7,306 Matching Annotations
  1. Jul 2021
    1. And we carry the scars of those stories as epigenetically-programmed determiners of our everyday, modern lives.But those stories are not the whole truth.Rather, they are that truth which the conscious ego and its master, the autonomic nervous system — which is delivered in each new human with factory settings locked on Sympathetic Response (stressed, ego-centered, fearful) — has determined as the impetus to create the climate-catastrophic society.And so the trauma at the deepest part of us… which may be driving all of our most self-destructive impulses and patterns… is the belief that we do not belong in this world.That we are strangers in a strange and dangerous land, instead of children who live in a supernatural garden.Because that is the other part of the story.

      interesting narrative shaping here.

      scientifically, yes our understanding of our autonomic nervous system is super important to the healing the things that are wild and disturbing above — but jumping to the next part is where I get lost.

      Agree a lot of our pain is from: "is the belief that we do not belong in this world." which I think we'll have to work on to fix the climate thing.

    1. Author Response

      Reviewer #1 (Public Review):

      [...] My main technical concern lies in the choice of decomposition filter for SEP and alpha oscillations, and the conclusions the authors draw from that. Specifically, a CCA spatial filter is optimized here for the N20 component, which is then identically applied to isolate for alpha sources, with the logic being that this procedure extracts the alpha oscillation from the same sources (e.g., L359). I have no issues (or expertise) with using the CCA filter for the SEP, but if my understanding of the authors' intent is correct, then I don't agree with the logic that using the same filter isolate for alpha as well. The prestimulus alpha oscillation can have arbitrary source configurations that are different from the SEP sources, which may hypothetically have a different association with the behavioral responses when it's optimally isolated. In other words, just because one uses the same spatial filter, it does not imply that one is isolating alpha from the same source as the SEP, but rather simply projecting down to the same subspace - looking at a shadow on the same wall, if you will. To show that they are from the same sources, alpha should be isolated independently of the SEP (using CCA, ICA, or other methods), and compared against the SEP topology. If the topology is similar, then it would strengthen the authors' current claims, but ideally the same analyses (e.g., using the 1st and 5th quintile of alpha amplitude to partition the responses) is repeated using alpha derived from this procedure. Also, have the authors considered using individualized alpha filters given that alpha frequency vary across individuals? Why or why not?

      Indeed, applying the same spatial filter to EEG signals with different spatial arrangements of the sources can lead to the extraction of neuronal activity which does not originate from the very same sources. We had chosen our approach, as it is well known that the generators of the early SEP components and the generators of the prominent somatosensory alpha rhythm co-reside at similar sites in the primary somatosensory cortex (e.g., Haegens et al., 2015). Therefore, we considered our approach appropriate to specifically focus on neural activity from the somatosensory region both in the frequency band of the SEP as well as of the alpha rhythm. Yet, we agree with the reviewer that it should be acknowledged that we may have missed or mixed-up effects of alpha activity from other sources by using this procedure (which might have led to different conclusions otherwise). In order to account for this, we repeated our analyses with an SEP-independent reconstruction of the oscillatory effects in source space (“whole brain analysis”). For this, we first reconstructed the sources of alpha activity using eLORETA and head models based on participant-specific MRI scans, and estimated the respective effects independently for all sources across the cortex using both linear-mixed effects models (LME) as well as a binning approach for the Signal Detection Theory (SDT) parameters sensitivity d’ and criterion c (consistent with the previous analyses in our manuscript). In the LME analyses, both the effects of pre-stimulus alpha activity on N20 amplitudes as well as on perceived stimulus intensity were strongest in the right primary somatosensory cortex – in accordance with the sources of the originally extracted tangential CCA component of the SEP (see Supplementary Figure 1 for Peer Review). Also, using the binning approach to examine the relation or pre-stimulus alpha activity with SDT parameter criterion c, the effects were most pronounced around the right somatosensory regions (Supplementary Figure 2 for Peer Review), yet these effects did not survive statistical correction for multiple comparisons (FDR-correction with p<.01). However, when performing the same binning analysis for our region of interest (ROI), the hand area in BA 3b of the right somatosensory cortex, a significant effect or pre-stimulus alpha on criterion c was indeed confirmed, t(31)=-2.951, p=.006, CI95%=[-.173, -.032]. Furthermore, in line with our previous CCA results, for sensitivity d’, neither the whole brain analysis nor the ROI analysis showed effects of pre-stimulus alpha amplitude, t(31)=0.633, p=.531, CI95%=[-.083, .157]. Taken together, the findings we report in our original manuscript for pre-stimulus alpha activity obtained with the spatial CCA filter can thus be replicated with a SEP-uninformed source reconstruction, both using LMEs for a “whole-brain analysis” as well as SDT analyses in a ROI-based approach. We therefore conclude that the relationships between pre-stimulus alpha activity, N20 potential of the SEP, and perceived stimulus intensity can indeed be attributed to neural activity from the same (or at least very similar) sources in the primary somatosensory cortex.

      Addressing the question on filtering alpha activity in individualized frequency bands, we considered this option, too. However, the rather short length of our pre-stimulus window (-200 to -10 ms) constitutes a natural limit for the frequency resolution in the alpha range and slightly different filter ranges (adjusted with regards to the individual alpha peak frequency) are thus unlikely to lead to large differences in the estimation of pre-stimulus alpha amplitudes. Therefore, we refrained from using individualized frequency bands here and focused on the more generic approach using one common alpha band (8-13 Hz) for all participants, which should also facilitate direct comparisons with previous studies on pre-stimulus oscillatory effects.

      In the same vein, both alpha and N20 amplitude relate to perceptual judgement, and to each other. I believe this is nicely accounted for in the multivariate analysis using the SEM, but the analysis that partitions the behavioral responses using the 20% and 80% are done separately, which means that different behavioral trials are used to compute the effect of N20 and alpha on sensitivity and criterion. While this is not necessarily an issue given that there IS a multivariate analysis, I would like to know how many of those trials overlap between the two analyses.

      This is an interesting point indeed. We included both the binning analyses and the multivariate analyses in our manuscript as we believe they offer complimentary views on the data, and also allow a direct comparison to previous studies in the field (e.g., Iemi et al., 2017). In fact, the trial overlap between the extreme bins of the alpha and N20 data were rather small.

      Since the expected trial overlap is 20% when partitioning the data into quintiles randomly, the effect-driven increments and reductions in trial overlap in our data appear to be rather small. However, they showed the expected directions: Larger alpha amplitudes were associated with more negative N20 amplitudes (and vice versa). Presumably, these small differences in trial overlap reflect the rather small effect sizes we also observed in the multivariate analyses. We have added this information to our revised manuscript in the following way to give the reader a better picture of the underlying data for the binning analyses (page 9, lines 137 ff.): “(Please note that this procedure resulted in a different trial selection as compared to the SDT analysis of pre-stimulus alpha activity. Please refer to Fig. 2—figure supplement 2 for further details on the trial overlap.)”

      At multiple points, the authors comment that the covariation of N20 and alpha amplitude in the same direction is counterintuitive (e.g., L123-125), and it wasn't clear to me why that should be the case until much later on in the paper. My naive expectation (perhaps again being unfamiliar with the field) is that alpha amplitude SHOULD be positively correlated with SEP amplitude, due to the brain being in a general state of higher variability. It was explained later in the manuscript that lower alpha amplitude and higher SEP amplitude are associated with excitability, and hence should have the opposite directions. This could be explicitly stated earlier in the introduction, as well as the expected relationship between alpha amplitude and behavior.

      Thank you for pointing out this unclarity. We have now made this rationale more explicit already at an early point in the introduction (page 3, lines 26 ff.): “According to the baseline sensory excitability model (BSEM; Samaha et al., 2020), higher alpha activity preceding a stimulus indicates a generally lower excitability level of the neural system, resulting in smaller stimulus-evoked responses, which are in turn associated with a lower detection rate of near-threshold stimuli but no changes in the discriminability of sensory stimuli (since neural noise and signal are assumed to be affected likewise).”

      Furthermore, I have a concern with the interpretation here that's rooted in the same issue as the assumption that they are from the same sources: the authors' physiological interpretation makes sense if alpha and N20 originated from the same sources, but that is not necessarily the case. In fact, the population driving the alpha oscillation could hypothetically have a modulatory effect on the (separate) population that eventually encodes the sensory representation of the stimulus, in which case the explanation the authors provide would not be wrong per se, just not applicable. A comment on this would be appreciated in the revision.

      Our extensive additional analyses suggest that the sources of behaviorally relevant alpha and N20 activity were located at very similar cortical sites. Nevertheless, this is not a proof that exactly the same neuronal populations were involved (for example, alpha and N20 effects could originate from different cortical layers). Therefore, we have added this potential limitation to our revised manuscript in the following way (page 19, lines 379 ff.): “Furthermore, with the present data, we cannot unambiguously conclude that the observed relation between pre-stimulus alpha activity and initial SEP indeed involved the very same neuronal populations – which may represent a limitation of the hypothesized mechanism. However, all approaches to localize these effects pointed to very similar cortical regions as discussed in the following section.”

      In addition, given how closely related the investigation of these two quantities are in this specific study, I think it would be relevant to discuss the perspective that SEPs are potentially oscillation phase resets. Even though the SEP is extracted using an entirely different filter range, it could nevertheless be possible that when averaged over many trials, small alpha residues (or other low freq components) do have a contribution in the SEP. If the authors are motivated enough, a simulation study could be done to check this, but is not necessary from my point of view if there is an adequate discussion on this point.

      Indeed, the phase reset mechanism may be a possible alternative explanation for relations between oscillations and later parts of the ERP. However, the N20 potential reflects the very first excitation of the cortex in response to a somatosensory stimulus and should therefore represent a textbook example of an additive response (EPSPs are added to ongoing background activity). Moreover, the N20 response should be over long before a possible phase reset in lower frequencies (such as alpha frequencies) would start to play a role (Hanslmayr et al., 2007; Sauseng et al., 2007). Nevertheless, we ran additional control analyses (including a simulation study) in order to exclude that some odd combination of phase-locking and filter residues led to the present findings: Please see Essential Revision #4 for details and how we included these considerations in our revised manuscript.

      Reviewer #2 (Public Review):

      [...] The main weaknesses of the manuscript becomes most apparent with respect to the stated impact that "The widespread belief that a larger brain response corresponds to a stronger percept of a stimulus may need to be revisited.". I am not really sure if there are many cognitive neuroscientists, that would actually subscribe to such a simplistic relationship between evoked responses and perception and that temporal differentiation (early vs late responses) and the biasing influence of prestimulus activity patterns are becoming increasingly recognized. So rather than actually changing a dominant paradigm, this work is an (excellent) contribution to a paradigm shift that is already taking place.

      Thank you for this feedback. We agree that the paradigm shift away from simplistic assumptions about the relationship between variability of neural responses and perception is already taking place and that this is already being appreciated by many scientists in the field. Also, we agree that the present study contributes more evidence to this emerging notion rather than changing the whole field. However, we do think that particularly the observation of opposite amplitude modulations of initial somatosensory evoked responses associated with presented stimulus intensity on the one hand and pre-stimulus excitability state on the other, provides a novel perspective for our understanding of how fundamental features of sensory stimuli are processed at initial cortical levels. Following your suggestions to tone down claims about the controversiality as well as to avoid over-generalization, we have therefore adjusted the impact statement of this manuscript to: “Larger evoked responses during initial cortical processing may reflect states of lower excitability.”

      Furthermore, we have adjusted similar statements throughout the manuscript accordingly.

      Also it should be considered that with regards to the analysis approach using CCA, the claims are mainly restricted to BA3b: i.e. while I also think that this is a strength of the current study, one should refrain from overinterpreting the results in a very generalized manner. The authors do include some "thalamus" and "late" evoked response patterns as well, however that presentation of the results is somewhat changed now as compared to the N20 (e.g. using LMEs rather than comparison of extremes; not using SEMs). The readablity of results and especially the comparison of effects would profit from a more coherent approach.

      We agree that our findings indeed have the specific focus on the N20 component and thus on its generators in BA3b. We did not intend to suggest that the effects we observed for this initial cortical response can be readily generalized to other (later) ERP components, too. However, we do believe (and hypothesize) that similar mechanisms may be in place for corresponding initial cortical responses in other sensory modalities, too – yet it is clear that we cannot test this generalization with the current study. To avoid misunderstandings of these interpretations and their limitations, we have further specified these aspects in the Discussion.

      Regarding our analyses of the later SEP (i.e., N140 component) and thalamus-related activity (i.e., P15 component), we initially decided to use linear-mixed effects models as they are mathematically equivalent to the way the sub-equations of the structural equation model were constructed (Table 2 in the manuscript). Nevertheless, we have now additionally run binning analyses to make a direct comparison also with Signal Detection Theory (SDT) parameters possible: For the N140 component, there was a significant effect on criterion c, t(31)=-3.010, p=.005, but no effect on sensitivity d’, t(31)=0.246, p=.807. For the P15 component, no effects emerged either for criterion c or sensitivity d’, t(12)=1.201, p=.253, and t(12)=-0.201, p=.844, respectively. These findings correspond well to the previous LME analyses and may indeed further facilitate the comparison with the findings for the N20 potential and pre-stimulus alpha activity. Therefore, we have added these complimentary analyses to our manuscript in the following way:

      Results: “In addition, the SDT analysis based on binning of the P15 amplitudes into quintiles neither suggested a relation with criterion c nor with sensitivity d’, t(12)=1.201, p=.253, and t(12)=-0.201, p=.844, respectively.” (page 14, lines 241 ff.)

      “These findings were in line with a separate SDT analysis: N140 amplitudes were associated with an effect on criterion c, t(31)=-3.010, p=.005, but no effect on sensitivity d’ emerged, t(31)=0.246, p=.807.” (page 15, lines 263 ff.)

      Discussion: “Crucially, our data are at the same time consistent with previous studies on somatosensory processing at later stages, where larger EEG potentials are typically associated with a stronger percept of a given stimulus (e.g., Al et al., 2020; Schröder et al., 2021; Schubert et al., 2006), as both our SDT and LME analyses of the N140 component showed.” (page 19, lines 367 ff.)

      “Yet, neither our SDT analyses nor the LME models of the thalamus-related P15 component supported this notion.” (page 21, lines 414 ff.)

      Methods (page 32, lines 681 ff.): “The effects of the EEG measures pre-stimulus alpha amplitude, N20 peak amplitude, P15 mean amplitude, and N140 mean amplitude on the SDT measures sensitivity d’ and criterion c were examined using a binning approach: […]”

      I have some concerns whether the relationship between large alpha power and more negative N20s could be driven by more trivial factors rather than the model explanations the authors develop in the discussion. Concretely the question whether phase locking of large alpha power along with >30 Hz high pass filtering could produce a similar finding as shown e.g. in Figure 2c. This is an important issue, as prestimulus alpha influences the N20 amplitudes as well as the perceptual reports.

      Indeed, potential phase-locking of alpha oscillations to stimulus onset and filter-related effects are important issues that could potentially offer an alternative explanation for the observed relationship between amplitudes of pre-stimulus alpha activity and the N20 potential of the SEP. Although such pre-stimulus alpha locking is rather unlikely in a paradigm with jittered stimulus onsets (in our case uniformly distributed between -50 ms and +50 ms; corresponding to a whole alpha cycle), we have run the following control analyses to fully exclude this possibility:

      First, we analyzed whether pre-stimulus alpha phase values were distributed uniformly and whether these phase distributions differed between high and low alpha amplitudes as well as between high and low N20 amplitudes. The phase of pre-stimulus alpha activity was obtained from a Fast-Fourier transform in the pre-stimulus time window from -200 to -10 ms, applied to unfiltered, but otherwise identically pre-processed data as in the original manuscript (i.e., applying the spatial filter of the tangential CCA component). For the FFT, we used zero padding (extending the pre-stimulus data segments to 2048 data points each) in order to obtain an interpolated frequency resolution of around 3 Hz. The phase was extracted at the frequency 9.766 Hz (i.e., the closest available frequency to 10 Hz). As visible from Supplementary Figure 3 for Peer Review, pre-stimulus alpha phases were distributed uniformly across all five quintiles of both alpha and N20 amplitudes. This observation was confirmed by the Rayleigh test (testing for deviations from a uniform distribution; Berens, 2009): Neither in the concatenated phase data of all participants, z=1.130, p=.323, nor in single-participant analyses within every alpha amplitude or N20 amplitude bin, we found evidence for a non-uniform distribution of alpha phase, all p>.367 (after Bonferroni correction for multiple testing). Thus, there was no phase-locking of pre-stimulus alpha activity that could serve as a trivial alternative explanation of the relationship between pre-stimulus alpha amplitude and N20 amplitude.

      Second, in order to examine whether the combination of our temporal filters (30 to 200 Hz band-pass for the SEP, and 8 to 13 Hz band-pass for alpha activity) could have led to the present findings, we additionally re-ran our analysis pipeline with simulated data: We mixed exemplary SEP responses with constant amplitudes (unfiltered; derived from within-participant averages), with simulated alpha band activity with randomized amplitude fluctuations, and pink noise, reflecting neural background activity as is typical for the human EEG. The SEP onsets were chosen according to our original experimental paradigm with inter-stimulus intervals of 1513 ms and a jitter of ±50 ms. Next, we filtered these mixed signals between 30 and 200 Hz in order to extract the single-trial SEPs, and estimated the pre-stimulus alpha amplitudes between -200 and -10 ms in the same way as was done in the original manuscript (i.e., by filtering the mixed signal between 8 and 13 Hz). This procedure was repeated for 32 generated data streams, containing 1000 SEPs each (corresponding to our empirical dataset of 32 participants). The resulting average SEPs did neither show a visually detectable difference between the five alpha amplitude quintiles nor indicated a random-slope linear-mixed-effects model any relation between pre-stimulus alpha amplitude and N20 amplitude on a single-trial level, βfixed=-.0005, t(255.16)=-.094, p=.925. Therefore, our findings cannot be explained by filter artifacts or residual activity leaking from the alpha frequency band to the frequency band of the N20 potential.

      Third, we re-analyzed our empirical EEG data in time-frequency space to obtain a more detailed view of the effects of pre-stimulus alpha activity on N20 amplitudes. For this, we decomposed our pre-processed but unfiltered data with wavelet transformation (complex Morlet wavelets) and calculated linear-mixed effects models on the relation between signal amplitudes in the time-frequency domain and single-trial N20 amplitudes as obtained from our original analyses. As shown in Supplementary Figure 5 for Peer Review, the time-frequency representations of the effects on N20 amplitudes indeed indicated a specific role of the alpha band, with its effects (i.e., already 200 ms before stimulus and in the upper alpha frequency range) separated from the time- and frequency range of the N20 potential of the SEP (i.e., from ~20 ms after stimulus onwards and above ~20 Hz). In addition, we ran the same analysis for the behavioral effect (i.e., perceived stimulus intensity). Also here, pre-stimulus effects were predominantly visible in the alpha band. Of note, there were also strong effects in the beta band. These may be interesting to study further in future studies – in particular, whether they reflect independent physiological processes or rather harmonics of the alpha band. Furthermore, these time-frequency representations suggest that the studied pre-stimulus effects might have been even more pronounced if we had analyzed the data in pre-stimulus time windows from -300 to -10 ms. However, in order to avoid inflating effect sizes by post-hoc data digging (“p-hacking”), we prefer to keep the original, a priori chosen time window for the main analyses of the manuscript. Yet, these onsets of pre-stimulus effects at around -300 ms may be of interest for future work. Taken together, these time-frequency analyses further support the notion that the observed relation between pre-stimulus alpha activity and N20 amplitudes is not due to technical issues (such as filter leakage and phase-locking) but rather reflects genuine neurophysiological effects of alpha oscillations on SEPs.

      We have added the time-frequency analysis, as well as the SEP simulation analysis as figure supplements to Figure 2 in our revised manuscript (page 8) since we believe that these control analyses comprehensively show that the observed effects were (a) specific to the alpha band and (b) not due to any data processing-related artifacts.

      It is important to emphasize that the model develop is a post-hoc one, i.e. the authors do not develop already in the discussion various alternative scenario results based on different model predictions. Therefore there is no strong evidence in support of the specific one advanced in the discussion.

      Thank you for raising this issue. Indeed, we cannot prove with the current findings that our proposed physiological model of the relation between alpha oscillations and the SEP is the correct model (or that it is at least the best one out of a selection of possible alternative models). To do so, future studies would be needed that can actually directly measure and/or manipulate differences in membrane potentials and trans-membrane currents. Rather, we aimed with the present study to associate a physiological meaning with the concept of excitability changes in the human EEG – offering a hypothesis that may be worthwhile to be studied (and either confirmed or rejected) in future studies. We have tried to make this motivation more explicit in the Discussion section (page 20, lines 384 ff.): “Also, we would like to emphasize that the presented mechanism reflects a hypothesized model, which shall be further supported or falsified with more targeted studies, for example, directly quantifying membrane potentials and trans-membrane currents in relation to different excitability states in somatosensation.”

    1. Author Response

      Reviewer #1 (Public Review):

      [...] The manuscript is excellently written and discusses the simulation results clearly and succinctly. The resolution of the simulations is very impressive and yields unprecedented insight into the effect of merozoite shape on alignment dynamics, which has important implications for how effectively the parasite can survive and multiply. The conclusions reached by the authors are certainly justified by the simulation data. In particular, the authors are careful not to draw conclusions beyond the limits of their study, and acknowledge other factors which may influence the merozoite shape, such as internal structural constraints and the energy of invasion following successful alignment.

      We thank the reviewer for a thorough reading of our manuscript and the very positive judgement.

      Regarding weaknesses of the manuscript, some of the explanations of the trends observed in the simulation data could be expanded slightly, to help gain a deeper understanding of the competition between adhesion and RBC deformability underlying the alignment dynamics. These are described in more detail below.

      1. Line 114 and lines 120-129: The discussion here of the trends observed in Figure 1 (including why the LE shape has a larger energy compared to the OB shape despite having a smaller adhesion area) is somewhat vague and should be developed further. For example, currently there is only a video showing the egg-like shape and a second video comparing the LE shape to a spherical shape - it would be helpful to have a further video comparing the LE and OB shapes and the different RBC deformations they cause. Moreover, the explanation of the energy/mobility of each shape in terms of curvatures (e.g. the OB shape having "lower curvature at its flat side") could be made more precise. I would expect that the adhesion area depends on how close the principal curvatures of the merozoite surface are to being equal and opposite to the natural curvatures of the RBC, since this determines the bending energy associated with wrapping the merozoite and forming short bonds. This would explain why the spherical shape is most mobile (its principal curvatures are constant so there is no region where at least one is relatively small), and why alignment is most likely to occur in the dimple of the RBC where the membrane is naturally concave-outward. For a given adhesion area, the deformation energy should depend on the difference in principal curvatures in the contact region, with a larger difference causing more bending of the RBC membrane. This difference is larger for the LE shape, since one principal curvature remains large at each point on the surface, compared to the OB shape whose principal curvatures are both small on the 'flat side' where contact is most likely to occur.

      We have expanded the discussion of these results to make it clearer. Furthermore, a new video was generated to visually see differences between different shapes.

      1. Lines 175-176: Given that the ratio A_m/A_s (adhesion area to total surface area) plays a key role in the probability of alignment, the authors should be more quantitative at this point. How does the ratio A_m/A_s (as measured directly, or indirectly e.g. by the area under the probability distributions inside the alignment region in figures 3a,b) scale with the system parameters, such as the adhesion strength and the off-rate k_off? Can it be estimated from an energy balance between RBC bending/stretching and the average adhesion energy?

      A change in A_m as a function of adhesion strength can be estimated analytically for a sphere, as was done in Hillringhaus et al. Biophys. J. 117:1202, 2019. For small deformations, there is essentially a competition of bending and adhesion energies, while for strong adhesion, stretching-elasticity contribution becomes important. We have included this theoretical result into the manuscript and discuss its implications.

      1. Line 197-198 and Figure 4c: Why is the deformation energy associated with the OB shape much lower than all other shapes for values of k_off/k_on^{long} smaller than 2?

      For k_off/k_on^{long} < 2, the magnitude of local curvature has a pronounced effect. For the OB shape, a large adhesion area is formed over the area with very low curvature, and close to the rim where the curvature is large, the adhesion strength may not be strong enough to induce membrane wrapping and deformation. For other shapes, the adhesion strength is large enough to lead to partial wrapping of the parasite by the membrane over moderate curvatures. As a result, the integrated deformation energy is significantly lower for the OB shape than for the other shapes in this regime of adhesion strengths. We have added this clarification to the manuscript.

      1. Alignment requires that the distance between the merozoite apex and RBC membrane is very small, and the alignment criteria necessitate examining small changes in the apex angle \theta from \pi. Can the authors comment on how sensitive are the results to the numerical discretisation used?

      The discretization length does affect the tightness of the alignment criteria. In our simulations, the average discretization length of the RBC membrane is about l0=0.2 m. The half circumference length of a parasite (corresponding to angle ) is R, which is equal to about 12 l0 for R=0.75 m, such that our angle resolution with respect to the parasite size is 0.1. Therefore, we use 0.2 for the alignment criteria, which is large enough to avoid strong discretization effects. Simulations with a finer discretization are possible, but they become very expensive computationally.

      Reviewer #2 (Public Review):

      [...] A major strength of the results is that it investigates an unstudied problem in malarial pathogenesis. The results pertaining to adhesion strength may be informative for preventing the organism from invading red blood cells. A primary weakness is that there is too little detail provided in the methods for this reviewer to adequate assess the computational method. Secondly, the results are somewhat inconclusive. While the egg-shape performs better than certain other shapes, there is no clear final understanding why this shape is preferred over the spherical or short ellipsoidal shapes. However, this possibly provides some clues as to why a certain malarial species does actively adopt a spherical shape during red blood cell binding and invasion.

      We thank the reviewer for a positive judgment of our manuscript. We have significantly expanded the methods section, so it should contain now all necessary simulation details. We agree with the reviewer that the conclusions about shape advantages/disadvantages are equivocal to some extent, but this is exactly what our simulation data show. However, from our data it is clear that the two shapes (i.e. egg-like and sphere) stand out, and they also correspond to real examples of merozoite shapes. As the reviewer points out, we do discuss some clues for the importance of parasite shape in the alignment process.

      Overall, the authors achieved their aims by quantitatively assessing the affect of parasite shape and adhesion strength on cell alignment, which is a proxy for invasion. The discussion at the end of the manuscript provides an accurate evaluation of the results that puts them into the context of invasion. While to some extent the results presented here are inconclusive, I do think that this paper achieves an important goal for its field. This is an understudied area pertinent to a major disease. This manuscript has the potential to bring questions of the biophysics of malarial invasion out to the broader community, specifically introducing these questions to biophysicists as well as microbiologists. Furthermore, the results naturally lead to new questions. If the spherical and egg shapes do not confer a strong advantage, then these specific shapes must also play a role in other processes. The authors do suggest some possibilities in the Discussion. That their remain interesting questions is a great spur for future work.

      Thank you for emphasizing the importance of multidisciplinarity. We also hope that our work will ignite interest in different communities, as only a multidisciplinary effort can bring us much closer to understanding of parasite alignment and invasion, which clearly include a combination of different mechanical and biochemical processes.

    1. Author Response

      Reviewer #1 (Public Review):

      [...] Their studies were complemented by transcriptomics and metabolomics and these results support the general conclusions that pollen contains diverse carbon sources which could be used in complementary ways by the different species, which have diverse metabolic capabilities encoded in their genomes.

      Reply: We thank the reviewer for the positive assessment of our manuscript.

      One of the points that was not completely explored in the paper is what happens in the simplified diet both in vitro and in the Bee gut. They propose in the discussion that in the presence of few and simple carbon sources (sugars) there is competition for nutrients and competitive exclusion is driving loss of some species. But this is not fully addressed in the paper.

      Reply: All four species can colonize the gut individually and grow on their own in axenic cultures when providing the simple sugars or the pollen as the only carbon source. When cultured together, all four strains are stably maintained in the presence of pollen. However, three of the four strains steadily decrease in abundance in the simple sugars. These findings are, in our opinion, consistent with the consumer-resource model (more resources = more species that can coexist) and the competitive exclusion principle which predicts that if two or more strains compete for the same nutrients they will not be able to coexist. We have added a corresponding section on line 423-425.

      The system they use (with 4 closely related bacterial species) is a simplified system. Therefore, it is not clear if the same general findings will hold in more complex systems. But the results supporting that nutrient complexity (in diet) and metabolic diversity (from the microbial side) are key factors to enable co-existence and persistence of complex microbiota communities are strong and likely generalizable. Although, it is possible that with other communities and other hosts other factors will also come into play. Nonetheless, the current study is important because it sets a good example for how these questions can be addressed to study more complex systems.

      Reply: It is true that bacterial coexistence does not necessarily need to be dependent on the nutrient complexity and that in other communities the host, the structure of the environment, or cross-feeding activities may play a more important role. We have discussed this point in the revised manuscript starting on line 423 and on line 427.

      Overall, the study described here is complete, and rigorous, except for a few points that still need to be addressed and clarified. Namely, it would be interesting to understand what drives exclusion of some members of the community in the simplified diet.

      Reply: See our reply above.

      Importantly, the current study opens the door for new studies (including in vitro studies) on the identification of network interactions that are important for Microbe-Microbe interactions that enable co-existence in other systems. Additionally, this study also highlights the importance of identifying the relevant nutritional (and metabolic) conditions for addressing those questions given the importance of the metabolic context in shaping microbe-microbe interactions.

      Reply: Thank you. We agree.

      Reviewer #2 (Public Review):

      [...] Strengths: The use of community profiling, transcriptomics, and metabolomics adds depth, as does the comparison of defined culture conditions to the host environment. The main conclusions drawn by the authors is that the presence of pollen is necessary for gut species to coexist, and that the different species, although closely related, respond in distinct ways to nutrients in pollen and consume different profiles of nutrients from pollen.

      Reply: We thank the reviewer for the positive feedback and the many valuable comments which helped us to further strengthen our manuscript.

      Weaknesses: The main weakness I see with this work is the choice of in vitro comparison conditions. The strains are cultured either on pollen or sugar water, whereas in vivo bees are fed a diet of pollen and sugar water, or only sugar water. A direct comparison is possible between the strains grown on sugar water in vitro or in vivo, but I think that in several places, the authors may have to reconsider or modify their interpretations comparing in vitro culture on pollen/pollen extract with the in vivo growth of the community on pollen and sugar water. Because there is sugar in the bee diet, differences in assembly dynamics, transcription, or metabolite consumption between pollen-containing culture conditions and the bee gut might stem from the dietary intake of sugar, or from an aspect of the host environment.

      Reply: We agree with the reviewer that the nutrient conditions that were used in vitro and in vivo are not identical and may have impacted the relative abundance of some of the community members, the transcriptional profiles, or the metabolite changes. Nevertheless, we believe that our experimental design is valid to test the main hypothesis of our study, i.e. a complex, pollen-based diet facilitates coexistence, while simple sugars lead to the dominance of a single strain independent of the environment (culture tube versus host). An important point to consider here is that bees will pre-digest the consumed pollen, and partially absorb dietary nutrients such as amino acids, glucose, and fructose, before they reach the bacteria in the hindgut. Consequently, the in vivo and in vitro conditions will never be the same even if we would have used the identical nutrients in our treatments. Also, pollen by itself contains glucose, fructose, and sucrose. So, although we have not added glucose to the in vitro pollen condition, this simple sugar was present in the corresponding condition. We have added a corresponding section in the discussion on line 402-422. This said, while we cannot recapitulate the exact same nutritional conditions in vitro, we still think that our main conclusions hold which is that we can recapitulate the pollen-dependent coexistence found in vivo.

      Reviewer #3 (Public Review):

      [...] Overall, the paper is strong and the arguments and conclusions put forth are well supported by the data. I only have a few suggestions:

      Reply: We thank the reviewer for the positive evaluation of our manuscript.

      1) The study focuses on one strain each of the 4 Firm-5 species; however, there is diversity within each species. This is only briefly mentioned in the paper at the very end, and I think the authors should address this a bit more directly. In particular, they have previously generated a large amount of genomic data from some of these other strains, so it is likely possible to infer or speculate, based on this data, whether they expect different strains within each species to utilize similar nutrients. Also, I'm wondering if the authors can comment on how their findings could extend to the related bumble bee gut microbiome. Such a discussion would help enhance the applicability and importance of this study.

      Reply: We agree that the large amount of strain-level diversity within a given species is an important point to consider. However, we would like to not expand this point much further as it would require a relatively complex genomic analysis. Also, considering that many of the strain-specific transcriptional changes are in genes shared with the other species, I am not sure how much such an analysis would reveal. Anyways, we plan to compare the coexistence between strains from the same versus another lineages in a follow-up study.

      As for the bumble bees, we currently do not know how many strains or species of Lactobacillus Firm5 can coexist in bumble bees. Therefore, we feel that a discussion extending to bumble bees would be too speculative. However, we included a sentence in the discussion which states that since pollen facilitates coexistence, it follows that dietary differences are likely to influence the diversity of Lactobacillus Firm5 and give the example of the Asian honey bee, which seems to only harbor one species of this phylotype. See line 479-488.

      2) It is interesting that different species ended up dominating in the in vivo vs. in vitro simple sugar-based communities. What do the authors think may be behind this difference?

      Reply: This is indeed an interesting point. We have not used the same sugars in vivo (sucrose) and in vitro (glucose). Moreover, the nutritional and physicochemical conditions in the hindgut are likely different from those found in a culture tube. We have mentioned that these are potential reasons for the observed differences in the relative abundance of different community members between in vivo and in vitro conditions on line 402-422 of the manuscript.

      3) Since the observed coexistence of these gut microbes is largely due to nutritional niche partitioning, it would be helpful if the authors can comment on the natural variation of key pollen derived metabolites, and if/how we could expect ecological variation in the bee microbiome due to plant pollen availability based on biogeography and seasonality.

      Reply: We agree and have included a corresponding sentence in the discussion on line 479. See also our reply to point 1.

      4) The supplementary information is nicely documented and accessible, but I think it would be even more useful if genome-wide data for the RNA-seq results, not just for select genes, are made available. Furthermore, I suggest including descriptive titles and labels within the supplementary Excel files, as there are many separate sheets and it is not always clear what each one shows.

      Reply: This has been included in the revised manuscript.

  2. migration-encounters-prototype.netlify.app migration-encounters-prototype.netlify.app
    1. Isabel:        Yeah. That's good then. This is a weird question, so do with it what you will. Do you feel Mexican or American?Nadxieli:        Mexican. Hell, yeah.Isabel:        Hell, yeah. Why is that?Nadxieli:        Well, I don't know. This is who I am, you cannot change that. Even though you move out a country, a continent, you are what you are at the end, I think you never forget that. I think when you forget that is when you lost your identity, most likely people take advantage of that. So as long as you remember who you are and where you're coming from, you're good.Isabel:        I like that.Nadxieli:        Yeah.Isabel:        I know some people say, oh, you talk different or you have these different things about you because of your time in the US, and some people may say "Oh, you're from neither here or there," some people they don't know you may not have the same experience, but some people think you're too Mexican to be American or too American to be Mexican. That's a trend you see. What would you say to that?Nadxieli:        I would say I'm 100% Mexican. I never changed that. It took a while to get into the Mexican stuff again. But at the end we already knew that. So it was not that hard. I also think that it's because I didn't spend a lot of years there. So I know people working where I'm working, they spent their whole life, so that will be different if I spend like 22 years out of 23, I guess that will be different.

      Identity, Mexican;

    1. Anita: Did Gerald Ford know you were undocumented?Rodolfo: No, Gerald Ford didn't know I was undocumented, no. I was still very young at that point. My mother and my family always told me, "Don't let anybody know you're undocumented.” If somebody finds out, for whatever reason, there's some people who just are plain out racist or don't want people like me in the States. Sometimes they just do things to... I don't know. That's what I understood and that's what I took in and that's what I applied to my life. It's like living a secret, it was like living a second life or whatever. It’s like, "Oh shit, why do I have to lie, why?" I guess it's neither here nor there now, right? I'm here in Mexico.Anita: That must have been incredibly difficult. I know personally, because I've had to keep secrets.Rodolfo: Yeah, I guess it's one of those things where you think it's never really gonna affect you, until you're in the back of the DHS, the Department of Homeland Security, van. You're next to a whole bunch of people you never met, and they're also in the same position. Some don't even speak English. You don't really understand how immediately it can affect you until it affects you. I never thought it would affect me. Okay, well I mean, I'm working, I'm going to school—I'm in high school—I'm doing this, this and that. Some of my friends who are students already dropped out. Did everything, they’ve already gone to prison and back and everything, and they haven't even hit their 21st birthday.Rodolfo: And I'm still good, I'm still good. I may not be a straight A student or anything, but hey man, I'm still here! Why can't I have the same privilege as you all do? Why can't I get my license? You know how happy I was when I got my license here, damn. I love to drive, that's one of my passions. Always, always, always I love to drive. I couldn't get my license over there. I remember even in high school in drivers ed, I knew what the answer was, but I asked my mom, “Hey mom, can I apply for drivers ed, so I can get my license? “She was like, "You know you can't get your license." Again, one of the primary things, I’m like damn, I'm just not gonna be able to drive all my life? Or if I do drive and I get pulled over—as a matter of fact, that's the reason why I got deported, driving without a valid drivers license.Rodolfo: I never got why the paper said, "Driving on a suspended license." I would always ask them, "If I don't have a license, why is it suspended?" They just told me, "Because you have a drivers license number, but you don't have a drivers license? I'm like, "Okay, so if I have a drivers license number, why can't I get my drivers license?" "You don't have the proper documentation." I'm like, "But I have my..."Rodolfo: One day I thought, “Well why don't I just grab the driver license number and have somebody make me a fake drivers license, and put the drivers license on there?” But see, if I get caught with it, now I'm in more trouble, and now I'm seen as a real criminal, because now I'm going around the system once again. That's why we don't want you here, because you're gonna do things like that. [Exhale] I haven't talked about this in a while. It just makes me want to…I don’t know.

      Time in the US, Immigration Status, Being secretive, Hiding/lying, In the shadows, Living undocumented; Reflections, The United States, US government and immigration; Feelings, Frustration; Time in the US, Jobs/employment/work, Documents, Driver's license, Social security card/ID

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their thoughtful comments. We were delighted the reviewers found our results “compelling”, “striking”, “well presented”, “implications exciting”, “excellent results! really nice!”, “this microscopy is beautiful!” and “translational-dependence (of mRNA localization) in a transcript-specific way without perturbing translation globally”, which is a “complete surprise, and opens exciting doors to investigate how translation leads to mRNA organization and its connection to **tissue development” and “may represent a new pathway of mRNA transport”.

      We also appreciated the comments regarding the “wide appeal”, “broad readership of readers”, and “broad interest” the reviewers gave to our manuscript regarding its impact, and also the comments of “well-written (and) well-cited”.

      We can address all the concerns raised by the reviewers. In addition to textual changes, we will add the following to the Results section:

      1. Additional quantitation of smFISH beyond Figure 2;
      2. Addition of a negative (uniformly distributed) mRNA control and its quantitation;
      3. Western blots for our ΔATG lines to determine what and how much protein is made.
      4. Unbiased nuclear masking. Our specific responses are shown below, in blue.

      Reviewer #1

      **Major comments**

      Fig. 1: Main and supplementary figures present smFISH signals for eight localized mRNAs, while in the results section authors describe that they analyzed twenty-five transcripts. Authors should explain the choice of transcripts presented in the paper.

      We will include a panel in Fig. S1E to show every mRNA that we tested, and we will edit Table 1 to describe the observed subcellular localization.

      We will edit the text, adding a few sentences to clarify, along the lines of: “O**ur survey revealed mRNAs with varying degrees of localization within epithelia that we divided into three classes: CeAJ/membrane localized, perinuclearly localized, and unlocalized (Fig. 1 and S1 and Table 1).” and “The rest of our tested mRNAs did not possess any evident subcellular localization at any of the analyzed embryonic stages/tissues and were not further investigated (Fig. S1E and Table 1).

      Moreover, smFISH signal of different localized mRNAs in epidermal cells was visualized at different stages (bean, comma or late comma), and authors did not comment what was the reason of such conditions. This may make transcripts localization results difficult to interpret, as further analysis showed that mRNA localization varied in a stage-specific manner.

      We have clarified this point now in Figure legend 1: “Specific embryonic stages were selected for each transcript based on the highest degree of mRNA localization they exhibited.

      Did author used smFISH probes designed against endogenous mRNAs for all tested transcripts?

      We did not. We clarify this point now in Materials and methods: “All probes were designed against the endogenous mRNA sequences except dlg-1 (some constructs), pkc-3, hmp-2, spc-1, let-805, and vab-10a, whose mRNA were detected with gfp probes in their corresponding transgenic lines (Table S2). An exception to this is Fig. S1A where we used probes against the endogenous dlg-1 mRNA.”.

      Marking dlg-1 mRNA as dlg-1-gfp suggests that smFISH probe was specific for gfp transcript. Is it true? If yes, authors should compare localization of wild-type endogenous dlg-1 mRNA with that of the transcript encoding a fusion protein, to confirm that fusion does not affect mRNA localization.

      Yes, in Fig. 1C we show smFISH for GFP (i.e., the tagged dlg-1 only). In Fig. S1A, we show smFISH against endogenous dlg-1. Tagged and endogenous dlg-1 mRNAs are both localized. We clarified this point in the main text: “Five of these transcripts were enriched at specific loci at or near the cell membrane: laterally and at the CeAJ for dlg-1 (Fig. 1C for endogenous/GFP CRISPR-tagged dlg-1::gfp mRNA and S1A for endogenous/non-tagged dlg-1 mRNA), (…)”. And in the Supplemental figure legend (Fig. S1A): “Endogenous/non-tagged dlg-1 mRNA shows CeAJ/membrane localization like its endogenous/GFP CRISPR-tagged counterpart.

      Fig. 2B: Authors conclude that at later stages of pharyngeal morphogenesis mRNA enrichment at the CeAJ decreased gradually in comparison to comma stage. Data do not show statistically significant decrease in ratio of localized mRNAs - for dlg-1: bean: 0.39{plus minus}0.09, comma: 0.29{plus minus}0.08, 1.5-fold: 0.30{plus minus}0.09; for ajm-1: bean: 0.36{plus minus}0.08, comma: 0.30{plus minus}0.05, 1.5-fold: 0.28{plus minus}0.09.

      t-test (one-tailed) analysis revealed a significant difference between bean and comma stages for both dlg-1 and ajm-1 mRNAs. Statistical analysis and data will be provided.

      Fig. 4: What was the difference between the first and the second __ΔATG transgenic line? Authors should analyze the size of the truncated DLG-1 protein that is expressed from the second Δ__ATG transgenic line that localizes to CeAJ. Knowing alternative ATGs and protein size may suggest domain composition of the truncated protein. This will allow to confront truncated protein localization with the results from.

      We will perform a Western blot to determine the size and levels of proteins produced.

      Fig. 5. Moreover, to prove that the localization of dlg-1 mRNA at the CeAJ is translation-dependent, additional experiment should be performed where transcripts localization will be analyzed in embryos treated with translation inhibitors such as cycloheximide (translation elongation inhibitor) and puromycin (that induces premature termination).

      We believe this comment might refer to Fig. 4. If this is the case: drugs like cycloheximide and puromycin affect the translation of the whole transcriptome, whereas with our ΔATG experiment, we aimed to target the translation of one specific transcript and avoid secondary effects. Nevertheless, we understand Reviewer #1’s concern and will include a second experiment. In our hands, cycloheximide and puromycin have never worked in older embryos (it’s hard to get past the eggshell and into the embryo). Instead, we will use stress conditions, which induce a “ribosome drop-off” (Spriggs et al., 2010). Heat stress has been shown to decrease polysome occupancy (Arnold et al., 2014). We, therefore, have used heat-shock at 33°C for 30’, and the results are now shown in Fig. S4. These show the loss of RNA localization upon heat shock.

      **Minor comments**

      In the introduction section authors should emphasize the main goal and scientific significance of the paper.

      We added this sentence to state the significance before summarizing the results: “To investigate the impact of mRNA localization during embryonic development, we conducted a single molecule fluorescence in situ hybridization (smFISH)-based survey (…)” and “Our data demonstrate that the dlg-1 UTRs are dispensable, whereas translation is required for localization, therefore providing an example of a translation-dependent mechanism for mRNA delivery in C. elegans.” To state the significance.

      Fig 1A: It's hard to distinguish different colors on the schematics. Schematics presents intermediate filaments that are not included in the Table 1.

      We modified Table 1 based on this and other reviewers’ comments.

      Fig. 1C: dlg-1 transcript is marked as dlg-1-gfp on the left panel and dlg-1 on the right panel.

      Corrected.

      Fig. 2B: Axis labels and titles are not visible, larger font size should be used.

      We will modify the graph (following Reviewer #2’s suggestion) and axes label and title sizes will be taken into account.

      Fig. 5C: Enlarge the font size.

      Will do.

      Fig. S2: Embryonic stages should be marked on the figure for easier interpretation.

      Added.

      Reviewer #2

      Major comments

      Figure 2 requires a negative (or uniformly distributed) mRNA control for comparison. Figure 2C should be quantified. The plot quality should be improved, and appropriate statistical tests should be employed to strengthen the claimed findings.

      We will add a negative control (jac-1 mRNA), and quantify Fig. 2C as well. Plots will be changed accordingly to the suggestion.

      Most claims of perinuclear mRNA localization are difficult to see and not well supported visually or statistically. The usage of DAPI markers, membrane markers, 3D rendering, or a quantified metric would bolster this claim. Also, sax-7 is claimed to be perinuclear and elsewhere claimed to be uniform then used as a uniform control. Please explain or resolve these discrepancies more clearly.__

      Regarding perinuclear mRNAs:

      We are not trying to make a big statement out of these data as perinuclear (ER) localization of mRNAs coding for transmembrane/secreted proteins is well known. The aim of our study was to describe transcript localized at or in the proximity of the junction. However, we thought it was worth mentioning these examples of perinuclearly localized mRNAs (hmr-1, sax-7, and eat-20) for two reasons: scientific correctness – show accessory results that might be interesting for other scientists – and use as positive controls for our smFISH survey – these mRNAs were expected to localize perinuclearly for the reasons mentioned above. We will rewrite the text to make these points clearer.

      Regarding sax-7 mRNA:

      sax-7 mRNA localizes perinuclearly in sporadic instances (Fig S1C), but it is predominantly scattered throughout the cytoplasm (i.e., unlocalized). It presumably localizes perinuclearly in a translation-dependent manner as sax-7 codes for a transmembrane protein that would be targeted to the ER. We have described this ER-type of localization in the introduction and reiterated it partially in the first paragraph of the results. sax-7 UTRs are therefore presumably not responsible for subcellular localization, which would instead depend on a signal sequence. We will better clarify this point in the main text.

      The major concern about the paper is the data display and interpretation of Figure 5C. I'm not comfortable with the approach the authors took of blurring out the nucleus. A more faithful practice would be to use an automated mask over DAPI staining or to quantify the entirety of the cell. If the entirety of the cell were quantified, one could still focus analysis on specific regions of relevance. The interpretations distinguishing membrane versus cytoplasmic localization (or mislocalization) are hard to differentiate in these images especially since they are lacking a membrane marker. The ability to make these distinctions forms the basis of Tocchini et al's two pathways of dlg-1 mRNA localization. These interpretations also heavily rely on how the image was processed through the different Z-stacks, and it's not clear to me how that was done. For example, the diffusion of mRNA in figure 5F and 5I are indistinguishable to my eye but are claimed to be different.

      In the images, the nuclei have been blurred to allow the reader to focus on the cytoplasmic signal and not on the nuclear (transcriptional) signal as it is not meaningful for this study. In the quantitation, the nuclear signal has been unbiasedly and specifically removed from the analysis by cropping out the DNA signal from the other channels. The frontal plane views of the seam cells in Fig. 5 show maximum intensity projections (MIPs) of 3 Z-stacks (0.54 µm total) that each contain nuclei and, therefore, the transcriptional signal (schematics in Fig. 5B). We will clarify these points in the text.

      Regarding cytoplasmic versus membrane-associated mRNAs, although we did not have a membrane marker, we relied on the brightness of the DLG-1::GFP signal to identify the cell borders (i.e., membranes) after over-exposure. This approach allowed us to discern apicobasal and apical sides for the intensity profile analyses. We will clarify this point as well in the text and, in parallel, we will try a different approach using transverse sections on top views to clarify our data.

      To my eye, it seems that Figure 5 could be more faithfully interpreted to state that DGL-1 protein localization depends on the L27-SH3 domains. The Huk/Guk domains are dispensable for DLG-1 protein localization; however, through other studies, we know they are important for viability. In contrast, dlg-1 mRNA localization requires all domains of the protein (L27-Guk). It is exceptionally interesting to find a mutant condition in which the mRNA and protein localizations are uncoupled. It would be very interesting to explore in the discussion or by other means what the purpose of localized translation may be. Because, in this instance, proper mRNA localization and protein function are closely associated, it may suggest that DLG-1 needs to be translated locally to function properly.

      We will rewrite the Results and Discussion to clarify our model. We agree that L27 and SH3 domains are critical, but we also detected effects of the HooK/GuK domains. We have refined our model to describe functions of the N and C termini for membrane or junctional localization.

      The manuscript requires an improve materials & methods description of the quantification __procedures and statistics employed.__

      We will add these points.

      Minor & Major comments together - text

      Summary statement: Is "adherent junction" supposed to be "adherens junction?"

      Corrected.

      Abstract: Sentence 1, I think they should add a caveat word to this sentence. Something like "...phenomenon that can facilitate sub-cellular protein targeting." In most instances this isn't very well characterized or known.

      Corrected.

      In the first paragraph, it might be good to mention that Moor et al also showed that mRNA localize to different regions to alter their level of translation (to concentrate them in high ribosome dense regions of the cell).

      Added as follows: “For example, a global analysis of localized mRNAs in murine intestinal epithelia found that 30% of highly expressed transcripts were polarized and that their localization coincided with highly abundant regions in ribosomes **(Moor, 2017).”

      There are some new studies of translation-dependent mRNA localization - that might be good to highlight - Li et al., Cell Reports (PMID: 33951426) 2021; Sepulveda et al., 2018 (PCM), Hirashima et al., 2018; Safieddine, et al 2021. Also, Hughes and Simmonds, 2019 reviews membrane associated mRNA localization in Drosophila. And a new review by Das et al (Nat Rev MCB) 2021 is also nice.

      We will add them to the text.

      Parker et al. did not show that the 3'UTR was dispensable for mRNA localization. They showed the 3'UTR was sufficient for mRNA localization.

      Quoting from the paper Parker et al.: “3′UTRs of erm-1 and imb-2 were not sufficient to drive mRNA subcellular localization. Endogenous erm-1 and imb-2 mRNAs localize to the cell or nuclear peripheries, respectively, but mNeonGreen mRNA appended with erm-1 or imb-2 3′UTRs failed to recapitulate those patterns (Fig. 4A-D).” We will make this point clearer in the rewritten text.

      In the second paragraph, the sentence about bean stages is missing one closing parenthesis.

      Corrected.

      Last paragraph: FISH is fluorescence, not fluorescent.

      Corrected.

      Both "subcellular" and "sub-cellular" are used.

      Corrected.

      Minor comments – Figures

      Figure 1

      o Figure 1A is confusing. It's not totally clear what the rectangles and circles signify. There are many acronyms within the figure. Which of the cell types depicted in the figure are shown here? For example, for the dorsal cells, which is the apical v. basal side?

      We tried to simplify the cartoon for a general C. elegans epithelial cell. We followed schematics already shown in previous publications to maintain consistency. Acronyms and color-codes are listed in the corresponding figure legend and have been better clarified.

      o Some of the colors are difficult to distinguish, particularly when printed out or for red/green colorblind readers. Is erm-1 meant to be a cytoskeletal associated or a basolateral polarity factor?

      We understand the issue, but unfortunately, with 8 classes of factors, shades of gray might not solve the problem. We tried to circumvent the red-green issue changing red to dark grey. Furthermore, we added details about shapes to the figure legends. We will work to make the colors work better.

      ERM-1 is a cytoskeletal-associated factor.

      o The nomenclature for dlg-1 is inconsistent within "C".

      Corrected.

      o Please specify what the "cr" is in "cr.dlg-1:-gfp" in the legend.

      Added.

      Figure 2

      o Can Figure 2C be quantified in a similar manner to 2A/2B?

      Currently our script cannot do that, but we will try to optimize it to be able to quantify this type of images.

      o 2B - please jitter the dots to better visualize them when they land on top of one another

      Yes, we will.

      o Please include a negative control example, a transcript that is not peripherally localized for comparison.

      Yes, we will.

      o There is no place in the text of the document where Fig 2C is referenced

      Corrected (it was wrongly referred to as “2B”).

      o I can't see any discernable ajm-1 localization in Fig 2A.

      We added some arrowheads to point at specific examples and increased the intensities of the corresponding smFISH signal for better visualization.

      o I can't see any dlg-1 pharyngeal localization in Fig2C.

      We added some arrowheads to point at specific examples and increased the intensities of the corresponding smFISH signal for better visualization.

      o More details on how the quantification was performed would be welcome. Particularly, in 2B, what is the distance from the membrane in which transcripts were called as membrane-associated? What statistics were used to test differences between groups?

      We will add a full description of the script used as well as the statistic details.

      Figure 3

      o Totally optional but might be nice: can you make a better attempt to approximate the scale of the cartoon depiction?

      The UTRs, especially the 5’ one, are much smaller than the dlg-1 gene sequence. A proper scaling of the cartoon to the actual sequences, would draw the attention away from the main subjects of this figure, the UTRs. Nevertheless, we made sure it is clear in the corresponding figure legend that the cartoon is not in scale: “The schematics are not in scale with the actual size of the corresponding sequences. UTR lengths: dlg-1 5’UTR: 61 nucleotides; sax-7 5’UTR: 63 nucleotides; dlg-1 3’UTR: 815 nucleotides; unc-54 3’UTR: 280 nucleotides.”

      o The GFP as an asterisk illustration may be confusing for some readers. Could you add another rectangular box to depict the gfp coding sequence?

      Corrected.

      o This microscopy is beautiful!

      Thanks Reviewer #2!

      o Were introns removed? Is the endogenous copy still present?

      All the transgenes were analyzed in a wild-type background, therefore, yes, the endogenous copy was still present. All the transgenes possessed introns. We will change the corresponding text as follows: “To test whether the localization of one of the identified localized mRNAs, dlg-1, relied on zip codes, we generated extrachromosomal transgenic lines carrying a dlg-1 gene whose sequence was fused to an in-frame GFP and to exogenous UTRs.”. In the figure “dlg-1 ORF” has been replaced with “dlg-1 gene”.

      o The wording in the legend "CRISPR or transgenic" may be confusing as Cas9 genome editing is still a form of transgenesis.

      We added “extrachromosomal” to clarify the nature of the mRNA.

      o The authors state that the 5'-3'UTR construct produces perinuclear dlg-1 transcripts but in the absence of DAPI imaging, it's not clear that this is the case.

      We could not find such a statement, but we tried to clarify the localization of these mRNAs in the text: “The mRNA localization patterns of the two UTR reporters were compared to the localization of dlg-1 transcripts from the CRISPR line (“wild-type”, Fig. 3A; Heppert et al., 2018), described in Fig. 2. Both reporter strains showed enrichment at the CeAJ and localization dynamics of their transcripts that were comparable to the wild-type cr.dlg-1 (Fig. 3B). These results indicate that the UTR sequences of dlg-1** mRNA are not required for its localization.”

      o Which probe set was used? The gfp probe?

      Yes, please see the main text: “Given that the transgenic constructs were expressed in a wild-type background, smFISH experiments were conducted with probes against GFP RNA sequences to focus on the transgenic dlg-1::GFP mRNAs (cr.dlg-1 and tg.dlg-1).”

      o Here, sax-7 is used as a uniform control, but sax-7 is claimed in Fig S1B-D as being perinuclear. This is a bit confusing.

      sax-7 mRNA localizes perinuclearly in sporadic instances (Fig S1C), but it is predominantly scattered throughout the cytoplasm (i.e., unlocalized). It presumably localizes perinuclearly in a translation-dependent manner as sax-7 codes for a transmembrane protein that would be targeted to the ER. We have described this ER-type of localization in the introduction and reiterated it partially in the first paragraph of the results. sax-7 UTRs are therefore presumably not responsible for any subcellular localization, which would instead rely on a signal sequence. We will better clarify this point in the main text.

      Figure 4

      o Excellent results! Really nice!

      Thanks Reviewer #2!

      o Fig 4A. The GFP depicted as a circle is strange.

      We changed it into a rectangle.

      o Fig 4A. Can you include the gene/protein name for easy skimming?

      Added.

      o Fig 4B. the color here is too faint and it is unclear what is being depicted. Overall, this part of the figure could be improved.

      We are optimizing the coloring and simplifying the schematics.

      o Were the introns removed?

      No, the introns were maintained in this and in all our transgenic lines. We described our transgenic lines in the materials and methods section (now with more detail). What we depict in the scheme (Fig. 4A) is the mature RNA (now specified in the figure), therefore no introns depicted. We will also specify this in the main text.

      Figure 5

      o Fig 5A. can you add the gene/protein name

      Added.

      o Fig 5B. Can you make the example apicobasal (non-apical) mRNA more distinctive? If it had its own peak in the lower trace, the reader would more clearly understand that this mRNA will be excluded from apical measurements whereas it will be included in apicobasal measurements.

      We actually wanted to show this specific example: a cytoplasmic mRNA and a junctional mRNA may seem close from the apicobasal analysis (partially overlapping peaks that Reviewer #2 mentioned). With the apical analysis, instead, we can show that these mRNAs are actually not close, and they belong to two different compartments (cytoplasm and junction). We would therefore like to keep the current scheme, while better clarifying this point in the corresponding figure legend.

      o D' - I' The grey font is too light.

      Noted. We will change it.

      o D' - I' The inconsistent y-axis scaling makes it difficult to compare across these samples. Can you set them to the same maximum number?

      The values are indeed quite different. We tried to use the same scale, but this would make some of the data unappreciable. The idea was to evaluate, within each graph, how mRNA and protein are localized relative to the junctional marker. We will make this clearer in the text.

      o D' - I' The x-axis labels are formatted incorrectly

      Corrected.

      o The practice of masking out the nucleus appears to remove potentially important mRNAs that are not nuclear localized. This could really impact the findings and interpretation. Instead, consider an automated DAPI mask.

      The masking on the images is not the same used for the analysis: in the images, a shaded circle has been drawn on the DNA channel and moved onto its corresponding location in the other channels or merges. For the analysis, the DNA signal has been specifically removed in the channel with the smFISH signal. Given that the analysis has been performed on maximum intensity projections of 3 Z-stacks, we believe we did not remove any non-nuclear mRNA. We will clarify this point in Materials and methods.

      o I can't see what the authors are calling membrane diffuse versus cytoplasmic. This is making it hard for me to see their "two step" pathway to localization.

      We will add in Fig. 5B-C an example of a membrane localized mRNA. Furthermore, we will add transverse sections of membrane and cytoplasm to make the date clearer to the reader.

      o Can more details of the quantification be included? How were Z-sections selected, chosen for inclusion? Which Z-sections and how many were selected?

      We will add the details to Materials and methods.

      o Also, why do these measurements focus on what I think are the seam cells when Lockwood et al., 2008 show the entire epithelium that is much easier to see?

      We are focusing on the seam cells at the bean stage as these are the cells and the embryonic stage where we see the highest localization of dlg-1 mRNA in the wild-type.

      o Please name these constructs to correlate the text more explicitly to the figures.

      Added.

      o How many embryos were analyzed for each trace? How many embryos showed consistent patterns?

      We will add the details of the analysis to Materials and methods.

      o Why were these cells used for study here? Lockwood et al., 2008 use a larger field of epithelial cells for visualization.

      As stated before: we are focusing on the seam cells at the bean stage as these are the cells and the embryonic stage where we see the highest localization of dlg-1 mRNA in the wild-type.

      Figure 6

      There are major discrepancies between what this figure is depicting graphically and what is described in the text. Again, I'm not comfortable making the "two step" claims this figure purports given the data shared in Figure 5.

      We are planning to re-write the last part of the results to better clarify our two-step model. A two-step model had been previously suggested in McMahon et al., 2001, where they could show that DLG-1 and AJM-1 (referred to in that publication as JAM-1) are initially localized laterally and only later in development are then enriched apically. Our data agree with McMahon very well, so we used the earlier study as a start. We will cite and explain this paper in greater depth during the rewriting.

      **Minor comments - Tables & Supplemental Figures**

      Table 1

      I think this table could be improved to more clearly illustrate which mRNAs were tested and what their mRNA localization patterns were (for example, gene name identifiers included, etc). Could the information that is depicted by gray shading instead be added as its own column? For example, have a column for "Observed mRNA localization"

      We modified Table 1 based on these and the other reviewers’ comments.

      Can you add distinct column names for the two columns that are labeled as "protein localization - group"

      We modified Table 1 based on these and the other reviewers’ comments.

      Can you also add which of these components are part of ASI v. ASII (as described in the introduction?)

      A new table has been added with the factors belonging to the two adhesion systems (same color code as in Table 1).

      Supplemental Figure 1

      It is hard to see that some of these spots are perinuclear. More information (membrane marker, 3D rendering, improved metrics) is required to support this claim.

      We are not trying to make a big statement out of these data as perinuclear localization for mRNAs coding for transmembrane/secreted proteins is well known. The aim of our study was to describe transcript localized at or in the proximity of the junction. We thought it was worth mentioning these examples of perinuclearly localized mRNAs (hmr-1, sax-7, and eat-20) for two reasons: scientific correctness – show accessory results that might be interesting for other scientists – and use as positive controls for our smFISH survey – these mRNAs were expected to have a somewhat perinuclear localization for the reasons mentioned above.

      What do these images look like over the entire embryo, not just in the zoomed in section?

      We added a column with the zoom-out embryos.

      sax-7 localization in S4 looks similar but a different localization claim is made.

      sax-7 mRNA can localize perinuclearly in sporadic instances (Fig S1C), but is predominantly scattered throughout the cytoplasm (i.e., unlocalized). It presumably localizes perinuclearly in a translation-dependent manner as sax-7 codes for a transmembrane protein that would be targeted to the ER. We have described this ER-type of localization in the introduction and reiterated it partially in the first paragraph of the results. sax-7 UTRs are therefore presumably not responsible for any subcellular localization, which would instead rely on a signal sequence. We will better clarify this point in the main text.

      Supplemental Figure 2

      Before adherens junctions even exist dlg-1 go to the membrane - this is really neat!

      Thanks Reviewer #2!

      Supplemental Figure 3

      Technical question: If either 5 or 3 stack images are used, how does this work? Do they have different z-spacings? Or do they do 5-stack images represent a wider Z-space?

      This is the sentence under question: “Maximum intensity projections of 5 (1.08 µm) (A) and 3 (0.54 µm) (B) Z-stacks”. The space between each Z-stack image is constant in all our imaging and its value is 270 nm. When we consider 5 planes, the distance from the 1st to the 5th is 4 x 270 nm = 1.08 µm, whereas for 3 planes will be 2 x 270 nm = 0.54 µm.

      Supplemental Figure 4

      Line #2 retains translation and keeps mRNA localization.

      Totally optional, but consider showing both lines in the main figure to illustrate the two possibilities.

      Noted.

      Materials and methods - how did they created the ATG mutations? Is it an array? - why does one translate, and one doesn't?

      We will clarify this point in Materials and methods: “dlg-1 deletion constructs ΔATG (SM2664 and SM2663) and ΔL27-PDZs (SM2641) were generated by overlap extension PCR using pML902 as a template.”.

      We will perform a Western blot to clarify Reviewer #2’s last point. Currently we do not know what peptide is translated, but the comparison with our full-length control will probably shed some light on the issue.

      Reviewer #3

      Major comments

      The smFISH results are striking and implications exciting. The conclusions made from the smFISH results reported in all Figures will be strengthened considerably by quantifying the mRNA localized to the defined specific subcellular regions. At the very least, localization to the cytoplasm versus the plasma membrane should be determined as performed in Figure 2B, but quantifying finer localization will enhance the conclusions made about regional localization (e.g. CeAJ versus plasma membrane mRNA localization in Figure 5). Inclusion of a non-localizing control in Figures 1-4 will enable statistical comparisons between mRNA localizing and non-localizing groups.

      We will add more quantitation, statistics, and negative controls.

      The script used for smFISH quantitation should be included in the methods or published in an accessible forum (Github, etc). Criteria for mRNA "dot" calling should be defined in the methods. All raw smFISH counts should also be reported.

      We will add the full description of the script in Materials and methods, and we will provide the raw data in an additional supplementary table.

      Figure 2: What is the localizing ratio of a non-localizing control mRNA (e.g. jac-1)? Including an unlocalized control with quantitation would strengthen the localization arguments presented.

      Yes, we will add quantitation for an unlocalized mRNA.

      Figure 5: Quantifying colocalization of mRNA and protein (+/- AJM-1) will strengthen the arguments made about mRNA/protein localization.

      Yes, we will quantify Fig. S5 to have a full picture of the cells (the images in Fig. 5 represent only a portion of the cell).

      Discussion of the CeAJ mRNA localization mechanism is warranted. Do the authors speculate that the newly translated protein drives localization during translation, similar in concept to SRP-mediated localization to the ER, or ribosome association is a trigger to permit a secondary factor to drive mRNA localization, or another model?

      Unfortunately, this is hard to say at the moment as we do not have any data regarding where translation actually occurs. We will add a conjecture to the Discussion.

      Minor comments

      Please complete the following sentence: "We identified transcripts enriched at the CeAJ in a stage- and cell type-specific."

      Corrected.

      It would be helpful to provide reference(s) for the protein localization summary in Table 1.

      Added.

      Figure 2B: Did dlg-1 and ajm-1 localize at similar ratios? Appropriate statistics comparing the different ratios may be informative.

      We will modify the graph (following Reviewer #2’s suggestion) and add the requested details.

      Figure 2: In the paragraph that begins, "Morphogenesis of the digestive track," the text should refer to Figure 2C? If not, the text requires further clarification.

      Corrected.

      Figure 2: Reporting the smFISH localizing ratios of 8E and 16E will be informative.

      We will add the information.

      Please include citations when summarizing the nonsense-mediated decay NMD mechanism and AJM-1 identifying the CeAJ.

      Added.

      The sentence, "Embryos from our second __Δ__ATG transgenic line displayed a little GFP protein and some dlg-1::gfp mRNA," should refer to Figure S4.

      Added.

      An immunoblot of this reporter versus wild type may be informative regarding the approximate position of putative alternative start codon.

      We will perform a Western blot to verify the size of the protein product produced.

      Figure 5: N's and repetitions performed should be included for localization experiments.

      Yes, we will add them here and in all the other quantifications we will add to the manuscript.

      Please clarify that the "the mechanism of UTR-independent targeting is unknown in any species" refers to dlg-1 mRNA localization.

      Added.

      "Our findings suggest..." discussion paragraph should reference Figure 6.

      Added.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Tocchini et al. screened apical junction and cell membrane proteins for mRNA localization. They identified multiple proteins that are translated from localized mRNAs. Of these, dlg-1 (Discs large) mRNA localizes to cell cortices of dorsal epithelial cells, endoderm cells, and epidermal (seam) cells and is dependent on active translation for transport. The manuscript dissects the contributions of different DLG-1 protein domains to mRNA localization.

      A major strength of the paper is the way it assesses translational-dependence in a transcript-specific way without perturbing translation globally. The authors cleverly combine mutations in ATG start sites with a knock down of the non-sense mediated decay pathway. This allows Tocchini et al to examine whether dlg-1 mRNA depends on active translation for localization, which it does. The authors observe an interesting finding, that the domains required for protein localization can be separated from those required for mRNA localization. Namely, mRNA localization (but not protein localization) requires C-terminal domains of the protein.

      My major points of concern focus on the presentation and interpretation of Figure 5. In this figure, the blocking approach used seems confounding, the observations described by the authors are not visible, the quantification is confusing, and the interpretations seem like an over-reach. The

      Major comments:

      • Figure 2 requires a negative (or uniformly distributed) mRNA control for comparison. Figure 2C should be quantified. The plot quality should be improved, and appropriate statistical tests should be employed to strengthen the claimed findings.

      • Most claims of perinuclear mRNA localization are difficult to see and not well supported visually or statistically. The usage of DAPI markers, membrane markers, 3D rendering, or a quantified metric would bolster this claim. Also, sax-7 is claimed to be perinuclear and elsewhere claimed to be uniform then used as a uniform control. Please explain or resolve these discrepancies more clearly.

      • The major concern about the paper is the data display and interpretation of Figure 5C. I'm not comfortable with the approach the authors took of blurring out the nucleus. A more faithful practice would be to use an automated mask over DAPI staining or to quantify the entirety of the cell. If the entirety of the cell were quantified, one could still focus analysis on specific regions of relevance. The interpretations distinguishing membrane versus cytoplasmic localization (or mislocalization) are hard to differentiate in these images especially since they are lacking a membrane marker. The ability to make these distinctions forms the basis of Tocchini et al's two pathways of dlg-1 mRNA localization. These interpretations also heavily rely on how the image was processed through the different Z-stacks, and it's not clear to me how that was done. For example, the diffusion of mRNA in figure 5F and 5I are indistinguishable to my eye but are claimed to be different.

      • To my eye, it seems that Figure 5 could be more faithfully interpreted to state that DGL-1 protein localization depends on the L27-SH3 domains. The Huk/Guk domains are dispensable for DLG-1 protein localization; however, through other studies, we know they are important for viability. In contrast, dlg-1 mRNA localization requires all domains of the protein (L27-Guk). It is exceptionally interesting to find a mutant condition in which the mRNA and protein localizations are uncoupled. It would be very interesting to explore in the discussion or by other means what the purpose of localized translation may be. Because, in this instance, proper mRNA localization and protein function are closely associated, it may suggest that DLG-1 needs to be translated locally to function properly.

      • The manuscript requires an improve materials & methods description of the quantification procedures and statistics employed.

      Minor & Major comments together:

      Text

      • Summary statement: Is "adherent junction" supposed to be "adherens junction?"

      • Abstract: Sentence 1, I think they should add a caveat word to this sentence. Something like "...phenomenon that can facilitate sub-cellular protein targeting." In most instances this isn't very well characterized or known.

      • In the first paragraph, it might be good to mention that Moor et al also showed that mRNA localize to different regions to alter their level of translation (to concentrate them in high ribosome dense regions of the cell).

      • There are some new studies of translation-dependent mRNA localization - that might be good to highlight - Li et al., Cell Reports (PMID: 33951426) 2021; Sepulveda et al., 2018 (PCM), Hirashima et al., 2018; Safieddine, et al 2021. Also, Hughes and Simmonds, 2019 reviews membrane associated mRNA localization in Drosophila. And a new review by Das et al (Nat Rev MCB) 2021 is also nice.

      • Parker et al. did not show that the 3'UTR was dispensable for mRNA localization. They showed the 3'UTR was sufficient for mRNA localization.

      • In the second paragraph, the sentence about bean stages is missing one closing parenthesis.

      • Last paragraph: FISH is fluorescence, not fluorescent.

      • Both "subcellular" and "sub-cellular" are used. Minor comments - Figures

      • Figure 1

      o Figure 1A is confusing. It's not totally clear what the rectangles and circles signify. There are many acronyms within the figure. Which of the cell types depicted in the figure are shown here? For example, for the dorsal cells, which is the apical v. basal side? o Some of the colors are difficult to distinguish, particularly when printed out or for red/green colorblind readers. Is erm-1 meant to be a cytoskeletal associated or a basolateral polarity factor? o The nomenclature for dlg-1 is inconsistent within "C". o Please specify what the "cr" is in "cr.dlg-1:-gfp" in the legend.

      • Figure 2

      o Can Figure 2C be quantified in a similar manner to 2A/2B? o 2B - please jitter the dots to better visualize them when they land on top of one another o Please include a negative control example, a transcript that is not peripherally localized for comparison. o There is no place in the text of the document where Fig 2C is referenced o I can't see any discernable ajm-1 localization in Fig 2A. o I can't see any dlg-1 pharangeal localization in Fig2C. o More details on how the quantification was performed would be welcome. Particularly, in 2B, what is the distance from the membrane in which transcripts were called as membrane-associated? What statistics were used to test differences between groups?

      • Figure 3

      o Totally optional but might be nice: can you make a better attempt to approximate the scale of the cartoon depiction? o The GFP as an asterisk illustration may be confusing for some readers. Could you add another rectangular box to depict the gfp coding sequence? o This microscopy is beautiful! o Were introns removed? Is the endogenous copy still present? o The wording in the legend "CRISPR or transgenic" may be confusing as Cas9 genome editing is still a form of transgenesis. o The authors state that the 5'-3'UTR construct produces perinuclear dlg-1 transcripts but in the absence of DAPI imaging, it's not clear that this is the case. o Which probeset was used? The gfp probe? o Here, sax-7 is used as a uniform control, but sax-7 is claimed in Fig S1B-D as being perinuclear. This is a bit confusing.

      • Figure 4

      o Excellent results! Really nice! o Fig 4A. The GFP depicted as a circle is strange. o Fig 4A. Can you include the gene/protein name for easy skimming? o Fig 4B. the color here is too faint and it is unclear what is being depicted. Overall, this part of the figure could be improved. o Were the introns removed?

      • Figure 5

      o Fig 5A. can you add the gene/protein name o Fig 5B. Can you you make the example apicobasal (non-apical) mRNA more distinctive? If it had its own peak in the lower trace, the reader would more clearly understand that this mRNA will be excluded from apical measurements whereas it will be included in apicobasal measurements. o D' - I' The grey font is too light. o D' - I' The inconsistent y-axis scaling makes it difficult to compare across these samples. Can you set them to the same maximum number? o D' - I' The x-axis labels are formatted incorrectly o The practice of masking out the nucleus appears to remove potentially important mRNAs that are not nuclear localized. This could really impact the findings and interpretation. Instead, consider an automated DAPI mask. o I can't see what the authors are calling membrane diffuse versus cytoplasmic. This is making it hard for me to see their "two step" pathway to localization. o "F" looks the same as "I" to me, but the authors claim they represent different patterns and use these differences as the basis for their claim that X. o Can more details of the quantification be included? How were Z-sections selected, chosen for inclusion? Which Z-sections and how many were selected? o Also, why do these measurements focus on what I think are the seam cells when Lockwood et al., 2008 show the entire epithelium that is much easier to see? o Please name these constructs to correlate the text more explicitly to the figures. o How many embryos were analyzed for each trace? How many embryos showed consistent patterns? o Why were these cells used for study here? Lockwood et al., 2008 use a larger field of epithelial cells for visualization.

      • Figure 6

      o There are major discrepancies between what this figure is depicting graphically and what is described in the text. Again, I'm not comfortable making the "two step" claims this figure purports given the data shared in Figure 5.

      Minor comments - Tables & Supplemental Figures

      Table 1

      • I think this table could be improved to more clearly illustrate which mRNAs were tested and what their mRNA localization patterns were (for example, gene name identifiers included, etc). Could the information that is depicted by gray shading instead be added as its own column? For example, have a column for "Observed mRNA localization"

      • Can you add distinct column names for the two columns that are labeled as "protein localization - group"

      • Can you also add which of these components are part of ASI v. ASII (as described in the introduction? Supplemental Figure 1

      • It is hard to see that some of these spots are perinuclear. More information (membrane marker, 3D rendering, improved metrics) is required to support this claim.

      • What do these images look like over the entire embryo, not just in the zoomed in section?

      • sax-7 localization in S4 looks similar but a different localization claim is made.

      Supplemental Figure 2

      • Before adherens junctions even exist dlg-1 go to the membrane - this is really neat! Supplemental Figure 3

      • Technical question: If either 5 or 3 stack images are used, how does this work? Do they have different z-spacings? Or do they do 5-stack images represent a wider Z-space?

      Supplemental Figure 4

      • Line #2 retains translation and keeps mRNA localization.

      • Totally optional, but consider showing both lines in the main figure to illustrate the two possibilities.

      • Materials and methods - how did they created the ATG mutations? Is it an array? - why does one translate, and one doesn't?

      Significance

      The authors discover that dlg-1, ajm-1, and hmr-1 mRNAs (among others) are locally translated, and this represents an important conceptual advance in the field as these are well studied proteins and important markers. This is the first study to illustrate translation-dependent mRNA localization in C. elegans, to my knowledge. The mechanisms transporting these mRNAs and their associated translational complexes to the membrane may represent a new pathway of mRNA transport and is therefore significant. The authors identify domains within DLG-1 responsible which is a nice advance. If they are unable to order the events of association as they claim in Figure 5 (and that I dispute), this doesn't detract from the impact of the paper.

      Other high-profile studies have recently been published that echo how mRNA localization to membranes can be observed for transcripts that encode membrane-associated proteins (Choaib et al., Dev Cell, 2020; Li et al., Cell Reports, 2021 (PMID: 33951426); and Reviewed in Hughes & Simmonds, Front Gen, 2019). These recent findings underscore the impact of Tocchini et al.'s paper. Similar studies have identified mRNAs localizing through translation dependent mechanisms to a variety of different regions of the cell (Sepulveda et al., eLife, 2018; Hirashima et al., Sci Reports, 2018; Safieddine, et al., Nat Comm, 2021; and reviewed in Ryder et al., JCB 2020). Given the timely nature of these findings and the recent interest in these concepts, a broad readership of readers should be interested in this paper.

      My field of expertise is in mRNA localization imaging and quantification. I feel sufficiently qualified to evaluate the manuscript on all its merits.

    1. new digital tools may be transforming these methods and this basic work. Is the very computer upon which humanists rely so heavily still a tool, something akin to their medieval writing tablets? Or has it become an environment, its screen no longer a blank sheet on which to write but a window or portal into the entire digital realm, which acts upon the humanist as much as or more than she acts upon it? As such tools become even more integrated with the human body - Google Glass or the new Apple Watch, for example - will the distinction between tool and environment disappear even further? Might we be approaching the time when the distinction created by the term homo Jaber, the human as maker, outside and above the world of her creations, becomes meaning-less in the world of the semantic web and 3D bacterial printing?

      I think that technology has developed to the point that it is both a tool and an environment. When I use it to write a paper, it is a tool, but it becomes an environment when using it to interact with my classmates. Things like search engines are more ambiguous. They are a tool in how they help me achieve the goal of finding what I am looking for, but they immerse me into the environment created by websites and documents. Things like google street view are, without a doubt, in my mind, a tool and environment. They both help me find the place I was looking for and immerse me into the environment and visually experience it.

    1. This is a risk weoften take when working with children. Even if we arenot conscious of it, we face this dilemma every daybecause of our own pre-conceived notions and theo-ries. I believe that we can choose to offer topics for thechildren’s consideration as long as we are aware of

      I think that awareness is key for being able to teach within a context we as adults may be familiar with. In this particular experience there was almost a system of checks and balances to make sure the students hypotheses and ideas stayed at the forefront of their research.

    1. Author Response:

      Reviewer #1:

      By sequencing a large number of SARS-CoV-2 samples in duplicate and to high depth, the authors provide a detailed picture of the mutational processes that shape within-host diversity and go on to generate diversity at the global level.

      1) Please add a description of the sequencing methods and how exactly the samples were replicated (two swaps? two RNA extractions? two RT-PCRs?). Have any limiting dilutions been done to quantify the relationship between RNA template input and CT values? Also, the read mapping/assembly pipeline needs to be described.

      Limiting dilutions were not performed however the association between Ct and discordance between replicates was explored. Samples with Ct>=24 were found to have considerable discordance between replicates, likely resulting from a low number of input RNA molecules. This is described in the first section of the results and illustrated in Figure 1 - figure supplement 3.

      We have now added additional sections to the methods to better describe the sequencing and mapping pipelines.

      Sequencing: A single swab was taken for each sample. Two libraries were then generated from two aliquots of each sample with separate reverse transcription (RT), PCR amplification and library preparation steps in order to evaluate the quality and reproducibility of within-host variant calls. The ARTIC protocol v3 was used for library preparation (a full description of the protocol used available at dx.doi.org/10.17504/protocols.io.be3wjgpe).

      Alignment and variant calling: Alignment was performed using the ARTIC Illumina nextflow pipeline available from https://github.com/connor-lab/ncov2019-artic-nf...

      2) I find the way variants are reported rather unintuitive. Within-host variation is best characterized as minor variants relative to consensus (or first sample consensus when there are multiple samples). Reporting "Major Variants" along with minor variants conflates mutations accumulated prior to infection with diversity that arose within the host. The relative contributions of these two categories to the graphs in Fig 1 would for example be very different if this study was repeated now. Furthermore, it is unclear whether variants at 90% are reversions at 10% or within-host mutations at 90%. I'd suggest calling variants relative to the sample or patient consensus rather than relative to the reference sequence (as is the norm in most within-host sequencing studies of RNA viruses).

      We are grateful for this comment and have tried to improve and clarify the reporting of variants to align with previous literature.

      Our original classification intended to classify non-reference sites as fixed changes (VAF>95%) or within-host variants (which we called “minor variants”). While we chose 95% as a cutoff (which may have been confusing), the results are analogous with a 99% cutoff, as variants in this set essentially have VAF~100%, and nearly all are expected to have occurred in a previous host. Thus, the previous classification intended to cleanly separate inter-host (fixed) mutations from within-host mutations, to compare their patterns of selection and their mutation spectra.

      Following the reviewer’s request, we have modified this classification to better align with other studies of RNA viruses by defining the majority allele at a site as the “consensus”. We note that the results remain largely similar, since the vast majority of within-host variants identified had a low VAFs (<<50%) with the majority/consensus allele most often corresponding to the reference (Wuhan) base.

      When considering recurrent mutations we now discuss the number of times variants are observed at each location within a sample. This avoids the issue of how variants are polarised.

      3) It is often unclear how numbers reported in the manuscript depend on various thresholds and parameters of the analysis pipeline. On page 2, for example, the median allele frequency will depend critically on the threshold used to call a variant, while the mean will depend on how variation is polarized. Why not report the mean of p(1-p) and show a cumulative histogram of iSNV frequencies on a log-log scale including. I think most of these analyses should be done without strict lower cut-offs or at least be done as a function of a cut-off. In contrast to analyses of cancer and bacteria, the mutation rates of the virus are on the same order of magnitude as errors introduced by RT-PCR and sequencing. Whether biological or technical variation dominates can be assessed straightforwardly, for example by plotting diversity at 1st, 2nd, and 3rd codon position as a function of the frequency threshold. See for example here:

      https://academic.oup.com/view-large/figure/134188362/vez007f3.tif [academic.oup.com]

      There are more sophisticated ways of doing this, but simpler is better in my mind.

      It would be good to explore how estimates of the mean number of mutations per genome (0.72) depend on the cut-offs used. A more robust estimate might be 2\sum_i p_i(1-p_i) (where p_i is the iSNV frequency at site i) as a measure of the expected number of differences between two randomly chosen genomes. Ideally, the results of viral RNA produced of a plasmid would be subtracted from this.

      The reviewer raises a number of important points that we have tried to address and clarify.

      We think that the quality of our variant calls is supported by several lines of evidence, including: (1) the use of the ShearwaterML calling algorithm, which uses a base-specific overdispersed error model and calls mutations only when read support is statistically above background noise in other genomes, (2) we use two independent replicates from the RT step, (3) we provide several biological signals that cannot be expected to arise from errors, including the fact that the mutation spectra of low VAF iSNVs called in our study recapitulate that of consensus mutations and the clear signal of negative selection acting on iSNVs. We note that this dN/dS analysis is closely related to the suggestion by the reviewer of comparing the frequency of mutations at positions 1/2/3 of a codon.

      To address this comment in the manuscript, we have amended the text to include these arguments and we provide two new supplementary figures: (1) a figure of the frequency of mutations at the three codon positions, as requested by the reviewer, and (2) the mutation spectra of low VAF iSNVs, demonstrating the quality of the mutation calls. Similar to the finding in Dyrak et al., (2019), and as expected from the dN/dS ratios, the distribution of variant sites is dominated by variants at the third position and not equally distributed as one might expect if errors were dominating the signal.

      We have amended the relevant section of the text to read:

      “To reliably detect within-host variants with the ARTIC protocol, we used ShearwaterML, an algorithm designed to detect variants at low allele frequencies. ShearwaterML uses a base-specific overdispersed error model and calls mutations only when read support is statistically above background noise in other genomes \cite{Gerstung2014-av,Martincorena2015-ef} (Methods). Two samples were excluded, as they had an unusually high number of low frequency variants unlikely to be of biological origin, leaving 1,179 samples for analysis, comprising 1,121 infected individuals of whom 49 had multiple samples. For all analyses we used only within-host variants that were statistically supported by both replicates (q-value<0.05 in at least one replicate and p-value<0.01 in the other, Methods). Within each sample, we classified variant calls as `consensus' if they were present in the majority of reads aligned to a position in the reference or as within-host variants otherwise. The allele frequency for each variant was taken as the frequency of the variant in the combined set of reads for both replicates.”

      ...

      “The use of replicates and a base-specific statistical error model for calling within-host diversity reduces the risk of erroneous calls at low allele frequencies. We noticed a slight increase in the number of within-host diversity calls for samples with high Ct values, which may be caused by a small number of errors or by the amplification of rare alleles and that could inflate within-host diversity estimates (Figure 1 - figure supplement 3) \cite{McCrone2016-se}. However, the overall quality of the within-host mutation calls is supported by a number of biological signals. As described in the following sections, this includes the fact that the mutational spectrum of within-host mutations closely resembles that of consensus mutations and inter-host differences and the observation of a clear signal of negative selection from within-host mutations, as demonstrated by dN/dS and by an enrichment of within-host mutations at third codon positions \cite{Dyrdak2019-xk} (Figure 1 - figure supplement 4).”

      Whilst we believe the remaining variant calls are reliable we acknowledge that how variants are polarised could impact some of the summary statistics reported. To help improve this we have amended Figure 1 to include a cumulative histogram of within-host variant frequencies on a log-log scale as suggested by the reviewer. We have also included estimates of the mean value of sqrt(p(1-p)) (indicating an estimate of the standard deviation of within-host variants assuming a Bernoulli distribution). We have also replaced the estimates of the mean number of mutations per genome with the expected number of differences between two randomly chosen genomes. The amended Figure 1C now displays a histogram of the expected number of differences between two genomes for each sample rather than the mean number of mutations.

      4) This paper provides an important baseline characterization of within-host diversity, while the patterns themselves are not extremely surprising. It is thus important that the data are provided in a form that facilitates reuse. It would be helpful to provide intermediate analysis results in addition to the raw reads in the SRA and the shearwater calls. I would like to see simple csv tables with the number of times A,C,G,U,- was observed at every position in the genomes for every sample. This would greatly facilitate the reuse of the data.

      We have now added raw count tables for each sample and each replicate to the GitHub repository. We have also archived this data using Zenodo to ensure it remains easily accessible.

      Reviewer #2:

      The paper by Tonkin-Hill and colleagues describes the analysis of intra-host variation across a large number of SARS-CoV-2 samples. The authors invested a lot of effort in replicate sequencing, allowing them to focus on more reliable data. They obtained several important insights regarding patterns of mutation and selection in this virus. Overall, this is an excellent paper that adds much novelty to our understanding of intra-host variation that develops during the time course of infection, its impact on transmission, and what we can or cannot learn on relationships between samples.

      We are grateful to the reviewer for their positive comments.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.


      Reply to the Reviewers

      We thank the Referees for their evaluation and their useful comments.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The MS from Bonaventure and colleagues used a CRISPR to identify novel IFN-induced antiviral effectors targeting HIV-1. One hit, the DEAD Box helicase DDX42, while not itself part of the IFN response, exerts a substantial inhibitory effect on HIV-1 replication when over expressed, and gives a several fold boost to viral replication when knocked down in cells. The effect of DDX42 KO or O/E is manifest at reverse transcription and PLA analysis suggests and interaction with incoming virions. Moreover, DDX42 appears to exert an inhibitory effect generally against retroviruses and retroelements, with evidence that it associates with viral/transposon RNA. The authors further show that DDX42 has antiviral against a range (but not all) RNA viruses, with very striking phenotypes seen especially with Zika, CHIKV and SARS CoV2, with DDX42 associating with dsRNA in infected cells. These data suggest DDX42 is a constitutively expressed a broad-spectrum inhibitor of a range of mammalian RNA viruses. The manuscript is very well written, the data is of good quality and clearly DDX42 is having a general effect on viral replication. The results are novel, important and potentially of wide interest. Where the MS is somewhat lacking is understanding whether DDX42 has direct antiviral activity or is globally affecting cellular RNA metabolism. Some important areas for the authors to consider are:

      • DDX42 has a potential role in splicing and/or RNA metabolism so I think it would be important to see whether there is any clear global change in gene expression in knockout or knockdown cells cells vs control that might be suggestive of a generalized effect.

      Responses

      We thank the reviewer for this important question. Indeed, DDX42 didn’t impact the replication of 2 negative strand RNA viruses and this suggested that DDX42 didn’t have a global impact on the target cells, but we could not formally exclude a generalized effect. Therefore, we have performed RNA-seq analysis in order to evaluate the impact of DDX42 depletion (using 3 different siRNAs targeting DDX42 in comparison to a CTRL siRNA in U87-MG cells, and 2 different siRNA in comparison to a CTRL siRNA in A549-ACE2 cells, in samples obtained in 3 independent silencing experiments). The RNA-seq data (See Supplemental File 1 and Figure S5) showed that only 63 genes are commonly differentially expressed by the 3 siRNAs targeting DDX42 in U87-MG cells and only 23 of these genes were also found differentially expressed in A549-ACE2 cells depleted for DDX42. Importantly, the identity of these genes could not explain the observed antiviral phenotypes. These data are in favor of the absence of generalized effect on the target cells, which could have explained the antiviral phenotypes of the sensitive viruses.

      • The HIV experiments in primary cells are only one round at present. Does the DDX42 knockdown enhance viral replication in multiround? Does it lead to more viral PAMPs for PRRs to induce IFN?

      Responses

      We agree with the reviewer that it would have been very informative to measure the impact of DDX42 knockdown in multiround infections in primary T cells. However, we tried several times to do this experiment (with primary T cells from several donors) and we were not successful: indeed, DDX42 KO appeared to slow down cell division, which could be taken into account for a short, one-cycle experiment (i.e. 24 h) 3 days post-Cas9/sgRNA electroporation by adjusting the number of cells at the time of infection. However, DDX42 KO appeared quite toxic in longer experiments, with cells stopping to grow.

      The question regarding the generation of more viral PAMPs for PRRs to induce IFN is also very interesting. We know from published work (including ours) that primary T cells don’t normally produce IFN following HIV-1 infection (see for instance Bauby and Ward et al, mBio 2021). However, one can indeed hypothesize that as more viral DNAs are produced in the absence of DDX42, perhaps the primary T cells could detect them and produce IFN. To address this question in primary T cells, we would have needed to be able to perform multiround infections, which was not possible, as mentioned above. Moreover, we could not test this hypothesis in the cell lines that we used, such as U87-MG/CD4/CXCR4 cells, as they are unable to produce IFN following HIV-1 infection.

      • More could be made mechanistically of the lack of sensitivity of Flu and VSV to DDX42. In particular showing whether or not DDX42 interacts with the RNA of the insensitive virus, or whether DDX42/virus or dsRNA interactions by PLA occur with Flu would highlight the relevance of these observations to the antiviral mechanism.

      Responses

      This is an excellent remark. We have now performed RNA immunoprecipitation experiments using 2 viruses targeted by DDX42 (CHIKV and SARS-CoV-2) and 1 virus that is insensitive to DDX42 (IAV) (See New Figure 4J-L): whereas CHIKV and SARS-CoV-2 RNAs could be specifically pulled-down with DDX42 immunoprecipitation, this was not the case for IAV RNA. This strongly argues for a direct mechanism of action of DDX42 helicase on viral RNAs.

      Reviewer #1 (Significance (Required)):


      __ The role of helicases in host defence are of wide interest and importance. This has the potential to be a very important study that deserves a wide audience. However in my opinion it needs some further mechanistic insight along the lines I have suggested.

      Responses

      As mentioned above, we have now added important data: First, DDX42 is able to interact with RNAs from targeted viruses (and not from an insensitive virus); Second, we have checked that DDX42 didn’t have a substantial impact on the cell transcriptome. Taken together, these data are clearly in favour of a direct mode of action of DDX42.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this brief report, the authors use a CRISPR screening approach to identify cellular proteins that limit HIV infection. The screen itself is elegantly designed and most of the top hits are components of the interferon signaling pathway that would be expected to emerge from such a screen, thus providing confidence in the results. The authors followed up on DDX42 as a new hit identified in their screen and confirmed that targeting DDX42 with distinct guide RNAs resulted in increased HIV infection in at least 3 cell lines. Conversely, DDX42 overexpression inhibited infection. They also confirmed a role for DDX42 in inhibiting HIV infection in primary macrophages and CD4 T cells using siRNA and CRISPR KO strategies, respectively. They also demonstrate that DDX42 inhibits several other divergent lentiviruses as well as Chikungunya virus and SARS-CoV-2, but not influenza virus. These data convincingly show that DDX42 plays a role in inhibiting many lentivirus and positive sense RNA virus infections. Using PCR assays for reverse transcription products they conclude that DDX42 inhibits an early process in the HIV life cycle occurring after virus entry, though the statistical significance of these differences is not clear. They further use proximity ligation assays to suggest that DDX42 is in proximity to HIV-1 and SARS-CoV-2 replication complexes. Mechanistically, these data are largely unsatisfying as they do not provide specific insight into how DDX42 so broadly inhibits virus replication. Overall, the manuscript presents a significant advance, it also has some weaknesses as listed below.

      1. Statistical analysis is not included in any of the figures.

      Response

      Statistical analyses have now been included.

      Many of the figure legends do not state how many independent biological replicates the figures are based on.

      Response

      The number of biological replicates for each panel is stated at the very end of each figure legend.

      Detailed mechanistic understanding of DDX42 effects on virus replication is not provided by the manuscript.


      Response

      As mentioned in response to Reviewer 1, we have now added data showing that DDX42 could interact with RNAs from targeted viruses but not from an insensitive virus, arguing for a direct antiviral mode of action of this Dead-Box helicase.

      Reviewer #2 (Significance (Required)):

      DDX42 is a new antiviral protein identified and confirmed in this manuscript. It was also identified as one of many hits in a genome wide CRISPR screen for cellular proteins that regulate SARS-CoV-2 infections, but was not followed up. Thus, the identification and confirmation of DDX42 antiviral activity is highly significant for both the HIV and SARS-CoV-2 fields. This high significance may compensate to some extent for the lack of mechanistic insight contained in this initial report.

      **Referees Cross-commenting**

      I find the comments of the other reviewers to be fair and reasonable, and I concur that the work is overall important and novel. It seems that reviewers generally agreed that some additional mechanistic insights would be desirable for publication in a high impact journal. Reviewer 1 makes some good suggestions in this regard. As for mouse experiments, I would reserve these for a follow up manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):


      __In this manuscript, Bonaventure et al report the results of a screen to identify cellular inhibitors of HIV-1 infection in IF treated cells. They identify DDX42 as such a factor though, unexpectedly, DDX42 did not turn out to be an ISG. Strikingly, DDX42 turns out to inhibit a wide range of retroviruses as well as retrotransposons and + sense, but not - sense, RNA viruses among which SARS-CoV2 turns out to be especially sensitive to DDX42, with siRNAs specific for SARS-CoV2 DDX42 increasing viral RNA expression by a startling 3 orders of magnitude, compared to only an 2-5 fold positive effect with HIV-1.

      Response

      We agree with the reviewer that DDX42’s impact on HIV-1 may appear as somewhat modest, however, it is highly reproducible across cell lines and primary cells and, more importantly, it is observed upon depletion of the endogenous protein (either by KO or silencing) in target cells that are highly permissive to viral replication, such as activated primary CD4+ T cells. We therefore believe that these findings, combined with the findings that other positive-strand RNA viruses are targeted, are of high interest.

      Reviewer #3 (Significance (Required)):


      __I found this paper generally convincing and technically sound though the emphasis was odd and clearly driven more by the history of how this work was done than by the actual results obtained. Specifically, the emphasis is on HIV-1 yet the most interesting data are the dramatic effects seen with Chikungunya and SARS2. If I was writing this paper, I would delete figure 4 and focus this paper entirely on retroviruses and retrotransposons. In that form, I think it would be competitive at PLoS Pathogens or perhaps EMBO Journal. The RNA virus work shown in figure 4 could then be figure 1 of a new, high impact, paper looking at the mechanism of action of DDX42 as an inhibitor of + sense, but not - sense, viral gene expression. Though Wei et al do mention DDX42 in their SARS-CoV2 screening paper this is certainly not a major theme of that paper so I don't think that would be a problem.

      Responses

      We thank the reviewer for this comment. We had hesitated to present the manuscript as suggested by the reviewer (i.e. focusing only on HIV-1, retroviruses and retroelements) and prepare a second manuscript with the remaining data. We’ve finally decided against it, as we believe that showing a broad antiviral effect of DDX42 on +strand RNA viruses increases the impact of our findings.

      On another note, a conditional DDX42 KO mouse has been generated by the Wellcome trust Sanger institute and it would greatly improve this manuscript if they could show an in vivo a result similar to figure 3F using MLV.

      Responses

      We thank the reviewer for this information. We completely agree that in vivo work would be a massive plus and we will be planning to explore this in the future, but not at this stage as it would require specific funding and resources.

    1. Author Response:

      Reviewer #1:

      This study focuses on how the vmPFC supports delay discounting. The authors tested patients with vmPFC lesions (N=12) and healthy controls (N=41) on a delay discounting (DD) task with two additional conditions: (1) reward magnitude and (2) cues that should evoke episodic future thinking (EFT).

      The authors replicate their previous finding that patients with vmPFC lesions show steeper DD, and report two novel findings: (1) DD in patients is insensitive to reward magnitude, suggesting that vmPFC is critical for reward magnitude to modulate DD; (2) vmPFC patients show normal effects of EFT cues on DD, such that all subjects discounted less in the presence of cues that promote episodic future thinking. These findings have important implications for how vmPFC contributes to delay discounting, as they suggest that vmPFC is not necessary for prospective thinking to affect the evaluation of future rewards.

      1) A potential issue with the EFT finding is that it rests on accepting the null hypothesis of no group differences. However, there are reasons to assume this is not a trivial null result due to a lack of statistical power. Specifically, there is a significant effect of EFT within the vmPFC patient group and there is a significant group difference for the effect of reward magnitude. Assuming comparable power to detect effects of EFT and reward magnitude, it seems unlikely that the non-significant EFT effect is simply a lack of power. In any case, this caveat has to be considered when interpreting the effect.

      We have added a discussion of this caveat on p. 10, which reads: “Before discussing this finding further, we note that it rests on accepting the null hypothesis of no group differences in the EFT effect on DD between vmPFC patients and controls. It is unlikely, however, that this null finding simply reflects a lack of statistical power, for example due to a small sample size. First, the null effect on group differences indeed reflects a significant within-participant effect, with greater regard for future amounts in the EFT compared to the Standard condition in vmPFC patients. Second, together with the preservation of the EFT effect, we found a significant reduction of the magnitude effect in the same vmPFC patient sample. Bayesian analyses confirmed greater evidence in favour of the null compared to the alternative hypothesis regarding group differences in the EFT effect on DD.”

      2) It is somewhat surprising that the authors had such a strong prediction about the absence of group differences for the EFT effect. Based on previous work (Bertossi et al., 2016a, b), one could expect a smaller EFT effect in the VMPFC group. The authors appear to put much weight on the results by Ghosh et al. 2014, which suggest that vmPFC is critical for schema reinstatement. The rationale for this strong prediction is not very clear from the introduction.

      We have now reframed our hypotheses as suggested by the reviewers and the editors. In the Introduction, we now make only the hypothesis of a reduced EFT effect on DD in vmPFC patients, which is based on previous evidence of an EFT impairment in vmPFC patients. We present the hypothesis that vmPFC is critical for schema instantiation only in the Discussion, as an explanation of the null finding on group differences on the EFT effect.

      Thus, p. 5 now reads: “Concerning prospection, previous studies have observed an EFT effect on DD, such that people discount future rewards less steeply if cued to imagine personal future events during intertemporal choice (Peters and Büchel, 2010; Benoit et al., 2011). Considering that vmPFC is implicated in prospection (Schacter et al., 2012) and that vmPFC patients are impaired in EFT (Bertossi et al., 2016a,b; Bertossi et al., 2017), vmPFC patients' DD should remain steep even when EFT cues are provided, because patients may nevertheless fail to construct the vivid future events that might be needed to counteract DD. Thus, we predict a reduced EFT effect on DD in vmPFC patients compared to healthy controls.”

      Reviewer #2:

      Ciaramelli et al. address a timely and theoretically important issue with respect to the functional role of the vmPFC in decision-making more generally, and temporal discounting in particular. Strong points of the paper include 1) a theoretically important research question and 2) much-needed lesion data on two important behavioral effects in temporal discounting: the magnitude effect, and a modulation of discounting via episodic future thinking. Weaker points of the paper include 1) lack of clarity for a number of methodological issues (group comparisons & control group for the AI data, inconsistency analysis) and 2) many remaining open questions with respect to how vmPFC patients might have utilized the EFT cues, and whether different processes were at work compared to controls.

      We thank the reviewer for this positive evaluation of the paper and address the reviewer’s comments below.

      Major points:

      1) The authors note that their interpretation of the preserved EFT effects in the vmPFC patients in terms of e.g. semantic processing remains speculative, but is supported by the finding of intact external details production following vmPFC damage in earlier studies. But was this also the case in the present data set? This remains unclear, because for the AI data, only z-scores relative to some earlier control group (Kwan et al. 2015) are reported (Table 1 and Supplement p. 30). Was this control group matched to the patients? And since the referenced Kwan et al. (2015) paper reports only on six patients (presumably the patients from the Canada site?) - what about the patients from the Italian site, which control group were their AI data compared to?

      The Crovitz data of the Canadian patients are unpublished (the Kwan et al., 2015 paper is not about vmPFC patients, but about 6 MTL patients). We compared them to a sample of 18 age-matched healthy controls, a subset of those included in Kwan et al. (2015). The 4 Italian patients were part of the vmPFC sample tested on EFT (and episodic memory) in Bertossi et al. (2016). We compared their performance with that of the 11 healthy controls from the same study who were age-matched to the patients.

      This is clarified on p. 17, which reads: “The results of the Italian patients (a subset of those included in Bertossi et al. 2016b) were contrasted with those of the 11 healthy controls from the same study (all males; Bertossi et al., 2016b) who were age-matched to the patients (vmPFC patients: M = 47.75, SD = 5.25; healthy controls: M = 41.63, SD = 11.89, t13 = -0.97, p = 0.34). The results of the Canadian patients (unpublished) were contrasted with those of 18 healthy controls (10 males; a subset of those included in Kwan et al., 2015) age-matched to the patients (vmPFC patients: M = 61.00, SD = 9.83; healthy controls: M = 67.94, SD = 13.57, t22 = 1.15, p = 0.26).”

      2) Directly related to my previous point: The methods section states that external details were in the normal range in the vmPFC group (mean z-score for EFT = -.73) but from Table 1 we can see that 8/10 patients in fact exhibit a negative z-score. This suggests that a direct group comparison of the external details scores would very likely reveal a significant group difference. Generally, it would help to report to actual control data here, not just the z-scores, and report the respective group comparisons.

      We now report the Crovitz data in Table 2 and have run two ANOVAs on internal and external details separately in vmPFC patients and controls tested in Italy and in Canada. As the two ANOVAs show, we confirm that both patient groups produced fewer internal (episodic) details but a similar number of external details during EFT (as well as episodic remembering) than healthy controls. Therefore, the previously reported EFT problems for internal (but not external) details in vmPFC patients also apply to the patients tested here.

      P. 17 now reads: “As for the Italian sample, an ANOVA on the details produced with Group (vmPFC patients, healthy controls), Time (Past, Future), and Detail (internal, external) as factors showed a significant effect of Time (F1,13 = 14.66, p = 0.002, partial η2 = 0.53), such that all participants produced more details for past than future events (18.19 vs. 15.37). There were also significant effects of Group (F1,13 = 6.16, p = 0.02, partial η2 = 0.32) and Detail (F1,13 = 9.14, p = 0.009, partial η2 = 0.41), qualified by a Group x Detail interaction (F1,13 = 8.99, p = 0.01, partial η2 = 0.40). Post hoc Fisher tests showed that vmPFC patients produced fewer internal details (11.45 vs. 25.51; p = 0.004) but a similar number of external details than controls (11.39 vs. 11.96; p = 0.89). No other effect was significant (p > 0.31 in all cases). The same ANOVA on the Canadian sample revealed an effect of Group (F1,22 = 17.76, p = 0.0003, partial η2 =20.44), qualified by a significant Group x Detail interaction (F1,22 = 4.72, p = 0.04, partial η = 0.18), again indicating that vmPFC patients produced fewer internal details (10.63 vs. 31.78; p = 0.0003) but a similar number of external details than controls (16.79 vs. 25.65; p = 0.09). No other effect was significant (p > 0.32 in all cases).”

      3) The description of the inconsistency analysis was somewhat unclear. The authors use the procedure suggested by Johnson & Bickel (2008), which makes sense, given the overall analytical approach that focuses on the analysis of indifference points. However, this procedure is based on a comparison of adjacent indifference points. In contrast, the authors are referring to the number of inconsistent choices - this is either a typo, or a different procedure. I think the former, because the reported absolute numbers (e.g. means around 1) and the single subject plots in the supplement appear to reflect the number of inconsistent ID points rather than choices. If this is the case, I disagree with the statement that the "mean number of inconsistent choices was very low" (p. 10) - as this probably reflects the mean number of inconsistent indifference points and not choices, about 1 out of 6 ID points was inconsistent in the vmPFC group, which is a lot.

      We apologize for lack of clarity. Yes, we are referring to indifference points (as in our previous study; Sellitto et al., 2010), not single choices. Inconsistent preferences are defined as “data points in which the subjective value of a future outcome (amount = R) at a given delay (R2) was greater than that at the preceding delay (R1) by more than 10% of the amount of the future outcome (i.e., R2 > R1 + R/10, as in Sellitto et al., 2010).” To avoid confusion, we have now corrected the expression ‘inconsistent choice’ to ‘inconsistent preference’ throughout the paper, and have eliminated the claim about the low number of inconsistent choices in vmPFC patients.

      4) The EFT cues are suggested to help vmPFC patients to "circumvent their initiation problems" (p. 12) but I am not sure I follow this logic. First, the AI procedure typically entails external cues as well, and here vmPFC patients showed impairments (Table 1, but see my point 1 above). Second, some of the cited papers (e.g. Verfaellie et al., 2019) also used specific event cues, and still observed reduced internal details production in vmPFC patients.

      The AI (Crovitz) procedure uses external cues but typically these are words that are not particularly meaningful to the participants (indeed, they are the same for all participants). e.g., Imagine attending a Fourth of July cookout a few years from now; Verfaellie et al., 2019) but, again, these cues are the same for all participants. We used personalized cues, which were events that participants (1) had selected themselves, and (2) had already planned or found them plausible in their future, and therefore presumably were the most self-relevant and familiar to the participants, including patients. We think that these events may have been effective in activating self- and event- relevant schemata. We clarify this point on p. 11, which reads: “We propose, therefore, that subject-specific event cues, which were self-relevant and familiar to the participants because they had been selected by participants themselves, and were already planned or were plausible in their future, acted as external triggers of self- and situation-relevant schemata, helping to circumvent vmPFC patients’ EFT initiation problems. Their intact MTLs allowed them to construct episodic future events, which were then integrated into intertemporal choice, reducing DD.” As we note on p. 14, indeed, vmPFC patients are capable of imagining detailed experiences if they are guided to choose for themselves a specific moment from an extended future event to narrate in detail (Kurczek et al., 2015). Of course, we agree with the Reviewer’s point below that this interpretation is speculative at this point.

      5) One shortcoming with the paper is that no data are available that could inform how vmPFC patients might have utilized the EFT cues, and whether the processes at work might have differed from those in controls. Many points mentioned in the discussion (self-referential processing, semantic processing, activation of schemata, self-initiation vs. external cueing etc.) thus necessarily remain conjecture.

      We agree with the Reviewer, and we admit in several parts of the Discussion that this interpretation is speculative at this point. However, the interpretation that we offer seems the most plausible to us at this time, considering what we know about the role of the vmPFC (vs. the MTL) in event construction and the absence of the EFT effect on DD in MTL patients. We also propose an alternative interpretation, but the pattern of findings on the EFT effect on DD makes it less likely to us. On p. 12, we state, “An alternative interpretation of the DD modulation is that EFT cues simply shifted attention towards the future, or conferred a positive valence to it, as we encouraged positively valenced EFT. If so, however, one should consistently observe an EFT-induced benefit on DD also in patients with MTL lesions, but this is not the case (Kwan et al., 2015; Palombo et al., 2015).”

      Reviewer #3:

      In this manuscript, Ciaramelli et al. examined the decision-making behavior of 12 patients with vmPFC damage in a delay discounting task. The authors carried out two manipulations in this task: 1. They presented participants with small and large offers for both the immediate and delayed reward (magnitude manipulation), 2. They prefaced decisions with a cue prompting participants to vividly imagine an event in their future that was expected to occur at the same delay as the proposed larger offer (episodic future thinking (EFT) manipulation). Compared to age and education matched healthy controls, patients with vmPFC damage showed steeper discounting of delayed rewards, particularly when the amounts offered were large (reduced effect of magnitude). However, like controls, vmPFC damaged patients displayed shallower discounting of delayed rewards following the EFT manipulation.

      The manuscript is clear and concise in its presentation of the results, while still providing a detailed description of the behavior of these patients. This paper is also a good example of how pooling participants from multiple institutions can increase statistical power in a study of patients with focal brain damage targeting a fairly specific cognitive question. The positive results of the study mostly replicate previous findings. While the null result for the EFT manipulation is novel, the finding is hard to interpret. The authors state that they predicted that the EFT manipulation would not change discounting behavior in vmPFC damaged patients a priori despite the deficits of these patients in EFT in previous papers, which are also replicated here. However, I do not know why the authors would design their task in such a way to test for a null result. It is also not clear if this null result is observed for the reason proposed by the authors (that the EFT cues externally activate this process), or if this result is null for some other reason that is not accounted for here. As the authors do not provide a direct test for their hypothesized rationale for predicting this null result, the findings are hard to interpret.

      We agree with the reviewer’s and editor’s point that this paradigm does not allow testing whether subject-specific, personally relevant cues, such as those we used, are indeed effective in externally initiating EFT in vmPFC patients. Therefore, we concur that, for the sake of clarity, this is best presented only as speculative discussion of the preserved EFT effect on DD in vmPFC patients. In the Introduction, therefore, we now formulate only the hypothesis based on previous evidence of impaired EFT in vmPFC patients (e.g., Bertossi et al., 2016a,b, Verfaellie et al., 2019), which would lead to the prediction of a reduced EFT effect in vmPFC patients. We present the hypothesis that vmPFC is critical for schema instantiation only in the Discussion, as an explanation of the null finding on group differences on the EFT effect.

      P. 5 now reads: “Concerning prospection, previous studies have observed an EFT effect on DD, such that people discount future rewards less steeply if cued to imagine personal future events during intertemporal choice (Peters and Büchel, 2010; Benoit et al., 2011). Considering that vmPFC is implicated in prospection (Schacter et al., 2012) and that vmPFC patients are impaired in EFT (Bertossi et al., 2016a,b; Bertossi et al., 2017), vmPFC patients' DD should remain steep even when EFT cues are provided, because patients may nevertheless fail to construct the vivid future events that might be needed to counteract DD. Thus, we predict a reduced EFT effect on DD in vmPFC patients compared to healthy controls.”

      Overall, this manuscript makes a relatively modest contribution to our knowledge about the function of vmPFC during inter-temporal choice. It bolsters previous claims about how vmPFC damage impacts delay discounting and EFT, while not revealing new information about how vmPFC specifically contributes to the processes involved in these behaviors and why damage to this region impacts intertemporal choice in this way.

      We concur with the reviewer that our findings confirm previous evidence that vmPFC is necessary for balanced DD and for EFT. However, we think that our finding of a complete abolishment of the magnitude effect together with a complete preservation of the EFT effect on DD in vmPFC patients configures a remarkable theoretical advancement on the role of vmPFC in intertemporal choice. Indeed, it shows that during intertemporal choice vmPFC is more prominently implicated in reward valuation than in prospection. This finding is important for current theories of intertemporal choice, and is surprising considering previous demonstrations of impaired EFT in vmPFC patients (a finding that was replicated in the current study), and therefore has important implications also for theories relating to the role of vmPFC in EFT. Finally, we note that the paper focuses on one important facet of impulsivity following damage to the vmPFC in humans: steep DD. Our findings, therefore, may inform the clinical management of impulsivity in patients with vmPFC damage or dysfunction, delineating the contextual manipulations that are or are not expected to push the reach of patients' choice into the future.

    1. Author Response:

      Reviewer #1:

      This manuscript shows cell to cell variability in the relative levels of Sox2 and Brachyury (Bra) expression by individual cells within the region of the epiblast containing axial progenitors (the progenitor zone, PZ). Accordingly, some cells express high Bra and low Sox2 levels, others high Sox2 and low Bra and a third group expressing equivalent levels of both transcription factors. They then show that by experimentally promoting high Sox2 expression cells enter neural tube (NT) fates, whereas high Bra brings cells in the progenitor zone to enter the presomitic mesoderm (PSM). The authors then complement these experiments with evaluation of cell movements within the PZ, NT and PSM to show that cells in the NT are much less motile than those in the PZ and PSM. These data led the authors to propose a fundamental role for Sox2/Bra heterogeneity to maintain a pool of resident progenitors and that it is the high cell motility promoted by high Bra levels what pushes cells to join the PSM, whereas high Sox2 levels inhibit cell movement forcing cells to take NT fates. To validate their hypothesis, the authors generated a mathematical model to show that those expression and motility characteristics can indeed lead to axial extension generating NT and PSM derivatives in the proper positions, while keeping a PZ at the posterior end.

      Some specific comments on the manuscript are specified below.

      1) Although the description of cells within the PZ containing different Sox2 and Bra expression ratios is more explicit and quantitative in the present manuscript, this has already been previously reported by different methods including immunofluorescence (e.g., Wymeersch et al, 2016). Similarly, that breaking the Sox2/Bra balance towards high Sox2 or Bra is an essential step to bring the progenitors towards NT or PSM fates has also been previously shown in different ways. These observations are, therefore, not totally new. The novel contribution of this paper is the authors' interpretation that "heterogeneity among a population of progenitor cells is fundamental to maintain a pool of resident progenitors". In this work, however, this conclusion is only supported by their mathematical simulation, as the experiments described in this manuscript are not aimed at homogenizing Sox2/Bra expression levels in the progenitor cells (meaning keeping the double positive feature) but, instead, forcing the progenitors to express Sox2 or Bra alone, which permits evaluation of differentiation routes rather than how to maintain the resident progenitor pool. Interestingly, their alternative mathematical model in which the relative Sox2/Bra levels follow an anterior-posterior gradient (which is actually a feature observed in the embryo) was also successful in producing an extending embryo. This model was not favored by the authors (but see my comment below). According to this model, the progenitor zone could be maintained by a cell pool containing equivalent Sox2/Bra levels; when this balance is broken cells eventually enter NT or PSM routes. Therefore, while expression heterogeneity can be observed in the PZ, I am not sure that the work shown in this manuscript is conclusive enough to claim an essential role of such heterogeneity to maintain the progenitor pool.

      We acknowledge that regional heterogeneity of Sox2 and Bra has been described in the PZ and we made sure that we cite the bibliography including Wymeersch et al, 2016 and Kawachi,2020. Although these papers described different levels of Sox2 and Bra in the PZ, they did not clearly reported and quantified the fact that direct neighboring cells have very different levels of Sox2 and Bra, therefore we believe that our description of a “random-like” pattern of heterogeneity constitutes a real novelty. In the same lines, we are aware of the several studies independently showing that gain or loss of-function of Sox2 or Bra can act on the progenitor decision to join either the NT or the PSM (these references are cited l.70, l.72). However, we believe that our study is the first to test systematically both overexpression and downregulation of Sox2 and Bra on progenitor distribution in the same biological system and to link Sox2/Bra functions to cellular motility.

      Testing the requirements of spatial cell-to-cell heterogeneity to maintain a pool of progenitors is experimentally challenging and even if we were able to homogenize Sox2 and Bra expression, we would have to do it in all progenitors, which is not, so far, technically possible using bird embryo as a model system. We are well aware of these limitations and have toned down claims on the essential role of heterogeneity to maintain progenitor pool. In particular, we have changed the abstract (we removed the last sentence stating that heterogeneity is fundamental to maintain a pool of resident progenitors), as well as the end of the introduction (we removed “while progenitors expressing intermediate/equivalent levels of the two proteins tend to remain resident”). We have pondered our model in the discussion in saying by cell with comparable levels of Sox2 and Bra “could” remain resident (L.370)

      To better apprehend the role of cell-to-cell spatial heterogeneity, we have developed a new mathematical model (Figure 5) which integrates both gradient and random heterogeneity in Sox2/Bra values within the PZ and thus fits better to our biological results. In the new version of the manuscript, we compared this model with a model in which the PZ is fully gradient-like and second one in which it is completely random. These comparisons allow us to describe better what properties random and patterned heterogeneities could bring to the system (Figure 6).

      2) The other main novelty of this manuscript is the idea that differences in cell motility derived from their Sox2 or Bra contents are a major force driving the generation of NT and PSM from the progenitors in the PZ. While there are clear differences between cell motility in the NT and the other two regions, the differences between what is observed in the PSM and PZ is not that high (actually, from the data presented it is not clear that such differences actually exist). However, independently of motility differences, there is no experimental evidence demonstrating that the essential driver of the cell fate choices is motility itself. Differences in cell motility could be just one of the results of more fundamental (and causal) changes in cell characteristics triggered by Sox2 or Bra activity. Indeed, NT and PSM cells are different in many different ways, including adhesion properties, which are normally a major determinant of tissue morphogenesis. Cell motility could, therefore, be one of the factors but it is not clear that it plays the essential role proposed by the authors. (see also next comment).

      Cell motility distributions in the PZ are slightly different from that of the PSM since slower cells were found in the PZ. We agree with the reviewer that this difference might be difficult to see because the average motilities between the two tissues are very similar (Figure 3 and Figure 3-figure Supplement 1). To reveal this difference more clearly we have used a reporter gene for Sox2 and analyze progenitor motility by time lapse imaging. We have specifically tracked GFP positive cells (reporter gene for Sox2) in the PZ and compared them to cells which are not expressing GFP. The result is that Sox2 high progenitors are globally slower than other progenitors clearly revealing heterogeneity in cell movements within the PZ and its relation to Sox2 expression (L.225-232, Figure 3-figure Supplement 1B, video 2).

      We agree that there is no experimental evidence that motility itself is the driver of the cell fate choices. To test if the effect on cell motility is taking place downstream of differentiation events, we have analyzed the expression of markers for mesodermal and neural fate (Msgn1 and Pax6) 7hrs after overexpression of Sox2 and Bra. While Sox2 or Bra overexpression triggers changes on cell motility in this short time window, we did not observe any changes in Msgn1 and Pax6 expression (L.267-274, Figure 4-figure Supplement 2) arguing that the effect on motility is an early consequence of the Bra and Sox2 misexpression. Nevertheless, we are aware that this is not a strict demonstration that the effect on fate are coming from the differential motility only. We have therefore toned down our arguments and changed the title of the manuscript (“....guides destiny by controlling their motility “ has been replaced by “...guides motility and destiny”) .

      The effects on cell motility we observe could be a consequence of Sox2 and Bra effect on adhesion as suggested by the reviewer, this is an interesting possibility that we cannot and don’t want to rule out. The effect on cell adhesion is taken into account in our model and we discuss this hypothesis in the new version of the manuscript (L. 456-459). Identifying the mechanisms underlying the effects of Sox2 and Bra on cell motility is an extremely interesting project we want to pursue but we consider that this aspect goes beyond the scope of the current manuscript.

      3) The authors developed a mathematical model to confirm their hypothesis that Sox2/Bra expression diversity combined with different motility of cells with high, low or intermediate relative levels of Sox2 and Bra expression are the key to guarantee proper axial elongation from the PZ. I am, however, not sure that the model, the way it was designed, actually proves their point. In particular, because it introduces an additional variable that might actually be the essential parameter for the success of the mathematical model: physical boundaries between NT and PSM cells, meaning that cells with high Sox2 or high Bra are unable to mix. As I commented above, this variable reflects a key biological property of the two tissues involved, one epithelial and the other mesenchymal in nature, which might be more relevant that the motility of the cells themselves (e.g. by different cell adhesion properties). How would a model that does not include such physical barriers work? Conversely, how would a model work in which only physical barriers are applied, using similar starting conditions: a prefigured central neural tube (Sox2 high), flanked at both sides by PSM (Brachyury high) and with the PZ (variable Sox2/Bra levels) just posterior to the neural tube?

      We agree that adhesion and non-mixing properties are essential to our models. Because it was not clear in the previous version, we have explained them in more details in the new version of the manuscript (l.295-300 and Appendix 1). To assess their roles, we have made two new simulations one without the regulation of non-mixing /adhesion properties and one without motility control by Sox2/Bra. Both simulations show strong defects in morphogenesis arguing that motility on its own is a key component of the system and that the non-mixing and adhesion properties are also important but not sufficient to drive morphogenesis (Figure 5F). Having the same non-mixing/adhesion and motility properties downstream of Sox2 and Bra in all our models allows us to isolate the phenomena we wish to study: the role of the distribution of cell -to cell heterogeneity in the PZ (Figure 6).

      4) The authors generate two mathematical models, differing in whether they start with a random distribution of Sox2 and Bra expression throughout the PZ or with prefigured opposing Sox2 and Bra expression gradients, somehow resembling the image observed in the embryo. The two models generated structures resembling the elongating embryo, although with small differences in the extension process and the extension rate. After analyzing the behavior of those models, they concluded that the random model fits better with the expectations from the in vivo characteristics in the embryo. I am however not sure that I agree with the authors' interpretation. First, because the gradient model includes a natural characteristic observed in the embryo, which the random model does not. Second, because one of the deciding characteristics, namely the slower extension rate observed in the gradient model, does not necessarily make it worse than the random model, as it is not possible to properly determine which extension rate actually resembles more accurately axial extension in the embryo. Third, because the observation that in the gradient model the PZ undergoes fewer transient deformations and self-corrective behaviour is in my view an argument to favor, instead of to disfavor the gradient model, both because the final result is at least as good as the one obtained with the random model and it is actually not clear that in the embryo the PZ undergoes such clearly visible deformations and self-corrections during axial extension. In addition, the gradient model generates a "pure" PZ (just yellow cells) in the posterior end of the structure, while in the random model the PZ contains some islands of NT cells, which is not what is observed in the embryo. According to the last features, the gradient model seems better than the random model.

      To answer the reviewer’s concern about similarity to the embryo, we have developed a new model that is clearly closer to the biological system because it integrates both the gradient and the random ratio distributions (new Figure 5). Interestingly, by comparing it to the two extreme models (random and gradient), we found that this more “natural” model combines the stability and fluidity brought by the gradient model and the random model, respectively. As pointed out by the reviewer, we found that graded distribution brings more stability to the system with a “purest” PZ. At the opposite, random distribution allows more tissue fluidity and cell rearrangements as well as tissue shape conservation (Figure 6). We want to thank the reviewer for his or her input; we think that the new model and the comparison with the two extreme cases allowed us to reveal more clearly properties that are specific to the two types of spatial distributions and therefore to point out what general morphogenetic properties could emerge from random- like heterogeneity in the embryo.

      Reviewer #2:

      In this manuscript, Romanos et al show firstly that there is extensive cell-to-cell heterogeneity in the relative levels of Sox2 and Bra in the region containing progenitors for neural and paraxial mesoderm, gradually resolving towards high Bra/low Sox2 in the mesoderm or high Sox2/low Bra in emerging neurectoderm. They then show that overexpression of Sox2/morpholino-based inhibition of Bra or vice versa lead cells to favour neurectoderm or mesoderm respectively. Next they show that cells expressing high Bra are more motile than those expressing Sox2, and show using mathematical modelling that these behaviours can explain many aspects of the eventual segregation of Sox2-high neurectoderm and Bra-high mesoderm.

      This interesting and well-presented work leads to the elegant and novel hypothesis that random cell motility induced by Bra and inhibited by Sox2 are sufficient to explain the segregation of NMps towards mesoderm and neurectoderm respectively. The work will be of broad interest to developmental and mathematical biologists interested in the cell biological basis of self-organising cell behaviours. Nevertheless there are some concerns to address in order to solidify the claims in the manuscript.

      1) The section where Sox2 and Bra levels are manipulated (line 152 onwards) is somewhat under-analysed. Results are presented as supporting a model where the two proteins mutually repress each other and lead to segregation of neural (high Sox2) and mesodermal (high Bra) cells. However the data presented does not unequivocally support the claims in the manuscript and would require further clarification.

      In the new version of our manuscript, we give more details on the analysis of Sox2 Bra levels manipulations. In particular, we provide data showing the tissue localization of manipulated cells on transverse sections (L. 192, Figure 2-figure supplement 3). We have also studied the effects of Sox2 and Bra ovexpression on cell fate maturation in the PZ and provide some evidence that progenitors do not yet express differentiation markers as they acquire specific motile properties in response to Sox2 or Bra overexpression (L. 267-273, Figure 4-figure supplement 1). According to our results and to the literature, we revised the text by removing mentions to Sox2 and Bra mutual repression (L 171, L 386, L389).

      2) The mathematical model may be an oversimplification of the role of these two genes in organising a balanced production of neurectoderm and mesoderm.

      In the new version of our manuscript, we have made significant efforts to better explain how non- mixing properties are taken into consideration in our models and thus, hopefully, to avoid an impression of oversimplification. We would like to point out that simulations performed to evaluate the impact of non-mixing properties on the elongation process, indicate that adhesion and non- mixing properties alone cannot account for the morphogenetic events we modelled (new Figure 5F), thus reinforcing the view that regulation of cell motility is a key element in the system. Furthermore, we have designed a new mathematical model, which is closer to the biological system because it integrates both graded and random distribution of Sox2/ Bra values (as observed in vivo) (new Figure 5). As explained above in response to reviewer 1, comparison of this model with our previous models, based on either graded or random distribution of the Sox2/ Bra values, points out the importance of random like cell-to-cell heterogeneity in this morphogenetic process.

      Reviewer #3:

      The manuscript by Romanos and colleagues examines how Sox2 and Brachyury control the behavior and cell fate of neuro-mesodermal progenitors (NMPs) in avian embryos. Using immunohistochemistry, the authors showed that the cells residing in the progenitor zone (PZ) display high variability in Sox2/Bra expression. Manipulation on the levels of the two transcription factors affected NMPs' choice to stay or exit the PZ and their future tissue contributions. This motivated the authors to employ an agent-based computational model and additional functional experiments to explore the importance of Sox2/Bra for cellular motility. The results led the authors to propose that (i) heterogeneity in Sox2/Bra ratio is important for the spatial organization of the PZ and its derivatives and that (ii) Sox2/Bra determine the fate of progenitor cells by controlling cellular movements.

      This is a technically sound report that combines single-cell analysis, in vivo functional experiments, and mathematical modeling to explore the link between cell motility and cell identity. While the model proposed by the authors is intriguing, I found that the study should provide evidence placing Sox2/Bra as primary regulators of cell motility in the context of the PZ. Given the extensively-studied role of these transcription factors in NMPs, it is challenging to decouple cellular behavior from cellular identity during tissue formation. The study would benefit from further demonstration that cell fate commitment is regulated by - and not a regulator of - cell migration of NMPs.

      We have now tested the effect of Sox2 and Bra overexpression on cell identity. We show that, 7 hrs after electroporation (a time at which we observe an effect on cell movement), no modification of the expression of neural (Pax6) and mesodermal (Msgn1) maturating markers. These data thus indicate that the effect on cell motility happens without a major acceleration of the maturation program (Figure 4 figure supplement 2). However, as mentioned in response to Reviewer 1, these experiments are correlative and do not demonstrate that the effect of Sox2 and Bra on neural and mesodermal differentiation programs are going only thought cell motility, therefore we have accordingly toned down our arguments in the new version of our manuscript.

      Strengths and Weaknesses:

      • The idea that heterogeneity in cellular behaviors within a progenitor field may act as a driver of morphogenesis is interesting and nicely supported by the agent-based model.

      We want to thank the reviewer for this comment. We believe that in the new version of the manuscript we go even further by developing a new model (Figure 5) which is closer to reality and by testing the influence of random versus gradient Sox2/Bra distribution on morphogenesis (Figure 6)

      • One of the premises of the model (Fig 4) is that Sox2/Bra ratio determines how much cells move, but this is not clear from the in vivo experiments and seems speculative. A clear demonstration of correlation between Sox2/Bra ratio and cellular motility is necessary for proper support of the model.

      The role of the Sox2 to Bra ratio on PZ cell motility is demonstrated in Figure 4. In the new version of the manuscript, these results are presented before the modelling section, we hope that it would help clarifying any doubt the reader can have on the fact that we do demonstrate clearly a role of Sox2 and Bra in controlling PZ cell motility in vivo.

      • The authors found that manipulation in the levels of the TFs results in changes in NMP motility, but it is not clear if this the cause or a consequence of commitment to a neural or mesodermal fate. Could Bra-High cell moving more because they have been specified to a mesodermal fate? Conversely, Sox2-High cells might migrate less since they get incorporated into the neural tube. Establishing the timing of cell fate commitment is necessary to resolve this issue

      We agree with the reviewer that it is an interesting issue; we have checked for expression of specification markers 7hrs after electroporation of Sox2 and Bra expression vectors, a time point at which electroporated cells did not yet leaved the PZ but have already changed their motility. In these conditions, overexpression of Sox2 and Bra had no discernable effect on expression of the neural marker Pax6 and on the PSM marker Msgn1, respectively (Figure 4 figure supplement 2).

      • The study's impact and novelty depend on the demonstration that the primary function of Sox2/Bra in NMPs is to drive cell movement. This is not sufficiently explored in the study, and there are no proposed mechanisms for how Sox2/Bra modulate cellular behavior.

      We do have shown that Sox2 and Bra act on progenitor motility in vivo (Figure 4). As a mechanism, we propose that Sox2 and Bra could act directly on motility or indirectly by regulating differential adhesion. Cell adhesion control by Sox2/Bra is part of our modeling assumptions and is therefore a hypothesis that will be the subject of future investigations in the lab. This hypothesis is part of the discussion in the new version of the manuscript (L.457).

    1. Author Response:

      Reviewer #1 (Public Review):

      [...] Strengths:

      1. The loss of ciliary GPR161 has a more robust phenotype in specific tissues (i.e., the limbs and face). As a result, the limb data (in Figure 6) and craniofacial data (in Figure 7) are well presented and clear. In these figures, the authors directly compare and highlight differences between primarily two genotypes (wt and Gpr161mut1/mut1 embryos) and quantify the changes (digit number and distance between nasal pits). Overall, these two figures support the existing GPR161 model, showcasing that a loss of ciliary GPR161 results in a tissue-specific loss of GLI3R (Figure 6D) and consequently the development of additional digits (Figure 6E) and craniofacial defects (Figure 7D and 7E).

      Thank you.

      Weaknesses:

      1. There is no data in the paper showing that Gli3 repressor function is affected preferentially compared to Gli Activator function. In Figure 4C, Gli3 FL/R ratios are not different between wt/wt and mut/mut embryos. The data can be explained by the fact that the mutant Gpr161 is a partial loss of function allele and the resultant weaker phenotypes (compared to the full KO) show some tissue specificity. Linking this allele to a specific biochemical mechanism is not justified by the data.

      We have now revised the title of the paper and the discussion emphasizing on these limitations. We have also added a new section in discussion on the limitations of our methods and other optogenetic/chemogenetic methods for generating cAMP in cilia. These limitations arise from the cilioplasm not being strictly restricted from the cytoplasm. Therefore, the second messengers cAMP and Ca2+ are freely diffusible between ciliary and extraciliary compartments (Delling et al., 2016; Truong et al., 2021). A paper published in Cell during revision of this study used optogenetic tools to show that ciliary, but not cytoplasmic, production of cAMP functions through PKA localized in cilia (Truong et al., 2021) to repress sonic hedgehog-mediated somite patterning in zebrafish (Wolff et al., 2003). We have also compared and discussed these results with our study. Our study highlights that the effects of ciliary loss of Gpr161 pools are tissue specific and dependent on the requirements of the tissues on GliR vs GliA in the morpho-phenotypic spectrum. Overall, our results using Gpr161mut1 allele are complementary to the optogenetic study by showing that lack of ciliary Gpr161 pools result in Hh hyperactivation phenotypes arising mainly from lack of GliR, in the limb buds, mid-face and intermediate neural tube.

      1. The authors use an endpoint assay based on overexpression in 293T cells to claim that cAMP production is unaffected by the Gpr161mut allele. However, weak effects (very likely given the weak phenotypes) may not be evident this assay. We also do not know if the mutant allele is defective in some other biochemical function or in localization to other places in the cell. One way to address this is to measure ciliary and extraciliary cAMP in their knock-in cells. In Gpr161mut1/mut1 cells, is ciliary cAMP reduced to levels comparable to Gpr161ko/ko cells? Is extraciliary cAMP unchanged compared to WT cells? Or, is cAMP able to diffuse into the cilia from GPR161mut1 localized to vesicles at the ciliary base (Figure 1B)? Many of the conclusions made in the paper equate a loss of ciliary GPR161 to a loss of ciliary cAMP, but this loss of ciliary cAMP is not definitively shown in the paper.

      As physiological ligands for Gpr161 are currently not known, we are unable to test extraciliary vs ciliary contribution of Gpr161 in cAMP production in a physiological context. Therefore, we resort to overexpression assays for constitutive cAMP production by Gpr161 and Gpr161mut1. Using these assays, we do not find a difference in constitutive activity among these variants.

      As the cilioplasm is not strictly compartmentalized from the cytoplasm, the second messengers cAMP and Ca2+ are freely diffusible between ciliary and extraciliary compartments (Delling et al., 2016; Truong et al., 2021). Thus, in any approach for generating subcellular pools of cAMP, be it genetic, optogenetic or chemogenetic (Guo et al., 2019; Hansen et al., 2020; Truong et al., 2021), extraciliary cAMP could diffuse into ciliary compartments. A recent paper using optogenetic and chemogenetic tools for cAMP production inside cilia or in cytoplasm show that there is free access of cytoplasmic cAMP to intraciliary compartments but is unable to reach critical thresholds in activating PKA (Truong et al., 2021). Thus, we would assume that the extraciliary cAMP produced by extra copies of Gpr161mut1 could diffuse to cilia but is likely to be less effective in activating downstream effectors. In addition, the PKA regulatory subunit-AKAP complexes are fundamentally important in organizing and sustaining PKA catalytic subunit activation to organize localized substrate phosphorylation in restrictive nanodomains (Bock et al., 2020; Zhang et al., 2020). The dual functions of Gpr161 in Gs coupling and as an atypical AKAP (Bachmann et al., 2016) is likely to further restrict cAMP signaling in ciliary or extraciliary microdomains.

      1. Compared to Figures 6 and 7, the data presented in Figures 3 and 5 are very confusing and difficult to interpret. On the one hand, this is understandable, the Gpr161mut/mut phenotypes are complex, and some tissues (like the developing spinal cord) are more resistant to change due to a loss of GliR. On the other hand, the data collected from the numerous genotypes analyzed could be easier to interpret by (i) providing a penetrance of the phenotypes and (ii) quantifying the phenotypes.

      Thank you for all the suggestions. We have now carried out these quantifications or tabulations, which have considerably improved the presentation of the datasets (Table 2 and Figure 5-figure supplement 1). Some of these experiments required additional experimental animals (Table 1), and we have updated the text accordingly.

      Below are a few examples of data that could be improved with quantifications:

      — In Figure 3, the authors are trying to convey that the Gpr161mut allele is partially functional and produces a milder phenotype than the Gpr161ko allele. However, the Gpr161ko/ko, Gpr161mut/ko, and Gpr161mut/mut phenotypes showcased in the figure all look quite severe, and it is difficult to appreciate the differences in the defects fully. An accompanying table summarizing the phenotypes and their penetrance in the affected genotypes would help to convey this point.

      We have added an accompanying Table 2 summarizing the phenotypes and penetrance for the respective genotypes, when present. Please note that rostral malformations such as exencephaly are similar between Gpr161 ko/ko and Gpr161 ko/mut1, whereas Gpr161 mut1/mut1 embryos have mid face widening. In the same line, Gpr161 ko/ko has no forelimbs, whereas Gpr161 ko/mut1 has smaller fore limb buds, whereas Gpr161 mut1/mut1 embryos have polydactyly.

      — In Table 1, the authors note that the Gpr161mut1/mut1 mouse is embryonic lethal by e14.5, but the analysis in Table 1 appears to be incomplete. In the table titled "breeding between Gpr161 mut1/+ parents," the authors indicate that they only assessed one litter of e14.5 and e15.5 embryos. Oddly, the authors note that additional litters were collected, but the embryos were not genotyped because the embryos exhibited no phenotypes. The absence of phenotypes could be due to an absence of viable Gpr161mut1/mut1 embryos; however, the embryos need to be genotyped and a chi-square analysis conducted to verify this. Death can be a measure of phenotype severity, but I think it is important to surmise why the embryos are dying. It is unclear whether the embryos are dying due to the heart defects mentioned in the discussion. If the embryos are dying due to the heart defect, then it would be important to know whether the heart defects are more severe in the Gpr161ko/ko embryos.

      Our apologies for the oversight. We have now analyzed additional timed pregnancies at E14.5, E14.75 and E15.5. We find that the embryonic lethality is seen fully by E14.75. Heart defects in Gpr161 ko/ko embryos are not apparent as they are E10.5 lethal. We do see apparent heart defect phenotypes in Gpr161 ko/mut1 vs Gpr161 mut1/mut1. These defects include pericardial effusion, outflow tract defects, A-V cushion abnormalities and smaller ventricles. These phenotypic descriptions are beyond the scope of the current paper. However, we have mentioned about pericardial effusion in the text and Table 2.

      — In Figure 5, quantifying the progenitor domains would greatly assist in discerning differences between the various genotypes. For example, a quantification would help readers assess differences in NKX6.1 across the various genotypes.

      We have now quantified the differences in Nkx6.1 across genotypes. The data is presented in Figure 5-figure supplement 1.

      On an unrelated note, the PAX7 staining of the Gpr161mut1/ko spinal cord looks very strange because the line adjacent to the image does not accurately represent the dorsal-ventral patterning of PAX7 seen in the image. This image would need to be replaced.

      Our apologies for the oversight. We have now revised this image.

      Reviewer #2 (Public Review):

      The premise of the entire study is predicated on GPR161mut1 failing to target to cilia and being WT in every other aspect. The Gs coupling of GPR161mut1 is examined. The ciliary localization ofGPR161mut1 is carefully assessed by conducting staining not just in WT cells but also in INPP5Ecells where GPR161 ciliary levels are known to be elevated. Another prediction is that GPR161mut1is found in an intermediate biosynthetic compartment. Some insights into the compartment whereGPR161mut1 is found would help interpret the phenotype of the GPR161mut1 animals. It would be important to know whether the GPR161mut1 mimics a pre-cilia targeted GPR161 (say at the plasma membrane) or whether it mimics a post-ciliary exit state (say recycling endosomes). In the past few years, work from the von Zastrow lab and others has shown that GPCRs keep activating their downstream partners after endocytosis from the plasma membrane. If GPR161mut1 were to mimic the post-ciliary exit state of GPR161, it may assume some of the signaling functions of ciliaryGPR161.

      Thank you for all the suggestions. We have now examined and extensively discussed the plausible source of extraciliary Gpr161 in mediating Hh repression. We already showed that Gpr161 localizes to the periciliary recycling endosomal compartment where it localizes in addition to cilia (Mukhopadhyay et al., 2013) and could activate ACs and PKA in proximity to the centrosome. We now show that Gpr161mut1 also localizes to similar compartments (Figure 1-figure supplement 3). We propose that this compartment could promote Gpr161 activity outside cilia in the in vivo settings in GliR formation (please see model in Figure 8D).

      We also compare our results with a recently published paper showing that ciliary, but not cytopasmic, production of cAMP functions through PKA localized in cilia to repress sonic hedgehog-mediated somite patterning in zebrafish (Truong et al., 2021). While this paper is an elegant demonstration of ciliary pools of cAMP in repressing Hh activity despite having no strict compartmentalization exclusively in cilia, it does not capture the roles of ciliary and extraciliary pools of Gpr161-mediated cAMP signaling in different tissues that we show are dependent on the requirements of the tissues on GliR vs GliA in the morpho-phenotypic spectrum.

      A second point that the authors may wish to address is whether GPR161mut1 may fail to enrich in cilia because it is hyperactive and undergoes constitutive exit from cilia. The hypothesis here is thatGPR161mut1 couples to beta arrestin better than WT GPR161. Blocking GPR161mut1 exit via depletion of beta arrestin or BBSome is a simple way to test this hypothesis.

      As advised by the reviewer, we have tested for Gpr161/Gpr161mut1 levels in cilia upon arrestin1/2 or BBSome loss. These experiments show that Gpr161mut1 is not present in cilia in arrestin1/2 (Arrb1/2) double ko MEFs (Figure 1-figure supplement 1) or upon RNAi of BBS4 (Figure 5-figure supplement 2). We previously also showed that knockdown of the 5’phosphpatase INPP5E that causes accumulation of Gpr161 in cilia does not show any accumulation of Gpr161mut1 in cilia. Based on all these experiments, we surmise that Gpr161mut1 does not transit through cilia.

      Finally, it would be good to learn about the levels of expression of GPR161mut1 compared to WTGPR161 using immunoblotting. If GPR161mut1 were to be expressed at much higher levels than WTGPR161, it may compensate for its lack of ciliary localization by elevated total cellular activity.

      We were unable to determine protein stability of the mutant receptor in the Gpr161mut1 embryos due to technical constraints in immunoblotting for endogenous levels. However, we note Gpr161mut1 in vesicles surrounding the base of cilia (Figure 1B) and constitutive cAMP signaling activity (Figure 1G, Figure supplements 1-3) in stable cell lines, suggesting that protein levels and activity of the mutant were comparable with wild type Gpr161. As suggested by the reviewer, we also tested LAP-tagged Gpr161mut1protein levels by tandem affinity purification and immunoblotting, with respect to LAP-tagged Gpr161wt in MEFs stably overexpressing these variants. We noted similar immunoblotting pattern from receptor glycosylation in both variants (Figure 2-figure supplement 2).

    2. Reviewer #1 (Public Review):

      The authors created a new GPR161 mutant mouse (Gpr161mut/mut) in which GPR161 does not localize to the primary cilium but is still cAMP signaling competent based on an over-expression assay in 293T cells. Through a detailed analysis of the Gpr161mut/mut mouse and its comparison to a previously generated Gpr161 knockout mouse (Gpr161ko/ko), the authors try to discriminate the ciliary and non-ciliary roles of GPR161. The current prevailing model is that GPR161 (localized to the primary cilium in the absence of Hh pathway activation) is constitutively active and elevates cAMP levels within the primary cilium. Elevated ciliary cAMP then activates ciliary (or ciliary adjacent) PKA, driving the processing of bifunctional GLI proteins into transcriptional repressors (GLIR). According to this model, the ciliary pool of GPR161 is critical for suppressing Hh signaling activity, and one would predict that the Gpr161mut/mut embryos would look identical to the Gpr161ko/ko embryos. However, this was not the case. Across multiple developmental tissues, the Gpr161mut/mut phenotype is less severe than the complete knockout, suggesting a role for non-ciliary GPR161 in suppressing Hh signaling activity. The observations made in this paper are interesting, but the data fails to make a clear distinction between the ciliary and non-ciliary roles of GPR161.

      Strengths:

      1. The loss of ciliary GPR161 has a more robust phenotype in specific tissues (i.e., the limbs and face). As a result, the limb data (in Figure 6) and craniofacial data (in Figure 7) are well presented and clear. In these figures, the authors directly compare and highlight differences between primarily two genotypes (wt and Gpr161mut1/mut1 embryos) and quantify the changes (digit number and distance between nasal pits). Overall, these two figures support the existing GPR161 model, showcasing that a loss of ciliary GPR161 results in a tissue-specific loss of GLI3R (Figure 6D) and consequently the development of additional digits (Figure 6E) and craniofacial defects (Figure 7D and 7E).

      Weaknesses:

      1. There is no data in the paper showing that Gli3 repressor function is affected preferentially compared to Gli Activator function. In Figure 4C, Gli3 FL/R ratios are not different between wt/wt and mut/mut embryos. The data can be explained by the fact that the mutant Gpr161 is a partial loss of function allele and the resultant weaker phenotypes (compared to the full KO) show some tissue specificity. Linking this allele to a specific biochemical mechanism is not justified by the data.

      2. The authors use an endpoint assay based on overexpression in 293T cells to claim that cAMP production is unaffected by the Gpr161mut allele. However, weak effects (very likely given the weak phenotypes) may not be evident this assay. We also do not know if the mutant allele is defective in some other biochemical function or in localization to other places in the cell. One way to address this is to measure ciliary and extraciliary cAMP in their knock-in cells. In Gpr161mut1/mut1 cells, is ciliary cAMP reduced to levels comparable to Gpr161ko/ko cells? Is extraciliary cAMP unchanged compared to WT cells? Or, is cAMP able to diffuse into the cilia from GPR161mut1 localized to vesicles at the ciliary base (Figure 1B)? Many of the conclusions made in the paper equate a loss of ciliary GPR161 to a loss of ciliary cAMP, but this loss of ciliary cAMP is not definitively shown in the paper.

      3. Compared to Figures 6 and 7, the data presented in Figures 3 and 5 are very confusing and difficult to interpret. On the one hand, this is understandable, the Gpr161mut/mut phenotypes are complex, and some tissues (like the developing spinal cord) are more resistant to change due to a loss of GliR. On the other hand, the data collected from the numerous genotypes analyzed could be easier to interpret by (i) providing a penetrance of the phenotypes and (ii) quantifying the phenotypes. Below are a few examples of data that could be improved with quantifications:

      — In Figure 3, the authors are trying to convey that the Gpr161mut allele is partially functional and produces a milder phenotype than the Gpr161ko allele. However, the Gpr161ko/ko, Gpr161mut/ko, and Gpr161mut/mut phenotypes showcased in the figure all look quite severe, and it is difficult to appreciate the differences in the defects fully. An accompanying table summarizing the phenotypes and their penetrance in the affected genotypes would help to convey this point.

      — In Table 1, the authors note that the Gpr161mut1/mut1 mouse is embryonic lethal by e14.5, but the analysis in Table 1 appears to be incomplete. In the table titled "breeding between Gpr161 mut1/+ parents," the authors indicate that they only assessed one litter of e14.5 and e15.5 embryos. Oddly, the authors note that additional litters were collected, but the embryos were not genotyped because the embryos exhibited no phenotypes. The absence of phenotypes could be due to an absence of viable Gpr161mut1/mut1 embryos; however, the embryos need to be genotyped and a chi-square analysis conducted to verify this. Death can be a measure of phenotype severity, but I think it is important to surmise why the embryos are dying. It is unclear whether the embryos are dying due to the heart defects mentioned in the discussion. If the embryos are dying due to the heart defect, then it would be important to know whether the heart defects are more severe in the Gpr161ko/ko embryos.

      — In Figure 5, quantifying the progenitor domains would greatly assist in discerning differences between the various genotypes. For example, a quantification would help readers assess differences in NKX6.1 across the various genotypes. On an unrelated note, the PAX7 staining of the Gpr161mut1/ko spinal cord looks very strange because the line adjacent to the image does not accurately represent the dorsal-ventral patterning of PAX7 seen in the image. This image would need to be replaced.

    1. If we should simply found a few professorships, of such a nature as to attract attention on account of a special degree of distinction attached to them, it would go far to remove the prejudice which now exists against the idea of college professorships held by women. The plan that I have in mind is this: Instead of waiting for the colleges to offer professorships to our young doctors of philosophy, I would suggest that we offer our young doctors of philosophy as professors to the colleges -- and not in the way of founding fixed professorships in any given college, but rather of establishing what may be called peripatetic professorships, to be held, in any particular case, by our most available young woman and at the college or the university which shall best fulfil certain requirements of ours which I shall state in a moment.

      While Franklin has a great suggestion for how the professorship should be setup, I think she makes a great point that, like men, women should be sought after to fill these positions. So, instead of waiting for someone to stop in to claim the position, they should instead seek out the brilliant minds to fill the position.

    1. Reviewer #1 (Public Review):

      1) The user manual and tutorial are well documented, although the actual code could do with more explicit documentation and comments throughout. The overall organisation of the code is also a bit messy.

      2) My understanding is that this toolbox can take maps from BigBrain to MRI space and vice versa, but the maps that go in the direction BigBrain->MRI seem to be confined to those provided in the toolbox (essentially the density profiles). What if someone wants to do some different analysis on the BigBrain data (e.g. looking at cellular morphology) and wants that mapped onto MRI spaces? Does this tool allow for analyses that involve the raw BigBrain data? If so, then at what resolution and with what scripts? I think this tool will have much more impact if that was possible. Currently, it looks as though the 3 tutorial examples are basically the only thing that can be done (although I may be lacking imagination here).

      3) An obvious caveat to bigbrain is that it is a single brain and we know there are sometimes substantial individual variations in e.g. areal definition. This is only slightly touched upon in the discussion. Might be worth commenting on this more. As I see it, there are multiple considerations. For example (i) Surface-to-Surface registration in the presence of morphological idiosyncracies: what parts of the brain can we "trust" and what parts are uncertain? (ii) MRI parcellations mapped onto BigBrain will vary in how accurately they may reflect the BigBrain areal boundaries: if histo boundaries do not correspond with MRI-derived ones, is that because BigBrain is slightly different or is it a genuine divergence between modalities? Of course addressing these questions is out of scope of this manuscript, but some discussion could be useful; I also think this toolbox may be useful for addressing this very concerns!

    1. One of the first material scientists I spoke to about making things that last for thousands of years offered a compelling insight: “Everything is burning, just at different rates.” What he means is that what we perceive as aging is actually oxidisation, like rusting. When we imagine materials that may last for thousands of years, most people think of stone or precious metals like gold – because they don't oxidise readily. But even bodies can be preserved for millennia if stored in the right chemical environment, as the mummies of Egypt demonstrate.

      A fascinating take on "everyone is dying"

    1. Anne: What was family life like with you and your brother and your mother and father? Did you guys speak English at home? Did you do American things, activities? Do they work a lot? Tell me a little bit about family life.Juan: Right now, my dad, he's always been the boss of the family. He's always worked, he works in construction, and as you know, Utah, with the climate change, it snows, it rains, all of the climates. Since he works in construction, he does work outside all the time, so even if it snows or even if it rains, even if it's minus five degrees outside, he still goes out and works because nobody's going to give him the money to provide for his family.Juan: In a way, my dad, you can say he's one of those hard working men who doesn't look out for himself, but rather looks out for his family. In my house we spoke Spanish all the time because of my mom. To this day, she doesn't want to learn English even though we tell her to learn English. My little sister, she doesn't speak Spanish, she speaks more English and with her it's different. We tell her, "You have to learn Spanish because it's going to help you," but she doesn't want to learn.Anne: Is she a citizen?Juan: Yes, she was born in the US. So my parents didn't really adapt to the American culture. They always wanted to follow Mexican traditions, even when it's Mother's Day over there … I think here it's May 10th but over there, when is Mother's Day?Anne: I think it's the second Sunday of May, so it could be different days.Juan: We could take that as an example. They'd rather follow Mother's Day here in Mexico than over there. Also Christmas, I guess the one thing they did adapt to was Thanksgiving. We don't celebrate that here in Mexico, but they do celebrate there, and they did adapt that. Another thing, Easter day. You go out with your family, you hide the eggs as a tradition, no? They adapted to that, but here in Mexico they don't do that. They don't even know about that. In a way they wanted to keep their Mexican culture alive even though they were in the US, but they also wanted to adapt to the things that they did there.

      Time in the US, Homelife, Mexican traditions, Holidays, Spanish language, US traditions, Holidays

    1. Anne: I see.Ben: I mean it's a nice house. It's up in the mountains and I had a lot of family members, including my wife go, "Why are you leaving? Why are you going to Mexico City? You don't need to.” I go, "Well one I'm going, I want to be involved in helping these people. I gotta go out and do something, I know I can still do something, I need a job. I need a job, I need a real job.” Raising goats and sheep is fine and it was common people and stuff, but I'm a busy body and I need to do something.Ben: And then I became aware of New Comienzos and when I seen that, that's what I want to do. I want to go down there, I want to be involved in that. I want to be involved in that because that's something that I know I can help and contribute to. And at the same time, I can get me a job down there and I'll stay put. I'll come back and visit every now and then, but I'm a city person [Laughs].Anne: Yeah. So, did you fight the detention or no?Ben: No. When my first, I was detained when I was 19—well no, I got in trouble when I was 19, detained at 27. That time, I signed away, I didn't fight it. So, this time, I had no rights. I could not fight anymore because I'd already signed away. This time around, I probably would've fought it, because I had the money this time. Even if I knew I was going to lose, at least I knew I had the money for the bond and I could put it off two, three, four years. But, the first time I didn't have the money. So, I said, “Sit here two years and wait and then probably get deported? No.” Unfortunately, this time, I just, there was no rights that I could—Anne: And have your kids or your wife been to visit you?Ben: Yes, they have up there. Hopefully once I get settled here. My wife was supposed to come here in May, like around my birthday, which was the week before last. But when my son got this scholarship, well he said, "We gotta go,” so her and my daughter both drove him down to Orlando and they went to Disney, like we used to always go to Disney World. We would go at least twice a year. There was one year that I had two projects that ran over a year down there and I bought them season passes, because it was easier for them to fly down on the weekend and come see me. And when they come down, if you buy three individual park tickets, it's more expensive then the season pass.Anne: Yeah.Ben: But they're still keeping up the traditions [Laughs]. They're still going to Disney.Anne: And you spent a lot of time volunteering while you were in the states.Ben: Yes.Anne: So, it seems like, does that make it a good fit to try it here?Ben: Oh yes. Yes, it's voluntary here, it's a different theme here. It's a stronger, I feel it's a stronger theme. Not that my volunteer work back over there wasn't, but my volunteer… Like helping out at the school whenever I was in town, I would let them know that I would be in town and I was available to substitute if one of the teachers needed a break or was going to be missing. And I was qualified to take the classes on.Ben: But I also was a volunteer English teacher when they started, they started a Spanish church. When that Spanish church started, it was actually my father that was the preacher. My father was at another church, but when they wanted to do that, I talked to my father to see if he would, because they asked me to, but I was honest, I go, "You know I'm not that knowledgeable of the Bible, to be able to. I don't want to stumble over myself.” And you know when people are barely getting into a church and you say one thing but then you contradict yourself, you're going to destroy their faith.Anne: Don't want to do that.Ben: No. And I did a lot of volunteer work there at the church and the school. It was great. And they've been right by my family's side, they're still going to church there and anytime that they need anything, they're right there. But good thing …. they've been fine. My wife, she's got a pretty good job. She worked for a mortgage company, so she does pretty well. And my daughter helps out too now that she's making money. It's been a long ride. [Laughs].Anne: So, we hear a lot of stories about young men who come over as babies or toddlers and then for some reason get caught up in gangs or crime. What was different for you? Why do you think that never happened?Ben: Well, I can tell you that I think, probably the single most important thing, the most important thing in a person's life is environment. Parenting is important, but you can have the best parents in the world, but if you have them in a bad environment, your parenting is not going to supersede the environment. And that's one of the things that I focus with my wife is that—well my parents, they provided a good environment. And when I got married from my life experiences, I stepped that up a bit. I told a lot of other relatives, this is one thing I've told a lot of other relatives, this happens a lot in America—not just with Mexicans or Central Americans, Blacks or whatever—is a lot of people yell out racism or discrimination.Ben: And I sincerely believe that sometimes we discriminate ourselves, that we put it on ourselves, because we teach that to our children, because weekends we all want to go get together with other relatives, other friends of our own ethnicity. And that's not really what America's about and that's not what I taught my children because that's not how I lived my life. I was out with everybody, congregating with everybody, and that's the environment that we brought our children up in. We brought them up in their church—I was talking to you earlier, our church and the school that they went to was part of the church. We were the only Hispanics.Ben: But that doesn't mean that we didn't allow them or try to get them to forget who they were. We didn't, because we brought them around our relatives, but we let them see that environment and so that they felt comfortable. So, when they got out into the world, they're comfortable around anybody and they're not looking at colors or whatever. And they don't feel like they're different and they don't feel different. I honestly, I think I felt more different when I got back here [Laughs].Anne: Right.Ben: Because it was really kind of weird. But over there I didn't, but I think environment is one of the most important things. If you put a good person in a bad situation, in a bad environment, sooner or later he'll break. If you get a bad person that's never known what life is really supposed to be about, guide him a little bit and give him a little time, and if he's willing—Anne: It might work out.Ben: Yeah, it might work out.Anne: Interesting. So, you achieved your dreams in America.Ben: Oh yeah.Anne: Do you have dreams now for yourself here?Ben: Yeah. My dream here is, one, to help here and I can't say it's a goal that's going to be met. And the other is I'm going to have here what I had over there and I'm confident that I can make that happen.Anne: And will you make it through construction business, or will you make it through…?Ben: Right now, I think that there's other areas here that I could probably succeed in without jumping into the construction business. We have land back here (in the family home) and a buy little bit of cattle, make some money here. There’s just several different ideas. But I know that I can excel in a job here, because there's several people here that are making some pretty high incomes and just, some pretty much as telemarketers, but just there's some call centers with some good bonuses. You're not going to get rich there, but you can make a good living.Anne: Right.Ben: But there's some opportunities right now.

      Return to Mexico, Jobs, Community, Opportunity, Family Relationships, Feelings, Dreams; Reflections, Mexico, The United States

    1. psychology may be defined rigidly so as to include only a scientific description of mind, of mental activity, or of mental products

      This does not seem to be such a rigid definition to me. I think we use psychology in combination with closely related subjects, such as sociology, and it can become easy to mix the two. I think "a scientific description of the mind, of mental activity, or of mental products" seems like a reasonable definition for psychology.

    1. Peer Reviewed and recommended by Peer Community in Evolutionary Biology

      Recommendation<br> Separating adaptation from drift: A cautionary tale from a self-fertilizing plant<br> by Christoph Haag based on reviews by Jon Agren, Pierre Olivier Cheptou and Stefan Laurent.

      In recent years many studies have documented shifts in phenology in response to climate change, be it in arrival times in migrating birds, budset in trees, adult emergence in butterflies, or flowering time in annual plants (Coen et al. 2018; Piao et al. 2019). While these changes are, in part, explained by phenotypic plasticity, more and more studies find that they involve also genetic changes, that is, they involve evolutionary change (e.g., Metz et al. 2020). Yet, evolutionary change may occur through genetic drift as well as selection. Therefore, in order to demonstrate adaptive evolutionary change in response to climate change, drift has to be excluded as an alternative explanation (Hansen et al. 2012). A new study by Gay et al. (2021) shows just how difficult this can be.

      The authors investigated a recent evolutionary shift in flowering time by in a population an annual plant that reproduces predominantly by self-fertilization. The population has recently been subjected to increased temperatures and reduced rainfalls both of which are believed to select for earlier flowering times. They used a “resurrection” approach (Orsini et al. 2013; Weider et al. 2018): Genotypes from the past (resurrected from seeds) were compared alongside more recent genotypes (from more recently collected seeds) under identical conditions in the greenhouse. Using an experimental design that replicated genotypes, eliminated maternal effects, and controlled for microenvironmental variation, they found said genetic change in flowering times: Genotypes obtained from recently collected seeds flowered significantly (about 2 days) earlier than those obtained 22 generations before. However, neutral markers (microsatellites) also showed strong changes in allele frequencies across the 22 generations, suggesting that effective population size, Ne, was low (i.e., genetic drift was strong), which is typical for highly self-fertilizing populations. In addition, several multilocus genotypes were present at high frequencies and persisted over the 22 generations, almost as in clonal populations (e.g., Schaffner et al. 2019). The challenge was thus to evaluate whether the observed evolutionary change was the result of an adaptive response to selection or may be explained by drift alone.

      Here, Gay et al. (2021) took a particularly careful and thorough approach. First, they carried out a selection gradient analysis, finding that earlier-flowering plants produced more seeds than later-flowering plants. This suggests that, under greenhouse conditions, there was indeed selection for earlier flowering times. Second, investigating other populations from the same region (all populations are located on the Mediterranean island of Corsica, France), they found that a concurrent shift to earlier flowering times occurred also in these populations. Under the hypothesis that the populations can be regarded as independent replicates of the evolutionary process, the observation of concurrent shifts rules out genetic drift (under drift, the direction of change is expected to be random).

      The study may well have stopped here, concluding that there is good evidence for an adaptive response to selection for earlier flowering times in these self-fertilizing plants, at least under the hypothesis that selection gradients estimated in the greenhouse are relevant to field conditions. However, the authors went one step further. They used the change in the frequencies of the multilocus genotypes across the 22 generations as an estimate of realized fitness in the field and compared them to the phenotypic assays from the greenhouse. The results showed a tendency for high-fitness genotypes (positive frequency changes) to flower earlier and to produce more seeds than low-fitness genotypes. However, a simulation model showed that the observed correlations could be explained by drift alone, as long as Ne is lower than ca. 150 individuals. The findings were thus consistent with an adaptive evolutionary change in response to selection, but drift could only be excluded as the sole explanation if the effective population size was large enough.

      The study did provide two estimates of Ne (19 and 136 individuals, based on individual microsatellite loci or multilocus genotypes, respectively), but both are problematic. First, frequency changes over time may be influenced by the presence of a seed bank or by immigration from a genetically dissimilar population, which may lead to an underestimation of Ne (Wang and Whitlock 2003). Indeed, the low effective size inferred from the allele frequency changes at microsatellite loci appears to be inconsistent with levels of genetic diversity found in the population. Moreover, high self-fertilization reduces effective recombination and therefore leads to non-independence among loci. This lowers the precision of the Ne estimates (due to a higher sampling variance) and may also violate the assumption of neutrality due to the possibility of selection (e.g., due to inbreeding depression) at linked loci, which may be anywhere in the genome in case of high degrees of self-fertilization.

      There is thus no definite answer to the question of whether or not the observed changes in flowering time in this population were driven by selection. The study sets high standards for other, similar ones, in terms of thoroughness of the analyses and care in interpreting the findings. It also serves as a very instructive reminder to carefully check the assumptions when estimating neutral expectations, especially when working on species with complicated demographies or non-standard life cycles. Indeed the issues encountered here, in particular the difficulty of establishing neutral expectations in species with low effective recombination, may apply to many other species, including partially or fully asexual ones (Hartfield 2016). Furthermore, they may not be limited to estimating Ne but may also apply, for instance, to the establishment of neutral baselines for outlier analyses in genome scans (see e.g, Orsini et al. 2012).

      References

      Cohen JM, Lajeunesse MJ, Rohr JR (2018) A global synthesis of animal phenological responses to climate change. Nature Climate Change, 8, 224–228. https://doi.org/10.1038/s41558-018-0067-3

      Gay L, Dhinaut J, Jullien M, Vitalis R, Navascués M, Ranwez V, Ronfort J (2021) Evolution of flowering time in a selfing annual plant: Roles of adaptation and genetic drift. bioRxiv, 2020.08.21.261230, ver. 4 recommended and peer-reviewed by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2020.08.21.261230

      Hansen MM, Olivieri I, Waller DM, Nielsen EE (2012) Monitoring adaptive genetic responses to environmental change. Molecular Ecology, 21, 1311–1329. https://doi.org/10.1111/j.1365-294X.2011.05463.xISTEX

      Hartfield M (2016) Evolutionary genetic consequences of facultative sex and outcrossing. Journal of Evolutionary Biology, 29, 5–22. https://doi.org/10.1111/jeb.12770

      Metz J, Lampei C, Bäumler L, Bocherens H, Dittberner H, Henneberg L, Meaux J de, Tielbörger K (2020) Rapid adaptive evolution to drought in a subset of plant traits in a large-scale climate change experiment. Ecology Letters, 23, 1643–1653. https://doi.org/10.1111/ele.13596

      Orsini L, Schwenk K, De Meester L, Colbourne JK, Pfrender ME, Weider LJ (2013) The evolutionary time machine: using dormant propagules to forecast how populations can adapt to changing environments. Trends in Ecology & Evolution, 28, 274–282. https://doi.org/10.1016/j.tree.2013.01.009

      Orsini L, Spanier KI, Meester LD (2012) Genomic signature of natural and anthropogenic stress in wild populations of the waterflea Daphnia magna: validation in space, time and experimental evolution. Molecular Ecology, 21, 2160–2175. https://doi.org/10.1111/j.1365-294X.2011.05429.xISTEX

      Piao S, Liu Q, Chen A, Janssens IA, Fu Y, Dai J, Liu L, Lian X, Shen M, Zhu X (2019) Plant phenology and global climate change: Current progresses and challenges. Global Change Biology, 25, 1922–1940. https://doi.org/10.1111/gcb.14619

      Schaffner LR, Govaert L, De Meester L, Ellner SP, Fairchild E, Miner BE, Rudstam LG, Spaak P, Hairston NG (2019) Consumer-resource dynamics is an eco-evolutionary process in a natural plankton community. Nature Ecology & Evolution, 3, 1351–1358. https://doi.org/10.1038/s41559-019-0960-9

      Wang J, Whitlock MC (2003) Estimating Effective Population Size and Migration Rates From Genetic Samples Over Space and Time. Genetics, 163, 429–446. PMID: 12586728

      Weider LJ, Jeyasingh PD, Frisch D (2018) Evolutionary aspects of resurrection ecology: Progress, scope, and applications—An overview. Evolutionary Applications, 11, 3–10. https://doi.org/10.1111/eva.12563

      Reviews.

      Revision round #2.<br> 2021-04-19.<br> Author's Reply.<br> Download author's reply (PDF file) Download tracked changes file.

      Dear Dr Haag,

      Thanks for handling the review of our manuscript. We agree that the comments of Jon Agren have further improved the quality of this manuscript and we tried to answer to all of them (see the point-by-point reply below). We provide a track-changes version where the changes in the main text and supplementary files are highlighted in bold. The new version is also available online on Biorxiv : https://www.biorxiv.org/content/10.1101/2020.08.21.261230v3.

      We hope that you will find this updated version of our manuscript suitable for recommendation by PCIEvolBiol and would be happy to take any further comments if you judge it would improve the manuscript.

      Laurène Gay, on behalf of all the coauthors

      Decision round #2.

      Dear Dr Gay,

      Your revised preprint "Evolution of flowering time in a selfing annual plant: Roles of adaptation and genetic drift" has now been reconsidered by two of the original reviewers. As you will see, while one of them is satisfied with the new version, the other is positive but recommends an additional round of minor revision. From my own reading, I agree that the suggestions by the reviewer will likely further strengthen the manuscript. Therefore, before reaching a final decision, I would like to ask you to consider these suggestions, and to revise the manuscript accordingly. When you submit the revised version, please include a letter in which you describe how you have responded to each of the referees comments.

      Best wishes, and many thanks for submitting to PCI Evol Biol,

      Christoph Haag

      Preprint DOI: https://www.biorxiv.org/content/10.1101/2020.08.21.261230v2

      Reviewed by Jon Agren, 2021-04-16 11:54.<br> I think the presentation has benefitted from the revisions made by the authors. Below is a list of comments on details regarding terminology and presentation that the authors may want to consider.

      p. 1, Abstract first sentence. Resurrection experiments can detect correlations between trait modifications and changes in the environment, but this is not really a test of a causation, is it? Or is the argument here that simultaneous parallel changes in many populations indicate a change in the environment acting over a large area? This could be indicated with a slight rewording.

      p. 1, Abstract first sentence. Change “traits modifications” to “trait modifications”.

      p. 2, first paragraph. Not fully clear what the important difference is between experimental and natural populations. In both cases, an estimate of effective population size is required.

      p. 2, right column, line 3. What does “>0.5” refer to? A broad-sense heritability estimate?

      p. 2, right column, line 27. Insert “selfing” after “predominantly”.

      p. 2, right column, line 36, “across 22 generations”. Does this species have any seed bank that may affect “effective generation time”?

      p. 2, right column, line 47, “taking into account the multilocus genotypic composition…”. Unclear how this should be understood. Reword?

      p. 2, right column, line 50, “for neutrality”. I suggest the authors indicate how this is achieved. – By using estimates of genotypic values for flowering time and assuming flowering time is a neutral trait?

      p. 3, first paragraph. I still find the procedure for building “families of full sibs” unclear: I suggest the authors state explicitly whether the families multiplied in 2011 each originated from a different pod collected in the field, or whether the families originated from seeds that had been randomly selected from pooled samples of seeds from 1987 and 2009, respectively.

      p. 3, right column, paragraph “Temporal changes in sensitivity to vernalization”, “measured as the slope…” This needs some more explanation. Are differences calculated between all possible pairs of plants in the two treatments?

      p. 4, third paragraph, “good approximation of the additive genetic covariance”. What about maternal environmental and genetic effects?

      p. 4, right column, first paragraph. State explicitly that the individuals analysed represented 145 different families?

      p. 5, second paragraph, “As a preliminary step,…”. To me the argument would make more sense in the reverse order, as the changes in flowering time and MLG frequency between 1987 and 2009 are the most direct estimates of evolutionary change. In other words, starting from the observation of the changes in flowering time and MLG frequency, one can examine the strength of the association between flowering time and MLG in the greenhouse, and whether the change is consistent with selection observed in the greenhouse. I see no a priori reason why selection on flowering time in the greenhouse should mirror that at the site of the focal population. To make this order of logic clear, the authors may want to move the description of the selection gradient analyses to after this argument has been formulated.

      p. 5, second paragraph, “whether selection in quantified in the greenhouse is likely to mirror selection in the field at present and 22 years ago”. To be strict , it would only need to mirror the predominant selection between 1987 to 2009 to be correlated with the change observed, right? Current selection in the field should matter little?

      p. 5, second paragraph, “We then measured…”. I like this approach! The authors should indicate which measure of flowering time was used in this analysis. The legend of Fig. 3 speaks about “average flowering time”. The sensitivity to vernalization treatment varied among genotypes. Are the results of this analysis essentially the same if the analysis is conducted separately for treatment 1 or 2, or separately for estimates of flowering time obtained based on the seed sample from 1987 and from 2009, respectively?

      p. 6, first paragraph; Table 3. Since a single line was sampled in each population, it is a bit misleading to call the examined effect a “population effect”. Change to “line effect”?

      p. 7, first paragraph, “predict an evolution of towards earlier flowering”. Since estimates of selection and heritabilities are specific to a given environment, this prediction is valid for the greenhouse and not necessarily for other environments.

      p. 7. Was there an effect of year of sampling on estimates of flowering time for MLGs sampled in both 1987 and 2009?

      p. 7, right column, second paragraph, “were persistent through time”. Change to “were observed in both years” to make the fact that altogether 5 lines were observed in both the 1987 and 2009 sampling more obvious?

      p. 7, right column, second paragraph, “Fig. 3A, regression only significant…”. Add sample size (i.e., number of family means included in this regression).

      p. 11, second paragraph, “Munguia-Rosas et al.”. Note that selection estimates considered in this meta-analysis largely ignores the effect of variation in number of flowers and plant size, suggesting that many of them rather reflect a correlation between plant condition and fitness.

      Finally, I suggest the authors somewhere add a caveat regarding possible G x E interactions for flowering time (greenhouse vs. field), when discussing the possible association between flowering time as expressed in the greenhouse and fitness and evolutionary change in the field.

      Reviewed by Stefan Laurent, 2021-03-20 17:00.

      I am satisfied with the answers to my comments and with the modifications to the main text. The qqplots should be added to the supplementary figures linked to main figure 3.

      Revision round #1.<br> 2020-10-26.

      Author's Reply.<br> Download author's reply (PDF file).<br> Download tracked changes file.

      Dear Dr Haag, Please find enclosed a revised version of our manuscript. We are very grateful to you and the reviewers for the comments and suggestions that have improved the manuscript substantially. We tried to answer to all of them (see the point-by-point reply below). We provide a track-changes version with line numbers, where the changes in the main text and supplementary files are highlighted in bold. We also added a revised version that you can find after the track-changes (starting page 19). We hope that you will find this updated version of our manuscript suitable for recommendation by PCIEvolBiol and would be happy to take any further comments if you judge it would improve the manuscript.<br> Laurène Gay, on behalf of all the coauthors.

      Decision round #1.<br> Dear Dr Gay, Thank you for submitting your preprint "Evolution of flowering time in a selfing annual plant: Roles of adaptation and genetic drift" to PCI Evol Biol. Your work has now been considered by three reviewers, whose comments are enclosed. As you will see, the reviews are largely positive, and, based on these reviews as well as my own reading, I am happy to further consider your preprint for recommendation. However, before reaching a final decision, I would like you to revise your manuscript according to the recommendations by the reviewers. Besides the more minor points (which also should be considered carefully), I think there are two main issues that need particular attention:

      • First, the introduction (and perhaps also some other sections) would profit from some streamlining. In my opinion, this does not mean that you should entirely drop the discussion of the effects of selfing on the efficacy of selection. But this section should be reduced in length and care should be taken to clearly state the objective of the study early on without raising issues (e.g., comparison between selfers and outcrossers) that are not subsequently addressed. Incidentally, from my own reading, I also think that the last part of page 1 (where you give some more detail on the different possible approaches to investigate the influence of selection on phenotypic change) would profit from some reformulation: I found this part difficult to follow and its purpose is not entirely clear to me: Do you want to provide details on some of the approaches or do you want to explain why you used only some bot not others in your study? Moreover, the statement that natural populations cannot be replicated may also need to be nuanced (replication might in principle be possible across different populations or using independent samples from the same population).
      • Second, the analysis of the frequency changes of the multilocus genotypes needs some clarification, both in terms of potential effects of excluding rare genotypes and in terms of confidence intervals given (likely) non-normal distribution of residuals. If you submit a revised version, please include a letter in which you describe how you have responded to each of the referees’ comments. Best withes, and apologies again for the delayed decision, Christoph Haag

      Additional requirements of the managing board:<br> As indicated in the 'How does it work?’ section and in the code of conduct, please make sure that: -Data are available to readers, either in the text or through an open data repository such as Zenodo (free), Dryad or some other institutional repository. Data must be reusable, thus metadata or accompanying text must carefully describe the data. -Details on quantitative analyses (e.g., data treatment and statistical scripts in R, bioinformatic pipeline scripts, etc.) and details concerning simulations (scripts, codes) are available to readers in the text, as appendices, or through an open data repository, such as Zenodo, Dryad or some other institutional repository. The scripts or codes must be carefully described so that they can be reused. -Details on experimental procedures are available to readers in the text or as appendices. -Authors have no financial conflict of interest relating to the article. The article must contain a "Conflict of interest disclosure" paragraph before the reference section containing this sentence: "The authors of this preprint declare that they have no financial conflict of interest with the content of this article." If appropriate, this disclosure may be completed by a sentence indicating that some of the authors are PCI recommenders: “XXX is one of the PCI XXX recommenders.”

      Preprint DOI: 10.1101/2020.08.21.261230

      Reviewed by Pierre Olivier Cheptou, 2020-10-20 11:18.<br> The study by Gay et al. reports empirical data on the evolution of flowering time in a highly selfing species: Medicago truncatula. The authors used several approach to investigate the question. In particular, they used a resurrection approach with seeds from 1987 and 2009. The aim of the study is to disentangle the role of drift and selection in the shift observed as well as estimating selection gradient of flowering time. The study is interesting and the different experiments (pop centered, regional) is consistent with a shift in flowering time. Below, my comments:

      1-The introduction discuss the question of adaptation face to environmental change. While the text is rich and well referenced, I found that the introduction is a bit long. There is a long discussion on whether outcrossing/selfing traits influences adaptation. The logical consequence would be to compare outcrossing/selfing populations. Since the study does not compare outcrossing and selfing populations, I think this part should be greatly reduced. Also, the statement that bottlenecks are more frequent in selfers (if true !!) would be more striking if the references were reporting empirical data. To my knowledge, Schoen and Brown (1991) and Ingvargsson 2002 hypothesize that it is the case but did not demonstrated that selfers suffer from higher bottlenecks. In the following paragraph, I found confusing to assert that “self-fertilization mays have facilitated adaptations to agricultural practices” when discussing the role of mating system on adaptation. Is it because the traits were preadapted or because the genetic architecture of selfers facilitates adaptation? In short, the introduction should be more focused to introduce the question short term adaptation of flowering time in the face of warming.

      2-Sum of temperature. The individual flowering time is converted in sum of temperature. The basal temperature is assumed to be 5°C, based on Moreau et al (2007). Would it be possible that Tb has evolved during the two decades? Would the conclusions be different if flowering time were measured as the number of days? At least, the possibility of a shift in Tb should be discussed as I found contradictory to evaluate adaptation to warming but keeping Tb constant.

      3-Maternal effects. If I understood well, the results on the studied populations are corrected for maternal effects (one generation to refresh seeds stock) but the results of regional analysis are based on the F1 generation (without correcting for maternal effects). I was interested by the amplitude of the shift: two days in the cape Corsica populations but five days in the regional analysis. This may be a “true result” or an effect of correcting for maternal effects. Did the authors measure the flowering date in the F1 of the cape Corsica populations. I would suggest to mention this result in the discussion. Is it possible that the difference in flowering date reported have changed in Cape Corsica population because of the F1 generation in greenhouse? My feeling is that these results are, as such, interesting. We often see this pattern of a lower amplitude after one generation. If it was only noise, the first generation should exhibit either lower or higher difference than the F2. Epigenetic components of flowering could have played a role in adaptation to warming and these effect cannot be distinguished from true quantitative genetic effects if parts of these effects last more than one generation. Do the same MLG (from 1987 and 2009) have the same fitness? Because the authors have the chance to have the same MLG, it would be interesting to look at this relationship to investigate maternal effects.

      4-Genetic analysis. If I understood well, the test for selection versus drift is based only on conserved multilocus genotypes, i.e. a fraction of the population. Why doing this choice? Why not using a Qst/Fst approach that would take into account all the individuals? (the design allows to estimate Qst, doesn’t it?). In addition, I see a potential bias because it assumes that the population behaves as a fully selfing populations, which is not the case. While the authors point the potential differential selective response of outcrossers versus selfers, the results reported are based only on the full selfing fraction of the population, which I found contradictory.

      Overall, I found the ms interesting and such long term dataset is rare. However, the ms would benefit from being more focused (particularly the introduction) in order to highlight the results and their biological interpretation.

      Reviewed by Jon Agren, 2020-10-19 15:12.<br> This study uses a resurrection experiment and simulations to explore the possible causes of changes in flowering time and genetic composition of a Medicago truncatula population across 22 generations. In the resurrection experiment, plants grown from seeds collected 22 years apart were raised in the greenhouse to produce selfed lines. These lines were then used to document possible changes in flowering time and to quantify selection on flowering time in the greenhouse. Changes in genetic composition were characterized by scoring 20 microsatellite loci (16 kept after filtering) and documenting changes in the frequencies of multilocus genotypes. The paper is well written and addresses interesting problems of wide general interest. However, I think the authors need to (a) motivate their approach to use estimates of selection obtained in the greenhouse to infer selection in the field, (b) provide more detail on the distribution of multi-locus genotypes and the power of their analysis of change in genetic composition, and (c) clarify a few details when it comes to sampling procedure (see below).

      Main comments:

      The authors appear to assume that selection quantified in the greenhouse is likely to mirror selection in the field at present and 22 years ago. This needs to be motivated.

      I suggest the authors provide more detail on the distribution multilocus genotype (MLG) frequencies, and that this information is given already at the start of the third paragraph on p. 7. They report that 60 different MLGs were detected in their sample of 145 individuals. Two MLGs were common, and 12 MLGs were shared between the two sampling years. This suggests that most MLGs were rare and perhaps only represented by a single plant? The authors may want to discuss whether their sample sizes are sufficient to characterize changes in genetic composition of a population with such skewed distributions of MLGs.

      I suggest the authors clarify a few details regarding sampling:

      (a) For the resurrection experiment, “100 seeds per sampling were replicated” (p. 3, second paragraph). Were these seeds from 100 different pods and thus sampled from 100 plants, or were they a random sample of 100 seeds from a pooled seed sample from each year?

      (b) For the genetic analysis, leaves were sampled from “the multiplication generation in the greenhouse” (p. 4, fifth paragraph), and after filtering 145 individuals remained in the data set to be analysed. Please, state explicitly that the “multiplication generation” refers to the plants derived from the 200 field-collected seeds (presumably representing seeds from 200 plants(?); see previous comment). Were seeds from the two sampling occasions equally represented among the 145 individuals included in the analysis?

      Minor corrections:

      Abstract, line 11 from bottom. Change “population” to “populations”

      p. 7, first paragraph, second line from bottom, “in both years”. From this wording, you easily get the impression that selection was quantified in two years. I suggest you add a few words to indicate that this rather refers to a similar negative relationship being observed among lines derived from each of the two years.

      To make text in graphs readable, font size should be increased in Figures (in particular in Figs. 3-5).

      Reviewed by Stefan Laurent, 2020-10-16 11:05.<br> In this study, the authors test whether flowering time evolved in an experimental population of Medicago truncatula and whether this change could represent an adaptation to varying environmental conditions. For this, they measure changes in flowering time in a natural population over 22 generations (2 timepoints), they quantify the association between flowering time and fitness (as approximated by the number of seeds produced), they track changes in haplotype frequencies characterized by different approximated fitness values, and finally they also measure changes in flowering time in 17 populations from the same geographical region that have been sampled twice over a comparable time range.

      The authors report a significant reduction in flowering in the main population and in the regional analysis that appears to be consistent with the specific effects of climate change in the Mediterranean region (i.e. limiting summer drought occurs earlier in the year). They also report a significant association between flowering time and seed production. However, the evidence for the effect of positive selection obtained by analyzing the changes in haplotypes is at best marginal; even if the authors do a good job in describing some of the uncertainty associated with this analysis, I think that one more aspect should be exposed.

      Besides my major comment, I find the manuscript clearly written, the analyses carefully conducted and presented, and the intro and discussion very well written and informative, at least for the non-expert.

      Major comment

      My only major criticism refers to the results presented in Figure 3. The selection gradients measured here seem to be heavily influenced by two outlier points with low seed production and early flowering. As a result, the linear models (especially the one for MLG found in 1987) appear to be a poor fit to the data, as can probably be seen by inspecting the residuals, which are unlikely to be normally distributed. I think that the authors should report the uncertainty around the slopes and that this uncertainty should be further considered in the analyses presented in figure 4, which will likely cause the observed selection gradient to be non-significant under a larger range of Ne values. I am not sure about the best way to obtain confidence intervals for the selection gradients but I imagine that a bootstrap approach should be applicable.

      Minor

      I agree with the authors that the N_e value estimated from the temporal Fst is very likely underestimated. Comparing the expect heterozygosity under Ne=19 with the observed He would further support the idea that larger Ne values are indeed realistic. How does the observed heterozygosity in the population compares to the theoretical expectations given by Nordborg and Donnelly (1997)? Rescaling the census number (>2000) by 1/(1+F) would lead to a less conservative Ne value for the test for selection and may allow a putative selection signal to be detected even after considering the uncertainty around the observed selection gradient.

    1. Author Response

      We are grateful for the thorough and thoughtful comments provided by the reviewers, and we appreciate their support for the design and implications of this study. We have addressed the major points raised by the reviewers as follows.

      Major Concerns:

      1) Limitations of extrapolation to human health and disease.

      From Reviewer 2: Though I found the work largely beyond critique technically, I would have appreciated additional discussion of the limitations of the use of a captive non-human primate to model human dietary response.

      From Reviewer 3: However, my major concern is the suitability of these results to explain human relevance and how far they can address the actual evolutionary significance. I think they should tone down a little. For example, is there really any strong reason to assume that macaques will mimic dietary responses in humans? I appreciate the fundamental importance of macaque-specific responses, but I am unclear how captive primates can model human effects─ how do authors factor their (obvious?) fundamental differences between different immune response profiles activated against similar cues and standing microbiome, warranting divergent interactions with the said dietary manipulations. I think these are caveats that need to be carefully discussed to avoid building over expectations among readers.

      From Reviewer 3: Could there be more discussion on the relevance of differentially expressed macaque genes in humans?

      We appreciate the concern regarding possible overinterpretation of results. There is an extensive body of literature demonstrating the utility of the cynomolgus macaque model to explore influences of diet on numerous phenotypes including atherosclerosis and cardiovascular disease, bone metabolism, breast and uterine biology, and other phenotypes (Adams et al., 1997; Clarkson et al., 2004, 2013; Cline et al., 2001; Cline & Wood, 2006; Haberthur et al., 2010; Lees et al., 1998; Mikkola et al., 2004; Mikkola & Clarkson, 2006; Naftolin et al., 2004; Nagpal, Shively, et al., 2018; Nagpal, Wang, et al., 2018; Register, 2009; Register et al., 2003; Shively & Clarkson, 2009; Sophonsritsuk et al., 2013; Walker et al., 2008; Wood et al., 2007). The cynomolgus model was remarkably accurate in predicting effects of hormone therapies on both cardiovascular disease and breast cancer later demonstrated in the very large Women’s Health Initiative (Adams et al., 1997; Clarkson et al., 2013; Naftolin et al., 2004; Shively & Clarkson, 2009; Wood et al., 2007). Cynomolgus macaque responses to other therapies (tamoxifen, selective estrogen receptor modulators, blood pressure medications, etc.) also have shown great similarities to those in humans (Cline et al., 2001). We have added additional text to the Abstract (lines 51-52), Introduction (lines 136-141), and Discussion (lines 531-542) to situate the current work in the extensive literature that uses cynomolgus macaques as a model to understand human health. We have also included discussion regarding the limitations of extrapolating these results to humans in lines 543-545 of the Discussion

      We also tested the overlap of differential gene expression induced by the Western diet with genes implicated in human complex traits (Zhang et al., 2020). Genes implicated in numerous traits associated with cardiometabolic health were enriched in Western genes, while no traits were enriched in Mediterranean genes. We describe these findings in lines 206-215 of the Results section and in Figure 1—figure supplement 1, which depicts traits relevant to human health and disease identified by previous groups where gene expression profiles overlapped with the “Western genes” in the current study. Lines 668-672 of the Materials and Methods detail the statistical approach used.

      2) Limitations of this experimental design to test the evolutionary mismatch hypothesis.

      From Reviewer 2: My worry is that macaques are so ill-adapted to the Western human diet that the behavioral and inflammation differences seen are explained by this macaque-Western diet mismatch, which dwarfs the human-Western diet mismatch that likely nonetheless exists. This concern can be partially mitigated by careful discussion of this study limitation.

      From Reviewer 2: One critique of dietary interventions that attempt to correct the evolutionary mismatch (which would be useful to address when discussing human-macaque differences) is that human evolution continuing to the present day has been marked by putative selection regime changes associated with multiple major dietary shifts, including meat eating and those arising from cooking and domestication of plants and animals. Such selection may have differentiated humans from macaques in key ways that influence macaque suitability as a dietary model.

      From Reviewer 2: My recommendations for strengthening the work are minor, besides those outlined above to include caveats concerning the differences between macaques and humans that will hopefully prevent lay readers from over-interpreting the results. Specifically, species-level differences which warrant mention include gross differences in "natural" diet between the species, as well as known recent selection on diet-related genes in humans (reviewed in, e.g., Luca et al. 2010; doi:10.1146/annurev-nutr-080508-141048) and gut microbiome differences between the species (e.g., Chen et al. 2018; doi:10.1038/s41598-018-33950-6).

      From Reviewer 2: A simple analysis that begins to address this point analytically would be to compare what results exist for humans (e.g., Camargo et al, 2012; doi:10.1017/S0007114511005812) to those of your study.

      From Reviewer 2: Additionally, one could check whether the DE genes you identify are known to be selected in humans.

      We appreciate the suggestion to strengthen our discussion of the macaque model of human health. As with early hunter-gatherer humans, macaques are omnivorous in the wild, eating a variety of plants and animals. In addition, the cynomolgus macaque often co-exists with human populations, and in that respect may have co-evolved in many ways. Furthermore, cynomolgus macaques have been used in studies of dietary influences on chronic prevalent human disease for 50 years (Malinow et al., 1972), and nearly 700 papers in a Pubmed literature search support the idea that cynomolgus responses to diet are remarkably similar to those of humans in all systems studied. Some of these studies are identified above. With respect to the microbiome, previous work by others has demonstrated that the gut microbiome of omnivorous nonhuman primates is similar to that of humans living a modern lifestyle (Ley et al., 2008), and we previously reported similarities in patterns of microbiome responses to Mediterranean vs. Western diets between humans and NHPs in the present study (Nagpal, Shively, et al., 2018). We have added discussion of the above and note limitations of extrapolation to humans due to species-level differences in natural diets and the role that selection may plan in responses of humans to Western or Mediterranean dietary patterns (lines 543-545). Similarities between humans in DE genes are noted in responses above. In addition, we already had noted that our studies complement and extend the findings of Camargo (line 399), and we added more detail that we found similar effects of diet on expression of IL6 and NF-kB pathway members (line 397).

      3) Lack of control group maintained on a standard chow diet.

      From Reviewer 2: In future studies, it would be useful to have samples from proper control monkeys fed a standard primate diet.

      From Reviewer 3: Also, this is slightly unfortunate because there is no full control treatment where macaques are maintained in their regular diet (i.e., standard monkey chow) and then compared with groups switched to the Mediterranean vs western diet to estimate the relative deviations from their expected physiological processes and behavioural traits.

      We appreciate the concern regarding the lack of a standard monkey chow diet control group. All monkeys ate chow during the baseline phase and were thoroughly phenotyped, exhibiting minimal differences in monocyte gene expression profiles between groups subsequently assigned to the two diets, which involved stratified randomization based on key baseline characteristics while consuming the same diet. Importantly, monkey chow is unlike any historic or current human or nonhuman primate diet as is apparent in Table 1. It is quite low in fat, and rich in soy protein and isoflavones, which are known to alter physiology and immune system function. Therefore, parallel assessments of health measures in monkeys consuming chow long term do not provide data relevant to diet effects on human health. We have added discussion of the strengths of the study (lines 136-141, 531-542), which was designed in order to be able to draw causal inference about the diet manipulation, and we acknowledge limitations to assess directionality of changes (i.e. which experimental diet is driving a particular observed difference) in lines 545-553.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): In this manuscript, using in vivo infection of Zebrafish embryos with Mycobacterium marinum and THP1-derived macrophages infected with Mycobacterium tuberculosis, the authors show that these pathogenic mycobacteria trigger an increase of K+ concentration through the expression of OXSR1. The ESX1 secretion system that is essential for the virulence of M. marinum is required for the expression of OXSR1 and SPAK. OXSR1 and SPAK are involved in the WNK signaling pathway and are cytoplasmic serine/threonine protein kinases that regulate the function of a series of sodium, potassium and chloride co-transporters via phosphorylation. Given that K+ efflux is now accepted as the main inducer of NLRP3 inflammasome, the authors report that this infection-induced OXSR1 expression restrains the protective NLRP3 inflammasome response leading to IL-1b maturation and secretion. Il-1b as a very potent pro-inflammatory triggers TNF-a production and the authors demonstrate that infection-induced OXSR1 expression suppressed host protective TNF-a and cell death early in fection. It appears therefore that virulent mycobacteria induce OXSR1 expression to reduce inflammasome activation by maintaining high intracellular K+. The results presented by the authors are convincing and the conclusions raised by the authors are well supported by the data. In zebrafish embryos, OXSR1 knockdown nicely reduces mycobacteria burden. Based on their conclusions that infection-induced OXSR1 expression reduces NLRP3 inflammasome activation, NLRP3 inflammasome activation has therefore a protective effect against bacterial infection. My main concern is that surprisingly, nlrp3 or il1b knockdown has no effect on bacterial burden in comparison to control embryos. Lane 256, as an explanation, the authors wrote "This may have been because we were using mosaic F0 CRISPR knockout, which is not a complete removal". The removal using mosaic F0 CRISPR knockout is nevertheless sufficient to observe a decrease in bacterial burden following OXSR1 knockdown. Would it be possible that OXSR1 also regulates immunity independently of NLRP3 inflammasome?

      Yes, we will add text to the discussion to address potential NLRP3-independent mechanisms that connect OXSR1 to immunity against mycobacterial infection.

      The lack of effect of il1b knockdown on M. marinum burden has been corroborated by independent laboratories including a publication from the Elks lab in Journal of Immunology: Ogryzko et al 2019. The Ogryzko study found no effect of il1b knockout on M. marinum burden.

      **Other comments:** OXSR1 WB in extended Data 3 is really poor quality so that it is hard to see the increased expression of OXSR1 following infection.

      The western blot will be repeated for cleaner images.

      Figure 2C. It is not shown but I guess that similar results should be obtain using M. tuberculosis.

      Material leaving our BSL3 facility must be decontaminated which makes this suggested analysis impossible in our facility.

      Figures 5D and 5E. To confirm the involvement of NLRP3, in addition of using MCC950, NLRP3 knock down using siRNA should be also performed. NLRP3-deficient THP-1 cells are also commercially available if the siRNA-mediated knock down of NLRP3 is not convincing enough.

      We will purchase NLRP3 deficient THP-1 cells and use our existing shRNA vector to create NLRP3 and OXSR1 deficient cells. We will repeat the experiments in 5D and 5E in these cells to confirm NLRP3 involvement.

      **Minor comments:** How do the authors think that mycobacterium induces OXSR1 expression following infection? It has not been investigated and it is not discussed.

      In Fig1A we showed upregulation of oxsr1a transcription and in Fig2A we showed upregulation of OXSR1 protein. In line 204 of the discussion we described our hypothesis that oxsr1a transcription is responsive to the mycobacterial ESX1 secretion system.*

      *

      Reviewer #1 (Significance (Required)): The observations reported in this manuscript are interesting since for the first time, it is described that virulent mycobacteria induce OXSR1 expression to reduce NLRP3 inflammasome activation by maintaining high intracellular K+. This is quite a significant advance in the field. To escape immune control, many successful intracellular pathogens have evolved methods to limit inflammasome activation. While it is known that potassium efflux is a trigger for inflammasome activation, the interaction between mycobacterial infection, potassium efflux and inflammasome activation was not explored. My field of expertise is the regulation of inflammasome activation. As far as I remember, I've never reviewed a paper using zebrafish embryos but here, the explanations and data are clear so that it was easy to understand and to evaluate. Likewise, I did not know the WNK signaling pathway but the literature clearly shows that it is involved in intracellular ionic balance.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Hortle et al, in this study evaluated the role of WNK kinases SPAK and OXSR1 during M. marinum and M. tuberculosis infection. These two kinases inhibit the KCC channels which have a tendency to export potassium out of the cell. Since potassium efflux is a known stimulator of NLRP3 inflammasome activation, this raises the possible role of these kinases in inflammation and infection. Authors showed that inhibiting OXSR1 genetically and chemically reduced the mycobacterium survival in cells and zebrafish model, thus proposing OXSR1 as a host-directed therapeutic candidate. They showed that knockdown of OXSR1a leads to NLRP3 inflammasome mediated IL1B induction, which results in increase in TNFa and suppression of mycobacterium growth. Furthermore, reduction in mycobacterium growth in OXSR1a KD zebrafish embryos was found to be dependent on ESX1 machinery of Mycobacterium. The role of potassium in regulating Mycobacterium host response is novel. However there are few things which are missing from this interesting work. **Main comments**

      1. Since OXSR1 is known to inhibit KCC channels, which will lead to increase intracellular potassium. Why in infected control cells there is no increase potassium, Fig 2C. What would be the role of potassium in OXSR1 mediated control of Mtb growth?

      We will perform more experiments with altered levels of extracellular potassium to determine if infected control cells have increased intracellular potassium compared to OXSR1 knockdown cells.

      Does addition of extracellular potassium restricts mycobacterium in OXSR1-KD cells?

      We will perform additional experiments with the addition of potassium to the cell culture medium to address this concern.

      Since OXSR1 is known to inhibit KCC channels, What happens to the activity of these channels in OXSR1 KD cells? This is important, because authors could not find any difference in intracellular potassium between uninfected control and uninfected OXSR1 KD cells (Fig 2C). It will be good to add the flowcytometric histogram or dot plots of potassium staining in the main figure or in extended figures.

      We have data showing that although there is minimal difference in basal K+ level in OXSR1 KD cells, there is significantly lower K+ level when the cells are placed in High K+ media, or osmotic shock. We will include this data in the revised manuscript. We will amend the figures to include Flow plots.

      Acquisition of potassium stained cells - In methodology it has been mentioned that ion K+ Green stained undifferentiated THP1 cells were acquired using PE channel while differentiated THP1 cells were acquired using FITC channel. Furthermore in methods its mentioned that Leica Sp8 microscope was used to acquire images, however I do not see any of this data in the manuscript.

      Ion K+ green emits into both the PE and FITC channels. Our choice to use the FITC or PE channel depended on whether the cells were also infected with red fluorescent bacteria which “contaminates” the PE channel.

      Fig 2E and 3D - Meaning of "Normalized CFU/ml"? Each dot represents what? How many times this experiment was performed, please add in the legend.

      Normalized CFU/ml means that the CFU at 3 day post infection were normalized to the 0 day post infection intracellular bacterial burden, to adjust for any differences in phagocytosis of bacteria. Each dot represents the CFU from an infected well in a single representative experiment and the experiment was repeated 3 times. This information will be added to the figure legend.


      Fig 1D - What could be the reason of no statistical significant difference between wild type and homozygous oxsr1a-KO fish?

      This data is from two experimental replicates. We are currently growing more breeding fish to generate embryos for experimental replicates.

      Good to have a schematic model showing the finding s of the study

      We will add a schematic model to the manuscript.

      TNFa is double edge sword and can lead to pathology. Hence treatment of chronically infected animals (say mice) by Compound B, will be needed to confirm the HDT activity of OXSR1.

      Yes, we will add discussion of this point as a caveat to our future direction of using OXSR1 inhibition as a HDT.

      Reviewer #2 (Significance (Required)): This study showed role of kinases, which regulate trafficking of potassium, in mycobacterium-host interaction. Since kinases are draggable, so this opens a new area for developing host-directed therapies for TB. Reviewer #3 (Evidence, reproducibility and clarity (Required)): In this study, the authors suggest to have evidence for OXSR1 to inhibit NLRP3 inflammasome activation by limiting potassium efflux during mycobacterial infection. To my opinion, the study lacks important results supporting their main conclusions. In many instances, the authors have over-interpreted their data and I therefore do not support publication of this study. **Main comments:** Activation of the NLRP3 inflammasome upon OXSR1 knockdown was not convincingly demonstrated.

      We will address the activation state of the NLRP3 inflammasome with NLRP3 KO and OXSR1 KD cells as also suggested by reviewer 1: We will purchase NLRP3 deficient THP-1 cells and use our existing shRNA vector to create NLRP3 and OXSR1 deficient cells. We will repeat the experiments in 5D and 5E in these cells to confirm NLRP3 involvement.

      Clearance of bacteria in an organism, herein zebrafish, involves mechanisms in different cell types including downstream of inflammasome activation. Thus, bacterial clearance experiments in THP-1 cells might not necessarily be related to in vivo experiments in an organismal context. Finally, a mechanism as to how mycobacteria enhance OXSR1 expression to block a NLRP3-mediated response has not been addressed.

      We are not able to perform in depth analysis of the bacterial side of this host-pathogen interaction as my lab will close in the next 4 months. We have shown that transcriptional upregulation of oxsr1a is ESX1-dependent. We will include data on OXSR1 protein expression with WT and ESX1 mutant bacteria when we repeat the western blots in Extended data 3.

      **Specific comments:**

      1. The author showed that the M. marinum ESX1 secretion system induced OXSR1 expression to inhibit the NLRP3 inflammasome activation. This is contradictory to another recent study (PMID: 18852239), which showed that the ESX1 secretion system activated the NLRP3 inflammasome. These effects are not mutually exclusive. The ESX1 secretion system has a “deliberate” purpose in exporting mycobacterial effector proteins to subvert cellular immunity while also having an “accidental” role in exposing the host cell cytosol to vacluolar contents that can activate cellular immunity. We do not assert that mycobacteria completely inhibit all NLRP3 activation – rather that attempts to stop full activation via inducing the expression of host OXSR1. This can be seen in the IL-1b data in figure 3E, where infected WT cells release more IL-1b than MCC950 treated cells, but less than OXSR1 KD cells.

      In line 102, based on Data shown in Fig 1D, the authors concluded that homozygous, but not heterozygous, oxsr1asyd5 embryos showed reduced bacterial burden. However, in Fig 1D, the difference among the genotypes is not significant.

      This concern will be addressed with additional replicates.

      In line 196, the authors stated that "We present evidence that pathogenic mycobacteria increase macrophage K+ concentration by inducing expression of OXSR1." However, the authors did not provide evidence for this.

      We will soften this phrase in the discussion to replace “by inducing” with “and induce”.

      Based on Extended data 3, the authors concluded that infection increases the expression of OXSR1. However, this is not evidenced in the Western Blot. In addition, in panel B, the OXSR1 blot showed many non-specific bands with decreased intensity in OXSR1 knockdown conditions suggesting that there is unequal protein loading making it impossible to interpret these results.

      We will repeat the western blots as per Reviewer 1’s comment as well.

      The authors concluded that infection-induced OXSR1 expression suppressed inflammasome activity to aid mycobacterial infection. Experiments with Compound B, that inhibits OXSR1 phosphorylation, are used in support of the above conclusion. I do not really see a connection between OXSR1 expression and the inhibitor experiment.

      We will reword “expression” to “activity” in regards to the inhibitor experiment.

      In line 187, "Knockdown of tnfa reduced the amount of infection-induced tnfa promoter-driven GFP produced around sites of infection ....". How can a knockdown of tnfa affect the GFP expression driven by the tnfa promoter ?

      The promoter fragment used in the TgBAC construct contains target sites for two of our guide RNAs. We will also include qPCR validation of the knockdown.

      Reviewer #3 (Significance (Required)): Mechanism underlying decreased intracellular potassium level is of great interest in the inflammasome field. However, their observation is not in line with published studies. Audience in the pathogen-host interaction field will be interested. Expertise: dissection of signalling pathway regulation, molecular and cellular mechanism underlying NLRP3 inflammasome activation. We are not using zebrafish model.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, using in vivo infection of Zebrafish embryos with Mycobacterium marinum and THP1-derived macrophages infected with Mycobacterium tuberculosis, the authors show that these pathogenic mycobacteria trigger an increase of K+ concentration through the expression of OXSR1. The ESX1 secretion system that is essential for the virulence of M. marinum is required for the expression of OXSR1 and SPAK. OXSR1 and SPAK are involved in the WNK signaling pathway and are cytoplasmic serine/threonine protein kinases that regulate the function of a series of sodium, potassium and chloride co-transporters via phosphorylation. Given that K+ efflux is now accepted as the main inducer of NLRP3 inflammasome, the authors report that this infection-induced OXSR1 expression restrains the protective NLRP3 inflammasome response leading to IL-1b maturation and secretion. Il-1b as a very potent pro-inflammatory triggers TNF-a production and the authors demonstrate that infection-induced OXSR1 expression suppressed host protective TNF-a and cell death early in fection. It appears therefore that virulent mycobacteria induce OXSR1 expression to reduce inflammasome activation by maintaining high intracellular K+.

      The results presented by the authors are convincing and the conclusions raised by the authors are well supported by the data.

      In zebrafish embryos, OXSR1 knockdown nicely reduces mycobacteria burden. Based on their conclusions that infection-induced OXSR1 expression reduces NLRP3 inflammasome activation, NLRP3 inflammasome activation has therefore a protective effect against bacterial infection. My main concern is that surprisingly, nlrp3 or il1b knockdown has no effect on bacterial burden in comparison to control embryos. Lane 256, as an explanation, the authors wrote "This may have been because we were using mosaic F0 CRISPR knockout, which is not a complete removal". The removal using mosaic F0 CRISPR knockout is nevertheless sufficient to observe a decrease in bacterial burden following OXSR1 knockdown. Would it be possible that OXSR1 also regulates immunity independently of NLRP3 inflammasome?

      Other comments:

      OXSR1 WB in extended Data 3 is really poor quality so that it is hard to see the increased expression of OXSR1 following infection.

      Figure 2C. It is not shown but I guess that similar results should be obtain using M. tuberculosis.

      Figures 5D and 5E. To confirm the involvement of NLRP3, in addition of using MCC950, NLRP3 knock down using siRNA should be also performed. NLRP3-deficient THP-1 cells are also commercially available if the siRNA-mediated knock down of NLRP3 is not convincing enough.

      Minor comments:

      How do the authors think that mycobacterium induces OXSR1 expression following infection? It has not been investigated and it is not discussed.

      Significance

      The observations reported in this manuscript are interesting since for the first time, it is described that virulent mycobacteria induce OXSR1 expression to reduce NLRP3 inflammasome activation by maintaining high intracellular K+. This is quite a significant advance in the field. To escape immune control, many successful intracellular pathogens have evolved methods to limit inflammasome activation. While it is known that potassium efflux is a trigger for inflammasome activation, the interaction between mycobacterial infection, potassium efflux and inflammasome activation was not explored.

      My field of expertise is the regulation of inflammasome activation. As far as I remember, I've never reviewed a paper using zebrafish embryos but here, the explanations and data are clear so that it was easy to understand and to evaluate. Likewise, I did not know the WNK signaling pathway but the literature clearly shows that it is involved in intracellular ionic balance.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required):

      In this project, authors develop a colorimetric and luminescence assay for the detection of SARS-CoV-2 RNA in vitro. They design an RNA based sensor that will be triggered by target RNA then release the ribosome binding site and a translation start site followed by a reporter gene. The released sequence will then trigger the production of reporter protein by transcription-translation coupled assay. Authors also introduce an RNA amplification step in order to increase the sensitivity of this assay.

      **Strengths:**

      This assay provides a simple, rapid way to detect SARS-CoV2 and it is an elegant way to incorporate transcription-translation coupled assay for SARS-CoV-2 RNA detection and identify SARS-CoV-2 patient samples. It is a nice assay and the performance is comparable with the existing method.

      **Weaknesses:**

      However, the positioning of this assay is not very clear. The readout of this assay could be recorded by camera whereas it includes several steps such as RNA extraction, amplification, transcription-translation coupled assay and reporter reaction. The limitations of the existing methods (RT-PCR, paper strip) and the advantages of this assay haven't been demonstrated by the experiments. The stability of RNA may also restrict the application of the proposed assay on site.

      **Major comments:**

      Authors are suggested to design an experiment to show the advantage of this assay compared with the existing method.

      Response: We thank the reviewer for pointing this out. In Fig 5, we show a comparison of our assay with the bench mark in COVID-19 diagnostics, which is the RT-qPCR assay. We specifically correlate the Ct- values obtained for RT-qPCRs with the amount of color or luminescence obtained through our assay. From these experiments we note that the sensitivity of our assay is a lttle less than the RT-qPCRs where our assay does not detect Ct-values in the 36 to 38 range (very low viral loads). This comparative experiment highlights that our assay bears clear advantages over the RT-qPCR in terms of ease of assay set up, ease of color detection, amenability to cell-phone imaging and no requirement of sophisticated equipment or technical training to interpret results. The full details of these comparisons are discussed in the manuscript.

      This is consistent with the literature on COVID-19 diagnostics where new assays are routinely bench-marked against the “gold-standard” RT-qPCR assay ((Corman et al., 2020; Pearson et al., 2021).

      What is the limit of detection of this assay using LacZ and Luciferase reporter respectively?

      Response: The limit of detection of the assay as shown in Fig 4B and Fig 4C-D, was found to be 100 copies of RNA, which translates to a concentration of 8 attomolar RNA. In this case, we find the limit of detection to be the same for both LacZ (Fig 4B) and Luciferase (Fig 4C-D) reporter.

      The calculations of copy number and sensitivity were made using a commercial source of synthetic CoV-2 RNA (Twist Biosciences) that is used in several studies about COVID-19 diagnostics (Joung et al., 2020; Rabe & Cepko, 2020; Wu et al., 2021). The RNA copy numbers are taken from the product details provided by the manufacturer. These details are now clearly stated in the manuscript. The commercial RNA is provided at 106 copies per ul. From this we take as low as 100 copies per 20ul of NASBA reaction, which we are able to detect using our assay. Hence our sensitivity comes to 8 attoMolar. We have clarified this in the manuscript. We noticed a typo in the original submission where we refer to a sensitivity of 80 attomolar in the Discussion. This is corrected to 8 attomolar. With this sensitivity we are within the range to detect RNA in patient samples, as confirmed by our patient data.

      Authors have not examined the selectivity of this assay. What is the specificity, selectivity for each of these variants? Does altering target RNA change the specificity?

      Response: We thank the reviewer for raising this point. As recommended by the reviewer, we have now examined the selectivity of this assay through new data (See new Fig S3, new Fig S4 and new Fig S8, also shown below).

      We have examined selectivity in 3 different ways.

      1. Is our sensor selective to the said region of the SARS-CoV-2 genome? To address this, we generated 19 different Target (Trigger) RNAs spread across the SARS-CoV-2 genome. These were tested against Sensor 12 to examine for their ability to trigger the sensor. We find that our sensor is highly selective for its target RNA and does not show any detectable response to the other regions of SARS-CoV-2 (see new Fig S3).

      Next, we asked if our assay is selective to SARS-CoV-2 versus other related human corona viruses. For this, we first examined the sequence of the target RNA (Amplicon RNA 12) that is sensed by Sensor 12. We selected equivalent regions of RNA from a different coronavirus, the HKU1 human coronavirus family. We generated these RNA sequences in vitro and performed IVTT. These new data are shown in new Fig S4 and below. We find that the human coronavirus (HKU1) RNAs are not able to turn on our sensor, whereas the cognate SARS-CoV-2 RNA is able to.

      We then asked if our assay can detect a current prominent variant of SARS-CoV-2. A major cause of concern is the ability of SARS-CoV-2 to accumulate mutations in its genome, resulting in different variant strains of SARS-CoV-2. Of these variants, the Delta variant (B.1.617.2) is not only highly contagious but has been noted as a possible vaccine breakthrough mutant of SARS-CoV-2. For this, we obtained RNA from the patient nasopharyngeal swab samples from the NCBS-inStem Covid-19 testing Center, Bangalore, India. RNA was isolated in the BSL-3 facility at the testing center. RNA samples were sequenced and confirmed to be the Delta variant- B.1.617.2 (sequences deposited in GASIAD). RNA extracted from these patient samples were tested against Sensor 12 using NASBA followed by IVTT. We find that our assay can efficiently detect the Delta variant SARS-CoV-2 RNA from patient samples with a build up of color, but no color was observed from control samples. These new data are shown below and in new Fig 5F and new Fig S8. The ability to detect the Delta variant of SARS-CoV-2 is an important feature of our sensor since this variant is now of global concern and extensively found in the population, even becoming the dominant variant in several countries (Callaway, 2021; O’Dowd, 2021; Torjesen, 2021).

      In Figure 2C-F, sensor 17 showed higher fold change and sensitivity. Why was sensor 12 selected for further study in Figure 3

      Response: The reviewer rightly notes that sensor 17 responds to 1012 copies of RNA and hence appears to be inherently more sensitive than sensor 12, which responds to 1013 copies of RNA. However, neither of these sensitivities are good enough to detect the levels of viral RNA found in patient samples. Hence we coupled these sensors with a step of NASBA amplification. The screen to identify pairs of NASBA primers gave us great hits for sensor 12 right off the bat, where we could detect down to 100 copies of RNA. Hence we moved forward with sensor 12 for further experiments. This has now been clarified in the manuscript.

      Authors should show the error bar in all plots. Authors should also indicate what the error bar means (SD, S.E.M. etc.) throughout the manuscript.

      Response: This is an important point. We have added the error bars and statistical analyses to all relevant plots. We have included the description of these statistical parameters in the figure legends throughout the manuscript, where relevant. Alternatively, experimental replicates are indicated and shown in the revised manuscript. Specifically in Figures 2 and 3 and 4D we have performed statistical analysis to include p-values to show significance of the data. For the data in Figure 4 B-C we include the experimental replicates as a new Supplementary Figure (see new Fig S5). Data in Figure S5 is now updated to include the experimental replicates. For the patient data in Figure 5, we have included details of specificity and sensitivity analysis for clinical samples (see new Fig 5C).

      **Minor comments:**

      "This method is relatively faster but may generate false positives due to non-specific amplification and primer interactions." Reference is needed.

      Response: We have now added the following references in support of this statement. (Gadkar, Goldfarb, Gantt, & Tilley, 2018; Sahoo, Sethy, Mohapatra, & Panda, 2016)

      "using the softwares Primer 3 and NUPACK." Reference is needed.

      Response: We have now added the following references (Untergasser et al., 2012; Zadeh et al., 2011)

      Reference 15 belongs to CRISPR-CAS based assay but it was cited under RT-LAMP assay.

      Response: This has now been corrected. We thank reviewer for this.

      Reviewer #1 (Significance (Required)):

      This paper will be of interest to scientists interested in developing diagnostic tools for the detection of SARS-CoV2 in viral and host pathogenic sequences; genetic disorders and development of precision medicine.

      Reviewer works in the field of Chemical Biology and Nanotechnology including sensor development and the application in diagnosis, cell physiological studies.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Charkravarthy et al. report a new method for detecting SARS-CoV-2 RNA in both in vitro and human saliva and nasal samples. The new detection method, PHANTOM, is capable of detecting as few as 100 copies of the SARS-CoV-2 genome. The method is demonstrated to reproducible over a large range of viral titers and results in a binary report on CoV-2 infection. From my perspective the results are strong and fairly convincing (please see comments below). There is clear, logical, flow to the experiments and engineering of the PHANTOM system. The collaborative work is well organized and logical. The work is clearly of high significance and certainly merits expedited review and publication. I would like to unambiguously state that support publication of this manuscript in its current form in the non-peer reviewed context of this journal, would be more than happy to provide further peer review of this manuscript upon submission to another journal, and would be more than happy to provide further comments if requested by the authors.

      My personal background is broad in range, however, I have a long track record of research in RNA folding, structural biology, biosensor development, and bioinformatics. Given this knowledge base, I found the manuscript rather easy to read and digest. The manuscript is well written and clear. In order to expedite the process of review I will not give a detailed review which would include grammatical errors (there are are very few). Rather, I will touch on the most pressing issues I see.

      **Major concerns:**

      1) There a number of figures that do not show a statistical measure of significance (e.g. error bears, ANOVA, etc.). It is essential that these be included in the final peer reviewed publication. (See Figure 2A, Figure 3D, Figure 4B, Figure 4C, Figure 5A, Figure 5C, Figure 5D).

      Response: This is an important point. We have added the error bars and statistical analyses to all relevant plots. We have included the description of these statistical parameters in the figure legends throughout the manuscript, where relevant. Alternately, experimental replicates are indicated.

      Specifically in Figures 2 and 3 and 4D we have performed statistical analysis to include p-values to show significance of the data. For the data in Figure 4 B-C we include the experimental replicates as a new Supplementary Figure (see new Fig S5). Data in Figure S5 is now updated to include the experimental replicates. For the data in Figure 5, we have included details of specificity and sensitivity analysis for clinical samples (see new Fig 5C).

      2) There are some important points that do not include references within the manuscript. I believe that the authors should reference Abdolahzadeh et al. RNA 2019 in the introduction. This manuscript describes another NASBA viral detection system using fluorescent RNA reporters (also see Trachman et al. Q. Rev. Biophys 2019, for reference on fluorescent aptamers). Also see the ROSALIND method (Jung et al. 2020 Nature Biotechnology) for detecting water contaminants using visual identification by fluorescent aptamers.

      Response: We have added the above mentioned references to the manuscript as suggested by the reviewer.

      3) The discussion states that "The overall sensitivity in the attomolar range ensures detection of infection in the majority of Covid-positive patients in a population". Please provide a reference to support this and explicitly state the concentration of viral RNA in patient samples. There are a number of times that the copy number of viral genomes and sensitivity of the measurement is stated throughout the manuscript. There should also be a reference and statement about concentration.

      Response: The reviewer has raised multiple connected points here, which we address in the revised manuscript.

      1. Concentration of RNA in patient samples: We have added the references (Pujadas et al., 2020; Wyllie et al., 2020) where the authors report that the typical concentration of viral RNA in patient nasopharyngeal swab samples lies in the range of 104 to 105 copies of RNA per ml. This translates to a concentration range of 10 to 100 attoMolar. This reference is now added to the manuscript. For the patient samples used on our study, we refer to the Ct- values obtained from the RT-PCR tests and correlate Ct values to the readout from our assay, consistent with other reports on COVID-19 diagnostics ((Joung et al., 2020; Vogels et al. 2020; Wu et al., 2021).

      Copy number and sensitivity: As the reviewer notes, we refer to viral genome copy number and sensitivity of our assay in the manuscript. These calculations of copy number and sensitivity were made using a commercial source of synthetic CoV-2 RNA (Twist Biosciences) that is used in several studies about COVID-19 diagnostics (Joung et al., 2020; Rabe & Cepko, 2020; Wu et al., 2021). The RNA copy numbers are taken from the product details provided by the manufacturer. These details are now clearly stated in the manuscript. The commercial RNA is provided at 106 copies per ul. From this, we take as low as 100 copies per 20ul of NASBA reaction, which we are able to detect using our assay. Hence our sensitivity comes to 8 attoMolar. We have clarified this in the manuscript. We noticed a typo in the original submission where we refer to a sensitivity of 80 attomolar in the Discussion. This is corrected to 8 attomolar. With this sensitivity we are within the range to detect RNA in patient samples, as confirmed by our patient data.

      Reviewer #3 (Significance (Required)):

      I think this is a significant advancement in the field. The introduction of smartphone technology to this robust diagnostic is very attractive. The work is of high significance since the researchers demonstrated robust responses against SARS-CoV-2 variants. As well all now know these are on the rise and cheap robust detection methods are essential for containing this virus.

      Response: We thank the reviewers for the positive comments.

  3. Jun 2021
    1. very few traditional humanists would call their source material “data.” You may have seen this piece in the LA Review of Books in October 2012. While the language is pretty hyperbolic, I do think it helps to convey how uncongenial many humanists feel the notion of data is to the work that they actually do.

      This point about where to draw the line between data and artifacts is interesting considering that digital humanities is built upon the very concept of turning artifacts into data. This connects to the concepts in the article by Trevor Owens about the various properties of data. If we view data as an artifact which can also serve as a source of evidence, then we can preserve the integrity and multifaceted nature of the dataset while still using it so serve the purpose of providing a specific source of numerical evidence. It seems to me that this idea is very important to the digital humanities considering the susceptibility to losing the integrity and humanist nature of original data sources when viewing them as sources for discrete data sets.

    1. I worry that social justice may become simply a “topic du jour” in music education, aphrase easily cited and repeated without careful examination of the assumptions and actions itimplicates.

      I completely agree with this statement, and I think that it's become a buzzword (like Alex said) in schools in general, not even just in the field of music education. Our district hired an Equity Officer about 2 years ago, and I was really hoping that they would have a strong presence in our district, at curriculum review meetings, providing PD, etc....I think I have seen them once since I was hired and it was at New Teacher Orientation. We have someone there that could be helping us to fully understand some of these terms/topics instead of assuming we know what it is, its implications, its assumptions, etc. but it feels as if they're not being fully utilized.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Responses to reviewers’ comments

      We thank the reviewers for their encouraging comments and helpful suggestions.

      Reviewer #1

      (Evidence, reproducibility and clarity (Required)):

      Sanchez et al report several new findings about the adhesive protrusions on Plasmodium falciparum infected erythrocytes. Using super resolution microscopy and correlation analysis, they tracked associations between the knob protein KAHRP and erythrocyte membrane cytoskeleton proteins. They have expanded on and improved previous work on the unusual spiral structure of the knobs, which appears to be a spiral ribbon or blade and have shown a developmental pathway for the association of KAHRP with the cytoskeleton. They have localised KAHRP close to the spiral and determined its abundance in the knobs. They have also used cryo electron tomography and subtomogram averaging to get an improved 3D view of the knob structure.

      The work appears to be carefully and thoroughly done, and the paper is clearly written, though non specialists in the optical methods may find it challenging to navigate through the many super resolution images and correlation plots.

      Comment 1: The writing needs minor editing to fix a variety of small linguistic errors and typos. For example, line 97 "sideway positions" (they presumably mean lateral location), line 980 typo overlay, line 366 "then could reorganizes", line 435, "a predict volume".

      We apologize for the linguistic errors and typos. These have been corrected in the revised manuscript.

      (Significance (Required)):

      Comment 2: The study provides a distinct advance on the previous state of knowledge of the structure and biochemistry of the knobs. The knobs play a key role in virulence of P. falciparum and they are quite poorly understood. Although this paper does not represent a major breakthrough in determining the molecular structure or mechanistic role of the knobs, e.g. the biochemical identity of the spiral remains unknown, the new information is valuable and likely to be important in understanding the pathogenic actions of P falciparum.

      We thank the reviewer for appreciating the importance of our study. We believe that our first-time observations on the dynamics of KAHRP are a very important advance in the field and that revealing the mechanistic basis is a great challenge that at the current stage has to be left to future work.

      Comment 3: The interpretation shown in Figure 7 seems fine, except for the proposal that the actin cytoskeleton is reorganised. There is no evidence for that. The cryo tomograms of the cytoskeleton in Watermeyer et al addressed this point and did not find any evidence for reorganisation of the cytoskeleton other than the insertion of the knobs.

      In two previous studies we could show that actin is indeed reorganized by the parasite. It is mined from the protofilaments to generate long actin filaments that connect the knobs with the Maurer’s clefts and which are used for trafficking of cargo vesicles from the Maurer’s clefts to the erythrocyte plasma membrane (Cyklaff et al. Hemoglobins S and C interfere with actin remodeling in Plasmodium falciparum-infected erythrocytes. Science. 2011 334:1283-1286; Cyrklaff et al. Oxidative insult can induce malaria-protective trait of sickle and fetal erythrocytes. Nat Commun. 2016 7:13401). Moreover, a life-cycle resolved AFM-study of the cytoplasmic side of iRBCs by the group of CT Lim has demonstrated dramatic coarsening of the spectrin network, which must be accompanied by changes to the actin component of the skeleton (Shi, Hui, et al. "Life cycle-dependent cytoskeletal modifications in Plasmodium falciparum infected erythrocytes." PLoS One 8.4 (2013): e61170). Coarsening of the actin-spectrin network would imply a decrease of the amount of actin in the network, which is consistent with its use in the parasite-derived long actin filaments.

      \*Referee Cross-commenting***

      I also agree with the other 2.

      Reviewer #2

      (Evidence, reproducibility and clarity (Required)):

      Malaria parasites replicate within circulating red blood cells (RBC). During parasite maturation, the parasite coordinates extensive modification of the host cell, including structural modifications of the RBC cytoskeleton and surface membrane. These host cell alterations play crucial roles in the pathology of malaria, including vascular adhesion by parasitised cells and avoidance of splenic clearance, and so are of great interest. This interesting manuscript describes a detailed examination of the role in these RBC modifications of a well-described parasite protein called KAHRP. Using a combination of cutting-edge super-resolution microscopy, cryo-electron tomography, immuno-EM, SEM and parasite mutagenesis, the authors provide evidence that KHARP localisation alters during parasite maturation but eventually becomes closely associated with the previously-described spiral structures that underlie infected RBC surface membrane protrusions called knobs. The authors provide improved resolution of the spiral formations, generate a quantitative estimate of the number of KAHRP molecules per knob, and provide a model for the role of KAHRP in attaching other proteins to the spirals based on their observations.

      In general, this study is thorough and well-performed, and the conclusions drawn are well-supported by the data. Although the work does not advance understanding of knob function or the parasite components that form the bulk of the spirals, it provides an interesting and useful contribution to understanding of the manner in which this important pathogen manipulates its host cell.

      We thank the reviewer for appreciating the importance of our study and in acknowledging that it is an important intermediate step towards a complete understanding of skeleton remodelling by the parasite.

      I have just a few minor suggestions that should improve the manuscript.

      Comment 1: Line 91 (Page 2 paragraph 2). It would be greatly helpful here if the authors could provide a more detailed background on the makeup of the RBC cytoskeleton, and in particular the interactions between beta-spectrin and the actin protofilaments of the junctional complexes. The authors should make it clear that the actin-binding domain of beta-spectrin comprises 2 calponin like domains, and that these are attached to the end of the tandem spectrin repeat domains that make up the bulk of the molecule.

      We thank the reviewer for this helpful suggestion and have added a new paragraph to the results section providing detailed background information on the makeup of the RBC membrane skeleton. The new text reads as follows:

      “Major components of the red blood cell membrane skeleton are spectrin and actin filaments (Fig. 1B). The spectrin filaments consist of α- and ß-spectrin, which form α2ß2 heterotetramers by head-to-head association of two αß dimers (Lux, 2016; Machnicka et al., 2014). The N-termini of the ß-spectrin subunits are positioned at the tail ends of the heterotetramer and contain two calponin homology (CH) domains for binding to actin protofilaments consisting of 6 to 8 actin monomers in each of the two strands (Lux, 2016; Machnicka et al., 2014). Protein 4.1R strengthens the spectrin actin interaction (Lux, 2016; Machnicka et al., 2014). Groups of up to six spectrin heterotetramers can attach to an actin protofilaments, resulting in a pseudohexagonal meshwork (Lux, 2016). Ankyrin binds to the C-terminal domain of ß-spectrin and connects integral membrane proteins with the actin spectrin network in an ankyrin complex (Lux, 2016; Machnicka et al., 2014).”

      Comment 2: Line 97 "These values are slightly larger than the reported physical dimension of the protofilament...". Please provide these reported dimensions here, as well as relevant references.

      The requested information is now provided. The sentence now reads as follows:

      “These values are slightly larger than the reported physical dimension of the protofilaments of ~37 nm (Lux, 2016) and might be explained by the lateral localization of the spectrin binding sites and the additional sizes of the primary and secondary antibody trees used to detect the two targets.”

      Comment 3: Line 366 "reorganize"

      The spelling mistake has been corrected.

      (Significance (Required)):

      Comment 4: This is a useful technical advance in understanding of the structure of the P. falciparum-infected red blood cell, and builds on the work of Watermeyer et al. (2016). The study should certainly be of interest to most malaria researchers, particularly those interested in the pathobiology of the organism.

      We thank the reviewer for supporting our study.

      \*Referee Cross-commenting***

      I fully agree with and endorse the comments of the other 2 reviewers.

      Reviewer #3

      (Evidence, reproducibility and clarity (Required)):

      The binding of P. falciparum infected erythrocyte (iRBCs) to the endothelium is mediated by protuberances (knobs). These knobs are assembled by a multi-protein complex at the iRBC surface. It acts as a scaffold for the presentation of the major virulence antigen, P. falciparum Erythrocyte Membrane Protein-1 (PfEMP1). The knob-associated histidine-rich protein (KAHRP) is an essential component of the knobs and therefore essential for the binding of iRBC to the endothelium under physiological conditions. This manuscript focusses on the knob architecture and KAHRP localization.

      Comment 1: It is, at least for this reviewer - hard to assess how the "preparation of exposed membranes by hypotonic shock" and the analysis of the "inverted erythrocyte membrane ghosts" is i) reflective of the physiological architecture within the iRBC and ii) how the authors exclude remnants from Maurers clefts (MCs) in their preparation. The latter appears especially important for the interpretation of dynamic KAHRP repositioning, as MCs are mobile in early stages and non-mobile later on (e.g. McMilian et al. 2013, Grüring et al. 2011) and the authors observed at least some MAHRP1 signal (Figure S8), which is hard to interpret by the single representative image provided.

      We understand the reviewer’s concerns, but are convinced that we have done the necessary controls to evaluate our approaches. For example, we evaluated the exposed membrane approach by investigating uninfected erythrocytes and comparing the findings with literature reports (see Figure 1). A high degree of agreement was observed. We further would like to point out that the exposed membrane approach has been successfully used by several other studies referenced in the manuscript (Dearnley et al., 2016; Looker et al., 2019; Shi et al., 2013). Please also allow us to explain why we have used exposed membranes instead of whole cells. The reason is that the hemozoin produced by the parasite interferes with STED microscopy, resulting in a quick and strong build-up of resonance energy in the specimen and, eventually, in the disruption of the cell.

      With regard to the question of whether remnants of Maurer’s clefts are present in our preparations, we do not think so, at least we never observed membrane profiles reminiscent of Maurer’s clefts in SEM images of exposed membranes (see figure at the end of the response letter). Irrespectively, we will double check this result using STED imaging of exposed membranes treated with an antibody against the established Maurer’s clefts marker SBP1. These data could be added to a revised manuscript.

      Comment 2: line 173: Please provide a detailed description about parasite synchronization (also absent in the methods section).

      A detailed description including references are now added to the methods section:

      “For synchronization of cultures, schizont-infected erythrocytes were sterile purified using a strong magnet (VarioMACS, Miltenyi Biotec) (Staalsoe et al., 1999) and mixed with fresh erythrocytes to high parasitaemia. 5000 heparin units (Heparin-sodium 25000, Ratiopharm) were added and the cells were returned to culture for 4 hrs (Boyle et al., 2010). Following the treatment with heparin, cells were washed with pre-warmed supplemented RMPI 1640 medium and then returned to culture for 2 hrs to allow for re-invasion of erythrocytes. Subsequently, cells were treated with 5% sorbitol to remove late parasite stages (Lambros and Vanderberg, 1979).”

      Comment 3: line 136: Please re-check nomenclature of "PHIS1605w" (mixed nomenclature used throughout the manuscript). I suggest to use either LyMP or the up-to-date ID PF3D7_0532400.

      We apologize for the oversight and now consistently use the ID PF3D7_0532400.

      Comment 4: Please provide source and references for PfEMP1, MAHRP1 and "PHIS1605w" antibodies that are used. I cannot find them in the methods section or in Table S1.

      We apologize for the oversight and now provide the requested information in the amended Table S1.

      Comment 5: line 165: Warncke et al. (2016) appears to be misplaced as an appropriate MAHRP1 reference.

      We now cite the original MAHRP1 publication by Spycher et al. 2003.

      Comment 6: line 159: the sentence "The strong cross-correlation between KAHRP and actin is consistent with previous cryo-electron tomographic analysis showing long actin filaments connecting the knobs with Maurer's clefts in trophozoites (Cyrklaff et al., 2012; Cyrklaff et al., 2011; Cyrklaff et al., 2016)" could be moved to the discussion section.

      The sentence was indeed redundant with a section in the discussion and was removed.

      Comment 7: line 199: The text refers to Fig. 9AB - but should refer to 4AB or suppl. 11.

      We are sorry for this mistake and now refer to the correct figures in the revised manuscript.

      Comment 8: Fig. 4: A solid average for the number of subtomograms, but please provide information about what the arrowheads (4E) indicate.

      Thank you for this comment. The arrowheads indicate peripheral crown-like densities. We have updated the figure legend to clarify this issue.

      Comment 9: The "flexible periphery" is likely a combination of flexibility and occupancy as the average was made from subtomograms with varying number of turns in the spiral. As occupancy is likely a significant contributing factor to the average that should be discussed or at least mentioned.

      Thank you for this important comment. Indeed, a significant variation was observed between the individual knobs. The spirals have variable diameter, and the number of peripheral proteins also varied. We added measurements to the supplementary figure 11D. In addition, we update the text and extended the discussion.

      Comment 10. On that note, did the authors try and classify based on number of turns prior to averaging and if so did the authors see any differences in structures between few turn and many turn spirals?

      We attempted several classifications on the full knobs with variable masks. However due to a limited number of particles in the dataset we could not converge to stable solutions. Instead, we decided to adopt the subboxing strategy where locally ordered segments at the periphery could be analyzed. This showed several structural snapshot at the periphery of the knobs.

      Comment 11. What size mask was used? Was it a soft sphere around the core or big enough for the knobs with multiple spiral turns?

      While we attempted several alignments and classifications with variable masks, the final refinement and measurement of FSC was performed with a soft contour mask mask. We overlaid it with the structure in Figure S11F and uploaded it as a part of the EMDB deposition. We further show the masks used in this study in a new Figure S14.

      Comment 12. It might be useful for readers who are not familiar with Dynamo to provide a little bit more information about how the initial reference was produced. Additionally more information about the sub-boxing strategy ie: spacing etc. would helpful.

      Thank you very much for the suggestion. For the initial reference we manually aligned all the particles, summed them up and low-pass filtered them. We now describe it in the methods section.

      For the subboxing procedure we added more description to the main text:

      “40 segments were extracted at the radius of the 2nd and 3rd spirals followed by their classification into structural classes.”

      We further extended and simplified the description in the results section (line ~221).

      Comment 13: Fig. 5 Additional (earlier) maturation stages of the iRBC with Ni2+NAT-gold-labelling would be a nice add on - this could help confirm the model and would itself be a control for the later stage labelling.

      We thank the reviewer for this insightful suggestion. We are currently performing the proposed experiment and will include it in a revised version of the manuscript.

      Comment 14: line 637: DMSI typo and please provide the supplier for DMSI (DSM1).

      We corrected the typographic error and now provide the name of the supplier.

      Comment 15: Figure 7: Please provide what the purple arrows indicate.

      The figure legend has been updated.

      Comment 16: Fig S11D: The labels X, Y and Z are confusing, describing the slicing axis as "XZ, YZ and XY" view is more intuitive.

      Done as suggested by the reviewer.

      Comment 17: Figure S13 B: WBs are cropped. Please provide un-cropped WB.

      Uncropped Western blots will be provided in the revised manuscript.

      (Significance (Required)):

      In general, I highly appreciated the solid data and its thorough analysis of the microscopy data. The authors investigate the structural organization of knobs in iRBCs using high-resolution imaging techniques including STED and PALM super-resolution microscopy-based approaches and electron tomography. The beauty of this paper is that it does nicely re-investigate knob architecture in iRBC (e.g. Watermeyer et al., 2016, Cutts et al., 2017, Looker et al., 2019, McHugh et al., 2020) and provides some intriguing KHARP co-localization with cytoskeleton components. The downside of it is that - by nature - it is descriptive (and the data rather confirmative) and as it stands does not provide us with a deeper molecular dissection of the knob associated structure and its cellular function.

      We thank the reviewer for appreciating our study and would like to emphasize the following novelties in our study:

      • We show that the association of KAHRP with membrane skeletal components is highly dynamic and changes as the parasite matures. Our results on the dynamics of KAHRP organization reconciles conflicting reports in the literature, and establish for the first time a dynamical model for KAHRP organization.
      • We further show that KAHRP finally assembles at remnant actin-junctional complexes devoid of the actin-capping factors adducin and tropomodulin.
      • We further quantified the number of KAHRP molecules per knob and show that KAHRP is present as 60 copies per knob, a number one order of magnitude greater than previously thought.
      • Last but not least, we provide a 35 Å map of the spiral scaffold underlaying knobs and show that KAHRP associates with the spiral scaffold.
      • We conclude by providing a novel model on the biological function of KAHRP by proposing that KAHRP acts as a glue that connects spectrin and parasite-remodeled actin filaments with the knob spiral.

        \*Referee Cross-commenting***

      Fully agreed.

      Boyle, M.J., Wilson, D.W., Richards, J.S., Riglar, D.T., Tetteh, K.K., Conway, D.J., Ralph, S.A., Baum, J., and Beeson, J.G. (2010). Isolation of viable Plasmodium falciparum merozoites to define erythrocyte invasion events and advance vaccine and drug development. Proc Natl Acad Sci U S A 107, 14378-14383.

      Lambros, C., and Vanderberg, J.P. (1979). Synchronization of Plasmodium falciparum erythrocytic stages in culture. J Parasitol 65, 418-420.

      Lux, S.E.t. (2016). Anatomy of the red cell membrane skeleton: unanswered questions. Blood 127, 187-199.

      Staalsoe, T., Giha, H.A., Dodoo, D., Theander, T.G., and Hviid, L. (1999). Detection of antibodies to variant antigens on Plasmodium falciparum-infected erythrocytes by flow cytometry. Cytometry 35, 329-336.

    1. Author Response:

      Reviewer #1 (Public Review):

      This paper examines muscle activity at single muscle level during Drosophila ecdysis (adult hatching) behavior. The premise is that quantifying behavior or motor neuron activity is insufficient to understand how the CNS generates behavior - it is also critical to quantify muscle activity. They show that abdominal body wall muscles generate stereotyped patterns of activity during four developmental stages; (phase 0, stochastic activity; phase 1-3, each with different patterns of activity. Co-active groups of muscles form "syllables" which are used in different combinations to generate the stereotyped activity seen in phases 1-3. This analysis was facilitated by use of a convoluted neural network. Interestingly, they found examples where muscle contraction did not match muscle activity (GCaMP elevation), showing the importance of measuring both attributes.

      In addition to mapping the stereotyped muscle activity at single muscle resolution in the generation of ecdysis behavior, they find that phase 1 and 3 are quite variable, and speculate that other constraints on the CNS output (e.g. during larval locomotion) may prevent a sharpening up of muscle patterns. They show that the hormone ETH is required for initiating phase 1, and the neuromodulators bursicon and CCAP are required for initiating phase 2. Failure to initiate either phase is lethal. Lastly, they show that in addition to initiating phase 1 or 2, the hormone/neuromodulators result in more coherent muscle activity.

      Overall this study sets the stage for a detailed analysis of motor neuron function in driving muscle activity patterns, and then further into the CNS to understand the role of premotor neurons. Ecdysis behavior has the potential to be a powerful system for understanding how the CNS generates behavior at the single muscle /single motor neuron level, as well as for understanding how neuromodulators act to regulate muscle/motor neuron activity.

      The figures are almost all too small to see the salient information, and the color scheme is often difficult to resolve. Please enlarge the key aspects of the figures; and try to use more distinctive colors where critical comparisons need to be made. Some examples: left/right colored lines in 1G; panel 3D; lines in 3E; all data in 5G (this is the worst for tiny data); 6C,D,J; all of 7.

      Thank you for your thoughtful review and your suggestions on how to improve the manuscript. Some figure panels (e.g. 5G) have been completely replaced. The others mentioned have been divided into multiple figures or panels, which allowed us to enlarge the material in each. Fig. 7 was deleted from the revised manuscript because it was generally found unhelpful. We also felt that the other revisions rendered this figure unnecessary. The revised manuscript now has 11 main figures and 9 figure supplements with more generous layouts for individual panels so that details are more easily resolved. In addition, we attempted to improve the color scheme to facilitate clarity, using the color palette recommended for the color-blind. Other specific changes are referenced in our responses to individual concerns below.

      Reviewer #2 (Public Review):

      The manuscript by Diao et al. is an important extension of their eLife paper of 2017. Their development of new tools that allow them to follow Ca2+ transients in single muscle fibers over the whole animal through the behavioral sequence and also to independently monitor the Ca2+ transients in the endplates of the motor neurons that innervate these muscles. Their goal is to break down the movements that control the ecdysis sequence into elemental "syllables" and then to defined the role of these syllables in constructing progressively complex behavioral programs and as targets of neuropeptide modulation.

      A crucial behavior that occurs during P1 in higher flies is the movement of the gas bubble but this event is largely ignored in the paper. Prior to pupal ecdysis, gas is expelled into the posterior puparial space and then actively translocated, via muscular contractions of the body wall, to the anterior end of the puparium during the latter portion of P1 (shown nicely in the author's 2017 Video). A detailed study by C.G. Chadfield & J.C.Sparrow (1985. Dev. Genetics 5: 103) of pupal ecdysis in Drosophila emphasized the importance of this translocation for head eversion. When they simply removed the operculum at the start of bubble movement, then the gas bubble could not push the animal backwards in the puparial case and head eversion could not occur. However, they saw normal pupation and head eversion if the removed operculum was immediately replaced and sealed down with petroleum jelly.

      During translocation, the bubble moves in a fragmented fashion between the pupal cuticle and the puparium. Ignoring this movement leads to statements like on line 378 "Because pupal ecdysis is independent of environmental factors and executed in the absence of competing physiological needs, it is likely that its variability is intrinsic to the ecdysis network." For the pupating animal, its "environment" is the inside of the puparial case and the moving bubble is an unpredictable variable in this environment. The trajectory and route of bubble movement is not fixed, and it is likely that variation in sensory feed-back from the gas movement explains the motor variability and reduced stereotypy during P1. The role for proprioception during this phase is likely to inform the CNS of the progression of the bubble fragments. The author's finding that the blockage of proprioceptors suppresses the behavior progression could mean that this sensory information is needed to signal that an anterior space has been produced, and without this signal, the behavior does not progress to its next phase. This should be addressed in the text if not experimentally.

      We very much appreciate the reviewer’s point that the environment within the puparium may affect the pupa’s motor performance. We have now amended our comment on environmental influences to include this point (ll. 479-481 [515-517]), and we elaborate in the Discussion on conditions within the puparium that may influence movement and sensory processing (ll. 457-477 [493-513]). Following the reviewer’s advice, we note that the gas bubble and its dispersion during P1 must be considered a possible determinant of pupal movement. In addition, we mention other possible determinants that we did not previously discuss, namely substrate and surface tension interactions between the body wall, puparium, and residual molting fluid. In line with the Reviewer’s point that understanding the environment of the puparium is critical, we stress the need to account for all external forces acting on the pupal body to achieve a complete understanding of the pupal motor output. In the Discussion, we also now mention the Reviewers’ interesting hypothesis that creation of the anterior space at the end of P1 may provide sensory information necessary for progression of the behavioral sequence (ll. 534-535 [601-602])

      Another aspect of the background that is missing is considering earlier studies on the ontogeny of behaviors leading up to ecdysis/hatching. Notable are studies of the progressive construction of the flight motor program during metamorphosis in moths (Kammer & Rheuben 1976 J. Exp. Biol. 65:65.) and a similar feature of assembly of motor programs prior to hatching in Drosophila (Crisp et al., 2008 Development 135:3707). In the moth studies, complex motor programs were gradually assembled during ontogeny with motor neurons firing but without muscle contraction (as the authors see in prepupae during P0 - Fig 2C). A lack of excitation-contraction coupling in the moth prevents muscle movement through most of development. This suppression of contraction is essential because prior to production of adult cuticle, muscle contraction would rip the developing animal apart. The same requirement to suppress muscle contraction would be seen in fly prepupa until sufficient pupal cuticle has been secreted to prevent rupture from actual muscle contractions! This should be addressed in the text.

      We thank the reviewer for his comments and for the references on motor program assembly. We agree that this is topic deserved more attention than it was originally given. We have now amended our discussion of P0 to contextualize our observations, pointing to the previous literature on both suppressed muscle activity and latent motor programs observed in other developing animals (ll. 487-500 [523-536]).

      Besides not being explicit about how the syllables combine to build the eight basic movements, it is not clear how these basic movements then combine to support the major behaviors of each phase. This is seen in P1, where we see that swing and brace movements can co-occur (e.g., Fig 3D) but is a swing on one side always associated with a brace on the other? What are their phase relationships? Does their temporal association remain stable as the bouts progress? Another example is in Phase 3. There appear to be 5 basic behaviors associated with bouts in Phase 3. The example in Fig 1H shows double peak bouts in phase 3, and the bulk Ca data show a preponderance of double peaks. The different shapes suggest that there are different movements during the two peaks. Their discussion of P3 movements (around line 273), though, does not address this feature of the double peaks. The example in Fig 7A suggests that some movements, like the PostSwing occur at half the frequency of other movements such as the PostCon and AntComp. Is this the basis of the double peaks and how is that reflected in the movements that are finally produced? This should be addressed in the text.

      We regret the confusion on these points. As described there, we have made numerous changes to the manuscript to clarify how elements of behavior at one level (e.g. movements) derive from lower-level elements (e.g. syllables) and are used to build higher-level elements (e.g. phases). We describe the phase relationships at all levels for P1 and P2 and summarize the more variable constituents of P3 movements in the text (Figs. Fig. 7D, E and ll. 247-275 [274-302]). The specific questions raised by the reviewer are also now answered in the text. In brief, early P2 bouts (roughly those prior to head eversion) differ from later bouts in containing only a Swing. Later bouts contain in addition to the Swing a Brace performed concomitantly on the contralateral side of the body (l. 182-183 [197-199]). The movements contributing to the peak-double peak motif common to P3 are now more carefully described at ll. 351-360 [383-393])

      One approach that I did not find useful was dividing the analysis into compartments - anterior versus posterior and dorsal-lateral-ventral. This may provide a way of generating some statistical analysis, but it did not illuminate anything about the behavior. The line between anterior and posterior segments seems to be arbitrary. Of course, it is important to know if there is directionality of movement [waves going anteriorly versus posteriorly], but beyond that, I am not sure what it adds. [Indeed, it made Fig 7 very confusing!] Also, I could not see a rationale for considering separate dorsal-lateral-ventral compartments. This should be addressed in the text.

      We thank the reviewer for this question, which we now address in a revised section of the Discussion on the topic of neuromodulation and compartmentalization (ll. 539-588 [606-655]). To briefly expand upon our explanation there, we think that compartmental activity allows a useful coarse-grained description of the sequential body wall contractions that give rise to movement as indicated by the SequenceMatcher similarity scores (Fig. 6E in the revised manuscript). Second, and more important, we think that how activity flows across compartments provides clues about both the central organization and the neuromodulatory control of ecdysis behavior. Both ETHRB and CCAP neuron suppression exert selective effects on A-P compartments. ETHRB neuron suppression blocks the Lift, a movement of the posterior compartment, while suppressing CCAP neurons prematurely terminates the first (and only) swing-like movement by blocking its progression into the anterior compartment. Additionally, the distribution of CCAP-R appears to reflect mechanisms for selectively regulating distinct D-V compartments. Myotopic maps of larval motor neuron dendrites show that MNs innervating dorsal and ventral muscles are spatially segregated from those innervating lateral muscles and have distinct inputs. This suggests distinct regulation of activity in D-V and L compartments and likely distinct functions. Importantly, CCAP-R is expressed only in motor neurons of the D and V compartments, but in the L compartment it is expressed in muscles. As we suggest, this may allow the different regulatory mechanisms of compartmental regulation to synergize during P2. Finally, our subdivision of the A-P axis at the boundary between segments 5 and 6 has both anatomical and functional importance. At the pupal stage, selective muscle loss imposes differences in muscle composition of segments anterior and posterior to this boundary. Most importantly, anterior segments contain M12, which is a major contributor to behavior only after P1 and is targeted by neuromodulatory Type III terminals containing CCAP and Bursicon. In addition, the A-P boundary also conforms to the functionally and neuroanatomically defined “hinge” region of Tastekin et al. (2018, eLife,), which regulates the switch from forward to backward movement in the larva. Because the compartmental subdivisions we define conform with neuroanatomical differences and appear to underlie functional differences, our working hypothesis is that they will be important landmarks for mapping behaviorally relevant CNS activity as we begin to image it in the next phase of our work.

    1. Author Response:

      Reviewer #1 (Public Review):

      This manuscript presents new data and a model that extend our understanding of color vision. The data are measurements of activity in human primary visual cortex in response to modulations of activity in the L- and M-cone photoreceptors. The model describes the data with impressive parsimony. This elegant simplification of a complex data set reveals a useful organizing principle of color processing in the visual cortex, and it is an important step towards construction of a model that predicts activity in the visual cortex to more complex visual patterns.

      Strengths of the study include the innovative stimulus generation technique (which avoided technical artifacts that would have otherwise complicated data interpretation), the rigor of experimental design, the clear and even-handed data presentation, and the success of the QCM.

      The study could be improved by a more thorough vetting of the QCM and additional discussion on the biological substrate of the activation patterns.

      We thank the reviewer for the thoughtful summary of our work, for highlighting the strengths of our methodology and analysis, and for noting that our study will make a worthy contribution to understanding the organizing principles of visual cortex.

      Reviewer #2 (Public Review):

      The goal of this work is to advance knowledge of the neural bases of color perception. Color vision has been a model system for understanding how what we see arises from the coordinated action of neurons; detailed behavioral measurements revealed color vision's dependence upon three types of photoreceptors (trichromacy) and three second stage retinal circuits that compute sums and differences of the cone signals (color opponency). The processing of color at later, cortical stages has remained poorly understood however, and studies of human cortex have been hampered by methodologies that abandoned the detailed approach. Typical past work simply compared neural responses in two conditions, the presentation of colorful (formally, chromatic) vs grayscale (luminance) images. The present work returns to the older tradition that proved so successful.

      The project's specific goals were to measure functional MRI responses in human cortex to a large range of colors, and equally importantly, capture the pattern responses with a quantitative model that can be used to predict response to many additional colors with just a few parameters. The reported work achieved these goals, establishing both a comprehensive data set and a modeling framework that together will provide a strong basis for future investigations. I would not hesitate to query the data further or to use the QCM model the paper provides to characterize other data sets.

      The strengths of the work include its methodological rigor, which gives high confidence that the goals were achieved. Specifically:

      1) The visual presentation equipment was uniquely sophisticated, allowing it to correct for possible confounds due to differences in photoreceptor responses across the retina.

      2) The testing of the model was quite rigorous, aided by distinct replications of the experiment planned prior to data collection.

      3) The fMRI methods were also state of the art.

      The work was well-situated within the literature, comparing its findings to past results. The limitations and assumptions of the present work were also clearly stated, and conclusions were not overstated.

      Weaknesses of the current draft are relatively minor, however, I believe:

      1) The data could be presented in a way to make them more comparable to prior fMRI work, e.g. by using percent change units in more places, comparing the R^2 of model fits reported here to those reported in other papers, and explaining and exploring how the spatially uniform stimuli, used here but not in other fMRI studies, limited responses in visual areas beyond V1.

      2) Comparison between the two models, the GLM and QCM is not quite complete.

      3) The present results are not discussed in context with past results using EEG, and Brouwer and Heeger's model of fMRI responses to color.

      4) Implications of the basic pattern of response for the cortical neurons producing the data are discussed less than they could be.

      We thank the reviewer for this clear summary of the paper, calling to attention our detailed approach to studying cortical color processing, and enthusiasm regarding the impact of our data and computational modeling.

      Reviewer #3 (Public Review):

      The authors describe a method for fitting a simple, separable function of contrast and cone excitation to a set of fMRI data generated from large, unstructured chromatic flicker stimuli that drive the L- and M- cone photoreceptors across a range of amplitudes and ratios. The function is of the form of a scaled ellipse – hereafter referred to as a 'Quadratic Color Model' (QCM). The QCM fits 6 parameters (ellipse orientation, ellipse elongation, and 4 parameters from a non-linear, saturating (Naka-Rushton) contrast response curve. The QCM fits the dataset well and the authors compare it (favorably) to a 40-parameter GLM that fits each separate combination of chromatic direction and contrast separately.

      The authors note three things that 'did not have to be true' (and which are therefore interesting):

      1) The data are well-fit by a separable ellipse+contrast transducer - consistent with the idea that the underlying neuronal computations that process these stimuli combine relatively independent L-M and L+M contrast.

      2) The short axis of the QCM tends to align with the L-M cone contrast directing (indicating that this direction is one of maximum sensitivity and the L+M direction (long axis) is least sensitive. This finding is qualitatively consistent with psychophysical measurements of chromatic sensitivity.

      3) Fit parameters do not change much across the cortical surface – and in particular they are relatively constant with respect to eccentricity.

      This is a technically solid paper – the data processing pipeline is meticulous, stimuli are tightly-calibrated (the ability to apply cone-isolating stimuli to fovea and periphery simultaneously is an impressive application of the 56-primary stimulus generator) and the authors have been careful to measure their stimuli before and after each experimental session. I have a few technical questions but I am completely satisfied that the authors are measuring what they think they are measuring.

      The analysis, similarly, is exemplary in many ways. Robust fitting procedures are used and model performance and generalizablility are evaluated with a leave-run-out and leave-session-out cross validation procedures. Bootstrapped confidence intervals are generated for all fits and analysis code is available online.

      The paper is also useful: it summarises a lot of (similar) previous findings in the fMRI color literature going back to the late 90s and points out that they can, in general, be represented with far fewer parameters than conditions. My main concerns are:

      1) Underlying mechanisms: The QCM is a convenient parameterization of low spatial-frequency, high temporal-frequency L-M responses. It will be a useful tool for future color vision researchers but I do not feel that I am learning very much that is new about human color vision. The choice to fit an ellipse to these data must have been motivated at least in part by inspection. It works in this case (possibly because of the particular combination of spatial and temporal frequencies that are probed) but it is not clear that this is a generic parametric model of human color responses in V1. Even very early fMRI data from stimuli with non-zero spatial frequency (for example, Engel, Zhang and Wandell '97) show response envelopes that are ellipse-like but which might well also have additional 'orthogonal' lobes or other oddities at some temporal frequencies.

      2) Model comparison: The 40-parameter GLM model provides a 'best possible' linear fit and gives a sense of the noisiness of the data but it feels a little like a strawman. It is possible to reduce the dimensionality of the fit significantly with the QCM but was it ever really plausible that the visual system would generate separate, independent responses for each combination of color direction and contrast? I suspect that given the fact that the response data are not saturating, it would be possible to replace the Naka-Rushton part of the model with a simple power function, reducing the parameter space even further. It would be more interesting to use the data to compare actual models of color processing in retina/V1 and, potentially, beyond V1.

      3) Link to perception. As the authors note, there is a rich history of psychophysics in this domain. The stimuli they choose are also, I think, well suited to modelling in the sense that they are likely to drive a very limited class of chromatic cells in V1 (those with almost no spatial frequency tuning). It is a shame therefore that no corresponding psychophysical data are presented to link physiology to perception. The issue is particularly acute because the stimulus differs from those typically used in more recent psychophysical experiments: it flickers relatively quickly and it has no spatial structure. It may, however, be more similar to the types of stimuli used prior to the advent of color CRTs : Maxwellian view systems that presented a single spot of light.

      We thank the reviewer for their detailed comments on our paper and for highlighting our careful methodological approach and modeling of the data. We address the specific points.

    1. Author Response:

      Evaluation Summary:

      This paper compares the properties of UV cone output synapses in different regions of the zebrafish retina using a combination of electron microscopy, quantitative imaging and computational modeling. They relate these differences to ultrastructural differences in synaptic ribbons and evaluate them using a previously-developed biophysical model for the operation of the synapse. The finding of regional differences in ribbon behavior is novel and suggests an under-appreciated degree of control of release by ribbon structure and behavior. The presentation of some of the results, particularly the model, could be strengthened.

      We thank the reviewers for their valuable inputs. In response, we have substantially extended and restructured the description of preprocessing steps and modelling to aid clarity. Moreover, we include new analysis of “old” GCaMP6f data to show the similarity of calcium dynamics across retinal regions. Additionally, we worked on the description of the simulation-based inference method and provided more intuitive explanations. Finally, we updated the discussion of the model results. We hope to have addressed the helpful critique of the reviewers and strengthened our conclusions and the whole manuscript.

      Reviewer #1 (Public Review):

      Preprocessing of glutamate traces. The bulk of the analysis in the paper uses "scaled and denoised" traces. It is important to verify that this process did not either introduce or obscure any differences across regions. This should include some validation of the assumptions that go into the scaling process (such as whether a sufficiently low calcium level is achieved to use that as a standard). An example of a how this concern could impact the conclusions is that the AZ glutamate traces look less rectified than the others, perhaps due to an elevated baseline, as suggested in the text. But the conclusion about the elevated baseline relies on the scaling process creating a proper alignment such that it is accurate to superimpose the traces as in Figure 3a.

      Thank you for giving us the opportunity to clarify this point. AZ UV-cones indeed have an elevated baseline, as explicitly shown in our previous publication (Yoshimatsu et al. 2020 Neuron). The scaling process recapitulates this baseline shift, as expected. In this previous work we also show how the lower rectification of AZ cones is directly linked to this baseline shift, and it includes experiments specifically designed to find the “true” minimum calcium levels achievable in UV-cones in different parts of the eye, as suggested by the reviewer.

      However, we fully agree that the scaling/denoising process could be described more clearly, and we expanded the explanation in the method section and added a figure (Fig. S3) to visualize all steps explicitly.

      Model fitting. Some key aspects of the model fitting were difficult to evaluate and follow. For example, is the loss function the same as the discrepancy defined in the methods (I assumed that is the case - if not the loss function needs to be defined)? The definition of the discrepancy could be clearer (e.g. be careful about using x here and as the offset of the calcium trace). Related, the results would benefit from a more intuitive description of the fitting, rather than just a reference to the methods (which is a bit dense to go through for that intuitive-level explanation of the model development).

      We added an overview of the simulation-based inference method to the main section of the manuscript. Additionally, we updated the definition of the loss function and tried to give more intuitive explanations. We hope that these changes will help the reader to better understand the computational methods used.

      Some statements seem too strong given the state of current knowledge. E.g. lines 79-80 I think goes too far about the functional role of the ribbon. Similarly lines 97-98 are quite explicit about the connection to prey capture. Lines 276-279 are a particularly important example; I would argue that the statement there requires showing uniqueness of the model.

      We agree that the mentioned statements were perhaps quite strong and we have toned them down in the revised manuscript.

      Could fixation of the retina for EM change the distribution of vesicles in different compartments? I realize this may not be answerable, but a caution about that possibility might be warranted.

      We are not aware of such an effect in previous works. As the reviewer notes it may not be answerable. However, in a way we have an “internal control” for such a possibility, since the different eye regions were treated equally for fixation, yet vesicle distributions differ across eye regions. It seems unlikely that the fixation would have disproportionately distorted vesicle distributions in one eye region without also affecting the others. This is now noted when first discussing the EM approach.

      Line 159: it is not clear how similar the calcium signals are. Specifically, could differences in calcium signal get amplified when passed through simple nonlinearity (e.g. due to the calcium dependence of transmitter release) to account for the differences in glutamate output? Maybe rewording here to leave open that possibility unless you have reason to reject it.

      We agree that this statement was perhaps too strong at this point of the manuscript. We softened it and included a detailed analysis of additional calcium data later to investigate the regional differences of the calcium signal (Fig. 3k-n)

      Can you quantify the fits in Figure 4f,g? For example, can you give a probability of a particular experimental trace or summary parameters for that experimental trace given the parameter probability distributions from the same area and from a different area?

      A quantification of the fits is shown in Fig. S4b,c (previously S3b,c). As we perform “likelihood-free inference”, we cannot give probabilities for the model traces, but we show two different loss functions for the model fits as well as for the linear model: the relevant loss, on which the models are optimized (which is based on the summary statistics) and for comparison the MSE to the experimental traces. We apologize if this was not clearly mentioned in the manuscript. We added it more prominently in the revised version.

      Reviewer #2 (Public Review):

      This study images synaptic calcium and glutamate release from larval zebrafish UV-sensitive cones in vivo. They also study the ultrastructure of ribbon synapses from UV cones in different regions of the retina. They find differences in ribbon dimension and light-evoked glutamate release from cones in different regions of the retina. Cones from dorsal retina show a more pronounced transient component of glutamate release than those from nasal retina. Those in the acute zone in the center of the retina showed intermediate kinetics. Ultrastructural reconstructions of UV-sensitive cones from those regions showed fewer and small ribbons in dorsal cones vs. those in the nasal region or acute zone zone. Light-evoked changes in the kinetics of synaptic calcium were not significantly different suggesting that differences in release kinetics may be related to differences in ribbon behavior in cones from different regions. To relate these different measurements to one another, the authors modified an existing model of cone release to incorporate a simulation-based Bayesian inference approach for estimating best-fit parameters. The model suggested that the differences in glutamate release kinetics could be explained by differences in the rates of transfer between vesicle pools on and off the ribbon. By fixing different parameters, the authors then used the model to explore the parameter space and general properties of ribbon tuning. They also provide a link to the model for others to use.

      The main new experimental finding is that glutamate release properties differ among cones in different regions. The finding that kinetics of glutamate release and ribbon ultrastructure vary systematically in different regions of the retina is interesting. They relate these data using a model of ribbon release. While the model is not novel in its general design, the incorporation of Bayesian inference is new. The most interesting finding from the model is that the kinetic differences in release between cones are not due to calcium kinetics but arise primarily from differences in transitions between vesicle pools. Nevertheless, using the model, the authors show that calcium levels and kinetics matter, since if they hold other parameters fixed, calcium levels and kinetics are the most important factors in shaping response detectability and response kinetics. This is consistent with a lot of earlier work that calcium kinetics are important for shaping response kinetics at ribbon synapses.

      1) The measured changes in glutamate and calcium are small and noisy and there is considerable overlap in the data from cones in different regions. While the example waveforms show considerable differences, the scatter in the data is less persuasive. If I understand correctly, the imaging data comes from 30 AZ, 16 dorsal, and 9 nasal UV cones. With such noisy data, 9 cones seems like particularly small sample. With imaging data, it should be possible to record from dozens or hundreds of cells and a larger sample would strengthen the conclusions.

      We agree that the sample size is quite small, however the dual color experiments are technically extremely challenging. This is part-related to the laser wavelength compromise that needs to be reached for concurrent excitation of red and green fluorescent probes, and the fact that red probes generally give comparatively poor SNR. Notably, to our knowledge concurrent 2P imaging of presynaptic calcium and consequent glutamate release in an in vivo scenario is quite novel, and still very much on the edge of experimental possibilities.

      The green glutamate recordings based on iGluSnFR which are particularly central to our work do have a reasonably high SNR, rather the “problem” is more obviously linked to the calcium recordings. For a better understanding of the calcium handling, we therefore now reanalysed an “old” dataset from Yoshimatsu et al., 2020, Neuron (see Fig. 3k-n) that was recorded with SyGCaMP6f, which provides much higher SNR (and is a little faster albeit also more nonlinear). Notably, the SyGCaMP6f calcium dynamics were also analysed in some detail in Yoshimatsu et al., 2020, Neuron, and we built on these conclusions.

      We hope that the analysis of the additional calcium dataset which is now included in the manuscript adds to more persuasive conclusions.

      2) Calcium and iGluSnfr measurements are both single wavelength measurements and thus sensitive to differences in expression of the indicator. In Fig. 3, the authors show that dorsal cones exhibit larger calcium responses than nasal cones (3c) and that AZ cones show larger glutamate responses than nasal cones (3d). Please address the potential impact of differences in expression on these measurements.

      Thank you for this comment. In Yoshimatsu et. al, 2020, Neuron we compared “live 2p” and “fixed confocal” data of the same sample to show that biosensor expression in UV-cones was uniform across regions, and that the different brightness levels were rather a result of variations in calcium levels. We extrapolated this knowledge to the used biosensors in the new experiments. We now note this explicitly in the revised manuscript.

      3) Please describe controls performed to assess the potential for spectral overlap between the red and green channels. Is there any bleed-through of one dye into the other channel?

      The expression profile of the two indicators is very different, the red fluorescence signal appears in cones, the green in HCs. We illustrated this separation in an additional figure (Fig. S2a,b) which shows that there was no obvious spectral mixing of the two fluorescence channels. We clarified this now in the revised manuscript.

      4) I am not a modeler and while I understand the general approach used for the model, I am not competent to critique specific details of the implementation, particularly the Bayesian inference. However, the fact that the linear statistical model seems to perform just as well as the more ornate model is comforting since it says that the Bayesian inference approach didn't lead the model into an unrealistic parameter space. However, while to my eye the linear model appears to perform just as well as the fancier model, the text says otherwise (Figure 4, lines 270-273). Please clarify.

      Indeed, the linear model captures the general shape of the glutamate response. However, it fails to recover adaptational processes, more precisely the transient components and adaptation over several steps. The model performances are quantified in Fig. S4 (previously S3), and especially with respect to the relevant loss, which is measuring the relevant features, the biophysical model outperforms the linear model. We expanded the discussion on these points in the manuscript and made a more prominent reference to the quantification figure.

      5) Adding a diagram to show where the different regions (dorsal, nasal, acute zone) are located in the eye would be helpful. Is there a difference in the number or size of UV cones from different regions of the retina in larval zebrafish?

      A diagram has been added to Figure 1 as requested. Regarding UV-cone numbers, indeed they do vary across the eye to specifically peak in the acute zone, and to a lesser extent also nasally. This relationship was explored in some detail in

      Zimmermann et al. 2018 Curr Biol, and also touched upon in Yoshimatsu 2020 Neuron. This known density difference is now noted in the introduction.

      6) Are differences in ribbon morphology, glutamate responses or calcium changes retained in adult zebrafish retina? While it may not be feasible to perform similar experiments in adult, some discussion of possible differences and similarities with adult retina would be helpful for putting the results in a more general context.

      The reviewer raises an interesting point. Adult zebrafish display a much broader array of visual behaviours than larvae, and moreover have a rather different diet (meaning that the UV-dependence of prey capture - see Yoshimatsu et al., 2020 Neuron - may be different). Unfortunately, the visual ecology of adult zebrafish remains poorly explored so at this point we can only speculate. Notably, unlike larvae, adults also feature a crystalline mosaic of all cones, meaning that at least numerical anisotropies in cones as they occur in larvae (Zimmermann et al. 2018) are not expected. However, this does not preclude the possibility that UV-cones have different properties across the retina, perhaps it would be the most straightforward way to regionally tune outer retinal outputs in adults. Accordingly, we fully agree that this topic would be exciting to explore, however it would go beyond what could be achieved within a reasonable revision cycle.

      We now added a summarising note of the above into the discussion section.

      Reviewer #3 (Public Review):

      The strengths of the manuscript: It contains a thorough characterization of the anatomical and physiological differences of UV cone ribbons at different locations using the state-of-art techniques including Serial-blockface scanning EM reconstruction and dual-color, simultaneous calcium and glutamate imaging. The Bayesian simulation-based inference model captured the key features of the calcium responses and glutamate release dynamics and provided distributions for each biophysical parameters, which gave insights of their interactions and their impacts on ribbon function. The online tool for ribbon synapse modeling is quite useful. Overall, it is a great effort to understand the function of ribbon synapse with a suitable system that allows multi-facet data collection and a new modeling approach.

      The weaknesses of the manuscript: 1) Overall the writing/formatting of the manuscript can be much improved - there are many imprecise, hard to understand descriptions in the manuscript; figure legends/descriptions are often inadequate for easy understanding; inconsistencies between description in the main text and methods; and above all, the descriptions of model itself and the results from the model are not communicated in a way that facilitates the understanding of process and implications. In contrast, the previous papers from the same group employing similar modeling approaches are much better explained. 2) Based on the intuitions from the modeling, there has not been a strong connection established between the anatomical data and the functional data to which the model is built to fit. More clearly identifying the consistencies and discrepancies between the data and the model will help the readers to understand the pros and cons of the model and the limitations of the generalizations from the model.

      Specific questions and recommendations for the authors:

      1) It will be helpful to have a retina diagram indicating the locations of three different regions.

      The requested diagram has been added to Figure 1.

      2) Fig 1d,e,f (and other figure panels in general) there is no need to mark n.s. On the other hand, in the Statistical Analysis section, GAMs models are mentioned only for Fig 1g, but not other results - needs a clarification.

      We find the “n.s.” labels useful, in part because in some panels none of the differences were significant and the label makes this quite explicit. Accordingly, we have opted to retain them. GAMs were indeed only used for Figure 1g - this is motivated by the difference in data structure of this panel compared to others (i.e. a comparison between continuous rather than discrete distributions). We now clarified this in the methods and added a short paragraph on the used testing procedure.

      3) Fig 1h is quite confusing, with a mixture of 3D and 2D plot, schematic drawing and statistical marks. What comparisons are these marks for? The legend is not specific and the Suppl Fig S1 doesn't clarify much.

      The asterisks are meant to indicate a statistically significant difference in the indicated property (e.g. ribbon size/number) relative to the acute zone. We apologise for not making this clear in the previous version, it is now directly noted in the panel. Regarding the 2D/3D representation, we agree that it may be a little confusing, but we cannot think of a “better” way of summarising all properties analysed by EM in a single panel, so we opted to keep it. We did however expand on the related explanation in the legend to further clarify what is shown.

      4) It will be good to discuss the properties of the calcium sensor. Deconvolution of the calcium signal (lines 617-619) notwithstanding, presumably, the sensor has neither the temporal nor spatial resolution to catch the nano-domain calcium peak near the vesicles in RRP, which is critical for the release of RRP.

      This point seems to link to the ongoing debate on to what extent release from ribbons is driven by micro- and/or nano-domain calcium signalling. It is our understanding that this debate remains unresolved in a truly general sense. Rather, it seems to be non- mutually exclusive (i.e. both micro and nano-domain signals working together), and moreover quite specific to each ribbon synapse in question. In larval zebrafish cones, the pedicle has a rather small cytoplasmic volume, there is only one invagination from postsynaptic processes, and all ribbons inside the cone are opposed to this single invagination. Accordingly, on a possible “sliding scale” of micro- vs nano-domain dominance, we think it is likely that in larval zebrafish cones microdomains will have a notable impact on release. While we are not aware of any data directly looking at this question in zebrafish larval UV-cones, there is good data available from systems that are perhaps quite similar, such as mammalian rods (which also have a single invagination site). For example, from Thoreson et al., 2004, Neuron, Figure 3.

      Already at low micromolar concentrations of calcium that are readily achieved at the level of bulk calcium in the terminal (e.g. 1-2 microM), release is driven to a substantial degree.

      However, we fully agree that we cannot detect possible nano-domain calcium signalling with our imaging method (in fact we are unsure that with currently available technology it is technically possible in an in-vivo preparation). We therefore now further emphasise the possibility of nanodomains acting on release in the discussion.

      Notably, we do already allow exploring the possible influence of nanodomain-type calcium kinetics in the online model, and we think this usefully adds to our exploration of links between calcium signalling and glutamate release.

      5) Likewise, the kinetics of iGluSnFR and of glutamate concentration in the cleft. Admittedly, figs 2a, 3c etc. show that the glutamate signal drops rapidly following the transition from dark to light, however, the rates of vesicle pool replenishment are a topic in the field-some discussion of how glutamate clearance from the cleft and the kinetics of the sensor will influence your estimates of replenishment rates would help future readers better interpret your findings in the context of their own observations.

      We agree that there are technical limitations as to what the iGluSnFR signal can tell us about the exact dynamics of glutamate in an unperturbed situation. Likely this will never be fully addressable. Rather, we use the iGluSnFR signals in a comparative fashion across eye regions, where presumably any distortion of the signals as alluded to by the reviewer would be approximately equal. Following the reviewer’s suggestion, we now explain this more directly in the main text.

      6) In Fig 2d, the rising phase kinetics of the Glu for that nasal cone is strikingly different from that of the acute zone cone. However, such difference is not seen in Fig 3. Therefore, the one in Fig 2d may not be a good representation?

      Thanks, we agree. We have replaced the nasal example with a more representative trace.

      7) In Fig 3a, c.u. and v.u. (only defined in Fig 4 in the context of the model) were used here but not S.D. as in Fig 2, any explanation?

      After scaling, SD adopts arbitrary units. For consistency with the model later we decided to use c.u. and v.u. Here (i.e. “calcium units”, and “vesicle units”). We agree that this could be explained better, and have now rephrased as follows: “We show the rescaled traces in c.u. (calcium units) and v.u. (vesicle units) respectively, to be consistent with the used units in the model later.”

      8) Lines 186-188, how were traces "normalized with respect to the UV-bright stimulus periods"?

      The traces were rescaled such that the UV-bright stimulus periods had a mean of zero and a standard deviation of one. We included this missing piece of information and expanded additionally the explanation of the pre-processing.

      9) Lines 194-195, "In addition, the glutamate release baseline of AZ UV-cones was increased during 50% contrast at the start of the stimulus" - it is unclear whether higher glutamate baseline occurred during the adaptation step (i.e. it increased during that period) or said increase was the level during adaptation compared to that during bright periods?

      Thank you, we meant the former (i.e. glutamate release “is” higher during the adaptation step). This is now clarified in the text.

      10) Lines 219-220, "a sigmoidal non-linearity with slope k and offset x0 which drives the final release" - this sentence is not clear, needs to clarify that it is referring to the relationship between calcium and release.

      Thanks, this is now clarified in the manuscript.

      11) Lines 230-232, "x0 can be understood as the inverted calcium baseline (see Methods)" - Methods don't cover this point, though it is described in the f(Ca) equation, but it isn't obvious how x0 should be the inverted baseline, as if Ca=x0, f(Ca) = 0.5 (i.e., the point of half-release probability). Please clarify this. In general, there are places where explanations of model found in methods don't match those described in the main text (also see some of the points below). Please go over carefully to ensure consistency.

      x0 can be seen as an inverted baseline as it shifts the whole linearity to a different operating point: the smaller x0 the less additional calcium is needed to trigger vesicle release. If we assume a fixed calcium affinity this implies an increased baseline level. We apologise for having omitted these explanations in the initial manuscript, we have expanded the explanation in the Methods of the revised manuscript.

      12) Fig 4e suggests a 5-10 times difference in RRP size between acute zone and nasal UV cones, which is not in line with the anatomical data (Fig 1h). Some discussions and clarifications will be helpful. As we note in the manuscript, it is difficult to quantitatively link anatomical structures to functional data. However, the small RRP size in the nasal zone inferred by the model (Fig. 4e) matches very well to the low vesicle densities at a small distance from the ribbon in the nasal zone in Fig. 1. Our model thus picks up the right trends for an anatomical structure from pure functional recordings, which is in our opinion already remarkable given the experimental noise and fine-grained differences. We commented on this point in the revised manuscript.

      13) From Fig 4h, and Fig S3b,c, the linear model doesn't look too bad (unless I misunderstand the figure panels, which are not explained in great detail). The explanation in lines 272-274 needs some work to make it clearer.

      Compared to the “best model”, the linear model clearly lacks in accuracy, perhaps most intuitively visible when looking at adaptation kinetics. This is especially the case for the relevant loss, which is based on the summary statistics. We extended the mentioned lines and hope to clarify it now in the manuscript.

      14) Sobol indices and their explanation are lacking. Are they computed using Ca2+ and glutamate signals, or just glutamate? It is hard to parse their relative "contributions" to model behavior as described in the text, when the methods caution against interpreting this analysis as determining the "importance" of parameters (lines 805-806).

      The first order Sobol indices measure the direct effect of each parameter on the variance of the model output. More specifically, it tells us the expected reduction in relative variance of the output if we fix one parameter. For the computation, broadly speaking, many parameters were drawn from the posterior distribution and the model was evaluated on these parameters. Afterwards the reduction in variance of the model evaluations was computed if one dimension of the parameter space was fixed. We agree that they are non-intuitive to interpret for a single time point, however its temporal changes give us insight into the time dependent influence on the model output. Often Sobol indices are computed by drawing random samples from a uniform distribution on a high dimensional cuboid [r1,s1] x … x [rn,sn] where each interval [ri,si] is simply defined by the mean+-10% of the parameter fit, where the definition of 10% leaves much room for interpretation and could not be meaningful in the same way for all parameters. We believe that the inferred posterior distributions are a much better suited probability distributions as they encode all parameter combinations which agree with the experimental data.

      We expanded our explanation on this point in the manuscript.

      15) The sensitivity analysis suggests that vesicle transitions are more important than pool sizes or their calcium dependence. Thus, it appears that one intuition from the model is that ribbon size - the main anatomical difference of the UV cone ribbons from different regions - is not very important for the functional difference observed (also see discussion in lines 438-439). Although, it has been discussed that ribbon size does not necessarily correlate with IP or RRP size, but this appears to be the hallmark of the acute zone.

      As the reviewer notes, one potentially interesting hint from our work is that ribbon size does not necessarily translate 1:1 to vesicle pool sizes, or their relative transition rates. One particularly clear example of this might come from comparing Figs. 1d-f and Fig. 1h, between nasal and acute zone. Both have similar ribbon geometry (Fig. 1d-f), but nasal ribbons nevertheless appear to pack fewer vesicles (Fig. 1h). Linking with our functional data and modelling, it then appears that perhaps on top of that, vesicles simply move at different rates between the pools, a property that is impossible to pick up from a static EM reconstruction.

      More generally, as mentioned in the manuscript and discussed in the previous point, it is difficult to judge the overall importance of a parameter from the sensitivity analysis. However, we clearly see time dependent effects of the different parameters and especially the RRP size matters for the transient component, which can be seen in Fig. 5. Indeed, the pattern for IP size seems to be different and it may be that case that the used stimulus is not optimal to infer this parameter from functional recordings.

      How the ribbon size relates to different vesicle densities and how these densities could potentially influence the changing is however still an open question and cannot be answered in the scope of this manuscript.

      16) Lines 460-461, intuitively, a slower RRP refill rate will result in more transient response - after the depletion of RRP, less refilled vesicles to give the sustained component of the response. This is the opposite of what model predicted (a faster RRP). Some explanation and discussion will be helpful.

      The RRP refill rate indeed influences the transience in the mentioned way. However, its influence already starts earlier and is also influencing the overall amplitude (if some minimal background activation is assumed). It is therefore especially influencing the sustained component. However, for the nasal model already the inferred RRP size is the smallest and it seems that a small RRP refill rate is sufficient to produce the sustained response behaviour which we see in Fig. 4f. We thank the reviewer for this thoughtful comment and mentioned this behaviour in the discussion.

      17) Also, the model simplifies vesicle transition rates by removing their calcium dependence. The Methods section indicates that this choice resulted from early fitting results that essentially "dialed out" the calcium dependence. Given the relative freedom that the model seems to have in finding suitable solutions, how is the lack of calcium dependence justified, and what potential impact might it have on the modeling results?

      Identifying model (mis-)specification is a non-trivial task in general. The presented model is complex enough to replicate the recorded data but can easily be extended to more complex dynamics (e.g. more complex calcium handling) in future studies, as it is publicly available online. Further added components could even act as “distractors” to compare the other parameters across zones and we thus decided to use an “as simple as possible” model. Interestingly our previous study (Schröder et al., 2019, Approximate bayesian inference for a mechanistic model of vesicle release at a ribbon synapse, NeurIPS.) showed that even at a temporal resolution of single released glutamate vesicles, it was not necessary to include calcium dependency for the refilling of the vesicle pools. This study thus supports our model choice.

      18) Lines 503-508, "In combination with the approximately equal and opposite effects of calcium baseline on the detectability of On- and Off-events (Fig. 7b,f), this suggest(s) that the calcium baseline may present a key variable that enables ribbons to trade-off the transmission of high frequency stimuli against providing an approximately balanced On- and Off- response behaviour." - what will be the physiological relevance for such conditions, perhaps the level of adaptation? Any existing data or predictions?

      The reviewer raises an interesting but ultimately perhaps unanswerable point, given the scarcity of available data on temporal natural image statistics in the UV band across the larval zebrafish visual field. It is of course tempting to speculate that the ecological need to tune kinetics and On/Off preferences might be linked (e.g. detecting a “dark looming predator” might disproportionately benefit from a rapid Off response). However, to truly understand this idea at a useful level of detail would likely be a rather involved study in its own right. Accordingly, we here prefer to simply point at the possibility to “tune” the ribbon using calcium baseline, and what effects this might have on kinetics if all else was kept equal.

      19) I am slightly skeptical of the predictions that the model might make about the ribbon's frequency tuning (Fig. 7) in light of the fact that the AZ model in particular seems unable to reliably capture the fast transient response to dark flashes (Fig. 4c,f).

      The noted effect in the fast transient components in Fig. 4c,f is partially due to the slow calcium recordings which act as an input for the model in Fig. 4. As mentioned, and discussed above, there is an ongoing discussion to what extent nanodomain or more global calcium concentration drives the release. For this reason, we added a simple calcium model for the simulations for Fig. 7 which includes a variable time constant for calcium (nanodomains would presumably have much faster calcium transients than used for the model default). This allows us to explore the influence of different possible calcium handlings. Although this extrapolation to new stimuli is based on the fitted model, it allows for varying all essential parameters. In the online simulation it can be observed that for fast calcium handlings the ribbon is able to also follow higher frequency stimuli. However, we agree that experimentally testing the influence of different ribbon configurations on frequency tuning is an interesting research direction but goes beyond the scope of this manuscript.

    1. Perhaps a tool for thought isn’t so much a tool for collecting answers, as a tool for asking questions? Can a tool offer new ways to uncover the important questions we can’t yet articulate? I think so.

      Better still an Engine Discovery a Serendipity Engine for Questions too.. Not by the machine, but helping to bring to the human mind a constellation of ideas that may point to the adjacent possible questions arising from the 'clues' pointing to 'clues' 'Clue' is what TrailMarks Pages composed of. The primary means of Combination. By constructions 'Clues' can be assigned identities Human readable permanent Identities. The fundamental Means of Abstraction in TrailMarks. In turn Clues contains listicles comprising mixtures of plain text, HTML mashups, and further nested clues, recursively. It is Clues all the way up, ever extending the unending frontier of knowledge. Bringing into perview new things that we did not know about ready to be experienced, brought to awareness, articulated, connected to the existing body of articulation/knowledge creating new qestions as well as answers.

    1. Author Response:

      Reviewer #1 (Public Review):

      In this manuscript, the authors build off their previous data where they have identified differences in the sst1 locus as responsible for differences in susceptibility of B6 and C3HeB/Fej mice to Mycobacterium tuberculosis infection. The authors have previously shown that this susceptibility is attributed to higher levels of type I IFN signaling and in particular, the ISG IL-1Ra. The sst1 locus contains many genes that could be contributing to the differential susceptibility in C3HeB/Fej mice, and the model in the field was that differences in Sp110 expression was a likely candidate to explain the susceptibility. However, in this manuscript, the authors show that it is not lower expression of Sp110, but instead decreased expression of another gene in the sst1 locus, Sp140, that contributes to the increased susceptibility of mice carrying the sst1S sequence to bacterial infections. This is a very significant and surprising finding, supported by very clear and convincing data from experiments performed with a high level of rigor. Although identification of the gene responsible for differences in susceptibility and outcomes during bacterial infections is an advance for the field, the manuscript stops there in terms of new insight and falls short of providing any additional information beyond what has already been published regarding how this gene or lucus is functioning to regulate immune responses to infection. This limited scope embodies the major concern for this otherwise strong manuscript.

      We thank for the reviewer for recognizing the importance of our discovery that loss of Sp140 (not Sp110) confers susceptibility to M. tuberculosis. Our generation of Sp140 deficient mice allows us to demonstrate, for the first time, that Sp140 is a negative regulator of type I IFNs. By generating crosses between Sp140–/– and Ifnar–/– mice, we further demonstrate that type I IFNs mediate the susceptibility of Sp140–/– mice to M. tuberculosis and Legionella. The reviewer appears to believe that because IFNs were previously shown to mediate the phenotype of Sst1S mice that somehow the function of Sp140 was already known. By contrast, we feel that in fact the function of Sp140 was not at all clear prior to our work, and that our work does indeed provide important mechanistic insight into the function of Sp140 as a regulator of type I IFNs. Sst1S mice contain many genetic differences compared to B6 mice. It is only because of our work that we can now go back and reinterpret the prior work on Sst1S mice, but this would not be possible without the work we have reported in this paper. Of course we would love to be able to describe more about the molecular mechanism by which Sp140 represses interferon transcription. This is indeed something we are working on. However, our preliminary experiments indicate this is not likely to be straightforward and will require considerable effort that is certainly beyond the scope of this current paper. It should be noted, for example, that Sp140 is in the same protein family as the well-known transcriptional regulator Aire. The mechanism by which Aire regulates gene expression has been studied for almost two decades and is still not entirely clear (and was certainly not clear in the initial foundational paper on Aire function published by Anderson et al in Science in 2002). We expect the mechanism of Sp140 to be similarly complex. Importantly, we now know for the first time which protein to study mechanistically, i.e., SP140 instead of SP110.

      Reviewer #2 (Public Review):

      The authors have suggested the importance of SP140 for resistance to Mtb, Legionella infections in mice. They also provide evidence for IFNaR signalling in mediating the increased susceptibility of SP140-/- mice. While they attribute an important function of the transcriptional regulator SP140 to regulation of type I IFN responses by demonstrating the dysregulation of these responses in the SP140-/- mice, more direct evidence for this is needed.

      We appreciate the reviewer’s succinct summary of the main conclusions of our manuscript. While we would agree that there is more to learn about the mechanism of SP140 function, it is not entirely clear to us what the reviewer means when they say that more “direct” evidence is needed for our claim that Sp140 regulates the IFN response during bacterial infection. We feel that the genetic experiments we provide are clear on this point. The reviewer may be thinking that we are proposing a specific mechanism, e.g., that our model is that Sp140 regulates IFN production by binding to the IFN beta gene; although that is an appealing possibility, we agree that is not shown in our manuscript, and indeed, we are careful not to make any such claim. Indeed, we explicitly state that a more indirect mechanism is possible (line 390). What is clear, though, is that loss of Sp140 mediates susceptibility to infection via (direct or indirect) increases in type I IFN. We observe increased type I IFN responses in Sp140–/– mice in vivo, and moreover, we find that a cross of Sp140–/– mice to Ifnar–/– mice reverses susceptibility to infection. These results demonstrate that the dysregulation of type 1 IFN in the absence of Sp140 is not merely correlative, but in fact drives susceptibility to bacterial infection in vivo.

      Reviewer #3 (Public Review):

      In this manuscript Ji et al carefully examine candidate genes driving a previously described susceptibility within the severe susceptibility to tuberculosis (sst1). Surprisingly, mice deficient in the original candidate gene within this locus, SP110, showed no change in susceptibility to infection with M. tuberculosis. In contrast, the authors found that loss of a second gene in this locus, SP140, recapitulated many phenotypes seen in the SST1 mouse, including increased Type I IFN. SP140 susceptibility was reversed by blocking these exacerbated type I IFNs, similar to SST1 mice. RNAseq analysis identify changes in pro-inflammatory cytokines and type I IFNs. The strengths of this paper are the careful and controlled experiments to target and analyze mouse mutants within a notoriously challenging region with homopolymers. Their results are robust, convincing and will be of broad interest to the field of immunology and host-pathogen interactions. Convincingly identifying a single gene within this region that recapitulates many aspects of the SST1 mouse is very important. While a minor weakness is the lack of any mechanistic understanding of how SP140 functions, this is overcome by the impact of the other findings and it is anticipated that this mouse will now be a key resource to dissect the mechanisms of susceptibility in much greater detail.

      We thank the reviewer for their generous evaluation. Mechanistically, we do show that Sp140 affects resistance to bacterial infection via regulation of the interferon response, which we think is an important and technically non-trivial advance that provides insight into the function of Sp140. However, we agree that the mechanism for how Sp140 regulates type I IFN is not shown (nor is it claimed to be shown) and addressing this mechanism is now an important and exciting question for future studies.

    1. Author Response:

      Reviewer #2 (Public Review (required)):

      Using high-speed holographic methodology, the swimming trajectories of two Leishmania life cycle stages are measured. Significant differences between the life stages become apparent. In addition, the authors show in a chemotaxis experiment that the infectious metacyclics respond chemotactically to the presence of macrophages.

      The physics part of the study is flawless, and the holography is very impressive, especially in view of the comparatively simple setup. The analysis and presentation of the data is also flawless.

      What is not so clear is the biological interpretation of the data. Chemotactic behavior has been repeatedly postulated for Leishmania, trypanosomes, and other parasites. However, there have been no experiments to date that allow conclusions to be drawn about in vivo relevance. Unfortunately, this does not really change with this study.

      It has been shown in trypanosomes that the swimming behavior of different species and life stages are influenced by the mechanical conditions of their microenvironments. Viscosity, obstacles, and hydrodynamics can all play a critical role in determining motility. These factors are ignored in the study. Cell culture medium with the viscosity of water cannot image the situation in the vector or body fluids such as blood or lymph. A chemotactic gradient such as the one generated here by rather simple means cannot arise at all in vivo, simply because everything is in flux and parasites and macrophages move continuously. Moreover, one may wonder why Leishmania should actively move chemotactically toward macrophages when they come into contact with target cells much more rapidly by chance due to self-stirring properties of body fluids. I am not questioning the finding at all. I am merely questioning its biological relevance. Perhaps it would be better to describe this aspect of the paper more cautiously and to discuss it quite openly critically. Otherwise, the result might enter our knowledge as evidence for biologically relevant chemotaxis, and that would be problematic.

      We thank the reviewer for their perspective and agree that providing formal evidence for chemotaxis in vivo is complicated. The reviewer is right that mechanical stimulus, viscosity, elasticity etc. are present in body tissues, and that they will affect the motion of the flagellum, and that there is evidence that physical obstructions interrupt the flagellar beat (though ‘stirring’ does not play a role in Leishmania’s motion through tissue). At any rate, we contend that an in vitro study such as ours decouples the mechanical heterogeneity of the in vivo environment from the parasite’s cellular response. If a chemotactic response is present in the parasite, then it will be most sensitively and uniquely tested in an isotropic environment such as a bulk Newtonian fluid - indeed, this is what we find. Chemical gradients are known to occur and persist in cutaneous infections, as damage to tissue, sand fly saliva and Leishmania-derived molecules have been shown to recruit immune cells by this mechanism - we have added references and words to this effect on lines 211-214.

      Reviewer #3 (Public Review (required)):

      The authors describe a clever and powerful assay to show chemotactic behavior in metacyclic Leishmania, which is an important result. The data seem mostly solid, but some results are confusing (perhaps partly an issue with presentation?) and overall conclusions seem like they need to be toned down a little. It is expected that this work will have long-lasting impact on the research community, and the new methods developed will be widely utilized.

      Major concerns:

      • "Pre-Adaptation", e.g. lines 149-150: A major message of the work is to suggest that motility behavior and chemotaxis is a "pre-adaptation". However, I don't agree that the current studies show that "…flagellar motility is a …preadaptation to infection of human hosts." What are the data to support this? The authors do a very good job of defining motility features of PCF and META forms, including quantitative analysis of motility features in 3D. They find that motility differs in PCF vs META forms. They also demonstrate chemotaxis in META forms. But, I don't see how these combined results demonstrate a "pre-adaptation" to infection of human hosts. As such, the "pre-adaptation" statement should be moved to speculation. Notably, I did not see tests for chemotaxis in PCF. Thus, it is even not formally demonstrated whether or not chemotaxis itself is an "adaptation" specific to META forma, or rather (and quite likely) is a fundamental property of all life cycle stages.

      o To test if chemotaxis is an 'adaptation', the authors would need to provide an analysis of PCFs. To be an adaptation, one would expect to find either that PCFs do not exhibit chemotaxis, or that they do not chemotax toward macrophages in the assay used. Without this, the authors cannot say whether chemotaxis is a stage-specific behavior, much less a "pre"-adaptation.

      We have moderated the language around claims of ‘pre-adaptation’ (please see next point for locations), and provided additional results from chemotaxis assays in PCF. Consistent with previous studies (e.g. Oliveira et al, Exp. Parasitol. (2000), Leslie et al., Exp. Parasitol. (2002), Barros et al., Exp. Parasitol. (2006)), we find a different chemo/osmotactic response in which PCF cells are drawn towards the agar in the pipette tip even in the absence of an embedded stimulant such as macrophages. We speculate that this result is due to the presence of small carbohydrate molecules from the unrefined agar - and note that the response is distinct to META, which show no such attraction. However, as suggested, this has been made more speculative in the revised discussion.

      o Note, I think the work would not be negatively affected if the whole concept of "adaptation" were omitted and the work was framed around the very important results of developing a new and powerful approach to investigate Leishmania motility in 3D; quantitative definition of motility parameters; demonstration of chemotaxis in META forms.

      We thank the reviewer for their suggestion (and their positive words), and have modified the language around claims of pre-adaptation. We have rephrased the claims in the abstract, and around lines 188-90 in the summary/conclusions.

      • Chemotaxis: The work would benefit from some commentary on chemotaxis in kinetoplastids. A 'suggestion' for a potential advantage provided by chemotaxis (lines153-155) is not unwarranted, but that should be kept to speculation at this point, and implication that this is an 'adaptation' is not supported by the current data. With report of chemotaxis being a major message, the paper would benefit from a brief discussion on what's been demonstrated regarding chemotaxis in trypanosomatids, as this is an important, yet under-represented area of research on these organisms. Without this, the novelty and significance of the author's rigorous, novel and very interesting work are not brought out.

      We thank the reviewer for this suggestion, and have added another paragraph to the introduction (lines 53-81), giving additional context to our results by providing an overview of more experiments in the field. We have also changed the word ‘suggest’ to ‘speculate’ in the summary and conclusions (line 243).

      • Lines 125 - 129: How is it that tumble frequency decreases, but run duration is unaffacted? I would think that less frequent tumbles would lead to longer runs? This warrants more comment.

      We thank the reviewer for pointing out the apparent confusion here. This stems from the fact that (as stated in the subsequent sentence) in the majority of the population, the tumble rate is significantly suppressed, to either one or zero tumbles per track. We require at least two tumbles per track to measure run duration, so the small fraction of the population unaffected by the stimulus contributes the bulk of the measurable runs. We have clarified this section of the text to clarify how we measure run duration.

      • Fig 3 and Lines 135-139: How does one reconcile the finding that murine macrophages and human macrophages both induce taxis toward the pipet tip (3A), but there is opposite impact on speed profiles, with murine macrophages causing slower speeds, and human macrophages causing faster speeds (3H,K vs 3I,L)? Perhaps analysis done for human macrophages must also be done for murine macrophages. Some more commentary, and analysis needs to be provided on this point.

      We thank the reviewer for this suggestion, and in the light of their comments, we have revised our description of the murine data, highlighting that the results are not statistically significant. To further emphasise this point to the reader, we have recast the error bars in figure 3a in terms of 95% confidence intervals rather than using the standard error on the mean, as in the previous version. Although one may be calculated directly from the other without any further assumptions, the 95% CI representation might be more familiar to the readership. In this light, the fairly modest decrease in average swimming speed (also seen in absolute terms in the DMEM case) reinforces the revised conclusion that the null hypothesis (META are not stimulated by mm\phi) cannot be rejected.

      • Regarding replicates: While the number of cells tracked are clearly indicated, I did not see a description of how many different chambers were imaged for each condition, or how many different fields per chamber.

      This has been amended in the Methods section, subheading “Chemotaxis Assay”

    1. In my house we spoke Spanish all the time because of my mom. To this day, she doesn't want to learn English even though we tell her to learn English. My little sister, she doesn't speak Spanish, she speaks more English and with her it's different. We tell her, "You have to learn Spanish because it's going to help you," but she doesn't want to learn.Anne: Is she a citizen?Juan: Yes, she was born in the US. So my parents didn't really adapt to the American culture. They always wanted to follow Mexican traditions, even when it's Mother's Day over there … I think here it's May 10th but over there, when is Mother's Day?Anne: I think it's the second Sunday of May, so it could be different days.Juan: We could take that as an example. They'd rather follow Mother's Day here in Mexico than over there. Also Christmas, I guess the one thing they did adapt to was Thanksgiving. We don't celebrate that here in Mexico, but they do celebrate there, and they did adapt that. Another thing, Easter day. You go out with your family, you hide the eggs as a tradition, no? They adapted to that, but here in Mexico they don't do that. They don't even know about that. In a way they wanted to keep their Mexican culture alive even though they were in the US, but they also wanted to adapt to the things that they did there.

      Family, mom, parents, translating for, learning English, Homelife, Mexican traditions, holidays, Spanish language;

    1. intrinsically the mind was virtually omniscient and that it merely it was not in fact omniscient here and now because for the benefit of the 00:34:28 animal who has to survive on the surface of this planet we cannot be omniscient because we should be so full of irrelevant information that we should simply not be able to get out of the way of the cars in the street and 00:34:42 consequently the nervous system central nervous system the brain exists in order to limit this virtually in this quantity of consciousness which we virtually have 00:34:57 to limit it and to funnel it through for the purposes of biological survival on the surface of this particular planet well my own feeling is I would I would think this 00:35:10 idea of a completely omniscient mind is a low seems to me a little fantastic but I would think that there is something to be said for a view which would say that 00:35:24 the this psychic medium whatever it may be is let us say virtually omniscient that is it could take on into itself 00:35:38 every kind of specialized information but what it is in itself is a kind of undifferentiated consciousness and as I shall try to point out later on in this 00:35:52 lecture there is a lot of evidence from the part of the on the part of the Mystics both east and west to the effect that our particular specialized 00:36:06 individualized consciousness is under Lane by an undifferentiated consciousness and this again differentiated consciousness possesses

      brain is there to limit

    1. Author Response:

      Reviewer #1:

      This study reports the novel and interesting finding that AKAP220 knockout leads to a dramatic increase in primary cilia in renal collecting ducts. AKAP220 is known to sequester PKA, GSK3, the Rho GTPase effector IQGAP-1 and PP1. Previous work from this group demonstrated that AKAP220-/- mice exhibit reduced accumulation of apical actin in the kidney attributable to less GTP-loading of RhoA. Relatedly, AKAP220-/- mice display mild defects in aquaporin 2 trafficking. In this work, Golpalan et al examine the effects of AKAP220 mutation on cilia. They demonstrate increased numbers of primary cilia decorating AKAP220-/- collecting ducts. This phenotype is striking as little is known about negative regulators of cilium biogenesis.

      The authors also provide evidence that interaction of AKAP220 with protein phosphatase 1 (PP1) is critical for its function. Through PP1, AKAP220 may regulate HDAC6, which may in turn inhibit tubulin acetylation, which may in turn control cilia stability. Aberrant cilia function is implicated in autosomal dominant polycystic kidney disease. The authors also speculate that AKAP220 and tubulin acetylation may have clinical relevance for autosomal dominant polycystic disease. However, it remains unclear how increased cilia biogenesis may affect cell or tissue physiology. This work is of interest to cell biologists seeking to understand the biogenesis of the primary cilium, and to others interested in ciliopathies (i.e., disorders of the primary cilium).

      We thank the reviewer 1 for their insightful comments and concur with their assessment that “it remains unclear how increased cilia biogenesis may affect cell or tissue physiology”. This is clearly a topic for further study within the field that will include ourselves and other laboratories.

      Reviewer #2:

      The authors show that AKAP220 knockout in kidney collecting ducts leads to a pronounced increase in primary cilia. They go on to demonstrate that this effect holds true in multiple different preparations, before clearly demonstrating that the PP1 anchoring site is critical for the normal role of AKAP220 is limiting primary cilia formation.

      Although the key overall finding is well supported, I did not find the specific mechanism concerning a AKAP220-PP1-HDAC6 signaling complex/axis csufficiently onvincing. The authors propose that AKAP220 interacts with HDAC6 via PP1, and that within the complex HDAC6 is stabilised through phosphorylation. The knock on effect is efficient deacetylation. Although this complicated mechanism is consistent with the data, three supporting observations towards this specific mechanism come with caveats: (i) in figure 2C, they show an increase in acetyl tubulin by immunoblotting, but the densitometry seems to be the ratio of acetyl tubulin to GAPDH - would it not be more appropriate to reference to total tubulin?

      We are encouraged that this reviewer considers that our “overall findings are well supported”. In response to their comments, we have bolstered our evidence that AKAP220 interacts with HDAC6 via PP1 by including new biochemical and imaging data showing that recruitment of the histone deacetylase is attenuated in kidney cells engineered to express a PP1-binding defective mutant of the anchoring protein. These new data are incorporated into figure 3D and supplemental figures S3D-L.

      The mechanism investigated in this paper is concerned with absolute levels of acetylated tubulin. Since the levels of both control proteins (alpha tubulin and GAPDH) and do not change between wildtype and AKAP220KO, therefore we chose to normalize to GAPDH. It is important to note that normalizing to total tubulin does not change the result.

      Reviewer #3:

      The authors had previously generated a mouse line with inactivation of AKAP220, which encodes an A-kinase anchoring protein, and observed defects in their collecting ducts (CD) leading to defects in trafficking of aquaporin 2. While further characterizing the samples, they observed that CD epithelia had increased numbers and length of their primary cilia compared to CD cells of control mice. While some AKAP proteins have been localized to the primary cilium, AKAP220 was not one of them so the authors pursued a systematic series of experiments to determine how AKAP220 has these effects. Using a combination of CRISPR-manipulated renal epithelial cell lines (IMCD cells), drugs/compounds, 3D and organ-on-a chip cell culture systems they present compelling data that show that AKAP220 anchors a complex of HDAC6 and Protein Phosphatase-1 (PP1) that controls the polymerization of actin and thereby affects cilia formation and elongation. Genetic or pharmacologic manipulations that disrupt AKAP220 or its ability to bind to PP1, inhibit HDAC6, or affect actin stability result in a similar phenotype of enhanced ciliogenesis and ciliary length. Given that polycystic kidney disease has been described as a ciliopathy, with the gene products of the two most common forms of the disease (polycystin-1 and polycystin-2) localized to the cilia, they tested whether inhibiting HDAC6 activity might affect cyst growth using a human iPSC organoid system. They found that organoids lacking polycystin-2 treated with tubacin had smaller cyst size compared to vehicle-treated mutants, leading them to propose manipulation of HDAC6 as a tentative therapeutic strategy for human autosomal dominant polycystic kidney disease and for ciliopathies characterized by defects in ciliogenesis.

      Strengths: These findings will be of interest to the ciliary community. They have identified a new factor and its associated partners that appear to regulate ciliogenesis. The studies follow a logical progression and are generally well-done with suitable controls, rigorous quantitation, and a reasonable level of replication (all done at least three times). They have used complementary methods (ie. Genetic manipulation, pharmacologic inhibition) to support their model, sometimes in combination to show that the underlying factor targeted by either genetics or drugs work through the same mechanism.

      Weaknesses: The major weakness of the report is in its attempt to be translational. Here, the report has a number of serious theoretical and experimental limitations. On the theoretical level, the rationale behind using an HDAC6 inhibitor is unclear given their data and their model. On the one hand, a prior study had reported that a non-specific inhibitor of HDACs slowed cyst growth in an orthologous mouse model of ADPKD. The current work could suggest that HDAC6 was the actual target in the prior work and that a specific inhibitor for HDAC6 should confer the same benefits. On the other hand, there are compelling reports that show that genetic inhibition of ciliogenesis actually attenuates cystic disease in orthologous mouse models of human ADPKD. The current paradigm is that preserved ciliary activity in the absence of Polycystin-1 or Polycystin-2 promotes cystic growth. This would suggest that any intervention that boosts ciliary function could actually worsen disease. And while the authors never directly comment on the functional properties of the "mutant" cilia that result from deletion of AKAP220 or inhibition of HDAC6, they imply that these "enhanced" cilia are functional by suggesting the use of HDAC6 inhibitors as therapy for ciliopathies that are the result of defective biogenesis. Their prior work also provides indirect support for the notion that the enhanced cilia are functional. AKAP220 knock-out mice are reported to be generally functional, apparently lacking phenotypes commonly associated with defective cilia structure or function. These contradictory observations suggest that one or more of the following conclusions: the "mutant" cilia are in fact poorly functional, the HDAC inhibitors are working through a different mechanism than that which has been proposed, or that the assay as used in this report is not a good read-out of cyst-modulating effects. The last point is particularly relevant for this report. The investigators scored effectiveness of tubacin based on the relative rate of growth of cysts treated with different concentrations of tubacin vs vehicle. In this assay, cyst growth is principally driven by rates of cellular proliferation. Tubacin is an anti-proliferative agent with some toxicity, and while it might be highly selective for HDAC6, these studies cannot distinguish between effects mediated through the AKAP22-HDAC6 pathway versus others. In sum, while tubacin or a similarly-acting drug may or may not be effective for slowing cyst growth, there are multiple reasons to think it isn't through the mechanism the authors propose.

      We are encouraged that reviewer 3 considers “our studies follow a logical progression and are generally well-done with suitable controls, rigorous quantitation, and a reasonable level of replication”. In terms of weaknesses, our reading of the reviewer’s detailed passage has identified two specific points that we can address.

      1) Lesions in cilia and polycystins are linked to Autosomal Dominant Polycystic Kidney Disease (Hughes et al., 1995; Mochizuki et al., 1996). Although there is general agreement on this point, the molecular details remain unclear and are inherently paradoxical. For example, loss of morphologically intact cilia favors a less severe cystic phenotype (Ma et al., 2013). In contrast, other investigators report that loss of intact primary cilia results in renal cystogenesis (Kolb and Nauli, 2008; Lin et al., 2003). How primary cilia can be pro-cystogenic in one context yet anti-cystogenic in another context remains an unsolved paradox for the field. We appreciate the need for further clarification on this point as raised by reviewer 3. This conundrum is now noted in the discussion on page 34, line 3.

      2) Searching for a therapeutic approach to restore functional primary cilia is the rationale behind our concluding studies. However, the complexity of genetic models for ADPKD and the above mentioned “cilia paradox” limits our ability to accurately predict how pharmacological agents targeting cilia might affect cellular models of cystogenesis. That being said, we realize that HDAC6 inhibitors have been used by other groups to target cyst size (Cebotaru et al., 2016; Yanda et al., 2017). The reviewer is correct in pointing out that the mechanism by which HDAC6 inhibitors act to inhibit cystogenesis could be less than straightforward given the multitude of functions for HDAC6. We have amended the discussion on page 34, line 5to reflect the reviewer’s valid point.

    1. Author Response:

      Reviewer #1:

      In this paper, the authors study one of the understudied aspects of the evolutionary transition to multicellularity: the evolution of irreversible somatic differentiation of germ cells. Division of labour via functional specialisation of cells to perform different tasks is pervasive across the tree of life. Various studies assume that the differentiation of reproductive cells ("germ-role cells" in this manuscript) into a non-reproducing cell type ("soma-role cells") is irreversible. In reality, the conditions that promote the evolution of this irreversible transition are unclear. Here, the authors set out to fill in this knowledge gap. They model a population of organisms that grow from a single germ-role cell and find the optimal developmental strategy in terms of differentiation probabilities, under different scenarios. Under their model assumptions, they show that irreversible somatic differentiation can evolve when 1) cell differentiation is costly, 2) somatic cells' contribution to growth rate is large, 3) organismal body size is large.

      Overall, I think the authors identified an interesting and neglected aspect of cellular differentiation and division of labour. I enjoyed reading the paper; I thought the writing was clear and the modelling approach was adequate to address the authors' question.

      Thank you for a detailed and constructive review.

      Some aspects that can be improved:

      1) Throughout the manuscript, I was somewhat confused about what system the authors have in mind: a colony with division of labour or a multicellular organism? While their model can potentially capture both, their Introduction and Discussion seem to be geared towards colonies at the transition to multicellularity, whereas the Results section gives the impression that the authors have multicellular organisms in mind (e.g. very large body sizes).

      We are interested in the transition from a colonial life, where tasks are distributed in time, to multicellular organisms, where tasks are divided between cells. As such, our model covers these scenarios as two limit cases. In the context of our study, we discuss examples from the nature where this transition is observed – e.g. among Volvocales algae. For the purpose of the necessary colony/organism size, we do not need to go further than 2^6 = 64 cells. However, to infer the patterns of the composition effect Fcomp (Fig.3 C,D), we consider organisms doing four more rounds of cell divisions before reproduction, leading to maturity size of 2^10=1024 cells. There, irreversible somatic differentiation can occur at a wide range of differentiation costs (see Fig.4 A). Also, smaller sizes put stronger restrictions on the composition effect Fcomp, so the distribution of parameters presented at Fig.3C,D taken at the n=6 instead of 10, would have much less data points and this could obfuscate the pattern found in this study. Overall, the scale of about 1000 cells, for which we report most of our modeling results, features entities with very diverse complexity: from undifferentiated colonies (ocean algae Phaeocystis antarctica), to intermediary life forms (slime molds slugs), to paradigm multicellular organisms (higher Volvocales and C. elegans). We think that the chosen range of the organism size is adequate to the comparison of entities with undifferentiated and differentiated cells. In the updated manuscript, we extend the exposition of organism size to reflect this aspect.

      2) From the point of view of someone who works on topics related to cancer and senescence, I think these fields are very much connected to the evolution of multicellularity. Maybe because I had multicellular organisms in mind rather than colonies with division of labour (above), I thought the manuscript missed this connection. Damage accumulation is key to Weismann and Kirkwood's theories of germ-soma divide and disposable soma, respectively, whereas dysregulated differentiation is one of the important aspects of tumour development (e.g. Aktipis et al. 2015). Making these links could also be relevant to discuss some of the model assumptions. For instance, the authors assume that fast growth comes with no cost in terms of cell damage, which may not always be the case (e.g. Ricklefs 2006) and reversibility of somatic differentiation can come at a cost of increased risk of somatic "cheaters" or cancerous cell lines.

      Thank you for this suggestion. Indeed, the aspect of cancer risk has not been considered in the initially submitted manuscript. In the updated manuscript, we introduce a model where differentiation is linked to the risk of an organism for death instead of a delay in development. The results with this model exhibit very similar pattern, see Fig.5. Hence, the term “cost of differentiation” can be interpreted more broadly than just cell division delay suggested by our main model.

      3) The authors assume the differentiation strategy (D) does not change over the lifetime (which equates to ontogenesis in their model, i.e. they do not consider mature lifespan). I wonder if this is really the case, or whether organisms/cells can respond to the composition of cells they perceive. For instance, at least in some animal tissues, a small number of stem cells are kept to replenish differentiated tissue cells when needed. I understand that making D plastic can make the model really complicated, but maybe it is worth talking about what strategy would evolve if D was not stable through ontogenesis (and mature lifespan). My initial guess is that if differentiation probabilities can change through life and if one considers cellular damage accumulation, senescence and cancer (as above), the conditions that favour irreversible somatic differentiation would expand.

      Indeed, we assume the differentiation strategy to be constant in our model. We do not know whether it is true at the brink of multicellularity and, for sure, once evolution makes a species complex enough, this assumption will become inadequate. Yet, when we consider a dynamic differentiation strategy, we find a very efficient but unrealistic solution: at the very beginning of a life cycle a germ-role cell gives rise to two soma-role cells, then these soma-role cells produce only soma-role cells and finally, at the very last round of cell division, they give rise to as many germ cells as possible. This scenario is the most efficient in terms of the rate of the organism development (100% of useful soma-role cells during growth), amount of offspring produced (every cell becomes a germ at the end of the day), and differentiation costs/risks (differentiation occurs only twice in a life time). Still, it is unrealistic. There must be some constraints on the flexibility of the dynamic differentiation strategy. We think that the exploration of the space of dynamical differentiation strategies and their constraints goes beyond the scope of the current study. Nevertheless, we are very interested to explore this topic further in following projects.

      Reviewer #2:

      This works seeks to determine the conditions in which simple multicellular groups can evolve irreversibly somatic cells, that is: a replicating cell lineage that provides cooperative benefits as the group grows and cannot de-differentiate into reproductive germ cells.

      This question is addressed with a well-constructed model that is easy to understand and provides intuitive results. Groups are composed of germ and soma cells that replicate synchronously until the group has reached a maximal size. When each type of cell divides, they may have different probabilities of producing daughter cells of each type, and the analysis determines the optimal differentiation probabilities for each type of cell depending on a variety of factors. Critically, irreversible somatic differentiation arises when the optimal probability for soma cells is to produce only soma cells.

      The elegance of the model means that the predictions are easy to interpret. First, when there is a higher cost for soma cells to produce germ cells, then a dedicated lineage of somatic cells is more favourable. Second, when soma cells produce only soma cells and germ cells can produce both types, the proportion of soma cells in the group will increase with each division. Consequently, for irreversible somatic cells to be optimal, germ cells must produce a small number of soma cells and these few must provide large benefits. Third, larger group sizes are required for a small number of soma cells to arise and provide sufficient benefits to the group.

      Inevitably, there is a trade-off between the benefits of a simple model and the costs of idealised assumptions.

      Among other assumptions, the model assumes that germ cells and soma cells replicate synchronously and at the same rate, and that soma cells provide benefits throughout the growth of the group, but do not increase the fecundity of germ cells in the last generation. Consequently, it is not clear to what extent the predictions of the model apply to the notable empirical cases where these assumptions do not hold. For instance, in the often-cited Volvocine algae, soma cells do not provide any benefits until the last generation of the group life cycle. This may help to explain why many Volcocine species have a very large number of somatic cells, counter to the second prediction of the model.

      Overall, this analysis is targeted and provides clear predictions within the bounds of its assumptions. Thus, these results provide a compelling framework or stepping-stone against which future models of germ-soma differentiation in alternate scenarios can be compared and evaluated.

      Thank you for the kind words and the well-thought review. Indeed, our model takes a number of simplifying assumptions. In the revised manuscript, we consider the model, in which the strongest of our simplifications – of simultaneous cell divisions - is violated. This asynchronous cell division model shows that irreversible differentiation may evolve, at least, under asymmetric differentiation costs. However, its evolution is observed less often than in a synchronous model.

      We absolutely agree that the design of our model does not replicate the details of Volvocine life cycles. However, our work is not aimed to be a model of germ-soma differentiation in Volvocales. Instead, we developed a simplistic model implementing features from a diverse range of organisms. While in higher Volvocales young colonies develop within a maternal organism, there is a wide range of colonial organisms, which grow from independently living single cell, e.g. colonial diatoms, Haptophytes Phaeocystis antarctica, and amoebazoan Phalansterium. We agree that the protection by maternal organism should play a major role in Volvocales and we are looking forward to investigate a follow-up model taking this factor into account.

      Reviewer #3:

      This paper provides a theoretical investigation of the evolution of somatic differentiation. While many studies have considered this broad topic, far fewer have specifically modelled the evolutionary dynamics of the reversibility of somatic differentiation. Within this subset, the conditions that select for irreversible somatic differentiation have appeared conspicuously restrictive. This paper suggests that an overly simplified fitness function (mapping the soma-germline composition of an organism to its growth rate) may be partly to blame. By allowing for a more complex fitness function (that captures the effect of upper and lower bounds for the contribution of somatic cells to organism fitness) the authors are able to identify three conditions for the evolution of irreversible somatic differentiation: costly cell differentiation (particularly for the redifferentiaton of soma-cell lineages to germ line); a high/near maximal organismal growth advantage imbued by a small proportion of soma cells; a large maturity size for the organism (typically greater than 64 cells).

      The model presented is simple and elegant, and succeeds in its aim of providing biologically feasible conditions for the evolution of irreversible somatic differentiation. Although the observation arising from the first condition (that high costs to reversible somatic differentiation promote the evolution of irreversible somatic differentiation) is perhaps unsurprising, the remaining conditions on the fitness function and the organism maturity size are interesting and initially non-obvious. Particularly tantalising is the prospect of testing these conditions, either against available empirical data, or in an experimental setting.

      The model does however make a number of simplifying assumptions, the effects of which may limit the broad applicability of the results.

      The first is to assume that cell division is synchronous, so that the costs of cell differentiation can be straight-forwardly averaged across the organism at each division. While the authors present a convincing biological justification for this assumption for algae such as Eudorina illinoiensis and Pleodorina californica, it is not immediately that this assumption should hold more widely.

      The second is to assume that the development strategy (i.e. the rates of differentiation between somatic and germ-line cell types) is constant throughout the organism's growth. For instance, there may be a growth advantage in the current model (aside from the advantages with respect to reduced mutation accumulation) of producing more germ cells early in the developmental programme, before transitioning to producing more soma cells in later development.

      Exploring such extensions to this model presents a seam of potential avenues for investigation in future theoretical studies.

      Thank you for the kind assessment of our findings. In the updated manuscript, we in addition investigated a model with asynchronous cell divisions. However, due to computational limitations, we are unable to fully replicate the investigation protocol of the original synchronous model. The execution time of the synchronous model scales linearly with the number of generations (n) and it still takes about a week to compute a single map like Fig.2A on a 2000-node cluster. The asynchronous model, in turn scales linearly with number of cell divisions, and hence, exponentially with generation time (as 2^n), which results in calculations taking much more time. For instance, the map in Fig.2A requires about 160 times more computer time with the asynchronous model. Nevertheless, we were able to implement this model for smaller organisms, with less statistics. There, we found that asynchronous model allows an evolution of irreversible somatic differentiation. However, it is suppressed comparing with the synchronous model – the fraction of Fcomp profiles promoting irreversible differentiation is much smaller and the organism size restriction is higher.

      To study a dynamic differentiation strategy would be wonderful. Early on, we considered studying this scenario. The crucial factor here is how flexible can the strategy be. In a naïve situation with a complete flexibility between every cell generation, the most successful strategy would be all cells of an organism first completely turn into soma-role to gain the maximal benefits, and then at the last step, they all convert back to germ to produce the maximal number of offspring. This is not observed in natural species; hence the flexibility of dynamic differentiation program must be constrained. We are curious to study what kind of constraints can lead to irreversible soma, but this task is beyond the scope of the current study. Our work with a constant differentiation program is the beginning of the future line of research. We are already looking forward to explore the space of dynamic differentiation programs in later projects.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful to the reviewers for their thoughtful comments and propose the following experiments or clarifications listed below (blue) in a revised manuscript.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors use a combination of Dsn1-Flag kinetochore purification from yeast extracts and laser trapping experiments (as in a number of previous studies), to study the effect of Mps1-dependent phosphorylation on reconstituted kinetochore-microtubule attachments in vitro. They complement this analysis with genetic experiments characterizing the effects of non-Mps1 phosphorylatable mutants on checkpoint activity and chromosome segregation in yeast.

      The authors had previously shown that Mps1 is the major kinase activity that copurifies with Dsn1-Flag in their purification scheme. They now investigate the effect of adding ATP and thereby allowing Mps1 phosphorylation in the reconstituted system. They show that addition of ATP decreases the rupture force of kinetochore-microtubule attachments, meaning it weakens the strength of the attachment. This effect can be negated either by inhibiting Mps1 with reversine, or by providing kinetochores in which the Mps1 phosphorylation sites on Ndc80 (most of them in the N-terminal tail) have been mutated to alanine. Thus, like the activity of Ipl1, Mps1 phosphorylation of the Ndc80 N-tail (which is known to be important for full MT affinity) weakens kinetochore-microtubule attachments.

      Cellular experiments demonstrate that non-Mps1 phosphorylatable Ndc80 14-A mutants have a functional mitotic checkpoint (contrary to previous claims by Kemmler et al., 2009), but show synthetic sickness with stu2 alleles that are involved in error correction.

      **Major points:**

      Within the framework of this experimental setting, the study as presented is logical and clear. The conclusions regarding the effect of Mps1 in this reconstituted system are overall well supported by the data. I have a couple of major and some minor points that can further improve data interpretation and should therefore be considered:

      1. In previous publications (e.g. Gutierrez et al., Current Biology 2020), the authors have reported that the Dam1 complex, an established Mps1 substrate, is required for full attachment strength in this system. Are the effects of Mps1-dependent Ndc80 phosphorylation and Dam1 independent from one another? For example would dad1-1 or non Cdk1 phosphorylatable Dam1 complex further reduce the rupture force in ATP? Or does Mps1 phosphorylation affect, for example, Dam1 binding to Ndc80?

      Response: To better understand the effects of ATP treatment, we analyzed the levels of Dam1 on the kinetochores after ATP treatment and did not see any change. We will add this data to a supplemental figure. Dam1 clearly makes a major contribution to the strength of the kinetochores because their strength even after ATP-treatment is higher than the rupture force of kinetochores purified from a dad1-1 mutant strain. However, as we report in the paper, blocking the eight Mps1 target sites in the tail of Ndc80 was sufficient to block the effect of ATP, so it is unlikely that phosphorylation of the Dam1 complex by Mps1 makes a major contribution to the ATP-dependent kinetochore weakening in vitro. We think Dam1 phosphorylation by Aurora B probably contributes independently to error correction, because the dam1-3D mutant, carrying phospho-mimetic substitutions in three Aurora B sites, is synthetically lethal when combined with the ndc80-8D phospho-mimetic mutant in eight Mps1 sites. We will add this genetic interaction data to the revised manuscript to provide additional information about the pathways.

      What is the effect of ATP on initial binding events? Are there differences in the fraction of beads that spontaneously attach laterally at the start of the experiment? This may allow to draw conclusions whether any kind of binding or specifically force-generating end-on attachments are affected by ATP.

      Response: We did measure a reduction in the fraction of free kinetochore-decorated beads capable of binding microtubules upon exposure to ATP (from 20% binding in the absence of adenosine to 11% in the presence of ATP). This observation suggests that the microtubule-binding activity of the kinetochores, like their rupture strength, is reduced upon exposure to ATP, as reported in the methods, in the "rupture force measurements" section. However, because we worked with a low density of kinetochores on the beads, the initial numbers of beads that spontaneously attached was quite low and free beads capable of binding to microtubules were relatively rare. In addition, when we find a bead already attached to the lattice, we cannot distinguish whether it bound initially to the lattice or instead bound to a tip that then grew beyond the bead. For these reasons, we feel it would be very difficult using our current approach to draw statistically significant conclusions about whether there were ATP-dependent changes in the relative affinities of the kinetochores for lateral versus tip attachments.

      Ndc80-8D has low attachment strength, consistent with lowered MT affinity of the phospho-mimetic Ndc80 tail. Interestingly, Supplementary Figure S4B shows that the amount of Cse4 in the pull-down western appears substantially reduced in 8D vs 8A or wt. Is the amount of co-purified inner kinetochore affected in this mutant? This may be an alternative explanation for decreased attachment strength, for example if the fraction of "full" or "complete" kinetochores may be reduced. Could this also happen upon inclusion of ATP?

      Response: The reviewer is correct that the level of Cse4 and other inner kinetochore components is slightly reduced in the Ndc80-8D kinetochores, for reasons that are not clear to us. However, the incubation of wild type kinetochores with ATP does not affect the levels of these proteins, suggesting that the weakened rupture strength is not due to reduced levels of these inner kinetochore proteins. We will add the data showing that ATP does not affect levels of inner kinetochore proteins into a supplemental figure to clarify this point.

      **Minor points:**

      page 13 (heading): "Weakening occurs via phosphorylation...". Probably good to mention what is weakened ("Weakening of kinetochore-microtubule attachments occurs via phosphorylation...".

      Response: We will alter the heading as suggested.

      page 14/Figure5C: Median Rupture Force for Ndc80-8D is 4.8 pN according to the text. In the graph it looks like >5 pN.

      Response: We thank the reviewer for noticing this mistake and will correct the median rupture force to 5.6 pN.

      page 23: comma missing between T21 S37 and T47 (should be T21, S37 and T47)

      Response: We thank the reviewer for noticing this omission and will correct it.

      page 24/25: different spelling of G1 (sometimes with subscript)

      Response: We thank the reviewer for noticing this inconsistency and will correct all to be G1.

      page 24/25: ug instead of µg

      Response: Thanks. We will fix this mistake.

      page 28: Figure 5B instead of Figure 5A

      Response: Thanks for noticing this mistake. We will correct this.

      Figure 6A: Lambda-Phosphatase treatment for 20 minutes according to figure legend and 30 minutes according to Material and Methods section.

      Response: The material and methods section specified a 20-minute incubation with phosphatase, in agreement with the figure legend. We believe the reviewer might have accidentally confused the time value with the temperature, which was 30 degrees.

      Figure 6E: One should not draw any conclusions from the anti-phospho T47 blot here, the quality is simply too poor to allow a statement regarding an mps1-1 effect

      Response: While the immunoblots with the T74 phospho-specific antibody are not as clean as many standard antibodies, we have reproduced the results multiple times and therefore feel comfortable concluding that there is a decrease in signal that is Mps1-dependent.

      Figure 6: Labelling T47P misleading (Proline substitution?, use pT47 instead)

      Response: We will change the labeling on this figure, as suggested, from T74P to pT74. To be consistent, we will also change this nomenclature in the text.

      Figure 6F: Make clear in the labelling that a stu2-AID background is used here, makes it easier to understand why Auxin is used here.

      Response: We will change the labeling, as suggested, to include the genotype of stu2-AID in the figure.

      how specific is reversine for yeast Mps1? I have not seen any data on this in previous publications.

      Response: Reversine is not necessarily specific for Mps1. However, the only kinase activity that co-purifies with the isolated kinetochores is from Mps1, so reversine should inhibit only Mps1 in our in vitro experiments. Nevertheless, to further address this concern, we will include optical trapping results using mps1-1 mutant kinetochores in the revised manuscript. We have already performed these additional experiments and found that mps1-1 kinetochores do not undergo ATP-dependent weakening, strongly reinforcing our conclusion that Mps1 is the major kinase involved.

      additional genetic interactions might be informative, if Ndc80-8D has weakened attachments, it may have synthetic effects with other mutants (dam1?), conversely, ndc80-8A may show genetic interactions with ipl1 alleles, for example.

      Response: We agree that the ndc80 phospho-mutant alleles might have genetic interactions with other mutants. Consistent with this prediction, we have found that ndc80-8D is synthetically lethal when combined with the dam1-3D mutant in three Ipl1 sites. As mentioned above, we will add this data into the revised text. We will also perform additional genetic interaction experiments with ipl1 and mps1 alleles and add any additional interactions we discover into the revised text.

      Reviewer #1 (Significance (Required)):

      The study adds to the characterization of the effects of Mps1 kinase on kinetochore-microtubule attachments and characterizes the cellular phenotypes of non-Mps1 phosphorylatable Ndc80 mutants. The major conceptual point that Mps1 phosphorylation can weaken kinetochore-microtubule interactions and thereby contributes to error correction in a manner similar to Ipl1 has previously been made in the literature. Maure et al., (Tanaka lab, 2007, Current Biology) have characterized the effects of mps1 mutant alleles on biorientation of authentic chromosomes and on replicated/unreplicated mini-chromosomes. In particular the experiments with unreplicated mini-chromosomes have revealed less frequent detachment in mps1 mutants, demonstrating that Mps1 activity is required to release attachments that are not under tension.

      Another benefit of this study is that it puts the Kemmler 2009 EMBO J. paper into perspective and corrects some of it claims. In particular the notion of sustained checkpoint activation in the Mps1 phospho-mimetic Ndc80-14D mutant, whose lethality was claimed to be rescued by checkpoint deletion. It is confirmed here that the allele is lethal but cannot be alleviated by simultaneous checkpoint deletion. Conversely, the Ndc80-14A mutant is shown to have a functional checkpoint. One could argue that since the publication of the Kemmler paper, the idea of requirement of Mps1 phosphorylation on Ndc80 for checkpoint activity has not gained any traction in the field, but it's still useful for the field to put some of these earlier claims into perspective. The paper will therefore be interesting to researchers working on mechanisms of chromosome segregation and error correction.

      From my background I cannot comment on technical details of the biophysical force spectroscopy experiments (laser trapping), but I have no reason to doubt that the authors accurately report their findings.

      Response: We sincerely thank the reviewer for their careful reading, helpful comments, and enthusiasm for our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This paper focusses on the mechanisms underlying chromosome biorientation in mitosis, an essential process that warrants equal chromosome segregation to the dividing cells. Correction of improper kinetochore-microtubule attachments relies on two conserved protein kinases, Aurora B and Mps1, that detach kinetochores that are not under tension in order to provide them with a second opportunity to establish bipolar connections. In vivo, Aurora B and Mps1 have intertwined functions and share some common targets. For this reason, despite the large body of literature on the subject, their precise roles in chromosome biorientation have been difficult to tease apart.

      The authors take advantage of an in vitro reconstitution assay that they previously published (Akyioshi et al., 2010) to identify the critical target(s) of Mps1 in weakening kinetochore-microtubule connections. The assay uses kinetochore particles purified from budding yeast cells that bear Mps1 but are notably deprived of Aurora B. Upon addition of ATP to activate the co-purified kinases (e.g. Mps1), kinetochores are added to coverslip-anchored microtubules to which they attach laterally. Through a laser trap, kinetochores are brought to the microtubule plus-end and pulled with increasing force until the kinetochore detaches, which allows measurements of the average rupture forces that reflect the strength of the attachments. The approach is straightforward and potentially very powerful, first because it provides a simplified experimental set-up in comparison to the cellular context, and second because it directly measures the impact of protein phosphorylation on the strength of attachments.

      The authors convincingly show that Mps1-dependent phosphorylation of the N-terminal part of Ndc80 significantly weakens the strength of kinetochore-microtubule attachments in vitro, while phosphorylation of other known Mps1 targets, such as Spc105, does not seem to have an effect. Eight phosphorylation sites in Ndc80, which were previously identified as Mps1-dependent phosphorylation sites (Kemmler et al., 2009), are shown to be critical to destabilise kinetochore-microtubule attachments in the in vitro reconstitution assays. The authors also present evidence for a moderate involvement of Ndc80 phosphorylation by Mps1 in correcting improper attachments in vivo, suggesting that additional mechanisms are physiologically relevant for error correction.

      The experiments are mostly well designed, the data are solid and support the main conclusions. However, to my opinion additional experiments could be performed, as outlined below, to strengthen the physiological relevance of the main findings and corroborate some of the conclusions.

      **Major points:**

      1. Given the partially overlapping function of Mps1 and Ipl1 (Aurora B) in error correction, the ndc80-8A mutant should display synthetic growth and chromosome mis-segregation defects with ipl1 temperature-sensitive alleles. Conversely, the ndc80-8D mutant should suppress the lethality at high temperatures of mps1-3 mutant cells, which were recently shown to be defective in chromosome biorientation (Benzi et al., 2020). Finally, chromosome mono-orientation could become apparent in ndc80-8A cells upon a transient treatment with microtubule-depolymerising drugs, which should amplify the cellular need for error correction.

      Response: We agree that further exploration of the possible genetic interactions might help to reinforce the physiological relevance of our main findings. Toward this goal, we will obtain the mps1-3 mutant to determine whether ndc80-8D can suppress its lethality and will add this to the revised manuscript if there is a positive result. As mentioned in response to Reviewer 1, we will add a synthetic lethal interaction between ndc80-8D and a dam1-3D mutant where the Aurora B sites are altered to the revised text. We will also perform additional genetic interactions with ipl1 and mps1 mutants and add any we find into the revision. As requested, we will perform a nocodazole wash out experiment, to determine if ndc80-8A cells show a defect in error correction and add this data to the revision if there is a defect.

      The authors show that Mps1-dependent phosphorylation of Ndc80 is not involved in the spindle assembly checkpoint, a conclusion that contradicts a previous report (Kemmler et al., 2009). They also find, in contrast with the same report, that the lethal phenotype of the ndc80-14D phospho-mimetic mutant cannot be rescued by disabling the spindle checkpoint. In my opinion, Kemmler et al. convincingly showed, through a number of different experimental approaches, that ndc80-14D cells die because of spindle checkpoint hyperactivation. Not only deletion of checkpoint genes was shown to rescue the lethality, but re-introduction of a wild type copy of the deleted checkpoint gene reinstated lethality. Thus, the explanation invoked here that spontaneous suppressing mutations could underlie the viability of ndc80-14D SAC-deficient mutants is not consistent with the published observations. A thorough examination by the authors of the phenotype of ndc80-14D cells in their hands should be carried out to support these conflicting conclusions. If authors find that ndc80-14D cells actually die because of chromosome mono-orientation, then this would highlight an important function for some or all the six additional phosphorylation sites, relative to the ndc80-8D mutant, for chromosome biorientation in vivo.

      Response: We were unable to reproduce the data that deletion of the spindle checkpoint suppresses lethality of the ndc80-14Dmutant, so it remains unclear why our results differ from those of the Kemmler paper. However, we note that re-introducing a wild-type checkpoint gene via transformation and restoring lethality to the ndc80-14D cells does not necessarily mean there were no suppressors. While that is one possible interpretation, another possibility is that there was a suppressor mutation in the viable ndc80-14D cells that also required the lack of the checkpoint to live. Kemmler and co-workers selected for viability on FOA media and never backcrossed those viable strains to show that they could regenerate the double mutant through a cross with the expected segregation pattern of two mutations, which would have been a more rigorous demonstration that the viability was specifically due to ndc80-14D and the checkpoint mutation. Instead, they transformed a wild-type copy of the checkpoint gene back into the strain that was selected for growth on FOA and showed that it reverted the phenotype. This approach cannot rule out a suppressor mutation that fails to suppress in the presence of an active checkpoint. Therefore, in our opinion, the Kemmler paper does not make an entirely convincing case that the ndc80-14D cells die because of spindle checkpoint hyperactivation.

      To further analyze the phenotype of ndc80-14D cells, we have constructed an Ndc80-AID ndc80-14D strain and added auxin, to deplete the wild-type copy of Ndc80. In agreement with the findings of Kemmler et al., this did trigger the spindle assembly checkpoint. However, when we made an Ndc80-AID ndc80-14D mad2 strain and analyzed segregation, we found that chromosome 8 missegregated in 28% of the cells compared to 2% of control cells. This observation suggests that there is a kinetochore defect in these cells that may have triggered the checkpoint and is inconsistent with the mutant solely activating the checkpoint in the absence of any other kinetochore defect. In addition, the levels of Ndc80-14D as well as Mps1 were altered on the mutant kinetochores. The combination of these defects strongly suggests that the ndc80-14D mutant alters kinetochore function in addition to leading to constitutive checkpoint signaling. Because our manuscript is mainly focused on phosphorylation of the Mps1 target sites within the N-terminal tail, we do not plan to add this data involving many additional sites, including Ipl1 target sites and sites on the CH domains of Ndc80, into the current manuscript. We will further pursue the other phosphorylation sites in the future.

      The conclusion that Spc105 phosphorylation by Mps1 is not required for the Mps1-mediated weakening of kinetochore attachments in vitro is based on the comparison between kinetochore particles bearing wild type, untagged Spc105 and particles bearing non-phosphorylatable Spc105-6A tagged at the C-terminus with twelve myc epitopes. Thus, the presence of the tag could obliterate the effects of the mutations in the phosphorylation sites by destabilising kinetochore-microtubule attachments in the presence of ATP. Consistent with this conclusion, Spc105-6A-12myc-bearing kinetochores withstand lower rupture forces than Spc105-bearing kinetochores upon ATP addition. Furthermore, Spc105-6A-12myc kinetochore particles show an interacting protein at MW above 150 KD that is not present in wild type particles (Fig. S2A), suggesting that either the tag or the mutations might affect kinetochore composition. Thus, this set of experiments should be repeated using Spc105-6A kinetochore particles lacking the tag.

      Response: If we understand correctly, the reviewer is suggesting that the myc tag on Spc105-6A could cause an ATP-dependent effect on kinetochore strength. While this is formally possible, it seems highly unlikely to us, for two reasons: First, a myc tag is not expected to bind nucleotides, and while it can sometimes have a general effect on protein stability or interfere with protein-protein interactions, we are not aware of any evidence for a myc tag directly causing an ATP-dependent effect in vitro. Second, when we measured Spc105-6A kinetochores in control experiments, without adenosine or with ADP, their rupture strengths were high like wild-type kinetochores. The strength of ADP-treated Spc105-6A kinetochores (8.7 pN), for example, was statistically indistinguishable from that of ADP-treated wild-type kinetochores (8.7 pN, p = 0.27 based on a log-rank test). The wild-type-like behavior of untreated and mock-treated Spc105-6A kinetochores indicates that their composition is not affected in a manner that significantly impacts kinetochore-microtubule strength.

      In general, it would have been informative to complement the data presented here with a mass spec analysis of the composition of kinetochore particles, at least for the experiments that are most relevant to the conclusions. For instance, the composition of the Ndc80-8A kinetochore particles is assumed to be similar to that of wild type kinetochores based on gel silver staining (Fig. S4A; note also that ndc80-8A particles are compared to ndc80-8D particles and not to wild type particles). However, the authors previously showed that kinetochore particles purified from dad1-1 mutant cells (affecting the Dam1 complex) have an apparently identical composition to particles purified from wild type cells by silver staining, yet they display significantly lower resistance to the rupture strength in vitro (Akyioshi et al., 2010). What is the status of the Dam1 complex (or other kinetochore subunits) in kinetochores purified from ndc80-8A/-8D or spc105-6A cells relative to wild type kinetochore particles?

      Response: We agree that further characterization of the kinetochore particle composition would be valuable and propose to further analyze the composition by purifying wild-type, Ndc80-8A, Ndc80-8D and Spc105-6A kinetochores and performing immunoblotting against the Dam1 complex. In addition, we will analyze the Ndc80-8A and Ndc80-8D kinetochores by mass spectrometry and report a qualitative analysis of the relative amounts of each kinetochore subcomplex in the revised manuscript supplementary data.

      **Minor comment:**

      I believe that the right reference for the sentence in the Discussion "If Aurora B is defective, for example, the opposing phosphatase PP1 prematurely localizes to kinetochores" is Liu et al. 2010.

      Response: We had cited the reference showing this effect in yeast, since our work was performed in yeast. We will also add the Liu et al paper, which showed the same result in human cells.

      Reviewer #2 (Significance (Required)):

      Although the experiments are well designed and the conclusions are mainly supported by the data, the question arises as to what extent the in vitro assays recapitulate, at least partly, what happens in vivo. An emblematic example is the involvement of Spc105 in the error correction pathway. The Biggins lab previously showed that Spc105 phosphorylation by Mps1 and subsequent Bub1 recruitment is not only essential for the spindle assembly checkpoint, but is also crucial for chromosome segregation in vivo, as shown by slow-growth phenotype and aneuploidy of the spc105-6A non-phosphorylatable mutant (London et al., 2012). Additionally, a recent paper showed that Spc105 is a crucial Mps1 target in chromosome biorientation (Benzi et al., 2020).

      In sharp contrast, the ndc80-8A mutant, which in vitro completely erases the ability of Mps1 to destabilise kinetochore-microtubule attachments, displays no growth defects in otherwise wild type cells and only modestly enhances chromosome mis-segregation in a mutant affecting an intrinsic correction pathway (stu2ccΔ). The N-terminal part of Ndc80 (aa 1-116) containing the aforementioned eight phosphorylation sites can even be deleted altogether without any consequence on cell viability (Kemmler et al., 2009). Thus, although the in vitro assays presented here produced clear-cut and reproducible results, their physiological relevance in vivo remains unclear.

      Left apart this criticism, the manuscript has several merits outlined above and will be of interest for people working in the fields of chromosome segregation, kinetochore assembly, spindle assembly checkpoint, etc.

      Expertise of this reviewer: mitosis and related checkpoints

      Response: We are grateful to the reviewer for carefully reading our manuscript and detailing their concerns. We agree that it can be challenging to establish the physiological relevance of experiments performed in vitro. However, our in vitro approach allowed the effects of Mps1 specifically on kinetochore-microtubule attachment strength to be disentangled from its numerous other effects in vivo. In our view, the relatively mild phenotypes associated with mutants in the Mps1 phosphorylation sites on the Ndc80 tail are consistent with similarly mild phenotypes of mutants in the Aurora B phosphorylation sites on the Ndc80 tail. In both cases, this appears to be due to additional error correction pathways that compensate in vivo.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Sarangapani, Koch, Nelson et al. applied a combination of in vitro biophysical assays with purified kinetochore particles and in vivo analyses to investigate the contribution of Mps1 kinase to kinetochore-microtubule (KT-MT) attachment stability and error correction.

      The manuscript is well written and the authors nicely highlight the facts that 1) the focus of the field has long been on the contribution of Aurora kinases (Ipl1 in budding yeast) to attachment stability and error correction, and 2) it has been difficult to assess the relative contributions of Aurora versus Mps1 kinases in cell-based experiments. The authors note that their KT particle assay is uniquely positioned to address this gap in our understanding and to specifically isolate the contribution of Mps1 to attachment stability in vitro. The findings are well-presented and quite convincing although I have several comments that should be addressed to strengthen the central conclusion that this work has isolated the contribution of Mps1 in their assays.

      **Major points:**

      1) I think it is important to note that reversine is not specific for Mps1 kinase - although it is typically presented as such in the field. It was initially identified as an Aurora kinase inhibitor (IC50: ~25nM (Aurora B) - 900nM (Aurora A)) that turned out be an even more potent Mps1 inhibitor (IC50 ~6nM). I have concerns that the in vitro assays were done with 5 uM reversine - a concentration so high that it could certainly inhibit any Ipl1 that is present (see comment 3 below) and possibly even inhibit Bub1 activity as Santaguida et al. (JCB, 2010) measured an IC50 >1uM for Bub1 inhibition. It is important to complement/confirm the chemical inhibitor experiment by repeating the rupture assays +/- ATP in KT particles purified from the mps1-1 strain (shown in Figure 6).

      Response: We agree that reversine is not necessarily specific for Mps1 and this concern was also brought up by Reviewer 1. Because Mps1 is the only kinase activity that co-purifies with the isolated kinetochore particles, we expect reversine to inhibit only Mps1 in our in vitro assays. However, to further address this point, we will add rupture force assays using kinetochores purified from mps1-1 mutant cells to the revised manuscript. We have already performed these experiments and they confirm that kinetochores lacking Mps1 do not undergo ATP-dependent weakening. We did not put this data into the original submission because the experiment needs to be performed differently due to altered Dam1 levels. But we will clarify the changes in the materials and methods and add the data to a supplementary figure.

      2) If the ATP-mediated reduction on rupture force is lost in the mps1-1 KT particles, which will also lack Bub1 kinase, then preserving the ATP-dependent reduction in rupture force from KT particles purified from the Bub1delta mutant strain would be strong evidence that the contribution of Mps1 kinase has been disentangled from other kinases in this assay.

      Response: Although Mps1 recruits Bub1, we think it is unlikely that we are assaying Bub1 kinase activity in our in vitroexperiments. We cannot detect Bub1 activity on the purified kinetochores using a sensitive radioactive kinase assay (London et al, Curr Bio 2011), and the levels of Bub1 in our kinetochore purifications are very low (for example, see Akiyoshi et al, Nature, 2010). However, we agree with the reviewer that this caveat should be mentioned and will add this point to the revised text for clarity.

      3) Recent work has shown that Sli15-Ipl1 interacts with and is recruited to KTs by the COMA complex (Rodriguez et al., Curr Biol, 2019 and Fischbock-Halwachs et al., eLife 2019) and that this population of Ipl1 is important for accurate chromosome segregation as also shown 10 years prior by Knockleby and Vogel (Cell Cycle, 2009). I realize that this group previously showed (London et al., Curr Biol, 2012) that phosphorylation of KT particles was not affected when purified from the ipl1-321 mutants, but in light of the recent findings how sure are the authors that there is not any Sli15-Ipl1 in the preparations? I think commenting on this would be worthwhile.

      Response: We have not detected Ipl1 or Sli15 in the numerous mass spectrometry experiments we have performed on the kinetochore purifications. In addition, we have been separately assaying the effects of Ipl1 phosphorylation on kinetochores for another project (de Regt, https://doi.org/10.1101/415992), which independently confirmed that the only detectable kinase activity in our kinetochore purifications is Mps1. We will add this additional reference to the manuscript.

      4) Since the interplay between Mps1 and Aurora B are central to this story, the authors should expand upon the sentence on page 5 reading "While there is some evidence that Mps1 regulates Aurora B activity (Jelluma et al., 2010; Saurin et al., 2011; Tighe et al., 2008), significant data suggests it has an independent role in error correction and acts downstream of Aurora B (Hewitt et al., 2010; Maciejowski et al., 2010; Maure et al., 2007; Meyer et al., 2013; Santaguida et al., 2010)." I am not entirely convinced that the in vivo experiments presented here differentiate as to whether Mps1 is upstream from Ipl1 or whether they are acting independently? For example, phosphorylation of T74 looks to be completely lost in figure 6E (although it's difficult to tell since the blot for T74P is very smeary). If they are acting independently in error correction then Ipl1 should still be able to phosphorylate T74 in this condition. However, if the P-T74 really is lost completely in the mcd1-1 cells then this suggests to me that Ipl1 is downstream of Mps1 in this live cell error correction assay.

      Response: We thank the reviewer for bringing this to our attention. We did not mean to imply that Mps1 is downstream from Aurora B in budding yeast and were intending only to summarize findings from the literature regarding other organisms. We will revise this section of the text to make that point clearer, and we agree that the order of events remains unresolved. In addition, we will note that Mps1 does not eliminate the phosphorylation detected by the T74 antibody in the revision, to avoid misconceptions about the order of events.

      **Other points:**

      1) On p.8 "a median strength of 7.5 pN, similar to untreated and ADP-treated kinetochores". Similar is vague so I'm curious as to whether there a statistically significant difference between this and the 9.8 pN and 8.7 pN measured in the other conditions. If so this could be explained by partial dephosphorylation with the phosphatase.

      Response: The quoted phrase refers to the 7.5-pN strength measured when λ-phosphatase was included together with ATP (data from Fig. 1D and Supp. Fig. S1B). P-values computed from comparisons of survival plots using the log-rank test show that this strength was not significantly different from the ADP-treated wild-type (8.7 pN, p = 0.06), nor was it significantly different from the ADP- and MnCl2-treated wild-type (8.1 pN, p = 0.35). However, it was barely significantly different from MnCl2-treated wild-type (8.6 pN, p = 0.03), and it was more significantly different from untreated wild-type (9.8 pN, p = 0.0007). With the revised manuscript, we will include a supplemental table with p-values computed from log-rank tests for all the key statistical comparisons, including those mentioned here.

      2) On p.19 the authors note that Aurora A phosphorylates Ndc80 tail during mitosis. Ye et al. (Curr Biol, 2015) also showed that Aurora A can phosphorylate Aurora B sites and that this activity "converges" at the tail to weaken attachments during error correction.

      Response: We will add the reference and thank the reviewer for pointing out this omission.

      3) Optional: I am curious as to whether the addition of ATP to the Ndc80-8D particles further reduces the rupture force. If so then other sites may also be in play.

      Response: We agree this is an interesting question but we have not yet performed those assays and agree it might be worthwhile for a future study.

      4) Please comment on why MnCl2 is used in the rupture assays in Figure S1. I saw no mention of this in the main text.

      Response: We include MnCl2 in the assay because it is required for phosphatase activity and will add this point to the legend of supplementary Figure S1.

      5) Consider moving S2 A and B to Figure 3 C and D. This is an interesting result and would go well in the main figure next to the significantly reduced rupture force measurements for the 6A mutant so the reader doesn't have to dig into the supplemental for the data providing this reasonable explanation for the rupture force result.

      Response: We thank the reviewer for this suggestion and will move S2A and S2B into Figure 3.

      Reviewer #3 (Significance (Required)):

      The significance of this relates to focusing on an important phenomenon - error correction - and in looking beyond the traditional focus of the field on Aurora kinases to Mps1 kinase, which is largely implicated in checkpoint signaling. Disentangling the contributions of these two players is an important advance.

      The work will be of interest to audiences interested in: kinases, cell division, checkpoints, kinetochore biology, biophysics

      The above areas of interest overlap with my expertise.

      Response: We thank the reviewer for their enthusiasm for our experiments that help distinguish kinase activities and thus contribute to understanding the process of error correction.

    1. 1. A solution which has become increasingly popular for dealing with resistance to change is to get the people involved to “participate” in making the change. But as a practical matter “participation” as a device is not a good way for management to think about the problem. In fact, it may lead to trouble.

      Kunal - This is what we are trying to do. They have proposed a solution a few paragraphs below.

    1. Author Response:

      Reviewer #1 (Public Review):

      The authors have studied mutations in the K13 gene that is linked to Artemisinin resistance in a range of African parasites. They show that these mutations can confer resistance in a in vitro survival assay but that they are often linked to reduced fitness. The authors also show that different parasites have less of an impact on fitness when the K13 mutations are introduced in line with the suggestion that the overall genetic background is critical for transmission of K13 mutations. The paper also shows evidence that genes potentially contributing to the genetic background are not involved.

      The overall work involves a significant amount of work that to generate a wide range of different parasite lines that allow a detailed assessment of how different mutations interact with the genetic background of the parasite. This provides a significant amount of new insights. A key conclusion the authors draw from this work relates to the relationship between fitness and resistance and by inference on why artemisinin resistance has occurred in SE Asia. While this indeed would be a striking conclusion I think the data at this stage is not strong enough to make this claim. The claim is mainly based on Figure 3 E and F as well as 5 C and D. While indeed, initially it looks like RSA has much less of a survival impact in Dd2 there is some concern that the data is generated using different baselines (isogenic WT parasite in Figure 3 and Dd2eGFP in Figure 5 D). This is noteworthy as in Figure 5C the Dd2wt parasite is used and the fitness cost appears to be different.

      Please see our reply below to Reviewer 1 Comment #2.

      A striking finding is that the UG659C560Y line appears to have a relatively small fitness cost - especially if looked at for the whole 40 generations rather than the somewhat arbitrarily picked 38 days. This data could suggest that there are parasites in Africa that have the capacity to acquire resistance with minimal cost to fitness.

      We thank the Reviewer for this suggestion and have now recalculated our fitness data using a 36-day period, which we have adopted as a standardized timeline and which allows us to compare across all prior and newly acquired fitness assays. We note that this is already relatively lengthy compared to a number of other reports in the literature. For example, Baragana et al. (2015, Nature) measured competitive growth rates over a 14-day period. Gabryszweski et al. (2016, Mol Biol Evol) used 20-day assays. Siddiqui et al. (2020, mBio) used longer 48-day assays. We agree with the Reviewer that our data suggest that some African strains can achieve in vitro ART resistance with a minimal cost to fitness. In support of this, our new data presented in the revised Figure 3 provide evidence for the R561H mutation having little to no fitness cost in 3D7 parasites that are closely related to Rwandan isolates (see our response above to Comment #2 from the Editors).

      As pointed out above, we now include new fitness data on the R561H variant in African parasites, based on competition assays with an eGFP reporter line. To standardize our fitness data, we now have analyzed our data to day 36 across assays, as follows:

      Methods lines 538-539: “Cultures were maintained in 12-well plates and monitored every four days over a period of 36 days (18 generations) by harvesting at each time point a fraction of each co-culture for saponin lysis.”

      Figure 3 Legend lines 920-921: “K13 mutant clones were co-cultured at 1:1 starting ratios with isogenic K13 wild-type controls over a period of 36 days.”

      The selective sweep to C560Y in SE Asia is something that has been known for a while. It is striking that it has been selected as based on the data presented here P563L has a similar fitness and RSA profile. The authors could explore this further.

      The Reviewer highlights the important point that RSA values and fitness were comparable for C580Y and P553L, yet only the former swept across Southeast Asia. This would argue for additional factors that contribute to the successful dissemination of C580Y. These could include favorable genetic backgrounds that help propagate C580Y mutant parasites, or increased transmission rates, relative to P553L. To date, reasons for C580Y’s success beyond its moderate resistance and relatively minor fitness cost have not been firmly established. One possibility might be related to piperaquine pressure that selected for amplification in plasmepsins II and III as well as novel mutations in PfCRT, which emerged in parasites harboring K13 C580Y and which have been shown to spread as a series of genetically closely related sublineages (referred to as KEL1/PLA1; Hamilton et al. 2019, Lancet Infect Dis; Imwong et al. 2020, Lancet Infect Dis). These points are discussed as follows:

      Discussion lines 361-369: “Our studies into the impact of K13 mutations on in vitro growth in Asian Dd2 parasites provide evidence that that the C580Y mutation generally exerts less of a fitness cost relative to other K13 variants, as measured in K13-edited parasites co-cultured with an eGFP reporter line. A notable exception was P553L, which compared with C580Y was similarly fitness neutral and showed similar RSA values. P553L has nonetheless proven far less successful in its regional dissemination compared with C580Y (Menard et al., 2016). These data suggest that additional factors have contributed to the success of C580Y in sweeping across SE Asia. These might include specific genetic backgrounds that have favored the dissemination of C580Y parasites, possibly resulting in enhanced transmission potential (Witmer et al., 2020), or ACT use that favored the selection of partner drug resistance in these parasite backgrounds (van der Pluijm et al., 2019).”

      Overall, the main conclusion that there are K13 mutations that can confirm resistance to Art in the context of African parasites is clearly presented and convincing and this highlights the risk that exists for public health officials in African nations. What would be interesting from a readers perspective is how likely it is that this loss of fitness hurdle is going to be overcome in Africa and whether the risk of resistance development will increase as transmission rates drop.

      We appreciate this suggestion from the Reviewer. Our revised manuscript now addresses this topic as follows:

      Discussion lines 393-399: “It is nonetheless possible that secondary determinants will allow some African strains to offset fitness costs associated with mutant K13, or otherwise augment K13-mediated ART resistance. Identifying such determinants could be possible using genome-wide association studies or genetic crosses between ART-resistant and sensitive African parasites in the human liver-chimeric mouse model of P. falciparum infection (Vaughan et al., 2015; Amambua-Ngwa et al., 2019). Reduced transmission rates in areas of Africa where malaria is declining, leading to lower levels of immunity, may also benefit the emergence and dissemination of mutant K13 (Conrad and Rosenthal, 2019).”

      Reviewer #2 (Public Review):

      In this paper, the investigators performed two large-scale surveys of the propeller domain mutations in the K13 gene, a marker of artemisinin (ART) resistance, in African (3299 samples) and Cambodian (3327 samples) Plasmodium falciparum populations. In the African parasite population, they identified the K13 R561H variant in Rwanda, while parasites from other areas had the wild-type K13. In Cambodia, however, they documented a hard genetic sweep of C580Y mutation that occurred rapidly. They generated the C580Y and M579I mutations in four different parasite strains with different genetic backgrounds and found that these mutations conferred varying degrees of in vitro ART resistance. They further edited the SE Asian parasite strains Dd2 and Cam3.II with 7 K13 mutations and found that all the propeller domain mutations conferred ART resistance in the Dd2 parasite, whereas three of the mutations did so in the Cam3.II background. The R561H and C580Y mutations were also evaluated in several parasites collected from Thailand. In vitro growth competition analysis showed that K13 mutations caused substantial fitness costs in the African parasite background, but much less fitness costs in the SE Asian parasites. This study demonstrated the potential emergence of ART resistance in African parasite populations and offered insights into the importance of the parasite's genetic background in the emergence of ART resistance.

      We thank the Reviewer for this thorough summary and favorable assessment of our work.

      Reviewer #3 (Public Review):

      Stokes et al address the question: Why have mutations in the K13 gene spread rapidly across South East Asia and led to widespread treatment failure with artemisinin-based antimalarials? In contrast, why do K13 mutations remain quite rare in Africa, and artemisinin-based antimalarials remain effective?

      The work combines a number of different studies on different parasites of different origins. Gene editing has been used to assess the effects of K13 mutations in different parasite backgrounds, leading to a very complex view of the competing factors of level of resistance conferred and fitness cost.

      The authors put forward the hypothesis that fitness costs associated with K13 mutations select against their dissemination in the high malaria transmission settings in Africa. However, the complexity of the genetic backgrounds of the parasites makes it difficult to tease out the contributing factors.

      We agree that these are complex and multifactorial areas of investigation and appreciate the Reviewer’s summary.

    1. Author Response:

      Reviewer #1 (Public Review):

      This work described a novel approach, host-associated microbe PCR (hamPCR), to both quantify microbial load compared to the host and describe interkingdom microbial community composition with the same amplicon library preparation. The authors used the host single (low-copy) genes as PCR targets to set the host reference for microbial amplicons. To handle the problem that in many cases, the host DNA is excessive compared to the microbiome DNA, the authors adjusted the host-to-microbe amplicon ratio before sequencing. To prove the concept, hamPCR was tested with the synthetic communities, was compared to the shotgun metagenomics results, was applied in the biological systems involving the interkingdom microbial communities (oomycetes and bacteria), or diverse hosts, or crop hosts with large genomes. Substantial data from diverse biological systems confirmed the hamPCR approach is accurate, versatile, easy-to-setup, low-in-cost, improving the sample capacity and revealing the invisible phenomena using regular microbial amplicon sequencing approaches.

      Since the amplification of host genes would be the key step for this hamPCR approach, the authors might also include more strategy discussions about the selection of single (low copy) genes for a specific host and the primer design for the host genes to guarantee the hamPCR usage in the biological systems other than those mentioned in the manuscript.

      A deeper discussion about the design of suitable host primers has been added to the Supplementary Information as Supplementary Discussion 3, and is now mentioned in the main text in the first section of the Methods.

      Reviewer #2 (Public Review):

      Lundberg and colleagues provide a detailed set of data showing the utility of host-associated microbe PCR. By simultaneously amplifying microbial community and host DNA, hamPCR provides an opportunity to measure the microbial load of a sample. I was largely convinced about the robustness of this approach after seeing the many different optimization datasets that were presented in the paper. I also appreciated the various applications of hamPCR that were demonstrated and compared to other standard approaches (CFU counting and shotgun metagenomics, for example). As clearly illustrated in Figure 6f, hamPCR could dramatically improve our understanding of interactions within microbiomes as it helps remove issues of relative abundance data.

      One challenge about the approach presented is that it cannot be quickly adapted to a new system. Unlike most primers for 'standard' microbial amplicon sequencing, considerable time will be required to determine which host gene to target, how to make that host gene size larger than the size of the microbial amplicon, etc. This may limit wide adoption of hamPCR in the field. I do appreciate the authors providing some details in the Supplement on how they developed hamPCR for the several different systems described in this paper. The helpful tips may make it easier for others to develop hamPCR for their own systems.

      Additional strategy of primer design was addressed in the response to Reviewer #1 Public Review.

      An issue that repeatedly came up is that at high and low ends of host:microbe ratios, inaccurate estimates can occur. For example, with high levels of microbial infection, the authors note that hamPCR has reduced accuracy. The authors propose three solutions to this problem (1. altering host:microbe amplicon ratio, 2. use a host gene with higher copy number, 3. and adjust concentrations of host primers), but only present data for #1 and 3. Do they have any data to show that #2 would actually work?

      One instance of potential unreliable load that sticks out in the paper is in Figure 5b. The authors note that this is likely due to unreliable load calculation. Is this just one of 4 replicates? What are other potential reasons this would be an outlier and how can the authors rule this out? Did they repeat the hamPCR for this outlier to confirm the striking difference from the other three samples in the eds1-1 Hpa + Pto sample?

      Both qPCR and amplicon sequencing can be used to detect copy number variation in genomes [1]. Because amplicon-based methods are known to be sensitive to small differences in gene copy number, we are confident, without generating additional data on the topic, that #2 would work.

      Furthermore, bacterial genomes from different taxa are known to vary slightly in their copy number of 16S rDNA, usually from between 1 to about 15 copies [2]. These variations are reflected in sequence counts from amplicon sequencing, biasing the counts towards taxa with more 16S rDNA gene copies [2, 3, 4]. This phenomenon has been well documented, distorts the accurate description of microbial communities, and therefore has led to some efforts to correct 16S rDNA gene amplicon data by dividing the counts from each taxon by the (estimated) 16S rDNA copy number of that taxon, so that the counts better reflect the numbers of bacterial cells.

      Because amplicon methods are sensitive to copy number variation (whether those copies are from inside the same cell, or coming from different cells), we reasoned that choosing a host gene with a higher copy number, similar to the effects of copy number variation on 16S rDNA gene counts, will increase the representation of that host amplicon in the final library (because there will be more template host DNA molecules available to amplify). We did not test this explicitly - we think the evidence from literature is strong support on its own. We have added to the paper a statement that now references the Kembel 2012 paper, which we hope adequately supports our claim:

      “Second, a host gene with a higher copy number could be chosen for HM-tagging throughout the entire project, which would increase host representation by a factor of that copy number (Kembel et al., 2012).”

      1) Martins, W.F.S., Subramaniam, K., Steen, K. et al. Detection and quantitation of copy number variation in the voltage-gated sodium channel gene of the mosquito Culex quinquefasciatus . Sci Rep 7, 5821 (2017). https://doi.org/10.1038/s41598-017-06080-8

      2) Kembel, S. W., Wu, M., Eisen, J. A., & Green, J. L. (2012). Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS Computational Biology, 8(10), e1002743. https://doi.org/10.1371/journal.pcbi.1002743

      3) Starke, R., Pylro, V. S., & Morais, D. K. (2021). 16S rRNA Gene Copy Number Normalization Does Not Provide More Reliable Conclusions in Metataxonomic Surveys. Microbial Ecology, 81(2), 535–539. https://doi.org/10.1007/s00248-020-01586-7

      4) Louca, S., Doebeli, M., & Parfrey, L. W. (2018). Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem. Microbiome, 6(1), 41. https://doi.org/10.1186/s40168-018-0420-9

      Could the DNA extraction method used cause biases in hamPCR for/against either the host or the microbiome? If two different labs study the same system (let's say bacterial communities growing on Arabidopsis leaves) but use different DNA extraction approaches, would we expect them to obtain different answers using hamPCR? Did the authors try several different DNA extraction methods to see if this is an issue? Or has another team of researchers considered this and addressed it in a separate paper? I would appreciate seeing either data to address this or a discussion paragraph that reasons through this.

      Differences in DNA extraction method will certainly change the results, not only of the microbe-to-plant ratio, but also in the representation of microbes, because microbes differ in their sensitivity to different lysis methods. This is a well-documented concern in microbiome studies and has been demonstrated by using different methods on the same mock community in papers such as the following:

      Yuan, S., Cohen, D. B., Ravel, J., Abdo, Z., & Forney, L. J. (2012). Evaluation of methods for the extraction and purification of DNA from the human microbiome. PloS One, 7(3), e33865. https://doi.org/10.1371/journal.pone.0033865

      Albertsen, M., Karst, S. M., Ziegler, A. S., Kirkegaard, R. H., & Nielsen, P. H. (2015). Back to Basics--The Influence of DNA Extraction and Primer Choice on Phylogenetic Analysis of Activated Sludge Communities. PloS One, 10(7), e0132783. https://doi.org/10.1371/journal.pone.0132783

      In short, if the DNA is not extracted because plant or microbial cells are not lysed, it cannot be amplified in PCR. However, there is a good overall strategy to minimize the problem, as also proposed in the above papers, and that is to err on the side of a harsher lysis (using strong bead beating, as we have done), since this will leave fewer cells unlysed (and thus less information will be hidden). We note that similar concerns about lysis methods changing results also apply to DNA extraction for qPCR and live bacterial isolation for CFU counting (for which too harsh a lysis will kill bacteria, but too gentle a lysis will not release them from host tissue).

      We addressed this in two places. First, in the results section we mention briefly the following:

      “All DNA preps employed heavy bead beating to ensure thorough lysis of both host and microbes, as an incomplete DNA extraction can lead to underrepresentation of hard-to-lyse cells (Albertsen et al., 2015; Yuan et al., 2012).”

      Second, we added a paragraph to the discussion about sample selection and DNA extraction as follows:

      “Because hamPCR can only quantify the DNA available in the template, choice of sample and appropriate DNA extraction methods are very important. In particular, the sample must in the first place include a meaningful quantity of host DNA. For example, although there is some host DNA in mammalian fecal samples or in plant rhizosphere soil samples, this host DNA does not accurately represent the sample volume, and therefore relating microbial abundance to host abundance probably has less value in these cases. Further, the DNA extraction method chosen must lyse both the host and microbial cell types. An enzymatic lysis suitable for DNA extraction from pure cultures of E. coli may not lyse host cells or even other microbes. Appropriate DNA preparation methods for metagenomics have been thoroughly evaluated elsewhere (Albertsen et al., 2015; Yuan et al., 2012), and a common point of agreement is that strong bead-beating increases the yield and completeness of the DNA extraction, but comes at the cost of some DNA fragmentation. Especially for short reads, as we have used here, this fragmentation is not a problem, and we recommend to err on the side of a harsher lysis, using strong bead beating potentially preceded by grinding steps using a mortar and pestle as necessary for tougher tissue.”

      One emerging theme in microbiome science is to have consistent methodologies that are used across studies/labs to allow direct comparisons of microbiome datasets. Standardization of approaches may make microbiome science more robust in the long-term. Given much of the nuance in developing hamPCR for different systems, my impression is that this method is best for comparing samples within a particular host-microbe system and not across systems. For example, it may be challenging to directly compare my bacterial load hamPCR data from Arabidopsis to another lab's if we used different Arabidopsis host genes or if we used different 16S gene regions. Can the authors unpack this a bit in a discussion paragraph? If it is widely adopted, is there a way to standardized hamPCR so that it can be consistently used and compared across datasets? Or should that not be the goal?

      There appears to be considerable non-specific amplification or dimers in the gels presented throughout the manuscript. Could this non-specific amplification vary across host-microbe primer combinations? Would this impact quantification of host and microbial amplicons?

      Non-specific amplification / dimers do vary across host-microbe primer combinations. Indeed, they also vary between common 16S rRNA primer pairs used on their own (not shown). Fortunately non-specific amplicons amplified during the exponential PCR step do not, at least with our method, seem to impact quantification of host and microbial amplicons.

      One reason is that non-specific amplicons can be recognized by their sequence and ignored. After the sequences of the amplicons have been extracted from the short read data, only those that match expected length and sequence patterns of the targeted amplicons need to be counted. Non-specific amplicons are certainly a nuisance because they represent wasted sequencing resources, but they can be excluded bioinformatically and therefore do not change the accuracy of the microbial load measurement. This is in contrast to ddPCR/qPCR, for which any off-target amplicons are also quantified!

      A second reason is that the sensitive exponential amplicon step of hamPCR is done with a single primer pair. Off-target sequences do squander PCR reagents including primers and dNTPs, such that they become limiting at earlier cycles than without off-target sequences, but because the exponential PCR step is done with a single primer pair, such inferior amplification conditions are shared by all molecules, and therefore do not differentially affect the host or microbial amplicon. Any off-target binding occurring in the initial tagging reaction (before the PCR step) would certainly be a concern if the reaction was carried on long enough, because for example the microbial primer pair might become limiting at an earlier cycle number, leading to underestimates of microbial load. However, limiting the tagging cycle to a low number of cycles ensures that – should primers targeting a particular host or microbial amplicon be non-specific – the fraction still available to bind the correct sequence remains in excess.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General comments:

      We thank the reviewers for their constructive critique and are pleased they see the results as interesting and of general relevance. We also acknowledge their concerns on the issue of whether all claims are supported by sufficiently strong data. Our careful reading and analysis of the points that are raised suggest there are different reasons for the different cases that are brought up:

      1. Misunderstandings, due to lack of clarity on our side. Example: When talking about ‘reduced actin’, our wording focussed on the endosome-associated actin (partly out of consideration for the fact the actual measurements we show come from the area around the endosomes, so we did not want to make any stronger claims), even though we should have made it clear that other areas of the cell tip are also affected. This will be addressed by clearer explanations.

      Ill-advised wording we chose that is or can be seen as overinterpretation.

      Example: ‘anchoring’ of actin at endosomes. We had not intended to infer anything about specific anchoring sites or mechanisms. We should have used a more neutral term, such as ‘associate with’ or accumulate around’ for the description. This and other cases can also be resolved by rewriting and better wording.

      Anecdotal evidence or insufficient data.

      Example: Images of phalloidin stainings depicting how actin is organized around late endosomes in control embryos. These and other cases will be addressed by adding further examples and additional quantification.

      Finally, one suggestion was made for obtaining additional experimental data, which would involve laser ablation. While the experiment would provide an interesting extension of our findings, we will sadly not be in a position to carry it out in the foreseeable future, as explained below. We hope the referees will agree that our now extended discussion addresses the point in question sufficiently to support the conclusions from the experiments we do present.

      These and all other points are addressed individually below. We highlighted the corresponding text changes in the manuscript file for the reviewers to identify them more easily.

      Detailed responses:

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      Rios-Barrera and Leptin investigate the formation and guidance of the subcellular tube that forms in the terminal cell of the dorsal branches of the Drosophila tracheal system. In previous work the authors documented the presence of late endosomes at the tip of the growing terminal cell, ahead of the forming subcellular tube, which are involved in membrane recycling {Mathew, 2020 #1407}. In this present work they analyze the organization of the actin cytoskeleton in the tip terminal cell, in relation with the late endosomes, and assess the guidance of the subcellular tube. They find that the presence and localization of late endosomes play a role in tube guidance. They also find that late endosomes recruit actin around them, mediated by the activity of the actin polymerization regulator Wash, which is recruited to maturing late endosomes. When Wash activity is decreased, actin around late endosomes is decreased and tube guidance is compromised. Based on this observation, laser ablation experiments of actin ahead of the tube and actin staining at the tip of the terminal cell, the authors propose an exciting model: late endosomes recruit actin, which connects the actin pool of the basal membrane and the actin pool of the apical (subcellular tube) membrane thereby directing tube growth and guidance.

      The manuscript is well-written and well-presented, the images and movies are of high quality and the experimental data, which is technically challenging, is very good and sufficiently replicated.

      **Major comments:**

      1. A critical point in the model that the authors put forward (which is also contained in the title and abstract) is that actin organized at late endosomes anchors apical and basal actin cortices. However, there is no clear and conclusive evidence for this. Clear evidence in this direction should be provided to propose it as a mechanism (as it is in the text, particularly in the first sentences of the discussion) and imply it in the title. The authors show endogenous actin around late endosomes and actin fibers at the tip of the terminal branch. However, at the level of resolution presented (Fig 3A,B), it is not possible to determine whether the different actin populations are actually "anchored". I suggest to present stronger data supporting this important conclusion.

      In the same direction, it would be critical to show that this anchoring of actin fibers is disturbed when actin enrichment at the late endosome is perturbed (see also point 5).

      Actually, the authors show that when Vha or Wash activity are downregulated actin accumulation around the CD4 vesicles decrease. However, this experiment has a few inconveniences. First, it is difficult to determine levels of a construct that is overexpressed (UAS-utr::GFP). Could the authors use phalloidin or an actin antibody to confirm the result?

      Second, I find the result difficult to interpret. In the images provided I see a general decrease of actin (UtrGFP) at the tip, not only around the CD4 vesicles (Fig 6D,F) . Are these mutant conditions also affecting the rest of actin pools? If this is the case, can the authors attribute the defects exclusively to the abnormal recruitment of actin to the late endosomes?

      Most importantly, the authors should analyze the pattern of actin distribution (labelling endogenous actin) and determine a possible loss of "anchoring" of fibers when late endosome maturation is perturbed.

      We understand the referee addresses three issues here, to which we will respond in turn below:

      • As already mentioned above, the referee interprets the term ‘anchoring’ in a more specific meaning than we had intended it to have. We obviously have to rephrase.
      • A technical critique of the use of an overexpressed construct to visualize actin, which in turn has two sub-points: potential physiological effects on actin, and potentially inaccurate localisation. Both are valid points, but in our view do not undermine our conclusions. We will raise and discuss these concerns in our revised text.
      • The specificity of the effects of reducing Vha and Wash of function on actin associated with endosomes versus throughout the growth cone of the cell – a very good point, about which we should have been much clearer and now will be. (a) Considering the use of the term ‘anchoring’, and the referee’s concern over whether we provide the appropriate evidence gave us with a good starting point to re-think what we actually show and how it can be interpreted.

      Put in neutral terms, what [we felt] we had shown was an accumulation or enrichment of actin around endosomes that was dependent on proper functioning of Vha and Wash.

      We agree that the term ‘anchoring’ cannot be justified by the description of actin localisation alone. The term implies a physical (and perhaps strong or long term) interaction between the endosome and the surrounding actin.

      We see a strong enrichment of actin around endosomes, including in experiments in which we use phalloidin to visualize actin (Fig. 3B). The resolution of our images is approximately 200nm so they are able to reveal the very close association. The question is what the mechanistic basis for this closeness is. It is unlikely to be random, as shown by a quantification we have now included (Figure S3C). It is difficult to imagine how it could persist without at least transient physical interaction between the two components. The association is indeed highly dynamic and is constantly being re-established. This must mean that something ‘attracts’ actin to endosomes, most likely a molecule that is itself associated with endosomes. The presence or accessibility of such a molecule depends on the proper maturation of endosomes, as shown by the results of reducing Vha activity. And the ability of actin to associate depends on Wash. Together these findings suggest to us the existence of a (dynamic) molecular link between the endosomes and the actin network.

      In order not to give the impression that we are claiming a permanent ‘anchor’, we now use more general terms such as ‘associates’ or ‘accumulates’, but also include the clarification on our thinking in the text. Furthermore, to illustrate a representative range of cases, we will add more examples of late endosomes and the actin meshwork surrounding them (Figure S3A, B). These images should give a broader reflection of the actin populations and their dynamism during tube growth.

      (b) A major reason why we use live imaging with actin reporters is that the distribution of actin around late endosomes and the tip compartment in general is very dynamic, so capturing cells at the right time point can be challenging from fixed samples. This problem is exacerbated by a technical limitation: For the actin cytoskeleton to be well preserved during fixation, embryos have to be manually dechorionated which limits the throughput of the experiment. We therefore found that analysing cells over time is more informative than analysing cells fixed at a given time point.

      As the reviewer points out, using an overexpressed reporter can have drawbacks. With regard to the problem of not representing the endogenous distribution faithfully, this can be the case when making statements about the absolute distribution. However, what we are looking at here is not absolute quantities of actin but relative changes in the area of interest with respect to other, unaffected regions of the cells, and then comparing these between mutant conditions and the control. We do this by normalizing the signal to the levels seen in the subcellular tube, using it as an internal control that allows us to adjust for variation in expression levels.

      There is on case where such a normalization could be problematic, and that is when comparing actin levels in cells expressing bitesize RNAi, because Bitesize is itself involved in organizing the actin cytoskeleton in the tube membrane (JayaNandanan et al., 2014). However, in this experiment, the analysis still shows that actin levels at late endosomes do not correlate with the tube misguidance phenotype.

      With regard to potential physiological effects of an over-expressed construct, some of the commonly used actin reporters have subtle effects on actin physiology, whereas Utr-ABD has negligible or no effects on the actin cytoskeleton, and it also reproduces actin dynamics faithfully (Spracklen et al., 2014). It is therefore generally considered the most reliable tool for live imaging of actin in Drosophila.

      We have adapted the text and commented on these issues and hoped we have achieved more clarity.

      (c) We agree that when Vha or Wash are downregulated, actin levels are overall reduced in the growth cone of the cells, while this is not the case in other regions, for example at the base of the cell. Although we had not explicitly stated this (but now will), this is a further indication that the different actin populations in the growing tip interact with each other.

      For the downregulation of Wash, this could potentially have been due to a direct effect of Wash on the apical and basal actin, but then we would have expected a similar result in other parts of the cell, including the cell body and the proximal part of the branch, but we do not see that. Even more importantly, the expression of Vha100-DN has the same effect and this cannot be easily explained by a direction action on actin. Together, these findings therefore indicate that depletion of actin around endosomes has a knock-on effect on the basal and apical actin cortex in the vicinity. We have included this reasoning in the paper now.

      Another critical point in the model put forward by the authors is that late endosomes drive tube guidance. To test this point the authors use an elegant system to mislocalize Rab7 late endosomes.

      However, the effects are not strong (1G), and only a proportion of branches show misguided tubes. Do the cases with a ventrally-guided tube in the experiment Rab7:YFP+/+ (Fig. 1G) have a CD4 endosome (with Rab7YFP) at the tip? This would help to explain the weak effect.

      This is an excellent point, and it is indeed what we observe: all cells with ventrally guided tubes have a late endosome that is positive for the YRab7-nanobody-membrane complex at the tip of the cell (n=42), whereas only 2/3 of misguided tubes do (n = 12), and those always have the additional endosome at the tip of the misguide tube. As the reviewer suggests, this provides an obvious explanation for why these cells do not have a tube misguidance phenotype. We have added a representative image of this condition (Rab7::YFP+/+, ventrally-guided tube) in Figure 2 to illustrate the phenotype.

      What is the cause that preventing proper endosome maturation and acidification leads to misguided tubes (rather than missing ones)?

      A complete loss of late endosome activity would indeed result in the absence of the subcellular tube. However, we and others have shown that partial loss of function (as caused by RNAi) can have more subtle effects. For instance, fully blocking endocytosis using the shibire**ts line completely prevents proper tube extension (Mathew et al., 2020), but expression of a shibire RNAi still allows tube formation to proceed, albeit in a defective manner (Schottenfeld-Roames et al., 2014). Similarly, the misguidance phenotypes resulting from Vha downregulation likely reflect weaker loss of late endosome function. These perturbations would allow initial tube growth to proceed, but later on they would uncover this later function of the endocytic pathway in regulating tube guidance.

      We believe that what we see as this weaker defect is an uncoupling of direction from growth per se. The cells still receive their growth-inducing signals from the FGF-receptor, and this leads to directed cell growth in the direction of the chemotactic signal. The normal trafficking of membrane material from the apical to the basal domain is also not disrupted. Thus, membrane keeps being added to both domains and both the tube and the basal domain continue to growth. However, the growing tube has been disconnected from its guiding structure at the tip of the cell (our speculation: because failed endosome maturation no longer allows proper actin coordination) and therefore follows a random path. We had not been sufficiently clear about this but have now hopefully remedied this in the text.

      The authors indicate that downregulating Vha activity leads to defects in acidification, but late endosome-MVB normally form. It is intriguing to see extra CD4 vesicles (like in 1C or 6C).

      Wouldn't we expect to see "normal" tip accumulation of CD4 vesicles only, and not extra ones? How relevant are these extra CD4 vesicles?

      Wouldn't we expect to see "non functional" CD4 vesicles, unable to recruit actin and lead intracellular tube formation (i.e. no tube) rather than missguidances? (1D shows higher proportion of misguided tubes than no tubes)

      Similarly, is Wash-RNAi producing extra CD4 vesicles (as observed in movie 5, fig 6E)?

      We do not postulate that the late endosomes are morphologically normal – there are vesicles carrying the CD4 marker (which is only a membrane marker, not specific for endosomes), but the literature indicates that the endosomes do not undergo their normal maturations, and we would have no reason to claim otherwise. So we agree that the ones we see in the Vha-downregulated cells are not fully functional, and this is indeed confirmed by their inability to recruit actin.

      With regard to the number of large CD4 vesicles at the tip, terminal cells can normally have from 1 to 3 in the growth cone, and the fact that the experimental cells we showed were at the upper range whereas the control at the lower end was pure chance. We have now quantified the number of vesicles in the abnormal conditions and see that there is no increase (Figure S5F).

      Actin recruitment to late endosomes was already documented, where it plays a role in cargo trafficking.

      The authors propose that Wash is recruited to late endosomes upon acidification where it would prime actin nucleation around the endosome. The authors indicate a decrease in Wash accumulation upon expression of Vha dominant negative. However, this decrease is not quantified. In addition, it is difficult to determine levels of a construct when this is overexpressed (UAS-Wash::GFP). It would be desirable to use antibodies against the endogenous protein (Wash in this case) to claim differences in accumulation in mutant conditions.

      We have quantified the amount of Wash::GFP in CD4 vesicles. As mentioned, the vesicles are very dynamic, and so is their recruitment of Wash::GFP, and doing the analysis in the live cells is therefore more meaningful than extracting information from fixed samples, but we will also try to obtain the antibody for confirmation in fixed material. We appreciate that as discussed above for actin, results using overexpressed constructs have to be interpreted with care, but here again, we mitigate against this by assessing relative changes rather than absolute amounts and mitigate against misinterpretation by normalizing the signal to the one seen in the cytoplasm.

      The results presented do not rule out a requirement of Wash in terminal branching which is not associated with the enrichment in the late endosomes. The genetic interaction observed with Shrub is also compatible with both proteins acting on terminal branching but in different/parallel mechanisms.

      While the fact that downregulation of Vha has the same effect cannot be explained in this manner, we agree with the reviewer and will rephrase this section in the paper.

      Laser ablation experiments

      The laser ablation experiments are difficult to interpret.

      First, it is unclear to me what the results exactly indicate. What does the recoil observed suggest? Does it fit with the expected tension exerted by a link of the actin cytoskeleton relayed by late endosomes?.

      The observed recoil suggests that there was tension across the ablated area. The laser ablation experiments were one way to evaluate whether the actin cytoskeleton within the tip of the cell was continuous between the subcellular tube and the leading edge of the cell. Tension along this axis would support such a model. We assumed that if the actin cytoskeleton at the tip is continuous with both membrane compartments it was likely to be under tension, and our laser ablation experiments showed that is indeed the case. We have rewritten this section to make it clearer.

      From the text and figure I don't understand how is the recoil calculated: retraction of the subcellular tube backwards? "enlargement" of the bleached area?

      Briefly, we had used three measuring points: the backward displacement of (i) the subcellular tube and (ii) of the plasma membrane adjacent to the ablated area, which both retract towards the cell body, and we also measured (iii) the forward displacement of plasma membrane on the other side of the ablated area. We then calculated the average of these for each experiment.

      However, we have now redone the evaluations of these experiments using PIV, an established method that is commonly used to calculate initial recoil after ablation and have explained this in the text.

      Second, it is unclear to me what laser ablation actually ablates. Does it only affect actin? Or are also CD4-late endosomes and other tip structures affected?

      The laser ablations with the conditions we use have in the past been shown to temporarily disrupt the actin cytoskeleton without otherwise damaging the cell (Rauzzi et al., 2015).

      The ablations were done in cells that express the actin reporter Utr::GFP together with the membrane marker CD4::mIFP but we have no reason to believe that CD4 containing structures were damaged. For example, upon ablation, the CD4 vesicles in the ablated area are bleached, but in the recovery phase, we observe actin puncta in the positions where CD4 vesicles were originally located, suggesting that the vesicles themselves persist. Our interpretation of these observations is that the bleached CD4 vesicles do not recover their fluorescence (CD4::mIFP is an integral membrane protein and cannot simply be re-inserted within short periods), but they are still capable of recruiting actin. We have added a representative image of this to better describe the experiment (Fig. S4).

      Third, is the recovery observed after ablation correlated with new actin recruitment around old or new late endosomes?

      Actin rapidly reappears in the bleached area and the region that recoiled, where it is first seen in the basal cortex and filopodia. The tube re-extends towards the ablated area, and actin reassembles around the tube within seconds. During further recovery, actin reappears in puncta ahead of the tube and we assume that this is partly de novo assembly around the existing vesicles (Fig. S4A, B). At the same time, we also see new CD4 vesicles reaching the tip, so it is likely that both populations (old and new vesicles) mediate the recovery phase. We have added images of additional examples that illustrate these points.

      Forth, I find the experiments in cells with secondary subcellular tubes very confusing and the explanations very speculative

      The data on cuts in cells with tube duplications are indeed difficult to interpret, and because the emergence of secondary branches is unpredictable, it is not easy to obtain large numbers of observations. Figure S4 is another example of the response of these cells to the laser cut, and we will make clear that our interpretations are merely speculative.

      Finally, and most importantly. I think that performing laser ablation experiments in mutant conditions that affect actin recruitment (VhaDN and Wash RNAi,....) would be very informative. One would expect to find a decrease in recoil. If this was the case, it would validate, on the one hand, that in control conditions there is a tension that depends (at least in part) on actin organization, and on the other hand it would show that when actin recruitment is affected tension decreases, supporting the "anchoring" model. I understand that laser ablation experiments are not easy to perform, but I think this would be a useful experiment.

      To my understanding, as it stands, the laser ablation experiments "....support the notion that adequate cytoskeletal organization at the tip is required for tube guidance and stability" as the authors acknowledge, but they do not convincingly support their "anchoring" model

      Laser cuts on cells that express Vha100-DN or wash-RNAi would be a nice addition that would take the work to the next level. But sadly, these are among the experiments that right now are impossible to carry out because of all the logistical and other problems resulting from the Covid pandemic, as explained in the cover letter.

      **Other comments:**

      • From the images presented, it is often difficult to figure out where the subcellular tube forms, the presence of vesicles, the cell morphologies,... and to determine the correlation between the CD4 vesicles and tube guidance.

      This is the result of a frustrating technical limitation. In experiments in the past we have used markers for the outline of the cell, as we do here, too. Thus, where CD4 is expressed under the btl-gal4 driver it marks the entire outline of the cell against a completely negative background. Even for other markers, if expressed under btl-gal4, the outline of the cell is visible against the dark background. However, for endogenously marked proteins that are expressed ubiquitously, this is no longer true, and as we add more markers to follow different structures, we run out of fluorescent colours for everything we would like to highlight (and genetically, out of chromosomes to accommodate the necessary transgenic or endogenously modified constructs). We will provide tracings of the outlines of the cells to make the images clearer.

      For instance, in Fig 1H and 1J, is there a "lateral" CD4 vesicle? Why it does not generate a missguided tube?

      Yes, there are also CD4 vesicles closer to the proximal part of the cell. They are enriched at but not restricted to the tip of the cell. As we have shown previously (Mathew et al., 2020), they emerge along the subcellular tube, and most are transported towards the tip (also seen in Fig. 1A, for example). Why the remaining ones do not affect the guidance of the tube is unclear, but it is almost certain that the growth of the tip of the cells towards the chemotactic FGF signal plays a role: the basal membrane is constantly moving away from the tip of the tube at this location, but not at the sides further down the branch.

      Fig 1I, are there 2 subcellular tubes? Can the authors mark them? I cannot really visualize them with the CD4 marker, they seem stalled or short or missing.

      In Fig 1I, the tube is curled up inside of the cell, a phenotype often seen in larval terminal cells with excessive FGF signaling (for instance see Ukken et al., 2014). We added diagrams that explain the morphology of the tubes in this figure.

      Fig 1L: what do the authors mean by "corrected" tube sprouts?

      This is not well phrased, and we will also improve the figure to make the point clearer.

      Panels 1K-M (now 1H-J) show snapshots from a movie of a cell that originally had only a misguided tube (at the top left) and is here in the process of forming its ‘correct’ tube growing in the ventral direction. In 1L (now 1I) this second tube is showing first signs of emerging, in 1M (now 1J) it is clearly visible. We have changed the wording in the figure and add an explanation in the legend, and we added a second example of this process in Figure S1.

      It is difficult to identify the cell in Fig 2D-F

      We added a dotted line in one of the channels showing the general morphology of the cell.

      • Movie S3: I find it difficult to spot the association of CD4 and utrGFP that the authors point. Can the authors label in the movie the vesicles and the association?

      We added pauses in the movie and arrows to the frames where actin is seen surrounding late endosomes.

      • The results with the Rab7 downregulation and upregulation are not very clear.

      Does the downregulation of Rab 7 (Rab7 DN construct) have any effect on tube guidance?

      Does it decrease or eliminate actin association with CD4 vesicles in the embryo? The authors show that in the larvae expression of Rab7 DN leads to loss of actin enrichment in Rab7 vesicles. Does this have an effect on terminal branching?

      Rab7DN is not visible in the embryo so we did not pursue further experiments in those stages and we previously showed that loss of Rab7 does not affect branching in larvae (Best and Leptin 2019). However, as the reviewer rightfully pointed out, expression of Rab7DN prevents actin nucleation at late endosomes in larval stages, so having the phenotypic consequence of this experiment would be informative and we are grateful for the observation. We had done the experiment, and we found no difference in the number of branches compared to controls. This suggests either that at larval stages actin recruitment at late endosomes is no longer required for branching or that there are redundant mechanisms that can balance the lack of actin nucleation. We favour the second model, because it has been shown that microtubules also play a role in tube branching and in coordinating the actin cytoskeleton (Araujo lab, 2021), so it is possible that actin nucleation can be bypassed. This is also consistent the fact that the phenotypes we describe are not all fully penetrant, again pointing to redundant mechanisms ensuring consistent directed growth.

      We added the data regarding Rab7DN to the manuscript (Figure S2).

      The Rab7 active construct produce effects at larval stages but not in the embryo. Is terminal cell branching in the larvae also dependent on late endosomes? Can the authors show "excess" of late endosomes in the larvae that lead to extra terminal branches? Even that the authors indicate that they cannot detect Rab7Q67L, can they find any effect at embryonic stages (e.g. presence and position of CD4 vesicles, other unrelated effects,...)?

      Expression of Rab7CA in the embryo generates similar defects as nanobody-mediated mislocalisation of Rab7. We include below an example for the reviewer, but we did not feel comfortable including these data in the paper because some technical complications made them impossible to document and interpret with the certainty that we would wish. Most importantly, the YFP fusion protein is not detectable at embryonic stages, even with the most sensitive microscopes and detectors available to us. This means that we cannot correlate the observed phenotypes with the presence or absence of Rab7CA, which in our view makes them too weak for publication. At face value, these results suggest that Rab7CA begins to trigger branching during embryonic development, which eventually leads to the excess number of branches we see in the larva, but alas, we think this is too speculative to include in the paper.

      • In some examples in the movies there seem to be a correlation between CD4 vesicles presence/positioning and basal lamellipodia/filopodia or actin enrichment, and also in -btl experiments. Have the authors explored this? They may want to comment on this in the discussion section.

      That is a very pertinent point, and we should indeed have commented on it. If we assume the reviewer is looking at examples such as the one in Fig. 1I (currently S1C), then the explanation is the following. The terminal cells in the embryo often form transient side-branches, presumably in response to a low level FGF signals from the environment. In those cases, the basal actin cytoskeleton rearranges in the branching area to form the filopodia that lead the outgrowth of the branch, and what the reviewer observed is that this transient branch also forms the late endosome structure that we see in the main or proper growth cone. Thus, the guiding FGF-signal leads to a reorganisation of the entire actin cytoskeleton in the growth cone, and the formation of the actin-covered endosome is part of that process. We have included this in the discussion.

      Reviewer #2 (Significance (Required)):

      This work is relevant for the morphogenesis field and deals with the important issue of how the cytoskeleton regulates shape and cellular events. The work represents a deep analysis of a specific issue in the specialized field of tracheal development, but the results may be relevant for other types of cells forming subcellular tubes. Describing a function of trafficking vesicles (late endosome in this case) in cell morphogenesis (in addition to cargo trafficking) in an in vivo system is also relevant to advance in the cell biology field.

      **Referees cross-commenting**

      I agree with the comments of reviewer #1. I find relevant the points raised in "major comments number 2 and 4".

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The authors investigate the role of late endosomes in

      the context of actin organization during cell morphogenesis. They use as experimental model the polarized terminal cells in the Drosophila tracheal system that forms a sub-cellular projection containing a tube. The authors show that disruption of the sub-cellular localization or maturation of late endosomes leads to increased proportion of terminal cells with mis-guided tubes. Their analysis indicated that endosomal F-actin recruitment is crucial for the directionality of the tube growth. The authors propose a model where, late endosomes control a coordinated crosstalk between endosomal and cortical actin pools to drive subcellular tube-guidance.

      **Major comments:**

      1. The conclusion about how WASH functions in the tube-guidance, is not clearly shown and it should be better explained and documented. It is known that loss of function of the WASH leads to dysregulation of endosomal tubulation inducing enlarged endosomes, which in turn affects the endosome-to-plasma-membrane recycling of various cargos (including luminal cargos like Serp) (Gomez, et al., Mol. Biol. Cell, 2012; Dong et al, Nat. Comm. 2013). The authors should clarify if there is a defect in the integrity of endosomes located close to the cell tip in the btl>WASHIR knock down. In the cartoon panel C' (Figure 7), the endosomes in the cell tip are shown intact (Including their relative position to the tip) but no experimental data support this conclusion.

      Given the fact that Wash contributes to proper late endosome morphology we do not necessarily expect the endosomes to look normal. We had not shown this in the diagram because our own data do not directly address this point, but the literature is of course clear enough about this, so we have modified our diagrams so that they better reflect the expected phenotypes and included a reference to the relevant literature.

      We and others have shown the important role of late endosomes in plasma membrane and luminal cargo delivery, and as elaborated in the response to referee 2’s point 3, complete loss of endosomal function blocks these processes. Here, at reduced but not abolished function plasma membrane delivery is clearly still functional.

      -The functional analysis of WASH was based on RNAi knock down. The authors express a single RNAi construct against WASH. The expression of this RNAi line gave a low penetrance phenotype. A well-known caveat of RNAi is off-targeting. Hence, phenotypic analysis needs to include a verification by a second independent RNAi construct or a rescue of the RNAi phenotype with an overexpressed cDNA of WASH. Ideally, the null wash mutant (Nagel et al. 2017) can be used to confirm the phenotype.

      Analysing wash mutants would provide a welcome additional confirmation of the knockdown results, and it is true in general that poorly characterised RNAi lines can have off target effects. However, this is a well validated line: Nagel et al. (2017) showed that the same RNAi line that we used fully recapitulates the phenotype seen in wash mutants: In both cases, actin fails to localize to late endosomes, and this is what we also found in terminal cells.

      Whereas we believe therefore that the experiment is not essential to support our conclusions, we agree it is desirable and have ordered these flies. However, progress is being hampered by import restrictions at the first author’s lab: the necessary paperwork for flies to be imported for his work is still under revision by officials. The experiment thus cannot be done at the moment.

      The authors claim a role of the late endosomes in subcellular tube growth and guidance. But show no data on lumen formation to prove tube presence in the tracheal terminal cells of V100R755A, btl>WASHIR, shrb mutants or in GrabFP-Bint treated terminal cells. The interpretation and quantification of the phenotypic classes "miss-tube-guided" and "ventrally-tube-guided" are based on membrane markers and not on luminal markers. The presented data with the provided resolution does not prove if the mCD4-mIFP or PH-GFP markers define apical membrane protrusions/extensions or tubular structures. Therefore, the classifications of the tube-guidance phenotype and the quantification of "distance from tube to tip" may be suggestive. The authors need to provide additional confocal data of co.stainings of the endosomal compartments with luminal antigens (i.e. GASP or Serp or Verm).

      We are very unsure as to what this would add and in what context it would be necessary. Membrane and actin markers have been widely used to follow the formation of the subcellular tube by all groups working in this field. There is ample documentation in the literature that the subcellular tube, as defined by luminal content (Serp, Verm, Gasp, CBP-GFP, ANF-GFP) is surrounded by apical plasma membrane which carries apical transmembrane proteins (Crb, Uif) and their associated apical cytoplasmic complexes (Par3, Par6, aPKC, pMoesin), and apical phospholipids which can be visualized by specific PIP-binding markers, e. g. The PLC-d PH-domain that binds to PIP2 (see, e.g., Kato et al., 2004; Oshima et al., 2006; Okenve-Ramos & Llimargas, 2014 (here they use both luminal and actin reporters); Ochoa-Espinosa et al., 2017; and from our lab: JayaNandanan et al., 2014; Mathew et al., 2020. Therefore, all labs in this field have used these markers interchangeably to visualize the subcellular tube and we are not aware of a single case where luminal shape and apical membrane shape were not exactly congruent.

      We have in fact used luminal markers in some experiments here, but we believe there is no reason to assume that luminal markers would have a different distribution compared to membrane, apical or actin reporters in any the experiments described here. Finally, the focus of the paper is on the behaviour of the early out-growing membrane rather than the mature tube, and on how membrane is remodelled in this process by modifications in the actin cytoskeleton. Including confirmation of the presence of luminal material would not add to the paper.

      page 8 line 248, the authors interpret that reducing the dose of Shrb by half strongly enhances the wash-RNAi phenotype and suggest that WASH and Shrb act in the same pathway. Shrub is a subunit of the ESCRT-III complex involved in inward membrane budding of endosomes and WASH functions in outward endosomal membrane budding.

      The Shrb and WASH form discrete molecular complexes in endosomes. The authors should consider that Shrb and WASH may well act in parallel to control directional tube growth.

      This is a good point and we will rephrase our conclusions from this experiment.

      The authors use nanobody-based GFP trap construct to investigate the effect of Rab7YFP localization. This is an excellent way to provide novel information for protein miss-localization in vivo. Using this method the authors concluded that ... "the correct distribution of late endosomes is required for proper tube guidance" (page 5, lines 157-158). The authors obviously consider that GrabFP-B-Int construct affected the distribution of late endosomes. However, this is unclear and additional control experiments are needed to support the author's claims. For instance, did expression of GrabFP-B-Int, target the Rab7-YFP protein or the Rab7-associated endosomes? With the presented data, it is not clear if the Rab7-YFP positive vesicles are endosomes? or aggregates formed by the trapped Rab7-YFP protein? Co-stainings using GFP in Rab7-YFP terminal cells with another endosomal markers i.e. Avl, or hrs, should be provided. It is also not clear if endocytosis of apical/basal membrane or luminal cargos was affected in GrabFP-B-Int treated terminal cells. The loss of endocytic components has been associated with defects in subcellular tube shape and morphology (Schottenfeld-Roames et al, Cur Biol. 2014). The authors should clarify these issues.

      The nanobody would of course bind both to free Rab7::YFP (if there is any available) and to endosome-associated Rab7::YFP. However, in addition to Rab7::YFP we also assayed the distribution of CD4::mIFP, a membrane-associated protein that is seen at very low levels in all membranes (Mathew et al., 2020), but highly enriched in cytoplasmic vesicles, which we showed by co-expressed markers to correspond to endosomes (Mathew et al., 2020). If the nanobody sequestered free Rab7::YFP, we would expect little overlap between Rab7::YFP and CD4::mIFP puncta. Instead, we see that the large Rab7::YFP/nanobody puncta have membrane associated with them (63% of vesicles are triple positive, vs 8% of Rab7::YFP-GrabFP vesicles) indicating that they are not merely Rab7 aggregates. We will include a quantification of the degree of overlap between these components.

      Regarding the question of whether endocytosis is affected, we believe this is unlikely, or if it is at all, only to a minimal extent, since growth of the outer membrane, which crucially depends on endocytosis, continues in these cells. We have added a comment to this effect in the text. The cells look very different from cells in which endocytosis has been inhibited.

      In the legends of Figure 7 (C'), the authors stated that.... "lack of actin regulators at the basal cortex prevents the connection of the actin meshwork at the tip to the basal plasma membrane".... by depicting the singed mutant phenotype. singed mutant analysis is not shown in the manuscript.

      Singed/Fascin has previously been shown to be required for actin organization in fillopodia (Okenve-Ramos & Llimargas, 2014). We have now included new data that show that cells expressing singed RNAi also have reducedamounts of actin at late endosomes, and that reduced actin correlates strongly with tube misguidance. This shows that an actin bundling protein that has previously been shown to be needed for actin bundles in filopodia again affects actin around endosomes, providing another illustration that these compartments interact with each other.

      Our quantifications on actin around late endosomes show that interfering with endosome maturation, actin nucleation via Wash and basal/filopodial actin all lead to loss of actin around endosomes, and the misguidance phenotype correlates with actin loss (Figure 6J). By contrast, disruption of the apical actin cortex does not affect endosomal actin but does lead to misguidance. This establishes a hierarchy of actin organisation in the tip of the cell: basal actin affects endosomal actin, loss endosomal actin affects both apical and basal actin, but apical actin does not feed back on endosomal. All three pools are nevertheless required for tube guidance.

      The authors consider the late endosomes nucleate actin ahead of the tube (i.e. page3, line 87-88, page 9, line 285). This is not very convincing from the presented data. The authors should provide some quantitative data showing that lack of WASH (and endosomal F-actin network) effects the apical and basal F-actin pools in the tip of the cell.

      If we understand the reviewer correctly, there are two comments included in this point: (i) whether actin is nucleated at late endosomes, and (ii), whether reducing endosomal F-actin affects apical-basal actin pools in the tip of the cell.

      (i) As stated above in the response to reviewer #2, we have added quantitative data illustrating actin recruitment at late endosomes with phalloidin stainings. Actin association with endosomes is also confirmed by the Rab7 stainings in larval terminal cells in Fig. 3G-H that show actin puncta associated with endosomes.

      (ii) Again, as mentioned in the response to reviewer #2, we do think that all actin pools in the growth cone are affected. We are glad that the reviewers encouraged us to make this more explicit and will now discuss more clearlyhow endosomal F-actin could affect apical and basal F-actin pools.

      **Minor concerns:**

      1. The authors concluded (page 9, line 285) that "endosomes serve as actin nucleating centres that propagate forces within the cell by physically linking different subcellular compartments".

      We agree with the reviewer, this is a good way of phrasing it, and we will rewrite this conclusion accordingly.

      The authors should depict in the panels the Ventral/Dorsal axis.

      All images are positioned in the same orientation, but we will ensure that the D/V axis orientation is stated in the manuscript.

      Numerous omissions need to be corrected. Labeling is missing in the panels J-M' (Figure 1). The statistical significance and the p values levels are not indicated in Figure 2 (G). The panel figure 5 (D) is miss-labelled. The panels C-C' in f igure 7 are not very informative. They do not reflect the general model of the study. How the prevention of actin nucleation at late endosomes, or apical or basal cortex affects tube directionality is not graphically shown.

      We thank the reviewer for noticing these omissions, we will fix them for resubmission. Having added more discussion about the general organization of actin at the tip of the cell, we think the relevance of panels 7C is justified.

      In the section "Crosstalk between cytoskeletal compartments" (Lines 359- 400, discussion) the argument about the involvement of microtubules in tube-guidance is a likely scenario. But I found this argument over-extended. WASH interacts with tubulin Derivery et al. Dev Cell (2009) and WASH activity balances the endosomal and cortical F-actin networks during epithelial tube maturation in multicellular tracheal tubes (Tsarouhas et al., Nat. Comm 2019). These results should be considered in the discussion section.

      We will incorporate these references to the discussion, they will for sure enrich it.

      Reviewer #1 (Significance (Required)):

      The important role of actin cytoskeleton in the initiation of endocytosis is well established. Actin structures in the plasma membrane are dynamically organized to assist the remodeling of the cell surface and to facilitate the inward movement of vesicles. Similarly actin networks in endosomes are critical for endosomal fusion and fission. In this work, the authors identified an opposing but interesting scenario. They propose a role for the late endocytic pathway in organizing actin networks for proper cell morphogenesis and point out an intracellular crosstalk and coordination between distinct cytoskeletal pools within a cell.

      Although the mechanism about how the separate F-actin pools communicate is not shown, the paper is interesting and shows an original contribution in the area of cell morphogenesis. In addition it represents a conceptual advance as it proposes a mechanism through which actin cytoskeleton is coordinated to regulate tube morphogenesis. The proposed mechanism may be relevant for tracheal terminal cells, but could represent a general mechanism in the field of cell biology. The methodology is appropriate and the text flow is well organized. However, as explained, there are few inconsistencies in the manuscript. I believe the above additions would strengthen the conclusion of the paper.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Rios-Barrera and Leptin investigate the formation and guidance of the subcellular tube that forms in the terminal cell of the dorsal branches of the Drosophila tracheal system. In previous work the authors documented the presence of late endosomes at the tip of the growing terminal cell, ahead of the forming subcellular tube, which are involved in membrane recycling {Mathew, 2020 #1407}. In this present work they analyze the organization of the actin cytoskeleton in the tip terminal cell, in relation with the late endosomes, and assess the guidance of the subcellular tube. They find that the presence and localization of late endosomes play a role in tube guidance. They also find that late endosomes recruit actin around them, mediated by the activity of the actin polymerization regulator Wash, which is recruited to maturing late endosomes. When Wash activity is decreased, actin around late endosomes is decreased and tube guidance is compromised. Based on this observation, laser ablation experiments of actin ahead of the tube and actin staining at the tip of the terminal cell, the authors propose an exciting model: late endosomes recruit actin, which connects the actin pool of the basal membrane and the actin pool of the apical (subcellular tube) membrane thereby directing tube growth and guidance. The manuscript is well-written and well-presented, the images and movies are of high quality and the experimental data, which is technically challenging, is very good and sufficiently replicated.

      Major comments:

      1. A critical point in the model that the authors put forward (which is also contained in the title and abstract) is that actin organized at late endosomes anchors apical and basal actin cortices. However, there is no clear and conclusive evidence for this. Clear evidence in this direction should be provided to propose it as a mechanism (as it is in the text, particularly in the first sentences of the discussion) and imply it in the title.

      The authors show endogenous actin around late endosomes and actin fibers at the tip of the terminal branch. However, at the level of resolution presented (Fig 3A,B), it is not possible to determine whether the different actin populations are actually "anchored". I suggest to present stronger data supporting this important conclusion.

      In the same direction, it would be critical to show that this anchoring of actin fibers is disturbed when actin enrichment at the late endosome is perturbed (see also point 5). Actually, the authors show that when Vha or Wash activity are downregulated actin accumulation around the CD4 vesicles decrease. However, this experiment has a few inconveniences. First, it is difficult to determine levels of a construct that is overexpressed (UAS-utr::GFP). Could the authors use phalloidin or an actin antibody to confirm the result? Second, I find the result difficult to interpret. In the images provided I see a general decrease of actin (UtrGFP) at the tip, not only around the CD4 vesicles (Fig 6D,F) . Are these mutant conditions also affecting the rest of actin pools? If this is the case, can the authors attribute the defects exclusively to the abnormal recruitment of actin to the late endosomes? Most importantly, the authors should analyze the pattern of actin distribution (labelling endogenous actin) and determine a possible loss of "anchoring" of fibers when late endosome maturation is perturbed.

      1. Another critical point in the model put forward by the authors is that late endosomes drive tube guidance. To test this point the authors use an elegant system to mislocalize Rab7 late endosomes. However, the effects are not strong (1G), and only a proportion of branches show misguided tubes. Do the cases with a ventrally-guided tube in the experiment Rab7:YFP+/+ (Fig. 1G) have a CD4 endosome (with Rab7YFP) at the tip? This would help to explain the weak effect.
      2. What is the cause that preventing proper endosome maturation and acidification leads to misguided tubes (rather than missing ones)? The authors indicate that downregulating Vha activity leads to defects in acidification, but late endosome-MVB normally form. It is intriguing to see extra CD4 vesicles (like in 1C or 6C). Wouldn't we expect to see "normal" tip accumulation of CD4 vesicles only, and not extra ones? How relevant are these extra CD4 vesicles? Wouldn't we expect to see "non functional" CD4 vesicles, unable to recruit actin and lead intracellular tube formation (i.e. no tube) rather than missguidances? (1D shows higher proportion of misguided tubes than no tubes) Similarly, is Wash-RNAi producing extra CD4 vesicles (as observed in movie 5, fig 6E)?
      3. Actin recruitment to late endosomes was already documented, where it plays a role in cargo trafficking. The authors propose that Wash is recruited to late endosomes upon acidification where it would prime actin nucleation around the endosome. The authors indicate a decrease in Wash accumulation upon expression of Vha dominant negative. However, this decrease is not quantified. In addition, it is difficult to determine levels of a construct when this is overexpressed (UAS-Wash::GFP). It would be desirable to use antibodies against the endogenous protein (Wash in this case) to claim differences in accumulation in mutant conditions.

      The results presented do not rule out a requirement of Wash in terminal branching which is not associated with the enrichment in the late endosomes. The genetic interaction observed with Shrub is also compatible with both proteins acting on terminal branching but in different/parallel mechanisms.

      1. Laser ablation experiments The laser ablation experiments are difficult to interpret. First, it is unclear to me what the results exactly indicate. What does the recoil observed suggest? Does it fit with the expected tension exerted by a link of the actin cytoskeleton relayed by late endosomes?. From the text and figure I don't understand how is the recoil calculated: retraction of the subcellular tube backwards? "enlargement" of the bleached area? Second, it is unclear to me what laser ablation actually ablates. Does it only affect actin? Or are also CD4-late endosomes and other tip structures affected? Third, is the recovery observed after ablation correlated with new actin recruitment around old or new late endosomes? Forth, I find the experiments in cells with secondary subcellular tubes very confusing and the explanations very speculative Finally, and most importantly. I think that performing laser ablation experiments in mutant conditions that affect actin recruitment (VhaDN and Wash RNAi,....) would be very informative. One would expect to find a decrease in recoil. If this was the case, it would validate, on the one hand, that in control conditions there is a tension that depends (at least in part) on actin organization, and on the other hand it would show that when actin recruitment is affected tension decreases, supporting the "anchoring" model. I understand that laser ablation experiments are not easy to perform, but I think this would be a useful experiment. To my understanding, as it stands, the laser ablation experiments "....support the notion that adequate cytoskeletal organization at the tip is required for tube guidance and stability" as the authors acknowledge, but they do not convincingly support their "anchoring" model

      Other comments:

      • From the images presented, it is often difficult to figure out where the subcellular tube forms, the presence of vesicles, the cell morphologies,... and to determine the correlation between the CD4 vesicles and tube guidance. For instance, in Fig 1H and 1J, is there a "lateral" CD4 vesicle? Why it does not generate a missguided tube? Fig 1I, are there 2 subcellular tubes? Can the authors mark them? I cannot really visualize them with the CD4 marker, they seem stalled or short or missing. Fig 1L: what do the authors mean by "corrected" tube sprouts? It is difficult to identify the cell in Fig 2D-F
      • Movie S3: I find it difficult to spot the association of CD4 and utrGFP that the authors point. Can the authors label in the movie the vesicles and the association?
      • The results with the Rab7 downregulation and upregulation are not very clear. Does the downregulation of Rab 7 (Rab7 DN construct) have any effect on tube guidance? Does it decrease or eliminate actin association with CD4 vesicles in the embryo? The authors show that in the larvae expression of Rab7 DN leads to loss of actin enrichment in Rab7 vesicles. Does this have an effect on terminal branching? The Rab7 active construct produce effects at larval stages but not in the embryo. Is terminal cell branching in the larvae also dependent on late endosomes? Can the authors show "excess" of late endosomes in the larvae that lead to extra terminal branches? Even that the authors indicate that they cannot detect Rab7Q67L, can they find any effect at embryonic stages (e.g. presence and position of CD4 vesicles, other unrelated effects,...)?
      • In some examples in the movies there seem to be a correlation between CD4 vesicles presence/positioning and basal lamellipodia/filopodia or actin enrichment, and also in -btl experiments. Have the authors explored this? They may want to comment on this in the discussion section.

      Significance

      This work is relevant for the morphogenesis field and deals with the important issue of how the cytoskeleton regulates shape and cellular events. The work represents a deep analysis of a specific issue in the specialized field of tracheal development, but the results may be relevant for other types of cells forming subcellular tubes. Describing a function of trafficking vesicles (late endosome in this case) in cell morphogenesis (in addition to cargo trafficking) in an in vivo system is also relevant to advance in the cell biology field.

      Referees cross-commenting

      I agree with the comments of reviewer #3. I find relevant the points raised in "major comments number 2 and 4".

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Very high evidence and clarity. Excellent scientific rigor. The findings are important and reported clearly. The experiments are conducted in a rigorous way by numerous participating laboratories. Reviewer #1 (Significance (Required)): Very high significance, both from a molecular biology and clinical standpoints. This is an important manuscript that challenges the findings and conclusions of a prior high-profile paper in Science by Ma et al 2016, claiming that LAG3 is a receptor for aggregation-prone species of alpha-synuclein and that deletion of LAG3 results in reduced cell to cell propagation of alpha-synuclein aggregates. The experiments in this paper are numerous and employ a variety of techniques. The overall conclusions are that LAG3 is not expressed by the relevant neurons and that LAG3 is not a receptor for alpha-synuclein fibrils (of different sizes). Therefore, the authors conclude that LAG3 is unlikely to play a role in the spread of alpha-synuclein pathology in Parkinson's disease and related disorders. There are, however, some weaknesses. For example, the Introduction contains passages that are not written in a stringent way: 1. "Histologically, PD is characterized by α-synuclein aggregates known as Lewy Bodies in neurons of the substantia nigra," That is not a good description of PD neuropathology. Lewy pathology is present in numerous areas of the CNS and PNS, and is not restricted to the substantia nigra.

      We have added a more detailed account:

      “Histologically, PD is characterized by α-synuclein inclusions known as Lewy Bodies whose accumulation is associated with neurodegeneration (Dickson, 2012; Mullin and Schapira, 2015; Corbillé et al., 2016). These inclusions affect the Substantia nigra and other mesencephalic regions as well as, in some cases, the amygdala and neocortex (Dickson, 2018).”

      1. "Growing evidence suggests that α-synuclein fibrils spread from cell to cell". While alpha-synuclein pathology can spread from cell to cell, it is not known if the fibrils are the species (alone or combined with other conformers) that cause the spreading of the pathology in a seeding fashion, or if smaller alpha-synuclein assemblies play that role.

      We have reformulated the sentence to credit the fact that we do not know which synuclein species is the one that is transmitted:

      “Growing evidence suggests that α-synuclein aggregates spread from cell to cell (Volpicelli-Daley et al., 2011; Volpicelli-Daley, Luk and Lee, 2014)… “

      1. "...by a "prionoid" process of templated conversion (Aguzzi, 2009; Aguzzi and Lakkaraju, 2016; Jucker and Walker, 2018; Kara, Marks and Aguzzi, 2018; Scheckel and Aguzzi, 2018; Uemura et al., 2020)." This sentence gives the impression that the corresponding author has led the field when it comes to alpha-synuclein's prionid properties. That is not really the case, and it would be appropriate to cite the literature in a more scholarly fashion that reflects how this part of the alpha-synuclein research field developed.

      I cannot disagree, and in fact I suspect that the present paper may be my second and possibly last experimental contribution to the synuclein field! However, I do claim intellectual parenthood of the prionoid (not “prionid”) concept, which I first expounded in a 2009 Nature paper. Anyway, we now provide a more balanced citation:

      “…by a “prionoid” process of templated conversion (Aguzzi, 2009; Jucker and Walker, 2018; Kara, Marks and Aguzzi, 2018; Henderson, Trojanowski and Lee, 2019; Karpowicz, Trojanowski and Lee, 2019; Uemura et al., 2020; Kara et al., 2021).“

      1. "Interrupting transmission of a-synuclein may slow down or abrogate the disease course." This is a bold statement and far from certain. While one might propose that this is the case, it is still just a hypothesis and the Introduction should reflect that.

      We have rewritten the sentence in a more subdued manner:

      “It is thought that interrupting transmission of a-synuclein may slow down or abrogate the disease course.”

      **Referee Cross-commenting** I concur with reviewers 2 and 3, and the new comment from reviewer 2. This paper should be published as soon as possible.

      *********************************************

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): This study conclusively shows that LAG3 is not the receptor for a-synuclein that underlies the spread of synucleinopathic damage in various PD-related conditions. The paper is done extremely carefully and comprehensively. My only suggestion is to indicate the significance level in Figure 5a, as it may turn out that LAG3 is actually protective.

      We have added the significance level in Fig. 5A, in the legend: “The survivals of ASYNA53T LAG3-/-, LAG3+/- and LAG3+/+ mice were similar (Mantel-Cox log-rank test, p-value = 0.165).”

      Reviewer #2 (Significance (Required)): This study is of extremely high significance - we need mechanisms to deal with spectacular results in the literature that should not have been published because they are were uncompelling to begin with, but were published for various sociological/political reasons. Science won't progress if we don't find correction mechanisms for wrong conclusions. **Referee Cross-commenting** I agree with reviewers 1 and 3, especially with the suggestions made by reviewer 1, which should be instituted. I think we all concur that the paper should be published without new experiments. I believe testing a-synuclein propagation in vivo in LAG3 KO mice would be useful, but given the complete lack of replication of LAG3 expression in brain and of a-synuclein binding to LAG3, this is not necessary.

      We considered running experiments in addition to those performed in vivo in ASYNA53T transgenic mice (including LAG3 KO) and ex vivo in organotypic slices, the latter using pre-formed fibrils. However, the outcome of these experiments, along with the absence of LAG3 expression in neurons and its unclear binding, convinced us that the usage of further animals and reagents would be unwarranted.

      *****************************************

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): It was proposed that LAG3 is important in the treatment of PD and related disorders, because it functions as a receptor of pathogenic α-synuclein and the treatment with anti-LAG3 antibodies attenuated the spread of pathological α-synuclein and drastically lowered the aggregation in vitro (Mao et al, Science 2016). In this study, authors characterized 8 antibodies to LAG3 and investigated the presence of LAG3 in cultured cell lines, NSC-derived neural cultures, or organ homogenates for the presence of human or murine LAG3. But it was not detected in any of the neuronal samples tested. In addition, single cell (sc) RNAseq yielded only minimal counts for the LAG3 transcript in neurons, astrocytes, and mixed glial cells, and single-nucleus (sn) RNAseq human brain dataset for LAG3 expression across different cell types confirmed no LAG3 signals for any of 34 identified cell clusters, including 13 clusters of excitatory and 11 subtypes of inhibitory neurons, oligodendrocytes, oligodendrocyte precursor cells, microglia, astrocytes, and endothelial cells. Authors also analyzed the binding of LAG3 with α-synuclein in mouse and human model systems, and concluded that the affinity of LAG3 for α-synuclein fibrils, if any, is micromolar or less. Furthermore, authors studied the propagation of pre-formed fibrils (PFFs) of α-synuclein in neural stem cell (NSC)-derived neural cultures in the presence or absence of LAG3, and the impact of LAG3 on survival in ASYNA53T transgenic mice expressing wild-type LAG3 as well as hemizygous or homozygous deletions thereof. However, they were unable to see any significant role for LAG3 in these in vitro and in vivo models of α-synucleinopathies. In this connection, the reviewer would like to ask one question: Have you conducted any experiments of the propagation of PFFs of α-synuclein in LAG3-KO mice ? If they did, what were the results ?

      We did consider the possibility of replicating the experiments using PFFs in LAG3 KO mice. However, as stated above, we felt that our experiments – including the survival study in vivo in ASYNA53T transgenic mice – were unambiguous. After critical consideration, we remained unconvinced that this additional experiment would change the weight of our evidence in a substantial manner that would justify the inoculation of other animals and the utilisation of more resources.

      **Minor point** In Page 10, I think it's a typo: ASYYN mice must be ASYN mice.

      Thank you for pointing this out. We corrected it.

      Reviewer #3 (Significance (Required)): These negative findings about the LAG in α-synucleinopathies shown in this manuscript do not provide any new insight into the mechanisms of α-synuclein propagation. However, it is clear that LAG3 is not expressed in neuronal cells and the binding of LAG3 to α-synuclein fibrils appears limited. Overexpression of LAG3 in cultured human neural cells did not cause any worsening of α-synuclein pathology ex vivo. The overall survival of A53T α- synuclein transgenic mice was unaffected by LAG3 depletion and the seeded induction of α-synuclein lesions in hippocampal slice cultures was unaffected by LAG3 knockout. These data shown in this manuscript are convincing and the information is very important in terms of correcting the direction of disease treatment and research. **Referee Cross-commenting** I agree with reviewers 1 and 2. This paper should be published as soon as possible.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study conclusively shows that LAG3 is not the receptor for a-synuclein that underlies the spread of synucleinopathic damage in various PD-related conditions. The paper is done extremely carefully and comprehensively. My only suggestion is to indicate the significance level in Figure 5a, as it may turn out that LAG3 is actually protective.

      Significance

      This study is of extremely high significance - we need mechanisms to deal with spectacular results in the literature that should not have been published because they are were uncompelling to begin with, but were published for various sociological/political reasons. Science won't progress if we don't find correction mechanisms for wrong conclusions.

      Referee Cross-commenting

      I agree with reviewers 1 and 3, especially with the suggestions made by reviewer 1, which should be instituted. I think we all concur that the paper should be published without new experiments. I believe testing a-synuclein propagation in vivo in LAG3 KO mice would be useful, but given the complete lack of replication of LAG3 expression in brain and of a-synuclein binding to LAG3, this is not necessary.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Rebuttal letter – Response to Reviewers

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This study focused on P. vivax, which is an important neglected human malaria killer. The reported evidence will have a significant impact on diagnosing infectious diseases. The language in the manuscript is very good. However, some typos were reported. Some paragraphs might need particular attention to punctuation. Overall, the work is very good. The statistics are straight forward. However, there are a couple of major points that must be addressed before publication. Some of my comments are just recommendations to clarify some sections of the text.

      **Major comments:**

      The statistical methods can be improved by using generalised mixed models (GLMM).

      1- PCA graphs need to be organised in more descriptive ways. Dim1 and Dim2 in each axis need to be defined clearly in the figures. PCA in Fig2 c is very difficult to follow, and it needs to be organised.

      Answer: Figures have been amended to be more self-explanatory and clearer to the reader.

      2- In this study, patients were male and female, and we know already male and female haematological parameters are hugely different, specially Hb level, and so on. My question is how the sex variable is treated in this study? Did your control group were from both sexes? Sex could be treated as a random variable in all studies if GLMM models were used.

      Answer: Information in how the sex variable was treated in the study has been added to the methods section. In our cross-sectional study with uncomplicated P. vivax malaria patients seen at FMT-HVD in Manaus, Brazil, patients and healthy donors (controls) were matched by age and sex. In both groups, frequency of female individuals was 30% and male individuals 70%.

      We think sex is better fitted as fixed effect since only two levels for this factor are possible. Thus, we used linear models with age and sex as fixed variables for statistical testing and to ensure that the differences observed between P.vivax- infected patients and controls, as well as between the clusters, were only due to disease status. This analysis showed that red blood cells count, hemoglobin, hematocrit, MXD and neutrophils counts (this parameter only when comparing the clusters) needed to be corrected only due to sex influence. For these parameters, estimates of predicted sex influence were subtracted from the raw parameter values and residuals were used for statistical testing. We have added this information in the Methods section as indicated below:

      Page 6, line 128: Patients and healthy donors were age and sex-matched, with a frequency of 30% female and 70% male individuals in both groups.

      Page 14, line 336:

      To ensure that differences observed between P. vivax - infected patients and controls, as well as between the clusters, were due to disease status and not confounded by age or sex, the clinical parameters were fitted as response variables in a linear model with sex and/or age fitted as explanatory variables. Age and sex were included in the model if their coefficients were estimated as different from zero with p-value The residuals from the linear model were then used as age and/or sex corrected parameters in subsequent analyses.

      3- Why 6h and 18h used for the HUVEC evaluation?

      Answer. We ran several optimization experiments with individual plasma samples where we observed maximal mRNA expression changes after 6h of stimulation. For experiments detecting protein expression (IFA and flow cytometry), we increased the stimulation time to 18h. Preliminary experiments suggested this to be the optimal duration without compromising cellular viability.

      4- It is mentioned only neutrophil enriched in this study, if myelopoiesis is affected, why the other granulocytes were not showed significant enhancement?

      Answer: Our data reveal no change in the number of circulating neutrophils in the different clusters of individuals. However, mixed cell counts (MXD), a parameter representing monocytes, basophils and eosinophils numbers, was significantly reduced in Vivaxhigh patients. As a result, there was a significant enrichment of neutrophils in the leukocyte fraction in the blood of Vivaxhigh patients as well as a higher Neutrophil:Lymphocyte count ratio (NLCR) (Figure 4). In hematopoietic progenitors, stochastic changes in each factor’s concentration could result in one factor’s becoming more abundant and committing a hematopoietic progenitor to a particular lineage. To generate each mature granulocyte population (e.g. basophils, eosinophils and neutrophils), common myeloid precursor cells (CMPs) and later precursors for granulocytic and monocytic lineages (GMPs) follow in the BM different lineage commitment programs, tightly-regulated or instructed by a specific set of soluble factors, cell-cell interactions and transcription factors, that define cell fate decisions and lineage restrictions. For instance, differential PU.1 activity can specify different cell fates during haematopoiesis regulating monocyte and neutrophils differentiation. Genetic and biochemical analyses have shown that G-CSF can direct granulocyte differentiation by changing the ratio of C/EBPα to PU.1 (Zhu et al., Oncogene 2002; Friedman Oncogene 2002; Dahl et al., Nat Immunol 2003). High expression levels of PU.1 and C/EBPa, stimulated by G-CSF, promote GMP differentiation to neutrophils and inhibits monocyte differentiation, while only PU.1 expression, IRF-8 and lower expression/activity of C/EBPs induce GMP differentiation to monocytes (Zhu et al., Oncogene 2002; Friedman Oncogene 2002; Dahl et al., Nat Immunol 2003). Meanwhile, a combination of PU.1, C/EBPb and low levels of GATA-1 differentiates GMPs to eosinophil lineage (Kulessa et al., 1995; McDevitt et al., 1997; Yamaguchi et al., 1999) and PU.1 must also cooperate with GATA2 to direct mast cell differentiation (Walsh et al., Immunity 2002). In addition, eosinophil and basophil differentiation are induced by a different set of cytokines, usually produced in prevalent T-helper 2 response, such as IL-5, which should be inhibited in the strong Th1 environment evidenced by our and previous Luminex data in Pv patients. The enrichment of activated neutrophils in the peripheral circulation of P. vivax patients could be due to a response that specifically enhances neutrophil production and release from the bone marrow (BM). This hypothesis is supported by emerging evidence for enrichment of P. vivax parasites in the hematopoietic niche of BM, our Luminex data showing significant increase in pro-inflammatory cytokines associated with emergency myelopoiesis (e.g., TNF-a, IL-1a, IL-1b, IL-6, IL-8), and increased circulating levels of G-CSF, the major inducer of neutrophils production in the BM. Likewise, increased activation-induced cell death (AICD) in T cells, splenic T-cell and platelet accumulation or decreased lymphopoiesis due to myeloid-biased HSC differentiation induced by inflammatory cytokines and EC activation in the BM (refs 36,37,39) might explain the neutrophil enrichment in vivax patients.

      5- I would also ask the authors to speculate a bit on, What could be the mechanism behind the different function of P. vivax compared to P. falciparum? From an evolutionary perspective, the parasite should rather become softer and keep the host alive for its own benefit.

      Answer: One of the characteristics of P. vivax that could play an important role in immunity is its restriction to invade immature reticulocytes. For example, the infected reticulocyte could play a role in the presentation of parasite antigens as reticulocytes (but not mature RBCs) express MHC-I and are capable to process and present antigens on their surface for recognition by T cells. Indeed, it has been shown that reticulocytes act directly as an antigen-presenting cell, emphasizing the importance of erythrocyte surface antigens both in the induction as well as the target of a protective immune response (Burel et al 2016, Junqueira et al 2018). Recent investigations comparing P. vivax and P. falciparum controlled human infection models (CHMIs) also revealed marked differences in the immune profiles generated following infection with the two species and postulated that protective immune responses to Plasmodium are species-specific. It has been hypothesized that this difference is due to strict P. vivax tropism for MHC-I-expressing reticulocytes that, unlike mature red blood cells, can present antigen directly to CD8+T cells. Specifically, P. vivax but not P. falciparum infection led to the expansion of a specific subset of CD38+CD8+ T cells which were associated with an activated phenotype and cytotoxic potential. Corroborating Burel et al findings in the CHMI model, Junqueira et al showed that P. vivax–infected reticulocytes express HLA-I. In P. vivax-infected patients, CD8+ T cells in the peripheral blood express high levels of cytotoxic proteins, recognize and form immunological synapses with P. vivax–infected reticulocytes in HLA–dependent manner. Next, it was showed that P. vivax-specific CD8+ T cells release their cytotoxic granules to kill both host cell and intracellular parasite, which prevented reinvasion (Junqueira et al 2018). Although these data indicate a protective role of cytotoxic CD8+ T cells during P. vivax blood-stage malaria, it is not clear whether these lymphocytes would always be beneficial because they might contribute to anemia, inflammation or other pathological sequelae of infection, which needs to be further investigated.

      **Minor comments:**

      • It is important to have a reference, version, and date for the R software, packages and GraphPad.

      Answer: We have added version and date for the R and GraphPad software.


      2- In Fig 5, E missed to report. This figure can be better organised. It is very hard to read and follow.

      Answer: There is no E in Figure 5. We will organize the figure to make it easier to read and follow.

      Reviewer #2 (Significance (Required)):

        • vivax remains endemic in 51 countries across Central and South Americas, the Horn of Africa, Asia and the Pacific islands. In most areas it is co-endemic with P. falciparum, which has been the priority species to address for national malaria control programmes. Malaria related deaths are mostly attributable to the more pathogenic P. falciparum, but over the last decade these have declined, however there has been a consistent rise in the proportion of malaria cases due to P. vivax. However, because it is difficult to diagnose resistant strains, strategies to detect and track drug resistant P. vivax* are limited. In this context it is vital to develop better tools to assess diagnostic, antimalarial efficacy and drug susceptibility so that emerging drug resistance can be tracked, and novel treatment strategies explored. From my viewpoint, despite some statistical problems to understand the complex nature of data (mixed interactions among multiple variables), these findings seem to be very interesting and (after a major revision) worth to be published. As said before, the story told by the authors could become interesting.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript titled: "Total parasite biomass but not peripheral parasitaemia is associated with endothelial and haematological perturbations in Plasmodium vivax patients" by Silva-Filho et al., reinforce the original observation and data by the group of Nicholas Anstey and coworkers, who first proposed the use of plasma parasite lactate dehydrogenase and PvLDH as a marker of parasite biomass. In that work, it was already demonstrated that P. vivax biomass is related to plasma concentration of LDH levels. As such, the present work cannot be considered of high novelty. Yet, through a meticulous approach including clinical data, computational approaches, machine learning, LDH measurement, multiplex analysis and quantitave RT-PCR, the authors here have extended the original observations that a large biomass of P. vivax parasites is out of blood circulation. In contrast, unlike the original observations of Anstey´s group, a correlation between total parasite biomass and systemic levels of markers of endothelial cells activation, was observed. The manuscript is very well written and the discussion brings new knowledge in this key topic for elimination of malaria. This manuscript is therefore recommended for publication after the following comments are addressed.

      **Major comments:**

      1. The vascular endothelium plays a pivotal role in malaria. Therefore, to test whether cell and/or parasite factors affect the vascular endothelium, HUVEC cells were used in this study. This is of major concern as endothelial cells from the bone marrow, where most hematological disturbances, notoriously thrombocytopenia, occur, were not used instead. HUVEC cells seems the only endothelial cell that does not express ABO blood group antigens, thus suggesting that surface expression on these cells is highly altered (O´Donnell et al., 2000 J Vasc Res). Moreover, significant functional differences between HUVEC cells and adult vascular endothelium have been reported (Chan et al., 2004). Together, this indicates that results obtained with HUVEC cells might not reflect responses of the bone marrow vascular endothelium. As one of the corresponding authors have ample experience with working with human bone marrow endothelial cells (Mantel et al., 2016 Nat Comm), it is suggested to perform some experiments with these cells to assure extrapolation of the results obtained with HUVEC cells.

      Answer: We agree with the reviewer that performing ex vivo assays with primary human bone marrow endothelial cells would be an excellent alternative. However, we would like to argue that HUVECs are also suitable for our purposes. HUVECs are widely used to study endothelial barrier function, for example in the context of angiogenesis and inflammatory responses/barrier disruption. To emphasise this point, we have now referenced examples where HUVECs were used in the context of endothelial barrier biology and in different inflammatory conditions (see also lists a, b, c below).

      1. Papers showing the use of HUVECs in studies yielding important insights about endothelial barrier function
      • Krispin S et al. Growth Differentiation Factor 6 Promotes Vascular Stability by Restraining Vascular Endothelial Growth Factor Signaling. Arterioscler Thromb Vasc Biol. 2018.
      • Aranda JF et al. MYADM controls endothelial barrier function through ERM-dependent regulation of ICAM-1 expression. Mol Biol Cell. 2013.
      • Orsenigo F et al. Phosphorylation of VE-cadherin is modulated by haemodynamic forces and contributes to the regulation of vascular permeability in vivo. Nat Commun. 2012.
      • *
      1. Papers that used HUVECs in studies about endothelial barrier function in inflammatory conditions
      • Dickinson CM et al. Leukadherin-1 ameliorates endothelial barrier damage mediated by neutrophils from critically ill patients. J Intensive Care. 2018.
      • Kuck JL et al. Ascorbic acid attenuates endothelial permeability triggered by cell-free hemoglobin. Biochem Biophys Res Commun. 2018.
      • Tramontini Gomes de Sousa Cardozo F et al. **Serum from dengue virus-infected patients with and without plasma leakage differentially affects endothelial cells barrier function in vitro. PLoS One. 2017.
      • Fox ED et al. Neutrophils from critically ill septic patients mediate profound loss of endothelial barrier integrity. Crit Care. 2013.
      • Rahbar E et al. Endothelial glycocalyx shedding and vascular permeability in severely injured trauma patients. J Transl Med. 2015.
      • *
      1. Papers showing that HUVECs behave similarly to other endothelial cell types in regard to barrier function, except when the comparison is with blood brain barrier models
      • *

      • Totani L et al. Mechanisms of endothelial cell dysfunction in cystic fibrosis. Biochim Biophys Acta Mol Basis Dis. 2017, Dec;1863(12):3243-3253.

      • Gündüz D et al. Effect of ticagrelor on endothelial calcium signalling and barrier function. **Thromb Haemost. 2017 Jan 26;117(2):371-381.
      • Deitch EA et al. Mesenteric lymph from rats subjected to trauma-hemorrhagic shock are injurious to rat pulmonary microvascular endothelial cells as well as human umbilical vein endothelial cells. ** 2001 Oct;16(4):290-3. Importantly, we were able to reproduce in the HUVEC ex vivo assays a phenotype of endothelial perturbations that is inferred based on the in vivo Luminex data using the same plasma sample. These data also support our hypothesis that patients with higher parasite biomass present higher endothelial cell perturbations, corroborating the associations between parasite accumulation in deep tissues (total parasite biomass represented by PvLDH levels) and endothelial cell activation as demonstrated in the Figure 6.

      Strikingly, the authors stated that "P. vivax infection results in different ranges of EC alterations without massive cytoadhesion". This statement has no data supporting it. In fact, their own flow cytometry data convincingly demonstrated that exposure of HUVEC cells to plasma of vivax-high patients significantly increased the surface expression of ICAM-1 and VCAM. ICAM-1 expression is a well know receptor for cytoadhesion in malaria and Dr. Costa first demonstrated the importance of this receptor in cytoadherence of P. vivax (Carvalho et al., 2010). Moreover, these data are in some contradiction with the original observations of Anstey and collaborators who demonstrated that parasite LDH concentration did not correlate with markers of endothelial activation (Barber et al., 2015 PLoS Path). Therefore, this sentence should be modified to accommodate the alternative possibility of cytoadherence, deleted from the manuscript or binding functional assays should be performed to sustain it.

      Answer: We agree with the reviewer and have removed this statement.

      Page 22, line 543: The association between endothelial activation, Syndecan-1 and parasite biomass (PvLDH) indicates a positive feedback loop between glycocalyx breakdown, activation of endothelial receptors such as ICAM-1and VCAM-1 and parasite accumulation in deep tissues9,12.

      Extracellular vesicles are key players in pathology of malaria and this includes P. vivax where concentration of circulating microparticles were associated with acute infections (Campos et al., 2010 Mal J). Moreover, Dr. Marti has pioneered this field since the original manuscript describing the role of EVs in malaria as intercellular communicators (Mantel et al., 2013 Cell). More recently, his group also demonstrated that interaction of EVs with bone marrow endothelial cells induce expression of IL-6 and IL-1 as well as vascular endothelium perturbations after trans-endothelial electrical resistance experiments (Mantel et al., 2016 Nat Comm). Furthermore, another recent report showed the physiological role of EVs in vivax malaria by demonstrating that EV uptake by human spleen fibroblast induced nuclear translocation of the NF-kB transcriptional factor, concomitant with surface expression of ICAM-1, thus facilitating cytoadherence of infected reticulocytes from P. vivax patients (Toda et al., 2020 Nat Comm). This growing evidence indicates that plasma circulating EVs are key communicators in malaria infections potentially explaining some of the findings reported in this work. Neglecting the importance of EVs in the discussion of this article is not reasonable and weakens this manuscript. Including a paragraph on EVs and accurate references in the discussion is thus strongly recommended.


      Answer: We agree with the reviewer that extracellular vesicles are key communicators in malaria infection. We have not measured them in our study, however, and therefore can only speculate about their impact on our observations. We have added a phrase in the discussion:

      Page 27, line 661: It is likely that other circulating factors that we have not directly measured in our study are also contributing to EC activation and vascular permeability. In particular, extracellular vesicles (EV) originating from ECs, platelets, and RBCs are present during malaria infection and are known to modulate the host immune response to the parasite54-56 . In P. falciparum, infected RBCs release EVs containing immunogenic parasite antigens, that activate macrophages, induce neutrophil migration and alter endothelial barrier function54,55. In P. vivax, plasma-derived EVs from iRBCs are taken up by human spleen fibroblasts (hSFs). This event signals NF-kB translocation and upregulation of ICAM-1 expression, facilitating cytoadherence of P. vivax-infected reticulocytes56.

      **Minor comments:**

      1. The lack of a group including severe vivax malaria patients is a drawback of this article as this group would have firmly validated the predictor of severe disease.

      Answer: This study was investigating a cohort of uncomplicated P. vivax malaria compared to controls. We agree that it will be important to extent our analysis to severe vivax malaria in future studies.

      In the selection criteria of the patients to be included in the study, no information on other co-infections were mentioned. Is this information available? If so, this should be mentioned.

      Answer: As described in the Methods sections, Page 6, line 132, mono infection by P. vivax was confirmed by analysis of blood smears and quantitative PCR (qPCR) for both P. vivax and P. falciparum. We agree that excluding other coinfections could have been of interest. However, the differential diagnosis for an acute febrile illness is very broad and it would be impractical to track all other possible diseases. In addition, the patients included in the present work had mild disease, and therefore were discharged from hospital after a positive malaria diagnosis. No further investigation on other infections was done.

      The main coinfection to be considered for an acute febrile illness with no localizing signs in our context is Dengue Fever. Although Dengue coinfection in our cohort is possible, the incidence at the Hospital is only 2.8% (P. vivax/Dengue coinfection) (Magalhães et al, Plos NTD 2014). Thus, it is unlikely that such a coinfection would have a major impact on our findings.

      This work determined the levels of PvLDH in a cohort of uncomplicated P. vivax patients as well as healthy volunteers using a double-sandwich ELISA assay: (i) are the clones to determine PvLDH values freely available to facilitate similar studies by independent groups? (ii) How was the cut-off of positivity defined? This is not evident, neither in the materials and methods, nor in the results.

      Answer: Clones are commercially available and were purchased from Vista Diagnostics International LLC, WA, USA. Information has been amended to the text in the Methods section.

      Page 8, line 186: “Cut-off of positivity was defined by correcting absorbance values generated in the plasma samples from healthy donors (controls) by blank values (plate controls), with both values being in the same range. Absorbance values higher than controls were considered positive. In parallel, we used schizont extracts to perform standard curves and lower absorbance values were in the range of O.D = 0.03-0.04. All positive patient samples gave O.D. values equal or higher than 0.05. This information has also been added in the Methods section.:

      It is not clear why varying percentages of pooled plasma (30% for imaging and flow cytometry, and 20% for impedance changes) from the different clusters were used for the functional EC assays. Moreover, no information about the concentration of plasma used for transcriptional analysis is available. Please clarify.

      Answer: The concentration of 30% pooled plasma was also used for transcriptional analysis, as indicated in the Methods section, page 11, line 250. This information was also added in the legend of Figure 5B. We had run several optimisation time-course and titration experiments with individual plasma samples, testing concentrations of plasma varying from 10% up to 30% v/v and we did not observe differences in mRNA expression between 20% and 30% v/v plasma conditions.

      As for the ECIS, our collaborators (Erich V de Paula group) have optimised this assay and they use a range of 15 to 20% (Santaterra et al 2020). Higher concentrations of plasma reduces the reproducibility, probably to fibrin formation.

      Reference 9 is a nonhuman primate study where no LDH is used. Please remove it.

      Answer: Reference 9 has been removed following the reviewer suggestion.

      Reference 39 is a review on the subject and cannot be included in the sentence on line 556 "In agreement with a previous study8,39, where reference 8 is accurate. Please remove reference 39 from here.

      Answer: The text has been amended as suggested.

      Reviewer #3 (Significance (Required)):

      This paper further contributes to explain the conundrum of low peripheral blood parasitemia and clinical severity in P. vivax. Moreover, by including new human markers and solidly applying computational tools, this paper further contributes to advance clinical research in P. vivax.

      Clinical diagnosis of hematological disorders including anemia, lymphopenia and thrombocytopenia, are routinely obtained from a complete blood count. Therefore, I believe the major significance of this work is to raise public health awareness of including in these clinical examinations, the determination of PvLDH levels. They might prognose, as suggested by the authors, better diagnosis and treatment of P. vivax,

      My main expertise is the biology of host-pathogen interactions with a focus on P. vivax.


      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      The study evaluates P. vivax biomass (serum LDH) versus peripheral parasitemia with multiple variables. From the biomass Vivax high vs. Vivax low, they compare multiple determination in patients with uncomplicated P. vivax. This raises questions about disease and the presence of parasites in various organs. The question is if P. vivax sequesters and the answer is yes in the bone marrow and spleen. Does it sequester like P. falciparum that causes disease by sequestration by binding endothelium in various organs. That is less clear. As P. vivax is rarely fatal, the sequestration has not been studied. The presence of parasites in organs of P. vivax infected splenectomized squirrel and Aotus monkeys has been found in bone marrow and liver (note: splenecotomized monkeys so parasitemia can rise to higher levels than in non-splenectomized monkeys). There are studies of binding of schizonts infected red cells to lung endothelium in vitro does not answer the question of whether sequestration occurs in vivo.

      The most important complication of P. vivax is generally anemia. This did not correlate with vivax biomass, but this raises the question of the length of infection and the possibility that parasite biomass may vary at different times of infection. Anemia was seen in P. vivax infected patients, but it did not relate to biomass at the time of study. Note the caveat mentioned in the previous sentence on long term effects of infection on anemia.

      The finding of biomass with reduced platelet counts and endothelial effects that may be related to a serum factor and not sequestration. This is the main limitation of the paper besides the unknown long term effect infection. If one could identify an effect of P. vivax infected human serum, this may be worth a study in the future on what is in serum causing the effects.

      Reviewer #4 (Significance (Required)):

      This study is unique with the caveats mentioned above. It has a good review of the literature.

      Answer: We appreciate the reviewer comments. In our cohort, the frequency of anaemia was not as high or severe as the frequency of thrombocytopenia and lymphopenia. However, we still find associations between endothelial cell activation marker Ang-2 and the pro-inflammatory cytokine IL-1 IL-1 negatively associated with several markers of anaemia, such as haemoglobin, haematocrit and RBC numbers. Although we did not further investigate this association, it may indicate indirect effects of parasite biomass on anaemia mediated by inflammation and EC activation, which will be further investigated in other current longitudinal cohort studies.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response/revision plan

      (Point-by-point response)


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Pennauer et al is the first to systematically investigate the role of class I&II Arfs using a knockout approach. It builds on earlier work by the Kahn lab who used an RNAi approach (Volpicelli-Daley et al. 2005) and is complementary to the overexpression approach used by the Hauri lab (Ben-Tekaya et al, 2010). The work is elegant and the data are strong. I am strongly in favor of publishing this work and my comments are technical in nature (2-5) and a request for some text changes (1). have the following comments for improvements:

      1- When it comes to evaluating the role of depletions of Arfs on cell fitness, it would be better to use a non-transformed cell line. I am not asking the authors to go through the painstaking process of generating knockout cell lines in RPE1 cells for instance. Rather, I suggest that the authors make the reader aware that conclusions about cell survival have to be taken with care due to the use of a transformed cell line.

      We will add this valid point to the Discussion.

      2- Why do Arf1 and Arf4 ko cells grow more slowly. Is it a higher rate of cell death? Is it a block in a certain phase of the cell cycle. Given the link of the Golgi to G2-M entry, I think that an analysis of the cell cycle distribution would add more depth to these data. If the cell cycle distribution is unaffected, then I would conclude that that the difference in doubling time are due to reduced cell survival. If there is an effect on the cell cycle distribution, then the conclusion of the authors is safe that no single Arf is required for survival

      We plan to analyze cell cycle distribution.

      3- It is not clear to me how many cells were quantified in Figure 2D-F. I suppose that each dot represents a cell. In this case, the number of cells quantified is a bit low. Such a quantification of fluorescence intensities in two channels in the same region is a simple task and I think it should be no problem obtaining at least 100 cells per condition.

      We will add the number of cells analyzed to the figure legends: At least 40 Golgis were quantified in each experiment. thus >100 in total.

      4- Is the drop in the ratio of beta-COP/GM130 in Arf1 depleted cells reflecting reduced recruitment to the Golgi? Because the Golgi is bigger, it might be reflecting a reduced density in the number of coatomer molecules per surface area. If it is due to reduced recruitment, then the ratio of membrane/cytosolic betaCOP should be altered. This of course requires to show that the knockout does not affect total levels of coatomer. I think that such fractionation experiments would be a valuable addition to the manuscript and increase the depth of the data.

      We are currently performing immunoblot analysis to determine bCOP levels.

      In the Figure below, we have plotted the total intensity of GM130 or bCOP per Golgi from our immunoflurescence data. Total intensity of GM130 significantly increased in the cell lines lacking Arf1, consistent with the increase in Golgi volume. The amount of bCOP at the Golgi remained constant, resulting in reduced bCOP/GM130 ratio. Deletion of Arf1 thus results in reduced rate of coat recruitment that is compensated by an increase in Golgi mass. In the simplest model, reduced formation of Golgi-exit carriers causes Golgi growth until exit carrier formation allows for the required flux.

      We propose to include this data in the revised manuscript.

      FIGURE

      5- The finding that Arf4-ko cells exhibit a defect on retrieval of ER-resident proteins is exciting, and in my opinion, it is the most significant finding in this manuscript. How can this be reconciled with the lack of an ARf4 ko effect on coatomer recruitment to the Golgi. Looking carefully at the data, I see that in 2 out of 3 experiments, Arf4 ko reduced the betaCOP/GM130 ratio. This is why I think it is crucial to perform more experiments and add more cells to increase the confidence in the data. Reduced retrieval of ER chaperones is frequently found in tumors and we still don't understand the reason behind this. Therefore, this finding is of significance beyond the community of cell biologists.

      We plan to repeat quantitation with COPI for better statistical validity.

      6- I find Figure 6A confusing. Why do Arf1 overexpressing parental HeLa cells exhibit less Arf1 than control cells?

      In order not to overload the immunoblot of Arf overexpressing lysates, a smaller aliquot (1/20) was loaded. We will indicate this directly below the blots to make this more obvious in the revised figure.

      7- Why was the following condition not tested: Arf4ko cells with Arf1 overexpression. Given the importance of Arf1 in retrograde (Golgi-to-ER) trafficking, I would expect a partial rescue of the retrieval of ER chaperones.

      We will to do this experiment.

      Reviewer #1 (Significance (Required)):

      **Significance of the work:**

      The paper is important because it is the first to examine the role of Arfs using a knockout approach. Another very important finding is that Arf4 depleted cells exhibit problems with retrieval of ER chaperones. This is a very novel finding and to the best of my knowledge

      **Audience:**

      The primary audience is of course the community working on membrane trafficking, organelle biology and proteostasis. However, I think that the data on the role of Arf4 in retrieval of ER chaperones might be of relevance for cancer biologists. Secretion of ER chaperones is frequently found in many tumors and we still do not understand why this is happening and what the significance thereof is.

      **My own expertise:**

      Export from the endoplasmic reticulum Golgi fragmentation in cancer cell migration Rho GTPases Kinase signaling Pseudoenzymes Cell migration of breast cancer cells Proteostasis in multiple myeloma

      **Referee Cross-commenting**

      Just a follow-up comment from my side:

      I agree that it has not been unequivocally established that Arf1 is the main/sole of retrograde transport. However, even less established is the role of Arf4 in this process. The authors show that it is mainly Arf1 depletion that reduces the amount of COPI at the Golgi (ratio of COPI/GM130). Thus, I remain very surprised that it is actually the Arf4 depletion that results in reduced retrieval.

      What is the significance of having less COPI at the Golgi in Arf1-ko cells? Certainly, the Golgi is not more "leaky". Does the level of COPI at the Golgi not reflect the strength of retrograde trafficking? Maybe there is no less COPI at the Golgi, and it only appears to be less, because the Golgi is bigger. This is why a simple fractionation experiment would be good. Something like making a cytosol and a microsome fraction and looking at the ratio of COPI (Cyt/Mem).

      If both reviewers think it is too much, or unlikely to work, then I am happy to drop this point.

      Below are my comments to the evaluations by the other two reviewers:

      1- I agree with most comments that the two other reviewers made. Some of them are actually overlapping with mine (e.g. the use of a cell line other than HeLa).

      2- I am not sure whether the impact of the paper would improve by adding data on Arf6.

      3- To the comment on Golgi polarity. Maybe we could be more specific here and say that it would be sufficient to show that a trans-Golgi protein and a cis-Golgi protein can be separated by fluorescence microscopy, or whether we alternatively want them to actually do it by immunogold labeling for EM (which is more difficult).

      4- I agree with reviewer 2 that the work proposed needs 1-3 months. I think reviewer 3 is a bit too optimistic with 1 month, because her/his comment on using a cell line other than HeLa cannot be addressed in just a month.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Pennauer uses HeLa cells and CRISPR/Cas9 to delete the 5 members of the class I and class II ARF family of small regulatory GTPases either individually or in combinations. The characterization of the KO cells is excellent and convincingly demonstrates that true KOs were generated. The quality of the data presented is high. Using the KO cells she documents minor alterations in Golgi architecture and the recruitment of vesicular coats in cells deleted of all ARFs except ARF4. In contrast, there is a significant lack of retention/recycling to the ER of KDEL-containing ER proteins in ARF4 KO cells, with numerous ER chaperones now released into the medium (the ARF4 KO secretome). This is a well-done study that showcases the ability of ARF4 alone to sustain cellular life (quite a surprise to this reviewer). Yet, the characterization of the phenotypes is somewhat minimal and the conclusions would be more robustly supported by additional experiments. Specifically:

      1. The authors completely ignore class III ARF6 and this paper would be much more comprehensive and informative if analysis of that ARF was also included (ARF6 has been seen at the Golgi and also mediates endosomal trafficking that intersects with the TGN).

      In agreement with the reviewers' consensus in cross-commenting, we consider Arf6ko to be beyond the scope of this study.

      Although the overall Golgi architecture seems to be largely conserved, it remains essential to test whether Golgi polarity is similarly maintained, and such data would significantly expand the significance of the reported findings

      We have performed super-resolution microscopy of wild-type and Arf1ko Golgis for GM130 and TGN46 as cis- and trans-Golgi markers, respectively, showing that polarity is still intact for Arf1ko, the morphologically most affected knockout cell line. We plan to include the following Figure in the revised manuscript.

      FIGURE

      Golgi complexes were imaged by superresolution microscopy for GM130 (green) and TGN46 (red), and displayed as maximum intensity projections, or tomographic 2D slices. Scale bar, 3 μm.

      Since there is a defect in retrieval of KDEL-proteins, it would be important to show the intracellular localization of the KDEL-R in the cells (especially in the ARF4 KO cells that don't retrieve KDEL-GFP) - is the receptor degraded, stuck in some specific place - knowing that would increase the impact of this study and provide a mechanistic explanation for the observed phenotype

      We plan to perform immunoblot analysis for KDELR to test for changes in levels in Arf deletion cells, and immunofluorescence microscopy to analyze changes in KDELR localization.

      The rescue experiments in Figure 6 are good as far as they go, but this experiment would be much more informative if in addition to the same class rescue, the other class ARFs (at least one!) were also characterized.

      We will to do this experiment.

      This is maybe a little too much to ask, but since the authors propose a mechanistic explanation for the ARF4 KO KDEL phenotype as being due to different effectors recruited by this ARF (in this case different COPI isotypes - this study would increase in impact by actually testing this mechanisms by assessing whether ARF4Q71L mutant preferentially bound any particular isotype of COPI or even try to do mass spec to identify relevant effectors for this extremely interesting ARF.

      We also think that this additional analysis is beyond the scope of this study.

      The Discussion is a very limited and would be more impactful by adding some discussion of organismal effects of ARF deletions (many are embryonic lethal while cells seems to live quite happily) or mutations (links to cancer come to mind here), as well as some mention of data from yeast ARF (what is and isn't essential in those cells). As is, the authors miss an opportunity to highlight the importance of their findings as they relate to current knowledge of ARFology.

      We agree to add a discussion of information on embryonic lethality and disease.

      Reviewer #2 (Significance (Required)):

      This is an important paper for the ARF field and people interested in ARF signaling will be glad to read about the findings and perhaps also use the developed KO cell lines - this is a significant advancement. The impact would be even higher if some of the experiments suggested above were incorporated into the manuscript.


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This paper describes the application of CRISPR/Cas9 to systematically delete from HeLa cells all four Arf genes, either singly or in combination. The authors find it is possible to generate a number of double deletions (notably one lacking both Class I Arfs), and a triple deletion lacking all but Arf4. The authors characterise the structure of the Golgi in these mutants as well as retention of ER residents. The work is a comprehensive study of an exceptionally high technical standard. There is excellent validation of the deletions, and then the application of a wide range of methods including immunofluorescence, electron microscopy and mass-spectrometry, all with careful and extensive quantitation. The finding that cells can survive without Class I Arfs is interesting and unexpected, as is the fact that Arf4 alone is sufficient. This work will provide an excellent platform for future studies on Arf protein function in human cells. There are of course many questions that arise from these findings, but given the scope and quality of the work they would seem better left for future publications. There is one experiment that could be added, and some additions needed to the text for clarity and minor adjustments to the figures (all listed below), but if these are addressed, this would be a high quality paper of wide audience to a cell biological audience.

      **Specific comments:**

      1) Have the authors tested the levels and/or localisation of the KDEL receptor in the various lines? This is not essential, but if it were easily done, it would add to the work on ER resident secretion.

      We plan to perform immunoblot analysis for KDELR to test for changes in levels in Arf deletion cells, and immunofluorescence microscopy to analyze changes in KDELR localization.

      2) The work is entirely done in HeLa cells. The authors should note that the situation might be different in other cells types and cell lines. For instance, the DepMap CRISPR database suggests that quite a lot of cell lines are strongly affected by loss of Arf1.

      We agree to add a discussion on known effects in other tissues.

      3) Figure 2. Please show single channels as grey scale, and only merge as RGB. This is easier to see, especially for the colour blind. Likewise, Figure 3D would be clearer in greyscale rather than green, and 6B better in grey than in red.

      We will make these changes.

      4) Figure 5C. A brief comment is needed as to why it might be that BiP and calreticulin are not so efficiently secreted when Arf5 is knocked out in addition to Arf4.

      This was a mistake in labeling that lane and will be corrected. It should read "Arf3+5ko" not "Arf4+5ko. Thank you for pointing this out.

      5) Discussion:

      a) The authors should relate these studies to work in other species. For instance, in yeast reduction of Arf levels causes the Golgi to enlarge (PubMed ID 9487133).

      We can discuss this.

      1. b) Some more discussion is needed of the fact that Arfs may not all act in the same part of the Golgi, which could explain some of the differences observed between the various deletions.

      We can add this point in the discussion.

      Reviewer #3 (Significance (Required)):

      The Arf GTPases have been studied extensively for over 30 years as major regulators of Golgi function. They are essential for the recruitment to Golgi membranes of both COPI and clathrin/AP-1 coats, as well as various other proteins that regulate Golgi function. In addition, they have been reported to have roles in viral replication, and even other cellular processes such as lipid droplet formation and mitochondrial division. In humans there are four Arfs, Arf1 and Arf3 (Class I Arfs), and Arf4 and Arf5 (Class II Arfs). All are present on the Golgi, but their precise individual roles have remained unclear. Attempts have been made to deplete individual Arfs using RNAi, but incomplete knockdowns have made the results hard to interpret.

      **Referee Cross-commenting**

      There is probably no need for a prolonged debate about this, but I agree that the importance of Arf4 is striking, but it reflects the nature of this work that CRISPR has finally allowed these sorts of questions to be addressed unequivocally. COPI is also involved in recycling of Golgi resident enzymes, and it may be that Arf1 acts in this role.

      If the authors check levels of COPI by blotting, and measure the intensity over the Golgi by quantitative IMF, that will reveal whether stability or membrane association if affected without fractionation which is probably less reliable.

      If they want to do some extra experiments, then it would be quite easy to check the levels of some Golgi enzymes, or look at lectin binding as a proxy for glycosylation enzyme levels.

      Overall, I agree with the positive comments of Reviewers 1 and 2, and it good that we all recognise the quality and importance of the work. However, I feel that one or two of their requests go beyond the scope of a single publication, or would add rather little for a lot of additional work. It is of course easy to propose experiments that someone else has to do!

      **[On] Reviewer 1:**

      Point 4. I agree that it would be useful to perform a blot to determine if the levels of coatomer are effected in the various KO lines. I am not sure if Reviewer 1 is also proposing fractionation to determine cytosol vs membrane ratio, but if so, then this would be less useful as peripheral membrane proteins tend to fall off membranes during fractionation and so such analysis is generally questionable. A blot, and clarification of the way the COPI/GM130 ratio is determined, would answer the key points in a relatively straightforward way.

      Point 5. I agree that the defect in retrieval of ER residents in Arf4-KO is striking, but it a clear effect even if the reviewer does not understand it themselves! It does not seem so surprising to me, given that Arf4 is likely to act on the early Golgi were such retrieval occurs from. However, the experiment suggested by myself and Reviewer 32 of checking the levels and localisation of the KDEL-receptor would seem to me a good first step to addressing possible mechanism, and certainly sufficient for an initial publication.

      Point 7. I am not sure that it has been unequivocally established that Arf1 is important in retrograde traffic. The reality is that many labs have taken Arf1 as being representative of all others and so concentrated biochemical and in vivo studies on this protein. This paper is really important as it highlights the need to investigate both Class I and Class II Arfs, and to bear in mind that their roles in vivo may well be more distinct than their in vitro properties would lead one to suspect. Perhaps, the simplest explanation for this is that the GEFs that activate them have a strong preference for one or the other.

      Follow up Comment 1. I was not suggesting that the authors repeat all this in a cell line other than HeLa cells, as this is clearly impractical. HeLa cells are widely used, and so the findings are useful, and whilst it seems certain that some other cell lines would give different data (and indeed the DepMap data show this), then testing one other line would not change the conclusions much. All I wanted the authors to do is to clearly state in the text that what they see in HeLa cells may well be different in other cell lines. This does not detract from the fact that their HeLa cells will provide an excellent platform for focused studies on the role of individual Arfs.

      Follow up Comment 2. I agree that Arf6 is not relevant to this paper (as discussed in detail below).

      Follow up Comment 3. agree that a simple IMF experiment would suffice to check polarity and immuno-EM is technically very demanding and would add little in this context. The authors have already shown that the Golgi forms stacks in the KO cell lines, and I cannot see how this could occur without the stack being polarised - it has to form at one end and then mature to the other. In addition, after decades of working on the Golgi I have yet to see a credible report of a change to cells causing a loss of Golgi polarity, but maintaining a stacked structure. If the Golgi is not polarised it could not form a stack.

      Follow up Comment 4. I agree that one month is perhaps too short to look at KDEL-R, COPI levels and checking polarity by IMF. As noted above, I am NOT suggesting that they repeat all this in a different cell line.

      **[On] Reviewer 2.**

      Point 1. I agree with Reviewer 1 that the authors are correct to ignore Arf6. It is a completely different GTPase with a distinct function in a different part of the cell. The family of Arf1-Arf5 arose in metazoans from a single Arf, but Arf6 had already split away from the Arf1-5 family in the last eukaryotic common ancestor, as Arf6 is present in plants and yeasts. There is overwhelming evidence that Arf1-Arf5 are partially redundant and this has hampered their study. Arf6 does not share these roles. The fact that it is acts on endosomes and has been reported to be on the Golgi (which is not widely agreed), is also true of many other GTPases. Indeed, other distant relatives of Arf1-5 are actually on the Golgi (Arl1, Arl5 etc), but these are also not relevant as like Arf6 they do not bind coat proteins and other major effectors of Arf1-5.

      Point 2. As noted above, it is hard to see how polarity could be affected given that a Golgi stack is formed, but, at most, a simple application of IMF would seem sufficient to confirm this.

      Point 3. Agreed.

      Point 5. I agree with the reviewer that this is (much!) too much to ask for an initial publication. Various labs have already reported analysis of the effectors of Class II Arfs and they tend to overlap with Class I. Moreover, it is quite possible that the difference of role in vivo reflects differing interactions with regulators.

      Point 6. Agreed.

    1. Author Response:

      Reviewer #1 (Public Review):

      In this work, Panigrahi et. al. develop a powerful deep-learning-based cell segmentation platform (MiSiC) capable of accurately segmenting bacteria cells densely packed within both homogenous and heterogeneous cell populations. Notably, MiSiC can be easily implemented by a researcher without the need for high-computational power. The authors first demonstrate MiSiC's ability to accurately segment cells with a variety of shapes including rods, crescents and long filaments. They then demonstrate that MiSiC is able to segment and classify dividing and non-dividing Myxococcus cells present in a heterogenous population of E. coli and Myxococcus. Lastly, the authors outline a training workflow with which MiSiC can be trained to identify two different cell types present in a mixed population using Myxococcus and E. coli as examples.

      While we believe that MiSiC is a very powerful and exciting tool that will have a large impact on the bacterial cell biological community, we feel explanations of how to use the algorithm should be more greatly emphasized. To help other scientists use MiSiC to its fullest potential, the range of applications should be clarified. Furthermore, any inherent biases in MiSiC should be discussed so that users can avoid them.

      We thank the reviewer for the positive feedback and comments to help disseminate MiSiC to the broad bacterial cell biology community as it is meant to. As described above we have largely addressed this comment via the redaction of a comprehensive handbook. As detailed below, we now also provide precise measurements of the MiSiC segmentation accuracy compared to ground truth for the various imaging modalities and bacterial species segmentation.

      Major Concerns:

      1) It is unclear to us how a MiSiC user should choose/tune the value for the noise variance parameter. What exactly should be considered when choosing the noise variance parameter? Some possibilities include input image size, cell size (in pixels), cell density, and variance in cell size. Is there a recommended range for the parameter? These questions along with our second minor correction can be addressed with a paragraph in the Discussion section.

      Setting the noise parameters is now detailed in the handbook (section 1.d). A set of thumb rules and recommendations are provided. In addition a paragraph explaining the importance of noise addition for images with sparse bacterial cell density has been added in the results section.

      “Associated Figure S1. Background noise can lead to spurious cell detection by MiSiC. SI images retain the shape/curvature information of the intensities in a raw image through eigenvalues of the hessian of the image and an arctan function, creating the smooth areas corresponding to cell bodies and propagating noisy regions where there is no shape information. Thus, MiSiC segments the cells by discriminating between “smooth” and “rough” regions. In effect, when adjusting the size parameter, scaling smooths out the image noise, leading to background regions that have a smoother SI than in the raw image. Some of these areas could be falsely detected as bacterial cells. This effect is shown here: When an image with uniform and random intensity values is segmented with MiSiC with increasing smoothening (here using a gaussian blur filter), spurious cell detection becomes apparent. In addition, since the SI keeps the shape information and not the intensity values, background objects that are of relatively low contrast (ie dead cells or debris) may be detected as cells. All these artifacts can be mitigated by adding synthetic noise to the scaled images.”

      2) Could the authors expand on using algorithms like watershed, conditional random fields, or snake segmentation to segment bacteria when there is not enough edge information to properly separate them? How accurate are these methods at segmenting the cells? Should other MiSiC parameters be tuned to increase the accuracy when implementing these methods?

      We thank the reviewer for raising this point as it is important to make clear that post-processing algorithms can certainly improve the accuracy of MiSiC masks downstream. To show this specifically, we further processed MiSiC masks of Bacillus subtilis filamentous cells to resolve division septa using the watershed algorithm. This example is now provided as Figure S3. Importantly, there is no particular MiSiC adjustment that needs to be performed prior to running these processing steps, which can be done directly in Image-J or its bacterial cell analysis plug-in, MicrobeJ. It is worth noting that the post- processing strategy may depend on the scientific question under consideration. In the handbook, we also give an example of post-processing methods that may be used.

      “Associated Figure S3. Refining cell separations with watershed. Watershed methods may be used to obtain a more accurate segmentation of septate filaments such as Bacillus subtilis. In this example applying this method to the MiSiC mask effectively resolves cell boundaries that are not captured in the prediction but are visible by eye (arrows).”

      3) Can the MiSiC's ability to accurately segment phase and brightfield images be quantitatively compared against each other and against fluorescent images for overall accuracy? A figure similar to Fig. 2C, with the three image modalities instead of species would nicely complement Fig. 2A. If the segmentation accuracy varies significantly between image modalities, a researcher might want to consider the segmentation accuracy when planning their experiments. If the accuracy does not vary significantly, that would be equally useful to know.

      This is a very important issue that was also raised by reviewer 3 and which we decided to address in full. For each imaging modality and distinct species, we measured the Jaccard Index as a function of the threshold set for the Intersection over Union (ioU). The resulting curves are now provided in two separate Figures 2 and 3 and a supplemental Figure S2; they provide a robust measure of the segmentation for each modality/tested species.

      “Figure 2. MiSiC predictions under various imaging modalities. a) MiSiC masks and corresponding annotated masks of fluorescence, phase contrast and bright field images of a dense E. coli microcolony. b) Jaccard index as a function of IoU threshold for each modality determined by comparing the MiSiC masks to the ground truth (see Methods). The obtained Jaccard score curves are the average of analyses conducted over three biological replicates and n=763, 811, 799 total cells for Fluorescence, Phase Contrast and Bright Field, respectively (bands are the maximum range, the solid line is the median). The fluorescence images were pre-processed using a Gaussian of Laplacian filter to improve MiSiC prediction (see methods).”

      “Associated Figure S2. MiSiC predictions under various imaging modalities. a) MiSiC masks and corresponding annotated masks of fluorescence, phase contrast and bright field images of a dense M. xanthus microcolony. b) Jaccard index as a function of IoU threshold for each modality determined by comparing the MiSiC masks to the ground truth (see Methods). The obtained curves are the average of analyses conducted over three biological replicates and n=193,206,211 total cells for Fluorescence, Phase Contrast and Bright Field, respectively. The fluorescence (bands are the maximum range, the solid line is the median) images were pre-processed using a Gaussian of Laplacian filter to improve MiSiC prediction (see methods). c) A human observer is slightly less performant than MiSiC. The same ground truth as used in Figure 2 (dashed lines) was compared to an independent observer’s annotation (solid lines) and Jaccard score curves were constructed as shown in Figure 2. BF: Bright Field, PC: Phase Contrast, Fluo: Fluorescence.”

      “Figure 3. MiSiC predictions in various bacterial species and shapes. a) MiSiC masks and corresponding annotated masks of phase contrast images of another Pseudomonas aeruginosa (rod-shape), Caulobacter crescentus (crescent shape) and Bacillus subtilis (filamentous shape). b) Jaccard index as a function of IoU threshold for each species determined by comparing the MiSiC masks to the ground truth (see Methods). The obtained Jaccard score curves are the average of analyses conducted over three biological replicates and n=1149,101,216 total cells for P. aeruginosa, B. subtilis and C. crescentus, respectively (bands are the maximum range, solid line the median). Note that the B. subtilis filaments are well predicted but edge information is missing for optimal detection of the cell separations.”

      4) The ability of MiSiC to segment dense clusters of cells is an exciting advancement for cell segmentation algorithms. However, is there a minimum cell density required for robust segmentation with MiSiC? The algorithm should be applied to a set of sparsely populated images in a supplemental figure. Is the algorithm less accurate for sparse images (perhaps reflected by an increase in false-positive cell identifications)? Any possible biases related to cell density should be noted.

      In fact, MiSiC performs well both with densely or sparsely populated images. In the case of sparsely populated images it is however possible that non-cell objects can occasionally appear in the MiSiC mask. As mentioned above, inclusion of noise can help remove these objects in the sparsely populated images. This issue is now fully explained in a supplemental Figure S1. Of note, non-cell objects -if they were to remain after noise addition- can be eliminated using additional general morphometric filters or specific models fitting bacterial cells, as for example those included in Microbe-J and Oufti. These points are now clarified in the text.

      “Associated Figure S1. Background noise can lead to spurious cell detection by MiSiC. SI images retain the shape/curvature information of the intensities in a raw image through eigenvalues of the hessian of the image and an arctan function, creating the smooth areas corresponding to cell bodies and propagating noisy regions where there is no shape information. Thus, MiSiC segments the cells by discriminating between “smooth” and “rough” regions. In effect, when adjusting the size parameter, scaling smooths out the image noise, leading to background regions that have a smoother SI than in the raw image. Some of these areas could be falsely detected as bacterial cells. This effect is shown here: When an image with uniform and random intensity values is segmented with MiSiC with increasing smoothening (here using a gaussian blur filter), spurious cell detection becomes apparent. In addition, since the SI keeps the shape information and not the intensity values, background objects that are of relatively low contrast (ie dead cells or debris) may be detected as cells. All these artifacts can be mitigated by adding synthetic noise to the scaled images.”

      and:

      “Along similar lines, non-cell objects can appear in the MiSiC masks and while some can be removed by the introduction of noise, an easy way to do it is to apply a post-processing filter, for example using morphometric parameters to remove objects that are not bacteria. This can be easily done using Fiji, MicrobeJ or Oufti."

      5) It is exciting to see the ability of MiSiC to segment single cells of M. xanthus and E. coli species in densely packed colonies (Fig. 4b). Although three morphological parameters after segmentation were compared with ground truth, the comparison was conducted at the ensemble level (Fig. 4c). Could the authors use the Mx-GFP and Ec-mCherry fluorescence as a ground truth at the single cell level to verify the results of segmentation? For example, for any Ec cells identified by MiSiC in Fig. 4b, provide an index of whether its fluorescence is red or green. This single-cell level comparison is most important for the community.

      We have now performed this comparison and determined Jaccard indexes for E. coli and Myxococcus detection using the individual fluorescence images as a reference (figure 5b). Since we were only able to make this comparison in relatively small fields we also kept the comparison of expected morphometric parameters in large images. Taken together, these data now demonstrate that semantic classification as performed does well separate Myxococcus cells from E. coli cells (see more details in our response to reviewer 3).

      Reviewer #2 (Public Review):

      Panigrahi and co-authors introduce a program that can segment a variety of images of rod-shaped bacteria (with somewhat different sizes and imaging modalities) without fine-tuning. Such a program will have a large impact on any project requiring segmentation of a large number of rod-shaped cells, including the large images demonstrated in this manuscript. To my knowledge, training a U-Net to classify an image from the image's shape index maps (SIM) is a new scheme, and the authors show that it performs fairly well despite a small training set including synthetic data that, based on Figure 1, does not closely resemble experimental data other than in shape. The authors discuss extending the method to objects with other shapes and provide an example of labelling two different species - these extensions are particularly promising.

      The authors show that their network can reproduce results of manual segmentation with bright field, phase and fluorescence input. Performance on fluorescence data in Fig. 1 where intensities vary so much is particularly good and shows benefits of the SIM transformation. Automated mapping of FtsZ show that this method can be immediately useful, though the authors note this required post-processing to remove objects with abnormal shapes. The application in mixed samples in Fig. 4 shows good performance. However, no Python workflow or application is provided to reproduce it or train a network to classify mixtures in different experiments.

      We thank the reviewer for the positive comment. As discussed in our answer to reviewer 1, the classification presented in Figure 4 (now Figure 5) is meant to provide an example of how MiSiC can be further used to train networks to classify species in interspecies communities by generating two datasets, one per species of interest, to further train a U-Net. Here, the secondary U-Net was developed to specifically discriminate Myxococcus from E. coli, which is a very specialized application. Hence it was not included in the MiSiC package. Nevertheless the code is accessible at https://github.com/pswapnesh/MyxoColi (which is mentioned in the Methods).

      Performance was compared between SuperSegger with default parameters and MiSiC with tuned parameters for a single data set. Perhaps other SuperSegger parameters would perform better with the addition of noise, and it's unclear that adding Gaussian noise to a phase contrast image is the best way to benchmark performance. An interesting comparison would be between MiSiC and other methods applying neural networks to unprocessed data such as DeepCell and DeLTA, with identical training/test sets and an attempt to optimize free parameters.

      In fact, we believe that it does make sense to test how MiSiC performs in the presence of noise and show that it is robust, making it suitable for use on complex multi-tile images. For this analysis we kept the comparison with Superseger, which provides a reference as it is done on a data set optimized for Superseger segmentation. Importantly, we keep the parameters constant throughout the analysis because it would not be feasible to tweek parameters tile-by-tile in a multi-tile image. This analysis shows that MiSiC is more adapted for this application.

      INSTALLATION: I installed both the command line and GUI versions of MiSiC on a Windows PC in a conda environment following provided instructions. Installation was straightforward for both. MiSiCgui gave one error and required reinstallation of NumPy as described on GitHub. Both give an error regarding AVX2 instructions. MiSiCgui gives a runtime error and does not close properly. These are all fairly small issues. Performance on a stack of images was sufficiently fast for many applications and could be sped up with a GPU implementation.

      We have updated the pip install script available in GitHub for MiSiCgui that remediates some of these issues : There is no more numpy error, it closes properly and there are only warning messages concerning future deprecations in the napari packages. We have tested in Windows 10, Linux Ubuntu 18, and Mac OS Catalina. For the moment it seems impossible to install in Mac OS BigSur maybe due to the python 3.7 requirement. We will work on this problem in the near future. We have removed the command line interface as we are developing future version with an easiest way to provide MiSiC as Napari or FIJI/ImageJ plugin

      TESTING: I tested the programs using brightfield data focused at a different plane than data presumably used to train the MiSiC network, so cells are dark on a light background and I used the phase option which inverts the image. With default settings and a reasonable cell width parameter (10 pixels for E. coli cells with 100-nm pixel width; no added noise since this image requires no rescaling) MiSiCgui returned an 8-bit mask that can be thresholded to give segmentation acceptable for some applications. There are some straight-line artifacts that presumably arise from image tiling, and the quality of segmentation is lower than I can achieve with methods tuned to or trained on my data. Tweaking magnification and added noise settings improved the results slightly. The MiSiC command line program output an unusable image with many small, non-cell objects. Looking briefly at the code, it appears that preprocessing differs and it uses a fixed threshold.

      We thank the reviewer for testing the programs. Tiling related artifacts may now be avoided by excluding a few pixels at the border in the new version of MiSiC code. This is now implemented in the MiSiC.segment function as segment(im,invert = False,exclude = 16). Without seeing the reviewers data it is difficult for us to see how the segmentation (which is said to be acceptable) could be further improved. The command line program has now been removed in favor of continuous development on the graphical interface.

      Reviewer #3 (Public Review):

      The authors aimed to develop a 2D image analysis workflow that performs bacterial cell segmentation in densely crowded colonies, for brightfield, fluorescence, and phase contrast images. The resulting workflow achieves this aim and is termed "MiSiC" by the authors.

      I think this tool achieves high-quality single-cell segmentations in dense bacterial colonies for rod-shaped bacteria, based on inspection of the examples that are shown. However, without a quantification of the segmentation accuracy (e.g. Jaccard coefficient vs. intersection over union, false positive detection, false negative detection, etc), it is difficult to pass a final judgement on the quality of the segmentation that is achieved by MiSiC.

      We thank the reviewer for this comment. To address it we divided the previous Figure 2 into two figures (and associated supplemental figures) separately showing how MiSiC performs (i), to segment two very distinct bacterial species E. coli and Myxococcus under various imaging modalities. (ii) to segment other bacterial species: rods (P. aeruginosa), filaments (B. subtilis) and crescent shapes (C. crescentus). The results now clearly show both the strength and limitations of the system.

      A particular strength of the MiSiC workflow arises from the image preprocessing into the "Shape Index Map" images (before the neural network analysis). These shape index maps are similar for images that are obtained by phase contrast, brightfield, and fluorescence microscopy. Therefore, the neural network trained with shape index maps can apparently be used to analyze images acquired with at least the above three imaging modalities. It would be important for the authors to unambiguously state whether really only a single network is used for all three types of image input, and whether MiSiC would perform better if three separate networks would be trained.

      A single network is using a shape-index-map rather than the original images as an input. As mentioned by the reviewer this is a major strength of the workflow given that it permits segmentation, independent of the imaging modality, which we now measure for each modality.

      As the reviewer hints, three different models specific to each modality (CP, Fluorescence and BF) could also be used to train three networks, allowing the direct end-to-end segmentation of raw images. In theory, this could improve the segmentation (although this might lead to negligible benefits given the actual segmentation quality).

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **Summary:** In this study authors investigated the role of NAMPT, NAD+ and PARP1/parthanatos in skin inflammation using a zebrafish psoriasis model with an hypomorphic mutation of spint1a and human organotypic 3D skin models of psoriasis. Authors showed that genetic deletion and/or pharmacological inhibition of Nampt/PARP1/AIFM1/NADPH oxidases reduced oxidative stress, inflammation, keratinocyte DNA damage, hyperproliferation and cell death in zebrafish models of chronic skin inflammation. Authors also showed the expression of pathology-associated genes in human organotypic 3D skin models of psoriasis with pharmacological inhibition of Nampt/PARP1/AIFM1/NADPH oxidases. The key finding of this study is that PARP1 hyperactivation caused by ROS-induced DNA damage mediates skin inflammation through parthanatos. **Major comments:** This is a very comprehensive study to investigate the role of PARP1 in skin inflammation. The main conclusion was made based on the genetic inhibition and/or pharmacological inhibition of Nampt/PARP1/AIFM1/NADPH oxidases. Although the finding of this study that NAMPT-derived NAD+ fuels PARP1 to promote skin inflammation through parthanatos is interesting and important, there are lots of major concerns and questions, which have to be addressed to better support the main conclusion. In addition, the data and methods were not presented with sufficient detail.

      1. This study is heavily relied on pharmacology inhibition. However, the specificity and selectivity of many inhibitors were not tested in this study.

      At least 3 concentrations of each inhibitor were tested and the lowest one able to rescue the phenotype was then used for further testing (please, see Table S1). More importantly, the specificity of all compounds used were confirmed by genetic inhibition of their targets.

      Fig. 1: it is quite confusing how NAD+ increases H2O2 levels? Is NAD+ cell permeable? It is not clear if NAD+ has been really up taken by cells in the larvae. If NAD+ fuels PARP1 to promote skin inflammation, why NAM treatment increased H2O2 levels but NMN precursor failed to increase skin oxidative stress? No reasonable explanation has been provided.

      This is an interesting point. We have shown that exogenous NAD+ added in the water of larvae increased larval NAD+ (please, see Fig. 2K). It has been shown that neurons can take up NAD+ through CX43 (Fig. S7), so a similar mechanism may operate in larval skin. As regards, the effect of NAM and NMN, a recent study has demonstrated that NAM supplementation increased zebrafish larval NAD+; however, NA, NMN and NR failed to boost larval NAD+ level (PMID: 32197067). These results are consistent with our data.

      Fig. 1E and 1G: it is not clear what is the green channel. Similarly, there is no clear description what is red or green in many other figures.

      To help the interpretation of larval pictures, we have indicated in all figures what is analyzed in each fluorescent channel.

      1. Fig. 1K and 1L: It is hard to understand why FK-866 reduced H2O2 release, but it increased neutrophils infiltration. How to interpret this conclusion?

      Fig. 2C-D: Why low doses FK-866 reduced neutrophil infiltration whereas high dose FK-866 increased neutrophil infiltration?

      Answer to 4&5: As it was explained in lines 145-156, FK-866 induces NF-kB activation in the muscle and neutrophil infiltration in this tissue when used at 100 uM. This result may be deleted if the reviewers think it is confusing, since a 10 uM dose was used in all subsequent experiments to study the impact of Nampt in skin inflammation. This dose has no effects in the muscle but robustly reduced skin H2O2 production and neutrophil skin infiltration.

      Fig. 2I-J: it is not clear how NF-kB activity was measured. Is that based on green fluorescence shown in Fig. 2J? if so, the representative images were not consistent with the quantification data shown in I. Similarly, many other representative images were also not consistent with their quantification data throughout the manuscript. For example, Fig. 3C/D, 3E/F, 3G/H, 3L/M, Figure S2C/D, S2G/H, Fig. 4C/D, 4J/K.

      The quantification of NFkB was measured in the skin, as it has already been reported previously (Candel et al., 2014). This is indicated in M&M section. The images show the whole larvae and NFkB is expressed at high levels in different tissues, such as neuromasts. To clarify this, we have included an additional figure to explain the ROI used for quantification of H2O2 and NfkB (Fig. S1G).

      Figure S1C, Nampta/Namptb protein expression should be checked and shown after its KO using crispr/cas9 technique.

      Unfortunately, we have used to different antibodies and both failed to crossreact with zebrafish Nampta/b. However, we have included the efficiency of CRISPR-Cas9 in Fig. S1F of the revised version. The efficiency is relatively low, probably indicating that is indispensable for zebrafish development, as occurs in mice (PMID 28333140).

      Fig. 3I: protein expression of nox1, nox4 and nox 5 should be checked after genetic inhibition using CRISPR/Cas9 technique.

      Unfortunately, we do not have antibodies able to recognize zebrafish Nox1, Nox4 and Nox5. However, we have provided the efficiency of the gRNA used for each gene (Fig. S3) and it is about 65%.

      Fig. 4: If Olaparib treatment increased DNA damage, will it increase PARP1 activation and PAR formation?

      As it has widely used in mammalian models, parthanatos is triggered by overactivation of PARP1 following DNA damage. Therefore, although inhibition of olaparib may further induces DNA damage, it blocks parthanatos. This is consistent with our results showing that olaparib reduces PARylation (Fig. S4H) and cell death (Figs. 4J, 4K).

      Fig. 4M: it is not clear what staining has been done. No difference was observed among different groups.

      As indicated in the figure legends, pγH2Ax+ (green) keratinocytes (red) are shown. We have indicated this in the figure and include arrows to show pγH2Ax+ cells. The quantitation of this experiment (Fig. 4L) show that FK-866 robustly reduced, while olaparib increases, keratinocyte DNA damage.

      Authors used N-phenylmaleimide (NP) to block AIF nuclear translocation. How does this inhibitor work? what is its actual effect on AIF nuclear translocation? Experiments are required to show this inhibitor actually blocks AIF nuclear translocation.

      NP has been shown to block AIFM1 nuclear translocation, since it inhibits cysteine proteases which are required for its cleavage which precedes nuclear translocation (PMID 8879205). Although we have shown that genetic inhibition of Aifm1 rescues skin inflammation, confirming the specificity of the inhibitor, we agree on this point. Therefore, we have performed additional experiments and showed nuclear Aifm1 in keratinocyte aggregates of Spint1-deficient larvae and that NP treatment blocked nuclear translocation (Fig. S6C). In addition, we have also shown increased nuclear translocation of AIFM1 in keratinocytes of lesional skin from psoriasis patients (Figs. 6C, 6D).

      Figure S4: it is hard to understand why lane #2 with Olaparib has the highest PAR signal.

      We are sorry for this mistake labeling the WB. The right legend is: 1 +/+, 2 -/- treated with DMSO, 3 -/- treated with FK-866 and 4 -/- treated with olaparib.

      Does spint1a-/- zebrafish show parthanatos cell death? It is not clear how cell death was measured.

      We have shown that skin keratinocytes from Spint1a-deficient fish show increased cell death, as assayed by TUNEL, that is fully reversed by olaparib (Figs. 4J, 4K). In addition, skin keratinocytes from the mutant fish also have increased PARylation that is reversed by either FK-866 or olaparib (Fig. S4G, S4H). Further, pharmacological and genetic inhibition of Aifm1 inhibition and forced expression of Parga also rescue skin inflammation. Finally, we have included new experiments showing Aifm1 nuclear translocation in both Spint1a-deficient larvae and psoriasis patient lesional skin. Therefore, all these results show that Spint1a-deficient fish show parthanatos cell death-induced inflammation.

      NAD+ levels were regulated by 3 different pathways. Expression of many genes involved in these 3 pathways were altered in psoriasis. However, it is not clear if the other two pathways play a role in PARP1-mediated inflammation.

      NAD+ salvage pathway has been shown to be the major pathway regulating NAD+ levels in most tissues. The inhibition of this pathway with FK-866 rescues all skin phenotypes observed in Spint1a-deficient larvae as well as in organotypic 3D skin models of psoriasis. These results were further validated using another inhibitor (GMX1778) and genetic inhibition. Therefore, our results support that the salvage pathway is the one involved in psoriasis and inhibition of this pathway would rescue inflammation. However, it will be worthy to investigate if other pathways play a role in psoriasis and specifically upon inhibition of the salvage pathway.

      **Minor comments:**

      1. Page 9: To test this hypothesis, we used N-phenylmaleimide (NP), a chemical inhibitor of Aifm1 translocation from the nucleus to the mitochondria (Susin et al., 1996). The statement is not correct.

      We are sorry for this mistake. It has been amended to: “To test this hypothesis, we used N-phenylmaleimide (NP), a chemical inhibitor of Aifm1 translocation from the mitochondria to the nucleus (Susin et al., 1996).”

      Page 12: To the best of our knowledge, this is the first study demonstrating the existence of parthanatos in vivo. This statement is not correct.

      We have removed this statement.

      Figure S3 and S6E: they should be presented in an easy understandable way for the general readers.

      We have explained in the legends the graph output of TIDE analysis.

      Figure legends should be presented in a clearer way.

      We have tried our best writing the legends. All suggestions and request were made.

      Reviewer #1 (Significance (Required)): Parthanatos is a new type of cell death distinct from apoptosis, necrosis, necroptosis and plays a pivotal role in ischemic stroke and neurodegenerative diseases (Wang Y et a., Science. 2016; Kam TI et al., Science 2018). The current study may provide new evidence of the importance of PARP1 and parthanatos in skin inflammation and potential targets for the treatment of skin inflammation. We thank the reviewer’s opinion on the significance of our study.

      The reviewer has the expertise in oxidative stress, PARP1 and parthanatos research. Reviewer #2 (Evidence, reproducibility and clarity (Required)): **Summary:** The manuscript entitle "NAMPT-derived NAD+ fuels PARP1 to promote skin inflammation through parthanatos" is well written, divided and organized. This work demonstrated that models of psoriasis are characterized by ROS stress, inflammation and cell death. It was clear that NAMPT, a rate-limiting enzyme of NAD salvage pathway, and PARP1, a Poly-ADP-ribose polymerase, could be targeted to decrease ROS stress and inflammation that are contributing to cell death through parthanatos pathway. However, it was not clear that NAD+ are the responsible for fuel these processes in the psoriasis models analyzed. Nevertheless, the present work demonstrated that the cell death observed in the psoriasis model analyzed was correlated to an unidentified programmed cell death pathway, parthanatos that up to date has not been demonstrated.

      We are pleased with the reviewer’s comments on our study.

      **Major comments:** Most of the data showed confirmed that inhibition of NAMPT or PARP1 seems to be beneficial for the relief of some characteristics related to oxidative stress and inflammation in the skin. However, the author should show data about NAD+ levels only instead of the ratio NAD+/NADH to state that NAMPT-derived NAD+ is promoting oxidative stress (line 366-368) (fig2K).

      The data shown in Fig 2K are NAD+ plus NADH. Considering that cytosolic and nuclear NAD+/NADH ratio typically ranges from 100 to 1000 (PMID: 21982715), these data mainly show intracellular NAD+ concentration in larvae.

      Some data images are not convincing, or they don't really show an increase or decrease as the author showed in graph data. (Fig1D, 1E - 1F,1G).

      The quantification of H2O2 and NFkB was measured in the skin, as it has already been reported previously (Candel et al., 2014). To clarify this, we have shown the ROI used for quantification of H2O2 and NfkB in Fig. S1G.

      What is the relevance to analyze muscle and what is the relevance of the results obtained, since the effect of FK-866 in muscle increases the NFKB activity?

      This is essentially a similar concern raised by reviewer 1. FK-866 induces NF-kB activation in the muscle and neutrophil infiltration in this tissue when used at 100 uM. This result may be deleted if the reviewers think it is confusing, since a 10 uM dose was used in all subsequent experiments to study the impact of Nampt in skin inflammation. This dose has no effects in the muscle but robustly reduced skin H2O2 production and neutrophil infiltration.

      Figure S4H is not convincing with what the author wrote.

      We are sorry for this mistake labeling the WB. The right legend is: 1 +/+, 2 -/- treated with DMSO, 3 -/- treated with FK-866 and 4 -/- treated with olaparib. Both FK-866 and olaparib rescue PARylation in the skin of Spint1a-deficient larvae.

      The author should make the keratinocyte aggregation experiment with FK-866 treatment to better substantiate what they are proposing.

      These results are shown in Figs. 2E and 2F.

      **Minor comments:** Line 281: "NP, a chemical inhibitor of Aifm1 translocation from the nucleus to the mitochondria..." should be the opposite: NP, a chemical inhibitor of Aifm1 translocation from mitochondria to nucleus.

      We are sorry for this mistake. It has been amended.

      Line 299 "figure 6A" should be Figure 6B.

      We have checked and it is correct.

      How the author explains the relationship between all the results being related to NAMPT and supposedly to NAD+, but an important precursor to make NAD through salvage pathway (NMN) and a well NAD+ booster didn't show any effect?

      This is an interesting point that was also raised by reviewer 1. A recent study has demonstrated that NAM supplementation increased zebrafish larval NAD+; however, NA, NMN and NR failed to boost larval NAD+ level (PMID: 32197067). This explains our results. We have discussed this point in the revised manuscript.

      Line 178: should be NAMPT inhibitor stead of FK-866 inhibitor.

      Thanks a lot. It has been amended.

      Line 191-192: I suggest reformulating this sentence since the result showed was only the ratio NAD/NADH.

      Please, see our response above. We are measuring NAD+ plus NADH. We have amended the text to clarify this fact.

      Reviewer #2 (Significance (Required)): The present work greatly demonstrated the relevance of PARP1 and NAMPT in the field of inflammation and ROS in the skin that contribute to diseases like psoriasis. Although it is not a lethal disease, as the author mentioned, it affects the physical and mental health of the individual. Understanding the mechanism that underlie this condition would help to trigger new and more efficient treatments. It was clear that the result showed a promising strategy in targeting NAMPT and PARP1. Furthermore, inhibitor for them is already know and may be useful for future treatment of psoriasis disease. We thank this comments on the impact of our study.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): This study shows NAMPT derived NAD facilitates PARP activation to promote skin inflammation via parthanatos. The authors used the zebrafish model and organoid models of psoriasis and observed that inhibition of NAMPT reduces inflammation in zebrafish and human skin organoid models. They also observed that NADPH oxidase-derived oxidative stress activates PARP, and PARP inhibition or over-expression of PARG or AIF mimics protection mediated by NAMPT inhibition. This is an interesting study, but there are several weaknesses to support the conclusions of this study. While pharmacological inhibition is a powerful tool, complementary methods (knock out of PARP-1) are critical for this paper's conclusions. PARP inhibitor used in this study may not specifically inhibit PARP1 but other PARPs too. Therefore, genetic knockout of PARP will make the make this conclusions/interpretation of this study strong.

      We thank these comments on our manuscript. All pharmacological inhibitions used in this study were confirmed by genetic experiments, including Parp1. The genetic inhibition of Parp1 is shown in Figs. S4C-S4F.

      Additional comments include: This study's primary focus is PARP activation and PAR-mediated parthanatos, but it is not shown how different inhibitors used in this study and supplementations of NAD alter PARP activation and PAR formation.

      We have shown through the quantitation of PARylation that Spint1a-deficient skin shows increased PAR activity and that pharmacological inhibition of either Nampt or Parp was able to fully reverse it (Figs S4g & S4H). In addition, we have also shown a dramatically increased PAR activity in lesional skin biopsies from psoriasis patients (Fig. 6E).

      NAMPT is not the only NAD biosynthesis pathway; how other NAD pathways respond when NAMPT is inhibited with FK-866.

      NAD+ salvage pathway has been shown to be the major pathway regulating NAD+ levels in most tissues. The inhibition of this pathway with FK-866 rescues all skin phenotypes observed in Spint1a-deficient larvae as well as in organotypic 3D skin models of psoriasis. Therefore, our results support that the salvage pathway is the one involved in psoriasis and inhibition of this pathway would rescue inflammation. However, we agree that it will be worthy to investigate if other pathways play a role in psoriasis and specifically upon inhibition of the salvage pathway. However, this is out of the scope of this manuscript.

      PARG is used in this study, but the protein levels of PARG are not shown, and it is not clear whether the PARG overexpression is sufficient to reduce PAR levels in the models used. AIF pharmacological and genetic manipulation of AIF is used, but it is not shown that AIF translocates to the nucleus in this model.

      We agree on these points, so we have analyzed Aifm1 translocation in Spint1a-deficiet larvae and psoriasis patient lesional skin (please, see above our response to reviewer 1) and PARylation upon forced expression of Parga (Fig. 5M).

      Does NAMPT inhibition reduce NAPD oxidase activity?

      Our results indicate that Nampt inhibition reduce NAPDH oxidase activity, since a drastic reduction of H2O2 production was observed in the skin of Spint1a-deficient larvae treated with FK-866.

      PAR plots provided in fig S4 need quantification, and the blots (Fig S4 G&H) should be run on the same gel to make sure the exposure levels are the same. It is not clear which group is represented in lane 4 of Fig S4 G.

      We have provided the quantitation. The problem is that we mislabeled the legend of Fig. S4H. The right legend is: 1 +/+, 2 -/- treated with DMSO, 3 -/- treated with FK-866 and 4 -/- treated with olaparib. Therefore, either Nampt or Parp inhibition robustly reduces PARylation of Spint1a-deficient skin to the levels of their wild type counterparts.

      Reviewer #3 (Significance (Required)): This study in interesting potentially showing the role of PARP-1 activation and Parthanatos in skin inflammation. It could be very significant if above identified weaknesses are addressed.

      We are pleased with this reviewer’s assessment on the significance of our study.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for the positive assessment of our work and for the constructive comments that helped us to improve the quality of our manuscript. We have carefully considered each point and have addressed most by modifying the manuscript text to increase clarity of our work. Based on a suggestion by Reviewer 2 we have also included the results of a new experiment.

      In addition to addressing all comments of the reviewers, we have expanded the part of the study analysing the functionality of Caulobacter’s DnaA Nt in the heterologous host E. coli. Furthermore, we have replaced our original set of fluorescence data by a new data set that has been acquired using optimized measurement parameters (bottom read and 100 for the detector gain - see Material and Methods for details), which have improved the signal-to-noise ratio and the overall quality of the fluorescence profiles. Importantly, these new data do not change, but rather strengthen, our conclusions.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Felletti et al provide compelling new evidence that a CDS element in the dnaA mRNA is required for nutrient dependent translationol control. This provides a mechanisms by which dnaA translation is shut off during carbon starvation, and is supported by a rather rigorous analysis of the mRNA performed both in vitro and in vivo. Overall it was a pleasure to read and the data are generally very compelling. My specific comments are below:

      **Major Comments:**

      While the authors rule out differences in charging of different ala-tRNAs as controlling the nutrient dependent repression in translation, the authors assume that this must be due to the nascent sequence. However, could it also be possible that all ala-tRNA isoacceptors have lower charging after C-starvation?

      We thank the reviewer for raising this important point. As Reviewer 1 pointed out, we cannot conclusively exclude that carbon starvation could lead to reduced charging levels of all isoacceptor Ala-tRNAs. However, based on the available literature, we consider it unlikely. In a first work by Elf et al 2003 (confirmed later by Dittmar et al 2005 and Subramaniam et al 2014) the authors argued that under amino acid-limiting conditions the charging levels of the different isoacceptor tRNAs depend directly on their codon usage during translation. Importantly, in our work we could show that Nt mediates the inhibition of translation independent of the synonymous codon choice, suggesting that aa-tRNA levels are not limiting in our experimental conditions. To address this comment of Reviewer 1, we discussed this matter in a greater detail in the revised version of the manuscript (line 374-379).

      **Minor comments:**

      It was observed many years ago that tmRNA is required for the proper timing of DNA replication initiation in Caulobacter (Cheng and Keiler J Bact 2009). Since the AAI motif is appearing to alter translation elongation, it might be interesting to discuss the AAI motif may be linked to ribosome arrest and rescue.

      We appreciate this suggestion. Cheng and Keiler 2009 proposed an indirect involvement of the tmRNA in the transcriptional regulation of DnaA over Caulobacter’s cell cycle. In the revised version of the manuscript, we mention the tmRNA and ArfB protein as possible factors involved in ribosome rescue following Nt-induced ribosome stalling and we refer to Keiler et al 2000 and Feaga et al 2014.

      Line 49 - add "initiation"

      The word “initiation” was added to the text.

      Line 61 - is "cleared" meant to be proteolyzed or simply meaning to have a lower protein level?

      We apologize if we were not clear. We rephrased the text as follows: “[…] DnaA levels decrease at the onset of carbon starvation […]”.

      Line 92-93 - is this 5' UTR based on a previously defined TSS determined in their previous study?

      dnaA TSS has been first determined by primer extension (Zweiger and Shapiro 1994) and later by global 5’RACE (Schrader et al 2014 and Zhou et al 2015). In the new version of the manuscript, we include references to these previous studies (line 94).

      Line 115-118 - this is interesting, might this conserved 5' UTR be added to rfam?

      We thank the reviewer for this suggestion. We will submit our alignment to rfam after publication of the manuscript in a journal.

      Line 126-127, 131,189 - Is the 3nt sequence the authors found here considered a Shine-Dalgarno site? I would imagine that this would be too small to consider this. Perhaps calling it SD-like sequence might be more appropriate.

      We agree with this comment. In the new version of the manuscript, we refer to the identified 3-nucleotide sequence as a “SD-like sequence”.

      Lines 136-140, 208-210 - Would the authors consider this upstream site with a potential CUG start codon a standby site? It appears to fit many of the criteria which could be used to define one.

      According to our probing data, the mRNA region in proximity of the CUG start codon forms a very stable stem-loop structure. Based on our previous experience (especially the extensive work by the Wagner lab), typical ribosome standby sites only occur in largely unstructured regions. Furthermore, in Supplementary Fig. 4 we show that the deletion of stem P4 does not affect eGFP expression levels. For these reasons, we consider it unlikely that the putative CUG start codon is part of a ribosome standby site.

      Lines 253-255 - this is a beautiful experiment, but very hard to understand from the text. Perhaps add a sentence or two to explain it in more detail.

      We thank the reviewer for this comment. In the revised version of the manuscript, we provide a more detailed description of the dfsNt reporter mutant. We hope this will address the reviewer’s concerns.

      Line 307 - add "synonomous"

      The word “synonymous” was added in the revised version of the manuscript

      When dnaA is depleted, it was observed that the chromsome can be erroneously segregated by the ParA/B/S system (mera et al PNAS). Does this occur in C-starvation when DnaA levels drop?

      In a separate study we have also observed that in a fraction of DnaA depleted cells the origin of replication is erroneously translocated from the stalked to the swarmer cell pole. We have not studied this phenomenon under carbon starvation, as it lies outside the scope of this paper. However, if the ParA/B/S remains functional under carbon starvation, this might also happen in G1-arrested starved cells.

      Reviewer #1 (Significance (Required)):

      Appears to be quite significant to researchers studying regulation of bacterial cell cycle and translation. Since DnaA is conserved across bacteria, and this mechanism works in E. coli, it appears that the findings will likely be important in many bacterial systems.

      Referee Cross-commenting

      All the reviewer comments I read seem reasonable. Specifically, I found the point about E. coli 30S ribosomes is very important that the authors address. This could be done in writing, but should be better listed as a caveat to those experiments.

      As suggested by the reviewers, we have partially rephrased some parts of the text describing the toeprint results. Moreover, we have inserted in the main text and in Fig. 1 legend explicit references to the use of purified E. coli 30S subunits and tRNA-fMet. We believe these changes will address the reviewers’ concerns.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:** The Jonas lab provide good evidence that they have found a new mechanism to regulate the amount of the DnaA protein by a starvation signal. The DnaA protein is the key chromosome replication initiator probably for most bacteria and as such DnaA is the target of many regulatory inputs. The authors created an accurate reporter system that allows them to dissect the 5' mRNA translated and untranslated sequences of dnaA and they have convincingly demonstrated that the N-terminal DnaA peptide sequence and not the RNA mediate the response to starvation by glucose exhaustion. This is potentially a model example for global translational responses in bacteria.

      **Major comments:**

      The main conclusion, i.e. that the DnaA leader peptide "Nt" mediates this response is convincing. However, there were 2 major problems that should be easily addressed. These do not subtract from the main conclusion.

      Problem 1

      E. coli 30S subunits were used in the "Toeprint" assay of Fig. 1. Obviously Caulobacter 30S Ribosome subunits should have been used, or a justification should be given. One remedy would be to make this supplementary information.

      We thank Reviewer 2 for this comment. We agree that it would be better to use Caulobacter 30S ribosome subunits in our toeprint experiments. However, because toeprint assays with E. coli 30S ribosome were already established in our lab (i.e. the Wagner lab, where the assays were performed) and because works by other groups have shown that E. coli 30S subunits can be used to study the translation of mRNAs from other bacteria, we decided to use this experimental set up. Based on our results, we also had no reason to doubt the suitability of the E. coli 30S subunits. The toeprint showed that translation starts at the in silico predicted translation start site, which was further confirmed by our in vivo mutagenesis experiments. For these reasons, we are confident that the toeprint assays indicate the true translational start site. However, we acknowledge that we could have been more explicit about the use of the purified E. coli 30S subunits and tRNA-fMet in toeprinting assay. To increase clarity and transparency, in this revised version of the manuscript, some parts of the main text were rephrased and references to the use of E. coli 30S and tRNA-fMet were introduced (including Fig. 1 legend). We hope that these changes will address the reviewer’s concerns.

      Problem 2

      The results in Fig. 6B could be due to the Nt simply making the hybrid protein more unstable in E. coli. This is the main impression given by the drop in signal. In this case, the conclusion would be wrong, and Nt is not transferring a starvation translation block from C. crescentus to E. coli. Nt is just making the protein unstable. These results should be treated as preliminary pending protein stability measurements. However, this defect does not subtract from the other main points and without the Fig. 6 E. coli experiments they still make a complete and interesting story. One remedy would be to make this also supplementary information.

      It is indeed striking that a drop of normalised fluorescence is observed for the 5’UTRdnaA-Nt construct in E. coli but not in Caulobacter. In order to address if this behavior can be explained by reduced protein stability, we have performed a translation shut-off assay using the 5’UTRdnaA-Nt E. coli reporter construct. The results of this experiment (shown in Supplementary Fig. 9A and described in line 327-329) show that the normalised fluorescence remains stable over 10 hours after chloramphenicol addition to the culture, ruling out that the presence of Nt significantly affects eGFP protein stability in E. coli. Importantly, this experiment also showed that in contrast to the chloramphenicol treated culture, in which the OD600 decreased after reaching stationary phase, the OD600 of the non-treated cultures slightly increased between 2 and 10 hours (Supplementary Fig. 9A). Because this increase was not observed in carbon starved Caulobacter cultures, we consider the different growth dynamics between E. coli and Caulobacter to be the most likely explanation for differences in eGFP accumulation at later time points during the experiment.

      To further strengthen our E. coli data, we have analysed additional relevant Nt mutants that we identified as most critical mutants in our Caulobacter experiments presented in Fig. 5, namely dfsNt, mutD1, mutD2, ΔAAI and AAI>DDK. Determination of Δt and Δf values for the E. coli strains carrying these different Nt constructs showed similar results as for the corresponding constructs in Caulobacter. Collectively, these new data further support the notion that Nt operates in E. coli through a conserved inhibitory mechanism of translation. These data are now included in a reorganized new version of Fig. 6 (panels A, B) as well as in Supplementary Fig. 9.

      **Minor comments:**

      There are also 6 minor issues that are easily addressed, most by small changes to the text, and these should improve this otherwise fine manuscript.

      Issue 1

      Line 88 Fig. 1A shows DnaA degradation upon entering stationary phase from a low glucose media and not a simple starvation response to one component like glucose. Did the authors consider trying simple washout experiments, i.e. resuspend the cells in glucose-free media? This would have the advantage of suddenly exposing the cells to starvation and thereby studying the sudden response rather than the slower lingering response which would be due to many factors and not just glucose removal.

      In a previous work from our lab (Leslie et al 2015), we have conclusively shown that the downregulation of DnaA synthesis depends primarily on the nutrient content of the growth medium.

      Besides being in continuity with our previous work, we think that the starvation protocol that we used in the present study, and that was also used by the Sean Crosson lab (Boutte et al. 2012), might better reproduce what happens in the natural environment when nutrient levels gradually decrease until becoming limiting for bacterial growth.

      Issue 2

      Reference 16 should be cited are the first publication to show that glucose and other starvations induce DnaA degradation in Caulobacter.

      We have added Reference 16 to the first sentence of the results section, in which we state that DnaA levels decrease when cells are shifted from a glucose-supplemented minimal medium to a glucose-limiting medium.

      Issue 3

      Fig. 1D shows that the TOEprint is not changed by adding the ribosome, very surprising considering its size and SD docking & alignment. 2 Minor bands then appear when the tRNA-Met is further added. These are presumably the "toeprints". A control with just the added tRNA-Met would make this result much more significant.

      In the existing literature, there is a common consensus to consider real toeprints (i.e., indicative of the presence of an assembled translation pre-initiation complex) as only those bands that appear faintly in the presence of the 30S ribosome subunit but that become clearly enhanced upon addition of the initiator tRNA-fMet. Some examples can be found in Hoekzema et al 2019, Romilly 2014, Romilly 2020. In cases when the translation start site is buried in a structural element, the intensity of the toeprint signal is further increased when the mRNA is rendered unfolded, as seen in our data.

      tRNA-30S-independent bands always show up in toeprint experiments, but their intensities differ with the sequence of the mRNA and sometimes the choice of RT used for primer extension. Addition of initiator tRNA-fMet alone is commonly not done in toeprint experiments (see references mentioned above). Finally, we want to point out again (see also our answer on “Problem 1”) that the toeprint data are very much consistent with our in silico predictions and our in vivo mutagenesis data. Therefore, we are confident that the observed toeprint upstream of the AUG corresponds to the true ribosome binding site.

      Issue 4

      Why does the cell OD drop, e.g. in Fig. 2, is it cell death from starvation?

      We don’t think that the slight reduction of OD600 observed in our experiments is due to cell death. Based on our knowledge, carbon starved cells remain viable up to 24 hours after the starvation onset. Instead, we have observed a cell volume reduction, which may at least partially explain the observed OD600 decrease.

      Issue 5

      Line 327 Discussion "This study reveals a new mechanism, by which some bacteria can regulate the synthesis of the replication initiator DnaA in response to nutrient availability by modulating the rate of translation." Rate of translation or rate of translation abortions (as implied in Fig. 6)?

      The rate of translation is the result of multiple contributions such as initiation, elongation, abortion and termination. Our data indicate that Nt is a regulator of DnaA translation elongation responding specifically to the nutritional state of the cell. Translation abortion could be one of the possible outcomes (but not the only one) of ribosome stalling. For these reasons, in the new version of the manuscript, we added the word “elongation” at the end of the sentence mentioned by Reviewer 2 (line 354).

      Issue 6

      It seems that that for most experiments with the eGFP the translation and protein decay components of the signal could have been easily uncoupled by running a parallel +chloramphenicol control. For example, this would simplify the interpretation of Fig. 6 where Nt eGFP stabilities are an issue and it is important to establish that comparable protein stability with and without the Nt peptide.

      To address the reviewer’s comment, we have now included a chloramphenicol control experiment (stability assay) performed with E. coli carrying the 5’UTRdnaA-Nt reporter construct (Supplementary Fig. 9A). Please, see the response above for more details. For the experiments with the Caulobacter 5’UTRdnaA-Nt reporter we show in Supplementary Fig. 7 that the Nt peptide has no destabilising effect on eGFP.

      Reviewer #2 (Significance (Required)):

      Caulobacter crescentus is a model bacterium that has provided many insights into bacterial physiology that are now exploited to understand many organisms. These present results may provide one such example. It is known that the first amino acids of translated peptides can influence increase or impede exit from the ribosome, so this is a potential translation-level regulatory point that might be used by many organisms. This manuscript gives a concrete and important example of such usage suggesting that it many be widespread. Therefore, this work should find a wide audience and it should stimulate research in many other systems.

      My lab also studies Caulobacter crescentus and we studied the same dnaA gene and protein including starvation responses. We at present do not have projects on dnaA but we do study other regulators and regulatory mechanisms of chromosome replication in Caulobacter crescentus.

    1. Author Response:

      Reviewer #1:

      This meta analysis addresses a double-edged sword in evolutionary biology. Group living may be beneficial for many reasons, but has costs in terms of increased rates of parasitism. Furthermore, if groups are highly related, parasites that are genetically able to infect on member of the group may be able to infect all of them, putting the entire group at risk. In the her presented meta analysis, many original studies working on questions related to parasitism, relatedness and group living are brought together in one unifying framework. The authors find that indeed, group living can facilitate the spread of infectious diseases. However, they also find that the negative effects of disease can be overcompensated by the benefits of being social. The authors stress that experimental studies are necessary to disentangle these effects. The study is of high standard and well-conducted. The take home message is clear and of general interest.

      The study highlights that experimental work is important to understand the relationship between parasitism, relatedness and living in groups. However, I missed an important aspect here. Experiments tend to stretch factors (sometimes to extremes), which may go square to the biology of the species. In some cases, this results in non-social organisms to be pressed in a group-environment. For example, the monoculture effect as we know it in agriculture is highly artificial. Clonal lines of crop are planted in high density, promising high yield, if pathogens stay out. These plants do not have a history of evolving mechanisms to deal with the effect of high relatedness. In contrast animals living in social groups, may never experience setting with non-relatives. Social insects evolved to deal with parasites by expressing specific adaptations, such a grooming, hygiene and social structure in the colony. Many social insects may never experience conditions of low relatedness. Thus, I expect it makes a difference if you experimentally force a non-social organism to be social, or a social organism to be asocial. I would be happy if this factor could be included in the reasoning, and maybe even analyzed quantitatively. For example, I would expect that non-social species made artificially to grow in groups of relatives, suffer much more from parasites than typical social animals with the same degree of relatedness.

      This is an important point. One of the main motivations for conducting this study was to test if species that typically live with kin have evolved adaptations to minimise any increase in susceptibility to pathogens brought about by living in groups with relatives. We therefore collected data on whether species are: a) typically social or non-social, and b) average levels of relatedness between individuals in groups under natural conditions (see Methods section ‘Data on species characteristics’).

      a) Testing differences between social and non-social species. All species included in our dataset had some part of their life-cycle where they were social (note we specifically excluded any studies on non-natural systems such as crops and domesticated species). This meant that only comparisons between species that are obligately social versus species that are social during specific life stages could made. This is problematic as assumptions need to be made about the strength of selection during different life cycle phases. For example, mortality caused by pathogens maybe particular high during the social juvenile phases of otherwise non-social species, resulting in selection for adaptions to reduce pathogen spread being similar to species that are obligately social. An additional problem was that experimental studies (a key factor highlighted by our analyses) of species that are non-social apart from specific life-cycle phases were rare (n=1, Rana latastei) precluding any meaningful comparisons.

      We have now added the following sentences to the methods to clarify this point:

      “We also collected data on whether species always lived in social groups (‘obligately social’) or whether species were only social during specific life stages (‘periodically social’). However, it was not possible to analyse this data as experimental manipulations of pathogens, a key factor influencing the relationship between relatedness and mortality and pathogen abundances, were only performed for one periodically social species (Rana latastei)” (Lines 425-430).

      b) Testing differences between species that typically live with kin and non-kin. The third aim of the paper was to test if species that typically live with kin have evolved to deal with pathogens as the referee suggests. We found that species that live with kin, such as social insects, have similar rates of mortality and pathogen abundances to species that live with non-kin (Figure 3). However, species that typically live with kin had lower rates of mortality in groups with higher relatedness when pathogens were absent compared to species that typically live with non-kin. This suggests that pathogens represent an omnipresent threat to all species, but that adaptations have evolved to reap the benefits of living with relatives in social species.

      In summary, as suggested by the referee we analysed whether “species made artificially to grow in groups of relatives, suffer much more from parasites than typical social animals with the same degree of relatedness” as much as was possible given the limitations of the published data. We have edited parts of the manuscript to emphasise that this was a key aim of the paper (Lines 66-74; 92-94; 136-153).

      The term (and concept) "monoculture" is typically used to describe clonal populations, predominantly in agricultural settings. I understand that the authors like to expand this term (as have others done before) to include social animals. However, for most people this would be a change in terminology and may cause misunderstandings. I would prefer if you could stick with the mainstream terminology and avoid pressing this concept into a new costume.

      We included the term “monoculture effect” to facilitate links to existing literature, both in the fields of agriculture and evolutionary biology (e.g. Ekroth at al 2019). While we think that making the reader aware of relevant work in other fields is valuable, we understand its prominence could give the impression that we included agricultural studies. Therefore, we have removed it from the abstract, but have chosen to keep one reference to the monoculture effect in the introduction.

      Reviewer #2:

      This study uses an unusually broad comparative data set to disentangle the positive (relatedness) and negative (pathogen pressure) effects of living in groups. The authors largely succeed in this task even though the data do not allow answers to all outstanding issues. Not unexpectedly, experimental manipulation studies appear to be most informative. The results are broadly consistent with expectations based on kin-selection theory and clarify the effects of a number of important covariables. The study is thoroughly executed and innovative in its approach. I expect this study to be interesting for a broad readership and this method of searching literature data to have considerable impact. Some suggestions strengthening this paper are below:

      • I think it would be helpful for readers to have the Discussion start with a few lines on what your study achieved in language that is complementary to the abstract, perhaps followed by a brief explanation of which angles/ambiguities/challenges you will be taking up in the paragraphs to follow.

      We have now edited the beginning of the discussion in accordance with this suggestion. It reads:

      “Our analyses show that pathogens can increase rates of mortality in groups of relatives. The detrimental effects of pathogens were, however, counteracted by high relatedness reducing mortality when pathogens were rare, particularly in species that live in kin groups. Such contrasting effects of relatedness meant that experimental manipulations were crucial for detecting the costs and benefits of living with relatives when the presence of pathogens varied. Additionally, high relatedness resulted in more even abundances of pathogens across groups, but more variable rates of mortality, highlighting the importance of population genetic structure in explaining the epidemiology of diseases. We discuss these findings in relation to the environments favouring the evolution of different social systems, the mechanisms that have evolved to prevent disease spread in social groups, and the types of study system where more experimental data are required” (Lines 171-181).

      • The rationale of this study is (often implicitly) that tendencies to live with relatives or not is a continuous variable. This surprised me because the senior author has written influential papers showing that family groups are different from non-family groups. In some contexts of this study it seems crucial to make that distinction. For example, a number of data points come from studies of social insects (bumblebees, honeybees, ants). Here, living with non-relatives is not an option but a given. It is well documented that these caste-differentiated colonies originated from ancestors that had exclusively full-sib colonies, so maximal relatedness was ancestral and became only diluted secondarily in some lineages. Would it be possible to check statistically whether the social insect data points always showed the same pattern as the other data points? That would test whether it matters that low relatedness is either derived or ancestral (as I think we implicitly assume to be the case in all other organisms).

      The primary studies included in our analyses were conducted on a diverse set of species where relatedness was often reported and measured on a continuous scale (range 0 to 1). Our rationale and statistical treatment of the data (the effect size of Pearson’s correlation coefficient captures continuous variation in relatedness) reflect the measures reported in the primary studies. This does not mean, however, that we believe groups evolve from along a continuum of within-group relatedness.

      As the referee points out there are two distinct routes to group formation that set the limits to relatedness within groups. In species, where offspring do not disperse from their natal patch (‘family’ groups) the opportunity for interacting with relatives is high, whereas in species where groups form after individuals disperse from natal patches (‘non-family’ groups) relatedness is typically low. Some variation in within-group relatedness subsequently arises within these two categories because of a number of modifying factors (breeder turnover, number of males and females founding groups, ‘budding’ dispersal and so on). However, the potential for kin selection to favour adaptations, including those that limit pathogen spread, remains fundamentally different between family and non-family groups. We tried to capture such differences by classifying species as typically living with kin and non-kin using life- history information (dispersal patterns, mating systems) and direct estimates of relatedness.

      We used the terms kin and non-kin rather than family and non-family because across such a diverse set of study species, with variable types of information (e.g. some species only had molecular genetic estimates of relatedness others had only life-history information), it was not possible to ascertain exactly how groups form for each species. Nevertheless, our analyses are aimed at addressing if species that typically live with kin, such as the social insects, have more effective mechanisms for reducing the impact of pathogens amongst relatives than species that live with non-kin.

      The referee makes an additional valuable point that for social insects ancestral levels of relatedness in groups are known to be high, with lower levels of relatedness being derived. Examining whether species with low versus high contemporary estimates of relatedness may therefore shed light on the importance of current versus past evolutionary responses to pathogens.

      Unfortunately, the sample sizes are just too limited to conduct any meaningful analyses. Only one species of social insect in our dataset was classified as living with non-kin (r <0.25). We also examined finer scale predictors of relatedness applicable to social insects (queen mating frequency: monogamous (r = 0.5) versus polyandrous (r > 0.25 & <0.5)). Sample sizes for crucial comparisons were again too small for formal analysis (Number of monogamous species with experimental data: pathogens present = 3, Pathogens absent = 3. Number of polyandrous species with experimental data: pathogens present = 2, Pathogens absent = 1).

      We have extended the discussion highlighting that more work on species with ancestral and derived levels of high and low levels of relatedness will aid our understanding of the evolutionary history of adaptations to minimise pathogen spread in groups (Lines 248-250). We have also checked and edited the manuscript to remove any implication that groups originate from a continuum of relatedness.

      • I wondered whether you could (interpretationally, i.e. in the discussion) do more with comparative data on pathogen pressure in the wild. The 1987 Hamilton chapter that you cite has lots of interesting natural history observations, which are now often supported by better data. I think he speculates about how altruistic soldiers evolved in aphids and thrips and connects their sociality with living in their own food (galls), which should mean low parasite pressure. The same is true for the lower termites. Would your results allow you to conjecture that all independent lineages that evolved differentiated castes (only possible in families with full siblings; or clones as in aphids) likely had to do that in disease free habitats?

      This is an interesting point and an area where further research would be very valuable. It fits in nicely with our current discussion of how the evolution of groups with high relatedness maybe more likely to occur in environments where pathogens are rare. This was rather vertebrate focused before and so we are grateful for the referee’s suggestion, which has broadened this point. The section now reads:

      “Parallel arguments have been made for social insects. Species with sterile worker castes, that only evolved in groups with high levels of relatedness, are thought to have arisen in environments protected from pathogens (Hamilton 1987). For example, sterile soldier castes have evolved at least six independent times in clonal groups of aphids, and the majority of these cases form galls that provide protection against pathogens (Hamilton, 1987; Stern and Foster, 1996). Escape from pathogens may therefore be a general feature governing the evolutionary origin, as well as the current ecological niches, of species living in highly related groups” (Lines 190-197).

      • I think some effort should be made to make Figures 2,3 and 4 easier to interpret. The ultra-brief acronyms along the y-axis take a while to digest and to realize the nestedness of the analyses. Could you give one piece of information on the left axis (spelled out like 'experimental data' and 'observational data' and the other piece on the right axis (spelled out as 'pathogens absent' and pathogens present'? It would also be helpful if the reader could fully understand the figures without first having to go through the entire method section, so I recommend you extend the legend to explain: 1. What Zr stands for. 2. What the directionality is (so the cryptic line just below Zr can become a proper sentence in the legend), and 3. The rationale of the multifactorial analyses with four or eight combinations (as you describe in the methods; I believe Figure 4 is an example of eight, but this remains rather hazy).

      Many thanks for these suggestions. We have now revised the axis labels and figure legends to improve interpretability.

    1. Reviewer #3 (Public Review): 

      Brochet et al. find that four species of the Lactobacillus Firm-5 lineage, one of the core bacterial lineages of the honey bee microbiome, are able to coexist because they utilize different pollen-derived flavonoids and sugars. They demonstrated this both in vivo, in gnotobiotic bees, and in vitro with laboratory co-cultures. Simple yet robust experiments involving diet or growth media with just simple sugars resulted in loss of diversity, whereas diets and media supplemented with pollen allowed the persistence of all four Firm-5 species over multiple serial passages. The authors then proceeded to examine the genes that were differentially expressed in response to different nutrient growth conditions, as well as the presence of metabolites to infer utilization of pollen-derived nutrients. The results paint a convincing picture of niche partitioning via differentiation in both encoded metabolic capabilities and in the differential expression of commonly encoded genes among co-resident bacterial species. 

      Overall, the paper is strong and the arguments and conclusions put forth are well supported by the data. I only have a few suggestions: 

      1) The study focuses on one strain each of the 4 Firm-5 species; however, there is diversity within each species. This is only briefly mentioned in the paper at the very end, and I think the authors should address this a bit more directly. In particular, they have previously generated a large amount of genomic data from some of these other strains, so it is likely possible to infer or speculate, based on this data, whether they expect different strains within each species to utilize similar nutrients. Also, I'm wondering if the authors can comment on how their findings could extend to the related bumble bee gut microbiome. Such a discussion would help enhance the applicability and importance of this study. 

      2) It is interesting that different species ended up dominating in the in vivo vs. in vitro simple sugar-based communities. What do the authors think may be behind this difference? 

      3) Since the observed coexistence of these gut microbes is largely due to nutritional niche partitioning, it would be helpful if the authors can comment on the natural variation of key pollen derived metabolites, and if/how we could expect ecological variation in the bee microbiome due to plant pollen availability based on biogeography and seasonality. 

      4) The supplementary information is nicely documented and accessible, but I think it would be even more useful if genome-wide data for the RNA-seq results, not just for select genes, are made available. Furthermore, I suggest including descriptive titles and labels within the supplementary Excel files, as there are many separate sheets and it is not always clear what each one shows.

    1. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.

      His definition of a Memex is simply a mechanized (or what we would now call digitized) commonplace book, which has a long history in the literature of knowledge management.


      I'll note here that he's somehow still stuck on the mechanical engineering idea of mechanized. Despite the fact that he was the advisor to Claude Shannon, father of the digital revolution, he is still thinking in terms of mechanical pipes, levers, and fluids. He literally had Shannon building a computer out of pipes and fluid while he was a student at MIT.

    2. One cannot hope thus to equal the speed and flexibility with which the mind follows an associative trail, but it should be possible to beat the mind decisively in regard to the permanence and clarity of the items resurrected from storage.

      the idea of an "[[associative trail]]" here brings to mind both the ars memorativa and the method of loci as well as--even more specifically--the idea of songlines.

      Bush's version is the same thing simply renamed.

      <small><cite class='h-cite ht'> <span class='p-author h-card'>Jeremy Dean</span> in Via: ‘What I Really Want Is Someone Rolling Around in the Text’ - The New York Times (<time class='dt-published'>06/09/2021 14:50:00</time>)</cite></small>

    3. just as though he had the physical page before him.

      Strange that we also want to do more than the material is capable of, but we still want the sense of material interaction. Why?

    4. No human vocal chords entered into the procedure at any point;

      I thought he was going to get to benefits for health/medicine/disabilities but alas...it is all reviewed as a benefit for masculine and able-bodied intellectualism

    5. A record if it is to be useful to science, must be continuously extended, it must be stored, and above all it must be consulted.

      I'd disagree with this. Old and forgotten media have had great use; if they were not extended or needed that doesn't cancel out their former use (thinking of the work of Lisa Gitelman here)

    6. square-rigged ships.

      What a weird metaphor! A square-rigged ship—I think—was a type of vessel that had been improved over several hundreds of years and was commonly used in the nineteenth century. Bush seems to be using it to represent something that was formerly considered a hallmark of civilization (a tool of conquest, nationalism, exploration), but was outdated in a 20thC technological environment

    7. The investigator is staggered by the findings and conclusions of thousands of other workers—conclusions which he cannot find time to grasp, much less to remember, as they appear.

      This astonishment at the "findings and conclusions of thousands of other workers" seems connected to our "information overload" not only in the amount of information available, but in how we relate to it. There is so much out there; is this useful in de-centering individual intellectual authority, or harmful in making all discourse relative?

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The authors wish to thank all three Reviewers for their appreciative comments regarding our ECPT and for very useful suggestions. Response to all points raised are presented below, we hope that the responses and new experiments proposed in the following pages will fully address remaining concerns.

      Reviewer’s comments to the BiorXiv paper by Chesnais et al, 2021

      “High content Image Analysis to study phenotypic heterogeneity in endothelial cell monolayers”

      https://www.biorxiv.org/content/10.1101/2020.11.17.362277v3


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors highlight the importance of endothelial heterogeneity using endothelial cells from different tissues. They examined aortic and pulmonary endothelium as well as HUVECs. They cultured the cells in identical conditions and also stimulated them with a physiological concentration of vascular endothelial growth factor as well high concentrations as would be found in cancers. They developed a profiling tool that allowed analysis of individual endothelial cells within a monolayer and quantification of inter-endothelial junctions, Notch activation, proliferation and other features.

      **Major comments**

      1. It would be useful to apply this technology one step beyond two-dimensional culture, to use vessels opened up longitudinally so that one can see the monolayer of endothelial cells and assess whether it is relevant in primary material in situ. I think this would be a major utility of the whole approach.

      R: We thank this reviewer for the suggestion. In vivo analysis is not in the objectives of the paper. However, we propose to perform “En face” staining of murine blood vessels following the protocol in the reference below. We will perform stainings for murine CDH5 (VE-Cadherin), NOTCH1 intracellular domain, HES1 and DNA which parallel that used in vitro on human EC. We will then apply our revised ECPT workflow and present data in a new Figure.

      En Face Preparation of Mouse Blood Vessels. Ko KA, Fujiwara K, Krishnan S, Abe JI. J Vis Exp. 2017 May 19;(123):55460. doi: 10.3791/55460. PMID: 28570508

      2. There are some very nice images here but disappointed not see a field that could show staining and markers for several of the target proteins and thus show the heterogeneity and randomness or organisation of the endothelial cells.

      R: We thank the reviewer for the appreciative comment. We propose to include representative microphotographs to illustrate the heterogeneity of different EC monolayers in the revised version of the manuscript. Furthermore, to further illustrate these aspects we will also include spatial correlation maps of cells and features measured with ECPT as explained below.





      3.

      • The Notch signalling is an important aspect of this work, particularly evidence of lateral inhibition would have been of value. For example, one might expect cells adjacent to each other to have alternating high and low NICD. R: We thank the reviewers for the suggestion. To address this, we are currently developing a new module to perform spatial autocorrelation analysis based on cell maps built using ECPT. In particular we have developed a new module to export cell maps as spatial objects in R which can be then analysed using the adespatial R package and provide correlation metrics such as the Moran’s autocorrelation index (see reference below). The index works with continuous data, removing the need to establish arbitrary thresholds and thus provides formal metrics to demonstrate heterogeneity in EC monolayers. We have derived this index as an example of such analysis for synthetic data and for one ECPT cell map as shown below.

      Figure 1: Moran’s spatial autocorrelation analysis using R and adespatial package. Moran’s index has values between –1 and 1. If adjacent cells had a consistent tendency to acquire alternate high and low NICD values, the corresponding bivariate Moran’s index would have an I+ value ~ 0 and an I- value approaching -1. In the example cell map both I+ and I- have relatively small absolute values and large p values which suggest a random cell distribution. The analysis was performed on synthetic data and ECPT derived data (HUVEC at baseline).

      • *

      Community ecology in the age of multivariate multiscale spatial analysis

      S Dray et al, Ecological Monographs, 2012. doi:10.1890/11-1183.1

      • NICD staining alone does score the extent of the signalling because of many factors that can influence the transport of the cleaved NICD. Really a marker of Notch signalling downstream e.g. HES or HEY family, DLL4 fis needed to give more information about this critical aspect. R): We thank the reviewer for the suggestion. We are currently performing HES1 staining (with no Pha staining) along with a new NICD mAb (see below). Preliminary qualitative data (Fig 2) show that HES1 staining also reveals single cell heterogeneity of NOTCH activation in the same monolayer. We will include ECPT analysis of HES1 and correlation with NICD and other features as suggested. We will reformat the current Fig 5 to include HES1 analysis and improved metrics of NOTCH pathway activation including spatial analysis (point 3 above).

      Figure 2: HES1 immunostaining on HUVEC (Image enhanced for visualisations). Cell nuclei labelled as 1, 2 and 3 have raw mean grey values of HES1 signal equal to 2271, 11210 and 48261 (C2/C1 and C3/C2 >4 folds).




      I really do not think that in Figure 5 it is justified to have a red line drawn through the cloud of points. The correlation coefficient is so low that this is meaningless. The failure to distinguish a P value from biological relevant is worrying. Much better comparison would have been between NICD staining and a downstream gene regulated by notch.

      R: We appreciate the reviewer’s concerns and are presenting our analyses of NOTCH activation using new immunostainings (HES1) and robust metrics for NOTCH activation as discussed above. We will therefore remove the mentioned corelation plots from the reviewed version of the manuscript.

      It is important to know that the antibodies used for staining have be validated by the investigators. They would need to show a single band on Western blots or be able to block staining on immunohistochemistry. We all know the manufacturers can be unreliable and use high concentrations of proteins for Western blots. These should be added as a supplementary figure.

      R: While the paper was under revision the AB8925 (NICD, Abcam) has been retracted from the market. To address this major concern, we have decided to acquire a different antibody targeting the intracellular portion of NOTCH receptor and validated its specificity by western blot. Fig 3 below, show western blots demonstrating a clean band at ~98 Kd as expected for cleaved NOTCH1 intracellular domain (NICD).

      We are currently repeating the whole experiment presented in the current version of the manuscript and the ECPT analysis using the new antibody and including HES1 one of the canonical NOTCH target genes as also suggested by this and other Reviewers. We will provide WB analysis of all antibodies used in the paper in a supplementary figure in the revised manuscript.

      Figure 1, WB analysis (NOTCH1 intracellular domain, AB52627, Abcam). of HUVEC (lanes 2,3), HAoEC (lanes 4,5) and HPMEC (lanes 6,7)

      Reviewer #1 (Significance (Required)):

      This represents a valuable and thorough methodology likely to be highly useful to many groups and show new insights into endothelial biology.

      Wide audience, cancer, cardiology, vascular disease-covid.

      My expertise >100 papers on angiogenis in cancer, basic mechanism, therapy models, bioinformatics IHC, patients, clinical trial. H score 190 Google Scholar

      R: We thank Reviewer One for their very appreciative comments and we hope that the proposed revisions will fully address remaining concerns.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Chesnais et al reports development of workflow for analysis of cultured endothelial cells , which they call Endothelial Cell Profiling tool (ECPT). Using ECPT they analyse several parameters in three different endothelial cell types (HuAEC, HUVEC and HPMEC), such as cell morphology, activation of cytoskeleton, VE-cadherin junctions, cell proliferation and Notch activation, under steady conditions and upon treatment with VEGF. The analysis allows to observe some predicted changes, such as increase in cell cycle and junctional activation in cells treated with VEGF-A, and such changes are highly heterogeneous. Overall, this is a potentially useful albeit not revolutionary tool for batch analysis of cultured endothelial cell phenotypes.

      I have the following comments:

      1. To make their case the authors should provide a comparison with other currently used approaches for EC phenotypic analysis in vitro - what is the advantage of using ECPT? The authors repeatedly use the term "single-cell level of analysis ", but this is in fact the case of any IF based analysis of cultured cells.

      R: We thank the reviewer for the suggestions. Indeed, several tools for imaging based single cell phenotyping are available. However, ECPT represents an improvement under several aspects. First, it allows improved segmentation of difficult-to-segment and heterogeneous cells; second, ECPT allows multi-parametric analysis on large image datasets in a semi-automated and structured way facilitating downstream data analysis; third, ECPT is open source.

      Furthermore, ECPT is a very flexible workflow including tools which facilitate and automate several tasks such as systematic images re-labelling and grouping. We will now draft a table including a complete list of features and improvements in comparison to other available tools and include it in revised manuscript in appendix1 and include analyses which are not implemented in any currently available software such as spatial autocorrelation.

      I strongly recommend to stain HPMECs for PROX1, these cells are frequently 100% lymphatic endothelial cells. In this case the authors compare different lineages and not blood endothelial cells from different locations.

      R: We thank the reviewer for the suggestion. We will address this with a new characterisation as supplementary figure in the revised manuscript. We are currently performing a qRT-PCR screening of several EC marker including arterious, venous and lymphatic markers (e.g., CXCR4, Tie2, CDH5, PROX1, LYVE1 as well as baseline NOTCH1 and Dll4 and downstream genes such as HES1 and HEY2.

      Please provide evidence for specificity of NICD antibody.

      R: We thank this reviewer for the suggestion. Please see response to Reviewer one point 5.

      Figure 1: HPMEC picture appears out of focus

      R: We thank this reviewer for noticing, we will now include a clearer picture in revised version of the manuscript.

      Figure 3 A - it is not entirely clear what is the difference between activated and stressed phenotype, they look quite similar.


      R: We will clarify the definitions of cell activation in revised version of the manuscript and present this analysis as supplementary material to demonstrate the flexibility of our ECPT rather than in main figures. We have removed Pha staining from the new experiments we are performing to allow HES1 staining and address NOTCH signalling in more details. The assessment of Pha and stress fibres in previous experiments will be moved to supplementary material. The classification is based on PhA staining using CPA classifier which was trained to distinguish among the two by the presence of stress fibres. The general rule to place cells in the stressed category during training of the CPA model was the observation of stress fibres crossing the nucleus while cells with peripheral bundles of actin were placed in the activated category.


      Figure 5 - what is the difference in NICD localization between "high" and "On" conditions?

      R:

      Since it has been noted by this and other reviewers that this classification might be difficult to interpret and in fact, the established thresholds are somehow arbitrary, we will completely revise the way we present analysis of NOTCH activation data including downstream analysis and more formal metrics of spatial correlation and extent of activation eliminating the need to impose thresholds (also see response to Reviewer one point 3).

      Since the authors make a correlation between Notch activity and junctional stabilization, it would be important to confirm this by other means, such as analysis of Notch target genes.

      R: We thank this reviewer for the comment which resonate with this and other Reviewers’ comments. We will include HES1 analysis in the revised manuscript, please see Response to point 6 and reviewer’s one point 3 above.

      • *

      **Technical and minor**

      1. Methods mentions HDMECs (human dermal microvascular endothelial cells) but the authors discuss HPMEC throughout the text 2. Please add scale bars on all microscopy pictures. 3. Please provide the information on what isoform of VEGF-A was used for stimulation and the rationale for selecting the concentration.

      R: We thank this reviewer for flagging these imprecisions and we will fix them in revised version of the manuscript.

      Reviewer #2 (Significance (Required)):

      The authors provide a workflow for the phenotypic analysis of cultured cells. Such tool is potentially useful, although the examples the authors show do not reveal striking examples of why such analysis is better in comparison to existing approaches. My guess is that the analysis may be faster and less tedious, once the training sets are generated, but this is not specified. My speciality is endothelial cells biology.

      R: We thank this reviewer for their very useful and appreciative comments. As mentioned above we will expand appendix 1 to fully explain potential and utility of our ECPT and review the main text to clearly highlight main advantages.** We hope that our plan for revision will fully address remaining concerns.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **SUMMARY:**

      The manuscript by Chesnais et al. presents a novel endothelial cell (EC) profiling tool (ECPT) which provides spatial and phenotypic information from individual ECs, and was tested with a variety of specialized EC subtypes (arterial, venous, microvascular). They present a high throughput immunostaining and imaging-based platform using culture of human ECs on 96-well plates and capture of fixed, stained samples on a Perkin Elmer Operetta CLS system. The authors report the use of this ECPT tool to investigate EC phenotypes from human umbilical vein ECs (HUVEC), human aortic ECs (HAoEC) and human pulmonary microvascular (HPMEC) in relation to 50 ng/mL VEGF stimulation for 48 hours, and the general parameters of proliferation, Notch activation and stress fiber rearrangement (F-actin), and present this as a prospective platform to examine differences in EC phenotypes and responses at a more individualized level.

      **MAJOR COMMENTS:**

      1. Fundamentally, the advantage of single cell technologies is the ability to segregate populations to make novel observations. One area that would be of interest to explore in this manuscript using this ECPT platform would be reporting the results from single cell analysis that is then subsequently pooled within a sub-population, rather than sub-stratifying populations to reflect the multiple phenotypes that may be present within a single "confluent" well. With analysis of EC heterogeneity, it would be of interest to differentiate heterogeneity within EC subtypes at the culture/treatment conditions presented, and heterogeneity between EC subtypes.

      R: We thank this reviewer for the suggestions, we believe that the new approach to evaluate heterogeneity through spatial autocorrelation can provide a much better and clearer picture of this aspect (see responses to Reviewers One point 3 and Two points 6 and 7. Furthermore, we are currently restructuring the ECPT data structure to a more intuitive layout (list of lists rather than a single huge data frame) without affecting downstream data presentation. We will also update our Shiny App to enable the user to perform analyses on data subsets of interest without any R coding, we will present examples and walkthrough of this approach in appendix.

      2.

      The term "stable IEJ" is used and refers to 48h after seeding 40,000 cells on a 96-well plate, but it is unclear how the authors defined or demonstrated a "stable" junction. In previous reports, longer-term culturing of EC monolayers well beyond the point of confluence has been shown to result in junctional complex rearrangement (Andriopoulou P et al. Arterioscler Thromb Vasc Biol. 1999; reviewed in Bazzoni G & Dejana E. Physiol Rev. 2004). To this point, the fact that the different EC subtypes investigated had different percentages of "quiescent cells" suggests that the monolayers were not completely quiescent. The statement that the IEJ classification is "an immediate index of EC activation in contrast to quiescence" should be further supported by references or data. The definition of quiescent EC as simply non-proliferating, non-migrating is somewhat reductionist, and oversimplifies EC states. The authors state that HAoEC and HUVEC "...appeared more active...", but it is unclear what "active" means, and whether this may simply reflect that these cells had not yet reached confluence or quiescence in the 48h total culture time. As well, it is unclear how "migratory phenotypes" could occur in confluent monolayers. It would be helpful to see the data for these observations. If leaving ECs longer in culture, are the authors able to achieve a higher percentage of quiescent cells?


      R: We thank this reviewer for the very insightful comments and for suggesting the references. Indeed, we considered these aspects carefully. Regarding cell culture density and confluency, we previously tested seeding densities of 30000-60000 cell/well of 96 well plates (0.32 cm2, ~95000-190000cells/cm2) and we selected 40000 as the maximum seeding density allowing adhesion of >99% of cells. For HUVEC, a seeding density of 40000 cells/well (125000 cells/cm2) produced a high-density culture immediately after seeding (close to what reported for long-confluent cultures in Andriopoulou P et al, ATVB 1999, 140000 cells/cm2). We allowed further 48h culture aiming to achieve junctional “stabilisation” and “maximal” cell density. For consistency, we also seeded 40000 HAoEC and HPMEC per well in all our experiments, however both cell types are significantly larger than HUVEC (Fig 4). For all cells cultures we used EGM2 medium which has few differences with that reported in Andriopoulou P et al, namely, absence of antibiotics and antimycotics and use of defined cocktail of recombinant growth factors instead of Endothelial Cells Growth Supplement. In the past we compared HUVEC cultured in EGM2 and supplemented M199 medium and in our experience EGM2 promotes higher proliferation rates in sub-confluent cultures but similar morphology upon confluency. Is notable that several other factors (including flow, matrix, perivascular cells) are absent in our culture conditions and therefore the homeostatic balance found in vivo might not be fully achievable under our experimental conditions. However, we argue that the described culture conditions should be sufficient to reach a bona fide relatively quiescent EC phenotype in culture.

      Save these considerations, we agree with this reviewer that providing examples of longer-term cultures would help substantiating our findings and further validate the ECPT approach. We will perform a supplementary experiment to evaluate this aspect by comparing 48h cultures with longer culture times (72h and 96h). Furthermore, we will expand the methods section with the details discussed above and in relation to the suggested references.

      • *

      Regarding the definition of “stable IEJ” and “active EC”, we used this terminology referring exclusively to our measures of IEJ stability (STB index) and Pha based cell classification where we used the terms of “quiescent”, “active” or “stressed”. Therefore, all statements mentioning more or less “stable IEJ” or “active” EC are relative to the specific context of our experiment (not in absolute terms).

      Overall, we appreciate that the terminology we employed is a source confusion and might suggest inappropriate over-interpretation of our results. We will correct the text in the manuscript to avoid this confusion and to clarify that our observations are valid within the context of our in vitro conditions. In particular, we will present the data regarding junctions as proportions of different junction per cell, and we will rename cell “activation” categories based on PhA immunostaining using more neutral terms (e.g., No Fibres, Peripheral Bundles, Stress fibres). Finally, we will also attempt to generalise our observations to more physiologic context by performing immunostaining on “en face” preparation of murine blood vessels (cfr response to R1 point 1).

      Fig 4: Cell area density distribution for HUVEC, HAoEC and HPMEC in baseline conditions.

      Could the authors comment on the baseline NICD immunoreactivity in the nuclei in HAoEC and HPMEC compared to HUVEC? Is this a reflection of active NOTCH signaling? Or rather, is it possible contact-inhibition (and downregulation of NOTCH) may not have occurred? Demonstration of EC quiescence would help to ensure similar cell cycle states. The definition of "Notch-positive" and "Notch-negative" cells is a bit misleading, as NICD levels and localization are a better indication of canonical Notch activation, and not the presence or absence of Notch protein(s). Further, NICD activation is also dependent on the levels of Notch ligands, which was not addressed. Are the authors able to confirm "OFF", "Low", "High", and "ON" classifications based on NICD intensity and localization with downstream Notch gene activation at a single-cell level? Or correlation between NICD status and the phase of cell cycle or proliferation status?

      R: We thank this reviewer for the comment. Overall, NICD either nuclear or cytoplasmic can give a measure of how much a cell is relaying canonical notch signalling in a small timescale (minutes, which is also the timescale affected by lateral inhibition, Sjoqvist M and Andersson ER, Dev Biol, 2019). By evaluating single cells in the context of their population in multiple fields of view and samples we can get an indication of how frequently a particular cell type tends to actively transduce canonical NOTCH (under confluent conditions). As this and other reviewers have pointed out NOTCH signal transduction mediated by NICD can be affected by several factors limiting the potential to infer actual activation of the pathway (i.e., downstream gene transcription. As suggested by this and other reviewers we are including measures of downstream gene activation, in particular we have included HES1 staining in our workflow, and we will include these data in a new analysis (also see response to R1 point 3). We will also provide new metrics of spatial autocorrelation to evaluate the tendency to lateral inhibition (R1 point 3) and correlation between parameters using continuous mesures and therefore we will remove the previous classification based on thresholds. Finally, we are performing a qRT-PCR screening to assess baseline levels of DLL4, NOTCH1 and JAG1 which we will present as supplementary material.

      Do as I say, Not(ch) as I do: Lateral control of cell fate

      Sjoqvist M and Andersson ER, Dev Biol, 2019

      PMID: 28969930

      The existing workflow/platform is adapted for images obtained from the Operetta CLS system (Perkin Elmer) and Harmony software (proprietary), which may not be available for broader users in the EC field. It would be helpful to include ImageJ macros for the bulk automatic import of TIFF, renaming and upscaling of resolution/bit quality to match the formats that are compatible with the software.

      R: We thank this reviewer for the comment. We have now included an ImageJ macro (available in the GitHub repository) which in principle can import and elaborate images from any source. We didn’t include a specific option in our current user interface because the relabelling operates by parsing original filenames into fields which are then renamed according to user input and each HT platform adopt different regular expression to encode filename. Any user with a basic literacy in ImageJ macro scripting can achieve relabelling and elaboration of their own file given that their filenames use regular expressions which can be parsed. Also, it is relatively easy (again by modifying the macro) to include user defined pre-processing steps including image scaling. An example of parsing method for Operetta CLS filenames is provided in appendix 1.

      Could the authors comment on the manpower (hours from start to finish for experiments, staining, imaging, analysis, etc.) and cost of the ECPT pipeline relative to emerging single cell technologies such as single cell-RNA sequencing.

      Further, one major advantage of imaging technologies is the ability to assess live cell dynamics, which are particularly relevant in response to stimuli and agonists. Have the authors utilized the ECPT platform for these approaches, in particular, to assess the differential EC subtype dynamics in proliferative conditions?

      R: In terms of manpower the workflow is not very demanding. Our current dataset is based on images extracted form 4 independent experiments (18 wells each). The process is sequential, therefore a single user trained in cell biology, automated microscopy and in the use of the different ECPT components (ImageJ, CP, CPA and R) could perform the experiment alone. The timing of each experiment will depend on circumstantial factors, however once the ECPT is trained for specific user’s requirements (which can require some trials and errors depending on user’s experience) the whole process from cell fixation and staining, through image acquisition (~2 h acquisition for each experiment on an Operetta CLS system), to dataset build-up can take less than one week. For example, elaborating the current image database (~6000 images for four fluorescence channels) which data are presented throughout, had the following raw elaboration times on a Mac Book Pro 2017 equipped with an intel i7 processor and 16 Gb of RAM:

      - Image pre-processing and relabelling ~1h

      - Generation of probability maps for VEC and NICD ~3h

      - CP pipeline run (Cell segmentation, objects measurements and classification) ~16h

      - Data import (R studio) ~20m

      • *

      We will measure these parameters more precisely in the new experimental run and present timings for each step in a new table in appendix 1.

      • *

      After main dataset is created R studio can perform most statistical analyses and data plotting almost instantly.

      • *

      We fully appreciate the value of employing ECPT in live imaging setups and we believe it is one of the most promising future applications. We didn’t address live microscopy experiments in the context of ECPT development and validation presented in the current manuscript therefore we cannot present example data or proof of concept. However, we can confidently comment that time lapse experiment would not endow further layers of complexity in terms of image analysis workflow. Therefore, given appropriate set of live markers (e.g., transgenic fluorescently tagged CDH5 for EC segmentation and junctions analysis) we believe that the current implementation of ECPT is already fully equipped to facilitate elaboration and analysis of imaging data derived from time lapse experiments.

      The authors should discuss the ability to amend or revise of the ECPT platform to incorporate analysis of additional markers that may be obtained through imaging, and discuss greater implications and utility to specifically tailor the workflow for other researchers in vascular biology, or to other monolayer culture systems. Further, they should better highlight the novel observations obtained with the ECPT compared to traditional methodology.

      R: We thank this reviewer for the comments. We will provide evidence of ECPT flexibility within this manuscript by including, during the time of this review process, a new analysis for downstream NOCTH signalling (HES1). We will move analysis of cell “activation” (i.e., stress fibres analysis) to supplementary information and include a more through discussion of how automated single cell classification could improve content, speed, reliability and robustness of quantification tasks which are currently exposed to long and tedious processing times and conscious/unconscious observer biases.

      **MINOR COMMENTS:**

      We thank this reviewer for the very thorough revision of the manuscript. It is truly invaluable to us to improve it. Below responses to specific technical points, we will fix all stylistic, formatting and typographical issues in revised version of the manuscript.

      1. There are minor typographical, capitalization and grammatical errors throughout.

      R1: Thanks, we will fix these in updated version of the manuscript.

      Why was fibronectin used to coat plates, and what was rationale for using this ECM substrate versus gelatin (most commonly used in EC cultures) or type I collagen?

      R2: We used fibronectin for immunostaining experiments similar to what reported in our previous work (Veschini et al, 2007, 2011, Wiseman et al, 2019) and also in Andriopoulou P et al,1999. In general, in our experience FN gives better cell adhesion in comparison to gelatin when culturing EC on glass or other substrates different from cell culture plastic. FN is the cell culture substrate recommended by Promocell therefore, we also used FN for cell expansion to avoid any phenotypic change which might have been caused by switch in cell culture substrate.

      3. Based on the various box plots present throughout the figures, it appears that some parameters have a large range of values. Is it possible or helpful to set minimum and maximum exclusionary criteria? Further, in the way that these data are presented, it is difficult to appreciate the effects of a treatment such as 48h of VEGF, as the magnitude of STB Index difference, for example, appears small, and it is difficult to understand whether these significant differences are biologically relevant, as assessed.

      R3: We agree that in absence of exclusion criteria it is difficult to infer biologic meaning out of subtle differences (e.g., the tiny difference in STB index between HAoEC in presence or absence of VEGF). In the current version of the manuscript, we attempted to be agnostic in regards whether some observed small but significant mean differences could endow biologic meaning and discussed larger variation as biologically meaningful, for example the differences in STB index among cell types. We argue that tiny differences in the distribution of some selected parameter across experimental conditions could reflect underlying mechanisms masked by biologic noise, therefore catching a glimpse of these variations via ECPT could inspire novel experiments to specifically address their full biologic significance.

      To the interest of better understanding of the current manuscript we will re elaborate our data to provide more immediate metrics and highlight outstanding features.


      Use of arrows and further description in Figure 1 would help the reader understand what specific features are different in the various EC subtypes. As well, the representative micrographs for HPMEC appear blurry compared to other panels (Fig. 1).

      In Figure 2, the panels in A, B and C do not correspond horizontally, and it may be cleared to demonstrate "Segmentation & features extraction" overlays from the same representative micrographs shown in panel A. Labeling of the individual panels and software used for panel B would help the readership understand what is being quantified and how. The second panel in "C" appears blurry.

      In Figure 3, labelling the color code for quiescent, activated and stressed categories on graphs and in legend would be helpful to easily identify populations.

      R4-6: Thanks, we will fix these in updated version of the manuscript.

      For Figure 4, line separators or more obvious grouping to distinguish discontinuous, linear and stabilized junction types in panel A. What proportion of the different EC subtypes contains discontinuous, linear and stabilized junctions at confluence/quiescence? Is there a correlation between discontinuous junctions and proliferating cells?


      R7: We will perform new analyses to address correlation between proliferation and junctions and proliferation vs HES1. We will restructure data presentation on junctions to display different proportion of junctions per cell or per cell type rather than a unified value (STB index).


      It would be useful to distinguish the effects of published mediators on junctional integrity in intact EC monolayers (i.e. histamine; VEGF) from those shown in this automated quantitation. It appears that 50 ng/mL of VEGF treatment for 48h only slightly increases STB index based on panel C.

      R7c__: __We agree that increase of STB index in HAoEC and HPMEC upon VEGF treatment might not be highly biologically meaningful, save consideration in point 3 above. However, difference in HUVEC (+- VEGF) is visually appreciable in images (i.e., VEGF treated HUVEC seem to have more linear junctions) therefore we believe that the ~16 units difference in STB index is biologically meaningful. As discussed in point 7 above, we will restructure data presentation to better clarify these aspects.


      Figure 5 panel B should provide legend in graphs/figures or figure legends to highlight the color-coding matching the OFF, Low, High and ON groups. Further, it is unclear the difference between "High" and "ON" groups. The authors state that "thresholds were selected empirically", however, it is unclear whether this was derived through utilization of known Notch activators or inhibitors, and how this relates to the threshold of Notch activity necessary to enhance proliferation or maintain quiescence. In Supplementary Figure 4 (which I believe is mislabelled as Supplementary Figure 5), shows only a weak positive correlation between nuclear NICD intensity and mean STB index. It would be of interest to see the plot from Supplementary Figure 5 for each of the EC subtypes, in the presence and absence of VEGF. As well, for Figure 5, on C and D panels, it would improve clarity to revise "Low" and "High" descriptors with "Low NICD activity" and "High NICD activity".

      R8: As discussed above we will revise our analyses to remove NOTCH categories and instead show spatial autocorrelation analyses which work on continuous data.

      In Supplementary Table 1, "Widt/length" should be "Width/length"


      R9: Thanks, we will fix this in updated version of the manuscript.

      For Supplementary Figure 3, it would be of use to show DNA distribution intensities from proliferating, non-confluent EC subtypes to demonstrate the validity of this methodology to identify cells in G0/G1, S and G2/M phases, as highlighted in panel A. Could the authors comment on the discrepancy between the percentage of cells identified as quiescent by ECPT and the percentage of cells in G0/G1? The comment that "VEGF induced a small detectable increase in proliferation rate in all EC" is curious, as a dose of 50 ng/mL of VEGF should be a relatively strong stimulator of proliferation/migration in ECs.

      R10: We will perform ECPT analysis on sub-confluent or sparse cells to further validate our analysis. Qualitative data on preliminary images seems to confirm that the proliferation rate in sparse cells is very high (>70%). To perform the evaluation we followed and improved a previously published method (Roukos et al, Nat Prot, 2015)

      Regarding the relation between cell in G0/G1 and assessment of “quiescent” phenotype (which nomenclature will be revised as discussed above), it is important to highlight that we reported data on stress fibres analysis (i.e., classification into “quiescent”, “activated” and “stressed” cells) only on the cells in G0/G1 (i.e., we excluded proliferating cells from this analysis as we assumed that all proliferating cells would be “not quiescent” and bias our estimation).

      For Supplementary Figure 5, "Nuclear NOTCH intensity" on the Y-axis should read "Nuclear NICD intensity", as it does not appear that Notch was stained. It would also be of benefit to overlay the ranges for "OFF, Low, High and "ON" to appreciate ranges of activation. Is there any correlation between NICD nuclear intensity and proliferative index?

      R11: We will present correlation between NICD or HES1 and proliferation in revised version of the manuscript.

      Definitions should be provided for many terms. i.e. vascular endothelial-cadherin (VE-CAD; CDH5); HUVEC (human umbilical vein endothelial cell); HAoEC (human aortic endothelial cell); HDMEC (human dermal microvascular endothelial cell); NICD (NOTCH intracellular domain); VEGF (vascular endothelial growth factor); etc. at first appearance.


      R12: We will add this information in revised version of the manuscript.

      For EC subtypes purchased from commercial vendor, it would be of interest to understand how many unique donors these cells/data were derived from, and whether there are any differences in basic donor information such as age, sex, etc. Further, Promocell catalogs proliferative rate for each of their lot numbers, and it would be of interest how this compares to the values determined using the ECPT software analysis package.

      R13: We will add this information in revised version of the manuscript.

      1 In the "Cell culture" section of the methods, HDMEC from Promocell are listed, however, the manuscript and figures show data from HPMEC. Both EC subtypes are available from Promocell, however, HDMEC are from dermal origin.

      1 Vascular endothelial-cadherin should be abbreviated "VE-CAD" or "CDH5" and not "VEC", as this is not a standard or gene notation, and will likely be confused with the more common abbreviations for venous or vascular EC. It seems as though "CDH5" is used most commonly throughout manuscript, so this should be used throughout.

      1 The authors refer to "activated NOTCH" when describing antibodies in the methods, however, it would be clearer to the reader to simply refer to the antibody target (NICD), and mention that this reflects canonical NOTCH downstream activation.

      The sentence in the "Immunostaining" methods "CDH5 is a lineage marker..." should be moved to results/discussion as these details are out of place in methods.

      How were the 3 areas captured per wells designated? Were these locations the automated, and the same for all wells?

      "Appendix - Figure" notation should be revised to "Appendix Figure" for consistency and to avoid confusion.

      R14-19: Thanks, we will fix these in updated version of the manuscript.

      How were artifacts and mis-segmented cell objects excluded?

      R20: We will add this information in the revised appendix. As general rules, cells containing NaNs values in any of the parameters, cells fragments or merged cells (evaluated using area measurements) and cells with no detectable junctions were all excluded (total cell excluded from analysis were ~ 2.5 % of the initial dataset).

      • *

      In "Statistical analysis" "Tuckey's" should be "Tukey's". "HSD" should be defined "honestly significant difference" or simply removed, as Tukey's is most common name.

      In "Statistical analysis", "significative" should be "significant" or "statistically significant".

      Scale bars should be added to micrographs.


      R21-23: Thanks, we will fix these in updated version of the manuscript.

      Could the authors comment on the necessity of µclear plates, which substantially increases the cost per plate/experiment.

      R24: m**clear plates were used to allow image acquisition with a 40x water immersion objective in the Operetta CLS (impossible with standard 96 well plates). Cell grown on coverslips and mounted on microscopy slides could be used as well with significant increase in acquisition time (Wiseman et al, 2019).

      • *

      Were other seeding densities and times investigated?

      R25: We will evaluate sparse cells in revised version of the manuscript as discussed above.


      More description on potentially novel observations between these three primary EC subtypes would be informative for the readership to appreciate

      The references do not appear in chronological order. Further, consistency of reference formatting should be reviewed, and appropriate journal name abbreviations should be used.

      R26-27: Thanks, we will fix these in updated version of the manuscript.

      Reviewer #3 (Significance (Required)):

      • This manuscript presents a conceptual and technical advance, introducing a high throughput imaging platform to assess endothelial phenotypes
      • Within the field of angiogenesis, several tools exist, either proprietary, or leveraging ImageJ software to assist in assessment of cells. The ECPT provides a more complex analysis platform to integrate analysis of multiple endpoints
      • This work would be of interest to vascular biology laboratories to adopt a more comprehensive view of heterogeneous endothelial phenotypes in vitro
      • As a vascular biology researcher, I have had extensive experience with in vitro culture of various endothelial cell subtypes from human and mouse. My field of expertise gives me the perspective of the nuances of the direct handling and phenotyping of ECs, and have worked specifically worked with HUVEC, HAoEC and HPMEC, and assessed the impact of key factors relevant in angiogenesis such as VEGF, Notch and other mediators.

      R: We thank the reviewer for the very appreciative comments, and we hope that with the revised version of the manuscript we will be able to fully address remaining concerns.

    1. Reflecting on how new digital tools have re-invigorated annotation and contributed to the creation of their recent book, they suggest annotation presents a vital means by which academics can re-engage with each other and the wider world.

      I've been seeing some of this in the digital gardening space online. People are actively hosting their annotations, thoughts, and ideas, almost as personal wikis.

      Some are using RSS and other feeds as well as Webmention notifications so that these notebooks can communicate with each other in a realization of Vanmevar Bush's dream.

      Networked academic samizdat anyone?

    1. Reviewer #2 (Public Review): 

      Orije et al. employed diffusion weighted imaging to longitudinally monitor the plasticity of the song control system during multiple photoperiods in male and female starlings. The authors found that both sexes experience similar seasonal neuroplasticity in multisensory systems and cerebellum during the photosensitive phase. The authors' findings are convincing and rely on a set of well-designed longitudinal investigations encompassing previously validated imaging methods. The authors' identification of a putative sensitive window during which sensory and motor systems can be seasonally re-shaped in both sexes is an interesting finding that advances our understanding of the neural basis of seasonal structural neuroplasticity in songbirds. 

      Overall, this is a strong paper whose major strengths are: 

      1) The longitudinal and non-invasive measure of plasticity employed 

      2) The use of two complementary MR assays of white matter microplasticity 

      3) The careful experimental design 

      4) The sound and balanced interpretation of the imaging findings 

      I do not have any major criticism but just a few minor suggestions: 

      # Pp 6-7. While the comparative description of canonical DTI with respect to fixel-based analysis is well written and of interest to readers with formal training in MR imaging, I found this entire section (and especially the paragraphs in page 7) too technical and out of context in a manuscript that is otherwise fundamentally about neuroplasticity in song birds. The accessibility of this manuscript to non-MR experts could be improved by moving this paragraph into the methods section, or by including it as supplemental material. 

      # Similarly, many sections, especially results, are in my opinion too detailed and analytical. While the employed description has the benefit of being systematic and rigorous, the ensuing narrative tends to be very technical and not easily interpretable by non experts. I think the manuscript may be substantially shortened (by at least 20% e.g. by removing overly technical or analytical descriptions of all results and regions affected) without losing its appeal and impact, but instead gaining in strength and focus especially if the new result narrative were aimed to more directly address the interesting set of questions the authors define in the introductory sections. 

      # The possible effect of brain size has been elegantly controlled by using a medial split approach. Have the authors considered using tensor-based morphometry (i.e. using the 3D RARE scans they acquired) to account for where in the brain the small differences in brain size occur? That could be more informative and sensitive than a whole-brain volume quantification. 

      # I think Figures Fig. 3 and Fig. 4 may benefit from a ROI-based quantification of parameters of interests across groups (similar to what has been done for Fig. 7 and its related Fig. 8). This could help readers assess the biological relevance of the parameter mapped. For instance, in Fig. 3, most FA differences are taking place in low FA (i.e. gray matter dense?) regions. 

      # In Abstract: "We longitudinally monitored the song and neuroplasticity in male.." Perhaps something should be specified after the "the song"? Did the authors mean "the neuroplasticity of song system"?

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point response, comments in (blue), our response in (black)

      Note: we included 6 Figures in our response, yet the ReviewCommons system does not appear to support including images as part of the response. These Figures are in the original "Initial Response" file available to ReviewCommons. We requested that Review Commons post our "Initial Response" file that contains these figures so that this information is available.

      Reviewer #1

      *In the paper by Gowthaman et al., the authors aim at better understanding the molecular mechanisms controlling divergent non-coding transcription (DNC). They describe a high-throughput yeast genetic screen using two strains in which two loci consisting of a coding and a divergent non-coding transcription unit (CGC1-SUT098 or ORC2-SUT014) were replaced by a bidirectional fluorescent reporter construct encoding mCherry in the coding direction and YFP in the non-coding direction. The two reporter strains were crossed with the yeast deletion library and mutants leading to increased or decreased YFP signal were selected as potential DNC repressors or activators. The two screens identified a number of common potential repressors and activators. Components of the Hda1C histone deacetylase complex were identified as DNC repressors in both screens. This phenomenon was confirmed genome-wide by performing NET-Seq in WT as well as hda1D and hda3D strains. This experiment allowed to identify 1517 DNC transcripts repressed by Hda1. Further analyses indicate that Hda1C represses DNC genome-wide independently of expression levels and that loss of Hda1 does not substantially affect coding transcription.

      Live-cell imaging of transcription was then used to show that loss of Hda1 increases DNC transcription frequency rather than duration providing novel information on the link between DNC transcription initiation kinetics and chromatin regulation. Finally, using Chip-seq, the authors show that the level of acetylation over the divergent non-coding units is increased in the absence of Hda1 and some experiments suggest that H3K56 acetylation also contributes to DNC regulation, further strengthening the importance of elevated histone acetylation in efficient DNC.

      Importantly, several components of the SWI/SNF chromatin remodeling complex were identified as activators confirming earlier observations (Marquardt et al., 2014). SAGA subunits were also among potential DNC activators, however these effects could not be confirmed through validation experiments. The authors conclude that DNC may be independent of specific activators and mainly due to transcriptional noise resulting from the adjacent NDR.

      Overall this paper is very well structured, clearly written and the experiments are well controlled. The genetic screen identifies novel factors involved in the regulation of DNC. The study clearly demonstrates that the level of acetylation is a key regulator of divergent non-coding transcription and that histone deacetylation by Hda1 reduces the frequency of DNC initiation events. While this conclusion is strongly supported by the Net-Seq and Chip-seq metagene analyses, the fluorescence mCherry and YFP values or qRP-PCR analyses of specific genes do not always behave as expected when looking at absolute values rather than mCherry/YFP or GCG1/SUT098 ratios, which is sometimes disturbing when reading the paper. Therefore, the following points should be clarified.*

      We are grateful for the kind appreciation of our manuscript and clarify the remaining questions in the revised manuscript.

      **Major points**

      #1.1: Figures 2 and S2A: Figures 2C and D show the mCherry/YFP fluorescence and GCG1/SUT098 RT-qPCR gene expression ratios respectively, which are consistent with a repressive effect of Hda1C on DNC transcription and a potential DNC activating effect of SAGA components. However, the absolute mCherry and YFP or GCG1 and SUT098 expression values presented in Figures S2A and S2B show the opposite: loss of Hda1C subunits rather leads to a decrease in mCherry with not much effect on YFP; moreover loss of Hda3 results in decreased SUT098, which is inconsistent with the whole model. The same comment is valid for the SAGA mutants. It would be good to provide some explanation for these a priori contradictory observations, especially for the Hda1c mutants, which are the major focus of the study. The Net-Seq analyses are certainly more reliable since less subject to protein or RNA stability effects, which may underlie some of the inconsistencies between protein and RNA absolute levels.

      Thank you for this comment. We offer enhanced clarity in the revised manuscript.

      In general, transcription in each direction shows a weak yet highly statistically relevant positive correlation (Spearman rho = 0.26, p-value = 4.94e-24). We are enclosing a plot based on NET-seq data that supports the correlation in each direction of a NDR as part of our response below (RFig.1). To unpick relative effects the ratio captures these effects well, in our experience better than the individual fluorescence measurements or RT-qPCR. Of course, we are ultimately interested in transcription and fluorescence measurements or RT-qPCR of steady-state RNA are only an approximation. Resources and time constraints limit how many mutations we can examine by techniques such as NET-seq, which are arguably most informative. The positive correlation between transcription in each direction has the effect that relative differences can manifest themselves through detectable effects of the other fluorophore. As this reviewer mentions, we can be most confident of results that we could further validate by NET-seq or live-cell imaging.

      (INSERT Rfig1)

      RFig1: Scatterplot of NET-seq data for DNC/host gene pairs. Each point corresponds to a bidirectional gene promoter overlapping with a nucleosome-depleted region (NDR). The values represent NET-seq FPKM values in protein-coding (x-axis) vs non-coding (y-axis) directions. These data support a statistically significant correlation (Spearman test: rho = 0.2554876, p-value = 4.939658e-24).

      #1.2: Figure 3: this figure examines the effect of Hda1 and Hda3 on the 1517 DNC transcripts. Does loss of this HDAC also increase the expression of all the other 2219 non-coding transcripts identified by Net-Seq, which would make Hda1C a more general repressor of non-coding transcription?

      We have performed the analysis for all other non-coding transcripts in Hda1C mutant NET-seq data and added it as part of this response RFig2. Quantification of CUTs, SUTs and other lncRNAs that are not resulting from DNC in Hda1C mutants results in a slight increase in the nascent transcription that is not statistically significant. These data do not offer strong support for the idea that Hda1C represents a more general repressor. We added this plot as novel supplementary figure S3D and adjusted the text of the revised manuscript (line 214).

      (INSERT Rfig2)

      RFig2: Metagene plot of NET-seq data for non-coding RNA that are not classified as DNC. Metagene plot shows genomic windows [TSS - 100 bp, TSS + 500 bp] relative to the annotated starts of ncRNA transcripts.

      #1.3: Moreover, does loss of Hda1 or Hda3 reveal DNC transcripts that were not detected in wild-type? This may increase even more the number of genes with divergent transcription.

      We are grateful for the opportunity to clarify this point. We noticed that the yeast genome shows evidence for much more non-coding transcription than annotated. In this paper, we used TranscriptomeReconstructoR for a data-driven annotation of yeast non-coding transcripts, with an emphasis on the boundaries. See also:__ ( DOI: 10.1186/s12859-021-04208-2 ). The set of non-coding transcripts was for example informed by the previously published NET-seq data on wild-type samples (Churchman et al., 2009; Marquardt et al., 2014; Harlen et al., 2016; Fischl et al., 2017). We have clarified relevant Methods sections to make this point more accessible (line 733). The combination of these NET-seq datasets gives a very good sequencing coverage. The Hda1C mutant NET-seq data does not have a better coverage than this combined reference set, so it would be very hard to find new transcripts without prior evidence in our exhaustive set of combined NET-seq data. However, our Supplementary table S3 contains the fold-change values for all DNC transcripts in mutant compared to wild type. Loci with a high fold-change could arguably be regarded as hda-specific. __

      #1.4: Figures S3A, B, C: are the 3 groups of DNCs derepressed to the same extent by loss of Hda1 or Hda3? This is difficult to judge given the differences in y-axis scales. Figures S3D, E: the authors show the Net-Seq snapshots for the GCG1 and ORC2 loci. It would be good to add the quantifications as presented in Figure 3 for YPL172C and YDRr216C.

      Thank you for the suggestion. We replaced S3A-C with plots that show the same range of the y-axis in the supplementary figure. Hda1C represses DNC in all three cohorts stratified by DNC expression strength. We also added a quantification boxplot for NET-seq signal in the GCG1 and ORC2 loci in revised S3F-I.

      #1.5: Figures S4A, B, C and D are not well explained. What does the y axis frequency correspond to? Is it the % of cells showing a signal? Is the intensity of SUT098 higher because the transcription initiation frequency is higher and therefore the transcription site signal is more intense?

      We improved the annotation for the supplementary figure S4. We clarified in the legend that the y-axis frequency represents the percentage of frames recorded for transcription initiation spots (TS). The bars represent transcription intensity in all the frames recorded, with active transcription ‘ON’ and without TS ‘OFF’. The intensity increases with higher initiation rates and thus the intensity of SUT098 transcription initiation is high.

      #1.6: Figures S4 A-I should be more specifically cited in the text.

      We have cited the figures in the text in the revised version.

      #1.7: Figure 5A: it is really unexpected and unclear why the mCherry/YFP in the WTH3/hda1D and WTH3/hda1D/H3K56mut is increasing compared to WTH3, since DNC is supposed to increase. Similar comment for Figure S5C. This should be clarified in the text.

      Thank you for pointing this out. We missed to address this in the text. The isogenic control “H3 wild type” carries only one copy of the two genes coding for H3, which has a general effect on transcription. We added data showing this as part of our response (RFig3.), and explained this part more clearly in the revised text (line 263). Essentially, the genetic background of the yeast synthetic histone mutant collection sensitizes for a decreased ratio of mCherry/YFP (RFig3.). This result is also included in table S2, where deletions of the histone genes HHT2 (H3) and HHF2 (H4) are listed as shared repressors in both screens. Hda1C mutations show the increased ratio in the sensitized “H3 wild type” background, but not in backgrounds we tested that contain a wild-type dosage of histone genes.

      These data remain valid to support the genetic interaction of hda1D along with the substitution mutants of H3K56.

      (INSERT Rfig3)

      RFig3. Fluorescence signal values of H3WT and BY4741 strains with GCG1pr FPR. The H3WT affects general transcription of coding transcript and decreases the ratio of mCherry/YFP fluorescence.

      #1.8: More generally, as already mentioned above, the fluorescence data are expressed either as mCherry/YFP ratio or as absolute values. It would be good to systematically show the ratios and the absolute values of mCherry and YFP signal; the same for coding and DNC RT-qPCR as well as Net-Seq values when available.

      We ensured that the absolute data values for flow cytometry and qPCR have been represented in the supplementary figures S2 and S5. The FPKM values for NET-seq data for individual transcript units are provided in the supplementary table S3.

      #1.9: Figures S5A and B are not referred to in the text. It should be mentioned and explained how normalization to H3 affects the levels of acetylated H3 over the NDR.

      We now refer to the figures in the main text and explained the rationale for normalization.

      #1.10:* p. 12 "Our data thus suggest to extend the transcriptional noise hypothesis with activities limiting DNC transcription to account for genome-wide variation in non-coding transcription".

      If DNC is the result of "transcriptional noise", it is surprising that in the case of CGC1-SUT098, the transcription frequency is higher in the non-coding versus the coding direction. Is the SUT098 behaving like the coding unit in this case? The authors should comment on that. *

      This is very interesting point. One interpretation of the “transcriptional noise” hypothesis is indeed that non-coding transcription is at low level. We selected loci with high DNC expression, so these loci are somewhat contradictory to this idea a priori. Nevertheless, identifying a biological function of non-coding RNAs is challenging, and it remains to be tested if SUT098 represents particularly “loud noise” or if the high transcription indicates that it carries a yet unknown cellular function. In theory, this screen is suitable to identify factors that may be required to induce DNC, perhaps even specifically. To identify such factors a locus with high DNC is needed to facilitate detection, since our previous screen using the PPT1/SUT129 system had lower SUT expression and failed to identify such mutants systematically. This is important, since a mutation lowering DNC needs to start from a sufficiently high fluorescence signal to distinguish it from background fluorescence. Since the results presented did not clearly uncover such factors, we favor the hypothesis of DNC arising due to the promoter architecture at NDRs, see also positive correlation plot in RFig1. The many repressive pathways are also acting on highly expressed DNCs, which is certainly an interesting information provided by this manuscript.

      **Minor comments**

      #1.11: p. 4 should one talk about Hda1C-linked histone acetylation facilitates... (should be deacetylation...??)

      Done.

      #1.12: The authors should explain why they chose two coding/non-coding pairs that are cac2D insensitive and whether other criteria, such as level of DNC transcription, were also considered, since GCG1-SUT098 represents one of the most highly expressed divergent non-coding transcripts.

      The GCG1 and ORC2 loci were chosen based on i) high DNC levels, ii) a low fold-change of NET-seq data in the cac2 and iii) a DNC region free from other transcriptional units. However, this was based on the state-of-the-art annotation in 2015 when we started this project. Also, when we categorized genes as affected by cac2, we used a fold-change expression cut-off that suggested that about a third of DNCs are repressed by CAF-I. It appears that we still underestimated the effect of CAF-I, since our data show that the target regions of our new screens are also affected by CAF-I. DNC expression at these loci is high, which would result in a low fold-change in mutants that further increased DNC here.

      #1.13: It is hard to understand why both the H3K56A and H3K56Q mutations lead to increased DNC, a result already presented in the Marquardt et al. 2014 paper. It would be helpful to provide a more extensive explanation or hypothesis.

      The H3K56 substitution mutant Q is expected to mimic the acetylation state and A is devoid of post-translational modifications. We observe an increase of signal ratio in the mutants because the H3K56ac is both responsible for incorporation and eviction of -1 nucleosomes (Marquardt et al., 2014). Mutations affecting H3K56 can thus result in less -1 nucleosome density and more DNC through reducing incorporation or enhancing eviction. We have improved the revised text to highlight this. We have clarified this in the text (line 271).

      #1.14: What defines the level of DNC repression? How does the level of repression correlate with the level of coding transcription?

      We have added RFig.1 to address the question about correlation. There is a statistically significant positive correlation between transcription in each direction by NET-seq data in wild type samples genome-wide. However, the correlation is weak (rho = 0.26), which is consistent with locus-specific adjustments of transcriptional strength in each direction. For DNC, several chromatin-based pathways contribute to repression. The resulting level of DNC transcription thus reflects the combined action of several pathways. Here, we characterize Hda1C as a novel player with a genome-wide effect on this phenomenon. Elucidating the mechanistic interplay at specific target DNC loci will be an exciting future research question.

      Reviewer #1 (Significance (Required)):

      This is a very interesting and innovative study using cutting edge genetic approaches, genome-wide sequencing as well as single cell imaging to extend our understanding of non-coding transcription regulation and its potential impact on gene expression. It is a nice continuation and complement of an earlier study from the same author (Marquardt et al., 2014) and will certainly be of interest to a large chromatin biology audience.

      We are grateful for the appreciation of our research on this topic.

      Reviewer #2

      Promotors are frequently transcribed in both directions. The divergent, \upstream' transcript is frequently unstable. Transcription initiation is regulated through the acetylation of promoter-proximal nucleosomes, where HDAC-dependent deacetylation of histones typically represses transcription initiation.*

      *The current manuscript addresses the question whether initiation of coding and divergent, non-coding (DNC) transcription is regulated by the same factors. Previously Marquardt and others showed that H3K56ac-mediated histone exchange has a differential effect on coding and DNC transcription.

      Using a clever reporter system, the authors screened for positive and negative regulators that preferentially affect DNC transcription. They discover the Hda1 deacetylase complex as a DNC-biased repressor and diverse HATs as DNC-biased activators. The role of activators could not be validated, presumably due to high variability of the system.*

      Focusing on Hda1c the authors present data suggesting a larger effect of Hda1c on 'upstream' nucleosomes associated with DNC transcription than in coding transcription. Genome-wide NET-seq mapping was consistent with this differential regulation. Life cell imaging of one specific case argues that Hda1-mediated repression reduced the time between initiation events. The authors employ state of the art methods and in general the data are of very good quality. The effect size is very small, which raises the broader question whether the results, while statistically significant is biological relevant. I have a few comments that the authors may use to revise their manuscript.

      Thank you for the appreciation of our very good data quality. We hope our revision plan will help to clarify some confusion about the scope and effect size.

      #2.1) The differentially regulated coding and DNC transcription are defined by a directionality score. The screen was performed with two reporter loci that are strongly biased for DNC transcription (the idea to detect activators did not work out). Considering that coding and DNC transcription may not be totally independent because of the proximity of target nucleosomes, and sense and antisense transcription may compete for regulators, the question arises how levels of coding transcription affect DNC transcription in wildtype and mutants. The authors stratified their results according to levels of DNC transcription, but discussion and data analysis of the effect of coding transcription on the directionality score may be relevant.

      We added the plot in RFig.1 above to address the question of correlation between transcription in each direction. NET-seq data supports a weak but highly statistically significant positive correlation between transcription in each direction genome-wide (rho = 0.26, p-value = 4.94e-24). We agree that it is relevant to discuss the effect of coding transcription on the directionality scores and revised the discussion accordingly (line 315). We have used both the coding and DNC signal values to create the comprehensive quadrant scatter plot in Fig. 1D-E. Analysis of mutants along the diagonal illustrates that many mutations affect coding transcription as well as DNC. The directionality score measures deviations from the axis of positive correlation, which requires us to use the information of both fluorophores.

      #2.2) The study is strong where the findings can be generalized. The single-molecule live-cell imaging analysis, while done properly, has only limiting impact, because the corresponding coding transcript could not be detected. This si more an anecdotal finding.

      There seems to be a misunderstanding, the live-cell imaging measurements of transcription for SUT098 are stand-alone data. SUT098 by itself is a transcription unit, so we measure DNC of this unit independently from GCG1 that has much lower expression. The measurements are specific to SUT098 transcription and the quantification provides new information about the mechanisms involved in the regulation of DNC. We clarified the text in this regard (line 233).

      #2.3) The effect size is small (20%, on average) and the variability is high. The fact that the HATs that emerged as very robust activators of DNC transcription could not be validated and that the Hda2 subunit of the HDAC complex was not found statistically significant show the limitations of the study. To their credit, the authors discuss these limitations appropriately.

      We have worked on the Methods in the revised manuscript to clarify this confusion (line 712). For the screen, the median signal values represent data from up to 50,000 individual cells. These experiments are remarkably accurate and highly reproducible, especially for molecular biology where n=3 is common. We have uploaded these data to the FlowCore public repository. We encourage any colleague to exploit the opportunity to analyze these data independently to experience the high data quality. With high number of observations, 20% average is a large effect and reflects a rather big shift of the population. As is standard for genetic screens, resource constraints are prohibitive to pursue all hits. In addition, it is expected that only some hits will be affecting transcription of DNC since the fluorescence reporter can be affected by many other cellular events. We focused on the effects on DNC in this manuscript.

      There seems to be some misunderstanding, Hda2 is a statistically significant hit in the ORC2/SUT14pr screen; this information is in Fig. 1E. The Hda1C subunits are labeled in purple.

      #2.4) Figure S3C suggests that the Hda effect is largest at genes that are poorly expressed, and smaller at more average expression levels. Are we looking at a phenomenon that mainly applies to repressed genes?

      Thank you very much for this suggestion. We replaced S3A-C with revised panels where the data is shown with the same y-axis scale, please see also #1.4. We believe the revised presentation also helps to clarify that the mutations increase DNC for all cohorts stratified by DNC expression.

      **Minor issues**

      #2.5) The NET-seq study involves two replicates. How well did they correlate?

      The WT and mutant NET-seq replicates have good correlation (Spearman’s correlation coefficient was above 0.6 for WT and above 0.8 for the mutants).

      (INSERT Rfig4)

      RFig4. Correlation scatter plot of individual NET-seq replicates of WT, hda1D and hda3D. Spearman correlation coefficients of WT, hda1D and hda3D are 0.677, 0.8 and 0.825, respectively.

      #2.6) For the live-cell imaging replicates were not mentioned. Were replicate studies performed?

      We have updated the text to make this important point more accessible (line 230). For live-cell imaging studies, transcription is recorded as movies of cells over time. We took multiple movies, and pooled the data from all the cells to improve statistical power. Data from each movie represent individual repeats. We monitored 130 cells on average for the WT and mutant strains over time.

      #2.7) Fig 4E is not mentioned in the text (mislabeled as 4D)

      Done.

      #2.8) Fig S5 is not mentioned in the main text.

      __Done.

      __Reviewer #2 (Significance (Required)):

      In summary, this is a high-quality study that presents the results of a genome-wide screen that will be of interest to colleagues in the narrower field. Due to the small effects the results may appeal less to a general readership.

      We are grateful for appreciating our manuscript as a high-quality study. We hope our revisions help to clarify confusion concerning effect size.

      Reviewer #3

      In this manuscript, Gowthaman et al describe the results and follow up of their screen aimed at identifying regulators of divergent noncoding (DNC) transcription in S. cerevisiae. From this screen, they identify Hda1C as a repressor of DNC transcription, and perform follow experiments to support and detail this finding. In addition to RTqPCR to confirm the reporter and endogenous changes, the authors perform NET-seq to look at global DNC alteration upon Hda1C subunit deletion and identify a number of non-coding transcripts with altered expression levels. In addition, the authors perform live cell imaging to demonstrate that there is a modest restriction of initiation frequency when one of the subunits of Hda1C is deleted. Finally, the authors explore changes to pan-H3 acetylation and the genetic overlap between Hda1C and H3K56ac demonstrating independent genetic pathways, but overall increases in H3 acetylation over DNCs when Hda1C is deleted. Overall, the screen and results are of interest, but the authors overstate some of the conclusions (perhaps most importantly within the title!). I have the following suggestions to improve the manuscript:

      Thank you for recognizing the interest in our results. We have revised the manuscript to state the conclusions more cautiously.

      **Major comments**

      #3.1. The title of the manuscript is based on the single molecule live cell imagining experiments presented in Figure 4. While there is a statistically significant decrease in initiation frequency from deletion of one Hda1C subunit, there is no statistical decrease in deletion of the other two. Furthermore, these experiments were performed at one locus. As a result, I find the title to be an overstatement of the findings of the paper and suggest the authors refocus on the more robust findings of the manuscript.

      Live-cell imaging requires extensive engineering of the target loci. Perhaps this was lost in the Methods, but it is a 5-step process to integrate the stem-loops. We tried to engineer other loci, but this is far from trivial and this technique does not work for all loci tested. The hairpins are also unstable, and need to be carefully checked prior to experimentation, which challenges scaling this approach up to a higher-throughput. It appears that we undersold this point, but the fact that we now provide a locus and strains for the community that makes such studies possible for DNC represent a tremendous achievement. Since hda1D also decreases time between initiations, we generalized the finding to Hda1C.

      However, we recognized that the reviewer makes a helpful suggestion to choose a more careful title since there is no statistically significant reduction of initiation frequency in some mutants. We have revised the title to “__Hda1C limits divergent non-coding transcription and restricts transcription initiation frequency__” in the revised manuscript to address this point.

      #3.2. Relatedly, in Figure 4, the authors present the findings from the single molecule live cell imaging experiments. Within this experiment, the authors include a cac2 deletion (CAF-1 subunit) strain, and observe a modest effect, similar to hda1 deletion. This is surprising as the authors mentioned this location (GCG1/SUT098) was selected as CAF-1 was NOT shown to regulate the DNC previously (Marquardt et al 2014; as mentioned at the beginning of the Results section). The similar decrease in initiation frequency between cac2 deletion and hda1 deletion further concerns me regarding the use of these data as the headlining finding.

      We believe there is a misunderstanding. We clarify that selection of the GCG1 locus was based on a cut-off value for cac2D effect, as is also shown in Fig S1C. The fold-change is small, but since DNC transcription of the chosen loci is high in wild type, an increase in a mutant would not necessarily give a high fold-change. Hence, we need to be cautious to conclude that CAF-I does not regulate DNC at this locus. The fold-change analysis suggested it, but it remained possible. CAF-I appears to affect even more loci than initially identified with the chosen cut-off. We see the same trend as in Hda1C mutants as in cac2, which offers support to the exciting idea that modulation of the initiation frequency may be a shared mechanism by chromatin-based regulators acting on DNC.

      #3.3. It is unclear to me why the change in mRNA expression is included within the screen. Why not solely look at the expression change of the DNC? Importantly, the authors note in the discussion that perhaps the reason the SAGA complex was identified was due to regulating mRNA expression and not DNC expression and therefore was identified in the screen. Could the authors not just present the fold change in DNC expression using their YFP reporter, and not the YFP vs mCherry?

      The regulation of initiation frequency in each direction is super-imposed on a general positive correlation __(rho = 0.26, p-value = 4.94e-24) between the coding and non-coding directions__, please see also RFig.1. For the purpose of this study about selective effects on the direction of transcription, it is vital to incorporate both sides of the reporter. Otherwise, we would select for factors that activate or repress the transcription from the target promoter NDR. This point is accessible in Fig.1D-E, where mutations that affect YFP usually also have an effect on mCherry. The aim of this study was to identify mutants that affect the relative expression, and therefore a focus on one fluorophore would not improve the analysis. We clarified this important point more accessibly in the revised manuscript (line 315).

      Please also note that all the raw data are available, so colleagues are in the position to perform their independent analyses. We believe that it is very valuable for the community to have access to these data since they may be useful for other purposes and could be analyzed in many different ways. In fact, we have tried several methods and approaches over the years and present what we believe is most appropriate in this manuscript. For example, Hda1C comes out as a convincing hit with a range of different approaches to analyze the data, which is also a reason we feel confident about the characterization of Hda1C.

      #3.4. This is absolutely beyond the scope of the paper, but limiting the screen to only nonessential proteins likely misses important regulators. In the future, perhaps the authors could pursue a SATAY screen to look for essential proteins as well? Again, the findings of this paper are appropriate, and the screen is a great undertaking, but I want to suggest this to the authors for potential future projects.

      Thank you for this excellent suggestion. We agree that capturing the role of essential factors would be very informative, and the saturated transposition approach would be promising. However, as the reviewer points out, performing these analyses is beyond the scope of the current manuscript.

      #3.5. The authors perform NETseq experiments in deletion strains and identify ~1500 DNC transcripts with altered expression. Later the authors look into the mechanism and demonstrate an increased H3ac in hda1 deletion strains. The authors could enhance the representation of these datasets by correlating the change in H3ac with the change in DNC transcription - do they correlate?

      Thank you for bringing up this excellent point. We present the correlation data of change in H3ac and DNC transcription in the hda1D mutant (RFig5.). The ChIP-seq and NET-seq values of hda1D were divided by respective WT values in order to quantify the relative increase of H3 acetylation or nascent transcription in hda1D). The data showed a weak (Spearman rho= 0.23) but significant (pval=3.0e-20) positive correlation between the ratio values. The hda1D-dependent increase in H3 acetylation correlates with hda1D-dependent increase of RNAPII occupancy in DNC transcripts. We enhanced our representation of these data by including this plot as S5D in the revised manuscript as suggested.

      (INSERT Rfig5)

      RFig5__: Scatterplot of hda1D/WT NET-seq (y-axis) and ChIP-seq (x-axis) ratios. Each point corresponds to a bidirectional gene promoter overlapping with an NDR. The x-axis shows ChIP-seq ratios, and the y-axis shows the NET-seq ratios. These data support Spearman correlation test: rho = 0.234 and a statistically significant p-value = 3.0e-20.__

      #3.6. In Figure 5, the authors argue that Hda1C works non-redundantly with K56ac, using point mutants to mutate K56 to A or Q. Did the screen identify anything else in the K56ac pathway? Rtt109 or Asf1, for example? Because Hda1C deacetylates H3, including but not limited to K56, it is a bit surprising the K56 point mutations result in a larger increase in SUT098-YFP levels. The authors discuss within the text that Hda1C has multiple targets; but coming back to my previous point that CAF-I was not supposed to impact this location, I am having a hard time understanding these results.

      This is an excellent point. We improved the manuscript by highlighting other factors with links to H3K56ac in our scatter plots, for example Rtt109 in Fig 2A. Nevertheless, the reviewer may wish to satisfy his/her curiosity by exploring table S2 in more detail. Table S2 lists the top candidates from both screens.

      We hope our answer to point #3.2 helped to clarify the aspect of this comment related to CAF-I.

      **Minor comments**

      #3.7. The authors follow up the screen using RTqPCR for GCG1/SUT098 in newly made deletion strains. I was surprised the authors choose this locus rather than the ORC2/SUT014 locus, as the screen showed a strong increase for this reporter. While I appreciate generating the deletion strains within the reporter is beyond necessary, assessing the endogenous locus within the deletion strains by RTqPCR seems reasonable.

      We chose GCG1 locus since the fold change in directionality by genetic screen was high for the activator mutants. We will perform this experiment and add the missing validation experiment for the ORC2 locus in the revised manuscript.

      #3.8. The authors tend to show their genomic data as metaplots; it would be nice to see heatmaps where more can be gleaned from the display of all the loci. This applies to the NET-seq data (Figure 3) and the ChIP-seq data (Figure 5).

      We appreciate the suggestion and generated the requested heatmaps using the NET-seq tracks of WT and hda mutants (RFig6.). The heatmap represents the same genomic intervals as on the corresponding metagene plot (Figure 3A). We find that the differences between WT and hda samples are more clearly accessible at first glance on the metagene plot rather than on the heatmap. We believe that this could be because the heatmaps do not represent what transcripts have in common and rather underlines the differences. In contrast, the metagene plots reveal the common trends by taking the average of signal. We thus prefer showing metagene plots in the manuscript, as they allow for overlay of multiple tracks on the same plot, thus enhancing visual comparison for the readers.

      (INSERT Rfig6)

      RFig6. Heatmap representing NET-seq data in WT, hda1D and hda3D. Genomic intervals covering [TSS - 100 bp, TSS + 500 bp] of DNC transcripts (n=1517) are shown. The color indicates the log2-transformed NET-seq values.

      #3.9. In Figure 5B, the authors present H3ac ChIP-seq data, presented as a ratio of H3ac/total H3. While this is a perfectly acceptable way to present the data, I was surprised to see a decrease in total H3 levels when examining the supplemental data. Has this decrease in H3 occupancy upon hda1 deletion been shown previously? This finding should be discussed within the manuscript.

      We appreciate that the reviewer noticed this. We do not think this has been explicitly stated before, as the focus thus far had been on the effects towards the mRNA. However, the effect is not statistically significant between the WT and hda1D as observed in S5B. We thus prefer to remain cautious about this conclusion.

      #3.10. In Supplemental Figure S3, the authors break down the NET-seq data by DNC FPKM, which is very nice. Very minor point that the font here is quite small.

      Thanks, we improved the font size. Note that we also revised the y-axis scale in response to comment #1.4.

      Reviewer #3 (Significance (Required)):

      \*Significance:** *

      The regulation of divergent non-coding RNAs is an understudied field. In this paper, the authors perform a screen for all non-essential yeast proteins in regulating the expression of these ncRNAs. The screen results and follow up defining the role of Hda1C in broadly repressing the expression of these ncRNAs is of interest to the field.

      We are grateful to the reviewer for highlighting the interest of our work to the field.

      \*Context:** *

      This work follows from Marquardt's previous 2014 study that identify Caf1 as regulating DNCs in S. cerevisiae.

      \*Audience:** *

      Broadly, the chromatin and transcription field. Anyone interested in how chromatin regulates transcription, regulation of ncRNAs, and functions of histone modifying enzymes.

      \*Expertise:** *

      I am a member of the chromatin and transcription field, largely performing genomic experiments. We do not perform microscopy, although sufficiently understand the experiments and results presented here.

    1. How can you be against faith when we take leaps of faith all the time, with friends and potential spouses and investments? Here, the meaning of the word “faith” is shifted from a spiritual belief in a creator to a risky undertaking. A common invocation of this fallacy happens in discussions of science and religion, where the word “why” may be used in equivocal ways.

      This term in specific as been seen and even critiqued in my writing. I think that this is due to the fact that may students and writers attempt to use more educational or specific words that we may not always understand how to use. By attempting to use them in new ways, we risk confusing the audience or writing an argument that has a meaning different from what we intended.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point response:

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **SUMMARY**

      This MS tackles a largely unknown topic of vessel formation: how vessels anastomose and lumenise. The authors demonstrate that a matrix protein svep1 produced by neural tube during zebrafish embryogenesis plays a key role with blood flow to orchestrate anastomose formation. Actually in absence of this protein concomitantly with blood flow reduction results in significant decrease of lumenised DLAV segments.

      In absence of svep1 they observed an expansion of apelin positive endothelial cells connected with a defect in tip/stalk cell specification. Interestingly the phenotype is amplified by blocking the kinase activity of VEGFR2

      **MAJOR COMMENTS**

      The most solid evidence on the role of blood flow in cooperating with svep1 relies on the use of tricaine, which reduces heart contractility. Interestingly the authors report some data by using embryo lacking cardiac troponin T2. In my opinion I suggest the author to better analyze the phenotype obtained by the deletion of svep1 together a dose-dependent reduction of tnnt2. This approach is more elegant and physiologic than the use of a chemical compound. Furthermore this approach will allow to better analyze the relations ship between blood flow and the expression of svep1 in neural tube. It should be relevant to establish a sort of flow threshold required to dampen lumenisation. *

      Response: We appreciate the comment and have previously attempted to titrate the tnnt2 morpholino as published to have a graded reduction in blood flow. In our hands, this has not proved to be a robust approach, but we are willing to give it another try. In addition, we propose use alternative compounds to tricaine for blood flow reduction without affecting neural physiology. Alternatively we will use a-bungarotoxin mRNA injection to selectively affect neural activity to immobilize the embryos without effects on heart rate and blood flow (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4526548/)

      To further improve the findings here reported I suggest to analyze the expression of klf2, which is a well known mechano-sensor of blood flow in several animal species including zebrafish.

      Response: We will perform klf2 expression analysis

      It's likely that apelin is relevant in the observed phenotype. Which is the phenotype of a double mutant lacking both apl and svep1? Is there a direct influence of blood flow on apl expression?

      Response: We will investigate the double loss of function. However, double mutants would take some time, and a combination of morpholino and mutant would likely be the first and best option to answer this question in a reasonable time frame. The effect of flow on apl expression can be tested.

      Is there any suggestion that this mechanism is oprative in mammalian?

      Response: This is an interesting question and certainly relevant for follow up studies. At present, we can only speculate on a possible connection with flow, given that Svep1 mutations have recently been associated with artherosclerosis. However, whether the anastomosis defect we identify is conserved remains to be seen.

      *Reviewer #1 (Significance (Required)):

      The data here reported might represent a step forward in the field because a new mechanism is suggested.

      The interest is sufficiently broad.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The authors demonstrated that loss of svep1 in zebrafish contributed to defective anastomosis of intersegmental vessels, in addition, such Svep1 acted synergistically with blood flow to modulate vascular network formation in the zebrafish trunk.

      **Major comments:**

      The expression of svep1 is localized in neurons of neural tube, dorsal epithelial cells (as indicated by transgenic zebrafish) and ventral somite boundary (as indicated by in situ) but is excluded from endothelial cells nor the vasculature. It remains puzzling and the authors have not addressed this very reason of how a gene that is expressed in non-vascular tissue play a crucial role in vessel anastomosis, ie DLAV, ISV lumenization, during angiogenesis. As the entire story of this svep1 is related to its function in angiogenic sprout and lumen formation of vascular tissues, it will be helpful for reader to be able to put the pieces together of how such gene may be functionally involved in such angiogenic process. Previous publication of this gene involved in lymphoangiogenesis, as in this manuscript the authors could provide more evidence of how such gene and its localized expression contribute to different tissue in the vascular system, ie DLAV, instead of the neural tube, dorsal epidermis or ventral somite boundary.*

      Response: We appreciate the wish to understand exactly how non-endothelial expression of Svep1 causes an endothelial phenotype selectively under reduced flow conditions. The very nature of this new phenotype requires analysis in vivo, and can not easily be transferred to an ex vivo assay. Therefore, selective loss of function in different cell populations is not easily available. More importantly, the interpretation of such efforts, when mosaic, are marred with issues. At this point, we feel that full molecular characterization of how Svep1 affects endothelial cells during anastomosis will require entirely new approaches and lies beyond what can be achieved in this manuscript.

      We will however attempt to clarify the findings and the potential mechanisms in the discussion.

      Another puzzling point is that tricaine is the center of the subject in this study. As the authors claim that tricaine-dependent blood flow reduction synergistically augmented the effect of svep1 deficiency. However, tricaine is known acting on neural voltage-gated sodium channels, whether svep1 function was affected by tricaine in the neural tissues and possibly its expression, the authors could provide more explanation and argument in the discussion.

      Response: As mentioned in our response to reviewer 1, we will perform additional experiments to try to clarify whether an effect of tricaine on neuronal sodium channels contributes to the phenotype.

      It is unclear on p12 "These results suggest that while svep1 loss-of-function produces a cardiac defect that enhances the effect of tricaine on reducing blood flow, svep1 has an additive effect in modulating blood vessels anastomosis" that svep1 deficiency enhances the effect of tricaine leading to reduced blood flow, however, it is not accurate to state that svep1 loss-of-function produces a cardiac defect. It is not sure if the effect of svep1 was actually neural rather than cardiovascular tissue, for example, tricaine acts on neural voltage-gated sodium channel that slowing down heart beat. Whether the authors can explore the possibility that svep1 function in neural rather than cardiovascular tissues, may be discuss why the authors think svep1 enhances the blood flow defect (tnnt2a knockdown or tricaine) on angiogenesis such as DLAV phenotype.

      Response: We will attempt to dissect potential contributions by neural effects from cardiac and flow related effects as stated above. Tnnt2 MO and alternative drugs to reduce heart function selectively will be used. We will also clarify the discussion.

      On p13, the authors stated that svep1 expression was inhibited by reduced blood flow, however, is it really the effect of reduced blood flow or caused by the chemical tricaine? If tnnt2a knockdown showed a similar phenotype, then it may be more convincing.

      Response: see above

      \*Minor comments:**

      The work on "svep1 loss-of-function and knockdown are rescued by flt1 knockdown" was beautifully done and it is very clear and convincing.

      The last two sections, "Vegfa/Vegfr signalling is necessary for ISV lumenisation maintenance and DLAV formation" and "Vegfa/Vegfr signalling inhibition exacerbates svep1 loss-of-function DLAV phenotype in reduced flow conditions" are more related to the flt1 knockdown phenotype. These 3 different sections are actually related in the sense that the rescue phenotype should be explained in the vegf signaling pathway. They are better off to discuss more cohesively about this vegf pathway that will help readers to appreciate more their work in svep1. *

      Answer: We agree and will do so.

      *Reviewer #2 (Significance (Required)):

      This manuscript of svep1 in zebrafish provides new insight in angiogenesis, particularly in development of vessel anastomosis in zebrafish embryo, is very significant in the field and readers who are interested in angiogenesis and zebrafish development, including myself.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript reports that the secreted extra-cellular matrix protein Svep1 plays a role in vascular anastomosis during developmental angiogenesis in zebrafish. Further, the study demonstrates that flow and Svep1 modulate the vascular network in a synergistic fashion. This is a high quality manuscript presenting novel data which compellingly support the conclusions that are made. I have no suggestions for further experimentation but list minor points below.

      1. The final paragraph of Discussion is underdeveloped in that it claims regulation of phenotypic robustness in angiogenesis and its failure promises crucial insights into the mechanisms causing breakdown of vascular homeostasis in human disease. However, this issue is not pursued in any substantial way in Discussion. For example, are there known mutations in humans which lead to anastomosis defects and, if so, do any of them relate to the molecules or signaling pathways which are the subject of this manuscript? *

      Response: We agree with the wish to see more substantial discussion of the issue of phenotypic robustness and potential links to human disease. The question of anastomosis itself is something that has not been addressed in humans, as it is a rather detailed phenotype observable where predictive patterning occurs and can be dynamically studied. As such, there is a lack of literature and knowledge on signalling pathways that drive anastomosis in humans, and also not many that have been identified in experimental systems or animal models. Flt1 and Vegf signalling, junctional molecules and a few other pathways have been shown to be involved, but nothing is known so far about Svep1 and anastomosis in other system. We will attempt to complement the discussion to make this more clear.

      • There are typographical errors in the text so a further proof-read is required. *

      Response: thank you, these will be corrected

      *Reviewer #3 (Significance (Required)):

      This manuscript provides an incremental conceptual advance in our understanding of the molecular mechanisms responsible for vascular anastomosis during developmental angiogenesis. The manuscript will be of interest to developmental biologists and vascular biologists.

      My field of expertise pertains to angiogenesis and lymphangiogenesis in the setting of cancer and other diseases. *I am not a developmental biologist.

  4. May 2021
    1. Idempotency means calling a method multiple times without changing the result. The idempotent methods are required for Webhooks because a resource may be called multiple times if the network is interrupted. In this scenario, non-idempotent operations can cause significant unintended side-effects by creating additional resources or changing them unexpectedly. For businesses that rely on data, non-idempotency poses a considerable risk.

      I don't think we need this paragraph.

      We can start with -

      There could be scenarios where your endpoint might receive same webhook event multiple times. This is expected as per design and can be handled easily using x-razorpay-event-id header

      Check the value of x-razorpay-event-id in the webhook request header. The value for this header is unique per event You can cross reference on your end to identify if an event with same header is processed on your end already to avoid duplicates.

      But why do Razorpay sends same event multiple times? To avoid an event being missed, Razorpay follows at-least-once delivery semantics. In this approach, if we do not receive a successful response from your server, we resend the Webhook.

      There could be situations where your server accepts the event but fails to return a response in 5 seconds. In such cases, the session is marked timeout. It is assumed that the Webhook has not been processed and is sent again. Ensure your server is configured to handle or receive the same event details multiple times using the solution as mentioned above.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their comments, criticisms and suggestions that will help to improve the quality of our manusrcipt.

      Please find enclosed in this initial response our answer to each point raised by the reviewers.

      Please note that for several answers normally come along with an additional figure that could be added in the full revised version of the manuscript. However, these additional figures could not be added in the way we have to submit our answers but we are ready to send a pdf file including our answers with the additional figures upon request.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The paper by Genest et al. describes the effect of flotillins and sphingosine kinase 2 to stabilize AXL as a mechanism to promote epithelial-mesenchymal transition in breast (cancer) cells. The potential role of vesicles trafficking EMT-promoting proteins is of high interest in the field, also for exploring new opportunities of pharmacological targeting. However, the paper fails to convincingly demonstrate that the proposed mechanism is of real importance to support or promote EMT for the following main reasons:

      1-a) The role of flotillins is studied only by overexpression and in the context of non-cancerous MCF10A cells, while breast cancer cells of epithelial-like origin are not analyzed.

      Regarding the first part of the point raised here, we are not sure to understand correctly the sentence “[…] while breast cancer cells of epithelial-like origin are not analyzed”. Indeed, we used the breast cancer cell line MDA-MB-231 and a derived cell line that we generated by knocking down flotillin expression (MDA-MB-231shFlot2) in the second part of this study (Figure 6C, F and H and S7A, E and F). This previously characterized cell line allowed us to demonstrate that abolishing flotillin overexpression was sufficient to significantly inhibit the invasive properties of MDA-MB-231 cells (Planchon et al, J Cell Science 2018, https://doi.org/10.1242/jcs.218925

      Although flotillin upregulation induces some major mechanisms of the EMT process in MCF10A cells, flotillin downregulation was not sufficient to reverse the EMT phenotype in MDA-MB-231 cells. This could be explained by the fact that EMT is a multifactorial process and that MDA-MB-231 cells went through too many irreversible changes leading to this process. By contrast, when we analyzed EMT markers after SphK2 inhibition or knock down in MCF10AF1F2 and in MDA-MB-231 cells (Figure 6A-C), we could observe a significant decrease in ZEB1 expression.

      1-b) This is contrast with the purpose of the paper (see abstract, introduction, patients' data) which is to study tumors and EMT. Effect of shRNAs is also not reported, making it difficult to estimate the importance on the EMT phenotype.

      As we mentioned in our manuscript, previous studies by other groups who downregulated flotillin expression in different cancer cell lines using siRNA approaches or re-expression of miRNAs that inhibit flotillin expression, already showed flotillin participation in EMT (for review please see, Gauthier-Rouvière et al, Cancer Metastasis Review, 2020, **doi: 10.1007/s10555-020-09873-y).

      In this context, the novelty and the first goal of our study was to investigate how strong is the contribution of flotillin upregulation to EMT induction. To achieve this goal, we chose on purpose to use non-tumoral epithelial cells that do not harbor the anomalies already favoring EMT, unlike the cancer cell lines used in previous studies. In these non-tumoral models (the human MCF10A and mouse NMuMG mammary epithelial cell lines), we ectopically overexpressed flotillins (MCF10AF1F2 and NMuMGF1F2) to levels similar to what observed in invasive breast cancer cells. Using this approach, we found that flotillin overexpression is enough to induce EMT.

      1-c) Then, alteration of EMT should be concluded also from other non-genetic functional parameters, not just by markers. For instance: was morphology of the cells changed? Was cell migration affected with F1F2?

      Our conclusion that flotillin upregulation is sufficient to induce EMT in MCF10AF1F2 and NMuMGF1F2 cells is not based only on genetic functional parameters or markers. For instance, Figure S1 (panels H and I) shows a strong modification of the cell morphology and of the actin cytoskeleton organization in NMuMG cells upon flotillin upregulation. NMuMGF1F2 cells became flat and lost their apical F-actin belt and exhibited an increase in stress fibers.

      As shown below (Additional Figure 1), similar modifications of the cell morphology and of the F-actin cytoskeleton organization occur also when flotillins are upregulated in MCF10A cells (see below the comparison of MCF10A and MCF10AF1F2 cells) (these data could be added in the manuscript).

      ADDITIONAL FIGURE 1 CAN NOT BE ADDED BUT IS AVAILABLE UPON REQUEST

      Additional figure 1: Upregulation of flotillins in MCF10A cells leads to changes in the cell morphology and in F-actin cytoskeleton organization. Comparison of the morphology and of the actin cytoskeleton organization in MCF10AmCh and MCF10AF1F2 cells. Confluent cells were fixed and stained for F-actin (green) using Alexa488-conjugated-Phalloidin and for nuclei (blue) using Hoechst (in panel A flotillin2-mCherry signal is shown). (A) Upper panels show the maximum intensity projection images (MIP) of MCF10AmCh (control) and MCF10AF1F2 (flotillin overexpression) cells obtained from a stack of images acquired by confocal microscopy. Lower panels show magnified images from the boxed areas, including one single plane and the x-z and y-z projections along the indicated axes. (B) 3D reconstruction images obtained from the region in the boxed area from the MIP-images shown in A.

      These data show that in MCF10AF1F2 cells the apical actin belt is lost and the height of the cellular monolayer is lower compared with control MCF10AmCh cells.

      We also analyzed the migration capacity of these cells (shown in Figure 3G of the submitted manuscript). Briefly, using a Boyden chamber assay, we showed that flotillin upregulation significantly increased migration of MCF10A cells (Figure 3G). We previously demonstrated that flotillin upregulation also promotes cell invasion in 3D using a spheroid assay (Planchon et al, J Cell Science, 2018, https://doi.org/10.1242/jcs.218925**). As shown below (Additional Figure 2), using a wound healing assay, we also observed that cell velocity is higher in flotillin-overexpressing NMuMGF1F2 cells than in control NMuMG cells (this could be added to the manuscript).

      ADDITIONAL FIGURE 2 CAN NOT BE ADDED BUT IS AVAILABLE UPON REQUEST

      Additional figure 2: Upregulation of flotillins in NMuMG cells increases cell velocity in a 2D migration assay. (A) Representative images of NMuMGmCh (control) and NMuMGF1F2 cells during wound healing. The yellow dashed line indicates the leading edge of the migrating monolayer at the indicated times. The trajectory of 60 individual cells was tracked and the cell velocity and persistence of migration were extracted. The histogram shows the velocity quantification (mean ± SEM of 4 independent experiments). (B) Representative trajectories of individual cells.

      2) AXL up-regulation is not very strong (2-fold). What is unclear is if the minimal AXL increase due to F1F2 really provides a significant contribution to the EMT phenotype (as the authors conclude). The siRNA experiment knocks down all AXL, not just the F1F2-induced levels, making it difficult to estimate the real effect of the mechanism proposed.

      As shown in figure 3A and D, in MCF10AF1F2 cells compared with MCF10AmCh cells, we measured a significant 2.5 ± 0.7-fold increase in the AXL protein level. We do not think that this can be considered as a minimal increase.

      Considering that flotillin upregulation may affect simultaneously different receptors (Figure S2I, Figure S6A-F), we did not expect that downregulating a single receptor would have a major impact on the level of EMT markers and on cell migration. Yet, after knocking down AXL in MCF10AF1F2 cells, we observed a decrease in ZEB1 and N-cadherin expression and the re-expression of E-cadherin (Figure 3D-F) and the inhibition of cell migration (Figure 3G). The fact that we observed such an effect by downregulating AXL, which according to Reviewer #1 is minimally increased, might be explained by its well-known ability to act not alone but through cross-talk with other signaling receptors (Graham et al, Nature Reviews Cancer 2014; Halmos and Haura, Science Signaling 2016; Colavito et al, Journal of Oncology 2020).

      As suggested by Reviewer #1, ideally, it would be interesting to bring back AXL to its level in MCF10AmCh cells to better evaluate only the contribution of its increase. However, adjusting so precisely the efficacy of AXL downregulation by siRNA seems quite difficult to achieve.

      3) Why didn’t the author focus on EphA4 (or to a lesser extent ALK), which showed better regulation ?

      As we mentioned (page 18) “the available tools allowed us to validate this result only for AXL, but not for EphA4 and ALK”**.

      Nevertheless, for EphA4, we showed in Figure S6 that it is located in flotillin-positive late endosomes (Figure S6 A and C, for MCF10AF1F2 and NMuMGF1F2 cells, respectively) in a phosphorylated form (using an antibody against P-Y588/Y596-EphA4 that works in NMuMG cells, Figure S6D). However, the signals obtained by western blotting using the same antibody were too low to validate any significant variation of EphA4 Y-phosphorylation status, as suggested by the results from the phospho-RTK array.

      Regarding ALK, the increase in its phosphorylation, suggested by the phospho-RTK array, remains puzzling to us. By western blotting of cell lysates and in the presence of positive controls, we did not detect any positive signal for phosphorylated ALK and even for total ALK in MCF10A and MCF10AF1F2 cells. In addition, to our knowledge, ALK expression in MCF10A cells has never been reported in the literature. These observations did not encourage us to pursue our investigations on ALK.

      Moreover, several points led us to focus on AXL. Indeed, AXL expression is associated with the acquisition of a mesenchymal cell phenotype, invasive properties, and resistance to treatments and AXL is an attractive therapeutic target against which several inhibitors are in preclinical and clinical development (Shen Y et al. Life Sciences 2018). Moreover, AXL expression in tumors is attributed to post-transcriptional regulation, but the mechanisms are totally unknown. Understanding how its stabilization and signaling can be triggered by flotillin-mediated endocytic pathways is new and of high significance for the cancer field and the trafficking community.

      3) The conclusions of the manuscript are contradicted by the reported clinical data. In Figure S4 the authors clearly observe co-expression of Flotillin 1 and AXL prevalently in luminal breast cancers, which is the subtype known to not be driven by EMT. This evidence already indicates that this (otherwise interesting) mechanism is not relevant to EMT in breast cancer. So, the conclusions are not supported by the data, and the experimental setup and model chosen are not appropriate to generalize the findings to cancer.

      We acknowledge that flotillin 1/AXL co-expression is highest in the luminal subtype. If this co-expression was observed only in this particular subtype, we would have agreed that it excluded that flotillins and AXL co-overexpression may participate in EMT in tumor cells. However, our results show that flotillin 1 and AXL are co-expressed also in other subtypes that have undergone EMT. Considering this observation and the influence of flotillin upregulation on AXL overexpression we reported here, we believe that the point raised by the Reviewer is not sufficient to exclude that the co-upregulation of flotillins and AXL can participate in EMT induction in breast cancer cells.

      **Minor (here the most important):**

      4) The point of the Figure 2 is not clear. Why this part should have such a central role in the story? The entire data presented are not followed up in the rest of the paper. Moreover, in some cases upregulations also questionably significant (like RAS and STAT3 are not even 2 fold).

      Moreover, the error bars are so small that it seems unrealistic that the plots indicate three independent experiments.

      Because the activation of oncogenic signaling pathways is crucial to promote EMT, we think that analyzing these pathways in the context of flotillin upregulation is coherent with the message of the paper.

      To our knowledge, the amplitude of up- or down-regulation has nothing to do with its significance. The amplitude also depends strongly on the context (stimulation with an agonist, overexpression of GEF, etc). For instance, increases lower than 2-fold are frequently reported (Bodin and Welch, Mol Biol Cell, 2005; Miura SI et al, Arteriosclerosis, Thromb and Vasc Biology, 2003; Matsunaga-Udagawa R et al, J Bio Chem 2010)** when assessing the activity of Ras or small GTPases, but they represent real upregulations. Furthermore, Ras activation is supported by the downstream 4-fold activation of ERK that we measured (Figure 2C).

      In Figure 2, panels B, C, E, F and J, considering the amplitude of the mean increases shown, the error bars corresponding to SEM do not seem disproportionately small.

      As the Reviewer seems to insinuate that we have not performed independent experiments, we are presenting in the table below the detailed results all obtained from independent experiments.

      Panel

      Parameter measured

      Number of independent experiments

      Fold of increase value in MCF10AF1F2 cells compared with MCF10AmCh cells in each experiment

      Mean

      SEM

      p-value

      B

      Ras-GTP

      5

      1.95 ; 1.96 ; 1.18 ; 1.67 ; 1.86

      1.72

      0.14

      0.001

      C

      Phospho- ERK

      5

      1.24 ; 5.43 ; 3.22 ; 6.11 ; 3.52

      3.71

      0.73

      0.0042

      E

      Phospho-AKT

      4

      2.29 ; 6.54 ; 3.76 ; 2.6

      3.8

      0.97

      0.0276

      F

      Phospho-STAT3

      4

      1.63 ; 1.63 ; 2.42 ; 1.60

      1.82

      0.20

      0.0066

      J

      Phospho-SMAD3

      8

      4.1 ; 5.12 ; 6.29 ; 1.82 ; 2.58 ; 6.66 ; 2.82 ; 5.40

      4.35

      0.64

      0.0001

      In the legend to figure 2 panels C, E, F, J, “The histograms show […] with control MCF10AmCh **cells calculated from 4 independent experiments” was corrected by “The histograms show […] with control MCF10AmCh cells calculated from at least 4 independent experiments” as data shown in panel J were actually calculated from 8 independent experiments.

      5) More robust statistical analysis should be provided in the Figure 1 to support that EMT is suppressed with F1F2 overexpression. For instance a more standard GSEA on hallmark signatures.

      To avoid confusion, we understand that Reviewer #1 meant “… that EMT is induced with F1F2 overexpression” and not “… suppressed …”.

      As recommended by Reviewer #1, we performed a GSEA on the hallmark signature and the results are already included in the current revised version of our manuscript (figure 1C).

      6) In Figure 3 E-Cadherin is rescued with siAXL in the IF but not in the western blot.

      Using siRNA transfection, we can have a mosaic effect due to the fact that not all the cells of the sample are transfected and thus efficiently knocked down. This mosaicism was clear when we analyzed E-cadherin by immunocytochemistry. Indeed, in some cells, probably the ones that have been more efficiently transfected with the AXL siRNA, E-cadherin expression is clearly seen. By western blotting, which provides a global analysis in which transfected and non-transfected cells are mixed, this was not significantly higher than in MCF10AF1F2 cells transfected with a control siRNA, although there was a trend towards increased E-cadherin expression in MCF10AF1F2 transfected with the AXL siRNA.

      For the revised version of our manuscript we will try to improve the efficacy of the AXL siRNA and test whether we can fully rescue E-cadherin expression. The corresponding panel could be modified according to the data we will obtain.

      7) Some sentences require clarifications. The authors should be more clear on why ZEB2 antibody was not available or what they mean with "Unfortunately the available tools..".

      Page 7: we wrote «no anti-Zeb2 antibody is available». We should have said: «none of the anti-Zeb2 antibodies tested worked in MCF10A cells». We decided to remove “no anti-Zeb2 antibody is available” from the sentence to avoid confusion in the revised version of our manuscript.

      Page 19: we wrote «unfortunately the available tools» to refer **the available tools against EphA4 and ALK that did not allow us to validate the data obtained using the phospho-RTK array showing that the Y-phosphorylation of these two RTK is increased in cells with upregulated flotillins. (see also our answer to major point 2).

      8) Western blot from the CHX experiment should be shown, at least in the supplements. Again, the standard deviation in this experiment is minimal, was this really an average of three independent experiments (and not three western on the same lysates)?

      As asked, a representative western blot is now shown in Figure 3C in the current revised version of the manuscript.

      As indicated in the legend to the figure already in the initial version of our manuscript: “**The results are the mean ± SEM of 6 to 8 independent experiments depending on the time point, and are expressed as the percentage of AXL level at T0”. We wish to reassure Reviewer#1 that the results are really based on western blots performed on different lysates obtained in independent experiments. We can show the Reviewer these data obtained from independent experiments if necessary.

      9) All conclusions are derived from one single cells MCF10a. NMuMG cells are shown at the beginning but not used for the rest of the paper. Anyway, if this wants to be a cancer research paper, then cancer cells needs to be used.

      It is true that we did not use a cancer cell line at the beginning of the paper because, as expected, flotillin knock-down did not allow to revert the mesenchymal phenotype of MDA-MB-231 cells toward an epithelial one. If this had been obtained, we would have used these cells from the beginning of the paper. The lack of reversion of the mesenchymal phenotype after flotillin knock-down was expected. Indeed, the EMT process is multifactorial and the decrease of flotillins alone is obviously not sufficient to reverse it in a tumor cell line bearing multiple oncogenic mutations. Moreover, because we wanted to assess whether flotillin upregulation is sufficient in normal cells to acquire the properties of tumor cells and particularly to induce EMT, we used human MCF10A and murine NMuMG cells, two non-tumoral epithelial cell lines. Until now, the studies carried out on the effects of flotillin overexpression have used tumor cells that already harbor pro-oncogenic perturbations, preventing to show that flotillin overexpression alone activates oncogenic processes leading to EMT, and to identify the downstream mechanisms.

      Nevertheless, we have used the MDA-MB-231 cell line in several experiments to analyze: i) AXL distribution and internalization following the knock-down of flotillins (Figures 4 and S5), ii) SphK2 and flotillin 2 co-localization and co-endocytosis (Figures 5A and D and S7A), iii) the impact of SphK2 inhibition on AXL expression level distribution and endocytosis (Figure 6), iv) SphK2 expression level upon flotillin knock-down (Figure S7E) and AXL expression level upon SphK1 inhibition (Figure S7F). With these experiments performed in MDA-MB-231 cells, we showed that AXL and SphK2 colocalize in flotillin-positive late endosomes and are co-endocytosed from the plasma membrane containing flotillin-rich domains to flotillin-positive vesicles. We also demonstrated that flotillins and SphK2 control the rate of AXL endocytosis and its stabilization.

      We recently obtained additional data with HS578T cells, another triple negative breast cancer cell line, on the co-trafficking of AXL and flotillins as well as the co-trafficking of SphK2 and flotillins (Additional Figure 3, this data could be added in the fully revised version of our manuscript).

      In addition, we observed that inhibiting SphK2 also decreased the level of AXL in HS578T cells. This data could be added in the revised version of the manuscript (see data in our answer to Point #1 from Reviewer #3).

      • ADDITIONAL FIGURE 3 CAN NOT BE ADDED BUT IS AVAILABLE UPON REQUEST*

      Additional figure 3: Co-trafficking of SphK2 and AXL with flotillin 1 in intracellular vesicles in HS578T cells. HS578T cells co-expressing Flot1-mCherry with SphK2-GFP (A) or AXL-GFP (B) were monitored by time lapse spinning disk confocal video-microscopy. On the right of each panel are shown still images at different time points (min) of the boxed area. The colored arrows allow following three distinct vesicles that are positive for Flot1-mCherry and Sphk2-GFP, or AXL-GFP.

      10) The methods section contains inconsistent data about patients' samples (9 are indicated, but the Figure S4 features 37). Then, where those other 527 come from?

      We corrected the manuscript and added all characteristics regarding the 37 patients in the “Supplementary information” section.

      The 527 patients are from another cohort and were used for the analysis of the correlation between the mRNA levels of FLOT1 and p63 in breast cancer biopsies from 527 patients (Figure 2I). This cohort was described in our previous study (Planchon et al. J Cell Science 2018, https://doi.org/10.1242/jcs.218925). In the revised version of our manuscript, we now refer to this previous article in the “Result” section and in the legend to figure 2I to explain the origin and characteristics of this cohort.

      11) Some figures do not match with the legends or with the description in the text. It has not been easy to review this paper.

      We apologize as we indeed made one mistake in figure 2 that was inserted into the manuscript and that was actually figure S2 (that appeared twice). However, the correct figure 2 was uploaded on the website of Review Commons and BioRxiv. Regarding the comments made in point 4, it seems that Reviewer #1 examined the correct figure 2 that was uploaded and that matches the legend indicated in the manuscript.

      Besides this mistake, we do not see any other mismatch between figures and legends.

      Reviewer #1 (Significance (Required)):

      I am a cancer biologist working on EMT.

      **Referee Cross-commenting** I have nothing to comment on other's reviews.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Genest and co-authors present in this paper new fascinating evidence on how intracellular trafficking can modulate oncogenic signalling.

      First of all, they show how overexpression of Flotillin1 and 2 in non-cancerous breast lines can induce a strong reprogramming towards an EMT phenotype. They analyse mRNA and protein expression, intracellular distribution of activated proteins, cell phenotypes to demonstrate a strong activation of oncogenic signalling pathways. They then identify AXL as a key player in this process and show how this protein is stabilised upon Flotillin expression. The authors use an amazing variety of approaches to study the endocytosis and the trafficking of endogenous, GFP-tagged, Halo-tagged and Myc-tagged AXL in different cell lines and their data are strong and very convincing, the images are of very high quality and the analysis rigorous. Their data strongly support the hypothesis that high Flotillin levels triggers AXL endocytosis and accumulation in non-degradative late endosomes where signalling remains active. The authors then show how SphK2 has a key role in AXL stabilisation, it colocalises with Flotillin, AXL and CD63 and its activity (which they block by using inhibitors or siRNA) is necessary for flotillin-induced AXL stabilisation and EMT induction. The paper is extremely well written, the data flow logically and they are appropriately presented and analysed. I don't have any major comment and I believe the paper is suitable for publication.

      We thank the Reviewer for the positive appreciation on our manuscript.

      I have only some minor comments/questions: 1) did the authors try to colocalise AXL with endogenous Flotillin in MDA-MB-231 cells? They could use the antibodies used in Fig S1B. Of note, the authors have shown it in luminal tumours in Fig S4C.

      We performed co-immunofuorescence experiments to detect endogenous AXL with endogenous Flotillin in MDA-MB-231 cells. As shown below (Additional Figure 4), we could find AXL and Flotillin being present in the same intracellular endosomes. Images could be added in the revised version of the manuscript.

      ADDITIONAL FIGURE 4 CAN NOT BE ADDED BUT IS AVAILABLE UPON REQUEST

      Additional figure 4: Endogenous AXL and flotillin 1 are found in the same in intracellular vesicles in MDA-MB-231 cells. MDA-MB-231 cells were fixed and labelled with relevant antibodies directed against Flotillin1 and AXL. Scale bar in the main image : 10 µm. Scale bars in the magnified images from the boxed area : 1 µm. Arrows indicate flotillin and AXL positives vesicles

      2) In Fig6G, it appears that AXL-Flotillin colocalization is lost upon SphK2 inhibition. Is this the case? It could be that the correct lipids are necessary for the formation of Flotillin-positive internalisation domains and this could be very interesting and reinforce the model proposed in the paper.

      In figure 6G, cells were not permeabilized. Thus, only AXL at the cell surface was labelled using an antibody against the extracellular domain of AXL. Because flotillin 2 is tagged with mCherry, this allowed its visualization revealing its localization both at the cell surface and intracellularly in the inset of the lower pane l of figure 6G.

      After 6 hours of treatment using the opaganib inhibitor, we did not notice any major change in AXL-flotillin colocalization at the cell surface. Somehow, this is expected because blocking the generation of S1P is more likely to inhibit the invagination of flotillin-rich membrane microdomains rather than their formation.

      3) I would remove the sentence on line 995-997 "to our knowledge this is the first report to describe ligand-independent AXL stabilization..." as the cells are not serum starved in all experiments and animal serum can contain variable amounts of the ligand GAS6.

      We understand and agree with Reviewer #2, this sentence has been modified by “**To our knowledge this is the first report to describe AXL stabilization following its endocytosis”

      Please note that the authors don't have to necessarily address comments 1-2, their paper is already very rich in convincing data.

      Reviewer #2 (Significance (Required)):

      AXL is a major oncogene that promotes EMT in a variety of tumour types. Understanding how its signalling can be triggered by endocytic pathways even in cells that are non-cancerous is very important and of high significance for the cancer field and the trafficking community.


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This is an interesting and well written paper describing that upregulated flotillin promotes an endocytic pathway called upregulated flotillins-induced trafficking (UFIT) that mediates AXL endocytosis and allows its stabilization. Consequently, stabilized AXL in these flotillin-positive late endosomes enhances activation of oncogenic signaling pathways that promotes EMT. The authors suggest that Flotillin upregulation-induced AXL stabilization requires the activity of SphK2. However, this latter point is not supported by the data and further studies are needed to support this important conclusion.

      **Major concerns:**

      1. Most of the conclusions are based on effects of high concentrations (50 uM) of an ill-defined SphK2 inhibitor. The experiment described in Figure 6C-H need to be confirmed by downregulation of SphK2.

      We understand that Reviewer #3 is concerned that in our experimental conditions, the effects we observed could be really explained by a specific inhibition of SphK2.

      From the literature, among all the inhibitors described for SphK2, opaganib (ABC294640) is the most specific inhibitor available. It was shown to have no inhibitory effect on SphK1 up to 100 µM (French et al, J Pharmacol Experimental Exp Ther 2010; Neubauer HA and Pitson SM, The FEBS Journal 2013). In agreement, we found that PF543, the most specific SphK1 inhibitor, had no effect on AXL level (Figure S7F), unlike incubation with opaganib (Figure 6A and C), and that was confirmed in MCF10AF1F2 cells by the knock down of SphK2 with a specific siRNA (Figure 6B).

      In the literature, depending on the cell lines, opaganib is used in vitro in the 10 to 60 µM range. Opaganib IC50 on recombinant SphK2 was established at 60 µM (French et al, J Pharmacol Experimental Exp Ther 2010). In our experiments, opaganib was used at a concentration of 50 µM, below the IC50 value, as previously done by Nichols’ group (Riento and al, PloS ONE, 2018). In most of our experiments (Figure 6, A, D, E-I, Figure S7D), opaganib was added for a maximum of 10 hours, which is shorter compared to what done in other studies (24-48 hours). Furthermore, it was shown that an opaganib concentration of 50 µM does not have any inhibitory effect in vitro on 20 protein kinases tested, including PKA, PKB, PKC, CDK, MAP-K, PDK1 and Src (French et al, J Pharmacol Experimental Exp Ther 2010).

      In addition to inhibit SphK2, acting in a sphingosine-competitive manner, opaganib also was shown to act as an antagonist of estrogen receptor (ER), and inhibits ER-positive breast cancer tumor formation in vivo (Antoon JW et al, Endocrinology 2010). If Reviewer #3 is concerned about the possibility that the opaganib downstream effects we observed in our study might be explained by ER inhibition, we remind that we used cellular models that do not express ER. Indeed, the MDA-MB-231 cell line is a triple negative breast cancer cell line. MCF10A cells also do not express ER (Lane MA et al, Oncolgy Report, 1999,)** and our transcriptomic analysis (Table S1) did not reveal any increase in the expression of ER genes in MCF10AF1F2 cells in which flotillins are upregulated, thus eliminating a possible non-specific effect of opaganib in these cells.

      In conclusion, we hope that these arguments help to convince Reviewer #3 that our experiments were performed in conditions where we carefully limited the possibility of opaganib off-target effects, on the basis of the currently available opaganib-related data from the literature.

      We totally agree with Reviewer #3 that complementary experiments by downregulating SphK2 must be used. In agreement, we already downregulated SphK2 by siRNA in MCF10AF1F2 cells. This led to a significant decrease in AXL and ZEB1 expression. In the current revised version of the manuscript we have added data obtained with similar siRNA experiments performed in MDA-MB-231 cells (now Figure 6C). In agreement, we observed AXL and ZEB1 downregulation.

      As shown below (Additional Figure 5) we recently obtained similar data in HS578T cells, showing that inhibiting SphK2 also affects AXL protein level in this triple negative breast cancer cell line (these data could be added in the manuscript).

      ADDITIONAL FIGURE 5 CAN NOT BE ADDED BUT IS AVAILABLE UPON REQUEST

      Additional figure 5: SphK2 inhibition decreases AXL level in HS578T cells. HS578T cells were incubated with opaganib (50µM, 10 hours) (A) or with siRNA Ctrl or siRNA SphK2 for 72 hours (B). Cell lysates were blotted with relevant antibodies against AXL, SphK2 and actin. The histograms show AXL level (normalized to actin) expressed as fold-increase compared with the control condition, and data are the mean ± SEM of 3 (A) and 4 (B) independent experiments.

      Reviewer #3 also asks to use the siRNA approach on experiments shown in previous panels D-H (now panels E-I) of figure 6.

      In complement to Figure 6D (now Figure 6E), experiments using a siRNA against SphK2 to show that “**AXL decrease upon SphK2 inhibition is not due to protein synthesis inhibition” are on-going and the obtained data could be added in the full revised version of our manuscript.

      However, we are unfavorable to use a siRNA against SphK2, in addition to opaganib, in the experiments done to measure the effect of SphK2 inhibition on the rate of AXL internalization (previously in Figure 6E and F, now Figure 6F and G) and the level of AXL at the cell surface (previously in Figure 6G and H, now Figure 6H and I). Indeed, we carefully chose a short (4 hours) incubation with opaganib at the end of which the total cellular level of AXL was not yet decreased, allowing to measure unambiguously a defect in AXL endocytosis or a change in the level of AXL at the cell surface. We believe that it would be very difficult to achieve similar experiments using a siRNA against SphK2. It would require to determine the exact time after siRNA transfection leading to a sufficient SphK2 level reduction but in conditions where AXL level is still maintained. We think that due to the mosaic transfection efficiency, being able to precisely synchronize the effect of a siRNA at its beginning is impossible.

      1. Does overexpression of SphK2 reverse the effects of the SphK2 inhibitor? In a similar manner, does overexpression of SphK2 enhance stabilization of AXL?

      To answer the first question, it is not clear for us how to test whether SphK2 overexpression can reverse the effects of the SphK2 inhibitor because the ectopically expressed SphK2 would also be sensitive to the inhibitor. This would require to overexpress a SphK2 mutant that is catalytically active but insensitive to the inhibitor, and to our knowledge, such a mutant does not exist.

      Regarding the second question, we are currently generating a retroviral DNA construct allowing to overexpress SphK2 homogeneously in the cell population. Then we will test whether it further increases AXL level through its stabilization. This will be tested in cells upregulated for flotillin. As we showed in Figure 6 A and D (previously Figure 6 A and C) that AXL level depends on SphK2 activity only in cells that overexpress flotillins, we anticipate that there will be no impact in a cell line with a moderate level of flotillin. Results could be added in the fully revised manuscript.

      1. Although the authors suggest recruitment of SphK2 and formation of S1P in UFIT, there are no measurements of S1P. Also, there is no indication that SphK2 is activated despite the fact that ERK and AKT are activated in UFIT and are known to phosphorylate and activate SphK2. Is SphK2 that is recruited to flotillin phosphorylated?

      To answer the first point raised by Reviewer#3, we recently performed, in collaboration with a lipidomic platform, a comparative analysis by quantitative mass-spectrometry of S1P levels between MCF10AmCh and MCF10AF1F2 cells. As we anticipated, the results show a 3,5-fold increase in S1P in MCF10AF1F2 cells compared with MCF10AmCh (Additional Figure 6). This data agrees with the fact that we found that the SphK2 catalytic activity is required for the UFIT pathway mediated AXL stabilization. This result is also in agreement with the study from the Nichols’ group which detect a decrease in S1P in cells in which flotillins were knocked out (Riento et al, PloS ONE, 2018). The results regarding the analysis of S1P level along with the complete methodology used will be added in the fully revised version of our manuscript.

      ADDITIONAL FIGURE 6 CAN NOT BE ADDED BUT IS AVAILABLE UPON REQUEST

      Additional figure 6: Upregulation of flotillins in MCF10A cells promotes an increase in the level of Sphingosine-1-phosphate. The level of sphingosine-1-phosphate was compared by quantitative mass-spectrometry analysis from three independent samples of MCF10AmCh and MCF10AF1F2 cells. The results are expressed in pmol equiv / 1 . 106 cells. The graph shows the value for each sample and the bar horizontal bars indicate the mean value for each condition.

      Regarding the second point, we would like to clarify that we do not think that SphK2 interacts directly or indirectly with flotillins because SphK2 did not co-immunoprecipitate with flotillins (not shown). Thus, investigating by western blotting SphK2 phosphorylation status in flotillin immunoprecipitates is pointless. In theory, we could investigate the activity-related phosphorylation status of SphK2 associated with flotillin rich-membranes and endosomes. But this seems difficult to achieve because unfortunately, the only two commercially available antibodies against phosphorylated SphK2 are not described to work for immunofluorescence staining. One is against the Thr578 residue (https://www.abcam.com/sphk2-phospho-t578-antibody-ab215750.html), identified as phosphorylated downstream of ERK by Sarah Spiegel’s group (Hait et al, J Biol Chem, 2007). The second is designed to recognize specifically the phospho-Thr614 residue (https://www.abcam.com/sphk2-phospho-t614-antibody-ab111948.html), but this site has not been rigorously demonstrated to be phosphorylated downstream of AKT or ERK or to stimulate SphK2 activity. Thus, considering the lack of appropriate tools and considering that we already showed, using opaganib, that the catalytic activity of SphK2 is required for the UFIT pathway, we believe that investigating the phosphorylation status of SphK2 reflecting its activity in flotillin-positive vesicles will be complicated to achieve in a reasonable amount of time and we think that it will not bring a higher value to our present study.

      To answer more broadly to the question “Is SphK2 recruited to flotillin phosphorylated?”, we anticipate that it could be the case at least on the Ser419 and Ser420 residues because Nakamura’s group demonstrated that the phosphorylation of these sites favors the nuclear export of SphK2 (Ding G et al, J Biol Chem, 2007). This group developed an antibody against these phospho-sites, potentially working by immunofluorescence. However, as it is unknown whether phosphorylation of these residues influences SphK2 activation status, we do not plan to perform immunofluorescence experiments with this tool (not available commercially) because the results would not address the Reviewer’s question.

      1. It should be determined whether the optogenetic system used to induce flotillin oligomerization also induces recruitment and activation of SphK2.

      As we already have all the available tools, optogenetic experiments will be performed to answer this point and the results could be added to the fully revised version of our manuscript.

      As suggested, we plan to perform experiments in which exogenous S1P will be added to cells with a moderate flotillin expression level to check whether it could recapitulate the effect of flotillin upregulation on AXL expression. Results could be added to the fully revised version of the manuscript.

      However, our current results on the localization and the involvement of SphK2 suggest that the generation of S1P involved in the UFIT pathway occurs at the plasma membrane and in late endosomes. Because the exogenous S1P that will be added in the culture medium will not go through the plasma membrane, we anticipate that it could be insufficient to mimic all the mechanisms of the UFIT pathway. Its effect will be limited to the plasma membrane. In addition, these mechanisms are very likely based on a local concentration of S1P in some microdomains (at the plasma membrane and in intracellular membranes) scaffolded by flotillins. It will be very difficult to mimic such local concentration of S1P just by adding S1P to the cells.

      We agree that identifying the S1P receptors involved would be of valuable interest for a better characterization of the UFIT pathway. However, we think that this is beyond the scope of our present study. Among the five known S1P receptors, we do not know if any could be involved in membrane remodeling at the plasma membrane to promote endocytosis. To our knowledge, involvement of S1P receptors in endocytosis has never been reported. However, based on the work by Nakamura’s group (Kajimoto et al, Nat Comm, 2013 and Kajimoto et al, J Biol Chem, 2018), the S1P1 and S1P3 receptors are involved in membrane remodeling and cargo sorting from the outer membrane of late endosomes (where flotillins accumulate in our cell models). We could hypothesize that these receptors are influenced by flotillins and are involved in the UFIT pathway. But we think that testing this hypothesis would be the subject of a distinct study.

      At the plasma membrane, we totally agree that the effect of S1P could be mediated, as suggested by De Camilli’s group (Shen et al, Nat Cell Biol 2014), by the formation of tubular endocytic structure rich in sphingosine after acute cholesterol extraction. Reciprocally, in our cell models, upregulated flotillins, thanks to their ability to bind to sphingosine (demonstrated by Nichols’ group (Riento et al, PloS ONE, 2018)) and to oligomerize, could create sphingosine-rich membrane regions.

      1. There is a commercial antibody for endogenous SphK2 that can be used to validate and substantiate the data with GFP-SphK2. (F1000Res . 2016 Dec 6;5:2825. doi: 10.12688/f1000research.10336.2. eCollection 2016. Validation of commercially available sphingosine kinase 2 antibodies for use in immunoblotting, immunoprecipitation and immunofluorescence)

      We thank Reviewer #3 for this suggestion and advice. Being able to detect the localization of endogenous SphK2 in late endosome would be valuable for our study. We already tried with no success with antibodies from Sigma and Cell Signaling Technology (not described to work in immunofluorescence experiments).

      We will follow the advice from Reviewer #3 and test the anti-SphK2 antibody from ECM-Biosciences mentioned in the article by Neubauer and Pitson F1000 research, 2016. If we obtain interesting results, they will be included in the revised version of our manuscript.

      However, in experiments using SphK2-GFP, we noticed that in live cells, the signal in late endosomes was completely lost after fixation using paraformaldehyde. Similarly, we also observed in live cells that NBD-Sphingosine, added in the culture medium, quickly accumulated in flotillin-positive late endosomes (Additional Figure 7, this data could be added in the fully revised version of the manuscript), but this accumulation was no longer detectable after fixation. Based on these observations, we believe that SphK2 recruitment to flotillin-positive late endosomes is highly labile probably because it mainly involves its interaction with sphingosine molecules that are enriched in these intracellular compartments. This is supported by our observation that addition of opaganib, characterized as a sphingosine competitive inhibitor, displaces SphK2-GFP from flotillin-positive late endosomes in live cells (Figure S7D). In addition, we showed that SphK2-Halo is more recruited in CD63-positive late endosomes in cells overexpressing flotillins (Figure 5E). This could be due to a higher concentration of sphingosine promoted by flotillins (that bind to sphingosine) accumulating in these compartments.

      Thus, we will try the immunofluorescence staining of endogenous SphK2 using the recommended antibody, but it might be difficult to detect its presence in flotillin-rich late endosomes in fixed cells. The data could be added in the fully revised version of the manuscript.

      ADDITIONAL FIGURE 7 CAN NOT BE ADDED BUT IS AVAILABLE UPON REQUEST

      Additional figure 7: Visualization of NBD-sphingosine in flotillin-positive late endosomes. Live HS578T, MDA-MB-231 and MCF10AF1F2 cells expressing Flot1-mCherry were monitored by time lapse spinning disk confocal video-microscopy, 5 min after addition of fluorescent NBD-Sphingosine in the culture medium. On the right are shown still images corresponding to the boxed areas to illustrate the accumulation of NBD-sphingosine in virtually all flotillin-positive endosomes.

      Reviewer #3 (Significance (Required)): This is an interesting paper. If the authors confirm the involvement of Sphk2 and mechanism of action of S1P, this would be an important contribution to the field.

      Modifications done in the initial revised-version of our manuscript (at the time of the initial response). A full revised version will be provided after all the additional experiments asked by all the Reviewers will be achieved.

      Revisions are highlighted in grey in the initial revised-version of the manuscript

      1) Figure 1 has been modified and now includes results from a GSEA analysis as recommended by Reviewer #1. The texts of the corresponding legend and of the “Results” and “Methods” sections have been modified accordingly.

      1) The Figure 2 version that was inserted in the manuscript was wrong because it was a copy of Figure S2. However, the correct Figure 2 was uploaded to the Review Commons website and accessible for the Reviewers. The correct Figure 2 is now inserted in the manuscript.

      2) In the legend to panels C, E, F, J of Figure 2, the sentence: “The histograms show […] with control MCF10AmCh cells calculated from 4 independent experiments” was corrected to “The histograms show […] with control MCF10AmCh cells calculated from at least 4 independent experiments” because data shown in panel J are actually calculated from 8 independent experiments.

      3) Figure 6 has been modified with the addition of panel C showing the effect of SphK2 downregulation by siRNA on AXL and ZEB1 level in MDA-MB-231 cells. The text has been modified accordingly.

      4) In Figure 3 C, representative western blots have been added as asked by Reviewer #1.

      5) In the Supplementary information section, the full clinicopathological characteristics of only 9 patients were indicated, whereas Figure S4 mentioned 37 patients. We corrected this mistake and now provide the characteristics of all patients.

      6) In the sentence “Conversely, it induced ZEB 1 and 2 mRNA expression (Figures 1H and S1K) and ZEB1 protein expression (Figures 1I and S1L) (no anti-ZEB2 antibody is available)”, we removed “no anti-ZEB2 antibody is available”.

      7) The sentence previously on line 995-997 "to our knowledge this is the first report to describe ligand-independent AXL stabilization..." has been modified to “**To our knowledge this is the first report to describe AXL stabilization following its endocytosis”

      8) We are now referring to reference 18 (Planchon et al. J Cell Science, 2018) for the description of the cohort of 527 patients with breast cancer because this was missing.

    1. This will seem little to you with your strong practical sense for it takes fifty years for a poet’s weapons to influence the issue.”

      This reminds me of some of the influential poets and writers I admire, and their own perspective on social activism and global change. it makes me think of how we write things in hopes of inspiring change in people and to spark a fire of rebellion in certain cases . And yet by the time a piece of literature has made its way around the world, the actions of those who hold the same beliefs yet were more keen to pursue them through a practical sense have already made some kind of change. I think literature is meant to aid people as a whole- for generations to come- and I think what makes a piece of writing so strong is that it still holds meaning no matter what time you are in and that it captures the human existence.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the three reviewers for their helpful and valuable comments. We plan to address their criticisms in a revised manuscript and hope that our manuscript will then be significantly improved.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors have presented a very interesting and compelling set of data regarding the impact of conditional deletion of the only known pathway allowing the uptake of pyruvate into mitochondria. The paper comprises two interwoven stories that are both important. The first is the remarkable finding that the majority of excitatory neurons in the cortex (i.e. those under the influence of the CaMKII promoter) show remarkable metabolic flexibility as they tolerate elimination of pyruvate oxidation, considered the major supplier of ATP in neurons. The data on this seem clear although the authors did not delve into the potential mechanisms of metabolic compensation that likely occurs. Instead they examined whether there was some mal-adaptive compensation and they found clear evidence of this: in the absence of MPC activity the mice are much more prone to epileptic seizures, unveiled experimentally by relatively standard protocols (kindling). The authors present largely very convincing evidence that this mal-adaptive compensation in turn ends up decreasing the activity of KV7.2/7.3 channels whose job is normally to limit runaway repetitive firing by mediating an hyperpolarizing K+ efflux following an action potential. This channel, put on the map as it was one of the downstream targets modulated by cholinergic metabotropic activation, is also know known to be controlled by Calmodulin and therefore cytosolic Ca levels. Overall, I think at its core this manuscript is interesting and important. There however several weaknesses, I fear, will diminish the impact on the eventual readership. If these points can be addressed, it will strengthen the longevity of these findings:

      1) It is puzzling why the authors resorted to using shRNA-mediated KD of MPC1 for some of the in vitro studies when they have gone to the trouble of making a floxed CRE-dependent mouse. Primary cells (e.g. Fig 1) or organotypic cultures (Fig. 6) from these mice would have made a more consistent set of starting conditions to compare data across the manuscript. As there viruses expressing the CRE recombinase are widely available this could have been used on mice simply harboring the floxed gene it they are worried about waiting for the expression of the CaMKII promoter for in-vitro conditions.

      This is indeed a good point. Indeed initially, when we started these experiments, we tried to use viruses expressing the CRE recombinase in cultured neurons from mice harboring the floxed gene as proposed by the reviewer. However, for reasons that we do not fully understand, the use of AAVs or lentiviruses expressing the CRE was found to be deleterious for the cultured neurons. In view of this toxicity we tried using TAT-CRE recombinase, a recombinant cell-permeant fusion recombinase, which we added directly to the medium. However, this strategy proved to be poorly efficient. We finally used cultures of Cre-floxed neurons in which we tried to knockout MPC1 gene using 4-hydroxytamoxifen in the culture medium. However, we did not obtain satisfying results because, as previously reported, cortical neurons grow poorly in the presence of 4-hydroxytamoxifen (Nichols et al., Cell Death and Disease, 2018. https://doi.org/10.1038/s41419-018-0607-9). For these reasons we turned to the shRNA strategy and to the use of 3 small molecule inhibitors of the MPC each with different chemical structures. Both the RNA interference and the pharmacological approaches gave similar results, reinforcing our confidence in the specificity of the results, and the unlikelihood of off-target effects.

      2) The data in Figure 5 gets a little less convincing as using extracellular glutamate to drive Ca elevations is so non-physiological that the results might really be distorted by the participation of something irrelevant to the story, even though it supports the overall interpretation for a role of Ca/CaM in the control of the channel. Similarly, the use of RU360 should be done with caution. The drug, although a useful antagonist of MCU in purified mitochondria, is famously finicky with respect to its ability to cross membranes and could well have off target impact. A much cleaner experiment would be to suppress the expression of MCU via KD. Presumably in the MPC-deficient neurons, this would have minimal impact on Ca signals. Given the frequent ambiguity associated with interpreting pharmacological results, coupled to the central importance of this finding in interpreting the entire paper, I think carrying out experiments with molecular genetic manipulation of MCU is warranted.

      The main point of this figure is to study the capacity of MPC1 KO neurons to handle intracellular calcium increase and to regulate calcium homeostasis. To this end, we used strategies described to acutely increase cytosolic calcium, either through membrane depolarization with KCl (Rienecker et al., ASN Neuro. 2020. https://doi.org/10.1177/1759091420974807) or through activation of glutamate receptors using glutamate (For example see Wong, Vis Neurosci, 1995 : DOI: 10.1017/s0952523800009469). It is important to mention that the concentration of glutamate used in our experiments (10 microM for 2 min) is well below the concentration normally used to induce excitotoxicity (100-500 microM for 30min). The fact that both stimulations provided similar results and clearly indicated a defect in the clearance of cytosolic calcium in MPC-deficent neurons.

      Regarding the concern with RU360, we are aware of the problems with plasma membrane permeability associated with this compound, and for this reason we included a membrane permeabilizer (0.02% pluronic acid) to facilitate its entry into the cell. This was indicated in the Material and Methods section (line 585) as well as in the figure legend (line 948). In order to clarify this methodology, we will add this information in the main text. It should be noted that this concern would not apply to the electrophysiogical experiments, since in this case the compound was injected directly into the cell. We would like to add that we chose to inhibit the MCU using a chemical inhibitor rather than a shRNA because of the well known difficulty in obtaining a complete loss of function of the MCU using RNA interference (Nichols et al., Cell Death and Disease, 2018. https://doi.org/10.1038/s41419-018-0607-9). Nevertheless, as recommended by the reviewer, we will attempt to downregulate the expression of MCU using shRNA.

      3) The authors have not really made clear in this paper whether the ability to suppress the phenotype of the MPC deficiency with ketones is really related to a providing TCA cycle support or instead a pharmacological impact on non-TCA related targets (such as the Kv7.2/7.3 channels). Presumably the use of other ketones might circumvent this. The action of ketone bodies has been a topic of considerable interest in neuroscience, given the clinical relevance for childhood epilepsies. Previous studies for example have argued for direct inhibition of the vesicular glutamate transporter (Juge et al. Neuron 2010). The use of other ketones (acetoacetate) would narrow down the interpretations of the data.

      Our results point to 2 two possible mechanisms of ketone bodies: i) providing acetyl-CoA to the Krebs cycle, thereby stimulating OXPHOS and ii) direct action of 3-beta hydroxybutyrate on the activity of Kv7/7.3 channels. The reviewer is asking whether, in addition to 3-beta hydroxybutyrate, other ketone bodies, acetone or acetoacetate, may display antiepileptic activity, which would probably indicate that providing substrates to the TCA cycle is sufficient to prevent neuron-intrinsic hyperactivity and seizures. We agree that this in an interesting question and we will now test the effect of acetoacetate on PTZ-induced seizures in MPC KO mice.

      **other**

      1) In vitro - scramble controls only serve to demonstrate there is no general effect of treating cells with shRNAs, but do not address if there is an off-target effect. The most convincing thing here would be to have an shRNA-insensitive variant that rescues.

      We have used 2 different shRNAs and 3 chemically unrelated inhibitors of the MPC and in all cases we obtained similar results. Therefore, we think that it is unlikely that the effects we observe are due to an off-target activity. The experiment proposed by the reviewer is interesting but extremely difficult. The idea would be to reintroduce a shRNA-insensitive MPC1 into MPC1-deficient neurons treated with shRNA. This is difficult as it is known that the expression level of MPC1 needs to be matched to that of MPC2, otherwise it leads to depolarization of the mitochondria. Obtaining the right level of MPC1 would be extremely difficult to achieve in practice.

      2) Does rescuing CaMK binding to KCNQ channels rescue the phenotypes?

      The question raised by the Reviewer implies that CaM is not constitutively bound to KCNQ channels, which is a matter of debate. As we pointed out in the discussion, ‘Intracellular calcium decreases CaM-mediated KCNQ channel activity (32, 36) by detaching CaM from the channel or by inducing changes in configuration of the calmodulin-KCNQ channel complex (36).’ The CaM-KCNQ tethering is also described in a review by Alaimo and Villaroel, 2018 (doi:10.3390/biom80300579): ‘[…] CaM was first defined as an integral subunit constitutively tethered to the C-terminal region of Kv7.2/3 channels since Kv7.2 mutants that were deficient in CaM binding were unable to generate measurable currents [5,21]. However, this model has been questioned since Kv7.2 channels, carrying a hB mutation [40] or Kv7.4 hA mutated channels [41] that do not bind CaM, can still reach the plasma membrane and are functional.’

      When considering to manipulate CaM binding to KCNQ, it should also be considered that previous studies on this matter have mainly worked with heterologous systems and through genetic manipulations of CaM (by expression of a dominant negative or by overexpression of CaM) or of the KCNQ binding motif.

      Based on both theoretical and practical issues, we, thus, believe that it is not feasible to implement a straightforward approach that would be compatible with our mouse model.

      An alternative, indirect approach, as indicated by Reviewer #3, would be to test the effect of Ca2+ chelators. Although this is likely to introduce confounding effects through the inhibition of other Ca2+-dependent channels, we propose to focus on trying this option and assess whether a XE991-sensitive component will be unmasked in MPC1 deficient cells.

      3) As the authors imply that BHB activates KCNQ channels, showing this directly in their prep would provide some convincing data. If this is true, why doesn't BHB increase firing rate of WT neurons?

      Activation of KCNQ channels is expected to reduce (not increase) neuronal firing. When exposed to BHB, we indeed found that WT cells also show a trend towards decreased excitability (p=0.08). We will report this trend in the revised figure 5F. Given that KCNQ channels are already available to be recruited upon repetitive firing in WT cells (to a larger extent as compared to KO, as indicated by our data with XE991) it is conceivable that a further potentiating effect of BHB at the concentration used for ex vivo recordings (2 mM) will be limited.

      4) How does the anti-epileptic effects of ketones in this study relate to previous reports of regulation of KATP channels? One of main concerns is that ketones might have a parallel anti-epileptic effect in the MPC1 KO mice that is unrelated to the mechanism proposed here.

      The ketogenic diet is likely to exert several effects including disruption of glutamatergic synaptic transmission, inhibition of glycolysis, and activation of ATP-sensitive potassium channels as pointed out by the reviewer. We do not exclude that inhibition of the MPC could also have an impact on the KATP channels and we are currently exploring this possibility. However, such work to dissect the potential implication of the KATP channels would go well beyond the scope of the present paper. Nevertheless, we will plan to certainly raise this important possibility in the discussion.

      **Minor comments:**

      1- What is the MPC1 KO efficiency in CaMK neurons? The western blot in 2c is from the whole cortex and therefore does not show that.

      This is indeed a good comment, however, please note that the estimation of MPC1 KO efficiency has also been evaluated in synaptosomes isolated from MPC1 KO cortices. These structures are mainly isolated from neurons (Carlin et al., JCB, 1980. 10.1083/jcb.86.3.831). As shown in figure 2C, these synaptosomes are massively enriched for CamKII and contain less astrocytic marker GFAP in comparison to the whole cortex. The amount of MPC1 in the synaptosomes prepared from the KO animals is strongly decreased. Nevertheless, as recommended by the reviewer, we plan to quantify the efficiency of the KO by performing a double immunostaining for MPC1 and a specific marker for neurons.

      2- Mitochondrial Ca2+ levels are not measured directly, for which there are many tools. This is needed to demonstrate definitively that there is a defect in Ca2+ handling."

      The reviewer raised an important point and we plan to monitor the levels of mitochondrial calcium in MPC-deficient neurons using the mito-Aequorin, a luminescent quantitative probe targeted to mitochondria (Granatiero et al., Cold Spring Harb. Protoc. 2014. 10.1101/pdb.top066118)

      Reviewer #1 (Significance (Required)):

      see above.

      **Referee Cross-commenting**

      It seems we are in reasonable agreement about the pros & cons of the manuscript. I agree that alternative approaches to RU360 are warranted.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      De la Rossa and colleagues examined the consequences of conditionally knocking out MPC1,a subunit of the mitochondrial pyruvate carrier. They found that despite decreased levels of oxidative phosphorylation in excitatory neurons, phenotypically these conditional knockout mice were normal at rest. However, when challenged by inhibition of GABA neurotransmission, these animals developed severe seizure activity and expired. These authors then showed that neurons with an absence of MPC1 were hyperexcitable in part through abnormal calcium homeostasis, which was associated with a reduction in M-type inhibitory potassium channel activity. Intriguingly, the ketogenic diet and the major ketone body beta-hydroxybutyrate were able to reverse these changes.

      This is a carefully conducted research study that reveals cell type-specific alterations of MPC1 deletion and functional consequences. The study design was logical and involved an exhaustive array of methodologies. The manuscript was generally well written and organized, and there are no major concerns. This study shows a direct causal relationship between impaired bioenergetics at the level of mitochondrial, and subsequent behavioral seizures, and is perhaps the most direct demonstration to date that an intrinsic disturbance of metabolic function can result in seizure activity (through changes in calcium regulation and impairment of ion channel activity). This will be an important contribution to the scientific literature.

      **MINOR:**

      1. Page 4, line 86: Would recommend changing "paroxystic" to "paroxysmal" (the latter which is a more recognized term). We will make the change.

      Page 5, line 124: recommend including the concentration of beta-hydroxybutyrate used when first mentioned. In general, concentration and dose information were difficult to find, as well as route of administration (for kainate, page 7, line 175). This type of information was not conveniently presented.

      We will follow the reviewer’s recommendation.

      Page 5, line 128: "both overcomed" is awkward. Would recommend using "both reversed".

      We fully agree and will make the change in the revised manuscript.

      Page 8, line 193: the authors probably meant "astro-MPC1-WT mice", not "neuro-MPC1-WT mice".

      Thank you for the acurate look. This will be changed.

      Page 12, lines 280-282: the authors might want to mention that chronic exposure of BHB might reduce the hyperexcitability of neuro-MPC1-KO mice.

      This point could indeed be discussed.

      Please review entire manuscript and use consistent tense. For example, on page 13, line 309, to maintain the past tense, it should read "We first assessed whether..."

      Thanks for the recommendation.

      Page 13, line 318: the authors used 10 mM BHB when examining calcium levels, but they earlier used 2 mM. They need to explain why they used a different concentration; and 2 vs 10 mM are quite different.

      The reviewer makes a valid point. When we performed the in vitro experiments, we used 10 mM BHB, which is slightly higher than the amount of ketone bodies measured in the blood of mice fed on a ketogenic diet for 2 days (Supplementary figure 4). This concentration of BHB has also been used by others (see for example: Izumi et al., JCI 1998, 101:1121-1132). Later on, when electrophysiology experiments were performed, the person in charge of these experiments followed a previously published protocol by Yellen and colleagues, in which the authors had used 2 mM BHB (Ma et al., J. Neurosci 2007,27: 3618-3625). This explains the differences between the concentrations used in vitro and in vivo.

      Page 13, line 323: it is not necessary to say "...interesting study published during the preparation of this manuscript." This phrase should be deleted, and the relevant reference simply cited.

      We will follow the reviewer’s recommendation.

      The authors need to explain more clearly in the beginning what exactly is meant by "paradoxical" hyperactivity. They provide greater meaning later in the manuscript, but this should be clarified at the outset.

      We will explain why we used this adjective in the beginning as recommended by the reviewer.

      Reviewer #2 (Significance (Required)):

      This is a very important study to show how primary defects in metabolism (i.e., disruption of the mitochondrial pyruvate carrier) can lead to epilepsy. Moreover, it details a primary mechanism that connects cellular bioenergetics to membrane excitability (through changes in calcium homeostasis and M-current function).

      This is a well-conducted study that utilizes a multiplicity of experimental tools to link biochemistry with seizure activity. This type of study is not so readily done, and strengthens the notion that primary defects in metabolism can result in epileptic seizures.

      This study is unique and attempts successfully to be more than just correlational. Hence it is a valuable contribution to the field.

      The audience will likely consist of metabolic geneticists, neurologists/epileptologists, and neuroscientists. This is a beautiful study that runs the translational spectrum from biochemistry to behavior.

      My expertise is in the field of translational epilepsy research, with a focus on mitochondria, metabolism, the ketogenic diet and ketone bodies. Thus, I am qualified to critically evaluate the entire manuscript.

      **Referee Cross-commenting**

      After reading comments and reviewing the manuscript again, would agree with Reviewer #1, and would change recommendation to MAJOR REVISION.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript tests the genetic requirement of the mitochondrial pyruvate carrier (MPC) in regulation of neuronal excitability. The authors find that MPC deficiency in glutamatergic neurons is associated with aerobic glycolysis, inhibition of the M-type K channels, and neuronal hyperexcitability that manifests in increased sensitivity to chemical pro-convulsants without changes in resting conditions. Alterations in Ca homeostasis in MPC-deficient neurons is consistent with reduced mitochondrial membrane potential and attendant diminution of mitochondrial calcium buffering capacity. The authors further show that the effect of MPC deficiency can be phenocopied by treatment of wild type neurons with a chemical inhibitor of the mitochondrial Ca uniporter (MCU). Based on these data, it is proposed that reduced mitochondrial Ca uptake causes neuronal hyperexcitability in the absence of MPC. Overall, the manuscript presents detailed electrophysiology and in vivo seizure studies. However, there is significant disconnect between the actual data in Fig. 6 and the authors' conclusions/proposed mechanism. In particular, the evidence for the role of Ca in the hyperexcitability due to MPC deficiency is the weak link in the authors' argument.

      1. The studies linking reduced mitochondrial Ca uptake to hyperexcitability in MPC-deficient neurons (Fig. 6) have several limitations that significantly weaken the paper: 1a. The Ca measurements in cortical neurons (Fig. 6A-F) are performed under conditions (glutamate/KCl) that are fundamentally different from those used in electrophysiology of CA1 pyramidal neurons (Fig. 6G-N). The electrophysiological excitation is much briefer and less extreme than the chemical stimulation, and it is not clear that the Ca dysregulation occurs at the earliest times (see Fig. 6A).

      This point was also raised by reviewer 1. Please see our response to point 2.

      1b. The conclusion that MCU is functionally responsible for MPC's effect on neuronal excitability is singularly based on the use of RU360 as a chemical inhibitor of MCU but the specificity of this reagent is questionable. Evidence for a cause and effect relationship that directly implicates altered MCU/mitochondrial Ca buffering has not been provided.

      This accurate point was also raised by the reviewer 1. Please see our response to point 2 for a complete response. We will downregulate expression of MCU using shRNAs. We will also measure the mitochondrial calcium level in the hope of better understanding whether the phenotype of the MPC-deficient mice is due to impaired mitochondrial calcium uptake.

      1c. There is a large variation in the effect of 10 uM RU360 on firing frequency, comparing Fig. 6H and N (blue traces), including the shape of the traces and values at ramp number 6. This calls into question the reliability of the comparisons in each separate figure.

      Data presented in each single graph in the main Figures were obtained from groups of littermates through recordings conducted in consecutive days. Some caution is warranted when comparing data between different figures (i.e. between different experimental series), as several factors may contribute to inter-experiment variability, including variability between different batches of animals. However, the difference pointed out by the reviewer regarding the values of cell firing reported in Fig. 6H and N is only apparent. When applying depolarizations with ramps of 5s, a fair amount of WT cells infused with RU-360 show high instantaneous firing frequency, especially for the last ramps that steeply reach high current levels. This leads to accommodation/inactivation of the action potential towards the end of the ramps, as shown in the example trace in Fig 6G. As a result, the current-frequency plot deviates from linearity, as it is the case in Fig 6H (blue trace) and, even more evidently, in Fig 6N. We have now reanalyzed the same recordings from WT cells infused with 10 µM RU-360 and measured the firing frequency in response to a square depolarizing step (250 pA) of 0.5 or 1 second. No difference was found between the firing frequencies of the cells from Fig 6H and Fig. 6N (group 1 and group 2, respectively, in the figure below). Although the ramps may lead to some distortion for higher stimulation levels, we have decided to show results from ramps consistently throughout the main figures because this protocol with continuously increasing currents allows us to measure more precisely the rheobase and the firing threshold (as opposed to the stepwise increments of a square stimulation).

      1d. The calcium > PIP2 > M-type K+ channel axis is well established but has not been fully explored in the context of MPC deficiency. The use of a calcium chelator will likely be informative in this context, and would be better evidence for a role of Ca in the MPC effects.

      Although the use of a Ca2+ chelator such as BAPTA is likely to introduce confounding effects through the inhibition of other Ca2+-dependent channels, we will try this option and assess whether a XE991-sensitive component will be unmasked in MPC deficient cells.

      1e. The ability of BHB to rescue various parameters in this and other figures in the paper is interesting but does not directly speak to the specific mechanism as to how MPC deficiency affects neuronal excitability. BHB's effect is consistent with the metabolic flexibility of neurons when the TCA cycle cannot be fueled by glucose/pyruvate (as in GLUT1 or MPC deficiency).

      The mechanism we propose to explain the hyperexcitability of MPC-deficient neurons relies on the low mitochondrial membrane potential and their decreased capacity to buffer calcium. Based on our data, we propose that calcium accumulation in the cytosol disrupts the CaM-KCNQ interaction leading to hyperexcitability. Indeed, BHB could act in two possible (and parallel) ways. 1: directly on the M-type channels, 2. on mitochondria by providing acetylCoA to the TCA cycle. The use of an alternative ketone body will be informative in disentangling these two possibilities.

      The manuscript (and the field) will benefit from a more scholarly discussion and integration of published literature:

      2a. The published studies on the outcome of pharmacologic MPC inhibition in neurons (Ref 18, Divakaruni et al.) are not only consistent with the bioenergetic effect in Fig. 1, but more importantly, show that interference with MPC does not lead to broad deficiencies in energy metabolism but rather remodel fuel utilization patterns to alternative substrates that feed the TCA cycle (BHB, leucine, etc). For this reason, terms such as "mitochondrial dysfunction" and "OXPHOS deficiency" used throughout the manuscript to describe MPC deficiency are vague and imprecise. In addition, this metabolic flexibility may explain lack of defects under resting conditions. In light of these considerations, the argument as to whether aerobic glycolysis in MPC-deficient neurons explains the lack of phenotype in resting conditions (p 17) seems one-sided. Overall, the studies in ref 18 are relevant to the current manuscript and should be better integrated in the discussion.

      We fully agree with the possibility that the rewiring of cell metabolism in MPC-deficient neurons in the presence of leucine, BHB and other metabolites could explain the lack of phenotype in resting conditions. We thank the reviewer for this highly relevant comment which we will include in the revised discussion.

      2b. Several references are cited to describe the role of OXPHOS vis-à-vis aerobic glycolysis in neuronal function. At times, however, the authors' statements are not consistent with what these papers actually show (or do not show). For example, see the use of refs 6 and 44 on p17 of the discussion, where the authors state that aerobic glycolysis uncoupled from OXPHOS is sufficient to provide ATP for normal neurotransmission, but this does not mean OXPHOS is not needed.

      We agree that these references are not appropriate here and they will be removed.

      2c. Although the XE991 experiments support an important role for the M-type channels in the altered excitability with deficiency, it is not clear that the proposed mechanism can explain all of the electrophysiological differences, particularly those resting properties that are measured without a Ca challenge to the neurons. It would be good to discuss other possible mechanisms that could affect neuronal excitability.

      Our results point to M-type channels as important players in the phenotype of the MPC-deficient mice. Previous reports indicate that inhibition of this channel by XE991 can modulate input resistance, membrane potential and firing threshold of pyramidal cells (e.g. Shah et al, 2018, doi/10.1073/pnas.0802805105; Hu et al. 2007, DOI:10.1523/JNEUROSCI.4463-06.2007; Petrovic et al., 2012, doi:10.1371/journal.pone.0030402). We also found that XE991 induced a shift towards more negative potentials in the firing threshold of WT cells, but not in MPC1 deficient cells (-3.3±0.6 vs. -0.4±1.0, n=9, 8, p=0.027). However, we agree with the reviewer that the phenotype is probably highly complex and that additional mechanisms may contribute to modulate the intrinsic excitability of MPC-deficient neurons. One such mechanism could be closure of KATP channels, which we are currently investigating. This will be discussed.

      Reviewer #3 (Significance (Required)):

      The significance of the advance: The studies provide genetic evidence for the role of MPC in neuronal excitability.

      The work in the context of existing literature: Please see specific comments above under point 2 regarding the need for a scholarly discussion and integration of existing literature.

      Audience that might be interested: mitochondrial bioenergetics and metabolism and metabolic control of neuronal excitation.

      Keywords describing expertise: metabolism, mitochondria and electrophysiology.

    1. Judgments made by different people are even more likely to diverge. Research has confirmed that in many tasks, experts’ decisions are highly variable: valuing stocks, appraising real estate, sentencing criminals, evaluating job performance, auditing financial statements, and more. The unavoidable conclusion is that professionals often make decisions that deviate significantly from those of their peers, from their own prior decisions, and from rules that they themselves claim to follow.

      As educators (and disciplinary "experts") we like to think that our judgements on student performance are objective. As if our decisions are free from noise. I often point out to my students that their grades on clinical placements may be more directly influenced by their assessors relationship with their spouse, than by the actual clinical performance.

    1. Can a page opt-out of showing annotations

      One of the most important questions. We wrote about it here and here. Many people voiced their feelings about this in the six months or so that we spent investigating it and exploring it.

      The tension is between what users want and what authors want. The way the web works now, authors control what is delivered by the web server, users control how they consume web content. If I want to install the "Drumpf" chrome extension, I can, and web sites can't (easily?) block me from doing so. Greasemonkey is another good example of this.

      If the browser comes w/ the native ability to fetch, anchor and display annotations, but as the user I have the ability to decide which services I subscribe to-- then should page authors be able to block me from that? And if you think they should, should that blockage only be for public commentary, but not personal notes or private group annotation?

      One particularly thorny problem is around how governments and other public entities might want to use that same blocking power to prevent you from marking up their sites and documents. Should your government, or any government, be able to block its citizens or others from critical analysis of documents that it hosts? And if not, then how do you distinguish them from just any old page author.

      For the moment, a countermeasure that authors can employ is to add shrapnel to their web pages which blocks some of the strategies that are used. Wordpress plugins are available that do this. In a crude sense, this may be a reasonable compromise.

    1. Author Response:

      Reviewer #1 (Public Review):

      We thank the Reviewer #1 for their valuable comments. We agree with the Reviewer that our current results are not sufficient to confirm the therapeutic effects. The statement related to therapy is removed.

      The study by Song and colleagues explores the role of circRNAs in fibrosis of the endometrium. Endometrial cells for patients with and without fibrosis were subjected to expression profiling analysis, and circPTPN12 and miR-21-5p were strongly separate in fibrosis in endometrial, with circPTPN12 acting as an inhibitory factor for miR-21-5p. Through the use of various molecular approaches, the authors further that miR-21-5p inhibition results in upregulation of ΔNp63α, and transcription factor that induces EMT. The role of circPTPN12 was also confirmed in vivo using a mouse model of mechanically induced endometrial fibrosis. The authors concluded that targeting the path circPTPN12/miR-21-5p/∆Np63α may be a therapeutic strategy for endometrial fibrosis.

      The authors clearly and convincingly show the involvement of the circPTPN12/miR-21-5p/∆Np63α in EMT and its potential involvement in endometrial fibrosis. Whether or not this can be a therapeutic target is too preliminary at this point. First because the in vivo experiments confirm the link between circPTPN12/miR-21-5p/∆Np63α at the RNA level only (p63) and it would be more convincing to see protein data as well.

      We did try to detect the protein of ΔNp63α in mouse with immunochemistry and immunofluorescence, using three antibodies (CST, cat# 67825 and 39692; Abcam, ab124762). Unfortunately, we did not obtain positive results. However, ΔNp63α mRNA was significantly changed.

      The involvement of p63 in the process remains a little elusive in this paper.

      We have reported that ΔNp63α is ectopically expressed in endometrial epithelial cells in IUA patients (Cao et al., 2018), and showed that ΔNp63α promotes the expression of SNAI1 by DUSP4/GSK3B pathway and induces EECs-EMT and fibrosis (Zhao et al., 2020). We've put this description of ΔNp63α in the discussion section (2nd paragraph).

      In addition, if the authors believe this pathway can be a real future target to treat endometrial fibrosis, they could better contextualise such a statement, specifically describe what kinds of therapeutic intervention they think of, like regression or prevention of fibrosis. These should be tested in vitro and in vivo.

      Our results showed that replenishing miR-21-5p can reverse EMT and remit endometrial fibrosis in vivo and in vitro. However, the therapeutic intervention of miR-21-5p in clinic needs more research on other animal models such as rats, pigs, and non-human primates. Thus, we removed therapeutic statement (page 1, Line 1-2; and page 2, Line 37-40; and page 4, Line 74-76; page 13, Line 273).

      More evidence of the involvement of circPTPN12/miR-21-5p/∆Np63α and the correlation between the three players using clinical material is also necessary.

      The involvement of ∆Np63α in endometrial fibrosis has been proved in our published paper and results are quoted in this paper (Zhao et al., 2020). The correlation between circPTPN12 and miR-21-5p using clinical material was listed in Figure 2J. In vivo and ex vivo experiments had confirmed that overexpression of circPTPN12 downregulates miR-21-5p and upregulates ∆Np63α (Figure 3H/Figure 4J/ Figure 5B/ Figure 5E). In addition, ex vivo experiments suggested that the decrease of ∆Np63α is secondary to the increase of miR-21-5p (Figure 4C-E).

    1. In a way she realized that she herself was doomed, that sooner or later the Thought Police would catch her and kill her, but with another part of her mind she believed that it was somehow possible to construct a secret world in which you could live as you chose. All you needed was luck and cunning and boldness. She did not understand that there was no such thing as happiness, that the only victory lay in the far future, long after you were dead, that from the moment of declaring war on the Party it was better to think of yourself as a corpse. 'We are the dead,' he said. 'We're not dead yet,' said Julia prosaically. 'Not physically. Six months, a year--five years, conceivably. I am afraid of death. You are young, so presumably you're more afraid of it than I am. Obviously we shall put it off as long as we can. But it makes very little difference. So long as human beings stay human, death and life are the same thing.' 'Oh, rubbish! Which would you sooner sleep with, me or a skeleton? Don't you enjoy being alive? Don't you like feeling: This is me, this is my hand, this is my leg, I'm real, I'm solid, I'm alive! Don't you like THIS?' She twisted herself round and pressed her bosom against him. He could feel her breasts, ripe yet firm, through her overalls. Her body seemed to be pouring some of its youth and vigour into his. 'Yes, I like that,' he said. 'Then stop talking about dying. And now listen, dear, we've got to fix up about the next time we meet. We may as well go back to the place in the wood. We've given it a good long rest. But you must get there by a different way this time. I've got it all planned out. You take the train--but look, I'll draw it out for you.' And in her practical way she scraped together a small square of dust, and with a twig from a pigeon's nest began drawing a map on the floor.

      the idea of life and death, that they are already dead

    1. In some important professions, such as physics and engineering, Asian Americans are overrepresented and African Americans underrepresented. We presumably get better research because of this. This may or may not outweigh the inequity of unequal group representation. That is a social decision.

      This is a great article, but this statement irks me. "We presumably get better research out of this" - I do not think we can presume that. While a stopwatch may make a truer meritocracy (although one can argue that environment still plays a part in this), certainly there is a tremendous amount of environmental factors involved in what drives overrepresentation of certain racial or ethnic groups in "high-achieving" professions like physics or engineering.

    1. HU Skip navigation Search Search Search 9+ HU {"@context":"https://schema.org","@type":"VideoObject","description":"Doc Searls and Jonathan Bennett talk with Steven J. Vaughan-Nichols about what's happening in technology journalism, with the open source world he knows perhaps better than any other journalist on the case, and with where he got started: in space and space technologies. (Bonus fact: Steven digs Starlink, and Jonathan is using it to participate in the show.)\n\nHosts: Doc Searls, Jonathan Bennett\nGuest: Steven Vaughan-Nichols\nFLOSS Weekly Episode 629\nMore Info: https://twit.tv/shows/floss-weekly/episodes/629\n\nDownload or subscribe to this show at https://twit.tv/shows/floss-weekly\n\nThink your open source project should be on FLOSS Weekly? Email floss@twit.tv.\n\nThanks to Lullabot's Jeff Robbins, web designer and musician, for our theme music.\n\nGet episodes ad-free with Club TWiT at https://twit.tv/clubtwit\n\nProducts we recommend: https://www.amazon.com/shop/twitnetcastnetwork\nTWiT may earn commissions on certain products.\n\nJoin our TWiT Community on Discourse: https://www.twit.community/\n\nFollow us:\nhttps://twit.tv/\nhttps://twitter.com/TWiT\nhttps://www.facebook.com/TWiTNetwork\nhttps://www.instagram.com/twit.tv/\n\nAbout us:\nTWiT.tv is a technology podcasting network located in the San Francisco Bay Area with the #1 ranked technology podcast This Week in Tech hosted by Leo Laporte. Every week we produce over 30 hours of content on a variety of programs including Tech News Weekly, MacBreak Weekly, This Week in Google, Windows Weekly, Security Now, All About Android, and more.","duration":"PT3778S","embedUrl":"https://www.youtube.com/embed/T04zvX_JOPE","interactionCount":"26","name":"Steven J. Vaughan-Nichols - Technology Journalism","thumbnailUrl":["https://i.ytimg.com/vi/T04zvX_JOPE/maxresdefault.jpg"],"uploadDate":"2021-05-12","genre":"Science & Technology","author":"FLOSS Weekly"} Steven J. Vaughan-Nichols - Technology JournalismWatch laterShareCopy linkInfoShoppingTap to unmuteIf playback doesn't begin shortly, try restarting your device.4:320:00Up nextLiveUpcomingCancelPlay NowDigital Sovereignty - Dr. Andre Kudra1:09:48Rust - Steve Klabnik & Rust1:03:52FLOSS WeeklySUBSCRIBESUBSCRIBEDWe're not talking dentistry here; FLOSS all about Free Libre Open Source Software. Join host Doc Searls and his rotating panel of co-hosts every Wednesday as they talk with the most interesting and important people in the Open Source and Free Software community. Records live every Wednesday at 12:30pm Eastern / 9:30am Pacific at https://twit.tv/liveYou're signed outVideos you watch may be added to the TV's watch history and influence TV recommendations. To avoid this, cancel and sign in to YouTube on your computer.CancelConfirmSwitch cameraShareInclude playlistAn error occurred while retrieving sharing information. Please try again later.0:001:02:570:02 / 1:02:57Live•Scroll for details Steven J. Vaughan-Nichols - Technology Journalism 26 views • May 13, 2021 • Doc Searls and Jonathan Bennett talk with Steven J. Vaughan-Nichols about what's happening in technology journalism, with the open source world he knows perhaps better than any other journalist on the case, and with where he got started: in space and space technologies. (Bonus fact: Steven digs Starlink, and Jonathan is using it to participate in the show.) Hosts: Doc Searls, Jonathan Bennett Guest: Steven Vaughan-Nichols FLOSS Weekly Episode 629 More Info: https://twit.tv/shows/floss-weekly/ep... Download or subscribe to this show at https://twit.tv/shows/floss-weekly Think your open source project should be on FLOSS Weekly? Email floss@twit.tv. Thanks to Lullabot's Jeff Robbins, web designer and musician, for our theme music. Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Products we recommend: https://www.amazon.com/shop/twitnetca... TWiT may earn commissions on certain products. Join our TWiT Community on Discourse: https://www.twit.community/ Follow us: https://twit.tv/ https://twitter.com/TWiT https://www.facebook.com/TWiTNetwork https://www.instagram.com/twit.tv/ About us: TWiT.tv is a technology podcasting network located in the San Francisco Bay Area with the #1 ranked technology podcast This Week in Tech hosted by Leo Laporte. Every week we produce over 30 hours of content on a variety of programs including Tech News Weekly, MacBreak Weekly, This Week in Google, Windows Weekly, Security Now, All About Android, and more. Show less Show more 50ShareSave FLOSS Weekly FLOSS Weekly 5.16K subscribers Subscribe Steven J. Vaughan-Nichols - Technology Journalism

    1. Author Response:

      Reviewer #1:

      Weaknesses: The main aim of the study is to identify biomarkers that predict S/MD dengue early in the course of dengue. This requires biomarkers of which the levels change early after symptom onset. However, levels of several of the biomarkers did not change markedly between the two time points (early vs late), suggesting that the levels of these biomarkers had not yet changed on day 1-3, thereby questioning their use as 'early biomarkers'.

      Thank you, we acknowledge that the levels of some of the biomarkers are not markedly different between early and late time points. However this does not affect the aims of the study; firstly the late time-point may not represent the patient’s baseline as this time-point was within 2-3 weeks of the acute illness and secondly, our focus was on the first 3 days of illness, in order to identify early predictors, noting that this may not represent the peak for many of the biomarkers, which would be in the critical phase. However, we still were able to achieve our main aim which was to compare biomarkers on days 1-3 between patients who progressed to more severe outcomes and those who did not.

      The authors selected the biomarkers based on earlier pathophysiology studies. An alternative approach might have been to first measure a larger set of candidate biomarkers in a selection of patients and select only those biomarkers showing a clear change in the early phase.

      Thank you for your suggestion. For this study, due to the limited number of outcomes (moderate-severe events - 281 cases) and limited volume blood samples, we selected 10 biomarkers as the events-per-variable should be greater than 10 and we also would like to investigate the non- linear effect and interaction of the biomarkers [Heinze et al., Biom J 2018]. We therefore selected the most promising biomarkers systematically based on pilot data and published literature.

      Reference: Heinze G, Wallisch C, Dunkler D. Variable selection - A review and recommendations for the practicing statistician. Biom J 2018; 60(3): 431-49.

      The predictive values of many of the biomarkers was only modest or absent. In addition, some of the findings appear a bit counterintuitive. Examples include the trend of the association of IP-10 with S/MD dengue that changed from positive to negative in the global model, and the opposite trends of some of the biomarkers (e.g. IL-8, ferritin) in adults and children. The authors acknowledge the existence of differences in dengue pathology between children and adults, but could discuss the possible biological reasons in more detail. For example, why would specifically IL-8 or ferritin have an oppositie effect in children and adults.

      The trend of the association of IP-10 with S/MD changed from the single to global model does not diminish the possibility of that biomarker being selected in the best combinations. In this study we do not try to elucidate causal pathways. Another biomarker in our model may be a mediator or confounder of IP-10 in the pathway to the outcome. This could be IL-1RA, as its association with S/MD was similar between the single and global model, and the correlation between IP-10 and IL-1RA was strong (Spearman’s rank correlation coefficient was 0.75). A change in direction after correction for another variable is often referred to as Simpson’s paradox. We have added this point to the discussion of the revised manuscript (page 14, lines 10-16).

      The opposing effect in children and adults is likely to be due to the composite endpoint of severe and moderate dengue. As shown in the analysis of severe dengue alone (figure S5, table S6), the effects of IL-8 and ferritin were similar in children and adults, which suggests these biomarkers are still associated with severe disease in all age groups and that the difference is driven by the moderate dengue group. In addition, uncomplicated dengue in adults have higher ferritin levels compared to in children, with increasing age and chronic conditions in adults likely contributing to this. We have added this point to the discussion in the revised manuscript (page 14, lines 21-26 and page 15, lines 1-2).

      The study does not include a validation cohort. The authors conclude that their findings 'assist the development of biomarker panels for clinical use.' Can the authors put into perspective the performance of their current combined biomarker panel to rule out S/MD dengue.

      Thank you for your comment, this is a case-control and preliminary study to investigate the potential combination of biomarkers associated with dengue clinical outcomes. We quantify importance by means of AIC and p-value. Another dataset without selection by outcome is needed to validate the findings in relation to predictive value. We have added to the limitations that this was not a prediction study, therefore, the performance of the combined biomarker panel with respect to predictive value was not performed (page 16, lines 14-17).

      Overall, the authors show convincingly in a unique cohort that biomarkers can be helpful to triage dengue patients already in the first days from symptom onset. Identification of the best biomarkers for this goal, validation in other cohorts, and a better understanding of differences between children and adults are required before such panels can be introduced in daily clinical practice.

      Thank you for your comment.

      Reviewer #2:

      The main weakness is the exclusion of virological markers, such as plasma/serum viral RNA levels or NS1 antigenaemia. Indeed, previous observations have found severe dengue patients to have higher viraemia in the acute phase of illness compared to those with uncomplicated dengue. More recently, several mechanistic studies have suggested that dengue virus NS1 protein could bind endothelial cells to disrupt its integrity, leading to vascular leakage. Indeed, the authors have pointed out these findings in lines 20-25 on page to lines 1-2 on page 6. Despite these reports, it is curious that the authors have not included either viraemia or NS1 antigenaemia as possible biomarkers for severe dengue.

      Thank you, we acknowledge that plasma viremia and NS1 antigenaemia levels are important factors in dengue disease outcomes. In this study, only enrolment viremia levels were available, but NS1 antigenaemia levels were not. We have previously investigated the association between viremia levels and clinical outcomes using a pooled dataset of the IDAMS international study and other three studies in Vietnam. We found that higher plasma viremia was associated with increased dengue severity [Vuong et al., Clin Infect Dis 2020]. For this study, the main aim was to investigate host biomarkers which could be combined in a multiplex test panel.

      However, as suggested, we have added the information of viremia levels to table S3 (which was previously table 2) of the revised manuscript. Also, we have performed a sensitivity analysis to include viremia levels as a potential biomarker and we have found that: (1) higher plasma viremia was associated with increased the risk of severe/moderate dengue in both single and global models, and (2) viremia was not selected in children but was selected fourth in adults when performing the best subset procedure. We have added this information in the Statistical analysis (page 10, lines 20- 24) and Results sections (page 13, lines 17-20), and the Supplementary file (appendix 8, figure S8, tables S13-S15, pages 30-34).

      Reference: Vuong NL, Quyen NTH, Tien NTH, et al. Higher plasma viremia in the febrile phase is associated with adverse dengue outcomes irrespective of infecting serotype or host immune status: an analysis of 5642 Vietnamese cases. Clin Infect Dis 2020.

      The manuscript in its present form may favour those with a strong statistical background to fully appreciate the nuances. Clearer explanations on the statistical findings would, I think, be helpful to those without such statistical background but who would nonetheless be in positions to translate these findings into clinical practice.

      We have added more explanation in the Statistical analysis, Results and Discussion sections to clarify statistical methods used in this study and the interpretation of the results.

      Most of the cases included in this study had DENV-1 infection. The biomarkers identified in this study may thus be DENV-1 specific and may not be readily applied to triage dengue cases caused by other DENV infection.

      In our study, DENV-1 accounted for 42% of all cases. We have performed a sensitivity analysis taking into account differences between serotypes. The results showed that there was no significant difference between serotypes with respect to the association between the biomarkers and primary endpoint in both the single and global models. This suggests that the study’s results are applicable for all serotypes. This information has been added in the Statistical analysis (page 10, lines 18-20) and Results sections (page 12, lines 18-20), and the Supplementary file (appendix 5, figures S3-S4, tables S4-S5, pages 13-17).

      Reviewer #3:

      1) For general ease of readership, it would greatly help if the authors can explain the choice of the statistical method used in the data analysis and perhaps briefly explain the model and how AIC should be interpreted in the main rather than the supplementary text).

      We have clarified in more details in the Statistical analysis section of the revised manuscript.

      2) While this reviewer understands that the authors want to focus on host immune and inflammatory biomarkers but it would be helpful if NS1 and viremia data are also shown ( at least in supplementary data) if these have been found not to correlate with disease severity.

      Thank you please see response to comment #1 of reviewer #2. Quantitative NS1 results were not available in this study. We have added viremia in a sensitivity analysis and the results showed that higher viremia was associated with increased risk of severe/moderate dengue, similar to our previous study [Vuong et al., Clin Infect Dis 2020]. In the best subset procedure, viremia was not selected in children and was selected fourth in adults.

      Reference: Vuong NL, Quyen NTH, Tien NTH, et al. Higher plasma viremia in the febrile phase is associated with adverse dengue outcomes irrespective of infecting serotype or host immune status: an analysis of 5642 Vietnamese cases. Clin Infect Dis 2020.

      3) It is Interesting to note that some biomarkers ( particularly the vascular markers) in severe group do not return to the same baseline as mild cases at convalescence even after >20 days. Whether such individuals already are at higher inflammatory state at baseline (pre-infection) as a result of underlying co-morbidities such as obesity or diabetes? Table 1 did not provide such information but would be interesting to show if there is any difference in health state in the 2 groups especially for obesity.

      We have added the information of obesity and diabetes in table 1, Results section (page 11, lines 13-14). There were 5 patients with diabetes; obesity was balanced between groups (14% in control group and 10% in S/MD group).

      4) It is rather confusing that the 2nd paragraph of discussion stated "Balancing model fit, robustness, and parsimony, we suggest the combination of five biomarkers IL-1RA, Ang-2, IL-8, ferritin, and IP-10 for children, and the combination of three biomarkers SDC-1, IL-8, and ferritin for adults to be used in practice."

      But the concluding paragraph went on to state "The best biomarker combination for children includes IL-1RA, Ang-2, IL-8, ferritin, IP-10, and SDC-1; for adults, SDC-1, IL-8, ferritin, sTREM-1, IL-1RA, IP-10, and sCD163 were selected." This should be clarified further.

      Thank you for pointing this out. The conclusion was based on the best combinations (taking into account AIC only), which consisted of 6 biomarkers for children and 7 biomarkers for adults. In the discussion, we reduced the number of biomarkers, taking into consideration not only the AIC, but also parsimony for clinical translation purposes, while keeping the model fit as good as possible (taking a difference of AIC of less than 5 compared to the best combination). We therefore suggested a combination of 5 biomarkers for children and 3 biomarkers for adults, considering these 3 factors - model fit, robustness and parsimony. We have clarified this point in the Discussion section of the revised manuscript (page 15, lines 20-25).

    1. Author Response:

      Reviewer #1:

      Summary and Strength:

      Single-cell RNA sequencing is the most appropriate technique to profile unknown cell types and Koiwai et al. made good use of the suitable tool to understand the heterogeneity of shrimp hemocyte populations. The authors profiled single-cell transcriptomes of shrimp hemocytes and revealed nine subtypes of hemocytes. Each cluster recognizes several markers, and the authors found that Hem1 and Hem2 are likely immature hemocytes while Hem5 to Hem9 would play a role in immune responses. Moreover, pseudotime trajectory analysis discovered that hemocytes differentiate from a single subpopulation to four hemocyte populations, indicating active hematopoiesis in the crustacean. The authors explored cell growth- and immune-related genes in each cluster and suggested putative functions of each hemocyte subtype. Lastly, scRNA-seq results were further validated by in vivo analysis and identified biological differences between agranulocytes and granulocytes. Overall, conclusions are well-supported by data and hemocyte classifications were carefully performed. Given the importance of aquaculture in both biology and industry, this study will be an extremely useful reference for crustacean hematopoiesis and immunity. Moreover, it will be a good example and prototype for cell-type analysis in non-model organisms.

      Thank you very much for your kind review. We hope that this paper will lead to a better understanding of the immune system of shrimp and further development of aquaculture.

      Weaknesses:

      The conclusions of this paper are mostly well supported by data, but some aspects of data analysis QC and in vivo lineage validation need to be clarified.

      1) It is not a trivial task to perform genome-wide analyses of gene expression on species without sufficient reference genome/transcriptome maps. With this respect, the authors should have de novo assembled a transcriptome map with a careful curation of the resulting transfrags. One of the weaknesses of this study is the lack of proper evaluation for the assembly results. To reassure the results, the authors would need to first assess their de novo transcripts in detail and additional data QC analysis would help substantiate the validity.

      The genome sequence of the kuruma shrimp M. japonicus has only been registered, and the high-quality data has not been published yet. Therefore, we could not perform validation using the genome sequence. However, by applying the BUSCO tool to the assembled sequences, we verified the quality of the assembly genes. Line 80-82 and 634-636.

      2) The authors applied SCTransform to adjust batch effects and to integrate independent sequencing libraries. SCTransform performs well in general; however, the authors would need to present results on how batch effects were corrected along with before and after analysis. In addition, the authors would need to check if any cluster was primarily originated from a single library, which could be indicative of library-specific bias (or batch effects).

      Thank you for your suggestion. The triplicate distribution after batch correction is shown in the Figure 2-figure supplement 1 and Figure 5-figure supplement 1. Line 123 (Figure 2-figure supplement 1), 244 ( Figure 5-figure supplement 1) and 686-689.

      3) Hem6 cells lack specific markers and some cells in this cluster are scattered throughout the other clusters (Fig. 1 & 2). Based on the pattern, it is possible that these cells are continuous subsets of other clusters. It would be good if the authors could group these cells with Hem7 or other clusters based on transcriptomic similarities or by changing clustering resolution. Additionally, they may also be a result of doublets, and it is unclear whether doublets were removed. Hem6 cells require additional measures to fully categorize as a unique subset.

      Based on the new UMI counts, we re-did in silico clustering and pseudotime analysis with new parameters. For Doublets, we assumed UMI less than 4000 this time because none of them had prominent UMI. Line 118 (Figure 2), 237 (Figure 5), 686-689 and 710-712.

      4) The authors took advantage of FACS sorting, qRT-PCR, and microscopic observation to verify in silico analyses and defined R1 and R2 populations. While the experiments are appropriate to delineate differences between the two populations, it is not sufficient to determine agranulocytes as a premature population (Hem1-4) and granulocytes as differentiated subsets (Hem5-9). To better understand the two groups (ideally nine subtypes), additional in vivo experiments would be essential. For example, proliferation markers (BrdU or EdU) could be examined after FACS sorting R1 and R2 cells to show R1 cells (immature hemocytes) are indeed proliferating as indicated in the analyses.

      Since stable culture of shrimp hemocytes is still difficult, it is difficult to implement BrdU assay now. We believe the advantages of our study are that single-cell analysis can be used in shrimp, that we explored marker candidates, and that we were able to provide guidelines for cell classification in the future. Of course, we are going to adapt BrdU or EdU assay on hemocytes in the feature.

      5) FACS-sorted R1 or R2 population does not look homogeneous based on the morphology and having two subgroups under nine hemocyte subtypes may not be the most appropriate way to validate the data. The better way to prove each subtype is to use in situ hybridization to validate marker gene expressions and match with morphology.

      What we want to show here is that it is very difficult to classify hemocytes by morphologically, and even if we could, it is likely to be divided into two rough groups (FACS result). As in the answer to the question above, we believe the advantage of this project is that we were able to search for marker candidates and provide guidelines for cell classification in the future. Of course, in the future, we hope to look at the function and expression of each gene. Since it is difficult to perform the in-situ assay or BrdU assay in shrimp hemocytes immediately, we have removed the Figure 7.

      Reviewer #2:

      In this manuscript Koiwai et al. used single cell RNA sequencing of hemocytes from the shrimp Marsupenaeus japonicus. Due to lack of complete genome information for this species, they first did a de novo assembly of transcript data from shrimp hemocytes, and then used this as reference to map the scRNA results. Based on expression of the 3000 most variable genes, and a subsequent cluster analysis, nine different subpopulations of hemocytes were identified, named as Hem1-Hem9. They used the Seurat marker tool to find in total 40 cluster specific marker transcripts for all cluster except for Hem6. Based upon the predicted markers the authors suggested Hem1 and Hem2 to be immature hemocytes. In order to determine differentiation lineages they then used known cell-cycle markers from Drosophila melanogaster and could confirm Hem1 as hemocyte precursors. While genes involved in the cell cycle could be used to identify hemocyte precursors, the authors concluded that immune related genes from the fly was not possible to use to determine functions or different lineages of hemocytes in the shrimp. This is an important (and known) fact, since it is often taught that the fruit fly can be used as a general model organism for invertebrate immunologists which obviously is not the case. Even among arthropods, animals are different. The authors suggest four lineages based upon a pseudo temporal analysis using the Drosophila cell-cycle genes and other proliferation-related genes. Further, they used growth factor genes and immune related genes and could nicely map these into different clusters and thereby in a way validating the nine subpopulations. This paper will provide a good framework to detect and analyze immune responses in shrimp and other crustaceans in a more detailed way.

      Strengths:

      The determination of nine classes of hemocytes will enable much more detailed studies in the future about immune responses, which so far have been performed using expression analysis in mixed cell populations. This paper will give scientists a tool to understand differential cell response upon an injury or pathogen infection. The subdivision into nine hemocyte populations is carefully done using several sets of markers and the conclusions are on the whole well supported by the data.

      Thank you for taking the time to review our paper. We hope that this paper will serve as a guideline for crustacean hemocyte research.

      Weaknesses:

      One obvious drawback of the paper is first the low number of UMIs. A total number of 2704 cells gave a median UMI as low as 718 which is very low. Especially shrimp no. 2 has an average far below 500 and should perhaps be omitted. Therefore, one question is about cell viability prior to the drop-seq analysis. The fact of this low number of UMIs should be discussed more thoroughly.

      By confirming the mitochondrial-derived sequences, we cleared up the suspicion that large numbers of dead cells were contaminating. We have also succeeded in increasing the number of UMIs by changing mapping software and adjusting the parameters. The value of UMIs is still lower than that of other model organisms, but we think that will improve as the reference genome is published in the future. I have discussed this in the manuscript. Line 87-89, 118 (Figure 2) and 716-717.

      Details about how quality control (QC) was performed would be needed, for example the cutoff values for number of UMI per cell, and also one important information showing the quality is the proportion of mitochondrial genes.

      As we answered in the above section, we checked and figured the results of mitochondrial contents. Since there are no set rules here, we set the parameters for one cell based on the initial distribution diagram. Line 87-89, 118 (Figure 2) and 686-689

      The clustering into nine subpopulations seems solid, however the determination of lineages based upon the pseudo time analysis with cell-cycle related genes is not that strong. The authors identify four lineages, all starting from hem1 via hem2-Hem3- Hem4 and then one to Hem5, another through part of Hem 6 to Hem 7, next through part of Hem 6 to Hem 8 and finally through part of Hem 6 to Hem 9. Referring to Figure 3 - supplement 3, it seems as if Hem6 could be subdivided into two clusters, one visible in B and C, while another part of Hem & is added in D.

      Based on the new UMI counts, we re-did in silico Clustering and pseudotime analysis with new parameters. It made more clear result. Line 118 (Figure 2), 237 (Figure 5), 686-689 and 710-712.

      Also, the data in figure 3 - supplement 1 showing expression of cell cycle markers do not convincingly show the lineages. Cluster Hem 3 and 4 seems to express much fewer and lower amount of these markers compared to cluster Hem6 - Hem9.

      As a result of the new clustering and other analyses, we can now see more clearly how growth-related genes vary along the clusters (Figure 7). Line 366 (Figure 7).

      It is also clear (from figure 5 - supplement 1) that there are more than one TGase gene and the authors would need to discuss that fact related to differentiation.

      Thank you for your suggestion. We discussed about different type of TGase in revised paper. Line 386-399, 457 (Figure 8-figure supplement 2).

      While the part to determine subpopulations is very strong, the part about FACS analysis and qRT-PCR is weaker than the other sections, and doesn't add so much information. Validation of marker genes and the relationship between clusters and morphology shown in figure 6 is not totally convincing. It seems clear that both R1 and R2 contains a mixture of different cell types even if TGase expression is a bit higher in R1. A better way to confirm the results could be to do in situ hybridization (or antibody staining) and show the cell morphology of some selected marker proteins in a mixed hemocyte population. FACS sorting is very crude and does not really separate the shrimp hemocytes in clear groups based on granularity and size. This may be because the size of hemocytes without granules vary a lot. You need cell surface markers to do a good sorting by FACS.

      We agree your comments that in situ hybridization or antibody staining are powerful tools to support our new findings. However, it is difficult to perform in-situ assay or preparation of antibody for shrimp hemocytes immediately. What we want to show here is that it is very difficult to classify hemocytes by morphologically, and even if we could, it is likely to be divided into two rough groups (FACS result). As in the answer to the question above, we believe the advantage of this project is that we were able to search for marker candidates and provide guidelines for cell classification in the future. Of course, in the future, we hope to look at the function and expression of each gene.

      Another minor issue is the discussion about KPI. There are a huge number of Kazal-type proteinase inhibitors in crustaceans and it is not clear from this data if the authors discuss a specific KPI-gene, and there is a mistake in referring to reference 65 which is about a Kunitz-type inhibitor.

      Thank you for your important pointing. In case of kuruma shrimp, de novo assembled genes and blast results showed low (around 60%) identity against L. vannamei’s Kazal-type proteinase inhibitor, not against kuruma shrimp. Therefore, we could not discuss about which type of KPI in this study. We consider it important that further research on KPIs for kuruma shrimp be conducted in the future. Also, as you pointed out, reference 65 was wrong, so we removed it. Line 474 (Figure 8-figure supplement 5).

      In summary, this paper is a very important contribution to crustacean immunology, and although a bit weak in lineage determination it will be of extremely high value.

      Thank you for giving us a good feedback. We understand that the evaluation of the gene as a marker and the expression of the marker gene in each cell is poor in not being able to confirm. However, we believe that our research will hopefully serve as a basis for future research.

      Reviewer #3:

      This manuscript by Koiwai et al. described the single-cell RNA-seq analysis of shrimp hemocytes and was submitted as a Resource Paper in eLife. In this study, they identified 9 cell types in shrimp hemocytes based on their transcriptional profiles and identified markers for each subpopulation. They predicted different immune roles among these subpopulations from differentially expressed immune-related genes. They also identified cell growth factors that might play important roles in hemocyte differentiation. This study helps to understand the immune system of shrimp and maybe useful for improving the control of the pathogen infections. The analysis of the data and interpretation is overall good but there are also some concerns:

      Thank you for your careful peer review. We hope that this paper will be useful to other researchers in the future. We have made a revise based on your comments, please review it again.

      1) The number of UMI and genes detected per cell after mapping to the in-house reference genome does not appear to be presented, and the similarities or differences between the three replicated samples are not discussed, as well as the low number of genes detected per cell (~300 in this study) .

      By confirming the mitochondrial-derived sequences, we cleared up the suspicion that large numbers of dead cells were contaminating. We have also succeeded in increasing the number of UMIs by changing mapping software and adjusting the parameters. The value of UMIs is still lower than that of other model organisms, but we think that will improve as the reference genome is published in the future. I have discussed this in the manuscript. Line 87-89, 118 (Figure 2) and 686-689.

      2) The correlation between the morphology and the expression of marker genes demonstrated in Figure 6 is questionable. Cells of the same size could express totally different genes. On the other hand, cells that are different in size can express nearly identical genes. The evidence presented in this manuscript is not enough to support a correlation between cell size and gene expression. Therefore, the author would either need to provide more evidence to support this correlation, or not make such correlation.

      Yes, we agree your comments. What we want to show here is that it is very difficult to classify hemocytes by morphologically, and even if we could, it is likely to be divided into two rough groups (FACS result). So, it is not surprising that similar cells may or may not express similar genes. However, some of genes can be used as markers for cell (may refer to cell size too), such as TGase or proPO genes.

      3) There are many spindle-shaped cells in Figure 6B, but none of them appeared in Figure 6C and D after sorting, and the reason for this is unclear.

      We don't have any idea why the cells were deformed either, and we think this is exactly why it is so difficult to classify hemocytes by morphologically. This reason is unknown as cell culture is also not currently possible.

      4) The hemocyte differentiation model in Figure 7 is not supported by any experimental data.

      We understood your comment. Since we could not conduct any functional research about marker genes, we have removed figure 7.

    1. Reviewer #1 (Public Review):

      Strengths:

      1) The model structure is appropriate for the scientific question.

      2) The paper addresses a critical feature of SARS-CoV-2 epidemiology which is its much higher prevalence in Hispanic or Latino and Black populations. In this sense, the paper has the potential to serve as a tool to enhance social justice.

      3) Generally speaking, the analysis supports the conclusions.

      Other considerations:

      1) The clean distinction between susceptibility and exposure models described in the paper is conceptually useful but is unlikely to capture reality. Rather, susceptibility to infection is likely to vary more by age whereas exposure is more likely to vary by ethnic group / race. While age cohort are not explicitly distinguished in the model, the authors would do well to at least vary susceptibility across ethnic groups according to different age cohort structure within these groups. This would allow a more precise estimate of the true effect of variability in exposures. Alternatively, this could be mentioned as a limitation of the the current model.

      2) I appreciated that the authors maintained an agnostic stance on the actual value of HIT (across the population & within ethnic groups) based on the results of their model. If there was available data, then it might be possible to arrive at a slightly more precise estimate by fitting the model to serial incidence data (particularly sorted by ethnic group) over time in NYC & Long Island. First, this would give some sense of R_effective. Second, if successive waves were modeled, then the shift in relative incidence & CI among these groups that is predicted in Figure 3 & Sup fig 8 may be observed in the actual data (this fits anecdotally with what I have seen in several states). Third, it may (or may not) be possible to estimate values of critical model parameters such as epsilon. It would be helpful to mention this as possible future work with the model.

      Caveats about the impossibility of truly measuring HIT would still apply (due to new variants, shifting use & effective of NPIs, etc....). However, as is, the estimates of possible values for HIT are so wide as to make the underlying data used to train the model almost irrelevant. This makes the potential to leverage the model for policy decisions more limited.

      3) I think the range of R0 in the figures should be extended to go as as low as 1. Much of the pandemic in the US has been defined by local Re that varies between 0.8 & 1.2 (likely based on shifts in the degree of social distancing). I therefore think lower HIT thresholds should be considered and it would be nice to know how the extent of assortative mixing effects estimates at these lower R_e values.

      4) line 274: I feel like this point needs to be considered in much more detail, either with a thoughtful discussion or with even with some simple additions to the model. How should these results make policy makers consider race and ethnicity when thinking about the key issues in the field right now such as vaccine allocation, masking, and new variants. I think to achieve the maximal impact, the authors should be very specific about how model results could impact policy making, and how we might lower the tragic discrepancies associated with COVID. If the model / data is insufficient for this purpose at this stage, then what type of data could be gathered that would allow more precise and targeted policy interventions?

      Minor issues:

      -This is subjective but I found the words "active" and "high activity" to describe increases in contacts per day to be confusing. I would just say more contacts per day. It might help to change "contacts" to "exposure contacts" to emphasize that not all contacts are high risk.

      -The abstract has too much jargon for a generalist journal. I would avoid words like "proportionate mixing" & "assortative" which are very unique to modeling of infectious diseases unless they are first defined in very basic language.

      -I would cite some of the STD models which have used similar matrices to capture assortative mixing.

      -Lines 164-5: very good point but I would add that members of ethnic / racial groups are more likely to be essential workers and also to live in multigenerational houses

      -Line 193: "Higher than expected" -> expected by who?

      -A limitation that needs further mention is that fact that race & ethnic group, while important, could be sub classified into strata that inform risk even more (such as SES, job type etc....)

    1. Author Response:

      We thank you for the careful review and the opportunity to resubmit this manuscript. We particularly acknowledge the reviewer who helped to clarify the statistical arguments and stimulated our re-analysis of all results. This re-analysis has helped to change the focus of the work to identify significantly variable (higher) familial cancer risks in several race/ethnically described minority groups in the US, which we feel has broadened the message stimulating a word change in the title.

      Reviewer #1 (Public Review):

      This is a very well written and comprehensive paper that is a valuable contribution to the literature of childhood cancers. It shows that some childhood cancers have an inherited component and the risk could be to the mother or to the siblings. Although the relative risks are significant, childhood cancer is fortunately rare and the actual risk to the siblings is small.

      Can we assume this is less than one percent? i think it would be helpful to provide some absolute risk numbers for the siblings so that parents could be reassured that the risk to other children is small.

      Response: We appreciate this comment on absolute risk. It is true that the actual risk is very small given the rarity of childhood cancers. We calculated the overall absolute risk for mothers and siblings of a proband and compared it with the general population. It now reads “Moreover, due to the rarity of childhood cancers, the absolute risk is very small, but still higher among young siblings and mothers in the current study (0.074%) compared to general population (0.023%) of the same age group” in line 316 of the Discussion section.

      Do the authors have a suggestion on what genetic tests should be done on children with cancer? Do you have recommendations to make? i assume that the authors do not recommend screening of siblings for cancer except in rare cases. It would be useful to see what the authors recommend.

      Response: In this manuscript we do not provide clinical recommendations as we feel that is out of the scope of this research. Instead, we are making several points:

      1) That conventional US-based birth and cancer registries can be utilized to study familial-based cancer risks.

      2) That different ethnic groups appear to have different familial risks for some cancer subtypes.

      3) Early onset parental cancers can add information about familial-based risks.

      4) Second primary malignancies are enriched in families that exhibit familial risks (line 260 of the Results section). These characteristics will provide useful information for genetic counselors who need to advise families on their own decisions about genetic testing and family planning. At the present time the genetic counseling clinical discipline is tasked to make specific recommendations to families about screening siblings for cancer and presence of cancer predisposition alleles, such advice is stimulated by examining family history of cancer. Our work suggests that Latino families may have a higher risk of familial alleles in solid tumors overall, which may promote more attention or scrutiny of families by ethnicity.

      Are there some sites where the risk to siblings is there but not to parents which might suggest recessive inheritance?

      Response: this is an interesting question, but there are two reasons why our study may not be adequate to assess this. First, our sample size may not be large enough to adequately study this point. The risk to cancer in the general population is higher in children than it is in young adults – and therefore the low numbers of cancer in mothers that we see is largely a reflection of the low risk of cancer in young adults, since we cut off our observational age at 26 (due to the extent of follow-up on our young population). There is a lack of cancer at many of the ICC-03 defined childhood cancer sites among our parents, making it impossible to estimate cancer risk in the adults. Second, childhood cancers are biologically distinct from adults, so the risk imparted for childhood cancer from predisposition alleles that affect those cancers may not always have any effect on young adult cancers. Additionally, the progenitor cells at risk from childhood cancer may have differentiated, leading to no cells “at risk” of transformation after adolescence and the effect of childhood cancer predisposition alleles on those adult cancers not a meaningful comparison. Of course, there are exceptions to this such as TP53 alleles which affect cancer risk of many subtypes at any age.

      If the childhood cancer is rare and fatal one might not see it in the parents because of loss or reproductive fitness. Please comment.

      Response: We appreciate this comment a lot and have the same concern that patients with cancer that have a strong genetic cancer predisposition may not be capable to reproduce (even if the patient survives). We added a comment in the discussion section, and it now reads “Furthermore, it is likely that the low number of mothers with cancer is a result of bias against some very strong cancer predisposition alleles, so the patients could not survive long enough or be healthy enough to reproduce” on line 408.

      Should we assume that the higher risks for Latino children are purely due to genetic influences? Could there be environmental factors at play as well?

      Response: We appreciate this comment and totally agree that environmental factors also play a role. Not only genetic factors, but also the environmental factors, and the interaction between genetic and environmental factors would contribute to the variation in relative risks. We have addressed this point in lines 341 (“This familial concordance is likely due to both shared genetic and environmental…”) and 419 (“Second, the comparative attributable fraction of familial risk based on environmental risk factors interacting…”) of the discussion section. We believe that this point should stimulate further research, and we are constructing our own future studies to explore environmental factors along with genetics.

      Reviewer #2 (Public Review):

      [...] Although the authors comment that the results from the Chi-Sq test are not consistent with the specific group SIRs and 95%CIs, they do not explain how these results can be so different.

      I am concerned that there is either an error in the calculations or an error in the assumptions. It is not acceptable to have such contradictory results between the two distinct methods.

      For example, for hematological cancers the 95% CI for Latinos is entirely contained within the 95%CI for Non-Latino white, while this gives a p less than 0.05. The authors need to explore why these methods are giving very different answers and be clear that the low p-values are not simply an artifact of poor assumptions.

      Response: We sincerely appreciate the comments from Reviewer 2. And we want to thank Reviewer 2 for pushing on the inconsistency between confidence intervals and p-value comparing the SIRs between race/ethic groups. While overlapping CI’s do not necessarily indicate a lack of significance in the effect sizes, the apparent contrast in these statistical measures was too extreme to be believable and indeed there was an error.

      We reconstructed our data from scratch and recalculated all statistical comparisons with our statistician, Dr. W. J. Gauderman, and found a recurrent mistake in the calculation of p-value comparing the SIRs between race/ethic groups. We have corrected this mistake throughout the manuscript. Please refer to the new Figure 1, 3, and supplementary materials for the corrected numbers. The p values are now somewhat attenuated, and significant differences between Latinos and NL whites persist for solid tumors. In addition, Asians have significantly increased familial risk for hematologic cancers, and non-Latino Blacks have significantly increased risk of solid tumors when compared to non-Latino whites. Because of this broader enhanced risk evident in minority groups (with the corrected statistical comparisons), the focus of the manuscript was changed slightly emphasizing higher risks among minority groups in respective hematologic and solid tumor categories. There were also SIR differences suggested between many individual types of cancer, while not reaching formal statistical significance.

    1. Author Response:

      Reviewer #1 (Public Review):

      This Research Advance builds on the findings of this group's 2019 eLife paper which showed that conserved acidic and basic helices associate to enable heteropolymer formation by Snf7 and Vps24. This work provides some general structure/sequence relationships among the homologous ESCRT-III proteins that will be of interest to those in the ESCRT field. While there are no new mechanistic principles obtained from this study, the data allow the authors to propose a model of the minimal or core units needed for ESCRT-III membrane remodeling.

      The focus is largely on similarities and differences between the closely related Vps24 and Vps2, where they show that a few key point mutations or chimeric swaps (for Vps4 binding by the C-terminal region of Vps2) can exchange their functions. The last portion of the paper further tests similarities within the subgroups of ESCRT-III proteins to experimentally test functional groupings defined by sequence relationships.

      We thank the reviewer for their generous comments. We’d like to emphasize that one of the main focus behind this study is to be able to generate minimal ESCRT-III system that can be functional. We study Vps24 and Vps2 to generate a model ESCRT-III module with their specific properties. We previously engineered Snf7 to replace Vps20 (and other ESCRT components, eLife 2016). In this paper, we also extend some of the analysis to other ESCRT-III components. We agree that this current manuscript combines previously described mechanisms to understand the minimal ESCRT-III system and provides us a direction to understand why in some cases (for example archaeal system), there may be only two ESCRT-III subunits. This work, following up on previous works from our lab and others, takes us one additional step toward that direction.

      In addition, we’d also like to highlight from our work that in yeast, MVB biogenesis does have strong contributions from Did2 (CHMP1) and Vps60 (CHMP5), but not from Ist1 (IST1) and Chm7 (CHMP7) (Fig. 5). These have previously been under-emphasized in the literature.

      Reviewer #2 (Public Review):

      The manuscript by Emr and colleagues addresses the important question of how core ESCRT-III members Vps2 and Vps24 interact to form functional polymers using protein engineering and genetic selection approaches.

      Major findings are:

      Vps2 overexpression can functionally replace Vps24 in MVB sorting.

      Helix 1 N21K, T28A, E31K mutations, Vps2, were identified to be sufficient for suppression, concluding that Vps2 and its' over expression can replace the function of Vps24 and Vps2.

      Vps24 over expression does not rescue delta Vps2. The authors propose that this is due to the lack of the MIM and helix5 binding sites for Vps4 present in Vps2.

      Vps24 E114K mutation was identified to rescue deltaVps2 upon over expression and even better as a Vps24/Vps2 chimera suggesting that auto-activated Vps24 that can recruit Vps4 can functionally replace Vps2.

      Analyzing the effect of single ESCRT-III deletions on Mup1 sorting confirmed Snf7, Vps20, Vps2 and Vps24 as essential for sorting.

      In summary, the manuscript provides new insight into the assembly of ESCRT-III. It confirms some redundancy of VPS2 and Vps24 and shows how Vps2 can substitute Vps24 but not vice versa.

      We thank the reviewer for this summary of our work. One point we’d like to emphasize is that while we agree that Snf7, Vps20, Vps2 and Vps24 form a minimal core subunit to form MVBs, there are important functions of other ESCRT-III molecules Did2 and Vps60 (Figure 5 and supplement) for MVB biogenesis.

      Comments:

      The three minimal principles for ESCRT-III assembly stated in the abstract are not novel. Spiral formation of ESCRT-III has been described before for yeast Vps2-Vps24 as well as its mammalian homologues. The requirement for VPS4 recruitment is also well documented and finally, the manuscript does not provide proof for lateral association of the spirals via hetero-polymerization.

      We agree with the first two comments about spiral formation and Vps4 recruitment. We’d like to emphasize that the lateral association through heteropolymerization mechanism extends from our previous work (eLife 2019) and supported by this work through mutational analysis of Vps2’s helix-1 motif. In our previous work, we provided evidence of the association of Snf7’s helix-4 region with Vps24’s helix-1 region, and also lateral association of Snf7 and Vps24/Vps2 with in vitro assays. In the previous work, we didn’t characterize Vps2-Snf7 interaction, which we do further in this work. We find that charge-inversion mutations in Vps2 increases its affinity to Snf7, and this effect is sufficient to replace Vps24. We believe that these analyses strengthen our model and also enhance our knowledge of ESCRT-III polymerization. Therefore, this manuscript a strong extension/advance on our previous eLife paper, and both papers should be analyzed together.

      The authors show that 8-fold over expression is necessary to rescue Mup1 sorting to an extent of 40%. The authors hypothesize that over expression of Vps2 can rescue Vps24 deletion because Vps2 may have a lower affinity for Snf7 than Vps24. This is in agreement with data on mammalian homologues which showed that indeed CHMP3 binds with 10x higher affinity to CHMP4B than CHMP2A (Effantin et al, 2012). This could have been included in the discussion, since the function of yeast and mammalian core ESCRT-III proteins is most likely not different.

      We apologize for this oversight and have included appropriate reference to this paper in the next version.

      The authors designed several chimeric Vps24/Vps2 constructs and show that some of the Vps24 chimera including Vps2 helix 5 and the MIM are fully functional in Mup1 sorting in delta Vps24 cells, but lack the ability to functionally replace Vps2 in Vps2 delta cells. It is unclear whether the chimeras are in the closed conformation in the cytosol. It would be interesting to know whether they are activated more easily and possibly prematurely.

      With our current assays we cannot distinguish the open vs. closed conformations in solution vs. membrane for Vps24. We do not think that these chimeras are activated prematurely because they do remain functional (as highlighted by the reviewer) in vps24∆ strain.

      We’d like to thank the reviewer for pointing us to these mutants, which have encouraged us to further study these and related chimeras. To understand the role of swapping the Vps2 helix5 and MIM region further, we have added a couple of more experiments that would allow us to further understand the role of these motifs.

      We replaced the helix-5 and MIM regions of Vps2 onto Snf7 to ask whether this construct remains functional, and whether they can replace function of Vps24-Vps2 (by directly recruiting Vps4).

      In these set of data, we present evidence that when incorporated into Snf7, the helices 5 and MIM motifs of Vps2 make this chimeric Snf7 dysfunctional (Fig. 3 – Supp. 3). These data are consistent with the reviewers’ interpretation that premature recruitment of Vps4 to ESCRT-III filaments is presumably dysfunctional. However, inclusion of these motifs to Vps24 most likely does not prematurely disassemble ESCRT-III filaments, hence they remain functional. Also, mere substitution of the H5 and MIM motif to Snf7 (and therefore the Vps4 binding) is not sufficient for ESCRT-III function in cells.

      The larger point behind this set of analyses is that there are additional functions of Vps24-Vps2 beyond just recruitment of the AAA+ ATPase Vps4. Since we extensively analyzed the lateral association of Vps24-Vps2 to Snf7 in our previous manuscript (Banjade et al., eLife 2019), we ascribe these additional functions to lateral polymerization of Vps24-Vps2 on the Snf7 filament.

      The authors show that Vps24 E114K can form some kind of polymers in the presence of Vps2 in vitro while no polymerization is observed for wt Vps24 at 1 µM. It would be interesting to know whether wt Vps24 polymerizes at higher concentrations in this assay.

      We don’t observe polymers with 15 µM of Vps24 and 15 µM of Vps2, as the proteins start forming amorphous assemblies. We do refer to other manuscripts in the past who have observed similar linear polymers of Vps24 at higher concentrations (>300 µM) and longer incubation. So we believe that the ESCRT-III proteins Vps24 and Vps2 are able to form copolymers with a similar structure that is enhanced when these “activating” mutations are included.

      While the conclusion that E114K shifts the equilibrium to the open state is plausible, there is no evidence provided that this mimics Vps2 as stated. If so, Vps24 E114k should form the same polymers as shown in figure 4 supp 1 in the absence of Vps2 and spiral formation with snf7 should not require Vps2.

      We agree with this interpretation from the in vitro assays, and have appropriately changed the language in the manuscript. We now describe the effect of the E114K protein to “enhance” associated with existing Vps2. We hypothesize that this enhanced association to Vps2 occurs due to an “activation” process whereby Vps24 adopts a higher population of an open (or a semi-open) conformation, and have changed the language to reflect this interpretation. As an aside, we do note that Snf7 and Vps24 do form helices at higher concentrations without Vps2, as we showed in Banjade et al., eLife 2019.

      The speculation in the results section that Vps24 may not extend its helices 2 and 3 in an activated form due to potential helix breaking Asn residues in the linker region is not backed up by data, and it would have been appropriate to indicate this in the manuscript.

      We have now moved this analysis to the discussion and emphasized that this is a hypothesis. We also added the following sentence when describing the data regarding the mutations in the potential helix-breaking Asn residues: “We note that these data are indicative of mutations that control the conformations of the proteins. However, further biophysical analyses will be required for definitive evidence of this conformational flexibility.”

      The proposal that Vps2-Vps24 heteropolymers are formed by interactions along helices 2 and 3 is not supported by data presented in the manuscript. The authors would need to use recombinant proteins to test their mutants in biophysical interaction studies.

      We have now moved this interpretation to the discussion. Further dissection with biochemical and biophysical assays of Vps24-Vps2 would be a future direction in this area.

      Reviewer #3 (Public Review):

      This study sought to identify essential features of ESCRT-III subunits, with a focus on the yeast proteins Vps2 and Vps24, in order to reveal the required features of both subunits. The combined genetic and biochemical studies solidified the model that essential functions of ESCRT-III polymers - spiral formation, lateral association, and binding of Vps4 - are mostly distributed between different subunits (with some redundancy) and can be engineered into a single polypeptide. This study also sheds light on the long-standing and initially surprising finding that ESCRT-dependent budding of HIV does not require CHMP3 (Vps24), presumably because the distribution of distinct functions between different ESCRT-III subunits is not absolute.

      Inspired by earlier studies, the ability of overexpression of one ESCRT-III subunit to compensate for deletion of another subunit was explored using sorting assays. The demonstration of partial rescue inspired a mutagenesis approach that identified three residues that cluster on one face of a helix that enhanced rescue, and therefore confer functionality that in wt is primarily provided in the deleted subunits, which in this case is binding to Snf7. Extension of this analysis by protein engineering further demonstrated that the essential role of recruiting the Vps4 ATPase is normally performed by Vps2 but can be transferred to Vps24 by substitution of residues near the ESCRT-III subunit C-terminus. Similarly, it is shown that sequences that alter the propensity for bending of a helix at a point where open and closed ESCRT-III subunits differ in conformation contributed to the ability of Vps24 to substitute for deletion of Vps2, presumably by conferring the ability to adopt the open, activated conformation as well as the closed conformation.

      I don't have concerns about design or technical aspects of the experimental approach.

      We appreciate the reviewer’s comments and the summary of our work.

    1. Author Response:

      Reviewer #1:

      The authors sought to assess the relationship between developmental lineage and connectivity.

      This is a tour de force. It relies on detailed EM reconstructions, knowledge of complete neuroblast lineages thus correlating wiring with lineage, and through genetic manipulations of N gene function correlates developmental programs with wiring. The conclusion is important and provides a well described cellular and genetic system for linking the developmental program of a cell to its connection specificity. It provides a framework for considering how to study these questions in other regions of Drosophila and can be extended to the study of more complex mammalian systems where a similar neuroblast-lineage strategy generates different neuron types.

      There are no major weakness.

      This is an excellent study and, in my opinion, is ready to publish in its current form.

      We appreciate this comment!

      Reviewer #2:

      The conclusions of this paper are mostly well supported by data, however, there are several points that should be discussed further in the manuscript:

      1) The authors state that overexpression of Notchintra transforms Notch OFF neurons into Notch ON neurons. However, since this decision happens at the level of the GMC, wouldn't be more correct to say that Notch OFF neurons were not produced and only Notch ON neurons were generated? Moreover, the authors state that the Notchintra overexpression phenotypes are due to hemilineage transformation rather than to death of Notch OFF neurons, by providing the total neuronal number in both experimental conditions using NB5-2 lineage. I think this statement is too much of a generalization when only one NB lineage has been analyzed and should be addressed in more lineages to claim this as a general mechanism. Moreover, the opposite hypothesis could have also been tested to make the argument stronger: Would depletion of Notch in GMCs make all neurons in a lineage target the ventral neuropil domain?

      We agree, and now provide cell counts for WT and Notch-intra in all four lineages (5-2, 7-1, 7-4, and 1-2) in the text. In all cases, the number of neurons in wild type and Notch-intra lineages are not significantly different, supporting the Notch OFF to Notch ON transformation. We don't say that Notch-OFF neurons are missing, because there is no loss of neurons from the lineage, but rather the neurons that would have been Notch-OFF in wild type are now duplicating the Notch-ON neurons. Regarding presenting the opposite transformation, we tried to do it with misexpressing UAS-numb, but were unable to get the expected positive control phenotype in which all five Eve+ U neurons are transformed to Eve-negative siblings (Skeath and Doe, 1998). Thus, we were not able to do lineage-specific Notch inhibition. Unfortunately, we can’t use whole embryo N or N pathway mutants, as has been done before (Skeath and Doe, 1998), because they have massive disruption in the CNS that obscures lineage specific axon phenotypes.

      2) Temporal cohorts described in this work are an approximation to neuronal temporal identity. The authors validate the correlation of early and late temporal cohorts to the expression of the temporal TFs Hb and Cas (Fig 4G). Given the resolution of the TEM dataset and the existence of specific NBs and neuronal drivers for the neurons studied, a correlation between the 4 temporal cohorts presented in this work and the 4 temporal TFs Hb, Kr, Pdm and Cas expressed by these neurons could have been possible and would have presented a more comprehensive view of the relationship between tTF expression and neurite and synapse localization. Does temporal cohort between lineages (cortex neurite length) mean expression of the same temporal TF? For example: would mid-early neurons in different lineages express the same temporal factor?

      Excellent question! We show that radial position is a proxy for temporal identity, but the precise relationship of Hb, Kr, Pdm, and Cas expressing neurons to the four radial “bins” we describe remains unknown. In fact, a graduate student is doing these experiments by generating MCFO single neuron clones in newly hatched larvae (the stage of the TEM volume) and staining with Hb, Kr, and Cas temporal transcription factors (it is impossible to so this with Pdm because neurons lose expression at stage 15). This will be many months of work and probably over a thousand MCFO+ neurons to analyze, and we feel it is beyond the scope of the paper -- although very important and very interesting! Plus, we are still limited in lab time due to University of Oregon covid restrictions.

      Since shared temporal identity between different lineages on its own does not confer shared neuronal projections, but shared temporal cohort hemilineage does: Does this mean that the expression of a given temporal TF and/or neuronal birth order does not play a role in this shared connectivity? Please clarify these ideas in the text.

      We have tried to clarify this in the text. Whereas temporal identity alone has no detectable role in generating common synapse localization or connectivity, it does have some role in the context of hemilineage identity. That is, hemilineage temporal cohorts have more shared synapse localization and connectivity than either temporal or hemilineage identity alone. See Figure 6 for synapse localization, and Figure 7 for connectivity data.

      3) Although the authors claim so, it is not convincing that the role of spatial patterning in neuronal connectivity has been assessed in this work, since the authors do not present an obvious correlation between specific connectivity features (morphology, axon or synapses localization) and the position of a given NB in the VNC. This should be clarified in the text.

      Great point! We agree that spatial patterning was not directly tested in our manuscript, thank you for pointing this out. Our claim that spatial patterning is involved is simply based on the idea that lineages (and thus hemilineages) are more related to one another than neurons from other hemilineages suggesting that the identity of the parent neuroblast plays some role. You make the excellent point that we did not look at the relationship of projections from all NBs in a “row” or “column” within the NB array. That analysis would potentially reveal a role for spatial factors in determining neuron projections. Unfortunately, we have a very limited set of neurons from any one row or column, not enough to make claims about direct relationships between row or column identity and targeting/connectivity.

      Reviewer #3:

      Specific comments:

      1) Figure 1; page 3: The authors refer to the "striking" similarity between EM reconstructions and GFP filled clones and yet there are clear differences in some of the clones in the extent and localization of arborization. This may be in part technical but almost certainly also reflects inter individual differences in single neuron morphology. Since EM reconstructions presumably come for, one animal, the use of GFP clones allows the authors to map the degree of variation between clones and it would be interesting for them to show this.

      That is an interesting point. Elegant work from Tzumin Lee and Jim Truman have shown that clones from larval neuroblasts are very similar, and our qualitative findings support this conclusion. Thus, it would be a quite minor advance for us to quantify clonal similarity in embryonic neuroblasts. Plus, since the number of neurons in a clone varies slightly, we would have to count neuron numbers per clone and only compare those with identical neuron numbers, which is possible but time-consuming. Then there are the covid restrictions which make it difficult to rapidly generate new clones to increase the number with identical neurons. All in all, we decided that the benefit of answering this question was not worth the cost of performing it, and that other experiments were a higher priority in our limited research time. We have toned down the language to remove the word “striking” in the Introduction.

      2) Figures 2 and 4; pages 3-5: Along the same lines as above, the authors make categorical statements about the mapping of arbors to dorsal and ventral regions of the nerve cord and correlate that to hemilineage identity. Again, there is clear mixing in almost all neuroblast lineages, that seems to range from 15-30% as a rough estimate, and perhaps a bit more dorsally than ventrally, which the authors do not comment on (except to say it's "mostly non-overlapping"). This is a pity because they obviously have the tools to do so quantitatively and the information is already there in their data.

      Yes, good point – there is some overlap in most lineages for both axon/dendrite targeting (Figure 2) and synapse targeting (Figure 4). We now quantify the synapse similarity and observe that hemilineage-related neurons have much greater synapse similarity than they have with their sister hemilineage. The non-overlapping relationship between hemilineages is somewhat obscured by the simple posterior view shown in Figures 2 and 4, so we add a new figure (Figure 4 – supplement 2) that shows hemilineage synapse targeting in all three axis: A/P, M/L, and D/V. This makes it possible to see the true relationship.

      3) The analysis of Notch activity in hemilineages is excellent and very interesting, as is the new tool they develop. However, the analysis lacks loss of Notch function data and where and when Notch signaling is required to segregate the connectivity space (i.e. in neurons or in precursors such as Nbs and GMCs). Is this a binary fate specification mechanism or lateral inhibition among competing neurons? What about Notch activity manipulation in single neurons? If the authors wish to draw strong conclusions about the role of Notch in segregating target space and its relation to hemilineage identity, these experiments are essential. Alternatively, drawing subtler conclusions and acknowledging these caveats would be very welcome.

      Great point about the possible role of non-canonical Notch signaling in post-mitotic neurons (PMID: 22608692). We do not have the tools to perform lineage-specific, axon-specific removal of Notch protein. In theory we could do single neuron MARCM experiments, but these are extremely difficult due to the perdurance of the Gal80 protein, which would prevent us from assaying in newly hatched larvae. We add a Discussion section addressing the unresolved issue of post-mitotic neuron Notch function: “Another point to consider is the potential role of Notch in post-mitotic neurons (Crowner et al., 2003), as our experiments generated Notch-intra misexpression in both new-born sibling neurons as well as mature post-mitotic neurons. Future work manipulating Notch levels specifically in mature post-mitotic neurons undergoing process outgrowth will be needed to identify the role of Notch in mature neurons, if any.”

      4) Figure 7; Page 7: The authors state that 75% of hemilineage neurons correlated by temporal identity are separated by 2 synapses or less, suggesting greater connectivity than expected. How are these data normalized? What is the expected connectivity between neurons that are less related along these two developmental axes?

      Thanks for the question, which helped us change the text for clarity. The quantifications in Figure 7 actually do compare connectivity between unrelated neurons. Thus, we have changed “random” to “unrelated” in the text and figure legends. Additionally, the methods for this analysis were obviously not clear enough, so we have updated them with this text below:

      Path Length Analysis:

      We computed the pairwise path length between all hemilineages as well as all sensory and motor neurons in A1 in the undirected connectivity graph. We found that neurons that are unrelated by developmental grouping had an average path length greater than that of neurons related by hemilineage. Additionally, we found that the average path length for neurons related by hemilineage alone had an average path length greater than that of neurons in hemilineagetemporal-cohorts. For this analysis, unrelated neurons were defined as neurons that were in the same D/V axis (i.e. dorsal to dorsal and ventral to ventral) and same hemisegment (left or right), but not in the same hemilineage. Hemilineage comparisons were neurons in the same hemilineage, but not in the same temporal cohort. Significance was determined with a two-sample KS test on the empirical distributions of pairwise path lengths.

      Independent of path length, we also calculated connectivity similarity between related neurons in Figure 8. Similarity here was defined as the cosine of the angle between the input or output vectors of each neuron. Similarity by this metric was also found to be greater for developmentally related neurons. Finally, we added this line to Figure 7 legend to clarify normalization: “Frequency corresponds to the fraction of pairwise distances observed for each group.”

      5) Figure 8; page 7 and discussion: The authors conclude that the combination between temporal identity and hemilineage identity predicts connectivity beyond what would be predicted by spatial proximity alone. This conclusion is problematic at least two levels. First, practically what really matters for proximity is proximity during the time in development when synapses are forming between neuronal pairs, not proximity at the end in the final pattern.

      This is a good point that we need to clarify, although we note that synaptic connectivity is not a "one and done" in the embryo, but rather a continuous process that extends from the late embryo into the third larval instar ("Conserved neural circuit structure across Drosophila larval development revealed by comparative connectomics" by Gerhard, Andrade, Fetter, Cardona, and Schneider-Mizell, eLife 2017).

      Nevertheless, we now add the following additional text to the Results and to the Discussion. To the Results: “Interestingly, even neurons with the highest observed levels of overlap were not always connected (Figure 8A''). Thus, proximity alone can't explain the observed connectivity, consistent with a role for hemilineage-temporal cohorts providing increased synaptic specificity. Of course, our assays are in newly hatched larvae, and it is likely that dendritic arbors are more widely distributed during circuit establishment in the late embryo (Valdes-Aleman et al., 2021), yet only a specific region of the neuropil is targeted by larval hatching, which suggests the initial broad dendrite targeting is not sufficient to establish connectivity to many neurons contacted by these early dendrites, again arguing against a simple proximity mechanism.” To the Discussion: “Our results strongly suggest that hemilineage identity and temporal identity act combinatorially to allow small pools of neurons to target pre- and postsynapses to highly precise regions of the neuropil, thereby restricting synaptic partner choice. Yet precise neuropil targeting is not sufficient to explain connectivity, as many similarly positioned axons and dendrites fail to form connections (Figure 8C), despite active synapse addition throughout larval life (Gerhard et al., 2017).”

      Second, conceptually, opposing spatio-temporal mechanisms with proximity-based bias for connectivity makes no sense because that's exactly what spatio-temporal mechanisms achieve: getting neurons to the same space at the same time so connectivity can happen. At any rate, drawing strong conclusions about where and when neurons meet to form (or not form) synapses requires live imaging and absent that authors should refrain from making such a string statement about what their excellent correlative dataset means.

      Yes, spatiotemporal mechanisms get axons (or dendrites) to precise neuropil domains, but that does not invariably generate connectivity. What is interesting is that hemilineage-temporal cohorts share more connectivity than predicted by proximity alone. Thus, proximity is necessary but not sufficient for proper connectivity. An additional mechanism is in play, and our data suggests that is due to the neuron's hemilineage-temporal identity. We agree that our data are correlative – shared development correlates with shared connectivity – so we have moved any suggestion of possible mechanism from the Results to the Discussion. We agree this is an important change that will increase manuscript accuracy, and also provide a clear future direction for mechanistic experiments. Thanks for helping us focus the paper better.

    1. Author Response:

      Reviewer #1:

      In this manuscript the authors show that a designer exon containing a Fluorescent Protein insert can be used to edit vertebrate genes using an NHEJ based repair mechanism. The approach utilizes CRISPR to generate DSBs in intronic sequences of a target gene along with excision of a donor fragment from a co-transfected plasmid to initiate insertion of the exon cassette by ligation into the chromosome DSB.

      I like the idea here of inserting FP sequences (and other tags) into introns in this way. Focusing on the N- and C-termini for insertions has always seemed arbitrary to me. In practice these internal sites may even tolerate tag insertions better than the termini. However, this remains to be seen.

      My major reservation with this study is that the concepts here are not particularly novel. The approach is very similar to a concept already well established in gene-therapy circles of using introns as targets for inserting a super-exon preceded by a splice acceptor to correct inborn genetic lesions. The methodology employed is essentially HITI (https://www.nature.com/articles/nature20565).

      What is new is the finding that FP insertions are frequently expressed and at least partly functional as evidenced by their ability to localize to the expected intracellular structures. However, no actual functional data is provided in this study so it remains to be seen how frequently the insertion of FP exons is tolerated. It would help the study substantilly to have functional information for a few insertions.

      The value and utility of this study hinges on whether insertions of this type frequently retain function. The authors speculate that "labeling at an internal site of a gene is feasible as long as the insertion does not disrupt the function of the encoded protein. Many introns reside at the junctions of functional domains because introns have evolved in part to facilitate functional domain exchanges (Kaessmann et al., 2002; Patthy, 1999)." Thus an analysis of how often intron tags are tolerated as homozygotes would be helpful for users who will worry that a potentially "quick and dirty" CRISPIE insertion might not accurately report on the function and localization of their protein of interest.

      We thank the reviewer for appreciating our idea. CRISPIE is indeed improved HITI, with the notable difference that the insertion takes place at the intronic region and that a designer intron/exon module is used. This design has a significant benefit in that INDELs in both labeled and unlabeled alleles will be unlikely to cause mutations at the levels of mRNA and proteins. CRISPIE is also different from the super-exon, which is now cited (Bednarski et al, 2016). CRISPIE does not involve the 3’ UTR and the poly A signal. This makes the donor template more standardized and smaller. Transcriptional controls embedded in endogenous introns after the editing sites can be retained in CRISPIE, but not when super-exons are used. We also achieve much higher efficiency in vivo than previous editing methods, which we feel is an important advance.

      We now provide three different experiments to address the function of CRISPIEd β-actin and, in one experiment, the function of CRISPIEd α-tubulin 1B. One of the key functions of the cytoskeleton is to support growth. We now show that neither CRISPIE labeling of β-actin (hACTB), at two different intronic loci, and nor CRISPIE labeling of α-tubulin 1B (TUBA1B) affect the growth of U2OS cells (New Experiment #1; Figure 1H, and Figure 1-figure supplement 4), suggesting that labeled β-actin and α-tubulin are functional. In addition, as suggested, we now demonstrate that cells homozygous for CRISPIE insertions are viable and able to divide (New Experiment #2; Figure 4-figure supplement 1). We also show that two important neuronal functional parameters – the mEPSC frequency and amplitude – are not altered by CRISPIE labeling of hACTB in neurons in cultured hippocampal slices. (New Experiment #3; Figure 5– figure supplement 2).

      Having shown the above results, we also hope to emphasize that, although CRISPIE provides a way to perform FP tagging of endogenous protein with high efficiency and low error rates, it cannot ensure that FP-tagging itself is benign for all proteins. Numerous studies have overexpressed FP-tagged proteins, which is well documented to have side effects. The CRISPIE method empowers researchers by allowing them to tag endogenous proteins without overexpression. However, if the FP-tagging itself affects protein function, CRISPIE will not be helpful. Each FP-tagging project, whether it is based on CRISPIE or other methods, will requires its own systematic characterization. We have now made this clear in the discussion (pg. 17): “… although CRISPIE enables the tagging of endogenous proteins with low error rates, it does not ensure that the tagged protein functions the same as the wild-type protein. Not all tagging is benign, and rigorous characterizations will be needed for each tagging experiment.”

      Other comments:

      1) Were homozygotes identified and were they viable in each instance?

      We now provide data showing that cells homozygous for CRISPIE insertions are viable and able to divide (New Experiment #2; Figure 4-figure supplement 1).

      2) You say: "The CRISPIE method should be broadly applicable for use with different FPs or with other functional domains, different protein targets, and different animal species." I don't know if you optimized your FP to avoid potential reverse strand splice acceptors, but some discussion of this important point should be made so that those trying to apply the approach will make sure that strong acceptors are not included accidentally in reverse oriented inserts.

      Our RT-PCR does not detect reversed inserts at the mRNA level. We now add in the Discussion that donor design needs to eliminate unintended splicing sites in the reverse orientation. We write (pg. 17): “It should also be noted that, when designing the donor template, care should be given to not create unintended splicing acceptor sites in the inverted orientation. Otherwise, inverted insertion events can cause mutations at the mRNA and protein levels.”

      3) Would your mRNA sequencing methodologies detect defective transcripts where the splice acceptor and a portion of the upstream FP exon was inserted causing a frame shifted and mispliced mRNA? Such mRNAs would be unstable due to NMD and thus not detected readily in a PCR based approach. Thus disruption of the mRNA by partial insertion of your donor (or fragments of the other co-injected DNA) might be much more widespread than is measured here. This could be tested by recovering clones that partially inserted the donor in the forward orientation and carefully monitoring for defects in mRNA splicing of the inserted allele. Were such clones detected and how frequently?

      Our method should detect defective mRNAs, if they are not degraded. However, if defective mRNAs are quickly degraded, they are not measured in our current RT-PCR and NGS experiments, as described in Figure 2. While we cannot address this question directly, we now provide evidence that the cell growth and neuronal function after CRISPIE labeling of β-actin remain normal.

      We also thank the reviewer for suggesting the cloning approach. This proposed experiment, however, may potentially be affected if potential defective mRNAs can result in decreased cell survival/growth. Although this experiment will require time beyond the three-month revision period expected by eLife due to the length of time required to clone cells, we will keep this in mind in our future efforts.

      4) You note that in the case of vinculin the coding sequence of the last exon of hVCL was included in the insertion donor sequence, and a stop codon was introduced at the end of the mEGFP coding sequence. This is essentially the strategy for super-exon insertion into targets for gene therapy, instead of a splice donor on the C-terminus you include a stop codon. You should site these previous studies. Inclusion of a stop codon in frame would be expected to cause NMD, did you also include transcription termination signals?

      NMD will happen if the stop codon is further than about approximately 50 nucleotides upstream of any exon-junction complexes (Lewis et al, PNAS 2009). However, NMD won’t occur if it is within 50 nucleotides. For example, synaptophysin – a highly expressed neuronal protein – has its stop codon at its second to last exon within 50 nucleotides of the exon junction. The stop codon we used for labeling hVCL is also within 50 nucleotides (~20 nt) of the exon junction.

      We now cite Bernarski et al, 2016, which describes the use of super-exons in gene therapy. At the same time, we think that our approach is still different from the super-exon concept. After the stop codon, the 3’ UTR is not included. Instead, a splicing donor is included, allowing the exon to be spliced to the subsequent endogenous exon. This allows the insert to remain small for high insertion efficiency and makes it easy to produce the template (some 3’ UTRs can be several kilobase pairs in length), while utilizing the endogenous translational controls built into the native 3’ UTR.

      Reviewer #2:

      In-frame insertion of fluorescent protein tags into endogenous genes allows observation of protein localization at native expression levels, and is therefore an essential approach for quantitative cell biology. Once limited to unicellular model organisms such as yeast, endogenous gene tagging has become well-established in invertebrate model systems such as C. elegans and Drosophila since the advent of CRISPR technology in the last decade. However, a robust and widely accepted endogenous gene tagging strategy for mammalian cells has remained elusive. This is largely due to the fact that homologous recombination, the method used to create knock-ins in invertebrates, is inefficient (or sometimes doesn't work at all) in mammalian cells, especially those that do not divide rapidly.

      Several studies have attempted to bypass the need for homologous recombination by using a different method, non-homologous end joining (NHEJ) to insert GFP tags into vertebrate genomes (e.g. Auer et al. Genome Res 2014; Suzuki et al. Nature 2016; Artegiani et al. Nature Cell Biol. 2020). Such approaches can be orders of magnitude more efficient than homologous recombination, but the generated alleles require careful validation because of the error-prone nature of NHEJ.

      Here, Zhong and colleagues improve upon the existing NHEJ-based gene tagging approaches by designing synthetic exons (comprising a FP coding sequence with 5' and 3' splice sites) that can be inserted into native introns using NHEJ. The beauty of this approach is that any mutations (indels) created by the error-prone NHEJ repair mechanism are spliced out, and therefore do not affect the sequence of the encoded protein. A limitation is that tags must be inserted internally within a protein of interest and cannot be targeted to the extreme N- or C-terminus, but this limitation is clearly stated and discussed by the authors. Overall, this is a novel (to my knowledge) and powerful strategy that is likely to advance the field.

      We thank the reviewer for the very positive comments regarding our CRISPIE method.

    1. Author Response:

      Reviewer #1:

      In this manuscript Lituma et al. provides compelling evidence demonstrating the physiological role of presynaptic NMDA receptors at mossy fiber synapses. The existence of these receptors on the presynaptic site at this synapse was suggested more than 20 years ago based on morphological data, but their functional role was only shown in a single abstract since then (Alle, H., and Geiger, J. R. (2005)). The current manuscript uses a wide variety of complementary technical approaches to show how presynaptic NMDA receptors contribute to shaping neurotransmitter release at this synapse. They show that presynaptic NMDA receptors enhance short-term plasticity and contribute to presynaptic calcium rise in the terminal. The authors use immunocytochemistry, electrophysiology, two-photon calcium imaging, and uncaging to build a very solid case to show that these receptors play a role at synaptic communication at mossy fiber synapses. The authors conclusions are supported by the experimental data provided.

      The study is built on a solid and logical experimental plan, the data is high quality. However, the authors would need to provide stronger evidence to demonstrate the physiological function of these receptors. It is hard to reconcile these experimental conditions with the authors' claim in the abstract: 'Here, we report that presynaptic NMDA receptors (preNMDARs) at hippocampal mossy fiber boutons can be activated by physiologically relevant patterns of activity'. We know that extracellular calcium can have a very significant impact of neurotransmitter release and how short-term plasticity is shaped. For this reason, it would be important to explore how the activity of these receptors at more physiological calcium concentrations contribute to calcium entry and short-term plasticity at these synapses.

      We thank the reviewer for noting our study is “built on a solid and logical experimental plan, the data is of high quality”. We agree with the reviewer that exploring the role of preNMDAR under more physiological conditions is extremely important. In response, we have performed new experiments at 35 ºC and at a more physiological 1.2 mM Ca+2 and 1.2 mM Mg+2 concentrations. Our new results, now included in Figure 4-figure supplement 1, demonstrate that our conclusion that preNMDARs at mossy fiber boutons can be activated by physiologically relevant patterns of activity is also true under more physiological recording conditions.

      Reviewer 2:

      Lituma et al. examined the presence and functions of preNMDARs in dentate gyrus granule cells (GCs) in the hippocampus. The authors found that GluN1+ preNMDARs are indeed present at mossy fiber (mf) terminals with electron microscopy. With pharmacological and genetic approaches, the authors showed that preNMDARs are important in low frequency facilitation (LFF), burst-induced facilitation and information transfer at the mf-CA3 synapse. The authors further demonstrated that this preNMDAR contribution is independent of the somatodendritic compartment of the GCs. With 2-photon calcium imaging, the authors found that preNMDARs contribute to presynaptic Ca2+ transients and can be activated by local glutamate uncaging. Separately, the authors showed that GluN1+ preNMDARs might also contribute to BDNF release at mossy fiber terminals during repetitive stimulation. Lastly, non-postsynaptic NMDARs specifically mediates mf transmission onto mossy cells, similar to mf-CA3 synapses, but not interneurons. The authors concluded that preNMDARs mediate synapse-specific transmission originating from the GCs/mf inputs.

      Overall, the study provides compelling evidence from a battery of techniques, ranging from EM, pharmacology, genetic deletion, electrophysiology to 2-photon imaging/uncaging. The data supports a coherent story on the presence of preNMDARs at mf terminals and that preNMDARs play important roles in LFF.

      In conclusion, this study reveals how NMDA receptors can be found in unexpected locations and how they may have unconventional functions, i.e. outside the narrow textbook view that they primarily serve as coincidence detectors in Hebbian learning. This study thus helps to change the way we think about NMDA receptor functioning, so should be of broad interest.

      We appreciate the reviewer’s comments that our study provides compelling evidence for the presence and role of preNMDARs at mossy fiber terminals. We also agree with the reviewer that our study challenges the way we think about NMDA receptor function.

      Reviewer #3:

      In this manuscript Lituma and colleagues investigate a potential role for presynaptic NMDARs at hippocampal mossy fiber (MF) synapses in regulating synaptic transmission. The combined use of electron microscopy, electrophysiology, optogenetics, calcium imaging, and genetic manipulations expertly employed by the authors yields high quality compelling evidence that presynaptic NMDARs can participate in activity dependent short term facilitation of release onto postsynaptic CA3 pyramid and mossy cell targets but not onto inhibitory interneurons. Moreover, presynaptic NMDAR activation is demonstrated to be particularly effective in promoting BDNF release from MF boutons. The investigation is well designed with a clear hypothesis, appropriate methodological considerations, and logical flow yielding results that fully support he authors conclusions. The manuscript fills an important gap in our understanding of MF regulation by unambiguously confirming a functional role for presynaptic NMDARs that were first described anatomically at MF terminals nearly 30 years ago. Combined with a handful of other studies describing presynaptic NMDARs at various central synapses this study expands the role of NMDARs as critical players in synaptic plasticity on both sides of the cleft.

      We very much appreciate the reviewer’s positive remarks of our study as “well designed with a clear hypothesis, appropriate methodological considerations, and logical flow”. We concur that the manuscript fills an important gap in understanding MF regulation by preNMDARs and expanding the role of NMDARs in synaptic plasticity on both sides of the cleft.

    2. Reviewer #2 (Public Review):

      Lituma et al. examined the presence and functions of preNMDARs in dentate gyrus granule cells (GCs) in the hippocampus. The authors found that GluN1+ preNMDARs are indeed present at mossy fiber (mf) terminals with electron microscopy. With pharmacological and genetic approaches, the authors showed that preNMDARs are important in low frequency facilitation (LFF), burst-induced facilitation and information transfer at the mf-CA3 synapse. The authors further demonstrated that this preNMDAR contribution is independent of the somatodendritic compartment of the GCs. With 2-photon calcium imaging, the authors found that preNMDARs contribute to presynaptic Ca2+ transients and can be activated by local glutamate uncaging. Separately, the authors showed that GluN1+ preNMDARs might also contribute to BDNF release at mossy fiber terminals during repetitive stimulation. Lastly, non-postsynaptic NMDARs specifically mediates mf transmission onto mossy cells, similar to mf-CA3 synapses, but not interneurons. The authors concluded that preNMDARs mediate synapse-specific transmission originating from the GCs/mf inputs.

      Overall, the study provides compelling evidence from a battery of techniques, ranging from EM, pharmacology, genetic deletion, electrophysiology to 2-photon imaging/uncaging. The data supports a coherent story on the presence of preNMDARs at mf terminals and that preNMDARs play important roles in LFF.

      In conclusion, this study reveals how NMDA receptors can be found in unexpected locations and how they may have unconventional functions, i.e. outside the narrow textbook view that they primarily serve as coincidence detectors in Hebbian learning. This study thus helps to change the way we think about NMDA receptor functioning, so should be of broad interest.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point response to reviewer comments


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In the current manuscript, Millarte et al reports a novel role of Rabaptin5 in selectively clearing damaged endosomes via canonical autophagy. They have identified FIP200 as a novel interactor of Rabaptin5 under basal conditions using yeast-two hybrid screening and further confirmed the interaction of Rabaptin5 with FIP200 with immunoprecipitation. They next used Chloroquine and monitored colocalization of the Rabaptin5 with WIPI2, ATG16L1 and LC3B to demonstrate the potential interaction of Rabaptin5 with the autophagic machinery. They have primarily used Gal-3 as a marker of membrane damage after 30 minutes of Chloroquine treatment. In order to further elucidate the role of Rabaptin5 in autophagic induction mediated by Chloroquine, they have silenced Rabaptin5, FIP200, ULK1 and ATG13 and observed a decrease in the number of LC3 or WIPI2 autophagosome formation. Based on these observations they tested if Rabaptin5 interacts with ATG16L1 upon Chloroquine treatment and confirmed their interaction with potential interaction sites of both Rabaptin5 with ATG16L1 with IP. The authors confirmed the interaction of Rabaptin5 with ATG16L1 by complementing the KO line with the mutant form of Rabaptin5 containing alanine residues in its consensus motif. Finally, they have used Salmonella and SCV as a model to study the role of Rabaptin5 in endomembrane damage and monitored a 50% decrease in the removal of Salmonella in Rabaptin5 KO or KD cells.

      Major concerns One of the major concerns is the membrane damage reported by chloroquine which is known to induce lysosomal swelling and further targeting of the swollen compartments to degradation by direct conjugation of LC3 onto single membrane as a form of non-canonical autophagy. The evidence regarding membrane damage by Gal3 colocalization on the Rabaptin5 vesicles is preliminary. As suggested by the authors the canonical autophagy pathway recognizing damaged membranes recruits also ALIX to the damaged membrane which was not observed in Supplementary Figure 2. The link to membrane damage by chloroquine and monensin with Rabaptin5 is not convincing as there is not sufficient evidence of membrane damage. In relation to this issue authors should consider using other damage markers as Gal8, p62 or NDP52 to provide additional claim with respect to membrane damage induced by chloroquine.

      To expand on the question of CQ treatment damaging early endosomes, we also tested for Gal8 on Rabaptin5-positive enlarged endosomes and quantified the fraction of Rabaptin5-positive rings positive for Gal3 and Gal8 after 30 min of CQ treatment. We propose to include this data in Figure 2:

      • *

      *

      • *

      We have tested the importance of Gal3 and p62 by siRNA-mediated knockdown where we found a robust inhibition of induction of WIPI2 puncta with CQ, but not with Torin1. Formation of LC3 puncta was less reduced, similar to knockdowns of FIP200, ATG13, or Rabaptin5.

      We propose to add these knockdown experiments as a supplementary figure:

      • *

      • *

      *

      One of the main claims here is that Rabaptin5 regulates the targeting of damaged endosomes to autophagy. Clearly, these are early endosomes as stated in the abstract. However, the evidence presented here showing these are early endosomes is not convincing. Analysing Gal3 and Gal8 positive vesicles that are Rabaptin5 positive and an early endosomal marker will be important in this context. For example, there need to be additional evidence showing that early endosomes are targeted to autophagy. Is the degradation of TfR affected by this targeting? Did the authors look at the effect of Bafilomycin A1? If this process affects exclusively early endosomes, it should be BafA1 independent. This will direct more into the cellular function of this process.

      Rabaptin5 is a bona fide marker of Rab5-positive early sorting endosomes. As a control, we confirmed colocalization of Rabaptin5 with transferrin receptor, another endosomal marker, on CQ-induced rings (Fig. 2B). We now also analyzed swollen endosomes with triple-staining for Rabaptin5, transferrin receptor, and Gal3 as shown in this gallery (30 min CQ, as in Fig. 2). All Rabaptin5-positive swollen endosomes (rings) were positive for transferrin receptor and ~80% for mCherry-Gal3.

      • *

      *

      • *

      We further tested transferrin receptor levels with and without CQ. Since CQ inhibits autophagic flux, this assay may not be very sensitive. Nevertheless, we found a significant reduction of ~15% and ~30% after overnight incubation with CQ in parental HEK293A cells and in Rbpt5-KO cells re-expressing wild-type Rabaptin5, resp., but no reduction in Rbpt5-KO cells expressing the Rabaptin5-AAA mutant defective in binding to ATG16L1:

      • *

      *

      • *

      As to the effect of BafA1, see our general response on top. The osmotic effect of CQ or Mon on endosomes that leads to membrane breakage requires an acidic pH. Preincubation with BafA1 neutralizes the pH, prevents osmotic swelling by CQ/Mon, and was shown to block LC3 lipidation (Florey et al., 2015, Jacquin et al., 2017). When BafA1 was added simultaneously, CQ was found to induce LC3 despite the presence of BafA1 (Mauthe et al., 2018), and Mon was shown to still be able to break endosomal membranes and recruit LC3 to EEA1-positive endosomes (Fraser et al., 2019). However, CQ-induced LC3 recruitment to latex bead-containing phagosomes or entotic vacuoles, i.e. LAP-like autophagy, was blocked (Florey et al., 2015). Consistent with this literature, we found increased LC3B lipidation already within 30 min of CQ treatment independently of BafA1 (no preincubation).

      • *

      *

      • *

      Upon longer incubations, LC3B lipidation is very strong already with BafA1 alone so that the effect of CQ cannot be assessed anymore, since both drugs inhibit autophagic flux.

      Furthermore, we found a CQ-dependent increase in WIPI2- and LC3B-positive puncta to be insensitive to BafA1 (panel A below). Colocalization of Rabaptin5 to LC3B and LC3B to Rabaptin5 significantly increased upon CQ treatment independently of the presence of BafA1 (no pretreatment), indicating that at least a large part of CQ-induced LC3B puncta is not due to LAP-like autophagy.

      • *

      *

      Minor concerns Both for Figure 2 and Supplementary Figure 7 it will be clearer to have the images in colour rather than black and white for better interpretation.

      We thought the grayscale images were clearer, but are happy to provide color images.

      The interaction of FIP200 and ATG16L1 with Rabaptin5 is well characterized with immunoprecipitation and imaging but the interaction of Rabaptin5 in presence of chloroquine with FIP200 and ATG16L1 DWD are missing and it will be important to include if in the presence of chloroquine these interactions will increase or not.

      We can do co-IP experiments also upon CQ treatment.

      In order to further support the role of Rabaptin5 for LC3 lipidation upon chloroquine induced membrane damage, western blots of WT, +Rabaptin5, Rabaptin5 KO, Rabaption5 KO +WT or +AAA cell lines were analysed. However, the lysates were collected upon 30 minutes of chloroquine treatment which does not correlate with the imaging performed in Figure 2 as the number of LC3 vesicles did not show an increase upon 30 minutes of chloroquine treatment. The authors should include the 150 minutes time point for the LC3 lipidation in these conditions.

      Because CQ inhibits autophagic flux, LC3-II accumulates after longer times in all cell lines. The differences can only be seen early.

      The experiments with Salmonella are of great quality. The relationship of Rabaptin5 with SCV and the endomembrane damage induced by Salmonella could be further elucidated with Rabaptin5 positive vesicles at early infection stages. It is not very clear from the text how authors link the endosomal network previously described for chloroquine with infection. It would be important here to show that Salmonella mutants unable to damage endosomal membranes do not have an effect. In Figure 7 panel C, the time points on graphs are in hours but it should be in minutes. corrected.

      Since Salmonella require T3SS for infection of HEK cells and T3SS causes the membrane damage, the proposed experiment is difficult.

      The events of targeting the damaged membranes for degradation was well characterized by the recognition of these membranes by Gal3, Gal8 and recruitment of autophagic receptors to the site of damage (Chauhan et al. 2016; Jia et al. 2019; Thurston et al. 2012; Maejima et al. 2013; Kreibich et al. 2015). This manuscript introduces a new potential platform for the formation of autophagic machinery on endosomes with the interaction of Rabaptin5 with FIP200 and ATG16L1, however more evidence is required to link this to the clearance of damaged membranes. Previously it was shown that endolysosomal compartments that were neutralized and swollen by monensin and chloroquine had been directed to degradation by direct conjugation of LC3 to single membranes via noncanonical autophagy, but here authors propose another mechanism for this event via canonical autophagy.

      As discussed in the general response above, the literature reports CQ and Mon to initiate both canonical autophagy and LAP-like autophagy, the latter particularly on phagosomes containing latex beads or entotic vacuoles. Our results – including the additional data above –concern the effects of CQ and Mon damaging early endosomes and causing recruitment of galectins and ubiquitination, triggering autophagy dependent on the ULK complex and WIPI2 as hallmarks of canonical autophagy, and Rabaptin5. The reviewer comments highlighted the possibility of LAP-like autophagy occurring in parallel, perhaps on endosomes that are not broken, which might explain the relative insensitivity of LC3 puncta induced by CQ and Mon – compared to the strong and robust reduction of WIPI2 puncta – on the knockdown of FIP200, ATG13, or Rabaptin5. In an alternative explanation, inhibition of autophagic flux causes remaining canonical autophagy to accumulate, while WIPI2 puncta are strongly inhibited. In support of the latter interpretation, ULK inhibition by MRT68921 (Fig. 4C and D) or FIP200 knockout (Fig. 6B and C) abolished CQ-induced LC3 structures, suggesting that – unlike on phagosomes or entotic vacuoles – there is little LAP-like autophagy. We propose to revise the manuscript to discuss these considerations more clearly.

      Reviewer #1 (Significance (Required)):

      Overall this work is very novel and shows some evidence of early endosomal autophagy. It could be relevant for some for of receptor-mediated signalling (although it is not discussed by the authors) My experience is in intracellular trafficking of pathogens and membrane damage.

      **Referee Cross-commenting**

      In my opinion, the only way you can distinguish between double or single membrane is by EM. For me, the important part is to show this is targeting of early endosomes to autophagy, either using other early endosomal markers, analysing by WB some early endosome receptors such as TfR or other inhibitors. If the authors are able to address some these comments, I agree the paper will be in a better position for publication.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Millarte et al study the role of Radaptin-5 (Rbpt5) during early endosome damage recognition by autophagy. The authors focus on using chloroquine (CQ) as an agent to induce endosomal swelling/damage and suggest that Rbpt5 is required for the recruitment of the autophagy machinery to perturbed endosomes. They further use salmonella infection as a model to confirm the role of Rbpt5 in this process. The authors initially show that Rbpt5 binds to FIP200 and subsequently focus on its interaction with ATG16L1 and identify a mutant that is unable to bind ATG16L1 or mediate the recognition of early endosomes by autophagy. Overall, this is an interesting study which provides molecular insights of how early endosomes can be targeted by the autophagy machinery where Rbpt5 may act as an autophagy receptor. Some specific comments are as follows:

      Fig.3A: siRbpt5 seems to induce the localization of LC3 to ring-like structures during CQ treatment. Are these LAP-like structures (e.g. sensitive to BafA1 treatment)? And were they included in the quantification in Fig.3C?

      Ring-like LC3 structures were also counted.

      As discussed in the general remarks above, it is a possibility that knockdown-resistent LC3 recruitment (particularly rings) is due to a CQ-induced LAP-like process. The alternative explanation is that there is residual canonical autophagy upon knockdown of Rabaptin5, ATG13, or FIP200: while WIPI2 puncta are strongly reduced, LC3-positive structures accumulate due to inhibition of autophagic flux. In support of the latter interpretation, ULK inhibition by MRT68921 (Fig. 4C and D) or FIP200 knockout (Fig. 6B and C) abolished CQ-induced LC3 puncta or rings.

      We can also test BafA1 treatment. Certainly, we will revise the text to discuss this point in more detail.

      • *

      Fig.4A&B: Since Rbpt5 KD has a weak effect on LC3 puncta formation (Fig.3) and to distinguish the effects of CQ in inducing LAP, the effects of ATG13 and ULK1 KD should be assessed by localising Rbpt5 with WIPI2 or ATG16L1.

      We can do that.

      Fig.4: It is not clear why ULK1 KD would affect Torin1-induced autophagy but not LC3/WIPI2 localisation during CQ induced early endosome-damage. As the ULK inhibitors can target other pathways, the authors should confirm this finding in ULK1/2 double KO or KD cells.

      We have used **MRT68921, because it is frequently used in the literature for this purpose with high specificity. It was used for example by Lystad et al. (2019) together with VPS34IN1 to block all canonical autophagy to analyze exclusively noncanonical effects of monensin treatment. We could perform ULK1/2 double knockdowns, but since ULK2 cannot be detected on immunoblots in HEK293 cells, the result would be interpretable only when there is an effect.

      Fig.5: The contribution of FIP200 in the interaction between Rbpt5 and ATG16L1 is unclear. Is binding between Rbpt5 and ATG16L1 mediated by ATG16L1's interaction with FIP200? The plasmid details describing the delta-WD40 deletion plasmid used in this study are missing and could be important to confirm that the detla-WD40 still retains binding to FIP200.

      We will of course include the details on the deletion plasmid, which were missing by mistake. Our WD deletion construct of ATG16L1 consists of residues 1–319, precisely deleting just the WD40 repeats, but retaining the FIP200 interaction sequence and the second membrane binding segment (b).

      We did a co-immunoprecipitation experiment and found both wild-type ATG16L1 and the ∆WD mutant to co-immunoprecipitate with FIP200:

      • *

      *

      Fig.5E: the authors should test Rbpt5 AAA mutant binding to FIP200. Since the mutant appears to express less, its binding to ATG16L1 should be quantified or repeated with more comparable expression levels.

      We will quantify the immunoblots and perhaps attempt getting more equal expression levels.

      Fig.6: CQ treatment can induce various endosomal damage (in addition to early endosomes) and LC3 lipidation processes (e.g. LAP-like). The authors show that Rbpt5 is specifically involved in damaged early endosome autophagy. In this figure, it would be important to distinguish CQ-induced LC3 puncta as a result of early endosome damage or other lipidation processes (e.g. canonical or non-canonical autophagy). The use of FIP200 KO cells shows that LC3 puncta is inhibited. However, here a specific readout to look at early endosome recognition by autophagy is important. The authors can localize early endosome markers (EEA1) with autophagy players (e.g. WIPI2 and LC3). This is also relevant to other figures (e.g. supplementary figure 7E).

      Rabaptin5 is a bona fide marker of Rab5-positive early sorting endosomes. As a control, we confirmed colocalization of Rabaptin5 with transferrin receptor, another endosomal marker, on CQ-induced rings (Fig. 2B). We also analyzed swollen endosomes with triple-staining for Rabaptin5/ transferrin receptor/ Gal3 as shown in this gallery (30 min CQ, as in Fig. 2). All Rabaptin5-positive swollen endosomes (rings) were positive for transferrin receptor and ~80% for mCherry-Gal3.

      • *

      *

      • *

      Our results are in agreement with Fraser et al. (2019) where they use EEA1 as an endosomal marker upon monensin treatment.

      We also performed a colocalization analysis for Rabaptin5 and LC3B, showing enhanced colocalization after CQ treatment for 150 min: ~20% of LC3B is (still) pos for Rabaptin5 after 150 min of CQ treatment:

      *

      Fig.6F&G: the authors should show representative images of these localization images quantified here. These can be added in the supplementary figures.

      We are happy to do this.

      **Minor comments:**

      Fig.2E: FIP200 seems to be highly overexpressed in this image. Commercial antibodies that recognise endogenous FIP200 are widely used and should be tested to confirm the colocalisation between FIP200 and Rbpt5.

      We plan to do this.

      Fig.7C image: the different setting denoted by +/-, +/+ ..etc are not clearly defined.

      We will improve this.

      Reviewer #2 (Significance (Required)):

      This is a interesting study and provides important mechanistic insights underlying the recognition of perturbed early endosomes by the autophagy machinery. Researchers interested in endosomal trafficking or autophagic substrate recognition are likely to benefit from this study.

      **Referee Cross-commenting**

      In my opinion, the authors have attempted to distinguish single membrane from double membrane LC3 lipidation by looking at the ULK complex requirement. As other reviewers suggested, this can be further confirmed by using ATG16L1 mutants. It is important however that these experiments are supplemented by co-localising autophagy proteins with alternative early endosome markers when Rbpt5 is inhibited.

      I think if the authors are able to address the suggested experiments, this would help improve the manuscript and make it suitable for publication.


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Millarte and colleagues find that Rabaptin5, a critical regulator of Rab5 activity, and a protein localized to early endosomes, interacts with FIP200 and ATG16L1. This interaction is confirmed and validated by a number of approaches (yeast 2 H, co-immunoprecipitation) and the binding sites of Rabaptin5 are mapped on FIP200 and ATG16L1. More precisely the binding site for ATG16L1 is nicely mapped on Rabaptin 5 by analogy with other ATG16L1 binders. The authors investigate the significance of this binding of Rabaptin5 to the autophagy proteins by proposing this interaction is required for targeting "autophagy to damaged endosomes". Endosomes are damaged with short treatments of chloroquine, a well studied compound previously shown to inhibit autophagy by disrupting fusion of autophagosomes with lysosomes. They propose the recruitment of autophagy (proteins) to the damaged endosomes may allow them to be eliminated. They use another model (phagocytosis of salmonella) to probe the role for rabaptin5 and its partners FIP200 and ATG16L1 in the well-documented role of autophagy on the elimination of salmonella in SCVs (Salmonella containing vacuole) formed from endosomes. Using short infection time points, and the Rabaptin5 mutants which prevent ATG16L1 binding they suggest Rabaptin5 binding contributes to elimination and killing of Salmonella by recruitment of ATG16L1.

      **Major comments:**

      1. The authors make an unfortunate and confusing choice of wording in the title and the text of "autophagy being recruited" to damaged early endosomes. A protein can recruit another protein but it can not recruit a process or pathway to a membrane.

      In the title we use the term "target". It is OK for us to avoid the expression "recruiting autophagy".

      The authors conclude that Rabaptin5 may have a role in autophagy directed to damaged early endosomes. The conclusion that Rabaptin5 binds FIP200 and ATG16L1 are convincing. The main issue is however in identifying what sort of process they are following. They have shown WIPI2 and LC3 can be recruited to early endosomes after 30 min chloroquine treatment but there is no data to explain the consequences of the binding of these proteins. They do not provide proof that canonical autophagosomes are formed which engulf and remove the damaged endosomes, nor do they show that the recruitment of WIPI2 is to a single membrane (presumably damaged early endosomes) which would be a non-canonical pathway. They often use the terminology "chloroquine-induced autophagy" (see Figure 4) but have virtually no proof they have induced either canonical or non-canonical pathways in their experiments. The only evidence they provide that there is some alteration in a membrane-mediated event is increase in lipidation of LC3 in Figure 6. The authors must follow either an early endosome protein or cargo to demonstrate lysosome-mediated degradation indicative of autophagy, or demonstrate the process is a variation on non-canonical autophagy.

      We analyzed transferrin receptor levels with and without CQ to test degradation of an early endosomal marker protein. Since CQ inhibits autophagic flux, this assay may not be very sensitive. Nevertheless, we found a significant reduction of ~15% and ~30% after overnight incubation with CQ in parental HEK293 cells and in Rbpt5-KO cells re-expressing wild-type Rabaptin5, resp., but no reduction in Rbpt5-KO cells expressing the Rabaptin5-AAA mutant defective in binding to ATG16L1:

      • *

      *

      There are concerns about the replicates done for many experiments in particular the co-immunoprecipitations which are not quantified (Figure 1 and 5).

      We will quantify these blots.

      The rescue experiments, even if done with stable cells lines made in the parental HEK293 cell line should be viewed with caution because of the very different amounts of Rabaptin5 (see Figure 6A). The overexpression of Rabaptin5 has not been well studied and comparisons with the mutants are therefore preliminary (Figure 6F and G).

      Fig 6A shows that Rabaptin5 levels are similar except for +Rbpt, where they are higher, and R-KO, which has none. Additional Rabaptin5 seems not to significantly enhance early WIPI and ATG16L1 colocalization.

      Conclusions about the role of the ULK complex, or ULK1 versus ULK2, should be expanded by studying the activity of the complex (phosphorylation of ATG13 for example) in order to make the conclusions more significant.

      We consider this to be beyond the scope of this study. Rabaptin5-dependent autophagy depends on the components of the ULK complex.

      **Minor comments:**

      1. Much of the labelling in the immunofluorescence images is not visible even on the screen version.

      We were careful to have the signals within the dynamic range of the image, but we can enhance the signals for better visibility.

      The LC3-lipidation experiment (Figure 6D) should be re-analysed by normalization to the loading control. The result may be significantly different and is open to re-interpretation. The quality of this western blot is also very poor.

      Quantitation was based on the ratio between LC3B-I and -II or the **percentage of II of the total, always within the same lane and therefore largely independent of loading.

      Reviewer #3 (Significance (Required)):

      This manuscript topic fits into the field of study of canonical versus non-canonical autophagy. This literature is best described as "LAP" first discovered by Doug Green, (Sanjuan in 2009) but more recently as a phenomena induced by monesin, and viral infection amongst others. Most relevant to this study are the references (in the text) by Florey (Autophagy 2015), Fletcher (EMBO J, 2018) and others. However, this manuscript fails to cite and consider the critical findings in a key study published by Lystad et al., Nature Cell Biology 2019, which examines the role of ATG16 in both canonical and non-canonical autophagy. The current study if placed into the context of the Lystad study would have significantly more value, and potentially make the findings more significant.

      We did not refer to Lystad et al. (2019), because they analyzed different ATG16L1 mutants on their contribution to monensin-induced processes on LC3 lipidation after completely blocking canonical autophagy with the ULK inhibitor MRT68921 and the VPS34 inhibitor VPS34IN1. The Rabaptin5-dependent CQ-induced processes are blocked by MRT68921 (Fig. 4C). We plan to refer to this study in the revision.

      Furthermore, the short chloroquine treatments used here could be of interest to the field if using the cited study of Mauthe et al., (which very clearly defines the effect of chloroquine after long (5 hrs treatment)) the authors would to revisit and repeat some of the key experiments in order to demonstrate the effects of 30 minute treatment. Does such short treatment block fusion? Does it affect the pH of the acidic compartments? Does it inactivate the endocytitic pathway? As the manuscript stands the lack of this understanding of the effect of chloroquine at short times, makes the observations difficult to be place into any biological context.

      This reviewer has expertise in autophagy, autophagosome formation and is familiar with the areas of endocytosis and infection.

      **Referee Cross-commenting**

      I think a major concern about the manuscript which is present in all reviews is the lack of clarity about what type of membrane LC3 is added to- is this the damaged endosome or a forming autophagosome? This leads to the question of what type of process is being observed here? non-canonical versus canonical autophagy.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      We thank the reviewers for their constructive and critical feedback on our original manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): In this study, the authors explored the tissue-specific regulation of DT size using both global and targeted deletion of Fgf9. They found cell hypertrophy and mineralization dynamics of the DT, as well as transcriptional signatures from skeletal muscle but not bone, were influenced by the global loss of Fgf9. Deletion of Fgf9 in skeletal muscle leads to postnatal enlargement of the DT. However, the innovation of this paper is not enough, the phenotypes of global deletion of Fgf9 were previously reported, most of the data in this paper are mainly descriptive analysis of the phenotypes, and internal cellular and molecular mechanisms were not well investigated.

      Here are the major issues:

      1.The data showed that fewer osteoclasts were present at both E16.5 and P0 in Figure 2R, V. Whether FGF9 affects both osteogenesis and osteoclast formation?

      • Authors’ response to Reviewer: Thank you for your feedback. We revised this manuscript to reflect the concerns of Reviewer 1 related to the lack of cellular and molecular mechanisms as described below. **Based on this question from the Reviewer, we have revised our discussion to clarify our findings as follows: “From our EdU proliferation assays, we observed a decline in cell proliferation in Fgf9null attachments, suggesting an accelerated chondrocyte maturation. Though we saw similar levels of Pthlh expression (a chondrocyte hypertrophy suppressor) in both WT and Fgf9null attachments, we also saw increased expression of Gli1 (a marker of chondrocyte hypertrophy) localized to the attachment in Fgf9null embryos compared to WT embryos. This decrease in proliferation was in parallel with increased hypertrophy of chondrocytes adjacent to the attachment cells within the Fgf9null DT, which may have led to a rapid expansion of matrix in the DT. Even though the DT was enlarged in Fgf9null mutants, we found fewer Sost+ cell clusters in their DTs compared to WT mice. Mature osteocytes express Sost (Winkler et al., 2003), and fewer Sost+ cells may indicate an impaired ability of Fgf9null osteoblasts to embed and mature into osteocytes. Overexpression of FGF9 in the perichondrium has been previously shown to suppress chondrocyte proliferation and limit bone growth in the limb (Karuppaiah et al., 2016); in our study, we found that loss of Fgf9 globally leads to an accelerated enlargement of chondrocytes in the tuberosity. This accelerated enlargement may limit the ability of these cells to deposit matrix and mineral and therefore limit osteocyte differentiation. We also found fewer osteoclasts in the Fgf9null DT which mirrors previous reports using the same mutation to study the length and vascularity of developing limb (Hung et al., 2007). Because the DT is enlarged and resides on the surface of a shortened bone, this phenotype may elucidate a divergent role of FGF9 in patterning of an arrested (e.g., attachment) growth plate compared to a regular (e.g., long bone) growth plate. This includes unexplored roles of FGF9 in vascularity of the tendon attachment and formation of bone ridges that overlap with or deviate from its role in growth plate development that are beyond the scope of the current study.”
      1. RNA-sequencing analysis showed the decreased expression of mitochondria/ energy and lipid associated genes in Fgf9 null muscle compared to WT muscle, how does this relate to the enlargement of the DT? What are the detailed molecular mechanisms?
      • Authors’ response to Reviewer:
      • Based on this question from the Reviewer, we have revised our discussion to reflect the potential molecular mechanisms related to muscle mitochondria, fiber type, and metabolism as follows:

      “Fgf9 is expressed in muscle during embryonic stages, which we and others have observed using ISH (Colvin et al., 1999; Garofalo et al., 1999; Hung et al., 2007; Yang and Kozin, 2009). Previous work has established a connection between Fgf9 and muscle, as treatment of muscle and muscle progenitor cells with FGF9 slows maturation, enhances proliferation, and decreases expression of various myogenic genes (Huang et al., 2019). This study found supporting evidence that Fgf9 expression in muscle may be a limiting factor in tuberosity growth. However, it remains unknown how other FGFs and their receptors, FGFRs, regulate superstructure and attachment formation. In this study, we identified potential mediators of skeletal muscle metabolism in Fgf9null muscle, including downregulated mitochondrial-related genes associated with oxidative respiration and proton transport (i.e., Slc36a2 and Ucp1, amongst others). In cultured myoblasts, FGF9 can inhibit myogenic differentiation potentially via increased production of Myostatin (Huang et al., 2019), a well-established mediator of fast glycolytic muscle fibers (Girgenrath et al., 2005; Hennebry et al., 2009). While the role of FGF9 in myoblast fusion has been investigated in vitro, its effect on muscle fiber type and fiber metabolism (i.e., oxidative vs. glycolytic) has not yet been explored. Our findings from bulk RNA-seq of Fgf9null muscle point to potential mechanisms in muscle metabolism that may contribute to the enlarged phenotype that is mimetic of that found in Myostatin deficient mice and other animals (Elkasrawy and Hamrick, 2010; Hamrick et al., 2002). Additionally, further investigations are needed to investigate the potential role of Fgf9 in mitochondrial function and lipid metabolism. Recent work by Huang et al. also identified FGF9 as a potent regulator of calcium signaling and homeostasis in myoblast culture in vitro, and calcium release from the sarcoplasmic reticulum in muscle plays a critical role during embryonic skeletal myogenesis via ryanodine receptor 1 (RYR1). Although Ryr1 was not significantly different in between Fgf9null and WT muscle in the present study, we did find that calmodulin-associated genes (e.g., Calm4, Calml3, Camsap3, Calm5) were all significantly upregulated in Fgf9null muscle compared to WT muscle. Calmodulin interacts with RYR1 and its activation is required for intracellular binding of calcium (Newman et al., 2014, 1). Calmodulin is a crucial component of the calcium signal transduction pathway and also plays an important role in lipid and glucose metabolism (Nishizawa et al., 1988). Taken together, our findings along with recent work by Huang et al. support more mechanistic studies to investigate the metabolic effects of loss and gain of function of Fgf9 on skeletal muscle as well as the muscle secretome.”

      Reviewer #1 (Significance (Required)):

      R1 The authors compared the phenotypes between globally and muscle-specifically deletion of Fgf9 in mice, and found that Fgf9 secreted by muscle may induced the enlargement of the DT. However, the detailed molecular mechanisms were not well investigated.

      **Referees cross-commenting**

      R2 I do not disagree with Rev 1, but I do not think such a task is so trial reason why I don't suggest; it could take years to determine molecular mechanisms of anything. The authors could expand the discussion, offer some possibilities. If they had some RNAseq data they maybe could suggest some of the key signaling pathways involved.

      **Referees cross-commenting**

      R1 We still suggested that the internal cellular and molecular mechanisms should be well investigated in this papaer.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • This paper deals with an important topic which is exact molecular mechanisms regulating the growth of bony tuberosities; because this region is essential for force transmission and movement.
      • Based on the previous information they had that in the global KO of the gene FGF9 the deltoid muscle is enlarged; and this muscle is in a very important tuberosity; they decided to look at FGF9 as a potential genetic regulator.
      • The manuscript is clear, objective, concise. Very clear. Authors used both the global and targeted deletions, very high reproducibility. Reviewer #2 (Significance (Required)):

      • This manuscript advances several areas since we know little about the mechanisms controlling local mechanisms of tuberosities. It also advances our knowledge of FGF9. There were several studies before mostly in vitro showing that FGF9 when added to muscle cells could arrest myogenesis, but the types of experiments in vivo had not been performed yet. The authors used an array of methods; the studies are unbiased and very rigorous and also they always show all experimental points, which is excellent. The conclusions are supported by the data.

      • The main suggestion for authors: They essentially do not discuss the nature of the potential muscle to bone signaling occurring when they target the deletion of FGF9 in skeletal muscles and muscles enlarge and there is a series of adaptions in the tuberosity. Do the authors believe this to be all the genetic changes or potentially through secreted myokines? In the paper of Huang et al, 2019 the authors document an effect of FGF9 in intracellular calcium homeostasis/signaling; could this be part of the mechanism? Perhaps the authors could propose a model?

      Authors’ response to Reviewer:

      • Future studies could investigate the secretome of muscle in Fgf9null or muscle-specific knockouts, as well as assess calcium signaling homeostasis in Fgf9 mutant muscles. We did find calcium- and ion-associated genes in the RNAseq and revised the discussion to include this information.
      • Based on this question from the Reviewer, we have revised our discussion to reflect the potential molecular mechanisms related to muscle mitochondria, fiber type, and metabolism as follows: “Fgf9 is expressed in muscle during embryonic stages, which we and others have observed using ISH (Colvin et al., 1999; Garofalo et al., 1999; Hung et al., 2007; Yang and Kozin, 2009). Previous work has established a connection between Fgf9 and muscle, as treatment of muscle and muscle progenitor cells with FGF9 slows maturation, enhances proliferation, and decreases expression of various myogenic genes (Huang et al., 2019). This study found supporting evidence that Fgf9 expression in muscle may be a limiting factor in tuberosity growth. However, it remains unknown how other FGFs and their receptors, FGFRs, regulate superstructure and attachment formation. In this study, we identified potential mediators of skeletal muscle metabolism in Fgf9null muscle, including downregulated mitochondrial-related genes associated with oxidative respiration and proton transport (i.e., Slc36a2 and Ucp1, amongst others). In cultured myoblasts, FGF9 can inhibit myogenic differentiation potentially via increased production of Myostatin (Huang et al., 2019), a well-established mediator of fast glycolytic muscle fibers (Girgenrath et al., 2005; Hennebry et al., 2009). While the role of FGF9 in myoblast fusion has been investigated in vitro, its effect on muscle fiber type and fiber metabolism (i.e., oxidative vs. glycolytic) has not yet been explored. Our findings from bulk RNA-seq of Fgf9null muscle point to potential mechanisms in muscle metabolism that may contribute to the enlarged phenotype that is mimetic of that found in Myostatin deficient mice and other animals (Elkasrawy and Hamrick, 2010; Hamrick et al., 2002). Additionally, further investigations are needed to investigate the potential role of Fgf9 in mitochondrial function and lipid metabolism. Recent work by Huang et al. also identified FGF9 as a potent regulator of calcium signaling and homeostasis in myoblast culture in vitro, and calcium release from the sarcoplasmic reticulum in muscle plays a critical role during embryonic skeletal myogenesis via ryanodine receptor 1 (RYR1). Although Ryr1 was not significantly different in between Fgf9null and WT muscle in the present study, we did find that calmodulin-associated genes (e.g., Calm4, Calml3, Camsap3, Calm5) were all significantly upregulated in Fgf9null muscle compared to WT muscle. Calmodulin interacts with RYR1 and its activation is required for intracellular binding of calcium (Newman et al., 2014, 1). Calmodulin is a crucial component of the calcium signal transduction pathway and also plays an important role in lipid and glucose metabolism (Nishizawa et al., 1988). Taken together, our findings along with recent work by Huang et al. support more mechanistic studies to investigate the metabolic effects of loss and gain of function of Fgf9 on skeletal muscle as well as the muscle secretome.

      In conclusion, this work established a new role of skeletal muscle derived Fgf9 during skeletal development and tuberosity growth. Additionally, our unbiased transcriptomic approaches and rigorous analyses identified new potential mechanisms associated with muscle development, mitochondrial bioenergetics, and muscle metabolism that warrant further investigation into the role of FGF9 in muscle-bone crosstalk.”

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Summary:

      In this work the authors present a simple mathematical model for the distribution of morphogen molecules that travel via cytonemes through a 1- dimensional system. This model is used as a basis for a software package called Cytomorph that takes as an input a set of experimentally measured distributions of cytoneme dynamics as well as experimenter determined parameters such as contact probability and method of cytoneme growth and retraction. The Cytomorph package then outputs spatial and temporal information on the distribution of morphogen as well as cytonemes and their contacts with cells and other cytonemes, all obtained over thousands of simulation runs. A number of in silico experiments are then performed to show that these outputs agree with experimentally measured morphogen distributions of Hedgehog in the imaginal wing disc and abdominal histoblast nest. Further in silico experimentation is done to study how this distribution is affected by a wide array of parameters such as producer row number, cytoneme connection method, and connection probability function. Comparisons to the traditional diffusion based model are also made. The authors find a suite of results based on these experiments and accordingly present the Cytomorph software package as a useful and adaptable tool for the community.

      Major comments:

      While the various in silico experiments present an expansive and exhaustive study of the different ways in which Cytomorph can be used to examine a cytoneme based distribution system, the machinery behind the software is left notably underdescribed. The authors do not sufficiently make clear what exactly happens within each iteration of the simulations run by Cytomorph, leaving the results irreproducible without the reader going into and deciphering the software code itself.

      In order to improve the description of the mathematical and computational steps behind the software, we have created a visual organigram (new Supplementary Figure S.1) with a detailed depiction of the steps. We have also included a short description in the main text and an extended explanation in the Supplementary Material section.

      Some of the specific details left undiscussed are how it is determined when and where a cytoneme will spawn or what its maximum length will be, the dynamics of morphogen transport within the cytonemes, the effects of one cytoneme making multiple connections on how much morphogen is delivered through each connection, and where exactly stochasticity is introduced so as to allow for variations between simulation runs; amongst others.

      In the new description of the software steps, we have tried to address the Referee’s comments about the dynamics and stochasticity in more detail. In order to help the understanding of the variables, we have also tried to improve their description in the main text.

      Additionally, when the authors investigate the diffusion model their stated boundary conditions do not match those presented at the end of the Materials and Methods section. The initial condition u(x,0)=0 and boundary condition du(L,t)/dt=0 represent a perfectly absorbing molecule sink at the x=L end of the system, not the reflecting boundary condition du(L,t)/dx=0 that would correspond to a zero morphogen flux.

      We thank the Referee for noticing this annotation mistake since the equation is really dx instead of dt. We have corrected this error and included in the Supplementary Materials the exact lines of code used in Matlab pdepe to certify the conditions used in the resolution of the diffusion equation (new Supplementary Figure S.10).

      Finally, while the authors spend a great deal of effort analyzing signal variability between simulation runs, there is no effort made to account for the inherently stochastic nature of molecular production, movement, and degradation. Particularly if molecule numbers are small, fluctuations in these processes could greatly increase signal variability. The authors should either address why these fluctuations are negligible or include them in the modelling.

      This work is mainly focused on the transport of the morphogen; other terms as degradation were introduced directly using published experimental data. Regarding the main concern about the negligibility of the fluctuations for cytoneme transport, we agree with the Referee on the importance of this point. Therefore, we have included a detailed description of the variability and fluctuations in a new section of the Supplementary Material. To help its understanding, we have also included a new Supplementary Figure (Supplementary Figure S.11).

      The largest fluctuations were found at the tail of the morphogen gradient (last rows of receiving cells). Since this corresponds to the region where the amount of morphogen is low, the absolute fluctuations do not change the activation of the low-threshold target. We then conclude that those fluctuations are biologically negligible for our study.

      Minor comments:

      The authors should double check all equation and figure references as I noted several instances in which it appeared that the wrong equation or figure was being referred back to. Similarly, the authors should double check the equations themselves, particularly those in the supplemental material.

      We thank the Referee for noticing these mistakes. We have reviewed those references in order to fix the wrongly linked ones.

      Eqs. SM1.1 and SM1.2 have a plethora of parameters with a wide array of different sub- and superscripts that are left unexplained and possibly incorrectly labelled in some cases,

      Equations SM1.1 and SM1.2 described a general form of Triangular and Trapezoidal dynamics and the different sub- and superscripts come from the published experimental data. Nevertheless, in order to make them more intuitive we have simplified the expressions and included a more detailed description of those parameters and their scripts in the revised version.

      while the second line of Eq. SM2.2 is nonsensical unless r_I*p=0 and p_i<=1.

      We thank the Referee for noticing the uncertainty in this equation, since it was written in an iterative syntax as it is coded in the software. Therefore, in the code we did not have this nonsensical range of data, but we agree that it should be specified with a mathematical syntax as the rest of the equations in the manuscript. Therefore, we redefined the notation and specified better the numerical domains of those variables.

      Additionally, the notation used in Figs. 5 and 6 as well as the bottom part of Fig. 7 is confusing. The caption should more explicitly state what the various expressions in the second row of each column represent.

      The second row represents the statistical analysis between cases coded in a color matrix, as it is described in the footnote. We thank the Referee for this recommendation because this is not the usual representation. Therefore, we have changed the previous explanation to one hopefully clearer and intuitive; we have also included a specific label in the figures.

      In Fig. 5A specifically it is unclear what exactly the variable phi represents.

      Phi is a widely used annotation in biology to define cell size diameter and cell position. We didn´t realize it could be unclear. For a better understanding within a multidisciplinary field we have changed this symbol.

      Does it have anything to do with the phi that is used as a position variable for the cells, and if it is a ratio of cytoneme length to cell diameter then why does it have units of microns?

      We agree that this phi notation is confusing. It has been used to indicate distance position as well as cell diameter. Although these variables are biologically related, in the new version of the manuscript we have changed the notation to separate both concepts and avoid misunderstandings.

      Significance:

      As the Cytomorph model and software can be applied to a wide variety of systems involving morphogen transport via cytonemes, it provides a technical advance in our ability to analyze and discuss the results of measurements on cytonemes in a more homogenous way. This work and the resulting software is particularly applicable to and build off of studies done by other groups that study the dynamics of cytonemes such as the Kornberg lab (works from which are cited by the authors) and the Scholpp lab (such as Stanganello E, Scholpp

      S. Role of cytonemes in Wnt transport. J Cell Sci. 2016; 129(4):665-672), and as such it is experimental labs such as these that will be the most interested in this manuscript and its findings.

      My field of expertise lies primarily in stochastic modeling and linear response theory. As such, I feel I do not have sufficient expertise to evaluate the experimental methods outlined in this manuscript and determine their level of scientific rigor.

      Reviewer #2

      The manuscript "Improving the understanding of cytoneme-mediated morphogen gradients by in silico modelling" addresses the role of in silico modelling in understanding pattern formation via cytonemes: filopodia that transport signalling molecules to and from cells. Investigating the role of cytonemes and, in particular, their dynamics, during development is an important and emerging field in developmental biology, and there is great potential for mathematical modelling to aid in understanding these processes.

      The present manuscript attempts to derive a general set of equations describing pattern formation in the context of cytonemes, akin to that of the classic Turing model of morphogenesis. The authors replace the standard diffusion term in the PDE with a non-local term, intended to represent transport via cytonemes. This model is then posed over a one-dimensional domain with a source at one end and no flux boundary conditions at the other and is shown to be able to generate a morphogen gradient profile that could pre-pattern a biological tissue. The model is tested against a key experimental system, namely, Hh signalling in the Drosophila wing imaginal disc and is shown to reproduce some experimental results. Finally, the authors have developed a Matlab-based software package that they claim will be applicable to a wide range of systems. This GUI-based software allows users to input experimentally measured averages of cytoneme properties and explore the effect of these properties on tissue patterning.

      My primary concern is that the paper presents itself as a mathematical model of cytoneme formation in general. The authors themselves state in their introduction that the mechanisms for cytoneme generation and maintenance are presently unknown. In fact, it is not even known if they are consistent across biological systems (and in fact, are probably not in general). As such, any present instantiation that connects cytoneme dynamics to tissue patterning can only hope to be specific to a particular system (in this case, the Drosophila wing imaginal disc.

      As mentioned in the introduction, the connection of cytonemes with patterning has been described in several works. We had included a list of publications describing the implication of cytoneme-mediated signaling for several morphogens (FGF, Egf, Hh, Dpp, Wnt or Notch) and in many vertebrate and invertebrate systems (Drosophila, chicken, Xenopus, Zebra fish, mouse and human tissue culture cells).

      Whilst one may use general models (like the heat equation) to study pattern formation since it requires only specification of parameters, the model here requires specification of families of functions, that are likely to differ from context to context and so the model is not general.

      Our model inputs are parameters determined experimentally rather than families of functions. This misunderstanding might derive from the use of triangular and trapezoidal dynamics, which are equations included in the software code but not input functions. To avoid this confusion, we have specified the input data in tables S.1 and S.2 and clarified in the main text that the triangular or trapezoidal family of functions are just the names for the basic dynamics of cytonemes (triangular for elongation and retraction, and trapezoidal when there is a stationary phase in between).

      Ultimately, the model is a statistical modelling framework masquerading as a mechanistic one.

      In this work, we have not specified the mathematical area to which the model belongs. Furthermore, we always explicitly described the different variables and functions modeled. Therefore, we do not understand what the supposed masquerade is.

      As further evidence of the lack of generality of the model, the studied domain is only one dimensional and has signalling sources at one end. This scenario is perfectly adequate for theoretical explorations of pattern-forming systems but is highly unlikely to capture the geometrical intricacies of real-world systems (and I note that even in the diffusive case, boundary conditions are critical for understanding what patterns ultimately arise for a given system).

      We agree with the Referee that there are cases in biological systems in which it is required to work in 2D or even 3D to have a full comprehension of the process. Nevertheless, those are mainly related to biological patterns rather than to biological signaling gradients, which usually are studied (experimental and theoretically) in 1D. Therefore, we have limited our model to this case and compared our in silico results with the published experimental data. In any case, we have emphasized in the text that our model is limited to signaling gradients with the source at one end, which is the case of the best studied morphogens: Hh (Sonic-hh), Dpp (BMP) or Wg (Wnt).

      Actually, as prove of the generality of the model, we have predicted different properties of Dpp and Wg gradients using our model. We then validated the simulated results using the experimental data obtained from independent publications.

      To simulate their model, the authors need to specify triangular and trapezoidal functions, which are unlikely to be generalisable to all contexts. As such, the model is not general and, in particular, there is no way to change the software to make it so.

      Cytonemes are filopodial structures based on actin filaments that polymerize and depolymerize to elongate and retract. This is a general process for all filopodial structures and it is why cytonemes were classified in a previous published work as a triangular behavior or, if this dynamic has a stationary phase, as a trapezoidal behavior (Gonzalez-Méndez et al., 2017). Therefore, these functions are just a categorization introduced to better describe the intrinsic dynamics of cytonemes, that could be applied to most of the experimental cases. To attend this Referee’s concern, we have included in the introduction a more detailed description of these behaviors, as well as the references of publications describing the dynamic behaviors of cytonemes for different morphogens and in different organisms.

      Trying to make a generalization for all cases, we included in the model those situations in which the cytonemes were static rather than dynamic (detailed simulations comparing dynamic and static cases can be found in the old Supplementary Figure S.5 A (now S.7 A)).

      We have concluded that the model can be considered generalizable since it includes the simplest and most general cases in terms of cytoneme dynamics.

      Whilst the development of a GUI for this scenario is a nice contribution, I feel that the lack of generalisability will, at best, mean that the software enjoys little use, and at worst, may lead researchers unfamiliar with the modelling context to misuse it in error.

      Once we knew the model could be generalized, we were concerned about the misuse of the mathematical model, and that was the reason why we decided to develop a GUI as simple as possible.

      Furthermore, in the online repository there is, together with the open software, an user guide of Cytomorph with a full description of parameters, variables and outputs and how to use them properly.

      In my opinion, this work would be better suited as a presentation of specific mathematical modelling of tissue patterning in the Drosophila wing imaginal disc. In this case, many of the above concerns would be addressed.

      We have rewritten part of the text to indicate the limits of the model and make clear that it has been tested experimentally for the Hh pathway and in two different developing systems: wing imaginal discs and abdominal histoblast nests.

      As evidence of a more general use of Cytomorph, we have added in the revised version of the manuscript a new section focused on data prediction for the gradients of Dpp and Wg. We have also included supplementary figures that validate the predictions of our model using published experimental data.

      That said, there are still a number of issues with the presentation of the model and results. I shall detail these in the bullet point list below:

      1. The domain for Eq. 1 needs to be made explicit. Later, it appears that the domain is a closed one-dimensional interval, but the use of arrows here implies that x is a vector and hence x ∈_ D _Rn with n > 1.

      We initially described the general equation for morphogens as x ∈ ℝ𝑛 and later we limited it to 1D. This is why at the beginning x, as a vector, contained an arrow, although later it was a scalar variable. Since we were interested in 1D in this work, to avoid this kind of misunderstanding we have rewritten from the beginning the equations as 1D and clearly specified the x domain used: the set of natural numbers x ∈ ℕ0.

      1. It is unclear over what the sum in Eq. 2 is being taken.

      The sum in Eq. 2 is over the number of producing cell rows. We have changed the notation to clarify this point.

      1. The statement "we used the discrete cell position x = φ as spatial coordinate" is vague and does not help the reader understand the discretization._

      The number of cell diameters is a widely used discrete unit for position in Developmental Biology. As we expect the readers of this publication to be multidisciplinary, we have changed the notation to avoid misunderstandings and clarify this discretization.

      1. p is used both as a probability and as an index for producer cells. This is confusing._

      We have changed the notation to avoid misunderstandings.

      1. As previously stated, the choice of trapezoidal/triangular cytoneme dynamics is not general. More work needs to be done to showcase how the authors came to the conclusion that this is the best choice, and how the functions (and their associated parameters) describing them were selected.

      The names triangular and trapezoidal stand for the published dynamics for elongation and the retraction of cytonemes and we already argued about its generality. As we specified in the manuscript, these types of behaviors have been experimentally observed and, therefore, we considered that the experimental observation was reason enough to include them in the model. If more details are required, the Material and Method section and the Supplementary Table S.3 show that the times measured for triangular and trapezoidal dynamics are statistically different and, consequently, both behaviors have to be considered.

      As mentioned in the manuscript, the associated parameters represent the times and velocities for the elongation or retraction that have already been thoroughly analyzed and published (González-Méndez et al., 2017). The question of the Referee about how these functions affect the gradient is answered in the text and in Figure 7 F.

      1. I can see how Type 1 and Type 2 cytonemes could be expanded naturally to a higher dimensional case, but it is not clear how Type 3 cytonemes could be, since the probability of any two cytonemes occupying the same space in higher dimensions is likely to be small (if they are imbued with independent dynamics).

      We agree with the Referee on this point. It is something that shall be considered for future improvements of the model in higher dimensions. For instance, a complex scenario in 2D will be required of a cytoneme guiding model. Nevertheless, since the present study is limited to 1D, this concern is not applicable for the current model.

      1. The statement: "the distance between cells must be smaller than, or equal to, the maximum length of the cytonemes" seems inconsistent with the equations below since λ(t) does not appear to be a maximum length.

      The length of the cytonemes is controlled as a dynamic function described by λ(t). Our statement referred to the maximum length for each time step that is given by λ(t). We agree that the initial statement could lead to misunderstanding, so we have suppressed the word “maximum”.

      1. I think the authors are confusing probabilities and rates in their discussion of the model. Eq. 1 is a density model and so calling events probabilities here is slightly misleading. As a more general statement, I am currently interpreting contact function C as one defined as a rate, rather than as a set of probabilistic terms. If the latter is true, then Eq. 1 is invalid since it mixes processes at different levels of description._

      We thank the Referee for this comment. We have studied in depth this observation but we could not exactly find why the Referee considers that the model is working at different levels. Even though we could not find where in the text we called “probabilities” to the events of eq1, we rewrote the text to make clear what we consider either probability or rate. In addition, in the Supplementary Material section we clarify how the model works and at what levels of modeling we are working.

      Significance

      In general, the paper is well written, however, the focus of the findings should be on patterning within an epithelium such as the Drosophila wing imaginal disk.

      The work will be interesting for the developmental biology community as well as for the upcoming biomathematical modelling community.

      Expertise: Developmental biologist with experience in tissue patterning and morphogen gradients

      Referees cross-commenting

      I agree with Reviewer 3 that the importance of cytoneme-mediated signalling has been described in several systems - invertebrates and vertebrates. However, I think the focus of this work in particular should be on cytoneme signalling in the wing imaginal disc. IMO, this would not limit the conclusion but rather focus it and make it then applicable to epithelial tissues in general. I agree with the other point.

      Reviewer #3

      There is much to like in this thoughtful and worthwhile study that develops mathematics to describe how cytonemes might generate experimentally observed Hh gradients. Two suggestions:

      1. I am not equipped to evaluate the mathematics and as a non-expert would find it helpful if the authors explicitly stated at the outset what assumptions they took, the specific contexts they sought to model, and the parameters that they explored.

      We agree with the Referee on the excessively mathematical focus of our interpretation of the results in the old version of the manuscript. We have rewritten part of the text to clarify the biological implications of the variables and simulations explored.

      Am I correct that they assume that the Hh gradient correlates with a cytoneme gradient, that all cytoneme contacts have the same duration and exchange equivalent amounts of Hh, and that the variables that were characterized are cytoneme length distributions, cytoneme extension rate, contact duration, and cytoneme density?

      Since the mechanism of morphogen exchange is not fully identified, we assumed the simplest case in which all the contacts have the same duration and exchange the same amount of morphogen. Using this approach, we were able to reproduce the gradient and concluded that it is not strictly necessary to propose a more complex mechanism to establish a graded distribution of morphogens. We therefore worked under this assumption.

      The variables characterized were the ones pointed out by the Referee, mainly cytoneme features, as the cytoneme length distributions or the different parameters of the temporal dynamics. We tried to define better these variables in the new version of the manuscript.

      1. One of the unusual features of the Hh gradient in the wing disc is that the size of the posterior compartment field of Hh-producing cells is large relative to the size and extent of the Hh gradient in the adjacent anterior compartment. Wing discs with large hh mutant clones, wing discs with large smo mutant clones, and wing discs with ttv mutant clones that block Hh uptake provide evidence that the Hh gradient is constituted with Hh that is produced by many cells, some that are far from the compartment border as well as some that are close. Has this been factored into the author's model?

      Indeed. Being aware of the importance of the size of the signal source, we simulated how changing the size of the posterior compartment affects the gradient (altering the number of producing cell rows involved, figure 5B). In the old version of the manuscript we had focused on the theoretical approach, so we thank the reviewer for noticing that we should introduce a more biological point of view. Therefore, we included in the revised version of the manuscript a biological interpretation of how our simulations can help to understand the question posed by the reviewer.

      Does the fact that the relative size of the posterior compartments and Hh gradients in the histoblasts is not as extreme as it is in the wing disc influence their model?

      Following the Referee’s question, we decided to simulate the influence of the relative size of the posterior compartment in the abdominal histoblast nests. We found that in both wing discs and histoblasts, the size of the posterior compartment affects the gradient but in a different scale factor. We have included these data in the revised version of the manuscript (new supplementary figure S.5).

      Interestingly, this feature of the Hh gradient in the wing disc is not shared with other gradients in the wing disc such as the Wg, Dpp, and Bnl gradients. I would be interested to know if the author's model can be queried to suggest what properties might contribute to this difference?

      In order to answer the reviewer question, we have used our model to tentatively simulate Wg and Dpp gradients. Our preliminary results suggest that considering only cell position and cytoneme length, the Wg and Dpp gradient lengths can be predicted in wing imaginal disc. Nevertheless, each morphogen has its own particularities and further studies are required for a precise simulation of these gradients. We included these results in a new section of the manuscript and in the new Supplementary Figure S.9.

      Significance

      This is an important contribution to gaining a basic understanding of the role of various properties of dynamic cytonemes to gradient formation.

      Referees cross-commenting

      I discount the apparently strongly held opinion of Reviewer #2 that "it is not even known if they [cytonemes] are consistent across biological systems (and in fact, are probably not in general)". I do not know where this comes from and do not think that such opinions are appropriate for anonymous reviews.

      Cytoneme-mediated signaling has in fact been observed and characterized in many diverse biological systems. I submit that in contrast, mechanisms of dispersion based on diffusion are inferred and lack direct experimental evidence. I do agree that it is fair to ask the authors to carefully describe their work in the context of epithelial signaling, but it is not correct to ask them to limit their conclusions to the wing disc as the authors analyze both wing disc and histoblast signaling. They clearly state that their work is limited to 1D and so we understand that it is inadequate to model 3D morphologies. I do not criticize them for this.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript "Improving the understanding of cytoneme-mediated morphogen gradients by in silico modelling" addresses the role of in silico modelling in understanding pattern formation via cytonemes: filopodia that transport signalling molecules to and from cells. Investigating the role of cytonemes and, in particular, their dynamics, during development is an important and emerging field in developmental biology, and there is great potential for mathematical modelling to aid in understanding these processes.

      The present manuscript attempts to derive a general set of equations describing pattern formation in the context of cytonemes, akin to that of the classic Turing model of morphogenesis. The authors replace the standard diffusion term in the PDE with a non-local term, intended to represent transport via cytonemes. This model is then posed over a one-dimensional domain with a source at one end and no flux boundary conditions at the other and is shown to be able to generate a morphogen gradient profile that could pre-pattern a biological tissue. The model is tested against a key experimental system, namely, Hh signalling in the Drosophila wing imaginal disc and is shown to reproduce some experimental results. Finally, the authors have developed a Matlab-based software package that they claim will be applicable to a wide range of systems. This GUI-based software allows users to input experimentally measured averages of cytoneme properties and explore the effect of these properties on tissue patterning.

      My primary concern is that the paper presents itself as a mathematical model of cytoneme formation in general. The authors themselves state in their introduction that the mechanisms for cytoneme generation and maintenance are presently unknown. In fact, it is not even known if they are consistent across biological systems (and in fact, are probably not in general). As such, any present instantiation that connects cytoneme dynamics to tissue patterning can only hope to be specific to a particular system (in this case, the Drosophila wing imaginal disc. Whilst one may use general models (like the heat equation) to study pattern formation since it requires only specification of parameters, the model here requires specification of families of functions, that are likely to differ from context to context and so the model is not general. Ultimately, the model is a statistical modelling framework masquerading as a mechanistic one.

      As further evidence of the lack of generality of the model, the studied domain is only one dimensional and has signalling sources at one end. This scenario is perfectly adequate for theoretical explorations of pattern-forming systems but is highly unlikely to capture the geometrical intricacies of real-world systems (and I note that even in the diffusive case, boundary conditions are critical for understanding what patterns ultimately arise for a given system). To simulate their model, the authors need to specify triangular and trapezoidal functions, which are unlikely to be generalisable to all contexts. As such, the model is not general and, in particular, there is no way to change the software to make it so. Whilst the development of a GUI for this scenario is a nice contribution, I feel that the lack of generalisability will, at best, mean that the software enjoys little use, and at worst, may lead researchers unfamiliar with the modelling context to misuse it in error.

      In my opinion, this work would be better suited as a presentation of specific mathematical modelling of tissue patterning in the Drosophila wing imaginal disc. In this case, many of the above concerns would be addressed. That said, there are still a number of issues with the presentation of the model and results. I shall detail these in the bullet point list below:

      1. The domain for Eq. 1 needs to be made explicit. Later, it appears that the domain is a closed one-dimensional interval, but the use of arrows here implies that x is a vector and hence x ∈ D ⊂ Rn with n > 1.
      2. It is unclear over what the sum in Eq. 2 is being taken.
      3. The statement "we used the discrete cell position x = φ as spatial coordinate" is vague and does not help the reader understand the discretization.
      4. p is used both as a probability and as an index for producer cells. This is confusing.
      5. As previously stated, the choice of trapezoidal/triangular cytoneme dynamics is not general. More work needs to be done to showcase how the authors came to the conclusion that this is the best choice, and how the functions (and their associated parameters) describing them were selected.
      6. I can see how Type 1 and Type 2 cytonemes could be expanded naturally to a higher dimensional case, but it is not clear how Type 3 cytonemes could be, since the probability of any two cytonemes occupying the same space in higher dimensions is likely to be small (if they are imbued with independent dynamics).
      7. The statement: "the distance between cells must be smaller than, or equal to, the maximum length of the cytonemes" seems inconsistent with the equations below since λ(t) does not appear to be a maximum length.
      8. I think the authors are confusing probabilities and rates in their discussion of the model. Eq. 1 is a density model and so calling events probabilities here is slightly misleading. As a more general statement, I am currently interpreting contact function C as one defined as a rate, rather than as a set of probabilistic terms. If the latter is true, then Eq. 1 is invalid since it mixes processes at different levels of description.

      Significance

      In general, the paper is well written, however, the focus of the findings should be on patterning within an epithelium such as the Drosophila wing imaginal disk.

      The work will be interesting for the developmental biology community as well as for the upcoming biomathematical modelling community.

      Expertise: Developmental biologist with experience in tissue patterning and morphogen gradients

      Referees cross-commenting

      I agree with Reviewer 3 that the importance of cytoneme-mediated signalling has been described in several systems - invertebrates and vertebrates. However, I think the focus of this work in particular should be on cytoneme signalling in the wing imaginal disc. IMO, this would not limit the conclusion but rather focus it and make it then applicable to epithelial tissues in general. I agree with the other point.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers "Cell-cell communication through FGF4 generates and maintains robust proportions of differentiated cell types in embryonic stem cells"

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript Raina et al. use an in vitro model of PE specification based on the transient overexpression of GATA4 in ESCs to show that the acquisition of primitive endoderm (PE) identity is governed at the population levels by cell-cell interactions mediated by FGF signaling. The authors further argue that the specification of a defined proportion of "PE" and "Epiblast" cells in a differentiating population of ESC is an emergent property of a system where paracrine signaling shifts the balance between two alternative stable states. Overall, the work does not reach radically new conclusions: broadly similar models are outlined in several other publications, including from the authors. Yet this study makes use of elegant genetic models and is particularly well executed. In addition, it includes a very accurate characterisation of the spatial range of FGF signaling activity that is original and adds on the existing knowledge. Moreover, the authors show novel evidence suggesting that GATA factors inhibits Fgf4 transcription and the activity of the FGF signaling pathway in ESCs.

      We thank the Reviewer for commending the execution of the experiments, and for highlighting the novel insights that they bring. The Reviewer acknowledges that the specification of a defined proportion of PrE-like and Epiblast-like cells in a differentiating population of ESCs is an emergent property which is mediated by paracrine FGF4 signaling. This has not been experimentally demonstrated before. In contrast to the Reviewer’s assertion, we therefore think that our work does reach a conclusion that is radically different from previous experimental studies, a view that is also shared by Reviewer #3 below. In a revised version of the manuscript we will further emphasize the conceptual differences between published models that focus on single cell dynamics, and our experimental and theoretical demonstration of qualitatively different dynamics that emerge at the population level as a consequence of cell fate coupling.

      **Two major points deserve further clarification:**

      In this manuscript the authors claim that the proportions of cells acquiring PE fate is, at least in the experimental setup adopted, largely independent from the levels of GATA4 induction, and therefore of the initial state of the gene regulatory network regulating this cell fate transition. However, the authors should discuss how the current findings relate to their previous results, showing that the duration/levels of Gata4 induction, in a similar experimental setting, play an important role in determining the final proportion of cells cell acquiring "PE" fate. Absolute expression levels may be crucial for this distinction, but the authors seem to exclude this possibility (see figure S3).

      The different roles of GATA4-mCherry induction levels for determining the final proportion of cells acquiring a PrE-like fate reported in our previous (PMID: 26511924) and the current work is because of important differences in the experimental settings between the two studies. In PMID: 26511924, we assayed PrE-like differentiation in medium supplemented with serum and LIF, which provides exogenous signals that promote PrE-like differentiation. These conditions reveal the function of the cell-autonomous circuit, in which GATA4-mCherry levels do control the probability of PrE-like differentiation. In the current work, we likewise observe that cell type proportions depend on GATA4-mCherry induction levels when we supply exogenous FGF4 during the differentiation of wild type cells (Figures S2C and S3D, lower panel). Differentiation in the absence of exogenous factors in contrast reveals the behavior of the coupled system, in which cell type proportions are independent from GATA4-mCherry induction levels.

      Furthermore, in the present manuscript, we use new inducible cell lines in which the majority of cells can be induced above the critical GATA4-mCherry threshold required for PrE-like differentiation, in contrast to our previous study where the distribution of GATA4-mCherry induction levels was straddling this threshold.

      In a revised version of the manuscript, we will more explicitly emphasize these important differences in the experimental design between the two studies, and discuss how the specific conditions in the present study lead to new conclusions.

      Most importantly, the authors incorporate in their model the notion that GATA6 inhibits FGF signaling. It would be interesting to understand how such inhibition is mechanistically mediated. For instance GATA6 has been shown to bind in proximity of the Fgfr2 gene (Wamaitha et al., Genes and Dev., 2015). Alternatively, the authors show a direct effect on Fgf4 expression. The short time window of the reported repressive transcriptional effects (8h, Fig 2 middle), might suggest a direct regulation. The authors should test this possibility, and discuss what alternative modes of regulation could be envisaged (for instance, indirect effects mediated by Nanog). This is a key result that deserves a more detailed mechanistic characterisation.

      The regulation of FGF signaling by GATA factors has been pointed out as a central new result of our study by all three reviewers that we will be happy to further expand on in a revised manuscript. Regulation of Fgfr2 expression by GATA6 as suggested by the ChIP-seq data in Wamaitha et al., 2015 (PMID: 26109048) is one possible mechanistic explanation that we will of course discuss.

      Most importantly, we will test possible direct effects of GATA factors on Fgf4 expression that are indicated by the short timescales of the transcriptional effects shown in Fig. 2, as noted by the Reviewer. We have already mined the ChIP-seq data from Wamaitha et al., 2015 (PMID: 26109048) and found a GATA6-binding peak approximately 10 kb upstream of the Fgf4 start codon in a region that is highly enriched for GATA6 consensus binding sites. To test the functional role of this binding region, we propose to delete it by CRISPR-mediated mutagenesis in the inducible lines, and to test its ability to regulate reporter gene expression in heterologous assays.

      To address the question of alternative modes of regulation of Fgf signaling through NANOG, we have already performed in situ mRNA stainings for Fgf4 expression in cells grown for 40 h in N2B27 medium. While Nanog expression is much reduced under these conditions, Fgf4 mRNA continues to be expressed, indicating that positive regulation through NANOG is not essential for Fgf4 mRNA expression in ESCs. We will add this data to a revised manuscript, and discuss its implications for the regulation of Fgf4 transcription (see also our response to Reviewer #3 below). As a complementary approach to further test the role of indirect effects mediated through NANOG, we will dissect more closely the timing of Fgf4 downregulation reported in Fig. 2B relative to the upregulation of the inducible GATA4-mCherry protein and the downregulation of NANOG protein.

      **Minor points:**

      Fig S1: The authors should show quantifications of Nanog and GATA6 levels before the beginning of the differentiation protocol.

      We will be happy to add this data in a revised version, as part of a more extensive analysis of GATA4-mCherry and GATA6 expression at early stages of the differentiation protocol. See also our response to the next point.

      Line 106: The authors write "the initially large proportion of GATA6+; NANOG+ double positive cells". It appears that at 16h of differentiation ESCs have already partitioned between Gata6 or Nanog expressing cells. The authors should rephrase the sentence to reflect what seems to be an almost total absence of truly double positive cells. Possibly, an analysis conducted at earlier time points could clarify these dynamics.

      The Reviewer rightly points out that at 16 h of differentiation, most cells are already associated with one of two clusters in the NANOG/GATA6 expression space. The misleading classification of a large number of cells as double positive at 16 h was caused by applying a single gating strategy to the entire experiment, even though the mean expression levels of NANOG and GATA6 in the two clusters change significantly over time. We will update our gating strategy and rephrase this section to more appropriately describe cell clustering and gene expression dynamics over the time course. We will also extend Figure S1 with analysis of GATA6 and NANOG expression levels at earlier time points of the differentiation protocol, to test whether this allows detecting a truly double positive population.

      Line 124: The authors write "... concentration dependent downregulation of NANOG expression". The effects may rather depend on the time of doxycycline stimulation.

      We agree with the Reviewer that in isolation, the data shown in Fig. 1 and Fig. S2 leave open the possibility that the stronger downregulation of NANOG at higher GATA4-mCherry expression levels is caused by the extended time of doxycycline stimulation rather than GATA4-mCherry concentration. However, in our opinion, this concern is already addressed by the experiments performed in the four clonal lines with independent integrations shown in Figure S3. Here, the time of doxycycline induction is held constant, and a similar relationship between GATA4-mCherry and NANOG expression levels is observed as in the experiments where we modulate induction time in a single clonal line (compare Fig. S2A to Fig. S3B). In a revised version of the manuscript we will describe more clearly how the experiments shown in Figure S3 control for time-dependent effects of doxycycline stimulation.

      Line 192: The authors write "...and confined to cells with low GATA4-mCherry expression levels". It would be helpful to have an indication of the cell boundaries, possibly showing localisation of a membrane bound protein.

      We agree that more firmly establishing a correlation between GATA4-mCherry expression levels and Fgf4 mRNA expression in single cells would greatly benefit from co-staining with a plasma membrane marker. However, the protocol for mRNA in situ hybridization involves incubation steps with ethanol and formamide and is thus incompatible with staining for commonly used membrane markers. There is one commercially available membrane stain (CellBrite by Biotium) that promises to survive the treatments necessary for in situ hybridization and that we will try to use in our stainings. Should this not be successful, we will resort to identifying a subset of the cytoplasm corresponding to each nucleus by dilating nuclear masks that we will segment based on the DNA stain.

      It would be interesting for the authors to discuss how the spatial range of FGF activity measured in culture could affect PE specification in the embryo.

      During lineage specification in the embryo, Epi and PrE cells are initially arranged in a salt-and-pepper pattern (PMID: 16678776; PMID: 18725515; PMID: 30514631). In Fig. 4 and Fig. S9 of our manuscript, we show experimentally and theoretically how similar patterns in ESC colonies arise from the short range of FGF activity. In a revised version of the manuscript, we will discuss how the spatial range of FGF activity measured in culture provides a possible mechanistic explanation for the spatial arrangement of cell types in the embryo.

      Reviewer #1 (Significance (Required)):

      See above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In their manuscript entitled "Cell-cell communication through FGF4 generates and maintains robust proportions of differentiated cell types in embryonic stem cells" Raina et al study the effect of Fgf-signalling based local cell-cell communication for the establishment of PrE-like and Epi-like cells. The authors use an elegant, albeit artificial, system to analyse the effect of Fgf signalling on establishing 'normal' lineage proportions after transient induction of Gata4 expression. The main conclusions of the manuscript are: i) Gata6 positive cells emerge through short range Fgf4 based cell-cell cummunication. ii) Fgf4 signalling can compensate a wide range of initial levels of Gata6 expression and produce properly portioned cell identities. The authors also state that this mechanism could operate in a range of developing tissues.

      **Major points:**

      1. Fgf4 KOS ESCs are deficient in initiating epiblast lineage differentiation (Kunath 2007). Therefore, the effect studied by the authors might be multifactorial and the general inability of Fgf4 deficient cells to enter differentiation might contribute to the observed differentiation defects and defects of cell fate proportioning. Specifically, it could be expected that Nanog regulation is affected in Fgf4 mutants, although, to my knowledge, the specific phenotype of Fgf4 depletion has not been evaluated in Gata4 induced cell programming towards PrE. What steps have the authors taken to exclude an impact of general cell fate change defects in Fgf4 KO ESCs.

      While it is true that Fgf4 mutant cells have a general deficiency in initiating epiblast lineage differentiation, it was already shown in the original publication by Kunath et al. (PMID: 17660198) that general differentiation of Fgf4 mutant cells is restored to wild type levels by supplementing the culture medium with 5 ng/ml recombinant FGF4. This is a concentration that is well within the range of concentrations applied in our study. In initial experiments to characterize our Fgf4 mutant lines, we have measured NANOG expression to test the effectiveness of recombinant FGF4 to restore epiblast lineage differentiation. We found that FGF4 treatment of Fgf4 mutant cells in the absence of doxycycline induction leads to a downregulation of NANOG expression, to levels comparable to those seen in wild type cells grown in N2B27. These data indicate that treatment with recombinant FGF4 rescues defects of general cell fate change in Fgf4 KO ESCs. We will add these data to Figure S4 of a revised manuscript, and explicitly mention the function of recombinant FGF4 to rescue lineage differentiation potential more generally.

      Increasing the time of Gata4 expression results in increasing levels of Gata4 levels (Fig 1C). This is shown at the overall mean fluorescence level. However, it is important to also quantify how many cells do actually show some increase in Gata4 levels. Fig1D suggests that the number of Gata4 expressing cells is quite similar between 4h and 8h induction, but this needs to be quantified. An explanation for the apparent dosage independence of Gata4 could then be simple threshold effects, such that there is no additional effect of increased Gata4 levels in WT cells without any further requirement of feedback regulation after a certain threshold level of Gata4 is reached. Have the authors considered such a simple model?

      The current version of the manuscript already contains quantifications of GATA4-mCherry expression levels in single cells - see Fig. S2A for the experiments where we vary doxycycline induction time, and Fig. S3B for experiments with independent clonal lines. This analysis confirms the Reviewer’s visual impression of Fig. 1D - the number of GATA4-mCherry expressing cells is similar for different induction times and clonal lines, such that the increase in overall mean fluorescence levels is mainly due to an increase in GATA4-mCherry expression levels in single cells. This analysis therefore rules out the simple model based on threshold effects proposed by the Reviewer. In a revised version of the manuscript, we will more explicitly discuss the quantifications in Fig. S2A and Fig. S3B.

      An important point is that in the current setup distinguishing between dosage effects and effects of extended presence of Gata4 cannot be distinguished. Wouldn't titrating the amount of doxycycline used for induction be a more direct way to achieve different initial levels of Gata4 expression?

      This concern has also been raised by Reviewer #1, and is addressed in detail in our response to their comment above. Briefly, in our opinion this concern is addressed in the current manuscript by the experiments performed in the four clonal lines with independent integrations (Figure S3). Here, the duration of doxycycline induction and hence time of GATA4-mCherry exposure is held constant, such that the only difference between the conditions is GATA4-mCherry dosage. We will discuss this important function of Fig. S3 in a revised version of a manuscript.

      Unfortunately titrating doxycycline does not allow titrating transgene induction levels in a meaningful way, as sub-saturating doses of doxycycline lead to an increased heterogeneity in transgene expression with many non-expressing cells, rather than to reduced expression levels across all cells. See PMID: 17048983 for a possible explanation of this observation.

      Another point the authors should appropriately discuss and consider is that a lack of effect of different doses/durations of Gata4 expression could be due to the fact that by the time Gata6 is induced, the levels of Gata4 in cells previously treated for different periods of time are no longer detectably different. Such a regulation would equally result in indistinguishable cell fate proportioning. Can the authors exclude such a regulation? This is an important point at the heart of the authors conclusion.

      The Reviewer seems to suggest that by separating the initiation of GATA6 expression from the GATA4-mCherry pulse in time, the decision to initiate PrE-like differentiation could be independent from GATA4-mCherry concentration, thus explaining the robust cell type proportions. The data shown in Figs. S2C, S3D and Fig. 3 A - C clearly exclude such a regulation: In conditions where we supply recombinant FGF4, the proportions of the different cell types scale with GATA4-mCherry expression levels, indicating that GATA4-mCherry dose does indeed affect Gata6 expression. In a revised version of the manuscript we will discuss and consider how these observations argue against a model where the decision to initiate PrE-like differentiation occurs independently from GATA4-mCherry levels.

      The authors make some general statements on cell differentiation (e.g. l205). They also claim that the Fgf4-based mechanism of lineage proportioning could act in a range of tissues during development. However, the use of the term differentiation for the induction of PrE-identity (or Gata-factor expression to be exact, see comment below) after Gata4 overexpression is problematic. The system chosen by the authors is entirely artificial. ES cells normally do not differentiate into extraembryonic cell types. It needs to be made clear in the manuscript that they do not study a differentiation process that normally occurs in the embryo or in differentiating ESC cultures. The system the authors are using would, in my opinion, rather qualify as cell programming or transdifferentiation than as differentiation. I suggest presenting the system using clearer unambiguous language and to try to avoid any generalisations based on an artificial transgene-overexpression based system. The results have to be presented with this limitation in mind.

      To address the Reviewer’s concerns regarding terminology, we will expand on the relationship of our system to normal ESC differentiation and lineage specification in the embryo, and discuss its possible limitations. We disagree however with the Reviewer’s assertion that using a transgene-based overexpression system precludes drawing any general conclusions. Rather, the system allows mimicking Epi- and PrE-like differentiation in a uniquely accessible context, and thereby to exploit the molecularly simple regulation of this cell fate decision for studying basic principles of cell differentiation. This view is supported by Reviewer #3 in the referees cross-commenting section below, who emphasizes the value of such models and notes that they are very common in developmental biology.

      It is unclear how 'PrE-like' (as stated e.g. in the abstract) the cells really are after a short pulse of Gata4 expression. No proper characterisation has been performed but needs to be included, if the authors want to term these cells PrE-like.

      A recent study by Amadei et al. (PMID: 33378662) supports the notion that a short pulse of GATA4 expression can trigger bona fide PrE-like differentiation. In this study, the authors induced a similar doxycycline-inducible GATA4 expression system for 6 hours, and observed subsequent differentiation into several PrE derivatives, including the anterior visceral endoderm. In a revised version, we will cite this study to support our claim that the GATA6-positive cells are indeed PrE-like. Additionally, we offer to perform immunostainings with an extended panel of known PrE marker proteins to substantiate the PrE-like character of the GATA6-expressing cells.

      How is the statement in l112 that "The clear separation between the two populations suggests that the increase in the proportion of double negative cells at the expense of GATA6+; NANOG- PrE-like cells beyond 40 h is mostly fueled by the downregulation of NANOG expression in the GATA6-negative cell population, combined with a slower proliferation of the GATA6-positive population, rather than by the reversion of PrE-like into double negative cells." supported by the data?

      We realize from the comments of all three reviewers that this section was confusing and potentially misleading in the original version of the manuscript. In a revision, we will reword this paragraph to better bring out the major conclusions from the GATA6 and NANOG expression patterns shown in Fig. S1A. These data show that the majority of cells belong to one of two discrete clusters from 16 h onwards. The clear separation of the two clusters furthermore indicates that cells rarely switch their gene expression patterns. Given these observations, the changes of cell type proportions reported in Figure S1B can be explained as a consequence of slower proliferation of cells in the GATA6-positive relative to the GATA6-negative cluster. In addition, NANOG expression in the GATA6-negative cluster declines over time, such that progressively more cells are classified as double negative.

      Would the data and modelling performed by the authors be in line with a model in which the decision to express Gata6 is a stochastic choice (with a certain probability based on the levels of Gata4 induction) that is then stabilized and reinforced by Fgf signalling rather than Fgf signalling having an instructive role?

      The simulations shown for the Fgf4 mutant case in Fig. 3 D - G, right column, are based on a model in which the decision to express Gata is a stochastic choice with a probability based on the initial levels of GATA expression, and reinforced by FGF signaling. Thus, our data from the Fgf4 mutant, but not the wild type, are perfectly in line with such a model.

      We realize from the Reviewer’s comment that we have not made sufficiently clear the conceptual differences between the models for the mutant and the wild type case. We suspect that this lack of clarity stems from the fact that the two models rely on the same circuitry, except for the regulatory link between GATA and FGF. This link however makes a crucial difference: It transforms the simple single cell input-output model of the mutant case, which is common to many previous publications, into a population level model with cell-cell feedback which shows new emergent behavior. And only this population level model, but not the single cell model for the Fgf4 mutant, can recapitulate the experimental data observed in the wild type. In a revised version of the manuscript we will expand on these crucial differences when describing the model and data in Fig. 3.

      The statement in line 187 "This indicates that GATA4-mCherry expression negatively regulates FGF4 signaling during cell type specification." is not supported by the data. The authors show only a correlation and actually correctly say so in line 195.

      Prompted by the comments of both Reviewer #1 and #3, we will carry out experiments to mechanistically explore the regulation of Fgf4 expression by GATA factors (see our response to Reviewer #1 above for a detailed description). Depending on the outcome of these experiments we will reword this statement.

      In Fig 2F statistical analysis between the re-seeded conditions is required for the conclusion that "the proportion of PrE-like cells systematically increased with cell density". Replating itself appears to quite drastically impact lineage distribution. Do the authors have an explanation for this?

      The p-value in line 221 of the original manuscript refers to a test for a linear trend between the three conditions following a one-way ANOVA in GraphPad Prism. We apologize that this has not been made clear and will add this information in a revised version.

      The observation that replating drastically impacts lineage distribution is perfectly in line with the overall conclusion from this section, namely that FGF signaling is enhanced by cell-cell contacts. Replating strongly reduces the number of direct cell-cell contacts by disrupting the colony structure of the culture. Thus it is expected that the proportion of the PrE-like cells - which require exposure to FGF ligands - is reduced under these conditions compared to the condition that has not been replated. We will discuss this explanation in a revision.

      Fig 2G shows a key experiment illustrating the local effect of Fgf4 expression on first and second neighbours. The authors have investigated this effect using a Fgf-signalling reporter. Why did they not assay Gata6 expression in this assay instead of a Spry reporter? This would be the experiment to show that also Gata6 expressing cells (after transient Gata4 induction) are clustered around Fgf4 producing cells and be a strong piece of evidence to show that local Fgf4 signalling and cell-cell communication is indeed involved in cell identity proportioning. The cell lines required for this experiment (including Fgf4 mutant Gata4 inducible ESCs) appear to be available.

      We decided to measure the FGF4 signaling range with a Spry4:H2B-Venus reporter because its response time is faster than that of GATA6 expression during differentiation. Furthermore, the Spry4:H2B-Venus reporter provides a quantitative readout for FGF4 signaling, in contrast to a binary read-out that would be expected for GATA6 expression. We will be happy to discuss these considerations in a revised manuscript.

      We agree that measuring FGF4 signaling range with Fgf4 mutant Gata4-mCherry inducible cells as suggested by the Reviewer constitutes a complementary approach to further corroborate the role of local FGF4 signaling in cell differentiation. However, we would like to stress that our demonstration of local FGF4 signaling is already supported by two fully orthogonal quantitative experiments, one relying on cell replating and the other one relying on the signalling reporter. The concept of local signaling is further supported by our quantitative analysis of the spatial arrangement of cell types in Fig. 4. The additional experiment suggested by the Reviewer is therefore unlikely to substantially change the paper’s conclusions, as also pointed out by Reviewer #3 in the referees cross-commenting section. Therefore, we offer to perform this experiment for a revision, but would like to seek the editor’s opinion if this is deemed necessary to make the paper acceptable for publication.

      The authors conclude from data in Fig 3A that proper cell type proportioning depends on initial Gata4 levels in Fgf4 mutants, in contrast to WT cells where the initial levels appear more irrelevant. Is 10ng/ml too high a dose? Would using a lower concentration (such as ~2ng/ml suggested by Fig 2D to give WT-like distribution) result in a complete rescue of cell lineage proportioning in this assay? Formally a control of adding additional Fgf4 to WT cells will also ne needed to control for a potential effect of exogenous Fgf4 addition.

      In our initial characterization of the Fgf4 mutant cell lines, we have performed experiments where we examined cell type proportions upon culture in the presence of different doses of FGF4 following doxycycline induction times between 1 h and 8 h. These experiments confirm the suspicion of the Reviewer that cell type proportions similar to the wild type can be obtained with a lower dose of 2.5 ng/ml FGF4 after 8 h of induction. For shorter induction times followed by differentiation in the presence of 2.5 ng/ml FGF4 however, cell type proportions were strongly skewed towards Epiblast-like cells. These data thus further support the major conclusion from Fig. 3A quoted by the Reviewer: Proper cell type proportioning in Fgf4 mutants depends on GATA4 levels, and this behavior is independent from the FGF4 concentration applied. We offer to add this data to a revised manuscript.

      The effects of adding FGF4 to wild type cells are shown in Fig. S2C and S3D in the current version of the manuscript. This control has been performed in all experiments shown in Fig. 3A - C, but we decided to omit it for clarity. We are happy to add this information back in as requested by the Reviewer.

      Does the model in Fig 3E consider potentially varying doses of exogenous Fgf4? Can the model also predict what happens if Fgf4 is added to WT cells, as suggested above as control? In general, the value of this model is unclear. Figure 3E is near impossible to understand, no quantitative information is given.

      The model in Fig. 3E can of course be simulated with different doses of exogenous FGF4. These simulations recapitulate the experimental results described under point 10 above: Cell type proportions for the Fgf4 mutant case are skewed towards NANOG-positive cells at lower FGF4 doses, and vary with initial conditions irrespective of FGF4 dose. We offer to show the results of these simulations in a revised manuscript alongside the experimental data discussed above.

      It is also possible to incorporate into the model addition of exogenous FGF4 to the wild type. Simulations of this condition confirm the experimentally observed increase in PrE-like cells shown in Fig. S2C and S3D of the current manuscript.

      To help the reader digest Fig. 3E, we will add separating lines similar to the gates of the flow cytometry data in panel A, and indicate the proportion of cells in the respective quadrants.

      The Reviewer’s comment that the value of the model is unclear indicates to us that we have not explained in sufficient detail the conceptual differences between the behavior of the model of the wild type and the mutant case. As detailed in our response to Reviewer’s comment 6. above, we will rewrite the text to bring out more clearly the insight that the model brings.

      Fig4A: What were WT and Fgf4 mutant cells treated differently in this assay (8h vs 4h, respectively)?

      The spatial arrangement of cell types in Fgf4 mutant cells has been assayed in two conditions that give similar cell type proportions as seen in the wild type, as motivated in lines 366 - 370 of the current manuscript. We decided to show the condition with 4 h induction followed by differentiation in the presence of 10 ng/ml FGF4 in the main Figure 4 because it is most similar to the condition that gives wild-type like cell type proportions in the Fgf4 mutant shown in the immediately preceding main Figure 3, while the condition that uses 8 h induction followed by differentiation in the presence of 2.5 ng/ml FGF4 refers back to the main Figure 2. We show both primary data and the complete analysis for the latter condition in Figures S8D and S10. Fig. S10 provides a direct comparison between the two conditions and clearly demonstrates that they show similar dynamics. We do not think that exchanging the two datasets between main and supplementary Figures will add value to the manuscript.

      Does the interpretation that at 24h there is a difference in Fig 4C survive statistical scrutiny? Only few datapoints are shown and any apparent differences seem due to outliers rather than a shift in cluster radii. How often were these experiments independently repeated? This information is missing. In Fig 4B, I cannot appreciate any difference between cell lines.

      We will perform statistical testing to assess whether the spatial arrangement of cell types is significantly different between the time points, and mention the results in the text.

      To evaluate the spatial arrangement of cell types, we have performed two independent experiments in the wild type, and analyzed two conditions for the mutant case. In each experiment, we have analyzed at least eight positions per condition and control. Spatial clustering of wild type cells at 40 h is also observed in earlier Figures in the manuscript (e.g. Fig. 1D, S2B, S3C).

      The similarities between wild type and Fgf4 mutant cells shown in Fig. 4B are not surprising and fully in line with the data shown in panel C, which shows that differences between time points are much more pronounced compared to the differences between genotypes. However, we realize that the micrographs and analysis plots in Fig. 4A and B were perhaps not fully representative for the aggregate behavior shown in panel C. In a revision, we will therefore show data from more representative colonies in panels A and B.

      **Minor points:**

      a) More information on statistics should be given in the Figures and legends.

      To address this concern we will perform statistical tests for differences in proportions of the main cell types in Figures 1D and 3C. In addition, we will perform statistical testing on Fig. 4C as detailed in point 13 above.

      b) Percentages should be indicated in the quadrants of the FACS plots of Fig 3A and E.

      This is a good suggestion, we will add this information. See also our response to point 11 above.

      c) What is the underlying evidence for the statement: "The specification of Epi- and PrE-like cells in ESCs shows both molecular and functional parallels to the patterning of the ICM of the mouse preimplantation embryo."

      In the current manuscript, this statement is further substantiated in the subsequent paragraph (lines 483 - 503). We realize that this order is potentially confusing and will change it. We will further modify this section as part of our response to major point 3. above.

      d) Fig 5C is difficult to interpret without a comprehensive decoding of colour information.

      To facilitate interpretation of this panel, we will add a legend to decode the colour information of the traces (purple: VNPhigh, cyan: VNPlow)

      Reviewer #2 (Significance (Required)):

      This manuscript provides novel insights into the role of Fgf-mediated cell-cell communication to establish proper ratios of cell identities in a PrE-induction system. The authors provide some interesting data and interpretation. Overall, the significance is slightly impaired by the highly artificial nature of the studied cell fate specification event.

      This manuscript will be of interest to readers working on early embryonic cell fate decision as well as researchers working on modelling of cellular processes.

      My expertise lies in the field of cell fate decision and pluripotency.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      It is well established that FGF signalling plays a role in the partitioning of the Primitive Endoderm and Epiblast fates during preimplantation mammalian development. Recent work has shown that this fate decisions is associated with a mechanism that is able to maintain the proportions of the two fates stable in the face of perturbations. Here, the authors address this mechanism and show that it is dependent on FGF signalling and associated with the fate decision. In the process they suggest and test a novel mechanism based on short range FGF signalling. A series of carefully designed and executed experiments, refine and provide evidence for the model. This is an original and important piece of work that will influence the field of pattern formation.

      Overall the manuscript is well written but, at least from the perspective of this reviewer, there are places in which clarity can be improved.

      Lines 104 and ff: the description of the dynamics of the different populations fater the GATA4 pulse, can be clarified. The reference to the double negative population emerging from the PrEnd population is not clear. It is stated that the proportion of these cells increased continuously and it said to be at the expense of the decrease of the PrEnd population whose variation is referred to as 'slightly declined". How can a slight decline fuel a steady increase in the double negative?

      Also, what are these double negative? Could they be cells differentiating into embryonic lineages?

      We realize from the comments of all three Reviewers on this paragraph that it was confusing and potentially misleading in the original manuscript. In a revised version we will rewrite this section to clarify our interpretation of the data in Fig. S1. First, the clear separation of the two clusters observed in NANOG-GATA6 expression space indicates that cells rarely switch between the two clusters. Then, a likely explanation for the slow decline in the fraction of GATA6-positive cells is a slower proliferation compared to the GATA6-negative cells. Third, the increase in the proportion of double negative cells is caused by a progressive downregulation of NANOG expression in the GATA6-negative cluster. These NANOG expression dynamics are consistent with NANOG expression dynamics in epiblast cells of the embryo, and could indeed indicate differentiation towards embryonic lineages. We will mention this possibility in a revised manuscript.

      See also our response to Reviewer #1 and Reviewer #2, point 5..

      In Figure 1 and its discussion, it would be good to see a representation of the stability of the final proportions relative to the different initial conditions, a variation on 1E.

      This is a good suggestion. In a revised version, we plan to add a panel to Fig. 1 in which we plot the final proportions of the different lineages versus the GATA4-mCherry expression levels for the different induction times. This will illustrate more clearly that the final proportions of cell types are largely independent from the initial conditions.

      Paragraph lines 182 and ff: the report that GATA4 expression is able to suppress FGF4 signalling, autonomously is, at least for this reviewer, a novel and important result and one that impinges on the understanding of the process. The authors should emphasize this.

      We agree with the Reviewer that the direct regulation of Fgf4 expression through GATA factors is a new regulatory link suggested by our data that has not been described before and that is crucial for the functioning of the system. Prompted by a similar comment of Reviewer #1 above, we offer to further explore the mechanistic basis of this link through an analysis of published ChIPseq data, functional studies of a GATA binding site upstream of the Fgf4 start codon, or a more detailed temporal dissection of NANOG, GATA and Fgf4 expression dynamics following doxycycline induction (see our response to Reviewer #1 above for more details). These new experiments and analyses will allow us to emphasize this novel result, and thereby significantly strengthen our paper.

      Paragraph lines 274 and ff (section on the involvement of FGF4 in the robustness of the process) needs some explanations. The derivation of the conclusion that 'recursive communication vis FGF4 underlies a population-level phenotype ...characterized by the differentiation of robust proportions of cell types..." from the experiments requires some unwrapping. It would be helpful if the authors could reason how the conclusion follows from the experiments.

      We realize from this Reviewer’s comment and the comments of Reviewer #2 above that we have not explained well enough how the results shown in Fig. 3 A-C (lines 274 - 283) lead to our conclusion of emergent behavior, which are then further substantiated in the modelling in panels D - G. The central conclusion of this paragraph rests on the observation that cell type proportions are dependent on initial conditions in the Fgf4 mutant, but not in wild type cells. As we had supplied FGF4 externally to the Fgf4 mutant cells, the only difference between these two conditions is that FGF4 dose in wild type cells is regulated by the cell population, i.e. cells can communicate via FGF4, whereas mutant cells cannot. We will expand on this line of reasoning, and also explain in more detail the differences in the models for the mutant case and the wild type, which we believe will help to conceptualize the experimental results. See also our response to Reviewer #2, points 6. and 11..

      Their model does not seem to include the commonly agreed regulatory interaction between Nanog and FGF4, at least not directly, and it would be helpful if a reasoning could be provided for this decision.

      A discussion of the regulatory interaction between NANOG and Fgf4 has also been requested by Reviewer #1. In our response to their point above, we provide a reasoning why we have omitted it in the current manuscript. Briefly, our decision not to include a direct positive link between NANOG and Fgf4 expression rests on our observation that Fgf4 mRNA continues to be expressed 2 days after switching cells from 2i + LIF medium to N2B27, a time at which NANOG already starts to be downregulated as a consequence of differentiation along embryonic lineages. We will add this data to a revised manuscript. In addition, we propose above to dissect in more detail the temporal sequence of GATA4-mCherry, Fgf4 and NANOG expression upon doxycycline induction. This analysis will provide further information about the role of NANOG for Fgf4 mRNA expression in ESCs.

      Reviewer #3 (Significance (Required)):

      In this manuscript, Raina and colleagues use an Embryonic Stem (ES) cell based experimental system to address a central problem in developmental biology, namely the emergence of stable scaled populations of different cell fates. The experiments are elegant in design, carefully executed and the effort provides a solution to the problem: a novel mechanism based on short range FGF signalling that provides homeostatic control of relative cell populations. This is an important piece of work with sound conclusions that establishes a new paradigm in pattern formation whose implications are likely to lead to a reassessment of the role of FGF in different patterning paradigms. The experiments are quantitative and supported by a modelling effort based on a theoretical piece of work (Stanoev et al. 2021) which underpins the conclusion.

      This manuscript will appeal to a wide audience including developmental and stem cell biologists as well as modellers.

      My expertise cover the areas addressed in the manuscript.

      **Referees cross-commenting**

      It looks as if, with some nuances, we all agree on the value of the work. I do not have any issues with the comments of Reviewer 1, though I disagree that the model tested and improved here is similar to existing ones. While it is true that this work is related to a theory paper by some of the authors, the experimental test and resulting conclusions are very important. On the other hand, I am very surprised by the comments of Reviewer 2 who, after conceding the value and potential significance of the work, raises a list of queries, largely small details and opinions rather than points of substantial concerns, hinting at a need for the authors to perform extra work and analysis that will not change the conclusions of the manuscript. Some of this e.g. #9 would be a nice piece of additional evidence, but more an adornment than a necessary piece of additional evidence. The main problem of this reviewer is the lack of appreciation of what they define as 'highly artificial nature' of the study without providing any reason for why such experiments (very common in developmental biology) can lead to misleading conclusions. It seems to me that most, if not all, of their significant concerns can be dealt with in a rebuttal or by altering the text, either to discuss the issues raised, to clarify the points or qualify the conclusions.

  5. Apr 2021
    1. SciScore for 10.1101/2021.04.26.21255801: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      NIH rigor criteria are not applicable to paper type.

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">We operationalize the intensity of this intervention by means of the mobility index produced by Google (xm), specifically the average of the transit and workplaces indexes as previously explained(16).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>Google</div><div>suggested: (Google, RRID:SCR_017097)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      However, we think that these cases are out of the mainstream of the epidemic, and they pose no serious limitation to the model. On the other hand, the proposed model is highly flexible, allowing for both completely asymptomatic and mild symptomatic cases, that are thought to play a significant role in the SARS-CoV2 epidemic. Among the limitations of our research, we have to mention the data quality. Changes in real-time data due to corrections, poor data-quality and slow reporting may affect therefore the assumptions of our model. Under-reporting due to slow data processing, restrictive testing policies and lack of testing availability impacted on the cumulative number of cases acknowledged by official data sources. While we use Russell’s method to correct for this, we are introducing potential limitations of Russell’s method into our model. Also regarding data quality, imported cases also introduce uncertainty in the model, and data is not as granular as it is required to account for that. Another limitation is that in our model, 81% of the infections are asymptomatic. While as mentioned previously, some local data shows that for every PCR-diagnosed case there were 9 IgG SARS CoV-2 positive individuals that had not been diagnosed during the outbreak in a very poor neighborhood in the City of Buenos Aires (30) reaching 50-60% seroprevalence, studies conducted in Europe after local outbreaks show 15 to 20% seroprevalence (31). Further research is required to understand if this...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      Results from scite Reference Check: We found no unreliable references.


      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. Author Response:

      Reviewer #1:

      Guo et al. describes interesting experiments recording from various sites along a cortico-cerebellar loop involved in limb control. Using neuropixels recordings in motor cortex, pontine nuclei, cerebellar cortex and nuclei, the authors amass a large physiological dataset during a cued reach-to-grasp task in mice. In addition to these data, the authors 'ping' the system with optogenetic activation of pontocerebellar neurons, asking how activity introduced at this node of the loop propagates through the cerebellum to cortex and influences reaching. From these experiments they conclude the following: the cerebellum transforms activity originating in the pontine nuclei, this activity is not sufficient to initiate reaches, and supports the long standing view that the cerebellum 'fine tunes' movement, since reaches are dysmetric in response to pontine stimulation. Overall these data are novel, of high quality, and will be of interest to a variety of neuroscientists. As detailed below however, I think these data could provide much more insight than they currently do. Thus below I provide some suggestions on improving the manuscript.

      1) Since the loop is the focus of this study, it would be nice if the authors better characterized latencies of responsivity to pontine stimulation through the loop, to address how cortically derived information routed to the cerebellum may loop back to influence cortical function. In the data provided, we know that pontine stimulation modulates Purkinje and deep nuclear firing (but latency to responses are not transparently provided in the main text, if anywhere), while motor cortical responses peak at 120 ms (after stimulus onset?, unclear), and that this responsivity is preferentially observed in neurons engaged early in the reaching movement. Is the idea, then, that cortical activity early in the reach is further modulated by cerebellar processing to (Re) influence that same cortical population? Does this interpretation align with the duration of reaches, the duration of early responsive activity during reach, and the latency of responsivity; or is the idea that independent information from other modalities entering the pontine nuclei modulates early cells? Latency to respond at the different nodes, might aid in thinking through what these data mean for the function of the loop.

      We thank the reviewer for this important suggestion, and we have now added measurements of the latency from the onset of sinusoidal PN stimulation to neural responses in Purkinje cells, DCN neurons, and motor cortex (Supplemental Fig. 7), and observe a progressive recruitment of laser-evoked spiking along this pathway. There is a tradeoff between temporal resolution (which increases with decreasing bin width) and statistical power (which decreases with decreasing bin width), and we have opted to use 10 ms bins in a sliding window, which provides a reasonable compromise between these criteria. Although we potentially detect fewer tagged neurons at shorter latencies than we would with larger bins, this approach enables us to detect the timing of the earliest responses (defined as the earliest time point at which 5% of the neurons eventually recruited are responsive). Note that the sinusoidal stimulation used in these experiments is not ideal for latency measurements, as it takes 6.25 ms for the laser to reach peak power. We have also added a similar analysis for the response latency of PN neurons to pulse train stimulation of motor cortex (Supplemental Fig. 1). Based on these analysis, our estimate of the delay for signals to propagate across the entire loop is 26 ms: PN to motor cortex (21 ms) + motor cortex to PN (5 ms). Given that the movement duration (lift-to-grab) is approximately 110 ms on average, this would allow ~4 full feedback cycles throughout the reach. Thus, these delays are consistent with the possibility that cortical activity during planning or early in the reach is further modulated by cerebellar processing to influence that same cortical population later in the reach. Regarding the earliest motor cortical responses that we observe in PN-tagged units, it's possible that they may result from ponto-cerebellar input driven by other cortical regions. Alternatively, the responses of motor cortical neurons early in the movement may be driven more directly by other cortical areas or the basal ganglia, but these early-responding neurons may also receive strong ponto-cerebellar input due to plasticity during development or learning.

      2) Many of the figures need work to aid interpretation. Axis labels are often missing (eg 2F); color keys are often unlabeled (2F); color gradients often used but significance thresholds are hard to evaluate (using same colors for z scores and control / laser is confusing 6, 8); and within-figure keys would be useful (5D-h). These issues occur throughout the manuscript.

      We have added the axis and color labels in Fig. 2F, and have added additional annotation throughout the main and supplemental figures. For firing rate z-score heatmaps, we have kept the gray color scale for control and laser to facilitate direct comparison between the panels, but have added orange and blue boxes around the heatmaps in Fig. 6, 7, S8, and S9 to emphasize that they reflect different experimental conditions.

      3) Relatedly, but also conceptually, Figure 3B has particular issues, such as identifying where the neuropixel multiunit activity is coming from. I assume that in the gray boxes illustrating the spatio-temporal profile of spiking band activity that the lower part of the box is the ventral direction, upper, dorsal. This is not spelled out. From the two examples it would seem that the spiking band is in different places in the cerebellum, undermining, I think, the objective of the figure. It would be sensible to revisit this entire figure to identify the key takeaways and design figures around those ideas. As it stands, these examples appear anecdotal. Consider moving this to a supplement. Powerband density strength is missing an axis. More importantly, it would be nice to corroborate the interpretation of the MUA with the single unit recordings, since the idea is that many neurons are entraining to the PN activity. Yet, the examples don't seem particularly entrained. Is the activity being picked up on just axonal firing of the PN axons? Fourier analysis of spiking of isolated neurons in cerebellum should be used to corroborate the idea that cerebellar neurons are entraining, rather than the neuropixel picking up entrained PN axons.

      To examine spike entrainment to the 40 Hz PN stimulation for Purkinje cells and DCN neurons, we computed the phase of sinusoidal stimulation coinciding with each individual spike. If a neuron is entrained to the stimulation, the phase distribution for its spikes will differ from the uniform distribution on the circle; this can be assessed for each cell using a Rayleigh test. Furthermore, we can calculate the strength of entrainment and preferred phase by calculating the magnitude and angle of the mean resultant for each cell. If a neuron’s spikes are completely unrelated to the stimulation phase, the mean resultant length will tend to 0 as the number of spikes observed goes to infinity. If, on the other hand, a neuron is completely entrained (with every spike occurring at exactly the same phase), the mean resultant length will be 1. This approach is illustrated schematically in Supplemental Fig. 6A.

      This new analysis revealed two key features of the data we had not previously appreciated. First, it revealed PN-stimulation-induced changes in neural activity that were not apparent from the mean firing rate profiles: most Purkinje cells and DCN neurons were significantly entrained to the 40 Hz stimulation. Second, the entrainment strength was higher in the DCN than Purkinje cells (Supplemental Fig. 6B-D), suggesting the corticonuclear pathway amplifies the rhythmic input. This result is strikingly similar to published observations obtained from slice electrophysiology and anesthetized mice (Person & Raman, 2012), which we now discuss in the text. It is also possible that direct excitation from PN collaterals contributes to the DCN entrainment.

      We agree that the original analysis of multiunit activity is difficult to interpret, for two reasons: (1) the signal likely reflects the combined contribution of multiple cell types, including pontine mossy fiber terminals, and (2) the depth profile will differ for different electrode penetrations, due to the geometry of the cerebellar cortex. Furthermore, this analysis is largely redundant, since we have recorded from individual Purkinje cells and added new analyses demonstrating their entrainment to the 40 Hz stimulation (Supplemental Fig. 6). We have now moved this figure to the supplement and added labels to all axes (Supplemental Fig. 3).

      4) The use of the GLM is puzzling. In addressing the question of how cerebellum and motor cortex interact (from the Abstract, "how and why" do these regions interact) it is unclear why these regions are treated separately. I would have expected some kind of joint GLM where DCN activity is used to predict M1 variance (5 co-recordings are reported but nothing to analyze?); or where DCN + M1 activity is used to decode kinematics to see if it is better than one or the other alone. As it stands, we learn that there is more kinematic information in the motor cortex than in DCN. This is not necessarily surprising given previous literature on cerebellar contributions to reaching movements. In principle the idea that 'PN stimulation might perturb reaching kinematics through descending projections to the spinal cord, or by altering activity in motor cortex' is treated as mutually exclusive outcomes, though it is highly unlike to be so.' Analyzing M1+DCN together could address whether DCN activity adds nothing to decoding kinematics that isn't there in M1 or adds something that M1 does not have access to. The main point here is that the physiological datasets could be better leveraged with these fits to derive insight into the interactions of the loop. R2 should be provided in the GLMs (Fig 8) to assess statistically how well they perform relative to one another, not just correlations between the two.

      We have added two additional analyses to address these questions. First, in addition to motor cortex-based and DCN-based decoders for all sessions (Fig.8 and Supp. Fig.12A-D, G-H; all the R2 values are reported in Supp. Fig. 12C-D, G-H) we now also train a decoder using both motor cortical and DCN multiunit activity in sessions with simultaneous recordings (Supp. Fig.12E-F, I-J). When we train only on control trials, the decoder performs about equally well with or without the DCN multi-units for control trials (Supplemental Fig. 12E), but performs slightly worse on laser trials in comparison to using only cortical data (Supplemental Fig. 12F). When we train on both control and laser trials, adding DCN multi-units slightly degrades decoding performance on both control and laser trials in 3 out of 5 sessions (Supplemental Fig. 12I-J). Based on this comparison, it does not appear that DCN contributes kinematic information that is not already present in cortex. However, there are several cautionary notes to consider in interpreting these results. (1) This dataset consist of only 5 sessions, in all of which the recording yield in DCN was not as high as in cortex, so it is possible that dimensions of activity unique to DCN may not have been sampled enough in these experiments. (2) Our task involves only a single reaching target (in comparison to, e.g., center-out reaching tasks with eight targets which are possible in primates) so we cannot assess whether DCN contains directional-specific kinematic information not present in cortex. Thus, in light of these factors, it is difficult to draw strong conclusions from our experiments about differences in kinematic information between motor cortex or DCN. A more rigorous comparison requires carefully controlled experiments with many reaching targets, as in Fortier, Smith, & Kalaska (1993).

      Second, we have added an additional analysis to determine how predictive cortical activity is of DCN activity at the single-trial level, and vice versa. We considered several possible statistical approaches to this issue. Computing pairwise correlations of neurons in the cortex and DCN would be one possible method, but the outcome of this analysis would be difficult to interpret, as the sign and timing of firing rate peaks will vary across neurons. Another approach would be to regress principal component scores in one region - or their derivatives, as in Sauerbrei et al., 2020 - on the scores in another region. However, because cortex and DCN are bidirectionally connected, the choice of which region’s scores should be considered as the dependent variables is ambiguous, and this approach will merely “align” activity in one region (as a projection onto regression coefficients) with activity in the other. Ideally, we would like to find simultaneous linear transformations of both cortical and DCN activity that would maximally “align” them with one another, and to compute the correlations of the aligned neural trajectories. This is precisely what canonical correlation analysis (CCA) does, and CCA has been used increasingly in recent years to align population activity from different brain regions or samples - e.g., Lara et al., Nat. Comm. (2018), Perich et al., Neuron (2020), and Gallego et al., Nat. Neuro (2020). We took this approach with our simultaneous recordings of multiunit activity in the motor cortex and DCN, and found that:

      (a) In each of the 5 sessions, CCA found two pairs of canonical variates that were strongly correlated (Supplemental Fig. 11A, first two columns; Supplemental Fig. 11B, correlations in the range 0.58-0.88 for the first two canonical variates), and two pairs of canonical variates weakly correlated (Supplemental Fig. 11B, correlations <0.27 for the last two canonical variates)

      (b) The first two canonical variates accounted for half or more of the variance in each region (49%-64% in cortex, 51%-70% in DCN; Supplemental Fig. 11C, left column)

      (c) Between a quarter and a half of the variance in each region was accounted for by canonical variates in the other region (25%-50% of variance in DCN explained by cortex, 26%-47% in cortex explained by DCN; Supplemental Fig. 11C, right column)

      From these results we conclude that, within the constraints of our behavioral task, some but not all of the dominant dimensions of cortical and cerebellar activity are strongly correlated. We also performed additional CCA analyses using only laser trials or only control trials, to assess whether PN perturbation strongly affected the similarity in population activity between the two regions, but found limited differences between the results of the two analyses (Supplemental Fig. 11D).

      Reviewer #2:

      Guo et al examine the cortico-cerebellar loop during skilled forelimb movements in mice. The authors use optogenetic stimulation of the pontine nuclei (PN) and recordings in PN, cerebellar cortex, cerebellar nuclei (DCN), and motor cortex to show that PN output is transformed into a variety of activity patterns at different stages of the cortico-cerebellar loop. Stimulation only slightly alters movement-related activity in these structures and degrades movement accuracy. The authors conclude that the cortico-cerebellar loop fine tunes dexterous movement. The study is technically impressive, employing recordings in 4 brain regions, and recordings during optogenetic manipulations and behavior. The experiments are well done and the analyses are appropriate. The comparison across brain regions is comprehensive. The results that PN perturbation alters skilled movement and the perturbed activity could predict perturbed movement are important. The study adds to a long line of work supporting the view that the cortico-cerebellar pathway is required for fine motor control. I have a few comments on the interpretation and analysis which I believe could be addressed with changes to the text and additional analysis.

      1) The authors conclude that the cortico-cerebellar loop "does not drive movement" but "fine tunes" the movement. While I generally agree with this interpretation, I wonder if the authors could flush out the concepts of "driving movement execution" vs. "fine-tuning movement" more clearly. Do authors consider them separate processes? How can they be disentangled? I also feel the data on its own has some limitations that should be considered or discussed. First, the data shows that PN stimulation degrades movement accuracy. However, this does not yet reveal the function of the cerebellar loop in fine motor control. Certain places in the text makes stronger assertions (for example, "cortico-cerebellar loop fine-tunes movement parameters") that I feel the data does not support. It is not clear from the data how the loop tunes movement parameters. Second, Fig. 5F shows that stimulating PN blocked movement initiation in some sessions (this is also mentioned in the Methods). Could the authors consider the possibility that stimulating PN at a higher intensity might block movement? This is related to the distinction between "driving" vs. "fine-tuning" movement. At the very least, the authors should discuss these limitations and possibilities.

      In our view, the claim that a brain area drives reaching means that it is necessary for generating the large changes in muscle activity that set the limb in motion towards the target. The claim that a brain area fine-tunes reaching means that it is necessary for generating smaller changes in muscle activity that subtly adjust the limb trajectory and enable precise and accurate behavior. Previous work has demonstrated that motor cortex drives reaching: if it is transiently silenced, the initiation of reaching is robustly blocked (see Guo et al. 2015, Sauerbrei et al. 2020, and Galinanes et al. 2018). In the present manuscript, we show that perturbation of the PN has a very different effect: mice are usually able to initiate reaching, but they are less skillful (the success rate drops), slower (movement duration increases), and less precise (endpoint standard deviation increases). Our interpretation of these results is that while the total output of cortex drives movement (likely through corticospinal and cortico-reticulospinal routes), the cortico-cerebellar loop makes more subtle adjustments to the ongoing movement; that is, it fine- tunes. We have updated the text (in particular, the Abstract, Introduction par. 1, and Discussion par. 1-2) to clarify the distinction between driving and fine-tuning.

      We agree that several interpretive statements in the previous version (especially concluding sentences at the end of some Results paragraphs) were not clearly connected with the data, and we have removed or modified these statements. We now lay out our interpretation of the data as evidence for a cortico-cerebellar contribution to fine-tuning, rather than driving, in the first two paragraphs of the Discussion, but emphasis that this is an interpretation, rather than a direct description of the data. We have also changed the title to more directly state our experimental observations.

      We now mention the possibility that stronger stimulation or inactivation of PN neurons might have robustly blocked movement, and also mention several experimental variables which might have contributed to animal-to-animal variability in behavioral effects: “It is possible that the variability of behavioral effects ...” (Discussion).

      2) Related to point 1, in Fig. 5F, for stimulation trials in which mice failed to initiate movement, did mice fail to move altogether, or did they move in an abnormal fashion?

      We have added a new video documenting the behavior of the animal with the largest blocking effect from PN stimulation (supplemental video 2). This animal does not struggle through a partial reach, but fails to initiate movement. Small movements of the arm occurred (this also occurred in control trials), but these were not tightly synchronized with the onset of the laser across trials.

      3) In the abstract, the authors state that PN stimulation is "reduced to transient excitation in motor cortex". Also in the results (page 5) and discussion (page 8), "pontine stimulation only led to increases in cortical firing rates". These statements are based on the comparison between Fig 3D, 3F, and 4B. But I think the current presentation is somewhat misleading. First, Fig 3D, 3F, and 4B use different neuron selections that make direct comparison difficult. Fig 3 shows all neuron from Purkinje cell and DCN recordings. Fig 4B shows only PN-tagged motor cortex neurons. Furthermore, based on the methods description, it appears that PN-tagged neurons were defined using one-sided sign-rank test. Since the test is one tailed, does that mean neurons shown in Fig 4B are, by definition, neurons significantly excited by photostimulation? Looking at Fig 4B and 4C closely, there appear to be neurons suppressed by PN stimulation. Could the authors organize the rows in Fig 4 in the same way as Fig 3, where neurons that show suppression are grouped together?

      We now display the PN stimulation-aligned firing rates in the same format for Purkinje cells (Fig. 3B), DCN neurons (Fig. 3D), and motor cortical cells (Fig. 4A, lower), with all neurons in a single panel, sorted by response magnitude, for each area. The dominant response pattern in the cortical population is a transient firing rate increase, and this is more readily apparent with the new panel in Fig. 4A (lower). We also use a two-tailed test (which has slightly less statistical power, but allows us to test for both firing rate increases and decreases) for the identification of PN-tagged cortical neurons, and display neurons with stimulation-locked increases (n = 94) and decreases (n = 13) separately (Fig. 4B). In Fig. 4B-C, we still sort the neurons by their reach- related responses, as this reveals a difference in lift-aligned patterns between tagged and non- tagged neurons, which would be masked if we ordered according to stimulation-aligned responses. In Fig. 4D-E, we pool neurons with PN-stimulation-aligned increases and decreases into a “PN-tagged” group, as the small number of stimulation-aligned decreasing neurons (n = 13) does not allow adequate statistical power for a 3x3 contingency table test or for within-group averaging of lift-aligned firing rates.

      4) Fig 7 shows that PN stimulation has only subtle effects on movement-related activity in motor cortex. However, only a small portion (1/8) of the motor cortex neurons show modulation to PN stimulation. Fig 7 shows all neurons. Would the results look similar for PN-tagged neurons?

      We have added a new analysis to address this question, shown in Supplemental Fig. 10. The laser - control difference in lift-aligned activity are indeed larger for PN-tagged neurons; however, the largest peak in this difference occurs before lift, when the laser has been turned on, but the animal hasn’t started to move (Supplemental Fig. 10C).

      5) Page 3 "Our observation that the activity of some motor cortex-recipient PN neurons is aligned both to the cue and movement suggests that these neurons might integrate signals of multiple modalities." Presumably, motor cortex neurons also have cue and movement-related activity and PN simply inherits this activity from the motor cortex.

      As described in our response to the first reviewer’s seventh comment, we cannot conclude that the cue-related responses in the PN are inherited entirely from motor cortex. Briefly, (1) it has been difficult for us to reliably disassociate cue and movement responses for individual motor cortical cells (for instance, the GLM approach we took with PN neurons resulted in very poor model fits when applied to cortical cells), though our previous work has suggested that at the population level, the dominant signal in motor cortex is aligned to movement onset. To reliably disentangle cue and movement responses in cortex, we would need to train mice to wait for a relatively long and variable delay period before reaching. (2) The PN receive convergent input from many cortical areas, and there is likely a convergence of multiple inputs onto the motor- cortex-tagged PN units (c.f. the convergence of inputs from visual and somatosensory cortex onto individual PN neurons in rats reported in Potter, Ruegg, & Wiesendanger,1978). Hence it is possible (if not likely) that the multi-modal activity we observe in PN neurons results from the integration of inputs from different cortical areas, rather than being entirely inherited from motor cortex.

      6) Do Purkinje cells follow the 40 Hz PN stimulation like in the multi-unit recordings. The PSTHs in Fig 3 are too smoothed out to see this.

      As described in the response to reviewer 1.3 above, we have added a new analysis to the manuscript to address this question (Supplemental Fig. 6). Most Purkinje cells and DCN neurons are entrained to the 40 Hz stimulation, and the entrainment is much stronger in the DCN, consistent with previous work (Person & Raman, 2012).

      7) For the correlation analysis in Fig 6C top and 7C top, is the correlation computed from z-scored firing rates rather than on raw firing rates? This is not clear from the text. If computed on raw firing rates, one would expect the correlation to be above 0 even before photostimulation, since different neurons exhibit different baseline firing rates that presumably will be the same across control and stim trials.

      The correlations were indeed computed on z-scores, rather than raw firing rates, for this reason. We have clarified this in the Methods section. This analysis was designed to capture correlations in movement-related modulation between control and laser trials, and we z-scored the firing rates to avoid the confound that would have been introduced by baseline differences.

      Reviewer #3:

      It is generally thought that the cerebellum is primarily involved in the short-timescale control of movements, while motor cortex is involved in motor planning. The present paper follows classic studies in primates and a recent study in mouse that investigated the role of cortico-cerebellar loops in motor control. To date, studies in both species applied perturbations to the cerebellum to then study changes in cortical activity. For example, it has been long known that cooling deep cerebellar nucleus produces changes in the responses of motor cortex neurons in primate (e.g., Meyer-Lohmann et al., 1975). Further, Gao and colleagues' recent paper (Nature 2018) used optogenetics to perturb responses in the deep cerebellar nucleus before licking movements. The authors of this 2018 nature paper conclude that persistent neural dynamics are maintained during voluntary movements by connectivity in within this cortico-cerebellar loop.

      The experiments are well performed, and the results are logically organized and presented. However, a main concern is that the authors have not well justified that these experiments prove a conceptual advance. The conclusions appear to be largely consistent with those of prior work, both regarding changes in the responses of motor cortex neurons, and resultant (subtle) changes in behavior (i.e., altered arm kinematics). The impact of the paper would be improved if the authors adapted a more precise style of reporting the novelty of their results throughout.

      Major concerns:

      1) The experiments are well performed, and the results are logically organized and presented. However, a main concern is that the authors have not well justified that these experiments prove a conceptual advance. As noted above, prior studies have probed the role of cortico-cerebellar loops by applying perturbations to cerebellar activity (cerebellar cortex and/or deep cerebellar nuclei) and quantifying changes in cortical activity prior to and during movement. The main novelty of the present study is that the authors perturbed the loop at a different locus, namely in the pontine nuclei (PN) which send projections to the cerebellum rather than directly to the cerebellum. The rationale for why this specific perturbation provides a conceptual advance to the field was not adequately motivated.

      The authors do clearly review prior literature showing that perturbation of cortico-cerebellar projections impacts the rest of the loop and behavior, they also well explain the application of their exciting new tool to specifically target PN neurons with their optogenetic stimulation. Yet, the authors do not motivate why it is important to specifically perturb the pontine nuclei (PN) to gain new insights into the role of "cortico-cerebellar loops" nor do they provide any reason to expect a difference in changes in loop dynamics for perturbations applied versus to the DCN. Indeed, the conclusions appear to be largely consistent with those of prior work, both regarding changes in the responses of motor cortex neurons, and resultant (subtle) changes in behavior (i.e., altered arm kinematics). Generally, these results are similar to those previously reported in primate DCN cooling experiments characterizing changes in hand movement in in a voluntary tracking task (e.g., Brooks et al., 1973; Conrad and Brooks 1974).

      We agree that the rationale and conceptual advance require clarification. Previous work has established that silencing motor cortex blocks reaching (Guo et al. 2015, Sauerbrei et al. 2020, Galinanes et al. 2018), but the perturbations used in these studies were not selective to specific output channels (e.g., corticospinal, corticoreticulospinal, or corticocerebellar), and simultaneously influenced many projection targets of motor cortex. Other work from the Brooks, Prut, Person, and Svoboda groups has shown that altering cerebellar output impairs movement planning or execution, but their methodology did not test the effects of disrupting specific cerebellar inputs (e.g., from cortex). Thus, we would argue that previous studies have not provided direct evidence of the behavioral and neural effects of disrupting cortico-cerebellar signals. The central goal of the present manuscript is to test how selective impairment of cortico-cerebellar communication - not the simultaneous impairment of corticospinal, corticoreticulospinal, and cortico-cerebellar communication, and not a nonselective disruption of cerebellar output - disrupts behavior and neural dynamics across the cortico-cerebellar loop. Our conceptual advance, then, is to show that impairment of cortico-cerebellar communication does not typically block movement execution (as simultaneous perturbation of all motor cortical outputs does), but disrupts the fine kinematic details, similar to a direct manipulation downstream in the cerebellum. We have updated the text, particularly the Abstract, Introduction par. 1, and Discussion par. 1-2, to clarify this rationale and conclusion.

      2) The description of the connectivity of the loop illustrated in Figure 1 is straightforward. Motor cortex recipient PN neurons project to PN neurons, which then project directly to the cerebellar cortex and deep cerebellar nuclei, etc. Thus, the effect of any perturbation to PN neurons should be realized rapidly within neurons in the cerebellar cortex and deep cerebellar nuclei if they are part of this direct loop. However, onset latencies for the effect of the perturbations are not documented for these experiments (Figs 3&6 in the test/reaching conditions, and associated text). Similarly, latencies are not reported for the onset of changes in motor cortex neuron responses to PN perturbations in either condition (Figs 4&7 in the test/reaching conditions, and associated text). The only reference I could find to latencies specified the that required to reach the peak firing rate - not latency of the change. Specifically: "these were stereotypical, mostly consisting of transient excitation (Fig. 4B, left; median time of firing rate peak 120 ms)" - 120ms seems very long for the loop in Fig 1. It would be useful to know the latency between optogenetic stimulation in PN and changes in PN firing rate. And then the question is at what latency are the neurons in subsequent nodes altered? Quantification of latencies of the effects that are observes in the different nodes of the cortico-cerebellar loops would strengthen the authors' conclusion that they are actually studying the direct loop in Figure 1 which would then make the study's conclusions more compelling.

      We agree that it is important to characterize the latencies of neural responses to PN stimulation, and now provide these numbers for Purkinje cells, DCN neurons, and motor cortical neurons in the text and Supplemental Fig. 7. On stimulation of the PN, activity propagates first to Purkinje cells, then the DCN, and finally to motor cortex. We also quantify the latency of PN responses to motor cortical stimulation in Supplemental Fig. 1. (For a discussion of the rationale and limitations of our method, see also our response above to reviewer 1’s first comment.) Unfortunately, we have not been able to measure the delay from stimulation onset to the earliest spikes induced by ChR2 currents in PN neurons, as this would require simultaneous insertion of a stimulation fiber and recording probe to a deep target in the PN. Furthermore, we note that the earliest measurable response in Purkinje cells occurs 10 ms after stimulation onset, and this is likely an overestimate of the minimum latency, as it takes 6.25 ms for the laser to reach peak power under sinusoidal stimulation.

      3) Overall, there was often a sharp incongruity between the complexity of many of the findings described in results and accompanying figures and the short summary conclusion provided for the Results. Here is one of many examples (bottom of page 5), where the authors conclude "These results demonstrate that the cortico-cerebellar loop does not drive reaching, but fine-tunes the behavior to enable precise and accurate movement." Yet, what the results above describe is considerable heterogeneity and variability across animals and cases. These conclusion should be more aligned with/ justified by the author's description of their actual results.

      Throughout the Results section, we have now tied the interpretations more closely to the data. For example, in the instance the reviewer mentions, we now state: “These results demonstrate that PN stimulation impairs reaching performance, typically by disrupting precision, accuracy, duration or success rate of the movement.” In the first two paragraphs of the Discussion, we lay out our interpretation of the data as evidence that the cortico-cerebellar loop contributes to fine- tuning the movement, rather than driving it, but emphasize that this is an interpretation rather than a description of experimental results. Furthermore, we now address possible factors that could underlie the diversity of behavioral effects in the fourth paragraph of the Discussion (“It is possible that the variability of behavioral effects ...”).

      4) A related issue is the disconnection between description and summary, in the description of Figure 6- 8. The emphasis on correlation, yet the authors' main point here seems to be that there are changes in the activity in cortex and DCN induced by the PN stimulation during movement explain the changes in hand trajectory. For example, Figure 6D and its implications are not effectively described in the text.

      The main conclusion of figures 6 and 7 is that PN stimulation during movement alters movement-aligned cortical and DCN activity, but this modulation is typically subtle; that is, activity on control and laser trials is highly correlated for most neurons and time points. This is in contrast with more dramatic effects observed for perturbations delivered to other nodes in the loop; for instance, thalamic perturbations can robustly prevent the generation of the cortical pattern that drives movement (Sauerbrei et al. 2020). Supplemental Fig. 8D-E and Supplemental Fig. 9D-E suggest that these subtle stimulation-induced changes during movement are largely consistent with the changes that would be expected based on neural responses to laser alone, outside engagement with the task. Finally, the decoding analysis in Fig. 8 allows us to interpret these subtle neural changes: they do not appear to be random, but are consistent with the effects of stimulation on the hand. That is, the difference in hand velocity between laser and control trials decoded from neural activity is correlated with the observed hand velocity difference. We have added a video (supplemental video 3) to better visualize this result in all three spatial dimensions simultaneously, and have edited the text in the Results section to clarify these findings.

      5) Finally, the authors conclude that changes in the activity in cortex and DCN induced by the PN stimulation during movement explain the subtle deviations in hand trajectory and conclude that the cortico-cerebellar loop is responsible for fine-tuning movement parameters (bottom pf page 5 and top of page 8). However, i) the statement that this pathway fine-tunes motion is not justified by the analysis, and ii) the novelty is not made clear relative to prior work that has investigated cortico-cerebellar loop (beyond the experimental difference in perturbation site).

      Regarding (i), we agree that the fine-tuning is an interpretation rather than a direct reflection of the data presented in the paragraph, and have altered the statement accordingly: “Overall, these results show that the subtle changes in the activity in cortex and DCN induced by the PN stimulation during movement are consistent with the changes in hand trajectory for individual mice.” We now explain our interpretation of the data as supporting a fine-tuning role in the Discussion, rather than the Results. Regarding (ii), we have now clarified in the Abstract, Introduction, and Discussion that perturbation of the PN enables us to test the effects of a selective disruption of cortico-cerebellar communication, in contrast with direct manipulations of motor cortex or cerebellum (see also our response to comment 3.1 above).

      Overall, the text that follows in the discussion presented the findings in a far more clear and compelling way than much of the text in the Abstract, Introduction and Results "perturbing cortico-cerebellar communication did not block movement execution: animals were typically able to generate the basic motor pattern during optogenetic stimulation of the PN, and neural activity in cortex and cerebellum largely recapitulated the firing patterns observed during normal movement. Instead, PN perturbation altered arm kinematics, decreasing the precision and accuracy of the reach, and perturbation-induced shifts in neural activity explained these behavioral effects." The paper would be improved if the authors adapted this more precise style of reporting throughout.

      We have edited the main text throughout to improve clarity and precision.

    2. Reviewer #1 (Public Review):

      Guo et al. describes interesting experiments recording from various sites along a cortico-cerebellar loop involved in limb control. Using neuropixels recordings in motor cortex, pontine nuclei, cerebellar cortex and nuclei, the authors amass a large physiological dataset during a cued reach-to-grasp task in mice. In addition to these data, the authors 'ping' the system with optogenetic activation of pontocerebellar neurons, asking how activity introduced at this node of the loop propagates through the cerebellum to cortex and influences reaching. From these experiments they conclude the following: the cerebellum transforms activity originating in the pontine nuclei, this activity is not sufficient to initiate reaches, and supports the long standing view that the cerebellum 'fine tunes' movement, since reaches are dysmetric in response to pontine stimulation. Overall these data are novel, of high quality, and will be of interest to a variety of neuroscientists. As detailed below however, I think these data could provide much more insight than they currently do. Thus below I provide some suggestions on improving the manuscript.

      1) Since the loop is the focus of this study, it would be nice if the authors better characterized latencies of responsivity to pontine stimulation through the loop, to address how cortically derived information routed to the cerebellum may loop back to influence cortical function. In the data provided, we know that pontine stimulation modulates Purkinje and deep nuclear firing (but latency to responses are not transparently provided in the main text, if anywhere), while motor cortical responses peak at 120 ms (after stimulus onset?, unclear), and that this responsivity is preferentially observed in neurons engaged early in the reaching movement. Is the idea, then, that cortical activity early in the reach is further modulated by cerebellar processing to (Re) influence that same cortical population? Does this interpretation align with the duration of reaches, the duration of early responsive activity during reach, and the latency of responsivity; or is the idea that independent information from other modalities entering the pontine nuclei modulates early cells? Latency to respond at the different nodes, might aid in thinking through what these data mean for the function of the loop.

      2) Many of the figures need work to aid interpretation. Axis labels are often missing (eg 2F); color keys are often unlabeled (2F); color gradients often used but significance thresholds are hard to evaluate (using same colors for z scores and control / laser is confusing 6, 8); and within-figure keys would be useful (5D-h). These issues occur throughout the manuscript.

      3) Relatedly, but also conceptually, Figure 3B has particular issues, such as identifying where the neuropixel multiunit activity is coming from. I assume that in the gray boxes illustrating the spatio-temporal profile of spiking band activity that the lower part of the box is the ventral direction, upper, dorsal. This is not spelled out. From the two examples it would seem that the spiking band is in different places in the cerebellum, undermining, I think, the objective of the figure. It would be sensible to revisit this entire figure to identify the key takeaways and design figures around those ideas. As it stands, these examples appear anecdotal. Consider moving this to a supplement. Powerband density strength is missing an axis. More importantly, it would be nice to corroborate the interpretation of the MUA with the single unit recordings, since the idea is that many neurons are entraining to the PN activity. Yet, the examples don't seem particularly entrained. Is the activity being picked up on just axonal firing of the PN axons? Fourier analysis of spiking of isolated neurons in cerebellum should be used to corroborate the idea that cerebellar neurons are entraining, rather than the neuropixel picking up entrained PN axons.

      4) The use of the GLM is puzzling. In addressing the question of how cerebellum and motor cortex interact (from the Abstract, "how and why" do these regions interact) it is unclear why these regions are treated separately. I would have expected some kind of joint GLM where DCN activity is used to predict M1 variance (5 co-recordings are reported but nothing to analyze?); or where DCN + M1 activity is used to decode kinematics to see if it is better than one or the other alone. As it stands, we learn that there is more kinematic information in the motor cortex than in DCN. This is not necessarily surprising given previous literature on cerebellar contributions to reaching movements. In principle the idea that 'PN stimulation might perturb reaching kinematics through descending projections to the spinal cord, or by altering activity in motor cortex' is treated as mutually exclusive outcomes, though it is highly unlike to be so.' Analyzing M1+DCN together could address whether DCN activity adds nothing to decoding kinematics that isn't there in M1 or adds something that M1 does not have access to. The main point here is that the physiological datasets could be better leveraged with these fits to derive insight into the interactions of the loop. R2 should be provided in the GLMs (Fig 8) to assess statistically how well they perform relative to one another, not just correlations between the two.

    1. Nonetheless, as we noted, (1) entails (3) but not (3'). That (3') is false says nothing about (1). But if (3) is false, (1) is also false. Put differently (3) is entailed by orthodox theism, while (3') is certainly not. Thus while use of (3) in showing that (1) and (2) are logically compatible is perfectly legitimate, the theist is committed to (3) in a stronger sense than that in which (3) is one of various propositions he may adopt for legitimate logical manoeuvres, and I think this is worth emphasizing.

      (3) is consistent and entailed by (1), (2).

    Annotators

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Review of "Co-chaperone involvement in knob biogenesis implicates host-derived chaperones in malaria virulence." by Diehl et al for Review Commons.


      **Major Comments.** __

      1. In this paper the function of Plasmodium falciparum exported protein PFA66, is investigated by replacing its functionally important dnaJ region with GFP. These modified parasites grew fine but produced elongated knob-like structures, called mentulae, at the surface of the parasites infected RBCs. Knobs are elevated platforms formed by exported parasite proteins at the surface of the infected RBC that are used to display PfEMP1 cytoadherance proteins which help the parasites avoid host immunity. The mentulae still display some PfEMP1 and contain exported proteins such as KAHRP but can no longer facilitate cytoadherence. Complementation of the truncated PFA66 with full length protein restored normal knob morphology however complementation with a non-functional HPD to QPD mutant did not restore normal morphology implying interaction of the PFA66 with a HSP70 possibly of host origin is important for function. While a circumstantial case is made for PFA66 interacting with human HSP70 rather than parasite HSP70-x, is there any direct evidence for this eg, protein binding evidence? I feel that without some additional evidence for a direct interaction between PFA66 and human HSP70 then the paper's title is a little misleading.

        We thank the reviewer for their kind words. They are correct that we do not show direct evidence of such an interaction, but would like to note that we, and others, despite concerted efforts to produce direct evidence, have always been hindered by the nature of the experimental system. As noted also in our reply to Reviewer 3, the inability to genetically modify the host cell leads us to suggest that indirect evidence is the best that can conceivably be provided at this time. Our evidence, although indirect, is the first experimental evidence for the importance of such an interaction, all other suggestions having been based on “guilt by association” i.e. protein localisation or co-IP analyses.

      Was CSA binding restored upon complementation of ∆PFA with the full-size copy of PFA66?

      As this project grew organically and was driven by the results already obtained, we decided to use knob morphology via SEM as a “proof-of-principle” to show that we could reverse the phenotype. Thus, while we cannot comment on whether ALL functions of PFA66 are complemented, we suspect that if the knobs revert to their WT morphology, this is likely to be true for the other tested phenotypes. We do not feel that revisiting all of our assays (which would basically entail repeating almost every experiment so far carried out) would really be much more informative. We have added a note in the discussion stating “We wish to note that we cannot unequivocally state that our complementation construct allows reversion of all the aberrant phenotypes herein investigated, however we feel it likely that all abnormal phenotypes are linked and thus our “proof-of-principle” investigation of knob/eKnob phenotypes is likely to be reflected in other facets of host cell modification and can thus be seen as a proxy for such.”.

      **Minor Comments**

      Line 36, NPP should be NPPs if referring to the plural.


      Changed


      Line 37, MC should be MCs if referring to the plural. By the way this acronym is never used in the text, it's always written 'Maurer's clefts'.

      Changed

      Abstract, Line 52-53, could be changed to "uncover a new KAHRP-independent..." as it currently implies (albeit weakly) that that this is the first observation of a KAHRP-independent mechanism for correct knob biogenesis. Maier et al 2008, have previously shown that knock out of PF3D7_1039100 (J-domain exported protein), greatly reduced knob size and knock out of PHISTb protein PF3D7_0424600, resulted in knobless parasites.

      Correct. In line with the suggestions of another reviewer, this section has been changed.

      In the Abstract it is mentioned that "Our observations open up exciting new avenues for the development of new anti-malarials." This is never really expanded upon in the rest of the paper and so seems like a bit of a throwaway line and could be left out.

      Good point, changed

      Line 59, WHO world malaria report should be cited here since these numbers are from the report not a paper from 2002.

      Done

      Line 67, Marti et al 2004 should be cited here as its published at the same time as Hiller et al 2004.

      Our mistake. Done

      Line 76, I suggest using either 'erythrocyte' or 'red blood cell' throughout the text not both.

      We now use erythrocyte throughout

      Line 80, Maier et al 2008 should be referenced here.

      Done

      Line 87, the authors should cite Birnbaum et al 2017 for the technique used. This is cited immediately after (line 98) in the results section but could be addressed at both points in the text.

      Done

      Line 123, IFAs and live cell imaging failed to detect the PFA-GFP protein and the author proposes this is due to low expression levels. However, PFA66 is expressed at ~350 FPKM in the ring stage and previous studies from your own group have visualised it using GFP before. Is there another explanation for this such as disruption of the locus here has served to greatly reduce the expression level of the fusion protein?

      The truncated protein is now distributed throughout the whole erythrocyte cytosol, not concentrated into J-dots, likely making detection difficult. We wish to note that our original GFP tagged PFA66 lines (Külzer et al, 2010) did not really show a strong signal in comparison to other lines we are used to analysing. We further believe that the sub-cellular fractionation (Figure S1) demonstrates the erythrocyte cytosolic localization of the truncated PFA66. We have no evidence that truncation causes lower expression, but any future revision will include a comparison of expression levels of endogenously GFP tagged dPFA and PFA66.

      Line 147, for consistency it would be best to introduce infected red blood cell (iRBC) at the beginning of the main text and use throughout the text instead of switching between 'infected human erythrocyte' and iRBC.

      We agree, and have changed accordingly

      Line 153, Fig S2A does not exist.

      We apologise, this has been changed

      Lines 156-158: Different knob morphologies are described with repeated reference to Fig2 and FigS2. Since multiple whole-cell SEM images are displayed in these figures it would be worth adding lettering and/or zoomed-in regions of interest highlighting examples of each aberrant knob type.


      This has now been added to Figure S2.

      Line 178-179, "Although not highly abundant in either sample, the morphology of Maurer's clefts appeared comparable in both samples (data not shown)." Why is the data not shown? Representative images of Maurer's clefts from each line should be included in the supplementary figures or this in-text statement should more clearly justified.

      Figure S3 has been adjusted to also show Maurer´s clefts in more detail. An Excel table of Data can be provided if necessary.

      Line 196, indirect immunofluorescence assay (IFA).


      Changed

      Line 201, how was the 'non-significant difference' measured? PHISTc looks quite different by eye. Rephrase the term "significant difference" as localisation of these exported proteins was compared visually rather than quantified. Otherwise, a measure of mean fluorescence intensity could be taken for each protein as a basic comparison between the two lines. In the Figure legend of S4, the term "no drastic difference", is used suggesting this was not quantified. By the way, PHISTc appears different by the represented figure.

      We apologise for our use of a specific term for non-statistically verified observations. The PHISTc image the reviewer comments on, was presented incorrectly (too much brightness introduced during processing) and is now correct. We mean to say that we could not (in a blinded check), tell the difference between WT and KO IFA images. Only KAHRP (in our opinion) demonstrated a different fluorescence pattern. As KAHRP has previously been implicated in knob formation, we then analysed this phenotype in more detail. A detailed analysis of the fluorescence pattern in the other IFAs does, in our eyes, not add to the story or add any real value to our observations.

      Line 213, you now have 3 versions for the word wild type, 'wild type', 'wild-type' and 'WT', best to choose one for consistency.

      Changed

      Line 232, 'tubelike' to 'tube-like'.

      Changed

      Line 279, just use 'IFA', the acronym has already been explained earlier in the text.

      Changed

      Line 319, 'permeation' should be 'permeability'.

      Changed

      Line 353, 'The action of host actin is known' to 'Host actin is known'.

      Changed

      Line 373, 'through their role as regulators'.

      Changed

      Line 402, either use 'HSP70-x' or 'HSP70-X' throughout the text.

      Changed

      Line 540, the speed used to pellet the samples for sorbitol lysis assay, 1600g is quite high and could reflect RBC fragility rather than direct sorbitol induced lysis. The parasitemia is also very low, and previous published methods have used ~90% parasitemia rather than the 2% used here. We are not saying the method is wrong but please check it is accurate.

      We used the method of our former colleague Stefan Baumeister (University of Marburg), who is an expert in analysis of NPP, thus we are sure the method is correct. We are in fact tempted to remove the NPP data as they deflect from the main narrative of the manuscript, this being the reason we include them only as supplementary data

      Line 479, 10µm should be 10 µM.

      Changed

      In Fig 1A, the primers A, B, C etc are not explained anywhere that I can see.

      This information has now been included in the 1A Figure legend and table 2A.

      Figure 1B, I do not see any clear band for the 3' integration indicated with the *. Can a better image be shown?

      We apologise. Integration PCRs are notoriously challenging. Any revised manuscript will include better quality images

      It seems from Fig 3G,H,I that the KAHRP puncta are bigger in ∆PFA but are as abundant as CS2. Given that KAHRP is associated with knobs how do you reconcile this with there being fewer knobs per unit area in ∆PFA compared to CS2 as in Fig 2B? The numbers of knobs/KAHRP spots/Objects per um2 seems to vary between Fig 2 and 3. Please provide some commentary about this.

      We are not sure if all KAHRP spots actually label eKnobs, and it is possible that there are KAHRP “foci” that are not associated with eKnobs. We also wish to note that the data in figure 2 and 3 were produced using very different techniques. Sample preparation may lead to membrane shrinkage or stretching, and the different microscopy techniques have very different levels of resolution. For this reason we do not believe that the data from these very different independent experiments can be compared, however a comparison within a data set is possible and good practice.

      In the bottom panels of Fig 4, KAHRP::mCherry appears to extend beyond the glycocalyx beyond the cell. Is this an artifact?

      We checked assembly of the figure and are sure that this was not introduced during production of the figure. Our only explanation is that WGA does not directly stain the erythrocyte membrane, but the glycocalyx. A closer examination of the WGA signal reveals that it is weaker at this point (and also in the eKnobs i, ii) so potentially the KAHRP signal is beneath the erythrocyte plasma membrane, but the membrane cannot be visualised at this point.

      Line 837, does this refer to 10 technical replicates or was the experiment repeated on 10 independent occasions? This should at least be done in 2 biological replicates given the range in technical replicates on the graph. Was CS2 considered as '100% lysis' or the water control described in the method? Please provide more detail.


      This figure is the result of 10 biological and 4 technical replicates. A number of data points were removed as lying outside normal distribution (Gubbs test). The highest value within a biological replicate was set to 100% to allow comparison of results. This has now been corrected in the text.

      Reviewer #1 (Significance (Required)):

      This is a reasonably significant publication as it describes knob defects that to my knowledge have never been observed before. Importantly, the deletion of the J domain from PFA66 is genetically complemented to restore function really confirming a role for this protein in knob development. Amino acids critical for the function of the J-domain are also resolved. Apart from some minor technical and wording issues the paper is really nice work apart from one area which is the proposed partnership of PFA66 with human HSP70 for which there is not much direct evidence. If this evidence can be provided, we think this work could be published in a high impact journal. Without the evidence, it could find a home in a mid-level journal with some tempering of the claims of PFA66's interaction with human HSP70.

      **Referee Cross-commenting**


      There seems to be a high degree of similarity in the reviewers' comments and I think as many issues as possible should be addressed. I definitely agree that the term mentula should be not be used.


      We have now adopted the suggestion of Reviewer 3, and use the term eKnobs.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Plasmodium falciparum exports several proteins that contain J-domains and are hypothesized to act as co-chaperones to support partner HSP70s chaperones in the host erythrocyte, but the function of these co-chaperones is largely unknown. Here the authors provide a functional analysis of one of these exported HSP40 proteins known as PFA66 by using the selection-linked integration approach to generate a truncation mutant lacking the C-terminal substrate binding domain. While there is no fitness cost during in vitro culture, light and electron microscopy analysis of this mutant reveals defects in knob formation that produces a novel, extended knob morphology and ablates Var2CSA-mediated cytoadherence. These knob formation defects are distinct from previous mutants and this unique phenotype is exploited by the authors to show that the HSP70-stimulating "HPD" motif of PFA66 impacts rescue of the altered knob phenotype. In other HSP40 co-chaperones, this motif is critical to stimulate partner HSP70 activity, suggesting that PFA66 acts as a bona fide co-chaperone. Importantly, previous work by the Przyborski lab and others has shown that deletion PfHSP70x, the only HSP70 exported by the parasite, does not phenocopy the PFA66 mutant, implying that the partner HSP70 is of host origin. The results are exciting but I have some concerns about controls needed to properly interpret the functional complementation experiments. My specific comments are below.


      We agree that some control experiments are missing, and these will be included in any future revision.

      **Major comments**

      __

      • The failure of the HPD mutant PFA66 to rescue the knob-defect is very interesting. However, the authors need to determine that the HPA mutant is expressed at the same level as the WT (by quantification against the loading controls in the western blots in Fig 1D and Fig S6H) and is properly exported (by IFA and/or WB on fractionated iRBCs, as done for the GFP-fused truncation in Fig S1A). Otherwise, the failure to rescue is hard to interpret. If these controls were in place, the conclusion that a host HSP70 is likely being hijacked by PFA66 is appropriate. This genetic data would be greatly strengthened by in vitro experiments with recombinant protein showing activation of a host HSP70 by PFA66, but I realize this may be out of the scope of the present study. Along these lines, it might be worth discussing the finding by Daniyan et al 2016 that recombinant PFA66 was found to bind human HSPA1A with similar affinity to PfHSP70x but did not substantially stimulate its ATPase activity, suggesting this is not the relevant host HSP70. This study is cited but the details are not discussed. __

      As in our answer to Reviewer 1, we will examine the expression and localisation of both WT and mutant PFA66.

      We are currently expressing and purifying a number of HSP40/70 combinations for exactly the kind of analysis suggested and hope to include such data in future revisions, but as the reviewer fairly notes, this is really beyond the scope of the current study.

      Regarding Daniyan et al (and other) papers: The fact that PFA66 can stimulate PfHSP70x does not preclude that it also interacts with human HSP/HSC70, and indeed there is some stimulation of human HSP70. Daniyan and colleagues did steady-state assays in the absence of nucleotide exchange factors. Therefore, the stimulation of human HSP/HSC70 is not very prominent. One should either do single-turnover experiments or add a nucleotide exchange factor to make sure that nucleotide exchange does not become rate-limiting for ATP hydrolysis. This is completely independent of the results for PfHSP70-X the intrinsic nucleotide exchange rates of the studied HSP70s could be very different. Also, it is important to understand that J-domain proteins generally do not stimulate ATPase activity much by themselves but in synergism with substrates, allowing the possibility that such an in vitro assay may not reflect the situation in cellula. dditionally the resonance units in the SPR analysis for PFA66-HsHSP70 are lower than those for PFA66-PfHSP70-X. This could mean that PFA66 is a good substrate for PfHSP70-X but not for HsHSP70, but this does not mean that PFA66 does not cooperate with HsHSP70.

      - The authors claim that truncation of PFA66 alters the localization of KAHRP but not the other exported proteins they evaluated by IFA (Fig S4). This seems baseless as they don't apply the same imageJ evaluation to these other proteins. Similarly, the statement that KAHRP structures "appear by eye to have a lower circularity, although we were not able to substantiate this with image analysis" is subjective/qualitative and should probably be removed.

      We mean to say that we could not (in a blinded check), tell the difference between WT and KO IFA images. Only KAHRP (in our opinion) demonstrated a different fluorescence pattern. As KAHRP has previously been implicated in knob formation, we then analysed this phenotype in more detail. A detailed analysis of the fluorescence pattern in the other IFAs does, in our eyes, not add to the story or add any real value to our observations.

      The statement on the circularity has been removed according to the reviewers wishes.

      -The section title "Chelation of membrane cholesterol...causes reversion of the mutant phenotype in ∆PFA" seems an overstatement given the MBCD effect on the knob morphology is fairly weak and remains significantly abnormal.

      The title of this section was misleading, we agree. We have retitled it “Chelation of membrane cholesterol but not actin depolymerisation or glycocalyx degradation causes partial reversion of the mutant phenotype in ∆PFA” to clarify that the reversion was only partial (as explained by the following text in the manuscript).

      **Minor comments**

      - The DNA agarose gel image in Fig 1B is not very convincing. Most of the bands are faint and there is a lot of background/smear signal in the lanes. Also, it would help for clarity if the primer pairs used for each reaction were stated as shown in the diagram (rather than simply "WT", "5' Int" and "3' Int").

      We apologise. Integration PCRs are notoriously challenging. Any revised manuscript will feature clearer images.

      - Given the vulgar connotation of "mentula", the authors might consider an alternative term.

      We have now adopted the term “eKnobs” suggested by Reviewer 3.

      - lines 67-69: The authors may wish to cite a more recent review that takes into account updated Plasmepsin 5 substrate predication from Boddey et al 2013 (PMID: 23387285). For example, Boddey and Cowman 2013 (PMID: 23808341) or de Koning-Ward et al 2016 (PMID: 27374802).

      A fair point, we have now added Koning-Ward.

      - lines 77-79: "deleted" is repetitive in this sentence.

      Changed

      - line 115: It might be clearer to state "endogenous PFA66 promoter"

      Changed

      - lines 131-132: "...these data suggests that deletion of the SBD of PFA66 leads to a non-functional protein." Behl et al 2019 (PMID: 30804381) showed the recombinant C-terminal region of PFA66 (residues 219-386, including the SBD truncated in the present study) binds cholesterol. The authors may wish to mention this along with their reference to Kulzer et al 2010 showing PFA66 segregates with the membrane fraction, suggesting cholesterol is involved in J-dot targeting.

      We should have noted this connection and thank the reviewer for bringing it to our attention. This section has been revised to include this important information.

      - line 198: It's not clear what is meant by "+ve" here and afterward. Please define.

      We have now changed this to “structures labelled by anti-KAHRP antibodies”, or merely “KAHRP”.

      - lines 749-750: "Production of PFA and NEO as separate proteins is ensured with a SKIP peptide". Translation of the 2A peptide does not always cause a skip (see PMID: 24160265) and often yields only about 50% skipped product (for example, PMID: 31164473). Because of the close cropping in the western blots in Fig 1C or S1A this is difficult to assess. Is a larger unskipped product also visible? Beyond this one point, it is general preferable that the blots not be cropped so close.

      A very valid point, and in other parasite lines we have indeed detected non-skipped protein. In our case, we visualise a band at the predicted molecular mass for the skipped dPFAGFP and the commonly observed circa. 26kDa GFP degradation product. The full-length blots have now been included as supplementary data (Figure S7).

      - lines 867-868: Explain more clearly what "Cy3-caused fluorescence" is measuring.

      The Cy3 channel refers to anti-var2CSA staining, and we have now included this information.

      - Several figure legends would benefit from a title sentence describing what the figure is about (ie, Fig legends 1, 3, 5, S1, S5 & S6)

      This has been added.

      Reviewer #2 (Significance (Required)):

      This manuscript by Diehl et al reports on the function of the exported P. falciparum J-domain protein PFA66 in remodeling the infected RBC. Obligate intracellular malaria parasites export effector proteins to subvert the host erythrocyte for their survival. This process results in major renovations to the erythrocyte, including alteration of the host cell cytoskeleton and formation of raised protuberances on the host membrane known as knobs. Knobs serve as platforms for presentation of the variant surface antigen PfEMP1, enabling cytoadherence of the infected RBC to the host vascular endothelium. This process is of great interest as it is critical for parasite survival and severe disease during in vivo infection. The basis for trafficking of exported effectors within the erythrocyte after they are translocated across the vacuolar membrane is not well understood but is known to involve chaperones. This is a particularly interesting study in that it provides evidence in support of the hypothesis, initially proposed nearly 20 years ago, that the parasite hijacks host chaperones to remodel the erythrocyte. This is biologically intriguing and also suggests new therapeutic strategies targeting host factors that would not be subjected to escape mutations in the parasite genome. The work will be of interest to the those studying exported protein trafficking and/or virulence in Plasmodium (such as this reviewer) as well as the broader chaperone and host-pathogen interaction fields.

      **Referee Cross-commenting**

      I also agree with similarity in comments. Some additional discussion on the failure to localize the PFA66 truncation by live FL is warranted, as noted by reviewer #1. Seems likely that either the level of PFA66 protein is reduced by the truncation or the truncated PFA66 is dispersed from J-dots and harder to visual when diffuse instead of punctate. In either case, the complementing copy (WT or QPD) should be visualized by IFA.


      As noted above, we believe our inability to visualize the truncated protein is likely due to its dispersal throughout the whole erythrocyte cytosol as opposed to lower expression levels, but we will be checking this, and also the localisation of WT and mutant PFA66 complementation chimera and expect to have this result for the next revision.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The data are for the most part well controlled and reveal a potential function for PFA66 in knob formation. The assays are state of the art and the data provides insight into knob formation.

      However, some conclusions are not fully supported by the data. For example, 'uncover a KAHRP-independent mechanism for correct knob biogenesis' (line 52-53) is not supported by the data because PFA66 truncation could result in misfolding of KAHRP and thus lead to knob biogenesis defects.

      We meant to imply that not only perturbations/absence of KAHRP lead to aberrant knobs. This is now changed to “…uncover a new KAHRP-independent molecular factor required for correct knob biogenesis.”.

      The other major issue is that despite having a complemented parasite line in hand, the parental parasite line is used as a control for almost all assays. This is a critical issue because an alternative explanation for their data would be that expression of truncated PFA66 leads to expression of a misfolded protein that aggregates in the host RBC OR it clogs up the export pathway and indirectly leads to knob biogenesis defects. It is surprising that the authors do not test the localization of dPFA using microscopy especially since it is tagged with GFP. While the complemented parasite line does revert back, this could also be due to the fact that the complement overexpresses the chaperone helping mitigate issues caused by the truncated protein.

      As all virulence characteristics we monitor in this study have been verified many times in the parental CS2 parasites in the literature, we think that the best comparative control is indeed the truncated cell line. The large part of our study aimed to characterize differences in various characteristics upon inactivation of PFA66 function, and for this reason we used the parental WT line as a control. Using the complementation line would not truly reflect the effect of PFA66 truncation, as PFA66::HA was not expressed from an endogenous locus, but rather from an episomal plasmid. This itself may result in expression levels which differ from WT, and thus this parasite line cannot be seen as the gold-standard control for assaying PFA66 function.

      We did indeed try to localize dPFA (lines 122-123 in the original manuscript), but were unsuccessful, likely due to diffusion of dPFA throughout the entire erythrocyte cytosol (as opposed to concentration into J-dots as the WT). For this reason we carried out fractionation instead, and could show that dPFA is soluble within the erythrocyte cytosol. This experiment additionally excludes any blockage of the export pathway as no dPFA was associated with the pellet/PV fraction. Other proteins were still exported as normal (Figure S4), further supporting a functional export pathway. Indeed, as reported by ourselves and our colleagues (particularly from the Spielmann laboratory, Mesen-Ramirez et al 2016, Grüring et al 2012), blockage of the export pathway is likely to lead to non-viable parasites as the PTEX translocon seems to be the bottleneck for export of a number of proteins, many of which are essential for parasite survival.

      Reviewer #3 (Significance (Required)):

      The malaria-causing parasite extensively modifies the host red blood cell to convert the host into a suitable habitat for growth as well as to evade the immune response. It does so by exporting several hundred proteins into the host cell. The functions of these proteins remain mostly unknown. One parasite-driven modification, essential for immune evasion, is the assembly of 'knob' like structures on the RBC surface that display the variant antigen PfEMP1. How these knobs are assembled and regulated is unknown.

      In the current manuscript, Diehl et al target an exported parasite chaperone from the Hsp40 family, termed PFA66. The phenotypic observations described in the manuscript are quite spectacular and well characterized. The truncation of PFA66 results in some abnormal knob formation where the knobs are no longer well-spaced and uniform but instead sometimes form tubular structures termed mentulae. The mechanistic underpinnings driving the formation of mentulae remain to be understood but that will probably several more manuscripts to be deciphered.

      We thank the reviewer for their kind comments, and also for the recognition that this current manuscript is merely the exciting beginning of a story!

      **Major Comments:**

      General comment on the use of controls: The large part of our study aimed to characterize differences in various characteristics upon inactivation of PFA66 function, and for this reason we used the parental WT line as a control. Using the complementation line as a control in this context would not truly reflect the effect of PFA66 truncation, as PFA66::HA was not expressed from an endogenous locus, but rather from an episomal plasmid. This itself may result in expression levels which differ from WT, and thus this parasite line cannot be seen as the gold-standard control for assaying PFA66 function. Our complementation experiments were initially designed to verify that phenotypic changes ONLY related to inactivation of PFA66 function and were (as unlikely as this is) not due to second site changes during the genetic manipulation process. To avoid lengthy and not really very informative analysis of the complementation line, we used knob morphology via SEM as a “proof-of-principle”. However, as the reviewer is formally correct, we have added a passage to the discussion stating that “We wish to note that we cannot unequivocally state that our complementation construct caused reversion of all the aberrant phenotypes herein investigated, however we feel it likely that all abnormal phenotypes are linked and thus our “proof of principle” investigation of knob/eKnob phenotypes is likely to be reflected in other facets of host cell modification and can thus be seen as a proxy for such.“.

      Fig 3: The control used here is the parental line. Was there a reason why the complemented parasite line was not used as the control? Showing that the KAHRP localization and distribution is restored upon complementation would greatly increase the confidence in the phenotype.

      Please see our general comments above.

      Fig 5: The data showing a defect in CSA binding are convincing but again only the parental control is used and not the complemented parasite line. The complemented parasite line should be used as a control for the PFA binding mutant.

      Please see our general comments above, and also our reponse to reviewer 1.

      In 5D, the defect in dPFA seems to be occur to a lesser degree than Fig. 2C. How many biological replicates are shown in each of these figures? The figure legend says 20 cells were quantified via IFA but were these cells from one experiment? The expression of mentulae seems quite variable, while the authors mention '22%' (line 164), it seems in most other experiments, its more ~10% (5D and S6B, D-E). Were these experiments blinded?

      As the reviewer is likely aware, subtle differences in parasite culture conditions, stage, fixation, SEM conditions and length of time in culture between time experimental time points can lead to variations in results. Due to the time required to generate the data for figure 5, these experiments took place months after the original (i.e. Figure 2C) analysis. It is not possible to directly compare the results of these two independent experiments, however it is possible to compare the results of the parasite lines included within each set of experimental data. Due to the time and cost involved, each of these experiments represents only one biological replicate. If required, we can include more replicates, although this is more likely to further complicate the situation due to the reasons mentioned above.

      Fig S6G: The staining suggests that most PfEMP1 in is not exported, in any parasite line. Staining for PfEMP1 is technically challenging and these data are not enough to show that expression level is 'similar' (Line 279-280). It may be more feasible to use the anti-ATS antibody and stain for the non-variant part of PfEMP1 (Maier et al 2008, Cell).


      It is well known that a large portion of PfEMP1 remains intracellular. This figure does not aim to differentiate between surface exposed and internal PfEMP1, but merely to show that similar TOTAL PfEMP1 is expressed in the deletion line, and also that the parasites have not undergone a switching event which would lead to loss of CSA binding ability. We will endeavour to address this in future revisions by Western Blot but wish to note that WB analysis of PfEMP1 is notoriously difficult.

      Lines 320-322: The logic of why increased robustness of the RBC membrane would lead to faster parasite growth is confusing. It is likely that the loss of PfEMP1 expression leads to faster growth. The loss of NPP is minimal and may not cause growth defects in rich media.

      As far as we can detect, there is no loss of total PfEMP1 expression (as verified by figure S6G), but rather a drop in surface exposure and functionality, which is unlikely to affect parasite growth rates. What we intended to say was that the NPP assay is influenced by fragility of the erythrocyte, and therefore a stiffer erythrocyte may be more resistant to sorbitol-induced lysis. As the NPP result does not really add much to the main narrative of this manuscript, we would prefer not to invest unnecessary effort for a minimal potential readout. Indeed, we are tempted to remove the NPP data as they deflect from the main findings of the manuscript, this being the reason we include them only as supplementary data

      Lines 433-434: These data do support a function for HsHsp70 but these data are among many others that have previously provided circumstantial evidence for its role in host RBC modification. May be a co-IP would help support these conclusions better.

      Despite all our best efforts and publications, we have been unable to detect this interaction in co-IP or crosslink experiments, although we were successful in detecting interactions between another HSP40 (PFE55) and HsHSP70 (Zhang et al, 2017). Although this is disappointing, it may be explained due to the transient nature of HSP40/HSP70 interactions. We agree that our suggestion (that parasite HSP40s functionally interact with human HSP70) is not novel (we and others have noted this possibility for over 10 years), however the challenging nature of the experimental system makes it very difficult to show direct evidence of the importance of this interaction in cellula. Over the past decade we have use numerous experimental approaches to try to address this but have always been confounded by technical challenges. In 2017 the corresponding author took a sabbatical to attempt manipulation of hemopoietic stem cells to reduce HSP70 levels in erythrocytes, however it appears (unsurprisingly) that HsHSP70 is required for stem cell differentiation, and thus this tactic was not followed further. The authors believe that, due to the lack of the necessary technology, indirect evidence for this important interaction is all that can realistically be achieved at this time, and this current study is the first to provide such evidence.

      We would further like to note that a successful co-IP would not directly verify a functional interaction between PFA66 and HsHSP70, but could also reflect a chaperone:substrate interaction between these proteins, and is therefore not necessarily informative.

      **Minor Comments:**

      Fig1: The bands are hard to see in WT and 3’Int. May be a better resolution figure would help? Also, the schematic shows primers A-D but the figure legend does not refer to them. It would be useful to the reader to have the primers indicated above the PCR gel along with the expected sizes.

      We apologise. Integration PCRs are notoriously challenging. Any revised manuscript will contain clearer images.


      Fig S1: The NPP data could be improved if tested in minimal media. It has been shown that NPP defects do not show up in rich media (Pillai et al 2012, Mol. Pharm. PMID: 22949525). Does complementation restore NPP and growth rate?

      As the NPP result does not really add much to the main narrative of this manuscript, we would prefer not to invest unnecessary effort for a minimal potential readout. Indeed, we are tempted to remove the NPP data as they deflect from the main findings of the manuscript, this being the reason we include them only as supplementary data. Likewise the complementation experiments are, we feel, unnecessary.

      Fig 4: It is not clear what the line scan analysis are supposed to show. What does ‘value’ on the y-axis mean?


      These are line scans of fluorescence intensity (arbitrary units) along the yellow arrows shown on the fluorescent panels. This is now indicated in the figure legend.

      Fig S5D: Maybe it was a problem with the file but no actin staining is visible.

      The actin stain was visible on the screen, but unfortunately not in the PDF. We have applied (suitable) enhancement to produce the images in the new version.

      Fig 6: A model for mentulae formation is not really proposed. Only what the authors expect the mentulae to look like.

      We have changed the legend to reflect this “Figure 6. Proposed model for eKnob formation and structure.”. We do propose that runaway extension of an underlying spiral protein may lead to eKnobs, thus would like to keep the word “formation”.

      Lines 312-313: It is not clear what 'highly viable' means, parasites are either viable or not.


      This has been changed.

      Lines 400-405: The authors forgot to cite a complementary paper that showed no virulence defect upon 70x knockout or knockdown (Cobb et al mSphere 2017). Those data also support a role for HsHsp70.

      We apologise for the omission. This is now included.

      **Referee Cross-commenting**


      I agree, the comments are pretty similar. The authors could tone down their conclusions or add more data to support their conclusions. May be call them elongated knobs or eKnobs, instead of mentula? __

      We have now removed the offending term and use eKnobs.

    1. Reviewer #4 (Public Review):

      In this paper, the author uses an impressive comparative dataset of 172 species to investigate the relationship between intraspecific genetic diversity and census (actual) population size. They find that even when they use phylogenetic comparative methods, the relationship between neutral diversity and population size is much weaker than predicted by theory and that selection on linked sites is unlikely to explain this difference. The paper convincingly demonstrates that the paradox of variation first pointed out by Lewinton in the 70s remains paradoxical.

      This paper is exceptionally strong in multiple ways. First, it is statistically rigorous; this is particularly impressive given that the paper uses methods and data from multiple fields (genomics, macroecology, conservation biology, macroevolution). This is the most robust estimate of the relationship between diversity and population size that has been published to date. Second, it is conceptually rigorous: the paper clearly lays out the various hypotheses that have been put forth over the years for this pattern as well as the logic behind these. The author has done a great job at synthesizing some complex debates and different types of data that are potentially relevant to resolving it. Third, it is exceptionally well-written. I sincerely enjoyed reading it. Overall, I think this is a major contribution to this field and though the paper does not resolve the challenge laid down by Lewinton, I think these analyses (and curated data/computational scripts) will inspire other researchers to dig into this question.

      I do however, have some suggestions as to how this paper could be strengthened.

      First, in phylogenetic comparative methods (PCMs) there has been a persistent confusion as to what phylogenetic signal is relevant -- when applying a phylogenetic generalized linear model with a phylogenetically structured residual structure (which the author does here), one is estimating the phylogenetic structure in the errors and not the traits themselves. The comparative analysis are well-done and properly interpreted but at some points in the text, particularly when addressing Lynch's conjecture that PCMs are irrelevant for coalescent times and comments/analysis on the appropriateness of Brownian motion as a model of evolution, that there is some conceptual slippage and I suggest that author take a close look and make sure their language is consistent. Strictly speaking the PGLM approach doesn't assume that the underlying traits are purely BM -- only that the phylogenetic component of the error model is Brownian. As such running the node-height test on the both the predictors and the response variable separately -- while interesting and informative about the phylogenetic patterns in the data (including the shift points you have observed) isn't really a test of the assumptions of the phylogenetic regression model. It is at least theoretically plausible (if not biologically) that both Y and X have phylogenetic structure but that the estimated lambda = 0 (if for instance, Y and X were perfectly correlated because changes in Y were only the result of changes in X). To be clear, I am fine with the PGLM analysis you've done and with the node-height test; I just don't think that the latter justifies the former.

      One note about the ancestral character reconstruction: I think it is a fine visualization and realize you didn't put too much emphasis on it but strictly speaking the ASR's were done under a constant process model and therefore they wouldn't provide evidence for (a probably very real shift) between phyla. I think it was a good idea to run the analyses on the clade specific trees (particularly given how deep and uncertain the branches dividing the phyla are) but I just don't think you could have gotten there from the ASR.

      I am not convinced that the IUCN RedList analysis helps that much here and in my view, you might consider dropping this from the main text. This is for two reasons: 1) species may be of conservation concern both because they have low abundance in general and/or that their abundance is known to have experienced a recent decline -- distinguishing these two scenarios is impossible to do with the data at hand; and 2) there is of course a huge taxonomic bias in which species are considered; I don't think we can infer anything ecologically relevant from whether a species is listed on the RedList or not (as you suggest regarding the lynx, wolverine, and Massasauga rattlesnake) except that people care about it.

      This is not really a weakness but I find it notable that recombination map length is correlated with body size. I realize this is old news but I was left really curious as to a) why such a relationship exists; and b) whether the mechanism that generates this might help explain some of the patterns you've observed. I would be keen to read a bit more discussion on this point.

    2. Reviewer #3 (Public Review):

      This study is quite directly a follow-up study of the recent work of Corbett-Detig et al (2015) and the commentary by Coop (2016) which aimed to understand the relation between population size and diversity, and the degree to which the shape of the relation could be explained by the action of linked selection. The analysis here scales up the sample size for a large-scale focus on comparative analyses of animals, and introduces the application of phylogenetic correction to control for relatedness.

      As the most comprehensive analysis of its type to date, and with the addition of phylogenetic correction, this work's strength primarily lies in confirming the conclusions laid out in the commentary by Coop, notably that linked selection is unable to fully explain the narrowness of the diversity across species with orders of magnitude variation in population sizes. Through an explicit model-fitting of the effects of linked selection, the main conclusions are essentially that Lewontin's Paradox remains unexplained. The Introduction and discussion provide a very nice accounting of the range of possible explanations. I also appreciated the connection of the population size inferences to IUCN status.

      I wasn't so convinced that the assessment of phylogenetic inertia (Lambda>0) really provides a way to assess Lynch's argument that coalescent times are too short to have a phylogenetic effect. For reasons outlined by the author in the discussion, it could well be that any phylogenetic inertia signal is due to inertia of life history traits correlated with effective population size rather than with diversity itself. The discussion raises this important point, but I think leaves us with the difficulty of really assessing how important that phylogenetic correction really is: if diversity has no direct phylogenetic non-independence, I am a bit unsure how much we have learned through this analysis alone (i.e. what is lambda telling us), without an explicit assessment of how often divergence times may actually truly be on the same order as coalescent times.

      That said, I think it's a very open question whether diversity actually has phylogenetic independence because of short split times relative to effective population sizes. The author mentions the possible effect of large Ne on causing this to be violated; but I also wondered whether many of the small Nc species are still retaining a fair bit of ancestral polymorphism, further homogenizing diversity levels.

      Overall a number of possible explanations (such as the effect of variable selected site densities, and variable recombination) were raised, and rather quickly rejected as 'unlikely to explain the qualitative patterns'. In a number of cases these statements were fairly brief, and I wondered whether in aggregate how likely a combination of these COULD explain the patterns. Looking at Figure 5B, it seems like the major effect of phylogeny (or correlated life history) is also apparent for the discrepancy between observed and predicted diversity- Chordates seem to have the largest discrepancy. With that in mind, I do wonder whether some feature of genome structure in Cordates, including a combination of the effects discussed in the paper that could account for the discrepancy (e.g. the effects of variable recombination rates/genome size and functional densities, variation in mutation rates, etc.) could collectively account for the paradox, even though individually the author rules them out as being able to explain the 'qualitative pattern'. Could the genome structure of chordates lead to a major difference in linked selection that's unaccounted for here?

      Mei et al (2018) (American Journal of Botany, Volume 105, Issue 1, p1-124) argued that species with larger genomes have greater 'functional space', implying a greater deleterious mutation rate in species with larger genomes. This could potentially be a factor driving those Chordates with intermediate Nc values furthest below the predicted line?

    3. Reviewer #1 (Public Review):

      The standard neutral model, which is our null model for levels of genetic variation, predicts that they should be proportional to census population sizes. In reality census population sizes across metazoan species span several orders of magnitude more than the ~3 orders spanned by levels of genetic diversity. This discrepancy is referred to as Lewontin's paradox, and to resolve it would mean to explain how basic population genetic processes lead to the modest span of genetic diversity levels that we observe. This is a central question in population genetics (which is, after all, concerned with understanding patterns of genetic variation) and is of substantial general interest.

      The manuscript addresses Lewontin's paradox through three main analyses:

      1) It derives novel estimates of census population size across metazoans, which alongside previous estimates of neutral diversity levels, enables a revised quantification of the relationship between diversity levels (\pi) and census populations sizes (Nc).

      2) It quantifies the relationship between \pi and Nc controlling for phylogenetic relatedness.

      3) It revisits the question of whether this relationship can be accounted for by the effects of selection at linked loci (e.g., sweeps and background selection). I address each of these analyses in turn.

      Novel estimation of census population sizes in metazoans: The estimates are derived by: 1) estimating the density of individuals within their range, based on body size and a previously observed linear relationship between body size and density (Damuth 1981, 1987); 2) applying a geometric algorithm (finding the minimum alpha-shape computationally, sometimes adjusting alpha manually) to geographic occurrence data to estimate the area of the range; and 3) multiplying the two.

      The results are sometimes surprising. For example, Drosophila melanogaster is estimated to have a population size > 10^17 (Fig. 1); if the volume of an individual is 1 mm3, this implies a total volume > 1km x 1km x 100 m. Additionally, some species classified as endangered have census estimates > 10^8 (Fig. 3). The author compares his area estimates with estimates for species in the IUCN Red List (focused on endangered species) to find that they largely correlate (although this is not quantified). I think further investigation of the quality of the census size estimates is warranted. Are there are other estimates of census size or biomass that can be used for validation, e.g., for species of economic and biomedical importance (e.g., herring and anopheles)?

      If the proposed method proves to work well, I imagine that the estimates of census size may be of broad interest in other contexts. In the context of Lewontin's paradox, it may be interesting to quantify the difference in the relationship between \pi and Nc suggested by the new estimates vs the proxies used in previous work (e.g., Leffler et al. 2012).

      Quantifying the relationship between \pi and Nc controlling for phylogenetic relatedness: I am unclear about the motivation for this analysis. As Lynch argued (and the author describes), if TMRCAs of neutral loci within a species are smaller than the split time from another species in the sample, its genetic diversity level was shaped after the split, and it could be considered an independent sample for the relationship between \pi and Nc. There may be underlying factors shaping this relationship that are not phylogenetically independent (e.g., similar life history traits) but it is unclear why that would justify down-weighting a sample. In that sense, I am not convinced by the authors argument that finding a 'phylogenetic signal' justifies the correction. Stated differently, it is not obvious what is the 'true' relationship being estimated and why relatedness biases it. One could imagine that the 'true' relationship is the one across extant species, in which case the correction is not needed (with the possible exception of species in which TMRCAs are on the same order or greater than split times). I don't know what an alternative 'true' relationship would be.

      Moreover, I am not sure how a more precise 'quantification' of the relationship between diversity and census size serves us. Regardless of corrections, it is obvious that the null provided by the standard neutral model is off by orders of magnitude. Perhaps once we have alternative explanations for this relationship then testing them may require corrections, but presumably the corrections will depend on the explanations.

      One context in which phylogenetic considerations and quantification may be relevant is the comparison of the \pi - Nc relationship among clades. Notably, one could imagine that different population genetic processes are important in different clades (e.g., due to reproductive strategy) and a comparative analysis may highlight such differences. It is less clear whether the corrections that are applied here are the relevant ones. Separating clades makes sense in this regard, but it is unclear why to correct for non-independence within a clade. Furthermore, it seems that in order to point to different processes one would like to control for the distribution of census population sizes in comparisons between clades (to the extent possible). Otherwise, one can imagine the same process shaping the relationship in different clades, but having a non-linear (in log-log scale) functional dependence on census population size (as in the case of genetic draft studied next). In this regard, I am not sure I follow the argument attributed to Gillespie (1991) and specifically how the current analysis supports it.

      In summary, I find the ideas of clade level analyses and of using phylogenetic comparative methods (PCMs) to look at census population size (and possibly diversity levels) promising. For example, as the author alludes to in the Discussion (bottom of P. 13), PCMs may be informative about the hypothesis that species with large census sizes have a greater rate of speciation. Yet I find the current analyses difficult to interpret.

      Analysis of the effects of linked selection: The author investigates whether the effect of selection at linked sites (e.g., selective sweeps and background selection) can account for the observed relationship between diversity levels and census population size. To this end, he assumes that different species have the same sweeps and background selection parameters inferred in Drosophila melanogaster, but differ in census size and genetic map length.

      As justification for using selection parameters inferred in D. melanogaster, the author argues that this is a "generous" assumption in that the effects of linked selection in this species are on the high end. One issue with this argument is that among reasons for the strong effects in D. melanogaster is its short genetic map length. This is not a substantial caveat, given that the analysis is meant as an illustration and it can be resolved by using appropriate wording. Perhaps more troubling is that the author's estimate of the reduction in diversity level in D. melanogaster is much greater than the reduction estimated in the inference that he relies on (several orders of magnitude and less than one, respectively). This discrepancy is mentioned but should probably be addressed more substantially.

      The results of the analysis are intriguing. The effects of linked selection `shrink' the ~13 orders of magnitude of census population sizes to ~3 orders of magnitude of diversity levels. This massive effect is largely due to the genetic draft (Gillespie 2001) and to a lesser extent to the decrease in map length with increasing census size: when the census population size becomes very large (Nc~10^9) and coalescence rates due to genetic drift decrease accordingly (~1/2Nc), coalescence rates due to sweeps, which increase owing to the smaller map lengths (and would otherwise remain constant), become dominant. In hindsight this is quite intuitive and aligns with Gillespie's original argument, but this is in hindsight, and using this argument in conjunction with data, specifically with census population size and map length estimates, is novel.

      As the author points out, the resulting relationship between diversity levels and census population sizes does not fit the data well. Notably, predicted diversity levels are too high in the intermediate range of census population sizes. Nonetheless, their analysis suggests that linked selection may play a much greater role than previous studies suggested (i.e., the analyses of Corbett-Detig et al. (2015) and Coop (2016) suggests that it cannot account for more than 1 order of magnitude). Maybe the poor fit is due to the importance of other factors (e.g., bottlenecks) in species with intermediate census population sizes?

      I also wonder whether the potential role of linked selection may be clearer if the different effects are shown separately, and perhaps with less reliance on the estimates from D. melanogaster. Namely, the effects of background selection can be shown for a few different values of Udel, e.g., between 0.3-3 (this range seems plausible based on many estimates). They can be shown both accounting and not accounting for the relationship between map length and census size. Similarly, the effect of sweeps can be shown for several values of corresponding parameters, and perhaps even for different models for how the number of beneficial substitutions varies with census size (see Gillespie's work to that effect). I believe that such illustrations will be fairly intuitive and less restrictive.

    1. This datashortage is caused by chronic under‐funding ofconservation science, especially in the species‐rich tropics (Balmford and Whitten 2003), andthe highfinancial cost and logistical difficultiesof multi‐taxafield studies.

      Why is conservation science under-funded? I know that this text may be a little older, so funding may not be as bad now. With the effects of climate change seen everywhere, you would think the government would put more efforts and funds into conservation. We need to be responsible for these species and their ecosystems, especially if we're the ones responsible. Luckily it seems that the Biden administration is taking a bigger focus on the environment and climate change than the past administration. I read in an article that the new administration plans to triple the amount of protected land in the U.S., which will be huge for conservation. More changes also seem to be on the way.

      https://www.nationalgeographic.com/environment/article/biden-commits-to-30-by-2030-conservation-executive-orders

    1. We pro-pose that neurophenomenology of dreaming is a nascent discipline that requires rethinking the relative role of third-, first- and second-person methodologies, and that a paradigm shift is required in order to investigate dreaming as a phenomenon on a continuum of conscious phenomena as opposed to a break from or an alteration of consciousness

      We may need to think of dreams not as a break form consciousness, but as a unique continuation of our wakeful conscious state.

    Annotators

    1. Yet specialization becomes increasingly necessary for progress, and the effort to bridge between disciplines is correspondingly superficial.

      Conversely, the less specialized a discipline is, the more superficial its understanding and therefore more easily bridged. I'm thinking education in this context.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Bhide and colleagues present an insightful study of how cellular mechanics influences differential cell behaviour during morphogenesis despite apparent genetic homogeneity of the cellular ensembles. They dissect the extensively studied system of mesoderm invagination in Drosophila, focussing on the differences in cell behaviours between the cells in the middle of the infolding tissue and on the periphery that, as far as we know, share a common gene expression profile. They describe sub-cellular dynamics of major effector of apical constriction morphogenesis, the myosin motor distribution, in the invaginating cells and conclude that differences in myosin levels alone cannot account for the observed differences in cell behaviours. In order to understand the cell behaviour inhomogeneity, they turn to biophysical simulation and in an impressively exhaustive manner substantiate the idea that non-linear effects are required for explaining the phenomenon. This theoretical treatment fits well with the notion that the genetic identity of the cells but rather cell-cell mechanical coupling determine the differences in invaginating cell's behaviours. Additionally, the modelling is consistent with the myosin asymmetry and dynamics in the cells whose behaviours is being contrasted. Complementary, and beautifully executed filament-based modelling of microscopic actomyosin contractility further corroborates this view. Finally, the proposed model of non-linear actomyosin contractility dynamics governing the differential cell behaviour across genetically homogenous cellular field, is challenged by two complementary laser ablation and optogenetic experimental approaches. Overall, the results represent convincing evidence that points the tissue mechanics field of Drosophila mesoderm into an interesting new direction and has general implications for the understanding of the interplay between genetic regulation and emergent behaviours of cells operating in mechanically complex multicellular embryonic context. The study is meticulously executed, highly quantitative and combines effectively experiment and theory. I have only minor comments that concern in particular the presentation of the results.

      The paper is very dense and the text does not complement well the results presented in the main figures. Many panels in the Figures are not referred to explicitly. Figure elements are referenced out of order both within and across Figures. Sometimes, particularly, in the last two Figures (3 and 4) the reader is left alone to figure out what the data show (with the appropriately terse legends and without the clear narrative in the text, it is an uphill battle for non-specialists). Some key results are hidden in the sea of elements within the Figure 2 that contains the most important, relevant and impressive data.

      We have split this figure in two, moved some of the results from Suppl. Fig. 5 into one of its parts and included new calculations and data. We have also extended the description of these results in the main text and in the figure legends.

      As an example, on line the authors point to panel 2F to demonstrate the asymmetry of myosin distribution in some cells. To the best of my understanding, this phenomenon is actually shown in Fig 2E which is curiously not referenced at all.

      We have corrected the references to the panels

      Similarly, Figure 2K and L provide crucial data substantiating much of the conclusions of the paper. It requires a major effort to understand what the graphs mean. The following simulation results are quite impressive and would deserve a separate Figure which could provide more space for explaining what the parameter maps actually show. What is for instance plotted on the Y axis as steepness?

      We have added the following explanation: “The ‘width’ of the profile is the number of cells with maximum value; the ‘steepness’ is the slope between minimal and maximal values (equation 2 in materials and methods).”

      Secondly, I find the overall narrative of the manuscript needing some reorganisation. The main question is set-up extremely well, however in the middle of the manuscript the focus on the connection between cell behaviours and genetic programs is lost. New conclusions on force transmission between cells emerge, however they are not obviously connected with the question posed from the onset and addressed in the discussion section.

      To us, the section on force transmission seemed like an important component of the issue of intrinsic versus extrinsically determined cell behaviours. We had seen that the intrinsic programme of the cells, as reflected in their myosin levels, might not be sufficient to explain the difference between stretching and constricting. If their behaviour is not intrinsically determined, then there must be something acting from the outside, and we are looking here at what that might be, i.e. we need to find out how the potential constriction is influenced. The first model tests under which conditions differential contractility leads to different ‘cell’ behaviours. This in turn leads directly to the question of the forces the cells in the epithelium exert on each other.

      My impression is that the authors are conservative in their reasoning, however it does compromise the overall message of the story that should ideally focus on one subject. I find the combined evidence presented sufficiently supportive of the model that is beautifully and eloquently presented in the concluding sentence of the paper:

      "This mechanism, which we propose corresponds to the non-linear behaviour predicted by the models, would apply both to central and to lateral cells, with a catastrophic 'flip' being stochastic and rare in central cells, but reproducible in lateral cells because of the temporal and spatial gradient in which contractions occur."

      This may not turn out to be the entire story or even entirely correct, but it is certainly and exciting way of thinking about the problem. I wish that the manuscript would stay more on this subject throughout and provide intermediate conclusions supporting this model as the story develops.

      Few more minor comments:

      We have corrected all of the typos, mistakes and omissions and adapted the text, as mentioned below.

      Line 36 - typo > Line 97 - starting bracket missing > Line 126 - data on intensity are presented here. There is also a panel on concentration (Fig 1H). Where is this discussed?

      An explanation (definition) has been added to the main text.

      Line 132 - panel 2G - disruptive out of sequence reference to a future figure > Line 135 - with this regard - please spell out this important conclusion

      We have expanded this part, basically introducing the conclusion more clearly (we hope).

      Line 183 - typo > Line 210 - insects do not have intermediate filaments

      We have added ‘mammalian‘ to the reported experiment in the text, to make it clear that this does not refer to Drosophila cells

      Line 238 - please provide a hint of how such global ablations are performed > We have added this – both explicitly, and the relevant references.

      Line 240 - walk us through the Figure, it is too complex to figure it out alone > We have added a more extensive explanation both in the text and in the new figure legend.

      Line 245 - why is the clear hypothesis mentioned above (point 2) rephrased? > Line 273 - vague statement

      We have changed the text in response to these useful pointers.

      **Significance:

      The results represent convincing evidence that points the tissue mechanics field of Drosophila mesoderm into an interesting new direction and has general implications for the understanding of the interplay between genetic regulation and emergent behaviours of cells operating in mechanically complex multicellular embryonic context.

      Reviewer #2

      Bhide and colleagues explore the mechanisms of cell expansion in epithelial morphogenesis. During the invagination of the Drosophila mesoderm, cells in the center of the prospective mesoderm constrict under the action of actomyosin pulses, while lateral cells elongate towards the center of the mesodermal placode to accommodate the reduction in apical surface of the central cells. Central and lateral cells display strong similarities in terms of gene expression. How are thus this different behaviors (contraction and expansion) accomplished? The authors found that both central and lateral cells assemble actomyosin networks, although lateral cells do it with a certain delay. Mathematical models of cell constriction across the mesoderm using different strain-stress responses showed that strain-induced cell softening was necessary recapitulate the patterns of constriction and expansion observed in vivo. Furthermore, modelling predicts that cells can stretch until the actin networks yield and break. Laser ablation and optogenetic reduction of contractility in central cells results in a reduction in the apical surface area of lateral cells. An optogenetic increase in contractility in lateral cells caused an increase in apical area in central cells. Together, these data suggest that mechanical cues can override and contribute to sculpting genetically defined morphogenetic domains.

      I propose to address the following points before further considering the manuscript:

      Major

      1. Figure 3: following laser ablation of central cells, lateral cells reduce their apical surface. How do the authors know that this reduction in lateral cell apical surface area is an active process, driven by actomyosin-based contraction, rather than a passive response to the expansion of the wound induced by laser ablation?

        A similar argument could explain the constriction of lateral cells after optogenetic inhibition of actomyosin networks: the central cells relax, expand and compress the lateral cells.

      With regard to the comparison to wounds, it is important to note that the epithelium is not actually wounded by either ablation method. Thus, while the treatments ablate the actyomyosin meshwork, they do not ablate or kill the cells. Perhaps the term is an unfortunate choice, since it is more commonly used in developmental biology for killing cells. However, here the cells remain intact and when the optogenetic or laser treatment is released the cells resume their physiological activities.

      We have added a note in the text and now refer to ‘laser microdissection’, a term of art in the field, for more clarity.

      Regarding the more important question of what is the active process, expansion of the central cells or constriction of the lateral cells, a contribution from expanding central cells is of course in theory not impossible.

      However, for this scenario to work, in the absence of pulling from the lateral cells, there would have to be a force that is generated in the central cells, in this case a pushing force that would expand the cells and act on the lateral cells. We have shown in our previous work that if the actomyosin is dissected in dorsal cells, which are not surrounded by potentially contractile cells, the cells do not expand (Rauzi et al, 2017). This shows that ‘relaxing’ by itself does not have ‘expansion’ as a consequence. One would therefore have to consider how such a pushing force could arise in these cells. We can think of only two possibilities: hydrostatic pressure or an active force from the subcellular molecular machinery.

      Considering hydrostatic pressure, if the apical actomyosin that is ablated was responsible for maintaining such a pressure inside the cell (a reasonable assumption), then releasing the actomyosin would allow the cell volume to push against the neighbouring cell. However, such a recoil would occur on a very short time scale (seconds), whereas we see the contraction of the lateral cells continuing over extended periods (minutes).

      Alternatively, expansive forces could be generated by the cytoskeleton. Cytoskeletal pushing forces can come from microtubules (classical example: mitotic spindle; epithelial morphogenesis: work from T. Harris and B. Baum labs: PMID 18508861 and 20647372), or from continuous creation of new cross-linked or branching actin networks pushing against plasma membranes, as in the leading edge of crawling cells. But the microtubules in the blastoderm cells are not oriented in such a way they could provide a force in the correct dimension in these cells (the majority is oriented along the apical-basal axis). In addition, the connection between MT and the plasma membrane depends on the cortical actin meshwork (involving, for example, the actin-binding proteins P120-Catenin or patronin/Shot; Roeper lab, PMID 24914560, StJohnston Lab, PMID: 27404359) but the connection of actin with the plasma membrane has been severed in the optogenetically manipulated cells.

      By contrast, we show that normal lateral mesodermal cells possess a contractile actin network. So the only sustained force generated in the system at this point is the contractile force in lateral cells (which is normally counteracted by the stronger contractile force from central cells).

      Thus, we conclude that the expansion of central cells is a passive response to a contractile force from lateral cells, not an active process and conversely, the constriction of lateral cells is an active autonomous process.

      To demonstrate active responses of the lateral cells upon laser ablation and optogenetic manipulations of central cells, at the very least the authors should show the distribution of myosin in the lateral cells that constrict and demonstrate the assembly of contractile networks.

      We have now included the requested data for the experiments with laser ablations. Suppl. Fig. 8 and Suppl. video 3 show the myosin that accumulates in lateral cells. It would be nice also to be able to show this for the optogenetic experiments. However, despite trying hard, we have not succeeded in generating healthy embryos that carry the entire set of transgenes that are necessary to carry out the optogenetic experiments and at the same time visualize myosin (see also response to referee 2, point 3).

      1. Modelling suggests that actin networks yield and break in lateral cells. Does this occur in vivo?

      We postulate that the skewed and inhomogeneous distribution of myosin and the large myosin-free areas in stretched cells (lines 170 – 172 in the original text) are indications of a yielding meshwork, or at least of uneven force distribution in the network that leads to ineffective contraction or even release – i.e. functionally correspond to yielding. We have made this more explicit now.

      We have also added an additional panel quantifying more clearly the proportion of low- myosin areas in lateral cells (now Fig. 3H).

      Work from the Lecuit lab has recently shown beautifully that it is the connectivity of the myosin mesh rather than the underlying actin meshwork that affects apical forces in epithelial cells (PMID: 32483386), and our own findings are entirely consistent with that.

      1. Lines 166-175: The authors propose that constriction of a cell affects the localization of myosin in its neighbors. However, this is not directly measured. The authors should quantify the relative myosin offset in the cells around constricting cells, and show that that offset is greater (and oriented towards the constricting cell) than in cells around expanding cells. There should be a correlation between the relative size change of a cell and the myosin offset (not just concentration) in their neighbours. We now provide measurements of the rate of cell area change against the offset of surrounding myosin (the distance of myosin from a cellular border). We see that surrounding myosin is closer to the border of constricting cells and tends to be further away from the borders of expanding cells.

      We have added these data to the new Fig. 3I.

      In addition, does optogenetic activation of constriction in lateral cells affect the offset of myosin networks in central cells?

      This is technically challenging. For such an experiment we would need an embryo to express membrane and myosin markers in addition to the two optogenetic constructs and the GAL4 driver. We tried multiple times to generate such a cross, but obtained either no embryos or, at best, deformed embryos. We also tried to use the MCP-MS2 system in parallel to CRY2-RhoGEF2 but the crosses had the same problem. This sensitivity to additional genetic load was also observed in the DeRenzis lab, who generated these strains and tested and used them extensively.

      1. Fig. 2E-F: the authors argue that the mean myosin concentration in lateral cells at certain times is equivalent to that of central cells earlier in the invagination process. However, the fraction of apical surface area covered by myosin network is consistently lower for lateral cells (and also for central cells that remain unconstricted!). Have the authors considered this fact, and if not, why? Wouldn't this explain, at least in part, why some cells constrict and others do not, if medial myosin networks drive the disassembly of the apical surface?

      We believe in fact that this is precisely part of the picture and it was what we had meant to propose, but the text was perhaps indeed just to condensed. Thus, we had stated in line of the original document:

      “While the asymmetry is visible in all cell rows, there are larger areas without myosin and the distance of displacement is greater in lateral cells (Fig. 2G-J)”,

      and in the discussion (line 277 – 285):

      “Despite the homogeneous actin meshwork in stretching cells, the areas that are free of active myosin occupy a large proportion of the apical surface – similar to ectodermal or amnioserosa cells in which the connection of pulsatile foci to the underlying actin meshwork is lost. ... Dilution of cortical myosin may compromise a cell’s ability to make sufficient physical connections, in particular along the dorso-ventral axis, so that even if sufficient force is generated, it cannot shorten the cell in the long dimension. In other words, even though the cells have enough myosin to create force, the system is not properly engaged and its force is not transmitted to the cell boundary.”

      However, we didn’t state this with sufficient clarity in the results section and have added an extra sentence to this effect.

      If myosin activity were increased in laterals cells once central cells begin constricting, would that lead to an increased fraction of lateral cell surfaces covered by actomyosin networks and to reduced lateral cell elongation?

      This is a really nice experiment, and we have indeed tried to induce activation at later time points, but unfortunately this did not yield unambiguous results. If we did the manipulation after the central cells had clearly constricted, then activating lateral cells did not lead to their contraction. However, since this is a negative result and we have no independent criterion for knowing how 'strong' the induced contraction was (as explained above, we are unfortunately not able to visualize the myosin in these experiments), and why it might not have been sufficient to overcome the pull from central cells.

      In this context it is worth remembering that in mutants in which myosin is overactivated as a result of defective upstream signalling, lateral cells stretch less or not at all. See PMID: 24026125 for gprk2 mutants and our own results for active Rho1:

      {{images cannot be displayed}}

      Figure: Confocal Z-section of embryos expressing sqh::GFP (myosin; green) and GAP43::mCherry (membrane; magenta) imaged ventrally. A constitutively active form of Rho1 is ectopically expressed using a maternal Gal4 driver, inducing activation of myosin in more lateral cells. White dots mark the mesectoderm determined by backtracing after ventral furrow invagination. Yellow arrows in B are constricted cells in row 7/8.

      Minor

      1. Image panels are missing scale bars in many figures. > 2. Fig. 1C'-D': The authors should include a color bar to provide some indication of the scale of the apical areas measured. Same comment for other figures in which apical area is color-coded.

      We have added the missing elements

      1. Supp. Fig. 2E-F, G-H and Supp. Fig. 6: what is the difference between myosin intensity and myosin concentration? Junctional vs medial localization? Or summed vs mean pixel value? Please be specific, the difference between intensity and concentration is not clear.

      In the cases where we talk about myosin ‘amount’ we have now exchanged the term ‘intensity’, i.e the physical term for the amount of light, for ‘amount’ (i.e. that for which we use the light intensity as a proxy) and have explained in the main text how we define total apical myosin amount and apical myosin concentration (amount over area). However, in the cases where we are describing the actual image analysis, as in Suppl. Fig. 3, we use ‘intensity’ as the term of art that is used for the methods employed here. Similarly, the terms ‘sum intensity’ and ‘mean intensity’ are terms used for image in analysis in Fiji.

      The definitions of “junctional” and “medial” actin were introduced by the Lecuit lab (PMID: 21068726), and we have included the appropriate reference.

      1. Line 118: Supp. Fig. 2 does not have panels I and K. > 5. Line 223: the authors reference data at sec, but Supp. Fig. 6 does not show any images at that time point. They should be added or a different time point indicated.

      These errors have been corrected.

      Typos

      1. Abstract: "[in a supracellular context" should be "in a supracellular context". > 2. Line 145: should this be a reference to Supp. Fig. 5 instead of Supp. Fig. 4? > 3. Line 166: I am not sure how Supp. Fig. 5 supports this statement. Is this the right figure reference? Should it be Supp. Fig. 4 instead? > 4. Line 881: "representing on line" should be "representing one line".

      These errors have been corrected.

      Optional

      Tony Harris' lab showed that the Arf-GEF Steppke antagonizes myosin and facilitates cell deformation at the leading edge of the embryonic epidermis during Drosophila dorsal closure (West et al., Curr Biol, 2017). Does Steppke localize to junctions in lateral but not central mesoderm cells? Does the pattern of Steppke localization in the mesoderm change with manipulations to the contractility of central cells?

      This is certainly interesting, and we have ordered the protein trap, UAS constructs and RNAi lines. However, these will be long-term and time-consuming experiments.

      Significance:

      This is an interesting study, and one that makes uses of beautiful tools, including quantitative microscopy and image analysis, mathematical modeling and optogenetic manipulations. The prediction that embryonic cells display non-linear stress-strain responses is exciting, as linearity has been the predominant assumption so far. However, I find that model predictions are not well supported by the data, and that alternative interpretations of some results are possible. Additionally, the paper lacks insight into the molecular mechanisms that facilitate stretching (although that could be the subject of a follow-up study).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this study, the authors explore potential mechanisms for why some cell constrict while other cells expand, despite similar intrinsic genetic programs, during Drosophila ventral furrow formation at the onset of gastrulation. The authors combine quantitative analyses of cell shapes and myosin levels from multiphoton confocal and Multi-View SPIM imaging, optogenetic and laser perturbation experiments, and mechanical models to argue that nonlinear mechanical interactions between cells are required to explain the cell behaviors. Based on microscopic models of the actomyosin cytoskeleton in the tissue the authors argue that the required nonlinear mechanical behavior is consistent with actomyosin network reorganization.

      Major comments:

      • Although the area of investigation is exciting and the results are interesting, unfortunately the quality of the results and comparison between experiment and modeling in the current version of the manuscript are not convincing. Although it is not clearly explained in the manuscript, the experimental results on cell shapes, myosin intensity, laser manipulation, optogenetic perturbations appear to be from a single embryo or small number of embryos for each experiment (Figures 1, 3, 4).

      We had analysed a much larger number of embryos, but only included those for presentation that provided the most extensive data. It is extremely difficult to obtain absolutely ‘perfect’ embryos at high resolution for full quantification over long periods. ‘Perfect’ means that the embryos are mounted in such a way that they are imaged from an angle of 45 degrees off the dorso-ventral axis, so that initially mesodermal rows 3 to 7 are seen, and then, as furrow formation progresses, the more lateral rows move through the field of vision. It is difficult to mount in this perfect manner for two reasons: the shape of the embryo means that the embryo does not ‘like’ to be balanced in this position, but instead prefers to fall back on its side. Secondly, the embryo has to be mounted at a time point before visible differentiation along the D-V axis, so no visual cues exist to get the positioning right. This means that many of our recordings lack either the more ventral or the lateral cell rows. While the findings for these more restricted observations are fully consistent with our reports, they cannot be quantified with a full comparison across all cell rows over the entire imaging period. Nevertheless, we have processed and analysed further examples which we have now included in Suppl. Fig. 2 and Suppl. Fig. 8.

      The authors state that the cell stretching pattern "was best recapitulated by a superelastic response", but did not provide direct quantitative comparisons of the different mechanical models to the experimental data to clearly demonstrate this.

      Data that illustrate this were shown in Suppl. Fig 5 – but, admittedly, were not well explained, or rather, not at all. We have now added better explanations, expanded the figure, included new analyses, and now present some of these data in the new Fig. 2. Briefly, the figure shows that superelastic and elastoplastic responses are the only curves that successfully reproduce the pattern of stretching lateral cells (last 3 cells stretching with the inner cell stretching most and the last cell stretching least) while at the same time matching the ratio between the cell sizes of the most stretching cells to the least stretching cell.

      The top row of the parameter scans in Suppl. Fig. 5 (now Fig. 2) shows how many cells stretch for each combination of myosin curve steepness (y-axis) and width (x-axis) with shades of blue indicating the number of cells, and the red outline in the field where 3 cells stretch outlining those conditions where the inner cell stretches most. The bottom row shows the resulting size ratios of largest to smallest cell. High ratios in the region outlined in red in the top row are only reached for the superelastic and elastoplastic responses, with the elastomeric tending in the right direction.

      We have now also quantified a goodness-of-fit (root mean squared error, RMSE) measurement between our experimental data and the simulated data of all our models. This is shown now in the new Fig. 2.[1]

      We also note that only the parameter maps of the superelastic and elastoplastic models (Fig. 2J,K) resemble the equivalent parameter maps of the microscopic model (Fig. 3Q).

      Moreover, the local optogenetic myosin recruitment experiments in Figure 4 do not provide sufficient information on optogenetic tool recruitment,

      We have included images that illustrate the optogenetic construct in the illuminated cells, but not in the central cells in Suppl. Fig. 8. It is impossible to show the construct in the ‘dark’ cells, because illuminating them would activate the construct.

      myosin localization,

      As explained above, this is unfortunately technically not feasible. The best we can do is refer to the description of the construct by Izquierdo et al. (PMID: 29915285), which shows the accuracy of the tool and the highly specific membrane recruitment of myosin.

      or cell behaviors

      We have added quantitative comparisons between the experimental and control areas. to justify the claim that the central cells are not activated by the optogenetic perturbation and are only responding to the forces from neighboring cells.

      • The authors should provide direct quantitative comparisons of the models and experiments to clearly demonstrate their claims that the superelastic model is better than the linear model or other nonlinear models.

      See response above.

      • The authors should do additional experiments and/or provide more details for the existing experiments (to include several embryos per condition) on myosin quantification, photo-manipulation, and optogenetics experiments.

      We have provided data for more embryos for all cases.

      Additional controls would like be necessary for claims resulting from the optogenetics experiments in Figure 4.

      This has been addressed above – we have provided additional data and controls.

      • The additional time and resources required to address these concerns would depend on the experimental details, N values, and statistics in the current studies, which unfortunately were not described in the current manuscript.

      We have been able to add substantial additional data and have added the requested numbers. For many of the experiments each recording can be very time consuming and for the reasons explained in this response, it is not always easy to obtain precisely the desired recording from the desired imaging angle with the manipulations having been done precisely in the desired position. The numbers of embryos are therefore not high, but multiple shorter recordings provide a body of results that support the findings, but are not easily comparable statistically.

      • Methods descriptions for reproducibility are generally adequate, with the exception of N values and statistics

      See above.

      • Are the experiments adequately replicated and statistical analysis adequate?

      No, see above.

      Minor comments:

      1) Scale bars for images are missing throughout.

      We have added these

      2) Number of embryos and cells analyzed missing throughout text and figure legends. We have added additional embryos for all conditions and have included the numbers of cells analysed for all quantifications (except in cases where each data point represents a cell).

      3) Units are missing for many quantities in figures and tables throughout.

      We have added these

      4) Many figure references in the main text are incorrect, pointing either to the wrong figure or wrong figure panel.

      These have been corrected

      5) Line 728. What time point was used for myosin concentrations used in the model?

      We have added this information to the figure legend.

      How might myosin dynamics influence these findings?

      As regards the subcellular dynamics of myosin, these are included in the microscopic model (see ref Belmonte et al.;PMID: 28954810). Preliminary results showed that small changes in myosin stall force and unloaded myosin speed have little effect in our general results. This is now shown in a new supplemental figure (Suppl. Fig. 6). However, if the referee is referring to the dynamics of myosin accumulation over time, this is an interesting question.

      We had begun to explore this topic, but then realized for the linear stress-strain model that it is in fact expected that myosin accumulation would ultimately not affect the outcome. This is because in a linear model the final state of the system is determined by the final shape of the governing myosin profile regardless of the time evolution of the profile, and our simulations confirm this. A systematic analysis for all other stress- strain curves with temporal changes in myosin profiles (where a dependency on the profile temporal evolution is expected) is very time-consuming and will be interesting to pursue in future.

      The main conclusion here that linear models do not recapitulate the observed data as well as the non-linear ones stands regardless of how the temporal dynamics of myosin accumulation may affect the non-linear systems.

      6) The authors show a few examples of myosin pulsing in lateral cells and then conclude that myosin pulsing is not qualitatively different from central cells (lines 135- 136). The author should quantify the number of pulsing lateral cells as well as period and amplitude of pulsing, or discuss relevant results from prior studies in more detail to justify this conclusion.

      By ‘not qualitatively different’ we had meant only ‘in the sense that they are capable of generating contractile forces’, and we have made that more explicit in the text now. The quantitative differences have already been analysed and reported by the Martin lab (https://doi.org/10.1101/2020.04.15.043893; the pulses are slower and less persistent), and our point was that in spite of these known differences, the pulses are able to mediate constriction.

      7) Lines 145-150. The authors very briefly describe the results of the linear-stress strain response and conclude this did not yield outputs corresponding to in vivo data and leave this largely to the supplementary figures. This is a key point in the paper and deserves much more discussion and space in the main text.

      We have included a more extensive description and interpretation of the results in the main text, as detailed in several responses above

      As mentioned in main comments above, a quantitative comparison of the different mechanical models to show that the superelastic model better describes the observations should be included (potentially as an inset to Fig 2D showing a quantitative measure of the quality of model fit to the data).

      These comparisons have now been expanded and explained more extensively and moved to the main Figures.

      8) Lines 162-163. Provide more rationale for why strain-softening would most likely manifest as permanent or reversible cytoskeletal reorganization.

      The only component of the cell that can likely mediate this physical property and also respond at the observed time scales is the cytoskeleton. In these cells it is the main mechanical determinant. Other components that could in principle contribute to the nonlinearity of stress-strain response might be the viscosity of the cytosol, or the plasma membrane. However, stress responses of fluids to shear are usually in the direction of increasing stiffness, and rarely, if ever, with shear thinning. The same is mostly true for colloidal solutions. Therefore it is more likely that the stress-strain relationships at the apical surface of the cells are dominated by the dynamics of the actin cytoskeleton given that even the shape of the plasma membrane is in general determined by the cytoskeleton. We have added a note to this effect in the text.

      9) Lines 187-188. "This shows that forces acting on each cell from its neighbors have an important role in determining the cell's behavior." This seems somewhat obvious; perhaps a bit more explanation would help the reader to understand the importance of these results.

      We have expanded the explanations of these findings and added a sentence to relate them to the main model of the paper

      10) Lines 196-198. How were the concentrations and lengths of F-actin chosen? How were the concentration and properties of linkers chosen?

      The parameters were chosen on the basis of our earlier studies on simulated contractile meshworks and the theory underlying their behaviour. We had reported the conditions under which such meshes are able to contract, and also shown that the underlying theory correctly predicts behaviour of experimental meshworks (for those few conditions for which they have been reported).

      Unfortunately, there are practically no measurements for the length of F-actin filaments in vivo and estimates vary widely. Reliable data on the density of the cortical network are equally sparse.

      Based on our own previous work we chose concentrations of cross-linkers, myosin motors and transmembrane connectors that are able to ensure optimal contraction and force. Our in vivo measurements reported here show that the amounts of F-actin do not vary significantly across the mesoderm, so we used the same concentration of actin, crosslinkers and membrane connectors in all cells of the model, varying only myosin concentration. Taking into account the cell diameter of the mesodermal cells (~7um) and to ensure that the meshwork is sufficiently cross-connected (dense) to generate contraction and transmit forces between cells we used a model where each cell contains F-actin filaments of 1.5 um.

      We have expanded our supplemental material to make these points clearer.

      How sensitive are the results to these details of the cytoskeletal composition?

      We varied both the amounts of cytoskeletal components and the parameters controlling their dynamics (such as myosin stall force and viscosity) and found little impact on model predictions. These data are now presented in Suppl Fig. 6.

      11) Lines 238-244. It would be helpful to include some additional quantification that clearly shows the reader the differences in cell behaviors in control and perturbed tissue.

      We have added quantitative comparisons of the cells in the perturbed region with cells in an equivalent control region, together with evaluations of two additional embryos.

      For the optogenetics experiment, it would be important to show quantification that the lateral cells are not being directly perturbed during photoactivation of neighboring cells (e.g. due to light leakage).

      We have included this information, as described above.

      In both perturbations, it would be helpful to quantify how many cells in rows 7 and 8 constricted and by how much did they constrict? How reproducible were these effects?

      The perturbation experiments were those where it was most difficult to obtain a large number of identical-looking embryos that would allow broad statistics to be applied. For this to work, we would have to have embryos that were identically mounted and illuminated in the identical area of precisely rows 1 to 6 on each side of the midline – at a resolution of one cell row of 6.2 um width. And all this blind, because at the start of the manipulation there are no visual cues for orientation. Morphology gives no cues at this stage. The MS2-MCP-GFP works for laser ablations, but cannot be used for the optogenetics, because the embryo must not be exposed to blue light. This means we cannot predetermine precisely which rows we target.

      We have however added data and quantifications for the control and two further laser- manipulated embryos, which are now shown in suppl. Fig. 8. It is evident from both that our perturbations were slightly asymmetric and included the outer rows on only one side and on that side several cells that would normally have stretched are now strongly constricted. While by no means true for all lateral cells, this is a case of one black swan disproving the hypothesis that all swans are white: any constricting cell within two cell diameters of the mesectoderm, i.e. ones that would normally stretch proves that lateral cells do have the capacity to constrict.

      12) Lines 245-252. A key assumption in interpreting this experiment seems to be that the central cells are not directly perturbed by the optogenetic activation. Additional quantifications of RhoGEF2-CRY2 and/or myosin should be shown to support this.

      We have included an image of the optogenetically activated construct in this experiment in Fig. 5, but we cannot show its behaviour in the non-activated part because if we illuminated it, it would be activated. We were unable to create the embryos necessary to document the behaviour of myosin.

      It would be helpful to include some additional quantification that clearly shows the reader the differences in cell behaviors in control and experimental regions. How reproducible were these effects?

      We now provide the results from two additional embryos in Suppl. Fig. 8, and include quantitative comparisons between the control and experimental regions for these and for the embryos that are currently shown in Fig. 5 E.

      13) A section on statistics is missing from the methods section.

      We have added descriptions of the quantifications and statistics.

      14) Line 615. Ensure that Eq. 1 is dimensionally consistent; crucially, what units are used for 'M'? If the model is non-dimensionalized, provide the reference scales.

      Apart from the initial distance between membrane positions (set to 6.2 um) all other units in our visco-elastic model are arbitrary. In order to make this clearer, instead of using the term “viscosity” in equation 1, we now call it a “damping constant”.

      15) Line 675: The investigated stress-strain relationships are presented in Table S1. What are the definitions of xpl and xsh?

      We have included these definitions in materials and methods:

      All stress-strain curves are linear for extensive strains (∆𝑥) lower than the proportionality limit (𝑥!"), with some curves (elastoplastic and superelastic) undergoing a strain-softening to strain-hardening change after a given strain-hardening limit (𝑥#$).

      16) Line 678: Parameter values for the stress-strain relationships are given in Table S2. Can you provide more information on how these values were selected and their units? How sensitive are the results to changes in these values? Provide references when possible.

      The values for xpl and xsh were chosen to be within the range of the observed lengths of stretching cells, with xpl < xsh. Changing the values of each parameter listed in Table S2 does change the results quantitatively, but over the ranges we tested them, never to the point of making the linear or the other non-linear models reproduce the target pattern of stretching.

      We have stated this in the materials and methods section.

      17) Line 697. Please comment on why the embryo appears skewed to the right. Embryos are not always ‘perfect’, unfortunately. In addition, they can get slightly squashed during mounting and imaging. In spite of its imperfection, we showed this particular one, because we had imaging data for a long period without drift or other interference, and with good contrast at great depth.

      18) Line 712. A color-bar corresponding to this color-code is missing in the figure.

      This has been corrected.

      19) Lines 715-717. It seems panels E and E' are swapped in the legend.

      corrected

      20) Line 724 (Fig 2). It is difficult to read anything in panel K inset or Panel L inset.

      We have rearranged this figure and replaced some panels for greater clarity, and to remove redundancy.

      21) Line 728. What does "embryo 1" refer to?

      This was a remainder from an old plan where each embryo was numbered and listed in a table so that it could be cross-referred to. We have now described in the supplementary table the genotypes and imaging technique for each group of embryos. Where we show data or analyses of the same embryo in different figures, we refer directly to the relevant panels. We have made sure the embryos are referred to correctly in the figure legends.

      22) Line 732. A quantitative measure of the quality of the fits of the models to the experimental data should be included.

      We have done this, and the new data are now included in the new Figure 2.

      23) Line 739. What exactly does "Embryo 2" refer to?

      See comment 21

      24) Line 779. Why is a z-plane of 15 microns below surface chosen? > 25) Line 797. Why is a z-plane of 25 microns below the surface chosen?

      The planes were chosen in each case to show the reader in one single plane rows 7 and 8 along with the central cells > 26) Line 900. Panel G in Supp Fig 5 is not described in figure description.

      The panel captions were wrongly numbered. This has now been corrected, and more information on this figure has been included in the text. > - Are prior studies referenced appropriately?

      Yes.

      • Are the text and figures clear and accurate?

      No (see details listed above).

      • It would be very helpful to the reader to show direct quantitative comparison of the different mechanical models with the experimental observations to show how much better the nonlinear model is compared to the linear model.

      We have included this.

      An extended explanation of experiments and experimental results within the main text would improve the manuscript.

      We have expanded our explanations in many places.

      Significance:

      The key advance in this work is in identifying a potential role of nonlinear mechanical properties in contributing to distinct cell behaviors within a tissue during development in vivo. This contributes to a growing body of work highlighting the importance of cell and tissue mechanical properties in regulating cell behaviors during the formation of tissue structure.

      This work adds to a growing body of work connecting actomyosin contractility in cells to tissue-scale behavior during development. This work provides a unique mechanical modeling perspective to the study of apical constriction during Drosophila ventral furrow invagination, highlighting a potential role for superelastic cell mechanical behaviors during morphogenesis in vivo.

      The finding would be of interest to researchers working in the areas of morphogenesis, mechanobiology, the cytoskeleton, and active matter.

      This reviewer's expertise is in experimental studies of the cytoskeleton and cell mechanics during morphogenesis.

    1. “Oh! dear, there are a great many people like me, I dare say, only a great deal better. Good morning to you.” “But I say, Miss Morland, I shall come and pay my respects at Fullerton before it is long, if not disagreeable.” “Pray do. My father and mother will be very glad to see you.” “And I hope — I hope, Miss Morland, you will not be sorry to see me.” “Oh! dear, not at all. There are very few people I am sorry to see. Company is always cheerful.” “That is just my way of thinking. Give me but a little cheerful company, let me only have the company of the people I love, let me only be where I like and with whom I like, and the devil take the rest, say I. And I am heartily glad to hear you say the same. But I have a notion, Miss Morland, you and I think pretty much alike upon most matters.” “Perhaps we may; but it is more than I ever thought of. And as to most matters, to say the truth, there are not many that I know my own mind about.” “By Jove, no more do I. It is not my way to bother my brains with what does not concern me. My notion of things is simple enough. Let me only have the girl I like, say I, with a comfortable house over my head, and what care I for all the rest? Fortune is nothing. I am sure of a good income of my own; and if she had not a penny, why, so much the better.” “Very true. I think like you there. If there is a good fortune on one side, there can be no occasion for any on the other. No matter which has it, so that there is enough. I hate the idea of one great fortune looking out for another. And to marry for money I think the wickedest thing in existence. Good day. We shall be very glad to see you at Fullerton, whenever it is convenient.” And away she went. It was not in the power of all his gallantry to detain her longer. With such news to communicate, and such a visit to prepare for, her departure was not to be delayed by anything in his nature to urge; and she hurried away, leaving him to the undivided consciousness of his own happy address, and her explicit encouragement. The agitation which she had herself experienced on first learning her brother’s engagement made her expect to raise no inconsiderable emotion in Mr. and Mrs. Allen, by the communication of the wonderful event. How great was her disappointment! The important affair, which many words of preparation ushered in, had been foreseen by them both ever since her brother’s arrival; and all that they felt on the occasion was comprehended in a wish for the young people’s happiness, with a remark, on the gentleman’s side, in favour of Isabella’s beauty, and on the lady’s, of her great good luck. It was to Catherine the most surprising insensibility. The disclosure, however, of the great secret of James’s going to Fullerton the day before, did raise some emotion in Mrs. Allen. She could not listen to that with perfect calmness, but repeatedly regretted the necessity of its concealment, wished she could have known his intention, wished she could have seen him before he went, as she should certainly have troubled him with her best regards to his father and mother, and her kind complimen

      she thinks he wants to marry her. But this could be harmless flirting? she worries about intention

    1. My dearest Catherine, I received your two kind letters with the greatest delight, and have a thousand apologies to make for not answering them sooner. I really am quite ashamed of my idleness; but in this horrid place one can find time for nothing. I have had my pen in my hand to begin a letter to you almost every day since you left Bath, but have always been prevented by some silly trifler or other. Pray write to me soon, and direct to my own home. Thank God, we leave this vile place tomorrow. Since you went away, I have had no pleasure in it — the dust is beyond anything; and everybody one cares for is gone. I believe if I could see you I should not mind the rest, for you are dearer to me than anybody can conceive. I am quite uneasy about your dear brother, not having heard from him since he went to Oxford; and am fearful of some misunderstanding. Your kind offices will set all right: he is the only man I ever did or could love, and I trust you will convince him of it. The spring fashions are partly down; and the hats the most frightful you can imagine. I hope you spend your time pleasantly, but am afraid you never think of me. I will not say all that I could of the family you are with, because I would not be ungenerous, or set you against those you esteem; but it is very difficult to know whom to trust, and young men never know their minds two days together. I rejoice to say that the young man whom, of all others, I particularly abhor, has left Bath. You will know, from this description, I must mean Captain Tilney, who, as you may remember, was amazingly disposed to follow and tease me, before you went away. Afterwards he got worse, and became quite my shadow. Many girls might have been taken in, for never were such attentions; but I knew the fickle sex too well. He went away to his regiment two days ago, and I trust I shall never be plagued with him again. He is the greatest coxcomb I ever saw, and amazingly disagreeable. The last two days he was always by the side of Charlotte Davis: I pitied his taste, but took no notice of him. The last time we met was in Bath Street, and I turned directly into a shop that he might not speak to me; I would not even look at him. He went into the pump–room afterwards; but I would not have followed him for all the world. Such a contrast between him and your brother! Pray send me some news of the latter — I am quite unhappy about him; he seemed so uncomfortable when he went away, with a cold, or something that affected his spirits. I would write to him myself, but have mislaid his direction; and, as I hinted above, am afraid he took something in my conduct amiss. Pray explain everything to his satisfaction; or, if he still harbours any doubt, a line from himself to me, or a call at Putney when next in town, might set all to rights. I have not been to the rooms this age, nor to the play, except going in last night with the Hodges, for a frolic, at half price: they teased me into it; and I was determined they should not say I shut myself up because Tilney was gone. We happened to sit by the Mitchells, and they pretended to be quite surprised to see me out. I knew their spite: at one time they could not be civil to me, but now they are all friendship; but I am not such a fool as to be taken in by them. You know I have a pretty good spirit of my own. Anne Mitchell had tried to put on a turban like mine, as I wore it the week before at the concert, but made wretched work of it — it happened to become my odd face, I believe, at least Tilney told me so at the time, and said every eye was upon me; but he is the last man whose word I would take. I wear nothing but purple now: I know I look hideous in it, but no matter — it is your dear brother’s favourite colour. Lose no time, my dearest, sweetest Catherine, in writing to him and to me, Who ever am, etc.

      This bears resemblance to Fantomina where Beauplaisir grows bored of Fantomina after he has had raped her and had his fun. From this it seems pretty clear Frederick broke up with her. Do you think Frederick was bored of her because he received sex? Or did he do this to punish her?

    1. istorical ecological studies can provide a baseline on which to design biodiversity recovery strategies and conservation goals. Ma

      An example I think of is the catacombs in Europe. Something we do not think about is how unsustainable and awful burying people after they die is for the environment. It causes deforestation, contaminates soils, and uses a lot of resources. I have been to Italy twice, the second time I was able to go into the catacombs and learn about their history. Not a lot of people know this, but there are catacombs stretching throughout all of Europe, a lot being under Rome. There are dozens of levels of catacombs, all stacked on top of one another under the ground. Most we can not access because they have flooded or are not structurally sound. Rome is a large city, but Ancient Rome is not to the naked eye. There are layers upon layers of buildings under current day buildings that have been covered in sediment due to that power river. I went into a building in ancient Rome that had 3 buildings underneath it. The one at the very bottom was from before 100 B.C. So you can imagine, there were ALOT of people in Rome. They did not have much space and this river posed a threat to their crops. They couldn't afford to bury people like we do today, they had to think of something clever. So they stacked cemeteries on top of one another to reduce the amount of land area affected by the bodies. The hallways of the catacombs were designed for people to visit past family members, but that did not last long because of the smell of decomposition (yummy). So the hallways of the catacombs are extremely small and narrow, I am 5 4 and had to duck the entire time. They have holes cut into the walls where bodies were laid to rest, multiple right next to one another to take up as little space as possible.

      This is an example of something we could take inspiration from when it comes to sustainable practices. The catacombs had their own issues, but some aspects of their design may be useful for the future of cemeteries. They reduced deforestation, which is something cemeteries are very bad about. Maybe we should start to stack our cemeteries similar to them to reduce the amount of acres cleared for the dead. We can do the same with the findings of this and other papers involving ancient civilizations. Learn from the past.

    1. more important than working to become “com-petent” in the cultures of those with whom we work and interact

      I really like their definition of cultural humility and find it interesting that there is a discussion on if cultural humility or competency within a particular culture is more important. This kind of reminds me of the "jack of all trades" vs. "expert in a field" discussion and veterinary careers. As a veterinary student we have a very diverse background of knowledge to be able to address a wide variety of conditions and diseases. Some individuals decide to pursue board certification and expertise in a particular field of veterinary medicine. Even though a veterinarian may be boarded in cardiology and primarily see cardiac patients, it is important to remember other systemic conditions that can lead to similar symptoms/ lesions and to be able to address them accordingly. From this stand point I think that having proper cultural humility and competency within particular cultures that you work and interact can be equally important. If you predominantly work within a certain culture, it is natural to become more competent to interacting with individuals from that culture. At the same time it is important to maintain culture humility so you can properly and respectfully address people from all cultures.

    1. Feedback from the faculty teaching team after teaching for almost 8 weeks is how to template and simplify space for students to use, here is a direct quote: “could we create dedicated blog page for students that would be a pre-made, fool-proof template? When a student’s WordPress blog does not work and we can’t fix the problem, it is very frustrating to be helpless beside an exasperated student.”

      There may be a bit of a path forward here that some might consider using that has some fantastic flexibility.

      There is a WordPress plugin called Micropub (which needs to be used in conjunction with the IndieAuth plugin for authentication to their CMS account) that will allow students to log into various writing/posting applications.

      These are usually slimmed down interfaces that don't provide the panoply of editing options that the Gutenberg interface or Classic editor metabox interfaces do. Quill is a good example of this and has a Medium.com like interface. iA Writer is a solid markdown editor that has this functionality as well (though I think it only works on iOS presently).

      Students can write and then post from these, but still have the option to revisit within the built in editors to add any additional bells and whistles they might like if they're so inclined.

      This system is a bit like SPLOTs, but has a broader surface area and flexibility. I'll also mention that many of the Micropub clients are open source, so if one were inclined they could build their own custom posting interface specific to their exact needs. Even further, other CMSes like Known, Drupal, etc. either support this web specification out of the box or with plugins, so if you built a custom interface it could work just as well with other platforms that aren't just WordPress. This means that in a class where different students have chosen a variety of ways to set up their Domains, they can be exposed to a broader variety of editing tools or if the teacher chooses, they could be given a single editing interface that is exactly the same for everyone despite using different platforms.

      For those who'd like to delve further, I did a WordPress-focused crash course session on the idea a while back:

      Micropub and WordPress: Custom Posting Applications at WordCamp Santa Clarita 2019 (slides)

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1:

      I think the experiments within the manuscript are generally of good quality and well controlled.

      We would like to thank the reviewer for the appreciation of our work.

      ...However, I find that the authors' conclusions are very often not supported by the experiments performed (as detailed below) and I would strongly recommend that the authors stick to the conclusions that can be drawn based on the data they have generated. In my opinion, this manuscript contains findings that are of interest to the field but it needs to be rewritten with more justifiable conclusions.

      We have extensively rewritten the manuscript and toned down the role of the HMR/LHR complex in hybrids while emphasizing its role in Drosophila melanogaster.

      1) 'Speciation Core Complex' - The only link to speciation is the fact that the 'SCC' includes D.melanogaster HMR, a known hybrid incompatibility gene. On the other hand, all of these proteins have important functions in a pure species context and all of the interactions reported between the members of the SCC occur in a D.melanogaster background. Also, SCC assembly in viable/inviable hybrids is not tested. Essentially, I would come up with a different and more functionally consistent name for the complex. I highly recommend against naming these stable interactors as the 'SCC' unless the authors can show that mutating any of the other 'SCC' proteins (specifically NLP, NPH, BOH1 & BOH2), which should presumably also disrupt SCC formation, leads to the rescue of hybrid male lethality?

      We agree with the reviewer that we base the naming of the complex on the presence of the products of the two known hybrid incompatibility genes Hmr and Lhr. As we did not investigate the complex’ composition in hybrids we agree with the three reviewers that the term SCC is probably misleading. We also agree with the reviewer that it would be highly interesting to investigate whether NLP, NPH, BOH1 or BOH2 mutations also rescue hybrid male lethality. However, we would need to generate fly lines carrying mutations in both the D.mel and the D.sim alleles since the respective genes are autosomal and we feel that this would be beyond the scope of the manuscript. Moreover, such assays would only be possible it those genes are non-essential and not like Nlp, of which the available hypomorphic or deletion alleles are homozygous lethal (**Padeken, J. et al. (2013)**).

      2) Is it a stable 6-membered complex? - The only line of evidence for the presence of a stable complex between all 6 proteins are the MS data from Figure 1C and Figure S1A-C. Although I don't think it is necessarily required, a biochemical demonstration that these proteins co-sediment at a high MW would be a much stronger indication of complex formation. That being said, I think the authors can use their expertise in AP-LC/MS to more comprehensively characterize complex formation.

      Besides the fact that we observe all six components in AP-MS experiment using either one of the subunits, we have also shown in our previous experiments (Thomae et al, 2013) that all subunits can be purified by a tandem purification using first an antibody against FLAG-HMR followed by a Myc-LHR antibody. We also tried to purify the HMR complex via size exclusion chromatography to determine the size of the complex as suggested by the reviewer. Unfortunately, we did not manage to isolate enough of the complex in a soluble form that allowed us to detect a single peak on a size exclusion column. This may be either due to a disassembly of the complex during the unavoidable dilution during SEC or a lack of antibody sensitivity. We also tried to reconstitute the entire complex from recombinantly expressed proteins but failed to express all subunits in a soluble form. It is worth mentioning that a similar observation has been made, for example, for the Dosage Compensation Complex, which, despite being well characterized, has also eluded a characterization using size exclusion chromatography.

      a) For example, the authors could test whether loss of BOH1/BOH2 in S2 cells impacts complex formation. A reduction of interactions between other complex members would strengthen the authors' conclusion of a stable and stoichiometric 6-membered complex.

      Based on our observation that HMR and LHR form a stable heterodimeric complex in vitro (Figure S4) we assume that the presence or absence of the other components does not affect the complex composition in its entirety. The experiment suggested by the reviewer would allow us to distinguish between direct and indirect interactions between BOH1/2 and HMR. Though this is clearly a very exciting approach, RNAi mediated knock downs are rarely complete in S2 cells, making such experiments difficult to interpret. Therefore, these experiments would need to be supported by reconstitution of the different complexes in vitro and potentially crosslinking MS experiments. Such extensive molecular analysis would very likely require at least 6 month to be completed and would be beyond the scope of the current manuscript.

      1. b) Additionally, I would suggest that they use one (or more) of BOH1/BOH2/NLP/LHR as baits in the S2 cells expressing HMR mutations (HMR2 and HMR DC, Figure 3) to test complex formation. Beyond Figs. 1 and S1, the authors only test one-way interactions between HMR (or HMR mutants) and the other 5 binding partners. It is unclear if the other 5 'SCC' members are capable of binding each other when HMR is mutated. As a result, how HMR affects the ability of other proteins to interact with each other and its role in complex formation remains somewhat unclear. This is particularly important since the authors conclude in the discussion that "HMR acts as a molecular bridge between different modules of the SCC" and that "the integrity of the SCC is essential for its function".

      Similar to our answer to the reviewer’s suggestion above, we believe that this experiment requires an additional extensive molecular analysis to be meaningful, which is beyond the scope of the current manuscript. It is important to clarify here that the S2 cells we use still express endogenous full length-HMR, which could participate in complex formation even when Hmr mutant alleles are expressed. To unambiguously show that BOH1 and BOH2 still interact with the other complex components when they no longer associate with HMR, we would therefore need to generate a CRISPR based exchange of all HMR genes in SL2 cells with a mutated version of HMR and analyze their interaction partners. As both alleles fail to fully rescue HMR functionality in a deletion background and as we have shown previously that a removal of HMR results in mitotic defects, it may not even be possible to generate such cell lines.

      3) Centromeric vs heterochromatic localization of HMR - There appears to be some differences between Hmr localization across different tissues as the authors have noted in their introduction. In this manuscript, the authors assess HMR localization in S2 cells as well as mitotic and endocycling follicle cells from various stages of oogenesis. In these cell types, the authors compare HMR localization to both Cenp-C (centromere) and HP1 (constitutive heterochromatin). In my opinion, it is not easy to get a clear perspective on what the authors consider to be HMR's true localization in these cells and tissues. I would recommend the following straightforward changes/experiments related to this point,

      a) Label the image categories in Figure 4A. Please also describe in detail the classification criteria were used to separate these image categories from one another.

      In the revised manuscript we will label the image categories in Figure 4A. An extensive description on how the classification criteria were applied can be found in the methods section.

      b) I would also move Figure S7A to the main text since it demonstrates centromeric colocalization of HMR in early follicle cells.

      In the revised manuscript we will move **figure S7A to a new figure 5C. We have furthermore investigated the localization of endogenous HMR in various cell types in ovaries, which is going to be included in the revised manuscript as a new figure 5A.

      c) Use linescans on existing images to better demonstrate colocalization between Hmr and Cenp-C and/or HP1

      In the revised manuscript we will prepare linescans/profile plots for all IF pictures when necessary.

      d) Show Cenp-A and HMR staining for the images in Figure 5C and stage 10 follicle cells from Figure S7A.

      As stainings with the Cenp-C antibody resulted in more stable and reproducible signals, we used Cenp-C as a proxy for Cenp-A and centromere localization. In Figure S7A and B we stained Cenp-C and showed a greatly reduced expression in follicle cells undergoing endoreplication. We therefore did not perform a Cenp-C (or Cenp-A)/HMR co-staining in these cells and do not think it would add to a better understanding of the mechanisms of HMR locaization (Figure 5C).

      e) I feel the authors do not spend enough time discussing the fact that HMRDC still appears to localize to centromeres at most follicle cells upto Stage 7.

      We now also include the staining of endogenous HMR (figure 5A revised ms) in the various cell types in ovaries. This allows us to expand the discussion of HMR’s localization in dependency of the cell type and stage. These studies not only reveal the high diversity of HMR localization but also suggests that the potential of HMR to localize to the centromere as well as pericentromeric heterochromatin is crucial for its function. In the revised manuscript we have now discussed the fact that HMRdC still localizes to the centromere up to stage 7 more extensively.

      In sum, it would also be nice for the authors to take a clear position on whether HMR is centromeric, heterochromatic or both in the cells they analyze by microscopy and why these localizations may change between the cells they have looked at.

      The fact that we now include a novel figure where we investigate HMR’s localization in different cell types allows us to discuss the (diverse) localization as well as its potential regulation more extensively. As the localization is highly dependent on the cell type observed as well as the cell cycle stage use, we feel that these aspects need to be taken into account when describing HMRs localization. This is now discussed in the revised manuscript.

      4) HMR2 analyses - I think HMR2 is an important mutant to include as a control for HMRDC, especially since the authors should already have the required strains/data. I specifically mean the following,

      1. a) Figure 4C - Please add HMR2 ChIP-seq tracks only if the authors already have this data.

      Unfortunately, we were unable to acquire convincing HMR2-ChIP data. This may be due to the fact that HMR2localizes quite diffusely or due to a lower percentage of cells expressing this allele in the S2 lines used. Both issues do not influence our interpretations in AP-MS experiments or in single cell based fluorescence microscopy assays, but is problematic in bulk cell population assays like ChIP. Therefore, we cannot provide good HMR2 ChIP-Seq tracks.

      b) Figure 5C and Figure S7B - Add HMR2 IF images. Please also discuss HMR2 localization to centromeres and heterochromatin.

      In the revised manuscript, we have/will attache(d) IF images of ovarial tissue made from strains heterozygous for the Hmr2 allele. Due to the lower gene dosage the intensity of HMR stainings is reduced making a precise localization more difficult. As the manuscript mainly focusses on the description of the newly discovered HmrdC allele, we have added this as supplemental material.

      c) Figure 5E - Increase n's for the HMR2 fertility assay.

      The HMR2 allele has been extensively characterized by Aruna and colleagues (Aruna et al., Genetics (2009)) with regards to its effect on fertility. For this particular assay we only use it as a positive control and reference for the newly described HMRdC allele. We therefore feel that an increase in the number of replicates would be redundant to the earlier publications.

      5) HMR localization in female germline cells - Given that the authors indicate that female fertility and telomeric transposon suppression are compromised with HMR2 and HMRDC, I think it would strengthen the manuscript to address HMR localization with respect to heterochromatin and centromeres in the nurse cells and/or oocytes.

      We now also include the staining of endogenous HMR (figure 5A revised ms) in nurse cells, oocytes and early-stage follicle cells. This allows us to expand the discussion of HMR’s localization in dependency of the cell type and stage.

      6) I find the last part of the abstract and discussion i.e. HMR bridges heterochromatin and the centromere, to be very speculative based on the data presented. As far as I can tell, the only experimental basis for this conclusion is the fact that HMR binds known centromeric and heterochromatic proteins. With this logic, you could easily make a similar argument for the numerous proteins that colocalize with centromeric and pericentromeric heterochromatin. Personally, I would not speculate extensively on a HMR bridging activity without more compelling functional readouts.

      Our hypothesis of HMR as a bridging factor between centromeric and pericentromeric heterochromatin is not only based on its colocalization and interaction with components of chromatin types but also on our previous findings that an HMR knockdown results in a moderate centromere declustering and studies using super-resolution microscopy, which indicate that HMR is sandwiched between the two components (Kochanova, N. Y. et al. (2020)). As the proteomic analysis of the two HMR alleles presented in this study suggest that interactions with both components are required for full functionality of HMR, we assume that it bridges between the two chromatin components. However, we agree with the reviewer that this could also be explained by a centromeric as well as a heterochromatic function of HMR, which are independent from each other. We therefore removed the hypothesis from the abstract and discussed it together with other potential explanations for our findings.

      **Minor comments:**

      1) Intersection plot - I would explain the intersection plot on Figure 1C more thoroughly (I found it confusing).

      We expanded the paragraph in which we explain the intersection plot in figure 1C.

      2) Image colours - The images in Figure S2 and Figure S7 are hard to interpret due to the colours used for the HA and Hmr channel respectively. I would use the white pseudo-colour for DAPI and omit this channel from the merged image and insets (a line demarcating the nucleus would suffice in the merged image). In addition, a linescan would better represent colocalizations or lack thereof.

      We will omit the DAPI channel from the merged images and used a line to demarcate the nucleus as suggested by the reviewer in the revised manuscript. To better illustrate co-localisation of distinct factors we will used line profile plots.

      3) I'm not convinced that one can determine stoichiometry and sub-stoichiometry of protein complexes based on spectral counts; spectral counts could be affected by other factors. Therefore, I would hesitate to use "However, HP1a is only present in sub-stoichiometric amounts in the AP-MS purifications with antibodies against the SCC...."

      The question of whether the stoichiometry of complexes using iBAQ values of purified protein complexes is intensely discussed in the field. Several studies do suggest that this can indeed be done (i.e. Wohlgemuth, Iet al. Proteomics 15, 862–879 (2015); Smits, A. H., Nucleic Acids Research 41, e28–e28 (2012)), which is why we commented on the lower intensity of HP1a relative to the other subunits of the complex. However, we agree with the reviewer that this can only be an approximation rather than a precise measurement (which would need a full in vitro reconstitution, see comments above). We have mentioned this in the revised manuscript.

      4) Ambiguity in description of methods - In the methods section 'Crosses for generating Hmr genotypes for hybrid viability assays', the authors state that "In the rescue experiment, Hmr+ served as a positive (lethality rescue) and Hmr2 as a negative control (no lethality rescue)". The authors might consider rewording this as I think it's a bit strange to refer to hybrid male lethality as a rescued state.

      We agree with the reviewer that the wording to describe the assay we used to investigate HMR’s function in male hybrids is counterintuitive as a “rescue of functionality” results in male hybrid lethality. To better describe it we now call the assay “hybrid viability suppression”, according to the nomenclature that has been used by Aruna et al, 2009 (Aruna, S. et al. Genetics (2009)).

      .

      Reviewer #1 (Significance (Required)):

      **Nature and Significance of the advance:**

      This work adds to the study of reproductive isolation in Drosophila by defining a stable set of molecular interactors of the HMR hybrid incompatibility protein. In my opinion, this study offers a platform for future research into the poorly understood molecular events that trigger hybrid incompatibility in Drosophila. In addition, the authors generate a novel HMR mutation (HMRDC) that also rescues hybrid male lethality and it would be interesting to determine in finer detail how closely this mutation mimics other known HMR mutations. A characterization of BOH1/BOH2 would have also significantly strengthened the manuscript.

      We would like to thank the reviewer for the appreciation of our work. We agree with the reviewer that a deeper characterization of BOH1/BOH2 will further unravel their role in the complex. However, our initial experiments using null alleles or knock downs of BOH1 and BOH2 in D.mel showed no effect or only minor effects on transposon activation and hybrid male lethality. This is most probably due to the fact that the D.sim alleles can fully complement for their function. Moreover, the recombinant expression of BOH1 and 2 turned out to be difficult due to problems in protein solubility. We therefore need to postpone our BOH1 and 2 studies to a later timepoint.

      **My Expertise:**

      Satellite DNA repeats, Chromocenters, Speciation, Hybrid Incompatibility

      **Referees cross-commenting"

      I also agree that all the reviewer comments are reasonable. The manuscript would be significantly improved by making conclusions that can be supported by the data. I think some additional experiments are also warranted to make the paper more robust.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this study, the authors identify a protein complex that contains hybrid incompatibility genes Hmr and Lhr, naming it SCC (Speciation Core Complex). This paper's major conclusions are: 1) overexpression of Hmr (which resembles the situation in hybrid, where hmr/lhr are overexpressed) results in ectopic protein-protein interaction. 2) Hmr's DNA binding domain (mutated in Hmr2) and C-terminal domain (known to interact with Lhr) are important for its function and in causing hybrid lethality.

      The identification of SCC complex is quite intriguing, but this paper does not cover much of functional significance of this complex at all. For example, does mutating other components of SCC complex (BOH1 etc) rescue hybrid lethality? Without examining these important issues, they instead drifted to study the domain function of Hmr. It is not so clear why these two lines of studies are glued together in one paper.

      It is not that I insist that the authors have to do all these experiments, but the assembly of the paper makes this paper quite inconclusive. After reading it, the readers are left behind wondering what is the function of SCC---and we do not even know whether 'speciation core complex' is a fair naming, without any knowledge whether any of the components being involved in speciation or not.

      Overall, this work contains a lot of important information, which promises future breakthrough on the subject matter. However, unfortunately, the study is not carried out to generate any conclusion and is fairly incomplete at this point.

      We thank the reviewer for his appreciation of the importance of our work and apologize that we did not clarify the reasoning of the experiments sufficiently. We think that part of the reviewer’s disappointment is due the fact that we named the complex speciation core complex (SCC), which was indeed an unfortunate decision as we are unable to investigate the complex in male hybrids where it exerts it’s function in mediating hybrid incompatibility (see also answer to comments of reviewer 1). We therefore changed the name to HMR complex and tried to better explain the rational of our experiments in the text.

      **Specific comments.**

      • Quality of Fig4A is too low. I cannot even tell where is the boundary of nucleus. Diffuse signal in category 'yellow' and 'grey'---are they entire cell or nucleus or nucleolus? Please add additional marker(s) for better interpretation of the Hmr signal presented.

      We have improved the quality of figure 4A by adding lines to indicate the nuclear boundary and inserting profile plots to better illustrate the different types of co-localisation.

      • In Fig4A and 5C, the localization of Hmr (wild type version) looks quite different in these two images. Which image is more 'representative' for Hmr localization? (as they build the logic on Hmr localization, this inconsistency is quite bothering). This might be cell-type-specific issue, but if so, how do we know the relevance of their localization? These issues make the result of localization analysis of wt/mutant Hmr inconclusive.

      After reading the reviewers responses we realized that we did not describe our findings well enough, which resulted in a major confusion about the localization of HMR in cells. Indeed, the localization of HMR differs widely depending on the cell type used. We have now included a new figure (new Figure 5A) illustrating the analysis of the endogenous HMR localization in ovaries isolated from D.mel. We hope that the additional figure together with our interpretation helps to alleviate the confusion and adds to the understanding of HMR’s function and potential evolution of HMR.

      Reviewer #2 (Significance (Required)):

      Hmr and Lhr are known as 'hybrid incompatibility genes', deletion of which rescues male hybrid lethality in Drosophila melanogaster/simulans hybrid crosses. Understanding the molecular function of Hmr and Lhr is expected to provide insights into the fundamental question of how two species become incompatible (i.e. how speciation occurs). This study investigates the protein complex that contains Lhr and Hmr, identifying a previously unidentified 'core' complex. Understanding the function of this complex may significantly advance our understanding of speciation.

      **Referees cross-commenting"

      I think all review comments are reasonable. However, I'd like to emphasize that the biggest issue with this paper is not about the data, but how the authors frame it. The term such as 'speciation core complex' is beyond 'hype' (not even 'exaggeration'). Simply there is no evidence that this term can be supported. I think the authors need to be more ethical. I would be surprised if authors truly believe they can claim that the term 'speciation core complex' is justifiable in science.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The manuscript "The integrity of the speciation core complex is necessary for centromeric binding and reproductive isolation in Drosophila" by Lukacs and colleagues describes a study that show, by mass-spec and ChIP-seq, that two well established hybrid incompatibility proteins form a 6-protein complex that predominantly localizes near HP1a bound chromatin boundaries. With a C-terminal domain of HMR deleted, the 6-protein core complex was not disrupted, but its interaction and subsequent localization to HP1a domain near centromeres was lost. In addition, an HMR double mutant that disrupts the interaction between HMR and other components of the 6-protein core complex was tested and similar distribution patterns as for the dC mutant were observed. Next, the nuclear localization was HMR was tested in fruit fly follicle cells by IF. In endoreplicating cells, HMR-dC did not colocalize with HP1a, as did the double mutant. The expression level of several transposable elements (TEs) was assessed and only the full length wt Hmr transgene was able to rescue the repression of TEs, whereas neither the dC and double mutants did. When the number of offspring was assayed, a similar pattern was observed. Finally, male hybrid lethality was assayed by crossing D melanogaster mothers with different Hmr alleles with wt D simulans and only the wt Hmr allele resulted in male lethality, whereas both cD and double mutants resulted in 10-40% of the offspring to be male. These findings led the authors to conclude that 1) 6-protein speciation core complex containing HMR, LHR, NLP, NPH, and two uncharacterized proteins called BOH1 and BOH2, 2) overexpression of HMR/LHR results in novel interactions with other chromatin factors, 3) both the double mutant (E317K and G527A) and the C-terminal deletion mutant are important for for protein-protein interaction within the 6-protein complex and associated factors such as HP1a, and 4) HMR bridges heterochromatin and centromeres.

      **Major comments:**

      • Most of the key conclusions are supported by the evidence presented in this manuscript. The link between centromeres and HMR (and presumably the rest of the 6-protein complex) hinges only on colocalization IF and ChIP-seq data. The change in Hmr localization in cycling follicle vs endoreplicating cells of especially the dC mutant is very interesting. The loss of CENP-C signal correlates with a change in Hmr^dC signal. What exactly drives this change is not explored.

      We have shown in the past that HMR requires full length Cenp-C to localize to the centromere in S2 cells. We assume that this is also the case in the follicle cells. Therefore, the lack of Cenp-C recruitment in endoreplicating cells is likely the reason why HMR localizes primarily to HP1a containing heterochromatin. Differently from wild type HMR, HMRdC can’t bind LHR/HP1a as our AP-MS data show and therefore is not recruited to heterochromatin and diffuses away in later stages. We have described this point more extensively in the revised manuscript

      • The data presented in this manuscript are mostly clear (see minor comments) and appear to be reproducible, especially as the methods sections is detailed and both the ChIP-seq and mass-spec data is deposited in publicly accessible databases.
      • The rational why both HMR and LHR are overexpressed in cell lines is not clearly explained.

      As outlined in our response to reviewer 1 the overexpression of HMR and LHR was designed to simulate the hybrid situation, which shows an increase in HMR and LHR levels (Thomae, A. W. et al. Developmental Cell 27, 412–424 (2013)). We have indicated this in the revised manuscript.

      • The HMR/LHR overexpression experiment is very nice, and as one would expect, resulted in more protein interactions. Some of these might simply be the result from the abundance of HMR and LHR, which have saturated the core 6-protein complex. This leaves the question what is the true minimal size of the HMR/LHR complex? The dC mutant that removes the BESS domain as well as the double point mutations that disrupts the complex altogether, get to the importance of the stability of the complex and its association with especially HP1a. What the minimal interacting partners of HMR and LHR could be explored by knocking-down both factors and do mass-spec.

      We agree with the reviewer that the abundance of HMR and LHR results in a saturation of the core complex thereby having a spillover effect on other proteins. In this regard it is worth mentioning that the expression of the Hmr2 allele does not completely disrupt the complex but rather results in a loss of interactions with NLP, NPH, BOH1 and BOH2 while maintaining the interaction with LHR and HP1a. In fact, when the HMR2 protein is expressed, it shows a stronger interaction with known heterochromatic proteins than the wt protein (Figure 3B). As both mutant alleles show functional defects in pure species and in male hybrids we assume that HMR and LHR need to bind both chromatin types simultaneously. We consider the complex to be somewhat modular as we show that HMR and LHR can interact in isolation (Figure S4) while others have shown that LHR and HP1a, as well as NLP and NPH interact (**Greil, F. et al. EMBO J (2007); Anselm, E. et al. Nucleic Acids Research (2018)respectively). This is now pointed out in the revised manuscript

      • For the telomeric TE expression as well as offspring count shown in Figure 5D,E, a wild-type control would be informative as a measure how well the Hmr+/+ rescues both phenotypes.

      The misregulation of transposable elements (TE) and fertility defects of Hmr loss of function mutants have been previously characterized (Satyaki, P. R. V. et al. PLoS genetics (2014); Aruna et al.,Genetics (2009))**. We therefore rather focused on the relative expression of TEs in the HmrdC and Hmr2 mutants relative to the wild type rescue allele (Hmr+). Hmr2 serves as a known non-rescue allele (Aruna et al., 2009) in the fertility experiment, while in the TE experiment we describe for the first time a defect in TE repression for this allele.

      **Minor comments:**

      • In the opening paragraph of the introduction, the authors describe a scenario of sympatric speciation, which is subsequently highlighted by the speciation event between D. melanogaster and D. simulans. Yet, these two species have similar but not identical distribution range, leaving open the possibility the speciation event happened in parapatry. It might be worth rephrasing the first paragraph to leave open both modes of speciation, especially as the manuscript focuses on the mechanistic side of hybrid incompatability-associated proteins.

      We did not want to imply that our experiments allow a distinction between a sympatric or parapatric speciation. We thank the reviewer for pointing this out and rephrased the first paragraph accordingly.

      • Some of the abbreviations are repeated (e.g. SCC) others aren't introduced (e.g. HI). Overall, less abbreviations will make the text more readable, especially for non-experts.

      We tried to avoid acronyms wherever possible and got rid of the term SCC altogether. All acronyms are introduced at the first appearance.

      • In IF signal in Figure 4A is difficult to see on the black background. I would suggest either increasing the gain to improve the visibility of the signal or show in black-and-white. In addition, the colors should be labeled in the figure for clarity.

      We improved the quality of Figure 4A and labeled the different types of localization (see also answer to reviewer 1).

      • In Figure 5C the images for the Hmr^KO;Hmr^2 appears to be missing.

      See answer to reviewer 1 (4b). We have/will include the corresponding picture as supplementary material as we consider the characterization of the novel Hmr allele to be the main focus of the manuscript.

      In addition, for non-experts it might be helpful to mention which set of IF images are controls, rescues, and test, similar to what was done in Figure 5B.

      We have/will indicate which IF pictures are controls and rescue experiments

      Reviewer #3 (Significance (Required)):

      **Significance:**

      • This study provides novel insight how two factors involved in male hybrid lethality, with which chromatin factors they are associated, and how two mutants impact the chromatin localization and in vivo phenotypes.
      • Understanding the molecular basis of speciation is limited as most factors that drive speciation are not identified. Drosophila species are at the forefront of this research. Post-zygotic factors have predominantly found to have strong speciation potential. This work build very nicely on this work.
      • This manuscript will be predominantly interesting for the Drosophila chromatin field and speciation field.
      • I am trained in comparative genomic focusing on centromeric repeats and now study chromatin dynamics at the single molecule level, using cell biology, biochemical and biophysical tools.

      We thank the reviewer for appreciating our work. We think that our work will also be interesting for researchers focusing on centromere clustering and genome organization in general and independently of the Drosophila system.

      **Referees cross-commenting"

      Reviewer comments look reasonable to me- 1-3 months revision is not an undue burden, I think they can do at least some of what was requested. In response to Rev2: Agreed, they ought to tone it down

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this study, the authors identify a protein complex that contains hybrid incompatibility genes Hmr and Lhr, naming it SCC (Speciation Core Complex). This paper's major conclusions are: 1) overexpression of Hmr (which resembles the situation in hybrid, where hmr/lhr are overexpressed) results in ectopic protein-protein interaction. 2) Hmr's DNA binding domain (mutated in Hmr2) and C-terminal domain (known to interact with Lhr) are important for its function and in causing hybrid lethality.

      The identification of SCC complex is quite intriguing, but this paper does not cover much of functional significance of this complex at all. For example, does mutating other components of SCC complex (BOH1 etc) rescue hybrid lethality? Without examining these important issues, they instead drifted to study the domain function of Hmr. It is not so clear why these two lines of studies are glued together in one paper.

      It is not that I insist that the authors have to do all these experiments, but the assembly of the paper makes this paper quite inconclusive. After reading it, the readers are left behind wondering what is the function of SCC---and we do not even know whether 'speciation core complex' is a fair naming, without any knowledge whether any of the components being involved in speciation or not.

      Overall, this work contains a lot of important information, which promises future breakthrough on the subject matter. However, unfortunately, the study is not carried out to generate any conclusion and is fairly incomplete at this point.

      Specific comments.

      • Quality of Fig4A is too low. I cannot even tell where is the boundary of nucleus. Diffuse signal in category 'yellow' and 'grey'---are they entire cell or nucleus or nucleolus? Please add additional marker(s) for better interpretation of the Hmr signal presented.
      • In Fig4A and 5C, the localization of Hmr (wild type version) looks quite different in these two images. Which image is more 'representative' for Hmr localization? (as they build the logic on Hmr localization, this inconsistency is quite bothering). This might be cell-type-specific issue, but if so, how do we know the relevance of their localization? These issues make the result of localization analysis of wt/mutant Hmr inconclusive.

      Significance

      Hmr and Lhr are known as 'hybrid incompatibility genes', deletion of which rescues male hybrid lethality in Drosophila melanogaster/simulans hybrid crosses. Understanding the molecular function of Hmr and Lhr is expected to provide insights into the fundamental question of how two species become incompatible (i.e. how speciation occurs). This study investigates the protein complex that contains Lhr and Hmr, identifying a previously unidentified 'core' complex. Understanding the function of this complex may significantly advance our understanding of speciation.

      **Referees cross-commenting"

      I think all review comments are reasonable. However, I'd like to emphasize that the biggest issue with this paper is not about the data, but how the authors frame it. The term such as 'speciation core complex' is beyond 'hype' (not even 'exaggeration'). Simply there is no evidence that this term can be supported. I think the authors need to be more ethical. I would be surprised if authors truly believe they can claim that the term 'speciation core complex' is justifiable in science.

    1. switched to biodegradable packaging

      Why do you think they don't make the switch? Is it just because single-use plastic is cheaper? How can we as consumers hold these companies accountable, when there may not be other options?

    1. As for the land, oceanic inventories are likelyvery incomplete. For example, there are morethan 500 species of the lovely and medically im-portant genus of marine snail,Conus. Of the 316species ofConusfrom the Indo-Pacific region,Röckelet al.(1995)find that nearly 14% weredescribed in the 20 years before their publication.

      Much of the ocean is undiscovered as we can read here, but think of the species we do not know about for the creatures we can't see. Virus and bacteria is so diverse and understanding them may lead to learning about new bio systems.

    1. Swarming may overturn the existing order of world power

      Think about this from a surveillance lens... The surveillance lens tells us that as technology develops, and the capacity to store data increases (and centralizes/decentralizes) surveillance increases, becomes more precises, domineering, and increases its power for control and subjegation, and that the 'existing order of the world' evolves as the power of surveillance increases. Remeber: docile bodies: it's not that it becomes more efficient, it becomes more effective through its ability to become internalized and self perpetuated by subject populations. Here, with this new type of data collecting military technology, we see the capacity emerging for the 'existing order of world power' to evolve to new thresholds once again. (As in, from sovereign, to discipline, to control).

    1. It is often difficult for conservation officials tounderstand the histories of peoples that may havepreceded a park or protected area.

      I feel like we as a society have always struggled with trying to understand the history of people's cultures. And why do you think that is?

    1. But who controls articulation?Because the English language is a multifaceted orationSubject to indefinite transformationNow you may think that it is ignorant to speak broken EnglishBut I’m here to tell you that even “articulate” Americans sound foolish to the British

      I loved the last line, but Essentially she's right its like there's grammar Nazi's dictating how language is supposed to be used and performed in a sense. Because language is a tool that as humans we use to express abstract ideas and better yet the spectrum of emotion that we sometimes cant find words for. how much you want to bet another language has a spot on way of describing that very feeling. Anyway language has come a long way, but I guess her point is its still got a ways to go, its still changing.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their positive comments on our manuscript. To address their criticisms, we propose to do the following experiments:

      Reviewer 1 (mi__nor comments)__:

      1. In Fig. 1, the authors show that Btz-WT, but not Btz-HD, localizes to the posterior pole of the oocyte. Do the authors see Btz-WT and/or Btz-HD localized to MNs/muscles/glia at the NMJ? We have had difficulty detecting the expression of our Btz-GFP transgenes at the NMJ. In case this was due to competition with endogenous wild-type Btz, we will repeat the staining in a btz mutant background. If the protein is still undetectable, we can include data showing the localization of UAS-Btz-GFP when overexpressed in muscles or motor neurons.

      The mitochondrial phenotypes observed in Btz mutants are striking. But it seems possible that there are defects in overall mitochondrial levels in muscle in addition to defects in their localization. Overall, mitochondrial levels seemed reduced in Btz mutants. Is it possible to do a ATP5A immunoblot in Btz mutants to test whether overall mitochondrial levels are altered?

      We will do a Western blot to compare ATP5A levels in btz2/+ and btz2/Df(3R)BSC497 larval carcasses.

      ECM proteins are known to be critical for regulating TGFB signaling. That, taken with the multi-tissue genetic requirement for Btz, suggests that Btz might directly regulate either Ltl or Frac RNA, given that these ECM proteins are likely deposited by multiple cell types.

      We agree that this is a possibility and we will mention it in the Discussion.

      Reviewer 2 (major comments):

      1. In Figure 1, regarding the validation of rescue constructs: the EJC interaction-defective mutant is based solely on conservation, as all structural/interaction studies cited with Btz bound to EJC have been with human proteins. They use Vasa localization as a readout of EJC-dependent function, but this is indirect and only assesses one aspect of EJC function (localization). Since many of the main conclusions in the paper are predicated on this mutant being EJC-independent, they should validate this with the Drosophila orthologs using immunoprecipitation. They demonstrate the capability of expressing GFP-tagged versions of Casc3 WT and mutant in S2 cells, so this should not be a cumbersome control experiment to include. We will express tagged Btz-WT and Btz-HD proteins in S2 cells and test whether they can be co-immunoprecipitated with Myc-tagged Drosophila eIF4AIII.

      Regarding Figure 3, it could be postulated that the number of boutons would be influenced by the length of axons. Is axon outgrowth accounted for in these experiments? This would influence number of synaptic boutons. Panel F looks very different from panel A in terms of axon length (could this be due to axon outgrowth defect and/or impacted muscle size?) Can quantitation be done also by normalizing to axon length (bouton number/axon length)? Or perhaps this is accounted for in muscle size? If so, this should be explained.

      • *

      The NMJ grows during development by adding both axonal branches and synaptic boutons, so its size can be measured by counting the number of boutons or branches or measuring branch length. These measures are usually well correlated. In this paper we used bouton number normalized to muscle surface area as our measure of NMJ size, but we did observe corresponding changes in the number and length of branches, as the reviewer points out. We will explain this more clearly in the text.

      In Figure 3 quantification: n's vary between genotypes significantly, and this should be explained (e.g. was there a recovery issue between genotypes or just fewer needed for WT-like?).

      • *

      The btz mutant larvae are more difficult to dissect due to muscle fragility, and some crosses in this genetic background may have yielded fewer usable filets than desired. We believe the numbers we obtained are sufficient to show which differences are significant.

      In Figure 4 panels B and F (mutants), there appears to be reduced axon outgrowth (see point above). This should be taken into account when expressing bouton number.

      • *

      As explained in our response to point 2, axon length and bouton number are correlated measures of synapse size and vary together in this figure as expected.

      The RNA-seq data (Figure 5) has a potential issue in that they used larvae with a balancer chromosome (Df), which yields a 50% reduction in any genes on that chromosome. They acknowledge this and removed these genes from the analysis, but the concern remains that this still might be a confounding variable (for example, if reduction in any of these genes might disrupt a signaling pathway). We do not think that the RNA-seq needs to be repeated, but we propose that the authors validate these targets using qPCR in their MN-specific btz knockdown system (this way, they can also include magoh and eif4aIII knockdowns for comparison).

      • *

      Because only one btz allele was available, we used transheterozygotes with a deficiency for the region to avoid homozygosing other mutations that might be present on the btz2 chromosome. As a consequence, we did observe reduced expression of genes located within the deficiency (which covers a small region, not an entire chromosome), and it is possible that this might contribute to the phenotype. However, we have seen a similar reduction in NMJ size in btz2 homozygotes. We do not think that motor neuron-specific btz knockdown is a useful genotype to validate the RNA-Seq results because ltl and frac levels do not change significantly in the CNS, only in muscle, and knockdown only in motor neurons would be unlikely to change daw levels measured in the whole CNS. Knocking down mago or eIF4AIII in muscle is lethal before the third larval instar stage, preventing us from comparing their effects on gene expression to those of btz. However, we will do qRT-PCR to measure daw, ltl and frac mRNA levels in btz2 homozygous mutant muscles.

      Reviewer 2 (minor comments):

      1. *Some statements made in the introduction that are not entirely accurate: **

        "A fourth core subunit, known as Barentsz (Btz), Cancer susceptibility candidate gene 3 (CASC3), or Metastatic lymph node 51 (MLN51), associates with the complex following the completion of splicing, and is required for the effects of the EJC on translation, NMD and mRNA localization (Chazal et al., 2013; Palacios et al., 2004; Shibuya et al., 2006; van Eeden et al., 2001)."

        A recent study indicates that Casc3 is not required for EJC-dependent NMD targets in human cells, but rather enhances NMD on a subset of targets (Gerbracht et al. 2020 NAR). Perhaps "is required" should be changed to "plays a role in cytoplasmic EJC-mediated processes, such as...". It has also been shown that EJC core can assemble without Casc3 (e.g. Ballut et al 2005 NSMB, Gehring et al 2009 PLoS Biol). Previous work from the authors show that Casc3 (Btz) is not necessary for EJC function in pre-mRNA splicing (Roignant et al, 2010 Cell). Further, there exists a population of Casc3 lacking EJCs in human cells (Mabin et al 2018 Cell Reports). Collectively, all this evidence points to Casc3 not being a core EJC subunit. *

      • *

      We will change the text so that we do not refer to Btz/Casc3 as a core subunit.

        • "In the mouse brain, haploinsufficiency for Magoh, Rbm8a or Eif4a3 causes severe microcephaly, but complete loss of Casc3 has a much milder effect that can be attributed to developmental delay (Mao et al., 2017; Mao et al., 2016; Mao et al., 2015; Silver et al., 2010)."

        From Mao et al 2017: complete loss and hypomorphic mutants were embryonic and perinatally lethal (contrary to what the authors are stating here), while compound mutants and heterozygotes exhibited neurodevelopmental delay. By "milder effects" the authors could also be referring to brain size being proportional to body size in the complete loss homozygotes; either way, this should be clarified. *

      • *

          By “milder effects” we meant the effect on brain size. We will clarify this in the revised text.
        

      Fly-specific nomenclature could be made more accessible to a broader audience, as the full readership will likely not have expertise in Drosophila genetics. For example, w118, btz2 labels used in figures are not explained anywhere in the manuscript. While the authors do a good job of describing various mutants in a more accessible fashion in the results section, the genotype labels in figures can be better explained in the legends.

      We apologize for this and will clarify the genotype labels in the figure legends.

      Fig 2 L-N panels might warrant more explanation. Can the mitochondria be counted here? Is there also a difference in volume/morphology that could be quantitated? In Figure 2N, muscle fibers are more densely packed in mutant vs. control; can this be explained?

      • *

      We are hesitant to quantify mitochondria or comment on muscle fiber packing based on the EM images, because only one individual of each genotype was examined. We prefer to simply use these images to provide a higher resolution view of the change in mitochondrial distribution that we observed and quantified using light microscopy. However, we do plan to do a Western blot to determine whether there are changes in the number of mitochondria in btz mutants (see Reviewer 1 point 2).

      In Fig 2, to draw parallels between panels A-K and L-N, it might also be helpful to use the red/yellow arrow system on panel A for comparison.

      This is a good suggestion that we will follow.

      In Figure 3, it might be helpful for a general audience to include zoomed-in picture of boutons (as in Fig 5B), as some panels appear to have less defined bouton shape.

      • *

      We do observe that boutons tend to be less well separated from each other in btz mutants, and will include zoomed-in pictures to document this.

      Is the bouton size different in the mutant in Figure 3? Can this be quantified?

      We do not think that there is a significant difference in bouton size in btz mutants, but we will measure this and include a quantification.

      Fold changes are modest and not very apparent in staining (we acknowledge that this could be due to early developmental time point). Images could better point out differences in WT vs. mutant that are not readily apparent to those outside the fly neurodevelopment audience.

      Because of the inherent variability in synapse shape, it can be difficult to appreciate changes in bouton number from a single image. However, our quantifications show that the changes are consistent and significant.

      Fig 4 NMJs are shown on different scale (more zoomed in) than in Figure 3, and differences are bit easier to see at this scale. Presenting Fig 3 on this scale might help the reader with visualizing the differences in WT versus mutant.

      • *

      We will crop the images in Figure 3 so as to show them at the same scale as in Figure 4.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Ho et al. describes the developmental functions of the Drosophila Casc3 ortholog, Barentsz (Btz) using in vivo loss-of-function and rescue experiments in Drosophila larvae. In this study, the authors find that loss of Casc3 contributes to neuromuscular defects in the larval fly. Utilizing transgenics of WT and EJC interaction-defective mutants, they demonstrate that Btz has both EJC-dependent and independent functions in the larval neuromuscular junction, wherein muscle defects are EJC dependent and synaptic defects are EJC-independent. Using RNA-seq, they find that upregulated mRNAs include those that belong to the Activin signaling pathway. They go on to find that the neuromuscular defects in Btz mutants can be attributed to dysregulation of Activin signaling, and are rescued with loss of the Activin ligand, Dawdle (Daw).

      Major Comments

      Overall, the paper presents well-controlled experiments that support the main conclusions. We propose achievable validation experiments that we believe will strengthen the conclusions of the paper. There is some concern that the magnitude of the effects are overstated, or could be made more apparent to a broader audience (i.e. those in the mRNA regulation field beyond Drosophila geneticists).

      • In Figure 1, regarding the validation of rescue constructs: the EJC interaction-defective mutant is based solely on conservation, as all structural/interaction studies cited with Btz bound to EJC have been with human proteins. They use Vasa localization as a readout of EJC-dependent function, but this is indirect and only assesses one aspect of EJC function (localization). Since many of the main conclusions in the paper are predicated on this mutant being EJC-independent, they should validate this with the Drosophila orthologs using immunoprecipitation. They demonstrate the capability of expressing GFP-tagged versions of Casc3 WT and mutant in S2 cells, so this should not be a cumbersome control experiment to include.

      • Regarding Figure 3, it could be postulated that the number of boutons would be influenced by the length of axons. Is axon outgrowth accounted for in these experiments? This would influence number of synaptic boutons. Panel F looks very different from panel A in terms of axon length (could this be due to axon outgrowth defect and/or impacted muscle size?) Can quantitation be done also by normalizing to axon length (bouton number/axon length)? Or perhaps this is accounted for in muscle size? If so, this should be explained.

      • In Figure 3 quantification: n's vary between genotypes significantly, and this should be explained (e.g. was there a recovery issue between genotypes or just fewer needed for WT-like?).

      • In Figure 4 panels B and F (mutants), there appears to be reduced axon outgrowth (see point above). This should be taken into account when expressing bouton number.

      • The RNA-seq data (Figure 5) has a potential issue in that they used larvae with a balancer chromosome (Df), which yields a 50% reduction in any genes on that chromosome. They acknowledge this and removed these genes from the analysis, but the concern remains that this still might be a confounding variable (for example, if reduction in any of these genes might disrupt a signaling pathway). We do not think that the RNA-seq needs to be repeated, but we propose that the authors validate these targets using qPCR in their MN-specific btz knockdown system (this way, they can also include magoh and eif4aIII knockdowns for comparison).

      Minor comments

      Some statements made in the introduction that are not entirely accurate:

      • "A fourth core subunit, known as Barentsz (Btz), Cancer susceptibility candidate gene 3 (CASC3), or Metastatic lymph node 51 (MLN51), associates with the complex following the completion of splicing, and is required for the effects of the EJC on translation, NMD and mRNA localization (Chazal et al., 2013; Palacios et al., 2004; Shibuya et al., 2006; van Eeden et al., 2001)."

      A recent study indicates that Casc3 is not required for EJC-dependent NMD targets in human cells, but rather enhances NMD on a subset of targets (Gerbracht et al. 2020 NAR). Perhaps "is required" should be changed to "plays a role in cytoplasmic EJC-mediated processes, such as...". It has also been shown that EJC core can assemble without Casc3 (e.g. Ballut et al 2005 NSMB, Gehring et al 2009 PLoS Biol). Previous work from the authors show that Casc3 (Btz) is not necessary for EJC function in pre-mRNA splicing (Roignant et al, 2010 Cell). Further, there exists a population of Casc3 lacking EJCs in human cells (Mabin et al 2018 Cell Reports). Collectively, all this evidence points to Casc3 not being a core EJC subunit.

      • "In the mouse brain, haploinsufficiency for Magoh, Rbm8a or Eif4a3 causes severe microcephaly, but complete loss of Casc3 has a much milder effect that can be attributed to developmental delay (Mao et al., 2017; Mao et al., 2016; Mao et al., 2015; Silver et al., 2010)."

      From Mao et al 2017: complete loss and hypomorphic mutants were embryonic and perinatally lethal (contrary to what the authors are stating here), while compound mutants and heterozygotes exhibited neurodevelopmental delay. By "milder effects" the authors could also be referring to brain size being proportional to body size in the complete loss homozygotes; either way, this should be clarified.

      General minor comments:

      • Fly-specific nomenclature could be made more accessible to a broader audience, as the full readership will likely not have expertise in Drosophila genetics. For example, w118, btz2 labels used in figures are not explained anywhere in the manuscript. While the authors do a good job of describing various mutants in a more accessible fashion in the results section, the genotype labels in figures can be better explained in the legends.

      • Fig 2 L-N panels might warrant more explanation. Can the mitochondria be counted here? Is there also a difference in volume/morphology that could be quantitated? In Figure 2N, muscle fibers are more densely packed in mutant vs. control; can this be explained?

      • In Fig 2, to draw parallels between panels A-K and L-N, it might also be helpful to use the red/yellow arrow system on panel A for comparison.

      • In Figure 3, it might be helpful for a general audience to include zoomed-in picture of boutons (as in Fig 5B), as some panels appear to have less defined bouton shape.

      • Is the bouton size different in the mutant in Figure 3? Can this be quantified?

      • Fold changes are modest and not very apparent in staining (we acknowledge that this could be due to early developmental time point). Images could better point out differences in WT vs. mutant that are not readily apparent to those outside the fly neurodevelopment audience.

      • Fig 4 NMJs are shown on different scale (more zoomed in) than in Figure 3, and differences are bit easier to see at this scale. Presenting Fig 3 on this scale might help the reader with visualizing the differences in WT versus mutant.

      Significance

      Overall, this paper contributes conceptually to understanding EJC-mediated mRNA regulation during development. The contribution here is incremental, but meaningful in terms of defining the scope of regulation by the EJC and its peripheral factors in various contexts. These findings will likely be of interest to the fields of RNA metabolism and neurodevelopment. It also adds to the existing work suggesting Casc3 may have additional functions outside of the EJC (e.g. Mao et al. 2017 RNA, Baguet et al 2007 J Cell Sci, Cougot et al. 2014 J Cell Sci); while these previous studies have suggested Casc3 roles in development and mRNA localization/granule formation that are different from the EJC core proteins, this study more directly tests an EJC-independent role in mRNA regulation of specific targets. Further addressing the molecular basis of this regulation will be outside the scope of this article but will be of interest to the field.

      We are molecular biologists who study NMD and are thus equipped to address the EJC-related molecular functions and impact on the transcriptome. We do not have expertise in Drosophila genetics or neurobiology, and thus cannot critically evaluate the specific genetic approaches used or anatomy presented to the full extent. We have, however, pointed out areas that need elaboration regarding the genetic approaches and/or presentation of data that may be unfamiliar to a broader audience (i.e. the RNA metabolism field).

    1. Author Response:

      Reviewer #1:

      The authors demonstrate in this study that it is possible to train mice to perform a challenging tactile discrimination task, in a highly controlled manner, in a fully automated setup in which the animals learn to head-fix voluntarily. A number of well described tricks are used to prolong the self-fixation time and thereby obtain enough training time to reach good performance when the decision perceptual decision is difficult. In addition the study establish that this experimental design allows targeted silencing of relatively deep brain areas through a clear skull preparation.

      It has already been demonstrated that mice can perform voluntary head-fixation and can do behavioral tasks in this context. However, this is the first time this methodology is applied to first to a tactile task and second to a task that mice learn is thousands of trials. Another advantage of the present technique is that it is fully automated and allows training without virtually any human intervention.

      The demonstration that optogenetic silencing can be performed in this context is nice but not very surprising as already done in other contexts. Nevertheless it is an interesting application of self head-fixation. The authors should make sure that a maximum of information is available relative to the efficiency of the silencing (fraction of cells silenced) and about its impact on the behavior (does it result or not in a complete impairment?).

      We have improved presentation in various places of the paper to provide more information about the optogenetic manipulation. We added new analysis of the fraction of neurons affected by photostimulation (Figure 8E). We also analyzed the impact on behavioral performance relative to chance performance (Figure S4A and S6). We compared the effect size to prior studies (Figure S4) and we discuss the interpretation of effect size (Discussion, page 22).

      In the power range tested in this study, photostimulation did not reduce performance to chance level (Figure S6). One limitation of the optogenetic workflow is the interpretation of behavioral deficit effect size. We examined this issue in ALM, a brain region from which we have the most extensive data. In previous studies, we have shown that bilateral photoinhibition of ALM results in chance level performance (Li et al 2016, Fig 2b; Gao et al, 2018, Extended Data Fig 6b). Here, mice performance was above chance during photoinhibition of ALM (Figure S4). This difference in effect size likely resulted from incomplete silencing of ALM. The photostimulus intensity used here was much less than those used in previous studies (0.3 vs. 11.9 mW/mm2). In addition, a single virus injection was not sufficient to cover the entire ALM. Thus a partial behavioral effect could be due to incomplete silencing of a brain region, or partial involvement of the brain region in the task. Given this limitation, we caution that the function of a brain region could only be fully deduced in more detailed analysis and together with neurophysiology. The workflow presented here can be used as a discovery platform to quickly identify regions of interest for more detailed neurophysiology analysis. We now better highlight these points in the Discussion.

      Reviewer #2:

      Hao and colleagues developed an automatic system for high-throughput behavioral and optogenetic experiments for mice in home cage settings. The system includes a voluntary head-fixation apparatus and integrated fiber-free optogenetic capabilities. The authors describe in detail the design of the system and the stages for successful automatic training. They perform proof-of-concept experiments to validate their system. The experiments are technically solid and I am convinced that their system will be of interest to some laboratories that perform similar experiments. Despite the large variety of similar automated systems out there, this one may prove to become a popular design.

      The weak side of the work is that it is not particularly novel scientifically. The system is complex but there it is not an innovative technology. The body of the study has too many technical details as if it is a Methodological section of a regular manuscript. There are bits of interesting information scattered around the paper (like the insights about the strategy mice use, which stem from the regression analysis), but these are not developed into any coherent direction that answers outstanding questions. The potential advantages of this system compared to other systems is marginal. In my eyes, the fact that manual training is so similar to the automatic one is not only a positive point. Rather, it signifies that the differences are mainly quantitative (e.g. # of mice a lab can train per day, etc). Thus, even as a methods paper, the lack of qualitative difference between this and other methods weakens it as a potential substrate for novel findings.

      The automated workflow presented here significantly boosts the yield and duration of training to rival and slightly surpass that of manual training for the first time (new Supplemental Table 1). We think this degree of automation is an important technical advance. We show that the workflow can significantly scale up the throughput of optogenetic experiments probing behaviors that require thousands of trials to learn. This enables efficient and systematic mapping of large subcortical structures that are previously difficult to achieve. We better highlight comparisons to previous methods in several key areas in the Supplemental Table 1. We have also strengthened the Discussion (page 20).

      We highlight one line of inquiry enabled by our workflow, a systematic mapping of the cortico-basal- ganglia loops during perceptual decision-making. The striatum is topographically organized. Previous studies examined different subregions of the striatum in different perceptual decision behaviors, making comparisons across studies difficult. The striatum in the mouse brain is ~21.5 mm3 in size (Allen reference brain, (Wang, et al, Cell 2020)). Optogenetic experiments using optical fibers manipulate activity near the fiber tip (approximately 1 mm3). A systematic survey of different striatal domains’ involvement in specific behaviors is currently difficult. In our workflow, individual striatal subregions (~1 mm3, Figure 8) could be rapidly screened through parallel testing. At moderate throughput (15 mice / 2 months), a screen that tiles the entire striatum could be completed in under 12 months with little human effort. To illustrate its feasibility, we tested 3 subregions in the striatum previously implicated in different types of perceptual decision behaviors (Yartsev et al, eLife 2018; Sippy et al, Neuron 2015; Znamenskiy & Zador, Nature 2013), including an additional region in the posterior striatum that do not receive ALM and S1 inputs. The results revealed a hotspot in the dorsolateral striatum that biased tactile-guided decision-making (Figure 8). Our approach thus opens the door to rapid screening of the striatal domains during complex operant behaviors.

      Moreover, by eliminating human intervention, automated training allows quantitative assaying of task learning (Figure 4). Home-cage testing also exposes behavioral signatures of motivation in self-initiated behavior (Figure 6). These observations suggest additional opportunities for inquires of goal-directed behaviors in the context of home-cage testing.

      Reviewer #3:

      In this study, Hao et al. developed an automatized operant box to perform decision-making tasks and optogenetic perturbations without requiring the experimenter's manipulation. For this aim, mice learn to head-fix and to perform a task by themselves. The optogenetic experiment using red-shifted opsins allows manipulation of circuits without the need of an implanted optical fiber. The automation of behavioral tasks in home cages (isolated rodents or in groups) is an intense area of research in neuroscience. The possibility of coupling home cage behavioral analysis with optogenetic manipulation and with complex tasks that require precise positioning of the animal for controlled stimulations (vibrating stimulation, visual …..) is thus of great interest and I commend the authors for their comprehensive dissection of the automated behavioral training setup. Some clarification, reporting of additional behavioral measures and refinement of analyses could improve the impact of this work.

      1) The first part of the paper nicely describes the experimental procedure to automate such a complex task. The procedure is very well described, the important points (e.g. the possibility for the animal to disengage…) are properly highlighted, and the online site allows to download the plans and 3D descriptions of the tools and the procedures. The authors compare task learning in automated versus manual training and show that there are overall very few differences. Whisker trimming reduces performance, indicating that animal used information to make the choice. This part of the work is already impressive. Apart from that, the authors do not consider in their description what could be an essential aspect of experiments in a home-cage, i.e the control of the motivation to perform the task. Mice perform the task (here, engage in the head fixation to obtained reward) when they wish and thus, compared with the manual training, there is no explicit control of the animal motivation. This could have consequence on i) the inter-fixation intervals that become an element of the decision and ii) questioned whether the commitment to the task is always motivated by drinking, or whether there is also a commitment to explore, or to check… This could impact the success in the task (e.g. if the animal is not motivated by water, it can explore…). Adding data analyses (information about the daily water consumption, are the inter-fixation intervals correlated with the success or failure in the last trial …) and even short discussion or introduction of these aspects (see for example Timberlake et al, JEAB 1987 or Rowland et al 2008, Physiol behavior for distinction between close and open economies paradigm) could strengthened the behavioral description.

      We thank the reviewer for these suggestions. We performed additional analyses to examine these issues which led us to include a new section of Results in the revised manuscript (page 13-14 and Figure 6).

      We have added a new Figure 6 showing water consumption and body weight information in home-cage testing. At steady state, a mouse typically consumed ~1mL of water daily (~400 rewarded trials) while maintaining stable body weight. This amount of water consumption was similar to mice engaged in daily manual experiments (Guo et al, Plos ONE 2014). The number of head-fixations per day was correlated with body weight (Figure 6). Since body weight reflects prior water consumption, this indicates different levels of motivation due to thirst, which drives engagement in the task.

      We also examined the inter-fixation-interval. Interestingly, the inter-fixation-interval after an error (which led to no reward) was significantly longer than following a correct trial (Figure 6E). This is inconsistent with error from exploration. Rather it likely reflects a loss of motivation after an error, perhaps due to the loss of an expected reward. We suspect that error trials violated the mice’s expectation of reward, and therefore discouraged the mice, leading to a loss in motivation. Consistent with this interpretation, we also found a significant increase in inter-fixation-intervals shortly after a sensorimotor contingency reversal (Figure 6F), coinciding with an increase in error rate due to the rule change.

      Despite these changes in motivation to engage in the task, the choice behavior in the task was similar. In highly trained mice, task performance was stable despite the body weight change (Figure 6D). Logistic regression analysis of the choice behavior shows that mice maintained the same strategy in their choice behavior (Figure 6G).

      2) In the second part of the work, the authors focus on the description of choice behavior. To characterize it, the authors used a logistic model to predict choices. They suggest that at the beginning of the task the animals biased their current choice by their last choice (parameter A1) and that once the task is learned they alternate according to the current stimulation (parameters S0). The model was a logistic function of the weighted sum of several behavioral and task variables and has 19 parameters (the ß parameters). If the animal only used these two informations, can a model that only takes into account A1 and S0 reproduce the data? If not, this certainly indicates that other informations (even distributed) are necessary; and also indicates individual strategies. Finally, analyses are made by considering trials as a discrete chain (trial n, n+1…). However, the self-head-fixed methodology causes the trials to be organized with more or less time between successive trials depending on motivation (see above). Again, do the authors note differences in performance according to the timing between trials? Could it be a variable in the model?

      We thank the reviewer for these great suggestions. We tested a model that included only choice history A1, tactile stimulus S0, and a constant bias term (β0). This 3-parameter model performed as well as the full model in predicting choice. This indicates that other factors do not contribute significantly to the choice behavior. We have included this result in the revised Figure 4C.

      We next examined whether inter-fixation-interval (i.e. the time elapsed between head-fixations and presumably the motivation to engage in the task) could impact mice’s choice behavior. There are multiple ways inter-fixation-interval could be incorporated into the logistic regression model. For example, it could be modeled as an explicit variable that biases left/right choice, or modulations on existing regressors (i.e. a gain variable that modulates the contribution of specific regressors). Each approach requires assumptions about how motivation affects the behavioral strategy of the mice. Instead, as a first order analysis, we examined whether the logistic regression model could predict choice equally well in trials following short vs. long inter-fixation-intervals. Our logic is that if mice adapted different strategies in different motivational states (reflected in short vs. long inter-fixation- intervals), the predictive power of the model would differ between these conditions. We fit the logistic regression model using trials in their natural sequential order (regardless of the inter-fixation-intervals). The model was then used to predict choice on independent trials. Trials were then sorted by the preceding inter-fixation-intervals. Prediction performance was calculated separately for trials following short vs. long inter-fixation-intervals. We did not find a significant difference in the model prediction performance. The result was similar in early and late stages of task learning (Figure 6G), even though mice used distinct strategies during these periods (Figure 4). These results suggest consistent strategies in the choice behavior. We have included this analysis in the new Figure 6.

      3) The third part described optogenetic manipulations. It is clear that group sizes are small. Nevertheless, if the objective was to show that the method works, the results are convincing. Some experimental details and in particular the choice of the statistical procedure need clarification.

      We have improved the presentation and clarified experimental details of the task, hypotheses for targeting specific brain regions, and statistical procedures.

    1. Author Response:

      Reviewer #1:

      The authors note how previous studies on myocardial infarction have usually studied individual tissues and not examined the cross talk between tissues and their dysregulation. To address this challenge they have therefore performed, in a mouse model of MI, an integrated analysis of heart, liver, skeletal muscle and adipose tissue responses at 6 and 24 hours. They have then validated their findings at 24 hours in two independent mouse model data sets.

      A major strength is their comprehensive approach. They have used high throughput RNA seq and applied integrative network analysis. They show for multiple genes whether they are up regulated or down regulated in these four tissues at the 6 and 24 hour time points and whether the regulation directions are concordant or opposite and note in particular that for the liver both concordant and opposite effects occur. They identify key tissue specific clusters in each tissue and identify the key genes in each cluster. Finally they use whole body modelling to identify cross talk between tissues.

      A further strength of this paper is the integration of transcriptomic data (differential expression, functional analysis and reporter metabolite analysis). The final strength is the very clear presentation of the findings and their implications such that the reader gets a very clear message and at the same time can go in to more detail if this is their area of research interest.

      There are no major weaknesses. The authors have achieved their aims and the data supports their conclusions.

      This work represents a major advance in both methodology and understanding of a multi tissues approach to the study of the metabolic impact of MI and the underlying up and down regulation of relevant genes.

      The relevance of these findings in human MI will need to be tested and may ultimately have therapeutic implications.

      First and foremost, we would like to thank the reviewer for the positive and encouraging comments. We agree that further research, especially rigorous validation of the findings from this work in humans, is needed and hopefully it can be translated into clinical settings. Moreover, we would like to thank the reviewer for his highlight on our comprehensive approach that we hope can be a framework for future multi-tissue research in disease setting.

      Reviewer #2:

      The authors collected post-myocardial infarction (MI) transcriptome data from a mouse model as well as sham-operated control mice to identify systemic molecular changes in multiple tissues at pathway level. The data were collected at two time points (6 hours and 24 hours post-MI), and several computational systems biology tools were applied to the dataset to identify altered molecular processes. The applied tools vary from very standard tools (eg. enrichment analysis) to advanced methods based on mapping data on biological networks. A specific focus was put on the altered signaling pathways as well as metabolic pathways and metabolites. Identified up-/down-regulated pathways were in agreement with the literature.

      Strengths:

      • One unique aspect of the work is the fact that the transcriptomic data were collected from not only heart, the source tissue for MI, but also from three more tissues (liver, skeletal mouse, adipose). Therefore, molecular alterations in the related tissues were also able to be monitored and discussed comparatively. The introduced transcriptomic dataset has a high re-use potential by other researchers in the field since coverage of responses by four tissues at two different time points makes it unique.

      • Correlation-based coexpression networks were created for all four tissues, and some of the clusters in these networks were shown to be tissue-specific clusters, which nicely validates both the experimental and computational approach in the paper.

      • The results were validated by using independent transcriptomic datasets available in the literature. The authors showed that there is a high overlap between their dataset and the literature datasets in terms of identified differentially expressed genes and enriched pathways. This additional validation strengthens the results reported in the manuscript.

      • Use of a variety of computational approaches and showing that they point to similar or complementary molecular mechanisms increase the impact of the paper. The employed computational tools include not only information-extraction methods such as enrichment, coexpression networks, reporter metabolites, but also predictive methods based on modelling. The authors construct a multi-tissue genome-scale metabolic network covering all four tissues of interest in the study, and they show that this model can correctly predict some major post-MI changes in the metabolism. It is interesting to see that two completely different computational approaches (constraint-based metabolic modeling versus information-extraction based approaches) point to same/similar molecular mechanisms.

      We would like to thank the reviewer for providing positive comments and a comprehensive summary of our work. We also really appreciate the constructive comments from the reviewer to improve our work.

      Weaknesses:

      • Regarding predictions made by multi-tissue metabolic network modeling, the control case fluxes were predicted by maximizing the rate of lipid droplet accumulation in the adipose tissue. Although there is an agreement between the model predictions and the results obtained by other bioinformatics tools used in the study as well as literature information, it looks rather oversimplification to assume that all other three tissues are programmed to serve for maximum fat production in adipose tissue. This should be further elaborated by the authors.

      We would like to thank the reviewer for the comment and we agree that there is a simplification of the situation in the modeling. However, we would like also to emphasize that the model has been carefully constrained with the dietary composition as well as the tissue specific resting energy expenditure. In our opinion, these constraints have already included a great part of the metabolic activity and satisfied the basic metabolic needs of the mice. The rest of the energy in the diet could be either used as physical activity (energy production in muscle) or stored as fat (lipid droplet accumulation in adipose tissue), and in our analysis, we assumed the latter as we think it is more realistic in this study as mice in the cage might have very little physical activities. We added a clarification for this in the revised manuscript as follows,

      ‘To simulate the metabolic flux distribution in the sham-operated mice, we set the lipid droplet accumulation reaction in adipose tissue (m3_Adipose_LD_pool) as the objective function as we assume the energy additional to the resting energy expenditure will be mostly stored as fat rather than used by the muscle for physical activities because mice raised in the cages might have very little exercise. Then, we used parsimonious FBA to calculate the flux distribution.’

      Reviewer #3:

      In the manuscript, "Integrative transcriptomic analysis of tissue-specific metabolic crosstalk after myrocardial infarction" by Arif et al., the authors describe analyses of transcriptomes of +/- myocardial infarction (MI) mice. The study is useful and reports interesting results. These results could be of interest to further develop cellular insight in effects and treatments for MI. However, I do not find any methodological advances here. The manuscript appears to be a repository of transcriptomics analyses. All the techniques used have been tried and applied to other scientific problems. The authors have presented differential expression analysis, followed by GSEA, and then they perform different network analyses - co-expression networks, reporter analyses, multi-tissue model, etc.

      My main issues are that the authors do too many different analyses but neither of them get sufficient light in the paper. Also no other independent quantitative evidence is shown in support of results of their analyses. Further, validation was done the same way the pipeline was built. This makes their results comes across as circular. For e.g. when validating metabolic models of cells built using transcriptomic data, CRISPR-Cas9 essentiality screens are used. Here, they basically repeated the same analyses on the same transcriptome from a different experiment it appears.

      First of all, we would like to thank the reviewer for the positive summary of our research. We agree that this study can be useful to be explored further, especially by validating it in human. We also would like to thank you for the constructive comments. We agree that we presented multiple transcriptomics analyses that have been used before. Apart from understanding the metabolic effect of MI in multiple tissues (which is unique as of now), our secondary goal is to propose a novel integrative framework for analyzing multi-tissue transcriptomics data based on the available techniques. We would like to emphasize that, even though the singular analyses were not novel, the integrative analysis in multi-tissue and disease setting both at transcriptomic and metabolic crosstalk level is a strong novelty of this study. This required employing not only state-of-art network analyses but also reconstruction of multi-tissue models through new methods that enable joint modeling of the metabolic interactions within and between tissues.

      As this study is unique (as of now), we tried our best to validate it with other data with similar settings (from a tissue and we found only transcriptomics data) and run our pipeline to validate and strengthen our findings. Moreover, we also recognized the limitation that all the results presented in this study are purely based on transcriptomics data (as stated in the “Discussion” section of the manuscript). More experiments, such as with metabolomics and proteomics data, are in our pipeline to complement the results from the current study. In summary, we recognized the reviewer’s concerns and we would like to address it in our future studies.

    1. First, we can control for whether the Pradhan is new or not. It would not be legitimate to compare investments in all unreserved GP where Pradhan are new to those in reserved GP: the fact that the Pradhan is new may reflect unobserved characteristics of the GP, and this non-random sample selection would bias the results. There is, however, a random subset of unreserved GP where the Pradhan is always new in office. Individuals may run for a council seat only in the village in which she or he resides. Once elected, the councilors choose one of them to be Pradhan. As part of the reservation scheme, one third of council seats (identified by village) were reserved for women: thus, if the previous councilor was a man, and his seat was reserved for a woman in the 1998 election, we can be sure that the Pradhan for that GP will be new to office. We can therefore compare women Pradhan to this subgroup of new Pradhan, to control for the fact that the Pradhan is new in office. Clearly, this does not fully control for the Pradhan’s experience: even new Pradhan could be experienced politicians.Second, we can control for whether the Pradhan is likely to be re-elected in 2003. Every third GP starting with the second in the list will be reserved for a female Pradhan for the 2003 election. Pradhan in those GP should realize that they will not be able to stand for re-election as Pradhan (if their particular seat is not reserved, they may still be able to run for a position of member of the GP council). We therefore restrict the sample to GP reserved in 1998 and those which will be reserved in 2003, to examine whether and to what extent the differences we observe are due to the fact that women may not think they have a chance to be reelected.10 Again, men could still have a longer horizon in office than women, if they plan to be elected on another position or be elected again in 10 years.Finally, we take advantage of the reservation of about 44% of the seats to SC and ST. These reservation were also selected randomly, and within each list, one third of positions were reserved for women. Irrespective of their gender, all the leaders elected under this reservation policy tend to be new leaders and to be elected in large part due to the quota system. They also tend to

      hello

    1. Author Response:

      Reviewer #1 (Public Review):

      The manuscript by Schrieber et al., explores whether inbreeding affects floral attractiveness to pollinators with additional factors of sex and origin in play, in male and female plants of Silene latifolia. The authors use a combination of spatial sampling, floral volatiles, flower color, and floral rewards coupled with the response of a specialized pollinator to these traits. Their results show that females are more affected by inbreeding and in general inbreeding negatively impacts the "composite nature" of floral traits. The manuscript is well written, the experiments are detailed and quite elaborate. For example., the methodology for flower color estimation is the most detailed effort in this area that I can remember. All the experiments in the manuscript show meticulous planning, with extensive data collection addressing minute details, including the statistics used. However, I do have some concerns that need to be addressed.

      Core strengths: Detailed experimental design, elaborate data collection methods, well-defined methodology that is easy to follow. There is a logical flow for the experiments, and no details are missing in most of the experiemnts.

      Weaknesses: A recent study has addressed some of the questions detailed in the manuscript. So, introduction needs to be tweaked to reflect this.

      Thank you very much for bringing this excellent article to our attention! We adjusted the writing in the introduction and the discussion accordingly. Please consider that this article was first published at the 15th of January 21, while our manuscript was submitted at the 9th of January. Hence, we were not able to account for this study in the first submission. Introduction pp 4-5, ll 48-54: “Although in a few cases inbreeding has been shown to alter single components of flower attractiveness (Ivey and Carr, 2005; Ferrari et al., 2006; Haber et al., 2019), insight into syndrome-wide effects is restricted to a single study. Kariyat et al. (2021) demonstrated that inbred Solanum carolinense L. display reduced flower size, pollen and scent production and receive fewer visits from diurnal generalists. It is necessary to broaden such integrated methodological approaches to other plant-pollinator systems (e.g., nocturnal specialist pollinators) and further floral traits (i.e., flower colour).” Discussion p 19, ll 535-542: “In summary, our research on S. latifolia suggests that in addition to inbreeding disrupting interactions with herbivores by changing plant leaf chemistry (Schrieber et al., 2018) it affects plant interactions with pollinators by altering flower chemistry. Our observations are in line with studies on other plant species (Ivey and Carr, 2005; Kariyat et al., 2012, 2021) and highlight that inbreeding has the potential to reset the equilibrium of species interactions by altering functional traits that have developed in a long history of co-evolution. These threats to antagonistic and symbiotic plant-insect interactions may mutually magnify in reducing plant individual fitness and altering the dynamics of natural plant populations under global change.”

      Some details and controls are missing in floral scent estimation. Flower age, a pesticide treatment of plants that could affect chemistry..needs to be better refined.

      We clarified this issue at different occasions in the methods section. Previous studies (and our study) on S. latifolia have shown no clear differences in the quality of floral scent between sexes. However, one study found higher total emission of VOC in males, while others found no differences. Hence, females produce no specific VOC that are used as oviposition cues but may be differentiated from males by the total amount of emitted VOC and pronounced differences in spatial flower traits. We highlight this at p 6, ll 111-116: “Silene latifolia exhibits various sexual dimorphisms with male plants producing more and smaller flowers that excrete lower volumes of nectar with higher sugar concentrations as compared to females (Gehring et al., 2004; Delph et al., 2010). The quality of floral scent exhibits no clear sex-specific patterns, while male plants have been shown to emit higher or equal total amounts of VOC as compared to females in different studies (Dötterl & Jürgens 2005, Waelti et al. 2009)”.

      Both male and female moths show pronounced behavioural responses to lilac aldehyde isomers and other VOC in the floral scent of S. latifolia (Dötterl et al., 2006). We therefore treated these VOC as typical floral scent compounds. We clarified this at p 7, ll 125-126: “A substantial fraction of floral VOC produced by S. latifolia triggers antennal and behavioural responses in male and female H. bicruris moths (Dötterl et al., 2006).” and p 9, ll 2010-218:” For targeted statistical analyses, we focused on those VOC that evidently mediate communication with H. bicruris according to Dötterl et al. (2006). We analysed the Shannon diversity per plant (calculated with R-package: vegan v.2.5-5, Oksanen et al. 2019) for 20 floral VOC in our data set that were shown to elicit electrophysiological responses in the antennae of H. bicruris (Supplementary File 1). Moreover, we analysed the intensities of three lilac aldehyde isomers, which trigger oriented flight and landing behaviour in both male and female H. bicruris most efficiently when compared to other VOC in the floral scent of S. latifolia. Furthermore, H. bicruris is able to detect the slightest differences in the concentration of these three compounds at very low dosages (Dötterl et al. 2006).”

      We used biological pest control agents in a preventive manner because S. latifolia is often infested by thrips and aphids under greenhouse conditions. The writing in the previous manuscript version was not clear with this regard and we changed the text at p 8, ll 157-161: ” Plants received water and fertilisation (UniversolGelb 12-30-12, Everris-Headquarters, NL) when necessary for the entire experimental period and were prophylactically treated with biological pest control agents under greenhouse conditions to prevent thrips (agent Amblyseius barkeri and Amblyseius cucumeris) and aphid (agent Chrysoperla carnea) infestation (Katz Biotech GmbH, GE) .”

      Indeed, flower size and scent emission can be correlated. Although the question whether differences in scent emission were based on a difference in flower size is an interesting one, it seemed less relevant to us because it is unlikely that our pollinators correct their perception of a scent for the size of a flower (see also p 19, 520-526). We were rather interested in whether scent emission differs between the plant treatments and thus pollinators may chemically perceive such differences. Moreover, we found it problematic to correct our models for flower size by including it as a covariate, which is the reason why we have not assessed this trait during scent collection. In this case, we would have corrected our scent responses for the effects of inbreeding, sex and population origin (i.e., the predictors we are interested in) because all of them determine the size of a flower (Figure 2 c,d). Hence, the inbreeding, sex and origin effects on flower scent would likely vanish. However, it is highly unlikely that the set of genes contributing to sex-, breeding treatment- and origin-based variation in flower size is exactly the same one that determines variation in scent emission per flower, which is basically the assumption underlying the model that includes flower size as a covariate. We critically mentioned the trade-off relationships and our reasoning to not correct for flower size at 9p ll 208-210: “The intensities of VOC were not corrected for flower size because we wanted to capture all variation in scent emission that is relevant for the receiver i.e., the pollinator.”

      While the study is laser-focused on floral traits, as the authors are aware inbreeding affects the total phenotype of the plants including fitness and defense traits. For example, there are quite a few studies that have shown how inbreeding affects the plant defense phenotype. This could be addressed in the introduction and discussion.

      We agree that this aspect is important and therefore addressed it in further detail in the introduction at p 4 ll 34-38: “While it is well established that inbreeding can increase a plant’s susceptibility to herbivores by diminishing morphological and chemical defences (Campbell et al., 2013; Kariyat et al., 2012; Kalske et al., 2014), its effects on plant-pollinator interactions are less well understood. Inbreeding may reduce a plant’s attractiveness to pollinating insects by compromising the complex set of floral traits involved in interspecific communication.” Since other referees suggested to rather tone down than increase the discussion based on floral scent results, we stick to the general feedback relationship among of herbivory and pollination, rather than relating it specifically to volatiles in the discussion at p 19, ll 535-544: “In summary, our research on S. latifolia suggests that in addition to inbreeding disrupting interactions with herbivores by changing plant leaf chemistry (Schrieber et al., 2018) it affects plant interactions with pollinators by altering flower chemistry. Our observations are in line with studies on other plant species (Ivey and Carr, 2005; Kariyat et al., 2012, 2021) and highlight that inbreeding has the potential to reset the equilibrium of species interactions by altering functional traits that have developed in a long history of co-evolution. These threats to antagonistic and symbiotic plant-insect interactions may mutually magnify in reducing plant individual fitness and altering the dynamics of natural plant populations under global change. As such, our study adds to a growing body of literature supporting the need to maintain or restore sufficient genetic diversity in plant populations during conservation programs.”

      Reviewer #2 (Public Review):

      A summary of what the authors were trying to achieve. This interesting and data-rich paper reports the results of several detailed experiments on the pollination biology of the dioceus plant Silene latfolia. The authors uses multiple accessions from several European (native range) and North American (introduced range) populations of S. latifolia to generate an experimental common garden. After one generation of within-population crosses, each cross included either two (half-)siblings or two unrelated individuals, they compared the effects of one-generation of inbreeding on multiple plant traits (height, floral size, floral scent, floral color), controlling for population origin. Thereby, they set out to test the hypothesis that inbreeding reduces plant attractiveness. Furthermore, they ask if the effect is more pronounced in female than male plants, which may be predicted from sexual selection and sex-chromosome-specific expression, and if the effect of inbreeding larger in native European populations than in North American populations, that may have already undergone genetic purging during the bottleneck that inbreeding reduces plant attractiveness. Finally, the authors evaluate to what extent the inbreeding-related trait changes affect floral attractiveness (measured as visitation rates) in field-based bioassays.

      An account of the major strengths and weaknesses of the methods and results. The major strength of this paper is the ambitious and meticulous experimental setup and implementation that allows comparisons of the effect of multiple predictors (i.e. inbreeding treatment, plant origin, plant sex) on the intraspecific variation of floral traits. Previous work has shown direct effects of plant inbreeding on floral traits, but no previous study has taken this wholesale approach in a system where the pollination ecology is well known. In particular, very few studies, if any, has tested the effects of inbreeding on floral scent or color traits. Moreover, I particularly appreciate that the authors go the extra mile and evaluate the biological importance of the inbreeding-induced trait variation in a field bioassay. I also very much appreciate that the authors have taken into account the biological context by using a relevant vision model in the color analyses and by focusing on EAD-active compounds in the floral scent analyses.

      The results are very interesting and shows that the effects of inbreeding on trait variation is both origin- and sex-dependent, but that the strongest effects were not always consistent with the hypothesis that North American plants would have undergone genetic purging during a bottleneck that would make these plants less susceptible to inbreeding effects. The authors made a large collection effort, securing seeds from eight populations from each continent, but then only used population origin and seed family origin as random factors in the models, when testing the overall effect of inbreeding on floral traits. It would have been very interesting with an analysis that partition the variance both in the actual traits under study and in the response to inbreeding to determine whether to what extent there is variation among populations within continents. Not the least, because it is increasingly clear that the ecological outcome of species interactions (mutualistic/antagonistic) in nursery pollination systems often vary among populations (cf. Thompson 2005, The geographic mosaic of coevolution), and some results suggest that this is the case also in Hadena-Silene interactions (e.g. Kephardt et al. 2006, New Phytologist). Furthermore, some plants involved in nursery pollination systems both show evidence of distinct canalization across populations of floral traits of importance for the interaction (e.g. Svensson et al. 2005), whereas others show unexpected and fine-grained variation in floral traits among populations (e.g. Suinyuy et al. 2015, Proceedings B, Thompson et al. 2017 Am. Nat., Friberg et al. 2019, PNAS). Hence, it is possible that the local population history and local variation in the interactions between the plants and their pollinators may be more important predictors for explaining variation in floral trait responses to inbreeding, than the larger-scale continental analyses. Not the least, because North American S. latifolia probably has multiple origins, with subsequent opportunity for admixture in secondary contact.

      Yes, it is necessary to put populations from the same continent into one category, since native and invasive plant populations differ significantly in their evolutionary history (p 5, ll 74-81, http://onlinelibrary.wiley.com/doi/10.1111/j.1365-294X.2012.05751.x). Origin explained sufficient amounts of variation in several traits including flower number, corolla expansion, VOC diversity, lilac aldehyde A intensity, and pollinator visitation rates (see Figures 2-3; and Table 2) and some variation in in the magnitude of inbreeding effects (Figure 2e, f; Figure 3). Even if we would not be interested in differences among native and invasive populations, we would have to include origin as a fixed effect in our models because:

      i) populations within a distribution range are no independent samples,

      ii) origin explains sufficient variation in many responses,

      iii) origin cannot be fitted as a random factor, since it has only two levels (the minimum number of levels for random effect is 4). We agree that it would be very interesting to specifically assess differences in the magnitude of breeding and sex effects among populations within origins. We now discuss this as important future research direction at p 18, ll 500-507: “As such, the precise mechanisms underlying variation in inbreeding effects on different scent traits across population origins of S. latifolia can only be explored based on comprehensive genomic resources, which are currently not available. Future studies should also incorporate field-data on the abundance of specialist pollinators and extend the focus from variation in the magnitude of inbreeding effects among geographic origins to variation among populations within geographic origins and individuals within populations. This would allow a detailed quantification of geographic variation in inbreeding effects and elaborating on the causes and ecological consequences of such variation (Thompson, 2005; Schrieber and Lachmuth, 2017; Thompson et al., 2017)”.

      To empirically address within-origin variation of inbreeding effects with our data, we would have to i) fit correlated random intercepts and slopes for the interaction breeding-sex on the population random factor (models consume min. 22 DF); or ii) include population as a fixed effect in our models (models consume min. 67 DF). We have tried both of these approaches when preparing the revision, but unfortunately it turned out that our study is not designed to address this question. The models for both variants only partially converge (see R-script ll. 1568-1580), and even if they do this does not imply that one can draw solid inference from them. Approach i often results in multiple singular convergence warning messages implying that no variance is explained by population-specific reaction norms to the fixed effects specified in the random effects structure. Approach ii results in odd rank- deficient models (I was seriously worried about type I errors). We simply have too few replicates (5) per population-breeding treatment-sex combination for both approaches. For solid inference we would need 10approach i-40approach ii replicates = 640-2600 individuals. However, our experimental design is sufficient to address the hypothesis we have raised in the introduction as well as general differences in response variables among populations. We now provide information on variance partitioning for all models that include population as a random effect in S9. As you will see, population explains lower amounts of variation in our responses as the fixed effects in 9 out of 12 models. The random effects maternal and paternal genotype (mother&father) explain more variation than the random effect population in 6 of 12 cases. Thus, these data do not make a strong case for an extensive discussion of population-based differences in floral traits and this was also not a question or hypotheses we wanted to address with our study.

      I see no major weaknesses in the study, and but in my detailed response, I have made a few questions and suggestions about the floral scent analyses. In short, the authors have used a technique that is not the standard method used for making quantitative floral scent analyses, and I am curious about how it was made sure that the results obtained from the static headspace sampling using PDMS adsorbents could be used as a quantitative measure. I would suggest the authors to validate the use of this method more thoroughly in the manuscript, and have detailed this comment in my response to the authors.

      Also, and this may seem like a nit-picky comment, I am not convinced that the best way to describe the traits under study is "plant attractiveness", because in the experimental bioassays, most of the traits under study that are affected by the inbreeding treatment, did not result in a reduced pollinator visitation. Most (or all) of these traits may also be involved in other plant functions and important for other interactions, so I suggest potentially using a term like "floral traits" or "(putative) signalling traits".

      We now avoid the term floral attractiveness throughout the manuscript and instead refer to “floral traits”.

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions: By and large, the authors achieved the aims of this study, and drew conclusions based in these results. One interesting aspect of this work that I think could be discussed a bit deeper is the lack of congruence between the effects of inbreeding on floral traits and the variation in visitation pattern in the bioassay. In fact, the only large effect of inbreeding on a floral trait that may play a role as an explanatory factor is the reduction of emission of lilac aldehyde A in inbred female S. latifolia from North America, which correspond to a reduced visitation rate in this group in the pollinator visitation bioassay. I have made some specific suggestions in my comments to the authors.

      We agree that this aspect required deeper discussion and revised the section at p 19, ll 520-526 accordingly. We believe that the limited spatial vision of H. bicruris in combination with our experimental setup for pollinator observations increased the relative importance of floral scent for pollinator visitation rates (suggested by referee #3).

      A discussion of the likely impact of the work on the field, and the utility of the methods and data to the community: I think that one important aspect of this work that may broaden the impact of this study further is the link between these experiment, and our expectations from the evolution of selfing. Selfing plant species most often conform to the selfing syndrome, presenting smaller, less scented flowers than outcrossing relatives. Traditionally, the selfing syndrome is explained by natural selection against individuals that invest energy into floral signalling, when attracting pollinators is no longer crucial for reproduction. Some studies (for example Andersson, 2012, Am. J. Bot), however, have shown that only one, or a few, generations of inbreeding may reduce floral size as much as quite strong selection for reduced signalling. Here, at least for some populations and sexes, similar results are obtained in this paper regarding several traits (including floral scent), and one way to put this paper in context is by discussing the results in the light of these previous papers.

      We now address this issue at p 16, ll 417-420: “However, our findings highlight that even weak degrees of biparental inbreeding (i.e., one generation sib-mating) can result in a severe reduction of spatial flower trait and scent trait values that is detectable against the background of natural variation among multiple plant populations from a broad geographic region. This observation indirectly supports that the selfing syndrome (i.e., smaller, less scented flowers observed in selfing relative to outcrossing populations of hermaphroditic plant species) may not merely be a result of natural selection against resource investment into floral traits, but also a direct negative consequence of inbreeding (Andersson, 2012).”

      Reviewer #3 (Public Review):

      Schrieber et al. studied the effects of biparental inbreeding in the dioecious plant Silene latifolia, focusing specifically on traits important for floral attractiveness and pollinator attraction. These traits are especially important for dioecious species with separate sexes as they are obligate outcrossers. The authors find that inbreeding mostly decreases floral attractiveness, but that this effect tended to be stronger in the female flowers, which the authors suspect to result from the trade-off with larger investment in the sexual functions in the female plants. The authors then go on to couple the changes in visual and olfactory floral traits to pollinator attraction which allows them to conclude or at least speculate that differences in pollinator behavior are mostly driven by the changes in olfactory traits. The study is robust in its broad and well-balanced sampling of populations, rigorous and in large part meticulously documented experimental designs and linking of the effects on mechanisms to ecological function. The hypothesis are clearly stated and the study is able to address them mostly convincingly. However, some of the aspects of the decisions the authors made and possible caveats need to be addressed and elaborated on.

      A major caveat, in my opinion, is that while the authors find stronger effects of inbreeding on pollinator visitation rates in the plants from the North American (Na) origin, these plants were tested in an environment that was foreign to them, which could have important consequences for the results of this study. This is specifically because the main pollinator Hadena bicruris moth is completely absent from the populations in Na, and yet, was the main pollinator observed in the pollinator attraction experiment. As this pollinator is also a seed predator, the Na populations are released from the selection pressure to avoid attracting the females of this species and thus risking the loss of seeds and fitness. In fact, some of the results suggest that the release from the specialist pollinator and seed predator in Na has led to increase in the attractiveness of the female flowers based on the higher number of flowers visited in the outcrossed females compared to outcrossed males in the plant from the Na origin and the similar, though not statistically significant, pattern in the olfactory cue. While ideally this pollinator attraction experiment should be repeated within the local range of the Na plants, this is of course is not feasible. Instead I suggest the problem should be addressed in the discussion explicitly and its consequences for the interpretation of the results should be considered.

      Indeed, North American populations are tested in their “away”- habitat only and the observed plant performance and pollinator visitation rates can thus provide no direct implications for their “home”-habitat. We state this now more clearly at pp 11-12, ll 283-285. However, our design is appropriate for investigating inbreeding effects on plant-pollinator interactions in multiple plant populations in a common environment. Given the close taxonomic relationship of H. bicruris (main pollinator in Europe) and H. ectypa (main pollinator in North America), the behavioural responses of the former species to variation in the quality of its host plant was considered to overlap sufficiently with responses of the latter species as outlined at pp 11-12, ll 285-291.

      The hypothesis that North American (NA) S. latifolia evolved higher attractiveness to female Hadena moths because H. ectypa is not able to oviposit on female plants in contrast to H. bicruris is indeed a highly interesting one. However, as you have outlined correctly, our study is not designed to elaborate on questions related to adaptive evolutionary differentiation among North American and European plants. Instead of addressing this hypothesis based on our data, we thus take reference to previous studies in the discussion p 17, ll 482-487: “As discussed in detail in previous studies, higher flower numbers in North American S. latifolia plants (Figure 1b) may result from changes in the selective regimes for numerous abiotic factors (Keller et al., 2009) or from the release of seed predation. As opposed to H. bicruris, H. ectypa pollinates North American S. latifolia without incurring costs for seed predation, which may result in the evolution of higher flower numbers, specifically in female plants (Elzinga and Bernasconi, 2009).”

      The incorporation of the VOC data in the actual manuscript was quite limited and I found the reasoning for picking only the three lilac aldehydes (in addition to the Shannon diversity index) for the univariate statistical tests insufficient. How much more efficient was the effect of the lilac aldehydes compared to the other 17 compounds deemed important in the previous study? While the data on this one aldehyde matches the pollinator attraction results, having one compound out of 70 (or out of 20 if only considering the ones identified important for the main pollinator) seems, perhaps, fortuitous lest there is a good reason for focusing on these particular compounds.

      We adapted the text to increase clarity but sticked to our previous choice for the analyses of VOC data.

      i) We now explain our choice of analysing lilac aldehydes with more detail p9, ll 210-218: “For targeted statistical analyses, we focused on those VOC that evidently mediate communication with H. bicruris according to Dötterl et al. (2006). We analysed the Shannon diversity per plant (calculated with R-package: vegan v.2.5-5, Oksanen et al. 2019) for 20 floral VOC in our data set that were shown to elicit electrophysiological responses in the antennae of H. bicruris (Supplementary File 1). Moreover, we analysed the intensities of three lilac aldehyde isomers, which trigger oriented flight and landing behaviour in both male and female H. bicruris most efficiently when compared to other VOC in the floral scent of S. latifolia. Furthermore, H. bicruris is able to detect the slightest differences in the concentration of these three compounds at very low dosages (Dötterl et al. 2006).”

      ii) If one analyses 20 compounds with zero-inflation models (actually two models in one) + 8 floral trait models + 2 pollinator visitation models (zi-models with two component models), one ends up with 52 models investigating complex fixed and random effect structures. To keep type-1 errors as low as possible (see also comment 2.12.b from Referee#2), we approached the more comprehensive VOC data sets with multivariate analyses or Shannon diversity.

      iii) We tested the effect of sexoriginbreeding treatment on the Shannon diversity of 20 active VOC as well as in the random forest analyses with the 20 VOC and 70 VOC dataset and transparently reported the results from all of these analyses in the manuscript. Hence, the incorporation of VOC data was not limited. However, we agree that we have taken too little reference to these results and now changed the text accordingly. Results section p 13 ll 351-354: ”Multivariate statistical analyses of 20 H. bicruris active VOC and all 70 VOC detected in S. latifolia revealed no clear separation of floral headspace VOC patterns for any of the treatments (Figure 2-figure supplement 2). In summary, the combined effects of breeding treatment, sex and range on floral scent were rather week.”

      Sampling time of VOCs is reported ambiguously. Was it from 21:00 to 17:00 the next day or in fact from 9pm to 5AM (instead of 5 pm as reported)? Please be more specific in the text as this is quite important. If sampling tubes were left in place during the daytime, some of the compounds could have evaporated due to heating of the tubes in the summer. It would also be important to mention whether all of the headspace VOCs were sampled on the same day and whether there could be variation in i.e. temperature.

      Thank you very much for identifying this typo! It is from 9 pm to 5 am (p 9, l 186).

      Considering the experimental setup for the pollinator attraction observations and the pooling of the data at the block level (which I think is the right choice) it seems possible the authors were more likely to get a result where pollinator behavior matches the long-distance cue, the VOCs. Short-distance cues such a subtle difference in flower size would perhaps not be distinguished with the current setup. I would be interested to know if the authors agree, and if so, mention this in the discussion.

      Thank you very much for this excellent suggestion! We agree and discuss this aspect in detail at p 19, ll 520-526. Indeed, one would need two different experimental setups to assess the contributions of long and short distance cues. Our setup (large distances among plots) is optimal for long distance cues, while a setup for short distance cues should have all plants in close spatial proximity. However, the latter approach does then not allow to address long-distance cues and to exclude competition/facilitation for pollinators among plants from different treatment groups.

    1. We are not going to save each other, ourselves, America, or the world. But we certainly can leave it a little bit better. As my grandmother used to say, “If the Kingdom of God is within you, then everywhere you go, you ought to leave a little Heaven behind.”

      Yes we may think things are in turmoil in this world but it is up to us to do something and do our part to make the future better.

    1. Rooney tells us Marianne is isolated, lonely and also very smart but she never actually shows us Marianne’s supposedly exceptional intelligence or feelings of remoteness. Marianne may have been intended to appear deeply flawed, or difficult, but we only ever get a sense of her isolation through her persistent sexual degradation at the hands of men. As a character, Marianne is a cipher, inherently desirable to men and envied by women.

      I found it to be interesting that Rooney differentiated how Connell and Marianne were both shown. It is clearly shown through specific situations and examples that Connell is this popular, well-loved individual but when it comes to Marianne, I tend to agree with this statement. Rooney tells us all of these poor characteristics about Marianne, such as the fact that she's a loner and oddly smart but when do we ever see this actually play out in the story? We don't. I find this difference to be a little disturbing to be honest. I don't see why the men are viewed as so transparent and women are seen as this overly complex individual that we think we know but in reality, never actually get to know on a deeper level.

    1. Author Response:

      Reviewer #2:

      The current work makes the case that local neural measurements of selectivity to stimulus features and categories can, under certain circumstances, be misleading. The authors illustrate this point first through simulations within an artificial, deep, neural network model that is trained to map high-level visual representations of animals, plants, and objects to verbal labels, as well as to map the verbal labels back to their corresponding visual representations. As activity cycles forward and backward through the model, activity in the intermediate hidden layer (referred to as the "Hub") behaves in an interesting and non-linear fashion, with some units appearing first to respond more to animals than objects (or vice-versa) and then reversing category preference later in processing. This occurs despite the network progressively settling to a stable state (often referred to as a "point attractor"). Nevertheless, when the units are viewed at the population level, they are able to distinguish animals and objects (using logistic regression classifiers with L1- norm regularization) across the time points when the individual unit preferences appear to change. During the evolution of the network's states, classifiers trained at one time point do not apply well to data from earlier or later periods of time, with a gradual expansion of generalization to later time points as the network states become more stable. The authors then ask whether these same data properties (constant decodability, local temporal generalization, widening generalization window, change in code direction) are also present in electrophysiological recordings (ECoG) of anterior ventral temporal cortex during picture naming in 8 human epilepsy patients. Indeed, they find support for all four data properties, with more stable animal/object classification direction in posterior aspects of the fusiform gyrus and more dynamic changes in classification in the anterior fusiform gyrus (calculated in the average classifier weights across all patients).

      Strengths:

      Rogers et al. clearly expose the potential drawbacks to massive univariate analyses of stimulus feature and task selectivity in neuroimaging and physiological methods of all types -- which is a really important point given that this is the predominant approach to such analyses in cognitive neuroscience. fMRI, while having high spatial resolution, will almost certainly average over the kinds of temporal changes seen in this study. Even methods with high temporal and moderate spatial resolution (e.g. MEG, EEG) will often fail to find selectivity that is detectable only though multivariate methods. While some readers may be skeptical about the relevance of artificial neural networks to real human brain function, I found the simulations to be extremely useful. For me, what the simulations show is that a relatively typical multi-layer, recurrent backpropagation network (similar to ones used in numerous previous papers) does not require anything unusual to produce these kinds of counterintuitive effects. They simply need to exhibit strong attractor dynamics, which are naturally present in deep networks with multiple hidden layers, especially if the recurrent network interactions aid the model during training. This kind of recurrent processing should not be thought of as a stretch for the real brain. If anything, it should be the default expectation given our current knowledge of neuroanatomy. The authors also do a good job relating properties detected in their simulations to the ECoG data measured in human patients.

      We thank the reviewer for these positive comments.

      Weaknesses:

      While the ECoG data generally show the properties articulated by the authors, I found myself wanting to know more about the individual patients. Averaging across patients with different electrode locations -- and potentially different latencies of classification on different electrodes -- might be misleading. For example, how do we know that the shifts from negative to positive classification weights seen in the anterior temporal electrode sites are not really reflecting different dynamics of classification in separate patients? The authors partially examine this issue in the Supplementary Information (SI-3 and Figure SI-4) by analyzing classification shifts on individual patient electrodes. However, we don't know the locations of these electrodes (anterior versus posterior fusiform gyrus locations). The use of raw-ish LFPs averaged across the four repetitions of each stimulus (making an ERP) was also not an obvious choice, particularly if one desires to maximize the spatial precision of ECoG measures (compare unfiltered LFPs, which contain prominent low frequency fluctuations that can be shared across a larger spatial extent, to high frequency broadband power, 80-200 Hz).

      In the new statistical tests described above, we compute each metric separately for each patient, then conduct cross-subject statistical tests against a null hypothesis to assess whether the global pattern observed in the mean data is reliable across patients. We hope this addresses the reviewer's general concern that the mean pattern obscures heterogeneity across patients. With regard to the question of greater variability in anterior electrodes, the new analysis showing a remarkably strong correlation between variability of coefficient change and electrode location along the anterior-posterior axis provides a formal statistical test of this observation. We view variability of decoder coefficients as more informative than the independent correlations between electrode activity and category label shown in the supplementary materials, because the coefficients indicate the influence of electrode activity on classification when all other electrode states are taken into account (akin in some ways to a partial correlation coefficient). This distinction is noted in SI-3, p 48.

      The authors are well-known for arguing that conceptual processing is critically mediated by a single hub region located in the anterior temporal lobe, through which all sensory and motor modalities interact. I think that it's worth pointing out that the current data, while compatible with this theory, are also compatible with a conceptual system with multiple hubs. Deep recurrent dynamics from high-level visual processing, for which visual properties may be separated for animals and objects in the posterior aspects of the fusiform gyrus, through to phonological processing of object names may operate exactly as the authors suggest. However, other aspects of conceptual processing relating to object function (such as tool use) may not pass through the anterior fusiform gyrus, but instead through more posterior ventral stream (and dorsal stream) regions for which the high-level visual features are more segregated for animals versus tools. Social processing may similarly have its own distinct networks that tie in to visual<- >verbal networks at a distinct point. So while the authors are persuasive with regard to the need for deep, recurrent interactions, the status of one versus multiple conceptual hubs, and the exact locations of those hubs, remains open for debate.

      We agree that the current data does not speak to hypotheses about other components of the cortical semantic network outside the field-of-view of our dataset. We have added an explicit statement of this in the General Discussion (page 22).

      The concepts that the authors introduce are important, and they should lead researchers to examine the potential utility of multivariate classification methods for their own work. To the extent that fMRI is blind to the dynamics highlighted here, supplementing fMRI with other approaches with high temporal resolution will be required (e.g. MEG and simultaneous fMRI-EEG). For those interested in applying deep neural networks to neuroscientific data, the current demonstration should also be a cautionary tale for the use of feed-forward-only networks. Finally, the authors make an important contribution to our thinking about conceptual processing, providing novel arguments and evidence in support of point-attactor models.

      Thanks to the reviewer for highlighting these points, which we take to be central contributions of this work!

      Reviewer #3:

      The authors compared how semantic information is encoded as a function of time between a recurrent neural network trained to link visual and verbal representations of objects and in the ventral anterior temporal lobe of humans (ECOG recordings). The strategy is to decode between 'living' and 'nonliving' objects and test/train at different timepoints to examine how dynamic the underlying code is. The observation is that coding is dynamic in both the neural network as well as the neural data as shown by decoders not generalizing to all other timepoints and by some units contributing with different sign to decoders trained at different timepoints. These findings are well in line with extensive evidence for a dynamic neural code as seen in numerous experiments (Stokes et al. 2013, King&Dehaene 2014).

      Strengths of this paper include a direct model to data comparison with the same analysis strategy, a model capable of generating a dynamic code, and the usage of rare intracranial recordings from humans. Weaknesses: While the model driven examination of recordings is a major strength, the data analysis does only provide limited support for the major claim of a 'distributed and dynamic semantic code' - it isn't clear that the code is semantic and the claims of dynamics and anatomical distribution are not quantitative.

      Major issues:

      1) Claims re a 'semantic code'. The ECOG analysis shows that decoding 'living from 'nonliving' during viewing of images exhibits a dynamic code, with some electrodes coding to early decodability and some to later, and with some contributing with different signs. It is a far stretch to conclude from this that this shows evidence for a 'dynamic semantic code'. No work is done to show that this representation is semantic- in fact this kind of single categorical distinction could probably be done also based on purely visual signals (such as in higher levels of a network such as VGG or higher visual cortex recordings). In contrast the model has rich structure across numerous semantic distinctions.

      We have added a new analysis showing that the animate/inanimate distinction cannot be decoded for these stimuli from purely visual information as captured by a well-known unsupervised method for computing visual similarity structure amongst bitmap line drawings (Chamfer matching). We did not consider deep layers of the VGG-19 model as that model is explicitly trained to assign photographs to human-labeled semantic categories, so the representations do not reflect purely visual structure. The new analysis appears as part of the description of the stimulus set on page 31.

      The proposal that ventral anterior temporal cortex encodes semantic information is not new to this paper but is based on an extensive prior literature that includes studies of semantic impairments in patients with pathology in this area (e.g. refs 7, 13, 29-32), studies of semantic disruption by TMS applied to this region (refs. 37-38 ), functional brain imaging of semantic processing with PET (33), distortion-corrected MRI (34-36), MEG (e.g. Mollos et al., 2017, PLOS ONE), and ECOG (ref. 46), and neurally-constrained computational models of developing, mature, and disordered semantic processing (refs. 7, 31, 40, 53). A great deal of this literature uses the same animate/inanimate distinction employed here as a paradigmatic example of a semantic distinction. It is especially useful in the current case because the animate/inanimate distinction is unrelated to the response elicited by the stimuli (the basic-level name).

      2) Missing quantification of model-data comparison. These conclusions aren't supported by quantitative analysis. This includes importantly statements regarding anatomical location (Fig 4E), ressemblenes in dynamic coding patterns ('overlapping waves' Fig 4C-D), and presence of electrodes that 'switch sign'. These key conclusions seem to be derived purely by graphical inspection, which is not appropriate.

      We have added new statistical analyses of each core claim as explained above.

      3) ECOG recordings analysis. Raw LFP voltage was used as the feature (if I interpreted the methods correctly, see below). This does not seem like an appropriate way to decode from ECOG signals given the claims that are made due to sensitivity to large deflections (evoked potentials). Analysis of different frequency bands, power, phase etc would be necessary to substantiate these claims. As it stands, a simpler interpretation of the findings is that the early onset evoked activity (ERPs) gives rise to clusters 1-4, and more sustained deflections to the other clusters. This could also give rise to sign changes as ERPs change sign.

      The reviewer's comment suggests that information about the category should be reflected in spectral properties of the time-varying signals but not the direction/magnitude of the LFP itself. While we recognize that this is a common hypothesis in the literature, an alternative hypothesis more consistent with neural-network models of cognition suggests that such information can be encoded in magnitude and direction of the LFP itself—the closest brain analog to unit activity in a neural network model. The fact that semantic information can be accurately decoded from the LFPs, following a pattern closely resembling that arising in the model, is consistent with this hypothesis. We agree that, in future, it would be interesting to look at decoding of spectral properties of the signal. We note these points on revised manuscript page 22.

      With regard to this comment:

      a simpler interpretation of the findings is that the early onset evoked activity (ERPs) gives rise to clusters 1-4, and more sustained deflections to the other clusters. This could also give rise to sign changes as ERPs change sign

      We are not sure how this constitutes a simpler or even a different explanation of our data. ERPs at an intracranial electrode reflect local neural responses to the stimulus, which change over stimulus processing. The data show that semantic information about the stimulus can be decoded from these signals at the initial evoked response and all subsequent timepoints, but the relationship between the neural response and the semantic category (ie how the semantic information is encoded in the measured response) changes as the stimulus is processed. The changing sign of an ERP reflects changing activity of nearby neural populations. "More sustained deflections" indicates that changes to the code are slowing over time. These are essentially the conclusions that we draw about the dynamic code from our data.

      Maybe the reviewer is concerned that the results are an artifact of just the temporal structure of the LFPs themselves—that these change rapidly with stimulus onset and then slow down, so that the “expanding window” pattern arises from, for instance, temporal auto-correlation in the raw data. Testing this possibility was the goal of the analysis in SI-5, where we show that auto- correlation of the raw LFP signal does not grow broader over time—so the widening-window pattern observed in the generalization of classifiers is not attributable to the temporal autocorrelation structure of the raw data.

    2. Reviewer #2 (Public Review):

      The current work makes the case that local neural measurements of selectivity to stimulus features and categories can, under certain circumstances, be misleading. The authors illustrate this point first through simulations within an artificial, deep, neural network model that is trained to map high-level visual representations of animals, plants, and objects to verbal labels, as well as to map the verbal labels back to their corresponding visual representations. As activity cycles forward and backward through the model, activity in the intermediate hidden layer (referred to as the "Hub") behaves in an interesting and non-linear fashion, with some units appearing first to respond more to animals than objects (or vice-versa) and then reversing category preference later in processing. This occurs despite the network progressively settling to a stable state (often referred to as a "point attractor"). Nevertheless, when the units are viewed at the population level, they are able to distinguish animals and objects (using logistic regression classifiers with L1-norm regularization) across the time points when the individual unit preferences appear to change. During the evolution of the network's states, classifiers trained at one time point do not apply well to data from earlier or later periods of time, with a gradual expansion of generalization to later time points as the network states become more stable. The authors then ask whether these same data properties (constant decodability, local temporal generalization, widening generalization window, change in code direction) are also present in electrophysiological recordings (ECoG) of anterior ventral temporal cortex during picture naming in 8 human epilepsy patients. Indeed, they find support for all four data properties, with more stable animal/object classification direction in posterior aspects of the fusiform gyrus and more dynamic changes in classification in the anterior fusiform gyrus (calculated in the average classifier weights across all patients).

      Strengths:

      Rogers et al. clearly expose the potential drawbacks to massive univariate analyses of stimulus feature and task selectivity in neuroimaging and physiological methods of all types -- which is a really important point given that this is the predominant approach to such analyses in cognitive neuroscience. fMRI, while having high spatial resolution, will almost certainly average over the kinds of temporal changes seen in this study. Even methods with high temporal and moderate spatial resolution (e.g. MEG, EEG) will often fail to find selectivity that is detectable only though multivariate methods. While some readers may be skeptical about the relevance of artificial neural networks to real human brain function, I found the simulations to be extremely useful. For me, what the simulations show is that a relatively typical multi-layer, recurrent backpropagation network (similar to ones used in numerous previous papers) does not require anything unusual to produce these kinds of counterintuitive effects. They simply need to exhibit strong attractor dynamics, which are naturally present in deep networks with multiple hidden layers, especially if the recurrent network interactions aid the model during training. This kind of recurrent processing should not be thought of as a stretch for the real brain. If anything, it should be the default expectation given our current knowledge of neuroanatomy. The authors also do a good job relating properties detected in their simulations to the ECoG data measured in human patients.

      Weaknesses:

      While the ECoG data generally show the properties articulated by the authors, I found myself wanting to know more about the individual patients. Averaging across patients with different electrode locations -- and potentially different latencies of classification on different electrodes -- might be misleading. For example, how do we know that the shifts from negative to positive classification weights seen in the anterior temporal electrode sites are not really reflecting different dynamics of classification in separate patients? The authors partially examine this issue in the Supplementary Information (SI-3 and Figure SI-4) by analyzing classification shifts on individual patient electrodes. However, we don't know the locations of these electrodes (anterior versus posterior fusiform gyrus locations). The use of raw-ish LFPs averaged across the four repetitions of each stimulus (making an ERP) was also not an obvious choice, particularly if one desires to maximize the spatial precision of ECoG measures (compare unfiltered LFPs, which contain prominent low frequency fluctuations that can be shared across a larger spatial extent, to high frequency broadband power, 80-200 Hz).

      The authors are well-known for arguing that conceptual processing is critically mediated by a single hub region located in the anterior temporal lobe, through which all sensory and motor modalities interact. I think that it's worth pointing out that the current data, while compatible with this theory, are also compatible with a conceptual system with multiple hubs. Deep recurrent dynamics from high-level visual processing, for which visual properties may be separated for animals and objects in the posterior aspects of the fusiform gyrus, through to phonological processing of object names may operate exactly as the authors suggest. However, other aspects of conceptual processing relating to object function (such as tool use) may not pass through the anterior fusiform gyrus, but instead through more posterior ventral stream (and dorsal stream) regions for which the high-level visual features are more segregated for animals versus tools. Social processing may similarly have its own distinct networks that tie in to visual<->verbal networks at a distinct point. So while the authors are persuasive with regard to the need for deep, recurrent interactions, the status of one versus multiple conceptual hubs, and the exact locations of those hubs, remains open for debate.

      The concepts that the authors introduce are important, and they should lead researchers to examine the potential utility of multivariate classification methods for their own work. To the extent that fMRI is blind to the dynamics highlighted here, supplementing fMRI with other approaches with high temporal resolution will be required (e.g. MEG and simultaneous fMRI-EEG). For those interested in applying deep neural networks to neuroscientific data, the current demonstration should also be a cautionary tale for the use of feed-forward-only networks. Finally, the authors make an important contribution to our thinking about conceptual processing, providing novel arguments and evidence in support of point-attactor models.

    1. Yes some of the conditions I have described here work to systematically over empowercertain groups. Such privilege simply confers dominance because of one’s race or sex.

      I find these points to be quite interesting, as I think semantics and the specific kinds of words we use when talking about racism are very important and an important discussion to have. Some terms or words we use when talking about racism do not grasp the full extent of the experience it is trying to describe, and may inadvertently downplay the gravity of what the term is trying to address. Similar to how there are now black people asking non-black people to refer to "the n word" as "the n slur", as using the word slur emphasizes it absolutely should not be used by those cannot use it.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 20 2020, follows.

      Summary:

      In this manuscript, Mughrabi et al reported a technical advance of long term vagus nerve stimulation (VNS) in mice. VNS has been used in clinics for treating certain patients with epilepsy and depression and pioneered in clinical trials for a number of disorders including inflammation. Yet, VNS has not been widely used in mice for mechanistic studies largely due to technical challenges dealing with the small size. Here, the authors developed a method for chronic implantation of VNS stimulator in mice, and tested the effectiveness of the method using measurements of heart rate changes and effects on inflammation. This method is potentially useful to investigate the therapeutic potential of long-term VNS in chronic disease models in mice. While reviewers were positive about the work performed in this study including that it was carried out by multiple labs, there are major concerns about certain points and additional essential experiments are needed. These include the need for robust data related to the LPS inflammation studies and histological analysis. There were also missing details of methodologies that decrease the enthusiasm for this study.

      Essential Revisions:

      1) At least two papers (PMID: 28628030, 32521521) have reported implants usable for the same application (long-term VNS in mice) although more extensive validation and characterization were performed in this manuscript. A comparison between those implants and the one in this manuscript needs to be discussed. As the authors stated, one technical challenge is that the vague nerve in mice is very small and fragile. However, it is unclear how the approach presented here is different from previous designs, and in particular, how mechanical damage is reduced using the reported apparatus.

      2) If the paper is going to be a resource, the authors should provide detailed descriptions of the materials and construction of the electrode. Currently the details are sparse and the photos of poor resolution. It is unclear how the custom cuff was built (no details provided in the method section), what materials were used, and whether these materials are bio-compatible. Also, it is not clear whether and how the cuff electrode is appropriately insulated to prevent stimulation of surrounding muscles/nerves. In addition, the touching point between the nerve and the cuff is very easy to be damaged. With the description of the implantation procedure, it should also be made clearer as to when the cuff electrode is place on the nerve. A clear description could prevent torsion or other injury to the nerve.

      3) LPS experiments: All reviewers thought the LPS experiment needed improvement. This study is under-powered and lacks a control group (saline + Sham stim). The LPS study is inconclusive due to a small number of animals. Increasing N to get conclusive data is important because this implant will be very useful to investigate the anti-inflammatory effect of long-term VNS in chronic disease models in mice. Related to this point, out of the 4 animals with bradycardia, 2 animals did not show a decrease in serum TNF. This raises a concern that using heart rate threshold may not be appropriate to deliver a consistent stimulation dose within/across animals if the goal is to get a consistent anti-inflammatory effect. It is likely that vagus efferent fibers responsible for HR decrease (innervating the sinoatrial and atrioventricular nodes) and those responsible for an anti-inflammatory effect are different populations. Those two populations might be differently affected by the implantation surgery and repetitive stimulation. In addition, performing VNS in awake animals is closer to the human situation.

      4) Please confirm that 0.1mg/kg is the correct dose, this seems low to induce this amount of TNFa.

      5) The histology of the vagus nerve raised questions and needs to be addressed. Here were relevant comments by reviewers.

      • In fig 4b, the vagus nerve in the cuff is quite clear, as is the carotid artery. But there are other nerve fragments and/or auto-fluorescent tissue immediately adjacent. What are these? Leads one to wonder if they only stimulated the vagus? The cervical sympathetic travels with the cervical vagus and care is needed to separate them from the carotid sheath. On the right side of fig 4b, the "control" side, they highlight a nerve nowhere near the carotid artery. This is intact tissue, so the vagus has to be next to the carotid artery. There is a big nerve next to the right carotid that I would bet is the vagus. I think they've got it wrong. It is not clear at what level these photos are taken, is it the cervical vagus? The authors should indicate the left and right carotid in these figures.

      • Figure 4. I do not see how fibrosis is determined. Is this actually collagen? Can the sections in B be stained with mason's trichrome. In "B" I am not sure that I see that the indicated regions are in fact the vagus nerve. It is hard to tell what other nerves would be present as there are few indications of the anatomical area these sections are from other than neck. Thus it Is hard to discern if this really is the vagus or not. I would have thought that the carotid artery should be visible in close proximity to the nerve bundle, this seems not to be the case and leads to uncertainty that this is the correct nerve.

      • Was there any difference in histology between mice with functioning and non-functioning cuffs? As stated in Discussion, left VN without surgery in different animals would be a better control than right VN in the same animals.

      6) In the data presented in fig 2 or any of the studies where the kent scientific pulse/ox was used, Did O2 saturation decrease with the change in breathing?

      7) Why didn't animals receiving awake VNS show visible changes in BR, which is in contrast to remarkable changes in BR in anesthetized animals?

      8) In video 1, it is unclear when the stimulation starts or stops. As a result, it is uncertain if the mouse scratching is due to stimulation. Is this a pain/nociceptive response?

      9) Fig 3 is presented in a confusing manner. In "A", I'm not sure why two mice are presented for different days post implantation and what this is showing. There is a clear effect of VNS on the heart rate and breathing (rate, and air flow), is this the minimum current for each day that was found to induce the heart rate threshold change. While I appreciate that the longer pulse widths are less susceptible to the effect of bio-encapsulation of the electrode over time, I'm not sure how one compares 100 uA at 100 us to 400 uA at 600 us. In B how is the HRT achieved without damaging the electrode as the ICIC is exceeded, or are we not understanding this graph correctly? In C there are days that seem to be missing given the legend. The supplementary figure also appears to have data points missing or obscured?

      10) Success rate tops out at 75% with a skilled surgeon, and ranges between 40-60% for your average player. I'd say this is not too good.

      11) It would be nice to show that the implant does not cause chronic inflammation as this would impact its usefulness as a method. The authors should measure tnfa 14 days Post implanted in cuff implanted and sham implanted mice.

      12) What behavioral experiments were done, and what were the results? These are mentioned in several places (line 172, line 279 etc) but not reported.

      13) The vagus nerve is critically involved in many essential body functions. Chronic implantation of the VNS stimulator may cause severe inflammation, nerve damage, and neuronal dysfunction. Therefore, it is critical to demonstrate that the chronic implantation does not alter nerve function. The chronic effect of the VNS stimulator implantation needs to be carefully monitored. For example, whether there is any change in body weight, food intake, as well as the sensitivity of diverse physiological reflexes such as the baroreflex, the Hering-Breuer reflex, and the stomach accommodation reflex.

    1. Author Response:

      We would like to thank the reviewers for their thoughtful and thorough critique of our manuscript. In our revised preprint, we added important additional data and restructured our manuscript to reflect as many of the recommendations as possible. Additionally, we have added experiments to define the cellular mechanisms underlying observed damage following mechanical injury. The most significant additions of new data include:

      • Further experiments demonstrating block of glutamate clearance exacerbates stimulus-induced hair-cell synapse loss.
      • Analysis of neuromast disruption in lhfpl5b mutant null larvae showing mechanical displacement. Lhfpl5b mediates mechanosensitivity in lateral-line hair cells, allowing us to determine whether mechanotransduction is required for mechanical disruption of neuromasts.
      • Testing the vibratory stimulus at various frequencies to confirm the optimal frequency to induce acute, generally sub-lethal damage to lateral-line hair cells is 60 Hz.
      • Assessment of neuromast supporting cell and hair cell proliferation following mechanical overstimulation.
      • Quantitative analysis of kinocilia SEM and confocal images of hair bundles in control and stimulus exposed fish. Individual comments are addressed as outlined below.

      Reviewer #1:

      1) The authors use a vertically-oriented Brüel+Kjær LDS Vibrator to deliver a 60 Hz vibratory stimulus to damage lateral line hair cells. It is not made clear on why this frequency was selected. Did the authors choose this frequency because they screened a number of frequencies and this is the one that did the most damage to hair cells or was it chosen for another reason? Or, do all frequencies do the same amount of damage? The authors should screen a number of frequencies and choose the stimulus that does the most damage to hair cells. This would set the field in the best direction, should members of the community attempt this new technique. It is not necessary to repeat all of the experiments, but the authors should show which frequencies are best for inducing damage.

      The frequency selected for mechanical overexposure of lateral-line organs was based on previous studies showing 60 Hz to be within the optimal upper frequency range of mechanical sensitivity of superficial posterior lateral-line neuromasts, with maximal response between 10-60 Hz, but a suboptimal frequency for hair cells of the anterior macula in the ear (Weeg and Bass 2002, Trapani et al, 2009, Levi et al, 2015). To confirm that 60 Hz was the optimal frequency to induce damage, we tested 45, 60, and 75 Hz at comparable intensities. We observed at 75 Hz no apparent damage to lateral line neuromasts while 45 Hz at a comparable intensity proved toxic i.e. it was lethal to the fish. We have updated the Results and Method Details to include our rationale for choosing 60 Hz.

      2) The SEM images of the hair bundle are beautiful and do show damage to the hair bundle, but historically speaking older studies in mammals have shown that the actin core of the stereocilia is damaged. It would be critical to know if this was the case. Showing damage to the kinocilium and stereocilia splaying is a start, but readers would need to know if the actin cores are damaged. So, TEM should be used to find damage to the actin cores of stereocilia.

      Our main goal of this initial manuscript was to survey morphological and functional changes in mechanically injured lateral line organs with an emphasis on inflammation and synapse loss. We agree TEM studies showing damage to the actin core of the stereocilia will be important to determine whether mechanical damage to neuromast hair bundles fully mimics mammalian stereocilia damage, but these experiments will require significant time to perform and optimize. We have expanded our analysis of hair-bundle morphology in this study and intend to pursue deeper analysis of hair bundle damage, i.e. examination of the stereocilia actin core, in future follow-up studies.

      3) I think the use of "Noise-exposed lateral line" as a term for mechanically overstimulated lateral line hair cells is not correct and could be misleading. The lateral line senses water motion not sound as the word noise would imply. Calling the stimulus "noise" should be removed throughout.

      We have removed the term “noise” throughout the manuscript and replaced it with either “strong water current stimulus” or “mechanical overstimulation” where appropriate.

      4) Decreases in mechanotransduction are shown by dye entry. These results should be strengthened using microphonic potentials to determine the extent of damage. This experiment is not necessary but would improve the quality of the document.

      While we agree that microphonic recordings would provide further support for reduced mechanotransduction, quantitative FM1-43 uptake in zebrafish lateral line hair cells is a well-established proxy for microphonic measurements. In a previous study using the same protocol utilized in our manuscript, FM1-43 labeling intensity was shown to directly correspond with microphonic amplitude (Toro et al, 2015). Moreover, the fixable analogue of FM1-43 (FM1-43FX) gave us comparable relative measurements of uptake as live FM1-43 and provided the additional advantage of high temporal resolution and the ability to simultaneously assay entire cohorts of control and overstimulated fish (which is not possible with microphonic measurements or live FM1-43 imaging), as we could expose groups of fish briefly to the dye at determined time intervals following overstimulation, then immediately place in fixative.

      5) In figure 2, PSD labeling is not clear.

      We assume the reviewer meant PSD labeling in Figure 4 and we agree it is difficult to discern. We have changed the hair-cell label from gray to blue in the images so that the green PSD labeling is clear.

      Reviewer #2:

      1) While the findings are carefully measured and described, the effects of insult on hair cells are relatively minor, with a change in hair cell number, extent of innervation or synapses per hair cell (Figs 3 and 4) in the range of 10% reduction compared to control. One potential value of the model would be to use it to discover underlying pathways of damage or screen for potential therapeutics. However with these modest changes it is not clear that there will be enough power to determine effects of potential interventions.

      One advantage of the zebrafish model is the ability to overstimulate large cohorts of larvae, thereby providing enough power to uncover modest but significant changes resulting from moderate damage to hair cells. While not as well suited for unbiased large-scale screens of therapeutics, our overexposure protocol provides the opportunity to determine the role of specific cellular pathways (e.g. metabolic stress, inflammation, and glutamate excitotoxicity) in hair-cell damage and synapse loss following mechanically-induced damage via genetic or pharmacological manipulation of these pathways. Additionally, as the hair cell synapses fully repair following stimulus-induced loss, the zebrafish model has the potential for identifying novel pathways for repair through transcriptomic profiling (for an example, see Mattern et al, Front. Cell Dev. Biol., 2018). Cumulatively, these future experimental directions will provide important mechanistic information that could be used toward the development of targeted therapeutic interventions.

      2) The most dramatic phenotype after shaking is a physical displacement of hair cells, described as disrupted morphology. However it is not clear what the underlying cause of this change. Are only posterior neuromasts damaged in this way? Is it a wounding response as animals are exposed to an air interface during shaking? It is also not clear to what extent this displacement reveals more general principles of the effects of noise on hair cells. Additional discussion of underlying causes would be welcome.

      We agree that the underlying causes of the physical displacement of posterior lateral-line neuromasts warranted further investigation and we have expanded appropriate sections of the results. To determine if excessive hair-cell activity plays a role in the displacement of neuromasts we have exposed lhfpl5b mutant—fish that have intact hair cell function in the ear, but no mechanotransduction in hair cells of the lateral line—to mechanical overstimulation. We observed comparable disruption of neuromasts lacking mechanotransduction, supporting that displacement of lateral-line hair cells is due to mechanical damage and does not require intact mechnotransduction. Further, when examining the adjacent supporting cells in disrupted neuromasts, we observed they are similarly displaced and elongated. We conclude that observed disruption of hair cells is a consequence of mechanical displacement of the entire neuromast organ. We have added additional discussion of this phenomenon to the Results and Discussion sections of the manuscript.

      3) Because afferent neurons innervate more than one neuromast and more than one hair cell per neuromast, measurements of innervation of neuromasts (Figure 3) or synapses per hair cell (Fig 4) cannot be assumed to be independent events. That is, changes in a single postsynaptic neuron may be reflected across multiple synapses, hair cells, and even neuromasts. This needs to be accounted for in experimental design for statistical analysis.

      We agree that changes in single postsynaptic neurons, which innervate groups of hair cells of the same polarity within a neuromast, could be reflected across multiple synapses. Additionally, it is plausable that excitotoxic events at the postsynapse, while not contributing to apparent neurite retraction, could be contributing to synapse loss across multiple innervated hair cells. We have updated the manuscript to reflect the potential contribution of postsynaptic signaling to synapse loss and added experiments pharmacologically blocking glutamate uptake.

      4) The SEM analysis provides compelling snapshots of apical damage, but could be supplemented by quantitative analysis with antibody staining or transgenic lines where kinocilia are labeled. The amount of reduced FM1-43 labeling is one of the more dramatic effects of the shaking insult, suggesting widespread disruption to mechanotransduction that could be related to this apical damage. Further examination of the recovery of mechanotransduction would be interesting.

      To supplement the SEM snapshots of severe apical damage, we have expanded the SEM image analysis with quantitative data on kinocilia morphology. We have also added confocal images of hair bundles using antibody labeling of acetylated tubulin in a transgenic line expressing β-actin-GFP in hair cells. We agree that correlative studies of mechanotransduction recovery relative to hair-bundle morphology would be interesting, and we intend to examine this question in a future follow-up study.

      5) A previous publication by Uribe et al.2018 describes a somewhat similar shaking protocol with somewhat different results - more long-lasting changes in hair cell number, presynaptic changes in synapses, etc. It would be worth discussing potential differences across the two studies.

      We agree we did not adequately address the considerable differences between our mechanical damage protocol for the zebrafish lateral line and the damage protocol described by Uribe et al, 2018. We have provided a more direct comparison in the Results section and addressed the differences in our protocols in-depth in the Discussion section.

      Our damage protocol uses a stimulus within the known frequency range of lateral-line hair cells (60 Hz) that is applied to free-swimming larvae and evokes a behaviorally relevant response (fast start response). The damage is observable immediately following noise exposure, is specific to posterior lateral-line neuromasts, and appears to be rapidly repaired. Some features of the damage we observe—reduced mechanotransduction and hair-cell synapse loss—may correspond to mechanically induced damage of hair cell organs in other species. Notably, hair cell synapse loss in seemingly intact neuromasts is exacerbated by pharmacologically blocking synaptic glutamate clearance, supporting that the 60 Hz frequency stimulus is overstimulating neuromast hair cells directly and suggesting that the mechanism of synapse loss may be similar to inner hair cell synapse loss reported in mice following moderate noise exposures.

      By contrast, the damage protocol published by Uribe et al used ultrasonic transducers (40-kHz) to generate small, localized shock waves rather than directly stimulate neuromast hair cells. The damaged they reported—delayed hair-cell death and modest synapse loss with no effect on hair-cell mechanotransduction—was not apparent until 48 hours following exposure and not specific to the lateral-line organ. Some of the features of the damage they observed—delayed onset apoptosis and hair-cell death—may correspond to damage reported in mice following blast injuries.

      Reviewer #3:

      1) As the authors point out, zebrafish hair cells can be regenerated. With that in mind, and to make the relevance for mammalian hair cell repair clear, a clear distinction between mechanisms mediated by "repair" or "regeneration" needs to be made. The authors discuss that proliferative hair cell generation can be excluded based on the short time period, but suggest that transdifferentiation might be involved. Recovery of NM hair cell number occurs within the same 2 hour period in which NM morphology and hair cell function improved, making it difficult to determine the extent to which "regeneration" contributed to the recovery. The amount of transdifferentiation has to be shown experimentally (lineage tracing?).

      We agree that the distinction between "repair" and "regeneration" needs to be made when discussing this model of mechanical damage to zebrafish hair cell organs. We have tried to clarify that most of what we observe regarding recovery—restoration of neuromast shape, mechanostransduction, afferent contacts, and synapse number —reflect mechanisms of repair following mechanical damage (and, in the case of synapse loss, overstimulation) rather than regeneration. However, one feature of damage that may reflect rapid regeneration is restoration of hair cells number following mechanical injury. To experimentally determine whether proliferation contributed to hair cell generation, we assessed the incorporation of the thymidine analog EdU during a 4 hour recovery following mechanical overexposure in a transgenic line expressing GFP in neuromast supporting cells and observe a modest but not statistically significant increase in the number of proliferating supporting cells in neuromasts exposed to strong current stimulus, suggesting recovery of lost hair cells is not primarily due to renewed proliferation.

      The number of hair cells that are lost and recover within several hours are low, i.e., typically ~1 hair cell/neuromast. We observed this consistently in all of our experiments, but the mechanisms responsible are not clear. Based on previous studies of hair cell regeneration in the lateral line, the recovery time appears too rapid to be caused by renewed proliferation, a notion that is further supported by our Edu studies. On the other hand, it is possible that a few supporting cells may undergo the initial phases of phenotypic change into hair cells during this short time period, and we speculate that such transdifferentiation may be responsible for the observed recovery. We should emphasize that this is a new observation and, at present, we do not fully understand the underlying mechanism. However, the focus of the present study is on mechanical damage, synaptic loss, and subsequent repair. We believe that it is important to report our consistent findings of low level hair cell loss and recovery, but a detailed characterization of the mechanism would require considerable effort and would best be the topic of a future study.

      2) The classification of "normal" vs "disrupted" is vague and not quantitative. The examples shown in the paper seem to be quite clear-cut, but this reviewer doubts that was the case throughout all analyzed samples. Formulate clear benchmarks and criteria for the disrupted phenotype (even when blind analysis is performed).

      We have defined measurable criteria for "normal" vs "disrupted" neuromasts that we have added to the Method Details section: “We defined exposed neuromast morphology as “normal” when hair cells appeared radially organized with a relatively uniform shape and size, with ≤7 μm difference observed when comparing the lengths from apex to base of an opposing pair of anterior/posterior hair cells. Length was measured from a fixed point at the center of the hair bundle to the basolateral end of each opposing hair cell. We defined neuromasts as “disrupted” when hair cells appeared elongated and displaced to one side, with >7 μm difference observed when comparing the lengths of an opposing pair of anterior/posterior hair cells. Generally, the apical ends of the hair cells were displaced posteriorly, with the basolateral ends oriented anteriorly.”

      3) Sustained and periodic exposure: These two exposure protocols not only differ with respect to sustained vs periodic, they also differ in total exposure time (Fig 2B). This complicates the interpretation, especially considering the authors own finding that a pre-exposure is protective.

      To clarify—pre-exposure was not protective to hair-cell survival. Rather, in preliminary experiments, pre-exposure appeared to reduce larval mortality, and we have clarified that observation in the text of the Results and the Methods Details sections. We agree with the reviewer that comparing the two protocols based on differences in time distribution is complicated in that they also differ in total exposure time. For the purpose of clarity, we now focus on the sustained exposure in the main figures and created supplemental figures for the reduced damage still observed using periodic exposure, specifying that reduced damage may be the result of periodic time distribution of stimulus and/or less cumulative time exposed to the stimulus.

      4) The data on the mitochondrial ROS aspect seems not well integrated into the overall story.

      We agree that the ROS story was not well integrated and incomplete. We have removed the data describing mpv17-/- mutants and mitochondrial disfunction from this manuscript. A more comprehensive report of mpv17-/- mutant mitochondrial function and morphological analysis of neuromasts following noise exposure is now described in a follow-up manuscript (“Influence of Mpv17 on hair-cell mitochondrial homeostasis, synapse integrity, and vulnerability to damage in the zebrafish lateral line”).

      5) It is surprising that the hair bundle morphology was not assessed after recovery. This is crucial. Overall, it would be good to see some quantification of the SEM data, e.g. kinocilia length and number of splayed bundles.

      We have expanded the SEM image analysis to quantitatively access kinocilia morphology following exposure. We agree that assessment of recovery using live imaging of hair bundles paired with subsequent SEM analysis will be informative, and we intend to perform those experiments in a future study.

      6) Behavioral recovery (measured as number of "fast start" responses) was also not assessed. This is essential for determining the functional relevance of the recovery.

      We attempted to measure behavior recovery of lateral-line function by measuring “fast-start” responses immediately and several hours after recovery, and discovered that i) strong water current provided stimulation that was too intense to reveal subtle behavioral changes following lateral-line damage and recovery, and ii) when testing larvae immediately following sustained strong current exposures, it was difficult to discern if fewer “fast-start” responses were due to lateral-line organ damage or larval fatigue. We agree that behavioral recovery is important to assay but acknowledge assessing lateral-line mediated behavior following mechanical damage will require a more sensitive testing paradigm that stimulates the lateral-line sensory organ with a relatively gentile, calibrated water flow stimulus. We are currently performing a follow-up study to this paper using a testing paradigm developed by a postdoctoral associate in our lab that analyses subtle changes in larval orientation to water flow (rheotaxis) mediated by the lateral-line organ. Using this behavior paradigm, we will directly correlate morphological and functional recovery over time.

      7) This reviewer is not yet convinced that this damage model displays enough commonalities to mammalian noise damage to justify the ubiquitous use of the term "noise" throughout the manuscript. It would be more prudent to use a more careful term along the lines of "mechanical overstimulation-induced damage".

      We have removed the term “noise” throughout the manuscript and replaced it with either “strong water current stimulus” or “mechanical overstimulation” where appropriate.

      8) Overall, there was a lack of experimental and analysis detail in the results section. For example, how was afferent innervation quantified? Just counting GFP labeled contacts to hair cells?

      Innervation of neuromast hair cells was quantified during blinded analysis by scrolling through confocal z-stacks of each neuromast (step size 0.3 μm) containing hair cell and afferent labeling and identifying hair cells that were not directly contacted by an afferent neuron i.e. no discernable space between the hair cell and the neurite. Hair cells that were identified as no longer innervated showed measurable neurite retraction; there was generally >0.5 μm distance between a retracted neurite and hair cell. We have added this information to the Methods Detail section.

      There was also inconsistency in the use of two variations of the mechanical damage protocol, the time points at which repair was assessed, and whether the damage was quantified in all neuromasts or in normal vs. disrupted neuromasts separately, making the data difficult to interpret.

      We have revised our figure legends to clearly indicate when we are assessing damage in all exposed neuromasts (pooled) to control vs. comparative analysis of normal vs. disrupted neuromasts relative to control. In addition, we now focus on the sustained exposure in the main figures, which was the exposure protocol used for the time points in which repair and recovery were assessed.

    1. Author Response:

      Reviewer #1:

      In this manuscript, Ma, Hung and colleagues rewind the tape to explore the genetic landscape that precedes carbapenem resistance of Klebsiella pneumoniae strains. The importance of this work is underscored by the paucity of new drugs to treat CPO (carbapenemase producing organisms). 'Given the need for 35 greater antibiotic stewardship, these findings argue that in addition to considering the current 36 efficacy of an antibiotic for a clinical isolate in antibiotic selection, considerations of future 37 efficacy are also important.' And so I would say the major weakness of the paper is the aspirational nature of how this work could be used by clinicians in antibiotic selection or treatment of the patient.

      We consider this study as a first step towards recognizing the need to develop more comprehensive diagnostics and more sophisticated antibiotic stewardship programs. This study suggests that factors besides MICs could inform clinical antibiotic selection, including that specific lineages have higher propensity to develop resistance (i.e., ST258), stepping-stone mutations that facilitate the evolution of resistance (i.e., mutations in rseA and ompK36), and antibiotics that have high level resistance barriers (i.e., meropenem). We have now added language to both the introduction and discussion to note that next steps are needed to extend these findings into the clinic, including more extensive whole genome sequencing of isolates and tracking of these strains in the clinic, associated patient outcome and strain evolution data, to understand the full impact of these mutational events in CREs.

      The strains selected for these experiments and the evolutionary in vitro models are both well considered. One idea that has stuck with me from the figures of a review article by Kishony (https://pubmed.ncbi.nlm.nih.gov/23419278/, figure 4) is the concept of constraining the evolutionary pathways or fitness landscape for antibiotic resistance. Are there any peaks that a microbial strain reaches that optimize resistance to one AbX but basically leave it inherently unable to evolve resistance to another AbX? This could have application for dual drug therapy or pulsed therapy.

      This is a good evolutionary question that might be suggested by Kishony’s work. In our particular study however, because the majority of isolates used that are carbapenem susceptible are already resistant to many other antibiotics, we cannot measure their resistance frequencies to other clinically relevant antibiotics. It does suggest that such a strategy would have to be implemented early enough before strains have already acquired significant resistance and cannot be used to manage currently existing resistance.

      When you sequence the isolates that have increased their MIC do you find 'unrelated' mutations in genes that would control protein synthesis or other functions that might be compensatory mutations. Developing a clearer understanding of the rewiring of the bacterium's basic processes might also elucidate both integrated functions and potential weaknesses. You mention mutations in wzc, ompA, resA, bamD.

      Yes. We found some strains had acquired multiple mutations in multiple genes. Please refer to supplementary file 12. In some cases, we found additional mutations of unclear significance; for example, we identified two mutations in Mut86. We tested these two mutations separately and found that only the mutation in ompA affects the susceptibility of the mutant. However, this does not exclude the possibility that the other mutation might have other compensatory functions versus just being a random passenger mutation; this will require further investigation.

      On the other hand, in some cases, we indeed found mutations that affect the fitness of the isolates when cultured in LB medium or M9, e.g., mutations in rseA. Some mutations affect fitness only in LB medium but not M9, e.g., mutations in ompK36. Some mutations do not significantly affect the fitness in either LB or M9, e.g., duplication of blaSHV-12. We are performing RNA sequencing on these mutants to further understand the “rewiring of the bacterium's basic processes.”

      Point of discussion. Classic ST258 carries blaKPC on pKpQIL plasmid. Your ST258 strain (UCI38) carries blaSHV-12 on pESBL. Am I to assume that pESBL is in lieu of pKpQIL?

      Indeed, pESBL encodes an ESBL in UCI38 and may obviate the need for another classical KPC-carrying plasmid such as pKpQIL. However, pESBL and pKpQIL are not incompatible and so it is not clear that anything is precluding UCI38 from picking up pKpQIL.

      Transformation of CPO have many variables and in vitro data does not always mirror what is observed in vivo. So the findings of Fig 2f might need to be considered under different laboratory conditions (substrate, temperature) [https://pubmed.ncbi.nlm.nih.gov/27270289/].

      We revised the statement in the revision and pointed out that the results in Fig. 2F were limited to our assay condition.

      Reviewer #2:

      In this manuscript Ma et al., sought to investigate the breadth of genetic mechanisms available across various lineages of clinical isolates of Klebsiella pneumoniae, with a specific focus on carbapenem resistance evolution. The authors systematically evaluated how different carbapenems and genetic backgrounds affect the rate of evolution by measuring mutation frequencies. The authors found three major observations: First, that a higher mutational frequency is dependent on genetic background and high-level transposon activity affecting porins associated to carbapenem resistance. Importantly transposon activity was not only higher than SNP acquisition rates in distinct backgrounds, but was also reversible, thus emphasizing that resistance evolution via this mechanism might impart less of a cost than by the accumulation of mutations in other genetic backgrounds. Second, that CRISPR-cas systems have the potential to restrict the horizontal acquisition of resistance elements. Importantly, determining the presence or absence of such systems alone is not enough to determine wether a strain is "resistant" to certain foreign elements, but specific sequences within the different spacers can be more informative of the exact range of plasmids or genetic elements to which the system is restrictive. Third, pre-selection with ertapenem increases the likelihood of resistance evolution against other carbapenems both via de novo mutation and HGT.

      Altogether, these results emphasize the importance of additional factors, other than MIC values, such as genetic background, plasmid/transposon activity, and drug identity and choice in determining the rate at which resistance can evolve in K. pneumoniae. I consider that the data generally supports the authors conclusions and provides relevant observations to the field. I do not have any major concern and think the authors have done a very complete and systematic evaluation of the data necessary to answer their questions.

      My only minor concern is regarding the authors emphasis in their introduction and discussion on how these kind of data is relevant for clinical decision making. It remains unclear to me exactly how. While I completely agree that genomic information and drug choice play a major role in the evolution of antibiotic resistance, it is unclear to me how to efficiently and promptly translate all of this information at the bedside. Genome sequencing, however economical it has become in the recent years, is still not affordable to be implemented at the scales needed for diagnosis at the clinic. Perhaps the authors could expand on how they envision this could be implemented?

      We consider this study as a first step towards the development of more comprehensive diagnostics and more sophisticated antibiotic stewardship. Indeed, as current diagnostics exist, it would be difficult to implement. However, we hope that as studies such as these grow, it will usher in a new era of diagnostics that can indeed take such factors into account. We have now added such a discussion to the introduction and discussion in the revised manuscript.

    1. Author Response:

      Reviewer #1:

      This MS combines two-photon glutamate sensing (using the iGluSnFR fluorescent probe), two-photon glutamate uncaging, two-photon calcium imaging and electrophysiology to investigate whether synaptically released glutamate activates receptors outside the synapse of release, and at neighboring synapses. The data themselves are very impressive. The authors arrive at the revolutionary conclusion that synaptically released glutamate is able to activate both NMDA and even AMPA receptors at neighboring synapses, remarkably strongly. I say revolutionary, because previous modelling has yielded diametrically opposite conclusions. The reflex would be to prefer experiment over theory, yet the modelling was based upon quite strongly constrained physical parameters that would be quite incompatible with the interpretations reported here. However, I believe the authors have failed to take into account significant technical limitations inherent in the technologies they apply. These include spatial averaging of fluorescence, possible saturation of iGluSnFR and diffusive exchange of (caged) glutamate during uncaging. As a result, the conclusion is wholly unproven. Indeed, I believe it highly probable that all of the data in favor of distal activation will prove to be consistent with synapse specificity and the presence of technical artifacts related to spatial averaging of fluorescence signals and diffusive exchange of (caged) glutamate during uncaging.

      We agree that there are technical limitations and that the interpreration of signals recorded from near synapses is difficult. This concerns the length constants we describe and name SARGe. Our usage of those terms in the results may have suggested we propose the value of lambda istelf well dscribes the action range of glutamate. This is not the true as the reviewer states and in the beginning of the discussion section we note this limitation.

      However, our interpretation that glutamate may regularly activate AMPA-R in neighboring synapses is not based on lambda values (see discussion).

      It is based on the facts that a) ~5% iGluSnFr responses are observed at more than 1.5 µm remote to a synapse and b) uncaging at 500 nm produces a current response of ~38% of the quantal synaptic amplitude. Here, the remarks of the reviewer are incorrect: a) is not affected by volume averaging or saturation of iGluSnfr and previous models predict an activation of upto 1-2% only. We have shown this by simulation in an appeal letter which unfortunately was not forwarded to the reviewer. b) is not increased by “diffusive exchange of glutamate during uncaging”. In fact, releasing the same amount glutamate for a longer period reduces distant receptor activation and current models predict an 2-4 fold lower activation of AMPA-R than we observe here. This was also shown by simulation in the appeal letter but a further exchange with the reviewer on this was not permitted by the editors.

      Reviewer #2:

      Matthews, Sun, McMahon et al. addresses the extent of the spread of the neurotransmitter glutamate into the extracellular space. The authors use a combination of imaging techniques, 2-photon glutamate uncaging and electrophysiology to conclude that vesicular glutamate release reaches nearby, adjacent synapses. Although this is an interesting question, and one that has been addressed many times previously, I have several technical concerns about the strength of the conclusions that reduces my enthusiasm.

      Unfortunately, only this general part of comments of reviewer 2 is published so that we cannot meaningfully rule out/comment on the reviewer’s concerns.

      Reviewer #3:

      This is an interesting paper combining several impressive techniques to argue that synaptically released glutamate is allowed to diffuse to and activate receptors at much greater distance than previously thought. iGluSnFR recordings show that glutamate released from single vesicles activates the indicator with a spatial spread (length constant) of 1.2 um, substantially farther than previous estimates based on the time course of glutamate clearance by glial transporters (PMC6725141). Similar parameters are observed with spontaneous and evoked events, large or small, or when glutamate is released via 2P uncaging. Further uncaging experiments show that both AMPARs and especially NMDARs are activated a substantial distance. AMPARs, previously thought to be recruited only within active synapses, are activated with a spatial length constant that compares quite closely with the average distance between synapses in the hippocampus. More heroic experiments and some geometric calculations show that this behavior enables neighboring synapses to interact supralinearly. The results suggest that "crosstalk" between neighboring synapses may be substantially more common than previously thought.

      The experiments in this paper appear carefully performed and are analyzed thoroughly. Despite all of the quantitative rigor and careful thought, however, the authors fail to reconcile convincingly their results with what we know about neuropil structure and the laws of diffusion. There are very good data in the literature regarding the extracellular volume fraction and geometric tortuosity of the neuropil, the diffusion characteristics of glutamate and the time course of glutamate uptake. These data more or less demand that synaptically released glutamate is diluted over a much smaller spatial range than that suggested here. In the Discussion, the authors suggest that this discrepancy might reflect a simplified view of the neuropil as an isotropic diffusion medium (PMC6763864, PMC6792642, PMC6725141), whereas a more realistic network of sheets and tunnels (PMC3540825) might prolong the extracellular lifetime of neurotransmitter. I like this idea in principle, but there is no quantitative support in the paper for the claim - in fact, it seems at odds with the authors' very nice demonstration that diffusion appears to be similar in all directions (Figure 3B). I don't necessarily think a solution is within the scope of this single paper, but I would suggest that the authors acknowledge the present lack of a compelling explanation.

      Our results are not predicted by the modelling studies cited that is correct and this makes them important in our eyes. But it is important to note that those modelling/simulation studies use a strong simplification and view the extracellular space/ the neuropil as a porous medium. This is a powerful approach but it is only a valid description when considering diffusion distances of several micrometer - it is not applicable on the sub micron scale of neighboring synapses (PMID: 15345540 p1608; PMID: 7338810 p227, and DOI: 10.1088/0034-4885/64/7/202). This drawback of the simulation has been overlooked and the reviewer seems not to be aware of it and we point this out at the end of the discussion section. We do not suggest anisotropy near a synapse nor a particular perisynaptic geometry such that there would be specific channels from one synapse to the next; we don’t, we also assume that the neuropil is random (as shown by PMID 9547224) - instead everywhere in the neuropil the intial and submicron diffusion will not follow the “porous medium approach”.

      It is true that we do not offer a quantitative description of how this violation of the porous medium approach would lead to an underestimation of synaptic cross-talk - we provide experimental data. However, in our appeal letter we expicitly describe this discrepancy in detail to make the reviewer aware of it, but regrettably this information never reached the reviewer.

    2. Reviewer #3 (Public Review):

      This is an interesting paper combining several impressive techniques to argue that synaptically released glutamate is allowed to diffuse to and activate receptors at much greater distance than previously thought. iGluSnFR recordings show that glutamate released from single vesicles activates the indicator with a spatial spread (length constant) of 1.2 um, substantially farther than previous estimates based on the time course of glutamate clearance by glial transporters (PMC6725141). Similar parameters are observed with spontaneous and evoked events, large or small, or when glutamate is released via 2P uncaging. Further uncaging experiments show that both AMPARs and especially NMDARs are activated a substantial distance. AMPARs, previously thought to be recruited only within active synapses, are activated with a spatial length constant that compares quite closely with the average distance between synapses in the hippocampus. More heroic experiments and some geometric calculations show that this behavior enables neighboring synapses to interact supralinearly. The results suggest that "crosstalk" between neighboring synapses may be substantially more common than previously thought.

      The experiments in this paper appear carefully performed and are analyzed thoroughly. Despite all of the quantitative rigor and careful thought, however, the authors fail to reconcile convincingly their results with what we know about neuropil structure and the laws of diffusion. There are very good data in the literature regarding the extracellular volume fraction and geometric tortuosity of the neuropil, the diffusion characteristics of glutamate and the time course of glutamate uptake. These data more or less demand that synaptically released glutamate is diluted over a much smaller spatial range than that suggested here. In the Discussion, the authors suggest that this discrepancy might reflect a simplified view of the neuropil as an isotropic diffusion medium (PMC6763864, PMC6792642, PMC6725141), whereas a more realistic network of sheets and tunnels (PMC3540825) might prolong the extracellular lifetime of neurotransmitter. I like this idea in principle, but there is no quantitative support in the paper for the claim - in fact, it seems at odds with the authors' very nice demonstration that diffusion appears to be similar in all directions (Figure 3B). I don't necessarily think a solution is within the scope of this single paper, but I would suggest that the authors acknowledge the present lack of a compelling explanation.

    1. Goleman’s Model of Situational Leadership

      Although I think all three models have good points, I think Goleman's model is the one that has the strongest resonance with me. This model considers more than just competency, commitment and confidence. While all of those are certainly important factors, they are fairly rigid and leave little room for people who may be at the in between stages. For example, someone might be warming up to an idea so you may not need to go as heavy on the selling it part but they aren't quite at a place where they are ready to dive in to build competence.

      For Goleman's model, it includes more of the grey area. In social work, we literally live and work in the grey area with very little being black and white so a leader can only be effective if they understand and respond to the grey. This model essentially gives a leader more options in how to respond without restricting them to the four categories.

      It could be argued, however, that some of Goleman's leadership styles could actually be damaging. For example, the coercive leader may only serve to heighten a crisis instead of taking swift action to address it. Each of these styles listed by Goleman have both an upside and a downside. In counterpoint to this however, I would say that all three models have pros and cons and leaders will come up against times when no one model totally fits.

    1. However, it can be extremely frustrating placing the tiles. Very commonly there will be no position to place a tile in and it will be put to one side. Perhaps someone new to tile-laying games wouldn't find this so odd, but to anyone with experience of Carcassonne it will seem very limiting. In Carcassonne you can pretty much always place a tile, with several choices of position available. Every player I've introduced this game to has looked at me as if to say, "We must be doing something wrong." But no, that game is designed that way. Sometimes it feels like the map builds itself - there is often only one viable placement, so it starts to feel like a jigsaw, searching for that available position. Surely placing a single tile shouldn't be this difficult!

      I don't think I'd find it frustrating. I think I would enjoy the puzzle part of it.

      But indirectly I see that difficulty in placing tiles impacting my enjoyment: because it means that there are no/few meaningful decisions to be had in terms of where to place your tile (because there's often only 1 place you can put it, and it may sometimes benefit your opponent more than yourself) or which tile to place (because you don't get any choice -- unless you can't play the first one, and then you can play a previously unplayable one or draw blind).

    1. Rob May 所言的训练,则是广义层面的「训练」,他这样写道: ……and by training I don’t mean training a neural net. The current model of doing so is way too targeted to be a generic benefit like we need for the AI-as-electricity framework to make sense. At some point, I think training will be a more generic process that includes humans training machines, and machines learning by reacting to a broad based environment like humans do — not just narrowly targeted applications. I think broad based training is the place to really get economies of scale — train a thing once and see it execute that training as many times as the world needs it to for all kinds of applications. 利用这种更广义的人与机器、机器与环境甚至机器与机器的「训练」,有望可以大幅降低 AI 的训练成本

      机器与环境,机器与机器的训练。

      这个靠脑补感觉就能极大降低训练成本 Imp

      所以m2m的目标并仅仅是自动化,而是机器之间的自动反馈。

      可控网络下,带5g,带边缘智能,这一切的目的都是降低训练成本,而非单纯奔着应用去的 imp

    1. Accountability to the Learning Community

      This is talk that attends seriously to and builds on the ideas of others; participants listen carefully to one another, build on each other’s ideas, and ask each other questions aimed at clarifying or expanding a proposition. When talk is accountable to the community, participants listen to others and build their contributions in response to those of others. They make concessions and partial concessions (yes...but...) and provide reasons when they disagree or agree with others. They may extend or elaborate someone else’s argument, or ask someone for elaboration of an expressed idea.

      I think this paragraph thoroughly explains accountability to the learning community. As teachers, it is crucial we create an environment where students listen to and expand upon the ideas of their classmates. Thus, this is a collaborative discussion between students and guided by teachers so that a meaningful discussion can be had.

      Examples of this could be a teacher asking if anyone else wants to add on to a classmate's response or providing wait time during a discussion.

    2. Q1: Accountability to the learning community involves talk that has meaning. This allows students to be active listeners and participants in conversations. The goal is for students to be heard and to hear one another, we want them to be able to build on each other’s ideas and ask questions to clarify their thoughts and expand their understanding. Example: Take your time, we’ll wait. I really like this example because even today (as a 27 year old) there are times when I’m talking, I lose my thought and I need that chance to gather myself back.

      Q2: Accountability to standards of reasoning is “talk that emphasizes logical connections and the drawing of reasonable conclusions...involves explanation and self-correction.” In other words accountability to standards of reasoning is talk that is supported with reasonable thought. From what I learned from the kindergarten discussion is we want students to be able to have a discussion, and in this discussion there will be disagreements. However, what happens after the disagreement is what matters, can we get the other person to agree with me? And can I use reasonable evidence to get them to understand why this is why it may be better, etc. Example: Student A: I think Robert is happy that Stevie is gone because now he doesn’t need to play with him anymore. Student B: I disagree, I think Robert is sad Stevie is gone, because he was like a little brother to him.

      Q3: Accountability to knowledge is based on information that can be shared and accessed with one another which can include facts, written texts or other publicly accessible information. According to the article this is the most complex of the three accountabilities. Example: George Washington is the first president of the United States, we read about it in a book and it’s written down in history.

      Q4: Interdependent means that the three dimensions go hand and hand, they work together, or “must co-occur”. I think it is best explained in the article, “Knowledge is most easily identified as agreed-upon facts. Yet disconnected facts are a weak basis for reasoned argument. What makes facts usable is the connection to other facts, tools, and problem-solving situations, that is, the network of concepts, relationships, and the norms of evidence characteristic of reasoned argument taking place within a coherent discipline or practice.” In order for discourse to occur the three dimensions will work side by side for students to fully grasp and be involved in meaningful conversations.

      Q5: The main challenge to accountable talk is the different backgrounds students are coming from and how the discourse norms may be available to some but not others. This is a challenge because some students will easily engage in accountable talk while others may struggle and this is the first time for them practicing accountable talk. I agree with this statement and the part of this article where there are challenges to accountable talk, however, because it is school and our job is to educate our students I think it should still be done. One way to help those students who aren't familiar with accountable talk is to pair them up with students who are familiar with it and group them as a team to go against a similar pair and learn that way first. Make it more of an observable lesson before it becomes an active participant lesson.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Major points:**

      • The affinity analyses need more work. This is against A/B/C isoforms, and also the dimerization affinity between the fluorescent proteins could change the apparent on/off rates. This point is not quantified or discussed. Due to the chemical equilibrium analysis, the apparent equilibrium is not only affected by this on/off rates, but also the local availability (concentrations) of the reacting moieties. In the limit where the biosensor concentration is low within a cellular subcompartment or vice versa, how this is going to change the sensitivity of detection because this can push the reaction in either directions. Since equimolar distribution of the moieties are not guaranteed, this affects the detection characteristics of this biosensor. This point should be discussed and emphasized. Regarding the A/B/C isoforms: We did not mean to claim, that the sensor is specific for RhoA, based on the literature, we are certain it will also bind Rho B and C. We observed binding to active RhoB in an experiment not shown in the manuscript. To make this clearer, we changed the name of the Rho GTPase to Rho. Regarding the dimerization affinity: Some initial data has been acquired for the weaker dimers Venus and iRFP. They seem to have a slightly beneficial effect but less beneficial than the stronger dimer dTomato. We agree that the biosensor concentration affects the performance (which is an important point with respect to optimizing the right concentration, as will be discussed later). We think that the local availability is not limiting because of fast diffusion of the soluble biosensor. However, this may be an issue in highly polarized cell types such as neurons. This is added to the discussion: ‘The biosensor concentration of relocation probes affects their performance. Although the diffusion of a soluble probe will not readily lead to differences in local availability in most cell types, this may be an issue in highly polarized cell types.’

      • Fig 1 A: Are the fluorescence changes of the biosensors due to stimulation with histamine completely reversible ? In other words, is it possible to see a total recovery of the signals with pyrilamine or in the presence of another antagonist ? If not, why?

      This is typically what we observe for this antagonist. Although it is added at a saturating concentration, it cannot completely switch of the Rho GTPase activity. This has also been observed with a DORA FRET sensor (Figure 4B in: https://doi.org/10.1124/mol.116.104505)

      Does histamine stimulation induce a maximal activation of RhoA in HeLa cells? What happens in terms of fluorescence changes when the activity of RhoA is inhibited or in the presence of a Gαq-inhibitor, and in conditions in which RhoA activating GEF, RhoA GAP or RhoA GDI is overexpressed ? Generally, I think it is useful to have a calibration curve of the biosensors activity, maximal/minimal (ON/OFF) response. For exemple, it would help to answer the question concerning biosensors binding affinity for RhoA ("The function of rhotekin is not clear, it seems to lock RhoA in the GTP bound state (Ito et al., 2018; Reid et al., 1996). We can only speculate that rhotekin has a stronger binding affinity for active RhoA than anillin and PKN1 have." (p.15))

      We have optimized our system to achieve high Rho activation and this has previously allowed us to do a quantitative comparison of the contrast of RhoA FRET sensors (see supplemental material of: https://doi.org/10.1038/srep14693). Whether this is a maximal response is unclear, but we do observe robust and consistently strong responses, which were not achieved by other strategies.

      What is the effect of histamine stimulation on a membrane marker expression/location ?

      We propose to perform an additional experiment, measuring the fluorescent intensity for a cytosolic fluorescent protein in the HeLa cell histamine stimulation assay, since we measure the depletion in fluorescent intensity of the sensor in the cytosol.

      What is the effect of histamine stimulation on dT2xrGBD biosensor response when this one is forced to be located in other subcellular compartments (mitochondria, nucleus) by fusing the construct to targeting sequences.

      We have not tried this experiment and we are not sure what would be the point of that experiment? If the construct would be forced to localize, we would not observe relocalization.

      Physiological control: Effect of the presence of the biosensor in cell morphology/behavior... Experimental data concerning this point are evoked in the discussion section. "We demonstrate that low expression of the biosensor, through the truncated CMV promotor, did not inhibit cell division and cell edge retraction. Plus, endothelial cells expressing the sensor still show the typical reaction of contracting followed by spreading, when stimulated with thrombin. Low expression results in a low fluorescent signal of the sensor." (p.16) I think this results would deserve a section in this manuscript.

      This is the data shown in Figure 6, we will refer to it more clearly.

      Fig 2D : "The anillin sensor AHD+PH showed a 15% decrease in cytosolic intensity (Figure 2D), but it also relocalizes to striking punctuate structures upon histamine stimulation. These structures did not seem to represent local, high activity of RhoA, as the optimized rGBD sensor in the same cell showed no such locally clustered RhoA activation, but rather a homogenous activation at the membrane and a 60% drop in cytosolic intensity. Similar punctuate structures were observed in endothelial cells, when stimulated with the strong RhoA activator thrombin (Supplemental Movie 5)." And p. 15 : "However, we noticed that the AHD+PH sensor, containing aGBD, C2 and PH domain, localizes in a punctate manner. These 'dots' were observed in both HeLa cells and endothelial cells and were only observed with the AHD+PH RhoA sensor. As aGBD does not localize in puncta, it seems that the localization is caused by domains other than of the RhoA binding domain, i.e. the C2- and/or PH-domain." Punctate structures are also present in HeLa cells expressing the anillin sensor before histamine stimulation (see Supplemental Movie 4). Moreover, punctuate pattern activated by thrombin in endothelial cells looks different (more widespread) than the one activated by histamine in HeLA cells. In addition, these structures can also be found in human endothelial cells expressing dT2xrGBD (fig. 6B, Supplemental movie 10). What are those structures thrombin activated in endothelial cells that would be similar to the ones in Hela cells activated by histamine and that "did not seem to represent local, high activity of RhoA"? This is not further commented by the authors.

      Very well spotted. What can be seen in Figure 6B and SMovie 10, are different vesicles, that are always observed in endothelial cells expressing fluorescent proteins. We think they are endosomes/lysosomes, which would explain why especially the more pH stable red fluorescent proteins are visible in these structures. They do not localize at the membrane but in the cytosol. These structure are not induced by RhoA activation, and are not present in the TIRF data which excludes the cytosol.

      • Fig 3A: "The rGBD sensors solely colocalized in the nucleus with RhoA but not with Rac1 and Cdc42, indicating that rGBD specifically binds constitutively active RhoA." What about dT2xrGBD binding specificity for the three homologues RhoA, RhoB and RhoC? This point is evoked in the discussion part (p.16) but there is no experimental data to support it "The specificity of the relocation sensor is determined by the binding specificity of the GBD. The rGBD binds the three homologues RhoA, B and C but not to Rac1 and Cdc42". So, why rGBD is presented as a RhoA biosensor?

      We apologize for this misunderstanding. We have no reason to assume that the biosensor does not bind all three isoforms. We will refer to the RhoA/B/C isoforms as ‘Rho’ and we will call it a Rho sensor.

      Fig 3B: The data scatter for the dTomato-2xrGBD is very wide compared to the mScarlet-1xrGBD. What is causing this wide data scatter and such heterogeneous response? This is a problem if the sensor is really so heterogeneously responding to a strong mutant of RhoA, is this a dimerization-dependent problem?

      We think that this is related to expression levels. Since dTomato-2xrGBD shows higher amplitudes, the spread also becomes larger and so we think the coefficient of variation will be similar. We will add standard deviations an indicate fluorescent intensity.

      These domain-based biosensors could cause dominant negative/inhibitory artefacts. Also the dimerizing fluorescent proteins could introduce oligomerization of the signaling complex which is not real in cells and clearly affect phenotype. These issues should be tested and addressed by a quantitative measure of cell behavior against increasing concentration/changing dimerization potentials of the biosensor in live cell assays.

      We agree that these type of biosensors in a general sense can cause dominant negative/inhibitory artefacts and we explicitly mention this in the text: “Visualizing the endogenous Rho activity may interfere with the biological role of Rho, as the sensor binds endogenous Rho and may compete with natural effectors of Rho”

      We were worried about this possible downside and have been very carefully looking at the effects of the biosensor. As highlighted in the manuscript, we noticed mitosis and natural contraction/spreading of endothelial cells. We were able to make stable cell lines. These are all signs that there are no strong negative effects. We also advice to use low expression of the senor to limit negative effects: “To limit the perturbation, the sensor should be expressed at a low level to allow Rho signaling”

      Fig 4 C: "Given the successful improvement of the rGBD-based biosensor by increasing the number of binding domains, we explored whether the same strategy can be applied to the G protein binding domains from PKN1 and Anillin" and "The dimericTomato-2xrGBD sensor shows the best relocation efficiency, with a median change in cytosolic intensity of close to 50%"... So why the dT-2xaGBD construct has not been tried ?

      Because we did not see the stepwise improvement as we saw for the rGBD sensor, so we do not expect an improvement in that construct. Plus, the cloning for the 2xaGBD was initially not working out.

      p.9 : "None of the pGBD sensors showed a clear membrane localization upon stimulation with histamine (Figure 4A). The increase in cytosolic intensity observed in some cells, seems to be caused by changes in cell shape." Do changes in HeLa cell shape induced by histamine stimulation? How this can be explained? Do some cells expressing the rGBD sensors (single, tandem and triple and dimericTomato) undergo these changes of shape too, upon histamine stimulation? If yes, to what extent these changes in cell shape affect signals?

      The activation of Rho GTPases by the histamine receptor often results in changes in cell shape in HeLa cells. We propose to perform an additional experiment with a cytosolic fluorescent protein in the HeLa cell histamine stimulation assay, to measure potential intensity changed solely caused by shape changes.

      p9: Overall, the paragraph about Fig 4 E,F is not clear. What amino acid sequences of G Protein Binding Domains of Anillin and PKN1 bring for the understanding of rGbD, aGBD and pGBD sensors?

      Since there is no crystal structure for rGBD available, we thought it is interesting to compare the amino acid sequences to see how similar/ different these domains are.

      p. 12, Fig 6C, Fig. 6E: "The membrane marker showed a relatively small increase in intensity after stimulation and the curve did not show the same pattern as the RhoA biosensor intensity curve. Therefore, we conclude that the increase in RhoA biosensor intensity is caused by relocalization." It surprises me that decrease in cell areas induced a very small increase in fluorescence intensity of the membrane marker. It would be very helpful to see a figure with a quantification of the membrane marker intensity changes during this process. What about a cytoplasmic marker?

      Figure 6D shows the intensity measurements of the membrane marker intensity. The small change can be caused by membrane changes, but also other factors that affect intensity (focus change). We will add the membrane intensity measurements to Figure 6F and G as well. Since these measurements are made in TIRF, the intensity of the cytoplasmic marker would be very low. Therefore, we decided to use a membrane marker.

      In addition, how does the movement artefact is corrected?

      The ROIs were drawn by hand to measure the fluorescence intensity.

      "Our data revealed that the RhoA biosensor displays RhoA activity at subcellular locations where RhoA activity is expected, and appears mostly independent of fluorescent intensity measured by a separate membrane marker." This part should be developed further. Are there examples of cells for which the biosensor activity is dependent on fluorescent intensity measured by a separate membrane marker?

      The intensity of the membrane marker is only affected by changes in membrane area or morphology (and other technical reasons that lead to a change in intensity, e.g. focal drift, bleaching). This point is made in the paper by Dewitt that we cite (https://doi.org/10.1083/jcb.200806047). We are not aware of papers that show biosensor activity dependent on a separate membrane marker. One potential confounding issue is quenching of the membrane marker by FRET, but this would lead to a decrease in intensity and we do not observe that.

      Discussion (p.16): "Comparing relocation sensors to FRET sensors, both have their own advantages and disadvantages." The dT2xrGBD sensor is here presented as a new relocation sensor for RhoA activity. However in general, there should be more development of the direct comparisons, pros and cons, with quantitative data or more details allowing to have a general overview of the advantages and disadvantages of this new relocation biosensor as compared to the existing ones.

      We explain the pros and cons of FRET sensors and relocation sensors in the introduction and we show a quantitative comparison of this new relocation biosensor as compared to existing relocation biosensors (figure 2). The advantage of the relocation sensor relative to a FRET sensor is highlighted in the discussion: “Furthermore, the relocation sensor requires confocal microscopy or TIRF microcopy to spatially separate the bound from unbound probe, whereas FRET measurements are usually performed with widefield microscopes. However, the former mentioned techniques usually offer the higher resolution. Here we presented previously unachieved visualization of Rho activity at subcellular resolution. We observed local activation of Rho at the Golgi which was not possible with the DORA RhoA FRET sensor (Van Unen et al., 2015), indicating a higher sensitivity of the relocation sensor.”

      Minor points:

      • Overall, scale bars should have to be included in HeLa cells microscopy images.

      We will provide the width of the image in the figure captions.

      It was not clear until the Methods section that the widefield analysis appeared to be normalized against another fluorescent protein-based cytoplasmic signal to correct for variations in cell volume. I think this point should be mentioned in the main text more prominently and emphasized so that readers are not misled.

      The normalization of time traces has been done to account for differences in the initial intensity (e.g. due to differences in expression level), this is now better explained: “The mean gray value or cell area respectively, were normalized by dividing each value by the value of the first frame, to account for differences in the initial intensity.” Of note, there is no extra cytoplasmic signal to correct for variations in cell volume.

      • p. 9 : "Anillin AH+PH sensor" instead of "Anillin AHD+PH sensor"

      Corrected.

      • Fig 2B and 2D : Explain what parameter is used for the normalization of each signals ?

      We state in the methods: “ The mean gray value or cell area respectively, were normalized by dividing each value by the value of the first frame, to account for differences in the initial intensity.”

      • Fig. 1A, top panel: it would be good to know which images correspond to the addition of histamine and which ones correspond to the addition of pyrilamine

      The time line with the grey bars indicating the stimulus of the graph matches the images. We changed the legend to clarify: “The images match with the perturbation that is indicated for the plot in panel C.”

      • "TRIF microscopy" is written in legends of Fig. 6 and of Supplemental movie 11, and in Materiel and Methods section p. 23
      • Fig. 3 legend: Correct "mScralet-I-1xrGBD"
      • Fig 4F, legend: " Anillin and the bound RhoA are depicted in dark and light yellow, respectively. PKN1 and the bound RhoA are depicted in light and dark blue, respectively." Color codes in legend are opposites to the figure ones.
      • p.11 : "To examine this, we used a rapamycin-induced hetero dimerization system to recruit the dbl homology (DH) domain, of the RhoA activating GEF p63, to the membrane of the Golgi apparatus." Corresponding references should be included.

      Thanks for pointing these out, all have been addressed/corrected.

      Fig. 5A : Explain FRB, Fig 5C : no unit for a ratio

      We changed the legend “A) Still images of HeLa cells expressing FRB (part of rapamycin hetero-dimerization system) anchored to the membrane, Golgi and mitochondria (first column), FKBP-p63-DH (counterpart of rapamycin hetero-dimerization system, not shown), localization of the dimericTomato-2xrGBD sensor pre activation (second column) and post activation with 100 nM rapamycin (third column).”

      Reviewer #1 (Significance (Required)):

      Mahlandt et al. optimized and compared several G protein binding domain (GBD)-based biosensors in order to improve the potential of existing RhoA-domain-based biosensors for visualizing and reporting RhoA subcellular activity in living cells and tissue. The authors demonstrate that fusing a dimerizing fluorescent protein to the rhotekin GBD (rGBD) is an efficient strategy to increase the brightness of the sensor. The use of Rhotekin-RBD as affinity domain for Rho-class of GTPase is very well established, both in the methods of affinity pulldowns and in biosensor designs for Rho-class of GTPases in the field. The authors show that the dimericTomato-2xrGBD biosensor can indicate endogenous RhoGTPase spatial activity in dividing HeLa cells and during cell retraction of human endothelial cells.

      The dimericTomato-2xrGBD biosensor is thus introduced and described as a RhoA localization-based biosensor, however no experimental data demonstrate the binding specificity of the biosensor for RhoA. Moreover, authors discuss about a previous work showing that rGBD binds the three paralogs RhoA, RhoB and RhoC. This point and the apparent singular claim of this biosensor reporting RhoA activity as this manuscript alludes to are inappropriate and misleading.

      We apologize for the misconception that this probe is specific for RhoA. We do not want to claim this sensor is specific for RhoA (and note that we have been involved in generating FRET biosensors for the different isoforms, RhoA/B/C ourselves: https://doi.org/10.1038/srep25502). We have addressed this in the introduction, and we have changed RhoA to Rho to better reflect that we are looking at all three isoforms.

      This point especially in light of the field has moved on in the past 20 years to assign more specificity (not less) to which GTPase the biosensors are being specific, i.e., via FRET, etc., significantly tempers the enthusiasm of this reviewer. In addition to this main issue, the incomplete characterization of the relative affinities of the domain to the target GTPase isoforms and of the dimerization affinities of the fluorescent proteins (which could change the apparent reaction rate constants), and the impact of which on the reversibility, oligomerization states and detection sensitivity, and the biology, also appeared lacking. Additional stoichiometric considerations and apparent reaction equilibrium that are impacted by the relative concentrations of interacting moieties require careful and further analyses, study and discussion. In general, I think that this work could be interesting to a more specialized field audience with further analyses of the affinities of the interacting moieties and better characterization of the behavior of this biosensor in living cells since it is likely causing oligomerization of the signaling units due to the forced dimerization of the detection unit.

      **Referees cross-commenting**

      This is a dimerizing probe. It gets pretty bulky. Is dimerization occurring prior to GTPase binding or after? Is the dimerized probe/GTPase complex somehow more stable than would otherwise be if they were monomeric? If so, how would that affect the lifetime of the detection and also the diffusivity of the probe("s", if already dimerized) and possibly the whole oligomer?

      dTomato is shown to be a strong, obligate dimer. Therefore, we assume that the fluorescent probe is present as a dimer before (and after) binding to the GTPase. With respect to size/bulkiness we’d like to note that the biosensor is only somewhat larger than a FRET sensor, i.e 2x47 kDa and 74 kDa, respectively.

      It still feels to me that, yes new brighter fluorescent proteins were used, and dimerization and multimerization of the signaling complex increased the SNR of the system, but the whole premise just reverted the biosensor field back 20yrs, which has been my biggest single concern regarding this paper.

      This evaluation is in our opinion largely based on the misconception that we claim RhoA specificity. We do not claim that this sensor is specific for RhA (and we have revised the manuscript accordingly) and we are not aiming to replace FRET sensors (being quite fond of FRET sensors as is clear from our previous work). We think that there is ample opportunities and applications for the improved relocation sensor (as is also evident from requests for the plasmids that encode the probe), for instance in experiment were FRET sensors are challenging to use, such as optogenetics experiments and multiplexing biosensors. We state in the discussion: “Single color relocation sensors are ideal candidates for multiplexing experiments. Plus, the growing field of optogenetics is in need of single color biosensors to detect the effect of optogenetic perturbations. The conventional CFP-YFP FRET sensor is incompatible with most, blue light induced optogenetic tools.”

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Visualization of subcellular activity of GTPases is critical for the understanding of signal transduction of cell growth, differentiation, morphogenesis, etc. For this purpose, researchers often use relocation probes, which comprise a fluorescent protein(s) and a GTPase-binding domain(s), and move from cytosol to the location of active GTPases. The authors improved a previously reported RhoA probe with a strategy of increasing the avidity of RhoA-binding domain and optimizing the fluorescent protein. In the beginning, the authors declare "the relocation of the original, single rGBD monomeric fluorescent protein sensor is hardly detectable" in HeLa cells. To overcome this problem, they developed six constructs by changing the number of rGBD (rhotekin GBD) domains and fluorescent proteins. They found that the increase in the number of rGBD and a dimeric prone fluorescent protein, tdTomato, generate a better probe for RhoA activity. The specificity was examined by using active Rac1 and Cdc42 proteins. Different RhoA-bind domains derived from Rhotekin, PKN1, and Anillin were compared to show the superiority of rhotekin GBD. Finally, they show that subcellular RhoA activation detected by the probe is consistent with the knowledge on RhoA activation by using vascular endothelial cells. Overall this work has been well done in an organized way and disclose a novel RhoA probe that will be useful in future research of RhoA.

      **Major comments:**

      Reproducibility: The number of analyzed cells is described in the legend, but the number of independent experiments is not shown. This is critical to evaluate the reproducibility of the data. Preferably, the data should be presented to show data set derived from each trial clearly. It should also be described how cells were selected for the analysis? It is also preferable to apply automatic analysis. Ideally, the raw data with code sets for analysis should be presented.

      We will indicate the independent experiments. ROIs were partly drawn by hand. We agree that segmentation based methods would increase reproducibility, but this data set is not suitable for automated analysis.

      1. A serious defect of the relocation probe is the dependency on the expression level. The lower the number of the probe in a cell, the higher the fraction of recruited to active RhoA. However, lowering the probe concentration will be accompanied by dim fluorescence. The authors should describe how the optimal expression level was achieved.

      We fully agree. Using the low expression promoter improved the dynamic range but we have not gained control over the optimal expression level. It does vary from cell to cell. We added this paragraph to the discussion: “However, the optimal expression level is crucial for the dynamic range of the relocation sensor. Low concentrations of the sensor will show higher levels of relocalization, as a larger fraction of the sensor molecules binds the limited, active, endogenous Rho molecules. Nevertheless, if the concertation of sensor is too low, the fluorescent signal cannot be detected. To optimize the expression level, the CMVdel promoter, leading to a lower expression level, was applied (Watanabe and Mitchison 2002). Even though, this minimal promoter improved the performance of the relocations sensor, a variety of expression levels was observed. Cell sorting could be applied to select for cells with the optimal expression level.”

      1. Statistical analysis is absent throughout the paper.

      We will add standard deviations to the dot plots.

      **Minor comments:**

      In Figure 1, mNeonGreen (mNG) was used as the fluorescent protein fused to rGBD instead of EGFP, which was used in the original paper. For a fair comparison with the previous report, analysis using the original probe, i.e., EGFP-rGBD, is desirable. Or, the author may simply tone done.

      That is a good point. We propose to perform the HeLa cell histamine stimulation assay for the eGFP-rGBD sensor and add the data to Figure 1B.

      1. In the introduction, it says " The RhoA FRET sensors achieve subcellular resolution to a certain extent, but due to their design they do not localize as endogenous RhoA". Reference is required.

      We changed the following in the introduction: The RhoA FRET sensors achieve subcellular resolution to a certain extent, but due to their design they may not localize as endogenous RhoA (Michaelson et al., 2001).

      1. rGBD should be rhotekin GBD. It should be clearly stated in the beginning.

      We wrote in the introduction: “Secondly, the rhotekin G protein binding domain (rGBD)-based eGFP-rGBD Rho sensor, that was reported in 2005 (Benink & Bement, 2005).” and in the results “ The eGFP-rGBD biosensor consists of an enhanced green fluorescent protein (eGFP) and a rhotekin G protein binding domain (rGBD).”

      1. The reason why the CMVdel promoter is used should be stated clearly.

      Thanks for the suggestion. We added to the discussion: “However, the optimal expression level is crucial for the dynamic range of the relocation sensor. Low concentrations of the sensor will show higher levels of relocalization, as a larger fraction of the sensor molecules binds the limited, active, endogenous Rho molecules. Nevertheless, if the concertation of sensor is too low, the fluorescent signal cannot be detected. To optimize the expression level, the CMVdel promoter, leading to a lower expression level, was applied (Watanabe and Mitchison 2002). Even though, this minimal promoter improved the performance of the relocations sensor, a variety of expression levels was observed. Cell sorting could be applied to select for cells with the optimal expression level.”

      1. Page 23: TRIF should read as TIRF.

      Corrected

      1. Figures: Grey letters should be avoided.

      We will verify the figures for readability

      1. Fig. 3A: Apparently the probe binds to Rac1 G12V to some extent. The discrepancy of RhoA localization between mSca-1xrGBD and dt-2xrGBD must be discussed. This observation clearly suggests that GBD may change the localization of RhoA. It is interesting to note that Rac1 and RhoA may localize to the nucleolus.

      We have changed the text to make clear that the dTomato-2xrGBD binds better to RhoA than the 1xrGBD variant: “Comparing the original single rGBD sensor (mScarlet-I-1xrGBD) with the dimericTomato-2xrGBD sensor, a higher nuclear to cytosolic intensity ratio for the multi-domain sensor was detected, supporting its higher affinity for RhoA.”

      Reviewer #2 (Significance (Required)):

      1. This work discloses an improved RhoA probe, which will be welcome by the researchers in the field of small GTPases.

      We are glad that the reviewer shares our enthusiasm

      1. Novelty of increased GBD: The idea of increasing the GTPase-binding domain in the relocation probe was reported some time ago: Augsten et al., Live-cell imaging of endogenous Ras-GTP illustrates predominant Ras activation at the plasma membrane. EMBO Rep. 7, 46-51 (2006).

      Agreed - we added the reference to the discussion: “This strategy, to utilize multiple repeating domains has also been effective for a PH domain based lipid sensor and a cRAF derived Ras-binding domain Ras activity sensor (Augsten et al., 2006; Goulden et al., 2018)”

      1. Novelty of rhotekin GBD: The reason why GBD of PKN is chosen in intramolecular FRET biosensors such as DORA and Raichu is that the affinity of other GBD's is too high [Table 1, Yoshizaki et al., J. Cell Biol. 162, 223-232 (2003)]. Judging from this old data, GBD's of mDia and Rhophilin, may work better than that of Rhotekin. Moreover, it is known that PH domain may be required for proper conformation of GBD's. Thus, it is not surprising that removal of PH domain from the Anillin probe abolishes its translocation ability. Therefore, to the reviewer's eyes, the choice of GBD in Figure 4 is biased to those that will work less efficiently.

      We see the point, but we have chosen these (PKN/anillin) for a practical reason, namely that we had cDNA encoding these probes in our lab. We thank the reviewer for the suggestion to look into other GBDs.

      1. Authors' proposal of "systematic optimization" sounds exaggerated, considering the small number of constructs tested in Fig. 1 and Fig. 4. Similarly, it is not clear whether dimerize prone-fluorescent proteins are better choice by simply comparing tdTomato and mNeonGreen.

      Fair enough, we think of it as a systematic comparison (figure 1) and we have rephrased the sentence: “Improving the rGBD probe by increasing the avidity was successful”

      1. Keywords of expertise: Fluorescent probes. Cell signaling.

      **Referess cross-commenting**

      Because Review Commons does not specify the journal to be published, the request by the Reviewer #1 sounds too much. The probe reported in this work deserves publishing, although it may not be a ground-breaking probe.

      We thank the reviewer for the encouraging words and support.

      Reading the comments by the other reviewers, following concerns should be cleared.

      1.Relationship between the probe's concentration and the response.

      2.Specificity to RhoA, RhoB, and RhoC

      3.The effect of the cell morphology as pointed by Reviewer #1.

      Concern 1 will be addressed by re-analysis of the data. Concern 2 is addressed by changes in the text, was we have indicated in our response. Concern 3 will be addressed by control experiments that look into changes in cell morphology

      To Reviewer #1

      -Since equimolar distribution of the moieties are not guaranteed, this affects the detection characteristics of this biosensor. This point should be discussed and emphasized The probe will diffuse rapidly within cytosol. Therefore, subcellular concentration of the probe may not affect significantly on the performance of the probe.

      -What is the effect of histamine stimulation on dT2xrGBD biosensor response when this one is forced to be located in other subcellular compartments (mitochondria, nucleus) by fusing the construct to targeting sequences. I did not understand this question quite well.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      In this paper, Mahlandt et al compared and improved relocation sensors to visualize the activity of endogenous Rho. As a result of screening for several Rho binding domains (GBDs) and the number of repeats, the authors found that dTomato-2xrGBD is optimal, and succeeded in visualizing the activity of Rho during cytokinesis and migrating cells. Overall, this sensor would be a useful tool for many cell biologists. The data are represented clearly in the figures. I provide some concerns; that would be worth addressing in a revised version.

      **Major comments**

      1. The authors should experimentally show the quantitative relationship between biosensor expression level and degree of relocation. In principle, this relocation type sensor binds to the endogenous GTP-bound Rho. Since the number of endogenous GTP-bound Rho is limited in cells, the degree of relocation is considered to be dependent on the expression level of the sensor. If the number of biosensors expressed is too small in a cell, the response will be saturated. If the number of biosensors is too large, the relocation will be weakened and the Rho signal will be suppressed. Furthermore, although a weak promoter is used, the heterogeneity of the expression level in each cell makes quantitative analysis difficult, especially in transient expression experiments. I would like to suggest the addition of quantitative experimental data.

      We propose to re-analyze of our data, indicating the relative expression levels of the biosensor (based on intensity) in the dot plots. We agree that the expression level potentially affects sensor performance and we will address this more clearly in the text We added to the introduction: “A potential drawback is that background signal of the unbound biosensor in the cytosol, which may occlude the bound pool and reduce the dynamic range.” We added to the discussion: “However, the optimal expression level is crucial for the dynamic range of the relocation sensor. Low concentrations of the sensor will show higher levels of relocalization, as a larger fraction of the sensor molecules binds the limited, active, endogenous Rho molecules. Nevertheless, if the concertation of sensor is too low, the fluorescent signal cannot be detected. To optimize the expression level, the CMVdel promoter, leading to a lower expression level, was applied (Watanabe and Mitchison 2002). Even though, this minimal promoter improved the performance of the relocations sensor, a variety of expression levels was observed. Cell sorting could be applied to select for cells with the optimal expression level.”

      1. Most of the time-series data show only a representative example, namely, N = 1. In relation to the aforementioned issue, data and distribution derived from several cells (e.g. SD) should be shown in a clear manner.

      We focused not primarily on the kinetics, but more on maximal relocation, therefore we do not have time lapse movies for all the shown data points (e.g. a time lapse is shown in 1C and the data for a higher number of cells is shown in 1B). However, we can provide time series for multiple cells from our existing data sets.

      **Minor comments**

      1. I hesitate to call the biosensor developed in this study "RhoA sensor". This is because, as the authors mention, it has been reported that the rGBD also binds to RhoB and RhoC. If the authors call it a RhoA sensor, they should investigate the specificity of binding to RhoB and RhoC in addition to RhoA. If not, I would like to suggest changing the name to "Rho sensor" instead of "RhoA sensor".

      This is a fair point, also made by other reviewers. We will change the name to Rho sensor.

      Reviewer #3 (Significance (Required)):

      Rho is one of the low molecular weight G proteins, which regulate the reorganization of the actin cytoskeleton. As biosensors for visualizing the activity of Rho proteins, it has been reported intramolecular and intermolecular FRET biosensors and relocation sensors. The latter is less widely used than the former, because of insufficient sensitivity and specificity. Therefore, the improvement of Rho biosensors is really important and needed in the community of cell biology research field. The importance of this manuscript, I believe, is that the authors compared the existing relocation type Rho sensors. This is informative.

      Rho is one of the low molecular weight G proteins that regulate the rearrangement of the actin cytoskeleton. Intramolecular and intermolecular FRET biosensors and relocation sensors have been reported as biosensors for visualizing the activity of Rho proteins. The latter is not as widely used as the former due to its inadequate sensitivity and specificity. Therefore, improving the Rho biosensor is very important and is needed by the community in the field of cell biology research. I believe the importance of this manuscript is that the author compared existing relocation-type Rho sensors. This is beneficial and informative.

      My expertise: Cell biology, live-cell imaging, development of genetically encoded fluorescent probes

      We thank the reviewer for the positive evaluation of our work.

      **Referees cross-commenting**

      I generally agree with Reviewer 2's opinion. The opinions of our three reviewers can be summarized in three points: expression level, specificity, and statistical analysis and representation. I think these should be asked to the authors as major critics that should be addressed before publication.

      We agree and we propose to address the three main points (see also response to reviewer 2).

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      **SUMMARY:**

      Mahlandt and colleagues use advanced microscopy techniques to test new configurations of several Rho relocation sensors, which report on the activity of members of the endogenous RhoA GTPase family of proteins. A novel variant containing the dimericTomato fluorescent protein and a double rGBD domain shows a substantial increase in dynamic range in comparison with 2 originally published sensors and other new variants they tested. They use a cellular assay to show that this novel variant is specific for the activity of Rho family of Rho GTPases and not the Cdc42/Rac families. Finally, the authors show that this new variant can be used to measure a specific localised increase of Rho activity at the Golgi, and during cell division and cellular morphology changes that are known to activate the RhoA family of Rho GTPases. The biosensor can be useful for the community. However, I think the paper is not well written (I was very confused by several statements). The manuscript should be thoroughly proofread, there are quite some unclear or duplicate passages (for examples, see "text comments" below). Currently this hampers the interpretation of the manuscript for the reader. The authors are very dogmatic - they make claims about the literature that I do not agree with at all. Some of these unbalanced views will confuse the non-expert readers.

      **MAJOR COMMENTS:**

      -The reported dTomato-rGBD sensor is unable to distinguish between the different members of the RhoA familiy of Rho GTPases (measures combined activity of RhoA, RhoB and RhoC), which is unclear for the reader in the current text phrasing in the introduction. The authors seemingly suggest throughout the manuscript to work with a specific RhoA biosensor, which is not the case. This strong statement is completely misleading. The authors need to refer to the biosensor being specific for Rho (RhoA,B,C) GTPases versus Rac1/Cdc42 biosensors, and discuss what this means for the field. Some discussions about this are made in a JCB paper by Graessl et al, that the authors also cite.

      We agree that the probe measures the combined activity of all three isoforms and apologize for the confusion. We have changed the name to Rho sensor and updated the manuscript.

      -If the authors really want to sell that the biosensor is only specific for RhoA, then they need to make a series of experiments with RhoB and RhoC dominant positive/negative constructs, to tackle that specific point.

      No, we did not intend to claim the sensor is specific for RhoA in comparison to Rho B and C.

      -Did the authors consider to use the artificial GBD from Keller, 2019 to make a specific relocation sensor for RhoA? Perhaps the authors can comment on the feasibility of this approach?

      We think that this might be the only way to make a specific RhoA relocation sensor. Recently, we have received the DNA and plan to do the histamine stimulation experiment in HeLa cells as in Figure 1B.

      -A strong (dogmatic) statement is that Rho GTPases FRET sensors report solely on the activity of GEFs. This is not the case, these sensors report on the flux of GAP and GEF activity for Rho GTPase in cells. This is also true for relocation sensors, and has been documented in work from the Bement/Pertz/Nalbant/Dehmelt labs.

      We thank the referee for this correction and we have changed the text to: “By design, these FRET sensors report on the balance between activating guanine exchange factors (GEFs) and inactivating GTPase-activating proteins, instead of visualizing endogenous RhoA-GTP”

      -From the data in Figure 1, it seems to follow that the efficiency of PM relocation is mainly determined by the number of rGBD modules on the sensors. Could the authors speculate on how this works in practice; is the multi-rGBD sensor increasingly kinetically trapped by a single RhoA molecule, or is the sensor mostly bound to multiple RhoA molecules at the PM?

      This is an interesting question to which we do not have an answer. We added some text to the discussion: “It is currently not clear how each of the GBDs of the dimericTomato-2xrGBD sensor contribute to Rho binding and the probe may bind anywhere between 1 and 4 Rho molecules. If the probe is capable of binding multiple Rho proteins, the binding efficiency will depend on the local density of Rho in the membrane. “

      -Some form of statistical analysis should be performed on the data to give the reader a sense of robustness of the findings and its uncertainty. Either a non-parametric test on the median, confidence intervals or e.g. boxplots showing notches.

      We will include standard deviations in our dot plots.

      -Time-series now show single example traces (fig1C, fig2B,D, fig5B). It would be informative for the reader if the curves of all experiments were plotted, and statistical analysis would be performed on the data. It is unclear how representable the kinetics in these curves are.

      We can show the kinetics for more examples but we did not acquire time lapses for all the data points shown in the dot plots, since the microscope could not move fast enough to acquire frames with an interval of 10 -20 s.

      -About the spatial patterns of Rho activity (cytokinesis, tail retraction, ...), the reviewers agree that statistical analysis is much more difficult. But maybe showing 2-3 cells instead of only one, would make the data more convincing.

      We will provide more examples.

      **MINOR COMMENTS:**

      -(fig4a) dTomato-2xpGBD, why is this not good? how is it possible that it binds good to nucleus, but no translocation is observed? const activity? expression levels?

      We were surprised and somewhat disappointed by this as well and we do not have an explanation, besides that the binding affinity required for dynamic relocation seems to be higher than the one for binding the overexpressed active Rho GTPase.

      -(fig4f) The aGBD/pGBD binding sites for RhoA show great overlap but bind to completely different sites at RhoA, is this correct? (color scheme used for the structures is not easily interpretable)

      It is correct they both have two binding sites but apparently, they found crystals for one or the other. Maesaki et al. 1999 is describing the two binding site. We will change the colors.

      -(fig5) Unclear how the intensity at the specific organelles is measured? were the organelles segmented or hand-drawn ROI based? The quantified difference is very small, no statistics are performed, and it is unclear how it was measured. This is currently weak evidence for the main claim in this subsection.

      ROIs are drawn by hand. We will provide standard deviations in our dot plots.

      -(fig5) The kinetics of the response to histamine (fig1C) seems to be much faster as the rapamycin mediated increase in fig5B for the PM condition. Any explanation for this? Why does it not reach a plateau like in the histamine experiments?

      It is probably the recruitment of the p63-DH that takes more time than the activation of the H1R and the downstream signaling. We have the data of the p63-DH recruitment channel so we will check the recruitment kinetics of the p63-DH to the membrane.

      -(fig6F) Data from 6D is repeated here, 6F could potentially show aggregate time-series instead of individual cells. Would also improve interpretation if the membrane marker curve is plotted in every subfigure. Potentially membrane marker intensity could be used to normalise the (TIRF) measurements?

      We will include the data of the membrane intensity for every trace in F.

      -can the authors provide scale bars on the micrographs, as is usually done in any manuscript ? It would also be useful to put time labels when images corresponding to timeseries are shown.

      We will provide the width of the image in the figure captions.

      -ratio values are dimensionless by definition, so no need to write "arbitrary units"

      We will change that.  

      **TEXT COMMENTS:**

      -(abstract): "Due to the improved avidity of the new biosensors for RhoA activity, cellular processes regulated by RhoA can be better understood." -> unclear what the authors mean with 'avidity' in this context? (here, and throughout rest the manuscript)

      Avidity refers to “the accumulated strength of multiple affinities”, we added this explanation to the text in the introduction. Another paper working with multiple biding domains to improve a relocation sensors also calls it avidity: A high-avidity biosensor reveals plasma membrane PI(3,4)P2 is predominantly a class I PI3K signaling product (Goulden at al. 2018 JCB).

      -(introduction) "Although these three Rho GTPases may have different functions, we generally refer to RhoA in this manuscript." -> unclear what message the authors try to convey with this sentence.

      We changed to: “We will use ‘Rho’ throughout the manuscript, which refers to all three isoforms”

      -(introduction) "Active RhoA mainly localizes at the plasma membrane, due to its prenylated C-terminus" -> where else would it be localised? Where is inactive RhoA localised?

      We included: “Active Rho mainly localizes at the plasma membrane, due to its prenylated C-terminus (Garcia-Mata et al., 2011).However, a fraction of RhoA has been found at the Golgi apparatus. Inactive RhoA, in comparison, can be extracted from the plasma membrane by Rho-specific guanine nucleotide dissociation inhibitors (RHOGDIs) (Garcia-Mata et al., 2011)”.

      -(introduction) "Unimolecular Rho GTPase FRET-based biosensors consist of the Rho GTPase itself, a GBD and a FRET pair." -> a short description/explanation of what a "FRET pair" is would benefit the non-specialised audience.

      We included: “Unimolecular Rho GTPase FRET-based biosensors consist of the Rho GTPase itself, a GBD and a FRET pair, which is commonly a cyan and a yellow fluorescent protein.”

      -(Results p9) "For the original Anillin AH+PH sensor...around 15%" -> did the authors do the experiment with G14V on this original sensor variant?

      Yes, it is supposed to say AHD+PH here as well, which has been corrected. We performed the experiment with mScarlet-AHD-PH.

      -(Results p9) The "mScarlet-I-AHD+PH" seems to perform quite good on the fig4D assay, but is not present in 4C analysis?

      eGFP-AHD+PH was used as the original sensors for the 4C assay. Due to the color of the RhoA G14V (mTq2) we switched to the mScarlet version to exclude bleed through. We assume that the sensor performs similar with different monomeric fluorescent proteins.

      -(Results p9) "mScarlet-I-AHD+PH" is the same as "AHD+PH (aGBD+C2+PH)"? descriptions unclear. Would generally advise to thoroughly check the manuscript for consistency of condition descriptions / abbreviations in both text and legends.

      Changed to: AHD+PH (consisting of aGBD+C2+PH). We mention earlier: “Moreover, a published relocation sensor AHD+PH based on Anillin contains, next to a G protein binding domain, also a C2 and a PH domain and localizes in punctuate structures which do not represent Rho activity (Figure 2C,Supplemental Movie 4 and 5) (Munjal et al., 2015; Piekny & Glotzer, 2000). Here, we used only the G protein binding domain of Anillin (aGBD) as a basis for another sensor.”

      -(Results p12) "Visualizing endogenous RhoA activity" as subsection title could potentially confuse readers, since all measured Rho activity in the manuscript is endogenous.

      That could indeed be confusing. What we intending to highlight is that we did not overexpress any signaling molecules or receptors in these experiments. We changed the title to: “Visualizing endogenous Rho activity under physiological conditions”

      **minor text:**

      -(fig3b legend) "mScralet-I-1xrGBD"

      Corrected

      -(fig6H legend) "TRIF", and "cbBOEC" is same as "BOEC"?

      It is a detail, but these are indeed different and we have updated the materials and methods to better reflect this: “cord blood Blood Outgrowth Endothelial cells (cbBOEC)” and “Blood Outgrowth Endothelial cells from healthy adult donor blood (BOEC)”

      Reviewer #4 (Significance (Required)):

      The novel "Rho" family GTPase relocation sensor that the authors present might be a significant improvement over the currently existing ones (for refs, see manuscript). This might provide a substantial technical advance in the field and increases the utilisation and the reproducibility of this tool in the field. This sensor will be of significant interest for the Rho GTPase signalling field, and more broader the cytoskeleton biology community. My expertise in Rho GTPase biology, biosensor development and advanced microscopy granted me the opportunity to judge the complete manuscript

      The reviewer thinks that the new sensor will be of significant interest and we agree.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      SUMMARY:

      Mahlandt and colleagues use advanced microscopy techniques to test new configurations of several Rho relocation sensors, which report on the activity of members of the endogenous RhoA GTPase family of proteins. A novel variant containing the dimericTomato fluorescent protein and a double rGBD domain shows a substantial increase in dynamic range in comparison with 2 originally published sensors and other new variants they tested.<br> They use a cellular assay to show that this novel variant is specific for the activity of Rho family of Rho GTPases and not the Cdc42/Rac families. Finally, the authors show that this new variant can be used to measure a specific localised increase of Rho activity at the Golgi, and during cell division and cellular morphology changes that are known to activate the RhoA family of Rho GTPases. The biosensor can be useful for the community. However, I think the paper is not well written (I was very confused by several statements). The manuscript should be thoroughly proofread, there are quite some unclear or duplicate passages (for examples, see "text comments" below). Currently this hampers the interpretation of the manuscript for the reader. The authors are very dogmatic - they make claims about the literature that I do not agree with at all. Some of these unbalanced views will confuse the non-expert readers.

      MAJOR COMMENTS:

      -The reported dTomato-rGBD sensor is unable to distinguish between the different members of the RhoA familiy of Rho GTPases (measures combined activity of RhoA, RhoB and RhoC), which is unclear for the reader in the current text phrasing in the introduction. The authors seemingly suggest throughout the manuscript to work with a specific RhoA biosensor, which is not the case. This strong statement is completely misleading. The authors need to refer to the biosensor being specific for Rho (RhoA,B,C) GTPases versus Rac1/Cdc42 biosensors, and discuss what this means for the field. Some discussions about this are made in a JCB paper by Graessl et al, that the authors also cite.

      -If the authors really want to sell that the biosensor is only specific for RhoA, then they need to make a series of experiments with RhoB and RhoC dominant positive/negative constructs, to tackle that specific point.

      -Did the authors consider to use the artificial GBD from Keller, 2019 to make a specific relocation sensor for RhoA? Perhaps the authors can comment on the feasibility of this approach?

      -A strong (dogmatic) statement is that Rho GTPases FRET sensors report solely on the activity of GEFs. This is not the case, these sensors report on the flux of GAP and GEF activity for Rho GTPase in cells. This is also true for relocation sensors, and has been documented in work from the Bement/Pertz/Nalbant/Dehmelt labs.

      -From the data in Figure 1, it seems to follow that the efficiency of PM relocation is mainly determined by the number of rGBD modules on the sensors. Could the authors speculate on how this works in practice; is the multi-rGBD sensor increasingly kinetically trapped by a single RhoA molecule, or is the sensor mostly bound to multiple RhoA molecules at the PM? -Some form of statistical analysis should be performed on the data to give the reader a sense of robustness of the findings and its uncertainty. Either a non-parametric test on the median, confidence intervals or e.g. boxplots showing notches.

      -Time-series now show single example traces (fig1C, fig2B,D, fig5B). It would be informative for the reader if the curves of all experiments were plotted, and statistical analysis would be performed on the data. It is unclear how representable the kinetics in these curves are.

      -About the spatial patterns of Rho activity (cytokinesis, tail retraction, ...), the reviewers agree that statistical analysis is much more difficult. But maybe showing 2-3 cells instead of only one, would make the data more convincing.

      MINOR COMMENTS:

      -(fig4a) dTomato-2xpGBD, why is this not good? how is it possible that it binds good to nucleus, but no translocation is observed? const activity? expression levels?

      -(fig4f) The aGBD/pGBD binding sites for RhoA show great overlap but bind to completely different sites at RhoA, is this correct? (color scheme used for the structures is not easily interpretable)

      -(fig5) Unclear how the intensity at the specific organelles is measured? were the organelles segmented or hand-drawn ROI based? The quantified difference is very small, no statistics are performed, and it is unclear how it was measured. This is currently weak evidence for the main claim in this subsection.

      -(fig5) The kinetics of the response to histamine (fig1C) seems to be much faster as the rapamycin mediated increase in fig5B for the PM condition. Any explanation for this? Why does it not reach a plateau like in the histamine experiments?

      -(fig6F) Data from 6D is repeated here, 6F could potentially show aggregate time-series instead of individual cells. Would also improve interpretation if the membrane marker curve is plotted in every subfigure. Potentially membrane marker intensity could be used to normalise the (TIRF) measurements?

      -can the authors provide scale bars on the micrographs, as is usually done in any manuscript ? It would also be useful to put time labels when images corresponding to timeseries are shown.

      -ratio values are dimensionless by definition, so no need to write "arbitrary units"

      TEXT COMMENTS:

      -(abstract): "Due to the improved avidity of the new biosensors for RhoA activity, cellular processes regulated by RhoA can be better understood." -> unclear what the authors mean with 'avidity' in this context? (here, and throughout rest the manuscript)

      -(introduction) "Although these three Rho GTPases may have different functions, we generally refer to RhoA in this manuscript." -> unclear what message the authors try to convey with this sentence.

      -(introduction) "Active RhoA mainly localizes at the plasma membrane, due to its prenylated C-terminus" -> where else would it be localised? Where is inactive RhoA localised?

      -(introduction) "Unimolecular Rho GTPase FRET-based biosensors consist of the Rho GTPase itself, a GBD and a FRET pair." -> a short description/explanation of what a "FRET pair" is would benefit the non-specialised audience.

      -(Results p9) "For the original Anillin AH+PH sensor...around 15%" -> did the authors do the experiment with G14V on this original sensor variant?

      -(Results p9) The "mScarlet-I-AHD+PH" seems to perform quite good on the fig4D assay, but is not present in 4C analysis?

      -(Results p9) "mScarlet-I-AHD+PH" is the same as "AHD+PH (aGBD+C2+PH)"? descriptions unclear. Would generally advise to thoroughly check the manuscript for consistency of condition descriptions / abbreviations in both text and legends.

      -(Results p12) "Visualizing endogenous RhoA activity" as subsection title could potentially confuse readers, since all measured Rho activity in the manuscript is endogenous.

      minor text:

      -(fig3b legend) "mScralet-I-1xrGBD"

      -(fig6H legend) "TRIF", and "cbBOEC" is same as "BOEC"?

      Significance

      The novel "Rho" family GTPase relocation sensor that the authors present might be a significant improvement over the currently existing ones (for refs, see manuscript). This might provide a substantial technical advance in the field and increases the utilisation and the reproducibility of this tool in the field. This sensor will be of significant interest for the Rho GTPase signalling field, and more broader the cytoskeleton biology community. My expertise in Rho GTPase biology, biosensor development and advanced microscopy granted me the opportunity to judge the complete manuscript

    1. older-younger relationships are still quite malleable

      I found the idea of older-younger relationships somewhat relevant to my life because (at least for me), people who I am friends with where we have an age gap are people who I share similar life experiences with such as: playing the same sport, having the same interests, or perhaps just being in close proximity with them like neighbors. I also think many occurrences in life bring people together for different reasons although they may not be the same age. I like how in this context, the children were pretty open to the idea of older-younger relationships for a variety of reasons.

    1. In view of all this we may say, not, I think, that psychology is all there is of philosophy, as Wundt does, nor even that it is related to the systems as philosophy to theology, nor that it is a philosophy of philosophy, implying a higher potence of self-consciousness, but only that it has a legitimate standpoint from which to regard the history of philosophy,-- a standpoint from which it does not seem itself a system in the sense of Hegel, but the natural history of mind, not to be understood without parallel [p. 131] study of the history of science, religion, and the professional disciplines, especially medicine, nor without extending our view from the tomes of the great speculators to their lives and the facts and needs of the world they saw. It strives to catch the larger human logic within which all systems move, and which even at their best they represent only as the scroll-work of an illuminated missal resembles real plants and trees, in a way which grows more conventionalized the more finished and current it becomes. In a word, it urges the methods of modern historic research, in a sense which even Zeller has but inadequately seen, in the only field of academic study where they are not yet fully recognized.

      Hall explains how psychology is a different entity than philosophy. Although both fields may share similar beliefs, psychology explores various aspects that philosophy does not. Hall also mentions how psychology is much bigger than a subsection. Psychology deals with various concepts and being confined to a subsection of philosophy limits individuals from gaining new knowledge/information for this area (psychology).

    1. SciScore for 10.1101/2021.04.02.21254818: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">Consent: MD website all participants provided informed consent at the start of the online questionnaire for their data to be used for research purposes, and had to agree to the corresponding Your.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">The gender variable was coded as 0 for male and 1 for female.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">(5) For clustering, we excluded symptoms experienced by less than 5% of the individuals in each group to avoid chaining.(6) All statistical analyses were performed using custom programs in the MATLAB R2019b (MathWorks) environment. Ethics: The Covid-10 Symptom Mapper data was provided to us by Your.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>MATLAB</div><div>suggested: (MATLAB, RRID:SCR_001622)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      Strengths & limitations: One of the strengths of this study was the ability to look globally at symptoms with a specific breakdown by nationality, allowing geolocation and culture/behavioural aspects to be investigated. Symptom reports were conducted in local languages (e.g. Portugese, Hindi, etc) thus increasing accessibility, however translations may not match exactly within cultural contexts, e.g. “pain” in Brazil is “joint ache” in UK (but not stomach pain). Given the data for this analysis came from an Internet based survey, there will be differential access, however only a very low effort was needed to partake given the questionnaire was accessed via a simple website and not an app. Given the widespread use of smartphones globally, this should facilitate participation, however we acknowledge that those who are younger or in wealthier countries may be more likely to partake thus skewing the results, equally educational factors may have played a role and we do not have any socioeconomic or ethnicity information. Whilst we acknowledge that the data used are self-reported, we do not think this undermines the accuracy of underlying disease or symptom reporting. For those who report a COVID-19 positive test, we do not distinguish between type of tests and thus cannot account for differences in accuracy. Clinical and Public Health impact: Our information may be utilised in a clinical setting as an additional triage tool and for target testing, especially to better inform decis...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on page 7. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2021.04.01.21254744: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">IRB: This study was approved by the Ethics Committee of the University of Occupational and Environmental Health, Japan (R2-079).</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">We used Stata/SE 16.1 (StataCorp, College Station, TX, USA) for all analyses.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>StataCorp</div><div>suggested: (Stata, RRID:SCR_012763)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      This study has several limitations. First, the study recruited panelists who registered with an online research company. Therefore, the participants may not represent general workers. For example, online panelists may be particularly willing to use online tools or more familiar than others with using apps. Consequently, the results may underestimate negative factors for use of the app. Second, we evaluated the use of the contact tracing app by asking participants about whether they had downloaded it. Therefore, we did not confirm whether the application was installed. However, we think that most people who downloaded the app also installed it. Despite these limitations, to the best of our knowledge, this study represents the first study in Japan to examine current use of the COCOA with a large sample. In conclusion, the present study evaluated the associations of industry and workplace characteristics with the use of a COVID-19 contact tracing app in a large-scale online survey of Japanese workers. Those working in the public service sector or in information technology, as well as managers, were frequently found to use the contact tracing app, whereas those working in the retail and wholesale and food/beverage industries were less likely to use it. One possible reason for the under-implementation of the contact tracing app in the retail and wholesale and food/beverage industries may be the small size of businesses in these types of industries. An awareness campaign should be ...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. technical skills, interpersonal (or human) skills, and conceptual skills.

      Katz’s model postulates that good leadership comes down to three specific skills or attributes. First, a good leader must have a strong technical foundation, meaning they have to know how the company or agency works on a day to day level. For example, in my role, this knowledge includes the procedures used by staff such as completing an intake or an exit or having an understanding of the criteria for someone to access shelter. Next, a good leader must have strong conceptual skills, meaning they must have the ability to think creatively and look at the bigger picture. In my case this might be looking at the philosophy behind how we help survivors and how we can better integrate it into daily practice. Lastly, a good leader must have good interpersonal skills. In my case this means being able to read what my staff may need emotionally and be able to meet that need. An area I in which I often need to use this skill is in watching for burn out and supporting staff who are struggling with it.<br> I think that these three skills are absolutely still relevant during the technological age. In fact, I think they may be even more important, especially the interpersonal skills. In the age of zoom and digital platforms, it is important to be able to pick up on nonverbal cues that might be harder to see in this format. Also, it’s essential to be able to keep the team connected to one another and to make sure the team feels connected to their leader when they are not always in the same room. In my role, there are agency staff members that I have literally never met in person so I need to be able to figure out how to meet their emotional needs and respond appropriately using the limited information I get through a picture on a screen.<br> One thing I would argue is that there is some need for floor level managers to have conceptual skills. In order for an agency to truly adhere to its mission and vision, everyone needs to have an understanding of what it is and how it applies to their own job. You can’t get to your destination if you don’t know what it is or how to get there. I have also anecdotally found that the agencies for whom I’ve worked that are the most successful are the ones where everyone, regardless of role or position in the hierarchy, have a part in looking at and fine tuning the bigger picture. At my current agency, we literally go through our annual agency goals at staff meetings and will discuss why those are the goals as well as the action steps to get there.

    1. Author Response:

      Reviewer #1 (Public Review):

      Redmond et al. use single-cell and single-nucleus RNA-sequencing to reveal the molecular heterogeneity that underlies regional differences in neural stem cells in the adult mouse V-SVZ. The authors generated two datasets: one which was whole cell RNA-seq of whole V-SVZ and one which consisted of nuclear RNA-seq of V-SVZ microdissected into anterior-posterior and dorsal-ventral quadrants. The authors first identified distinct subtypes of B cells and showed that these B cell subtypes correspond to dorsal and ventral identities. Then, they identified distinct subtypes of A cells and classified them into dorsal and ventral identities. Finally, the authors identified a handful of genes that they conclude constitute a conserved molecular signature for dorsal or ventral lineages. The text of the manuscript is well written and clear, and the figures are organized and polished. The datasets generated in this manuscript will be a great resource for the field of adult neurogenesis. However, the arguments and supporting data used to assign dorsal/ventral identities to B cells and A cells could be strengthened, and more rigorous data analysis could result in new biological insights into stem and progenitor cell heterogeneity in the V-SVZ.

      We thank the Reviewer for their feedback on our manuscript. As suggested by Reviewer #1, we are performing additional analyses in the following areas:

      1) Performing additional analyses to further strengthen the dorsal/ventral scRNA-Seq B cell marker analysis and its relationship to our sNucRNA-Seq B data.

      2) Performing additional analyses to identify potential novel biological insights into stem & progenitor cell heterogeneity and text edits to discuss how differentially-expressed sets of genes among B cells and A cells are related to biological processes and/or signaling pathways.

      Reviewer #2 (Public Review):

      The paper is well written, and the data are well analyzed and presented. My concerns centre on terminology and alternative explanations of some of the data, which the authors might deal with in the introduction or discussion.

      We thank Reviewer #2 for their positive reception of our manuscript and the data, and for the constructive suggestions, which we have addressed by changes to the manuscript and in our responses below:

      1) I am slightly confused about some of the data shown in Figure 1. If B cells are defined as GFAP expressing cells, then why do only 25% of the B cells in the plot in Figure 1C express GFAP? I may be missing something here, as other readers may as well. Similarly in the same panel, only 25% of astrocytes seem to be expressing GFAP or GFP driven by a GFAP promotor.

      Importantly, among all cells captured in our scRNA-Seq, only B cells (51.86%), a subpopulation of parenchymal astrocytes (25%) and a small subpopulation of ependymal cells (E cells) had GFAP expression. This is consistent with immunocytochemical staining (Ponti et al. 2013) and other studies of scRNA-Seq expression (Xie et al. 2020). Similarly, Gfp (under the control of hGFAP promoter) is not expected to be expressed in all B cells (here 31.08% of B cells are Gfp+).

      Note that previous work has shown that B cells express different levels of GFAP protein, and some B1 cells were negative (Ponti et al. 2013). This supports the notion that this intermediate filament is a good marker of the V-SVZ primary progenitors, but also present in a subpopulation of parenchymal astrocytes and ependymal cells. However, a negative signal for GFAP does not imply that a cell is not a B cell. This highlights the importance of our clustering analysis revealing additional genes associated with B cells. Our analysis suggests that a combination of Gfap, Thrombospondin 4, Slc1a3 (GLAST) and S100a6 provide a better marker combination to identify B cells.

      The reason for the variability among B cells in the expression of GFAP remains unknown. It could be associated with the normal regulation of intermediate filaments as B cells transit the cell cycle or different stages of their activation or quiescence. It could also be linked to technical aspects of scRNA-Seq analysis: e.g gene dropout; detection limits; sequencing saturation. Since on our dot plot the actual proportion is only graphically shown, to clarify this issue in the text we have added the specific percentages and the following sentences:

      “A fraction of both populations expressed GFAP: 51.85% of B cells (clusters 5,13,14 & 22), 24.37% of parenchymal astrocytes (clusters 21, 26 & 29). This is consistent with previous reports (Chai et al. 2017; Xie et al. 2020; Ponti et al. 2013). Note that across all cells captured in our scRNAseq analysis, only B cells, parenchymal astrocytes or ependymal cells expressed GFAP. Among these three cell types, B cells had the highest average expression of GFAP (4.41 for B cells, 1.00 for astrocytes, 0.29767 for Ependymal cells, values in Pearson residuals). Other markers, like S100a6 (Kjell et al. 2020) (88.9% of B cells; 54% of parenchymal astrocytes and 80% of ependymal cells) and Thbs4 (Zywitza et al. 2018) (45% of B cells; 28.77% in parenchymal astrocytes, 2.88 % in ependymal cells) are also expressed preferentially in B cells and parenchymal astrocytes, but they alone do not distinguish these two cell populations.”

      2) The authors term the germinal zone of the adult mouse brain - the ventricular-subventricular zone. They should discuss the evidence that the adult germinal zone is made up of cells from both the ventricular zone and the sub ventricular zone in the late embryo, where those zones are described clearly on the basis of morphology. Many of the early embryonic neural stem cells are present in the ventricular zone before the sub ventricular zone has developed and continue to be present into the adult. If there is not clear mouse evidence that the progeny of embryonic sub ventricular cells are present in the adult germinal zone independent of embryonic ventricular zone progeny, then the authors might consider calling the zone - the adult ventricular zone, or alternatively terming the neurogenic area around the lateral ventricle the adult germinal zone or by a more straightforward descriptive term - the adult subependymal zone or the adult periventricular zone. Also, I think the first word in line 6 on page 3 should be neural rather than neuronal.

      We agree that the terminology in the field is confusing and multiple names have been used to describe the same region. In order to clarify that we are referring to the same adult periventricular germinal region, we have added a short sentence in the introduction to indicate that the V-SVZ is also referred by other authors as the SVZ, the subependyma or subependymal zone: We have added in the text: “This neurogenic region has also been referred to as the SVZ or the subependymal zone (Kazanis et al. 2017; Morshead et al. 1994)”.

      This reviewer argues that the adult V-SVZ should only be called V-SVZ if a lineage relationship could be established with the embryonic SVZ. To our knowledge there is no need to link the adult SVZ to the embryo, as this structure, like the embryonic SVZ, anatomically sits beneath the VZ (the area next to the ventricle). However, a lineage relationship does exist between the adult V-SVZ and the embryonic VZ, established in previous studies showing that PreB1 cells around E15.5 became quiescent and give rise to adult B cells in the V-SVZ (Fuentealba et al., 2015; Furutachi et al., 2015). In addition, developmental studies show a continuum in the gradual transformation of the embryonic periventricular germinal layers, including the SVZ. Importantly, B1 cells are derived from VZ radial glia (RG), maintain RG markers and retain RG-like interkinetic behavior establishing that functionally and anatomically a VZ is retained in the adult (Merkle et al., 2004; Mirzadeh et al., 2008). Therefore the adult periventricular epithelium is not made of a pure layer of ependymal cells with progenitor cells underneath, as previously thought. Moreover, recent work indicates that just like in the embryo, the more basal adult SVZ progenitors (B2 cells) can be derived from adult VZ progenitors (B1 cells) (Obernier et al. 2018). This transformation of apical to basal cells begins to occur in embryonic stages further suggesting equivalences between the adult and the embryonic progenitor cells. For all the above reasons we prefer to use the term V-SVZ.

      In line 6, page 3, We have changed neuronal cell types to “neural cell types”, as suggested.

      3) The authors refer to their molecularly described B cells as stem cells. Certainly, their lab and others have shown that adult olfactory bulb neurons are the progeny of those B cells, however the classic definition of stem cells (in the blood or intestine for example) require demonstration that single stem cells can make all of the differentiated cells in that tissue. Is their evidence that a single adult B1 cell can make astrocytes, neurons and oligodendrocytes? Indeed, what percentage of the single adult B cells characterized here on the bases of RNA expression can be shown to be multipoint for both macroglial and neuron lineages in vivo or in vitro? Perhaps progenitor or precursor cells might be a better term for a B cells that appears to give rise to neurons primarily.

      This is also an issue of definitions. We modified the text to refer to the primary progenitors in the V-SVZ as adult neural stem cells, or progenitor cells “NSPCs”. We agree that this needs to be clarified and in the introduction we modified one paragraph to indicate:

      “From the initial interpretation that adult NSPCs are multipotent and able to generate a wide range of neural cell types (Reynolds and Weiss 1992; van der Kooy and Weiss 2000; Morshead et al. 1994), more recent work suggest that the adult NSPCs in vivo are heterogeneous and specialized, depending on their location, for the generation of specific types of neurons, and possibly glia (Merkle et al. 2014; Fiorelli et al. 2015; Chaker, Codega, and Doetsch 2016; Merkle, Mirzadeh, and Alvarez-Buylla 2007; Tsai et al. 2012; Delgado et al. 2020).”

      Under normal in vivo conditions, a primitive state for NSCs capable of generating all neuronal and glial cell types of the CNS may only exist at very early stages of development and even their regional specification seems to occur very early (as early as E10.5; Fuentealba et al. 2015). Note that recent work in the hematopoietic system suggests that stem cells there also become restricted embryonically (Carrelha et al., 2018) and in adults their potential to generate lymphoid or myeloid lineages changes dramatically with age, yet at all these ages they are referred as HSCs. We are well aware of the work from the van der Kooy lab, suggesting the existence in the V-SVZ of rare “primitive” Oct4+/GFAP- cells that may be pluripotent and earlier in the lineage from B cells (Reeve et al., 2017). However, as indicated above lineage tracing from the embryo indicates that adult NSPC are specified in the embryo and are already in place and regionally specified between E11.5 and E15. We have investigated whether we could detect Oct4+/Gfap- cells in our datasets. However, we did not detect Oct4 expression in B cells or other cell types. We now indicate in the discussion:

      “It has been suggested that in the adult V-SVZ a more primitive population of Oct4+/GFAP- NSCs may be present and that these cells may be earlier in the lineage from the “definitive” GFAP+ B cells (Reeve et al. 2017). However, regionally specified NSPCs can be lineage traced to the embryo (Fuentealba et al. 2015; Furutachi et al. 2015), and we could not detect a population of Oct4+ cells in our datasets. We, however, cannot exclude that rare primitive OCT4+ NSPCs were not captured in our analysis for technical reasons.” ……. “This underscores the early embryonic regional specification of adult V-SVZ NSPCs and how these primary progenitors maintain a memory of their regions of origin.”

      4) This may be more than a semantic issue, as the rare clonal neurophere forming neural stem cells that do make all three neural cell types in vitro, and also maintain their AP and DV positional identity through clonal passaging in vitro (Hitoshi et al, 2002). However, Emx1 expressing cortical neural stem cells can be lineage traced as they migrate from the embryonic cortical germinal zone to the striata germinal zone in the perinatal period (Willaime-Morawek et al, 2006). Surprisingly, in their new striatal home the Emx1 lineage cortical neural stem cells will turn down Emx1 expression and turn up Dlx2 striatal germinal zone expression. The switch in positional identities of clonal neural stem cells can be seen also in vitro when the stem cells are co-cultured with an excess of cells from a different region and then regrown as clonal neural stem cells. This may suggested that Emx1 expressing neural stem cells (the clonal neurosphere forming cells), may switch their positional identities in vivo as they migrate into the striatal germinal zone, but the downstream neuron producing precursor B cells studied in this paper may maintain their Emx1 expression into the adult germinal zone. This raises an interesting issue concerning which cells in the neural stem cell lineage can be regionally re-specified.

      The interesting question about plasticity and respecification is not addressed by our current manuscript that focuses on the gene expression profile of unmanipulated cells from adult samples. However, regional re-specification is controversial. While work from van der Kooy lab suggests that striatal Emx1+ NSPCs originate in the pallium and migrate into the striatum in the perinatal brain (Willaime-Morawek et la., 2006), other studies suggest that rare Emx1 cells are already present in the developing LGE from embryonic stages as early as E12.5 (Gorski et al. 2002). In addition, we have labeled neonatal radial glial cells in the pallium, when this migration has been suggested to occur, and do not see migration of cells ventrally into the striatal wall. We have also transplanted dorsal NSPCs into ventral locations -- and vice versa -- and do not observe evidence of regional re-specification (Merkle, Mirzadeh, and Alvarez-Buylla 2007; Delgado et al. 2020).

      5) The authors nicely show dorsal versus ventral germinal zone lineages are marked by some of the same positional genes from B cells to A cells, suggesting complete dorsal versus ventral neurogenic lineages giving rise to different types of olfactory bulb neurons. Indeed, they nicely test this idea with dissection of the dorsal versus ventral germinal zones, followed by nuclear RNA sequencing. However, they don't discuss the broader issues concerning the embryological origins of the dorsal versus ventral germinal zones. Emx1 is one of the genes the authors use to mark dorsal lineages. The authors reference papers (Young et al, 2007; Willaime-Morawek et al, 2006;2008) that use Emx1 lineage tracing to show that certain classes of olfactory bulb neurons originate from embryonic cortical neural stem cells that migrate perinatally from the cortical germinal zone into the dorsal subcortical germinal zone. Could cortical versus subcortical embryonic origins of the dorsal versus ventral adult germinal zone explain the origin of different sets of adult olfactory bulb neurons? Further, the authors report that one of the GO terms for their dorsal lineages in cortical regionalization.

      This is a very interesting question that unfortunately we cannot answer. The dorsal domain includes both pallial and subpallial components, but the specific origin of B cells in this dorsal domain and the contribution of the pallium and subpallium remains unresolved.

      We went back to our data to try to find evidence of pallial vs. subpallial components in the dorsal B clusters (5 & 22). Indeed, there are some hints that cluster 22 may be more pallial and 5 more dorsal subpallial. However, when we try to confirm differential distribution of markers associated with these two dorsal subdomains anatomically, it is not possible to determine segregation, likely due to the intermixing of cells as the wedge is formed. We also looked for Dbx1, a relatively specific marker of the border region between pallium and subpallium that has been termed ventral pallium, but unfortunately our scRNA-Seq dataset did not capture this marker. Further, targeted lineage tracing of this region is required to determine the subdivisions of the dorsal V-SVZ. We have added as requested a short discussion on this issue:

      “The dorsal V-SVZ domain is likely further subdivided into multiple subdomains. In the current analysis we pooled together clusters B(5) and B(22) as dorsal. However, largely pallial marker Emx1 and dorsal lateral ganglionic eminence marker Gsx2 were differentially enriched in clusters B(22) and B(5), respectively, suggesting that these two clusters may also represent different sets of regionally specified B cells with distinct embryonic origins. These regions become blurred by cells intermixing in the formation of the wedge region in the postnatal V-SVZ making it difficult to confirm their origin based on expression patterns. In addition to pallial and dorsal subpallial markers, this wedge region likely also includes what has been termed the ventral pallium (Puelles et al. 2016), which is characterized in the embryo by the expression of Dbx1. Unfortunately, our scRNA-Seq analysis did not detect this marker. Further lineage tracing experiments will help determine the precise embryonic origin and nature of the dorsal V-SVZ, including the wedge region.”

      6) The percentages of dividing cells based on gene expression is given for some clusters of cells but not others. It might be useful to have a chart showing the percentages of cells in cycle (ki67 expression) for each cluster. This might be especially useful in characterizing some fo the differences between various subclusters of B, A and C cells. On page 9 it is suggested that the heterogeneity amongst C cell clusters was driven by cell cycle genes. However, it is possible to remove the cell cycle genes from the data analysis to see if this then produces clearer dorsal versus ventral positional identities. This may be an important issue as the dorsal versus ventral positional identity genes appear to be expressed more in less dividing A and B cells, than in the more dividing C cells. This leads to a potentially alternative conclusion - that dorsal/ventral regional identity genes are primarily expressed in the non-dividing post mitotic cells in their resident dorsal or ventral region, and not in precursor cells in the lineage.This could be easiy tested by removing the cell cycle genes from the analysis of highly dividing clusters to see if they then break down into doral versus ventral clusters.

      We now provide a table indicating the fraction of proliferating cells (defined as in S phase or G2-M phase) for all scRNA-Seq clusters.

      Concerning whether dorsal and ventral identities are maintained during the period of proliferation we have analyzed our data looking at dorsal and ventral signature levels over pseudotime (Figure 6-Supplement 1F). Here we do not observe a reduction in either dorsal or ventral score at the proliferative cell stages (pseudotime ~0.75, Figure 2L). This is in contrast to gene signatures that show clear up- or down-regulation over pseudotime, such as Gfap, Egfr & Dcx (Figure 2M). To understand how cell clustering is affected in the absence of proliferative gene influence, and whether clearer dorsal/ventral signatures are observed in proliferating cells, we are performing additional analyses using our scRNA-Seq dataset that is clustered after cell-cycle gene regression.

      References Cited:

      Chaker, Zayna, Paolo Codega, and Fiona Doetsch. 2016. “A Mosaic World: Puzzles Revealed by Adult Neural Stem Cell Heterogeneity.” Wiley Interdisciplinary Reviews. Developmental Biology 5 (6): 640–58.

      Delgado, Ryan N., Benjamin Mansky, Sajad Hamid Ahanger, Changqing Lu, Rebecca E. Andersen, Yali Dou, Arturo Alvarez-Buylla, and Daniel A. Lim. 2020. “Maintenance of Neural Stem Cell Positional Identity by.” Science 368 (6486): 48–53.

      Fiorelli, Roberto, Kasum Azim, Bruno Fischer, and Olivier Raineteau. 2015. “Adding a Spatial Dimension to Postnatal Ventricular-Subventricular Zone Neurogenesis.” Development 142 (12): 2109–20.

      Fuentealba, Luis C., Santiago B. Rompani, Jose I. Parraguez, Kirsten Obernier, Ricardo Romero, Constance L. Cepko, and Arturo Alvarez-Buylla. 2015. “Embryonic Origin of Postnatal Neural Stem Cells.” Cell 161 (7): 1644–55.

      Furutachi, Shohei, Hiroaki Miya, Tomoyuki Watanabe, Hiroki Kawai, Norihiko Yamasaki, Yujin Harada, Itaru Imayoshi, et al. 2015. “Slowly Dividing Neural Progenitors Are an Embryonic Origin of Adult Neural Stem Cells.” Nature Neuroscience 18 (5): 657–65.

      Gorski, Jessica A., Tiffany Talley, Mengsheng Qiu, Luis Puelles, John L. R. Rubenstein, and Kevin R. Jones. 2002. “Cortical Excitatory Neurons and Glia, but Not GABAergic Neurons, Are Produced in the Emx1-Expressing Lineage.” The Journal of Neuroscience: The Official Journal of the Society for Neuroscience 22 (15): 6309–14.

      Kazanis, Ilias, Kimberley A. Evans, Evangelia Andreopoulou, Christina Dimitriou, Christos Koutsakis, Ragnhildur Thora Karadottir, and Robin J. M. Franklin. 2017. “Subependymal Zone-Derived Oligodendroblasts Respond to Focal Demyelination but Fail to Generate Myelin in Young and Aged Mice.” Stem Cell Reports 8 (3): 685–700.

      Kooy, D. van der, and S. Weiss. 2000. “Why Stem Cells?” Science 287 (5457): 1439–41.

      Merkle, Florian T., Luis C. Fuentealba, Timothy A. Sanders, Lorenza Magno, Nicoletta Kessaris, and Arturo Alvarez-Buylla. 2014. “Adult Neural Stem Cells in Distinct Microdomains Generate Previously Unknown Interneuron Types.” Nature Neuroscience 17 (2): 207–14.

      Merkle, Florian T., Zaman Mirzadeh, and Arturo Alvarez-Buylla. 2007. “Mosaic Organization of Neural Stem Cells in the Adult Brain.” Science 317 (5836): 381–84.

      Morshead, C. M., B. A. Reynolds, C. G. Craig, M. W. McBurney, W. A. Staines, D. Morassutti, S. Weiss, and D. van der Kooy. 1994. “Neural Stem Cells in the Adult Mammalian Forebrain: A Relatively Quiescent Subpopulation of Subependymal Cells.” Neuron 13 (5): 1071–82.

      Ponti, Giovanna, Kirsten Obernier, Cristina Guinto, Lingu Jose, Luca Bonfanti, and Arturo Alvarez-Buylla. 2013. “Cell Cycle and Lineage Progression of Neural Progenitors in the Ventricular-Subventricular Zones of Adult Mice.” Proceedings of the National Academy of Sciences of the United States of America 110 (11): E1045–54.

      Puelles, Luis, Loreta Medina, Ugo Borello, Isabel Legaz, Anne Teissier, Alessandra Pierani, and John L. R. Rubenstein. 2016. “Radial Derivatives of the Mouse Ventral Pallium Traced with Dbx1-LacZ Reporters.” Journal of Chemical Neuroanatomy 75 (Pt A): 2–19.

      Reeve, Rachel L., Samantha Z. Yammine, Cindi M. Morshead, and Derek van der Kooy. 2017. “Quiescent Oct4 Neural Stem Cells (NSCs) Repopulate Ablated Glial Fibrillary Acidic Protein NSCs in the Adult Mouse Brain.” Stem Cells 35 (9): 2071–82.

      Reynolds, B. A., and S. Weiss. 1992. “Generation of Neurons and Astrocytes from Isolated Cells of the Adult Mammalian Central Nervous System.” Science 255 (5052): 1707–10.

      Tsai, Hui-Hsin, Huiliang Li, Luis C. Fuentealba, Anna V. Molofsky, Raquel Taveira-Marques, Helin Zhuang, April Tenney, et al. 2012. “Regional Astrocyte Allocation Regulates CNS Synaptogenesis and Repair.” Science 337 (6092): 358–62.

      Xie, Xuanhua P., Dan R. Laks, Daochun Sun, Asaf Poran, Ashley M. Laughney, Zilai Wang, Jessica Sam, et al. 2020. “High Resolution Mouse Subventricular Zone Stem Cell Niche Transcriptome Reveals Features of Lineage, Anatomy, and Aging.”Cold Spring Harbor Laboratory. https://doi.org/10.1101/2020.07.27.223602.

    1. And many white working-class voters feel a sense of subordination, derived from a lack of formal education, and that can play a part in their politics. Back in the early 1970s, the sociologists Richard Sennett and Jonathan Cobb recorded these attitudes in a study memorably titled The Hidden Injuries of Class. This sense of vulnerability is perfectly consistent with feeling superior in other ways. Working-class men often think that middle-class and upper-class men are unmanly or undeserving. Still, a significant portion of what we call the American white working class has been persuaded that, in some sense, they do not deserve the opportunities that have been denied to them.They may complain that minorities have unfair advantages in the competition for work and the distribution of government benefits. Nevertheless, they do not think it is wrong either that they do not get jobs for which they believe they are not qualified, or that the jobs for which they are qualified are typically less well paid. They think minorities are getting “handouts” – and men may feel that women are getting unfair advantages, too – but they don’t think the solution is to demand handouts for themselves. They are likely to regard the treatment of racial minorities as an exception to the right general rule: they think the US mostly is and certainly should be a society in which opportunities belong to those who have earned them.

      A bit of deja vu because it's still happening today

    1. But this “rationalization” account, though compelling in some contexts, does not strike us as the most natural or most common explanation of the human weakness for misinformation. We believe that people often just don’t think critically enough about the information they encounter.

      On the other hand, there is the risk of being critical to everything you read or hear due to the paranoia of getting misinformation. You may dismiss something that is true. Usually we turn this one and off with our political beliefs.

    1. restrict our discussion, as Althusser himself seems to do, to the “individual”interpellation which generates the subject-position of an “individual,” theclass content of such subject-formation will not emerge, since it can onlybecome visible as such when we grasp the positioning of that particularsubject against radically different interpellations. Within the individual“consciousness” this differential relationship must remain external; it is notexperienced from within, but interpretively added on by some apparentlyomniscient commentator.

      In sum, I think, at odds with Althusser's subject-position interpellation, since the school has to be there institutionally to act on the subject. To the subject these classes may not be so, but would be ascribed from above.

    1. I think of all the atrocities we have committed as members of the church: I am saying “we”, not “they”: “we”. The Constitutions of my own congregation reminds me: In Christ we unite ourselves to the whole of humanity, especially to the poor and suffering. We accept our share of responsibility for the sin of the world and so live that his love may prevail. (SHCJ Constitutions #6). I think all of us must acknowledge that our mediocrity, hypocrisy and complacency have brought us to this disgraceful and scandalous place that we find ourselves as a church.

      [8:01 - 8:54]

    1. High-BIA-producing cultivars have lost substantial genetic diver-sity through successive bottlenecks owing to domes-tication and long-term selective breeding for traits thatincrease yield

      This is interesting to think about given what we talked about today. What happens in conservation when you can synthesize something valuable in a lab? Well, before we even get to lab synthesis, what happens when you can easily cultivate it? This! A loss of genetic diversity which arguably is just as bad as habitat loss and extinction. While there is a little bit of a saving grace, it may not always be enough to work with.

    1. It also raises questions about whether and how an author’s works should be posthumously curated to reflect evolving social attitudes, and what should be preserved as part of the cultural record.

      I think it is also crucial to remember that we should consider how and why this passes publishing companies. The types of books that get published, even when they are problematic will tell what the companies, authors, and parts of society truly think and thoughts they prioritize depending on the era, world events, and harness the power to hurt, though it may not be the intent at the time.

  6. Mar 2021
    1. Author Response:

      Reviewer #1 (Public Review):

      In this project, the authors set out to create an easy to use piece of software with the following properties: The software should be capable of creating immersive (closed loop) virtual environments across display hardware and display geometries. The software should permit easy distribution of formal experiment descriptions with minimal changes required to adapt a particular experimental workflow to the hardware present in any given lab while maintaining world-coordinates and physical properties (e.g. luminance levels and refresh rates) of visual stimuli. The software should provide equal or superior performance for generating complex visual cues and/or immersive visual environments in comparison with existing options. The software should be automatically integrated with many other potential data streams produced by 2-photon imaging, electrophysiology, behavioral measurements, markerless pose estimation processing, behavioral sensors, etc.

      To accomplish these goals, the authors created two major software libraries. The first is a package for the Bonsai visual programming language called "Bonsai.Shaders" that brings traditionally low-level, imperative OpenGL programming into Bonsai's reactive framework. This library allows shader programs running on the GPU to seamlessly interact, using drag and drop visual programming, with the multitude of other processing and IO elements already present in numerous Bonsai packages. The creation of this library alone is quite a feat given the complexities of mapping the procedural, imperative, and stateful design of OpenGL libraries to Bonsai's event driven, reactive architecture. However, this library is not mentioned in the manuscript despite its power for tasks far beyond the creation of visual stimuli (e.g. GPU-based coprocessing) and, unlike BonVision itself, is largely undocumented. I don't think that this library should take center stage in this manuscript, but I do think its use in the creation of BonVision as well as some documentation on its operators would be very useful for understanding BonVision itself.

      We have added a reference to the Shaders package at multiple points in the manuscript including lines 58-59 and in Supplementary Details. We will be adding documentation of key Shaders nodes that are important for the creation of BonVision stimuli to the documentation on the BonVision website.

      Following the creation of Bonsai.Shaders, the authors used it to create BonVision which is an abstraction on top of the Shaders library that allows plug and play creation of visual stimuli and immersive visual environments that react to input from the outside world. Impressively, this library was implemented almost entirely using the Bonsai visual programming language itself, showcasing its power as a domain-specific language. However, this fact was not mentioned in the manuscript and I feel it is a worthwhile point to make.

      Thank you - we have now added clarification on this in Supplementary details (section Customised nodes and new stimuli)

      The design of BonVision, combined with the functional nature of Bonsai, enforces hard boundaries between the experimental design of visual stimuli and (1) the behavioral input hardware used to drive them, (2) the dimensionality of the stimuli (i.e. 2D textures via 3D objects), (3) the specific geometry of 3D displays (e.g. dual monitors, versus spherical projection, versus head mounted stereo vision hardware), and (4) automated hardware calibration routines. Because of these boundaries, experiments designed using BonVision become easy to share across labs even if they have very different experimental setups. Since Bonsai has integrated and standardized mechanisms for sharing entire workflows (via copy paste of XML descriptions or upload of workflows to publicly accessible Nuget package servers), this feature is immediately usable by labs in the real world.

      After creating these pieces of software, the authors benchmarked them against other widely used alternatives. IonVisoin met or exceeded frame rate and rendering latency performance measures when compared to other single purpose libraries. BonVision is able to do this while maintaining its generality by taking advantage of advanced JIT compilation features provided by the .NET runtime and using bindings to low-level graphics libraries that were written with performance in mind. The authors go on to show the real-world utility of BonVision's performance by mapping the visual receptive fields of LFP in mouse superior colliculus and spiking in V1. The fact that they were able to obtain receptive fields indicates that visual stimuli had sufficient temporal precision. However, I do not follow the logic as to why this is because the receptive fields seem to have been created using post-hoc aligned stimulus-ephys data, that was created by measuring the physical onset times of each frame using a photodiode (line 389). Wouldn't this preclude any need for accurate stimulus timing presentation?

      We thank the reviewer for this suggestion. We now include receptive field maps calculated using the BonVision timing log in Figure5 – figure supplement 1. Using the BonVision timing alone was also effective in identifying receptive fields.

      Finally the authors use BonVision to perform one human psychophysical and several animal VR experiments to prove the functionality of the package in real-world scenarios. This includes an object size discrimination task with humans that relies on non-local cues to determine the efficacy of the cube map projection approach to 3D spaces (Fig 5D). Although the results seem reasonable to me (a non-expert in this domain), I feel it would be useful for the authors to compare this psychophysical discrimination curve to other comparable results. The animal experiments prove the utility of BonVision for common rodent VR tasks.

      The psychometric test we performed on human subjects was primarily to test the ability of BonVision to present VR stimuli on a head-mounted display. We have edited the text to reflect this. The efficacy of the cube map approach for 3D spaces is well-established in computer graphics and gaming and is currently the industry standard, which was the reason for our choice.

      In summary, the professionalism of the code base, the functional nature of Bonsai workflows, the removal of overhead via advanced JIT compilation techniques, the abstraction of shader programming to high-level drag and drop workflows, integration with a multitude of input and output hardware, integrated and standardized calibration routines, and integrated package management and workflow sharing capabilities make Bonsai/BonVision serious competitors to widely-used, closed-source visual programming tools for experiment control such as LabView and Simulink. BonVision showcases the power of the Bonsai language and package management ecosystem while providing superior design to alternatives in terms of ease of integration with data sources and facilitation of sharing standardized experiments. The authors exceeded the apparent aims of the project and I believe BonVision will become a widely used tool that has major benefits for improving experiment reproducibility across laboratories.

      Reviewer #2 (Public Review):

      BonVision is a package to create virtual visual environments, as well as classic visual stimuli. Running on top of Bonsai-RX it tries and succeeds in removing the complexity of the above mentioned task and creating a framework that allows non-programmers the opportunity to create complex, closed loop experiments. Including enough speed to capture receptive fields while recording different brain areas.

      At the time of the review, the paper benchmarks the system using 60Hz stimuli, which is more than sufficient for the species tested, but leaves an open question on whether it could be used for other animal models that have faster visual systems, such as flies, bees etc.

      Thank you for prompting us to do this - we have now added new benchmarks for a faster refresh rate (144 Hz; new Figure 4 - figure supplement 1).

      The authors do show in a nice way how the system works and give examples for interested readers to start their first workflows with it. Moreover, they compare it to other existing software, making sure that readers know exactly what "they are buying" so they can make an informed decision when starting with the package.

      Being written to run on top of Bonsai-RX, BonVision directly benefits from the great community effort that exists in expanding Bonsai, such as its integration with DeepLabCut and Auto-pi-lot. Showing that developing open source tools and fostering a community is a great way to bring research forward in an additive and less competitive way.

      Reviewer #3 (Public Review):

      Major comments:

      While much of the classic literature on visual systems studies have utilized egocentrically defined ("2D") stimuli, it seems logical to project that present and future research will extend to not only 3D objects but also 3D environments where subjects can control their virtual locations and viewing perspectives. A single software package that easily supports both modalities can therefore be of particular interest to neuroscientists who wish to study brain function in 3D viewing conditions while also referencing findings to canonical 2D stimulus responses. Although other software packages exist that are specialized for each of the individual functionalities of BonVision, I think that the unifying nature of the package is appealing for reasons of reducing user training and experimental setup time costs, especially with the semi-automated calibration tools provided as part of the package. The provisions of documentation, demo experiments, and performance benchmarks are all highly welcome and one would hope that with community interest and contributions, this could make BonVision very friendly to entry by new users.

      Given that one function of this manuscript is to describe the software in enough detail for users to judge whether it would be suited to their purposes, I feel that the writing should be fleshed out to be more precise and detailed about what the algorithms and functionalities are. This includes not shying away from stating limitations -- which as I see it, is just the reality of no tool being universal, but because of that is one of the most important information to be transmitted to potential users. My following comments point out various directions in which I think the manuscript can be improved.

      We thank the reviewer for this suggestion. We have added a major new section, “Supplementary Details”, where we have highlighted known limitations and available workarounds. We also added new rows in the Supplementary Table that make these limitations transparent (eg. web-based deployment).

      The biggest point of confusion for me was whether the 3D environment functionality of BonVision is the same as that provided by virtual spatial environment packages such as ViRMEn and gaming engines such as Unity. In the latter software, the virtual environment is specified by geometrically laying out the shape of the traversable world and locations of objects in it. The subject then essentially controls an avatar in this virtual world that can move and turn, and the software engine computes the effects of this movement (i.e. without any additional user code) then renders what the avatar should see onto a display device. I cannot figure out if this is how BonVision also works. My confusion can probably be cured by some additional description of what exactly the user has to do to specify the placement of 3D objects. From the text on cube mapping (lines 43 and onwards), I guessed that perhaps objects should be specified by their vectorial displacement from the subject, but I have very little confidence in my guess and also cannot locate this information either in the Methods or the software website. For Figure 5F it is mentioned that BonVision can be used to implement running down a virtual corridor for a mouse, so if some description can be provided of what the user has to do to implement this and what is done by the software package, that may address my confusion. If BonVision is indeed not a full 3D spatial engine, it would be important to mention these design/intent differences in the introduction as well as Supplementary Table 1.

      Thank you for prompting us to do this. BonVision does indeed essentially render the view of an avatar in a virtual world (or multiple views, of multiple avatars), without any additional coding required by the user. We have now included in the new “Supplementary Details” specific pathways to the construction and rendering of 3D scenes. We have avoided the use of the terminology ‘game-engine’ as it has a particular definition that most softwares do not satisfy.

      More generally, it would be useful to provide an overview of what the closed-loop rendering procedure is, perhaps including a Figure (different from Supplementary Figure 2, which seems to be regarding workflow but not the software platform structure). For example, I imagine that after the user-specified texture/object resources have been loaded, then some engine runs a continual loop where it somehow decides the current scene. As a user, I would want to know what this loop is and how I can control it. For example, can I induce changes in the presented stimuli as a function of time, whether this time-dependence has to be prespecified before runtime, or can I add some code that triggers events based on the specific history of what the subject has done in the experiment, and so forth. The ability to log experiment events, including any viewpoint changes in 3D scenes, is also critical, and most experimenters who intend to use it for neurophysiological recordings would want to know how the visual display information can be synchronized with their neurophysiological recording instrumental clocks. In sum, I would like to see a section added to the text to provide a high-level summary of how the package runs an experiment loop, explaining customizable vs. non-customizable (without directly editing the open source code) parts, and guide the user through the available experiment control and data logging options.

      We have now added a brief paragraph regarding the basic structure of a BonVision program, and how to ‘close the loop’ in the new “Supplementary Details”.

      Having some experience myself with the tedium (and human-dependent quality) of having to adjust either the experimental hardware or write custom software to calibrate display devices, I found the semi-automated calibration capabilities of BonVision to be a strong selling point. However I did not manage to really understand what these procedures are from the text and Figure 2C-F. In particular, I'm not sure what I have to do as a user to provide the information required by the calibration software (surely it is not the pieces of paper in Fig. 2C and 2E..?). If for example, the subject is a mouse head-fixed on a ball as in Figure 1E, do I have to somehow take a photo from the vantage of the mouse's head to provide to the system? What about the augmented reality rig where the subject is free to move? How can the calibration tool work with a single 2D snapshot of the rig when e.g. projection surfaces can be arbitrarily curved (e.g. toroidal and not spherical, or conical, or even more distorted for whatever reasons)? Do head-mounted displays require calibration, and if so how is this done? If the authors feel all this to be too technical to include in the main text, then the information can be provided in the Methods. I would however vote for this as being a major and important aspect of the software that should be given air time.

      We have a dedicated webpage going through the step-by-step protocol for the automated screen calibration. We now explicitly point to this page in the new Supplementary Details section.

      As the hardware-limited speed of BonVision is also an important feature, I wonder if the same ~2 frame latency holds also for the augmented reality rendering where the software has to run both pose tracking (DeepLabCut) as well as compute whole-scene changes before the next render. It would be beneficial to provide more information about which directions BonVision can be stressed before frame-dropping, which may perhaps be different for the different types of display options (2D vs. 3D, and the various display device types). Does the software maintain as strictly as possible the user-specified timing of events by dropping frames, or can it run into a situation where lags can accumulate? This type of technical information would seem critical to some experiments where timings of stimuli have to be carefully controlled, and regardless one would usually want to have the actual display times logged as previously mentioned. Some discussion of how a user might keep track of actual lags in their own setups would be appreciated.

      We now provide this as part of the Supplementary Details, specifically animation and timing lags.

      On the augmented reality mode, I am a little puzzled by the layout of Figure 3 and the attendant video, and I wonder if this is the best way to showcase this functionality. In particular, I'm not entirely sure what the main scene display is although it looks like some kind of software rendering — perhaps of what things might look like inside an actual rig looking in from the top? One way to make this Figure and Movie easier to grasp is to have the scene display be the different panels that would actually be rendered on each physical panel of the experiment box. The inset image of the rig should then have the projection turned on, so that the reader can judge what an actual experiment looks like. Right now it seems for some reason that the walls of the rig in the inset of the movie remain blank except for some lighting shadows. I don't know if this is intentional.

      Because we have had limited experimental capacity in this period, we only simulated a real-time augmented reality environment off-line, using pre-existing videos of animal behaviour. We think that the comment above reflects a misunderstanding of what the Figure and associated Supplementary Movie represents, and we realise that their legends were not clear enough. We have now made sure that these legends make clear that these are based on simulations (new legends for Figure 3 and Figure 3 - video supplement 1).

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Author responses are written in bold and are italicized. We have underlined the important points in the reviewer's comments. All responses have been read and authorized by all authors of this manuscript. Authors would like to thank the reviewers and the editor for their valuable time. We believe that the comments and suggestions from both reviewers will significantly improve SMorph and the manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      First of all, I want to apologize the authors and editor for my delay. Secondly, for clarity, I want to disclose that I am the author of the Fiji's 'Sholl Analysis' plugin, that the authors cite extensively (Ferreira et al, Nat Methods, 2014).

      In this study, Sethi et al introduce a software tool - SMorph - for bulk morphometric analysis of neurons and glia (astrocytes and microglia), based on the Sholl technique. The authors compare it to the state-of-the-art in a series of validation experiments (stab wound injury), to conclude that it is 1000 times faster that existing tools. Empowered by the tool, the authors show that chronic administration of a tricyclic antidepressant (DMI) leads to structural changes of astrocytes in the mouse hippocampus. The paper is well written, the description of the tool is clear, and the authors make all of the source code available, as well as most of the imagery analyzed in the manuscript. The latter on its own, makes me really appreciative of the authors work.

      We thank reviewer #1 for their careful reading of the manuscript and their comments.

      Major comments:

      A major strength of SMorph is that it leverages the Python ecosystem, which allow the authors take advantage of powerful python packages such as sklearn, without the need for external packages or tools. However, I have strong criticisms for the claims that are made in terms of speed and broad-applicability of the software, including PCA.

      Speed:

      The 1000x speed gains, assumes - for the most part -- <u>that the processing in Fiji cannot be automated</u>. This is false. I read the source code of SMorph, and with exception of the PCA analysis, all aspects of SMorph can be automated in Fiji, using any of Fiji's scripting languages to make direct calls to the Fiji and Sholl Analysis plugin APIs (See https://javadoc.scijava.org/) . Now, perhaps the authors do not have experience with ImageJ scripting, or perhaps we Fiji developers failed to provide clear tutorials and examples on how to do so. Or perhaps, there is something inherently cumbersome with Fiji scripting that makes this hard (e.g., there is a current limitation with the ImageJ2 version of 'Sholl Analysis' that does not make it macro recordable). It such limitations do exist, it is perfectly fine to mention them, but do contact us at https://forum.image.sc, if something is unclear. We do strive to make our work as re-usable as possible. Unfortunately our own research does not always allow us the time required to do so. Case in point, our scripting examples (e.g., https://github.com/tferr/ASA/blob/master/scripting-examples/3D_Analysis_ImageStack.py; https://github.com/tferr/ASA/blob/master/scripting-examples/3D_Analysis_ImageStack.py) are not well advertised. <u>That being said, I am still surprised that in their side-by-side comparisons the authors were not able to automate more the processing steps</u> (e.g., the ImageJ1 version of 'Sholl Analysis' remains fully functional and is macro recordable). If I misunderstood what was done, please provide the ImageJ macros you used. Also, I wanted to mention that i) semi-manual tracing with Simple Neurite Tracer (now "SNT"), can also be scripted (see https://doi.org/10.1101/2020.07.13.179325); and that ii) Fiji commands and plugins can also be called in native python using pyimagej (https://pypi.org/project/pyimagej/), see e.g., https://github.com/morphonets/SNT/tree/master/notebooks#snt-notebooks). Arguably, the fact that SMorph handles blob detection and skeletonization-based metrics directly is more advantageous from a user point of view. In Fiji, blob detection, skeletonization and Strahler analysis (https://imagej.net/Strahler_Analysis) of the skeleton are handled by different plugins. However, those are also fully scriptable, and interoperate well. The point that topographic skeletonization in Fiji can originate loops is valid, however the authors should know that such cycles can be detected and pruned programmatically using e.g., pixel intensities (see https://imagej.net/AnalyzeSkeleton.html#Loop_detection_and_pruning and the original publication (https://pubmed.ncbi.nlm.nih.gov/20232465/)

      We completely agree with the reviewer’s assertion that most parts of the functionality of SMorph can be automated within imageJ as well, and in such comparison, the speed gains with SMorph will not be >1000X.

      However, automating the analysis in imageJ is beyond the scope of the present manuscript. In fact, imageJ analysis comparison was not a part of our original manuscript at all. Upon presubmission inquiry to one of the affiliate journals of Review Commons, we were specifically asked to include a side-by-side comparison with <u>“already available”</u> methods. So, we decided to use ImageJ as it is, and automation, if any, was limited to simple macros to run a series of commands sequentially on batches of images. Although it is true that this analysis could be done much more efficiently with additional scripting, it would not have met the definition of “already available” tools. The imageJ analysis was performed in a way an average biologist with no programming experience would perform it, since that group will find SMorph most useful. In no way do we intend to imply that imageJ analysis can’t be made more efficient and automated. Perhaps it was not clear from the way the text was framed in the initial version of the manuscript. We will add additional text to make this point clearer.

      On a side-note, in response to reviewer #2’s comments, we will perform the speed comparison on a per-image basis, so the speed gain (1080X) may change a little in the new comparison.

      Broad applicability:

      In our work, we made a significant effort to ensure that automated Sholl could be performed on any cell type: e.g., By supporting 2D and 3D images, by allowing repeated measures at each sampled distance, and by improving curve fitting. For linear profiles, we implemented the ability to perform <u>polynomial fits of arbitrary degree, and implemented heuristics for 'best degree' determination</u>. For normalized profiles, we implemented several normalizers, and alternatives for determining regression coefficients. We did not tackle segmentation of images directly (we did provide some accompanying scripts to aid users, see e.g. https://imagej.net/BAR) because in our case that is handled directly by ImageJ and Fiji's large collection of plugins. However, <u>in SMorph, several of these parameters are hard-wired in the code</u>. They may be suitable to the analyzed images, but they can be hardly generalized to other datasets. In detail: In terms of segmentation, SMorph is restricted to 2D images, scales data to a fixed 98 percentile, and uses a fixed auto-threshold method (Otsu). These settings are tethered to the authors imagery. They will give ill results for someone else using a different imaging setup, or staining method. In terms of curve fitting, the polynomial regression seems to be fixed at a 3rd order polynomial, which will not be suitable to different cell types (not even to all cells of 'radial morphology').

      We have indeed hard-coded the parameters that the reviewer mentions, and we agree that we can perhaps give all options to the end-users to choose from. The decision was made to hard-code the parameters so that SMorph becomes very easy and minimalistic to use for the end-users. But the reviewer is right to point out that this may compromise the broad applicability and accuracy. We will update the code in the revised version of the manuscript to give the users control over choosing these parameters.

      PCA:

      <u>The idea of making PCA analysis of Sholl-based morphometry accessible to a broader user base has merit and is welcomed</u>. However, it has to be done carefully in a <u>self-critic manner as opposed to a black-box solution</u>. E.g., in the text it is mentioned that 2 principal components are used, in the tutorial notebook, 3. <u>Why not provide intuitive scree plots that empower users with the ability to criticize choice?</u> Also, it would be useful for users to understand which metrics correlate with each other, and their variable weights.

      Reviewer #1’s suggestions would indeed make the PCA analysis more useful to the users. In the revised version of the code, we will provide additional data/plots to the user for making an informed choice of the significant principal components e.g. the elbow method, Ogive or Pareto plots, variable weights of different features in the principal components and correlation/covariance matrices.

      When we showcased the utility of PCA to distinguish closely related morphology groups (as in Type-1 and Type-2 PV neurons), we had been unable to base the distinction on individual metrics, at least not in a robust manner (see Fig. S4 in Ferreira et al, 2014). <u>A minor conundrum of the paper, is that it does not directly highlight the advantages of "analyzes in a multidimensional space"</u>. The differences between groups in the stab wound and DMI assays are such, that PCA is hardly needed: I.e., the differences depicted Fig2F,G are already significant, and already convey changes in "size and branch complexity" (as per PC1). The same argument applies to Fig. 5. The paper would profit from having this discussed.

      PCA data indeed is not required to make any of the inferences we make in the paper and is superfluous. However, as mentioned in the discussion section of this manuscript, the low-dimensional PCA data can be used in future for other applications, e.g to cluster the astrocytes into morphometrically-defined subpopulations. SMorph can be further developed to perform real-time classification of these cells into morphometric clusters, which will allow the researchers to investigate clusters-specific gene expression, electrophysiology etc. Preliminary results from our lab do suggest that such clusters are differentially altered by stress and antidepressant treatments. However, these results are preliminary and are a part of a long-term future study. The data is really premature to publish at this stage, since it will require a lot of experimentation to show that these astrocyte subpopulations are indeed physiologically and functionally different. Nevertheless, we think that the utility of SMorph for such analyses may help others to come up with additional innovative ways to use the PCA data. Hence, we do believe that the community will benefit from the current release of SMorph having PCA. PCA data was shown in the figures just to demonstrate the functionality of SMorph. We will add additional text to make these points clearer.

      Other:

      • All metrics and parameters should be expressed in physical units (e.g.," radii increasing by 3 pixels", axes in Figure 2, 3, 5, S2) so that readers can directly interpret them.

      In the revised manuscript, we will convert all units into actual physical distances.

      We thank the reviewer for suggesting this paper. We will include this in the discussion of the manuscript.

      Minor comments:

      • Usage of RGB images (8-bit per channel) seems hardly justifiable. Aren't you loosing dynamic range of GFAP signal?

      We agree that we could have captured the images at a higher dynamic range. However, for the changes we observe between treatment groups using GFAP immunoreactivity signal as presented in the manuscript, we do not see an advantage of using higher dynamic range. However, as the reviewer rightly pointed out, under certain conditions, imaging using a higher dynamic range may help and hence, we will include this recommendation in the materials and methods section.

      • Please explain how MaxAbsScaler "prevents sub-optimal results"

      Since morphometric features extracted from cell images either have different units or are scalar, we had to perform normalization before PCA. We will add further explanation in the methods section of the manuscript.

      • The fact that automated batch processing can stall on a single bad 'contrast ratio' image seems rather cumbersome to deal with

      This problem has been resolved in the current version of SMorph, which will be uploaded with the revised version of the manuscript.

      We will add a GPLv3 license

      • "mounted on stereotax" should be "mounted on a stereotaxis device"?

      We will make this change

      • Ensure Schoenen is capitalized

      We will make this change

      Reviewer #1 (Significance):

      <u>I find the Desipramine results interesting</u>. However, given the existing claims that DMI can modulate LTP, I regret that the authors did not look at <u>structural modifications in hippocampal neurons</u> (e.g., by performing the experiments in Thy1-M-eGFP animals). I understand, that doing so at this point would be a large undertaking.

      Another manuscript from our lab1, as well as work from other labs have shown that stress causes significant degenerative changes in hippocampal astrocytes2,3. In the light of these observations, we do believe that our observation of chronic antidepressant treatment inducing structural plasticity in astrocytes is significant. Structural alterations in neurons after DMI treatment are of interest. But in our experience, we have not seen gross morphological (dendritic arborization) changes in hippocampal neurons as a result of antidepressant drug treatments. Such changes are restricted to spine morphology and axonal varicosities, which is beyond the capabilities of SMorph.

      Reviewer #2 (Evidence, reproducibility and clarity):

      This paper addresses the challenge of automatic Sholl analysis of large dataset of multiple cell types such as neurons, astrocytes and microglia. <u>The developed approach should improve the speed of morphology analysis compared to the state of the art without compromising on the accuracy</u>. The authors present an interesting application of their tool to the morphological analysis of astrocytes following chronic antidepressant treatment. The paper is well written, and the tool presented could be <u>beneficial for different applications and context</u>. However, some major aspects should be addressed by the author concerning the description of the algorithms used and the quantification of the results.

      We thank reviewer #2 for their careful reading of the paper and their comments.

      Major comments/Questions:

      1. In the Results and/or Methods sections, the author should better describe how their approach is different from state-of-the-art approaches in terms of algorithms used and how these difference impacts on the speed and accuracy of the analysis.

      We will add these descriptions in the methods section in response to this comment as well as some comments from reviewer #1.

      1. Imaging was performed on a Zeiss LSM 880 airyscan confocal microscope. Is this method robust to other types of imaging techniques, other microscopes, variable levels of signal-to-noise? This should be tested and quantified.

      We will demonstrate the results obtained from images taken using different microscopes and imaging techniques, and quantify the outcome.

      1. Manual cropping of the cells with ImageJ was used. However, in the methods section, the authors mention that other machine learning tools could be used for this task. Why were these tools not implemented in this paper in order to propose a fully automated analysis approach in combination with SMorph?

      We have tried both the machine learning tools cited in this paper (one for DAB images and other for confocal images). However, in our experience, we do not get robust performance from these tools with our datasets, and these tools will perhaps need more optimization for broad applicability. We are developing an auto-cropping tool in-house, but that is beyond the scope of the current study. Another point is that these tools are tailor-made for astrocytes, and their integration into SMorph will restrict its applicability to just one cell type.

      1. In the methods section you state that cropped cells need to have a good contrast ratio for automated batch processing. Could you define what a good contrast ratio is and characterize the performance of your approach for different contrast ratio?

      In the revised manuscript, we will compare the images taken from multiple microscopes and quantify the outcome. We will change the text accordingly. As such, the comment on rejected cells referred to really poor quality images. In the revised manuscript, we will make specific recommendations on imaging parameters so that this should not be an issue at all.

      1. It is mentioned that the analysis routine can be interupted by a cell with lower contrast ratio. This is a major drawback of the approach (but I think that it could be easily improved), as such interruptions may not be= practicable for many applications that need to rely on automated processing.

      We have already rectified this problem and the updated version of SMorph will be uploaded with the revised manuscript.

      1. Also, you should precise how the contrast ratio should be enhanced without modifying raw data in order to be processed with your approach. You suggest removing cells with lower contrast ratio from the analysis, but can this impact on the findings especially if some treatments impact on the detected fluorescence signal? Can you propose ways to improve the robustness of your approach to variable signal ratios?

      It is indeed possible that removing cells from analysis, may in certain cases, affect the results. To rectify this, we are testing the method on images obtained from different microscopes and under different imaging conditions. From these analyses, we will deduce minimum recommendations for imaging conditions so that images don’t have to be edited/altogether removed from analysis for the software to work. In the materials and methods section, we will add these recommendations to the users on the optimal range of imaging parameters. This way, rejection/modification of images should not be an issue.

      1. In the Results section, you describe the time necessary to perform different analysis. However, giving a total time in hours is not very informative as this will likely vary a lot depending on the size of the dataset, complexity of the images, etc. You should compare the average time per image for both methods and types of analysis.

      We compared the total time required for the entire dataset, since SMorph is meant for batch-processing all the images at once. However, we can change the comparisons to time taken per image. We can divide the total time taken by SMorph by the number of images analysed. However, in our opinion, the time taken to initiate SMorph will make these comparisons inaccurate.

      1. You state that for the number of branch point, the lower value of the measured slope when comparing SMorph and ImageJ was related to a constant overestimation of this parameter with ImageJ. How was this quantified? I think you should stress out more the comparison of both approaches with the manually annotated dataset.

      In the revised version of this manuscript, we will include some examples of skeletonized images that overestimate the number of forks. We have observed this to be a recurring problem with the skeletonization tools we have tried in imageJ. This can be rectified in imageJ itself as pointed out by reviewer #1. However, that’s beyond the scope of the present study and will not fit the definition of comparison with “already available” methods.

      1. How can you explain the differences in the 2D-projected Area, total skeleton length and convex hull between SMorph and ImageJ, which all show a slope around 0.83? Can you quantify the performance of both methods by comparing them with your manually annotated dataset?

      In the revised version, we will include the correlation data between completely manual and SMorph comparisons. We will discuss these comparisons further in the manuscript and make specific conclusions about the accuracy.

      1. In the introduction and discussion, you mention that you present a method that works on neurons, astrocytes and microglia. However, I don't see in the paper the comparison between the accuracy for all these cell types as you seem to have analyzed only the morphology of astrocytes.

      In the revised manuscript, we will include the Sholl analysis comparison (imageJ vs SMorph) from images of neurons and microglia.

      1. You mention that your method is quite sensitive to variation in contrast ratio. You should quantify the contrast ratio throughout the experiments and ensure that this is not biasing the SMorph analysis for some of the treatments.

      We thank both reviewers for highlighting this issue in the initial version of SMorph. As mentioned in our response to point #6, we will perform additional analyses to make specific recommendations to the end users regarding imaging parameters so that SMorph can work on images as they are. As such, our comments on contrast ratio applied only to very poor quality images. If images are acquired conforming to the imaging parameters we will recommend in the revised manuscript, images can be analysed without any issues.

      Minor Points :

      1. Precise the exact inclusion and exclusion criteria for Soma detection and rephrase: "The high-intensity blobs were detected as a position of soma..." & "Boundary blobs coming from adjacent cells...".

      We will add a complete explanation of blob detection and the exclusion criterion in the methods section.

      1. Throughout the text, make sure to always refer to an analysis time per image or per cell and not only include absolute duration values without reference to the task at hand (e.g. in the discussion : SMorph took 40 second to complete the analysis... please state to which analysis you are exactly referring to and if applicable if it varies from cell to cell).

      We will change all comparisons to time taken per cell. Text will be added to mention which datasets were used when any claims of speed are made.

      1. When you state in the discussion that "Although some methods do allow Sholl analysis without manual neurite tracing, they still work on one cell at a time", please precise if the only aspect that is missing from this type of analysis is batch processing (looping through the data) or if there is a major obstacle to automate this technique. This is important a SMorph does proceed with the analysis one cell at a time but can work in a loop/batch.

      We will elaborate further on our assertion regarding the challenges of using imageJ plugins for sholl analysis in large batches of cells.

      Reviewer #2 (Significance):

      <u>This tool could very useful to researchers in the field of cellular neuroscience working with high-throughput analysis of microscopy data</u>. The authors show some interesting improvements over existing methods. An improved quantitative characterization of the robustness of their approach would be of great importance to ensure the significance of this tool to a large community of researchers using different types of microscopes or studying different cell types.

      My expertise is in the field of optical microscopy and high-throughput (automated) image analysis for neuroscience. My expertise to evaluate the biological findings in this study is very limited.

      We thank reviewer #2 for their careful reading of the manuscript and their insightful comments. Growing evidence (clinical and preclinical) shows a significant reduction in astrocyte density in key limbic brain regions as a result of depression. We believe that the structural plasticity induced by chronic antidepressant treatment, as demonstrated in this manuscript, is an interesting novel plasticity mechanism that can negate deleterious effects of stress on astrocytes.

      The improvements suggested by both reviewers will help us to greatly improve SMorph in the revised version of this manuscript.

      References:

      1. Virmani, G., D’almeida, P., Nandi, A. & Marathe, S. Subfield-specific Effects of Chronic Mild Unpredictable Stress on Hippocampal Astrocytes. doi:10.1101/2020.02.07.938472.

      2. Czéh, B., Simon, M., Schmelting, B., Hiemke, C. & Fuchs, E. Astroglial plasticity in the hippocampus is affected by chronic psychosocial stress and concomitant fluoxetine treatment. Neuropsychopharmacology 31, 1616–1626 (2006).

      3. Musholt, K. et al. Neonatal separation stress reduces glial fibrillary acidic protein- and S100beta-immunoreactive astrocytes in the rat medial precentral cortex. Dev. Neurobiol. 69, 203–211 (2009).

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Overall, we were pleased that the reviewers found our study carefully designed and interesting. We have addressed their comments below.

      Reviewer #1 (Evidence, reproducibility and clarity)

      The manuscript by Kern, et al., demonstrates that phagocytosis in macrophages is regulated in part by the intermolecular distance of phagocytosis-promoting receptors engaging phagocytic targets. Cells expressing chimeric receptors containing cytosolic domains of Fc receptors (FcR) and defined ligand-binding DNA domains were used to drive phagocytosis of opsonized glass beads coated with complementary DNA ligands of defined spacing and number. These so-called origami ligands allowed manipulation of receptor spacing following engagement, which allowed the demonstration that tight spacing of ligands (7 nm or 3.5 nm) optimized signaling for phagocytosis. The study is carefully performed and convincing. I have a few technical concerns and minor suggestions.

      1. It is assumed that the origami preparations were entirely uniform. How much variation was there? Is that supported by TIRF microscopy of origami preparations? Was the TIRF microscopy calibrated for uniformity of fluorescence (ie., shade correction)?

      Our laboratory, Dong et al., has extensively characterized the origami uniformity and robustness of these exact pegboards. This paper was just posted on bioRxiv (Dong et. al, 2021). We have also cited this paper in our revised manuscript in reference to the characterization of the DNA origami (Line 117).

      We did not use any shade correction. Instead we only collected data from a central ROI in our TIRF field. To check for uniformity of illumination, we plotted the origami pegboard fluorescent intensity along the x and y axis. We observed very modest drop off in signal - the average signal intensity of origamis within 100 pixels of the edge is 76 ± 6% the intensity of origamis in a 100 pixel square in the center of the ROI. Fitting this data with a Gaussian model resulted in very poor R values. While this may account for some of the variation in signal intensity at individual points, we expect the normalized averages of each condition to be unaffected. We have amended the methods to describe this strategy (Lines 851-854).

      [[images cannot be shown]]

      2. Likewise, how much variation was there in the expression of the chimeric receptors? Large variation in receptor numbers per cell could significantly alter the quantitative studies. Aside from the flow sorting for cells expressing two different molecules, how were cells selected for analysis?

      We thank the reviewer for bringing up this point. We confirmed comparable receptor expression levels at the cell cortex of the DNA CAR-𝛾 and the DNA CAR-adhesion used throughout the paper. We also have confirmed that receptor levels at the cell cortex were similar for the large DNA CAR constructs used in Figure 6C-D. This data is now included in Figures S5 and S7. We have also altered the text to include this (lines 169-172):

      Expression of the various DNA CARs at the cell cortex was comparable, and engulfment of beads functionalized with both the 4T and the 4S origami platforms was dependent on the Fc__𝛾R signaling domain (Figure S5).

      When quantifying bead engulfment, cells were selected for analysis based on a threshold of GFP fluorescence, which was held constant throughout analysis for each individual experiment. We have amended the “Quantification of engulfment” methods section to convey this (lines 921-923).

      3. The scale of the origami relative to the cells is difficult to discern in Figures 2C and D. Additional text would be helpful to indicate, for example, that the spots on the Fig. 2D inset indicate entire origami rather than ligand spots on individual origami particles.

      Thank you for pointing this out, we see how the legend was unclear and have corrected it (lines 453-454), including specifically noting “Each diffraction limited magenta spot represents an origami pegboard.” We have also outlined the cell boundary in yellow to make the cell size more clear.

      4. Figure 5 legend, line 482: How was macrophage membrane visualized for these measurements?

      We have added the following clarification (line 535-536): “The macrophage membrane was visualized using the DNA CAR__𝛾, which was present throughout the cell cortex.”

      5. line 265: "our data suggest that there may be a local density-dependent trigger for receptor phosphorylation and downstream signaling". This threshold-dependent trigger response was also indicated in the study of Zhang, et al. 2010. PNAS.

      The Zhang et al. study was influential in our study design, and we wish to give the appropriate credit. Zhang et al. found that a sufficient amount of IgG is necessary to activate late (but not early) steps in the phagocytic signaling pathway. In contrast, our study addresses IgG concentration in small nanoclusters. We find that this nanoscale density affects receptor phosphorylation. Thus, we think these two studies are distinct and complementary.

      Lines 283-287 now read:

      While this model has largely fallen out of favor, more recent studies have found that a critical IgG threshold is needed to activate the final stages of phagocytosis (Zhang et al., 2010). Our data suggest that there may also be a nanoscale density-dependent trigger for receptor phosphorylation and downstream signaling.

      6. line 55: Rephrase, “we found that a minimum threshold of 8 ligands per cluster maximized FcgR-driven engulfment.” It is difficult to picture how a minimum threshold maximizes something.

      We now state “we found that 8 or more ligands per cluster maximized FcgR-driven engulfment.”

      7. line 184: Rephrase, "we created... pegboards with very high-affinity DNA ligands that are predicted not to dissociate on a time scale of >7 hr". Remove "not".

      Thank you for pointing this out, it is now correct.

      Reviewer #1 (Significance):

      This study provides a significant advance in understanding about the molecular mechanisms of signaling for particle ingestion by phagocytosis.


      Reviewer #2 (Evidence, reproducibility and clarity):

      The manuscript on “Tight nanoscale clustering of Fcg-receptors using DNA origami promotes phagocytosis" studies how clustering and nanoscale spacing of ligand molecules for a chimeric Fcg-receptors influence the phagocytosis of functionalized silicon beads by macrophage cell lines. The basis of this study is the design of a chimeric Fc-receptor (DNA-CARg) comprising an extracellular SNAP-tag domain that can be loaded with single-stranded (ss) DNA, the transmembrane part of CD86 and the cytosolic part of the Fc-receptor g-chain containing an immunoreceptor tyrosine-based activation motif (ITAM) as well as a C-terminal green fluorescent protein (GFP). As control the authors used a similar designed DNA-CAR that is lacking the intracellular ITAM-containing FCg tail. The chosen target for this chimeric DNA-CAR, are silicon beads covered by a lipid bilayer that contains biotin-labelled lipids that, via Neutravidin, can be loaded with a biotinylated DNA origami pegboard displaying complimentary ss-DNA as ligand for the DNA-CAR. The DNA origami pegboard contains four ATTO647N fluorescence for visualization and the ssDNA ligand in different quantities and spacing.

      Using these principles, the authors study how ligand affinity, concentration and spacing influence the activation of the DNA-CARg and the engulfment of the loaded beads.

      The authors show that bead engulfment is increased between 2 till 8 ssDNA ligands on the pegboard. After this, ligand numbers do not play a role anymore in the engulfment. They then study the role of the ligand spacing using pegboards that either contain 4 single strand DNA ligands in close (7nm/3,5nm) proximity or a more spaced version using 21/17,5 nm or 35/38,5 nm. The authors find that the bead engulfment is maximally and positively affected by the close spacing of the ssDNA ligands. In their final experiments the authors vary the design of the DNA-CARs by tetramerization of the ITAM-containing Fcg-signaling subunit. In their discussion the authors mention different possibilities for the effect of spacing on the engulfment process.

      I think that, in general, this is an interesting study. However, it has some caveats and open issues that should be clarified before its publication.

      Major comments

      1. As a general comment, it is somewhat a pity that the authors did not use the endogenous FcR as a control. It would have been quite easy for the authors to place the SNAP-tag domain on the Fcg extracellular domain which would allow to do all their experiments in parallel, not only with the DNA-CAR, but also with a DNA-containing wild type receptor. Such a control would be important because, by using a CD86 transmembrane domain, the authors do not know whether the nanoscale localization of their chimeric receptors is reflecting that of the endogenous Fcg receptor.**

      We agree with the reviewer completely. We have repeated experiments shown in Figure 4A with a DNA-CAR containing the Fc𝛾 transmembrane domain instead of CD86 as the reviewer suggests. We also included a DNA-CAR version of the Fc𝛾R1 alpha chain, although this construct was not expressed as well as the others. These data are now included in Figure S5, and referenced in lines 167-168.

      2. An important issue that is discussed by the authors but not addressed in this manuscript is whether the different amount and spacing of the ligand is only impacting on signaling or also on the mechanical stress of the cells. Indeed, mechanical stress on the cytoskeleton arrangement could influence the engulfment process. For this, it would be very important to test that the different bead engulfment, for example, those shown in Fig. 4, is strictly dependent on signaling kinases. The authors should repeat the experiment of Fig. 4 a and b in the presence or absence of kinase inhibitors such as the Syk inhibitor R406 or the Src inhibitor PP2 to show whether the different phase of engulfment is dependent on the signaling function of these kinases. This crucial experiment is clearly missing from their study.

      We agree this is an interesting point. We find that ligand spacing affects receptor phosphorylation; however this does not preclude effects on downstream aspects of the signaling pathway. We will clarify this by adding the following comment to the manuscript (line 299-301):

      While our data pinpoints a role for ligand spacing in regulating receptor phosphorylation, it is possible that later steps in the phagocytic signaling pathway are also directly affected by ligand spacing.

      The DNA-CAR-adhesion in Figure 1 strongly suggests that intracellular signaling is essential for phagocytosis. We have now included additional controls using this construct as detailed in our response to point 3 below. Unfortunately, Src and Syk inhibitors or knockout abrogate Fc𝛾R mediated phagocytosis (for example, PMIDs 11698501, 9632805, 12176909, 15136586) and thus would eliminate phagocytosis in both the 4T and 4S conditions. This precludes analysis of downstream steps in the phagocytic signaling pathway.

      3. Another problem of this study is that the authors show in Fig. 1A the control DNA-CAR-adhesion but then hardly use it in their study. For example, the crucial experiments shown in Fig. 4 should be conducted in parallel with DNA-CAR-adhesion expressing macrophage cells. This study could provide another indication whether or not ITAM signaling is important for the engulfment process.

      We have added this control. It is now included in Figure S5 and S7. Figure 3D also shows that the DNA-CAR-adhesion combined with the 4T origami pegboards does not activate phagocytosis and we have amended the text to make this more clear (line 152).

      4. Another important aspect is how the concentration of the loaded origami pegboard is influencing the engulfment process. In particular, it would be interesting to show the padlocks with different spacings such as the 4T closed spacing versus 4s large spacing show a different dependency on the concentration of this padlock loading on the beads. This would be another important experiment to add to their study.**

      We agree that this is an interesting question. We suspect that at a very high origami density, 4S signaling would improve, and potentially approach the 4T. However, we are currently coating the beads in saturating levels of origami pegboards. Thus we cannot increase origami pegboard density and address this directly.

      Minor comments:

      1. The definition of the ITAM is Immunoreceptor Tyrosine-based Activation Motif and not "Immune Tyrosine Activation Motif" as stated by the authors.

      We have corrected this.

      2. The authors discuss that it is the segregation of the inhibitory phosphatase CD45 from the clustered Fc receptors is the major mechanism explaining their finding that 4T closed spacing is more effective than 4s large spacing. With the event of the CRISPR/Cas9 technology it is trivial to delete the CD45 gene in the genome of the RAW264.7 macrophage cell line used in this study and I am puzzled why they author are not conducting such a simple but for their study very important experiment (it takes only 1-2 month to get the results).

      This experiment may be informative but we have two concerns about its feasibility. First, CD45 is a phosphatase with many different roles in macrophage biology, including activating Src family kinases by dephosphorylating inhibitory phosphorylation sites (PMID 8175795, 18249142, 12414720). Second, CD45 is not the only bulky phosphatase segregated from receptor nanoclusters. For example, CD148 is also excluded from the phagocytic synapse (PMID 21525931). CD45 and CD148 double knockout macrophages show hyperphosphorylation of the inhibitory tyrosine on Src family kinases, severe inhibition of phagocytosis, and an overall decrease in tyrosine phosphorylation (PMID 18249142). CD45 knockout alone showed mild phenotypes in macrophages. We anticipate that knocking out CD45 alone would have little effect, and knocking out both of these phosphatases would preclude analysis of phagocytosis. Because of our feasibility concerns and the lengthy timeline for this experiment, we believe this is outside of the scope of our study.

      In our discussion, we simplistically described our possible models in terms of CD45 exclusion, as the mechanisms of CD45 exclusion have been well characterized. This was an error and we have amended our discussion to read (lines 335-343):

      As an alternative model, a denser cluster of ligated receptors may enhance the steric exclusion of the bulky transmembrane proteins like the phosphatases CD45 and CD148 (Bakalar et al., 2018; Goodridge et al., 2012; Zhu, Brdicka, Katsumoto, Lin, & Weiss, 2008).

      Reviewer #2 (Significance):

      The innovative part of this study is the combination of SNAP-tag attached, chimeric Fc-receptor with the DNA origami pegboard technology to address important open question on receptor function.

      Referees cross-commenting

      I find most of my three reviewing colleagues reasonable I also agrée to Reviewer #1 comments 2

      Likewise, how much variation was there in the expression of the chimeric receptors?

      Large variation in receptor numbers per cell could significantly alter the quantitative studies. Aside from the flow sorting for cells expressing two different molecules, how were cells selected for analysis?

      But I want to add it is not only the amount of receptors but ils the nanoscale location that is key to receptor function

      We have ensured that all receptors are trafficked to the cell surface. We have also measured their intensity at the cell cortex as discussed in response to Reviewer 1.

      Reviewer #3 (Evidence, reproducibility and clarity):

      This is a very nicely done synthetic biology/biophysics study on the effect of ligands spacing on phagocytosis. They use a DNA based recognition system that the group has previously use to investigate T cell signaling, but express the SNAP tag linked transmembrane receptor in a macrophage cell line and present the ligands using DNA origami mats to control the number and spacing of complementary ligands that are designed to be in the typical range for low or high affinity FcR, a receptor that can trigger phagocytosis. The study offers some very nice quantitative data sets that will be of immediate interest to groups working in this area and, in the future, for design of synthetic receptors for immunotherapy applications. Other groups are working on similar platform for TCR. I don't feel there is any need for more experiments, but I have some questions and suggestions. Answering and considering these could clarify the new biological knowledge gained.

      We thank the reviewer for their support of our manuscript. Given the reviewer’s statement that no new experiments are required, we have answered their questions to the best of our ability given the current data. Should the editor decide that any of these topics require experimental data to enhance the significance of the paper, we are happy to discuss new experiments.

      Reviewer #3 (Significance):

      I think the significance would be increased by addressing these questions, that would help understand how the synthesis system described related to other system directed as similar questions and more natural settings.

      1.The densities of the freely mobile DNA ligands required to trigger phagocytosis is quite high. Was the length of the DNA duplexes optimized? The entire complex for both the intermediate and high affinity duplexes seems quite short, perhaps <10 nm. Might the stimulation be more efficient if a short stretch of DS DNA is added to increase the length to 12-13 nm?

      The extracellular domain of the DNA-CAR (SNAP tag and ssDNA strand) are approximately 10 nm (PMID 28340336). The biotinylated ligand ssDNA is attached to the bilayer via neutravidin, resulting in a predicted 14 nm intermembrane spacing. The endogenous IgG FcR complex is 11.5 nm. Bakalar et al (PMID 29958103) tested the effect of antigen height on phagocytosis and found that the shortest intermembrane distance tested (approximately 15 nm) was the most effective. As the reviewer notes, the optimal distance between macrophage and target may be larger than our DNA-CAR. However we think the intermembrane spacing in our system is within the biologically relevant range.

      We saw robust phagocytosis at 300 molecules/micron of ssDNA, which is similar to the IgG density used on supported lipid bilayer-coated beads in other phagocytosis studies (PMID 29958103, 32768386). As the reviewer noticed, this is significantly higher than ligand density necessary to activate T cells (PMID 28340336). We have added a comment on ligand density to lines 96-97.

      2. Are the origami mats generally laterally mobile on the bilayers. If so, what is the diffusion coefficient? Can one detect the mats accumulating in the initial interface between the bead and cell, particularly in cased where there is no phagocytosis? Would immobility of the mats make them more efficient at mediating phagocytosis compared to the monodispersed ligands, which I assume are highly mobile and might even be "slippery".

      We have confirmed that our bead protocol generally produces mobile bilayers, where his-tagged proteins can freely diffuse to the cell-bead interface (see accumulation of a his-tagged FRB binding to a transmembrane FKBP receptor at the cell-bead synapse below). We can qualitatively say that the origamis appear mobile on a planar lipid bilayer (see Dong et. al 2021 and images below). Directly measuring the diffusion coefficient on the beads is extremely difficult because the beads themselves are mobile (both diffusing and rotating), and cannot be imaged via TIRF. We do not see much accumulation of the origami at cell-bead synapses. This could reflect lower mobility of the origamis, or could be because the relative enrichment of origamis is difficult to detect over the signal from unligated origamis.

      Overall, we expect the origami pegboards (tethered by 12 neutravidins) are less mobile than single strand DNA (tethered by a single neutravidin, supported by qualitative images below). We are uncertain whether this promotes phagocytosis. At least one study suggests that increased IgG mobility promotes phagocytosis (PMID 25771017). However, the zipper model would suggest that tethered ligands may provide a better foothold for the macrophage as it zippers the phagosome closed (PMID 14732161). Hypothetically, ligand mobility could affect signaling in two ways - first by promoting nanocluster formation, and second by serving as a stable platform for signaling as the phagosome closes. Since our system has pre-formed nanoclusters, the effect of ligand mobility may be quite different than in the endogenous setting.

      [[image cannot be shown]]

      In the above images, a 10xHis-FRB labeled with AlexaFluor647 was conjugated to Ni-chelating lipids in the bead supported lipid bilayer. The macrophages express a synthetic receptor containing an extracellular FKBP and an intracellular GFP. Upon addition of rapamycin, FRB and FKBP form a high affinity dimer, and FRB accumulates at the bead-macrophage contact sites.

      [[image cannot be shown]]

      In the above images, single molecules were imaged for 3 sec. The tracks of each molecule are depicted by lines, colored to distinguish between individual molecules. The scale bar represents 5 microns in both panels.

      3. Breaking down the analysis into initiation and completion is interesting. When using the non-signalling adhesion constructs, would they get to the initiation stage or would that attachment be less extensive than the initiation phase.

      This is an interesting question. While we did not include the DNA-CAR-adhesion in our kinetic experiments, we have now quantified the frequency of cups that would match our ‘initiation’ criteria in 3 representative data sets where macrophages were fixed after 45 minutes of interaction with origami pegboard-coated beads. We found that an average of 16/125 of 4T beads touching DNA-CAR-adhesion macrophages met the ‘initiation’ criteria and an average of 2/125 were eaten (14% total). In comparison, we examined 4T beads touching DNA CAR𝛾 macrophages and found that on average 23/125 met the ‘initiation’ criteria, and 45/125 were already engulfed (54%). This suggests that the DNA-CAR-adhesion alone may induce enough interaction to meet our initiation criteria, but without active signaling from the FcR this extensive interaction is rare. We have added this data in a new Figure S6 and commented on this in lines 213-215.

      4. It would be interesting to put these results in perspective of earier work on spacing with planar nanoarrays, although these can't be applied to beads. For integrin mediated adhesion there was a very distinct threshold for RGD ligand spacing that could be related to the size of some integrin-cytoskeletal linkers (PMID: 15067875). On the other hand, T cell activation seemed more continuous with changes in spacing over a wide range with no discrete threshold (PMID: 24117051, 24125583) unless the spacing was increased to allow access to CD45, in which case a more discrete threshold was generated (PMID: 29713075). The results here for phagocytosis with the very small ligands that would likely exclude CD45 seems to be more of a continuum without a discrete threshold, although high densities of ligand are needed. This issue of continuous sensing vs sharp threshold is biologically interesting so would be good assess this by as consistent standards are possible across systems.**

      We agree that this is an interesting body of literature worth adding to our discussion. We have added a paragraph that puts our study in the context of prior work on related systems, including these nanolithography studies (Line 364-382):

      How does the spacing requirements for Fc__𝛾R nanoclusters compare to other signaling systems? Engineered multivalent Fc oligomers revealed that IgE ligand geometry alters Fcε receptor signaling in mast cells (Sil, Lee, Luo, Holowka, & Baird, 2007). DNA origami nanoparticles and planar nanolithography arrays have previously examined optimal inter-ligand distance for the T cell receptor, B cell receptor, NK cell receptor CD16, death receptor Fas, and integrins (Arnold et al., 2004; Berger et al., 2020; Cai et al., 2018; Deeg et al., 2013; Delcassian et al., 2013; Dong et al., 2021; Veneziano et al., 2020). Some systems, like integrin-mediated cell adhesion, appear to have very discrete threshold requirements for ligand spacing while others, like T cell activation, appear to continuously improve with reduced intermolecular spacing (Arnold et al., 2004; Cai et al., 2018). Our system may be more similar to the continuous improvement observed in T cell activation, as our most spaced ligands (36.5 nm) are capable of activating some phagocytosis, albeit not as potently as the 4T. Interestingly, as the intermembrane distance between T cell and target increases, the requirement for tight ligand spacing becomes more stringent (Cai et al., 2018). This suggests that IgG bound to tall antigens may be more dependent on tight nanocluster spacing than short antigens. Planar arrays have also been used to vary inter-cluster spacing, in addition to inter-ligand spacing (Cai et al., 2018; Freeman et al., 2016). Examining the optimal inter-cluster spacing during phagosome closure may be an interesting direction for future studies.


      Additional experiments performed in revision

      In addition to these reviewer comments, we have added additional controls validating the DNA-CAR-4x𝛾 used in Figure 6c,d. We compared the DNA-CAR-4x𝛾 to versions of the DNA-CAR-1x𝛾-3x𝛥ITAM construct with the functional ITAM in the second and fourth positions (see the schematics now included Figure S7). We found that four individual receptors with a single ITAM each were able to induce phagocytosis regardless of which position the ITAM was in. However the DNA-CAR-4x𝛾 construct, which also contains 4 ITAMs, was not. This further validates the experiment presented in 6c,d. We also fixed minor errors we discovered in the presentation of data for Figures 1C and S1A.

    1. Reviewer #2 (Public Review):

      In this manuscript, the authors set out to provide a comprehensive meta-analysis of associations between masculinized phenotypes and fitness-relevant outcomes (mating, reproduction, and offspring viability), so as to assess the current state of evidence for hypotheses of sexual selection on human males across high- and low-fertility populations. I enjoyed reading this manuscript, which is well organized and very clearly written. I also appreciated the depth of the analyses reported by the authors. Overall, I am pleased with this research and think it will make a valuable contribution to the literature on human sexual selection and masculinity more generally.

      I do not have any major concerns regarding the methods and results. However, I think the paper would greatly benefit from introducing greater nuance into the theoretical framework and conclusions, which I believe will meaningfully change some of the takeaways presented in the discussion. I have provided references throughout to aid the authors in this effort during revision, though they should certainly not feel compelled to cite each reference provided. I would also appreciate that the authors provide some estimates of (a priori) statistical power when they make claims regarding statistical power in the interpretation of results.

      Major comments:

      The authors have done a very nice job of efficiently introducing the reader to mainstream hypotheses regarding sexual selection on human male phenotypes, particularly those emphasized within evolutionary psychology. I recognize that the authors' primary contribution is empirical and that they have in large part followed the typical presentation of these hypotheses in previous literature. However, given that this paper may be an important point of reference for future research in this area, I would like to encourage the authors to address some important nuances in greater detail that are frequently overlooked.

      (i) The authors argue that "Sexual selection is commonly argued to have acted more strongly on male traits as a consequence of greater variance in males' reproductive output (3) and male-biased operational sex ratio, i.e. a surplus of reproductively available males relative to fertile females (e.g. 4)". This argument then leads to a discussion of why formidability as indexed by strength and other potential indicators of physical dominance are expected to be under selection in males. However, recent work in sexual selection theory has begun to emphasize the importance of the co-evolution of male offspring care and reproductive competition, leading in many cases to opposite predictions compared to classical models of OSR. In particular, more recent models predict that males should often increase rather than decrease offspring care relative to mating effort when men are in relative abundance. These predictions have received support in recent empirical studies in human populations, and help to explain otherwise puzzling patterns such as e.g. the association between male-biased sex ratios and monogamy + low reproductive skew across many taxa. Please see

      Kokko, H., & Jennions, M. D. (2008). Parental investment, sexual selection and sex ratios. Journal of evolutionary biology, 21(4), 919-948. Schacht, R., Rauch, K. L., & Mulder, M. B. (2014). Too many men: the violence problem?. Trends in Ecology & Evolution, 29(4), 214-222. Schacht, R., & Borgerhoff Mulder, M. (2015). Sex ratio effects on reproductive strategies in humans. Royal Society open science, 2(1), 140402.

      Considering these models, one might expect that a variety of behavioral and psychological phenotypes would be under male-specific sexual selection that are simply not considered in the present study. One might also expect that appropriate proxies of male fitness will also vary across populations, independently of the presence/absence of contraception. The authors argue that they selected mating-based proxies of reproductive behaviors and attitudes under the assumption that "preferences for casual sex, number of sexual partners, and age at first sexual intercourse (earlier sexual activity allows for a greater lifetime number of sexual partners)... correlated with reproductive success in men under ancestral conditions". Yet, in large-scale industrialized societies that have undergone a demographic transition, high status males are often observed to invest more in offspring care and the production of intergenerationally transferable wealth at the expense of greater fertility, which may be an adaptive response to shifting demands in relation to competition for status.

      Shenk, M. K., Kaplan, H. S., & Hooper, P. L. (2016). Status competition, inequality, and fertility: implications for the demographic transition. Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1692), 20150150.

      In general, long-run fitness may often not map so simply onto promiscuous sexual behavior in such a straightforward way. Measures such as age at first intercourse may also be confounded with environmental heterogeneity among participants, which could instead indicate environmentally induced plasticity within individuals' lifetimes toward a faster pace of life.

      (ii) Related to this point, the authors discussion of the relationship between testosterone and male phenotypes is somewhat over-simplified, although again in keeping with much of the previous literature in evolutionary psychology. While it was long emphasized that testosterone is a mechanism of aggression per se, recent work has shown that testosterone is better understood as a mechanism for increasing status-seeking, competitive behavior, which can greatly vary in form across socioecological contexts.

      Eisenegger, C., Haushofer, J., & Fehr, E. (2011). The role of testosterone in social interaction. Trends in cognitive sciences, 15(6), 263-271.

      Unfortunately, most of the fWHR and 2D:4D literature has ignored these findings and continues to focus solely on aggression even in WEIRD student samples, where we can be certain that aggression is generally not a viable strategy for attaining and maintaining social status. To my knowledge, only a few studies have explicitly tested this more nuanced hypothesis regarding associations between masculinized phenotypes and differing forms of status-seeking behavior, both of which have found support for ecologically contingent effects in regards to fWHR. Martin et al. (2019) predicted and found support in bonobos for higher fWHR predicting higher scores on an affiliative measure of social rank among both males and females, consistent with the importance of relationship strength and social network centrality for competitive advantage among bonobos. Similarly, Hahn et al. (2017) found that fWHR in human males consistently predicts prosocial behavior and leadership in large-scale institutions. This is consistent with the fact that leadership traits, rather than aggression and formidability per se, are often important predictors of status in human societies (and in contexts of relatively higher SES within those societies).

      Hahn, T., Winter, N. R., Anderl, C., Notebaert, K., Wuttke, A. M., Clément, C. C., & Windmann, S. (2017). Facial width-to-height ratio differs by social rank across organizations, countries, and value systems. PLoS One, 12(11), e0187957. Martin, J. S., Staes, N., Weiss, A., Stevens, J. M. G., & Jaeggi, A. V. (2019). Facial width-to-height ratio is associated with agonistic and affiliative dominance in bonobos (Pan paniscus). Biology Letters, 15(8), 20190232.

      In regard to the male-male competition hypothesis, as noted in the previous comment, we might therefore expect sexual selection to occur on a variety of male traits other than formidability related measures, as well as to be highly population-specific-rather than there being some universal optimum for "masculine" traits-given that what constitutes an adaptive male phenotype likely varies across populations in regard to both male-male competition and female choice. Finally, it should be noted that testosterone is by no means the only sex hormone relevant to considering patterns of human sexual dimorphism. Please see Dunsworth (2020) for a discussion of the centrality of estrogen in proximally explaining sexual dimorphism in body size

      Dunsworth, H. M. (2020). Expanding the evolutionary explanations for sex differences in the human skeleton. Evolutionary Anthropology, 29, 108-116.

      (iii) The authors should provide more references to (and brief discussion of) mixed results regarding the degree of sexual dimorphism in facial and digit ratio metrics. While they cite a few studies in the introduction, one might leave the text with the impression that there is clear enough evidence for 2D:4D being influenced by (pre-natal) sex hormones and being a sexually dimorphic phenotype. However, these results have been strongly challenged, not only be ref 14 and 20 in the main text, but also various other studies e.g.

      Barrett, E., Thurston, S. W., Harrington, D., Bush, N. R., Sathyanarayana, S., Nguyen, R., ... & Swan, S. (2020). Digit ratio, a proposed marker of the prenatal hormone environment, is not associated with prenatal sex steroids, anogenital distance, or gender-typed play behavior in preschool age children. Journal of Developmental Origins of Health and Disease, 1-10. Richards, G. (2017). What is the evidence for a link between digit ratio (2D: 4D) and direct measures of prenatal sex hormones?. Early Human Development. Richards, G., Browne, W. V., Aydin, E., Constantinescu, M., Nave, G., Kim, M. S., & Watson, S. J. (2020). Digit ratio (2D: 4D) and congenital adrenal hyperplasia (CAH): Systematic literature review and meta-analysis. Hormones and Behavior, 126, 104867. Richards, G., Browne, W. V., & Constantinescu, M. (2021). Digit ratio (2D: 4D) and amniotic testosterone and estradiol: An attempted replication of Lutchmaya et al.(2004). Journal of Developmental Origins of Health and Disease.

      Similarly, not all metrics of facial masculinity are equally valid given current empirical evidence. In a recent longitudinal study, only cheekbone prominence was found to show consistent evidence of sexual dimorphism across age groups.

      Robertson, J. M., Kingsley, B. E., & Ford, G. C. (2017). Sexually dimorphic faciometrics in humans from early adulthood to late middle age: Dynamic, declining, and differentiated. Evolutionary Psychology, 15(3), 1474704917730640.

      Overall, I found the authors' discussion of how they selected the specific facial metrics lumped together in their analyses to be underspecified. Please note in the discussion as well that BMI is a well-known confound in studies of facial masculinity and may be a cause of null results in the present study (unless I happened to miss this in the regard to the moderation results - if so, my apologies!).

      Geniole, S. N., Denson, T. F., Dixson, B. J., Carré, J. M., & McCormick, C. M. (2015). Evidence from meta-analyses of the facial width-to-height ratio as an evolved cue of threat. PloS one, 10(7), e0132726.

      (iv) Finally, please provide reference to and potentially brief discussion of the current state of the literature as regards "good genes" hypotheses of female choice, which is relevant for determining how useful previous studies are for directly addressing this hypothesis. Please see:

      Achorn, A. M., & Rosenthal, G. G. (2020). It's not about him: Mismeasuring 'good genes' in sexual selection. Trends in Ecology & Evolution, 35, 206-219.

    1. SciScore for 10.1101/2021.03.23.21254207: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">IRB: This study was conducted with the approval of the Ethics Committee of the University of Occupational and Environmental Health,</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">: Release 14.2; StataCorp LLC, TX, USA).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>StataCorp</div><div>suggested: (Stata, RRID:SCR_012763)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      This study has several limitations. First, selection bias was unavoidable because the study was a survey of Internet monitors. To reduce potential bias, recruitment was conducted by sampling by occupation and gender in each region according to the infection rate. To understand the characteristics of the participants in this study, we compared our findings with those from national and occupational surveys that use various batteries (17). A previous study that used WFun to examine 33,985 workers from a general company showed that 20% had severe work functioning impairment (24). Given that our study protocol found that 21% of the entire study population has severe work functioning impairment (17), we concluded that our present study population was relatively unbiased. Second, we relied on respondents’ self-assessment of their physical environment while working from home, but did not examine the actual physical environments. Therefore, there may be discrepancies with objective evaluation. However, because we inquired about the physical environment, the possibility of erroneous answers is low. Third, since this study was a cross-sectional study, it is impossible to determine the causal relationship between the exposure factors and outcome. However, we think it is unlikely that individuals with severe work functioning impairment would choose to create a poor working environment. For example, a person with back pain is unlikely to actively choose a small space or an ill-fitting desk...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to all reviewers

      We thank all the reviewers for carefully considering our manuscript and providing useful comments and suggestions. We agree with the general comment that testing our key findings in breast cancer cells is important. We will therefore carry out this work over the coming months and include this data in the revision. The other specific comments we address individually in the point-by-point responses below, which provides an outline of the other new experiments we plan to carry out prior to revision.

      In addition to this, we would like to just highlight one general point that we only picked up when considering these responses. It is important to highlight this to all reviewers now, since we believe it adds clinical weight to our conclusions. This relates to the issue of P53, which our manuscript shows drives resistance to CDK4/6 inhibition in cells by inhibiting long-term cell cycle withdrawal following genotoxic damage.

      P53 loss has been implicated in abemaciclib resistance in breast cancer patients (P53 mutation was detected in 2/18 responsive patients and 10/13 non-responsive patents (Patnaik et al., 2016)). This was recently corroborated in a larger scale study in breast cancer: the first whole exome sequencing study aimed at characterising intrinsic and acquired resistance to CDK4/6 inhibitors (Wander et al., 2020). In this recent study, P53 loss/mutation was identified in 0/18 sensitive tumours, 14/28 intrinsically resistant tumours, and 9/13 tumour with acquired resistance**. This was the most frequent single genetic change associated with resistance (58.5%), although 8 other genetic changes were also associated with resistance to differing degrees (7-27%).

      Most of these other resistance events occurred in pathways known previously to help drive G1/S progression following CDK4/6 inhibition: i.e. fully predictable resistance mechanism (RB loss, CCNE2 amplification, ER loss, RAS/AKT1 activation, FGFR2/ERB22 mutation/amplification). Importantly, when the authors attempted to recapitulate these resistance event in breast cancer cell lines, they could demonstrate the expected increase in proliferation following CDK4/6 inhibition in all situation tested, except for P53 loss. This caused the authors to conclude that “loss of P53 function is not sufficient to drive CDK4/6i resistance”. This would appear to us to be an unsatisfactory explanation given the clinical data. However, the authors speculated further that: “Enrichment of TP53 mutation in resistant specimens may result from heavier pre-treatment (including chemotherapies), may be permissive for the development of other resistance-promoting alterations, or may cooperate with secondary alterations to drive CDK4/6i resistance in vivo.”

      We believe that our data provide a crucial alternative explanation for these clinical findings. P53 does not affect the efficiency of a G1 arrest (fig.2), but rather it prevents the resulting genotoxic damage from inducing long-term cell cycle withdrawal (figs.2,3). Therefore, this could explain why it drives resistance in clinical disease but not in the in vitro cell growth assays employed by Wander et al. This highlights a crucial general point of our paper – important effects like this can be missed or misinterpreted until the true nature of long-term cell cycle withdrawal is appreciated.

      As part of our breast cancer work at revision we will analyse this closely by comparing the effect of p53 loss on long-term cell cycle withdrawal. If the current RPE1 data holds true in breast cancer, then we believe that out study would provide a crucial explanation for these clinical findings, and in turn, these clinical data would throw weight behind our conclusion that genotoxic damage and p53 loss is a clinically important consequence of CDK4/6 inhibition in patients.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Comments on 'CDK4/6 inhibitors induce replication stress to cause long-term cell cycle withdrawal' The rationale for this work is to understand the mechanism by which Cdk4/6 inhibitors inhibit tumour cell growth, specifically via senescence which seems to be a frequent outcome of Cdk4/6 inhibition. Although several mechanisms by which Cdk4/6 inhibition induce senescence have been proposed these have varied with the cancer cell model studied. To examine the mechanism for the cytostatic effect of cdk4/6i in therapy without potential confounding effects of different cancer cell line backgrounds, Crozier et al tackle this question in the non-transformed, immortalised diploid human cell line, RPE1. They use live cell imaging and colony formation to track the impact of G1 arrests of different lengths induced by a range of clinically relevant cdk4/6 inhibitors. They also use CRISPR-mediated removal of p53 to examine the role of p53 in the observed cell cycle responses. After noting that G1 arrest of over 2 days leads to a pronounced failure in continued cell cycle and proliferation that is associated with features of replication stress, they perform a proteomics analysis to determine the factors responsible for this. They discover that MCM complex components and some other replicative proteins are downregulated and overall suggest a mechanism whereby downregulation of these essential replication components during a prolonged G1 induce replication stress and ultimate failure of proliferation. They show the impact of cdk4/6 inhibition can be increased by combining with either aneuploidy induction (to indirectly elevate replication stress), aphidicolin (to directly elevate replication stress) or chemotherapy agents that damage DNA. Overall this is a well written and presented manuscript. Data are extremely clearly presented and described clearly within the text. Most appropriate controls were included and the work is performed to a high standard. I have a few comments about the proteomic analysis, and the link between MCM component deregulation and the induction of replication stress:

      - We thank the reviewer for this careful, detailed review, and for their kind comments about our work.

      **Major points:**

      1. Relevance to cancer. I appreciate that examining the mechanism in a diploid line is a sensible place to start. However it remains a bit unclear precisely which aspects of this mechanism might be conserved in cancer. It could be helpful to provide evidence (if it exists) of the impact of cdk4/6 inhibition in tumour cells. For example, are catastrophic mitosis, senescence, etc observed? And is there anything further known about the relationship between tumour mutations such as p53 and clinical response to Cdk4/6i?

      - It is important to point out that senescence is a common outcome of CDK4/6 inhibition in tumour cells, but exactly why tumour cells become senescent is still unclear. There have been many possible explanations proposed (see introduction), but so far, none of these implicate DNA damage. This is surprising for us, considering that DNA damage remains the best-known inducer of senescence and this is how most other broad-spectrum anti-cancer drugs induce permanent cell cycle exit. P53 loss has been associated with CDK4/6i resistance in the clinic, but this has also not previously been linked to genotoxic stress or senescence following CDK4/6 inhibition (see detailed description of this in comment to all reviewers above).** Therefore, our data could help to explain both of these key findings. However, we appreciate the importance of testing these results in breast cancer cells, therefore we will perform these experiments and include the data after revision.

      Also - many of the phenotypes followed in this manuscript vary considerably with the length of G1 and the length of release. Which of these scenarios might mimic in vivo conditions?

      - We see that a prolonged arrest (> 2 days) is necessary to see genotoxic effects in RPE cells. Clinically, palbociclib is administered in 3-week on/1-week off cycles, therefore this is consistent with the possibility that replication stress is induced during the off periods to cause genotoxic damage and cell cycle withdrawal.

      Relating to the downregulation of MCM complex members, and the potential impact on origin licensing, how would this mechanism be manifest in cancer cells that have already deregulated gene transcription programs, and are already experiencing replication stress?

      - We hypothesise that cancer cells with ongoing replication stress maybe more sensitive to the MCM downregulation caused by CDK4/6 inhibition. The rationale is that a reduction in licenced origins would impair the ability of dormant origins to fire in response to replication problems, therefore making elevated levels of replication stress less tolerable. This is consistent with the enhanced effect of CDK4/6 inhibition seen when replication stress is elevated in RPE cells. Moreover, others have shown that experimentally reducing MCM protein levels induces hypersensitivity to replication stress in transformed cell lines such as U2OS and HeLa (Ge et al., 2007; Ibarra et al., 2008). Thus, low MCM levels and reduced origin licensing can contribute to replication failure in cancer cells.

      1. MCM protein levels and proposed impact on chromatin loading and origin licensing. Several MCM components are clearly reduced at the protein level. A chromatin assay (assaying fluorescence of signal remaining after pre-extraction of cytosolic proteins) suggests that MCM loading on chromatin is reduced, and this is taken to suggest a reduction in origin licensing. This is quite an indirect method - and it is difficult to conclude that the reduced chromatin bound fraction really represents a meaningful reduction in origin licensing. It would be more convincing if either positive and negative controls for this assay were included. Moreover it is not clear if this MCM reduction and proposed reduction in licensed origins would actually impact replication in an otherwise unperturbed state? Many more origins are licensed than actually fire during a normal S-phase, so it is not entirely clear that MCM levels could lead directly to replication stress here.

      - Quantifying the non-extractable MCM proteins is in truth the most direct assay for origin licensing (not origin firing) available in human cells. To our knowledge, there are no reports of MCM loading by this or similar assays that are not strongly correlated with origin licensing per se. The reviewer is correct that modest reductions in MCM loading are well-tolerated in the absence of other perturbations. Specifically, Ge et al found no proliferation effects after 50% MCM loading reduction, but any further reduction introduced a proliferation delay (Ge et al., 2007). Of note, the U2OS cells used in that study also have a functional p53 response.

      - Another important point that is worth emphasizing, is that many of the differentially downregulated proteins only function at replication forks (fig.4c). Therefore, we believe that the replication stress is a combined result of poor licencing and reduced levels of replication fork proteins that are needed after the origins fire. We will clarify this point in the revised manuscript.

      1. Loss of MCM protein levels and chromatin loading occurs after 1 day, not 4 days, of Cdk4/6 inhibition. The current proposal (based on evidence from the live cell imaging, and the induction of hallmarks of replication stress in figures 1-3) seems to be that something occurs between 2 and 7 days of cdk4/6i to prevent cells from resuming a normal cell cycle. Thus the proteomics was performed between 2 and 7 days, and MCM proteins identified as major changed proteins between those times. However, according to Western blots and FACS profiles in Figure 4, the major reduction in MCM protein levels, and chromatin loading occurs already at 1 day of of cdk4/6i (Figure 4d,e,f). However, replication stress is not observed after this timepoint (Figure 3) - so this seems to decouple the timings of MCM reduction from induction of replication stress. How can this be reconciled?

      - We agree that some of the observed changes to replisome components are quite considerable after just 1 day of arrest (some of these downregulations such as Cdc6 or phospho-Rb can be attributed to the cell cycle arrest itself - Cdc6 is unstable in G1 - but others, such MCM proteins, are not typically lost during G1). We were initially surprised by this too, considering that the phenotype clearly appears later than 1 day of arrest. It is important to state though, that the levels of almost all replisome components continue to decline as the duration of arrest is extended, eventually falling to considerably lower levels than seen after just 1 day. This is observed for MCM2, MCM3 and PCNA by western (fig.4e,e) and a large number of other replisome components by proteomics (fig.4c, 2 vs 7 days). Even MCM loading, which is 58% reduced after just 1-day arrest, is still reduced even further to just 20% of controls after 7 days (p- Our interpretation of the phenotypic data in light of this, is that replication problems become apparent when the number of licensed origins and the function of the replisome is compromised below a certain threshold; which most likely depends on cell type and, in particular, the levels of endogenous replication stress. So, in RPE cells, 1-day treatment is clearly tolerable, perhaps because there are still enough origins to complete DNA replication successfully. But, importantly, if replication stress is enhanced in these cells then 1-day of palbociclib arrest now starts to cause observable defects. This is evident in Figure 5h, where 1-day palbociclib treatment causes minimal effect on long-term growth on its own, but growth is reduced considerably when replication stress is elevated with genotoxic drugs. We interpret this to mean that the reduction in licenced origins and replisome components observed after 1 day of arrest, starts to become problematic in situations when replication stress is elevated.*

      - This is actually an important point that we will highlight this at revision, because one prediction is that other cells with elevated replication stress (e.g. tumour cells with oncogene-induced replication stress) may begin to see defects after as little as 1-day palbociclib arrest.

      **Minor points:**

      1. All the live cell tracking figures would be even more informative if a quantification of key features (such as a cumulative frequency of S-phase entry, or a mean+SD of time in G1, S and G2) were also presented.

      - We agree this will be useful, and we will include this information after revision.

      1. In Figure 2D the cells released from palbociclib seem to delay longer in G1 until they start to enter S phase, compared to cells co-treated with STLC (Figure 2B). Why would this be? It is difficult to tell if other subtle effects might be present in between the +STCL and -STLC conditions, so additional graphs such as those suggested above might be informative here in particular.

      - Fig.2d shows a representative experiment (50 cells) because it is difficult to interpret these individual cell cycle profiles when more than 50 cells are presented. However, we have all the data from 3 experiments (150 cells), therefore we will also calculate timings as suggested and present this information after revision.

      1. Figure 4f It would be helpful to see the FACS plot for at least one of the conditions quantified in the graph as a comparison.

      - These plots will be included after revision

      1. MCM2 protein is not down in p53 wt, but is reduced in p53 KO cells - why is this? And why is MCM2 not impacted when the other MCM complex members are?

      - We think perhaps there has been a mistake in interpreting these graphs. MCM2 is actually slightly lower in WT than KO cells at 1 days, and similar at 4 and 7 days (Fig.4d,e). MCM2 is also reduced slightly more than MCM3 (fig.4d,e) and MCM2, 3, 4, and 5 are all reduced by similar extents between 2 and 7 days palbociclib arrest (30-40% reductions; Fig.4c).

      Inducing aneuploidy with reversine to elevate replication stress may result in additional aneuploidy-related stresses that confound this interpretation. For example, aneuploidy per se is known to elevate p21 and p53 levels, and chromosome mis-segregation could elevate DNA damage. For these reasons these experiments are not as compelling as the direct elevation of replication stress using aphidicolin.

      - We agree that the aneuploidy experiment could have many different interpretations, and only one of these relates specifically to replication stress. This was also commented on by reviewer 3, so we feel it is best to remove this data and just keep the data on drugs that affect replication stress or DNA damage directly. We will address the effects of aneuploidy more extensively in a separate study.

      **Interesting points to follow up/add more mechanism**

      1. What is mechanism of protein downregulation of MCM etc? Was gene transcription impacted, or is this a question of protein stability? Depletion of one subunit can destabilise the complex leading to protein loss of the other MCM subunits, so perhaps this effect could be due to downregulation of a single MCM complex member.
      2. Are these findings specific to Cdk4/6 inhibitors, or would another means or arresting cells in G1 have the same impact?

      Both of these points are interesting questions and they are actually the focus of an entirely separate study that is ongoing. In particular, we are working on the mechanism(s) of MCM and replisome downregulation.

      Reviewer #1 (Significance (Required)): The central question of the paper is an important one so this work would be of interest to many in the clinical and preclinical fields, and also to the cell cycle and replication stress fields.

      - We thank the reviewer for this, and we agree that linking CDK4/6 inhibitors to genotoxic stress is important both for our understanding of cell cycle control and for cancer treatment. We are actually amazed that these drugs have not previously been linked to genotoxic stress, given that they appear to have broad pan-cancer activity and all other broad-spectrum anti-cancer drug work by causing genotoxic stress.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): In this paper, Saurin and colleagues investigate the effects of CDK4/6 inhibitors on cell cycle arrest and re-entry. The authors report that long-term G1 arrest induced by CDK4/6i interferes with DNA replication during the next cell cycle, leading to DNA damage and mitotic catastrophe. Additionally, this compromised replication state sensitizes cells to chemotherapeutics that enhance replication stress. The major claims advanced in this paper are well-supported by the presented evidence. Well I have several questions regarding the significance (see below), I have only a few minor points regarding the methodology. 1) Regarding the down-regulation of MCM components induced by long-term palbo treatment shown in Figure 4: MCM levels are tightly regulated by cell cycle phase. I could imagine that this gene expression change may be a consequence of, for instance, 2 days CDK4/6i treatment arresting 95% of cells in G1 while 7 days of CDK4/6i treatment causes a 99.9% G1 arrest. The data in Figure 1B seems to argue against this hypothesis, but how was that data generated? Can the authors rule out a subtle change in S-phase % over 7 days in palbo? Alternately, is the down-regulation of MCM genes a consequence of cells entering senescence?

      - We have performed extensive long-term movies with these cells, and we never see cells dividing or exiting G1 after the first day of palbociclib treatment. This is illustrated in fig.1b which demonstrates that 100% of FUCCI cells are in G1 (Red) at each of the timepoints. This will be clarified in the legend. In addition, MCM protein levels do not actually oscillate with cell cycle phase (Matson et al., 2017; Méndez and Stillman, 2000), although their mRNA levels certainly do (Leone et al., 1998; Whitfield et al., 2002). Furthermore, RPE and mammalian fibroblasts retain MCM proteins after 2 days of growth factor withdrawal despite transcriptional repression of their respective genes **(Cook et al., 2002; Matson et al., 2019)

      - We see significant changes in MCM levels at a time when cells are still permissive to enter the cell cycle following drug release. Therefore, MCM reduction is not a consequence of senescence. Rather, we believe that it is one of the causes of cell cycle withdrawal following the subsequent S-phase.

      2) For the drug studies presented in figure 5, it is important that the authors perform the appropriate statistical comparisons and analyses to demonstrate true synergy. The authors show that combining palbo and certain chemotherapies causes a greater decrease in clonogenicity than palbo alone. This may or may not be surprising (see below) - but this by itself is insufficient to support the claim that palbo "sensitizes" cells to genotoxins. If you treat cells with two poisons, in 9 out of 10 cases, you'll kill more cells than if you treat cells with one poison alone. But that could be due to totally independent effects - see, for instance, Palmer and Sorger Cell 2017. There are several well-established statistical methods for investigating drug synergy - like Loewe Additivity or Bliss Independence - and one of these methods should be used to analyze the drug-combination studies presented in Figure 5.

      - This analysis will be performed at revision

      Reviewer #2 (Significance (Required)): While this study is a comprehensive analysis of the effects of CDK4/6i in RPE1 cells in 2d culture, I am not convinced of its broader significance. 1) So far as I can tell, the authors do not cite any studies establishing that CDK4/6i results in a significant increase in G1-arrested cells in treated patients. What evidence is there for this claim? I am aware that this has been demonstrated in xenografts and in mouse models, but I could not find evidence for this from actual clinical studies. Here, I am reminded of the very interesting work from Beth Weaver's group on paclitaxel - Zasadil STM 2014. While it had been widely assumed that paclitaxel causes a mitotic arrest, they actually show that this drug kills tumor cells by promoting mitotic catastrophe without inducing a complete mitotic arrest. Similarly, in the absence of existing clinical data, the underlying assumption regarding the effects of CDK4/6i that motivates this paper may not be accurate. For instance, if CDK4/6i acts through the immune system (as suggested by Jean Zhao and others), then this G1 arrest phenotype could be entirely secondary to the drug's actual mechanism-of-action.

      - We are very surprised by the suggestion that CDK4/6 inhibitors may not need to cause a G1 arrest in patient tumours. We appreciate that that these inhibitors effect the immune system in many different ways to combat tumourigenesis, but there is also an overwhelming amount of evidence that a G1 arrest in patient tumours is critical for the overall response. Perhaps the most striking evidence is the fact that RB loss in tumours is one of the best-characterised mechanism of resistance in breast cancer patients (Condorelli et al., 2018; Costa et al., 2020; Li et al., 2018; O'Leary et al., 2018; Wander et al., 2020). In addition, tumours types that typically achieve a poor CDK4/6i-induced G1 arrest in preclinical models, such as TNBCs, also exhibit a poor response to CDK4/6i therapy in patients. Recently a luminal androgen receptor subtype of TNBCs has been identified that responds to CDK4/6 inhibition, due to low CDK2 activity which can otherwise drive G1 progression independently of CDK4/6 in basal-like TNBCs (Asghar et al., 2017; Liu et al., 2017). This rationalises combination therapies that converge to inhibit G1 more effectively in this subtype (e.g. AR antagonist + CDK4/6 inhibition (Christenson et al., 2021)), which is akin to the oestrogen receptor and CDK4/6 combinations that have proven so successful at treating HR+ breast cancer. Many other combinations are also currently in trials based on the same premise that inhibiting upstream G1/S regulators can enhancing the response by inducing a more efficient G1 arrest (MEK, PI3K, AKT, mTOR) (Klein et al., 2018).

      - In response to the specific question about clinical G1 arrest in patients, tumour samples from breast cancer patients shows a decrease in S-phase specific markers pRB and TopoIIa following abemaciclib treatment (Patnaik et al., 2016) and there is extensive evidence of a profound cell cycle arrest following CDK4/6 inhibition as judged by staining with the mitotic marker Ki67 (Hurvitz et al., 2020; Johnston et al., 2019; Ma et al., 2017; Prat et al., 2020). Whilst this does not formally prove a G1-arrest is specifical responsible for this overall cell cycle arrest, that is the implicit assumption given the known mechanism of action of CDK4/6 inhibitors in cells.

      2) How relevant are RPE1 cells? Clinically, CDK4/6 inhibitors are combined with fulvestrant (which would not have an effect in RPE1), and the activity that they exhibit in breast cancer has not been matched in any other cancer types. The underlying biology of HR+ breast cancer (particularly regarding the regulation of CCND1 expression and the G1/S transition by estrogen) may not be recapitulated by other cell types. Moreover, the artificial media used in cell culture experiments may alter the regulation of the G1/S transition. I do not believe that these experiments conducted in RPE1 cells in 2d cell culture are generalizable.

      - Fulvestrant/tamoxifen are effective because they enhance the efficiency of a CDK4/6i arrest by reducing Cyclin D expression to enhance Cyclin D-CDK4/6 inhibition. That convergence onto the G1/S transition is why ER antagonists enhance the CDK4/6 response. i.e. CDK activity is inhibited and CycD transcription is reduced, therefore this double hit allows breast cancer cells to arrest in G1 more efficiently than healthy tissue which is not oestrogen-responsive (this provides yet more evidence the G1 arrest in tumours is crucial for the clinical response). It is true that RPE1 cells do not respond to the oestrogen treatment, but that is not really relevant here in our opinion. We are not testing the efficiency of a G1 arrest beyond the initial characterisation in figure 1. We are mainly examining how cells respond to that G1 arrest afterwards. It could be that components of the cell culture media affect that downstream response in unanticipated ways, but we feel that is very unlikely.

      - Having said that, we agree that the general point on the relevance of RPE cells is a valid one, and we will repeat key experiment in breast cancer cells. We suspect that the reason replisome components become widely downregulated during a G1 arrest will not be a specific phenomenon that is characteristic of one particular cell type. Nevertheless, it is important to validate that assumption.

      3) I am confused about the effects of CDK4/6i on genotoxin sensitivity. Replogle and Amon PNAS 2020 and several citations contained therein report that CDK4/6i protects cells from DNA damage. Moreover, trilaciclib has recently received FDA approval for its ability to protect the bone marrow from cytotoxic chemotherapy! Is this a question of dose timing/intensity? The FDA approval of trilaciclib for this indication should certainly be discussed. This underscores my concern that certain findings in this paper are RPE1/tissue culture artifacts, with limited generalizability.

      - The studies the reviewer refers to demonstrate that halting cell cycle progression can protect cells from genotoxic drugs that cause DNA damage during S-phase. However, we can only think that the reviewer must have missed the critical point here: The genotoxic agents in figure 5 were added after washout from CDK4/6 inhibition (we will highlight this more clearly in the revised manuscript). After drug removal, cells enter S-phase with replication competence problems (as a result of the CDK4/6 arrest) and they then experience additional problems during S-phase (as a result of the genotoxic agents included following washout). These effects synergise to enhance replication stress, a key conclusion of figure 5.

      - This does is in no way support that notion that “findings in this paper are RPE1/tissue culture artefacts with limited generalizability”. Experiments in 2D tissue culture have furnished some of the most important fundamental discoveries in cancer research. It remains to be seen whether our study will cause a paradigm shift in our thinking about how CDK4/6 inhibitors work, but we believe that it may do. We appreciate that this will not become clear until our findings are followed up and validated in preclinical models and human disease, but that does not, in our opinion, make them any less valid at this stage. As stated earlier, we will confirm this is not a RPE1 cell phenomenon, but if this holds up in breast cancer cells then we believe our data will have an important impact on future preclinical and clinical work in this area.

      **Referees cross-commenting** I think that we largely agree that RPE1 is not a great model for this study, and repeating certain key experiments in an ER+ BC line like MCF7 may be warranted.

      - We agree that it would add value to examine our findings in BC cells, therefore we will address this point at revision by repeating key experiments in BC cells.

      Additionally, I wanted to draw attention to the fact that, to my knowledge, the evidence for palbociclib inducing a G1 arrest in patients is incredibly spotty. For early-stage breast tumors where palbo is most effective, nearly all tumor cells are in G1 anyway. I think that it makes the most sense that palbo is actually working through immune modulation or through some secondary mechanism, rather than enforcing a G1 arrest. So I'm not sure about the premise of this study.

      - As discussed above, there is extensive evidence that proliferation is reduced in response to CDK4/6 inhibition in patients (Hurvitz et al., 2020; Johnston et al., 2019; Ma et al., 2017; Patnaik et al., 2016; Prat et al., 2020). We agree that proliferation in patient tumours can be slower than observed in preclinical models, and there can be many reasons for this, especially within solid tumour where hypoxia is a major factor that limits proliferation. However, we do not agree that this implies that drugs that target these tumours do not act on proliferating cells. In fact, most other broad-spectrum non-targeted chemotherapies used to treat cancer also work by targeting dividing cells, and many of these are also more effective in early stage breast cancer. In addition, and as discussed extensively above, there are many studies supporting the interpretation that a G1 arrest is critical for CDK4/6i response in breast cancer patients. Considering all of these points, we strongly believe that the premise of our study – to characterise why a G1 arrest becomes irreversible – is valid and important. This point Is also made in numerous recent reviews which also highlight that this key mechanistic information is currently lacking (Goel et al., 2018; Klein et al., 2018; Knudsen and Witkiewicz, 2017; Wagner and Gil, 2020).

      - We do not disagree that the immune effects are important in patients – indeed, we cited and discussed these studies in our manuscript. However, we would argue that this works together with a G1 arrest in tumour cells. The G1 arrest most likely induces a senescent response that stimulates immune engagement and tumour clearance. These multifactorial effect of CDK4/6 inhibition, on both the tumour and the immune system, are discussed at length in these reviews: (Goel et al., 2018; Klein et al., 2018; Wagner and Gil, 2020).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): The authors clearly demonstrate, with appropriate techniques, that cells treated with clinically relevant CDK4/6 inhibitors lead to a cell cycle arrest, that is only partly reversible. The authors also demonstrate clearly that release from a cdk4/6i arrest leads to two phenomena: the inability to initiate S-phase, and a cell cycle exit in G2. The inability to initiate S-phase is partly dependent on p53, the cell cycle exit is fully dependent on p53. In the absence of p53, cells that are released from a CDK4/6i block frequently enter mitosis with unrepaired DNA lesions. The authors clearly demonstrate that cdk4/6 inhibition leads to down regulation of key replication genes. Combined treatment with genotoxic agents further exaggerates the phenotype of cell cycle exit upon cdk4/6 inhibition. **Specific comments:** Figure 1B: the loss of reversibility remains at approximately 50%. Does the phenotype of replication protein depletion not happen in the 50% of cells that do restart the cell cycle? it would be good if the authors could experimentally address the heterogeneity that is observed.

      - This is actually a result of the fixed analysis use in fig.1B. The irreversibility is much higher than 50% after long durations of arrest, but at the 24h timepoint used in this fixed assay many cells have exited G1 but not yet had a chance to revert back into G1 from S/G2 phase. We will reinforce this point in the legend. This highlights the value of our extensive live cell assays that can fully capture cell cycle profiles, and accurately determine when cell do/don’t enter or withdraw from different stages of the cell cycle. We believe that an overreliance of fixed endpoints in previous studies may have contributed to the genotoxic effects in S-phase being missed previously: many studies show senescence after drug washout, but the cause of that senescence only becomes apparent when you observe that cells withdraw with defects after the first S-phase.

      Figure 1C: the G1 state after S-phase. The read-out here is loss of the Fucci reporter geminin. Does observation reflect p53-dependent activation of the APC/C-Cdh1 prematerely? this is a known effect of persistent DNA damage in G2 cells.

      - Yes, we expect that APC/C-Cdh1 activation causes geminin and cyclin degradation when cells permanently withdraw from the cell cycle from G2. This is likely caused by p53-dependent p21 activation in response to DNA replication defects, as has been shown previously in direct response to DNA damage.

      Figure 2: there seem to be two distinct phenotypes when comparing p53-wt and p53-KO: the ability to initiate S-phase after CDK4/6i removal (which is largely gone in p53 KO, only slight number after 7d treatment). And cell cycle-drop-out after S-phase (this seems to be fully p53 dependent). I am not sure if a single mechanisms explains both.

      - We agree that there are p53-dependent effects on speed/extent of S-phase entry and on the resulting withdrawal from G2. It may not be a single mechanism that connected these effects, although they may be related. Our manuscript mainly focusses on the DNA replication defects and cell cycle withdrawal, but in the future, it will be important to also characterise what causes the delay in cell cycle re-entry following CDK4/6 inhibition. We suspect that this could reflect differing depths of quiescence, potentially caused by p21, which would explain the p53-dependence.

      Figure 3a: related to the proviso point. it is unclear if the p21 up regulation happens in G1 or G2 cells, and related to the inability of cells to initiate S-phase, or the cell cycle exit in G2.

      - This is a good point, and as discussed above, we suspect both maybe related to p21. We will examine p21 levels during a G1 arrest to compare to the levels seen following release, and we will include this data after revision.

      It is stated that a combined action of the p53 pathways and ATR signaling prevent mitotic entry in RPE-wt cells. However, ATR should also be able to do this in p53-KO cells. Does cdk4/6i inhibiton also down-regulation of ATR pathway components?

      - We do not detect downregulation of any ATRi components in the mass spec data comparing 2 and 7 day palbociclib arrest.

      Following the observation that CDK4/6i leads to replication stress, I would hypothesise that these cells would be very sensitive to agents that inhibit the response to replication stress (inhibitors of Wee1, ATR or Chk1). Yet, these agents work preferentially in p53-deficient cells, and require cell cycle progression. Sequential treatment with CDK4/6 inhibition followed by cell cycle checkpoint inhibition may help in uncovering the phenotype.

      - This is a good point and we will perform experiments with ATR inhibitors after release from CDK4/6 inhibition to examine if this enhances the phenotype.

      The authors increase the amount of replication stress using chemotherapeutic approaches or MPS1 inhibitors. The chemotherapeutic approaches are relevant clinically, but mechanistically it don't understand this beyond adding up treatments that lead to replication defects.

      - We agree that the main value of these experiments is not to provide mechanistic insight, but rather to demonstrate that CDK4/6 inhibition can enhance the effect of current genotoxic drugs. Considering CDK4/6 inhibitors are well-tolerated, this could represent an effective way to enhance the tumour-selectivity of current genotoxic therapeutics. This has been suggested previously in a pancreatic cancer study (Salvador-Barbero et al., 2020), but the reasons given for synergy were different (DNA damage repair) and the order of drugs exposure was reversed (genotoxic before CDK4/6i). This underscores the potential importance of our new data.

      - From a mechanistic point of view, these data do still suggest that CDK4/6i and genotoxic drugs converge onto the same replication stress phenotype, thereby supporting our overall conclusions. One interpretation is that a reduction in replisome levels and licenced replication origins impairs the ability of cells to overcome replication problems induced by chemotherapy drugs. Conceptualising how these drugs may synergize in this way will be important in designing new studies and trials to address this synergy more broadly.

      The aneuploidy treatment is a bit weird, because it may trigger a p53 response, before the cells are released from a cdk4.6i arrest. besides, mps1 inhibition does more than just cause replication stress and is not very clinically relevant in this context.

      - We agree that the aneuploidy experiment could have many different interpretations, and only one of these relates specifically to replication stress. This was also commented on by reviewer 1, so we feel it is best to remove this data and just keep the data on drugs that affect replication stress or DNA damage directly. We will address the effects of aneuploidy more extensively in a separate study.

      Reviewer #3 (Significance (Required)): In their manuscript entitled: Crozier and co-workers studied the effects of CDK4/6 inhibition on cell growth. CDK4/6 inhibitors are currently used in the treatment for hormone-positive breast cancers, but their cell biological effects on tumor cells remain incompletely clear, which may hamper the further clinical development of these drugs for breast cancer or other cancers. Inhibition of CDK4/6 is known to trigger a cell cycle arrest, and it is currently unclear how this could lead to long-term tumor control. This manuscript addresses the question why cdk4/6 inhibitors cause long-term cell cycle exit.

      - We thank the reviewer for this simple description of our work, which we think pitches the significance very clearly. There are currently 15 different CDK4/6 inhibitors in clinical trials, and more than 100 further trials using the 3 currently licenced inhibitors in a wide variety of tumour types and drug combinations. Although the clinical work on these drugs is huge, it is unclear how they cause long-term cell cycle arrest and we now link this to genotoxic stress for the first time. This explains clearly why this work is potentially very significant. We agree, however, that the main caveat is the need to demonstrate our findings are also applicable to breast cancer cells. But, if this is the case, we believe this would represent a paradigm shift in our understanding of how these drugs work, especially considering that genomic damage is an universal route to prolonged cell cycle exit in response to almost all other broad-spectrum anti-cancer drugs.

      There are two issues that affect the significance of the findings: the authors start their manuscript with a strong translational/clinical issue, but solely use RPE1 cell lines to address this issue2. it remains unclear if their observations hold true in breast cancer models. it would be advised to repeat key findings in a hormone receptor-positive breast cancer model.

      - We will examine the applicability of our findings in breast cancer cells and include this work at revision.

      the effects of CDK4/6 inhibitors are observed in clinically relevant doses. however, the effects are observed upon switch-like wash out. this does not per se reflect the pharmacodynamics of more gradual increase and decrease of drug concentrations in tuner cells. by washing out the CDK4/6 inhibitors. the significant of this work would be greater if cell cycle exit with replication stress would be observed either in clinical samples or in vivo treated cancer cells.

      - We agree that the significance of this work will ultimately only become fully apparent if replication stress is confirmed in clinical samples or in vivo. We envisage that our study will stimulate exactly this type of analysis in future. However, we would also add that the gradual increase/decrease in drug concentrations seen in patients is still likely to lead to switch like cell cycle re-entry given the switch-like nature of cell cycle controls at the G1/S transition. So, the timing may be different, but we would not predict that the downstream response in S-phase would be. However, whether replication stress is seen during drug-free washout periods in patients is clearly a critical future question, as we highlight in the discussion.

      References

      Asghar, U.S., Barr, A.R., Cutts, R., Beaney, M., Babina, I., Sampath, D., Giltnane, J., Lacap, J.A., Crocker, L., Young, A., et al. (2017). Single-Cell Dynamics Determines Response to CDK4/6 Inhibition in Triple-Negative Breast Cancer. Clin Cancer Res 23, 5561-5572.

      Christenson, J.L., O'Neill, K.I., Williams, M.M., Spoelstra, N.S., Jones, K.L., Trahan, G.D., Reese, J., Van Patten, E.T., Elias, A., Eisner, J.R., et al. (2021). Activity of combined androgen receptor antagonism and cell cycle inhibition in androgen receptor-positive triple-negative breast cancer. Mol Cancer Ther.

      Condorelli, R., Spring, L., O'Shaughnessy, J., Lacroix, L., Bailleux, C., Scott, V., Dubois, J., Nagy, R.J., Lanman, R.B., Iafrate, A.J., et al. (2018). Polyclonal RB1 mutations and acquired resistance to CDK 4/6 inhibitors in patients with metastatic breast cancer. Annals of oncology : official journal of the European Society for Medical Oncology 29, 640-645.

      Cook, J.G., Park, C.H., Burke, T.W., Leone, G., DeGregori, J., Engel, A., and Nevins, J.R. (2002). Analysis of Cdc6 function in the assembly of mammalian prereplication complexes. Proceedings of the National Academy of Sciences of the United States of America 99, 1347-1352.

      Costa, C., Wang, Y., Ly, A., Hosono, Y., Murchie, E., Walmsley, C.S., Huynh, T., Healy, C., Peterson, R., Yanase, S., et al. (2020). PTEN Loss Mediates Clinical Cross-Resistance to CDK4/6 and PI3Kα Inhibitors in Breast Cancer. Cancer Discov 10, 72-85.

      Ge, X.Q., Jackson, D.A., and Blow, J.J. (2007). Dormant origins licensed by excess Mcm2-7 are required for human cells to survive replicative stress. Genes Dev 21, 3331-3341.

      Goel, S., DeCristo, M.J., McAllister, S.S., and Zhao, J.J. (2018). CDK4/6 Inhibition in Cancer: Beyond Cell Cycle Arrest. Trends Cell Biol 28, 911-925.

      Hurvitz, S.A., Martin, M., Press, M.F., Chan, D., Fernandez-Abad, M., Petru, E., Rostorfer, R., Guarneri, V., Huang, C.S., Barriga, S., et al. (2020). Potent Cell-Cycle Inhibition and Upregulation of Immune Response with Abemaciclib and Anastrozole in neoMONARCH, Phase II Neoadjuvant Study in HR(+)/HER2(-) Breast Cancer. Clin Cancer Res 26, 566-580.

      Ibarra, A., Schwob, E., and Méndez, J. (2008). Excess MCM proteins protect human cells from replicative stress by licensing backup origins of replication. Proceedings of the National Academy of Sciences of the United States of America 105, 8956-8961.

      Johnston, S., Puhalla, S., Wheatley, D., Ring, A., Barry, P., Holcombe, C., Boileau, J.F., Provencher, L., Robidoux, A., Rimawi, M., et al. (2019). Randomized Phase II Study Evaluating Palbociclib in Addition to Letrozole as Neoadjuvant Therapy in Estrogen Receptor-Positive Early Breast Cancer: PALLET Trial. J Clin Oncol 37, 178-189.

      Klein, M.E., Kovatcheva, M., Davis, L.E., Tap, W.D., and Koff, A. (2018). CDK4/6 Inhibitors: The Mechanism of Action May Not Be as Simple as Once Thought. Cancer Cell 34, 9-20.

      Knudsen, E.S., and Witkiewicz, A.K. (2017). The Strange Case of CDK4/6 Inhibitors: Mechanisms, Resistance, and Combination Strategies. Trends in cancer 3, 39-55.

      Leone, G., DeGregori, J., Yan, Z., Jakoi, L., Ishida, S., Williams, R.S., and Nevins, J.R. (1998). E2F3 activity is regulated during the cell cycle and is required for the induction of S phase. Genes Dev 12, 2120-2130.

      Li, Z., Razavi, P., Li, Q., Toy, W., Liu, B., Ping, C., Hsieh, W., Sanchez-Vega, F., Brown, D.N., Da Cruz Paula, A.F., et al. (2018). Loss of the FAT1 Tumor Suppressor Promotes Resistance to CDK4/6 Inhibitors via the Hippo Pathway. Cancer Cell 34, 893-905.e898.

      Liu, C.Y., Lau, K.Y., Hsu, C.C., Chen, J.L., Lee, C.H., Huang, T.T., Chen, Y.T., Huang, C.T., Lin, P.H., and Tseng, L.M. (2017). Combination of palbociclib with enzalutamide shows in vitro activity in RB proficient and androgen receptor positive triple negative breast cancer cells. PloS one 12, e0189007.

      Ma, C.X., Gao, F., Luo, J., Northfelt, D.W., Goetz, M., Forero, A., Hoog, J., Naughton, M., Ademuyiwa, F., Suresh, R., et al. (2017). NeoPalAna: Neoadjuvant Palbociclib, a Cyclin-Dependent Kinase 4/6 Inhibitor, and Anastrozole for Clinical Stage 2 or 3 Estrogen Receptor-Positive Breast Cancer. Clin Cancer Res 23, 4055-4065.

      Matson, J.P., Dumitru, R., Coryell, P., Baxley, R.M., Chen, W., Twaroski, K., Webber, B.R., Tolar, J., Bielinsky, A.K., Purvis, J.E., et al. (2017). Rapid DNA replication origin licensing protects stem cell pluripotency. eLife 6.

      Matson, J.P., House, A.M., Grant, G.D., Wu, H., Perez, J., and Cook, J.G. (2019). Intrinsic checkpoint deficiency during cell cycle re-entry from quiescence. J Cell Biol 218, 2169-2184.

      Méndez, J., and Stillman, B. (2000). Chromatin association of human origin recognition complex, cdc6, and minichromosome maintenance proteins during the cell cycle: assembly of prereplication complexes in late mitosis. Mol Cell Biol 20, 8602-8612.

      O'Leary, B., Cutts, R.J., Liu, Y., Hrebien, S., Huang, X., Fenwick, K., André, F., Loibl, S., Loi, S., Garcia-Murillas, I., et al. (2018). The Genetic Landscape and Clonal Evolution of Breast Cancer Resistance to Palbociclib plus Fulvestrant in the PALOMA-3 Trial. Cancer Discov 8, 1390-1403.

      Patnaik, A., Rosen, L.S., Tolaney, S.M., Tolcher, A.W., Goldman, J.W., Gandhi, L., Papadopoulos, K.P., Beeram, M., Rasco, D.W., Hilton, J.F., et al. (2016). Efficacy and Safety of Abemaciclib, an Inhibitor of CDK4 and CDK6, for Patients with Breast Cancer, Non-Small Cell Lung Cancer, and Other Solid Tumors. Cancer Discov 6, 740-753.

      Prat, A., Saura, C., Pascual, T., Hernando, C., Muñoz, M., Paré, L., González Farré, B., Fernández, P.L., Galván, P., Chic, N., et al. (2020). Ribociclib plus letrozole versus chemotherapy for postmenopausal women with hormone receptor-positive, HER2-negative, luminal B breast cancer (CORALLEEN): an open-label, multicentre, randomised, phase 2 trial. Lancet Oncol 21, 33-43.

      Salvador-Barbero, B., Álvarez-Fernández, M., Zapatero-Solana, E., El Bakkali, A., Menéndez, M.D.C., López-Casas, P.P., Di Domenico, T., Xie, T., VanArsdale, T., Shields, D.J., et al. (2020). CDK4/6 Inhibitors Impair Recovery from Cytotoxic Chemotherapy in Pancreatic Adenocarcinoma. Cancer Cell 37, 340-353.e346.

      Wagner, V., and Gil, J. (2020). Senescence as a therapeutically relevant response to CDK4/6 inhibitors. Oncogene.

      Wander, S.A., Cohen, O., Gong, X., Johnson, G.N., Buendia-Buendia, J.E., Lloyd, M.R., Kim, D., Luo, F., Mao, P., Helvie, K., et al. (2020). The genomic landscape of intrinsic and acquired resistance to cyclin-dependent kinase 4/6 inhibitors in patients with hormone receptor positive metastatic breast cancer. Cancer Discov.

      Whitfield, M.L., Sherlock, G., Saldanha, A.J., Murray, J.I., Ball, C.A., Alexander, K.E., Matese, J.C., Perou, C.M., Hurt, M.M., Brown, P.O., et al. (2002). Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol Biol Cell 13, 1977-2000.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper, Saurin and colleagues investigate the effects of CDK4/6 inhibitors on cell cycle arrest and re-entry. The authors report that long-term G1 arrest induced by CDK4/6i interferes with DNA replication during the next cell cycle, leading to DNA damage and mitotic catastrophe. Additionally, this compromised replication state sensitizes cells to chemotherapeutics that enhance replication stress.

      The major claims advanced in this paper are well-supported by the presented evidence. Well I have several questions regarding the significance (see below), I have only a few minor points regarding the methodology.

      1) Regarding the down-regulation of MCM components induced by long-term palbo treatment shown in Figure 4: MCM levels are tightly regulated by cell cycle phase. I could imagine that this gene expression change may be a consequence of, for instance, 2 days CDK4/6i treatment arresting 95% of cells in G1 while 7 days of CDK4/6i treatment causes a 99.9% G1 arrest. The data in Figure 1B seems to argue against this hypothesis, but how was that data generated? Can the authors rule out a subtle change in S-phase % over 7 days in palbo?

      Alternately, is the down-regulation of MCM genes a consequence of cells entering senescence?

      2) For the drug studies presented in figure 5, it is important that the authors perform the appropriate statistical comparisons and analyses to demonstrate true synergy. The authors show that combining palbo and certain chemotherapies causes a greater decrease in clonogenicity than palbo alone. This may or may not be surprising (see below) - but this by itself is insufficient to support the claim that palbo "sensitizes" cells to genotoxins. If you treat cells with two poisons, in 9 out of 10 cases, you'll kill more cells than if you treat cells with one poison alone. But that could be due to totally independent effects - see, for instance, Palmer and Sorger Cell 2017. There are several well-established statistical methods for investigating drug synergy - like Loewe Additivity or Bliss Independence - and one of these methods should be used to analyze the drug-combination studies presented in Figure 5.

      Significance

      While this study is a comprehensive analysis of the effects of CDK4/6i in RPE1 cells in 2d culture, I am not convinced of its broader significance.

      1) So far as I can tell, the authors do not cite any studies establishing that CDK4/6i results in a significant increase in G1-arrested cells in treated patients. What evidence is there for this claim? I am aware that this has been demonstrated in xenografts and in mouse models, but I could not find evidence for this from actual clinical studies. Here, I am reminded of the very interesting work from Beth Weaver's group on paclitaxel - Zasadil STM 2014. While it had been widely assumed that paclitaxel causes a mitotic arrest, they actually show that this drug kills tumor cells by promoting mitotic catastrophe without inducing a complete mitotic arrest. Similarly, in the absence of existing clinical data, the underlying assumption regarding the effects of CDK4/6i that motivates this paper may not be accurate. For instance, if CDK4/6i acts through the immune system (as suggested by Jean Zhao and others), then this G1 arrest phenotype could be entirely secondary to the drug's actual mechanism-of-action.

      2) How relevant are RPE1 cells? Clinically, CDK4/6 inhibitors are combined with fulvestrant (which would not have an effect in RPE1), and the activity that they exhibit in breast cancer has not been matched in any other cancer types. The underlying biology of HR+ breast cancer (particularly regarding the regulation of CCND1 expression and the G1/S transition by estrogen) may not be recapitulated by other cell types. Moreover, the artificial media used in cell culture experiments may alter the regulation of the G1/S transition. I do not believe that these experiments conducted in RPE1 cells in 2d cell culture are generalizable.

      3) I am confused about the effects of CDK4/6i on genotoxin sensitivity. Replogle and Amon PNAS 2020 and several citations contained therein report that CDK4/6i protects cells from DNA damage. Moreover, trilaciclib has recently received FDA approval for its ability to protect the bone marrow from cytotoxic chemotherapy! Is this a question of dose timing/intensity? The FDA approval of trilaciclib for this indication should certainly be discussed. This underscores my concern that certain findings in this paper are RPE1/tissue culture artifacts, with limited generalizability.

      Referees cross-commenting

      I think that we largely agree that RPE1 is not a great model for this study, and repeating certain key experiments in an ER+ BC line like MCF7 may be warranted.

      Additionally, I wanted to draw attention to the fact that, to my knowledge, the evidence for palbociclib inducing a G1 arrest in patients is incredibly spotty. For early-stage breast tumors where palbo is most effective, nearly all tumor cells are in G1 anyway. I think that it makes the most sense that palbo is actually working through immune modulation or through some secondary mechanism, rather than enforcing a G1 arrest. So I'm not sure about the premise of this study.

    1. Author Response:

      Evaluation Summary:

      This paper will be of considerable interest to anybody focusing on highly sensitive T cell antigen recognition. It uses an extended experimental protocol and analytical methods to assess very low T cell receptor binding affinities, and to determine how T cells discriminate between self- and non-self antigens. The main conclusions are well supported by the presented analysis and provide a novel view on a previously considered concept.

      Reviewer #1 (Public Review):

      The presented manuscript takes a comprehensive and elaborated look at how T cell receptors (TCR) discriminate between self and non-self antigens. By extending a previous experimental protocol for measuring T cell receptor binding affinities against peptide MHC complexes (pMHC), they are able to determine very low TCR-pMHC binding affinities and, thereby, show that the discriminatory power of the TCR seems to be imperfect. Instead of a previously considered sharp threshold in discriminating between self and non-self antigen, the TCR can respond to very low binding affinities leading to a more transient affinity threshold. However, the analysis still indicates an improved discrimination ability for TCR compared to other cell surface receptors. These findings could impact the way how T cell mediated autoimmunity is studied.

      The authors follow a comprehensive and elaborated approach, combining in vitro experiments with analytical methods to estimate binding affinities. They also show that the general concept of kinetic proofreading fits their data with providing estimates on the number of proofreading steps and the corresponding rates. The statistical and analytical methods are well explained and outlined in detail within the Supplemental Material. The source of all data, and especially how the data to analyze other cell surface receptor binding affinities was extracted, are given in detail as well. Besides being able to quantify TCR-pMHC interactions for very low binding affinities, their findings will improve the ability to assess how autoimmune reactions are potentially triggered, and how potent anti-tumour T cell therapies can be generated.

      In summary, the study represents an elaborated and concise analysis of TCR-pMHC affinities and the ability of TCR to discriminate between self and non-self antigens. All conclusions are well supported by the presented data and analyses without major caveats.

      Reviewer #2 (Public Review):

      The paper revisits the question of ligand discrimination ability of TCRs of T cells. The authors find that the commonly held notion of very sharp discrimination between strongly and weakly binding peptides does not hold when the affinities of the weak peptides are re-measured more accurately, using their own new method of calibration of SPR measurements. They are able to phenomenologically fit their results with a ~2 step Kinetic Proofreading model.

      It is a very carefully researched and thorough paper. The conclusions seem to be supported by the data and fundamental for our understanding of the T cell immune response with potentially very high impact in many scientific and applied fields. The calibration method could be of potential use in other cases where low affinities are an issue.

      As a non-expert in the details of experimental technique, it is somewhat difficult to understand in detail the Ab calibration of the SPR curve - which is a central piece of the paper. The main question is - what are the grounds (theoretical and/or empirical) to expect that the B_max of the TCR dose response curve will continue to be proportional to the plateau level of the Ab. Figure 1D does suggest that, but it would be hard to predict what proportionality shape the curve will take for lower affinity peptides. Given that essentially all the paper claims rest on this assumption, this should explained/reasoned/supported more clearly.

      We have revised the relevant Results and Methods sections to provide additional information. This information should clarify the expected relationship between Bmax and W6/32 binding. We emphasise that we have only interpolated within the curve and therefore, have not relied on any assumptions about the relationship between these two values outside of the empirical curve that we have generated.

      On the theoretical side - I think the scaling alpha\simeq 2 in Figure 2 is indeed consistent with a two-step KPR amplification. However, there are some questions regarding the fitting of the full model to the P_15 of the CD69 response. As explained in the Supplementary Material the authors use 3 global and 2 local parameters resulting in 37 (or 27) parameters for 32 data points. To a naive reader this might look excessive and prone to overfitting. On the other hand, looking at Figure S8 shows the value ranges of lambda and k_p are quite tight. This is in contrast to gamma and dellta that look completely unconstrained.

      We have revised the relevant Results section to explicitly indicate that the number of data points ex- ceeds the number of free parameters, which together with the ABC-SMC results, should provide additional confidence that we are not over-fitting.

      Finally, one of the stated advantages of the adaptive proof-reading model is that it is capable of explaining antagonism. It is hard to see how a 'vanilla" KPR model is capable of explaining antagonism.

      We have added a discussion paragraph to discuss antagonism, which cannot be explained by the basic KP model that we found is sufficient to explain our data on antigen discrimination in the presence of self pMHCs on autologous APCs. We describe how the methods we have employed can be used to study antagonism.

      Reviewer #3 (Public Review):

      Pettmann et al. aimed at significantly improving the accuracy of SPR-based measurements of low affinity TCR-pMHC interactions by including a 100% binding control (injecting of a conformation-specific HLA-antibody) in the surface plasmon resonance protocol. Interpolating with the information of saturated pMHC binding on the chip The authors arrive at KDs for low affinity binders that are significantly higher than the previously reported constants. If correct, this has considerable ramifications for the interpretations of the results obtained from functional assays measuring the T cell response towards pMHCs featured in a titrated fashion. Unlike what was put forward by earlier reports, the authors conclude that the discriminatory power of TCRs is far from perfect, as T cells still respond to low affinity pMHC-ligands without a sharp affinity threshold. This is also because they managed to detect T cells responding to even ultra-low affinity ligands if provided in sufficient numbers.

      The body of work convinces in several regards:

      (i) It is exceedingly well thought out and introduces a quality of analytical strength that is absent in most of the literature published thus far on this topic.

      (ii) At the same time theoretical arguments are bolstered by a large body of experimental "wet" work, which combines a synthetic approach with cellular immunology and which appears overall well executed.

      (iii) The data lead to hypotheses in the field of T cell antigen recognition in general and in the theatre of autoimmunity, cancer and infectious diseases.

      There are a few aspects that may limit the impact of the study. I have listed them below:

      (i) The study does not provide kinetic data for the low affinity ligand-TCR binding but rather argues from the position of affinities as determined via Bmax. This limits somewhat the robustness of the statements made with regard to kinetic proofreading.

      We agree with this statement and are hoping to directly measure off-rates in the future. We note that in the published literature, including our own work, point mutations to the peptide generally modify the off-rate with only minor impact on the on-rate. An example of this can be found in Lever et al (2016) PNAS where point mutations led to 100,000-fold change in the off-rate but only a 10-fold change in the on-rate. This likely explains why antigen potency is often well-correlated with affinity when using point mutations to the peptide.

      (ii) Thresholds for readouts were arbitrarily chosen (e.g. 15% activation). It appears such choices were based on system behavior (with the largest differences observed among the groups) but may have implications for the drawn conclusions.

      We have chosen 15% in order to capture the ultra-low affinity pMHCs in our potency plots and have now added a sentence for why we have chosen this particular threshold. We did explore different thresholds but found that they produced similar values of α. The precise threshold could change the estimate of α if the shape of dose-response curves was dependent on antigen affinity but we did not find any evidence for this within our data.

      In summary, the work presented contributes to demystifying the link between TCR-engagement and (membrane proximal) signaling. It also provides a fresh perspective on the potential of TCR-cossreactivity.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful to the editors at Review Commons and to the reviewers for their thoughtful attention to our manuscript. Our work presents data showing that deletion of the apoptosis regulator Mcl-1 in CNS stem cells that give rise to neurons and glia resulted in specific degeneration of the white matter, beginning after postnatal day 7 (P7). Cellular analysis shows that oligodendrocytes were depleted while astrocytes persisted. Co-deletion of apoptosis effectors Bax or Bak rescued different aspects of the Mcl-1 deletion phenotype, confirming the role of apoptosis. Based on these observations, we conclude that oligodendrocytes require MCL-1 to prevent spontaneous apoptosis, and that MCL-1 depletion results in leukodystrophy, which resembles severe cases of the human disorder Vanishing White Matter Disease (VWMD). We further suggest that MCL-1 deficiency, caused by the eIF2B mutations of VWMD, may play a critical role in VWMD pathogenesis.

      The reviewers questioned the similarity of the Mcl-1 deletion phenotype to VWMD and were not convinced that MCL-1 deficiency is integral to VWMD. Based on reviewer feedback, we concede that a firm link to VWMD is not supported by the available data. We consider, however, that our findings that MCL-1 is required for oligodendrocyte survival and white matter stability remain highly significant. Accordingly, we propose to revise the work as suggested by Reviewer 1 to highlight the insight our data provide as to apoptosis regulation in glia and its importance for brain development, and to revise the title, as suggested by Reviewer 3, to remove the specific reference to VWMD.

      In the revision, we will make clear that the comparison to specific leukodystrophies is hypothetical and will require extensive follow-up experiments that are suggested by the findings of this work, as described in the reviews. Revising our work by removing the assertion that our data strongly implicate MCL-1 in VWMD pathogenesis will address the main reviewer concern, strengthen the logical flow, and highlight the potential for MCL-1 to be broadly relevant to white matter pathology. The significance of our findings that oligodendrocytes depend on MCL-1 protein to prevent their spontaneous apoptosis, and that MCL-1 deficiency produces white matter degeneration, will not be altered by these changes. Our data will continue to show that MCL-1 dependence is a physiologic vulnerability of oligodendrocytes that sets them apart from astrocytes and neurons and that this vulnerability is sufficient to cause white matter-specific brain degeneration when MCL-1 expression is blocked.

      The other issues raised by the reviewers are all tractable and can be addressed with new experiments that we can complete in a short time-frame, such as studies of retinal pathology and addition immunohistochemistry studies, or with changes to the text. We consider that with these revisions, the manuscript will be an important contribution to understanding glial biology and the pathogenesis of white matter-specific disorders. We describe in detail below our responses to reviewer feedback and planned changes to the manuscript.

      Reviewer comments are in italics and our responses are in plain text.

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):**

      **Comments** *

      While we acknowledge many important points in this review, this first point is based on a premise that is inaccurate. Based on published data, we respectfully disagree with the statement that “Depletion of MCL-1 in any tissue would promote apoptosis in cells of this tissue”. Most cells do not require an anti-apoptotic protein to prevent spontaneous apoptosis; cells that depend on anti-apoptotic proteins are specifically referred to as “primed for apoptosis” (1-5). Our conditional deletion genotypes ablated Mcl-1 in neurons of the forebrain and cerebellum and in all subtypes of glial cells. The loss of oligodendrocytes in our Mcl-1-deleted mice shows that a specific subset of white matter cells in the postnatal brain require MCL-1. Together with the increase in apoptosis and the rescues by co-deletion of Bax or Bak, these data demonstrate that cells within the oligodendrocyte lineage are primed for apoptosis in a manner that is restricted by MCL-1. In contrast, we have shown in published data that we cite in this manuscript that conditional deletion of Mcl-1 cerebellar granule neurons, the largest neuronal population in the brain, does not cause apoptosis (6); these data provide direct evidence that large populations of cells in the brain do not depend on MCL-1. We therefore disagree with the characterization of the brain-specific Mcl-1 deletion phenotype as “non-specific”.

      • The white matter disease is interpreted as similar to VWM; VWM is specifically investigated and MCL-1 is found to be decreased in VWM brain tissue. The decrease is most likely nonspecific. Decrease in MCL-1 is most likely part of a general mechanism of degeneration of brain tissue or white matter. That is a different but also important conclusion. It is essential that other progressive leukodystrophies and acquired brain diseases with tissue degeneration, such as encephalitis, are investigated as well to see whether MCL-1 is also decreased in these disorders. If so, the MCL-1 decrease in white matter disease and other brain degenerative disease should be described as a final common pathway rather than specifically applicable to VWM.*

      We agree that MCL-1 is likely to be a final common point in multiple disease processes that affect white matter. As described in our response to point 3 below, we are persuaded by the reviewers that the proposed similarity of the Mcl-1 deletion phenotype and VWMD is not sufficiently supported by the available evidence. We will revise the text to make clear that we consider that impaired MCL-1 is “likely part of a general mechanism of degeneration of… white matter”.

      • Adding to point 2 is the fact that the pathology of the brain-specific MCL-1 knock-out mouse does not resemble the pathology of VWM at all. The central features of VWM are abnormal astrocyte morphology with astrocytes having a few stunted processes, lack of reactive astrogliosis, lack of microgliosis, increase in number of oligodendrocytes and presence of foamy oligodendrocytes. The increase in oligodendrocytes in VWM may be such that the high cellularity leads to diffusion restriction on MRI. Bergmann glia are typically ectopic, but not reduced in number. By contrast, the brain-specific MCL-1 knock-out mouse is characterized by decreased numbers of oligodendrocytes, increased numbers of microglia, reactive astrogliosis, decreased numbers of Bergmann glia and ectopic granule cells. No morphological abnormalities of oligodendrocytes and astrocytes are observed. So, histopathologically the only shared feature is preferential involvement of the brain white matter.*

      We are persuaded by the reviewers that our assertion of a high degree of similarity between the Mcl-1 deletion phenotype and VWMD was not adequately supported by our available data. In the revision, we will state that a role for MCL-1 deficiency in VWMD pathogenesis is hypothetical, and that additional studies beyond the scope of this project will be needed to test this hypothesis. However, we reassert that the white matter specificity of the Mcl-1-deletion phenotype is important.

      The reviewer accurately characterizes the pathology of the Mcl-1 deletion phenotype and notes “the preferential involvement of the white matter”. We consider that the preferential involvement of white matter, and of oligodendrocytes within the white matter are highly significant. We will revise the work to focus on the Mcl-1 deletion phenotype, the white matter specificity, and the potential relevance to diverse white matter-specific disease.

      While we concede that more data would be needed to firmly connect MCL-1 to VWMD, we do not agree that the Mcl-1 phenotype “does not resemble the pathology of VWM at all”. There is a diversity of published observations of pathology in VWMD and not all published reports support the descriptions in the reviewer comment. This diversity of findings is highly relevant to our work. For example, while autopsy studies of humans with end stage VWMD show lack of microgliosis (7), studies of mice with a mutation known to cause VWMD in humans, that clearly recapitulate VWMD, show robust microgliosis earlier in the disease process (8). These different observations raise the possibility that microgliosis occurs during the period of active neurodegeneration or at least that in murine brain, the VWMD process activates a microglial reaction. Either interpretation would support a likeness between Mcl-1-deleted mice and VWMD mouse models. Another study of cerebellar pathology in twin human fetuses with characteristic VWMD mutations showed complete absence of Bergmann glia (9). We propose in the revision to address the reviewer’s concerns by presenting the diversity of perspectives on microglial reaction and Bergmann glial changes in VWMD, including all of the citations above.

      • The clarity of the work would benefit from a different approach to introduce the study. It would help the reader to know that (1) gray matter cell specific Mcl-1 deletion in mice did not cause apoptosis and (2) apoptosis may have different effector proteins. This important information is now in the discussion. The switch to another cell type in the brain (hGFAP+ cells) would be logical and the significance of the work may improve. When approaching the topic from the field of leukodystrophies one would not necessarily think of deleting the Mcl-1 gene, especially as this gene is not associated with any known leukodystrophy and tends to associate with preneoplastic and neoplastic disease.*

      We appreciate these suggestions, which we agree will enhance the logical flow and the significance, in line with our response to point 3. We will revise the Introduction as suggested.

      • The authors claim that the ISR is activated in VWM, which means that eIF2α phosphorylation levels are increased, general protein synthesis is decreased and a transcription pathway is regulated by ATF4 and other factors. However, this is not what is seen in VWM. Increased eIF2α phosphorylation and reduced general protein synthesis are not observed in VWM; strikingly, the level of eIF2α phosphorylation is reduced, general protein synthesis appears at a normal rate, and only the ATF4-regulated transcriptome is continuously expressed in VWM astrocytes. *

      This point is not well-settled, as published studies show that the ISR is activated in VWMD despite decreased eIF2α phosphorylation (10, 11). Published scRNA-seq studies of mice with VWMD mutations moreover, show that the ISR transcriptome is activated in oligodendrocytes, as well as neurons, endothelial cells and microglia (8). We will address this concern in the revision by citing these published reports that show both decreased eIF2α phosphorylation and lines of evidence that support ISR activation.

      Fritsh et al. show that MCL-1 protein synthesis is reduced by increased eIF2α phosphorylation due to reduced translation rates at the Mcl-1 mRNA and not due to differences in Mcl-1 mRNA levels.

      We agree with this interpretation of Fritsh et al, which is fully compatible with our proposed mechanism. We suggest that ISR activation in VWMD decreases translation of Mcl-1 mRNA, leading to reduced MCL-1 protein expression. MCL-1 protein is rapidly degraded and may therefore be a more sensitive detector of impaired translation than other readouts. We currently cite published work documenting altered translation in VWMD in the manuscript and in the revision will add the reference Moon et al, which is directly on point (11).

      One would a priori not expect to find altered MCL-1 synthesis rates in the mildly affected VWM mouse model Eif2B5R132H/R132H.

      The model does not show reduced global translation under normal conditions, but rather hypo-activity of eIF2B affects the translation of specific mRNAs (12). We will make this point clear in the revision.

      Actually, ISR deregulation has not been reported in the Eif2B5R132H/R132H VWM mouse model. The authors need to rephrase this part of their study taking this information into account, when explaining their experiments and interpreting their results.

      Consistent with the data that the ISR is activated in VWMD, mice show ATF4 up-regulation and other evidence of ISR activation (13) and impaired responses to physiologic stress (14, 15). In the revision, we will add these citations. To address the reviewer concerns, we will state in the revision that ISR activation is one of many potential mechanisms of reduced MCL-1 expression.

      The authors now imply that their study adds mechanistic insight into the VWM field and that is not the case.

      As we describe in response to point 3, we will acknowledge in the revision that the assertion that MCL-1 deficiency causes VWMD is hypothetical.

      In addition, Figure 7C shows differences in actin signal rather than MCL-1 signal, suggesting that transfer of the actin protein from the gel to the blot was not optimal for the middle lanes. MCL-1 protein may thus not be reduced in these samples from Eif2B5R132H/R132H VWM mice.

      We stand by our Western blot data that show that MCL-1 levels are lower in the Eif2B5R132H/R132H VWM mouse model, coincident with the onset of symptoms. The Western blot shown is a representative image that includes 3 biological replicates for each condition and of a total of 12 mice. The quantification demonstrates the reproducibility of the finding.

      • Can the authors show in which cell type was apoptosis found (lines 315-316)? Their study uses the hGFAP - Cre mouse model to generate conditional Mcl-1 knock-out mice. The original paper by Zhuo et al. describing the hGFAP - promoter mouse model suggests that Mcl-1 expression is also affected in neurons and ependymal cells. The authors can investigate this further to assess which cell types (1) are sensitive to apoptosis by Mcl-1 deletion and (2) depend on Bax and Bak.*

      Apoptosis may occur at different times in different cell populations, and asynchronous apoptosis can be difficult to detect at any point in time, which can complicate the suggested studies. Despite significant effort, we have not been able to co-localize any markers with dying cells in our model.

      To address the question of neuronal involvement, the revised manuscript will refer to prior published studies (16-18) which show that Mcl-1 deletion affects forebrain neural progenitors. In this context, we will discuss that our Mcl-1 deletion studies show that significant neural progenitor populations survive prenatal Mcl-1 deletion and generate appropriate cortical and hippocampal architecture in Mcl-1-deleted mice at P7, prior to the onset of white matter degeneration.

      To identify involved glial cells, we quantified the cells that were depleted or persisted in the Mcl-1 deleted brain. These studies identified oligodendrocytes and Bergmann glial as cell types depleted during P7-P15, when postnatal degeneration occurs in Mcl-1 deleted mice. In contrast, astrocytes persisted, indicating that astrocytes are not MCL-1-dependent. In the review, we will add new data quantifiying the immature, PDGFRA-expression subset of oligodendrocytes, which will increase the specification of which cells are depleted by Mcl-1 deletion.

      We share the reviewer’s interest in the question of which subsets of Mcl-1 dependent cells are rescued by co-deletion of Bax or Bak. As known markers may not be sufficient to distinguish these subsets, we consider that scRNA-seq studies are an ideal approach to identify these subsets and their specific gene expression patterns. However, these studies are outside the scope of the present work, which establishes that specific white matter cells depend on Mcl-1.

      • Heterozygous deletion of Bak greatly reduces the number of Bak-expressing cells (Fig. 3C, line, 331-333). Authors need to explain this remarkable finding. *

      As we state in the text, the reduced Bak expression in the heterozygous Bak +/- mice is consistent with a gene dosage effect, which has been observed for other genes.

      Please provide raw IHC data.

      Our IHC data is “raw” in the sense of unaltered. We are happy to include a supplementary figure with additional low power and high-power images of BAK staining.

      Co-staining with neuronal, astrocytic or oligodendrocytic markers would be insightful.

      To address this point, we have successfully performed double labeling with antibodies to BAK and with antibodies to the oligodendrocyte marker SOX10 and the astrocyte marker GFAP. We will add these images to the revision. These images show that BAK+ cells include oligodendrocytes and astrocytes. The position and morphology of the BAK+ cells show that they are not neurons.

      In addition, what does the Western blot signal for the BAX protein represent in Bax homozygous knock out mice (Fig. 3C)?

      We will add text stating that the small residual BAX protein detected in the conditional Bax-deleted mice can be attributed to BAX expression in cells outside the Gfap lineage, including endothelial cells, vascular fibroblasts, and microglia.

      Can the percentage of BAX+ cells in Mcl-1/BaxdKO corpus callosum be determined, similarly as was done for BAK? Co-staining with neuronal, astrocytic or oligodendrocytic markers would be insightful here as well. The legend of Fig. 3D does not state what staining is shown (H&E?).

      We were not able to label BAX protein in individual cells using immunohistochemistry. In contrast, BAK immunohistochemistry worked well, allowing us to analyze the cellular distribution of BAK protein. We will revise the legend in 3D to state the staining is H&E.

      • What explains the strong GFAP expression in processes of Mcl-1 KO astrocytes? Are these cells refractory to apoptosis or to hGFAP-driven Cre expression and recombination? Do they lack BAK or BAX or other apopotic-regulating protein? Or do specific factors compensate for the loss of MCL-1?*

      As we discuss in our response to point 1 above, not all cells require MCL-1 to prevent spontaneous apoptosis. The persistence of GFAP+ astrocytes in Mcl-1-deleted mice shows that astrocytes do not require MCL-1 to maintain their survival. These data do not mean that these astrocytes are refractory to apoptosis, but rather they are not primed for apoptosis in a way that is critically restricted by MCL-1. We will add a discussion of these implications to the revision.

      • Which developing symptoms do the authors refer to in line 468? Please specify and introduce appropriate references.*

      We will add a description of symptoms to the revision.

      • The definition of leukodystrophies given in the paper is outdated. Leukodystrophies are not invariably progressive and fatal disorders. For more recent definition of leukodystrophies see Vanderver et al., Case definition and classification of leukodystrophies and leukoencephalopathies, Mol Genet Metab 2015, and van der Knaap et al., Leukodystrophies a proposed classification system based on pathology, Acta Neuropathol 2017.*

      We appreciate this advice. We will revise the Introduction accordingly and cite the recommended work.

      • It is not correct that there is no specific targeted therapy clinically implemented to arrest progression of the disease in any leukodystrophy. Perhaps hematopoietic stem cell transplantation is not specific targeted, although curative if applied in time in adrenoleukodystrophy and metachromatic leukodystrophy, but certainly genetically engineered autologous hematopoietic stem cells would qualify the definition. In any case, the suggestion that no leukodystrophy is treatable is not correct.*

      We appreciate this correction. We will revise the text to provide a more detailed description of treatment options while underscoring the need for mechanistic insight.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors characterize the phenotype associated with brain-specific deletion of the mcl-1 gene in mice as a model for vanishing white matter-like disease in humans. Unfortunately, the gfap gene is expressed in many cell types during development which are outside of the intended cell type for this study, so functional data presented from the mutant mice is open to interpretation. The authors have not ruled out other interpretations of their results. The authors need to address major shortcomings in their data interpretation by addressing the following issues.

      We appreciate that concerns related to vision and hearing in the Mcl-1 deleted mice, and address these concerns as described below.

      On line 57, the authors indicate that seizures are common in leukodystrophy. This is controversial. Patients may have attacks that look like seizures, but without EEG recordings there is no way to distinguish these events from myoclonus. The authors should note this ambiguity.

      We will note this ambiguity in the revision. On line 58, the authors indicate the absence of treatments for leukodystrophies. The authors should review the following articles: PMID: 7582569, 15452666 and 27882623, and moderate the text.

      We will cite these papers and moderate the text as recommended

      The methods section is lacking in details in several areas. For example beginning line 136, there is virtually no indication of the MRI details without going to secondary literature. The authors should provide a brief description including magnet strength, type of imaging and the general sequence, software used to collect and analyze the images.

      We will include these details in the revision.

      Were the brains actually harvested fresh, where mechanical stresses easily deform brain structure, prior to immersion fixation for 48h? This could be troubling despite the method being previously published.

      Brains were harvested fresh and drop fixed. We have extensive experience over more than ten years in handling brain tissue from neonatal mice and subsequently analyzing MRI images and sections. These methods have allowed us to make quantitative volumetric comparisons of the 3-dimensional architecture of the developing brain using MRI in prior studies, that detected genotypic differences in brain growth without confounding fixation artefact (19). We can confirm that no mechanical stress of handling can reproduce the white matter specific changes that we see in the Mcl-1-deleted brain. We did not detect any abnormalities in control brains subjected to the same handling techniques. Beginning on line126, the authors could at least indicate the fixative details and whether the mice were perfused or tissue was immersion fixed. Compare this lack of detail with the description of lysis buffer beginning on line 158.

      We will add fixation details to the revision.

      Behavioral testing at young ages is rather problematic regarding data interpretation. For example, open field testing (Fig. 2B) at postnatal day 7, which relies on visual cues, is rather dubious when mice do not open their eyes until 12-13 days after birth. How would the pups know if they were in the middle of an open field and exhibit thigmotaxis, even if they were capable of the behavior at such a young age? Thus, the P7 data likely cannot be interpreted in terms of the knockouts being normal.

      We fully agree with the reviewer on the challenges with behavioral analysis of such young mice. The rationale for the open field test was that, at P7, mouse pups are gaining greater control of hind limb function, which can be observed as a transition from pivoting in one place to forward locomotion. Thus, we measured the number of pivots and distance traveled in the open field as indicators for maturation of motor function. Center time was presented to show that, at P7, both WT and knockout mice stayed in the middle (i.e., the groups were at the same stage of limited mobility). We consider that these measures, together with geotaxis and latency to righting (Table 1), provide a developmentally-appropriate neurologic assessment for an age when behaviors are very limited. We will make clear in the revision that these specific tests must be considered together in order to be informative.

      By P14, when the mutants exhibit a phenotype, they are already significantly underweight, which can lead to non-specific phenotypes such as retinal dysfunction or degeneration. Did the authors look for pathological changes in the retina?

      Further, GFAP is expressed in retina of many vertebrate species (PMID 1283834) which would inactivate mcl1 in that tissue and possibly lead to blindness. Indeed, the table at the following link provides a list of tissues in which the gfap-cre transgene is expressed during development. The authors need to address this major issue. http://www.informatics.jax.org/allele/MGI:2179048?recomRibbon=open

      We appreciate this suggestion and we will look for pathology in the retina and optic nerve. Such pathology, if we find it, is likely to be specific, as the optic nerve is myelinated and we have already noted extensive myelination abnormalities in the Mcl-1-deleted mice. If we find retinal or optic nerve abnormalities, we will note the potential for these abnormalities to impact on open field testing.

      For the startle response, which relies on normal hearing, did the authors check to determine if the mutants are deaf? This is very difficult at such a young age, especially prior to tight junction assembly in the lateral wall at around P14. Again, GFAP is expressed in the cochlea at an early age (see PMID 20817025) and may have caused degenerative pathology in this tissue. The authors need to address this major issue.

      The reviewer brings up the potential issue of deafness as a confounding factor for acoustic startle testing. Our results showed that startle responses in the mutant mice were increased at P14, which clearly indicates the mice were able to hear the acoustic stimuli. Further, at P14 and P21, both WT and knockout mice had orderly patterns of prepulse inhibition, providing confirmation of good hearing ability at each timepoint. We will make these points clear in the revision.

      *Reviewer #2 (Significance (Required)):

      Unknown.*

      The reviewer has not raised specific issues with the significance. We consider the significance of our work to be the finding that oligodendrocyte-lineage glial cells depend on MCL-1 and thus are primed for apoptosis, such that disrupting MCL-1 expression results in catastrophic degeneration of the cerebral white matter. Addressing the reviewer’s concerns described in the section on Evidence, reproducibility and clarity will support this significance.*

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Cleveland et al. tried to argue that brain-specific depletion of apoptosis regulator MCL-1 reproduces Vanishing White Matter Disease (VWMD) in mice. The authors show that brain-specific MCL-1 deficiency leads to brain atrophy, increased brain cell apoptosis, decreased oligodendrocytes, decreased MBP immunoreactivity, and activation of astrocytes and microglia. It is known that VWMD is a hypomyelinating disorder caused by mutation of eIF2B subunits, which displays severe myelin loss but minimal oligodendrocyte apoptosis or loss in the CNS white matter. In fact, a number of studies show increased oligodendrocyte numbers in the CNS white matter.*

      Published reports show decreased normal oligodendrocytes and increased immature oligodendrocyte populations (20)**.

      The characteristic oligodendrocyte pathology is foamy oligodendrocytes (Wong et al., 2000), rather than apoptosis.

      Foamy oligodendrocyte pathology and increased oligodendrocyte apoptosis are not mutually exclusive. The above referenced paper, Wong et al, in addition to foamy oligodendrocytes, also describes a “decrease in numbers of cells with oligodendroglial phenotype, both normal and abnormal” (21); this decrease is compatible with increased apoptosis. Moreover, published reports specifically describe apoptotic oligodendrocytes in human brains with VWMD (22). To address this point, we propose to include both of these citations in the revision to reference foamy oligodendrocyte pathology in VWMD and to state that this pathologic finding does exclude a role for apoptosis in VWMD pathogenesis.

      Since the CNS pathology of brain-specific MCL-1 deficient mice is drive by brain cell apoptosis, the relevance of this mouse model to VWMD is very limited.

      Whether apoptosis plays a mechanistic role in VWMD is less clear than this comment suggests, as described in multiple publications (22, 23).

      The title of this manuscript is misleading, and should be changed.

      We accept that our statement that Mcl-1-deletion recapitulates VWMD is premature and not adequately supported by the available data. We will revise the title, introduction and discussion accordingly, to focus on the white matter specificity of the Mcl-1-deletion phenotype.

      *Moreover, there are a number of major concerns.**

      1. Figure 1 clearly shows severe atrophy of neocortex in Mcl-1 cKO mice; however, the white matter appears largely normal in the cerebellum and brain stem. Mcl-1 cKO mice also display ventricular dilation and possible atrophy of corpus callosum. The authors should discuss severe atrophy of neocortex in Mcl-1 cKO mice and the possibility that ventricular dilation and corpus callosum atrophy result from severe atrophy of neocortex?*

      The cortical atrophy that the reviewer notes begins after P7 and is minimal at P14 when white matter loss is already pronounced. At P21, when there is clear cortical thinning, the white matter loss is extreme. Based on the time course, we consider that the white matter loss is the primary pathology, and the cortical thinning is secondary. Importantly, glial cells populate the cortex as well as the white matter and our cellular data show that oligodendrocytes are reduced in the cortex as well as in the white matter structures. Based on these lines of evidence, we consider that the primary cell type affected is the oligodendroglial population of the glia. We will add a discussion along these lines to the revision.

      We agree that the brain stem is preserved. Our data show that the hGFAP-Cre promoter is least efficient in the brain stem and midbrain regions (Sup Fig.1). We will note this differential efficiency in the revision.

      • The motor and sensory tests in Figure 2 are potential interesting, but their relevance to myelin abnormalities is limited. The authors should perform the behaviors tests that are highly relevant to myelin abnormalities.*

      The tests presented show progressive neurologic impairment, correlating with the onset of neuropathology. In the revision we will note that ataxia and tremor are common features of leukodystrophies and the Mcl-1-deleted mice show both ataxia and tremor.

      • It is well expected that there are increased apoptotic cells in the brain of Mcl-1 cKO mice. The authors should perform double labeling to demonstrate which cell types undergo apoptosis: neurons, oligodendrocytes, or other cell types? On the other hand, Figure 3A shows that there are substantial apoptotic cells in the cerebral cortex, which is consistent with severe cerebral cortex atrophy in Mcl-1 cKO mice, suggesting neuron apoptosis in the cerebral cortex. Neuron apoptosis would further rule out the relevance of Mcl-1 cKO mice to VWMD.*

      These studies would be of interest, but we have not been able to co-label apoptotic cells in the Mcl-1-deleted mice with any marker. In the advanced state of apoptosis when dying cells are detectable by TUNEL staining, the relevant marker proteins have been degraded beyond recognition by IHC. In contrast, the apoptotic marker cleaved caspase-3, which is positive earlier in the apoptotic process and might allow marker co-labeling, was not detectably elevated in the Mcl-1-deleted mice. We attribute the lack of cleaved caspase-3+ cells to the asynchronous nature of the increased cell death, and to the short duration in which dying cells are cleaved caspase-3+. While double label studies of dying cells have been problematic, our studies quantifying each cell type provide information to address the reviewer’s question. Our cell counts show clearly that oligodendrocytes are the primary cell type reduced in number in the Mcl-1 deleted mice.

      • Figure1, 4 the authors use H&E staining to demonstrate white matter loss. H&E staining is good to show general CNS morphology; however, it is impossible to use H&E staining to quantify the integrity of the white matter. The authors should perform specific staining to quantify white matter loss in the mouse models.*

      Our MBP stains later in the paper are used to quantify white matter loss.

      • Figure 5, MBP IHC is good to show general myelin staining, but is not a reliable assay to quantify myelin integrity in the CNS. The authors should perform electron microscopy analysis to quantify myelin integrity in the CNS in the mouse models.*

      Our studies of MBP staining show that the myelinated area in cross sections is significantly reduced in the Mcl-1-deleted mice. Electron microscopy studies cannot show whether the myelinated area is reduced and studies of myelin integrity are not needed to prove that reduced oligodendrocytes correlate with reduced myelination.

      • Figure 6, SOX10 is a marker of oligodendrocytes and OPCs. The authors should quantify the number of oligodendrocytes (using oligodendrocyte markers, such as CC1) and the number of OPCs (using OPC markers, such as NG2). Does deletion of BAK or BAX reduce oligodendrocyte apoptosis in the CNS of Mcl-1 cKO mice?*

      We agree that this is an important question, and we are working to quantify OPCs in the Mcl-1-deleted mice by counting cells labelled with the OPC marker PDGFRA. We will add these data to the revision and discuss their significance when we know what they show.

      • The authors show that the level of MCL-1 is comparable in brain lysates of wildtype and eIF2B5 R132H/R132H mice at the age of 7 months, and moderately decreased in eIF2B5 R132H/R132H mice at the age of 10 months. VWMDis a developmental disorder. Similarly, brain-specific MCL-1 deficiency causes developmental abnormalities in the CNS. The normal level of MCL-1 in 7-month-old eIF2B5 R132H/R132H mice strongly suggests that MCL-1 is not a major player involved in the pathogenesis of VWMD. Does brain-specific MCL-1 deficiency starting at the age of 10 months (using CreERT mice) cause CNS abnormalities in adult mice?*

      We agree that Mcl-1 deletion in our model disrupts postnatal brain development. Our studies show that in early life, oligodendrocytes depend on MCL-1 to prevent spontaneous apoptosis. It is an interesting, but separate question whether Mcl-1 deletion induced in the adult would also cause a similar phenotype. The suggested studies would take over a year to conduct, and while they are of interest, they are not required to prove our main point, which is that developmental leukodystrophies may result from the dependence of oligodendrocytes on MCL-1. In the revision, we will state that our comparison on the Mcl-1-deletion phenotype to VWMD is hypothetical, and that additional studies are needed to test this hypothesis.

      • Does MCL-1 deletion exacerbate the pathology in eIF2B5 R132H/R132H mice? Moreover, does MCL-1 overexpression rescue the pathology in eIF2B5 R132H/R132H mice? These two experiments are necessary to demonstrate the involvement of MCL-1 in VWMDpathogenesis.*

      We agree that these are interesting and important studies; however, these studies will require years to complete and extensive resources. These studies are not needed to show that Mcl-1 deletion produces early onset white matter degeneration, which is our main point. As in our response to point 7 above, we will state in the revision that our comparison on the Mcl-1-deletion phenotype to VWMD is hypothetical, and list these experiments as follow up studies that are needed to test this hypothesis.

      *Reviewer #3 (Significance (Required)):

      The study will not significantly advance the understanding of VWMD pathogenesis.*

      We recognize that our assertion of a direct relevance to VWMD was premature, and that additional studies, beyond the scope to this project, are needed to determine if MCL-1 deficiency contributes to VWMD pathology. We agree that the available data do not yet inform VWMD pathogenesis, but these data may become relevant to VWMD as follow-up studies are conducted. The data remain highly relevant to the broad group of leukodystrophies as they demonstrate a physiologic vulnerability of oligodendrocytes that sets them apart from astrocytes and neurons, and thus may play a role in disorders in which oligodendrocyte pathology is central.

      Neuroscientists may be interested in the reported findings.

      We appreciate the reviewer noting the significance for neuroscience.

      My field of expertise: oligodendrocyte, myelin, neurodegeneration, ER stress

      References cited:

      1. K. A. Sarosiek, C. Fraser, N. Muthalagu, P. D. Bhola, W. Chang, S. K. McBrayer, A. Cantlon, S. Fisch, G. Golomb-Mello, J. A. Ryan, J. Deng, B. Jian, C. Corbett, M. Goldenberg, J. R. Madsen, R. Liao, D. Walsh, J. Sedivy, D. J. Murphy, D. R. Carrasco, S. Robinson, J. Moslehi, A. Letai, Developmental Regulation of Mitochondrial Apoptosis by c-Myc Governs Age- and Tissue-Specific Sensitivity to Cancer Therapeutics. Cancer Cell 31, 142-156 (2017).
      2. R. Dumitru, V. Gama, B. M. Fagan, J. J. Bower, V. Swahari, L. H. Pevny, M. Deshmukh, Human Embryonic Stem Cells Have Constitutively Active Bax at the Golgi and Are Primed to Undergo Rapid Apoptosis. Mol Cell 46, 573-583 (2012).
      3. T. Ni Chonghaile, K. A. Sarosiek, T. T. Vo, J. A. Ryan, A. Tammareddi, G. Moore Vdel, J. Deng, K. C. Anderson, P. Richardson, Y. T. Tai, C. S. Mitsiades, U. A. Matulonis, R. Drapkin, R. Stone, D. J. Deangelo, D. J. McConkey, S. E. Sallan, L. Silverman, M. S. Hirsch, D. R. Carrasco, A. Letai, Pretreatment mitochondrial priming correlates with clinical response to cytotoxic chemotherapy. Science 334, 1129-1133 (2011).
      4. J. A. Ryan, J. K. Brunelle, A. Letai, Heightened mitochondrial priming is the basis for apoptotic hypersensitivity of CD4+ CD8+ thymocytes. Proc Natl Acad Sci U S A 107, 12895-12900 (2010).
      5. M. Certo, V. D. G. Moore, M. Nishino, G. Wei, S. Korsmeyer, S. A. Armstrong, A. Letai, Mitochondria primed by death signals determine cellular addiction to antiapoptotic BCL-2 family members. Cancer Cell 9, 351-365 (2006).
      6. A. J. Crowther, V. Gama, A. Bevilacqua, S. X. Chang, H. Yuan, M. Deshmukh, T. R. Gershon, Tonic activation of Bax primes neural progenitors for rapid apoptosis through a mechanism preserved in medulloblastoma. The Journal of neuroscience : the official journal of the Society for Neuroscience 33, 18098-18108 (2013).
      7. D. Rodriguez, A. Gelot, B. della Gaspera, O. Robain, G. Ponsot, L. L. Sarlieve, S. Ghandour, A. Pompidou, A. Dautigny, P. Aubourg, D. Pham-Dinh, Increased density of oligodendrocytes in childhood ataxia with diffuse central hypomyelination (CACH) syndrome: neuropathological and biochemical study of two cases. Acta Neuropathol 97, 469-480 (1999).
      8. Y. L. Wong, L. LeBon, A. M. Basso, K. L. Kohlhaas, A. L. Nikkel, H. M. Robb, D. L. Donnelly-Roberts, J. Prakash, A. M. Swensen, N. D. Rubinstein, S. Krishnan, F. E. McAllister, N. V. Haste, J. J. O'Brien, M. Roy, A. Ireland, J. M. Frost, L. Shi, S. Riedmaier, K. Martin, M. J. Dart, C. Sidrauski, eIF2B activator prevents neurological defects caused by a chronic integrated stress response. Elife 8, (2019).
      9. A. Trimouille, F. Marguet, F. Sauvestre, E. Lasseaux, F. Pelluard, M. L. Martin-Negrier, C. Plaisant, C. Rooryck, D. Lacombe, B. Arveiler, O. Boespflug-Tanguy, S. Naudion, A. Laquerriere, Foetal onset of EIF2B related disorder in two siblings: cerebellar hypoplasia with absent Bergmann glia and severe hypomyelination. Acta Neuropathol Commun 8, 48 (2020).
      10. T. E. M. Abbink, L. E. Wisse, E. Jaku, M. J. Thiecke, D. Voltolini-Gonzalez, H. Fritsen, S. Bobeldijk, T. J. Ter Braak, E. Polder, N. L. Postma, M. Bugiani, E. A. Struijs, M. Verheijen, N. Straat, S. van der Sluis, A. A. M. Thomas, D. Molenaar, M. S. van der Knaap, Vanishing white matter: deregulated integrated stress response as therapy target. Ann Clin Transl Neurol 6, 1407-1422 (2019).
      11. S. L. Moon, R. Parker, EIF2B2 mutations in vanishing white matter disease hypersuppress translation and delay recovery during the integrated stress response. RNA 24, 841-852 (2018).
      12. G. Raini, R. Sharet, M. Herrero, A. Atzmon, A. Shenoy, T. Geiger, O. Elroy-Stein, Mutant eIF2B leads to impaired mitochondrial oxidative phosphorylation in vanishing white matter disease. J Neurochem 141, 694-707 (2017).
      13. L. Kantor, D. Pinchasi, M. Mintz, Y. Hathout, A. Vanderver, O. Elroy-Stein, A point mutation in translation initiation factor 2B leads to a continuous hyper stress state in oligodendroglial-derived cells. PLoS One 3, e3783 (2008).
      14. Y. Cabilly, M. Barbi, M. Geva, L. Marom, D. Chetrit, M. Ehrlich, O. Elroy-Stein, Poor cerebral inflammatory response in eIF2B knock-in mice: implications for the aetiology of vanishing white matter disease. PLoS One 7, e46715 (2012).
      15. L. Marom, I. Ulitsky, Y. Cabilly, R. Shamir, O. Elroy-Stein, A point mutation in translation initiation factor eIF2B leads to function--and time-specific changes in brain gene expression. PLoS One 6, e26992 (2011).
      16. L. C. Fogarty, R. T. Flemmer, B. A. Geizer, M. Licursi, A. Karunanithy, J. T. Opferman, K. Hirasawa, J. L. Vanderluit, Mcl-1 and Bcl-xL are essential for survival of the developing nervous system. Cell Death Differ 26, 1501-1515 (2019).
      17. S. M. Hasan, A. D. Sheen, A. M. Power, L. M. Langevin, J. Xiong, M. Furlong, K. Day, C. Schuurmans, J. T. Opferman, J. L. Vanderluit, Mcl1 regulates the terminal mitosis of neural precursor cells in the mammalian brain through p27Kip1. Development 140, 3118-3127 (2013).
      18. C. D. Malone, S. M. Hasan, R. B. Roome, J. Xiong, M. Furlong, J. T. Opferman, J. L. Vanderluit, Mcl-1 regulates the survival of adult neural precursor cells. Mol Cell Neurosci 49, 439-447 (2012).
      19. S. E. Williams, I. Garcia, A. J. Crowther, S. Li, A. Stewart, H. Liu, K. J. Lough, S. O'Neill, K. Veleta, E. A. Oyarzabal, J. R. Merrill, Y. I. Shih, T. R. Gershon, Aspm sustains postnatal cerebellar neurogenesis and medulloblastoma growth. Development, (2015).
      20. M. Bugiani, I. Boor, B. van Kollenburg, N. Postma, E. Polder, C. van Berkel, R. E. van Kesteren, M. S. Windrem, E. M. Hol, G. C. Scheper, S. A. Goldman, M. S. van der Knaap, Defective glial maturation in vanishing white matter disease. J Neuropathol Exp Neurol 70, 69-82 (2011).
      21. K. Wong, R. C. Armstrong, K. A. Gyure, A. L. Morrison, D. Rodriguez, R. Matalon, A. B. Johnson, R. Wollmann, E. Gilbert, T. Q. Le, C. A. Bradley, K. Crutchfield, R. Schiffmann, Foamy cells with oligodendroglial phenotype in childhood ataxia with diffuse central nervous system hypomyelination syndrome. Acta Neuropathol 100, 635-646 (2000).
      22. K. Van Haren, J. P. van der Voorn, D. R. Peterson, M. S. van der Knaap, J. M. Powers, The life and death of oligodendrocytes in vanishing white matter disease. J Neuropathol Exp Neurol 63, 618-630 (2004).
      23. M. Bugiani, I. Boor, J. M. Powers, G. C. Scheper, M. S. van der Knaap, Leukoencephalopathy with vanishing white matter: a review. J Neuropathol Exp Neurol 69, 987-996 (2010).
    1. Arguments

      I couldn't highlight the section I wanted to highlight since this table is a .jpg, but I wanted to cover the argument of "This has always been done"

      This argument is heard so often in not just education but the work force, politics, and more. From my experience with this argument, in my undergraduate studies in landscape architecture, at the end of every design studio we give a presentation of our work. A number of the professors in the program I would consider to be old fashioned, but not all of them. The format of the final presentation would always be a powerpoint slide presentation where visuals of sites would be shown as pictures and photography. A common critique would be that the pictures either looked bad, looked too good, and did not show parts of the site they wanted to see.

      I wanted to make not only the powerpoint but design projects to be more interactive and engaging. So I developed presentations for my class in a video format. While some forms of interpretation may be left out such as "going back to a slide and having a closer look at infographics and images, the viewing experience of the presentation became more engaging and interesting.

      Another process I wanted to incorporate is using real-time modeling tools like Minecraft as a development tool for my site designs. Though, that idea was shot down, but I think with the accessibility of the application as well as realism mod packs to the game, that dream can be a reality. That and going even further by implementing VR capabilities. This is not just in landscape design, but in other forms of media making as well.

    1. Conversely, avoidingspecies extinction can be seen as the fundamentalgoal of biodiversity conservation, because whileall of humanity’s other impacts on the Earth canbe repaired, species extinction, Jurassic Park fan-tasies notwithstanding, is irreversible.

      I think this is an extremely important thing to focus on, and to educate about. People seem to have the wrong idea on what extinction truly means, especially for plants and pollinator species like bees. Loosing these food sources are permanent, and will change the face of the earth. I think it would be a good start to do as we have talked about in previous classes and start early with educating children on what extinction means, and why we must preserve natural habitats for both plants and animals. Yes, we talk about how our kids will not get to see Polar Bears, but we should also talk about how our kids may not have enough food to feed their families because of the way we are driving plant live and pollinators to dangerous levels of endangerment.

    2. Madagasca

      The Rosie Periwinkle is only found in Madagascar as are many indigenous plants. This flower has been proven to fight cancer. There are so many species threatened on Madagascar that we may not even know about. Think of all of the medicines ect that could be hiding on this island that we have no idea about and may never know about because of the high rate of extinction currently present on the island due to human interference. https://livingrainforest.org/learning-resources/rosy-periwinkle

    1. Author Response to Public Reviews

      Reviewer #1 (Public Review):

      [...] What is left unclear is what is unique about the fibrotic substrate in ESUS patients in comparison to AFib patients in the future.

      We thank the reviewer for these reasonable and accurate critiques. In the revised version of our manuscript, we offer a more in-depth analysis of potential cohort-scale differences in the spatial distribution of fibrosis between ESUS and AFib patients and how that might affect the overall arrhythmogenicity of fibrotic remodeling between the two populations. We further acknowledge comprehensive understanding of pathophysiological consequences of fibrosis in ESUS will require much more research in the future. Our plans include analysis of how fibrosis might affect LA hemodynamic properties and the likelihood of clot formation. Future work (both clinical and computational) will also be needed to test the hypothesis generated by the present study that ESUS patients lack the triggers needed to initiate AFib. We have added clarifying text to the Discussion section of our manuscript to acknowledge these two points (see lines 286-289, 367-368).

      Reviewer #2 (Public Review):

      [...] 1) As the authors point out, clinical studies have revealed that the fibrotic burden in ESUS patients is similar to those with aFib. The question is why then, do so few ESUS patients exhibit clinically detectable arrhythmias with long-term monitoring. The authors hypothesize and their data support the notion that while the substrate is prime for pro-arrhythmia in ESUS patients, a lack of triggering events may explain the differences between the two groups.

      We thank the reviewer for their kind comment about the level of anatomical and structural variability in our study. We concur that additional analysis of fibrosis spatial pattern properties (local fibrosis density and entropy, as calculated in our previous work) on a region-wise basis between AFib/ESUS and inducible/non-inducible models would add significant value to our work. Accordingly, we have made significant additions to the text including a completely new figure.

      2) I think the authors could go further in describing why this is surprising. Generally, severe fibrosis is thought to potentially serve as a means or mechanism for pro-arrhythmic triggers. This is because damage to cardiac tissue typically results in calcium dysregulation. When calcium overload occurs in isolated fibrotic tissue areas, or depolarization of the resting membrane potential due to localized ischemia allows for ectopic peacemaking, we might expect that the diseased/fibrotic tissue is itself the source of arrhythmia generation. I think the novel finding here is that this notion may be a simplification, and the sources of arrhythmia generation may be more complex and may need to come from outside the areas of fibrosis. I think this is a big deal.

      Patients with stroke were excluded from the AFib cohort because the etiology of stroke in our AFib cohort was not explicitly adjudicated to be cardioembolic, other ischemic such as atherosclerotic, or haemorrhagic and therefore would not allow us to draw reliable conclusions regarding the role of the atrial substrate in stroke in this population. A separate issue is the fact that the cell- and tissue-scale electrophysiology in models reconstructed from ESUS patients was based on the same representation as those used in AFib models. In fact, this was a deliberate design choice to ensure that our modeling results represented a “worst case scenario” for the potential impact of fibrosis in patients with ESUS. Given the fact that our aim is to determine whether there are any differences in the pro-arrhythmic capacity of fibrotic substrate in ESUS and AFib groups, we believe that this is a suitable and justifiable modeling choice – modeling fibrosis differently in the two populations would be difficult to justify due to a lack of good experimental data and would introduce more confounding factors.

      Nevertheless, we agree this is a relevant limitation of our study and we have added an acknowledgement of that fact to our revised manuscript (see lines 361-365).

      3) An acknowledged limitation of the study is the assumption of fixed conduction velocity and action potential duration/effective refractory period. Bifulco et al. base this assumption on previous studies by the group (e.g. L312), which, however, concluded that reentrant driver locations and inducibility are sensitive to changes of action potential and conduction velocity (Deng et al.). For conduction velocity, wider ranges have been reported since the publication of the supporting reference (35) in 1994, e.g. Verma et al.; Roney et al.

      The reviewer’s point is well taken. Accordingly, we have added qualifying language pertaining to RD localization analysis in our Discussion (see lines 323-326). Having said that, we do not think this issue stands to fundamentally change our top-line interpretation of the findings from simulations, as it pertains to the idea that fibrosis in ESUS might plausibly be latent proarrhythmic substrate. The point of the paper by Deng et al. was to analyze sensitivity of reentrant driver localization to altered cell- and tissue-scale electrophysiological properties, not the concept of inducibility per se. It is thus likely that if our entire study were repeated with ±10% CV or APD (both within normal physiological range for average fibrotic atrial tissue) the take-home message would be the same.

      4) The number of pacing sites is rather low for a comprehensive in silico arrhythmia inducibility test but likely a good balance of coverage and computational feasibility considering that the primary goal of this research was to check whether the two groups of models show differences when undergoing the same (but not necessarily exhaustive) protocol.

      We would argue that 15 sites in the LA alone is comparable in coverage to prior studies in biatrial models (e.g., 30 LA/RA sites in Zahid et al. [2016] Cardiovasc Res; 40 LA/RA sites in Boyle et al. [2019] Nat Biomed Eng). We would further stress that our decision to use these specific sites was based on our motivation to simulate triggered activity (i.e., rapid pacing) exclusively from sites identified as common clinically relevant trigger locations documented in AFib patients (see ref. [14] by Santangeli et al. [2016] Heart Rhythm). If we were to instead pace from randomly distributed atrial sites as in prior work, we would jeopardize our ability to draw conclusions on the potential relevance of our simulations to the real-world susceptibility of atrial fibrotic substrate in ESUS patients to ectopic beats originating from realistic locations.

      5) The discussion does a good job in putting the results into context. Two interesting observations that deserve more attention are that i) the Inducibility Score was always higher for AFib vs. ESUS (Figure 6A, no statistical test performed). However, this did not translate to a difference in silico arrhythmia burden (inducibility). ii) Reentrant drivers were about twice as likely to localize to the left pulmonary veins than the right pulmonary veins in the AFib models (Figure 6D).

      Regarding the first point (i), with corrections made to the fiber mapping process, the statement regarding uniformly higher IdS values in AFib models is no longer true. Moreover, with our revised analysis there is no significant difference in the region-wise inducibility rates (P=0.45). The reviewer’s second point (ii) still stands and is even more pronounced with a ~3x higher rate of localization to the LPV vs. RPV areas in AFib models. Notably, our new region-wise analysis of fibrosis spatial pattern (see new Fig. 6 and our response to major points 4 and 5 above) shows that LPV regions in AFib models in this cohort were much more likely to have the combination of high fibrosis density and entropy previously shown to be highly favorable to reentrant driver localization. However, we recognize that a more fulsome analysis will be required to draw truly meaningful conclusions on this subject in the context of either AFib or ESUS patients; this has been briefly noted in our Limitations section (see lines 332-335).

      6) The study succeeded in answering the question it posed in the sense that no marked difference was found between the ESUS and AFib models. This leads to the question what the stroke-inducing mechanism is in the ESUS patients. A hypothesis for future work could be that the fibrotic infiltrations in the ESUS patients reduce the hemodynamic efficacy of the left atrium and render clot formation (e.g. in the atrial appendage) more likely in this way.

      The reviewer’s comment is duly noted and entirely consistent with our plans for future work. In fact, we recently published a white paper (Boyle et al. [2021] Heart) outlining a vision to combine electrophysiological models of the left atrium with biomechanics and hemodynamics simulation to comprehensively understand how fibrosis might influence clot formation. Our revised Discussion emphasizes this exciting trajectory for future work (see lines 370-372).

      7) The negative finding in this study (no difference between groups) does not naturally allow us to draw clinical implications for diagnosis or stratification. Additional ways to put the hypothesis proposed by the authors (fewer arrhythmogenic triggers in the ESUS patients) to test could be to consider readouts/surrogate measures of the autonomic nervous system.

      We have noted in our Discussion (see lines 286-289) that future work could test the hypothesis arising from this project via electrocardiographic monitoring in ESUS patients with different levels of fibrosis. Concerning the idea of using direct readouts of autonomic tone, we chose to leave this out since we are unaware of any clinically available systems. The usefulness of surrogate measurements (e.g., heart rate variability) in this context also remains unclear.

      Reviewer #3 (Public Review):

      [...] 1) As the authors point out, clinical studies have revealed that the fibrotic burden in ESUS patients is similar to those with aFib. The question is why then, do so few ESUS patients exhibit clinically detectable arrhythmias with long-term monitoring. The authors hypothesize and their data support the notion that while the substrate is prime for pro-arrhythmia in ESUS patients, a lack of triggering events may explain the differences between the two groups.

      We thank the reviewer for these kind remarks. It is encouraging to have our results interpreted so elegantly and accurately. We are excited to test this new hypothesis (and others prompted by the peer review process for this manuscript) in future studies.

      2) I think the authors could go further in describing why this is surprising. Generally, severe fibrosis is thought to potentially serve as a means or mechanism for pro-arrhythmic triggers. This is because damage to cardiac tissue typically results in calcium dysregulation. When calcium overload occurs in isolated fibrotic tissue areas, or depolarization of the resting membrane potential due to localized ischemia allows for ectopic peacemaking, we might expect that the diseased/fibrotic tissue is itself the source of arrhythmia generation. I think the novel finding here is that this notion may be a simplification, and the sources of arrhythmia generation may be more complex and may need to come from outside the areas of fibrosis. I think this is a big deal.

      This is an excellent point and we strongly concur that the “trigger-centric” interpretation of the pathophysiological consequences of fibrotic remodeling should be reconsidered. To further reinforce this fact, we ran additional simulations to rule out the possibility that there might be exaggerated resting membrane potential depolarization in AFib but not in ESUS, which might provide an alternative explanation for the clinical manifestation of arrhythmia in the former but not the latter. Our new results support the point raised by the reviewer and, in our opinion, increase the overall impact of the work.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors aimed to understand the control and the elimination of disseminated tumor cells by NK cells within the lung, their main question being how pulmonary NK cells are able to prevent tumor cells from colonization in the lung.

      To dissect this question, Hiroshi Ichise and colleagues took advantage of the ultra-sensitive bioluminescence whole body imaging system combined with intravital two-photon microscopy technology involving genetically-encoded biosensors tumor or NK cells to explore the behavior and functional competences of NK cells in an experimental lung metastasis model.

      First, the authors have monitored the fate of intravenously injected B16-Akaluc cells from 5 min to 10 days and observe that tumor cells decrease rapidly within the first 12-24 hours. In parallel, they performed asialoGM1+ and NK1.1+ cells depletion by injection of depleting anti-aGM1 and anti-NK1.1 antibodies in order to see the involvement of these populations on the elimination of the disseminated tumor cells. They conclude that a rapid decrease of the tumor cells is mediated by NK cells. Consisting with this first data, the authors observe also the same early NK cells mediated impact on two other syngenic mouse tumor cell lines : the BRAFV600E melanoma and the colon adenocarcinoma MC-38.

      In a second part, the authors dissected NK cell dynamic behaviors in the pulmonary capillaries by taking advantage of the NKp46iCRExrosa26dtTomato mice where NKp46+ cells are fluorescents and performed 2P intravital imaging to follow the in situ the NKp46+ cells behavior. They could nicely observe that NK cells arrive from the capillaries and patrol on the lung epithelial cells in a stall-crawl-jump manner. Moreover, they also show that the attachment to the pulmonary capillaries is mediated by LFA-1. In the presence of B16F10 tumor model, they observe that NK cells stay longer in the capillaries and increase their duration time of crawling indicating that NK cells stay in contact longer with tumor cells.

      The authors then explored the NK-mediated tumor killing in the lung by measuring tumor cell apoptosis using B16F10-SCAT3 cells (which leads to visualize caspase 3 activation) and Ca2+ influx in tumor cells expressing two Ca2+ sensors, GCaMP6s and R-GECO. They could observe casp3 activation but also Ca+ influx on tumor cells within few minutes after encountering NK cells. They also observe that evasion of NK cell surveillance is mediated by Nectin-5 and Nectin-2 expressed on tumor cells.

      Then, they focus on NK cell activation by looking at ERK activation. To do so, they have isolated NK cells from Tg mice expressing a FRET-based ERK biosensor and performed in vitro killing assay against B16-R-GECO tumor cells but also in vivo experiments. For the in vivo experiments, they have developed reporter mice whose NK cells express the FRET biosensor for ERK. They observe that ERK-dependent NK cell activation contributes to the elimination of disseminated tumor cells within the first few hours but not after 24hours. Indeed, theu observe that B16F10-Akaluc tumor cells are equally eliminated when injected 24h after a first injection of B16F10 or PBS in mice. The authors concluded that tumor cell acquire the capacity to evade NK cell surveillance after 24h rather than a hypothesis toward NK cells loose tumoricidal activity over time.

      Finally, the authors have explored their last result on the potential tumor cell evasion of the NK cell surveillance. They show that this NK cell evasion is mediated by the shedding of cell surface Necl-5. They next show that clivage of extracellular domain of Necl-5 was mediated by thrombin in vitro and that anti-coagulation factors such as Warfarin, Edoxaban or Dabigatran Etexilate promote tumor elimination as observed by the bioluminescence experiments. This loss prevents the NK cell signaling needed for effective killing of tumor targets.

      However, most of the results remain correlations and have not been formally demonstrated or miss controls.

      B16F10 is a well known and characterized NK cell target in a in vivo model so the first part is not really knew except the in situ behavior of NK cells within the lung capillaries. The new mecanism of thrombin-mediated shedding of Necl-5 causing evasion from NK Cell surveillance is really concentrated on the last figure (Fig N{degree sign}6) and some supplemental experiments are mandatory and needed to really confirm this affirmation.

      Response: We deeply appreciate the reviewer’s effort to evaluate our work. The reviewer criticizes that the mechanism is well known except “the in situ behavior of NK cells within the lung capillaries.” Indeed, this is what we wish to emphasize in our work. Nobody has ever seen how NK cells kill metastatic tumor cells in the lung. There is a big GAP between in vitro tissue culture experiments and in vivo macroscopic counting of metastatic nodules. Most researchers do not even know when and where in the lung NK cells kill metastatic tumor cells. Live imaging is a powerful approach to address such questions.

      Reviewer #1 (Significance (Required)):

      There are several points to address to improve the significance of these data.

      \*Major points***

      1) A global point : 3 mice/group is to small to analyse and interprete data because of the heterogeneity of the mice. Mean +/- SEM have to represented instead of SD.

      Response: For the sake of animal welfare, researchers are asked to use minimal number of mice. Moreover, only one mouse can be observed in each imaging session, which takes several hours. In most experiments we performed two independent experiments with three mice each. We believe, the number is appropriate for this type of experiment. In the case of small number of samples, we think SD is better than SEM.

      2) The authors used the well known polyclonal anti-asialoGM1 Ab to deplete NK cells. AsialoGM1 is also expressed by ILC1, T, NKT and gd+T cells but also basophils (Trambley J et al., Asialo GM1(+) CD8(+) T cells play a critical role in costimulation blockade-resistant allograft rejection. JCI, 1999). The authors checked the involvement only for the basophils. They have to check the depletion of each of these populations specifically in the lung to assume that the depletion impact only the NK cells or they must change their conclusion on the entire manuscrit and say that not only NK cells is responsible and involved in the control of the disseminated tumor cells but maybe also ILC1, NKT and or gd+T cells.

      Response: We obtained similar observations by using BALB/c nu/nu mice, which lack T cells. Therefore, we can exclude the contribution of T cells at least in the acute phase (*3) Lines 133 to 136 : The authors say that they « did not observe any significant difference in the relative increase of the bioluminescence signal between the control and αAGM1-treated mice, implying that NK cells eliminate disseminated melanoma cells primarily in the acute phase (Response: After 24 hrs, the slope of increment of bioluminescence intensity (BLI) did not change significantly betweenαAGM1-treated mice and control mice. In both mice, the doubling times of melanoma cells are approximately one day.

      4) Fig S3A-B : The authors say that basophils express aGM1 so they performed basophils involvement on the elimination of B16F10 tumor cells with depleting aCD200R3 mab. They also checked the involvement of neutrophils and monocytes. They observed that basophils, neutrophils and monocytes are not involved on the B16F10 elimination. But what is the hypohesis to assess the role of neutrophils and monocytes ? Moreover, they did not explore Basophil roles in the other models including MC-38, BRAFV600E and 4T1 tumor cells.

      Response: We depleted neutrophils and monocytes because antibody-mediated removal of leukocytes could have non-specifically increased the survival of tumor cells. As for expanding the number of experiments with different cell lines, we are afraid but it is too much burden, considering the period required for the experiments and animal welfare.

      5a) Fig 1D : Missing control : the author must add the WT Balbc + a-AGM1 as control.

      Response: We have this data, which will be included in the revised paper.

      5b) Lines 154 to 156 : the authors say that « T cell immunity does not contribute to tumor cell reduction » because tumor cells are eliminated in the nu/nu mice as efficiently as in the WT Balbc mice. This is not correct because they are looking in a window that correspond to innate immunity activation (up to 24h) so they cannot talk about T cell immunity, the adpative response will come more later around 8 days after.

      Response: Yes, we are focusing on the early phase of the rejection of metastatic tumor cells. We will rephrase the sentences.

      6) Line 159 : (refer to point #2) To affirm that NK cells is critical and involved in the elimination of the disseminated tumor, authors have to perform experiment in a model of NK cell deficiency. The most relevant nowaday is the NKp46ICRExrosa26DTA mice that are deficients in NK cells but also ILC1 cells. Indeed, the authors have used the NKp46iCre mice model for other questions.

      Response: As the reviewer stated, the contribution of NK cells in the rejection of metastatic tumors is very well known. We do not think we need to repeat the experiments by using other genetically modified mouse lines, which will take at least one year. We wish to emphasize again that the new findings of our paper are in the in vivo imaging.

      7a) Fig 2F : IC missing

      Response: According to the reviewer's suggestion, we will perform control experiments with an isotype control.

      7b) Lines 181-182 : Authors conclude that the effect of anti-LFA-1 on NK cells adhesion to the pulmonary endothelial cells is mediated primarily by LFA-1. It is not totally true because it is partially mediated as observed in the fig 2F. So authors should change their conclusion and precise that the involvement is partially mediated by LFA-1.

      Response: We will rephrase the result section in the revised paper.

      8) Fig S5B-C-D and S7: The authors talk about tumor cell death. But they are analyzing Ca2+ influx in vitro so it is a little bit different from the cell death. I'm wondering how the cell death is mesured espacially in the fig S5D and S7?

      Response: Under microscopes, apoptosis can be easily recognized by the appearance of blebs. We will include videos in the revised paper.

      9) Fig 4H and lines 232-233 : the authors conclude that « damage to tumor cells is dependent on the engagement of DNAM-1 on NK cells ». There is any experiment performed to affirm this point so the authors cannot maintain this conclusion. First, the authors only analyzed Ca2+ influx at a specific time point. So this result only show that Nectin-5 and/or Nectin-2 expressed by B16F10 is involved in the Ca2+ influx following NK cell contact but there is any data on DNAM-1 contribution. So, the role on the NK cells and specifically DNAM-1+ NK cells have not been adressed here. To answer to that question, the author have to perform in vivo model of engrafted WT vs Necl-5/2 ko B16F10 in a WT vs DNAM1 deficient NK cells mouse model to ascertain the contribution of Necl-5/2-DNAM-1 on NK cells. Moreover, survival curve and bioluminescent experiments would be very appreciated.

      Response: We have shown the data with Necl-5/Nectin-2-deficientB16F10 cells in Fig. S7. I understand the importance of the experiment with the DNAM-1-deficient mice. But the introduction of another knockout mouse line cannot be performed easily. Instead, we will tone down the conclusion on the requirement of signaling from Necl-5/Nectin-2 to DNAM-1.

      10) Lines 253-254 : the authors talk about tumor apoptosis but they are looking at Ca2+ influx. So, they should change their conclusion or show killing experiment.

      Response: In Figure S7, we have shown that the sustained Ca2+ influx is a useful surrogate marker for apoptosis. We will include this information explicitly in the revised paper.

      11) Fig 6 : the authors conclude that the trombin dependent shedding of Necl-5 causes evasion of NK cells surveillance. Moreover, all experiments are correlations and do not implicate in the same experiment Necl-5, DNAM-1+ NK cells and trombin or anti-coagulation factors. So, as in the comment #9, to adress this point, the authors should inject WT vs Necl-5 deficient B16F10-Akaluc into WT vs NK cell depleted mice and monitor the bioluminescence of the tumor cells within 24h following injection of anti coagulation factors as in the fig 6H. Moreover, the monitoring of the survival curve and the number of the lung metastasis would be also very important and informative to really answer to this point.

      Response: We will try the requested experiments during revision.

      \*Minor points***

      1) Fig 2E: The authors assess the involvement of LFA-1 and MAC-1 on the NK cells attachement to the the pulmonary endothelial cells. But there is other adhesion molecules that are known to be expressed by NK cells as for example CR4 (CD11c/CD18). So, the attachement of NK cells could be also due to this molecule.

      Response: We agree. The text will be modified to suggest the involvement of other adhesion molecules.

      2) Lines 190 to 197 : Authors should put this methodology part in the « material and method » in order to be more clear on the message they want to deliver.

      Response: We will modify the text according to the suggestion.

      3) line 228 : There is any hypothesis or explanation regarding the use of Necl5/Necl2 deficient B16F10. Why authors decided to go and explore this pathway ? Authors could add some transition sentence and explanation to help readers.

      Response: We will refer to previous papers suggesting the role of DNAM-1 and its ligands, Necl-5 and nectin-2.

      4) The author could performed the same experiment as in Fig S7D and assessed ERK activation of DNAM+ vs - NK cells against WT vs Necl-5/Necl-2KO R-GEKO B16F10 cells.

      Response: We will try the suggested experiments.

      5) Line 283 : Thanks to reformulate the sentence. Check the firgures associated with the text.

      Response: We will correct this error. The figures will be Fig. 5E and 5F.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors use in vivo imaging techniques to investigate the killing of lung metastasis by NK cells. They demonstrate that the cleavage of CD155 may result in resistance of killing by NK cells and suggest that this could be an immune evasion mechanism of metastatic tumor cells.

      Overall, the subject is highly relevant, and the in vivo imaging is an interesting and highly relevant technique. However, the message, that tumor cells escape the killing by NK cells by cleavage of CD155 is interesting, but not yet fully supported by the data.

      \*Major comments:***

        • Figure 6: To support their main claim the authors would need to transfect the tumor cells with a CD155 mutant, which cannot be cleaved by Thrombin and show that these tumor cells can no longer escape NK cell-mediated killing. This experiment is straight forward and feasible. Another important experiment along this line would be the use the CD155/CD112 deficient tumor cells (Which the authors use in figure 4) in the experiments shown in figure 1. One would expect that tumor control by NK cells within the first 24h is absent when using these tumor cells.* Response: We previously made five CD155 mutants, which could be resistant to thrombin-mediated cleavage, and re-expressed in CD155/CD112 deficient tumor cells. However, none of the mutants was not killed by NK cells both in vivo and in vitro. It appears that the potential thrombin-cleavage site(s) reside in the recognition site by DNAM-1. We will include this observation in the discussion.
      • Figure 5: The demonstration that ERK is activated in this in vivo setting is novel. However, ERK activation is not DNAM-1 specific and the ERK inhibitor is significantly less effective that the depletion of NK cells. Therefore, the relevance of these data to the main message of the manuscript is unclear and the figure could be omitted.*

      Response: We agree that the modest effect of MEKi implies that ERK activation is dispensable for NK activation. However, ERK activation is a useful marker of NK cell activation. The data shown here vividly show the timing of NK cell activation and following tumor cell killing. Because the in vivo dynamics of NK cell activation and tumor cell killing is the most important message of this work, we wish to show this data.

      • In general, the issue of NK cell exhaustion should be addressed in more detail. The experiments do not address serial killing activity of NK cells and more data is needed to show that it is not an exhaustion of NK cells but the cleavage of CD155 from the tumor cells that prevents further killing.*

      Response: We believe, Fig. 5G clearly shows that NK cells are not exhausted 24 hours after tumor cell injection.

      **Minor comments:**

      • Figure 1C: The relevance of this experiment needs to be better explained.*

      Response: We will rephrase the result section in the revised paper.

      • Figure 3A: What does SHG stand for?*

      Response: It is shown in line 625, M&M section. We will show the statement that SHG stands for second harmonic generation channel in the figure legend.

      • Figure 3: Please add a statistical analysis for these experiments.*

      Response: We will include P values in the revised paper.

      • Figure 4: The use of the caspase-3 and the calcium sensors may detect different cytotoxic mechanisms used by the NK cells. While caspase-3 can be activated by death receptor and perforin/granzyme B mediated killing, the calcium sensor may report mostly on perforin mediated membrane damage. These killing mechanisms have different kinetics and are differentially used during serial killing by NK cells. This should be addressed (at least in the discussion).*

      Response: We thank this invaluable comment. We will include this discussion.

      Reviewer #2 (Significance (Required)):

      Investigating the in vivo cytotoxicity of NK cells against tumor cells by using live imaging technologies is highly relevant for the understanding of the dynamic relationship between tumor and killer cells. Therefore, the subject of this manuscript and the technologies used are very relevant, as in vivo killing activities do not always translate to the in vivo setting.

      Response: We thank the reviewer for the favorable comment.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      \*Summary***

      Ichise et al., present a solid work describing the modality and time frame of action of NK control over seeding metastatic cells within the lung vasculature. Th authors use a variety of technique able to dissect how NK patrol lung vasculature, that they interact with cancer cells as they interact with the endothelial cells and they activate a ERK dependent activation leading to calcium influx in cancer cells leading to their death. The data support the notion that this NK control occur over an early time frame, 4h after cancer cells arrival and is mediated by Necl expression on cancer cells. After this time point cancer cells show a thrombin dependent loss of Necl expression on their surface and therefore become resistant to NK control.

      \*Comments:***

      The data presented are supporting the conclusions. This work utilizes a variety of elegant strategy combining reporter strategy with in vivo imaging to assess the phenomenon of interaction, ERK activation, Calcium Inflax and Apoptosis activation directly in the lung.

      In term of experiments, I found the work thorough and complete.

      The data a presented well overall and the statistics seems adequate.

      I only have few suggestions:

      Supplementary Figure S3, show the use of antiLy6G to deplete neutrophils in the lungs of C57BL/6 mice injected with melanoma B16F10 cells. It was recently shown that this antibody is not efficient in depleting neutrophils in this background, but only lead neutrophils to internalise the Ly6G so they cannot be detected by FACS. As shown in Boivin et al 2020 http://doi.org/10.1038/s41467-020-16596-9) neutrophils depletion in C57BL/6 mice can be achieved by using antiGr1 antibody. Therefore, if the authors aim to show this additional control, which I also agree is really good to have, I suggest performing the experiment accordingly to the best-known practice.

      Response: We will perform the suggested experiment.

      Figure 1E: in the text the experiment is described as 4T1 Akaluc cells were inoculated into the foot pad of BALB/c mice with either control antibody or αAGM1, but the legend states that mice subcutaneously injected with B16 Akaluc cells into footpad.

      As B16 melanoma cells are not in BALB/c background, I assume the legend needs to be corrected as the cells should be 4T1, however I wonder if injecting 4T1 breast cancer cells in the footpad could have let to the substantial growth required for lung metastasis without impairing the animal mobility. Could it be that cells where actually injected in the fat pad of the mice and this is just a misspelling in the text?

      In this case, the different in the tissue residence NK cells could also potentially explain why 4T1 are not cleared in the fat pad like the B6 cells are in the footpad.

      The authors should comment on the difference in the in clearance of the cells at the injection site in Figure 1C VS Figure 1E.

      Response: We apology the erratum in the legend.

      Figure 1C was performed to examine whether NK cells in the lung could be exhausted or inert 14 days after the inoculation of B16F10 cells. In this experiment, Akaluc-expressing B16F10 cells were inoculated to monitor the bioluminescence for 24 hrs.

      In figure 1E, we used Akaluc-expressing 4T1 breast cancer cells because 4T1 cells inoculated into footpad can be spontaneously metastasized to the lung (Kamioka et al., 2017). We observed the bioluminescence of 4T1 cells in the lung for up to 20 days.

      Ref: Kamioka, Y., Takakura, K., Sumiyama, K., and Matsuda, M. (2017). Intravital FRET imaging reveals osteopontin-mediated polymorphonuclear leukocyte activation by tumor cell emboli. Cancer Sci 108, 226-235.

      Reviewer #3 (Significance (Required)):

      The present work is highly relevant to the field of cancer metastasis. While it is known that NK are responsible for the first line of defence against metastatic seeding, most of the studies focuses on how they are suppressed or influenced by other immune cells. The present study provides a very accurate description of their mechanism of action, how they depend in the interaction with the endothelial cells and highlight the novel aspect of thrombin in inducing cancer cells NK resistance. What cause thrombin activation is the next relevant question, by in my opinion this study is complete and important.

      My field of expertise is cancer metastasis and their interaction with the immune system and I personally enjoy very much reading this work.

      Response: We thank the reviewer for favorable comments and appreciate the effort to evaluate our work.

    1. SciScore for 10.1101/2020.04.30.20086223: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      NIH rigor criteria are not applicable to paper type.

      Table 2: Resources

      No key resources detected.


      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      This study has several limitations. First, while quotas were used to ensure a sample that was broadly representative of the general UK population, we cannot be certain whether respondents in survey panels are representative of the general population.(16, 17) We also cannot rule out participation bias. Given potential participants were not aware of the topic of the survey before starting it, the risk of this is low. Second, we did not differentiate between outings that were in line with Government guidelines and those that were not in our measure of “total out-of-home activity”. Third, because we used a cross-sectional study design, we are unable to determine the direction of associations. Fourth, due to the large sample size, small differences between groups were statistically significant. Where detected differences were very small, there may not be meaningful influence of these differences (e.g. perceived risk to self). People are likely to change their behaviour in line with their belief of whether they have had COVID-19. Even when tested, the reported result of an antigen test was not necessarily reflected in people’s belief about whether they had had COVID-19. Results from this study indicate that people who think they have had COVID-19 are less likely to adhere to social distancing measures. Clear, targeted communications might be used to advise this constantly growing group both to reduce reliance on self-diagnosis in the absence of a test and to provide advice on what ...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2020.05.05.20091983: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">IRB: The study was approved by the institutional review board (IRB) of the Albert Einstein College of Medicine with a waiver of the inform consent (IRB number 2020-11296).<br>Consent: The study was approved by the institutional review board (IRB) of the Albert Einstein College of Medicine with a waiver of the inform consent (IRB number 2020-11296).</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">All analyses were performed using STATA software (version 14·1; STATA Corporation, College Station, TX, USA).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>STATA</div><div>suggested: (Stata, RRID:SCR_012763)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      On the other hand, our study has several limitations. First, our sample was relatively small, but given the nature of the evolving pandemic, it was of paramount importance to make our early data and findings widely available as soon as possible, especially given the lack of data up to date in COVID-19 in minorities and underserved population. Second, this was a real-world study with a retrospective design utilizing the electronic medical records, which is suboptimal compared to a prospective study that could have more accurate follow-up assessment. Third, the rapidly changing management of COVID-19 might have affected our results but it highly unlikely that could have differentials affected associations between obesity and mortality. Fourth, we handled BMI as a categorical variable in the regression analysis. This can lead to suboptimal conclusions, but we think that specific cut-offs, following established clinical guidelines on obesity, may be of more interest and ease for the clinicians compared to interpretation of continuous variables in a regression model. In conclusion in this early cohort of hospitalized patients with COVID-19 in an underserved, minority-predominant population in the Bronx, we found that severe obesity was associated with higher in-hospital mortality even after adjusting for other pertinent potential confounding factors. Particular attention should be paid in protection of this population given the higher chance for negative outcomes once they are dia...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2020.08.28.272955: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      NIH rigor criteria are not applicable to paper type.

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Antibodies</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Transfected cells were then incubated overnight at 4°C with monoclonal anti-FLAG M2 mouse antibody (dilution 1:2,000, Sigma-Aldrich).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>anti-FLAG</div><div>suggested: None</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">After 3 washes for 10 min with PBS 0.01% Triton X-100, cells were incubated for 1 h at 37 °C with the Alexa Fluor 647-conjugated secondary antibody donkey anti mouse IgG (H+L) (1:2,000</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>anti mouse IgG</div><div>suggested: None</div></div></td></tr><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Experimental Models: Cell Lines</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">T-REx™ HEK293 cells were grown in Dulbecco’s Modified Eagle’s Medium (DMEM, Gibco) supplemented with 10% fetal bovine serum (FBS, Sigma-Aldrich), GlutaMAX™ and Penicillin-Streptomycin (1x).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>HEK293</div><div>suggested: None</div></div></td></tr><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">BioID data analysis: The proteins were identified by comparing all MS/MS data with the Homo sapiens proteome database (Uniprot, release March 2020, Canonical+Isoforms, comprising 42,360 entries + viral bait protein sequences added manually), using the MaxQuant software version 1.5.8.3.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>MaxQuant</div><div>suggested: (MaxQuant, RRID:SCR_014485)</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">The statistical analysis was done by Perseus software (version 1.6.2.3).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>Perseus</div><div>suggested: (Perseus, RRID:SCR_015753)</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">The tabs ‘Enrichment basal condition’ and ‘Enrichment poly(I:C)’ have been generated entering the lists of high confidence proximal interactors of each viral bait protein in the ToppCluster online tool4 (</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>ToppCluster</div><div>suggested: ( ToppCluster , RRID:SCR_001503)</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">All interactors individual annotations are shown in Supplemental Table 2, which was generated using the Metascape annotation tool5 (https://metascape.org/).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>Metascape</div><div>suggested: (Metascape, RRID:SCR_016620)</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Processing of the images was performed using Zeiss Zen 2 software and assembled using Adobe Illustrator.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>Adobe Illustrator</div><div>suggested: (Adobe Illustrator, RRID:SCR_010279)</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Our code is written in Python 3.87 and makes use of several modules, primarily: NetworkX8 for graph operations, NumPy9 for array manipulation and numerical computations, pandas10 for data handling and Plotly11 for visualization.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>Python</div><div>suggested: (IPython, RRID:SCR_001658)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      We thus do think that this apparent limitation could be also seen as an asset of our study; (ii) In our opinion, the main limitation sits in the expression of a single protein at a time. Knowing that several viral proteins require viral cofactors, infected context and/or the presence viral RNA to function properly (e.g. NSP10-NSP14 or NSP10-NSP16), the present analysis almost certainly misses cooperative viral interactions. Similar studies performed in infected cells will thus bring highly valuable additional information on putative SARS-CoV-2 pathogenesis mechanisms. As an attempt to mimic a physiopathological context, we artificially induced an anti-viral response by transfecting poly(I:C) and repeating the proximal interactome analysis. These experiments already revealed novel interactions of the utmost importance; and (iii), the proximal interactions are not necessarily physical and should therefore be considered as a discovery step systematically requiring orthogonal or functional validation. However, the proximal interactomics multiple analysis generated by us and others have been at the basis of fundamental mechanism discoveries, supporting the validity of the approach for identifying new biology (see177 for review). This first proximal interaction mapping of SARS-CoV-2 proteins provides a plethora of novel research tracks to better understand this virus pathogenesis. Although for a few proteins our approach did not lead to satisfying results (NSP3, NSP5, NSP8, ORF8 an...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We found bar graphs of continuous data. We recommend replacing bar graphs with more informative graphics, as many different datasets can lead to the same bar graph. The actual data may suggest different conclusions from the summary statistics. For more information, please see Weissgerber et al (2015).


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2020.12.04.20244087: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Any N95 that failed the seal check or the saccharine fit-test was further evaluated with a confirmatory quantitative fit-test using the ambient aerosol condensation nuclei counter (PortaCount®) protocol6,7.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>PortaCount®</div><div>suggested: None</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Analyses were performed using StataCorp 2019 (College Station, TX: StataCorp LLC).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>StataCorp</div><div>suggested: (Stata, RRID:SCR_012763)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      Our study has limitations. We did not perform a PortaCount on N95s that passed the seal check or the saccharine fit-test due to limited N95 supplies and we may have overestimated “passes”; however, false passes are infrequent with the saccharin method8. Although we evaluated two of the most commonly used N95 respirators in the United States 9, findings may not be generalizable to alternative models. The number of repeated N95 donnings was based on HCW recall, which may have been under- or over-estimated; however, we do not think there was bias in either direction. We did not sample the N95s to assess for pathogen contamination, a risk of N95 reuse, but our protocol of face shield to protect N95 reduces this risk by preventing droplets from landing on the respirator surface. This study was not powered to assess effectiveness of N95s to prevent SARS-COV-2 infection or other potentially airborne transmitted infections. Notably, no patient-to-HCW SARS-COV-2 transmissions have been documented for HCWs who complied with the recommended COVID-19 precautions at JHH to date (authors’ personal communication). There was missing PortaCount data from some HCWs who failed the seal check or saccharine fit-test; however, we performed a sensitivity analysis to minimize the impact of missing data on interpretation of the study results. In summary, extensive reuse of the N95 models tested in our study seems an acceptable and safe approach during critical supply shortages rather than uniform dis...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2021.01.22.21250304: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Software and reproducibility: Data management was performed using the OpenSAFELY software, Python 3.8 and SQL, and analysis using Stata 16.1.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>Python</div><div>suggested: (IPython, RRID:SCR_001658)</div></div></td></tr></table>

      Results from OddPub: Thank you for sharing your code.


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      Strengths and limitations: We were able to source our cohorts from the OpenSAFELY platform, which contains over 17m adults. This gave us a population of patients who were discharged following hospitalisation with COVID-19 of 31,569, allowing us to obtain precise estimates of the rate of each outcome. We were also able to draw on multiple linked data sources, including primary care records, hospitalisations and death certificates. This allows a more complete picture to be presented of the clinical activity surrounding each outcome. We believe that our use of an active control population of patients hospitalised with pneumonia in 2019 provides useful context for the rates of these outcomes in COVID-19 patients who survive hospitalisation. A comparison cohort could also have been attained by matching patients from the general population on various attributes such as age, sex and comorbidities. However, such a cohort would be lacking the exposure of an acute respiratory illness event requiring hospitalisation. We think presenting the rates in this context is more informative than within a general population. We note that our study aimed to describe clinical events that occurred after discharge from hospital, and therefore may not reflect the true additional morbidity burden of COVID-19 hospitalisation: specifically we did not set out to describe events that occurred during hospital admission with COVID-19 or pneumonia. However, in our view reliable analysis of in-hospital events ...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We found bar graphs of continuous data. We recommend replacing bar graphs with more informative graphics, as many different datasets can lead to the same bar graph. The actual data may suggest different conclusions from the summary statistics. For more information, please see Weissgerber et al (2015).


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a protocol registration statement.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2020.04.11.20062158: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">Population and study period: We included three groups of patients in our study: Group 1 (healthy controls): a randomly selected group of 55 patients who had a serum sample taken for other serologic studies, from October 1 to November 30, 2019 (before the first cases of COVID-19 were reported).</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Statistical analysis was performed with SPSS v20.0 (IBM Corp., Armonk, NY, USA).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>SPSS</div><div>suggested: (SPSS, RRID:SCR_002865)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      Our study is subject to some limitations. First, it has been conducted in a single hospital. Further multicenter studies are necessary to reinforce our findings. Second, patient selection was made according to the diagnostic needs of our hospital. Consequently, group 3 patients were all patients with negative PCR patients with clinical and radiological criteria of pneumonia and because of that, our results could not be generalized to other patients with COVID-19 and other clinical syndromes. Additionally, group 3 patients also presented a longer evolution time than group 2 patients. This probably explains that the overall positivity rates of the serological test are better than in group 3 (88.9% vs 47.3% in group 2). However, when we focus on patients with 14 or more days from onset of symptoms, the sensitivity and positivity rate increased for groups (91.1% for group 3 and 73.9% for group 2 patients). Because all of these limitations, further studies including all kinds of clinical presentations are needed in order to reinforce our conclusions. The question about the reliability of serologic rapid tests is still under debate (18,19) and more research is needed on this topic. We think that our study may help to point out the usefulness of these rapid tests.

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2020.05.04.20090878: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">IRB: Retrospective collection of patient data was approved by UCI’s Institutional Review Board.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      No key resources detected.


      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      The significantly greater presence of metabolic syndrome in underserved populations11 may provide some insight into the racial-ethnic disparities in COVID-19 incidence, particularly for critical disease Limitations of this study include the small sample of patients from a single-center, so conclusions may not be broadly generalizable. However, the strengths of this study include the collection of comprehensive data, including race-ethnic and census-tract derived community determinants from all patients presenting with COVID-19 over the study period. Future studies should evaluate the complex interactions of the social determinants of income and ethnicity with other demographic, clinical, and laboratory factors. In summary, our study examines the unveiling of race-ethnic disparities over the first six weeks of COVID-19 in Orange County, CA, and highlights vulnerable populations that are at increased risk for contracting COVID-19 and experiencing disproportionately severe outcomes. While our findings that Hispanic/Latinx populations are at increased risk corroborates reports elsewhere in the United States2, this study demonstrates the increase was most dramatic in minority groups living in disadvantaged communities. When we think of race-ethnic disparities, we often investigate immediate causes of disease, including risk factors. Our descriptive case series illustrates that for COVID-19 disparities, we also need to consider the “causes of those causes,” which ultimately set the...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2020.06.01.20119149: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">IRB: TOCIVID-19, an academic multicentre clinical trial, was promoted by the National Cancer Institute of Naples and was approved for all Italian centres by the National Ethical Committee at the Lazzaro Spallanzani Institute on March 18th, 2020; two amendments followed on March 24th, 2020 and April 28th, 2020.<br>Consent: Informed consent for participation in the study could be oral if a written consent was unfeasible.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">Phase 2 study design and analysis: Sample size for the phase 2 study was initially calculated using 1-month lethality rate as the primary endpoint; based on March 10th daily report on Italian breakout, 1-month mortality for the eligible population was estimated around 15%; 330 patients were planned to test the alternative hypothesis that tocilizumab may halve lethality rate (from 15% to 7.5%), with 99% power and 5% bilateral alpha error.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      No key resources detected.


      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      Mostly, retrospective or observational data have been reported so far, not based on prospective hypothesis testing, with prevalently positive results.[8, 16-25] However, our study has several limitations that deserve discussion for a better interpretation of findings. The first limitation is the single-arm study design, which prevents definitive conclusions.[26] However, we think that a randomised controlled trial was unfeasible for many reasons. There was a tremendous pressure to have the drug available, due to a widespread media diffusion of positive expectations and the increasing number of patients hospitalized for the disease, as confirmed by the massive registration of centres when the study began. Thus, obtaining a proper informed consent to randomization would have been extremely difficult also due to patients’ condition and clinical burden. Finally, developing a placebo was impossible, and, within a non-blind study, the risk of cross-over from the control to the experimental arm would have been high, reducing the validity of the randomised trial. Within this context, the problem of “learning while doing” was increased.[27] In our opinion, when the TOCIVID-19 trial started this protocol was the best trade-off between do-something and learn-something. A critical issue of the single-arm design was the definition of the null hypotheses to be tested, already acknowledged in the initial protocol where future modifications of study design were explicitly planned as an optio...

      Results from TrialIdentifier: We found the following clinical trial numbers in your paper:<br><table><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Identifier</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Status</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Title</td></tr><tr><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NCT04317092</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Active, not recruiting</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Tocilizumab in COVID-19 Pneumonia (TOCIVID-19)</td></tr><tr><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NCT04320615</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Completed</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">A Study to Evaluate the Safety and Efficacy of Tocilizumab i…</td></tr><tr><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NCT04381936</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Recruiting</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomised Evaluation of COVID-19 Therapy</td></tr><tr><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NCT04330638</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Active, not recruiting</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Treatment of COVID-19 Patients With Anti-interleukin Drugs</td></tr><tr><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NCT04346355</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Terminated</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Efficacy of Early Administration of Tocilizumab in COVID-19 …</td></tr></table>


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. SciScore for 10.1101/2020.06.08.138990: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Institutional Review Board Statement</td><td style="min-width:100px;border-bottom:1px solid lightgray">Consent: All donors provided written informed consent and tested negative for SARS-CoV-2 at the time of plasmapheresis.<br>IRB: Studies were conducted with the approval of the Houston Methodist Research Institute ethics review board, and with informed patient or legally-authorized representative consent when applicable.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">Generalized Liner model (GLM), using the first plasma donation data only, was performed between the same variables, as a response, and each of the following predictor factors: dyspnea (yes, no), disease severity (five classes as described above), hospitalization (yes, no) gender (male, female), and age combined into five age groups (<=30, 31-40, 41-50, 51-60 and >60).</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Cell Line Authentication</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Antibodies</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Donors were documented to be negative for anti-HLA antibodies, hepatitis B, C, HIV, HTLV I/II, Chagas disease, WNV, Zika virus, and syphilis per standard blood banking practices</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>anti-HLA</div><div>suggested: None</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">The ELISA used to measure antispike IgG antibodies in donor serum specimens was performed as follows.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>antispike IgG</div><div>suggested: None</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">A similar ELISA was used to study anti-spike ECD antibody titers in serum obtained from surveilled asymptomatic individuals</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>anti-spike ECD</div><div>suggested: None</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">All samples were tested with an initial screen assay and IgG antibody titers were subsequently performed on positive samples.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>IgG</div><div>suggested: None</div></div></td></tr><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Experimental Models: Cell Lines</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">The virus and plasma mixture was added to Vero E6 cells grown in a 96-well microtiter plate, incubated for 3 d, after which the host cells were treated for 1 h with crystal violet-formaldehyde stain (0.013% crystal violet, 2.5% ethanol, and 10% formaldehyde in 0.01 M PBS).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>Vero E6</div><div>suggested: RRID:CVCL_XD71)</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Diluted plasma was mixed with the SARS-CoV-2 WA1 strain, incubated at 37° C for 1 h, then added to Vero-E6 cells at a target MOI of 0.4.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>Vero-E6</div><div>suggested: None</div></div></td></tr><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Cells were fixed 24 h post-infection, and the number of infected cells was determined using SARS-CoV-S specific mAb (Sino Biological 401430-R001) and fluorescently labeled secondary antibody.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>SARS-CoV-S</div><div>suggested: None</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Whole genome alignments of consensus virus genome sequence generated from the ARTIC nCoV-2019 bioinformatics pipeline were trimmed to the start of orf1 ab and the end of orf10 and used to generate a phylogenetic tree using RAxML (https://cme.h-its.org/exelixis/web/software/raxml/indexhtml).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>RAxML</div><div>suggested: (RAxML, RRID:SCR_006086)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      Limitations: Our study has several limitations. The study was retrospective, only IgG titers were analyzed, and all VN studies were conducted in vitro. Plasma from the convalescent donors was used for VN assays, whereas serum samples were used for ELISA assays. As such, the findings may not be entirely applicable to all antibody testing platforms or other sample types. Conclusions: Taken together, the data clearly show that anti-RBD and anti-ECD IgG titers serve as important surrogates for in vitro VN activity. A substantial fraction of convalescent plasma donors may have VN titers below the FDA recommended cutoff of ≥1:160. Dyspnea, hospitalization, and higher disease severity were associated with higher VN titer. Importantly, a small percentage of asymptomatic individuals have virus-neutralizing antibodies, including some with a titer of ≥1:160. In the aggregate, it is reasonable to think that our findings provide impetus for widespread implementation of anti-RBD and anti-ECD IgG antibody titer testing programs. The resulting data could be useful in several settings, including, but not limited to, identification of plasma donors for therapeutic uses (e.g., convalescent plasma transfusion and/ or source plasma for fractionation in the manufacture of hyperimmune globulin) (5, 11), assessment of recipients of candidate vaccines, assessment of recipients of passive immune therapies, assessment of previously infected individuals, and identification of asymptomatic individuals wi...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. Author Response to Public Reviews

      We thank the reviewers and editors for their detailed and insightful comments. We believe the consequent revisions have greatly increased the overall clarity of the manuscript, and provide important additional context and analysis.

      Reviewer #1 (Public Review):

      We thank the reviewer for the detailed comments.

      [...] Overall, the manuscript lacks substantial statistical support or clear evidence of some of the patterns they are stating and would require a substantial revision to justify their conclusions. The majority of the manuscript relies on 8 infant/mother pairs where they have evidence of pertussis infection and rely on the dense sampling to investigate infection dynamics. However, this is a very small sample size and further, based on the results displayed in Figure 1, it is not obvious that the data has a very pattern that warrant their assertions.

      As noted in the introduction, we begin our results with “a descriptive analysis of eight mother/infant pairs where each symptomatic infant had definitive qPCR-based evidence of pertussis infection.” Our goal in this section is to use noteworthy examples to highlight salient epidemiological patterns, which we explore in further detail using data from the full cohort in subsequent sections. We note that the results presented in Fig 3 onwards in no way rely on any arguments and/or specific patterns described in Fig 2. In other words, the original eight pairs revealed several unanticipated findings (particularly the finding of repeated high CT values PCR findings in the mothers of a child with definite pertussis), that were intriguing and potentially relevant in terms of pertussis epidemiology. They are also unique – we have not seen any published time series data using qPCR in this way before. These early observations motivated us to conduct a more detailed and quantitative analysis of the cohort of >1,300 mother/infant pairs.

      The sample size under consideration in the majority of the manuscript (i.e., all except for the above section) is 1,320 mother/infant pairs (2,640 subjects), as shown in Table 1 and 2. In the original submission, sample sizes were also clearly indicated in Figure 2B (assays per week), Fig 3B (subjects per group), Table 2 (subjects per group), Figures S1-2 (study profile), Figure S3 (NP samples per infant), and Table S1.

      We have revised the panel order and axes labels of the current Figure 3 to more clearly illustrate the relationship between panels, and to clarify that the 6 example pairs shown in Fig 3A are unrelated to the 8 pairs shown in Figure 2. We hope this addresses any remaining confusion.

      While there are some instances with a combination of higher/lower IS481 CT values, it does not appear to have a clear pattern. For example, what are possible explanations for time periods between samples with evidence of IS481 and those without (such as pair A, C, D, E, F and H)? There also does not appear to be a clear pattern of symptoms in any of these samples (aside from having fewer symptoms in the mothers than infants).

      The ambiguity of these patterns played a role in guiding our analysis of the entire cohort, where we establish evidence for infection based on a preponderance of evidence from a large number of individuals.

      Further, it is not obvious how similar these observed (such as a mixture of times of high or low values often preceded or followed by times when IS481 was not detected) is similar to different to the rest of the cohort (in contrast to those who have a definitive positive NP sample during a symptomatic visit).

      The main results are primarily a descriptive analysis of these 8 mother/infant pairs with little statistical analyses or additional support.

      We strongly disagree with this characterization of our results, where we state that “In this analysis, we focus on the 1,320 pairs with ≥4 NP samples per subject (Figure S3)”. We believe the reviewer’s confusion may stem, in part, from a mis-interpretation of Figure 2 (below), along with our erroneous reference to Figure 3 (we incorrectly stated Fig 2, adding to the confusion). With this in mind, we have revised the previous Figure 2 (now Figure 3) in the interest of clarity, and more carefully described exactly what the points displayed in Figure 3 represent.

      The authors do not provide evidence or detail about what is known about the variability in IS481 CT values, amongst individuals, or over time, or pre/post vaccination. Without this information, it is not clear how informative some of this variability is versus how much variability in these values is expected.

      We agree that this is important information, and we have added figures and results summarizing the observed impact of vaccination on CT values (see essential revision 1, above), and the patterns of transitions of CT values across adjacent samples within individuals throughout the study (see essential revision 2). This latter analysis is now summarized in Figure 6, and shows a clear tendency for step-wise transitions over time. The implication is that the data present structure rather than random noise. This supports our overall contention that full-range CT values can provide meaningful insights into pertussis epidemiology. We also note that Fig. 7A (previously Fig 3A) and Table 3 (previously Table S1) do indeed summarize the distribution of CT values, including variability amongst individuals. As noted above, we have also included an additional analysis summarizing the interdependence of CT value on both symptoms and antibiotics (Fig 8-figure supplement 1).

      I think particularly in Figure 1, how many of the individuals have periods between times when IS481 evidence was observed when it was not observed, is concerning that these data (at this granular a level) are measuring true infection dynamics.

      Adding in additional information about the distribution and patterns of these values for the other cohort members would also provide valuable insight into how Figure 1 should be interpreted in this context.

      We believe our previous comments concerning the relationship between the current Figure 2 (illustrative example) and the remaining figures (cohort analysis) addresses this comment.

      As it stands, the authors do not provide sufficient interpretation and evidence for having relevant infection arcs.

      We have revised the manuscript to clarify that infection arcs are observed in other studies and expected in infected individuals, rather than directly observed and/or quantified in this study.

      It appears that Figure 2A was created using only 8 data points (from the infant data values). If so, this level of extrapolation from such few data points does not provide enough evidence to support to the results in the text (particularly about evidence for fade-in/fade-out population-level dynamics). Also, in Figure 2, it is not clear to me the added value of Figure 2C and the main goal of this figure.

      We believe our previous comments have addressed this point. As noted, we have revised the current Fig 3 for clarity. Figure 3A and 3C are intended to demonstrate the structure of the cohort across the study period. We have revised the caption to clarify this point.

      The authors have created a measure called, evidence for infection (EFI), which is a summary measure of their IS481 CT values across the study. However, it is not clear why the authors are only considering an aggregated (sum) value which loses any temporality or relationship with symptoms/antibiotic use. For example, the values may have been high earlier in the study, but symptoms were unrelated to that evidence for infection - or visa versa.

      We believe that temporal patterns of CT values within subjects now described in Figure 6 deserve further detailed attention that is outside the scope of the current work. We believe the high-level empirical summaries presented here are strengthened by their reliance on a preponderance of evidence. In the current revision, we have also included additional analyses that we believe address some (if not all) of the reviewers concerns.

      This seems to be an important factor - were these possible undiagnosed, asymptomatic, or mild symptomatic pertussis infections? It is not clear why the authors only focus on a sum value for EFI versus other measures (such as multiple values above or below certain thresholds, etc.) to provide additional support and evidence for their results.

      Our approach seeks to use an objective statistical summary (geometric mean RCD proportion) to quantify the “signal” contained in IS481 assays within individuals across the course of the study. We note that, while both false positives and false negatives are likely in this study, the sample characteristics of the cohort mean that repeated false positives within individuals are unlikely based on chance alone. Further, a central aspect to our argument is that dichotomizing a continuous variable at an arbitrary threshold is reductive and unnecessarily introduces misclassification that reduces, rather than improves, statistical power.

      It is not clear why the authors have emphasized the novelty and large proportion of asymptomatic infections observed in these data. For example, there have been household studies of pertussis (see https://academic.oup.com/cid/article-abstract/70/1/152/5525423?redirectedFrom=PDF which performed a systematic review that included this topic) that have also found such evidence.

      We are aware of the paper above, which we had cited in the discussion. A key limitation of the referenced study is reliance on retrospective recall spanning many months. Since pertussis infections may be mild and non-specific, the fact that household contacts of an index case cannot recall a pertussis-like infection is consistent with asymptomatic infection, but far from definitive evidence. Moreover, the use of seroconversion as the measure of exposure is unreliable, since variations in antibody concentrations can be driven by a number of factors other than natural exposure.

      While cross-sectional surveys may be commonly used in practice, it is not clear that there is no other type of study that provides any evidence for asymptomatic infections.

      Our core argument is that it is impossible to know with certainty that a symptom-free patient with a detecting qPCR on Monday would not have become symptomatic if recontacted on Tuesday. By their nature, cross-sectional studies simply cannot parse asymptomatic from pre-symptomatic infections. To do that, one needs a longitudinal design, as reflected in the aforementioned longitudinal household contact studies. A key consideration addressed in the current work is the extent to which low and/or borderline CT values should be reinterpreted within the context of A) repeated sampling of individuals over time and B) epidemiological surveillance versus clinical diagnosis. We do not claim that our approach is the only one possible.

      Further, it is not clear why the authors refer to widespread asymptomatic pertussis when a large proportion of individuals with evidence for pertussis infection had symptoms. Would it not be undiagnosed pertussis if it is associated with clinical symptomatology?

      We have revised the text to highlight the significance of both asymptomatic and minimally symptomatic pertussis. As we describe both here and in Gill et al. 2016, only a handful of individuals meet the consensus criteria for clinical pertussis (Ct<35). In addition, qPCR results were not available to clinic staff in real-time. This, coupled with the relative absence of severe symptoms during study visits (especially in mothers), meant that only one study participant was diagnosed with pertussis at the time of their visit.

      Reviewer #2 (Public Review):

      We thank the reviewer for their supportive comments.

      This study was done in a population with wP vaccine, I wonder if that's part of the reason many of the CT values are high. Can the authors speculate what this study would look like in a population having received aP for a long period? I'd appreciate more discussion around vaccination in general.

      We have added results summarizing the possible interaction between IS481 assays with infant vaccination.

      We also note that aP is widely used in high-resource settings where overall pertussis incidence is lower, while pertussis diagnosis and treatment are more widely available. Our results indicate that mothers in this population experience non-trivial pertussis incidence over time, yielding immunological profiles from repeated infection that we expect differ markedly from that of individuals who lack naturally-derived resistance to infection via, e.g., mucosal antibodies and tissue-resident T-cells. Recognizing that our study does not provide a direct comparison with aP-vaccinated populations, we nonetheless believe that directly comparable populations (urban poor in under-served communities) are both numerous and under-studied.

    1. Saura has observed: "I have never believed in the child's paradise. On the contrary, I think that childhood is a stage where nocturnal terror, fear of the unknown, loneliness, are present with at least the same intensity as the joy of living and that curiosity of which peda- gogues talk so much."3 The intensity of Ana's passions is made so credible that, without any melodrama, we can accept a nine-year-old con- templating suicide and poisoning one of her family elders. Her interior fantasy life is so vivid that when she awakens from a nightmare and discovers her mother's phantom has fled, without any sentimentality, we can identify with her panic and terror and her desperate longing for her mother. The child's perception of adult realities (e.g., her father's sexual adventures, which lead to his fatal heart attack, and his mistreatment of her mother, which is partly responsible for her death) is so convincing because, without fully comprehending all of the events, she intuits the emotional reality. Through her eyes, we are able to see the adults with a double perspective that may also partially reflect the adult Ana's con- sciousnes

      write about this maybe?

    1. Secondly, does everyone have the building blocks of a better life: education, information, health and a sustainable environment? And does everyone have the opportunity to improve their lives, through rights, freedom of choice, freedom from discrimination, and access to the world’s most advanced knowledge?

      I think that these are all good points. However, I do wonder if things such this as equal access to 'top tier' knowledge and education could cripple the growth of societies. If everyone has access to knowledge and resources I think two things that are bad could possibly happen. Because everyone has access to everything their is no drive to 'push forward' or be innovative. Because they have 'everything' they need right in front of them. also, having access to top tier information means less failure as an entrepreneur or normal person in our economy and while less failure is better. Failure usually means you learn more. Without failure Elon Musk may not be THE Elon we know to day and etcetera. The amount of access and databases could potentially hinder exponential growth of society as we have seen in the past.

    1. Despite what many may believe, online learning is fast, efficient, prioritizes students' own learning pace, adds flexibility, and lets each individual student learn at their own best time. As individuals, we are not all the same. 

      Do you think that online learning for information and more interaction in class is our future?

    1. we have to think about what we can start saving.

      So many people are uneducated about the impacts of hunger, while some may not even be interested in this issue as they have not personally experienced “true” hunger. We can apply pathos by spreading awareness of this issue with frequent commercials or billboards showing the victims of hunger. As well as showing the ludicrous amount of food being wasted and the faces of the hungry around the world (or of people in surrounding communities). These images can bring a reality to some people as visuals could be more impactful than words. Words can go through one ear through the other, an image could stay in somebody’s mind as it is more of a shock seeing it, opposed to reading it. Raising awareness may be the solution but I also believe that there may be some people that are too stuck in their ways to contribute to change as our world has become very money hungry and less focused on our surrounding environment.

    2.  There will always be waste. I’m not that unrealistic that I think we can live in a waste-free world. But that black line shows what a food supply should be in a country if they allow for a good, stable, secure, nutritional diet for every person in that country.

      I completely agree with this. It's so normalized to have fast food everyday, for breakfast, lunch and dinner. That can be caused by laziness, no knowledge of how to cook, whatever the case may be. But as humans, we should realize what we put into our bodies, especially for a necessity like food. Home-cooked meals are 10x better than fast food in many aspects. It's so much healthier. It can taste so much better. You can cook food to last several days. And you know exactly what you're putting into your body. Yes, of course there are people who can't cook out there but what else is life about? We learn something new everyday. Cooking is no different. And as time passes, you might even learn to love cooking. Cooking at home benefits not just you, but also the community.

    3. This is the result of my hobby, which is unofficial bin inspections. (Laughter) Strange you might think, but if we could rely on corporations to tell us what they were doing in the back of their stores, we wouldn’t need to go sneaking around the back, opening up bins and having a look at what’s inside.

      Corporations may hide things, but this is what we call least privilege. As an employee, we should not be sneaking around such as opening up bins. We have our own responsibilities and that should be it. For example, as an employee working at a supermarket, say Giant, your job may be as a cashier. If you are a cashier for that store, you wouldn't be tasked to go through inventory in back, unless specifically requested by your supervisor for that task. What’s its very suspicious for you to sneak around the back and open up bins. Being the cashier or inventory supplier has nothing to do with opening up bins. Your main goal is a cashier so we will only focus on payments and card transactions with items.

    1. Q: So, this means you don’t value hearing from readers?A: Not at all. We engage with readers every day, and we are constantly looking for ways to hear and share the diversity of voices across New Jersey. We have built strong communities on social platforms, and readers inform our journalism daily through letters to the editor. We encourage readers to reach out to us, and our contact information is available on this How To Reach Us page.

      We have built strong communities on social platforms

      They have? Really?! I think it's more likely the social platforms have built strong communities which happen to be talking about and sharing the papers content. The paper doesn't have any content moderation or control capabilities on any of these platforms.

      Now it may be the case that there are a broader diversity of voices on those platforms over their own comments sections. This means that a small proportion of potential trolls won't drown out the signal over the noise as may happen in their comments sections online.

      If the paper is really listening on the other platforms, how are they doing it? Isn't reading some or all of it a large portion of content moderation? How do they get notifications of people mentioning them (is it only direct @mentions)?

      Couldn't/wouldn't an IndieWeb version of this help them or work better.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their insightful comments and suggestions. Addressing them will improve our work. Please find below our point-by-points answers to the issues raised. We also provide a partially revised version of the manuscript with changes indicated in blue.


      Reviewer #1 (Evidence, reproducibility and clarity (Required

      **Summary**

      The authors propose a mechanism through which voltage dependent water pore formation is key to the internalization of Cell permeable peptides (CPPs). The claim is based on an in-silico study and on several experimental approaches. The authors compare 5 peptides (R9, TAT-48-57, Penetratin, MAP and Transportan and use 3 distinct cell lines (Raji, SKW6.4 and HeLa cells), plus neurons in primary cultures. The also present in vivo experiment (mouse skin and zebrafish embryo). All in all, it is an interesting study, but it raises several issues that need to be addressed. Moreover, the length and structure of the manuscript make it very difficult to read (see below under "Reviewer statement")

      **Reviewer statement**

      The instructions are to use the "Major comments" section to answer 6 precise questions. Unfortunately, this is not possible due to the structure of the document to review. The main manuscript (22 pages) comes with 4 primary figures and 19 supplemental ones. Most of these figures have an enormous number of panels and their legends occupy 17 pages. To this, are added 6 supplemental tables and 7 supplemental movies (with 2 pages of legends), 28 pages of Material and Methods, and 146 References (109 for the main manuscript and 37 for Supplemental information). To be frank, I was often tempted to send the manuscript back, asking for the authors to submit a document facilitating the task of the reviewers.

      Because of this complexity, my "Major comments" will come after a page by page, paragraph ({section sign}s) by paragraph and figure by figure "Detailed analysis" of the manuscript.

      **Detailed analysis**

      Q1. Page 4 {section sign} 3

      The test is based on the ability of TAT-RasGAP to kill the cells. Although controls exist, this is worrying since necrotic death might participate in the rupture of the membrane and artificially amplify internalization after a first physiological entry of the peptide. It is also a bit dangerous to add a FITC group to a short peptide without controlling that it has no effect on the interaction with the membrane (FITC-induced local hydrophobicity can provoke peptide tilting and membrane shearing). In the same vein, the very high peptide concentrations often used in the study (40µM for Raji and SKW6.4 cells and 80µM on HeLa cells) can be highly toxic.

      A1. We took advantage of the fact that TAT-RasGAP317-326 can kill cells to design a CRISPR/Cas9 screen based on cell survival for the identification of genes encoding proteins involved in CPP uptake. For this purpose, it was important therefore that the peptide was able to kill wild-type cells. Even if we consider the possibility that “necrotic death might participate in the rupture of the membrane and artificially amplify internalization after a first physiological entry of the peptide”, it remains that the cells that survived the screen did so because they were carrying mutations in genes that encoded potassium channels required for CPP uptake. And since the cells that survived the screen, by definition, were not dying, the issue raised by the reviewer is void in this case. The reviewer mentions that we included controls to validate the observations made with FITC-TAT-RasGAP317-326. Indeed, these controls were performed to address the potential problem raised by the reviewer. These controls, listed below, demonstrate that the genes identified through the CRISPR/Cas9 screen were also involved in the uptake of CPPs devoid of killing properties as well as CPPs that were not labelled with fluorophores.

      i) Three different cell lines, lacking specific potassium channels identified through the CRISPR/Cas9 screen, were unable to allow a non-labelled, non-toxic CPP (TAT-PNA) to enter cells (Supplementary Fig. 8a).

      ii) The Cre recombinase hooked to TAT, a construct that is not labelled with a fluorochrome and that is not toxic, did not enter Raji cells lacking the KCNQ5 potassium channels, also identified through a CRISPR/Cas9 screen (Supplementary Fig. 8b).

      iii) The internalization of a TAT-conjugated FITC-labelled cell-protective therapeutic compound was inhibited, sometimes fully, in three different cell lines, lacking specific potassium channels identified through the CRISPR/Cas9 screen (Supplementary Fig. 8c).

      Additionally, we are now reporting that the entry of FITC-labelled TAT, R9, and penetratin, all non-toxic CPPs, is impaired in Raji cells lacking the KCNQ5 potassium channel identified in the CRISPR/Cas9 screen. These new results will be incorporated in the revised version of our manuscript.

      As supportive evidence that a potential toxicity effect of TAT-RasGAP317-326 is not a confounding factor in experiments recording the initial uptake of the peptide is that internalization is measured after one hour of incubation with the cells (Figure 1), time at which the peptide only minimally impacts the survival of cells (PNAS December 15, 2020 117 31871-31881).

      Finally, please note that depolarizing cells, which is what happens in cells lacking the potassium channels identified through the CRISPR/Cas9 screen, not only blocked the uptake of TAT-RasGAP317-326, but also the uptake of a series of non-toxic CPPs (using short-time incubation protocols; Figure 2).

      Page 5 {section sign} 1

      Q2. Supp. Fig.1a shows no differences between the 3 cell types, even though they differ in their modes of peptide internalization, some favoring vesicular staining and others cytoplasmic diffusion.

      A2. The images shown in panel A of this figure depicts, for each cell line, examples of cells that do not take up the CPP, those that only display vesicular staining, and those that additionally take up the peptides in their cytosol. These images were picked to depict these uptake phenotypes and this is why they are similar in the three cells lines. Panel A does not provide any quantitative information on the prevalence of these different uptake modes in the three cell lines. This is shown in panel B of Supplementary Fig. 1. There is, therefore, no discrepancies between the two panels.

      Q3. Multiplying cell and peptide types contributes to the complexity of the manuscript without increasing its interest. If there is a conceptual breakthrough, as might be the case, it is obscured by the accumulation of useless images and data. A step into simplifying the manuscript would be (i), to concentrate on Raji cells (leaving out SKW6.4 and HeLa cells) and (ii) to only discuss the R9, TAT (including TAT-RasGAP) and Penetratin peptides.

      A3. We are sorry that the inclusion of several cell lines and several CPPs was seen as confusing by the reviewer. Our current vision is that our observations are strengthened if we show that the observed effects are seen in several cell lines and with a variety of CPPs. We would like therefore to not exclude supportive evidence presented in our work because if we do remove some of the data shown in the manuscript, we will definitely weaken some of our claims. We nevertheless remain open with this point that can be further discussed with the editors.

      Q4. TAT and R9 are poly-R peptides, which is not the case for Penetratin that has only 3 Rs. These 3 Rs are important (cannot be replaced by 3 Ks), but the two Ws absent in R9 and TAT are equally important as they cannot be replaced by Fs. This must be considered by the authors when they tend to generalize their model.

      A4. The point raised by the reviewer concerning the importance of W and R residues in CPPs is well taken. We have now developed this in the discussion with the addition of the paragraphs shown below.

      An additional potential explanation to the internalization differences observed between arginine- and lysine-rich peptides is that even though both arginine and lysine are basic amino acids, they differ in their ability to form hydrogen bonds, the guanidinium group of arginine being able to form two hydrogen bonds1** while the lysyl group of lysine can only form one. Compared to lysine, arginine would therefore form more stable electrostatic interactions with the plasma membrane.

      Cationic residues are not the only determinant in CPP direct translocation. The presence of tryptophan residues also plays important roles in the ability of CPPs to cross cellular membranes. This can be inferred from the observation that Penetratin, despite only bearing 3 arginine residues penetrates cells with similar or even greater propensities compared to R9 or TAT that contain 9 and 8 arginine residues, respectively (Supplementary Fig. 9g). The aromatic characteristics of tryptophan is not sufficient to explain how it favors direct translocation as replacing tryptophan residue with the aromatic amino acid phenylalanine decreases the translocation potency of the RW9 (RRWWRRWRR) CPP2. Rather, differences in the direct translocation promoting activities of tryptophan and phenylalanine residues may come from the higher lipid bilayer insertion capability of tryptophan compared to phenylalanine3-5. There is a certain degree of interchangeability between arginine and tryptophan residues as demonstrated by the fact that replacing up to 4 arginine residues with tryptophan amino acids in the R9 CPP preserves its ability to enter cells6. It appears that loss of positive charges that contribute to water pore formation can be compensated by acquisition of strengthened lipid interactions when arginine residues are replaced with tryptophan residues. This can explain why a limited number of arginine/tryptophan substitutions does not compromise CPP translocation through membranes**.

      Q5. Supp. Fig1c-d is not necessary (very little information in it) and Supp. Fig 1e is misleading as it takes a lot of imagination to see a difference between homogenous (top) and focal (bottom) diffusion.

      A5. Since we perform cytosolic quantitation to infer direct translocation, it appears important to us, for allowing others to potentially replicate our results, that we precisely report how methodologically we perform our experiments. For Supplementary Fig. 1e, we agree that the examples shown are not easily interpretable. We have now removed this panel, as well as the accompanying panel f, from the Supplementary Fig. 1.

      Q6. Supp. Fig.1g: How many cells are we looking at? Given the high variance, the result cannot be interpreted easily. A distribution according to fluorescence bits would be a better way to present the data.

      A6. Over 230 cells have been quantitated per condition, which includes all cells where CPP entry has occurred regardless of the intensity or the type of entry. We did not only focus on cells with strong cytosolic staining to avoid any bias with regards to detection limitations. High variance can also be explained by the fact that CPP cellular entry is not synchronized. We tested the way of showing the data as suggested by the reviewer but this did not improve the visualization of the results in our opinion. We will therefore keep the initial presentation. Note that regardless of the way the data are presented, the conclusion remains the same, namely that illumination in our hands is not the cause of CPP membrane translocation.

      Q7. Supp. Fig2i. This panel confirms that Raji cells differ from the two other cell types by showing clear temperature dependency. The explanation will come later with the energy barrier for low Vm-induced pore formation. This contradicts earlier reports showing that Penetratin translocation is not temperature-dependent, possibly because it was done on neurons naturally hyperpolarized. Or else because mechanisms are, at least in part, different from the one proposed here for R9 and TAT. This requires some clarification and supports the suggestion that, instead of multiplying models and peptides, it would be more efficient to compare TAT, R9 and Penetratin internalization by Raji cells and primary neurons.

      A7. Supplementary Fig. 1i (not Supplementary Fig. 2i as indicated by the reviewer) was reporting the overall CPP uptake, both through direct translocation and endocytosis as a function of temperature. As there is limited endocytosis in Raji cells, the data shown for this cell type mostly correspond to direct translocation. For Hela and SKW6.4, endocytosis is not marginal however and we will perform a new set of experiments to define the role of temperature (4, 20, 24, 28, 32°C) in CPP direct translocation (i.e. cytosolic acquisition) in HeLa cells and SKW6.4 (using the CPPs listed by the reviewer). We have partially performed this for HeLa cells already and this shows that direct translocation is indeed inhibited by low temperatures (more than 10-fold at 4°C compared to 37°C). Bear in mind that no endosomal escape occurs in our settings (see Supplementary Fig. 7c). This indicates that the decrease in cytoplasmic fluorescence induced by low temperature is not a consequence of diminished CPP endocytosis.

      Q8. Supp. Fig. 2a-f. Last sentence of the legend "Concentrations above 40µM led to too extensive cell death preventing analysis of peptide internalization". This confirms the warning against the use of concentrations varying between 40 µM and 80 µM and partially jeopardizes the validity of some experiments.

      A8. The reviewer has truncated this sentence that actually reads “Note: concentrations above 40 mM of TAT-RasGAP317-326 led to too extensive Raji and SKW6.4 cell death, preventing analysis of peptide internalization at these concentrations.” As different cell lines display various sensitivities to potential toxic effects induced by CPPs (Raji and SKW6.4 cells being more sensitive than HeLa cells for example), we have adapted the concentrations of CPPs used to monitor cellular uptake so that cell death was minimal or non-existent in order to prevent the potential confounding effects mentioned by the reviewer. Hence in contrast to what the reviewer is stating, we are taking care of the toxicity effect and perform our experiments in conditions were toxicity is minimal. The logic of the reviewer to state that we “jeopardize[d] the validity of some experiments” is therefore unclear to us as we did take care of not exposing our cells to toxic CPP concentrations.

      Page 6 {section sign} 2

      Q9. The authors advocate 2 modes of entry, opposing transport across the membrane and endocytosis. In contrast with R9, TAT and Penetratin, Transportan or MAP seem to be purely endocytosed but, if they reach the cytoplasm, they still have to cross a membrane (unless "a miracle happens"). For Penetratin and R9/TAT, the authors consider that water pore and inverted micelle formation are incompatible. This is a bit rapid as inverted micelles might induce water pores through W/lipids interactions requiring less R residues and, possibly, less energy. This provides the opportunity to signal that, in spite of their very high number, key references are missing or hidden in cited reviews, some of them written by colleagues who are not among the main contributors to the CPP field.

      A9. Transportan in our hands indeed appear to enter cells via endocytosis mostly. As reported by the reviewer, how Transportan reaches the cytosol remains unresolved.

      Our data support a model where CPPs enter cells via water pores that are not made by the CPPs themselves but that are created by the megapolarization state of the membrane. Our data therefore do not support toroidal or barrel-stave pore models because these pores would be built as a result of CPP assemblage.

      Inverted micelles have been hypothesized to mediate CPP translocation across membranes7 but to our knowledge, there is no in silico or cellular experimental evidence for this in the literature. To us, the data on which the involvement of inverted micelle in CPP translocation is based are also fully consistent with the water pore model. CPP translocation through water pores has been seen by several authors during in silico experiments but, to the best of our knowledge, simulations have not reported the formation of inverted micelles during CPP translocation across membranes.

      Finally, we would be grateful to this reviewer if the “key references” that are apparently missing from our manuscript are disclosed so that we could acknowledge them appropriately.

      Page 7 {section sign} 1

      Q10.Fig. 1b confirms that Raji cells provide a good model for loss and gain of function (lovely rescue experiment) and that the authors should drop the two other cell types that provide no decisive information.

      A10. Raji and HeLa cells display a stronger direct CPP uptake impairment phenotype when lacking a given potassium channel (KCNQ5 and KCNN4, respectively). In these cell lines, it appears that one potassium channel predominantly controls the plasma membrane potential. In contrast, in SKW6.4 cells, several potassium channels (e.g. KCNN4 and KCNK5) appear to be equally or redundantly involved in the control of the membrane potential. This probably explains the intermediate impact on the Vm and on CPP direct translocation when knocking out a given potassium channel in this cell line. When pharmacologically inducing cellular depolarization, a clear impairment in CPP translocation is however observed in this cell line. Thus, even though the Vm in SKW6.4 cells, is controlled predominantly by several potassium channels, it remains that an appropriate membrane potential is crucially required for these cells to take up CPP across their membrane. We agree with the reviewer that the stronger phenotypic effect observed in Raji and HeLa cells allows easy interpretation. On the other hand, it seems important to us that we provide data reporting intermediate situations so that readers can appreciate the variability that can be observed in different cell lines. Nevertheless, we would like to propose along the reviewer’s suggestion to move the SKW6.4 data from figure 1 to the supplemental data. Feedback from the editors would also be appreciated in this particular instance.

      Page 8 {section sign} 1

      Q11. A) Supp. Fig. 6b (no serum conditions) allows for the use of "normal" CPP concentrations and suggests that a fraction of the peptides may bind to serum components. No arrows in Supp. Fig.6b (but in 6c), and the R/pyrene butyrate interaction is not in 6c but in 6a. Still for Supp. Fig. 6c, the death of cells at 20µM (or less) even in the absence of K+ channels, confirms that we are borderline in term of peptide toxicity.

      B)There is a confusion between Supp. Fig. 6d and 6e and a legend problem (6e is not described). Cell death is assessed in % of PI-positive cells. Does this securely distinguish between death and holes allowing for PI entry without death?

      C) The CPP is incubated in the presence of Pyrene butyrate, making the KO cells less resistant. How does that demonstrate that the potassium channels are not involved in the killing if the peptide is already in? Unless the KO is done after internalization (but the cells should be already dead or dying?). This lacks clarity.

      A11. We apologize for the lack of clarity in the legend of Supplementary Fig. 6. This will be corrected in the revised version of the manuscript.

      A) Supp. Fig. 6b (no serum conditions) allows for the use of "normal" CPP concentrations and suggests that a fraction of the peptides may bind to serum components.

      A) The reviewer is correct that CPPs interact with serum components. This is indeed what is reported in this figure. The presence or absence of serum has therefore an important impact in experiments performed with CPPs and should be reported to allow proper interpretation of our data.

      No arrows in Supp. Fig.6b (but in 6c), and the R/pyrene butyrate interaction is not in 6c but in 6a.

      Thank you for noting this. This is now corrected.

      Still for Supp. Fig. 6c, the death of cells at 20µM (or less) even in the absence of K+ channels, confirms that we are borderline in term of peptide toxicity.

      It has to be understood that in Supplementary Fig. 6c, we use the TAT‑RasGAP317‑326 peptide that is inducing cell death when translocating into cells8. This cell death response is not provided by the CPP portion of TAT‑RasGAP317‑326 (i.e. TAT) but by its bioactive cargo (i.e. RasGAP317‑326). The read-out in this particular experiment is therefore cell death and this should not be confused with general CPP toxicity.

      B) There is a confusion between Supp. Fig. 6d and 6e and a legend problem (6e is not described).

      B) This has now been fixed.

      Cell death is assessed in % of PI-positive cells. Does this securely distinguish between death and holes allowing for PI entry without death?

      The answer to this question is yes. In this manuscript we used PI in two very different experimental set-ups.

      i) the conventional cell death detection assay where cells are incubated with 8 mg/ml PI prior to flow cytometry. In this set-up, dead cells with compromised membrane integrity have their nucleus brightly stained with PI.

      ii) the detection of small pores in the plasma membrane (water pore) where cells are incubated with ~30 mg/ml PI and the fluorescence of PI measured in the cytosol by confocal microscopy. In this set-up, PI enters into the cytosol through small plasma membrane pores but PI does not stain the DNA in the nucleus. This protocol has been previously described9 and we have further validated it in the present work (Figure 3 and Supplementary Fig. 12).

      PI does not fluoresce well unless it binds to DNA. In solution without cells, PI cannot be detected below 128 mg/ml (Supplementary Fig. 12e). At low PI concentrations (8 mg/ml), living cells (even when treated with compounds such as CPPs that create transitory pores) do not display cytosolic PI fluorescence. At high PI concentrations (32 mg/ml), the cytosol of CPP-treated cells becomes PI fluorescent. PI is positively charged and is attracted by the negative membrane potential of the cells. Its movement across the cell membrane is therefore unidirectional. This enables the PI molecules to accumulate/concentrate within the cytosol to values (> 64 mg/ml) allowing its detection (Supplementary Fig. 12a-c). PI and CPPs do no interact (Supplementary Figure 12d); hence they move independently from one another. If PI enters through the water pores induced by CPPs, the entry kinetics of PI and CPPs should be identical. Indeed, this is what we show now in a new figure (refer to our answer #31).

      C) The CPP is incubated in the presence of Pyrene butyrate, making the KO cells less resistant. How does that demonstrate that the potassium channels are not involved in the killing if the peptide is already in? Unless the KO is done after internalization (but the cells should be already dead or dying?). This lacks clarity.

      C) For the pyrene butyrate experiments the rationale was the following. The CRISPR/Cas9-identified potassium channels could either be involved in CPP internalization or they could be required for the killing activity of TAT-RasGAP317-326 when the peptide is already in the cytosol. To experimentally introduce TAT-RasGAP317-326 in the cytosol and to bypass any potential entry depending on potassium channels, we used pyrene butyrate that efficiently creates an artificial entry route for CPPs into cells. Our data show that when TAT-RasGAP317-326 is introduced in the cytosol through the use of pyrene butyrate, cells died whether they lack specific potassium channels or not. This led to our interpretation that potassium channels are not modulating the cell death activity of TAT-RasGAP317-326 once in the cytosol but that they are required for the entry of the CPP in the cytosol.

      Page 9 {section sign} 1

      Q12.The conclusion that the diffuse staining does not come from endosomal escape is based on the certainty that LLOME disrupts both endosomes and lysosomes. First, it should be verified with specific markers (rab5, rab7) that the fluorescent vesicles are endosomes. Second, the literature strongly suggests that LLOME primarily disrupts lysosomes and not endosomes. Finally, even if some endosomes are disrupted, the endosomal population is heterogenous and some CPPs may be in a subpopulation insensitive to LLOME. In addition, the importance of this issue is not well explained. In practice, access to the cytoplasm and nucleus requires crossing the plasma and/or the endosomal membrane and the latter, at least in early endosomes (thus the need of identifying the CPP-enriched vesicles), might not be very different from the plasma membrane.

      A12. The conclusion that diffuse staining does not come from endosomal escape is based on experiments where HeLa cells were incubated in the presence of CPP for 30 minutes to allow CPP entry into cells, then the cells were washed to prevent further uptake (Supplementary Fig. 7c). We only monitored the cells that initially took up the CPP by endocytosis and not through direct translocation (for the HeLa cell line, there is always a substantial fraction of such cells; see Supplementary Fig. 1b). We measured the cytosolic CPP fluorescence intensity in these cells by time-lapse confocal microscopy for 4 ½ hours. The procedure to do this is now explained in new Supplementary Fig. 7c. We then assessed the CPP fluorescence intensity within the cytosol. No increase in cytosolic fluorescence was detected in this condition, speaking against the possibility that cytosolic acquisition of CPPs by the cells resulted from vesicular escape (the identity of the vesicles being unimportant in this context). Our set-up has the potential to detect CPPs in the cytosol if these CPPs leak out from vesicles because we could measure increased CPP fluorescence in the cytosol in cells treated with LLOME. It did not matter in this positive control experiment what types of CPP-containing vesicles are disrupted by LLOME. What was important to show in this control condition was that the disruption of at least some CPP-containing vesicles permitted us to detect a cytosolic signal.

      Page 9 {section sign} 2

      Q13. Is Supp. Fig. 7e really necessary? First, as mentioned several times, if 20 µM is a borderline concentration in term of toxicity, raising the concentration up to 100 µM is problematic. Secondly, what matters is not "binding" in general, but binding to the proper membrane components. As mentioned by the authors themselves (Supp. Fig. 1e and movie), there are privileged sites of entry that may correspond to the recognition of specific molecular entities/structures.

      A13. The goal of the experiments presented in Supplementary Figure 7e was to determine whether the CRISPR/Cas9-identified potassium channels modulate CPP/membrane interaction. If those channels were to be required for the initial binding of the CPPs to the plasma membrane, this would have not hampered cells to take up the CPPs. Our data showed (Figure 7e) that Raji cells lacking the KCNQ5 potassium channel had a slightly decreased ability to bind TAT-RasGAP317-326 but importantly, these cells, at similar or even higher initial surface binding compared to wild-type cells (this was achieved by adequately varying the CPP concentrations), were still drastically impaired in taking up the peptide. Note that after one hour of incubation with TAT-RasGAP317-326 in the presence of serum there is only marginal amount of cell death (317-326, we have now performed an additional experiment with TAT that is not toxic to cells that confirms our data obtained with TAT-RasGAP317-326.

      Page 9 {section sign} 3 and Page 10 {section sign} 1

      Q14.The authors should have used a construct that does not kill the cells much earlier, just after the screening experiments based on resistance to necrosis induced by TAT-rasGAP. For Supp. Fig 8a and b: I am fully convinced by Raji cells and HeLa cells but not by the SKW6.4 cells.

      A14. As mentioned in our answer to point 10, we agree that SKW6.4 cells present intermediate phenotypes probably because, unlike Raji and HeLa cells, a combination of ion channels seems to regulate the plasma membrane potential. As indicated above, we can move the SKW6.4 data to the supplementary information to clarify the message presented in the main text. Again, feedback from the editors is welcome here.

      Page 10 {section sign} 2

      Q15. A) Supp. Fig 9 is quite convincing but adds the information that 2 µM are sufficient in neurons. This again makes the 20 to 80 µM concentrations used on transformed cells unsatisfactory.

      B) If one needs a cell line (more user friendly than primary cultures), there are several neural ones that can be differentiated (SHY, LHUMES, etc.) that may have an appropriate membrane potential (below -90mV). Indeed, it would then be important to verify if pore formation is still induced by TAT, R9 and Penetratin (separately) on "naturally" hyperpolarized cells.

      C) Figure 2a confirms that changes in Vm are not solid for HeLa and SKW6.4 cells. This casts a doubt on the validity of the results obtained with the latter 2 cell lines.

      A15. A) The experiments performed in Supplementary Fig. 9d with cortical rat neurons and HeLa cells were performed in the absence of serum accounting for the low concentrations used. We apologize for not emphasizing enough when experiments were performed in the presence or absence of serum, explaining the use of high CPP concentrations (40-80 mM) and low CPP concentrations (2-10 mM), respectively. We would like to emphasize however that we have adjusted the concentrations of CPPs in our study so as to get similar levels of CPP activity or CPP uptake between the different cell lines used. The concentrations used should not be compared as mere numbers, it is the CPP activity or uptake that should be considered.

      B) We thank the reviewer for his/her suggestion. To address this point, we will perform a new experiment to determine if in neurons TAT, R9, and Penetratin induce pores (using the PI uptake approach).

      C) Please see our answer to point 10.

      Page 11 {section sign} 2

      Q16. Why valinomycin was only tried on Raji cells?

      A16. In this study, valinomycin was used on Raji and HeLa cells (Figure 2 and 3). We did not use valinomycin on SKW6.4 cells, as the drug-induced hyperpolarization levels were insufficient in this cell line. As we got a nice hyperpolarization in HeLa wild-type and KCNN4 KO cells through ectopic expression of the KCNJ2 potassium channels (which restored the ability of the KO cells to take up the CPPs), we did not perform the CPP uptake experiment with valinomycin in HeLa cells (although we had tested that valinomycin is able to hyperpolarize HeLa cells).

      Page 12 {section sign} 2

      Q17.A)Looking at Fig. 2c, it seems that low Vm increases the uptake of all CPPs, except Transportan. Is there any reason why this Figure does not provide the number of vesicles per cell in the hyperpolarized conditions?

      B) In fact, if one goes to Supp. Fig. 9c, it appears that, among all peptides, only Penetratin is almost entirely cytoplasmic after 90' of incubation, whereas MAP and Transportan remain essentially vesicular. TAT and R9 are at mid-distance between these two extremes. This leads to send again the warning that all CPPs cannot be placed in a single category. The table that describes the sequences strongly suggests that, TAT and R9 uptake is due to the numerous Rs that cannot be replaced by Ks. In the case of Penetratin, that only has 3 Rs, the situation is thus different with the presence of 2 Ws previously shown to be mandatory for internalization, although absent in TAT ad R9.

      C) In Supp. Fig9, panel g is useless.

      D) A difference between peptides is also visible in Figure 2d where depolarization with KCl does not show the same efficiency on all peptides. The issue is whether these differences are significant and, if so, why? This discussion could be restricted to TAT, R9 and Penetratin.

      E) Supp. Fig. 10a also suggests that all peptides do not respond similarly to depolarization and that the effects differ between cell types and concentrations used. However, given the high concentrations used and the high variance between replicates, this figure might not be a priority in the reorganization of the manuscript.

      A17. A) As mentioned in the figure legend “Quantitation of vesicles was not performed in hyperpolarizing conditions due to masking from strong cytosolic signal.” This would create a bias towards underestimation of vesicles numbers in cells displaying strong cytosolic signal.

      B) We agree with the reviewer that Transportan enters cells primarily through endocytosis. This is mentioned in the text as well as other differences that were observed with regards to the prevalence of endocytosis or direct translocation. These mentions are reported below.

      Page 12: “With the notable exception of Transportan, depolarization led to decreased cytosolic fluorescence of all CPPs, while hyperpolarization favored CPP translocation in the cytosol (Fig. 2c, Supplementary Fig. 9h and 10a). Transportan, unlike the other tested CPPs, enters cells predominantly through endocytosis (Supplementary Fig. 9e), which could explain the difference in response to Vm modulation.

      Page 14: “Even though this extrapolation is likely to lack accuracy because of the well-known limitation of the MARTINI forcefield in describing the absolute kinetics of the molecular events, the values obtained are consistent with the kinetics of CPP direct translocation observed in living cells (Figure 1c and Supplementary Fig. 1b and 9e). With the exception of Transportan, the estimated CPP translocation occurred within minutes. This is consistent with our observation that Transportan enters cells predominantly through endocytosis and its internalization is therefore not affected by changes in Vm (Fig 2c-d and Supplemental Fig. 9e)”.

      Page 20: “On the other hand, when endocytosis is the predominant type of entry, CPP cytosolic uptake will be less affected by both hyperpolarization and depolarization, which is what is observed for Transportan internalization in HeLa cells (Fig. 2c and Supplementary Fig. 10a).

      Concerning the roles of arginine and tryptophan residues, please refer to our answer #4.

      C) We do not think this panel (now panel h) is useless as it shows representative examples of the quantitation shown in Figure 2c. We can however remove it if requested by the editors.

      D) The reviewer is correct with the observation that KCl-induced depolarization does not lead to similar inhibition in uptake of the tested CPPs. As mentioned in the text, these differences can be explained by the prevalence of direct translocation in the cells. For example, transportan enters cells primarily through endocytosis, which as we show is not regulated/affected by the membrane potential (Figure 2c, lower graphs). Consequently, it is expected that KCl treatment will not impact on transportan cellular uptake.

      E) The reviewer is correct in mentioning that there is quantitative heterogeneity between the different CPP tested. We mentioned these differences in the manuscript. These mentions are those that are reported under B, plus those listed below.

      Page 19: “It is known for example that peptides made of 9 lysines (K9) poorly reaches the cytosol (Fig. 3f and Supplementary Fig. 9e) and that replacing arginine by lysine in Penetratin significantly diminishes its internalization10,11. According to our model, K9 should induce megapolarization and formation of water pores that should then allow their translocation into cells. However, it has been determined that, once embedded into membranes, lysine residues tend to lose protons12,13. This will thus dissipate the strong membrane potential required for the formation of water pores and leave the lysine-containing CPPs stuck within the phospholipids of the membrane. In contrast, arginine residues are not deprotonated in membranes and water pores can therefore be maintained allowing the arginine-rich CPPs to be taken up by cells.

      Page 21: “Therefore, the uptake kinetics of lysine-rich peptide, such as MAP, appears artefactually similar as the uptake kinetics of arginine-rich peptides such as R9 (Supplementary Fig. 11b).

      Page 21: “The differences between CPPs in terms of how efficiently direct translocation is modulated by the Vm (Fig. 2c-d and Supplementary Fig. 10a) could be explained by their relative dependence on direct translocation or endocytosis to penetrate cells. The more positively charged a CPP is, the more it will enter cells through direct translocation and consequently the more sensitive it will be to cell depolarization (Fig. 2c). On the other hand, when endocytosis is the predominant type of entry, CPP cytosolic uptake will be less affected by both hyperpolarization and depolarization, which is what is observed for Transportan internalization in HeLa cells (Fig. 2c and Supplementary Fig. 10a).

      However, what remains is that depolarization always affects CPP uptake, at most concentrations tested. The heterogeneity reported in Supplementary Fig. 10a for a given experimental condition in a given cell type is in itself of interest as it suggests that there are varying factors within a cell population (e.g. cell cycle, metabolism, etc.) that may impact on the ability of cells to take up CPPs. As per reviewer’s suggestion we may remove this panel from the figure if instructed to do so by the editors.

      Page 12 {section sign} 3 and Page 13 {section sign} 1

      Q18. The pH story is either too long or too short.

      A18. One mechanism put forward to explain direct translocation relies on pH variation between the extracellular milieu and the cytosol14. It was therefore of interest in the context of the model we putting forward to see if pH is affecting the uptake of CPPs in our experimental model. Our data show that pH variations do not affect CPP direct translocation. This information should in our opinion be disclosed.

      Page 14 {section sign} 2

      Q19. At low Vm values, there is a decrease in free energy barrier. Does this modify temperature-dependency for internalization? Do cells really require energy when the Vm is very low, like is often the case for neurons?

      A19. We thank the reviewer for this interesting comment. We will now address this by visualizing under a confocal microscope CPP direct translocation in rat cortical neurons incubated at various temperature (4°C, 24°C, 37°C).

      Page 15 {section sign} 2

      Q20. Figure 2e is not explained, not even in the legend while the statement that CPPs induce a local hyperpolarization is central to the study.

      A20. As there is no Figure 2e, we believe that the reviewer is talking about Figure 3e, the legend of which was present in the initial version of the manuscript.

      Page 16 {section sign} 1

      Q21. It is confusing that the same agent, here PI, is used to measure internalization (2 nm pore formation in response to hyperpolarization,) and cell death. I have seen the explanation below, but I do not find it fully satisfactory.

      A21. We have tried to explain this better under our answer to point 11B.

      Page 16 {section sign} 2

      Q22. Entry is not necessarily a size issue. Structure is an important parameter, including possible structure changes, for example in response to Vm modifications. Therefore, the statement that molecule with larger diameters are mostly prevented from internalization is not only vague ("mostly") but incorrect.

      A22. We agree with the reviewer’s comment in the sense that the secondary structure of a molecule will also play an important role in its internalization. For that reason, we have used a series of molecules of identical structure (dextrans) but that have different molecular weights. In these experiments we saw that dextran of higher molecular weight enter less efficiently than that of lower molecular weight (Figure 3). We will rephrase some of our sentences so to precise that the size and the shape (structure) of molecules will determine their ability to enter cells through water pores that are characterized by a certain diameter.

      Page 2: “Using dyes of varying sizes and shapes, we assessed the diameter of the water pores**.

      Page 4: “translocation and we characterize the diameter of the water pores used by CPPs**.

      Page 15: “cells were co-incubated with molecules of different sizes and structure and FITC-labelled CPPs at a peptide/lipid ratio of 0.012-0.018 (Supplementary Fig. 11c-d).”

      Page 16: “3 kDa, 10 kDa, and 40 kDa dextrans, 2.3 ±0.38 nm, 4.5 nm and 8.6 nm (diameter estimation provided by Thermofisher), respectively, were used to estimate the diameter of the water pores formed in the presence of CPP.

      Page 16: “These results are in line with the in silico prediction of the water pore diameter obtained by analyzing the structure of the pore at the transition state.

      Page 16: “The marginal cytosolic co-internalization of dextrans was inversely correlated with their diameter.

      Page 35: “200 µg/ml dextran of different molecular weight in the presence or in the absence of the indicated CPPs in normal […]”.

      Page 17 {section sign} 4 and Page 18 {section sign} 1

      Q23. In Supp. Fig. 13b and c, since the GAP domain is mutated, death is not due to RasGAP activity. So what causes zebrafish death (hyperpolarization?) The results seem contradictory with those of Supp. Fig 13f where survival is 100% at 48 h.

      A23. Indeed, it appears that valinomycin in water leads to zebrafish embryo death, as can be seen in Supplementary Fig. 13c. However, the main difference between Supplementary Fig. 13c and S13f is that in Supplementary Fig. 13f zebrafish were not incubated in valinomycin-containing water, but were locally injected with a CPP in the presence or in the absence of valinomycin. This has now been clarified in the text. We saw that local injections with the hyperpolarizing agent are much less toxic and are well tolerated by the zebrafish embryos.

      Page 18 {section sign} 2

      Q24. The formation of inverted micelles is not incompatible with that of pores. CPP-induced hyperpolarization (Vm) is not measured directly, but deduced from experiments involving artificial membranes and in silico modeling. It would be useful to distinguish between what takes place on live cells (in vitro and in vivo) and what is speculated (based on modeling and artificial systems).

      A24.

      The formation of inverted micelles is not incompatible with that of pores.

      As mentioned above (point 9), we do also think that what has been presented as inverted micelles could have been in fact water pores.

      CPP-induced hyperpolarization (Vm) is not measured directly, but deduced from experiments involving artificial membranes and in silico modeling. It would be useful to distinguish between what takes place on live cells (in vitro and in vivo) and what is speculated (based on modeling and artificial systems).

      If we understand this point correctly, the reviewer is talking about the -150 mV hyperpolarization. This value is not a speculation but has been estimated from in silico experiments and also from experiments using live cells (not artificial membranes). In living cells, the hyperpolarization (megapolarization) has been estimated based on accumulation of intracellular PI over time in the presence or in the absence of CPP.

      Page 19 {section sign} 3

      Q25A. The model posits that the number of Rs influences the ability of the CPPs to hyperpolarize the membrane and, consequently, to induce pore formation. Since pore formation is key to the addressing to the cytoplasm, how can one explain that Penetratin which has only 3 Rs is transported to the cytoplasm more readily that TAT or R9? The authors should take this contradiction in consideration and should not leave aside, in the literature, what does not fit with their model.

      A25A. We fully agree that this should be discussed and not left aside. Please refer to point 4 for detailed discussion about the role of arginine and tryptophan in the ability of CPPs to translocate across membranes.

      Q25B. The fact that that Rs cannot be replaced by Ks, both in R9 and Penetratin is explained by differences in deprotonization. This is interesting but speculative. It might be that the interaction between Rs versus Ks with lipids and sugars are different and not only based on charge. After all their atomic structures, beyond charges, are different.

      A25B. We do not claim that protonation differences between R and K is the definitive answer for their ability to promote CPP translocation. It is one possible explanation that we find sound. As suggested by the reviewer, the ability of K and R to bind lipids and sugars can also play a role. We can mention in this context that the guanidinium group of arginine residues can form two hydrogen bonds1, which allow for more stable electrostatic interactions while the lysyl group of lysine residues can only form one hydrogen bond. We have included these additional possibilities in the revised version of our manuscript as indicated under point 4.

      Page 20 {section sign} 1 Q26. We still need to understand endosomal escape.

      A26. We agree with the reviewer that endosomal escape is still poorly understood. This is an interesting research topic that deserves its own separate study.

      **Major comments**

      • The key conclusions are convincing for a subset of CPPs and cell types
      • Yes, some claims should be qualified as speculative, but not preliminary
      • Many experiments should be removed. Neuronal primary cultures should be introduced to verify the main conclusions, at least for the 3 mains CPPs (TAT, R9, Penetratin). Answers must be given to the concentration issue. Vesicles should be characterized as well as the localization of the peptides in or around the vesicles. See above for less decisive but still important experiments that would benefit to the study.
      • Yes, the requested experiments correspond to a reasonable costs and amount of time (10 to 20,000 € and 3 to 5 months of work)
      • Yes, the methods are presented with great details. -Yes, the experiments are adequately replicated and statistical analysis is adequate

      **Minor comments (not so minor for some of them)**

      • See "Detailed analysis"
      • No, prior studies are not referenced appropriately (see above)
      • No, the text and figures are not clear and not accurate (see above)
      • (i) use Raji cells and primary neuronal cultures, plus in vivo model and forget the other cell types; (ii) forget MAP and Transportan and compare TAT/R9 and Penetratin; (iii) drastically reduce the number of figures, tables and movies (6 primary figures, 6 supplemental figures and 4 tables are reasonable numbers; movies are not absolutely necessary); (iv) limit to 6 (max) the number of panels per figure; (v) limit the number of references to less than 50 and cite the primary reports rather than reviews); (vi) reduce the size of the Material and Methods and the length of figure legends.

      Reviewer #1 (Significance (Required)):

      • The mode of CPP internalization is an unanswered question and the report, if revised, will represent a conceptual and technical advance.
      • Bits and pieces of the conclusions can be found in previous reports. But the Vm-dependent pore formation as well as the CPP-induced "megapolarization" (even if only shown for a subset of CPPs) would be an important contribution. The authors must resist the tentation to generalize to all CPPs what might only be true for a few of them.
      • I do not have the expertise for the in-silico work, but my field of expertise allows me to understand all other aspects of the manuscript.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors investigated the effect of membrane potential on the internalization of CPPs into the cytosol of some cancer cell lines. Using a CRISPR/Cas9-based screening, they found that some potassium channels play an important role in the internalization of CPPs. The depolarization decreases the rate of internalization of CPPs and the hyperpolarization using valinomycin increases the rate. Using the coarse-grained MD simulations, the authors investigated the interaction of CPPs with a lipid bilayer in the presence of membrane potential. In the interaction of CPPs with the cells, propidium iodide (PI) enters the cytosol significantly. Based on this result, the authors concluded that pores with 2 nm diameter are formed in the plasma membrane.

      This reviewer raises one main issue concerning CPP endocytosis. The reviewer challenges our method to investigate CPP direct translocation and specifically how do we make sure that what we consider direct translocation is not a combination of CPP endocytosis (followed or not by endosomal escape) and CPP plasma membrane translocation. As explained below in details our methodology is able to accurately distinguish CPP uptake by direct translocation from CPP endocytosis and we further demonstrate that endosomal escape does not occur in our experimental settings.

      Q27. One of the defects in this manuscript is the method to determine the fraction of internalization of CPPs via direct translocation across plasma membrane. The authors estimated the fraction of the direct translocation of CPPs by the fluorescence intensity of the cytosolic region (devoid of endosomes) and the fraction of the internalization via endocytosis by the fluorescence intensity of vesicles. However, the CPPs can enter the cytoplasm via endocytosis, and thus the increase in the fluorescence intensity of the cytoplasm is due to two processes (via endocytosis and direct translocation). The authors should use inhibitors of clathrin-mediated endocytosis and macropinocytosis to determine the fraction of internalization of CPPs via direct translocation accurately. Low temperature (4 C) has been also used as the inhibitor of endocytosis (e.g., J. Biophysics, 414729, 2011; J. Biol. Chem., 284, 33957, 2009). Supplementary Figure 1i (the temperature dependence of internalization of TAT-RasGAP317-326) clearly shows that at 4 C the fraction of the internalization was very low, indicating that this peptide enters the cytosol mainly via endocytosis. The determination of the fraction of the internalization via endocytosis by the fluorescence intensity of vesicles in this manuscript is not accurate because it is difficult to examine all endosomes in cells and it is not easy to discriminate the fluorescence intensity due to the endosomes from that due to the cytosol.

      It is important to follow a time course of the fluorescence intensity of single cells from the beginning of the interaction of CPPs with the cells (at least from 5 min) in the presence and absence of inhibitors of the endocytosis (J. Biol. Chem., 278, 585, 2003) to elucidate the process of the internalization of CPPs in the cytosol.

      A27. The reviewer raises the possibility that the signal of fluorescent CPPs in endosomes somehow perturbs the acquisition of the signal in cytosol. This could occur in two ways: CPP endosomal escape and diffusion of the signal located in endosomes into adjacent cytosolic regions (halo effect). The second possibility can be readily dismissed because in situations where cells only take up fluorescent CPPs by endocytosis, the cytosol emits background fluorescence (autofluorescence). This can be seen in Supplementary Fig. 1a (“vesicular” condition) or in Supplementary Fig. 9h in the depolarized cells that cannot take up CPP by direct translocation. Also note that when we record the cytosolic signal we take great care of using regions of interest (ROI) that are distant from endosomes. In contrast to what the reviewer is saying (“it is not easy to discriminate the fluorescence intensity due to the endosomes from that due to the cytosol”), it is actually not difficult discriminating the cytosolic fluorescence from the endosome fluorescence. To illustrate this, we now provide examples of high magnification images of cells incubated with fluorescent CPPs (new Supplementary Fig. 1c, right[1]) to better explain/illustrate our methodology and to show that it is quite straightforward to find cytosolic areas devoid of endosomes. Such high magnification images are those that are used for our blinded quantitation. The other possibility is endosomal escape. We demonstrate in Supplementary Fig. 7c that in our experimental conditions, no endosomal escape is detected[2]. We may not have explained our methodology well enough in the earlier version. We will try and improve the description of our quantitation procedures better in the revised version. To this end, we have now added a scheme illustrating the experimental setup (now part of Supplementary Fig. 7c) that is used to assess endosomal escape.

      The reviewer also questions the way we quantitate the CPP signals in endosomes. In the present paper, our goal is to characterize the direct translocation process of CPPs in to cells. We do not wish here to investigate in details the endocytic pathway taken by CPPs. This has been done in a separate study that we are currently submitting for publication. In a nutshell, this work shows that the endocytic pathway taken by CPPs is different from the classical Rab5- and Rab7-dependent pathway and that the CPP endocytic pathway is not inhibited by compounds that affect the classical pathway. Thus, even if we had wanted to use the inhibitors mentioned by the reviewer, they would not have blocked CPP endocytosis.

      To sum up the issues raised under this point, we believe we have presented the reasons why there are no grounds to support the concerns raised by the reviewer.

      [1] Supplementary Fig. 1c (right) is mentioned in the “Cell death and CPP internalization measurements” section of the methods.

      [2] In this experiment, cells were incubated with CPPs for 30 minutes to allow CPP entry into cells. Then the cells were either washed (to prevent further uptake including uptake through direct translocation) or incubated in the continued presence of CPPs. In both conditions, cells where only endocytosis took place were followed by time-lapse confocal microscopy for 4 hours (i.e. these cells do not display any cytosolic CPP signal at the beginning of the recording). We then assessed the CPP fluorescence intensity within the cytosol (i.e. away from endosomes). From these experiments we saw that cytosolic fluorescence increased only in conditions where CPP was present in the media throughout the experiment. No increase of cytosolic fluorescence was detected in the condition where CPPs were washed out. In conclusion these results demonstrate that the cytosolic signal that we observed in our experiments is due to direct translocation and not endosomal escape. In these experiments we have used the LLOME lysosomotropic agent as a control to make sure that if endosomal escape had occurred (even if only from a subset of endosomes/lysosomes), we would have been able to detect it. Indeed, upon addition of LLOME we were able to record CPP release from endosomes to the cytosol. There is therefore no endosomal escape occurring in our experimental conditions. In conclusion, the observed cytosolic signal in our confocal experiments do not originate, even partly, from endosomal escape.

      Supplementary Figure 1i (the temperature dependence of internalization of TAT-RasGAP317-326) clearly shows that at 4 C the fraction of the internalization was very low, indicating that this peptide enters the cytosol mainly via endocytosis.

      The experiment shown in Supplementary Fig. 1i was analyzed by flow cytometry that cannot discriminate the cytosolic signal from the endosomal signal. We will therefore perform this experiment again but this time using confocal imaging to record the impact of temperature on CPP cytosolic acquisition. We have performed this for HeLa cells already and this shows that direct translocation is indeed inhibited by low temperatures (full blockage at 4°C). Bear in mind that no endosomal escape occurs in our settings (see Supplementary Fig. 7c). This indicates that the decrease in cytoplasmic fluorescence induced by low temperature is not a consequence of diminished CPP endocytosis.

      Q28. Recently, it has been well recognized that membrane potential greatly affects the structure, dynamics and function of plasma membranes (e.g., Science, 349, 873, 2015; PNAS, 107, 12281, 2010). The results of the effect of membrane potential on the internalization of CPPs (depolarization decreases the rate of internalization and hyperpolarization increases the rate), which is main results of this manuscript, can be interpreted by various ways. For example, the rate of endocytosis may be greatly controlled by membrane potential, which can explain the authors' results.

      A28. This reviewer may have missed the experiment presented in Figure 2c that clearly shows that CPP endocytosis is unaffected by depolarization or hyperpolarization of cells. We have also determined that transferrin uptake through endocytosis is not affected by potassium channel knockout (which also leads to depolarization). The possibility raised by the reviewer is therefore refuted by our experimental evidence.

      Q29. A) The authors used the similar concentrations of various CPPs for their experiments (10 to 40 microM), and did not examine the peptide concentration dependence of the internalization. It has been recognized that the CPP concentration affects the mode of internalization of CPPs (e.g., J. Biol. Chem., 284, 33957, 2009). The authors should examine the peptide concentration dependence of the mode of internalization (less than 10 micorM, e. g., 1 microM).

      B) In the case of depolarization, can higher concentrations of CPPs (e.g., 100 micorM) induce their internalization?

      A29. A) We agree that CPPs/cell ratio might prompt one mode of entry over the other. It has been reported by imaging that at lower CPP concentrations endocytosis is favored since only vesicles were observed15-19. Our data confirm this (new Supplementary Fig. 9f).

      B) In Supp. Fig. 7e we have incubated KCNQ5 KO Raji cells that are slightly more depolarized than WT cells in the presence of increasing CPP concentrations up to 100 m From the obtained results, we can see that at 100 mM, the uptake in depolarized cells is increased but does not reach the level of uptake seen in wild-type cells. Therefore, lack of hyperpolarization can be compensated to a mild extent by increased CPP availability.

      Q30. A) The effects of membrane potential on plasma membranes and lipid bilayers have been extensively investigated experimentally and thus are well understood, although currently the coarse-grained MD simulations cannot provide quantitative results which can be compared with experimental results. In this manuscript, using the coarse-grained MD simulations, the authors applied 2.2 V to a lipid bilayer to examine the translocation of CPPs. However, it is well known the experimental results that application of such large voltage to a lipid bilayer induces pore formation in the membrane or its rupture (Bioelectrochem. Bioenerg., 41, 135, 1996; Sci. Rep., 7, 12509, 2017), but at low membrane potential (B) What is the probability of the existence of R9 in the surface of the membrane? R9 cannot bind to the electrically neutral lipid bilayers (such as PC) under a physiological ion concentration (Biochemistry, 55, 4154, 2016). Even if in the case of R9 the membrane potential reaches at -150 mV, the other CPPs have lower surface charge density than that of R9, and hence, the decrement of membrane potential is lower. The authors should provide the data of other CPPs.

      C) It has been reported that the negative membrane potential increases the rate of entry of two kinds of CPPs into the lumen of giant unilamellar vesicles (GUVs) without leakage of water-soluble fluorescent probe (Stokes-Einstein radius; ~0.9 nm diameter), i.e., no pore formation in the GUV membrane (Biophys., 118, 57, 2020, J. Bacteriology, 2021, DOI: 10.1128/JB.00021-21). The authors should discuss the similarity and the difference between the results in these papers and the above results in this manuscript.

      A30. A) As correctly stated by this Reviewer, we reported simulations with high transmembrane potential values, which is a common procedure in in silico simulations used to accelerate the kinetics of the studied process. In this manuscript we have additionally developed and carefully validated a novel protocol to estimate the free energy landscape of water pore formation and CPP translocation under physiological transmembrane potential (further details about the methodological procedure, the convergence and the validation of the free energy estimation are reported in Supplementary Fig. 15-19 of the manuscript). This protocol allowed us to demonstrate the impact of megapolarization (‑150 mV) on the free energy barrier corresponding to the CPP translocation process. The results exemplify how the megapolarization process modifies the uptake probability of the R9 peptide, reducing locally the free energy barrier of the membrane translocation (Fig. 3c-d). Moreover, we have also demonstrated how a single CPP produces a local transmembrane potential of about -150 mV, in agreement with our hypothesis (Fig. 3e).

      Finally, the quantitative accuracy of the molecular simulations was found to be satisfactory because the water pore formation free energy in a symmetric DOPC membrane that we calculated is in excellent agreement with previous atomistic estimation (Table S5).

      B) It has been demonstrated that CPP/membrane interactions are mostly electrostatic between positively charged amino acids carried by the CPPs and various negatively charged cell membrane components, such as glycosaminoglycans20-31 and phosphate groups32. It is in line with our model that the more positively charged CPPs are the better they should translocate into cells. Therefore, we agree with the reviewer that the level of megapolarization may vary according to the charges carried by the CPPs. However, our data clearly indicate that a certain membrane potential hyperpolarization threshold must be achieved to induce water pore formation. As suggested by the reviewer we will now conduct additional modeling experiments with other CPPs.

      C) We have carefully read these papers and do not necessarily reach the same conclusions as the authors. In both papers, the translocation of CPPs in polarized GUVs is monitored through CPP acquisition on vesicles found within the GUVs (intraluminal vesicles; either smaller GUVs or LUVs). There is actually no evidence of the presence of luminal CPPs outside of the intraluminal vesicle membranes. We would therefore argue that these studies elegantly demonstrate that membrane potential increases CPP binding and insertion into the membrane of the mother GUVs but that the CPPs then move, by diffusion, from the lipidic boundary of the mother GUVs to the lipidic membranes of its intraluminal vesicles. This CPP diffusion would presumable occur when the intraluminal vesicles touch the outer membrane bilayer of the mother GUV. There is a marked lag between binding of the CPPs to the membrane of the mother GUV and appearance of CPPs on the intraluminal vesicles (Figure 3c of the Biophysical Journal paper). This lag is, according to us, more compatible with the explanation we are giving than with a translocation mechanism. If there were direct translocation of the CPP through the membrane of the mother GUV, such a large lag would not be expected to be seen (see next point). If there is no translocation of the CPPs across the GUV membrane, it could explain why the water soluble dye within the mother GUVs does not leak out.

      Q31. The authors consider that the translocation of CPPs induces depolarization, and as a result, the pore closes immediately. This kind of transient pore cannot explain the authors' result of the significant entry of PI into the cytosol during the interaction of CPPs with the cells. The authors should explain this point.

      A31. Our interpretation is that PI takes advantage of the water pore triggered by hyperpolarization to penetrate cells. PI is positively charged and is attracted by the negative membrane potential of the cells. Its movement across the cell membrane is therefore unidirectional. This enables the PI molecules to accumulate/concentrate within the cytosol (Supplementary Fig. 12). When PI is in the presence of a CPP, both molecules enter with similar kinetics (Supplementary Fig. 12a and the new quantitation provided in the partially revised version of the manuscript; Supplementary Fig. 12b). PI and CPPs do no interact (Supplementary Figure 12d); hence they move independently from one another.

      Q32. In this manuscript, the authors used only cancer cell lines (Raji cell, SKW6.4 cell, and HeLa cell). The lipid compositions and the stability of the plasma membranes of these cells may be different from normal cells (e.g., 33; Cancer Res., 51, 3062, 1991). Is there a possibility that negatively charged lipids such as PS and PIP2 locate in the outer leaflet locally in these cells? At least, some discussions on this point is essential.

      A32. We agree with the reviewer that plasma membrane composition may vary between cancerous and not cancerous cells and that this may impact on the ability of CPPs to cross cellular membranes. We now mention this in the discussion: “While the nature of the CPPs likely dictate their uptake efficiency as discussed in the precedent paragraph, the composition of the plasma membrane could also modulate how CPPs translocate into cells. In the present work, we have recorded CPP direct translocation in transformed or cancerous cell lines as well as in primary cells. These cells display various abilities to take up CPPs by direct translocation and the present work indicates that this is modulated by their Vm. But as cancer cells display abnormal plasma membrane composition33, it will be of interest in the future to determine how important this is on their capacity to take up CPPs”.

      Q33. The authors found that PI enters the cytosol significantly when CPPs interact with these cells. Based on this result, the authors concluded that pores with 2 nm diameter are formed in the plasma membrane. However, they did not show the time courses of entry of PI and that of CPPs, and thus we cannot judge whether the pore formation in the plasma membrane is the cause of the entry of CPPs or the result of the entry of CPPs. We can reasonably consider that CPPs enters the cytosol via endocytosis and bind to the inner leaflet of the plasma membrane, inducing pore formation in the plasma membrane.

      A33. The kinetics we are now showing in point A31 indicate co-entry of CPPs and PI, an observation that is in line with our model. Also note that we have demonstrated that CPPs do not escape endosomes (please see our answers to questions 12 and 28). These data are therefore not compatible with the reviewer’s interpretation.

      Q34. It has been reported that the negative membrane potential increases the rate constant of antimicrobial peptide (AMP)-induced pore formation or local damage in the GUV membrane (J. Biol. Chem., 294, 10449, 2019; BBA-Biomembranes, 1862, 183381, 2020). These results are related to those in the present manuscript, because here the authors consider that CPPs induce pores in the plasma membrane in the presence of negative membrane potential.

      A34. We thank the reviewer for mentioning these interesting articles. As we understand them, they demonstrate that antimicrobial peptides (AMPs) bind membranes better as a function of increasing negative membrane potential and that this favors their ability to form pores in the membrane, compromising membrane integrity and inducing the release of cytosolic or luminal content. These AMPs do not behave exactly like CPPs because the latter do not compromise the integrity of the membranes.

      In conclusion, the results of the membrane potential dependence of the rate of the internalization of CPPs may be solid results, which is an important contribution. However, the other analyses and the interpretations are not conclusive at the current stage.

      We thank the reviewer for the positive assessment of our results concerning the membrane potential dependence on CPP uptake. Hopefully we have clarified the remaining points with our answers developed above and with the new data we are presenting.

      Reviewer #2 (Significance (Required)):

      (1) Using a CRISPR/Cas9-based screening, the authors found that some potassium channels play an important role in the internalization of CPP TAT-RasGAP317-326. This result advances the field of CPPs.

      (2) Several researches have suggested that the depolarization decreases the rate of internalization of CPPs into cell cytosol and the hyperpolarization increases the rate. It has been also reported that negative membrane potential increases the rate of entry of two kinds of CPPs into the lumen of GUVs of lipid bilayers. The authors provide a new genetic evidence that membrane potential plays an important role in the internalization of CPPs in the cytosol. However, modulation of membrane potential affects the structure, dynamics and function of plasma membranes greatly. At the current stage, it is difficult to judge which process of the internalization of CPPs is affected by the membrane potential.

      (3) The researchers of CPPs and AMPs are interested in their results after they improve the contents of the manuscript.

      (4) My field of expertise is membrane biophysics, especially the interaction of AMPs and CPPs with GUVs and cells.

      References

      1 Fromm, J. R., Hileman, R. E., Caldwell, E. E. O., Weiler, J. M. & Linhardt, R. J. Differences in the Interaction of Heparin with Arginine and Lysine and the Importance of these Basic Amino Acids in the Binding of Heparin to Acidic Fibroblast Growth Factor. Archives of Biochemistry and Biophysics 323, 279-287, doi:https://doi.org/10.1006/abbi.1995.9963 (1995).

      2 Derossi, D., Joliot, A. H., Chassaing, G. & Prochiantz, A. The third helix of the Antennapedia homeodomain translocates through biological membranes. The Journal of biological chemistry 269, 10444-10450 (1994).

      3 Jobin, M. L., Blanchet, M., Henry, S., Chaignepain, S., Manigand, C., Castano, S., Lecomte, S., Burlina, F., Sagan, S. & Alves, I. D. The role of tryptophans on the cellular uptake and membrane interaction of arginine-rich cell penetrating peptides. Biochim Biophys Acta 1848, 593-602, doi:10.1016/j.bbamem.2014.11.013 (2015).

      4 MacCallum, J. L., Bennett, W. F. D. & Tieleman, D. P. Distribution of amino acids in a lipid bilayer from computer simulations. Biophysical journal 94, 3393-3404, doi:10.1529/biophysj.107.112805 (2008).

      5 Christiaens, B., Symoens, S., Vanderheyden, S., Engelborghs, Y., Joliot, A., Prochiantz, A., Vandekerckhove, J., Rosseneu, M. & Vanloo, B. Tryptophan fluorescence study of the interaction of penetratin peptides with model membranes. European Journal of Biochemistry 269, 2918-2926, doi:10.1046/j.1432-1033.2002.02963.x (2002).

      6 Walrant, A., Bauza, A., Girardet, C., Alves, I. D., Lecomte, S., Illien, F., Cardon, S., Chaianantakul, N., Pallerla, M., Burlina, F., Frontera, A. & Sagan, S. Ionpair-pi interactions favor cell penetration of arginine/tryptophan-rich cell-penetrating peptides. Biochim Biophys Acta Biomembr 1862, 183098, doi:10.1016/j.bbamem.2019.183098 (2020).

      7 Derossi, D., Calvet, S., Trembleau, A., Brunissen, A., Chassaing, G. & Prochiantz, A. Cell internalization of the third helix of the Antennapedia homeodomain is receptor-independent. J Biol Chem 271, 18188-18193, doi:10.1074/jbc.271.30.18188 (1996).

      8 Serulla, M., Ichim, G., Stojceski, F., Grasso, G., Afonin, S., Heulot, M., Schober, T., Roth, R., Godefroy, C., Milhiet, P. E., Das, K., Garcia-Saez, A. J., Danani, A. & Widmann, C. TAT-RasGAP317-326 kills cells by targeting inner-leaflet-enriched phospholipids. Proc Natl Acad Sci U S A, doi:10.1073/pnas.2014108117 (2020).

      9 Bowman, A. M., Nesin, O. M., Pakhomova, O. N. & Pakhomov, A. G. Analysis of plasma membrane integrity by fluorescent detection of Tl(+) uptake. J Membr Biol 236, 15-26, doi:10.1007/s00232-010-9269-y (2010).

      10 Mitchell, D. J., Kim, D. T., Steinman, L., Fathman, C. G. & Rothbard, J. B. Polyarginine enters cells more efficiently than other polycationic homopolymers. J Pept Res 56, 318-325 (2000).

      11 Amand, H. L., Rydberg, H. A., Fornander, L. H., Lincoln, P., Norden, B. & Esbjorner, E. K. Cell surface binding and uptake of arginine- and lysine-rich penetratin peptides in absence and presence of proteoglycans. Biochim Biophys Acta 1818, 2669-2678, doi:10.1016/j.bbamem.2012.06.006 (2012).

      12 Armstrong, C. T., Mason, P. E., Anderson, J. L. & Dempsey, C. E. Arginine side chain interactions and the role of arginine as a gating charge carrier in voltage sensitive ion channels. Sci Rep 6, 21759, doi:10.1038/srep21759 (2016).

      13 Li, L., Vorobyov, I. & Allen, T. W. The different interactions of lysine and arginine side chains with lipid membranes. J Phys Chem B 117, 11906-11920, doi:10.1021/jp405418y (2013).

      14 Herce, H. D., Garcia, A. E. & Cardoso, M. C. Fundamental molecular mechanism for the cellular uptake of guanidinium-rich molecules. J Am Chem Soc 136, 17459-17467, doi:10.1021/ja507790z (2014).

      15 Kosuge, M., Takeuchi, T., Nakase, I., Jones, A. T. & Futaki, S. Cellular Internalization and Distribution of Arginine-Rich Peptides as a Function of Extracellular Peptide Concentration, Serum, and Plasma Membrane Associated Proteoglycans. Bioconjugate Chemistry 19, 656-664, doi:10.1021/bc700289w (2008).

      16 Fretz, M. M., Penning, N. A., Al-Taei, S., Futaki, S., Takeuchi, T., Nakase, I., Storm, G. & Jones, A. T. Temperature-, concentration- and cholesterol-dependent translocation of L- and D-octa-arginine across the plasma and nuclear membrane of CD34+ leukaemia cells. The Biochemical journal 403, 335-342, doi:10.1042/BJ20061808 (2007).

      17 Drin, G., Cottin, S., Blanc, E., Rees, A. R. & Temsamani, J. Studies on the internalization mechanism of cationic cell-penetrating peptides. J Biol Chem 278, 31192-31201, doi:10.1074/jbc.M303938200 (2003).

      18 Duchardt, F., Fotin‐Mleczek, M., Schwarz, H., Fischer, R. & Brock, R. A Comprehensive Model for the Cellular Uptake of Cationic Cell‐penetrating Peptides. Traffic 8, 848-866, doi:10.1111/j.1600-0854.2007.00572.x (2007).

      19 Ziegler, A., Nervi, P., Dürrenberger, M. & Seelig, J. The Cationic Cell-Penetrating Peptide CPPTAT Derived from the HIV-1 Protein TAT Is Rapidly Transported into Living Fibroblasts:  Optical, Biophysical, and Metabolic Evidence. Biochemistry 44, 138-148, doi:10.1021/bi0491604 (2005).

      20 Ziegler, A. Thermodynamic studies and binding mechanisms of cell-penetrating peptides with lipids and glycosaminoglycans. Advanced Drug Delivery Reviews 60, 580-597, doi:https://doi.org/10.1016/j.addr.2007.10.005 (2008).

      21 Rullo, A., Qian, J. & Nitz, M. Peptide–glycosaminoglycan cluster formation involving cell penetrating peptides. Biopolymers 95, 722-731, doi:10.1002/bip.21641 (2011).

      22 Bechara, C., Pallerla, M., Zaltsman, Y., Burlina, F., Alves, I. D., Lequin, O. & Sagan, S. Tryptophan within basic peptide sequences triggers glycosaminoglycan-dependent endocytosis. The FASEB Journal 27, 738-749, doi:10.1096/fj.12-216176 (2013).

      23 Gonçalves, E., Kitas, E. & Seelig, J. Binding of Oligoarginine to Membrane Lipids and Heparan Sulfate:  Structural and Thermodynamic Characterization of a Cell-Penetrating Peptide. Biochemistry 44, 2692-2702, doi:10.1021/bi048046i (2005).

      24 Rusnati, M., Tulipano, G., Spillmann, D., Tanghetti, E., Oreste, P., Zoppetti, G., Giacca, M. & Presta, M. Multiple Interactions of HIV-I Tat Protein with Size-defined Heparin Oligosaccharides. Journal of Biological Chemistry 274, 28198-28205, doi:10.1074/jbc.274.40.28198 (1999).

      25 Butterfield, K. C., Caplan, M. & Panitch, A. Identification and Sequence Composition Characterization of Chondroitin Sulfate-Binding Peptides through Peptide Array Screening. Biochemistry 49, 1549-1555, doi:10.1021/bi9021044 (2010).

      26 Åmand, H. L., Rydberg, H. A., Fornander, L. H., Lincoln, P., Nordén, B. & Esbjörner, E. K. Cell surface binding and uptake of arginine- and lysine-rich penetratin peptides in absence and presence of proteoglycans. Biochimica et Biophysica Acta (BBA) - Biomembranes 1818, 2669-2678, doi:https://doi.org/10.1016/j.bbamem.2012.06.006 (2012).

      27 Ghibaudi, E., Boscolo, B., Inserra, G., Laurenti, E., Traversa, S., Barbero, L. & Ferrari, R. P. The interaction of the cell-penetrating peptide penetratin with heparin, heparansulfates and phospholipid vesicles investigated by ESR spectroscopy. Journal of Peptide Science 11, 401-409, doi:10.1002/psc.633 (2005).

      28 Fuchs, S. M. & Raines, R. T. Pathway for polyarginine entry into mammalian cells. Biochemistry 43, 2438-2444, doi:10.1021/bi035933x (2004).

      29 Ziegler, A. & Seelig, J. Contributions of Glycosaminoglycan Binding and Clustering to the Biological Uptake of the Nonamphipathic Cell-Penetrating Peptide WR9. Biochemistry 50, 4650-4664, doi:10.1021/bi1019429 (2011).

      30 Ziegler, A. & Seelig, J. Interaction of the Protein Transduction Domain of HIV-1 TAT with Heparan Sulfate: Binding Mechanism and Thermodynamic Parameters. Biophysical Journal 86, 254-263, doi:https://doi.org/10.1016/S0006-3495(04)74101-6 (2004).

      31 Hakansson, S. & Caffrey, M. Structural and Dynamic Properties of the HIV-1 Tat Transduction Domain in the Free and Heparin-Bound States. Biochemistry 42, 8999-9006, doi:10.1021/bi020715+ (2003).

      32 Kawamoto, S., Takasu, M., Miyakawa, T., Morikawa, R., Oda, T., Futaki, S. & Nagao, H. Inverted micelle formation of cell-penetrating peptide studied by coarse-grained simulation: importance of attractive force between cell-penetrating peptides and lipid head group. J Chem Phys 134, 095103, doi:10.1063/1.3555531 (2011).

      33 Szlasa, W., Zendran, I., Zalesinska, A., Tarek, M. & Kulbacka, J. Lipid composition of the cancer cell membrane. J Bioenerg Biomembr 52, 321-342, doi:10.1007/s10863-020-09846-4 (2020).

    1. BETWEEN me and the other world there is ever an unasked question: unasked by some through feelings of delicacy; by others through the difficulty of rightly framing it. All, nevertheless, flutter round it. They approach me in a half-hesitant sort of way, eye me curiously or compassionately, and then, instead of saying directly, How does it feel to be a problem? they say, I know an excellent colored man in my town; or, I fought at Mechanicsville; or, Do not these Southern outrages make your blood boil? At these I smile, or am interested, or reduce the boiling to a simmer, as the occasion may require. To the real question, How does it feel to be a problem? I answer seldom a word. 1 And yet, being a problem is a strange experience,—peculiar even for one who has never been anything else, save perhaps in babyhood and in Europe. It is in the early days of rollicking boyhood that the revelation first bursts upon one, all in a day, as it were. I remember well when the shadow swept across me. I was a little thing, away up in the hills of New England, where the dark Housatonic winds between Hoosac and Taghkanic to the sea. In a wee wooden schoolhouse, something put it into the boys’ and girls’ heads to buy gorgeous visiting-cards—ten cents a package—and exchange. The exchange was merry, till one girl, a tall newcomer, refused my card,—refused it peremptorily, with a glance. Then it dawned upon me with a certain suddenness that I was different from the others; or like, mayhap, in heart and life and longing, but shut out from their world by a vast veil. I had thereafter no desire to tear down that veil, to creep through; I held all beyond it in common contempt, and lived above it in a region of blue sky and great wandering shadows. That sky was bluest when I could beat my mates at examination-time, or beat them at a foot-race, or even beat their stringy heads. Alas, with the years all this fine contempt began to fade; for the worlds I longed for, and all their dazzling opportunities, were theirs, not mine. But they should not keep these prizes, I said; some, all, I would wrest from them. Just how I would do it I could never decide: by reading law, by healing the sick, by telling the wonderful tales that swam in my head,—some way. With other black boys the strife was not so fiercelysunny: their youth shrunk into tasteless sycophancy, or into silent hatred of the pale world about them and mocking distrust of everything white; or wasted itself in a bitter cry, Why did God make me an outcast and a stranger in mine own house? The shades of the prison?house closed round about us all: walls strait and stubborn to the whitest, but relentlessly narrow, tall, and unscalable to sons of night who must plod darkly on in resignation, or beat unavailing palms against the stone, or steadily, half hopelessly, watch the streak of blue above. 2 After the Egyptian and Indian, the Greek and Roman, the Teuton and Mongolian, the Negro is a sort of seventh son, born with a veil, and gifted with second-sight in this American world,—a world which yields him no true self-consciousness, but only lets him see himself through the revelation of the other world. It is a peculiar sensation, this double?consciousness, this sense of always looking at one’s self through the eyes of others, of measuring one’s soul by the tape of a world that looks on in amused contempt and pity. One ever feels his two-ness,—an American, a Negro; two souls, two thoughts, two unreconciled strivings; two warring ideals in one dark body, whose dogged strength alone keeps it from being torn asunder. 3 The history of the American Negro is the history of this strife,—this longing to attain self?conscious manhood, to merge his double self into a better and truer self. In this merging he wishes neither of the older selves to be lost. He would not Africanize America, for America has too much to teach the world and Africa. He would not bleach his Negro soul in a flood of white Americanism, for he knows that Negro blood has a message for the world. He simply wishes to make it possible for a man to be both a Negro and an American, without being cursed and spit upon by his fellows, without having the doors of Opportunity closed roughly in his face. 4 This, then, is the end of his striving: to be a co-worker in the kingdom of culture, to escape both death and isolation, to husband and use his best powers and his latent genius. These powers of body and mind have in the past been strangely wasted, dispersed, or forgotten. The shadow of a mighty Negro past flits through the tale of Ethiopia the Shadowy and of Egypt the Sphinx. Throughout history, the powers of single black men flash here and there like falling stars, and die sometimes before the world has rightly gauged their brightness. Here in America, in the few days since Emancipation, the black man’s turning hither and thither in hesitant and doubtful striving has often made his very strength to lose effectiveness, to seem like absence of power, like weakness. And yet it is not weakness,—it is the contradiction of double aims. The double-aimed struggle of the black artisan—on the one hand to escape white contempt for a nation of mere hewers of wood and drawers of water, and on the other hand to plough and nail and dig for a poverty-stricken horde—could only result in making him a poor craftsman, for he had but half a heart in either cause. By the poverty and ignorance of his people, the Negro minister or doctor was tempted toward quackery and demagogy; and by the criticism of the other world, toward ideals that made him ashamed of his lowly tasks. The would-be black savant was confronted by the paradox that the knowledge his people needed was a twice-told tale to his white neighbors, while the knowledge which would teach the white world was Greek to his own flesh and blood. The innate love of harmony and beauty that set the ruder souls of his people a-dancing and a- singing raised but confusion and doubt in the soul of the black artist; for the beauty revealed to him was the soul-beauty of a race which his larger audience despised, and he could not articulate the message of another people. This waste of double aims, this seeking to satisfy two unreconciled ideals, has wrought sad havoc with the courage and faith and deeds of ten thousand thousand people,—has sent them often wooing false gods and invoking falsemeans of salvation, and at times has even seemed about to make them ashamed of themselves. 5 Away back in the days of bondage they thought to see in one divine event the end of all doubt and disappointment; few men ever worshipped Freedom with half such unquestioning faith as did the American Negro for two centuries. To him, so far as he thought and dreamed, slavery was indeed the sum of all villainies, the cause of all sorrow, the root of all prejudice; Emancipation was the key to a promised land of sweeter beauty than ever stretched before the eyes of wearied Israelites. In song and exhortation swelled one refrain—Liberty; in his tears and curses the God he implored had Freedom in his right hand. At last it came,—suddenly, fearfully, like a dream. With one wild carnival of blood and passion came the message in his own plaintive cadences:— “Shout, O children! Shout, you’re free! For God has bought your liberty!” 6 Years have passed away since then,—ten, twenty, forty; forty years of national life, forty years of renewal and development, and yet the swarthy spectre sits in its accustomed seat at the Nation’s feast. In vain do we cry to this our vastest social problem:— “Take any shape but that, and my firm nerves Shall never tremble!” 7 The Nation has not yet found peace from its sins; the freedman has not yet found in freedom his promised land. Whatever of good may have come in these years of change, the shadow of a deep disappointment rests upon the Negro people,—a disappointment all the more bitter because the unattained ideal was unbounded save by the simple ignorance of a lowly people. 8 The first decade was merely a prolongation of the vain search for freedom, the boon that seemed ever barely to elude their grasp,—like a tantalizing will-o’-the-wisp, maddening and misleading the headless host. The holocaust of war, the terrors of the Ku-Klux Klan, the lies of carpet-baggers, the disorganization of industry, and the contradictory advice of friends and foes, left the bewildered serf with no new watchword beyond the old cry for freedom. As the time flew, however, he began to grasp a new idea. The ideal of liberty demanded for its attainment powerful means, and these the Fifteenth Amendment gave him. The ballot, which before he had looked upon as a visible sign of freedom, he now regarded as the chief means of gaining and perfecting the liberty with which war had partially endowed him. And why not? Had not votes made war and emancipated millions? Had not votes enfranchised the freedmen? Was anything impossible to a power that had done all this? A million black men started with renewed zeal to vote themselves into the kingdom. So the decade flew away, the revolution of 1876 came, and left the half-free serf weary, wondering, but still inspired. Slowly but steadily, in the following years, a new vision began gradually to replace the dream of political power,—a powerful movement, the rise of another ideal to guide the unguided, another pillar of fire by night after a clouded day. It was the ideal of “book?learning”; the curiosity, born of compulsory ignorance, to know and test the power of the cabalistic letters of the white man, the longing to know. Here at last seemed to have been discovered the mountain path to Canaan; longer than the highway of Emancipation and law, steep and rugged, but straight, leading to heights high enough to overlook life. 9Up the new path the advance guard toiled, slowly, heavily, doggedly; only those who have watched and guided the faltering feet, the misty minds, the dull understandings, of the dark pupils of these schools know how faithfully, how piteously, this people strove to learn. It was weary work. The cold statistician wrote down the inches of progress here and there, noted also where here and there a foot had slipped or some one had fallen. To the tired climbers, the horizon was ever dark, the mists were often cold, the Canaan was always dim and far away. If, however, the vistas disclosed as yet no goal, no resting-place, little but flattery and criticism, the journey at least gave leisure for reflection and self-examination; it changed the child of Emancipation to the youth with dawning self-consciousness, self?realization, self-respect. In those sombre forests of his striving his own soul rose before him, and he saw himself,—darkly as through a veil; and yet he saw in himself some faint revelation of his power, of his mission. He began to have a dim feeling that, to attain his place in the world, he must be himself, and not another. For the first time he sought to analyze the burden he bore upon his back, that dead-weight of social degradation partially masked behind a half-named Negro problem. He felt his poverty; without a cent, without a home, without land, tools, or savings, he had entered into competition with rich, landed, skilled neighbors. To be a poor man is hard, but to be a poor race in a land of dollars is the very bottom of hardships. He felt the weight of his ignorance,—not simply of letters, but of life, of business, of the humanities; the accumulated sloth and shirking and awkwardness of decades and centuries shackled his hands and feet. Nor was his burden all poverty and ignorance. The red stain of bastardy, which two centuries of systematic legal defilement of Negro women had stamped upon his race, meant not only the loss of ancient African chastity, but also the hereditary weight of a mass of corruption from white adulterers, threatening almost the obliteration of the Negro home. 10 A people thus handicapped ought not to be asked to race with the world, but rather allowed to give all its time and thought to its own social problems. But alas! while sociologists gleefully count his bastards and his prostitutes, the very soul of the toiling, sweating black man is darkened by the shadow of a vast despair. Men call the shadow prejudice, and learnedly explain it as the natural defence of culture against barbarism, learning against ignorance, purity against crime, the “higher” against the “lower” races. To which the Negro cries Amen! and swears that to so much of this strange prejudice as is founded on just homage to civilization, culture, righteousness, and progress, he humbly bows and meekly does obeisance. But before that nameless prejudice that leaps beyond all this he stands helpless, dismayed, and well-nigh speechless; before that personal disrespect and mockery, the ridicule and systematic humiliation, the distortion of fact and wanton license of fancy, the cynical ignoring of the better and the boisterous welcoming of the worse, the all?pervading desire to inculcate disdain for everything black, from Toussaint to the devil,— before this there rises a sickening despair that would disarm and discourage any nation save that black host to whom “discouragement” is an unwritten word. 11 But the facing of so vast a prejudice could not but bring the inevitable self-questioning, self-disparagement, and lowering of ideals which ever accompany repression and breed in an atmosphere of contempt and hate. Whisperings and portents came borne upon the four winds: Lo! we are diseased and dying, cried the dark hosts; we cannot write, our voting is vain; what need of education, since we must always cook and serve? And the Nation echoed and enforced this self-criticism, saying: Be content to be servants, and nothing more; what need of higher culture for half-men? Away with the black man’s ballot, by force or fraud,— and behold the suicide of a race! Nevertheless, out of the evil came something of good,—the more careful adjustment of education to real life, the clearer perception of the Negroes’ social responsibilities, and the sobering realization of the meaning of progress. 12So dawned the time of Sturm und Drang: storm and stress to-day rocks our little boat on the mad waters of the world-sea; there is within and without the sound of conflict, the burning of body and rending of soul; inspiration strives with doubt, and faith with vain questionings. The bright ideals of the past,—physical freedom, political power, the training of brains and the training of hands,—all these in turn have waxed and waned, until even the last grows dim and overcast. Are they all wrong,—all false? No, not that, but each alone was over-simple and incomplete,—the dreams of a credulous race-childhood, or the fond imaginings of the other world which does not know and does not want to know our power. To be really true, all these ideals must be melted and welded into one. The training of the schools we need to-day more than ever,—the training of deft hands, quick eyes and ears, and above all the broader, deeper, higher culture of gifted minds and pure hearts. The power of the ballot we need in sheer self-defence,—else what shall save us from a second slavery? Freedom, too, the long-sought, we still seek,—the freedom of life and limb, the freedom to work and think, the freedom to love and aspire. Work, culture, liberty,—all these we need, not singly but together, not successively but together, each growing and aiding each, and all striving toward that vaster ideal that swims before the Negro people, the ideal of human brotherhood, gained through the unifying ideal of Race; the ideal of fostering and developing the traits and talents of the Negro, not in opposition to or contempt for other races, but rather in large conformity to the greater ideals of the American Republic, in order that some day on American soil two world-races may give each to each those characteristics both so sadly lack. We the darker ones come even now not altogether empty-handed: there are to-day no truer exponents of the pure human spirit of the Declaration of Independence than the American Negroes; there is no true American music but the wild sweet melodies of the Negro slave; the American fairy tales and folk-lore are Indian and African; and, all in all, we black men seem the sole oasis of simple faith and reverence in a dusty desert of dollars and smartness. Will America be poorer if she replace her brutal dyspeptic blundering with light-hearted but determined Negro humility? or her coarse and cruel wit with loving jovial good-humor? or her vulgar music with the soul of the Sorrow Songs? 13 Merely a concrete test of the underlying principles of the great republic is the Negro Problem, and the spiritual striving of the freedmen’s sons is the travail of souls whose burden is almost beyond the measure of their strength, but who bear it in the name of an historic race, in the name of this the land of their fathers’ fathers, and in the name of human opportunity.


      14 And now what I have briefly sketched in large outline let me on coming pages tell again in many ways, with loving emphasis and deeper detail, that men may listen to the striving in th

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **Summary:** Previously, the authors showed the importance of contractile force in cell positioning and cell fate specification in preimplantation mouse development. In this study, the authors generated maternal-zygotic mutants of the non-muscle myosin-II heavy chain (NMHC) genes Myh9 and Myh10, and quantitatively analyzed their development using time-lapse microscopy and immunostaining. The authors first examined the expression of NMHCs. Myh9 and Mhy10 are present in preimplantation embryos, and Myh9 is maternally inherited. Single maternal-zygotic mutants of Myh9 or Myh10 revealed that maternal Myh9 plays a major role in actomyosin contractility. In maternal Myh9 mutants, compaction and contractility at the 8-cell stage were reduced. Maternal Myh9 mutants demonstrated a longer 8-cell stage, and mutant blastocysts had reduced cell numbers. Cell positioning was not affected; however, cell differentiation was slightly affected by reduced expression of TE and ICM markers. Maternal Myh9 mutants formed blastocoels, but lumen opening was observed earlier than that in wild-type embryos. In double maternal-zygotic mutants of Myh9 and Myh10, cytokinesis was severely affected. Nevertheless, TE fate was specified and embryos formed blastocoels. Interestingly, single-celled mutants swelled upon the formation of fluid-filled vacuoles in their cytoplasm. Similar TE fate specifications and cytoplasmic vacuoles were also observed with single-celled embryos produced by blastomere fusion. Based on these results, the authors concluded that maternal Myh9 is the major NMHC. However, Myh10 can significantly compensate for the loss of Myh9, and that cell fate specification and morphogenesis are independent of the success of cell division. **Minor comments:** Overall, the conclusions of this study are supported by high-quality data. However, I have a few minor concerns:

      We thank the referee for her/his careful analysis of our manuscript.

      1. Line 200~205. The authors showed the correlation between the cell number at the blastocyst stage and the 8-cell stage, and concluded that "the lengthened 8-cell stage of mzMyh9 is an important determinant of their reduced cell number at the blastocyst stage". This conclusion is not well supported because of several reasons. First, the timing of cell count is not clear. Cell number was compared at the blastocyst stage, but Figure 1c shows that mzMyh9 embryos initiate blastocoel formation earlier than wild-type embryos. Therefore, if cell count timing was determined based on the blastocyst morphology of the embryos, the timing of cell count (i.e., time after 3rd cleavage) for mzMyh9 mutants is earlier than that observed for wild-type embryos. This shorter culture time likely contributes to the reduced cell number of mzMyh9. Second, the authors only showed a correlation, and no experimental data supporting this conclusion were shown. If the cell number was counted at the same time after the 3rd cleavage, and if the authors' hypothesis is correct, then culturing mzMyh9 mutants for an additional three hour, which is the difference in the duration of the 8-cell stage, should make the cell numbers of mutants comparable to those of wild-type blastocysts.

      Although, this correlation provides the best explanation we had based on the data, we agree that the statement above is weakly supported by our study. We do not want to make a strong point about it since we do not think it brings much to the narrative of the study. We have removed the sentence.

      Discussion. In the paragraph starting from line 405, the authors discussed the inconsistencies in the observation of the phenotypes of mzMyh9 and mzMyh10 mutants with the conclusions of previous studies by others about cell polarization. It will be informative to also discuss about inconsistency with their previous observations on cell fate. In their previous report (reference 8), the authors concluded that without contractile forces, blastomeres adopt an inner-cell-like fate regardless of their position. This is clearly opposite of the phenotype of mzMyh9;mzMyh10 mutants, in which all the cells are specified to TE. Please add a discussion addressing this discrepancy.

      The data provided here are consistent with the ones from ref 8 (Maître et al, 2016): reduced contractility (Myh9 KO, double Myh9;Myh10 KO or Blebbistatin treatment) leads to reduced CDX2 levels. In ref 8, CDX2 and YAP are checked at the 16-cell stage, before the definitive differentiation into TE and ICM, whereas here we present data at the mid-blastocyst stage (~64 cells). We had not checked SOX2 in ref 8 since it is not expressed at such early stage, so we cannot conclude about this marker.

      We want to clarify that, as stated in the manuscript, in mzMyh9;mzMyh10 KO we detect CDX2 in 5/7 embryos only and therefore not all cells are correctly specified into TE. However, SOX2 could be detected in the inner cell of the one embryo that produced an inner cell. We had not discussed this issue further since it is difficult to conclude much from such rare events and we would prefer to keep it as such.

      To strengthen our argument about reduced differentiation in NMHC mutant embryos, we now provide YAP immunostaining (Fig S4). YAP is correctly patterned in Myh10 mutants and shows slightly less defined nuclear localization in Myh9 mutants, in agreement with our previous observations on CDX2 in the present study and previous observations on YAP at the 16-cell stage (Maître et al 2016).

      Together, we can conclude that, at the 16-cell stage, when ICM fate is not engaged yet (no detectable SOX2 expression), “inhibition of contractility causes (…) blastomeres to become inner-cell-like with respect to (…) Yap localization and Cdx2 levels, despite their external position” (Maître et al, 2016). At the blastocyst stage embryos with chronically impaired contractility can succeed in some but not all cases to produce TE (this study). Between these two developmental stages, blastomeres are exposed to prolonged signals from the apical domain and can be strongly deformed by the growing lumen. Based on the literature (Hirate et al 2013, Dupont et al 2011), both of these stimuli could potentially favor YAP nuclear localisation despite low contractility.

      Throughout the paper, the description of gene and protein symbols should follow the rules of MGI's guidelines for nomenclature of genes (http://www.informatics.jax.org/mgihome/nomen/gene.shtml#gene_sym). Gene and allele symbols are italicized. Protein symbols use all uppercase letters and are not italicized.

      We have corrected this.

      Line 163. The term "contact angles" are used without any explanation or definition. The term should be introduced with a brief explanation in the text, preferably with a figure. It should help facilitate the understanding of the scientists working in different fields.

      We have labelled a contact angle on Fig 1A and specified this in the text and in the figure legend.

      Reviewer #1 (Significance (Required)): The importance of actomyosin contractility in compaction, cell polarization, cell positioning, and cell fate specification in preimplantation embryos has been reported by several groups, mostly using chemical inhibitors, except for the study cited in reference 8, in which chimeras of wild-type and mMyh9 mutant embryos were used. This is the first genetic analysis of the roles of actomyosin contractility in the development of preimplantation embryos. Thus, the major advancement of this study is the genetic dissection of the roles of actomyosin contractility in preimplantation mouse development, and clarifying the contribution of maternal/zygotic Myh9 and Myh10 genes. While the phenotypes of reduced compaction and blastomere contractility are consistent with those observed in previous studies, polarization and TE fate specification of the mutant cells appear inconsistent with the conclusions of previous inhibitor experiments, which show defects in polarization processes and fate specification to ICM. These are potentially important issues, but detailed analyses were not performed. The requirement of actomyosin contractility for the cytokinesis of preimplantation embryos is also a novel finding, although it is expected from studies conducted in other systems. Vacuole formation in single-celled mzMyh9;mzMyh10 mutants in a timely manner suggested that fluid accumulation is a cell autonomous process and that cell differentiation occurs independently of cell division. These are also novel findings, although the latter is somewhat expected from previous studies performed using cell number manipulated embryos. In summary, the conceptual advance offered by this study is small. However, this is a high-quality study and makes critical observations in the field of preimplantation mouse development. Scientists in the field of developmental biology, especially those working on preimplantation development, should be interested in this paper. My field of expertise is preimplantation development.

      We thank the reviewer for her/his appreciation of our work. We want to argue that we did perform a very detailed analysis of the development of the NMHC mutant embryos, with multiple quantitative image and data analyses to thoroughly and objectively characterise the phenotypes of these mutants. If by “detailed analysis”, the reviewer meant a molecular dissection of the phenotype, we argue that 1/ checking the end result (i.e. presence of TE and ICM markers, presence of polarised fluid transport) was sufficient to assess the functionality of biological processes without checking every steps of a signalling cascade; 2/ we now provide additional molecular information on the state of YAP and apico-basal polarisation (Fig S3-4).

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): In this manuscript, Schliffka et al. report that maternally deposited Myh9 is the major NMHC in preimplantation embryonic morphogenesis and complete removal of both Myh9 and Myh10 caused severe cytokinesis failure similar to tissue culture cells. Interestingly, although the mutant embryos completely failed cytokinesis thus forming single-celled embryos, they initiated trophoblast gene expression and vacuolization (likely similar to blastocoel formation), suggesting that the timing of preimplantation developmental events is independent from cell number and morphogenetic events.

      We thank the reviewer for her/his appreciation of our work.

      Major comments Vacuolization in single-celled embryos is interesting. In the images, there looks to be two types of vacuoles, F-actin positive and negative. The authors speculate the similarity to blastocoel formation. To support this, it is important to stain them with some basolateral markers like Na+ ATPase, E-cadherin and B-catenin. It is also important to confirm if the apical domain is properly formed by staining the apical domain markers like aPKC and Pard6.

      We thank the reviewer for this suggestion. We now provide immunostaining of single Myh9 or Myh10 and double Myh9;Myh10 mutants for aPKC (PRKCz), Na/K ATPase (ATP1A1), Aquaporin-3 (AQP3), the best basolateral marker in our hands, which is also very relevant to fluid pumping, CDH1 and F-actin (Fig S3). We observe that these markers localise similarly in multiple-celled and single-celled embryos, suggesting that vacuoles de facto substitute for the basolateral compartment normally consisting of cell-cell contacts and the lumen. This suggests that the same machinery is at the origin of the fluid inside the lumen and inside vacuoles.

      Minor comments All gene names should be Italicized.

      We have corrected this.

      L157. Myh10 and Myh9 should be mMyh10 and mMyh9.

      We have corrected this.

      L294 1/8 embryos. What does this mean?

      This means this was observed in 1 embryo out of 8 in total.

      L333 6/25 embryos. Does this mean 6 out of 25 embryos combined all maternal double mutants?

      Precisely.

      L438-442. I do not find these embryos are similar to tetraploid embryos. I suggest to remove the sentences.

      We have removed the sentences.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): This study investigates the roles of non-muscle myosin in development, reporting a requirement for maternal and zygotic Mhy9 and 10. Strengths of the study include robust genetic techniques, innovative nested imaging to visualize events over different timescales within the same embryos, and analysis of morphological as well as transcriptional/cell fate phenotypes. However, the somewhat superficial phenotype analysis limits the authors' ability to draw strong mechanistic conclusions about what is going on in these mutants. Is cell polarization normal? Is cell signaling (HIPPO signalling) normal?

      We thank the reviewer for carefully assessing our study. We argue that we have thoroughly characterized the phenotypes of the NMHC mutants, which allowed us to draw many important mechanistic conclusions (such as the ability of NMHC mutants to polarise, or to pump fluid in a cell autonomous manner). Each mutant embryo has been imaged at multiple time scales, stained and genotyped. The time-lapses and immunostaining have been extensively quantified using manual as well as automated methods such as particle image velocimetry. We also provided fusion experiments, which phenocopy some aspects of the mutants to provide evidence of the mechanisms causing the observed phenotype.

      Nevertheless, we agree that one can always do more and that we had focused on the biological processes (lineage specification, morphogenesis and cleavages) rather than molecular characterisation. Although polarised fluid pumping ascertains a functioning epithelial polarity, we now provide immunostaining of polarity markers in mutant embryos. Although CDX2 and SOX2 staining inform on the output of the signalling cascade leading to effective TE and ICM differentiation, we now provide YAP immunostaining of mutant embryos. We hope this satisfies the request from the reviewer.

      What determines whether an embryo can form an inside cell or not?

      This is an outstanding question. Cells can internalise by oriented cell division or contractility mediated cell sorting. Contractility-mediated internalisation functions with only 2 cells (as when doublets of 16-cell stage blastomeres form a cell-in-a-cell structure) but requires to grow above a tension asymmetry threshold (of 1.5 in WT and most likely above 3 in these mutants due to their poor compaction, see Maître et al 2016). Oriented cell division only works if there is a cell-cell contact to push dividing cells in between. Therefore, at least 3 cells are required for an inner cell to be internalised by this mechanism.

      In double mutants, the average cell number is 2.9. No embryo consisting of only 2 cells contained an inner cell, about half of embryos with 3-5 cells contained a single inner cell and all embryos with 6 cells or more contained inner cells (Fig 4D). Based on the low contractility of double mutants, we can speculate that they do not succeed in overcoming the tension asymmetry threshold. This would explain why no inner cell is observed in embryos with only 2 cells. We can speculate that with 3-5 cells, oriented divisions could occur thanks to the presence of functional polarity (Korotkevitch et al 2017).

      We have added a discussion about this important matter.

      Similarly, the manuscript would benefit from rewriting to reframe the authors' discoveries within the context of what is known regarding lineage specification (e.g., why does CDX2/SOX2 expression indicate normal lineage specification). Additional minor comments are listed below.

      We elaborate on these points.

      **Minor comments:** • Introduction focuses overly on the work of the PI and his mentor, giving the presentation an unnecessarily biased quality.

      We have corrected this to the best of our ability. Please note that, to our knowledge, there are 8 studies (Anani et al., 2014; Maître et al., 2015; Samarage et al., 2015; Maître et al., 2016; Zhu et al., 2017; Zenker et al., 2018; Chan et al., 2019; Dumortier et al., 2019) looking in more or less details into the contractility of the preimplantation embryo. We mention and cite all of these studies.

      • The text asserts that Myh9 levels are highest during zygote stage, on the basis of qPCR (Fig. S1A), and that this is also observed by RNA-seq (Fig. S1B). However, this conclusion is not supported by the data shown.

      We have corrected this.

      • Would be nice to repeat the qPCR on the mz null.

      We agree with the referee that this would help in assessing the level of compensation between NMHC paralogs in individual mutants. Our qPCR protocol requires a few tens of embryos to be able to amplify the different paralogs. Unfortunately, pooling embryos from our current mating strategy would result in pooling homozygote and heterozygote mutants as we cannot know a priori which embryo is of which genotype.

      We believe that, as nice as this information would be, the current study does not require this information, which would be technically challenging.

      • Were the measurements shown in Fig. S1F taken from the images shown in Fig. S1E? If so, the authors should clarify how the measurements were normalized, since the images in Fig. S1E were clearly taken with different camera settings (as judged by background fluorescence level surrounding the embryos).

      The camera settings were identical but the LUT are set differently (to the maximal signal of a given genotype) so that some signal is visible. The signal intensities are so different between genotypes that if set to a common LUT, we either get the maternal GFP as a saturated white circle or the other genotypes as black images. We explain our LUT settings both in the methods and figure legends.

      As an alternative to the current data presentation, we would be fine to have the same LUT for all images and show almost black images for WT and paternal GFP.

      • Can't really conclude that Myh9 is essential for compaction since compaction occurs (albeit abnormally) in the absence of Myh9 (line 177-178).

      Our statement is “we conclude that maternal Myh9 is essential for embryos to compact fully”. WT and mzMyh10 mutants increase their contact angles by 60° whereas mzMyh9 only grow by 30°. Double mutants compact less than single Myh9 mutants. Therefore, the compaction movement is halved in mzMyh9 and the residual weak compaction could be explained by compensation from Myh10. We stand by our statement.

      • Line 211: "observe" rather than "measure".

      We have corrected this.

      • If the embryos achieve proper ICM/TE ratio, in spite of having half the number of cells in the mutants, is that to be expected? Would/do halved embryos also possess the same ICM/TE ratio? Or is this outcome peculiar to the mutants?

      This is an interesting question on which we had not sufficiently elaborated. Our experiments with cell fusion at the 4-cell-stage (Fig. S5) produced embryos with reduced cell number. These resemble Myh9 mutant embryos in the aspect that they show a reduced cell number while maintaining the total embryonic cell mass. In both cases, the ICM/total cell ratio is similar to control embryos. This indicates a robust mechanism of ICM/TE ratio setting that is robust to the cell number change observed in the single mutant. We have added a discussion about this.

      • Line 222: what is the evidence that Cdx2 and Sox2 are TE and ICM markers?

      We have added references to the studies from Strumpf et al., 2005 and Avilion et al., 2003 to support these claims.

      • Is the reported reduction in CDX2 and SOX2 levels due to a stage-delay? What would the comparison look like in wt embryos with half as many cells? Timing of lumen formation may or may not indicate developmental timing...

      We address this point by fusing embryos to half the cell number and find that the fate marker levels are specifically affected as a result of mutation of Myh9 (Fig S5).

      We agree that the timing of lumen formation is unlikely to be a good reference for staging and we did not use this event. We do synchronise embryos based on lumen opening only when comparing lumen growth rate.

      • Line 240 - what was the correction on the multiple pairwise comparisons (multiple t tests)?

      To compare lumen growth rate, individual growth rates of mutants are compared to those of WT using Student’s t test. Growth rates are considered as normally distributed and independent (not pairwise).

      • Lumen forms on time in mutants, despite having fewer cells. Alternatively, lumen forms early, prior to acquisition of proper cell number. Is there a reason the authors did not consider this alternative?

      The referee is correct. Lumens form with fewer cells in mutant embryos and therefore prior to the acquisition of proper cell number.

      • Lines 306 and 339: why does lack of SOX2 expression suggest that the lineage specification program is intact? Why does expression of CDX2 suggest TE initiation has occurred normally? The regulation of these two markers was not introduced.

      We have better introduced and justified this aspect.

      • Line 349: why is blastocoel formation a cell-autonomous property when it clearly occurs extracellularly? Does this also happen in wild type embryos?

      Blastocoel formation is clearly a multi-cellular process. We argue that fluid accumulation is not. The implications for WT embryos are that fluid can be accumulated in the blastocoel entirely trans-cellularly (no need for fluid to flow through cell-cell junction).

      • Speculate in Discussion on why the ML-7/Blebbistatin experiments results could differ from the genetic results produced here.

      Blebbistatin experiments are in agreement with the mutant data. ML-7 experiments are partially in agreement with the mutant data. The discrepancy lies in the effect on cell polarity. ML-7 affects kinases other than the MLCK, such as PKC, which is a known regulator of cell polarity during preimplantation development. Although this is speculative, we specify this in the revised manuscript.

      • Can these mutant embryos implant?

      We grow colonies of heterozygous mutants, therefore mMyh9, mMyh10 and mMyh9;mMyh10 embryos are viable and must be able to implant. As for homozygous mutants, they are not viable and we do not know whether they can implant.

      Reviewer #3 (Significance (Required)): The study provides the first strong evidence of a requirement for non-muscle myosin in epithelialization. This is significant to embryology and to epithelial biology.

      We thank the reviewer for appreciating the significance of our study. We want to clarify that our study provides evidence for NMHC as NOT being required for de novo epithelialization.

    1. Writer-director Lee Isaac Chung based the film, which was shot in Oklahoma over 25 days, on his own family's story

      The film itself was based off of the director's story. Crazy to think about but I feel that in the film industry we as an audience must take things that are based off of a true story with a grain of salt. While the event and the emotion may be true, the specific details can be very easy to change to better fit the plot.

    1. The second is that when you find (as you often do) three young cads and idiots going about together and getting drunk together every day you generally find that one of the three cads and idiots is (for some extraordinary reason) not a cad and not an idiot.

      Same. Is this true? I generally find that in a group of 5-6 at least that 1-2 are this way. I wonder if in a group of 3 that is still true. It makes me think of how Augustine repeatedly says in the Confessions that he would not have stolen fruit from the pear tree if he had been alone. The things we will do with others that we won't do alone is an interesting phenomenon, and may help explain why someone who is not a cad or an idiot will act like one in a group.

    1. Author Response to Public Reviews

      We thank the reviewers for their careful reading of our work, and their detailed and helpful comments. Their insights have helped us in improving this manuscript. We include their comments and our replies to them below.

      Reviewer #2 (Public Review):

      Line 293, by "comparing the Apo_NE and IB_EQ simulations at equivalent points in time" and perform subtraction "from the corresponding Ca atom from one system to another at 0.05, 0.5, 1, 3, 5ns". It is not clear to me why those time points were chosen? Have authors attempted at validating whether or not the signal from the ligand-binding site has had enough time to propagate across the allosteric signaling pathway? If one considers that the ligand is a spatially localized signal, it requires time to propagate. This is in contrast with the Kubo-Onsager paper cited by authors in which the molecule is responding to a global perturbation such as an external field. However, a local perturbation on one side of the protein will need time to propagate to the other side of the protein (30 angstroms away in this case).

      The time points are chosen to highlight the propagation of signal in the short nonequilibrium simulations. We agree with the reviewer that the signal will take time to propagate; indeed, it evolves over time, as can be seen in the figures and accompanying movies. It is important to emphasise that this is averaged over many trajectories. Some conformational rearrangements will not be fully sampled, as can be seen in Figure 3–Figure supplement 3. It is important to emphasize that the short nonequilibrium simulations are used here to measure the immediate structural response towards a perturbation. The timescale of this response in the nonequilibrium simulation does not correspond to the physical timescale of conformational change induced by/associate with ligand binding. The perturbation here is nonphysical, and the response is rapid. For long simulation times, and as the correlation between the equilibrium and nonequilibrium trajectories is lost, the subtraction technique is no longer useful as the noise arising from the natural divergence of the simulations overcomes the structural response of the system to the perturbation. Thus, this method allows for the identification of the initial conformational changes associated with signal propagation. Also, the difference calculated at any given time point should not be seen in isolation. Instead, it should be compared with the other time points, as it is such a comparison that highlights the cascade of events associated with signal propagation. This is clearly illustrated in Figure 3 supplement 3 and in the movies, where the collective signal from the short nonequilibrium simulations is progressing in a trend that is comparable with the equilibrium simulations. The time evolution of the signal is striking and thought-provoking.

      A simple and naive example is to map out all the bus stops on one's route. 800 simulations between the first and second stop will not be able to provide the locations of other stops. Since authors have used this "subtraction technique" on several other proteins, it would be nice to clarify how this approach works on mapping out signaling propagation perturbed by local ligand binding/unbinding and how to choose the time points for subtraction.

      Analogies can be helpful in understanding the nonequilibrium simulations, some aspects of which are not immediately obvious. One could perhaps think of these nonequilibrium simulations as analogous to striking a bell to see how it rings. The bus stop analogy suggested by the referee is intriguing, and we develop it here.

      In this case, when ‘getting on the bus’ (beginning the simulation), we do not know where the bus is going (i.e. we only knew that we were starting at the allosteric site, so the only thing that we know is the place where we board the bus) or the route it would take to get there. The bus is not travelling on a straight road, and the destination is unknown. We could wend our way slowly by standard equilibrium MD, but we would only reach the first or second stop on the route in the time available, and we would still not know where the bus was going. We would never find out where the bus is going: it takes too long. The nonequilibrium approach is a magic bus! In this approach, as the bus meanders close to its starting point, we suddenly replace the driver. The new driver puts her or his foot on the accelerator and immediate sets off for a new destination, heading away fast from the starting point. The driver is guided by the roads available. The bus can only drive on the road network, i.e. its progress is defined by its physical environment and the available directions of travel. So, while she/he may drive at an unsafe speed, the bus should stay on the road. It’s possible that it will take a short cut or indeed take a wrong turn or enter a dead-end street. But overall, doing this ‘driver replacement’ hundreds of times, on average the bus should follow the right route and go much faster along it. So, it might be a terrifying journey,but we should get to the destination faster! It might not reach the final destination, depending how long we let it go on, but we should pass several of the bus stops along the correct route. We can test how likely the route is by averaging over hundreds of crazy new bus drivers. A well-defined route implies a well designed network. The bus can take any of the roads available to it on the network, and the route taken by the bus may be unpredictable (if it was obvious, we would not need all these crazy drivers!). In other words, the response to a perturbation is non-linear. In terms of the final destination, specifically here in TEM-1 and KPC-2,the omega loop, the 3-4 loop, the hinge region are known to be involved in substrate binding and catalysis. We observe the signal reaching these structural elements, so we can say with confidence that the perturbation is communicated to distant, catalytically important parts of the enzyme. So, in terms of the bus analogy, we show that starting in the distant hills, the crazy bus drivers actually end up in the capital city. The simulations identify the capital city as the actual destination. And the fact that the crazy drivers tend to follow the same route allows us to say that we have identified the bus route to the capital, and the important points along the route.

      Another question is whether tracing the dynamics of Calpha alone is enough. As we have seen from the network analysis papers, Calpha sometimes missed some paths or could overemphasize others. The Center of the mass of residue has been proposed to be a better indicator of protein allostery. Authors may wish to clarify the particular choice of Calpah in this study.

      This is an interesting question. We have found in our previous analyses of nicotinic acetylcholine receptors and other systems that analysing the C-alphas allows the identification of pathways of signal transduction in nicotinic acetylcholine receptors (Oliveira et al. Structure 1171-1183. e3 (2019)) and went on to show that these pathways were common across different receptor subtypes (J. Am. Chem. Soc. 2019, 141, 51, 19953–19958 (2019)). Obviously, all residues in the protein are represented equally when analysing C-alphas. Thus, analysing the C-alphas allows direct comparison of closely related proteins with different sequences, and identification and analysis of the pathway in the framework of the protein backbone. Here, of course, we are interested in whether these C-alpha pathways identify positions of sequence variation that affect function, and the results indicate that indeed they do. There is also the practical advantage of analysing C-alpha behaviour that their motions are less subject to noise and converge more rapidly than e.g. analysing sidechains. Other features could be chosen to trace signal pathways, such as the centre of mass of residues. However, choosing more flexible parts to track signal propagation would also have an impact on speed of convergence (i.e. number of trajectories required): more simulations would be required to achieve convergence. Therefore, as in previous work on other proteins, we chose C-alpha atoms to study signal propagation here.

      The order of events associated with signal propagation is computed by directly comparing the positions of individual C-alpha atoms at equivalent points in time (namely after 0, 50, 500, 1000, 3000 and 5000 ps of simulation) for every pair of unperturbed equilibrium ligand-bound and perturbed nonequilibrium apo simulation. The C-alpha positional deviation is a simple way to directly identify the conformational changes induced by ligand annihilation and their evolution over the 5 ns of simulation. Due to statistics collected over the large number of simulations, we can be sure of the statistical significance of the structural changes identified. The conformational changes extracted from the nonequilibrium simulations reflect the (statistically significant) structural response of the system to the perturbation. These changes propagate over time from the allosteric site to the active site, demonstrating a direct connection between them. Due to the very short timescale of the nonequilibrium simulations (5 ns), the observed conformational rearrangements do not represent the complete mechanism of conformational change, but rather reflect its first steps.

      In Figure 5, the authors seem to use Pearson correlation to compute dynamic cross-correlation maps. Mutual information (M)I or linear MI have advantages over Pearson correlations, as has been discussed in the dynamical network analysis literature.

      The reviewer is indeed correct; the DCCMs were calculated based on the Pearson’s correlation. We have tested and validated this approach over the last 15 years, with results reproduced experimentally by a number of our collaborators for over 10 different enzyme systems, including cyclophilin A, dihydrofolate reductase, ribonuclease, APE1 and Rev1 DNA binding enzymes (Biochemistry 43, no. 33 (2004): 10605-10618; Nature 438, no. 7064 (2005): 117-121; Biochemistry 58, no. 37 (2019): 3861-3868; PLoS Biol 9, no. 11 (2011): e1001193; Structure 26, no. 3 (2018): 426-436; Nucleic acids research 48, no. 13 (2020): 7345-7355; Proceedings of the National Academy of Sciences 117, no. 41 (2020): 25494-25504). The reviewer’s suggestion is an interesting one, and we would be happy to investigate it in future studies. Mutual information analyses offer useful features. Based on our experience, we expect the results to be qualitatively similar and not likely to change the conclusions described in this manuscript.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank Review Commons and its three reviewers for your supportive and insightful responses to our manuscript. Below, we provide detailed responses to the reviewers’ individual comments and how we plan to address them during the revision.

      Reviewer #1: **Major comments:**

      The manuscript is very well written. The data is clearly presented. The methods are explained in sufficient detail with a few exceptions mentioned below, and statistical analysis are adequate. There are some concerns and suggestions about the experimental design and data presentation.

      - Drug treatments. It is not clear whether the cells were previously grown on charcoal-stripped serum before hormone treatments. From methods, it seems they were grown in 5% FBS and directly treated with the hormones. Also, what "hormone-free medium" mean? Is it charcoal stripped Serum or not Serum at all?

      For all experiments, the cells were grown in medium containing 5% FBS. Throughout the manuscript, “hormone-free” refers to medium containing 5% FBS with no dexamethasone added. Technically, this medium is not hormone-free as FBS contains low levels of cortisol. However, the levels of cortisol from the FBS in our medium seems insufficient to elicit a transcriptional response or DNA binding by GR based on experiments comparing charcoal stripped and medium containing regular 5% FBS. However, we acknowledge that it should be made clear to the reader that growth conditions technically were not hormone-free. We will make sure to include this information in both the methods and results section of a revised manuscript. In addition, we will state explicitly that our naiive cells are those that have not been exposed to a high dose of hormone.

      Replicates for these data sets? The ATAC and Chip-Seq should have at least 2. The concordance of the ATAC-seq and Chip-seq replicates should be described and shown in supplemental figures.

      The ChIP-seq peaks for GR are the intersect of two biological replicates. This is described in the Methods section (page 7). For the ATAC data, we used two biological replicates for the vehicle treated cells and treated two different hormones (dexamethasone and cortisol) as replicates. In a revised manuscript, we will add a supplemental figure to show the concordance between the replicates.

      Fig1A - The ATAC-seq HM should be clustered to show which peaks in opening/closing and unchanged peaks also have called GR chip peaks. Showing browser shots as in Fig1B is cherry picking data and can be put in a supplementary figure as an example. This is a main point of emphasis of the manuscript so show the data. The atac peaks that do overlap with GR chip peaks should be sorted by GR peak intensity. The QPCR is then only needed to confirm the quantitative changes.

      This is a good idea. As suggested by this reviewer (and also in response to a comment by one of the other reviewers), we will revise this figure panel to make the overlap between GR binding and opening and closing sites more obvious. Here are the numbers:

      A549 cells:

      opening sites: 49%

      closing: 10%

      nonchanging: 18%

      U2OS cells:

      opening: 54%

      closing: 0.2%

      nonchanging: 7%

      Regarding the use of browser shots, obviously these are cherry picked examples, however in our opinion they serve a purpose beyond illustrating examples of individual loci that open or close as they also give the reader an idea of the quality of the ATAC-seq data.

      To show both the ATAC sites and H3K27ac sites are specific to hormone treatment, a random set of 15K peaks not in this peak set also should be shown in HMs and should not change with the treatments. Why does the H3K27ac go down in the 6768 non changing sites with dex?

      The proposed group of control peaks is essentially what we included as “non-changing” peaks. For the revision, we will refine this group and compare the H3K27ac signal between GR-occupied and non-occupied groups. Regarding reduced H3K27ac signal upon Dex treatment at non-changing sites: Notably, this comparison is based on a single ChIP-seq replicate. In our experience, ChIP-seq experiments show quite some variability between biological replicates, which limits our ability to compare signal levels quantitatively. Thus, the difference could simply reflect a difference in ChIP efficiency between the treated and untreated cells. Alternatively, it could be that there is a general redistribution of H3K27ac signal towards GR-occupied opening sites. To pin down which of these explanations is valid, we would need to perform additional experiments, e.g. using spike-ins. However, this is beyond what we can do at the moment and therefore, we will instead revise the text to make sure that the interpretation of these results is somewhat speculative.

      The D & E parts of Fig1 can then be eliminated to become parts of Fig1A. Its not clear in the text that the HMs in Fig1 are all sorted in the same way.

      We will revise figure 1 as requested. In our initial submission, the data was always sorted by signal intensity of the feature shown. We will revise this and sort by ATAC-signal and keep a consistent sorting order for other features shown (and stratify each group into GR-occupied or not).

      - Fig. 1b (and d). The ChIP data is from 3h-hormone treatment while the ATAC-seq data is from a 20h hormone treatment. It seems a bit misleading to directly compare GR occupancy with the state of the chromatin at different time windows. Shouldn't the authors show their ATAC-seq 4h treatment data (shown in Fig S1) here instead?

      We will revise the figures as suggested to show the same time point for ChIP and ATAC-seq data.

      - Fig. 1f. The authors state "downregulated genes only show a modest enrichment of GR peaks". However, there is a significant enrichment of GR-peaks in repressive genes compared to non-regulated genes. It would be interesting to see how some of these peaks look in a browser shot. While the general conclusion "transcriptional repression, in general, does not require nearby GR binding", seems valid, the observation that many GR peaks appear directly bound to nearby repressed genes ought to be more emphatically recognized in the text.

      This is a fair point and was also raised by the other reviewers. During the revision, we will make textual changes to acknowledge that GR binding is enriched near repressed genes, albeit less so than for activated genes. In addition, we will include genome browser shots of genes with nearby peaks that are repressed by GR.

      - Concept of naïve cells (Fig. 3A). If cells are normally grown in serum-containing media, which is known to have some level of steroids, can the cells described here as "Basal expression" be truly free of a primed state? In the first part of the experimental design (+/- 4h hormone), which type of media is present here? Is it 5% FBS? A concern is that the authors may require the assumption that the (4h + 24h) period a is sufficient to erase all memory of the cells, which is exactly what they are trying to test.

      See our response to the first major comment above.

      It would be interesting to do a time course of the hormone-free period of the washout to determine the memory of the chromatin environment that results in the enhanced transcriptional response instead of just 24 and 48 hrs in A549 cells.

      We agree that that would be interesting but this is something that we cannot include for now.

      Fig 5A appears to show H3K27ac overlaying H3K27me marks near the promoter of ZBTB16 and at the GR sites within the gene locus with no reduction in H3K27me levels. This seems counterintuitive and should be explained or addressed especially since the authors use quantitative comparisons of H3K27ac levels with and without treatment in other figures.

      A trivial explanation for the overlaying H3K27ac and H3K27me3 marks at the ZBTB16 locus is that the ChIP results represent a population average. From our single-cell FISH experiments, we found that only a subset of cells activates ZBTB16 expression upon hormone treatment. Thus, a potential explanation is that the cells of the population that respond are responsible for the H3K27ac signal whereas the non-responders are decorated with H3K27me3. We will include this information in a revised discussion.

      Showing the changes of ZBTB16 upon 2nd stimulation via FISH is not terribly surprising and is even the most expected reason for higher RNA levels. Why does it only occur at that gene is a better question and is touched on in the discussion. It is more likely that this gene has a very low level of pre-hormone transcription compared to FKBP5 (see Fig 3e and the FISH images). ZBTB16 is in the lower 3rd of basemean RNA levels of GR responsive genes according to the RNAseq data. Selection of 1 or 2 other genes with similar basemean levels of RNA (from the RNA-Seq data) would make the data more

      When compared to FKBP5, ZBTB16 indeed has very low levels of pre-hormone expression. However, this is unlikely to explain the observed “memory” for ZBTB16 given that there are other genes with similarly low pre-hormone levels that do not show more robust responses upon repeated hormone exposure (see Fig. 3B,D). For the FISH experiments, we decided to include a non-primed gene (FKBP5 as control). We agree that adding additional control genes with comparable basemean levels would be informative. For example, this would tell us if a response of only a subset of cells in the population to hormone is specific to ZBTB16. Based on single cell studies by others (PMID: 32170217), most GR target genes show a response in only a subset of cells indicating that this is unlikely a unique feature of ZBTB16 explaining the priming observed. Rather than performing additional experiments, we will revise the discussion to acknowledge the difference in basemean and the potential role of cell-to-cell variability in explaining the observed “memory” for the ZBTB16 gene.

      **Minor comments:**

      - In the Intro (paragraph two), the authors explain the different mechanisms by which GR might repress genes. One alternative the authors appear to have missed is the possibility of direct binding to GREs while, for example, recruiting a selective corepressor such as GRIP1 (Syed et al., 2020). There are many recent critics to the notion that transrepression via tethering is responsible for GR repressive actions at all (Escoter-Torres et al., 2020; Hudson et al., 2018; Weikum et al., 2017).

      We are aware of these studies and agree that they should be included when listing the possible mechanisms by which GR can repress genes. We will revise the text accordingly.

      - When the authors introduce the concept of tethering to AP-1, they go way back to the first description of tethering. However, one of the references (Ref 20) actually goes against the tethering model as they did not detect protein-protein interactions between AP-1 and GR, and also, they conclude that repression requires the DNA-binding domain.

      We will pick a more appropriate reference indicative of tethering as a mechanism by which GR might repress genes.

      -Figure 2. The authors state "This suggests that the few sites with persistent opening are likely a simple consequence of an incomplete hormone washout and associated residual GR binding". The authors should check the subcellular distribution of GR after their washout protocol. If the washout is not completed, GR should still be in the nuclear compartment.

      The careful phrasing here was to include the possibility that GR might bind DNA even when hormone is completely washed out. However, a more likely explanation is that the washout is incomplete. The residual GR binding we find in our ChIP assays shows us that a subset of GR is indeed still chromatin-bound which implies that some GR is still in the nuclear compartment.

      - The first part of the manuscript (Repression through "squelching") seems a bit disconnected from the rest of the results (reversibility in accessibility). The abstract is structured in a way that this disconnection seems much less obvious. Perhaps the authors could try to present their squelching part in the middle of the manuscript, following the flow of the abstract? This is just a suggestion.

      When revising the manuscript, we will see if implementiung this suggestion is feasible.

      - Figures have CAPS panel letters (A,B,C, etc) while the text calls for lower case letter (a,b,c...)

      We will fix this as part of the revision. Reviewer #2: **Major Comments**

      We agree that long-term and repeated GC treatment would be very interesting to study and would yield insights that are more likely to be relevant to, for example, emerging GC-resistance during therapeutic use. We are aware of the limitations of our study and will make sure that these are acknowledged in the revised manuscript and we will point out the speculative nature of translating our findings to an in-vivo setting.

      2a.) The authors show several heatmaps to indicate changes in accessibility, H3K27ac and P300 upon Dex treatment as well as GR binding patterns in Fig. 1 and S1. Those are sorted by decreasing signal strength (I assume). To make those results more comparable, I suggest to sort them all in the same way (e.g. by descending ATAC-Seq signal or fold-change).

      A similar suggestion was made by reviewer 1. We agree that using the same sort order for the datasets makes it easier to link the different types of data we generated. We will present the data with a consistent sorting order and stratified by GR-occupied or not when we revise the manuscript.

      2b.) In line with a.), it is unclear to the reader if those sides opening /closing are the same sides showing increased/decreased H3K27ac or P300 occupancy and if those sides bind GR. Integrating this data together with mRNA e.g as correlation plots would strengthen the author's argument that accessibility, H3K27ac and mRNA changes are indeed correlated. What about the GR binding sites that do not change accessibility or H3K27ac? What makes those different? ** Therefore, the statement "Furthermore, closing peaks, which show GC-induced loss of H3K27ac levels and lack GR occupancy (Fig. S1c-f), were enriched near repressed genes" on page 10 as well as the statement "suggesting that transcriptional repression by GR does not require nearby GR binding." in the abstract and discussion cannot be made from how the data is presented.

      The first issue raised will be addressed by using the same sort order across different types of data. It might also shed light on features associated with GR binding sites that do not change accessibility or H3K27ac. Once we implement the revised sorting order, we will evaluate if the statements mentioned are indeed supported by the data.

      2c.) Several recent studies have shown that GR's effect on gene expression and chromatin modification at enhancers might be locus-/context-specific ("tethering", competition, composite DNA binding) and/or recruitment of different co-regulators (see Sacta et al. 2018 (doi: 10.7554/eLife.34864), Gupte et al. 2013 (doi.org/10.1073/pnas.1309898110) and many more). Defining the GR-bound or opening/closing sides in terms of changing H3K27ac (or having H3K27ac or not) more closely would help to link those to gene expression changes e.g. in violin plots. Furthermore, the authors could include a motif analysis to see if the different enhancer behaviours can be explained by differences in the GR motif sequence or co-occurring motifs. Thereby more closely defining the mechanism of chromatin closure a sites that lack GR binding e.g. by displacement of other transcription factors as described for p65 in macrophages (Oh et al. 2017 (doi.org/10.1016/j.immuni.2017.07.012)). In general a more detailed analysis of the data is required before the authors could state "Instead, our data support a 'squelching model' whereby repression is driven by a redistribution of cofactors away from enhancers near repressed genes that become less accessible upon GC treatment yet lack GR occupancy." on page 10. The results might also be explained by competitive transcription factor binding, tethering or selective co-regulator recruitment (e.g. HDACs).

      We will include a motif analysis comparing opening, closing and non-changing sites (stratified into GR-occupied or not) in a revised version of the manuscript. In addition, we will further investigate the redistribution of p300 upon Dex-treatment e.g. to test the correlation between p300 loss at closing sites lacking GR occupancy and transcriptional repression. We agree that the “squelching model” is just one of several explanations for repression and will provide a more comprehensive list of possible explanations beyond squelching as part of the revision.

      We will discuss the difference in receptor levels between the cell lines, the different number of genomic GR binding sites and its possible implication in the observed residual binding after wash-out in U2OS-GR cells as suggested.

      We agree that the coverage plots do not take the fraction of binding sites with signal into account. However, by also showing the heat maps, this information is also available to the reader. In our opinion, the coverage plots provide a straight-forward way to compare the signal for the different categories of peaks. The violin plots are an interesting alternative way to present the data, which also captures the diversity in the signal within each group. We will add violin plots to the supplementary data as requested.

      We see your point. However, based on the ATAC-signal (Fig. 5D) the changes in nucleosomal occupancy upon GC treatment are the same for naiive and primed cells and revert to their base-line level after hormone withdrawal. This indicates that these loci have comparable nucleosome occupancy after wash-out. Yet, the levels for these histone modifications do not differ between primed and naiive cells indicating that these histone marks do not “mark” the promoter of primed genes after wash-out.

      We are reluctant to put p-values on every chart, especially for experiments with few replicates. Importantly, we always plot the values for each individual data point, so the reader can gage if they differ between conditions. We will add p-values for figure 4 to test (support) our claim that ZBTB16 is primed whereas other GR target genes are not.

      A similar suggestion was brought up by reviewer #1, here is the response we gave to this comment: When compared to FKBP5, ZBTB16 indeed has very low levels of pre-hormone expression. However, this is unlikely to explain the observed “memory” for ZBTB16 given that there are other genes with similarly low pre-hormone levels that do not show more robust responses upon repeated hormone exposure (see Fig. 3B,D). For the FISH experiments, we decided to include a non-primed gene (FKBP5 as control). We agree that adding additional control genes with comparable basemean levels would be informative. For example, this would tell us if a response of only a subset of cells in the population to hormone is specific to ZBTB16. Based on single cell studies by others (PMID: 32170217), most GR target genes show a response in only a subset of cells indicating that this is unlikely a unique feature of ZBTB16 explaining the priming observed. Rather than performing additional experiments, we will revise the discussion to acknowledge the difference in basemean and the potential role of cell-to-cell variability in explaining the observed “memory” for the ZBTB16 gene.

      The fact that we do not observe elevated expression of other genes upon repeated expression could be due to the relatively short length of the hormone treatment, 4 hours, which was chosen to enrich for direct target genes of GR. These four hours might be insufficient for transcription, translation and ultimately gene regulation by the ZBTB16 protein. We have not looked at ZBTB16 protein levels.

      **Minor Comments**

      We will include this information in a revised version of the manuscript.

      We will add the requested peak-centric view. Based on a previous study (PMID: 29385519), we expect that binding is a poor predictor of gene regulation of nearby genes, especially for repressed genes.

      In our analysis, we looked at opening and closing peaks independently. If a peak is in the vicinity of multiple genes, it will only be assigned to the closest one. Thus, genes that have both and opening and a closing peak in the 50kb window will be included in both the analysis of closing sites and opening sites. We have not looked at clusters of binding sites, but agree that this would be interesting to see if the combinatorial action of multiple peaks makes regulation of the gene more likely. We will look into this during the revision process.

      1. The authors claim on p10 that "We could validate several examples of opening and closing sites and noticed that opening sites are often GR-occupied whereas closing sites are not occupied by GR". As most of the ChIP-Seq experiments were performed on formaldehyde-only fixed cells, the authors might miss "tethered" sides, which are mostly linked to gene repression. You might rephrase this part to most closing sites lack direct DNA binding.

      Even though several studies indicate that tethered binding can be captured using formaldehyde-only fixed cells (e.g. PMID: 32619221, PMID: 15879558), we agree that the ChIP-assay might have blind spots, for instance for tethered binding, and will revise our statements as suggested.

      This might be related to comment #4 given that P300 is brought to the DNA by other transcription factors whereas H3K27ac is directly DNA-bound which likely influences the cross-linking efficiency. By resorting the heat-maps, we will be able to determine the overlap between p300 recruitment and changes in H3K27ac levels (the other main enzyme that deposits this mark is CREBBP (a.k.a. CBP)).

      We will include this information in a revised version of the manuscript.

      We have not looked into this but a previous study by the Reddy lab (PMID: 22801371) has investigated binding sites in A549 cells that are occupied at very low Dex concentrations. They found that this is not driven by a specific GR motif but rather by the presence of binding sites for other transcription factors and chromatin accessibility.

      This data for the GILZ gene is shown in Figure S2C. When we revise the manuscript we will add this information to main figures 1 and 2 as suggested.

      This is shown in figure S3C and shows that expression levels of certain genes (ZBTB16 and FKBP5 but not GILZ) stay high after Dex washout (but not cortisol wash-out) consistent with persistent GR binding at a subset of GR-occupied loci for the experiments using Dex.

      For both S2C and S3C, cells were treated for 4h with Dex before the wash-out. For the ZBTB16 and FKBP5 genes, the persistent GR binding after wash-out is accompanied by a preserved Dex response after wash-out. For GILZ, GR binding at one of the peaks near the GILZ gene is also preserved, yet the expression of this gene reverses to its pre-treatment levels after wash-out. A possible explanation is that the residual binding at the GILZ gene is observed for only one of several nearby GR peaks. Previous studies, where we deleted GR binding sites near the GILZ gene, have shown that the combined action of multiple GR-occupied regions is needed for robust induction of this gene (PMID: 29385519).

      A trivial explanation for the overlaying H3K27ac and H3K27me3 marks at the ZBTB16 locus is that the ChIP data represents a population average. From our single-cell FISH experiments, we found that only a subset of cells activates ZBTB16expression upon hormone treatment so a potential explanation is that the cells of the population that respond are responsible for the H3K27ac signal whereas the non-responders are decorated with H3K27me3. We will include this information in a revised discussion. On a single histone, H3K27me3 and H3K27ac are mutually exclusive. However, given that a nucleosome has 2 copies of histone H3, both modifications can in principle co-exist.

      We’re guessing here, but we assume the reviewer refers to the potentially slightly higher H3K27me3 levels upon Dex treatment for ChIP-seq whereas the qPCR indicates that the levels do not change? The change seen in the ChIP-seq experiment is marginal and based on a single experiment. In contrast, the qPCR data shows the results from three biological replicates and therefore is probably a more reliable source of information.

      We will include this information in a revised version of the manuscript.

      Cancer cell lines often have variable karyotypes and our FISH data suggests that the ZBTB16 locus is present in more than 2 copies in some of the A549 cells. Here’s the info from the ATCC website describing the karyotype of A549 cells: …” This is a hypotriploid human cell line with the modal chromosome number of 66, occurring in 24% of cells. Cells with 64 (22%), 65, and 67 chromosome counts also occurred at relatively high frequencies; the rate with higher ploidies was low at 0.4%.....”.

      Upon quick inspection, we find that GR target genes are typically not marked by H3K27me3, however ZBTB16 does not appear to be the only one. When we revise the manuscript, we will look more systematically at the link between gene regulation by GR and genes marked by H3K27me3 to determine how “special” the presence of this mark is, which will also inform us about the likelihood that it is linked to the transcriptional memory observed for the ZBTB16 gene.

      We are not sure if ZBTB16 regulation by GR is tissue independent. However, in contrast to most GR target genes that are regulated in a cell-type-specific manner, ZBTB16 is regulated in both cell lines we examined and has also been reported to be a GR target gene in other cell types e.g. in macrophages (PMID: 30809020).

      Reviewer #3 **Major Comments:**

      For sure the washout time matters and we do not doubt that the persistent changes observed upon shorter wash-out by the Hager lab are real. One of the reasons we chose the 24h period was to see if the changes observed by Lightman and Hager might persist for extended periods of time as suggested by Zaret and Yamamoto. Our findings suggest that this is not the case and that the majority of GR-induced changes are short-lived. Perhaps future studies can shed light on how long changes persist. However, given the slow dissociation between GR and Dex, we expect that it might be hard to dissect if persistent changes are indeed persisting in the absence of GR binding or reflect an incomplete hormone wash-out.

      The objective of this study was to find out if persistent changes as observed in Ref33 are the exception or the rule not to test if the original observation is correct (importantly, another cell line was used in Ref33 which makes a 1:1 comparison impossible to begin with). We believe that we have convincingly shown that, for the cell lines we assayed, persistent changes are rare if occurring at al. Given that no convincing persistent changes were observed after a 24h washout, we think that it is very unlikely that such changes would be observable after even longer wash-out periods. We do not intend to include experiments using longer wash-out but will revise the discussion to emphasize that the lack of persistent changes we found might be specific to the cell lines we chose for our studies.

      We agree that adding this percentage is a good idea as this would allow for a more quantitative comparison between the different groups. Here are the numbers:

      A549 cells:

      opening sites: 49%

      closing: 10%

      nonchanging: 18%

      U2OS cells:

      opening: 54%

      closing: 0.2%

      nonchanging: 7%

      We will include this information in a revised version of the manuscript.

      For the ATAC-seq experiments, we treated the dex-treated and cort-treated experiments as replicates to find candidate regions with persistent chromatin changes. For the ATAC-seq data, a site is 'persistent' if called (by MACS2, e.g. DEX vs EtOH) upon treatment and then again 24h after washout (DEX washout vs EtOH washout). For the ATAC-qPCR experiments, we performed 4 biological replicates and will perform a t-test to determine if the small difference we observe at some sites between the EtOH and washout is statistically significant. Given the overlapping error bars and the very small difference, don’t expect the difference to be significant even for these most promising candidates from our genome-wide analysis.

      Indeed we did not find a mechanistic explanation for the ZBTB16-specific memory. Possible explanations are discussion in the following section of the results (page 14-15): “… Mirroring what we say in terms of chromatin accessibility, transcriptional responses also seem universally reversable with no indication of priming-related changes in the transcriptional response to a repeated exposure to GC for any gene with the exception of ZBTB16. Although several changes in the chromatin state occurred at the ZBTB16 locus, none of these changes persisted after hormone washout arguing against a role in transcriptional memory at this locus (Fig. 5). Similarly, the increased long-range contact frequency between the ZBTB16 promoter region and a GR-occupied enhancer does not persist after washout (Fig. 5e). Notably, our RNA FISH data showed that ZBTB16 is only transcribed in a subset of cells, hence, it is possible that persistent epigenetic changes occurring at the ZBTB16 locus also only occur in a small subset of cells and could thus be masked by bulk methods such as ChIP-seq or ATAC-seq. Another mechanism underlying the priming of the ZBTB16 gene could be a persistent global decompaction of the chromatin as was shown for the FKBP5 locus upon GR activation [35]. Likewise, sustained chromosomal rearrangements, which we may not capture by 4C-seq, could occur at the ZBTB16 locus and affect the transcriptional response to a subsequent GC exposure. Furthermore, prolonged exposure to GCs (several days) can induce stable DNA demethylation as was shown for the tyrosine aminotransferase (Tat) gene [71]. The demethylation persisted for weeks after washout and after the priming, activation of the Tat gene was both faster and more robust when cells were exposed to GCs again [71]. Interestingly, long-term (2 weeks) exposure to GCs in trabecular meshwork cells induces demethylation of the ZBTB16 locus raising the possibility that it may be involved in priming of the ZBTB16 gene [72]. However, it should be noted that our treatment time (4 hours) is much shorter. Finally, enhanced ZBTB16 activation upon a second hormone exposure might be the result of a changed protein composition in the cytoplasm following the first hormone treatment. In this scenario, increased levels of a cofactor produced in response to the first GC treatment would still be present at higher levels and facilitate a more robust activation of ZBTB16 upon a subsequent hormone exposure. Although several studies have reported gene-specific cofactor requirements [73], the 14 fact that we only observe priming for the ZBTB16 gene would make this an extreme case where only a single gene is affected by changes in cofactor levels……”.

      **Minor Comments**

      We will include a motif analysis for opening and closing sites in a revised version of the manuscript.

      We will revise the label in a revised version of the manuscript as suggested.

      We actually prefer the MA plots as they also provide information regarding the basemean counts for regulated genes. This allows one, for example, to see that other GR-regulated genes with similar basemean counts do not show a “memory” suggesting that the low expression level for ZBTB16 likely does not explain the observed priming.

      We will include this information in a revised version of the figure.

    1. The difference today is that the landscapeswithin which species would move in response toclimate change have been highly modified byhuman activity through deforestation, agricultur-al conversion, wetland drainage and the like.

      It is extremely saddening to think that humans are not only destroying species at an alarming rate due to the list provided by the reading, but also in climate change. Climate change has been talked about for a very long time. And many big businesses seem to ignore it in hopes that it will go away, and things won't have to change. The problem with this is that it seems like nothing is going to dramatically change until it starts effecting humans. As it has with some of the fires and other natural disasters that may occur.

      As anyone would naturally move when their current living situation is uncomfortable these animals are trying to do the same, but can't because we're already in there way.

      The only way things are going to change is if people are educated about what is going on. This "climate change isn't real" stuff needs to stop, and people in higher power need to put regulations on everyday life or else this will only continue. The entire world needs to work together as a whole to fight what is coming our way.

      If the glaciers completely melt in the 15 years stated earlier in the chapter, will we be able to stop what has happened. Usually positive feedback loops are non reversible. For example Pregnancy is a positive feedback loop, and isn't complete until birth. As Glaciers moved they created significant biodiversity in plant life as they carried seeds and other things with them. And it seems as these glaciers are leaving us the biodiversity in many cases is leaving which feels ironic in a way. And it is all caused by humans.

    1. Author Response:

      Reviewer 1:

      In the study by Buus et al., the authors set out to address an important need to understand how oligo-conjugated antibodies should be optimally utilized in droplet-based scRNA-seq studies. These techniques, often referred to as CITE-seq, complement techniques such as flow cytometry and mass cytometry yet also further extend them by the ability to jointly measure intra-cellular RNA-based cell states together with antibody-based measurements. As is the case with flow cytometry, manufacturers provide staining recommendations, yet encourage users to titrate antibodies on their specific samples in order to derive a final staining panel. Based on the ability to stain with hundreds of antibodies jointly, few studies to date have assessed how the antibodies present in these pre-made staining panels respond to a standard titration curve. In order to address this point, this study tests two dilution factors, staining volume, cell count, and tissue of origin to understand the relationships between signal and background for a commercially available antibody panel. They arrive at the general recommendation that these panels could be improved, grouping various antibodies into distinct categories.

      This study is of general interest to the scRNA-seq and CITE-seq communities as it draws attention to this important aspect of CITE-seq panel design. However, it would stand to be substantially improved by not only providing suggestions but also testing at least one, if not more, of their suggestions from Supplementary Table 2, and preferably performing experiments using more technical replicates or biological replicates. As it stands now, the study is largely based on one PBMC and one lung sample, that were stained once with each manipulation as far as can be gathered from the Methods.

      We appreciate the reviewer’s insight into the methodology and enthusiasm for the study.

      We do want to clarify that the study does not use a “pre-made staining panel” from commercial vendor, but rather a cocktail of individual antibodies available from a commercial vendor (with emphasis on epitopes relevant to immunology and cancer research). We have also clarified this in the text of the manuscript.

      We hope that the added analysis, our point by point response to the issues raised by the reviewer, and inclusion of new CITE-seq data from the panel with adjusted concentrations to alleviates the main concerns of the reviewer.

      1) Given the title is improving oligo-conjugated antibody… it would be important to functionally test one of the suggestions. We would suggest a full titration curve of selected antibodies, perhaps one from each of the categories, but if cost is a concern at least two or three antibodies, to identify how titration impacts antibodies, and especially those in categories labeled as in need of improvement. Relatedly, if the idea is that if antibodies (such as gD-TCR) do not have a cognate receptor leading to general background spread, does spiking in a cell that is a known positive in increasing ratios remedy this issue by acting as a target for the antibodies? Does adding extra washes help to remedy these issues of background?

      These are excellent points. Full titration curves have previously been published showing that oligo-conjugated antibodies respond to titration, and in that regard behave similar to fluorophore-conjugated antibodies assayed by flow cytometry (see Stoeckius et al. 2018. Genome Biology; Fig. 3A-D). Our study does not aim to identify the optimal concentration of individual antibodies in isolation but strives to provide the optimal signal-to-noise ratio for each antibody in a cocktail while taking sequencing requirements into account - this is why we don’t focus on full titration curves and saturation kinetics for each antibody/epitope. If we use all antibodies at their highest signal-to-noise ratios, this would drastically increase sequencing requirements of the library as highly expressed markers would use the vast majority of the total sequencing reads. As such, we aimed to get “sufficient” signal-to-noise while keeping the sequencing allocated to each marker balanced.

      Furthermore, as our results show, background signal can be largely attributed to free-floating antibodies in the solution, using high concentrations for all markers in one or more condition would increase the background in all conditions if these were multiplexed into the same droplet segregation. This phenomenon would likely obscure the positive signals and possibly titration response at lower concentrations (similar to what we see for category A antibodies). To avoid this, if full titration curves should be meaningful, each condition should be run in its own droplet segregation making such titration efforts prohibitively costly. We have elaborated on this in the discussion of the revised manuscript.

      We agree that it would greatly improve the study to include results from our panel with adjusted concentrations. In the revised manuscript, we have made efforts to address this by making a comparison between the sample stained with the pre-titration (DF1) concentrations and a sample stained with concentrations that have adjusted based on their assigned categories (from Table 1). We believe that this new data convincingly demonstrates improvements both of the individual antibody signals and at the level of the increased sequencing balance (see new Fig. 5). While the adjusted concentrations could still benefit from further improvements, we show that at similar sequencing depths, the adjusted concentrations provide a more balanced sequencing output and exhibit a 57 % increase in the median positive signal and a 43 % reduction in the median background signal for the 52 antibodies in our panel. The benefit of the adjusted concentration was particularly remarkable for CD86 which went from having 76.5 % to 12.6 % of UMIs assigned to background signal and thus yielded comparable positive signal while using 4.8 fold less UMIs (new Fig. 5G).

      Spiking in cells that express the cognate antigen is an interesting idea. However, as the spiked in cells would be included in all the downstream processes including sequencing of mRNA and other modalities, it would be quite costly to spike-in cells that are not of biological interest – only to decrease background of one or a few antibodies.

      While the results presented in the manuscript do not address this directly, our data strongly suggest that adding extra washing would help reduce free-floating antibodies in the solution captured in the gel-bead emulsions responsible for some of the observed background signal (as can be assayed by the non-cell-containing droplets). For such a test to make sense, the staining conditions should be identical for two samples that are differentially washed (including the exact same cell composition) and would require fully separate droplet segregations (i.e. utilization of separate 10x lanes) which would make it a very costly experiment solely to test the washing effect. However, we have done preliminary tests using short (150bp) cDNA amplicon spiked into different tubes or plates containing ~750x103 PBMCs to determine washing efficiency by qPCR. Here we assayed how increasing the washing volume from 200µl (96-well) to 1.5mL or 50mL for two washes reduced the detection of the spiked-in amplicon in the supernatant as compared to an unwashed sample. While short cDNA amplicons may not behave identical to oligo-conjugated antibodies, they simulate background signal stemming from free-floating antibodies and thus can be used to evaluate different washing conditions for a given set-up. As expected, using higher washing volumes does indeed greatly reduce the amount of amplicon (simulating free-floating “background” antibodies) detected in the resulting suspension. (https://raw.githubusercontent.com/Terkild/CITE-seq_optimization/master/figures/review_washing_test.png)

      2) Another way of improving these panels is through reducing the costs spent on both staining but perhaps more importantly the sequencing-based readouts. Several times in the manuscript (at line 77 for example or line 277) it is alluded to that the background signal of antibodies can make up a substantial cost of sequencing these libraries. However, no formal data on cost is presented, which would be important to formalize the author's points. It would be important to provide cost calculations and recommendations on sequencing depth of ADT libraries based on variation of staining concentration. Relatedly, in the methods, sequencing platform and read depth for ADT libraries was not discussed, nor is the RNA-seq quality control metrics provided other than a mention of ~5,000 reads/cell targeted. This is important to report in all transcriptomic studies, and especially a methods development study.

      Thank you for pointing out the very sparse description of choice of sequencing method and RNA-seq quality controls. We have included additional metrics in the materials and methods and included a new Suppl. Fig. S1 showing number of detected genes as well as UMI counts within the mRNA and ADT modalities in the revised manuscript. We agree that reducing sequencing cost (without reducing biological information) is a major reason for optimizing staining with oligo-conjugated antibodies. We have now added a section in which we elaborate on the potential cost saving, and other benefits of titration of antibody panels and provide some examples from our datasets. Actual savings of optimization of these panels will be very dependent on a given setup, starting concentrations and the depth of sequencing that the particular research questions (and budget) warrant.

      Due to the 10-1000 fold higher numbers of proteins as compared to coding mRNA [16], ADT libraries have high library complexity (unique UMI content) and are rarely sequenced near saturation. Thus, either sequencing deeper or squandering fewer reads on a handful of antibodies, will result in an increased signal from other antibodies in the panel. We found that by simply reducing the concentration of the five antibodies used at 10 µg/mL, we gained 17 % more reads for the remaining antibodies. Consequently, assuming we are satisfied with the magnitude of signal we got from all other antibodies using the starting concentration, this directly translates to a 17 % reduction in sequencing costs.

      In terms of sequencing depth, we are not comfortable giving very broad recommendations. This is due to the fact that sequencing requirements will be very different depending on the composition of the antibody panel as well as the cell type distribution (epitope abundance) (as has been previously noted in Mair et al. 2020 Cell Rep.). If the antibody panel contains only antibodies targeting epitopes that are largely present on a small subset of cells (such as CD56 or CD8 for PBMCs) it would require fewer reads per marker per total cell count than markers that are broadly expressed (such as HLA-ABC or CD45 for PBMCs). However, in a different sample composition (for instance a tissue with few leukocytes) these same antibodies would require fewer reads per cell whereas other epitopes may be more abundant.

      We want to also stress, that aside from cost savings, an optimized balanced panel with low background will yield improved resolution compared to a non-optimized panel. Fortunately, CITE-seq and related methods are very flexible in this regard as you can start by shallow sequencing and then “top-up” the sequencing depth to an optimal level based on the actual data in subsequent sequencing runs (for instance together with the next batch of samples).

      3) One of the powerful elements of joint multi-modal profiling, as mentioned in the title, is to be able to measure protein and RNA from a single cell. This study does not formally look at correlation of protein and RNA levels, and whether a decrease in concentration of antibody either improves or diminishes this correlation. This would be important to test within this study to ensure that decreasing antibody levels does not then adversely affect the power of correlating protein with RNA, and whether it may even improve it.

      We appreciate the reviewer’s suggestion – this is a great idea. Unfortunately, such correlations are notoriously hard to do for scRNA-seq data due to the sparsity of the RNA measurements (which contains high frequency of 0 UMI counts). This is, in part, due to low reverse transcriptase efficiency, and also due to the fact that most proteins have 10-1000 fold more copies than the mRNA transcripts that encode them (Marguerat et al. 2012 Cell). This is exacerbated in our study by the fact that we only shallowly sequenced RNA modality (~4000 reads/cell). Consequently, we see a very high number of cells that despite clustering together within distinct lineage clusters (based on their full transcriptome) and expressing the expected lineage marker surface proteins, do not have readily detectable transcript for the same marker(s). For instance, for all cells that are positive for CD8 at the RNA level, there are at least as many that are negative for CD8 RNA while being positive for CD8 ADT. Importantly, these additional CD8+ cells are still located within clusters consistent with a CD8+ phenotype (see below): (https://raw.githubusercontent.com/Terkild/CITE-seq_optimization/master/figures/review_CD8_protein_rna_correlation.png)

      As such, due to the sparsity of RNA counts, if ADT signal is diluted too much leading to truly positive cells being called as negative, it may actually increase individual cell correlation between RNA and ADT but mean higher levels of “false negative” cells. Direct correlation between RNA and antibody measurements within each individual cells is further complicated by the presence of non-specific/background signal in protein data that is rarely found in RNA data. This can also be seen in the plot above by the fact that positive cells are defined at a cut-off “7” at the ADT level, and not “0” as is the case for RNA. Thus, while having only a few UMI counts for a given transcript is sufficient to call expression, having a few UMIs from an ADT can easily be attributed to background (particularly in an unoptimized panel).

      Due to these technical limitations, we find it more suitable to correlate “positivity” called by either ADT (gated positive as shown in Suppl. Fig. S2) or mRNA expression (i.e. > 0 UMI counts). While this comparison is less quantitative (does not distinguish “high” from “low” expression) it enables us to show whether reducing antibody concentrations affects ADT signal ability to distinguish positive from negative cells (as compared to GEX), which is at the core of the reviewer’s suggestion. The figure below, demonstrates that four-fold titration reduces the fraction of positive cells by some markers (reduction in the blue+red bars by dilution) whereas other markers are largely unaffected both of which is consistent with the analysis in the manuscript: (https://raw.githubusercontent.com/Terkild/CITE-seq_optimization/master/figures/review_protein_rna_correlations.png)

      In terms of assuring specificity, we have also modified the “titration plots” to show more detailed cell type distribution at each rank (by the “barcode plot” to the right of the “rank plot”) as well as the distribution of UMIs among cell types (by the bar plot above the “barcode plot”) at each condition. Finally, to make these “titration plots” more accessible, we have now included a guide to the different components of the “titration plots” in Fig. 2 of the revised manuscript.

      4) How was the lack of antibody binding determined for Category E? CD56 is frequently detected on NK cells in peripheral blood, CD117 should be detected on mast cells in the lung, and CD127 should be found on T cells, particularly CD8+ T cells. From inspecting Figure 1E, it appears as if all three of these markers are detected on small but consistent cell subsets. As the clusters are only numbered and no supplementary table is provided to help the reader in their interpretation, it is difficult to determine if these represent rare but specific binding, or have not bound with any specificity.

      Thank you pointing this out. In light of this comment, it is obvious that we need to annotate the cell types of the clusters. We have annotated all the fine-grained clusters by cell types and re-worked all relevant panels in Figures 1, 2 and 3 (and all their related supplementary figures) to show more detailed and consistent cell type annotation. We have also added Suppl. Fig. 1C, D to show marker genes for each of the annotated cell types, which together with the re-worked Fig 1E, give the reader a clear description of the cluster identity. We do indeed see some signal for Category E antibodies such as CD56, CD117 and CD127 within the expected clusters. This indicates that the antibodies do work to some extent. However, we also find that the signal for these markers is modest, at best, and not present in some populations where we would have expected them (CD127 should be more pronounced in T cells and we are finding an unexpectedly high frequency of CD56-negative NK cells).

      5) References: At 14 references, the paper overall could benefit from a more comprehensive citation of related literature including flow cytometry and/or CyTOF best practices for antibody staining and dealing with background, and joint RNA and protein measurement from single cells.

      We agree that the reference list of the original manuscript was sparse and may have missed important relevant studies. We have done our best to include additional studies relevant for the optimization and titration of mass cytometry panels and flow cytometry staining and added references to a few newly published joint RNA and protein measurement studies. We have strived to reference all studies directly relevant to the present work and do not want to overlook any appropriate publications that should be referenced and so welcome any suggestions of the reviewers.

      Reviewer 2:

      Recombinant antibodies are the most common and powerful reagents in life science research to identify and study proteins. Yet, every single antibody should always be validated and carefully tested for its relevant application, to ensure constructive and reproductive scientific endeavor. I was thus extremely pleased to review the manuscript of Terkild Buus et al, as it provides a careful assessment of oligo-conjugated antibody signal in CITE-seq. The authors tested four variables (antibody concentration, staining volume, cell numbers and tissue origin) and clearly showed that antibody titration is a crucial step to optimize CITE-seq panel. The authors found that, as a general rule, concentration in the 0.625 and 2.5 µg/mL range provides the best results while recommended concentrations by vendors, 5 to 10 µg/mL range, increase background signal.

      In my opinion, the study is well-performed and may serve as a guideline to accurately validate antibodies for CITE-seq, as a consequence I have only minor comments.

      We are very happy that you appreciate the necessity of our work and that you found it to be a useful resource for improving CITE-seq experiments.

      As stated by the authors, the starting concentration used for each antibody was based on historical experience and assumptions about the abundance of the epitopes. This approach may not be ideal, and the optimal concentration may have been missed. Do the authors think that a proper titration would be an advantage? Maybe this could be discussed in the text.

      We agree that using starting concentrations based on historical experience etc. may not be ideal for a completely objective assessment of how oligo-conjugated antibodies respond to the four-variables test. However, we firmly believe that using informed starting concentrations greatly increases the potential improvement of a panel while keeping costs to a minimum (which has to be a consideration for these expensive methods). With that said, we agree that this approach may not reach the optimal concentration (a definition that is a bit complex in this setting). As mentioned in our reply to reviewer 1, point 1, a previous study has shown a more formal titration response for three antibodies using a broader range of concentrations (Stoeckius et al. 2018. Genome Biology; Fig. 3A-D) and we believe that titration for CITE-seq is as much about balancing the sequencing needs of the full panel as it is about reaching the optimal signal-to-noise for the individual antibodies. We have elaborated on this in the discussion of the revised manuscript.

      The authors showed by testing four variables (see above) that they could define the optimal conditions to reduce background signal and increase sensitivity of antibodies and thus this way improves CITE-seq outcome. Nevertheless, the authors rely on the fact that all antibodies used in their panel are specific for their targeted antigens. I am not asking here to test the specificity of every single antibody used in the study as this would be a colossal amount of work. But I feel that this aspect should be discussed in the manuscript, especially when an "uncommon" antibody is intended to be used in the CITE-seq panel; the specificity of this antibody should be indeed tested prior to its use.

      Thank you for this suggestion. This is indeed an aspect of antibody optimization that we have not touched upon. By using commercially available oligo-conjugated antibody clones that are broadly used, the extensive testing of many of these clones by multiple labs within immunology community (for flow/mass cytometry applications) and based on our personal experience with majority of the clones for flow cytometry applications, we expected that the antibodies in our panel should be specific for their antigen. This is supported by the labelling matching what we would expect to find in PBMCs and lung leukocytes, as well as the correlation between expression of the gene encoding the targeted epitope and antibody binding (see our response to reviewer 1, point 3). We have added a paragraph to the revised manuscript discussing that, particularly when using antibodies for the first time or using clones that are unfamiliar, it is important to assure specificity.

    1. right things like but really it was 01:02:01 nonlinear text was what he that's what that's what I used to teach about Harpeth X is nonlinear text he got hyper media back in the sixties right amazing two-way links 01:02:13 he was right about that Tim I'm sorry Ted was right micropayments he was right about that transclusion which I am a great fan of and I think that was one of his most profound inventions that we 01:02:26 don't use enough of and I often ask me but is that a transfusion I don't know actually very profound of that I think we should use more on of course he wanted everything to go through his system it was very proprietary and but 01:02:39 in terms of the what we needed for this new world he knew about his Xanadu design inspired the the design of the dynamic generic links we had in 01:02:51 microcosm again also as we may think and who then along comes Tim with the World Wide Web embedded one way links retro and we've been retrofitting hypertext to 01:03:05 the web ever since which seems the most ridiculous thing to say but it's true

      nonlinear text

      hypermedia

      two-way links

      he was right about that sorry Tim

      Transclusion

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is a very interesting and well designed study on mNGS of mosquitoes. The authors demonstrate that they can distill valuable information on the vector species, the source of the blood meals and the microbiome/virome using a simple experimental approach and using single mosquitoes. A highlight of the work is that the paper is very comprehensive with an overwhelming dataset and thoughtful analysis. It is a showcase how sequencing data from a relative compact number of mosquitoes specimens can be used to conduct sophisticated computational analysis leading to meaningful conclusions. The authors make a strong case for the power of mNGS of mosquitoes that may be applicable to other (invertebrate) species. Especially the phylogenetic analysis based on SNP distance without have reference genomes and the grouping of contigs by means of co-occurence in datasets is original. We feel that the work deserves to be published.

      Significance

      We have a number of comments that the authors may consider in further improving the quality of their manuscript:

      What is the impact of this paper?

      I think it is possible that the paper will have a decent impact on the mosquito arbovirus field, because it adequately shows the possibilities that individual mosquito sequencing can bring (e.g. co-occurrence analysis). It may shift the balance to doing more individual mosquito sequencing instead of pools. The paper is also very extensive in the analyses that it does on this very rich data set. Below, some suggestions are given for additional analysis, which should be interpreted as a compliment to the interesting data set acquired. It should however be noted that the ideas and approaches taken are not entirely new. Sequencing individual mosquitoes, co-occurrence analysis and metagenomic sequencing have been done before, although not to this extent and not in this field. Several novel possibilities:

      1. An unbiased way to check if you have the correct mosquito species and the ability to detect subspecies. Using the genetic distance of the transcriptomes they have likely corrected the missed identification in some samples, where these calls had a logical mistake made. The fact that subspecies overlapped with the sites of capture is very interesting and confirms the relevance of looking at the genetic distance also within species.
      2. Blood-meal analysis from sequence data. Here they can get to species level for 10 out of 40 blood-engorged mosquitoes. The idea is interesting, as you would be able to get a lot more information if you can determine blood-meal origin from RNA-seq data (as shown in this paper). However, I feel that in the current paper (and this may be intentional) they do not properly show that RNA-seq is an adequate alternative to DNA sequencing of the blood. To convince me, I would have liked to have these results compared to DNA sequencing and see how much overlap there is. I understand however that the choice was made not to do this, but I do have a small note for the information given now. It was mentioned that 1 contig with an LCA of vertebrates is enough for a 'blood-meal origin' call. I am however left to wonder how reliable is 1 read? Are there really no contigs with an LCA in vertebrates in the non blood-fed mosquitoes? Also, what do we think happened in the mosquitoes that were visibly bloodfed but nothing was found; any speculation?
      3. The study of co-occurrence, although not novel, is a nice addition to the mosquito virome/microbiome determination field. Identifying novel segments and missed segments of viruses is very nice. I do however wonder: did it ever occur that co-occurrence finds a 'linked' fragment that was clearly wrong? Were some post-analyses done to check if the results make sense? It seems, especially because the paper elaborates on examples, that you need some follow-up. This is not problematic, but a nice addition to the paper would be (as is also described below) to mention which segments were added to viral genomes by co-occurrence and if some checks were done to verify these hits.
      4. Being able to say something about differences in viruses within the same mosquito species is super interesting. Pools do not give the possibility to say something about profiles and prevalence and the large size (148 mosquitoes) allows to find interesting correlations.

      What parts do you think are problematic?

      1. We question the validity 'blood-meal calls' as outlined above.
      2. In this study they use % of non-host reads as a measure for the abundance of a pathogen (see e.g. Figure 3). I don't understand this at all... If you have more pathogens, then the amount of non-host reads would have to go up right? It seems to assume that the amount of non-host reads you have is similar in all samples? It becomes even more problematic when the trend is mentioned that having a higher % of non-host reads for Wolbachia is related to a lower % of non-host reads for viruses. This seems to be trivial as the amount of non-host reads goes up with increased Wolbachia infection, and therefore the % of non-host reads for viruses goes down due to the larger denominator. A different number than 'non-host reads' should be taken to normalise the data and say something about abundance. E.g. host reads or spiked RNA?

      What are the most relevant questions you are left with?

      1. I am curious about the limited overlap with Sadeghi et al., 2018, who sequenced so many Culex mosquitoes in California. I would suggest to say a little but more about these discrepancies and their potential causes in the discussion.
      2. What do the authors think are in those 'dark reads'? Is the amount of dark reads the same across the different samples? Similarly, are the 'tetrapoda' reads reduced/absent in mosquitoes with a reference genome available?
      3. In the first part of the results, mention is made to being able to characterize to kingdom level 77% of the 13 million non-host reads (also see comment on non-host reads below). I am however puzzled with the description in the text and supplemental figure 3: which 3 million contigs were not able to be characterized? Where in supplemental figure 3 are they? This is especially puzzling as the main text mentions that 11 million non-host reads are from complete viral genomes, 0.9 million to eukaryotic taxa and 0.7 million to prokaryotic taxa?
      4. There seem to be 131 bars, corresponding to individual mosquitoes, in figure 3? Where are the remaining 17?

      What are your tips (in addition to responses to above questions)?

      1. I think the definition of 'non-host reads' needs to be clearly made and used consistently across the document. At the end of the paragraph 'Comprehensive and quantitative analysis of non-host sequences detected in single mosquitoes' the concept of "...13 million non-host reads..." is introduced. At first glance of supplemental figure 3 it seems that "non-host reads" could also be defined as the 16.7 aligned reads that are left after putative host sequences are removed. Although it is true that the derivation of 13 million is explained in the figure text of supplemental figure 3, it may be easier for the reader (as it cost me some time) to explain this in the main text. In addition, is the definition of 'non-host reads' (corresponding to 13-million reads) corresponding to "classified non-host reads" in the following excerpt: "For every sample, "classified non-host reads" refer to those reads mapping to contigs that pass the above filtering, Hexapoda exclusion, and decontamination steps. "Non-host reads" refers to the classified non-host reads plus the reads passing host filtering which failed to assemble into contigs or assembled into a contig with only two reads."? This caused some confusion.
      2. I believe it would be a valuable addition to add a table for the viruses which includes: 1) How it was determined that the complete genome is there, 2) The percentage overlap for those segments that were identified with blast and 3) Which viruses were already known.
      3. Have the numbers of the caught mosquitoes somewhere written out in the materials and methods.
      4. Pg2 L1-3: "Metagenomic sequencing..... a single assay." Perhaps a bit early for this statement. Would suggest to place it two paragraphs later before:"Here, we analyzed...."
      5. Figure S4 is too pixelated to read. Perhaps due to pdf conversion, but please do check before submission.
    1. System architects: equivalents to architecture and planning for a world of knowledge and data Both government and business need new skills to do this work well. At present the capabilities described in this paper are divided up. Parts sit within data teams; others in knowledge management, product development, research, policy analysis or strategy teams, or in the various professions dotted around government, from economists to statisticians. In governments, for example, the main emphasis of digital teams in recent years has been very much on service design and delivery, not intelligence. This may be one reason why some aspects of government intelligence appear to have declined in recent years – notably the organisation of memory.57 What we need is a skill set analogous to architects. Good architects learn to think in multiple ways – combining engineering, aesthetics, attention to place and politics. Their work necessitates linking awareness of building materials, planning contexts, psychology and design. Architecture sits alongside urban planning which was also created as an integrative discipline, combining awareness of physical design with finance, strategy and law. So we have two very well-developed integrative skills for the material world. But there is very little comparable for the intangibles of data, knowledge and intelligence. What’s needed now is a profession with skills straddling engineering, data and social science – who are adept at understanding, designing and improving intelligent systems that are transparent and self-aware58. Some should also specialise in processes that engage stakeholders in the task of systems mapping and design, and make the most of collective intelligence. As with architecture and urban planning supply and demand need to evolve in tandem, with governments and other funders seeking to recruit ‘systems architects’ or ‘intelligence architects’ while universities put in place new courses to develop them.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewers

      We first thank Review Commons for recruiting such knowledgeable reviewers to comment on our manuscript. We appreciate their diverse set of useful and constructive comments, which should help us improve the manuscript substantially. Please see our response to each reviewer’s comments below.

      Reviewer #1:

      **Summary:** The authors describe a useful modified fluctuation assay that couples conventional mutation rate analysis with mutational spectrum characterization of forward mutations at the S. cerevisiae CAN1 locus. They nicely showed that wild yeast isolates display a wide range of mutation rates with strains AAR and AEQ displaying rates ~10-fold higher than the control lab strain. These two strains also showed a bias for C>A mutations, and were the only strains analyzed that had a mutation spectrum statistically different from the lab control. Together, these data provide a compelling proof-of-principle of the applicability of the modified fluctuation analysis approach described in this manuscript. Overall, the manuscript is very well written, and the work reported in it does represent a valuable contribution to the field. However, two primary shortcomings were identified that can be addressed to strengthen the conclusions prior to publication. Both points described below pertain to the analysis of the possible C>A specific mutator phenotype in strains AAR and AEQ.

      Response:

      We thank the reviewer for this positive response. We have made a plan, detailed below, to address the shortcomings the reviewer has highlighted.

      **Major comments:**

      1. The work presented in the manuscript does suggest that these two haploids are likely to display the C>A mutator phenotype. Yet, the authors fell short of providing a full and unambiguous demonstration that would elevate the significance of their discovery. They could have directly tested the predicted C>A specific mutator phenotype by conducting additional experiments, one of which is relatively simple. Specifically, they could have performed a simple reversion-based mutation assay to validate the reported C>A mutator phenotype displayed by AAR and AEQ. For example, into AAR, AEQ, and a wild type control, the authors could introduce an engineered auxotrophic marker allele (e.g., ura3 mutation) caused by an A to C substitution, which upon mutation back to A restores prototrophic growth in minimal media (ie. reversion from ura3-C to URA3-A). Such specific reversible allele should be relatively easy to integrate into the AAR and AEQ genomes, as well as in the control strain. Based on the authors' prediction, AAR and AEQ should display a very large increase (far higher than 10 fold) in the reversion rate when compared to a control haploid. To demonstrate the specificity of the mutation spectrum, the authors could test the reversion rates of a different engineered allele requiring a reversion mutation in the opposite direction (ie. reversion from ura3-A to URA3-C). If the AAR and AEQ mutator is specific C>A, one would predict that all three strains should have similar mutation rates for a reversion in the A>C direction. This additional genetic work would thoroughly validate the central discovery and would reinforce the usefulness of the method described in the manuscript.

      Alternatively, a conventional mutation accumulation and whole genome re-sequencing experiment with parallel lines of AAR, AEQ and a control strain would also very effectively validate the C>A mutator prediction, and it would also answer the authors' discussion point about specificity to the CAN1 locus. However, it would be more costly and much more time consuming.

      Response:

      We thank the reviewer for these detailed, clear suggestions regarding additional methodology for further validating our results. We appreciate that parallel independent validations always add credibility to unexpected results like the ones presented in our manuscript. We’ve been considering these suggestions seriously, but our concern is that it is much less straightforward to engineer the genomes of these wild yeast than one might expect based on experiments with standard laboratory strains. Unforeseen roadblocks related to the biology of AAR and AEQ could end up making the URA3 reversion assay take even longer than an MA study. As we understand it, the two main concerns that might necessitate this additional undertaking are that either our novel assay for ascertaining mutations in CAN1 doesn’t work properly, or that the mosaic beer strains mutate significantly differently outside CAN1. Below we describe revisions to the text that we think will clearly represent these caveats and the relatively modest uncertainty associated with them.

      To further justify the soundness of our claim that AAR and AEQ have distinctive mutation rates and spectra, we plan to add additional discussion of the validation approaches that are presented in the manuscript to verify the accuracy of our pipeline. Although the ability of fluctuation assays to estimate mutation rates is well established, the identification of the spectra using our next-generation-sequencing-based pipeline is novel, so we used Sanger sequencing to validate the exact de novo mutations it ascertained in a select control strain. Our Sanger sequencing test found our assay to have an undetectably low false positive rate and a false negative rate that was much too low to account for the differences we measured between AAR, AEQ, and the standard lab strains. The fact that we also observed similar mutation spectra from control lab strains used in previous CAN1-based studies further demonstrates the reliability of our method, and it is notable that most natural isolates were measured to have very similar mutation spectra to lab strains (Figure 4 and Supplementary Figure S8-S9). We agree that further validation would be needed to read much into the more subtle differences in mutation rates and spectra that we saw hints of between other strains, and for that reason, we focused this paper on the differences that well exceed what we measured to be our measurement pipeline’s margin of error.

      It is true that the genome-wide mutation rate might differ somewhat from the mutation rate at the CAN1-locus, but the mutation spectrum at the CAN1 locus measured in a previous study (Lang and Murray, 2008) was very similar to the genome-wide mutation spectra obtained from MA studies (Sharp et al., 2018), with just a small overall increase of mutations with C/G nucleotides (the second to last paragraph on page 17 and Supplementary Figure S13). Moreover, we have avoided making any claims of seeing distinct mutation rates or spectra based on “apples-to-oranges” comparisons between mutation spectra measured at CAN1 and spectra measured across the whole genome.

      We also note that the enrichment of C>A mutations in AEQ and AAR is not only observed from our de novo mutation data in CAN1, but also seen in rare natural polymorphisms genome-wide (Figure 1B, 5A,B). Rare natural polymorphisms are recent mutations that occurred during the history of the strain, and the fact that they disproportionately enrich in C>A mutations in these strains indirectly shows that the C>A enrichment occurs not only at CAN1, as measured in our experiment, but has also been occurring during natural mutation accumulation genome-wide.

      The second concern is in regard to the relatively extensive conclusions drawn about the possible evolutionary significance of the possible C>A mutator in AAR and AEQ. The authors should be more cautious and conservative in the proposed interpretation. As the authors note:

      'Three of the four C>A-enriched mosaic beer strains, AAR, AEQ, and SACE_YAG, are all haploid derivatives of the [highly heterozygous] diploid Saccharomyces cerevisiae var diastaticus strain CBS1782, which was isolated in 1952 from super-attenuated beer.'

      From this statement, and because the paper cited provided few details on the isolation of CBS1782, it is presumed that these haploid derivatives were most likely isolated as recombinant spores. Furthermore, it is unclear when this isolation occurred, and for how many generations strains AAR and AEQ have been propagated in a haploid state.

      Herein lies a critical point: AAR and AEQ were recently derived from a diploid background with a "high level of heterozygosity". In a heterozygous diploid context, deleterious point mutations (and any resulting mutator phenotypes) would likely be masked by the presence of wild-type alleles. Now, as haploids, they express a novel genotype (i.e., combination of defective or incompatible parental alleles), which manifests as a mutator phenotype. In this respect, AAR and AEQ appear analogous to the spore derivatives of the incompatible cMLH1-kPMS1 isolate referred to in the manuscript as a notable exception. The analysis of strains harboring incompatible MLH1-PMS1 mutations by Raghavan et al. demonstrated that the heterozygous diploid parents were not themselves mutators, but that haploid spores which had inherited the pair of incompatible alleles displayed mutator phenotype. Collectively, while it can certainly be argued that the strains AAR and AEQ (like the MLH1/PMS1 incompatible strains) are mutators now, this fact alone does not support the conclusion that they have adapted to survive the expression of an extant mutator phenotype. This premise could be tested by analyzing the mutation rates/spectra of four new spores derived from a single tetrad of CBS 1782. Do the four sibling spores display similar or different mutational rates and spectra? If all four spores from a single tetrad exhibit the 10-fold increase in CAN1 mutation rate and the C>A transversion bias, then it can be inferred that the diploid parent is also a mutator in the same manner. Further direct analysis of mutation rates and spectrum in the parent diploid CBS 1782 would complete the work. This finding would be quite significant, and would provide strong evidence that wild strains can in fact tolerate the expression of a chronic mutator allele.

      Response:

      We thank the reviewer for suggesting additional study of the ancestral diploid strain CBS 1782, and we agree this could add a lot to the manuscript, especially given the high level of heterozygosity in the diploid and the link to the previous MLH1-PMS1 incompatibility story. We have obtained a sample of CBS 1782 and plan to knock out its HO locus using CRISPR, perform tetrad dissection of spores freshly derived from the diploid, and then measure mutation rates and spectra in all four segregants derived from a single tetrad (provided that all four spores end up growing). We plan to collect and sequence about 50 mutations to get qualitative results on the mutation rates and spectra of these segregants. We also plan to sequence the whole genome of the strain CBS 1782 and examine polymorphisms together with the 1011 strains to check for any signal of C>A enrichment. We recognize that our pipeline as currently implemented will not let us directly measure the mutation spectrum of the diploid, which is inaccessible to our pipeline given its two functional copies of CAN1 and the recessive nature of canavanine resistance. That being said, the elevation of the C>A fraction in natural polymorphisms found in AAR and AEQ provides evidence for prolonged activity of the mutator phenotype in the wild and/or in the domesticated environment from which CBS 1782 was derived. However, we acknowledge we have limited information about how these haploids were propagated before they were banked.

      **Minor comments:** A final, relatively minor point. That the new haploids AAR and AEQ show distinct mutation rates and spectra opens the door to an interesting line of inquiry, which may help to identify the causative mutator allele in a manner more efficient than searching for missense mutations. It is stated, and it is understandable, that the identification of the possible causal mutations is beyond the scope of the present manuscript. In this spirit, it would be much more appropriate to restrict such considerations to the Discussion section. Specifically, while the authors make a plausible case for OGG1 being a candidate gene responsible for the C>A mutator phenotype, no experimental demonstration was attempted. As such, that text segment should be moved from the Results to the Discussion section.

      Response:

      We agree with the reviewer of lacking genetic evidence on OGG1 in the current manuscript and we will move that section from the results to the discussion. Future work is underway to test and identify the causal loci for the mutator phenotype.

      Reviewer #1 (Significance (Required)): As stated in the summary section above, the manuscript by Jiang et al represents a substantial contribution to the fields of genome stability and genome evolution. The method described is likely to be useful beyond budding yeast. The work will be appreciated by a broad audience of geneticists. The additional work and text modifications proposed above would likely further elevate the impact of this work.

      Response:

      We are very grateful for this generous assessment and we likewise hope our planned revisions will further elevate the paper’s potential impact.

      Reviewer #2:

      Mutation is a fundamental force in organismal evolution, and therefore understanding the evolution of mutational mechanisms are important in evolutionary studies. In this manuscript, the authors used strains of S. cerevisiae as a model system to study the variations of rates and spectra in mutations with bioinformatic and experimental approaches. First, the authors analyzed the polymorphism data from 1011 strains by PCA analysis and show the variations in spectra. Second, the authors used fluctuation test combined with deep sequencing of the resistance gene to identify mutation rates and spectra in 18 strains, which show ~10-fold mutation rate variations and increased C-to-A mutations in two strains.

      For the second part, the experimental procedures and statistical analysis are mostly solid. For the first part, as what authors said in the introduction, polymorphism is not equal to the mutation spectra. I think the authors did a good job by being cautious in the wording and having no over-inference after the analysis. It is thus inevitable that the conclusion of this part sounds mostly descriptive. The overall writing is very clear. I will recommend the publication in field-specific journals.

      Response:

      We thank the reviewer for these positive comments. We will address each minor point below.

      **Minor comments:** P9 - It is very hard to not wonder how the 16 strains were picked in the fluctuation tests. Some comments on that will be appreciated. E.g., was that informed by the results of Fig 1?

      Response:

      We actually did not pick strains based on the results of Figure 1, one reason being that the CAN1 reporter method only works on haploid strains with a canavanine sensitivity phenotype. We also restricted our analysis to strains without known aneuploidies to maximize our ability to accurately measure the spectra of the strains’ polymorphisms. When possible, given these constraints, we included at least two randomly selected strains from each clade of the 1011 collection whenever possible. These constraints are currently explained on the second to last paragraph on page 9, and will be explained in more detail in revision.

      P17- In the paragraph "natural selection might contribute ..." , is there any example of "certain mutation types are more often beneficial than others"?

      Response:

      One example of this is that transitions are more often synonymous than transversions are (Freeland and Hurst, 1998), and mutations that create or destroy CpG sites are more likely to alter gene regulation than other mutation types are (in species other than yeast where CpGs are methylated). We recognize that these effects are likely not large, which is one reason we don’t think natural selection is a great explanation for mutation spectrum difference among groups.We will mention these examples explicitly in the revised text.

      P20 - Extra ')' in the sentence "Adjacent indels were merged if their frequencies differed by less than 10%)."

      Response:

      We will fix this in revision.

      In the discussion, it might be good to add a paragraph to compare the rate and spectra reported here and the ones found by MA and then NGS approach(e.g., Zhu et al. 2014).

      Response:

      We’ll be sure to add a reference to the Zhu et al. (2014) spectrum in the discussion, extending our existing comparison of mutation spectra previously reported using CAN1 (Lang and Murray, 2008) and the MA approach (Sharp et al., 2018) (currently discussed on the second to last paragraph on page 17, Supplementary Figure S13). Our CAN1 method also obtains results that are consistent with the Lang et al 2008 study on the same control strain (the last paragraph on page 11).

      Reviewer #2 (Significance (Required)): The significance of this manuscript will be relatively specific to evolutionary biologists and geneticists, especially those who use yeasts as a model system. For example, I expect the variation of mutation rates and spectra found in this manuscript will impact the following population-genetic analysis in this collection of 1011 strains and motivate more studies on the molecular machineries which affect mutation rates and spectra.

      In addition, in terms of methodological novelty, adding a novel step of reporter-gene sequencing is a reasonable way to get some information on mutation spectra as it is less labor-intensive than NGS of MAs. Other statistical or experimental procedures in this manuscript mostly follow the approaches which have been developed in previous literature and thus show not much novelty.

      Response:

      We thank the reviewer for this positive assessment. Since evolutionary biology, population genetics, and model organism genetics are three of eLife’s major focus areas, we are hoping to communicate our results to this journal’s broad audience rather than restrict ourselves to a journal focusing too narrowly on just one of these focus areas.

      Reviewer #3:

      **Summary** The authors show that certain yeast strains have altered mutation rates/bias. The study is well motivated, genetic variation in mutation rates are not easily uncovered, and capitalizes on yeast and a high-throughput mutation rate/bias method that validates findings of C>A bias from yeast polymorphism data. The results are solid and clearly presented and I have no major concerns.

      Response:

      We are very grateful for this positive response. Please find our response to each minor comment below.

      **Major comments** None

      **Minor comments** Should have comma: "In addition, environmental ..."

      Response:

      We will fix this in revision.

      Using S. paradoxus to classify derived vs ancestral alleles may not work as well as allele frequency. A 1/100 rare variant is 100x more likely derived than common variant. But with S. paradoxus divergence of say 5%, 5% polymorphic sites are misclassified or NA. Of course, since you used both, this is not a concern. But the number of variants included/excluded in each analysis should be reported. Also, I was a bit surprised that the rare variants are more noisy since most variants are rare.

      Response:

      We agree that the heuristic of classifying rare alleles as derived will do the right thing the majority of the time, but this could potentially create artifactual differences between the mutation spectra of different populations because the exact ratio of rare derived alleles to common derived alleles depends on the population’s demographic history and true site frequency spectrum. If two populations had the same mutation spectrum but very different proportions of variants that are polarized incorrectly, this could create the appearance of a mutation spectrum difference where none exists. In the revision, we will be sure to report the total number of variants filtered because of the variation present in S. paradoxus.

      The reviewer is right to point out that rare variants are generally more abundant than common variants, but this pattern is much more pronounced in a species like humans that has undergone recent population expansion than it appears to be in S. cerevisiae, which appears to have a higher proportion of older, shared variation. We hope this clarifies why the rare variant mutation spectrum PCA appears noisier than the plot made from variation across more frequency categories.

      In regards to variation in mutation rate based on canavanine resistanct. There is a caveat that some strains may be more canavanine resistant - due to differences in transporter abundanced or some other aspect of metabolism. Thus, the same mutation would survive and grow (barely) in one strain background, but not another. This caveat is very unlikely to have much of an impact but it would be worth discussing.

      Response:

      Thanks for pointing this out. We also considered the possibility that our mutation rate estimates could be confounded by slight differences in canavanine resistance between strains, and will address this point in the discussion.

      The explanation for synonymous mutations is hitchhikers or errors. However, they could also disrupt translation, here's one possibility PMC4552401.

      Response:

      Thanks for pointing this out. We will expand our statement on the possible significance of synonymous mutations to include modification of transcription and translation efficiency.

      Are there CAN allele differences between strains? If there are some, it might be worth mentioning why you do/don't think this influences the mutation rate. E.g. CGG is one step from stop but CGT is not.

      Response:

      The reviewer makes a good point that there are segregating differences among these strains in the sequence of CAN1. We plan to add an analysis where we calculate the number of opportunities for missense mutations and nonsense in each strain, as a function of its CAN1 sequence, to put a bound on the amount that these differences could affect our estimates of mutation rates in each strain.

      For the allele counts in Figure 5B. 2 indicates a variant is present in one strain so there are only 9 mutations present in AAR and not found in ANY other strain or just not found in the four listed? Likewise AAR has 36 for count 4, meaning that there are 36 variants present in AAR and one other strain, where other strains are just the 4 shown in the table, or other strains being any of the 1011?

      Response:

      The allele count in Figure 5B represents the number of times the derived allele is present in the whole population. In this case, the whole population refers to the 1011 strains minus 336 strains that are so closely related to other strains in the panel that they are effectively duplicates. An allele of count 2 might be homozygous in AAR and absent from all other strains, or present as one heterozygous copy in AAR as well as one heterozygous copy in another strain. We will explain this more clearly in the revised manuscript.

      "To our knowledge, this is one of the first" This is an odd way to put it and could be rephrased. As it stand you are either the first and not knowledgeable or knowledgeable and not the first.

      Response:

      Thanks. We will revise this to state that to our knowledge, we are the first to report such a discovery.

      "humans, great apes, .." Could you put the citations in the discussion too. I was a little surprised there was no mention of C>A bias as it relates to studies in bacteria and cancer, where there has been a lot of work on mutational spectra. A comment on this literature or whether the C>A biases are not found elsewhere would be nice.

      Response:

      We will add citations and discussion of bacteria and cancer in the revised manuscript. The reviewer is right to point out that C>A mutations do come up in cancer signatures, for example in familial adenomatous polyposis disorders where excision repair of 8-oxoguanine is compromised.

      Reviewer #3 (Significance (Required)):

      I am an evolutionary geneticist with expertise in genomics and bioinformatics. In addition to reviewing papers I also regularly handle papers as an editor. The manuscript provides rare insight into population variation in mutation rates. While differences in mutational biases are well known between species and in some cases within a species, we typically don't know what causes this biases. Environmental factors are often thought to be involved; this work clearly shows that genetic (mutator strains) exist and impact polymorphism in yeast. The manuscript does a nice job in the introduction of explaining the background on mutation rate research and motivation for the work. It also clear explains the advantage of an experimental highthroughput mutation rate/spectra approach. Thus, I believe this new angle on a long-standing problem will be of interest to the community of evolutionary geneticists outside of yeast researchers.

      Response:

      We appreciate this very generous assessment, thank you!

      Reference

      Freeland, S. J. and Hurst, L. D. (1998) ‘The genetic code is one in a million’, Journal of molecular evolution, 47(3), pp. 238–248.

      Lang, G. I. and Murray, A. W. (2008) ‘Estimating the Per-Base-Pair Mutation Rate in the Yeast Saccharomyces cerevisiae’, Genetics, 178(1), pp. 67–82.

      Sharp, N. P. et al. (2018) ‘The genome-wide rate and spectrum of spontaneous mutations differ between haploid and diploid yeast’, Proceedings of the National Academy of Sciences of the United States of America, 115(22), pp. E5046–E5055.

      Zhu, Y. O. et al. (2014) ‘Precise estimates of mutation rate and spectrum in yeast’, Proceedings of the National Academy of Sciences of the United States of America, 111(22), pp. E2310–8.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The authors show that certain yeast strains have altered mutation rates/bias. The study is well motivated, genetic variation in mutation rates are not easily uncovered, and capitalizes on yeast and a high-throughput mutation rate/bias method that validates findings of C>A bias from yeast polymorphism data. The results are solid and clearly presented and I have no major concerns.

      Major comments

      None

      Minor comments

      Should have comma: "In addition, environmental ..."

      Using S. paradoxus to classify derived vs ancestral alleles may not work as well as allele frequency. A 1/100 rare variant is 100x more likely derived than common variant. But with S. paradoxus divergence of say 5%, 5% polymorphic sites are misclassified or NA. Of course, since you used both, this is not a concern. But the number of variants included/excluded in each analysis should be reported. Also, I was a bit surprised that the rare variants are more noisy since most variants are rare.

      In regards to variation in mutation rate based on canavanine resistanct. There is a caveat that some strains may be more canavanine resistant - due to differences in transporter abundanced or some other aspect of metabolism. Thus, the same mutation would survive and grow (barely) in one strain background, but not another. This caveat is very unlikely to have much of an impact but it would be worth discussing.

      The explanation for synonymous mutations is hitchhikers or errors. However, they could also disrupt translation, here's one possibility PMC4552401.

      Are there CAN allele differences between strains? If there are some, it might be worth mentioning why you do/don't think this influences the mutation rate. E.g. CGG is one step from stop but CGT is not.

      For the allele counts in Figure 5B. 2 indicates a variant is present in one strain so there are only 9 mutations present in AAR and not found in ANY other strain or just not found in the four listed? Likewise AAR has 36 for count 4, meaning that there are 36 variants present in AAR and one other strain, where other strains are just the 4 shown in the table, or other strains being any of the 1011?

      "To our knowledge, this is one of the first" This is an odd way to put it and could be rephrased. As it stand you are either the first and not knowledgeable or knowledgeable and not the first.

      "humans, great apes, .." Could you put the citations in the discussion too. I was a little surprised there was no mention of C>A bias as it relates to studies in bacteria and cancer, where there has been a lot of work on mutational spectra. A comment on this literature or whether the C>A biases are not found elsewhere would be nice.

      Significance

      I am an evolutionary geneticist with expertise in genomics and bioinformatics. In addition to reviewing papers I also regularly handle papers as an editor. The manuscript provides rare insight into population variation in mutation rates. While differences in mutational biases are well known between species and in some cases within a species, we typically don't know what causes this biases. Environmental factors are often thought to be involved; this work clearly shows that genetic (mutator strains) exist and impact polymorphism in yeast. The manuscript does a nice job in the introduction of explaining the background on mutation rate research and motivation for the work. It also clear explains the advantage of an experimental highthroughput mutation rate/spectra approach. Thus, I believe this new angle on a long-standing problem will be of interest to the community of evolutionary geneticists outside of yeast researchers.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Answers to the reviewers’ comments

      We deeply appreciate the reviewers for their thoughtful, critical and constructive comments, which have undoubtedly provided us with valuable opportunities to improve our manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Extravasation of lymphocytes from HEV in the lymph nodes is mediated by the interaction between lymphocyte L-selectin and PNAd-carrying sulfated sugars expressed by HEVs. Multiple steps of lymphocyte migration interacting with ECs at the luminal side of HEVs have been studied intensively; however, post-luminal migration steps are unclear. In this study, using intravital confocal microscopy of peripheral lymph nodes (pLNs), the authors found that GlcNAc6ST1 deficiency, required for sulfation of PNAd, delays trans-fibroblastic reticular cell (FRC) migration of lymphocytes, and hot spots of trans-HEV EC migration and trans-FRC migration. Interestingly, hot spots of trans-FRC migration are often associated with dendritic cells (DCs). Thus, the authors concluded that FRCs delicately regulate the transmigration of T and B cells across the HEV wall, which could be mediated by perivascular DCs.

      **Main comments**

      1. This study focused on pLNs, which are quite different from mesenteric lymph nodes (mLNs) in many ways. The authors should include mLNs in their study to make the general statement with regard to the T/B cell entry into lymph nodes. In addition, it will be more significant if this study includes challenged pLNs.

      We thank the reviewer for raising the important point. We agree that mesenteric lymph nodes are quite different from peripheral lymph node that this study focuses on. Therefore, we specified the popliteal or peripheral lymph node in the revised manuscript as follows.

      In the Abstract (page 2), “… Herein, we performed intravital imaging to investigate post-luminal T and B cell migration in popliteal lymph node, consisting of trans-EC migration, crawling in the perivascular channel (a narrow space between ECs and FRCs) and trans-FRC migration. … These results suggest that HEV ECs and FRCs with perivascular DCs delicately regulate T and B cell entry into peripheral lymph nodes.”

      In the Introduction (page 4), “Herein, we clearly visualized the multiple steps of post-luminal T and B cell migration in popliteal lymph node, including trans-EC migration, intra-PVC crawling and trans-FRC migration, using intravital confocal microscopy and fluorescent labelling of ECs and FRCs with different colours.

      In the Discussion (page 21), “… These results imply that pericyte-like FRCs, the second cellular barrier of HEVs, regulate the entry of T and B cells to maintain peripheral lymph node homeostasis more precisely and restrictively than we previously thought.”

      In addition, we discussed the difference in lymphocyte migration across HEVs between peripheral lymph node, mesenteric lymph node, and peyer’s patches in the Discussion of the revised manuscript. We also discussed inflamed lymph nodes in the Discussion as follows.

      In the Discussion (page 20), “… Although this work focused on peripheral lymph node, the other lymphoid organs have different lymphocyte homing efficiency61 due to organ-specific gene expression on HEVs62. B cells home better to mesenteric lymph nodes and peyer’s patches than peripheral lymph nodes61 by CD22-binding glycans expressed preferentially on the HEVs of mesenteric lymph nodes and peyer’s patches62.

      Inflamed peripheral lymph node become larger by recruiting more lymphocytes and even L-selectin-negative leukocytes that are excluded in the steady state63,64. Inflamed HEV ECs show different gene expression, such as downregulation of GLYCAM1 and GlcNAc6ST-160. In addition, inflamed HEV integrity may be loosen due to markedly increased leukocyte influx although the HEV FRCs can prevent bleeding by interacting with platelet CLEC-248. CD11c+ DCs are associated with inflamed HEV EC proliferation that is functionally associated with increased leukocyte entry65. The stepwise migration of lymphocyte across inflamed HEVs and their hot spots with perivascular CD11+ DCs will be interesting topic for future study.”

      The finding that GlcNAc6ST1 deficiency delays lymphocyte trans-FRC migration but not trans-HEV EC migration is surprising. However, the reason this occurs is neither shown nor discussed. Is GlcNAc6ST1 also expressed in FRCs? Or does GlcNAc6ST1 expression on HEV license lymphocytes to transmigrate across FRCs?

      This is valid point to be addressed. GlcNAc6ST-1 is predominantly involved in PNAd expression on the abluminal side rather than on the luminal side. Therefore, our results that GlcNAc6ST-1 deficiency increased the time required for trans-FRC migration but not that for trans-EC migration, could be attributable to deficiency of GlcNAc6ST-1-synthesizing L-selectin ligands in the abluminal side of HEV.

      In addition to PNAd expression in the luminal and abluminal sides of endothelial cells in HEV, PNAd expression has been observed in reticular network close to HEV as following figures. We believe that PNAds are expressed in FRCs close to HEV and can affect lymphocyte migration such as trans-FRC migration and parenchymal migration. By looking at the data (Table S1, Rodda et al., Immunity 2008), GlcNAc6ST-1 (Chst2) is expressed in T-cell-zone reticular cells while GlcNAc6ST-2 (Chst4) is absent. Therefore, it is presumable that FRC-expressed GlcNAc6ST1 may regulate trans-FRC migration in some extent.

      Figures. PNAD expression on HEVs (arrows) and reticular network (arrow heads) close to the HEVs

      We included these points in the Discussion of the revised manuscript (page 15) as follows.

      “… GlcNAc6ST-1 is predominantly involved in PNAd expression on the abluminal side rather than on the luminal side, although GlcNAc6ST-1 deficiency also modestly affects the luminal migration of lymphocytes by increasing the rolling velocity9. GlcNAc6ST-1 deficiency increased the time required for trans-FRC migration but not that for trans-EC migration. This could be attributable to deficiency of GlcNAc6ST-1-synthesizing L-selectin ligands in the abluminal side of HEV. In addition to the abluminal side of HEV endothelial cells, FRCs also express GlcNAc6ST-1, but not GlcNAc6ST-227, implying that FRC-expressed GlcNAc6ST-1 may regulate trans-FRC migration in some extent. … Thus, PNAds expressed at the endothelial junction and on the abluminal side of HEVs facilitate the efficient transmigration of lymphocytes across the HEV wall but do not slow transmigration in the perivascular region. GlcNAc6ST-1 deficiency and MECA79 antibody also decreased the parenchymal B and T cell velocities immediately after extravasation, respectively, probably because of blockade of parenchymal expression of PNAd in close proximity to HEV6,21,28.”

      Because of the adoptive transfusion experiment, the actual number of transmigrating lymphocytes in Fig. 3F is underestimated.

      We agree with the reviewer’s comment. We corrected the y-axis label in Fig. 3F from ‘average number of cells transmigrating at one site’ to ‘average number of labeled cells transmigrating at one site.’

      Whether DCs covering FRCs have a role for lymphocyte trans-migration is not shown.

      We leaved this work as future research and discussed about the potential mechanisms in the Discussion (page 17-18) that the DC may regulate lymphocyte entering by interacting FRC with LTβR or CLEC-2 signaling. We also included ‘Martinez et al Cell Rep 2019 (ref.51)’ in the discussion of the revised manuscript (page 18). In addition, we also discussed about better characterization of the CD11c+ DC in the Discussion of the revised manuscript (page 19) as follows.

      In the Discussion (page 18), “The podoplanin of FRCs also controls FRC contractility49,50 and ECM production51 by interacting with the CLEC-2 of DCs in inflamed lymph nodes. In the steady state, resident DCs in lymph nodes express CLEC-252. Thus, it is conceivable that CLEC-2+ resident DCs may control the contractility of FRCs and remodel ECM surrounding HEVs to facilitate the trans-FRC migration of T and B cells. Thus, the CLEC-2/podoplanin signalling may represent a key molecular mechanism underlying our discovery that trans-FRC migration hot spots preferentially occur at FRCs covered by CD11c+ DCs.”

      In the Discussion (page 19), “… In addition, better characterization of the CD11c+ DCs located in the hot spots of HEVs is required to differentiate them from the other CD11c+ DCs observed in the non-hot-spot regions of HEVs. Some T-cell-zone resident macrophages can also express CD11c54. Imaging of a triple-transgenic mouse with Zbtb46-cre;tdTomato and CD11b-GFP will be able to differentiate 3 types of DCs and macrophages potentially associated with the hot spots: Zbtb46+CD11b- cDC1, Zbtb46+CD11b+ cDC2, and Zbtb46-CD11b+ macrophage54,55.”

      In Fig. 1, time required for trans HEV EC migration and trans-FRC migration of T cells is shorter than that of B cells; however, this finding is not observed in Fig. 2C and E.

      Although the statistical comparison between T and B cells are not shown in Fig. 2C-F and S5., there are actually significant difference between T and B cells, which are similar results as Fig. 1 except for the dwell time in PVC. P values between T and B cells in wildtype mice are 0.0003, In the Result (page 6), “… The mean velocity of T cells (5.3 ± 1.7 μm/min) was significantly higher than that of B cells (4.1 ± 1.4 μm/min) during intra-PVC migration (Fig. 1E), while the dwell time and total path length in the PVC were not significantly different between T and B cells (Fig. 1, H and I). Similar results were obtained when both cells were imaged simultaneously, except that B cells had significant longer dwell time than T cells (Fig. 2C-F and Fig. S5). Interestingly, more than half of the T and B cells crawled from 50 μm to 350 μm inside the PVC (Fig. 1I), …”

      In the legend of Fig. 2, “… P values between T and B cells in wild-type mice were 0.0003 (C), …”

      In the legend of Fig. S5, “… P values between T and B cells in wild-type mice were 0.0240 (A), 0.3614 (B), 0.7518 (C) and 0.1337 (D). …”

      **Minor comments**

      1. Please provide evidence for GlcNAc6ST1 deficiency in HEV and surrounding tissues.

      Previous studies (Uchimura et al., JBC 2004, Nat. Immunol. 2005; ref9 and 10, respectively, in the manuscript) confirmed systemic deficiency of GlcNAc6ST-1 in peripheral lymph nodes of the GlcNAc6ST-1 KO mice.

      Images for delayed trans-FRC migration in GlcNAc6ST1 KO mice relative to WT are not convincing (Fig. 2G and H).

      We think the reason why the images look unconvincing is probably because it is not easy to quickly determine the images corresponding to the trans-FRC migration in the image sequence. To make the transmigration images easier to recognize, we added arrow heads indicating the transmigration site in Fig. 2G and 2H, and Fig. S4 as follows.

      Provide actual time periods required for Fig. 3F and G. Lack of isotype control IgG experiment in Fig. S3.

      We added the time periods (3 hours) in the figure legend as follows.

      “… (F) Average numbers of labeled T and B cells transmigrating at one site for 3 hours. (G) Ratio of hot spots to total transmigration sites for 3 hours. …”

      The purpose of Fig. S3 was to confirm that the anti-ER-TR7 antibody injection for labeling FRC do not alter normal T cell motility, rather than to confirm the function of ER-TR7. Therefore, we used non-injected group as control rather than control antibody injection group.

      Line 12 on page 11, "the ratio of hot spots to the total “observed” transmigration sites..." is not appropriate. The ratio must be calculated by hot spots to the total "potential" transmigration sites, although it is challenging to find total potential sites.

      We corrected the expression from ‘the total observed transmigration sites’ to ‘the total potential transmigration sites’.

      Please correct typos of angiomoduin to angiomodulin (page 16), ET-TR7 to ER-TR7 (page 17), Anti-CD3 to anti-CD3 (page 22), half the dose to half dose (page 22), the Multiple step to the multiple step (page 23).

      We thank the reviewer for finding those errors. We corrected them and performed proofreading repeatedly to correct typos and grammatic errors.

      Please provide an additional explanation of why actin-DsRed in HEVs is more strongly expressed than surrounding tissues such as FRCs in Fig. 1 although actin-DsRed should be expressed in all cell types in mice.

      We were also surprised when we found that HEV ECs expressed red fluorescence more strongly compared to surrounding tissues. Although the other cells such as FRCs and endogenous lymphocytes also express DsRed under control of a promotor gene, beta-actin, we believe that HEV ECs express more strongly, which is sufficient to image only HEV-EC by adjusting an image contrast. We revised the explanation of this point in the Methods (page 21) as follows.

      “HEV ECs of actin-DsRed mouse popliteal lymph node expressed red fluorescence much stronger than the surrounding stromal cells and endogenous lymphocytes, which was sufficient to image only HEV ECs by adjusting an image contrast (Fig. 1, A and B).”

      Reviewer #1 (Significance (Required)):

      The study focused on lymphocytes post extravasation of HEV, which is an understudied question, using intravital imaging. The in vivo imaging study was deliberately and beautifully performed, and the finding is insightful for understanding lymphocyte trafficking in lymph nodes. However, additional experimental should be performed to address some weaknesses listed in our comments.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The present study by K. Choe meticulously monitored the stepwise transmigration behavior of T cells and B cells, respectively, through the high endothelial venules of the mouse popliteal lymph node using the laser scanning confocal microscopy. In particular, the study focused on the post-luminal migration of T and B cells and reported the following. (1) Mice deficient in GlcNAc6ST-1 which is necessary for PNAd expression on the abluminal side of HEV showed significantly reduced abluminal migration of both T and B cells, (2) the footpad injection of the ER-TR7 antibody did not affect T cell transmigration across HEVs but marginally increased the parenchymal T cell velocity when compared with injection of control antibody, (3) T cells and B cells tended to share FRC migration hot spots but this was not the case with trans-EC migration hot spot, (4) the trans-FRC migration was observed at the FRCs closely associated with CD11c+ dendritic cells in HEV.

      While the present study is obviously the product of very meticulous and time-consuming work, it basically describes only a phenomenology, just reporting the lymphocyte behavior within and outside lymph node HEVs, without sufficiently analyzing the mechanistic aspect of the individual event they observed. The only antibody blocking experiments they performed to obtain mechanistic insights was by the use of commercially available monoclonal antibodies, all of which unfortunately contained a preservative, sodium azide, which potently blocks lymphocyte migration in vivo (Freitas AA & Bognacki J, Immunol 36:247, 1979). Therefore, the results of these antibody blocking experiments cannot be taken at face value.

      We thank the reviewer for raising the important point. Freitas et al used pre-treated lymphocytes with sodium azide in vitro for 1 hour while we injected the antibody into the footpad of recipient mouse 3 hours before lymphocyte injection via tail vein and imaging. Sodium azide might be highly diluted in vivo condition. In addition, Fig. S3 shows no significant difference in T cell migration in HEV between anti-ER-TR7 antibody-injected and non-injected groups although the anti-ER-TR7 antibody also contains sodium azide. We believe that the effect of sodium azide on our convincing results of the PNAd-blocking antibody compared to the control antibody (Fig. S8) may be insignificant. The potential side effect of sodium azide was mentioned in the Methods of the revised manuscript (page 22) as follows.

      “All antibodies we used contains sodium azide that has potential side effects on lymphocyte migration in lymph node57. However, Fig. S3 shows no significant difference in T cell migration in HEV between anti-ER-TR7-injected and non-injected groups.”

      Reviewer #2 (Significance (Required)):

      Real time imaging experiments were performed very carefully. However, as mentioned above, authors used sodium azide-containing antibodies for blocking experiments, and hence, these experiments cannot be interpreted properly.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This study presents a detailed investigation of T and B cell entry into lymph nodes (LN) via HEV. Substantial high quality intravital imaging is used to examine trans-EC and trans-FRC migration and define the role of PNAds in this process. The authors find that T and B cells use 'hot spots' to cross EC and FRC barriers, which supports prior similar observations by others. They also show that where T and B cells cross EC and FRC layers can differ, with regions of shared trans-FRC migration but more distinct EC crossing sites. This may relate to differences in the structure of these cellular layers, but provides novel insight into the mechanisms of cell entry into LNs via HEV. Assessment of the dependence on PNAd using antibodies or GlcNAc6ST-1 KO mice revealed perivascular and parenchymal cell behavior is also influenced by these signals. Lastly, examination of DCs that sit on the perivascular FRCs suggested that cells may prefer to cross at sites co-localized by DCs, although the reasons for this are not explored.

      This is a well performed study, with high quality imaging data and analysis. The results are convincing, with sufficient numbers of mice and adequate statistical analysis. There are a number of minor grammatical errors throughout the text, which should be easy to fix.

      We thank the reviewer for the positive evaluation. We carefully performed proofreading repeatedly to correct typos and grammatical errors.

      Reviewer #3 (Significance (Required)):

      Although 'hot spots' have been proposed by others, this detailed analysis provides new knowledge of how lymphocytes can cross the HEV and FRC barriers to enter LNs. This is an important study to advance our understanding of cell recruitment to lymph nodes. The role of perivascular and parenchymal PNAd signals observed here should also be of interest to immunologists to help define the signals required for immune cell motility in tissues.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      The authors have used a combination of intravital confocal imaging and transgenic models to study the migration of T and B cells through the HEVs. They move on from Moscacci et al. and Park et al., studies on lymphocyte migration. This study focuses on visualization and molecular mechanism of post-trans-EC migration, including the intra-PVC and trans-FRC migration of T and B cells in HEVs. They have been able to show how lymphocytes migrate through the HEV into the parenchyma. Using the GlcNAc6sT-1 (catalyst for sulfation of PNAds) KO model (and MECA control for PNAds blocking) they identify the role of L-selectin/PNAd for lymphocyte transmigration. The identification of hot spots of T and B cell transmigration in HEVs is novel and extremely interesting for the field however the data shown is not entirely convincing in their current form. The hot spots were defined as areas where the lymphocytes migrate through the HEV epithelial cells and pericyte (FRC) regions. These are areas where migration was greatly shared T and B cells. Using the CD11c-YFP mouse model they identified CD11c+ cells in proximity to the FRCs located at the migration hotspots which can drive further speculation regarding the mechanism by which these areas of the HEVs are more permissive.

      **Major comments**

      1) Intravital imaging of T and B cell transmigration across HEVS composed of ECs and FRCs

      • Figure 1: The authors mention that they performed similar experiments for B cells. Authors should show comparative data for T cells and B cells.

      • Panel S1B should be provided for both T and B cells in figure 1.

      We added the image sequence of B cell migration and the panels (Fig S1B of previous manuscript) showing intra-PVC segments of T or B cells in Fig. 1C of the revised manuscript as follows.

      2) T and B cells preferentially share hotspots for trans-FRC migration not EC-migration

      • Figure 4: This data is important to the storyline but as presented it is difficult to understand. Results are overstated in the text however it is difficult to see where these conclusions come from based on the figure. In Figure 4B the authors should show percentages on the Venn diagram or remove it entirely. In Figure 4C the authors should add labels to their y-axis and separate the data in order to assist with the storyline and convince of the presence of hot spots.

      We agree with the reviewer’s opinion. We removed the Venn diagram, separated the Fig. 4C into 4B and 4C, and added y-axis labels in the figures. In addition, we revised the figure legends and the text in the Results to make it easier to understand as follows.

      In the figure legend, “…(B-C) The round and diamond symbols represent predicted and observed values, respectively, for the percentage of T cell hot spots in B cell hot spots (B), for the percentage of B cell hot spots in T cell hot spots (C). …”

      In the Results (page12), “Simultaneously imaging T and B cells showed that some T and B cells transmigrated across FRCs at the same site (Fig. 4A and Movie S8). To investigate whether T and B cells share their hot spots preferentially or accidentally, we compared the percentage of T cell hot spots in total B cell hot spots (diamond symbols in Fig. 4B) with its predicted value that is the possibility of accidently sharing T and B cell hot spots (round symbols in Fig. 4B). The predicted value can be calculated as the percentage of T cell hot spots in total transmigration sites. To note, the percentage of hot spots in total sites for trans-FRC migration was higher than that for trans-EC migration (Fig. 3G and round symbols in Fig. 4B) maybe because the number of trans-FRC migration sites was less than that of trans-EC migration sites. It implies that the possibility of accidently sharing T and B cell hot spots for trans-FRC migration is higher than that for trans-EC migration. However, surprisingly, the percentage of T cell hot spots in B cell hot spots was significantly higher than its predicted value of accidently sharing hot spots for trans-FRC migration (Fig. 4B). Similarly, the percentage of B cell hot spots in T cell hot spots was also significantly higher than its predicted value for trans-FRC migration (Fig. 4C). These results imply that T and B cells preferentially share trans-FRC migration hot spots beyond the prediction for accidently sharing. However, there were no significant differences between observed and predicted values for trans-EC migration (Fig. 4B and 4C), which implies T and B cells just accidently share their trans-EC migration hot spots.”

      3) T and B cells prefer to transmigrate across FRCs covered by perivascular CD11c+ DCs

      • DCs drive changes to FRC phenotype and contractility. The interaction between CLEC-2 (on DCs and platelets) is important for driving permeability of the HEVs. The authors use the CD11c-YFP mouse model in Figure 5 (and the supporting figures) to show the proximity of the CD11c+ cells and FRCs. Data from Baratin et al., (Immunity, 2017) suggest that CD11c+ cells in the parenchyma are also T cell zone macrophages (TZMs) that were previously characterized as DCs. Macrophages have previously been shown important for perivascular transmigration of neutrophils during bacterial skin infection (Abtin et al.2014- Nat Immun). CD11c-YFP alone does not show the cells proximal to FRCs are DCs so the authors should try to stain them with CLEC-2 or use the CLEC9a-cre mouse model to better characterise these cells.

      We thank the reviewer for raising important point. We agree that the perivascular CD11c+ cells could be T-cell-zone macrophages (TZMs). Better characterization of the CD11c+ cells located in the hot spots of HEVs is required to determine if they are DCs or macrophages, and also to differentiate them from the other CD11c+ cells observed in the non-hot-spot regions of the HEVs. To differentiate DCs from TZMs, Zbtb46-GFP mouse can be used for imaging because Zbtb46-GFP are highly expressed in conventional DCs (cDCs) but not monocytes, macrophages, or other lymphoid or myeloid lineages (Satpathy et al, JEM 2012). However, endothelial cells also express Zbtb46-GFP. To visualize only DCs in HEVs, we need to make a chimeric mouse by adoptive transfer of Zbtb46-GFP bone-marrow cells into irradiated wild-type mouse. Furthermore, using a triple transgenic mouse with Zbtb46-cre;tdTomato and CD11b-GFP will be able to differentiate 3 types of DCs and TZMs potentially associated with the hot spots: Zbtb46+CD11b- cDC1 (red), Zbtb46+CD11b+ cDC2 (yellow), and Zbtb46-CD11b+ macrophage (green). However, since generation or obtaining of those transgenic mice models including CLEC9a-cre mouse will take long time, we will leave this work as future research and discussed this point in the Discussion of the revised manuscript as follows. In addition, we think that it will be difficult to differentiate the CLEC2 of perivascular DCs from that of platelets by in vivo labeling by injection of anti-CLEC2 antibody conjugated with a fluorescent dye because the CLEC2 of platelets maintains HEV integrity with interacting of FRC podoplanin (Herzog et al, Nature 2013).

      In the Discussion (page 19), “… In addition, better characterization of the CD11c+ DCs located in the hot spots of HEVs is required to differentiate them from the other CD11c+ DCs observed in the non-hot-spot regions of HEVs. Some T-cell-zone resident macrophages can also express CD11c54. Imaging of a triple-transgenic mouse with Zbtb46-cre;tdTomato and CD11b-GFP will be able to differentiate 3 types of DCs and macrophages potentially associated with the hot spots: Zbtb46+CD11b- cDC1 (red), Zbtb46+CD11b+ cDC2 (yellow), and Zbtb46-CD11b+ macrophage (green)54,55.”

      **Minor comments**

      1) Intravital imaging of T and B cell transmigration across HEVS composed of ECs and FRCs

      • The velocity differences observed could be due to location of HEV in the parenchyma. Furthermore, FRC plasticity can cause differences in secretion of chemokine gradients based on the location of cells and their niche (Rhoda et al., Immunity 2018). HEVs regulation of lymphocyte entry can be influenced by their niche (Veerman et al., Cell Reports 2019). The authors should comment on the HEV position relative to B cell areas.

      We included this point with the references (Rhoda et al, immunity 2018, ref 27; Veerman et al., Cell Rep. 2019, ref 60) in the Discussion of the revised manuscript (page 19-20) as follows.

      “Compared to T cell, B cells took a longer time to pass EC and FRC layers in HEV and had lower velocity in PVC and parenchyma just after extravasation. Furthermore, the adhesion rate of B cells to HEV EC in luminal side is lower than that of T cells5. These could be attributed to lower expression of L-selectin and CCR7 on B cells than T cells18,59. The difference in homing efficiency between T and B cells may vary depending on the HEV location due to the heterogeneous expression of chemokines and integrins on HEV EC and surrounding FRCs in peripheral lymph node27,60. The HEVs imaged in this work were located around 40-70 μm depth from the capsule where might be close to B cell follicles. B cell homing efficiency in the deeper paracortical T cell zone could be different from our data probably due to less CXCL13 that is chemoattractant for B cells highly expressed in follicles. …”

      • Images shown in Fig1A is the same as Fig S1A/B. I presume this is an error.

      Fig. 1A and Fig. S1A correspond to a 20-um-thick maximum intensity projection and single z-frame without projection, respectively. To avoid the confusion, we changed Fig.1A to the single z-frame (Fig S1A) and remove the 20-um thick maximum projection.

      • Figure S3: Data for Ab treated appears to be identical to what is shown for T cells in Fig 1. I presume this is an error and the correct control will be shown.

      We used the data of Fig. 1D-1I as the Ab-injected group in Fig. S3. We are sorry for the lack of clear explanation about this. We included the explanation in the figure legend as follows.

      In the legend of Fig. S3, “(A-E) There is no significant difference between antibody-injected group (Ab) and non-injected group (Non) in T cell migration from trans-EC migration to trans-FRC migration. Non-injected means that no substance is injected into a footpad of mouse. We used the data of Fig. 1D-1I as the antibody-injected group. …”

      2) Non-redundant role of L-selectin/PNAd interactions in post-luminal migration of T and B cells in HEV

      • Could the authors clarify the number of mice used for this analysis (same applies to figure 1)

      In the legends of Fig. 1-2, S6 and S8, there is the number of mice we used. In Fig. 1, “Four and 3 mice were used for the analysis of T and B cells, respectively.” In Fig. 2, “Four mice were analysed for each group.” In Fig. S6, “Three mice were analysed for each group.” In Fig. S8, “Five and 4 mice were analysed for the control Ab and MECA79 groups, respectively.”

      In addition, we added the number of mice in the legend of Fig. S7. In Fig. S7, “The images are representative of 4 popliteal lymph nodes of 2 mice and 2 popliteal lymph nodes of a mouse for MECA79 and control IgM antibody, respectively.”

      • Figure S6: further to percentages of T cell populations the authors should also provide the number of T cells (CD4, CD8, CM and naive) for both wildtype and KO.

      We included the analyzed cell number by FACS in Fig. S6 and revised the figure legend as follows.

      In the Fig. S6, “… (B) Analyzed cell numbers by FACS for 3 control and 3 KO mice. (C) Percentage of each type of T cells in DsRed+ T cells. No difference in the percentage of homing central memory, Naïve CD4 and CD8 T cells between wild-type and KO mice. …”

      **Methods** for the flow cytometry analysis could the details of how samples were processed (or reference) be provided.

      We added the details in the Methods (page 24) as follows.

      “Popliteal and inguinal lymph nodes were harvested and single-cell suspensions were prepared by mechanical dissociation on a cell strainer (RPMI-1640 with 10% FBS). Cell suspensions were centrifuged at 300g for 5 min. Erythrocytes in lymph nodes were lysed with ACK lysis buffer for 5 min at RT. Cell suspensions were washed and filtered through 40um filters. Non-specific staining was reduced by using Fc receptor block (anti-CD16/CD32). Cells were incubated for 30 min with varying combinations of the following fluorophore-conjugated monoclonal antibodies: anti-CD3e (clone 145-2C11, BD pharmigen), anti-CD4 (clone GK1.5, BD Pharmingen), anti-CD8 (clone 53-6.7, eBioscience), anti-CD44 (clone IM7, Biolegend) and anti-CD62L (clone MEL-14, eBioscience) antibodies (diluted at a ratio of 1:200) in FACS buffer (5% bovine serum in PBS). After several washes, cells were analyzed by FACS Canto II (BD Biosciences) and the acquired data were further evaluated by using FlowJo software (Treestar).

      **References:** The discussion covers key references in the field, but more recent studies should be included. Some examples have been suggested in the comments sections. Key references missing that can help discussion/interpretation of the data include: 1) Veerman et al 2019, Cell reports. The data in that paper shows the heterogeneity of the HEV and different regulation of genes that control lymphocyte entry. This can also be linked to the comments above regarding section 1 and 2. 2) Rhodda et al 2018, Immunity that focuses on niche-associated heterogeneity of lymph node stromal cells. The authors should also include Webster et al., 2006, JEM which describes the role of DCs in regulating vascular growth in the lymph node.

      We thank the reviewer for suggesting good references to discuss. We included the references #1 and #2 in the revised manuscript as we responded to the minor comment #1. We also cited Webster et al., JEM 2006 (as ref 65) in the Discussion of the revised manuscript (page 20) as follows.

      “Inflamed peripheral lymph node become larger by recruiting more lymphocytes and even L-selectin-negative leukocytes that are excluded in the steady state63,64. Inflamed HEV ECs show different gene expression, such as downregulation of GLYCAM1 and GlcNAc6ST-160. In addition, inflamed HEV integrity may be loosen due to markedly increased leukocyte influx although the HEV FRCs can prevent bleeding by interacting with platelet CLEC-248. CD11c+ DCs are associated with inflamed HEV EC proliferation that is functionally associated with increased leukocyte entry65. The stepwise migration of lymphocyte across inflamed HEVs and their hot spots with perivascular CD11+ DCs will be interesting topic for future study.”

      Reviewer #4 (Significance (Required)):

      This paper asks important questions and can make a significant contribution to the field if all revisions are addressed. The authors identified PNAd as an important factor for T cell migration. Further to previous studies in the field suggesting non-random transmigration sites. The authors used intra-vital confocal imaging to identify how lymphocytes cross the epithelial cells and FRCs of the HEVs to migrate to the parenchyma. The authors identify hotspots used by lymphocytes to transmigrate. Finally, the authors show that CD11c+ cells are proximal to FRCs hotspots and might have a role in driving lymphocyte transmigration.

      Audience: Lymphocyte/immune cell biology, stomal immunology, FRC and lymph node inflammation. My expertise: Stomal immunology, immunology, innate immunity

    1. Soil means “we hope something will grow here.”

      I do agree that soil communicates a sense of hope or opportunity for new growth, but it has a second connotation that I think is more thematically aligned with this essay. Soil may also be understood as an aggregate of decomposed material. These two perspective are obviously related, but it is interesting how the "soil" here is discussed independently of the "land" mentioned earlier in the essay.