7,842 Matching Annotations
  1. May 2022
    1. Author Response

      Reviewer #1 (Public Review):

      The paper is very well written, the question is interesting, and the analyses are innovative. However, I do have concerns about the overall approach. My main concern is about looking at asymmetries in the low dimensional representation of connectivity. A secondary concern has to do with looking at the parcellated connectome. I explain these concerns in succession below.

      We thank the Reviewer for the appreciation of our work and the insightful comments, which we have addressed below. The page numbers are corresponding to the clean version of the manuscript.

      The first concern is to me quite a fundamental issue: looking at connectivity in a low dimensional space, that of the laplacian eigenvectors. There are two issues with this. The first one, which is less important than the second, is that the authors have a reference embedding to which they align other embeddings using a procrustes method with no scaling. While the 3D embedding is still optimally representing the connectivity (because distances don't change under rotations), we can no longer look at one axis at a time, which is what the authors do when they look at G1. In this case, G1 is representative of the connectivity of the reference matrix (LL), but not the others.

      But even if the authors only projected their matrices onto a single G1 dimension with no procrustes (and only sign flipping if necessary), there is still a major issue. One implicit assumption of this whole approach is that if there is a change in connectivity somewhere in the original matrix, the same "nodes" of the matrix will change in the embedding. This is not the case. Any change in the original matrix, even if it is a single edge, will affect the positions of all the nodes in the embedding. That is because the embedding optimises a global loss function, not a local one.

      To make this point clear, consider the following toy example. Say we have 4 brain regions A,B,C,D. Let us say that we have the following connectivity:

      In the Left Hemisphere: A-B-C-D

      In the Right Hemisphere: A-B=C-D

      So the connection between B and C is twice as strong in the right hemi, and everything else remains the same.

      The low dimensional embedding of both will look like this:

      Left: ... A ... B ....... C ... D ...

      Right A... ... ... B ... C ... ... ... D

      Note how B,C are closer to each other in the RIGHT, but also that A,D have moved away from each other because the eigenvector has to have norm 1.

      So if we were to calculate an asymmetry index, we would say that:

      A is higher on the LEFT

      B is higher on the RIGHT

      C is higher on the LEFT

      D is higher on the RIGHT

      So we have found asymmetry in all of our regions. But in fact the only thing that has changed is the connection between B and C.

      This illustrates the danger of using a global optimisation procedure (like low-dim embedding) to analyse and interpret local changes. One has to be very careful.

      We thank the Reviewer for the detailed description of the first concern. We agree that low-dimensional embeddings describe global embedding of local features, rather than local phenomena. Moreover, we indeed assume that the connectivity embedding of a given node gives us information about its position along ‘gradients’ relative to other nodes and their respective embedding. Thus, indeed, when a single node (node X) has a different connectivity profile in the right hemisphere relative to the left, this will also have some impact on the embeddings of all nodes showing a relevant (i.e., top 10%) connection to node X.

      To evaluate whether asymmetry could be observed in average connectivity within functional networks, an alternative approach to measure asymmetry was taken by computing average connectivity within different functional networks. Following we compared the within-network connectivity between left and right. We have now added this conceptual analysis to our results robustness analysis section. In short, we observed that transmodal networks (DMN, FPN, and language network) showed higher connectivity in the left hemisphere but other networks showed higher connectivity in the right hemisphere. Thus, this indicates that observations made with respect to asymmetry of functional gradients are similar to those observed for within-network functional asymmetry between the left and right hemispheres. We have now detailed the outcome of this analysis in our Result section and Supplementary Materials.

      Results, p.14.: “As low-dimensional embedding is a global approach to summarize functional connectivity we reiterated our analysis by evaluating asymmetry of within network functional connectivity in the current sample. Observations made with respect to asymmetry of functional gradients are similar to those observed for within-network functional asymmetry between the left and right hemispheres.”

      “To further explore functional connectivity asymmetry between left and right hemispheres, we calculated the LL within network FC and RR within network FC (Figure 2-figure supplement 5). It showed that connections in the left hemisphere and right hemisphere were relatively equal in the global scale. However, for the local differences, networks showed significant subtle leftward or rightward asymmetry (vis1: t = -5.203, P < 0.001; vis2: t = -22.593, P < 0.001; SMN: t = -8.262, P < 0.001; CON: t = -32.715, P < 0.001; DAN: t = -11.272, P < 0.001; Lan.: t = 33.827, P < 0.001; FPN: t = 24.439, P < 0.001; Aud.: t = 0.191, P = 0.849; DMN: t = 11.303, P < 0.001; PMN: t = -35.719, P < 0.001; VMN: t = -11.056, P < 0.001; OAN: t = 0.311, P = 0.756).”

      Irrespectively, we have further highlighted that such a global interpretation for asymmetry of areas is still meaningful, given that a node is always placed in a global context. We have now further explained that our metrics give insights in local embedding of global phenomena in the introduction, p. 3.

      Introduction, p. 3: “These low-dimensional gradient embeddings describe global embedding of local features, rather than local phenomena. Thus, interpretation for asymmetry of areas is under a global context.”

      My second concern is about interpreting the brain asymmetry as differences in connectivity, as opposed to differences in other things like regional size. The authors use a parcellated approach, where presumably the parcels are left-right symmetric. If one area is actually larger in one hemisphere than in the other, the will manifest itself in the connectivity values. To mitigate this, it may be necessary to align the two hemispheres to each other (maybe using spherical registration) using connectivity prior to applying the parcellation.

      Thanks for this nice idea. We have now computed the differences of the mean rsfMRI connectome along the first gradient at the vertex level using 100 random subjects, as we have the data mapped to a symmetric template (fs_LR_32k), indicating that each vertex has a symmetric counterpart in the right hemisphere. Our results show left-right asymmetry as language/default mode-visual-frontoparietal vertices, which is consistent with the main results of the parcel-based approach. We have also added this response to the Supplementary materials.

      Though overall findings are consistent, spherical registration may also have new issues. Total anatomical spatial symmetry may not provide functional comparability at the vertex level between left and right hemisphere. For example, during language tasks in the current sample, the activated frontal region in the left hemisphere is larger than the activated contralateral region in the right hemisphere. In the current study, we aimed to evaluate asymmetry between functionally and structurally homologous regions, as described by the Glasser atlas. In case of the resting state fMRI data, we used the region-wise symmetric multimodal parcellation (Glasser et al., 2016). This parcellation ensures the functional contralateral regions in both hemispheres. A previous study (Williams et al., 2021) investigated the structural and functional asymmetry in newborn infants. They used spherical registration (make fs_LR symmetric) for structural asymmetry but not for functional asymmetry. As such spheric registration may hide functional information, we think spherical registration may be more suitable for structural studies.

      To address the concern regarding the alignment of hemispheres, we used joint alignment for LL and RR to compare the results between this and the Procrustes alignment technique (Pearson r=0.930, P_spin<0.001), below is the figure of asymmetry along the principal gradient (upper: joint alignment, below: Procrustes alignment) indicating convergence between both approaches. We have reported this information in the Supplementary Materials.

      Lastly, we do agree that parcel size might be an important issue influencing the asymmetry pattern. To test for such an effect, we performed the correlation between the rank of parcel size (left-right)/(left+right) and rank of asymmetry index. It suggests only a small insignificant correlation along G1 (Spearman r_intra=0.130, P_spin=0.105; Spearman r_inter=0.130, P_spin=0.084). Of note, there is a systematic difference in parcel size as a function of sensory-association hierarchy, indicating that the link between parcel-size and asymmetry may vary as a function of sensory vs associative regions.

      Reviewer #2 (Public Review):

      Using recently-developed functional gradient techniques, this study explored human brain hemispheric asymmetry. The functional gradient is a hot technique in recent years and has been applied to study brain asymmetries in two papers of 2021. Compared to previous studies, the current study further evaluated the degree of genetic control (heritability) and evolutionary conservation for such gradient asymmetries by using human twin data and monkey's fMRI data. These investigations are of value and do provide interesting data. However, it suffers from a lack of specific hypotheses/questions/motivations underlying all kinds of analyses, and the rich observational or correlational results seem not to offer significant improvement of theoretical understanding about brain asymmetries or functional gradient. In addition, given the limited number of twins in HCP project (for a heritability estimation), the limited number of monkeys (20 monkeys), and the relatively poor quality of monkeys' resting functional MRI data, the results and conclusion should be taken cautiously. Below are major concerns and suggestions.

      We thank the Reviewer for the evaluation of our work and the helpful suggestions.

      The gradient from resting-state functional connectome has been frequently used but mainly at the group level. The current study essentially applied the gradient comparison (i.e., gradient score) at the individual level. Biological interpretation for individual gradient score at the parcel level as well as its comparability between individuals and between hemispheres should be resolved. This is the fundamental rationale underlying the whole analyses.

      We thank the Reviewer for this remark, and are happy to provide further rationale for using and comparing individual gradients scores to evaluate individual variation in asymmetry and associated heritability. Though gradients from resting-state functional connectivity have been frequently used at the group level, various studies have also studied individual differences. For example, using linear mixed models to compare gradient scores between left and right across subjects (Liang et al., 2021), applying the individual gradient scores to compare disease and controls (Dong et al., 2020, 2021; Hong et al., 2019; Park et al., 2021), and link individual hippocampal gradients to memory recollection (Przeździk et al., 2019). Together, these studies show individual variations of local gradients, indicating changes in node centrality and hubness (Hong et al., 2019), and connectivity profile distance (Y. Wang et al., 2021). Of note, low-dimensional embeddings describe global embedding of local features, rather than local phenomena. Thus, interpretation for asymmetry of areas is under a global context. The biological interpretation for individual gradients would be to what degree the system segregated and integrated has changed patterns of ongoing neural activity (Mckeown et al., 2020). It reflects that individuals have different functional boundaries between anatomical regions. Whereas, individual neurons are embedded under the global-local boundaries through a cortical wiring space consisting of intricate long- and short-range white matter fibers (Paquola et al., 2020).

      Introduction, p. 4: “We applied the individual gradient scores to study the asymmetry, consistent with prior studies (Gonzalez Alam et al., 2021; Liang et al., 2021). Individual variation along the gradients reflects a global change across subjects in the functional connectome integration and segregation, and it is under genetic control (Valk et al., 2021). Moreover, to what degree the system segregated and integrated relates to patterns of ongoing neural activity (Mckeown et al., 2020), and different individuals have different functional boundaries between anatomical regions.”

      Results, p. 5: “Next, individual gradients were computed for each subject and the four different FC modes and aligned to the template gradients with Procrustes rotation. It rotates a matrix to maximum similarity with a target matrix minimizing sum of squared differences. As noted, Procrustes matching was applied without a scaling factor so that the reference template only matters for matching the order and direction of the gradients. Therefore, it allows comparison between individuals and hemispheres. The individual mean gradients showed high correlation with the group gradients LL (all Pearson r > 0.97, P spin < 0.001).”

      Only the first three gradients are used but why? What about the fourth gradient? Specific theoretical interpretation is needed. At the individual level, is it ensured that the first gradients of all individuals correspond to each other? In this study, it is unclear whether we should or should not care about the G2 and G3. The results of G2 and G3 showed up randomly to some degree.

      In the current study we focused on the principal gradient in the main analysis, given its association with sensory-transmodal hierarchy, microstructure, and evolutionary alterations (Margulies et al., 2016; Paquola et al., 2019; Xu et al., 2020).

      Conversely, gradient 2 reflects the dissociation between visual and sensory-motor networks and gradient 3 is linked to task-positive, control, versus ‘default’ and sensory-motor regions. We analyzed asymmetry and its heritability of the first three gradients (explaining respectively 23.3%, 18.1%, and 15.0% of the variance of the rsFC matrix). However, we extracted the first ten gradients to maximize the degree of fit (Margulies et al., 2016; Mckeown et al., 2020). We have now also shown G4-10 mean asymmetry results as a supplementary figure. To ensure correspondence of gradients across individuals, we aligned the individual gradients to the group level template with Procrustes rotation. Procrustes rotation rotates a matrix to maximum similarity with a target matrix minimizing sum of squared differences. The approach is typically used in comparison of ordination results and is particularly useful in comparing alternative solutions in multidimensional scaling. Figure S1 shows the mean gradients across subjects of each FC mode, which is close to the Figure 1D template gradient space.

      Results, p. 5: “The current study analyzed asymmetry and its heritability of the first three gradients explaining most variance (Figure 1d). As they all have reasonably well described functional associations (G1: unimodal-transmodal gradient with 24.1%, G2: somatosensory-visual gradient with 18.4%, G3: multi-demand gradient with 15.1%). However, given we extracted ten gradients to maximize the degree of fit 26,52. We stated mean asymmetry of G4-10 in Figure 1-figure supplement 1.”

      The intra-hemispheric gradient is institutive. However, it is hard to understand what the inter-hemispheric gradient means. From the data perspective, yes you can do such gradient comparison between the LR and RL connectome but what does this mean? Why should we care about such asymmetry? From the introduction to the discussion, the authors simply showed the data of inter-hemispheric gradients without useful explanation. This issue should be solved.

      We are happy to further clarify. The LR and RL connectivity reflects cross-hemispheric functional signal interaction via corpus callosum, whose structural asymmetry is usually studied (Karolis et al., 2019). Such intra-hemispheric connections, compared to the inter-hemispheric connections, have been suggested to reflect the inhibition of corpus callosum, and underlie hemispheric specialization. Different information relies on hemispheric specialization (e.g., visual, motor, and crude information) and/or inter-hemispheric information transfer (e.g., language, reasoning, and attention) (Gazzaniga, 2000). To clarify and motivate the analysis of both intra- and inter-hemispheric asymmetry in functional gradients, we have now added further detail in the introduction, p. 5.

      Here is text: Introduction, p. 4. “The full FC matrix contains both intra-hemispheric and inter-hemispheric connections. Intra-hemispheric connections, compared to the inter-hemispheric connections, have been suggested to reflect the inhibition of corpus callosum and may underlie hemispheric specializations involving language, reasoning, and attention. Conversely, inter-hemispheric connectivity may reflect information transfer between hemispheres, for example a wide range of modal and motor information, and crude information concerning spatial locations 48. Previous studies have reported intra-hemispheric FC to study gradient asymmetry 6,38. By having the callosum related to association white matter fibers, one hemisphere could develop for new functions while the other hemisphere could continue to perform the previous functions for both hemispheres 48. Therefore, in addition to the intra-hemispheric FC gradients, we depicted the inter-hemispheric FC, which is abnormal in patients with schizophrenia 23,49 and autism 24.”

      as well as Discussion, p. 16 “Conversely, the transmodal frontoparietal network was located at the apex of rightward preference, possibly suggesting a right-ward lateralization of cortical regions associated with attention and control and ‘default’ internal cognition 62,63. The observed dissociation between language and control networks is also in line with previous work suggesting an inverse pattern of language and attention between hemispheres 3,64. Such patterns may be linked to inhibition of corpus callosum 65, promoting hemispheric specialization. It has been suggested that such inter-hemispheric connections set the stage for intra-hemispheric patterns related to association fibers 48. Future research may relate functional asymmetry directly to asymmetry in underlying structure to uncover how different white-matter tracts contribute to asymmetry of functional organization.”

      and Discussion, p.18 “Though overall intra- and inter-hemispheric connectivity showed a strong spatial overlap in humans, we also observed marked differences between both metrics across our analysis. For example, although we found both intra- and inter-hemispheric differences in gradient organization to be heritable, only for intra-hemispheric asymmetry we found a correspondence between degree of asymmetry and degree of heritability. Similarly comparing asymmetry observed in human data to functional gradient asymmetry in macaques, we only observed spatial patterning of asymmetry was conserved for intra-hemispheric connections. Whereas intra-hemispheric asymmetry relates to association fibers, commissural fibers underlie inter-hemispheric connections 77 It has been suggested that there is a trade-off within and across mammals of inter- and intra-hemispheric connectivity patterns to conserve the balance between grey and white-matter 76. Consequently, differences in asymmetry of both ipsi- and contralateral functional connections may be reflective of adjustments in this balance within and across species. Secondly, previous research studying intra- and inter-hemispheric connectivity and associated asymmetry has indicated a developmental trajectory from inter- to intra-hemispheric organization of brain functional connectivity, varying from unimodal to transmodal areas 78,79. It is thus possible that a reduced correspondence of asymmetry and heritability in humans, as well as lack of spatial similarities between humans and macaques for inter-hemispheric connectivity may be due to the age of both samples (young adults in humans, adolescents in macaques). Further research may study inter- and intra-hemispheric asymmetry in functional organization as a function of development in both species to further disentangle heritability and cross-species conservation and adaptation.”

      When aligning intra-hemispheric gradient, choosing averaged LL mode as the reference may introduce systematic bias towards left hemisphere. Such an issue also applies to LR-RL gradient alignment as well as cross-species gradient alignment. This methodological issue should be solved.

      We thank the Reviewer for raising this point. Indeed, we also used RR as reference, the results were virtually identical. We have stated this in the Results, p. 13. Regarding the cross-species alignment, we averaged the left and right hemispheres to reduce the systematic bias. It showed that the correlation and comparison results remained robust. Now we have updated the method and corresponding results (p.10). Here is the text:

      Results (p.15): “We also set the RR FC gradients as reference, the first three of which explained 22.8%, 18.8%, and 15.9% of total variance. We aligned each individual to this reference. It suggested all results were virtually identical (Pearson r > 0.9, P spin < 0.001).”

      Results (p.10): “To reduce a possible systematic hemispheric bias during the cross-species alignment, we averaged the left and right hemisphere. We found that the macaque and macaque-aligned human AI maps of G1 were correlated positively for intra-hemispheric patterns (Pearson r = 0.345, P spin = 0.030). For inter-hemispheric patterns, we didn’t observe a significant association (Pearson r = -0.029, P spin = 0.858)”

      The sample size of monkey (i.e., 20) is far less than human subjects (> 1000). Such limitation raises severe concern on the validity of the currently observed gradient asymmetry pattern in the monkey group, as well as the similarity results with human gradient asymmetry pattern. Despite the marginal significance of G1 inter-hemisphere gradient between humans and monkeys, I feel overall there is no convincingly meaningful similarity between these two species. However, the authors' discussion and conclusion are largely based on strong inter-species similarity in such asymmetry. The conclusion of evolutionary conservation for gradient asymmetry, therefore, is not well supported by the results.

      We agree with your comments. Although it is a small sample compared to humans, in NHP studies, it is a relatively decent sample size (most of the studies have N<10). Of note, recent work suggested that the individual variation pattern can be captured using 4 subjects in both human and macaques (Ren et al., 2021).

      To overcome potential overinterpretation of our findings, we have now changed the title to a more descriptive format: “Heritability and cross-species comparisons of asymmetry of human cortical functional organization”

      And further detailed findings already in the Abstract; “These asymmetries were heritable in humans and, for intra-hemispheric asymmetry of functional connectivity, showed similar spatial distributions in humans and macaques, suggesting phylogenetic conservation.”

      We have pointed out the small sample size in the limitation. Please find the text below: Discussion, p. 18: “Due to the small sample size of macaques, it is important to be careful when interpreting our observations regarding asymmetry in macaques, and its relation to asymmetry patterning observed in humans. Therefore, further study is needed to evaluate the asymmetry patterns in macaques using large datasets 53,79”

      And nuanced the conclusion, p.19: “This asymmetry was heritable and, in the case of organization of intra-hemispheric connectivity, showed spatial correspondence between humans and macaques. At the same time, functional asymmetry was more pronounced in language networks in humans relative to macaques, suggesting adaptation.”

    1. Author Response

      Reviewer #2 (Public Review):

      Weaknesses:

      1) It is surprising that certain enzymes with established depalmitoylation activity were excluded from BrainPalmSeq data-base (e.g. ABHD4, ABHD11, ABHD12, ABHD6)

      We have now included additional depalmitoylating enzymes in our database and manuscript.

      2) Albeit not essential it will be of great interest to include in the established database enzymes necessary for synthesis of ACYL-CoA (e.g. ACSL enzymes). One improvement may include the ability of future researchers to add such curated analysis to the platform within future research studies.

      We agree with the reviewer there are many expansions of our gene set that would be interesting to include. Given the size of the current manuscript however, for brevity we have decided at present to curate data for the core set of genes that directly regulate dynamic palmitoylation. We have also added a ‘Contact Us’ feature to the website, so that repeatedly requested genes or datasets can be added in future.

      3) The experimental validation presented in figure 6 relies on over-expression of substrates and ZDHHC enzymes. This setup is known to often provide unspecific S-acylation events which result from excess enzyme or substrate availability. Hence, such validation would be greatly strengthened by loss of function experiments.

      We have now done loss-of-function experiments and included results in major discussion point 1 above. If the editors/reviewers think it is appropriate to add to the manuscript, we will comply. However, as our negative data does not negate the fact that ZDHHC9 is able to palmitoylate the myelin proteins tested, but merely suggests it may not be necessary for protein palmitoylation in vivo, we do not think it strengthens the manuscript.

      4) The authors relevantly use in-situ hybridization images from the Allen Brain atlas to validate their predictions. Although it is understandable that an extensive experimental validation of the predictions here established would be out of the scope of the current study, this work could be improved by validating the RNA expression at the protein level of certain abundant ZDHHC enzymes in available neuro-associated cell types.

      We have now validated RNA expression at the protein level for a few palmitoylating and depalmitoylating enzymes.

      5) It would be interesting if the authors would further compare the predicted association clusters (e.g. figure 1), substrates (figures 1 and 2), and S-acylation pairs (figure 4) here determine, with previous determined ZDHHC enzyme associations described in different cell types and biological systems. Alternatively, further relevant validation could include testing whether further established ZDHHC-ZDHHC cascades (e.g. ZDHHC3-7) can be also detected with specific cells or regions of the CNS.

      On our website, all expression data can be downloaded below the heatmaps for each study, and the cell type expression relationships between any 2 genes can be plotted by the user to reveal cell types (if any) within which genes are co-expressed. In response to this comment and that of Reviewer 3 below, we have now performed such analysis on ZDHHC5/ZDHHC20 and ZDHHC6/ZDHHC16, which are to our knowledge the best established ZDHHC cascades. We have included these plots in new Figure 1 – figure supplement 2, along with discussion on line 172. Similar analysis has been performed on the known ZDHHC-accessory protein pairs (see below).

      6) Figure 3B: it is not clear why the cluster of zdhhcs with high layer specific expression displayed at the top of the graph does not follow the low-to-high expression scale of the table.

      The expression data in this figure is grouped by hierarchical clustering, rather than in order of low-to-high expression, in order to be consistent with Figure 2B. While we believe this is the better way to display the data, we are willing to modify if the editors/reviewers have a strong preference.

      7) Figure 4D: the more relevant potential cooperative pairs (ZDHHCs-APTs) could be highlighted in more contrasted colours.

      We thank the reviewer for this suggestion but at this stage would prefer to keep the color scheme as it is so that readers are better able to formulate their own hypotheses when observing these figures.

      Reviewer #3 (Public Review):

      Weaknesses:

      1) There is a vast amount of data available and the description and discussion of this could be endless, but there are a few points that could be brought out in more detail. For example, the correlation (or lack of correlation) of expression of the proposed zDHHC-PAT accessory proteins with their cognate zDHHCs. The dominance of a relatively small number of zDHHC enzymes (20, 2, 17, 3, 21, 8) in the CNS also merits some discussion. Is the combination of a high-capacity, low-specificity enzyme (zDHHC3) with others that are regarded as more 'specific'? I believe none of these are ER-resident - they represent Golgi and PM?

      The reviewer brings up many interesting questions. Indeed, we were hopeful that this type of mining of RNAseq data would bring to light many questions that can be followed up on in future publications.

      We have addressed the correlation in expression of accessory proteins with their cognate ZDHHCs with new data.

      We are unsure how to address the dominance of a relatively small number of ZDHHC enzymes (20, 2, 17, 3, 21, 8) in the CNS, beyond highlighting this expression pattern. We believe that interpretation of the expression of this in any way (e.g. co-expression of high-capacity, low-specificity enzymes (ZDHHC3) with more 'specific' ZDHHCs) would merely be speculative. However, we are open to adding further discussion with some guidance from the reviewer.

    1. [Bruno Giussani, co-curator of TED] gave the example of Steven Pinker‘s popular TED talk on the decline of violence over the course of history, based on his book The Better Angels of Our Nature. Pinker is a respected professor of psychology at Harvard, and few would accuse him of pulling his punches or yielding to thought leadership’s temptations. Yet his talk became a cult favorite among hedge funders, Silicon Valley types, and other winners. It did so not only because it was interesting and fresh and well argued, but also because it contained a justification for keeping the social order largely as is. Pinker’s actual point was narrow, focused, and valid: Interpersonal violence as a mode of human problem-solving was in a long free fall. But for many who heard the talk, it offered a socially acceptable way to tell people seething over the inequities of the age to drop their complaining. ‘It has become an ideology of: The world today may be complex and complicated and confusing in many ways, but the reality is that if you take the long-term perspective you will realize how good we have it,’ Giussani said. The ideology, he said, told people, ‘You’re being unrealistic, and you’re not looking at things in the right way. And if you think that you have problems, then, you know, your problems don’t really matter compared to the past’s, and your problems are really not problems, because things are getting better.’Giussani had heard rich men do this kind of thing so often that he had invented a verb for the act: They were ‘Pinkering’ — using the long-run direction of human history to minimize, to delegitimize the concerns of those without power. There was also economic Pinkering, which ‘is to tell people the global economy has been great because five hundred million Chinese have gone from poverty to the middle class. And, of course, that’s true,’ Giussani said. ‘But if you tell that to the guy who has been fired from a factory in Manchester because his job was taken to China, he may have a different reaction. But we don’t care about the guy in Manchester. So there are many facets to this kind of ideology that have been used to justify the current situation.’ —Winners Take All, pp. 126-127

      An early example of the verbification of Steven Pinker's name. Here it indicates the view of predominantly privileged men to argue that because the direction of history has been so positive, that those without power shouldn't complain.

      I've also heard it used to generally mean a preponderance of evidence on a topic, as seen in Pinker's book The Better Angels of Our Nature, but still not necessarily convincingly prove one's thesis.

    1. It is ironical that we Senators can in debate in the Senate directly or indirectly, by any form of words, impute to any American who is not a Senator any conduct or motive unworthy or unbecoming an American -- and without that non-Senator American having any legal redress against us -- yet if we say the same thing in the Senate about our colleagues we can be stopped on the grounds of being out of order. It is strange that we can verbally attack anyone else without restraint and with full protection and yet we hold ourselves above the same type of criticism here on the Senate Floor.  Surely the United States Senate is big enough to take self-criticism and self-appraisal.  Surely we should be able to take the same kind of character attacks that we "dish out" to outsiders. I think that it is high time for the United States Senate and its members to do some soul-searching -- for us to weigh our consciences -- on the manner in which we are performing our duty to the people of America -- on the manner in which we are using or abusing our individual powers and privileges.

      Aristotelian criticism is largely concerned with wondering how effective an artifact is in reaching its intended audience. In this case, Senator Smith never mentions Sen. McCarthy by name, but given the historical context, it is obvious she refers to him and his supporters in condemning "the Senate and its members". If one were to measure her success in doing so, as mentioned in Lorraine Boissoneault's Smithsonian article, stating , "The one person who didn’t forget Smith’s speech was McCarthy himself. 'Her support for the United Nations, New Deal programs, support for federal housing and social programs placed her high on the list of those against whom McCarthy and his supporters on local levels sought revenge,' writes Gregory Gallant in Hope and Fear in Margaret Chase Smith’s America. When McCarthy gained control of the Permanent Subcommittee on Investigations (which monitored government affairs), he took advantage of the position to remove Smith from the group, replacing her with acolyte Richard Nixon, then a senator from California." This may not have been an intended effect, but shows nonetheless the significance and degree to which Smith's speech was able to reach her audience. Unfortunately however, it's hard to say how effective entirely her speech would have been, as popularity of her speech waned as the Korean War broke out later the same month, inclining many to take a more right-wing, anti-communist approach favored by McCarthy and many other Republicans.

    1. It is ironical that we Senators can in debate in the Senate directly or indirectly, by any form ofwords, impute to any American who is not a Senator any conduct or motive unworthy orunbecoming an American -- and without that non-Senator American having any legal redressagainst us -- yet if we say the same thing in the Senate about our colleagues we can bestopped on the grounds of being out of order.It is strange that we can verbally attack anyone else without restraint and with full protectionand yet we hold ourselves above the same type of criticism here on the Senate Floor. Surelythe United States Senate is big enough to take self-criticism and self-appraisal. Surely weshould be able to take the same kind of character attacks that we "dish out" to outsiders.I think that it is high time for the United States Senate and its members to do some soul-searching -- for us to weigh our consciences -- on the manner in which we are performing ourduty to the people of America -- on the manner in which we are using or abusing ourindividual powers and privileges.

      Aristotelian criticism is largely concerned with wondering how effective an artifact is in reaching its intended audience. In this case, Senator Smith never mentions Sen. McCarthy by name, but given the historical context, it is obvious she refers to him and his supporters in condemning "the Senate and its members". If one were to measure her success in doing so, as mentioned in Lorraine Boissoneault's Smithsonian article , "The one person who didn’t forget Smith’s speech was McCarthy himself. 'Her support for the United Nations, New Deal programs, support for federal housing and social programs placed her high on the list of those against whom McCarthy and his supporters on local levels sought revenge,' writes Gregory Gallant in Hope and Fear in Margaret Chase Smith’s America. When McCarthy gained control of the Permanent Subcommittee on Investigations (which monitored government affairs), he took advantage of the position to remove Smith from the group, replacing her with acolyte Richard Nixon, then a senator from California." This may not have been an intended effect, but shows nonetheless the significance and degree to which Smith's speech was able to reach her audience. Unfortunately however, it's hard to say how effective entirely her speech would have been, as popularity of her speech waned as the Korean War broke out later the same month, inclining many to take a more right-wing, anti-communist approach favored by McCarthy and many other Republicans.

    Annotators

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Excellent quality of cell biology and biochemistry. the additional supports are needed for the claim of actin elongation using different formin variants.

      Reviewer #1 (Significance (Required)): Ingrid Billault-Chaumartin and co-authors described interesting research that provides insights on formin-isoform specific function in fission yeast and a new role of Fus1 FH2 domain in cell-cell fusion event. While three formin isoforms have different localization, research proposed an additional dissection in their functional differences by having different functions in C-terminus, including FH1 FH2 and formin C-terminus. The work also described additional factors that regulate cell fusions from autotrophy effect and formin expression level, in addition to the well-accepted formin biochemical activities. Here are my comments regarding the strengths of the work and improvements that could further strengthen the story.

      Major comments 1. Fig.1 shows Cdc12C could recapitulate Fus1 function by ~80% if fused with Fus1C, whereas deletion of the C-terminal tail of Cdc12 following FH2 introduces drastic dysfunction. Together with Fig. 3, these results indicate Cdc12 Cter plays more important roles than Fus1 Cter for there respective functions. Such results suggested a Cter-mediated mechanism that differentiates the functions of three fission yeast formin isoforms. The authors examined contributions from the difference in FH1 (Figs 4,5) and FH2 residues (Fig. 6). Whereas the obvious phenotype of Cter was not further investigated and not much discussed. The Cter of budding yeast formins interacts with nucleation-promoting factors, Bud6 and Aip5. Although S. Pombe does not have orthologs of budding yeast Bud6 and Aip5, I wonder would the author discuss the potential contribution of Cter in differentiating S. Pombe formins.

      The reviewer is correct that the C-terminal tail region of Cdc12 beyond the FH1-FH2 domains has a strong influence on the ability of Cdc12C to replace Fus1C. This is one reason why we specifically investigated the possible role of Fus1 C-terminal tail, which is much shorter than that of Cdc12. We found that Fus1 C-terminal tail plays only very minor role in regulating Fus1 function, as described in Figure 3. We note that contrary to what the reviewer states, Bud6 exists in S. pombe and binds the C-terminal tail of the formin For3 (see Martin et al, MBoC 2007), but whether it binds Fus1 is unknown. We have expanded our discussion to include a paragraph on the role of formin C-termini.

      Because the manuscript is focused on the function of Fus1 formin, we did not explore further the role of the Cdc12 C-terminal tail. It was previously shown that this region of Cdc12 contains an oligomerization domain that promotes actin bundling (Bohnert et al, Genes and Dev 2013). It is thus likely that this helps Cdc12 FH1-FH2 perform well in replacement of Fus1. In fact, it is likely that oligomerization boosts formin function, as we have discovered that Fus1 N-terminus contains a disordered region that fulfils exactly this function. This is described in a distinct manuscript under review elsewhere and just deposited on BioRxiv (Billault-Chaumartin et al, BioRxiv 2022; DOI: 10.1101/2022.05.05.490810). We have now cited this point in the discussion.

      1. Here, the study focuses on the FH1 between Fus1 and Cdc12 to understand their different functions in actin polymerization. FH1 mediated actin elongation through its interaction with profilin via polyP. The transfer rate of G-actin from profilin and profilin sliding depends on the polyP patterns regarding the length of each polyp motif and their distance to FH2 (Naomi Courtemanche and Thomas D. Pollard, JBC, 2012). To better understand the mechanisms by which these engineered FH1 variants on both Fus1 and Cdc12 in Fig. 4, the author may want to list the sequence of these engineered FH1 domains, including the information of the number and length of polyp motifs, and discuss these patterns.

      This list and discussion were available in the initial paper that characterized each of the constructs in vitro (Scott et al, MBoC 2011). We have now re-drawn it in a supplemental figure for convenience (as also answered in response to minor point 2), which is already provided in the revised manuscript as Figure S1. (Previous supplementary figures are re-numbered S1>S2, S2>S3 and S3>S4).

      1. Figs.4,5 cell biology results do not directly support the point of specific elongation rate unless the LifeAct-labeled actin cable elongation speed could be followed and quantified. The fluorescent tagging of tropomyosin does not show the actin cable pattern, which makes it very difficult to be used to study actin cable dynamics, such as elongation. Therefore, I feel the data in current Fig. 4 and Fig. 5 could not claim the differences in actin elongation without a quantitative comparison of elongation rate. I suggest a CK666 treatment to increase the visibility of the actin cable pattern of LifeAct, used before in both fission and budding yeasts, which would allow the author to quantify the actin cable elongation rate. Another way is to use the TIRF assay used in this study, which would give a better quantitation of formin nucleation and profilin-aided elongation.

      We respectfully disagree with the reviewer on this point. All the constructs we use in vivo have been characterized in vitro and their elongation rate carefully measured (Scott et al, MBoC 2011). These values are thus known and can be directly compared to our results in vivo.

      Of course, it would be fantastic to be able to directly measure formin elongation rates in vivo, but we are not aware that this has been done in any system. The proxy experiments that the reviewer suggests would be good ones, but each faces technical challenges that make them impossible in our system. First, because the fusion focus is a structure that forms in response to cell-cell pheromonal communication, we cannot add CK-666 or any other drug during this phase, as this perturbs the pheromone signal. Indeed, we had shown that simple buffer wash leads to loss of the fusion focus (see Dudin et al, Genes and Dev 2016). Second, the fusion focus is at the contact site between partner cells, i-e somewhat distant (1-2µm) from the coverslip during imaging. It is thus impossible to use TIRF. Finally, the fusion focus is a tightly packed actin structure. This is the reason why (rather than use of the tropomyosin marker) we cannot image single actin filaments (or even bundles) of which we could follow the dynamics as has been done to measure the retrograde flow of actin cables in yeast.

      What we have done is to use a better tropomyosin tag, mNeonGreen-Cdc8, which was just described (Hatano et al, BioRxiv 2022; DOI: 10.1101/2022.05.19.492673) to quantify amounts of linear actin. Although this is not a measure of elongation rate, it would give some sense about amounts of polymer assembled. We have obtained images with mNeonGreen-Cdc8 of all experiments previously conducted with GFP-Cdc8 and have replaced them in Figure 4C, Figure 5E, Figure 6E and Figure S2B. We have also quantified the relevant strains. The relative intensities of mNeonGreen-Cdc8 at the fusion focus at fusion time reflect remarkably well the measured elongation rates of the various formin constructs characterized in vitro. These data are now provided as new panels Figure 4F and Figure 5F.

      1. I appreciated the detailed biochemical dissections of multiple aspects of WTFus1 and Fus1R1054E, although the biochemical assays could not identify the mechanism by which R1054E causes the cell fusion. In many cases, the formin functions are diverse in diverse biological processes and sophisticated that cannot be explained well only from its biochemical activities in actin polymerization, such as the bundling, nucleation, and elongation studied in this story regarding fusion. This exciting information allows us to think of more possibilities that might regulate formin function rather than a direct change of formin activities in actin polymerization. I think a discussion of different aspects of functional regulation of formin might inspire society to investigate new possibilities to solve the mysteries. For example, the changes in formin behaviors and functions could be regulated by stress-induced formin turnover by degradation, cell signaling-regulated formin clustering and complex assembly, and their potential relevance to recruit protein constituents for fusion progression.

      We have added a paragraph on the role of Fus1 C-terminus. If you feel we should expand more on the diverse modes of regulation of formins, we could, but we have so far kept the discussion centred around the points of investigation in this paper, whose aim was to probe how changes in nucleation and elongation rates, rather than other regulations, affect the in vivo function of Fus1.

      Minor comments. 1. There are two types of "C", one includes FH1/FH2 and one following FH2, used in the manuscript, and it is a bit confusing. Better to differentiate them that allows an easy following. Fig. 1 uses Cdc12C-deltaC, Fig. 3 uses Fus1-delta Cter.

      We have updated the nomenclature to make this clearer: the C-terminal region beyond the FH1-FH2 domains is now called Cter throughout the manuscript.

      1. It's better to specify the amino acid position on the schematic of formins, such as panel A in many figures. It's always more informative to compare formin activities by considering the domain lengths, especially for the C-terminal tail that is variable in lengths and sequences. With similar thoughts, I suggest a supplementary figure that lists the sequence of all FH1 domains variants and Cter domains, such as the FH2 domain in Fig. S1.

      We have made a supplementary figure (new Figure S1) listing all constructs with specific aa positions as well as the FH1 domain variants and their sequences (see also answer to point 2 above). We have not added the sequence of the Cter domains in this figure, as these are extremely divergent and not particularly informative at this point.

      1. "n" for the statistic needs to be provided for Fig. S3.

      We have added the information to the legend of the figure (now Fig S4).

      1. The SDS-PAGE staining gel of the purified recombinant proteins for biochemical assays should be provided, particularly for these newly reported mutant variants.

      This is now provided as new panel S4C. We show the purified recombinant Cdc122FH1-Fus1FH2 proteins, which are the newly reported ones.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): In this study, Billaut-Chaumartin and colleagues investigate the molecular specialization of the S. pombe formin, Fus1. The authors systematically modulate the actin filament elongation and nucleation activities of Fus1 by expressing chimeric constructs that contain Formin Homology 1 and 2 domains from two other formins with known polymerization activities. By characterizing the architecture of the fusion focus and the efficiency of cell fusion, they find that both the elongation and nucleation properties of Fus1 are specifically tailored for its cellular role. Comparison of formin constructs with similar elongation and nucleation activities also reveals that the Fus1 FH2 domain possesses a specific property that promotes efficient cell fusion. Using sequence alignment and homology modeling, the authors identify R1054 as the residue that confers this novel, fusion-specific activity to Fus1, despite producing no effect on its bundling or polymerization properties in vitro.

      Overall, this study is well motivated, and the results support the conclusions that are drawn. I have only minor suggestions, as described below.

      Minor comments: (1) The schematic diagrams of the chimeric formin constructs are very helpful. However, it is difficult to distinguish the colors from one another, especially in the case of the Cdc12FH1-Fus1FH2 variant, which requires discernment of the relatively small purple region within the dark blue molecule. Would it be possible to modify the colors to increase their contrast? Similarly, the blue and gray data sets in Figure 3B are very difficult to discern.

      We have changed the colours to improve contrasts.

      (2) The affinities (Kd) with which the formins bind the barbed ends as described in the second-to-last paragraph on page 8, in Figure Legend 7G, and in the "Analysis of pyrene data" section of the Materials and Methods should be defined as dissociation "constants", rather than dissociation "rates". Also, these affinities are lacking units in the following sentence on page 8.

      We have corrected this. The unit is nM.

      (3) When comparing the TIRF micrographs in Figure S3A, it looks as though both formins (but especially the R1054E variant) nucleate more filaments in the presence of profilin than in its absence. Is this a reproducible effect? If so, can the authors provide an explanation for this?

      There is strong variability in the filament numbers observed by TIRF in replicate experiments, which makes it difficult to use this technique to determine the nucleation efficiency. This may be due for instance to the stickiness of the glass, which may influence the number of observed filaments. We have measured the number of filaments after 130s of polymerization for each condition to test whether there are any significant differences between conditions despite overall variability. The measurements suggest that the addition of profilin increases the number of actin filaments. However, these results should be taken very carefully due to the experimental variations (very large error bars). Additionally, because Fus1-associated filaments are very short in absence of profilin, it is quite likely that this influences their crowding at the glass surface compared to longer filaments (in presence of profilin). Since in TIRF we can only observe the filaments at the glass surface, we may miss a portion of short Fus1-bound actin filaments in absence of profilin.

      For these reasons, and because the possible role of profilin in modulating nucleation efficiency by formins is not the object of the work here, would thus prefer not to include this graph in the manuscript.

      Reviewer #2 (Significance (Required)): This study contributes a key advancement towards understanding how the polymerization activities of formins are tailored to support diverse and specific cellular functions. The results in this study nicely complement and expand upon similar recent work that dissected the polymerization requirements of the formin Cdc12, which mediates cytokinetic ring assembly in S. pombe, and For2, which drives the assembly of apical networks that are necessary for polarized growth in Physcomitrella patens. As such, this work will likely be of significant interest to scientists who study mechanisms of actin dynamics regulation. The identification of R1054 as a residue that confers a novel regulatory activity to the FH2 domain of Fus1 will also likely be of great interest to biochemists and other scientists who study formins at the molecular level.

      My expertise is in the field of formins and actin polymerization.

    1. Reviewer #1 (Public Review): 

      In this article Farrell et al. leverage existing datasets which measure frailty longitudinally in mice and humans to model 'robustness' (the ability to resist damage) and 'resilience' (the ability to recover from damage), their dynamics across age, and their relative contributions to overall frailty and mortality. The concept of separating damage/robustness from recovery/resilience is valid and has many important applications including better assessment and prediction of effective intervention strategies. I also appreciate the authors' sophisticated attempts to effectively model longitudinal data, which is a challenge in the field. The use of human and mouse data is another strength of the study, and it is quite interesting to see overlapping trends between the two species. 

      While I find the rationale sound and appreciate the approach taken at a high level, there are a few key considerations of the specific data used which are lacking. The authors conceptualize resilience based on studies which primarily use short time scales and dynamic objective measures (ex. complete blood cell counts in Pyrkov et al.) often in conjunction with an acute stress stimulus. For example, they heavily cite Ukraintseva et al. who define resilience as "the ability to quickly and completely recover after deviation from normal physiological state or damage caused by a stressor or an adverse health event." 

      Given these definitions, the human data used seem to fit within this framework, but we should carefully consider the mouse data. The mouse frailty index is a very useful tool for efficiently measuring the organismal state in large cohorts. A tradeoff for quickly measuring a broad range of health domains is that the individual measurements are low resolution (categorical) and involve inherent subjectivity (which may be considered part of the measurement error). Some transitions in individual components are due to random measurement error and I believe this is especially likely with decreases (or 'resilience' transitions). 

      The reason I think the resilience transitions are subject to high measurement error is that I am skeptical as to whether many of the deficits in the mouse index are reversible under normal physiologic conditions. For example, it is exceptionally unlikely for a palpable/visible tumor to resolve in an aged mouse over the time scales studied here, thus any reversal that was observed is very likely due to random measurement error. Other components which I have doubts about reversibility are alopecia, loss of fur color, loss of whiskers, tumors, kyphosis, hearing loss, cataracts, corneal capacity, vision loss, rectal prolapse, genital prolapse. 

      In summary, I applaud the authors' efforts in generating complex models to better understand longitudinal aging data. This is an important area that needs further development. I appreciate their conceptualization of resilience and robustness and think this framework has an important place in aging research. I also appreciate their cross-species approach. However, the authors may have over-conceptualized and made some assumptions about the mouse data which may not be valid. It will be important to assess the results with careful consideration of the time scales of the underlying biology and the resolution and measurement error inherent to these tools.

    1. What did Franklin himself think about abortions? In 1728 during his early years as a printer, he generated controversy over something he would end up doing himself. According to “Benjamin Franklin: An American Life” by Walter Isaacson, he “manufactured” an abortion debate, largely because he wanted to crush a rival, but his own opinions may not have been too strong about it. Franklin wrote a series of anonymous letters for another paper to draw attention away from Samuel Keimer’s paper: The first two pieces were attacks on poor Keimer, who was serializing entries from an encyclopedia. His initial installment included, innocently enough, an entry on abortion. Franklin pounced. Using the pen names “Martha Careful” and “Celia Shortface,” he wrote letters to Bradford’s paper feigning shock and indignation at Keimer’s offense. As Miss Careful threatened, “If he proceeds farther to expose the secrets of our sex in that audacious manner [women would] run the hazard of taking him by the beard in the next place we meet him.” Thus Franklin manufactured the first recorded abortion debate in America, not because he had any strong feelings on the issue, but because he knew it would help sell newspapers.

      Benjamin Franklin manufactured the first recorded abortion debate in America to help sell his newspapers and to crush a rival.

    1. The student doesn’t have a strong preference for any of these archetypes. Their notes serve a clear purpose that’s often based on a short-term priority (e.g, writing a paper or passing a test), with the goal to “get it done” as simply as possible.

      The typical student note taking method of transcribing, using (or often not using at all), and keeping notes is doomed to failure.

      Many students make the mistake of not making their own actual notes. By this I don't mean they're not writing information down. In fact many are writing information down, but we can't really call these notes. Notes by definition ought to transform something seen or heard into one's own words. Without the transformation, these students think that they're taking notes, but in reality they're focusing their efforts on being transcriptionists. They're attempting to capture something for later consumption. This is a deadly trap! By only transcribing, they're not taking advantage of transforming information by putting ideas down in their own words to test their understanding. Often worse, even if they do transcribe notes, they don't revisit them. If they do revisit them, they're simply re-reading them and not actively working with them. Only re-reading them will lead to the illusion that they're learning something when in fact they're falling into the mere-exposure effect.

      Students who are acting as transcriptionists would be better off simply reading a textbook and taking notes directly from that.

      A note that isn't revisited or revised, may as well be a note not taken. If we were to consider a spectrum of useful, valuable, and worthwhile notes, these notes would be at the lowest end of the spectrum.

      link to: https://hypothes.is/a/QgkL6IkIEeym7OeN9v9New

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      From the start, the authors would like to thank all the reviewers for their careful and constructive consideration of our manuscript. We have now made several changes to the paper and believe it to be better for the feedback.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this study, Rees et al. perform an RNA-seq circadian time course experiment in the recently formed allopolyploid wheat. Through comparisons with other circadian transcriptomic datasets in other species it appears that the period of rhythmic genes is much more variable in wheat with a shift to longer periods compared to the other species examined. Interestingly, by analyzing circadian parameters among expressed genes, they find evidence that this newly formed allopolyploid already shows signs of divergence in circadian traits among homoeologs. A thorough comparison with circadian regulated genes in Arabidopsis reveals overlap in phasing of genes involved in certain biological processes such as photosynthesis and light signaling whereas genes involved in starch metabolism were found to have different levels of rhythmicity and phasing. This dataset will be a great resource for the community and enable new predictions about the influence of polyploidy on the circadian control of important crop improvement traits and the circadian regulation of gene expression.

      Major Comments

      1. The results section starts with very little explanation of the experiment. It would help to provide a little more detail at the start of the results to explain the context for the experiment and what was done, when samples were collected and for how long. For the methods section, it isn't until line 650 that it is clearly stated that the sampling started at ZT0. It would be better to put this in the plant materials and growth condition section.

      Thank you for highlighting the need for this context, we agree that the manuscript is improved by an introduction to the experiments. We have now included an “Experimental context” section in the results and have taken the opportunity to explain how the full 0-68h and 24-68h datasets are used within our analysis. Ln 74-82. We have also edited the Methods as suggested Ln 610-615.

      The low proportion of circadian regulated genes is likely due to the very low cutoff for calling a gene expressed, especially when there are three days of repeated timepoints. If a gene is expressed across the time course it should have values above TPM 0 for at least 3 time points in order for it to be expressed each day. I'd also be suspicious of a gene with a TPM value less than 0.5. Comparing these types of numbers is always challenging due to the various cutoffs used. Along those lines, why was a different filtering scheme used for Arabidopsis (line 657)?

      We completely agree that the proportion of genes described as rhythmic changes a great deal with the threshold at which you exclude low expression transcripts as well as the window over which measurements are taken and the q-value cut-off for rhythmicity. We performed an analysis to test the effects of applying a pre-filtering step to exclude low-expression genes and discuss our findings in Supplementary Note 1. Briefly, we removed genes with expression less than 0.1 TPM in six or more timepoints and again ran Metacycle to define numbers of rhythmic genes. Our results are discussed in Supplementary Note 1 and are presented in Supplementary Table 1. Regardless of the cut-offs applied, Arabidopsis and wheat data was treated identically, and our findings reported in the main results were consistent with those reported in the Supplementary analysis. Thank you for raising this point, as we have now improved our description of this analysis in the main text (Ln 92-95).

      Regarding the different filtering schemes, the filtering mentioned by Reviewer 1 was applied to both Arabidopsis and wheat data for a stricter retention of rhythmic genes, as part of the pre-WGCNA clustering analysis. Filtering to retain genes with >0.5TPM across 3 timepoints was applied to reduce lowly expressed genes, that act as background 'noise' when defining clusters. We applied this across 3 timepoints rather than the WGCNA suggestion of 90% of samples - because the patterns of expression in our rhythmically filtered datasets were cyclical in nature.

      In reference to the shortening of the period every day, this should be interpreted with caution. Period estimate of a single cycle are not very reliable and the SD for each day is around 3h so it is difficult to draw any conclusions about changes in period each day. One option would be to only include genes with an SD less than 1h or alternatively to remove the discussion surrounding the comparison of period across the three days and focus on the period results for the full 24h-68h window shown in 1b. While 2 days is better it is still not ideal for calling period; however, your first day will still have a strong diurnal driven pattern that will likely skew your circadian period.

      Thank you for your comments. Our question here was to determine whether the mean period lengths of rhythmic transcripts in wheat were always immediately longer upon transfer to constant light, or whether they got progressively longer over time. Upon reading the reviewer’s comment, we realize that the explanation provided of how we conducted this analysis was misleading. Our approach was to take a 44h sliding window (almost 2 days) and measure period at 0-44h, 12-56h and 24-68h. We have now added the previously missing statistics that support our findings in the main text, and which hopefully show the significance of the period changes over time (supplementary note 2). One of the most surprising findings from this analysis was that the periods in the first window were the longest 28.61h (SD=3.421), suggesting that the diel (driven) oscillation had little impact upon immediate transfer to free run. Our interpretation is that the mean period initially lengthens trying to follow the missing dusk signal, before the free-running endogenous period asserts itself in later cycles (Ln 129-128).

      Line 87-93: If the dusk cue is important for clock expression you would think this would be biased towards genes that peak later in the day or near dusk. This argument should be connected better to the period results discussed on lines 98-101.

      Following on from our statement above, we have now combined our hypothesis for why wheat transcripts expressed at dusk have longer periods with the discussion about longer periods upon transfer to constant light. We agree that the two processes are likely to be connected and have now placed them together in Ln 129-128.

      1. Lines 650-652 of the Methods mentions that one of the main interests was the response to transfer to L:L, but this isn't mentioned in the introduction and doesn't come up much in the Results section. Most of the expression comparisons are focused on the 24-68h window. It also isn't clearly explained why the first day in LL is still a diurnal cycle. This would be helpful for non-circadian readers who may wonder why the first day is not included in all the analyses.

      We believe this point is now also addressed by the addition of an Experimental Context section in the results (Ln 74-82), in response to the reviewer’s previous comment.

      1. The phase comparisons shown in Figure suppl 4 are confusing. Suppl. Note 3 states that the period from the 24-68h data window was used to establish the bins but then the phase is shown for 3 different windows for each column? When calculating the phase for each of those 3 windows which period was used as the denominator in the phase calculation? Was it the period that matches the window used to calculate phase? What does the plot look like if phase is called on the same window used to calculate period (24-68)? What method was used to call phase in Suppl. Fig 4? As shown in Suppl Fig. 3 the method can influence the phase distributions. The methods suggest that the phase was determined with Metacycle but then FFT and MESA were used to verify. What does this mean verify, were they adjusted if FFT/MESA didn't agree?

      We agree that this Figure was unnecessarily complicated. We have now simplified Supplementary Figure 4 so that only the phases from 24-68h are presented. We have also clarified the legend to explain why we used FFT-NLLS to improve accuracy of Metacycle predictions.

      It is difficult to interpret the value of the period and phase comparisons shown in Fig. 1b, c, e and f after the preceding section about how variable the period and phase is across days. It is also surprising that the full 3 days were used to calculate the circadian statistics considering the first day is still under diurnal control. Do the ratios remain the same if the statistics are performed only on the 24h-68h window? For consistency with the rest of the paper and avoid confusion it would be best to have all circadian parameters measured using the same time window (24h-68h).

      Thank you for your comments, we can see how our logic in using the different data windows was not clear enough. As mentioned above, we have now explained the use of the full and shortened data windows in Experimental context section (Ln 74-82). Fig 1c is a comparison between different circadian datasets and as such we have only compared periods across 24-68h window. Similarly, Fig 1b is a global analysis of periods in rhythmic genes in comparison with Arabidopsis and so is again measured from 24-68h. We have now clarified this in the Figure legend for 1b.

      For comparisons of homoeologs within wheat triads, our question was in identifying homoeologs which behaved differently when placed under free-running conditions. We therefore still feel justified in using the full 0-68h dataset to identify homoeolog periods and phases which indicate differential circadian regulation, but we have now clarified that we are using the full dataset for the triad analysis in the results (Ln 140).

      Fig 1h-m. How were those genes chosen? It would help to see the SD of the replicates shown, since this is just showing one triad. It would be helpful to see a plot that represents the full set of triads rather than just one that looks best. If normalized to a standard phase they could be put on the same plot. For example, panel j is meant to show the 8h lag of subgenome D. If the data is normalized so that A and B are set to the same phase all the triads could be displayed with shaded SD bars to show the variation. Something like this would be a better representation of the data rather than showing just one example.

      Fig. 1h-m are case-studies illustrating the different forms of circadian imbalance between homoeologs. We agree that it is helpful to see the standard deviation as error bars on these triad plots and have added it as suggested. In line with another Reviewer 2’s suggestion we have removed Fig 1k and have replaced this with a comparison of mean normalised data for Triad 408 and Triad 2454, highlighting the difference between imbalanced rhythmicity and imbalanced amplitudes between homoeologs. Fig 1 I and m do not have error bars as adding standard deviations to mean normalised data wasn’t appropriate.

      Thank you for your suggestion on how to display the different phases between homoeologs. We feel that if we were to plot all of the triads displaying imbalanced phases, the differences in period length and accompanying noise differences would make the plot so busy as to be unreadable. We hope that the pie charts Fig 1 d-g give a global overview of the proportions of triads with circadian imbalance, but agree with the point that it is useful to allow readers to view triads of their own preference. Therefore, we have now provided the replicate level TPM data with the triad IDs annotated (Supplementary File 12) and Supplementary file 11 provides the classification of each triad alongside Metacycle statistics, ortholog identification and cluster information discussed elsewhere in the paper. Readers can now look up a triad or gene of interest and see how it was classified and what the expression looks like over the full dataset.

      It is surprising that there aren't more comparisons with the B. rapa dataset, especially when discussing the clock genes that show balanced or imbalanced expression. Are they similar in B. rapa and does it support your hypothesis that unbalance for certain genes are selected against?

      While we agree that a thorough, multiple species, comparative transcriptomic analysis is undoubtably of interest for the future, we feel it is beyond the scope of the questions being addressed in this paper. We do compare paralogs defined as “similar” in the Greenham dataset with homoeologs described as “balanced” in our dataset and find that genes involved with “photosynthesis” and “generation of precursor metabolites and energy” tend to be common between the two groups, potentially suggesting conservation of balance for certain types of genes (Ln 206-217).

      Figure 2 networks. Why were these specific modules selected? Is it actually appropriate to directly compare these modules? I do see that some of the comparisons have high correlations from panel a, but not all. For example, in panel b the W9 and A9 modules have a correlation value of 0.92, which seems appropriate. However, panel c (modules W3 and A2) have a correlation of 0.42, which seems far too low to make any sort of comparison meaningful.

      The modules were selected to simplify the comparison of genes expressed in the dawn, midday, dusk, and night. We were interested in identifying common GO-enrichment in genes peaking throughout the day, although as you have identified, the differences in period length between Arabidopsis and wheat made this difficult. Our reasons for comparing module W3 with module A2, were that, even though their eigengenes are not highly correlated per se, when period length is taken into account, both modules peak during the subjective day (CT 6.34h and 6.19h) and they share commonly enriched GO terms which make sense for day peaking genes.

      Further, as described in methods comments, using a cutHeight as low as 0.15 will likely lead to some number of genes in any given module that do not necessarily "share" a similar expression pattern. These genes could have a pattern that has very low correlation to their module eigengene and were only placed in that module because the pattern was "less similar" to other module eigengenes. The current expression plots in this figure follow a clear pattern, but I suspect this would be even more apparent if the genes within these modules had a higher correlation to the module eigengene. Perhaps the current genes in these modules could just be filtered to have a higher correlation score?

      Thank you for your comments, we have now made changes to the Results and Methods to clarify our approach (Ln 237-239 and Ln738-765). Merging modules with highly correlated module eigengenes (ME) is the final step in constructing our co-expression networks. To do this, as the reviewer describes - we used the WGCNA default parameter of a mergeCutHeight() of 0.15. This results in the merging of modules with highly correlated ME as the 0.15 mergeCutHeight() refers to the dissimilarity metric of 1 minus the eigengene correlation. So for WGCNA, a mergeCutHeight() of 0.15 corresponded to a correlation of 0.85. For the wheat modules, we took the additional step of merging closely related modules (mergeCloseModules()) using a cutHeight of 0.25, again a dissimilarity metric of 1 minus the eigengene correlation (corresponding to a correlation of 0.75). Reducing the stringency of the cutHeight to merge highly correlated wheat modules enabled us to more easily compare significantly correlated wheat and Arabidopsis co-expression modules to identify groups of genes in wheat and Arabidopsis expressed at similar times in the day, and enable the comparison of whether similar phased transcripts in wheat and Arabidopsis had similar biological roles.

      Lines 327-334: I am not following the connection between 'response to abiotic stimulus' and the photoreceptor and light signaling proteins. At the start of this section (line 308) the authors say that the GO analysis was only done on rhythmically expressed genes but the reference to only one PHYA being rhythmic and yet multiple genes are shown in the plot in fig. S16. Does this mean that all the genes were shown and not just the rhythmic ones? This would explain why many of the PHY and CRY genes don't seem to have rhythms. This should be clarified better in the text or indicated in the plot which ones were called rhythmic. Since the first day following transfer is still the diel pattern from the entrainment condition, what does the PHY and CRY expression look like? Does it appear rhythmic under diel but lose rhythmicity in LL? It should be noted in the text that arrhythmicity in circadian conditions doesn't mean there isn't rhythmicity under diel conditions. This could be an additional explanation apart from the current one in the text that the regulation is at the level of protein stability/localization. Overall, this entire section is very long and entirely based on data shown in the supplemental material. I do appreciate having the individual gene plots that supplement Figure 4 and would suggest either providing a main figure to highlight a small subset of genes or pathways in this section or shorten it and focus on the results shown in the main figures.

      Upon reading the reviewer’s comment, we realize that we should have made our motivations and processes clearer within this section. We used the data filtered for rhythmicity to conduct the GO-enrichment analysis and then used that to identify processes which should be of interest for further investigation. We have now added an additional sentence (Ln 352-354) to explain this more clearly. We then considered the orthologs of well-known Arabidopsis gene networks and extracted their expression from our circadian dataset, whether rhythmic or not. Supplementary Table 10 contains all of the genes we investigated, their expression and their MetaCycle statistics. We have also indicated here which genes are plotted in which Supplementary Figure 18-20. The reasons for plotting non-rhythmic genes in some cases was that it illustrates the differences between circadian control in Arabidopsis versus wheat (as is the case for the PHY and CRY genes). We understand that it is useful to see at a glance which genes are classified as rhythmic or arrhythmic, so have now highlighted each row in Supplementary Table 10 to make this more intuitive, and added a read me tab.

      Regarding your point about oscillation under diel cycles, we agree that some transcripts will show rhythmic behaviour under entraining environments but not under constant conditions, and may perform time-of-day specific functions. However, these transcripts are likely to not be regulated by the circadian clock (at the transcriptional level) and so are not discussed in the context of a circadian transcriptome.

      For your interest, here is the full expression of PHY and CRY transcripts starting at ZT0:

      [Image]

      It is difficult to say for definite, but it seems likely that some of these photoreceptors will have rhythmic patterns of expression under diel cycles, but these rhythms do not endogenously persist under constant conditions.

      We appreciate your feedback that this section would benefit from cutting down of text and addition of a Figure to illustrate the text. We have now cut some of this section down and created a new main figure based on some of the oscillation plots from Supplementary Figure 18 and 19. We chose examples that reflect a conservation of relationships between transcripts of different peak phases, as we find it interesting that both species have similar patterns. (Main Figure 4, Ln 361--363, 382).

      1. Primary metabolism section: in terms of the supplemental figure, similar to the previous one I think it would declutter the plots if the genes that are not rhythmic were left out and simply indicate below the plot that they didn't meet the rhythmicity cutoff. This is another area where there is more discussion surrounding the supplemental figures than the main figure 4.

      One of the overall findings of this section was that many of the genes involved in Starch and T6P metabolism which are rhythmically expressed in Arabidopsis are not rhythmically expressed in wheat. We feel removing these genes from the results would detract from the importance of this finding. We have now edited Supplementary Table 10 to highlight which genes are classified as rhythmic. We have also added in a sentence to the start of this section which lays out our motivations for this analysis, summarises our findings and better connects the text with an explanation of Fig. 5 (Ln 408-430).

      For all gene expression figures there should be SD or SE shown either as bars or ribbons to represent the variation in replicates.

      Although we agree that error bars are informative for showing variation between replicates (and have added them to Fig. 1 to show differences within wheat triads) we feel that adding error bars to the gene expression plots in Fig. 3, Fig 4 and Supplementary Fig 19-20 would make these plots difficult to read, particularly where the wheat homeologs are very similar. The purpose of these gene expression plots is to compare circadian profiles in Arabidopsis and wheat orthologs rather than to claim significant differences in expression at any particular timepoint. This is fairly common in other circadian biology studies:

      https://www.pnas.org/doi/10.1073/pnas.1408886111 ,

      https://www.jbc.org/article/S0021-9258(17)49454-3/fulltext#seccestitle20 , https://journals.plos.org/plosone/article/comments?id=10.1371/journal.pone.0169923 , https://www.science.org/doi/10.1126/science.290.5499.2110?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed,

      https://www.frontiersin.org/articles/10.3389/fgene.2021.664334/full,

      https://www.science.org/doi/full/10.1126/science.1161403

      The replication level information for each gene has now been made available in Supplementary file 12.

      1. It would be very helpful to include the code used to generate the networks and perform the cross-correlation of eigengenes across networks should be included in the Methods. This will also save you from responding to email requests!

      Thank you for your comment, Code for the cross-correlation analysis, Loom plots and WGCNA network construction is now available from our groups GitHub repository: https://github.com/AHallLab/circadian_transcriptome_regulation_paper_2022/tree/main

      Minor Comments

      1. Figure 1, panel d: - The "unbalanced" triads that are depicted by the lighter shading; do these in fact have a different cutoff than the original rhythmic homoeologs? In the figure it says qThank you for bringing this to our attention, this has now been corrected.

      Hard to directly compare the GO term overlap in Figure 2f. Might be better to only show the results for the 4 pairs shown in b-e and put them side by side in the bubble plot.

      Thank you for this feedback, We have tried to make this plot easier to understand without losing any of the available information. Hopefully it is now more intuitive to understand which columns are being compared. We have changed the coloured lines to make them slightly wider, put the modules in corresponding coloured boxes and highlighted GO-slim terms shared by modules being compared.

      1. Line 314 -316 don't see supp tables 10, 11

      Our apologies, these files were missed previously from the upload are now available.

      1. For the selection of B. rapa circadian paralogs with similar and differential expression patterns (starting line 714), the authors choose a hard cut off of 0.001 (differentially patterned) OR 0.1 (similarly patterned). What happens to the genes that are between these two cut offs or is this a typo. Since all the other cutoffs for rhythmicity was set at 0.01 it seems likely that this is a typo.

      We have now clarified this in the methods, (Ln 807-822). This is not a typo, but it is a different method to the Metacycle approach we have used for our wheat data. We defined similar/different paralogs as characterized in Greenham et al, (2020) using DiPALM p-values. We chose these DiPALM p-value cut-offs as they gave us approximately equal numbers of paralogs in each category, which represent tails of similarly expressed or differently expressed circadian genes. We checked these cut-offs by calculating average Pearson’s correlation statistics between paralogs and found that differential Brassica paralogs had a mean Pearson correlation coefficient of 0.31 (SD = 0.43) and similar Brassica paralogs had a mean Pearson correlation of 0.75 (SD= 0.23) which confirms that the DiPALM method of defining expression patterns makes sense in the context of this analysis.

      Line 681. Should be supplemental Figure 6 not 9.

      1. References to most supplemental figures are not the correct number.

      2. Labels above the plots in Supp Fig5 do not match the legend.

      We apologise for these mistakes. We realize that we had mistakenly submitted an earlier draft of the Supplementary materials file, which was missing Supplementary Figure 5, 6 and 9 which therefore shifted the order of the remaining figures. This is now updated.

      1. Suppl table 7 should be as a separate .csv file or similar to be able to see the full table.

      This is a good suggestion, and we have added this.

      1. Line 723 should be B. rapa not B. napus.

      Thank you for catching this! Corrected.

      1. Figure 4. There is no explanation for what the black boxes represent in the figure legend.

      Thank you for your comment. Figure 4 (new Figure 5) has now been updated.

      Reviewer #1 (Significance (Required)):

      This study provides new insight into the circadian regulation of the transcriptome in a new allopolyploid. It adds a valuable resource to a growing collection of circadian studies in important crops and will greatly improve our efforts to learn more about the circadian control of important crop improvement traits. The dataset will be of interest to other plant circadian biologists as well as the general plant biology community who focus on monocot crops. My expertise is more on the transcriptomic side and I do not have the expertise to evaluate the phylogenetic work presented in this study.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary Rees et al. present an RNAseq time course of bread wheat. Its recent polyploidisation is one motivation for this study as gene expression dosage is known to be important for clock function in other plants. The time course covers 3 days at sampling intervals of 4h of 2-week old wheat plants (all aerial tissues), in triplicates. The subsequent analysis of the RNAseq data includes analysis of the generated data by itself (e.g. GO analysis, rhythmicity, period and phase analysis, rhythmicity of transcription factor families as well as TF binding sites) as well as thorough comparison with published datasets of other species (Arabidopsis, Brassica rapa, Brachypodium dystachion). One of the key findings is that the mean period length and the period spread are larger in wheat than in these other species). Circadian clock genes largely have similar dynamics in wheat compared to Arabidopsis. In addition, one focus is the analysis of the dynamics of three genes of one triad and imbalance / balance of such triads. To the surprise of the authors, circadian regulated and clock genes were not necessarily balanced. Silencing is one of their explanation for imbalance of circadian genes as arrhythmic genes of one triad are typically those with the lowest expression level. Finally, the authors point out more examples of rhythmic processes and genes (photoreceptors and signalling, auxin, carbon metabolism) and their commonalities and differences with Arabidopsis.

      Major comments - The key conclusions and the data are convincing

      We thank the reviewer for their supportive comments.

      • line 120 and figure 1: In my opinion, q > 0.05 is not a good definition of arrhythmicity as non-significant q-values can result from either noise in spite of rhythmicity or from arrhythmicity. A more statistically sound way to detect arrhythmicity could for example be two-one-side tests (for example in the R package 'equivalence', e.g. see usage for time courses by Noordally et al. 2018, https://www.biorxiv.org/content/10.1101/287862v1).

      Thank you for pointing us in the direction of this package, we agree that choosing methods for circadian quantification and q-value cut-offs is always tricky and different approaches will perform better for noisier or non-sinusoidal waveforms. For future work, we will investigate the application of the suggested method in circadian rhythmicity analysis. However, we believe that the criteria used in this paper for rhythmicity quantification is suitable for addressing our questions, and overall, we are satisfied that rhythms with a q-value of >0.05 would also be classified by eye as being arrhythmic, and rhythms with a q-value Many other studies have used meta2d B.H q-values as a metric of rhythmicity: e.g. (https://bmcplantbiol.biomedcentral.com/articles/10.1186/s12870-022-03565-1 , https://link.springer.com/content/pdf/10.1186%2Fs12915-022-01258-7 , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8782462/pdf/pcbi.1009762.pdf )

      • lines 480-484 and intro: In the introduction, the authors write that expression levels of clock components are important for the function of the clock, and that this is one motivation for the current study where polyploidisation is expected to affect the expression levels of clock genes and their outputs. I wonder what answers or speculations this study provides in the end, or whether such answers / speculations should be made clearer. For example, do the authors think that the higher variability of periods in wheat could be a consequence of lower robustness (in addition to possible spatial differences that are mentioned) due to polyploidisation? Is anything known about the period of rhythms of close wheat relatives that did not undergo polyploidisation? Did you look at dampening over the time course in wheat vs. Arabidopsis?

      The point above is an interesting one, and we thank the reviewer for raising it. We agree that the high variability of periods in wheat may be a product of polyploidisation, as functional redundancy between homoeologs may allow a tolerance for less tightly regulated, non-dominantly expressed circadian transcripts. We have now added this hypothesis to our discussion: Ln536-550.

      In our comparative analysis of period distributions, we looked at periods of transcripts from a diploid relative of hexaploid wheat, Brachypodium distachyon. In Brachypodium, period lengths have around the same SD as in Arabidopsis but the mean period length is slightly longer (Supplementary table 2). We have now edited our results to make the relationship between wheat and Brachypodium clearer (ln 109-110).

      Minor comments:

      Introduction - lines 49: it is unclear what is meant by ppd-1 at this position of the sentence

      We agree this was unclear and have revised it to “notably the ppd-1 locus within TaPRR3/7” Ln 52

      • line 54/55: clarify that this refers to Arabidopsis thaliana

      Corrected.

      Results - line 69 and 76: cite references for these tools here (not only in the methods section)

      Corrected.

      • line 90-93: Why wouldn't the same thing happen on subsequent subjective evenings?

      Thank you for your comments. We have now combined our hypothesis for why wheat transcripts expressed at dusk have longer periods with the discussion about longer periods upon transfer to constant light. We think that the two processes are likely to be connected and have now placed them together in Ln 126-131.

      The behaviour of mean period lengths of wheat transcripts upon transfer to constant light was unexpected and we believe is quite interesting. One explanation is that the influence of the ongoing light zeitgeber when dusk was expected causes a delay in the expression of evening peaking genes which are delayed by the continuous light signal. Then, on subsequent evenings the influence of the diel dusk signal is ‘forgotten’ as the governance of the endogenous clock takes over. The very long period observed at 0-24h (28.61h) may be due to a phase shift rather than an intrinsic lengthening of period per se. Whether this trait is unique to wheat or can also be seen in other plant species is, to our knowledge, unknown.

      • line 118: what is your defined cutoff for significance of the Chi square test (p=0.03 not regarded significant?)

      The reviewer is completely right, we have now clarified this. Ln 145-149

      • figure 1h,i: In order for the reader to see whether A and D (Figure 1h) or A (figure 1i) are indeed arrhythmic, one would need to see plots with a normalisation as done in figure 1m for 1l.

      We have now removed the triad showing one rhythmic gene and two arhythmic genes (as Fig. 1h already illustrates this type of circadian imbalance) and replaced this with a side by side comparison of how imbalance in rhythmicity differs from imbalance in relative amplitude as suggested.

      • figure 1h-m (and others with circadian time course traces): could a measure of variation (e.g. SD, SEM, confidence interval) be plotted as a shaded region around the curves (unless they're so small that they are there but not visible)?

      We have now added error bars to these plots to show standard deviation between replicates, in Fig. 1 h, j, k and l. We could not think of an accurate way to display this information for the mean normalised data (Fig 1. i and m) so have not put error bars on these plots.

      • line 139 (also in 737 and 450): give reference to Ramirez-Gonzalez et al in the same style as the rest of the manuscript (number)

      Thank you for raising this, we believe we have corrected all in-text citations (both narrative and fully parenthetical form) for consistency with the APA format used by the majority of Review Commons affiliate journals.

      • Clustering (modules): What is the reason for choosing 9 clusters? Was this number optimised or chosen for other reasons?

      WGCNA uses an unsupervised clustering algorithm that works within the supplied parameters to determine the optimum number of clusters to explain the dataset, without prior specification of the number of clusters. We have amended the manuscript text to clarify this Ln237-239.

      • lines 280 - 284: The TaELF3-1D phenotype could be explained a bit better to the non-wheat specialist, for example by mentioning in the beginning of this set of sentences.

      Done (Ln 314-318).

      • The authors present an analysis of TF binding sites. Can they say something about binding sites in a less sophisticated manner, such as on some very well-known motifs in promoters like the evening element?

      We agree that this is a very interesting question, and one that we may investigate in more detail with our data in the future. In this paper, we performed a global analysis of wheat TFBS predicted from orthologous Arabidopsis TF targets. These targets have been experimentally validated in Arabidopsis using DAP-seq, but we have not validated that these binding sites exist in wheat promoters. We therefore took a tentative approach, and presented only enrichments at the superfamily level rather than talking about specific regulatory motifs.

      The evening element would fit most likely fit within the MYB or MYB-related TFBS superfamily, however the diversity of transcription factors in this family means that there is significant enrichment of these TFBS in multiple modules throughout the day (Supplementary Figure 11). In summary, a more in depth TFBS analysis of known circadian motifs is of great interest, but we feel would be a substantial work in its own right.

      • Figure 1h-l: If known or meaningful, it would be interesting to know the gene identities behind the triads shown, as in supplementary figure 5.

      These triads were selected as case studies to exemplify the ways in which we were defining imbalanced circadian triads. They have no particular relevance to the figure, but out of curiosity, these are the closest Arabidopsis orthologs for the triads displayed in Fig. 1:

      Triad 408 has highest identity to a hypothetical protein (AT4G26415).

      Triad 2454 is similar to AT3G07600, a heavy metal transport/detoxification superfamily protein

      Triad 13405 is similar to AT3G22360, encoding an ALTERNATIVE OXIDASE 1B, AOX1B

      Triad 10854 is similar to NSE4A, a δ-kleisin component of the SMC5/6 complex, possibly involved in synaptonemal complex formation (AT1G51130).

      Information about wheat gene names in each triad and their Arabidopsis orthologs can be viewed in Supplementary Table 11, so that readers can search for genes of particular interest to them.

      • Figure 4 and text: The illustration of starch metabolism is very helpful. However, I think the paper would benefit from giving a better reason for the selection of this specific set of processes, for example by relating these findings to functional differences in starch metabolism in the two species (in contrast to Arabidopsis, wheat stores little starch in leaves but uses fructans as main reserve carbohydrate)? Are there known differences in the dynamics of starch degradation during the night?

      The reviewer raises an interesting point, and we have now clarified in our results that the stated differences between starch regulation in Arabidopsis and wheat was part of the motivation behind studying this pathway. Starch is at the centre of plant primary metabolism as a carbon storage source and is arguably one of the most important features that breeders look for in regard to grain filling and yields. Additionally, it is of interest to circadian biologists as starch (as well as sucrose) have been shown to transiently cycle and to be regulated by the circadian clock. However, in wheat, carbon storage primarily uses sucrose rather than starch, and we have now added sucrose to Figure 5 to place it in this context. We think your suggestion has now improved our explanation for why we focused on starch in the manuscript, and we are grateful for your input (Ln 408-421).

      We also agree that the differences in the ways that Arbaidopsis and wheat utilise starch versus sucrose, and perhaps the role that fructans have in as a reserve carbohydrate and in protection against freezing in wheat may be one of the reasons we are seeing differences in circadian regulation of starch. We have now added this to our discussion (Ln 584-592).

      • Figure 4: triose-phosphates can be transported in and out of the chloroplast, as is illustrated in the figure. However, the illustration looks as though they are converted to hexose phosphates during the transport process. In order to be consistent with other transport processes of the figure (maltose and glucose), triose-phosphate should be repeated on the cytosolic side.

      We have now amended this (new Fig. 5). Thank you for your feedback.

      Methods - line 543: if I understand correctly that triplicates were collected and analysed for each time point, '18 samples' is mis-leading (18 time points would be more accurate).

      We agree this was badly worded. Changed Ln 615.

      Supplementary - Supplementary figure 3: x axis label very small and contains typo

      Now corrected. Also enlarged axis for Supplementary Figure 2.

      • Supplementary table 1: Romanowski et al 2020 (add year), or use ref. number citation style as in the rest of the manuscript

      Thank you for raising this, we have now hopefully corrected all in text citations (both narrative and fully parenthetical form) to be consistent with APA format used by the majority of Review commons affiliate journals.

      • Supplementary table 9, primary metabolism: does bold highlighting of Arabidopsis accession numbers have a meaning or is it accidental?

      We apologise that this was unclear. We have corrected this. Supplementary Table 10 now also has a “Read me” tab which explains that table.

      Reviewer #2 (Significance (Required)):

      I believe this is a precious, carefully generated and analysed dataset which many biologists will benefit from, beyond wheat or circadian specialists. The dataset expands the knowledge of circadian transcriptome regulation to an important crop and contributes a resource of which only a handful of others exist in other species. Many high impact papers on RNAseq include some follow-up on candidates, for example in Romanowski et al 2020, which is admittedly easier to do in Arabidopsis than wheat due to the availability of genetic resources.

      My expertise: Plant circadian clock (Arabidopsis), dataset analysis (but not specifically for RNAseq)

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript is based on the analysis of a single experiment consisting in transcriptomic profiling of one (hexaploid) wheat genotype along 3 days (samples taken every 4 hours). The experiment is performed in constant light conditions, allowing detection of transcripts controlled by the circadian clock. The bioinformatic analysis studies the dynamics of the different homoeologous transcript in the polyploid genome and compares cycling transcripts in wheat with what is known from Arabidopsis.

      The manuscript is well written, the methods are correct, the analysis performed is sufficiently extensive and the figures are clear. The manuscript finds interesting expression patterns among homeologous genes, and goes into detail on important differences in circadian regulation of relevant gene families between Arabidopsis and wheat. The work is purely descriptive and does not aim at associations with physiological phenotypes, but the bioinformatic analysis is very thorough and uncovers interesting examples.

      Only one caveat: For what I gather, there is no replication in the RNA-seq experiment, although the exact method does not appear in the text. From the Methods section: "tissue was sampled every 4h for 3 days (18 samples in total)" and "At each timepoint, we sampled the entire aerial tissue from 3 replicate plants". Whether these samples were pooled or not is not described. The "Data Availability" section links to 18 RNA-seq paired end libraries, which suggest that the replicates were pooled, although some type of barcoding might have been used. The text should mention if the replicates were pooled or not, and, if so, what was the method used for poling (tissue, RNA or libraries). Even in the case of no biological replication the manuscript brings interesting insights into wheat transcriptomics and circadian biology. The editor (or the rules of the journal) should decide if they accept articles with no "real" biological replication (I am sure we all understand by now the benefits and limitations of pooling biological replicates into a single RNA-seq library).

      There was replication within the RNA sequencing experiment, and we apologise that this was unclear from our manuscript. Each timepoint consisted of three independent biological replicates. We have now created a new “Experimental context” section in the results to explain this (Ln 74-82) and have clarified in the methods how our data was processed (Ln 609-615 and 636-638).

      We have now included an additional matrix with TPMs at the replicate level to assist readers in looking at specific genes of interest (Supplementary Table 12).

      Minor comments:

      The description of the experimental setup in the first sentence of the Results section is too brief. Could you please talk about for how long the experiment was running? At what intervals the samples were taken? What conditions were used?

      We apologise that this was unclear. We hope that the new Experimental Context section, added in response to comments from several reviewers, makes this much clearer, alongside the clarification in the methods (Ln 609-615 and 636-638).

      Line 280: "...due *to* an introgression..."

      Corrected. Ln 315

      The legend of Figure 3l says elf4 instead of elf3

      We thank the reviewer for noticing this mistake that we have now corrected.

      Line 306 "says Supplementary Note 7 instead of Supplementary Note 7

      We are not sure what is to be corrected here!

      Reviewer #3 (Significance (Required)):

      This works advances our knowledge on how genome wide expression levels are controlled by the circadian clock in polyploids. Although previous works had performed similar analyses in other polyploid plants, this is the first time this is done in an hexaploid. This work is a starting step to understand gene regulation in this important crop, and have interest for researchers working in fundamental and applied plant biology.

      Thank you for your positive comments and your feedback in improving this manuscript. We would like to clarify that to our knowledge, this work presents the first analysis of a circadian transcriptome in a polyploid crop. The work by Greenham et al, although undoubtably providing insight into circadian regulation of ancient paralogs, was performed in the diploid Brassica rapa.

    1. • About 99% of the time, the right time is right now. • No one is as impressed with your possessions as you are. • Dont ever work for someone you dont want to become. • Cultivate 12 people who love you, because they are worth more than 12 million people who like you. • Dont keep making the same mistakes; try to make new mistakes. • If you stop to listen to a musician or street performer for more than a minute, you owe them a dollar. • Anything you say before the word “but” does not count. • When you forgive others, they may not notice, but you will heal. Forgiveness is not something we do for others; it is a gift to ourselves. • Courtesy costs nothing. Lower the toilet seat after use. Let the people in the elevator exit before you enter. Return shopping carts to their designated areas. When you borrow something, return it better shape (filled up, cleaned) than when you got it. • Whenever there is an argument between two sides, find the third side. • Efficiency is highly overrated; Goofing off is highly underrated. Regularly scheduled sabbaths, sabbaticals, vacations, breaks, aimless walks and time off are essential for top performance of any kind. The best work ethic requires a good rest ethic. • When you lead, your real job is to create more leaders, not more followers. • Criticize in private, praise in public. • Life lessons will be presented to you in the order they are needed. Everything you need to master the lesson is within you. Once you have truly learned a lesson, you will be presented with the next one. If you are alive, that means you still have lessons to learn. • It is the duty of a student to get everything out of a teacher, and the duty of a teacher to get everything out of a student. • If winning becomes too important in a game, change the rules to make it more fun. Changing rules can become the new game. • Ask funders for money, and they’ll give you advice; but ask for advice and they’ll give you money. • Productivity is often a distraction. Don’t aim for better ways to get through your tasks as quickly as possible, rather aim for better tasks that you never want to stop doing. • Immediately pay what you owe to vendors, workers, contractors. They will go out of their way to work with you first next time. • The biggest lie we tell ourselves is “I dont need to write this down because I will remember it.” • Your growth as a conscious being is measured by the number of uncomfortable conversations you are willing to have. • Speak confidently as if you are right, but listen carefully as if you are wrong. • Handy measure: the distance between your fingertips of your outstretched arms at shoulder level is your height. • The consistency of your endeavors (exercise, companionship, work) is more important than the quantity. Nothing beats small things done every day, which is way more important than what you do occasionally. • Making art is not selfish; it’s for the rest of us. If you don’t do your thing, you are cheating us. • Never ask a woman if she is pregnant. Let her tell you if she is. • Three things you need: The ability to not give up something till it works, the ability to give up something that does not work, and the trust in other people to help you distinguish between the two. • When public speaking, pause frequently. Pause before you say something in a new way, pause after you have said something you believe is important, and pause as a relief to let listeners absorb details. • There is no such thing as being “on time.” You are either late or you are early. Your choice. • Ask anyone you admire: Their lucky breaks happened on a detour from their main goal. So embrace detours. Life is not a straight line for anyone. • The best way to get a correct answer on the internet is to post an obviously wrong answer and wait for someone to correct you. • You’ll get 10x better results by elevating good behavior rather than punishing bad behavior, especially in children and animals. • Spend as much time crafting the subject line of an email as the message itself because the subject line is often the only thing people read. • Don’t wait for the storm to pass; dance in the rain. • When checking references for a job applicant, employers may be reluctant or prohibited from saying anything negative, so leave or send a message that says, “Get back to me if you highly recommend this applicant as super great.” If they don’t reply take that as a negative. • Use a password manager: Safer, easier, better. • Half the skill of being educated is learning what you can ignore. • The advantage of a ridiculously ambitious goal is that it sets the bar very high so even in failure it may be a success measured by the ordinary. • A great way to understand yourself is to seriously reflect on everything you find irritating in others. • Keep all your things visible in a hotel room, not in drawers, and all gathered into one spot. That way you’ll never leave anything behind. If you need to have something like a charger off to the side, place a couple of other large items next to it, because you are less likely to leave 3 items behind than just one. • Denying or deflecting a compliment is rude. Accept it with thanks, even if you believe it is not deserved. • Always read the plaque next to the monument. • When you have some success, the feeling of being an imposter can be real. Who am I fooling? But when you create things that only you — with your unique talents and experience — can do, then you are absolutely not an imposter. You are the ordained. It is your duty to work on things that only you can do. • What you do on your bad days matters more than what you do on your good days. • Make stuff that is good for people to have. • When you open paint, even a tiny bit, it will always find its way to your clothes no matter how careful you are. Dress accordingly. • To keep young kids behaving on a car road trip, have a bag of their favorite candy and throw a piece out the window each time they misbehave. • You cannot get smart people to work extremely hard just for money. • When you don’t know how much to pay someone for a particular task, ask them “what would be fair” and their answer usually is. • 90% of everything is crap. If you think you don’t like opera, romance novels, TikTok, country music, vegan food, NFTs, keep trying to see if you can find the 10% that is not crap. • You will be judged on how well you treat those who can do nothing for you. • We tend to overestimate what we can do in a day, and underestimate what we can achieve in a decade. Miraculous things can be accomplished if you give it ten years. A long game will compound small gains to overcome even big mistakes. • Thank a teacher who changed your life. • You cant reason someone out of a notion that they didn’t reason themselves into. • Your best job will be one that you were unqualified for because it stretches you. In fact only apply to jobs you are unqualified for. • Buy used books. They have the same words as the new ones. Also libraries. • You can be whatever you want, so be the person who ends meetings early. • A wise man said, “Before you speak, let your words pass through three gates. At the first gate, ask yourself, “Is it true?” At the second gate ask, “Is it necessary?” At the third gate ask, “Is it kind?” • Take the stairs. • What you actually pay for something is at least twice the listed price because of the energy, time, money needed to set it up, learn, maintain, repair, and dispose of at the end. Not all prices appear on labels. Actual costs are 2x listed prices. • When you arrive at your room in a hotel, locate the emergency exits. It only takes a minute. • The only productive way to answer “what should I do now?” is to first tackle the question of “who should I become?” • Average returns sustained over an above-average period of time yield extraordinary results. Buy and hold. • It’s thrilling to be extremely polite to rude strangers. • It’s possible that a not-so smart person, who can communicate well, can do much better than a super smart person who can’t communicate well. That is good news because it is much easier to improve your communication skills than your intelligence. • Getting cheated occasionally is the small price for trusting the best of everyone, because when you trust the best in others, they generally treat you best. • Art is whatever you can get away with. • For the best results with your children, spend only half the money you think you should, but double the time with them. • Purchase the most recent tourist guidebook to your home town or region. You’ll learn a lot by playing the tourist once a year. • Dont wait in line to eat something famous. It is rarely worth the wait. • To rapidly reveal the true character of a person you just met, move them onto an abysmally slow internet connection. Observe. • Prescription for popular success: do something strange. Make a habit of your weird. • Be a pro. Back up your back up. Have at least one physical backup and one backup in the cloud. Have more than one of each. How much would you pay to retrieve all your data, photos, notes, if you lost them? Backups are cheap compared to regrets. • Dont believe everything you think you believe. • To signal an emergency, use the rule of three; 3 shouts, 3 horn blasts, or 3 whistles. • At a restaurant do you order what you know is great, or do you try something new? Do you make what you know will sell or try something new? Do you keep dating new folks or try to commit to someone you already met? The optimal balance for exploring new things vs exploiting them once found is: 1/3. Spend 1/3 of your time on exploring and 2/3 time on deepening. It is harder to devote time to exploring as you age because it seems unproductive, but aim for 1/3. • Actual great opportunities do not have “Great Opportunities” in the subject line. • When introduced to someone make eye contact and count to 4. You’ll both remember each other. • Take note if you find yourself wondering “Where is my good knife? Or, where is my good pen?” That means you have bad ones. Get rid of those. • When you are stuck, explain your problem to others. Often simply laying out a problem will present a solution. Make “explaining the problem” part of your troubleshooting process. • When buying a garden hose, an extension cord, or a ladder, get one substantially longer than you think you need. It’ll be the right size. • Dont bother fighting the old; just build the new. • Your group can achieve great things way beyond your means simply by showing people that they are appreciated. • When someone tells you about the peak year of human history, the period of time when things were good before things went downhill, it will always be the years of when they were 10 years old — which is the peak of any human’s existence. • You are as big as the things that make you angry. • When speaking to an audience it’s better to fix your gaze on a few people than to “spray” your gaze across the room. Your eyes telegraph to others whether you really believe what you are saying. • Habit is far more dependable than inspiration. Make progress by making habits. Dont focus on getting into shape. Focus on becoming the kind of person who never misses a workout. • When negotiating, dont aim for a bigger piece of the pie; aim to create a bigger pie. • If you repeated what you did today 365 more times will you be where you want to be next year? • You see only 2% of another person, and they see only 2% of you. Attune yourselves to the hidden 98%. • Your time and space are limited. Remove, give away, throw out things in your life that dont spark joy any longer in order to make room for those that do. • Our descendants will achieve things that will amaze us, yet a portion of what they will create could have been made with today’s materials and tools if we had had the imagination. Think bigger. • For a great payoff be especially curious about the things you are not interested in. • Focus on directions rather than destinations. Who knows their destiny? But maintain the right direction and you’ll arrive at where you want to go. • Every breakthrough is at first laughable and ridiculous. In fact if it did not start out laughable and ridiculous, it is not a breakthrough. • If you loan someone $20 and you never see them again because they are avoiding paying you back, that makes it worth $20. • Copying others is a good way to start. Copying yourself is a disappointing way to end. • The best time to negotiate your salary for a new job is the moment AFTER they say they want you, and not before. Then it becomes a game of chicken for each side to name an amount first, but it is to your advantage to get them to give a number before you do. • Rather than steering your life to avoid surprises, aim directly for them. • Dont purchase extra insurance if you are renting a car with a credit card. • If your opinions on one subject can be predicted from your opinions on another, you may be in the grip of an ideology. When you truly think for yourself your conclusions will not be predictable. • Aim to die broke. Give to your beneficiaries before you die; it’s more fun and useful. Spend it all. Your last check should go to the funeral home and it should bounce. • The chief prevention against getting old is to remain astonished.

      So much wisdom and stuff to think about here.

    1. Author Response

      Reviewer #1 (Public Review):

      1) The connectivity patterns along the anterior-posterior hippocampal axis broadly follow an anterior-posterior cortical bias, such that posterior regions, e.g. the visual cortex, are preferentially connected to the hippocampal tail, and anterior regions, e.g. the temporal pole, are preferentially connected to the hippocampal head. The authors focus on the twenty regions with the highest connectivity profiles, which appears to capture the majority of all connections. However, some of the present structural connectivity patterns differ in interesting ways from previously described cortical networks reported in resting-state fMRI studies. Most notably, the medial PFC and orbitofrontal regions combined account for less than 1% of all connections in the present investigation (Table S1 & S2). This is an interesting contrast to functional investigations which tend to find that these regions cluster with the aHPC (e.g., Adnan et al. 2016 Brain Struct Func; Barnett et al. 2021 PLoS Biol; Robinson et al. 2016 NeuroImage). In contrast, the present DWI results suggesting preferential pHPC-medial parietal connectivity dovetail with those observed in fMRI studies. It seems important to discuss why these differences may arise: whether this is a differentiation between structural and functional networks, or whether this is due to a difference in methods.

      We thank Reviewer 1 for making this important point and agree that these observations are deserving of further expansion. We have now included additional text where we place the surprising observation of sparse connectivity between PFC regions and the hippocampus more firmly in the context of recent evidence and argue that these observations suggest a potential differentiation between structural and functional networks.

      We have included the following text in the discussion (pp. 16-17, lines 439-457);

      “While many of our observed anatomical connections dovetail nicely with known functional associations, patterns of anatomical connectivity strength did not always mirror well characterised functional associations between the hippocampus and cortical areas. For example, a surprising observation from our study was that only weak patterns of anatomical connectivity were observed between the hippocampus and the ventromedial prefrontal cortex (vmPFC) and other frontal cortical areas. This lies in contrast to well documented functional associations between these regions (46-48). Our observation, however, supports a growing body of evidence that direct anatomical connectivity between the hippocampus and areas of the PFC may be surprisingly sparse in the human brain. For example, Rosen and Halgren (49) recently reported that long range connections between the hippocampus and functionally related frontal cortical areas may constitute fewer than 10 axons/mm2 and more broadly observed that axon density between spatially distant but functionally associated brain areas may be much lower than previously thought. Our observation of sparse anatomical connectivity between the hippocampus and PFC mirrors this recent work and suggests a potential differentiation between structural and functional networks as they relate to the hippocampus. It remains possible, however, that methodological factors may contribute to these differences. We return to this point later in the discussion. A future dedicated study aimed at assessing whether the well characterised functional associations between the hippocampus and vmPFC are driven by sparse direct connections or primarily by intermediary structures is necessary to address this issue in an appropriate level of detail.”

      2) While the analytic pipeline is described in sufficient detail in the Methods, it is somewhat unclear to a non-DWI expert what the major methodological advance is over prior approaches. The authors refer to a tailored processing pipeline and 'an advance in the ability to map the anatomical connectivity (p. 5), but it's not immediately clear what these entail. It would be useful to highlight the key methodological differences or advances in the Introduction to help with the interpretation of the similarities and differences with previous connectivity findings.

      We have now included a brief description in the Introduction highlighting the key methodological advances used in the current study.

      We have included the following text in the Introduction (pp. 4-5, lines 130-144);

      “In typical fibre-tracking studies, we cannot reliably ascertain where streamlines would naturally terminate, as they have been found to also display unrealistic terminations, such as in the middle of white matter or in cerebrospinal fluid (39). While methods have been proposed to ensure more meaningful terminations (40), for example, with terminations forced at the grey matter-white matter interface (gmwmi), this approach is still not appropriate for characterising terminations within complex structures like the hippocampus. A key methodological advance of our approach was to remove portions of the gmwmi inferior to the hippocampus (where white matter fibres are known to enter/leave the hippocampus). This allowed streamlines to permeate the hippocampus in a biologically plausible manner. Importantly, we combined this with a tailored processing pipeline that allowed us to follow the course of streamlines within the hippocampus and identify their ‘natural’ termination points. These simple but effective methodological advances allowed us to map the spatial distribution of streamline ‘endpoints’ within the hippocampus. We further combined this approach with state-of-the-art tractography methods that incorporate anatomical information (40) and assign weights to each streamline (41) to achieve quantitative connectivity results that more faithfully reflect the biological accuracy of the connection’s strength (39).”

      3) Related to the point above, it was a bit unclear to me how the present connections map onto canonical white matter tracts. In Fig., 4A, the tracts are shown for a single participant, but it would be helpful to map or quantify know how many of the connections for a given hippocampal subregion are associated with a given tract to provide a link to prior work or clarify the approach. A fairly large body of prior research on hippocampal white matter connectivity has focused on the fornix, but it's a little difficult to align these prior findings with the connectivity density results in the current paper.

      We thank Reviewer 1 for this comment and agree this would be an interesting avenue to pursue. However, the reliable segmentation of white matter fibre bundles is currently an area of contention in the DWI community. This pervasive and problematic issue was highlighted in a recently published large multi-site study that revealed a high degree of variability in how white matter bundles are defined, even from the same set of whole-brain streamlines (Schilling et al., 2021, Neuroimage. Nov; 243:118502. https://pubmed.ncbi.nlm.nih.gov/34433094/). This means that, even if we were to choose a particular method to segment white matter bundles, our results would not be readily translatable to those reported in previous DWI studies. This significantly limits meaningful comparison and/or interpretation. Indeed, such an approach may paradoxically take away from the detailed characterisations we have achieved in the current study. As highlighted in that study, it is now paramount that consensus is reached in this field to define criteria to reliably and reproducibly define white matter fibre bundles. Once that is achieved, we plan to conduct a follow-up study to characterise this in more detail, with bundles that will be able to be reliably reproduced by others.

      4) Finally, on a more speculative note: based on the endpoint density maps, there seems to be a lot of overlap between the EDMs associated with different cortical regions (which makes sense given the subregion results). Does this effectively mean that the same endpoints may be equally connected with multiple different cortical regions? Part of the answer can be found in Fig. 3D showing the combined EDM for three different regions, but how spatially unique is each endpoint? This is likely not a feasible question to address analytically but it might be helpful to provide some more context for what these maps represent and how they might relate to differences across individuals.

      The primary aim of the current analysis was to characterise broad patterns of endpoint density captured by our averaged group level analysis. However, Reviewer 1 is astute in assuming that, although there is overlap in the group averaged endpoint density maps (EDMs) associated with different cortical areas, at the single participant level, there are both overlaps and spatial uniqueness in the location of individual endpoints. For example, while group level analysis revealed that area V1 and area V2 showed preferential connectivity with overlapping regions of the posterior medial hippocampus, when visualising individual endpoints associated with each of these areas at the single participant level, we can see that some endpoints overlap while others display spatially unique patterns (see image below). Although a more in-depth analysis of individual variability in these patterns was beyond the scope of this investigation (as noted on Page14; Lines 379-381), we agree with Reviewer 1 that this is an important point to note in the manuscript. We have, therefore, included additional text touching on this and have included a new Supplementary Figure (Page 42; also see below) to emphasise that, at the single participant level, different cortical areas display both overlapping and spatially unique endpoints within specific regions of the hippocampus (using areas V1 and V2 as an example).

      We have included the following text in the Results section (pp. 14, lines 370-379);

      “Finally, while we observed clear overlaps in the group averaged EDMs associated with specific cortical areas, a closer inspection of individual endpoints at the single participant level revealed that endpoints associated with different cortical areas displayed both overlapping and spatially unique characteristics within these areas of overlap. For example, at the group level, areas V1 and V2 showed preferential connectivity with overlapping regions of the posterior medial hippocampus (see Supplementary Figure S5) while, at the single participant level, individual endpoints associated with each of these areas display both overlapping and spatially unique patterns (see Supplementary Figure S6). This suggests that, while specific cortical areas display overlapping patterns of connectivity within specific regions of the hippocampus, subtle differences in how these cortical regions connect within these areas of overlap likely exist.”

      Reviewer #2 (Public Review):

      Dalton and colleagues present an interesting and timely manuscript on diffusion weighted imaging analysis of human hippocampal connectivity. The focus is on connectivity differences along the hippocampal long axis, which in principle would provide important insights into the neuroanatomical underpinnings of functional long axis differences in the human brain. In keeping with current models of long-axis organisation, connectivity profiles show both discrete areas of higher connectivity in long axis portions, as well as an anterior-to-posterior gradient of increasing connectivity. Endpoint density mapping provided a finer grained analysis, by allowing visualisation of the spatial distribution of hippocampal endpoint density associated with each cortical area. This is particularly interesting in terms of the medial-lateral distribution with hippocampal head, body and tail. Specific areas map to precise hippocampal loci, and some hippocampal loci receive inputs from multiple cortical areas.

      This work is well-motivated, well-written and interesting. The authors have capitalised on existing data from the Human Connectome Project. I particularly like the way the authors try to link their findings to human histological data, and to previous NHP tracing results.

      Many thanks.

      1) There are some important surprises in the results, particularly the relatively strong connectivity between hippocampus and early visual areas (including V1) and low connectivity with areas highly relevant from functional perspectives, such as the medial prefrontal cortex (rank order by strength of connectivity 7th and 78th of all cortical structures, respectively). This raises a concern that the fibre tracking method may be joining hippocampal connections with other tracts. In particular, given the anatomical proximity of the lateral geniculate nucleus to the body and tail of the hippocampus, the reported V1 connectivity potentially reflects a fusion of tracked fibres with the optic radiation. In visualizing the putative posterior hippocampus-to-V1 projection (Figure 4B, turquoise), the tract does indeed resemble the optic radiation topography. Although care was taken to minimise the hippocampus mask 'spilling' into adjacent white matter, this was done with focus on the hippocampal inferior margin, whereas the different components of the optic radiation lie lateral and superior to the hippocampus.

      We agree with Reviewer 2 that our observations relating to area V1 could be the result of limitations inherent to current tracking methodology. Indeed, probabilistic tracking can result in tracks mistakenly ‘jumping’ between fibre bundles. Unfortunately, primarily due to limitations in image resolution, we do not believe that we can categorically rule this possibility out in the current dataset beyond the measures we have already taken in our analysis pipeline. We have now included additional text in the Discussion acknowledging and emphasising this possible limitation of our study.

      We have included the following text in the Discussion section (Page 25; Lines 694-699);

      “Also, we cannot rule out that some connections observed in the current study may result from limitations inherent to current probabilistic fibre-tracking methods whereby tracks can mistakenly ‘jump’ between fibre bundles (e.g. for connections between the posterior medial hippocampus and area V1 due to the proximity to the optic radiation), especially in “bottleneck” areas. Again, future work using higher resolution data may allow more targeted investigations necessary to confirm or refute the patterns we observed here.”

      Beyond the possibility of tracks jumping between fibre bundles, we feel it is important to emphasise that an integral part of our analysis was the detailed attention we took to minimise mask ‘spillage’ of the entire hippocampus mask. It is not the case that we primarily focussed on inferior portions of the hippocampus as stated by Reviewer 2. Equal focus was paid to medial, lateral and superior portions of the mask which lie adjacent to visual thalamic nuclei, the optic radiation posteriorly and a number of other structures. We can see that our description relating to this lacked the necessary detail to convey this important point clearly and we apologise for the confusion. We have, therefore, included additional text in the Methods section clarifying this further.

      We have included the following text in the Methods section (Page 26; Lines 751-755);

      “We took particular care to ensure that all boundaries of the hippocampus mask (including inferior, superior, medial and lateral aspects) did not encroach into adjacent white or grey matter structures (e.g., amygdala, thalamic nuclei). This minimised the potential fusion of white matter tracts associated with other areas with our hippocampus mask.”

      These points notwithstanding, our results support recently observed structural and functional associations between the posterior hippocampus and early visual processing areas. We agree that these findings are potentially of great conceptual importance for how we think about the hippocampus and its connectivity with primary sensory cortices in the human brain and we have now included a brief comment relating to this in the Discussion.

      We have included the following text in the Discussion (Page 23-24; Lines 638-644);

      “However, this observation supports recent reports of similar patterns of anatomical connectivity as measured by DWI in the human brain (38) and functional associations between these areas (43, 60). Collectively, these findings are potentially of great conceptual importance for how we think about the hippocampus and its connectivity with early sensory cortices in the human brain and open new avenues to probe the degree to which these regions may interact to support visuospatial cognitive functions such as episodic memory, mental imagery and imagination.”

      2) A second concern pertains to the location of endpoint densities within the hippocampus from the cortical mantle. These are almost entirely in CA1/subiculum/presubiculum. It is, however, puzzling why, in Supp Figure 2, the hippocampal endpoints for entorhinal projections is really quite similar to what is observed for other cortical projections (e.g., those from area TF). One would expect more endpoint density in the superior portions of the hippocampal cross section in head and body, in keeping with DG/CA3 termination. I note that streamlines were permitted to move within the hippocampus, but the highest density of endpoints is still around the margins.

      We agree with Reviewer 2 that, in relation to the entorhinal cortex, we would expect to see more endpoint density in areas aligning with the dentate gyrus (DG) and CA3 regions of the hippocampus. We noted in the discussion that “Despite the high-quality HCP data used in this study, limitations in spatial resolution likely restrict our ability to track particularly convoluted white-matter pathways within the hippocampus and our results should be interpreted with this in mind”. We believe that this limitation applies to pathways between the entorhinal cortex and DG/CA3. We have now included additional text specifically noting that this limitation likely affects our ability to track streamlines as they relate to DG/CA3. A targeted investigation of this effect using higher resolution diffusion MRI data may help address this issue, and this will be the subject of future work.

      We have included the following text in the Discussion (Page 25; Lines 690-693);

      “Indeed, this may explain the surprising lack of endpoint density observed in the DG/CA4-CA3 regions of the hippocampus where we would expect to see high endpoint density associated with, for example, the entorhinal cortex which is known to project to these regions. Future dedicated studies using higher resolution data are needed to assess these pathways in greater detail.”

      3) On a related point, the use of "medial" and "lateral" hippocampus can be confusing. In the head, CA2/3 is medial to CA1, but so are subicular subareas, just that the latter are inferior.”

      We agree that applying the terms ‘medial’ and ‘lateral’ to our three-dimensional representations can lead to some ambiguities and confusion. We have included a new description defining our use of these terms in the Results section.

      We have included the following text in the Results section (Page 10; Lines 268-273).

      “In relation to nomenclature, our use of the term ‘medial’ hippocampus refers to inferior portions of the hippocampus aligning with the distal subiculum, presubiculum and parasubiculum. Our use of the term ‘lateral’ hippocampus refers to inferior portions of the hippocampus aligning with the proximal subiculum and CA1. In instances that we refer to portions of the hippocampus that align with the DG or CA3/2 we state these regions explicitly by name”.

    1. Discussion, revision and decision


      Decision

      Verified with reservations: The content is scientifically sound, but has shortcomings that could be improved by further studies and/or minor revisions.

      Dr. Bañuelos: Verified manuscript

      Dr. Morris: Verified with reservations


      Revision

      Response to Reviewer 1 (Dr. Bañuelos)

      1. Most importantly, I would like to see an introduction that explains the authors’ general arguments about grading changes – including the trajectory of these changes at Dalhousie and why this arc contributes to our knowledge of the history of higher education more broadly. Then, the authors might continually remind us of the arc they present at the outset of their paper – especially when they are highlighting a piece of evidence that illustrates their central argument. To me, the quotes from students and faculty responding to grading changes are among the most interesting parts of the paper and placing these in additional context should make them shine even more brightly!

      Our Response: Thank you so much for your thoughtful review. We have added a larger new introduction section of the paper (paragraphs 1-5 in the latest draft are new) that outlines the general importance of the topic, the Canadian context, details on Dalhousie University, and our overall thesis statement (i.e., most decisions were to improve the external communication value of grades). Moreover, we have added three new student quotes form the Dalhousie Gazette to build a stronger picture for student reactions, and to build a better case for our overall thesis statement (i.e., that changes in grading were often to increase the external communication value of grades). Moreover, throughout we have added some details on the overall funding trajectory for institutions in Canada that created some pressure to standardize grading. We think that these changes have improved the manuscript.

      1. I’d like to read a little more about Dalhousie itself – why it is either a remarkable or unremarkable place to study changes in grading policies. Is it representative of most Canadian universities and thus, a good example of how grading changes work in this national context? Is it unlike any other institution of higher education and thus, tells us something important about grades that we could not learn from other case studies? I don’t think this kind of description needs to be particularly long, but it should be a little more involved than the brief sentences the authors currently include (p.3, paragraph 1) and should explain the choice of this case.

      Our Response: This comment revealed that two additional pieces of context were needed for the introduction: (a) some national context for higher education policy in Canada and (b) some extended description of Dalhousie University when compared to other universities in Canada. To this end, two new paragraphs have been added to the paper (paragraphs 2 & 3 in the current draft).

      Notably, Jones (2014) notes that “Canada may have the most decentralized approach to higher education than any other developed country on the planet” (pg 20). With this in mind, any historical review of education policy is by necessity specific to province and institution – that is, the information can be placed in its context, but resists wide generalization to the country as a whole. In the newest draft, we tried to describe the national, provincial, and institutional context in some more detail in paragraphs 2 & 3.

      1. I’d also like to know more about the archival materials the authors used. The authors mention that they drew from “Senate minutes, university calendars, and student newspapers” (p. 3), but what kinds of conversations about grades did these materials include? At various points, the authors engage in “speculation” (e.g. p.4) about why a particular change occurred. This is just fine and, in fact, it’s good of the authors to remind us that they are not really sure why some of these shifts happened. But, they might go one step further and tell us why they have to speculate. Were explicit discussions of grading changes – including in inter- and intradepartmental letters and memo, reports, and other documents – not available in these archives? Why are these important discussions absent from the historical record?

      Our Response: We have added a new paragraph (paragraph 4) to the paper discussing the sources in some more detail. It is true that the verbatim discussions are frequently absent from the record, especially earlier in history – or if they exist, we have not found them! Instead, we frequently are reviewing meeting minutes or committee reports, which are summaries of discussions. As we now note in the paper, “Thus, the sources used showed what policy changes were implemented, when they were implemented, and a general sense of whether there was opposition to changes; however, there were notable gaps in faculty and student reactions to grade policy changes, as these reactions were frequently not written down and archived.”

      This gap was most apparent in the Senate minutes around the 1940s, where I (the first author) could not find any direct discussions of why changes were implemented. Under the 1937-1947 heading, we more clearly indicate that the rationale for the changes was absent from the Senate minutes during this period. I add some further speculation on why these records might be absent, based on summaries from Waite (1998b); specifically, the university president of the time often made unilateral decisions, circumventing Senate, which might account for why the changes are absent from the records.

      This will hopefully make the limitations of what can be learned from this approach more apparent.

      1. At various points, the authors make references to the outside world – for example, WWII (p. 5), the Veteran’s Rehabilitation Act (pp. 6-7), and British versus American grading schemas (p. 6). But, these references are brief and seem almost off-handed. I know space is limited, but putting these grading changes in their broader context might help make the case for why this study is interesting and important. Are the changes in the 1940s, for example, related to the ascendance of one national graduate education model over another (e.g. American versus British)? Are there any data on how many Canadian undergraduates enrolled in British versus American graduate programs over time? If so, I would share any information you might have on these broader trends.

      Our Response: To our knowledge, there isn’t any comparable report to what we’ve written here documenting the transition from British “divisions” to American “letter grades” in Canadian Universities, making our report novel in this regard. It might well be that a similar historical arc exists in many of the 223 public and private universities in Canada, but we don’t believe such data exists in any readily accessible way – excepting perhaps undergoing a similar deep dive into historical documents at each respective institution! So, we do not have the answer to your question: “Are there any data on how many Canadian undergraduates enrolled in British versus American graduate programs over time?” However, we did add one reference which provided a snapshot point of comparison in 1960, noting in the paper “Baldwin (1960) notes that the criteria for “High First Class” grades in the humanities was around 75-80% at Universities of Toronto, Alberta, and British Columbia in 1960, suggesting that Dalhousie’s system was similar to other research-intensive universities around this time.” That said, there are a few major national events related to the funding of universities in Canada that we have elaborated on in the text to address the spirit of your recommendation for describing the national context:

      a) In the “Late 1940s” section of the paper, we added: “Though Dalhousie had an unusually high proportion of veterans enrolled relative to other maritime universities during this period (Turner, 2011), the Veteran’s Rehabilitation Act was a turning point for large increases in enrollment and government funding Canada-wide, at least until the economic recession of the 1970s (Jones, 2014).”

      b) In the 1990s, there were major government cuts to funding, creating challenging financial times for the university. We discuss the funding pressures that likely contributed to standardization of grading during this time by saying the following in the 1980s-2000s section: “Starting in in the 1980s-1990s there were major government cuts to university funding nation-wide, with the cuts becoming more severe in the 1990s (Jones, 2014; Higher Education Strategy Associates, 2021). Because of the nature of the funding formulas, cuts in Nova Scotia were especially deep. Beyond tuition increases, university administrators knew that obtaining external research grants, Canada Research Chairs, and scholarship funding was one of the few other ways for a university to balance budgets, so there was extra pressure to be competitive in these pools. […] The increased standardization was likely related to increased financial pressures at this time – standardization is an oft-employed tool to deal with ever-increasing class sizes with no additional resources.”

      c) In the 2010s section of the paper, we added context to how universities in country-wide have become increasingly dependent on tuition fees for funding: “Following the 2008 recession, federal funding decreased again (Jones, 2014; Higher Education Strategy Associates, 2021); however, this time universities tended to balance budgets by increasing tuition and international student fees. This trend towards increased reliance on tuition for income is especially pronounced in Nova Scotia, which has the highest tuition rates in the country (Higher Education Strategy Associates, 2021). Thus, the university moved closer to a “consumer” model of education, so it makes sense that a driving force for standardization was student complaints.”

      1. This is a very nitpicky concern that doesn’t fit well elsewhere, so please take it with a grain of salt. I was surprised at the length of the reference list – it seemed quite short for a historical piece! I wonder, again, if more description of the archival material - including why you looked at these sources, in particular, and what was missing from the record – would help explain this and further convince the reader that you have all your bases covered.

      Our Response: In the introduction section, paragraph 4, we describe our sources in more detail including what is likely missing from the record and why we used them. Regarding the length of the reference list, we did add ~12 new references to the list in the course of making various revisions, which partially addresses your concern. Beyond this though, it’s worth noting that some of the sources more extensive than they seem, even though they don’t take up much space in the reference list (e.g., there is one entry for course calendars, but this covers ~100 documents reviewed!). Moreover, there were many dead-ends in the archives that are not cited (e.g., reviewing 10 years of Senate minutes in the 1940s produced little of relevance), so the reference list is curated to only those sources where relevant materials were found.

      Reviewer response to revisions

      The new introduction to the piece addresses many of my previous questions about the authors’ general arguments, the Dalhousie context, and the source material. Thank you for addressing these! Reading this version, it is much clearer that the key argument is that standardized, centralized grading practices were “to improve the external communication value of the grades, rather than for pedagogical reasons” (p. 6). I also really enjoyed the added quotes from students in the Dalhousie Gazette.

      The authors’ response to Reviewer 2 really gave me a better sense of why they wrote this piece and also helped me to more clearly put my finger on what was troubling me in the first round. It still reads a little like a report for an internal audience – which is just fine and, in fact, can be extremely useful for historians of the future. But, as Reviewer 2 notes, this means it does not really seem like a piece of historical scholarship. I do worry that shaping it into this form would take an extensive revision and might not be in the spirit of what the authors intended to do.

      A different version of this article might start with this idea that grades were standardized for external audiences and in response to financial pressures. It would then develop a richer story behind the sudden importance of these external audiences and the nature (i.e. source, type) of financial pressures Dalhousie was facing. It would highlight the impact such changes had on students and their future careers/graduate experiences. It could then connect these trends to other similar changes for external audiences and the increasing interconnectedness of American, Canadian, and British systems through graduate education. It might even turn to sociological theories of organizational change and adaptation and make an argument for when (historically) similar forms of decoupling were likely to occur in the Canadian higher education system. Finally, it might connect these grading changes to current trends – including accusations of grade inflation and accepted best practices for measuring learning outcomes.

      But, it doesn’t seem that the authors necessarily want to do this, which I can understand and respect. I think there is enormous value in a piece of scholarship like this existing – both for internal audiences and for future historians. Indeed, imagine if every university had a detailed history of its grading policies like this available somewhere online! Comparing such practices across institutions would certainly tell us a lot about why grading currently looks the way it does.

      Decision changed

      Verified manuscript: The content is scientifically sound, only minor amendments (if any) are suggested.


      Response to Reviewer 2 (Dr. Morris)

      The authors dove headfirst into Dalhousie’s archives, unpacking the subtle shifts in grading policy. Their work seems to be comparable to archaeologists, digging deep beneath mountains of primary sources to find nuggets of clues into Dalhousie’s grading evolution. I particularly liked when the authors were able to link these changes to student voices, as seen in moments when they referenced student publications.

      Ultimately, I kept coming back to one main comment that I wrote in the margins: “So what?” I would humbly suggest that the authors reflect on why this history matters to them. Granted, they do this in the conclusion, where they touch on Schneider & Hutt’s argument that grades evolved to increasingly be a form of external communication with audiences beyond school communities. Sure. But I want more. I wanted to see a new insight that this microhistory of Dalhousie significant to the history of Canada or the history of education more generally.

      If the authors are so inclined, there might be several approaches to transform this manuscript. I would suggest the following. First, instead of tracing the entire history of grading at the institution, choose one moment of change that you think is the most important. Perhaps in the 1920s and the lack of transparency in grading, or the post-war shift toward American grading. Second, show me – don’t tell me – what Dalhousie was like at this moment. Paint a picture of the institution with details about student demographics, curriculum, educational goals, the broader town, etc. Make the community come alive. Show me what makes Dalhousie unique from other institutions of higher ed. Once you establish that picture, perhaps you could link the change in grading practices to subtle changes at the university community, thereby establishing a before and after snapshot. This will require considerable amounts of work, and the skills of a historian. You will have to find primary and secondary sources that go far beyond what you’ve relied on thus far.

      In the end, I found myself wanting the authors to humanize this manuscript, meaning I wanted them to show me that changes in grading practices have tangible effects on real-life human beings. A humanization of their research would mean going narrower and deeper; or, in other words, eliminating much of what they have documented.

      However, if that is too tall of an order, I would ask that the authors clarify for themselves who this manuscript is for. Is this a chronicling of facts for an internal audience at Dalhousie’s faculty, alumni, and students? Fine. But my guess is that even members of the Dalhousie community want to read something relatable.

      I am suggesting revisions, although not because of objective errors. History is more of an art, in my opinion. With that in mind, I would suggest that the authors paint a more vivid picture (metaphorically) of Dalhousie, showing me how changes one moment of change in grading practices impacted the lives of human beings.

      Our Response: Thank you very much for taking the time to read our paper and provide your thoughts and recommendations. It may be helpful to begin by describing why I (the first author) decided to write this paper. Ultimately, I wrote this paper to satisfy my own personal curiosity and to connect with other people at my own place of employment by exploring our shared history. At present day, Dalhousie has a letter grading scheme with a standardized percentage conversion scheme that all instructors used. I wanted to know why this particular scheme was used, but I quickly realized that nobody at Dalhousie really knew how we ended up grading this way! There was an institutional memory gap, and a puzzle that was irresistible to me. So, I wrote this paper for the most basic of all academic reasons: Pure curiosity. I do very much recognize that the subject matter is very niche, perhaps too niche for a traditional journal outlet. Thus, my publishing plan is to self-publish a manuscript to the Education Resources Information Center (ERIC) database and a preprint server as a way of sharing my work with others who might be interested in what I found. Nonetheless, I believe in the importance and value of peer review, especially since I am writing in a field different than most of my scholarly work. That is why I chose PeerRef as a place to submit, so that I could undergo rigorous peer review to improve the work while still maintaining the niche subject matter and focus that drives my passion and curiosity for the project. Of course, if you feel the whole endeavor is so flawed that it precludes publication anywhere, then we can consider this a “rejection” and I will not make any further edits through PeerRef.<br /> The core of your critique suggested that I should write a fundamentally different paper on different subject matter. While I don’t necessarily disagree that the kind of paper you describe might have broader appeal, it would no longer answer the core research question I wanted an answer to: How has Dalhousie’s grading changed over time? So, I must decline to rewrite the paper to focus on a single timeframe as recommended. All this said, I did try my best to address the spirit of your various concerns to improve the quality of the manuscript. Below, I will outline the various major changes to the manuscript that we made to improve the manuscript along the lines you described, while maintaining our original vision for the structure and focus of the paper. The specific changes are outline below:

      a) Two new paragraphs (now paragraphs 1-2 of the revised manuscript) were added to explain the “so what” part of the question. Specifically, we describe why we think the subject matter might be of interest to others and summarize the general dearth of historical information on grading practices in Canada as a whole.

      b) Consistent with recommendations from the other reviewer, we now state a core argument (i.e., that most major grading changes were implemented to improve the external communication value of the grades) earlier in the introduction in paragraph 5 and describe how various pieces of evidence throughout the manuscript tie back to that core theme.

      c) In an attempt to “humanize” the manuscript more, we added more student quotes from the Dalhousie Gazette throughout the paper so that readers can get a better sense of how students thought about grading practices at various times throughout history. Specifically, three new quotes were added in the following sections: 1901-1936, late 1940s, 1950s-1970s. We also added this short note about the physical location where grades used to be posted: “Naturally, this physical location was dreaded by students, and was colloquially referred to as “The Morgue” (Anonymous Dalhousie Gazette Author, 1937).”

      d) Early in the paper, we describe why we chose Dalhousie and the potential audience of interest: “As employees of Dalhousie, we naturally chose this institution as a case study due to accessibility of records and because it has local, community-level interest. The audience was intended to be members of the Dalhousie community; however, it may also be a useful point of comparison for other institutions, should similar histories be written.”

      e) We have described some of the limitations of our sources in paragraph 4, which may explain why the manuscript takes the form it does – it has conformed to the information that is available!

      f) We have linked events at Dalhousie to the national context in some more detail, by detailing some national events related to the funding of universities in Canada. See our response to Reviewer 1, #4 above for more details on the specific changes.

      g) Consistent with your stylistic recommendations, we have changed various spots throughout the paper from the present tense (e.g., “is”) to the past tense (e.g., “was”), and were careful in our new additions to maintain the past tense, when appropriate. If there are any spots that we missed, let us know the page number / section, and we will make further changes, as necessary.

      h) We retained the first person in our writing – this may be discipline-specific, but in Psychology (the first author’s home discipline), first person is acceptable in academic writing. If you feel strongly about this, we can go through the manuscript and remove all instances of the first person, but we would prefer to keep it, if at all possible.

      Hopefully this helps address the spirit of your concerns, and I look forward to hearing your thoughts in the second round of reviews.

      Decision changed

      Verified with reservations: The content is scientifically sound, but has shortcomings that could be improved by further studies and/or minor revisions.

    1. Author Response

      Reviewer #1 (Public Review):

      We thank the reviewer for a very constructive evaluation of our work and for a fair summary of its main strengths. We have addressed her/his main concerns as follows:

      1) The experiments involve an invasive neurosurgical procedure used to perform hippocampal imaging, which removes the ipsilateral overlying somatosensory cortex, and it is not possible to evaluate from the data provided that this surgery does not disrupt network function, especially given the focus on movement-related activity patterns.

      We thank the reviewer for bringing up this important issue. Indeed, our experimental access to early hippocampal activity with 2-photon calcium imaging relies on a quite invasive procedure. However, the many control experiments we have performed indicate that early hippocampal dynamics were not significantly altered by the surgery. First, our extracellular electrophysiological recordings from a sample of 6 mice (ranging from P6 to P11, Figure 1- figure supplement 1C) show that the frequency of early sharp waves (eSW) was slightly but not significantly reduced in the ipsilateral hemisphere compared to the contralateral one. Of note, a similar “non-significant” decrease had been previously reported by another group (Graf et al 2021 Fig S6C). As suggested by the reviewer, we can speculate that this slight decrease may result from a reduction of the sensory feedback re-afference originating from the right limbs. Indeed, we observed that movements of the right limbs (contralateral to the window implant) elicited a slightly smaller response than those from the left limbs. This observation has been added to Figure 1 - Supplement 1E and described in the results (lines 128-134) and discussion (lines 314-320).

      We have performed additional control experiments using EMG nuchal electrodes in two pups aged P5 and P6. We observed that, an hour following the surgery (corresponding to the recovery time in our experimental procedure), the composition of the sleep-wake cycle (with 70 to 80 % of active sleep) was comparable to previous reports (Jouvet-Mounier, 1969, Fig 4). This quantification was added to Figure 1- figure supplement 1B (lines 82-86).

      2) State-dependent parameters are not adequately described, controlled, and examined quantitatively to ensure that data from similar behavioral states is being used for analysis across ages. Network activity from wakefulness, REM/active sleep and NREM/quiet sleep should not be presumed to be indistinguishable.

      We would like to point out that our analysis across ages focused on the population response following animal movements, and not across all behavioral states. That said, it is true that two types of movements can be distinguished, namely the twitches and the complex ones. To take this behavioral heterogeneity into account, we have now separately quantified the hippocampal activation following twitches (movement during active sleep) and complex movement (during wakefulness). We show in Figure 2 - figure supplement 1B that the hippocampal response to twitches and complex movements is similar across ages. Thus, even if the amount of time spent in each behavioral state is modified over the developmental period that we have studied, we are pretty confident that it does not impact the transition we have described in the relationship between animal movements and hippocampal activity. Additionally, we were able to combine in one P5 mouse pup 2p-imaging with nuchal EMG recordings and separately computed the PMTH for movements observed during REM or wakefulness (Figure 2 - figure supplement 1C). We show that CA1 hippocampal neurons were activated time-locked to movement in both behavioral states, with only the amplitude of the population response differing between wakefulness than during REM. This point is now included in the result section (lines 148-152) and discussed (lines 324-327).

      3) Currently employed statistics are not rigorous, unified, or sensitive, and do not support all of the authors' claims. Data shown suggest potentially significant changes that have not been identified due to suboptimal statistical approach and/or underpowering.

      We obviously agree with this reviewer that rigorous statistics should be employed and can certify that the data analyzed in the submitted manuscript was carefully examined following that principle. We feel that his/her strong criticism regarding that point was not fully justified. In particular, we do not understand why statistical tests should be “unified” across different figures of the paper. Rather, statistical tests should be adapted to the sample size and distribution. Of course, the same tests were used for similar datasets. This revised manuscript now contains further description and justification of all the tests included in every figure panels.

      4) The authors use an artificial neural network approach to infer cell classification (pyramidal cell vs. interneuron). From the data provided, it is not possible to adequately evaluate whether these 'inferred' interneurons represent the same population as conventionally labeled interneurons.

      We thank the reviewer for this important remark and apologize for the lack of detailed description of our method to ‘infer’ interneurons. This method was previously published (Denis et al., 2020), and designed to identify interneurons from their calcium fluorescence signals in the absence of a reporter. Most importantly, this cell type classifier was trained and tested on a dataset in which interneurons were labeled using a reporter mouse line (GAD 76-Cre). This dataset is included in this article. This means that all the ‘labelled’ interneurons included here were also used for the training and the test dataset. As for the activity classifier, the training and test data sets covered all the developmental ages used in the study. Thus, the previously published statistics (accuracy/sensitivity) of this classifier should well account for the present analysis. This method is now described in better detail in the results (line 183) and methods parts (lines 616-619). We now also illustrate in the figures how this classifier can infer interneurons with 91% precision (split up of prediction vs ground truth in test data are reported from Denis et al) and that these ‘infered’ interneurons are activated with movement just as genetically ‘labeled’ interneurons (Figure 3 - figure supplement 1B-E).

      5) Functional GABAergic activity is not assessed across development (only at P9-10), limiting mechanistic conclusions that can be drawn.

      We thank the reviewer for this comment that reveals some lack of clarity in the previous description of our experiments. Indeed, functional GABAergic activity was also assessed before P9, however, given that there are no GABAergic axons in the CA1 pyramidal layer at early stages (for both CCK cf. Morozov and Freund 2003, and prospective PV cells cf. Figure 4A,B), there is no signal to be measured either. We have now added a new figure (Figure 4 - figure supplement 1) to clarify this point. In agreement with our Syt2 longitudinal quantification, we show, using tdTomato expression in the Gad67cre driver mouse line, that GABAergic perisomatic innervation is only visible after p9. This matches as well our attempted imaging experiments using axon enriched GCaMP in mice before P9.

      6) The present analyses are almost exclusively focused on movement-related epochs, substantially limiting conclusions that can be drawn as to what neural dynamics are actually occurring during epochs that the authors propose comprise internal representations.

      We agree with this reviewer that our study is focusing on movement-related episodes and that we are not assessing hippocampal representations, especially since the pups are recorded in conditions that minimize external environmental influences. Still, we observe that there is a switch in the distribution of spontaneous activity in CA1 after P9, with most activity occurring outside from the synchronous calcium events and detached from movement. The exact nature of this activity remains to be studied, however, it is most likely not evoked by extrinsic phasic inputs and rather represents local dynamics. We have now removed reference to ‘internal representations” or “internal models” in the two previous instances of use i(abstract and discussion) and replaced them, when possible by “self-referenced” representations alluding to self-generated-movement-triggered activity.

      Reviewer #2 (Public Review):

      The study by Dard et al aims to uncover the post-natal emergence of mature network dynamics in the hippocampus, with a particular focus on how pyramidal cells and interneurons change their response to spontaneous limb movement. Several previous studies have investigated this topic using electrophysiology, but this study is the first to utilize 2-photon calcium imaging, enabling the recording of hundreds of individual neurons, and discrimination between pyramidal cell and interneuron activity. The aims of the study are of broad interest to all neuroscientists studying development (including neurodevelopmental disorders) and the basic science of network dynamics.

      The main conclusions of the study are that (1) in early life, most pyramidal cell activity occurs in bursts synchronized to spontaneous movement, (2) by P12, pyramidal cell activity is largely desynchronized from spontaneous movement, and indeed movement triggers an inhibition in the pyramidal network (approximately 2-4sec following movement), (3) unlike pyramidal cells, interneuron activity remains positively modulated by movement, throughout the period P1-P12, (4) the changes in pyramidal cell activity are achieved by means of increases in perisomatic inhibition, between P8 and P10.

      It should be noted that conclusion (1) and to some extent conclusion (2) have already been reported, by previous studies using electrophysiology (as clearly acknowledged by the authors).

      A principal strength of this manuscript is the extremely high quality of the data that the authors are able to use in support of (1) and (2), with very large numbers of neurons being analyzed to clearly delineate the relationship between neural activity and movement. The finding that pyramidal cells become inhibited following movement is novel, I believe. Furthermore, this study offers the first description of the development of interneuron activity, in this experimental context.

      The main weakness of the manuscript is that the authors cannot provide direct functional evidence for the conclusion (4). As shown by the analysis in support of conclusion (3), interneuron activity with respect to movement does not actually change during the developmental period being studied, making it prima facie unlikely that this is the cause of changes in pyramidal network responses to movement. To overcome this, the study describes the activity of GABA-ergic axon terminals in the pyramidal cell layer at P9-10, but it appears that due to technical problems this was not possible in younger animals. It, therefore, remains unknown if the functional inhibitory inputs to pyramidal cells are changing over the ages studied.

      We thank this reviewer for acknowledging the broad interest of the study, its novelty, and the high quality of our dataset. The main concern raised by this reviewer (lack of axonal activity experiments in younger pups) was in fact a misunderstanding of the experiments performed and we apologize for this lack of clarity. Reviewer #2 is correct in that the relationship between interneuron activity and movement does not change over the developmental period studied. However, we have only included GABAergic axonal imaging after P9, not due to a technical problem but rather because there are no GABAergic axons in the pyramidal layer before (we see GABAergic neurites only outside the layer). We have now dedicated a new supplementary figure (Figure 4 - figure supplement 1) to explain why we could not image GABAergic axons in the pyramidal cell layer at earlier developmental stages.

      The study does describe increases in the protein synaptotagmin-2, in the pyramidal cell layer, between P3 and P11, but in my opinion, this molecular evidence for increases in perisomatic inhibition does not match the (very high) standards of neuronal function/activity reported elsewhere in the manuscript.

      In the absence of parvalbumin expression in early development, synaptotagmin-2 has been described as the best marker of prospective PV boutons in the cortex (Someijer et al. 2012). This molecular marker has been used in other studies (Modol et al. Neuron 2020, Sigal et al. PNAS 2019). We respectfully disagree with this reviewer, and think that quantification from immunohistochemistry experiments is as high of a standard as functional imaging as it is the only way to describe the anatomical structure of active neuronal processes.

      Reviewer #3 (Public Review):

      Dard and colleagues use both in vivo calcium imaging and computational modelling to explore the relationship between the early movement of CA1 hippocampal activity in neonatal mice.

      The manuscript represents a significant technical advance in that the authors have pioneered the use of multiphoton imaging to record activity in the hippocampus of awake neonates. Overall the presentation of the data is convincing although I would recommend a number of tweaks to the figures and the inclusion of some raw data to better direct and inform non-expert readers. I also believe that the assessment of long-range inputs using pseudo-rabies virus should be present in the main body of the manuscript as opposed to supplemental material. The computational modeling supports their idea but does not exclude other possibilities. Further, it is not clear to what extent the strengthening of local excitatory input onto the interneurons - the dominant route of recurrent input in the hippocampus, is important; something that the authors acknowledge in the discussion.

      Overall, I believe the paper adds to our knowledge of the timeline of development and further identified the postnatal day (P)9-P10 window as important in emergent cortical processing. The fact that this is linked to an increase in GABAergic innervation has implications for our understanding of both normal and dysfunctional brain development.

      We thank the reviewer for his constructive comments and helpful suggestions. As suggested, this revised version now includes some raw-data and better descriptions to guide non-expert readers. Regarding the inclusion of rabies-tracing experiments in the main part of the MS, we would like to state here that there are still a number of limitations with the use of this method during development (incubation time, spatial precision of the injection site, etc. ) that limit the interpretation and quantification of the results. As a result, we have decided to remain only qualitative, focusing on identifying the brain regions that could send projections onto CA1 pyramidal cells and interneurons. We believe that this type of description is more suited for a supplementary figure than a principal figure, but will be happy to change this, if the reviewer and editors think otherwise.

    1. Some students do as well in online courses as in in-person courses, some may actually do better, but, on average, students do worse in the online setting, and this is particularly true for students with weaker academic backgrounds.

      I think this statement is important because it shows that the argument is not as simple as, "Online courses are bad and in-person classes are good". It shows that, while plenty of students do just fine learning online, the online courses themselves lack a lot of the edge that an in-person course can give a student. This is an important observation because we can use this research to optimize the way we learn online moving forward!

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01219

      Corresponding author(s): Rajan, Akhila

      1) General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      The goal of this study is to:

      • Define how prolonged exposure to a high-sugar diet (HSD) regime alters both the lipid landscape and feeding behavior.
      • Determine how changes in lipid classes within the adipose tissue regulates feeding behavior. Key findings:

      In this study, by taking an unbiased systems level and genetic approach, we reveal that phospholipid status of the fat tissue controls global satiety sensing.

      Impact of Key findings:

      By uncovering a critical role for adipose tissue phospholipid balance as a key regulator of organismal feeding, our work raises the possibility that the rate-limiting enzymes in phospholipid synthesis, including Pect, are potential targets for therapeutic interventions for obesity and feeding disorders.

      Peer review comments:

      This study has immensely benefited from the thoughtful peer-review of three reviewers. As per their recommendations, we have performed a major revision by performing additional experiments (see summary table below in next section) and strived to address the major concerns raised. Based on our reading, there were two major concerns that overlapped between all three reviewers raised. They are as follows:

      • Does the genetic disruption of Pect in fly fat body alter phospholipid levels? Two reviewers (#2 and #3) recommended that we perform lipidomic analyses on adult flies with adipose tissue specific knockdown of For the revised version, we have completed this lipidomic experiment, and present results as a new main Figure 6, Supplemental S7 and S9.
      • Is the dampened HSD induced hunger-driven feeding (HDF) behavior because of increased baseline feeding (#1 and #3)? In addition, reviewer #1, asked us whether HSD flies experience an energy-deficit? In other words, we were asked to uncouple whether what we observed was HSD-driven allostasis or indeed, as we had interpreted, that HSD dampened hunger-driven feeding response.

      Hence, they recommended that we:

      1. Re-analyze our hunger-driven feeding datasets and present non-normalized data (also requested by Reviewer #3) and show baseline feeding behavior on HSD. To address this, we have completed this analysis and present our results in Figure 1B-D and S1.
      2. Determine whether the HSD fed flies display an energy deficit on starvation. To this end, we performed an assayed starvation-induced fat mobilization on HSD, results for this are now presented on Figure 1E-G and S2. Conclusions after the revision:

      First, it is important to note here that the additional experiments have not caused a significant revision of the major conclusions of the original version of our study. In fact, we hope that the revised version provides clarity and further substantiation to our original arguments.

      • The lipidomics experiments on Pect fat-specific knock-down flies show that reducing Pect in fat-body causes a significant reduction in certain PE lipid species (PE 36.2 specifically- Figure 6B). This is consistent with a prior report on lipidomics of the Pect null allele by Tom Clandinin’s group (PMID: 30737130). Furthermore, we note that when Pect is knocked down in the fat body, there is a significant increase in two other classes of phospholipids LPC and LPE (Figure 6A). Together, this suggests that an imbalance in phospholipid composition in the absence of Pect activity in fat.
      • The starvation-induced fat mobilization experiments show that despite being fed a prolonged HSD, adult flies sense starvation and effectively mobilize fat stores, at a level comparable to Normal food (NF) fed adult flies, suggesting that even despite HSD exposure, adult flies experience an energy deficit on starvation.
      • In our non-normalized data, we find that the baseline feeding events are not significantly altered between HSD and NF-fed flies (Figure 1D). This suggests that the effects we observe are not due to an increase in the “denominator”, but a dampening of hunger-driven feeding on HSD. With regard to our original version, all three peer-reviewers found that the study was interesting, significant, important, and novel – Reviewer #1: “The work is potentially novel and interesting”; #2 : “I find the study to be potentially very important - the authors combine a longitudinal study that would be difficult in any other model with the powerful genetic tools available in the fly. The conclusions are mostly convincing”; #3: “This manuscript demonstrates how fat body Pect levels affect HSD induced changes in hunger-driven feeding response. I agree with all the reviewers points; potentially very interesting”. But had requested that we provide further substantiation and clarification.

      We sincerely hope that the peer-reviewers find that our revised version with additional new experimental datasets, improved data visualization, and the presentation of non-normalized raw data points, makes this study clear, compelling, and well-substantiated.

      • Point-by-point description of the revisions This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      Below we summarize in Part A, the key experiments that were performed to address the major concerns. In Part B, we provide a point-point response to each reviewer with embedded datasets.

      Part a:

      We performed several new experiments, including:

      • To address the primary concern of Reviewer #1 regarding whether the HSD flies have a similar energy deficit to Normal food (NF) fed flies, we performed analysis of stored neutral fat Triacylglycerol (TAG) reserves and how HSD fed flies mobilized fat stores on starvation. We present these results in Figure 1E-G, S2. These results show that HSD-flies despite accumulating more TAG (S2), breakdown a similar amount of fat reserves as NF-fed flies on starvation at any time-point (Figure 1E-G). This suggests that HSD-fed flies do sense and respond to energy deficit.
      • To address concerns of reviewer #2 and #3 on whether Pect genetic manipulation affects specific phospholipid classes, we performed lipidomic analyses. The table below summarizes the new 3 new figures and 4 supplemental figures (blue text are all new figure numbers and figure panels) and three new Supplementary files as per reviewer’s request.

      Figure #

      Main point

      New datasets in revision

      Companion Supplement

      1

      HSD alters feeding behavior, but flies still breakdown TAG on starvation.

      TAG storage and breakdown over longitudinal HSD shows that HSD and NF fed flies show similar levels of TAG breakdown on starvation, despite consistently elevated TAG on HSD. This supports the idea that flies do sense starvation even on HSD, but there is a uncoupling of the feeding behavior after Day 14. Revised the data representation of Figure 1 to show non-normalized data over time. S1 and S2 companions are new in the revision. Panels 1D to 1E are new for the revision.

      S1- Raw data of feeding events plotted.

      S2 Elevated TAG at all time points.

      2

      HSD causes insulin resistance

      S3A added to show that insulin transcript levels remain the same in response to reviewer #3’s concerns.

      S3

      3

      Phospholipid concentration raw data from lipidomic on Day 7 and Day 14 HSD suggest that PC, PE levels are increased on Day 14 HSD.

      Figure 3 revamped to show new data visualization and non-normalized raw data to address Reviewer #2’s major concerns. S4A and S4B added. In addition Supplementary File 1 and 2 provided with raw lipidomics data as per reviewer #2’s request.

      S4.

      S4A- non normalized raw data of all other lipid classes on HSD.

      S4B- fatty acid species data on Day 14 added as per request of rev.#2.

      4

      HSD regulate Apo-I levels in the IPCs and phenocopies Pect KD.

      Added Figure 4A to show that HSD phenocopies Pect-KD in terms of delivery to brain

      S5 showing the validation of the Apo-I antibody.

      S6 validation of Pect KD and over-expression and Pect mRNA levels dysregulation on HSD.

      5

      Pect RNAi is insulin resistant

      N/A

      N/A

      6

      Pect knockdown shows significant increase in LPC and LPE, and a non-significant reduction in PC, PE levels. Specifically, the PE lipid class PE36.2 is downregulated.

      Fig 6, S7, S9 are completely new based on reviewer #2 and #3 requests. In addition Supplementary File 3 provided with raw lipidomics data as per reviewer #2’s request

      S7, S8, S9#.

      S7- new Pect KD other classes

      S8- new PE classes for day 14 and Pect associated classes.

      S9- Pect OE lipidomics

      7

      Pisd and Pect activity in adipocytes are required for hunger-driven feeding behavior in normal diets

      Pisd RNAi data was moved from supplement to main figure.

      N/A

      Note on revised text: We have revised text not only in the results section, but also as per reviewer #2’s recommendation, we have revamped our introduction and discussion as well. Since the manuscript has been significantly revised to include a main figure 6, fully altered Figure 1 and 3, multiple new supplemental figures, the changes in text are extensive. Hence, they are unmarked in the main text. Nonetheless, we hope that the reviewers will be able to evaluate these changes, as we have provided the specific locations in text and embed key figures in the point-point response below.

      __Part B: __Point-Point responses to reviewer comments.

      Reviewer #1 comments in Blue, author response in black.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Kelly et al. show that the difference between the feeding behavior of fed and starved flies (hunger-driven feeding; HDF) is absent in animals fed a high-sugar diet (HSD) for two weeks or more. The disappearance of HDF with HSD coincides with changes in phospholipid profiles caused by HSD. Furthermore, RNAi-mediated downregulation of Pect in the fat body-a key enzyme in the PE biosynthesis pathway-phenocopies physiological effects of HSD. Moreover, downregulation or overexpression in the fat body abolishes or induces HDF, respectively, abolishes or induces HDF, respectively, independent of HSD treatment.

      Overall, the manuscript is well-written and the phenotypes are clear. However, I have major concerns regarding the authors' interpretation of the data and their conclusion. Most importantly, while it is clear that the authors' high-sugar dietary treatment affects feeding behavior and physiology, I am not convinced that the changes can be considered "hunger-driven"-which is central to the main point of the manuscript. Therefore, it is my recommendation that the authors substantially revise the manuscript by either showing additional/re-analyzed data that rule out alternative hypotheses, or rewriting the manuscript keeping alternative interpretations in mind.

      We are thankful to this reviewer for their thoughtful critique, and constructive and specific suggestions on how we can redress these concerns. We have taken on board the concerns of this reviewer regarding our interpretation of whether the changes in feeding behavior can be considered hunger-driven or not. Based on their advice, we have made significant changes by addressing: i) does HSD increased baseline feeding- we now show non-normalized raw data and data supports conclusion that baseline feeding is not higher; ii) whether HSD- fed flies can sense an energy deficit at levels similar to NF fed flies- we show that HSD flies sense energy deficit. We have provided detailed response below, and we hope the reviewer finds the additional datasets and re-analyzed data are consistent with the interpretation that prolonged HSD dampens starvation induced feeding. In addition to this key concern this reviewer has made a many other salient points that we have addressed with additional data or by clarifying the text.

      Major comments: 1) The data do not sufficiently show that the long-term HSD regime disrupts "hunger-sensing." The manuscript should address alternative hypotheses by showing raw instead of normalized data, rewriting the manuscript with a new central conclusion, or running additional experiments that actually show a defect in hunger-driven response. a. The main results that the authors rely on for the argument is that the ratio of feeding events that the starved and non-starved flies eat is different between the groups fed normal or HSD. However, because the authors only show normalized data (normalized to non-starved flies; Fig. 1), it is difficult to tell whether the change is due to a chronically increased feeding in non-starved HSD flies-maybe in perpetual hunger-like allostasis-or dampened starvation response. Indeed, the data shown in Fig S1 show that flies fed HSD for as short as 5 days show more frequent feeding events compared to age-matched controls fed normal food. It is possible that because the HSD-fed flies eat more than NF-fed flies, even without being starved, the ratio of starved/non-starved feeding is lower in the HSD-fed group-due to changes in the denominator, rather than the numerator.

      We have taken onboard this concern regarding presenting only normalized data, and that clouded the interpretation and left open other possibilities. In the completely revised figure 1 and S1. We now show non-normalized data, as a function of time. First we note that HSD-fed flies, do not show higher baseline feeding that NF fed flies, except on Day 10 of HSD, when there is a modest but significant elevation (Figure 1D).

      Nonetheless, on Day 10 HSD, flies still display increased hunger-driven feeding HDF (Figure 1C), it is only after Day 14 HSD that HSD dampens the starvation induced feeding.

      1. It is also possible that the HSD-fed flies are simply not in as big an energy deficit physiologically, due to the increased fat deposits they've accumulated (as the authors show later in the manuscript). It may take longer for the fat HSD flies to reach substantial energy deficiency than the NF flies, but they still may eventually be able to appropriately respond to hunger, just like NF flies. In such case, it would be a misnomer to call this behavioral change a 'defect in hunger-driven feeding behavior.' Maybe an experiment with a dose-response curve of "hunger driven feeding response" as a function of duration of starvation would help? Prompted by this reviewers question, we asked whether HSD fed flies, that have a higher baseline neutral fat store (Triacylglycerol-TAG) level, and if HSD-fed flies can sense energy deficit. For this, we revisited the longitudinal assays for neutral fat triacylglycerol (TAG) storage that our lab had generated, along with the HSD-HDF studies. We now present this evidence as Figure 1E-1G and Figure S2. Overall, our experiments point to the idea that adult flies fed HSD, are able to sense and mobilize TAG stores effectively throughout the 28-day time point that we analysed.

      First as shown in Figure S2, flies fed HSD display an increase in TAG levels. But it is to be noted that while TAG stores increase, the increase is not linear with time. This suggests that adult flies exposed to HSD store excess energy as TAG, but the increased TAG stores stay within a certain range despite the length of HSD exposure. This suggests that adult flies on HSD still display TAG homeostasis.

      Next, to directly address the reviewers point about HSD fed flies not sensing an energy deficit, we subject HSD-fed flies to an overnight starvation, same regime as used in the overnight feeding experiments, and asked whether they mobilize TAG. We noted that flies exposed to HSD breakdown TAG throughout the 28-day exposure at statistically significant levels for Day 3- Day 28, except on 14 and 21 days (Figure 1F). While there is TAG mobilization on Day 14 and 21, the difference is not statistically significant. Nonetheless, we note the same levels TAG breakdown for normal lab food (NF) fed flies on Day 14 and 21 (Figure 1E). Overall, HSD fed flies sense and display energy deficit, as measured by TAG store mobilization, throughout the 28 days of HSD exposure, at levels comparable to NF-fed flies (Figure 1G).

      Taken together, these results suggest that while HSD-fed flies experience an energy deficit on starvation, at levels comparable to NF-fed flies, throughout the 28-day time point assayed. But, their starvation driven feeding-response is dampened by Day 14 and by Day 28, the HSD-fed flies display more feeding events than HSD starved flies. These results are consistent with the interpretation that in HSD-fed flies the starvation-induced feeding behavior becomes desynchronized from the starvation induced TAG-mobilization, suggesting that there is an absence of hunger-driven feeding.

      2) How can you be sure that lower Dilp5 immunofluorescence is indicative of increased Dilp5 secretion? Wouldn't decreased production of dilp5 also have the same results?

      It has been shown previously in HSD fed larvae are hyperinsulinemic, i.e., they have 55% increase in circulating Dilp2 ( PMID: 22567167). Additionally, we have shown that ectopic activation of the insulin-producing neurons by expressing TRPA1, an ion channel that activates neurons, reduces Dilp5 accumulation without a change in Dilp5 mRNA levels (PMID: 32976758), suggesting that reduced Dilp5 accumulation, without alterations to mRNA levels is a proxy for increased secretion. Now, in response to this concern, in the revised manuscript, we have added qPCR data of Dilp2 and 5 (Figure S3A), which show no difference in expression levels after 14 days on HSD. Therefore, there is no dip in Dilp5 mRNA production. Given that Dilp2 and Dilp5 mRNA levels remain the same, but we see reduced Dilp5 accumulation, we interpret this to mean that Dilp5 secretion is increased.

      1. Also, the authors should state in the main text that it is Dilp5, not just any Dilp. Thanks for this suggestion and we have fixed this and referred to Dilp5 specifically throughout the text in the results section.

      3) Data presentation: a. Sometimes the data are normalized to NF (Fig 4B-C), sometimes not (ex. Fig 4A, S4C). Unless there is a specific rationale for the data transformation, it would be more appropriate to show untransformed data (ex. Fig 4A, S4C), especially as the authors use two-way ANOVA to determine significance. Only showing the differences implies comparison against a hypothetical mean (i.e. μ0=0), not between two group means.

      We thank the reviewers for bringing this issue to our attention. We updated all the figures to show untransformed data in the revised manuscript.

      1. Some figures show both individual data points and summary statistics (mean, SD, ... ex. Fig 2A)-which I believe is ideal-but some show only one or the other (ex. Fig 2B, no summary statistics; Fig. 3, no data points. The manuscript would read more convincing if data visualization is consistent across figures. We thank the reviewers for their feedback. We have made changes to all the figures in the revised manuscript to improve visual consistency.

      Minor comments: 1) High sugar diet: what is the actual sugar concentration in the NF v. HSD diets? The authors write that the HSD diet contains "30% more sugar" than the NF, but providing the final sugar concentrations-sucrose or others-would be informative for other scientists studying the effect of high sugar diets.

      We thank the reviewer for their suggestion and now we have updated the methods to include this sentence. After 7 days, flies were either maintained on normal diet or moved to a high sugar diet (HSD), composed of the same composition as normal diet but with an additional 300g of sucrose per liter”.

      1. Additionally, the definition of HSD is inconsistent. Main text (Page 5, line 17) states that their HSD is "60% more sugar than normal media," whereas the figure legend (Fig 1) and the Methods state that the HSD contains "30% more sugar." We apologize for this egregious typo in the figure legend! We have now fixed this to say 30% HSD. Only 30% HSD was used throughout this study.

      2) Starvation medium: please provide justification for why the authors used 1% sucrose/agar for starvation medium, instead of plain agar/water that most labs use. At least clarify and provide a reference for the claim that the 1% sucrose/agar "is a minimal food media to elicit a starvation response."

      We are very grateful for this reviewer identifying this this methods description error and bring it to our attention. We used 0% sucrose agar for overnight starvation in this study as most labs do. The error occurred because we were using another manuscript from the lab to help draft the methods section (PMID: 29017032). In that study, where we assayed the effect of chronic starvation our lab used: “1% sucrose agar for 5 days at 25C”. However, in this current study, because we are testing acute effects of overnight starvation, we are using 0% sucrose agar.

      3) Pect mRNA level is higher with HSD. This is surprising because not only, as authors mention, is increased PC32.2 with HSD suggests lower Pect activity, but also because Pect RNAi phenocopies long-term HSD in HDF behavior, lipid morphology, FOXO accumulation in fat body. The authors speculate that the data "likely shown an upregulation in an attempt to mediate the Pect dysregulation occurring at the protein level." If that were true, a western blot may be informative. Zhao and Wang (2020, PLoS Genetics) generated a Pect antibody that seems compatible with western blot applications. That being said, I don't think such data is critical for the manuscript. I mention this simply as a suggestion for the authors. a. page 8, line 22-23, did you mean to write "Given how PC32.2 is elevated after 14 days of exposure to HSD, we assumed that Pect levels would be low for flies under HSD," not "high?" Otherwise the subsequent 2 sentences don't make sense.

      We agree that the most confusing aspect of the study was that Pect mRNA levels being very high on Day 14 HSD, but nonetheless the effects of Pect-KD phenocopied HSD. To resolve this, we have now performed lipidomic analyses on whole adult flies, when Pect is knocked-down (KD) by RNAi in the fat tissue. We now present a new dataset in Figure 6. Two striking changes occur. They are:

      1. Pect-KD shows increase in the phospholipid classes LPC and LPE (Figure 6A). In contrast, LPE is significantly downregulated on HSD Day 14 (Figure 3).
      2. Pect-KD shows a significant reduction in specific class of PE 36.2 (Figure 6B). Our data regarding increase in PE 36.2 agree with a previous lipidomic analyses of Pect mutant retina (PMID: 30737130). In contrast, PE 36.2 trends upwards on 14 day HSD (Figure S7C) though not significantly. On 14-day HSD consistent with extreme upregulation of Pect mRNA fed flies (Figure S6A; Pect mRNA 200-250 fold), PE trends upwards on 14-day HSD (Figure 3) and PE 36.2 trends higher (Figure S7C). We note that on the surface of it PE and LPE per se are contrasting between 14-day HSD lipidome and fat-specifc Pect-KD. But there is a significant commonality that under both states there is an imbalance of phospholipids classes PE and LPE. Hence, we propose that maintaining the compositional balance of phospholipid classes PE and LPE is critical to hunger-driven feeding and insulin sensitivity. Hence, either increase or decrease, of these key phospholipid species, may lead to abnormal hunger-driven feeding.

      We agree that a western blot would be informative as well, but we were unable to obtain the reagent from Dr. Wang’s group, precluding us from performing this request. See email snapshot.

      To ensure that we appropriately discuss and clarify this issue, we have now included a section in the discussion - Page 14 Lines 26-34- under the subtitle “The implications of relationship between Pect levels and HSD”. We have pasted an excerpt from that subsection below for this reviewers assessment.

      Also, we note that over-expression of Pect cDNA in the fat-body does not alter phospholipid balance (Figure S9) and indeed improves HDF on HSD (Figure 7B). While this may appear inconsistent, it is critical to note that over-expression of Pect cDNA using UAS/Gal4 only increases Pect mRNA expression by 7-fold (Figure S6A), whereas HSD causes its upregulation by 250-fold (Figure S6B). Hence, we speculate that an increased ‘basal’ level of Pect such as by that provided by a cDNA over-expression in fat, may be protective to the negative effects of HSD (Figure 7B) without affecting overall phospholipid levels (Figure S9) , but extreme upregulation Pect on HSD affects the PE and LPE balance (Figure 3).”

      Reviewer #1 (Significance (Required)):

      The work is potentially novel and interesting, but at this stage it's difficult to interpret what the phenotype signifies. Although the manuscript could be revised simply by modifying the text, experimentally addressing the concerns would significantly improve the work.

      In sum, we hope we have addressed the key concern for Reviewer #1 as to whether the behavior we report here is indeed a dampening of starvation-induced feeding, or an effect of increase in baseline feeding. We hope that by reviewing our non-normalized data, they can appreciate that it is the former. Also, we hope that Reviewer #1 appreciates that we have strived to address the concerns by additional experiments, to clarify our findings and improve the impact of the work.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This intriguing manuscript by Kelly and colleagues uses the fruit fly Drosophila melanogaster as a model to understand how diet-induced obesity alters the feeding response over time. In particular, the authors findings indicate that chronic exposure to a high-sugar diet significantly alters the starvation-induced feeding response. These behavioral studies are complemented by a lipidomics approach that reveals how a chronic high sugar affects many lipid species, including phospholipids. The authors then pursue mechanistic studies that indicate phospholipid metabolism within the fat body appears to remotely affect insulin secretion from the insulin producing cells. Moreover, the changes in phospholipid abundance are associated with changes in insulin-signaling, including increased insulin secretion from the IPCs and elevated levels of FOXO within the nucleus.

      I find the study to be potentially very important - the authors combine a longitudinal study that would be difficult in any other model with the powerful genetic tools available in the fly. The conclusions are mostly convincing, but a few follow-up experiments are required:

      We are grateful for the reviewers constructive, detail-oriented, and balanced feedback, and their recognition of the value of this study. Now, we have performed additional experiments to address the key concerns raised by all reviewers. We hope that on reading the revised version of our study, that the reviewer continues to feel positive about the message of this study and its potential impact.

      1. The key conclusions from the manuscript assume that manipulation of Pect expression levels alters phosphatidylethanolamine (PE) levels. However, the authors make no attempt to verify that the genetic experiments described herein actually affect PE levels. At a minimum, changes in PE levels should be verified for the Pect knockdown and overexpression lines. Similarly, there is no evidence that manipulation of either EAS or Pcyt2 induces the expected metabolic effects. I'm not asking that the longitudinal feeding experiments be repeated, simply that the authors measure the relevant lipid species, preferably with a targeted LC-MS approach.

      Prompted by this reviewer, we performed targeted LC-MS on whole adult flies, on normal diet, to assess lipid levels for fat-specific Pect-KD and overexpression. We decided to focus on Pect, as its knock-down even on normal diet causes a dampened hunger-driven feeding behavior (Figure 7A) and phenocopied a 14-day HSD feeding phenotype.

      We now present a new dataset in Figure 6. Two striking changes occur:

      They are:

      Pect-KD shows a significant reduction in specific class of PE 36.2 (Figure 6B). Our data regarding decrease in PE 36.2 agree with a previous lipidomic analyses of Pect mutant retina (PMID: 30737130). It is to be noted that though overall levels of all PE species trend downwards, like the Clandinin lab study on Pect (PMID: 30737130), we did not find a significant change in the overall PC and PE levels.

      • Pect-KD shows increase in the phospholipid classes LPC and LPE (Figure 6A). In contrast, LPE is significantly downregulated on HSD Day 14 (Figure 3). On 14-day HSD consistent with extreme upregulation of Pect mRNA fed flies (Figure S6A; Pect mRNA 200-250 fold), PE trends upwards on 14-day HSD (Figure 3) and PE 36.2 trends higher (Figure S7C). We note that on the surface of it PE and LPE per se are contrasting between 14-day HSD lipidome and fat-specifc Pect-KD. But there is a significant commonality that under both states there is an imbalance of phospholipids classes PE and LPE. Hence, we propose that maintaining the compositional balance of phospholipid classes PE and LPE is critical to hunger-driven feeding and insulin sensitivity. Hence, either increase or decrease, of these key phospholipid species, may lead to abnormal hunger-driven feeding.

      Finally, fat-specific Pect-OE did not cause significant changes to lipid species (Figure S9). This could either be due to the fact that in fat-specific Pect-OE flies under normal food and that we were assaying whole body lipid levels and not fat-specific lipid changes. But to counter that, even a 60% reduction in Pect mRNA levels (Figure S6A), was sufficient to produce an effect on whole body phospholipid balance (Figure 6). Hence, we speculate that by maintaining a basally higher (7-fold higher Pect mRNA level Figure S6A), might allow 14-day HSD-fed flies to buffer the negative effects of HSD and we predict that it might take longer to disrupt the phospholipid balance and HDF response.

      We have now included a section in the discussion - Page 14 Lines 26-34- under the subtitle “The implications of relationship between Pect levels and HSD”. We have pasted an excerpt from that subsection below for this reviewers assessment.

      Also, we note that over-expression of Pect cDNA in the fat-body does not alter phospholipid balance (Figure S9) and indeed improves HDF on HSD (Figure 7B). While this may appear inconsistent, it is critical to note that over-expression of Pect cDNA using UAS/Gal4 only increases Pect mRNA expression by 7-fold (Figure S6A), whereas HSD causes its upregulation by 250-fold (Figure S6B). Hence, we speculate that an increased ‘basal’ level of Pect such as by that provided by a cDNA over-expression in fat, may be protective to the negative effects of HSD (Figure 7B) without affecting overall phospholipid levels (Figure S9), but extreme upregulation Pect on HSD affects the PE and LPE balance (Figure 3).”

      A central hypothesis in the study is that the HSD over a period of 14 days results in insulin resistant and that these changes are leading to changes in hunger dependent feeding. I would encourage the authors to determine if Foxo mutants are resistant to these HSD-induced effects on HFD.

      We thank the reviewers for this suggestion. However, given that dFOXO nuclear localization rather than expression levels regulate insulin sensitivity, we feel that disrupting dFOXO levels via mutation or knockdown will produce a plethora of indirect effects including developmental abnormalities (PMID: 24778227, PMID: 16179433, PMID: 29180716, PMID: 12893776). Our data suggest that chronic HSD treatment and Pect affect insulin sensitivity in fat tissue. However, we feel that investigating whether insulin sensitivity/FOXO signaling in fat tissue regulates feeding behavior is outside the scope of our work.

      1. In lines 25-30, the authors draw the conclusion that an increase in unsaturated fatty acid species is associated with the HSD and that these changes results in a more fluid lipid environment. While I agree with the model, the manuscript contains no evidence to support such a model. Either test the hypothesis or move the last line of the section to the discussion.

      We thank the reviewer for this important and insightful comment. We agree that the data we presented and discussed in the original version is at the moment speculative. Addressing the hypothesis that increase in unsaturated fatty acid species result in a more fluid lipid environment will require us to build tools and expertise. Hence, this hypothesis is better suited for exploration in a future study. Given this, we have moved this out of the results section into the Discussion section titled “HSD and fat-specific PECT-KD causes changes to phospholipid profile” (See excerpt below from page 13, lines 24-35).

      In addition to changes in phospholipid classes, we found that HSD caused an increase in the concentration of PE and PC species with double bonds (Figure S4C and S4D). Double bonds create kinks in the lipid bilayer, leading to increased lipid membrane fluidity which impacts vesicle budding, endocytosis, and molecular transport14,92. Hence it is possible that a mechanism by which HSD induces changes to signaling is by altering the membrane biophysical properties, such as by increased fluidity, which would have a significant impact on numerous biological processes including synaptic firing and inter-organ vesicle transport.”

      Also, as per the reviewer’s guidance, given that we are speculating here, we have also shifted this dataset from Main figure 4 to supplement S4C and S4D.

      In addition, lines 25-30 state that FFAs are increased after 14 days of a HSD. Figure 3A shows the exact opposite - FFAs are significantly decreased in 14 day fed animals despite being elevated in the 7 day fed animals. This is an interesting result that warrants discussion. Moreover, I would encourage to examine the lipidomic data more carefully to ensure that the text accurately portrays the lipid profiles.

      We apologize for misstating that FFAs are decreased on 14-day HSD in the lines 25-30. It was an error and we have corrected this. We agree with the reviewer that the reduction of FFA on Day 14-HSD is an intriguing and unexpected observation that needs to be emphasized and further discussed. To this end, we have added figure S4B, wherein we have provided the difference in FFA concentration (by species) after days 7 and 14.

      Furthermore, we have discussed what the potential meaning of reduced FFA at Day 14 implies in page 12, lines 19-27 of the Discussion section titled “HSD and fat-specific PECT-KD causes changes to phospholipid profile”. We have stated the following-

      We speculate that this reduction in FFA maybe due to their involvement in TAG biogenesis (PMID: 13843753). We were interested to see if the decrease in FFA correlated to a particular lipid species, as PE and PC are made from DAGs with specific fatty acid chains. However, further analysis of FFAs at the species level did not reveal any distinct patterns. The majority of FFA chains decreased in HSD, including 12.0, 16.0, 16.1, 18.0, 18.1, and 18.2 (Figure S4B). This data was more suggestive of a global decrease in FFA, likely being converted to TAG and DAG, rather than a specific fatty acid chain being depleted.”

      The processed lipidomics data should also be included as supplementary data table so that they can be independently analyzed by the reader.

      We thank the reviewer for this suggestion. As per the reviewers request, we have included the raw data as an attachment in our supplementary material (Supplementary Files 1-3.), so that interested readers can use the datasets generated in this study for future work and further analysis.

      Beyond these experimental suggestions, the manuscript needs significant editing for clarity. While I won't provide a comprehensive list, the authors need to provide accurate descriptions and annotation of genotypes (including w[1118], which is written as W1118), typos, and formatting. I've listed a few examples below:

      1. Page 3, Line 1 and 2: "...have been shown to impact feeding behavior and metabolism that leads to..." This is an awkward and grammatically incorrect sentence.
      2. Page 3, Lines 7-32 is one very large paragraph but contains concepts that should be broken down over at least three paragraphs.
      3. Page 3, Line 25: A description of the reaction catalyzed by Pect would be helpful for a manuscript focused on Pecte activity.
      4. Page 4, Line 10: "previously characterized method of eliciting diet induced feeding behavior." As stated in the text, the method is previously described yet the manuscript characterizing the method isn't cited.
      5. Figure legend 3 contains a random assortment of capitalized lipid species. Also, the names of lipid species are inappropriately broken into multiple names. Please use correct nomenclature throughout the manuscript.

      The list above is nowhere near comprehensive. The manuscript requires significant editing.

      We are grateful to the reviewer for drawing our attention to these errors. We have made significant edits to the revised manuscript to address the above-mentioned concerns, as well as made additional textual changes throughout and copyedited it. We hope that the reviewer will find the manuscript reads better and the clarity and preciseness is significantly improved.

      Reviewer #2 (Significance (Required)):

      I find the study to be potentially very important - the authors combine a longitudinal study that would be difficult in any other model with the powerful genetic tools available in the fly. The findings will significantly advance our understanding of how lipid metabolism links dietary nutrition with feeding behavior.

      Once again, we are grateful for this reviewer’s thoughtful critique and encouraging words regarding our work and its potential impact.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript uses Drosophila to investigate how diet-induced obesity and the changes in the lipid metabolism of the fat boy modulate hunger-driven feeding (HDF) response. The authors first demonstrate that chronic exposure (14 days) of high sugar diet (HSD) suppresses HDF response. Through lipidome analysis, the authors identify a specific class of lipids to be elevated upon chronic HSD feeding. This coincided with the changes in expression of Pect, an enzyme that regulates the biosynthesis of these lipids. Modulating the expression of Pect specifically in the fat body affected HDF response.

      We thank this reviewer for their rigorous and thoughtful critique and for identifying a key issue with our original study pertaining to a gap in how Pect mRNA levels on 14-day HSD are elevated but the Pect-KD phenocopies the HDF. Now by performing whole-body adult fly lipidomic on fat-specific Pect-KD we have resolved this issue and provided clarity on role of Pect in maintaining phospholipid homeostasis and thus subsequently impacts hunger-driven feeding. We hope the reviewer finds that the revised manuscript provides further clarity to the functional link between Pect’s role in fat-body and hunger-driven feeding.

      Major comments: The author claim that the HDF response in HSD is distinct between early (5d, 7d) and chronic (day 14) HSD feeding. However, the data seem to indicate that HDF response is significantly decreased at all time points in HSD. For example, at day 5 HDF response was increased only 3-fold in HSD (Figure 1C) compared to around 50-fold increase in NF (Figure 1B). The scale of the Y-axis in Figure 1B and 1C is an order of magnitude different. Including the starved data (NFstv and HSDstv) in Figure S1, normalized to NF fed group, would better visualize the overall trends. Related to this, having the source data for the actual number of feeding events would be useful (e.g., to see the baseline changes in feeding in different time points in Figure 1 and the effect of genetic manipulations in Figure 7).

      As per the reviewers request, we now have modified our graphs to show source data (Figure S1) and show the raw feeding events.

      Then in the non-normalized graphs we plot, over a longitudinal time course, baseline and hunger-driven feeding events (Figure 1B-D). We also show that HSD fed flies do not display increased baseline feeding (Figure 1D) suggesting that the effect we see on HDF are no clouded by increased baseline feeding.

      Yes, the reviewer makes an important point that HDF response on HSD fed flies is of a lower magnitude than NF fed flies. We think that is a biologically meaningful observation, as it suggests that flies have a remarkably fine-tuned ability to coordinate food-intake with nutrient store levels.

      ­­Now we have included a paragraph in the Discussion, Page 11 Lines 23-27, that say the following to ensure the readers appreciate this salient point raised by this reviewer.

      *It is to be noted that the HDF response of HSD-fed flies (Figure 1C, Days 3-10) is of lower order of magnitude than the NF-fed flies. This suggests that that in addition to sensing an energy deficit and mobilizing fat stores (Figure 1F, 1G, S1), HSD fed flies calibrate their starvation-induced feeding to compensate only for the lost amount of fat. Overall, this suggests that flies have a remarkably fine-tuned ability to coordinate food-intake with nutrient store levels. *

      The association between fat body Pect level and phospholipid levels is not clear. Day 14 of HSD feeding shows high expression of Pect in the fat body and elevated levels of PC32.0 and PC32.2. The authors assume the high expression of Pect in the fat body is due to the compensatory response, but there are no data indicating downregulation of Pect levels at the earlier time points of HSD feeding. A previous study demonstrated that Pect mutant flies have lower levels of PC32.0 but higher PC32.2 (PMID: 30737130).

      We agree that one puzzling aspect of the original version of this study was that Pect mRNA levels being very high on Day 14 HSD, but nonetheless the effects of Pect-KD phenocopied HSD. To resolve this, prompted by Reviewer #2 and #3 concerns, for this revised version we have now performed lipidomic analyses on whole adult flies, when Pect is knocked down (KD) by RNAi in the fat tissue. We now present a new dataset in Figure 6. Two striking changes occu. They are:

      1. Pect-KD shows increase in the phospholipid classes LPC and LPE (Figure 6A). In contrast, LPE is significantly downregulated on HSD Day 14 (Figure 3).
      2. Pect-KD shows a significant reduction in specific class of PE 36.2 (Figure 6B). Our data regarding increase in PE 36.2 agree with a previous lipidomic analyses of Pect mutant retina (PMID: 30737130). In contrast, PE 36.2 trends upwards on 14 day HSD (Figure S7C) though not significantly. On 14-day HSD consistent with extreme upregulation of Pect mRNA fed flies (Figure S6A; Pect mRNA 200-250 fold), PE trends upwards on 14-day HSD (Figure 3) and PE 36.2 trends higher (Figure S7C). We note that on the surface of it PE and LPE per se are contrasting between 14-day HSD lipidome and fat-specifc Pect-KD. But there is a significant commonality that under both states there is an imbalance of phospholipids classes PE and LPE. Hence, we propose that maintaining the compositional balance of phospholipid classes PE and LPE is critical to hunger-driven feeding and insulin sensitivity. Hence, either increase or decrease, of these key phospholipid species, may lead to abnormal hunger-driven feeding.

      On day 14, HDF response was increased 70-fold in w1118 flies in NF (Figure 1B; w1118), but only 2.5-fold in lpp>LucRNAi control flies in NF (Figure 7A). This suggests that lpp-gal4 driver lines have a significant effect on HDF response. Using a different fat-body specific Gal4 line would be necessary to validate conclusions.

      Regards reduced HDF magnitude, in our experience using UAS-Gal4 reduces HDF response magnitude consistently and cannot be compared to w1118 which is more robust. To account for background differences, we use Uas-Gal4 with control RNAi. It clearly shows differences in HDF response on starvation, but Pect and Pisd RNAi does not (Figure 7A). Hence, given that this experiment internally controls for any changes in HDF response for UAS-Gal4>RNAi, we conclude that HDF response in disrupted in Pect and PISD KD (Figure 7).

      We only presented the Lpp-driver in our study, as this driver is the only fat-specific driver that has no leaky expression in other tissues, and is specific to fat as apolpp promoter used to generate this Gal4 line is only expressed in fat tissue (Eaton and colleagues, PMID: 22844248). Other widely used fat-specific drivers, including the pumpless-Gal4 (ppl-Gal4) driver has leaky expression in gut or other tissues (See Table 2 of this detailed study by Dr. Drummond- Barbosa https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7642949/). If the reviewer is aware of a fat-specific Gal4 line, other than Lpp-Gal4, which has a highly specific expression in the fat tissue without leaky expression in other tissues, then we are happy to take onboard the reviewer’s suggestion and try that fat-specific Gal4 that they suggest.

      HSD feeding promotes Pect expression (Figure S3C) and global changes in phospholipid levels (Figure 3, 4). Therefore, shouldn't Pect overexpression (not Pect RNAi) in a normal diet mimic HSD feeding state and promote loss of HDF response? Conversely shouldn't knockdown of Pect in HSD rescue loss of HDF response?

      We agree that a puzzling aspect is that Pect mRNA levels are significantly elevated in HSD Day-14, but Pect-KD showed displays the inappropriate HDF response. As we have described in our response to this reviewer on Page 19, we believe that Pect-KD and HSD disrupt PE and LPE balance overall but in different ways. Whereas Pect-OE using cDNA expression in fat body does not cause a significant change to any lipid class (Figure S9), and our results suggest that basally higher level of PECT is likely to be protective on HSD with respect to HDF(Figure 7B).

      To ensure that we appropriately discuss and clarify this issue, we have now included a section in the discussion - Page 14 Lines 26-33- under the subtitle “The implications of relationship between Pect levels and HSD”. We have pasted an excerpt from that subsection below for this reviewers assessment.

      Also, we note that over-expression of Pect cDNA in the fat-body does not alter phospholipid balance (Figure S9) and indeed improves HDF on HSD (Figure 7B). While this may appear inconsistent, it is critical to note that over-expression of Pect cDNA using UAS/Gal4 only increases Pect mRNA expression by 7-fold (Figure S6A), whereas HSD causes its upregulation by 250-fold (Figure S6B). Hence, we speculate that an increased ‘basal’ level of Pect such as by that provided by a cDNA over-expression in fat, may be protective to the negative effects of HSD (Figure 7B) without affecting overall phospholipid levels (Figure S9) , but extreme upregulation Pect on HSD affects the PE and LPE balance (Figure 3).”

      We would have liked to test Pect protein expression on HSD, but since we were unable to access antibodies for Pect published in a prior study (PMID: 33064773) from Dr. Wang’s lab (see Page 10-11, of response to Reviewer #1). Hence, we were unable to test how the proteins levels of Pect correlate with the 250-fold increase mRNA expression.

      In conclusion, we hope the reviewer appreciates that our results regarding Pect function are consistent with the main conclusion that achieving the right phospholipid balance between PE and LPE, is critical for an organism to display an appropriate HDF response.

      Minor comments: All graphs should plot individual data points and showed as box and whisker plot as much as possible.

      Thanks for this suggestion, we have added individual data points to the vast majority of figures in the paper. We have made exceptions to graphs such as seen in figure 1 and FigureS4B-D where we find individual data points add an unnecessary layer of complexity. We hope these changes provide additional clarity and strength to the claims made in this manuscript.

      Data for day 14 missing in Figure S4A and S4B.

      We have provided Day 14 for the PC composition and PE composition, due to changes in Figures, they are now S7A and S7B.

      Reviewer #3 (Significance (Required)):

      The interactions between diet-induced obesity, peripheral tissue homeostasis and feeding behavior is an interesting topic that can be addressed using Drosophila. This manuscript demonstrates how fat body Pect levels affect HSD induced changes in hunger-driven feeding response. However, at this point, the functional association between fat body Pect level, global phospholipid level, and loss of hunger-driven feeding response in chronic HSD feeding is not clear.

      We hope the revised data, and discussion of the paper, provides well-substantiated functional association on the importance of maintaining phospholipid balance, driven by Pect enzyme, as a critical regulator of hunger-driven feeding behavior. As stated in the revised discussion, the key take home message of our manuscript is that on prolonged HSD exposure PC, PE and LPE levels are dysregulated, the loss of phospholipid homeostasis coincided with a loss of hunger-driven feeding. Following this lead on phospholipid imbalance, we then uncovered a critical requirement for the activity of the rate-limiting PE enzyme PECT within the fat tissue in controlling hunger-driven feeding.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Kelly et al. show that the difference between the feeding behavior of fed and starved flies (hunger-driven feeding; HDF) is absent in animals fed a high-sugar diet (HSD) for two weeks or more. The disappearance of HDF with HSD coincides with changes in phospholipid profiles caused by HSD. Furthermore, RNAi-mediated downregulation of PECT in the fat body-a key enzyme in the PE biosynthesis pathway-phenocopies physiological effects of HSD. Moreover, downregulation or overexpression in the fat body abolishes or induces HDF, respectively, abolishes or induces HDF, respectively, independent of HSD treatment.

      Overall, the manuscript is well-written and the phenotypes are clear. However, I have major concerns regarding the authors' interpretation of the data and their conclusion. Most importantly, while it is clear that the authors' high-sugar dietary treatment affects feeding behavior and physiology, I am not convinced that the changes can be considered "hunger-driven"-which is central to the main point of the manuscript. Therefore, it is my recommendation that the authors substantially revise the manuscript by either showing additional/re-analyzed data that rule out alternative hypotheses, or rewriting the manuscript keeping alternative interpretations in mind.

      Major comments:

      1. The data do not sufficiently show that the long-term HSD regime disrupts "hunger-sensing." The manuscript should address alternative hypotheses by showing raw instead of normalized data, rewriting the manuscript with a new central conclusion, or running additional experiments that actually show a defect in hunger-driven response.
        • a. The main results that the authors rely on for the argument is that the ratio of feeding events that the starved and non-starved flies eat is different between the groups fed normal or HSD. However, because the authors only show normalized data (normalized to non-starved flies; Fig. 1), it is difficult to tell whether the change is due to a chronically increased feeding in non-starved HSD flies-maybe in perpetual hunger-like allostasis-or dampened starvation response. Indeed, the data shown in Fig S1 show that flies fed HSD for as short as 5 days show more frequent feeding events compared to age-matched controls fed normal food. It is possible that because the HSD-fed flies eat more than NF-fed flies, even without being starved, the ratio of starved/non-starved feeding is lower in the HSD-fed group-due to changes in the denominator, rather than the numerator.
        • b. It is also possible that the HSD-fed flies are simply not in as big an energy deficit physiologically, due to the increased fat deposits they've accumulated (as the authors show later in the manuscript). It may take longer for the fat HSD flies to reach substantial energy deficiency than the NF flies, but they still may eventually be able to appropriately respond to hunger, just like NF flies. In such case, it would be a misnomer to call this behavioral change a 'defect in hunger-driven feeding behavior.' Maybe an experiment with a dose-response curve of "hunger driven feeding response" as a function of duration of starvation would help?
      2. How can you be sure that lower Dilp5 immunofluorescence is indicative of increased Dilp5 secretion? Wouldn't decreased production of dilp5 also have the same results?
        • a. Also, the authors should state in the main text that it is Dilp5, not just any Dilp.
      3. Data presentation:
        • a. Sometimes the data are normalized to NF (Fig 4B-C), sometimes not (ex. Fig 4A, S4C). Unless there is a specific rationale for the data transformation, it would be more appropriate to show untransformed data (ex. Fig 4A, S4C), especially as the authors use two-way ANOVA to determine significance. Only showing the differences implies comparison against a hypothetical mean (i.e. μ0=0), not between two group means.
        • b. Some figures show both individual data points and summary statistics (mean, SD, ... ex. Fig 2A)-which I believe is ideal-but some show only one or the other (ex. Fig 2B, no summary statistics; Fig. 3, no data points. The manuscript would read more convincing if data visualization is consistent across figures.

      Minor comments:

      1. High sugar diet: what is the actual sugar concentration in the NF v. HSD diets? The authors write that the HSD diet contains "30% more sugar" than the NF, but providing the final sugar concentrations-sucrose or others-would be informative for other scientists studying the effect of high sugar diets.
        • a. Additionally, the definition of HSD is inconsistent. Main text (Page 5, line 17) states that their HSD is "60% more sugar than normal media," whereas the figure legend (Fig 1) and the Methods state that the HSD contains "30% more sugar."
      2. Starvation medium: please provide justification for why the authors used 1% sucrose/agar for starvation medium, instead of plain agar/water that most labs use. At least clarify and provide a reference for the claim that the 1% sucrose/agar "is a minimal food media to elicit a starvation response."
      3. PECT mRNA level is higher with HSD. This is surprising because not only, as authors mention, is increased PC32.2 with HSD suggests lower PECT activity, but also because PECT RNAi phenocopies long-term HSD in HDF behavior, lipid morphology, FOXO accumulation in fat body. The authors speculate that the data "likely shown an upregulation in an attempt to mediate the PECT dysregulation occurring at the protein level." If that were true, a western blot may be informative. Zhao and Wang (2020, PLoS Genetics) generated a PECT antibody that seems compatible with western blot applications. That being said, I don't think such data is critical for the manuscript. I mention this simply as a suggestion for the authors.
        • a. page 8, line 22-23, did you mean to write "Given how PC32.2 is elevated after 14 days of exposure to HSD, we assumed that PECT levels would be low for flies under HSD," not "high?" Otherwise the subsequent 2 sentences don't make sense.

      Significance

      The work is potentially novel and interesting, but at this stage it's difficult to interpret what the phenotype signifies. Although the manuscript could be revised simply by modifying the text, experimentally addressing the concerns would significantly improve the work.

      The co-reviewer and I have expertise in Drosophila neurobiology and behavior.

      Referees cross-commenting

      Hi all, although the reviews hit upon some overlapping, but mostly different points, I agree with all of the concerns raised. There's some really interesting stuff here but some of the results, as presented, don't make sense. It's possible this will be clarified by revising the text, although I suspect it's more likely that the authors will have to add a number of the experimental suggestions made by the reviewers.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewers comments in italics *

      We thank all reviewers for their positive and encouraging comments and criticisms to improve our work. Here we present a reviewed version of the manuscript according to the comments risen.

      • Reviewer #1 (Evidence, reproducibility and clarity (Required)): This is an interesting paper that identifies Tns3 as a potential effector of oligodendrocytes differentiation based on an ingenious strategy comparing regulatory binding sites of known master regulators of differentiation, and then shows using in vivo genetics that this role is indeed correct. Next, a potential mechanism is identified by showing co-localization with beta 1 integrin, known to regulate apoptosis of newly-formed oligodendrocytes. The results are well illustrated and the experiments performed with appropriate power using a broad range of techniques that combine in silico, in vitro and in vivo work to great effect.

      I think this represents an important contribution that will be of significant interest to neuroscientists - the mechanisms regulating oligodendrocytes generation remain poorly understood and the evidence that this contributes to adult learning (adaptive myelination) and CNS regeneration makes this a key question. I would suggest that the following are considered before publication: We thank the reviewer for this positive comments and critics to improve the manuscript. The work describing the KO mice that were not used as they proved unsuitable need not be described - it breaks the logical flow.*

      In agreement with the reviewer comment, we have reduced this part to a sort paragraph indicating that our analyses of several Tns3 constitutive KO lines showed developmental lethality and possible genetic compensation in Tns3 expression, leading us to conclude them inappropriate tools to study Tns3 function in oligodendrogenesis. We have summarized the data in Fig. S7 and the description in the method section.

      It would be useful to compare the extent of cell death in the Tns3 cKO mice with that described in the alpha6 integrin KO and the integrin beta1 cKO (the Colognato and Benninger papers). Do they match? If not (and I suspect the Tns3 cKO death is greater) could other mechanisms be downstream of the Tns3?

      In agreement with the reviewer comment, we have added the following paragraph to the discussion:

      ‘Knockout mice for integrin-a6 present a 50% reduction in brainstem MBP+ OLs at E18.5, just before they die at birth, accompanied by an increase in TUNEL+ dying OLs (Colognato et al, 2002). Similarly, conditional deletion of integrin-b1 in immature OLs by Cnp-Cre also leads to a 50% reduction in cerebellar OLs at P5, with a parallel increase in TUNEL+ dying OLs (Benninger et al., 2006). Therefore, given that Tns3-induced deletion in postnatal OPCs also leads to 40-50% reduction in OLs in both grey and white matter regions of the postnatal telencephalon (this study), paralleled by similar increase in TUNEL+ apoptotic oligodendroglia, we suggest that Tns3 is required for integrin-b1 mediated survival signal in immature oligodendrocytes.’

      I'm not sure why the authors argue that the activation of beta 1 would not be informative experiment? This will regulate actin dynamics just as it regulates other integrin signaling pathways. Indeed, I would argue that an integrin activation experiments would be a neat way to prove mechanism (as it would be predicted to rescue the Tns3 cKO phenotype).

      In agreement with the reviewer comment, we have removed this sentence: ‘If so, exogenous activation of integrin a6b1 in cultured OPCs by Mn2+ (Colognato et al., 2004) would not be expected to increase oligodendrogenesis in Tns3-iKO oligodendroglia.’

      In an effort, to understand Tns3 function by acute Tns3-deletion in postnatal OPCs, we have compared the transcriptome of Tns3-iKO oligodendroglia compared to control cells, and we present these results in figure 7 pinpointing deregulated genes leading to reduced oligodendroglial differentiation, integrin dysregulation, increase apoptosis, and conflicting cell cycle signaling, and leaving for further studies the full characterization how the loss of Tns3 leads to the deregulation of these processes.

      Can the authors provide any data on GM oligos and their OPCs? Is the requirement for Tns3 the same, and if so what might the implications be in the adult where new oligodendrocytes are being generated throughout life?

      Indeed, in our analyses of Tns3-iKO mice, we provide quantifications of the cortex as a grey matter territory, showing a similar 40-50% reduction in OLs as in white matter areas (corpus callosum and fimbria, and mixed regions such as the striatum.

      I note in S13 that integrin beta1 is not highly expressed in human oligos at the time in question. Does this call into question the relevance for human disease?

      We realize that scRNAseq plots are never easy to interpret but it is important to note that the levels of expression are coded by the intensity of the color scale, while the surface of the dot plots indicate the experimental sensitivity to detect transcript expression in a larger or smaller proportion of the cells in a given cluster/cell type (due to the drop out limitation of current single cell RNA-seq technologies). Considering this, please note that beyond a stronger expression in neural progenitor cells (NPCs, blue color), integrin-b1 (Itgb1) transcripts are expressed at medium to high levels (green to blue) in human immature OLs (Fig. S13B), similar to their pattern of expression in mouse oligodendroglia (Fig. S13A).

      Reviewer #1 (Significance (Required)): See above

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      *In this article, the authors identify and characterise Tensin3 (Tns3) as a target of key oligodendroglial transcription factors driving differentiation in the mouse. They use multiple transgenic models to describe loss of function, and suggest Tns3's action through integrin B1 signalling, with the key function being oligodendroglial survival.

      There is extensive and impressive work here, including identification of Tns3 by ChIPseq, expression of Tns3 in brain development, analysis of human (ES-derived) and mouse scRNAseq to infer timing of expression in the differentiation pathway, generation of V5-tagged Tns3-KI mice to overcome antibody limitations, identification of its expression in mouse remyelination, generation of a new Tns3KO mouse, in vivo Crispr Tns3KO in development, generation of a conditional KO, for deletion in adulthood, and finally some culture work to investigate potential mechanisms of actions. The bottom line is that Tns3 is required for survival of OPCs and immature oligodendrocytes in development/remyelination in mouse at least, and loss leads to apoptosis (through p53 increase and loss of integrin-B1 signalling), leading to a failure of proper differentiation.

      The experiments are carefully done, convincing and the tools generated impressive. There is clearly more to be done on clarifying the mechanism of action of Tns3, but I do not think further experiments on this topic are needed for this paper - they can wait for the next.*

      We thank the reviewer for the positive and encouraging reviewing comments. In an effort, to understand Tns3 function by acute Tns3-deletion in postnatal OPCs, we have compared the transcriptome of Tns3-iKO oligodendroglia compared to control cells, and we present these results in figure 7 pinpointing deregulated genes leading to reduced oligodendroglial differentiation, integrin dysregulation, increase apoptosis, and conflicting cell cycle signaling, and leaving for further studies the full characterization how the loss of Tns3 leads to the deregulation of these processes.

      My only query is whether the expression of Tns3 is also in immature OLs in human brain (rather than human ES-derived OLs). This should be easily checked with interrogation of online Shiny apps from already published snRNAseq from various groups on human post mortem adult brain, but if not present then in also baby/fetal brain. This would be interesting and may well be different from the ES_derived cells which tend to be very immature and would add interest to the possible translational impact.

      According to the suggestion of the reviewer, we analyzed 69,174 snRNAseq GW9-GW22 from fetal cerebellum,; Aldinger & Miller, 2021; https://doi-org.proxy.insermbiblio.inist.fr/10.1038/s41593-021-00872-y), which we present now in Figure S3, finding a cluster of cells expressing iOL markers, including NKX2-2, TNS3, ITPR2, and BCAS1, similar to the hiPSCs-derived iOL1/iOL2 clusters and mouse iOL1/iOL2 clusters shown in Fig. S2.

      We also analyzed other datasets without finding iOLs given their age or numbers, including:

      • Immunopanned PDGFRA+ cells from human cortex GW20-GW24 (2690 cells, Huang and Kriegstein, Cell 2020) finding OPCs but not iOLs.

      -The recently published dataset from GW8-GW10 human forebrain oligodendroglia (van Brugen & Castelo-Branco, Dev Cell 2022; https://doi.org/10.1016/j.devcel.2022.04.016) containing OPCs but not iOLs.

      -The GW17 to GW18 human cortex (40,000 cells, Polioudakis & Geschwind, 2019, https://doi.org/10.1016/j.neuron.2019.06.011) containing OPCs but not iOLs.

      Reviewer #2 (Significance (Required)): This work extends our knowledge of oligodendroglial differentiation, links it to the ECM and provides interest in manipulating this in diseases including glioma. My expertise: myelin, oligodendroglia, remyelination, human neuropathology

      *Reviewer #3 (Evidence, reproducibility and clarity (Required)): *

      see below Reviewer #3 (Significance (Required)): Using purified oligodendrocytes target genes of key regulators of oligodendrocyte differentiation were analyzed, which led to the identification of Tensin-3. The authors performed a detail characterization of Tensin-3 expression. They found that Tensin-3 is highly expressed in immature mouse and human oligodendrocytes. Interestingly, Tensin-3 is selectively enriched in immature oligodendrocytes, and not present at detectable levels in OPCs and mature oligodendrocytes. Subsequently, the authors characterized Tensin-3 function by a series of knockdown approaches in vitro and in vivo. These series of experiments revealed an essential function of Tensin-3 in supporting oligodendrocytes survival. In the absence of Tensin-3 a large fraction of oligodendrocytes undergo apoptosis while differentiating to mature oligodendrocytes. This is a remarkable study applying an impressive array of methods that led to an important discovery in the field of oligodendrocyte biology. The main advances for the field are: 1) identification of a novel marker for premyelinating oligodendrocytes, 2) elucidation of Tensin-3 as a pro-survival factor in oligodendrocytes differentiation, 3) evidence of link of Tensin-3-integrin signal in survival of oligodendrocytes. The data is well presented and organized, and the paper well written. I recommend publication with only minor suggestions for a revision:

      • *

      We thank the reviewer for this positive comments and critics to improve the manuscript.

      In Figure 2, only images are shown, and the data is referred to as highly expressed or strong co-localization. Even if the data looks clear, the authors should provide some quantification of the data in the figure.

      We thank the reviewer for his comment and we have now provided a quantification of the fraction of Tns3+ cells expressing different markers of oligodendrocyte lineage progression/stages, and the percentage of each stage expressing Tns3.

      Figure 3 is given too much weight in the manuscript text. I would recommend to shorten the text in the result section, and to move this figure to the supplement as it does not advance the story. It mainly shows that the KO mice still express transcripts in the brain. Were the transcripts lost in peripheral tissue?

      • *

      As mentioned above, in agreement with the reviewers #1 and #3 comments, we have reduced this part to a sort paragraph indicating that our analyses of several Tns3 constitutive KO lines showed developmental lethality and possible genetic compensation in Tns3 expression, leading us to conclude them inappropriate tools to study Tns3 function in oligodendrogenesis. We have summarized the data in Fig. S7 and the description in the method section.

      Page 11: the authors describe in the text how the floxed allele was generated. This should be shifted to the supplement.

      According to reviewers suggestion, we have moved the description of Tns3 floxed allele generation to the Methods section. Page 16: the authors refer to Bcas1 as a problematic marker for immature oligodendrocytes, because the transcript is also expressed in mature oligodendrocytes. The authors are correct that the transcript is expressed in mature oligodendrocytes. However, the proteins changes its localization when oligodendrocytes mature. On protein level, it is valuable and a selective marker, as antibodies only label pre-myelinating and actively myelinating cells. In mature oligodendrocytes, antibodies against Bcas1 do not label the cell, only myelin. The text is misleading and needs to be corrected.

      In agreement with reviewers comment we have modified the text as follows: ‘An optimized protocol for immunodetection using Bcas1-recognizing antibodies has been shown to label iOLs (Fard et al., 2017).’

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The manuscript by Tran et al. describes the mechanism by which IFNa treatment prevents the development of liver CRC metastasis in several mouse models. They show how continuous administration of IFNa strength liver vascular barrier by a direct effect on endothelial cells and avoids the trans-sinusoidal migration of tumour cells.

      Major points:

      1. Authors use an elegant orthotopic model of liver metastasis to confirm the effect of continuous IFNa on hepatic colonization (Fig.3). Although they extensively characterize the metastatic lesions, they do not show data on the potential impact of IFNa treatment in the primary caecum tumour. Authors should clarify if the described effects are taken place in the liver or/and in the caecum. It would be interesting to show if IFNa affects the primary tumour size, the extravasation of cancer cells and the immune infiltration since all these factors could have an impact in the number of liver lesions.

      We thank the reviewer for acknowledging the importance of our results particularly in the context of the orthotopic mouse model we developed. We agree that displaying the results of continuous IFNα therapy on primary intracecal tumors, as well as the results pertaining to the few mice that develop microscopic or macroscopic liver metastasis, is important for the interpretation of our work. Thus, we evaluated the dimension of primary intracecal CRC lesions (Fig 3D,E) and we performed additional IHC characterization of the primary tumors (Fig S4A,B). The analysis showed that the dimension of the primary lesions and the markers we analyzed were non significantly modified by continuous IFNα therapy (Fig 3D,E and Fig S4A,B). These results favor the hypothesis that IFNα therapy does not modify the number of cells that spread from the primary tumors and seed into the liver, but it rather impinges on the intravascular containment of CRC cells circulating within the liver (Fig 3F). As said earlier, the data also highlight the possibility that CRC tumors may become refractory to IFNα or that the dose and schedule we adopted does not significantly affect the growth of established liver CRCs at late time points. The data are also consistent with results obtained with MC38Ifnar1_KO CRC cells indicating that continuous IFNα therapy does not require Ifnar1 expression by tumor cells to exert its antimetastatic function (Fig 4A,C-D). This is also in line with the high IFNα concentrations required to activate the "tunable" direct antiproliferative functions of this cytokine that exceed those achieved in our system (Catarinella et al, 2016; Schreiber, 2017). Text has been added in the revised manuscript at lines 175-197 and in the discussion lines 425-431.

      1. Figure 3f right shows liver images without any obvious metastatic lesion. Since authors are analysing the effect of IFNa treatment in proliferation, vascularization and immune composition in liver tumours, they may show and quantify images with metastatic lesions and restrict the analysis to the tumour area.

      Since the main finding of our manuscript regards the prevention of hepatic colonization by continuous IFNα therapy, we think that the original data presented in Fig 3G,H are representative of the overall efficacy of our strategy that confers protection in up to 60% of the mice carrying intramesenteric tumors of increasing dimensions (Fig 3H). We have thus maintained our original results, adding the quantification of all IHC data on groups of Sham control livers (n=6), as suggested. In any case, we also included the same IHC characterization of the few and small intrahepatic lesions that have bypassed the intravascular antimetastatic barrier (Fig S4C,D). Indeed, in agreement with the results observed in primary intracecal lesions, these metastatic lesions that developed in IFNαtreated mice showed similar markers of cell proliferation, neoangiogenesis, F4/80 macrophages and CD3+ T cells, as control lesions detected in NaCl-treated mice. Once again, the results highlight the possibility that CRC tumors, once established as micro/macroscopic metastases, may become refractory and resistant to IFNα therapy by downregulating the Ifnar1 in various components of the tumor microenvironment (Boukhaled et al., 2021; Katlinski et al., 2017). Text has been added in the revised manuscript at lines 175-197 and in the discussion lines 496-515.

      1. Authors analyse the recombination efficiency of different mouse CRE lines by non-quantitative methods (PCR of hepatic genomic DNA and GFP expression by immunofluorescence in healthy liver). Since PDGFRβ-Cre/ERT2 and CD11c-Cre lines are used to exclude a role of IFNa on the targeted cells, authors should provide stronger evidences to support this. They may consider studding the ablation of Ifnar1 in FACS sorted fibroblasts and myeloid cells. Moreover, it would be important showing the proportion of GFP+ cells in the sorted populations to understand how broadly these stromal populations are targeted.

      We thank the referee for raising this important issue, which is related to the relative efficiency of Ifnar1 recombination in each of the Cre-expressing mouse models we have used in the study. To this regard, we newly performed an extensive colocalization analysis quantifying the percentage of GFP+ cells that colocalize with cell specific markers (i.e., PDGFRβ, CD11c, F4/80 and CD31) of the various mouse models (PDGFRβCreERT2, CD11cCre and VeCadCreERT2, respectively) crossed with RosaZsGreen reporter mice. Colocalization analysis of GFP in the different systems was performed using the ImageJ “colocalization” algorithm developed by Pierre Bourdoncle (Institut Jacques Monod, Service Imagerie, Paris; 2003–2004). The method allows the generation of unsupervised profiles of co-localized pixels between two channels. This methodology has been included in the section Methods and Protocols, line 806-809. Of note, we observed an almost complete recombination in liver fibroblast (GFP+/PDGFRβ+), with about 98.2 ± 0.72% hepatic stellate cells that co-expressed GFP+ and PDGFRβ+ signals (see the new Fig S5E). Similarly, hepatic DCs (GFP+/CD11c+) had 94.17 ± 2.16% colocalization, while F4/80+ KCs or LCMs (GFP+/F4/80+) colocalized in 78.14 ± 5.03% (see the new Fig S5E). Finally, HECs, including LSECs, (GFP+/CD31+) showed 85.3 ± 5.03% colocalization (see the new Fig S5E,F), with no expression of GFP signals in cells other than CD31+. Note that these values indicate an almost complete colocalization of the Cre recombinase in the target cell types analyzed (see representative IF shown in Fig S5E). Text has been added in the revised manuscript at lines 225-233. Moreover, DEGs analysis between NaCl-treated VeCadIfnar1_KO and Ifnar1fl/fl HECs showed a significant downregulation of Ifnar1 expression in CD31+ VeCadIfnar1_KO cells, with a log2 fold-change of -0.387 and an adjusted p-value of 0.033, further confirming Cre recombination in HECs isolated from VeCadIfnar1_KO mice (as depicted in the heatmap of Fig 6B; the 12th gene of the Type I IFN response is Ifnar1). We have prepared all source images at higher dimension to better appreciate the colocalization within liver microvasculature. In addition, we performed several flow cytometry analyses to identify liver cell populations of Cre-recombinant mice that express Ifnar1. Unfortunately, the predicted low cellular surface expression of this molecule coupled with the experimental conditions needed to extract viable non-parenchymal cells from the liver have prevented us from obtaining informative results.

      1. Ifnar1 ablation in VeCad+ cells prevents the effect of IFNa on tumour growth (Fig. 4d), suggesting the existence of anti-tumour mechanisms beyond the effects on hepatic colonization. Authors may consider checking proliferation, vascularization and immune infiltration in these tumours to enhance their conclusion.

      We fully agree with the referee’s concern and as above mentioned, we have followed his/her suggestion and examined the existence of antitumor mechanisms beyond the effects on hepatic colonization in VeCadIfnar1_KO mice treated with NaCl or IFNα. To this end, 4 NaCl-Ifnar1fl/fl, 7 IFNα-Ifnar1fl/fl, 4 NaCl-VeCadIfnar1_KO and 4 IFNα-VeCadIfnar1_KO mice were intrasplenically injected with MC38 CRC cells (Fig S7A,B). Twenty-one days after injection, mice were euthanized and their livers analyzed for tumor size, proliferation, signs of angiogenesis (as denoted by CD34 staining) and immune infiltration (F4/80+ macrophages and CD3+ T cells). Consistent with data presented in Fig 4D, histological analysis showed that Ifnar1fl/fl mice did not develop liver metastases in IFNα-treated mice. Furthermore, metastatic lesions detected in VeCadIfnar1_KO mice treated or not with IFNα did not show significant differences in Ki67 positivity, CD34 staining or the amount of F4/80+ resident macrophages and CD3+ T cells. This further supports that the antimetastatic potential of IFNα therapy may be primarily depend on the inhibition of hepatic trans-sinusoidal migration, a limiting step in the metastatic cascade that could secondarily influence colonization and outgrowth (Chambers et al, 2002). Corresponding text has been added at lines 248-252.

      1. Immune properties of LSECs are analysed in vivo by using a mouse CRE line that targets all endothelial cells, including those ones located in lymphoid organs, and evaluating T cell composition in the spleen. I found difficult to conclude that these properties are exerted directly by LSECs and not by other endothelial cells in vivo. To clarify the local effect of LSECs in modulating anti-tumour immunity, T cell composition and activation should be checked in tumours shortly after tamoxifen administration.

      We thank the reviewer for pointing out this issue, which cannot not be tested directly because - as also mentioned by reviewer 2 - LSEC-specific Cre-recombinant driver mice do not exist . As also indicated in the cited literature, central memory T cells accumulate after peripheral priming in secondary lymphoid organs such as the spleen (Sallusto et al, 2004; Stone et al, 2009; Yu et al, 2019). To this end, the generation and regulation of antitumor immunity is a highly orchestrated multistep process involving the uptake of tumor-associated antigens by professional APCs, their time-consuming migration to draining lymph nodes and the generation of protective T cells. Unlike other APCs, HECs/LSECs do not need to migrate to draining lymph nodes to activate effector T cells, leading to a rapid intrahepatic CD8+ T cell activation. In this context, LSECs must not only efficiently uptake, process and present CRC-derived antigens coming from intravascularly contained tumor cells, but they also require the attraction and retention within the liver micro-vasculature of T cell populations necessary for the generation of effective antitumor immune responses, where chemokines play an important role (Lalor et al, 2002). As shown in Fig 6A-C, two prominent chemokines (Cxcl10 and Cxcl9) required for T cell recruitment to the liver are specifically upregulated only in HECs/LSECs from IFNα-treated Ifnar1fl/fl mice, whereas HECs from VeCadIfnar1_KO mice maintained low expression of these chemoattractants in both NaCl- and IFNα-treated mice. These data are also consistent with the in vitro cross-priming results (see Fig 7A,B) showing that in the absence of IFNα, HECs have a low capacity to prime naïve T cells (Katz et al, 2004), indicating that LSEC-primed by tumor-derived antigens coming from apoptotic intravascular CRC metastatic cells play an important role in inducing tolerance (Berg et al, 2006; Katz et al., 2004), especially when CRC cells quickly extravasate and position within the space of Disse, likely becoming less accessible to intravascular patrolling by naïve and effector T cells (Benechet et al, 2019; Guidotti et al, 2015). On the contrary, in IFNα-treated Ifnar1fl/fl mice, CRC cells are rapidly contained in the liver microvasculature (Fig 5A,B) with CRC-derived antigens that could be immediately taken up by LSECs due to their anatomical proximity and efficient endocytosis capacity, which is among the highest of all cell types in the body (Sorensen, 2020). Here, the continuous sensing of IFNα by LSECs upregulates several genes related to antigen processing and presentation pathways (Fig. 6B,D), leading to efficient cross-priming of tumor-specific CD8+ T cells to the same extent as professional APCs, such as splenic DCs (Fig 7B). Text has been added in the revised manuscript at lines 496-515. Finally, regarding the suggestion to analyze the role of HECs/LSECs in inducing antitumor T cell immunity shortly after tamoxifen administration, while we agree that it would be interesting to analyze HEC/LSEC-mediated T cell activation by treating NaCl- and IFNαtreated Ifnar1fl/fl and VeCadIfnar1_KO mice with tamoxifen after CRC cell injection, we would like to point out that tamoxifen treatment will not only induce Cre recombination and Ifnar1 loss on endothelial cells but it may also induce several “off-target” effects complicating the interpretation of the results. Indeed, tamoxifen is known to i) inhibit the in vitro proliferation of several CRC cell lines (Ziv et al, 1994), ii) impair the growth of CRC liver metastases in vivo (Kuruppu et al, 1998) and iii) modify matrix stiffness to reduce tumor cell survival (Cortes et al, 2019). Further, as IFNα modifies the hepatic vascular barrier and the accessibility of antigens by LSECs, the specific timing of tamoxifen treatment could also affect the immunological consequences of Ifnar1 deletion making these experiment impractical. For these reasons, we’d like not to perform the suggested experiment with tamoxifen.

      Reviewer #1 (Significance):

      The conclusions of this study are consistent with previously published literature and the biological insights are potentially useful to the cancer biology community.

      Reviewer #2 (Evidence, reproducibility and clarity):

      In this study Dr. Sitia's group investigated the effect of IFNα1 as perioperative agent preventing liver metastasis formation of colorectal carcinoma (CRC). To this end, various mouse models were used such as liver colonization models, i.e. intrasplenic and mesenterial injections of MC38 and CT26 CRC cell lines. Besides, spontaneous metastasis of CRC was analyzed by orthotopic injection of MC38 into the cecum. To study the influence of IFNα1 in these settings mini-osmotic pumps releasing IFNα1 were used. Moreover, conditional mouse models with a cell-type specific deficiency of Ifnar1 were compared. Altogether, the application of IFNα1 led to a reduction in liver colonization of CRC in all models studied. This was ascribed to decreased trans-sinusoidal migration of CRC and increased cross-priming by LSEC entailing in T cell activation.

      Major comments:

      Overall the study is well performed and the major conclusions seem to be drawn well. However, there are certain points I like to address:

      • First, the authors started their experiments with MC38 and CT26 CRC cell lines. At the end they just applied MC38. The rational behind this should be clearly stated. Second, as in their previous publication (Catarinella et al, 2016) F1 hybrids of C57BL/6 x BALB/c mice were used for the experiments. However, I believe that the genetic heterogeneity might be strongly increased by this approach which might lead to difficult reproducibility of the results.

      We thank the referee for raising this important issue; additional text describing the reason of our choice has been introduced at lines: 203-205. We respectfully disagree with the comment that CB6F1 hybrids may increase genetic heterogeneity and impair reproducibility of our results. Each CB6F1 hybrid individual is genetically identical to its littermates, sharing 50% of genes of each parental mouse line and being tolerant to reciprocal MHC-I genes (thus permitting the correct engraftment of both cell lines). We agree that the use of mismatched backcrosses after the F1 generation would increase genetic heterogeneity and thus may affect outcome. This is also the reason why we could not perform experiments with CT26 in the Ifnar1fl/fl conditional lines that are in C57BL/6 background and would have needed at least 10 generations of backcrossing in the BALB/c background before being suitable to such experiments. Finally, all experiments described in Fig 4, 5, 6 and 7 were performed in C57BL/6 mice using MC38 CRC cells with results that reproduced those obtained in CB6F1 hybrids, and very similarly to what we have previously reported with MC38 in C57BL/6 mice (see Fig 5 (Catarinella et al., 2016)).

      • At page 16 the authors conclude that "patients suffering from chronic liver fibrotic disease... display lower incidence of hepatic metastases". In the community there is contradictory data (see Kondo et al, BJC, 2016, https://www.nature.com/articles/bjc2016155). This should be precisely discussed, otherwise this claim should be removed.

      We thank the referee for raising this issue and modified the discussion accordingly. Text has been added in the revised manuscript at lines 455-457.

      We agree with the reviewer's suggestion and added new text to recognized the interplay between different cell types such as dendritic cells within the hepatic niche (see new text at lines 505-515).

      • Last, multiple times the authors write about data that is "not shown". Please either include these data in the manuscript or delete corresponding phrases because it is not possible for the reader to scrutinize it.

      We fully agree with the referee’s concern and displayed all “not shown results” in Fig S1E and Fig S9C-I.

      • Besides, I suggest additional experiments further substantiating the study:
      • To see if this effect of IFNα1 is cell type-specific liver metastasis of other solid tumors such as breast cancer or melanoma should be investigated.

      We agree with the reviewer's suggestion, as also indicated in our original discussion. We believe that additional experiments with other solid tumor cell lines would be important to generalize the potential of perioperative IFNα therapy. In particular, we believe that pancreatic ductal adenocarcinoma (PDAC), a highly lethal disease that most commonly metastasizes to the liver (Lambert et al, 2017), may benefit from our approach. It should be noted, however, that the pleotropic nature of IFNα allows this cytokine to inhibit tumor growth by several mechanisms. Above all, the ability of IFNα therapy to directly reduce tumor growth depends on the relative surface expression of Ifnar1 on each tumor cell and the ability to maintain such expression in the harsh tumor microenvironment during IFNα therapy. As the degradation of Ifnar1 by CRC tumors has been well described (Katlinski et al., 2017), it is possible that CRC tumors thus escaping the antitumor properties of endogenous type I interferons may respond less efficiently to therapeutic IFNα regimens such as those herein described. This notion is consistent with our data on primary orthotopic tumors (Fig. 3D,E), which are no longer responsive to continuous IFNα therapy as early as 7 days after implantation of CT26LM3 cells. In addition, the definition of the HEC/LSEC antimetastatic barrier has been possible only because CRC cells are not directly susceptible to the IFNα antiproliferative activity, which we observed in vitro at extremely high IFNα dosages (Catarinella et al., 2016) but not in vivo (as formally demonstrated by using MC38Ifnar_ko cells, Fig 4A). At any rate, we followed the reviewer’s suggestion and performed an additional experiment in which we intramesenterically injected the PDAC cell line Panc02 (H-2b, C57BL/6-derived) (Soares et al, 2014) into C57BL/6 mice 7 days after of NaCl or IFNα therapy initiation. As shown below, MRI analysis at day 21 showed that none of the IFNα-treated Panc02 challenged mice developed metastatic lesions, while NaCl controls displayed a high metastatic burden that required euthanization for ethical reasons of about 67% of these mice shortly after MRI analysis. These data indicate that perioperative IFNα therapy completely curbs metastatic development in IFNα-treated PDAC animals. The notion that these cells may be more IFNα-susceptible than CRCs may well depend on the relative capacity of the former cells to maintain Ifnar1 expression, as suggested by others (Zhu et al, 2014). Properly addressing the reviewer’s comment would thus require extensive investigations involving the establishment of new mouse models of metastases from other solid tumors, starting from the in vitro and in vivo regulation of surface Ifnar1 expression in each tumor cell. We strongly believe that this work has merit but we think that it should be reported separately.

      • The authors applied a broad range of cell type-specific mice. However, a thorough characterization of the deletion of Ifnar1 in the corresponding cell types is missing. This is crucial for the manuscript.

      We fully agree with the referee’s concern and as previously mentioned, we have improved the characterization of Ifnar1 deletion (see response to the same critique received from reviewer 1, comment 3).

      • The capillarization of the hepatic vascular niche is a crucial point in this story. I believe that the hepatic endothelium should be further characterized by additional vascular markers.

      In response to the reviewer’s suggestion, we have included in our analysis the characterization of Lyve-1, a marker of hepatic capillarization (Pandey et al, 2020; Wohlfeil et al, 2019). Indeed, IFNα treatment of Ifnar1fl/fl mice significantly increased the expression of Lyve-1, whereas IFNα treatment of VeCadIfnar1_KO mice showed no effect (Fig S9A,B), further corroborating our findings. Text has been added in the revised manuscript at lines 291-294. To better aid readers, we have prepared high-resolution images for each IF channel and have provided these data as source date for Fig S9A.

      • Last, the data and methods appear adequately presented and experiments seem to be reproducible. Just in Figure 4 the exact number of mice and replicates are not clearly presented. Otherwise, everything is fine.

      We thank the reviewer for raising this issue, which apparently was not properly described in our original submission. We have now included the exact number of mice in each experimental group in the figure legend to Fig 4.

      Minor comments:

      Overall the text and figures are accurately presented. However, I would like to add further minor comments:

      • In Fig. 1 you present the IFNα dosing regimen. How do you explain the decrease in serum IFNα after day 2? Besides, the data points at day 0 should be excluded since measuring startet from day 2! Why did you decide to treat for seven days until the start of the experiment? One could think 2 days might already be enough.

      We thank the reviewer for raising these important points. Regarding the pharmacokineticpharmacodynamic (PK-PD) behavior of our approach, we do not believe that MOP reduced its pumping efficacy after day 2 (Theeuwes & Yum, 1976), nor that counterregulatory mechanisms, such as the induction of anti-IFNα blocking antibodies, occurred in such a short time frame (Wang et al, 2001). It is neither feasible that IFNα treatment significantly downregulated Ifnar1 in the liver (as demonstrated by pSTAT1 activation after MOP treatment in Fig S1E). Rather, our results reflect the PK-PD behavior of other long-lasting formulations of IFNα, which depend on intrinsic pharmacological properties of IFNα already described in (Jeon et al, 2013). Text has been added in the revised manuscript at lines 110-112. We also corrected the figures in which we quantified serum IFNα. Indeed, blood was drawn one day before MOP implantation rather than on the same day of surgery to avoid additional blood loss, which could be a source of unnecessary stress for the animals. Therefore, we corrected the results section and Fig S1A-C and Fig 1A,B. The decision to start treatment 7 days rather than 2 days before seeding was made for several reasons: i) this study follows our previous gene/cell therapy approach, in which the time interval between reconstitution of the transduced bone marrow with Tie2-IFNα and tumor challenge was at least 7-8 weeks. We therefore thought that 7 days might be a sufficient/necessary time period to induce similar phenotypes in the liver after continuous IFNα administration; ii) 7 days is a time frame compatible with the perioperative period in humans (Horowitz et al, 2015). Furthermore, the side effects that patients may experience after IFNα therapy are generally limited to the first few days after administration, allowing patients to benefit from IFNα-induced vascular antimetastatic barriers at the time of surgery without potential side effects of IFNα. Because oncologic guidelines recommend starting adjuvant chemotherapy at least 4 weeks after surgery in stage 2-3 CRC patients at risk of later developing liver metastases (Engstrand et al, 2019; van Gestel et al, 2014), our proposed perioperative time frame does not even conflict with these indications (Van Cutsem et al, 2016). We have included additional text in the lines 131-132 to motivate the timing of our regimens.

      • Fig. 2: Did you check for metastases in other organs than the liver at the timepoint of euthanization, e.g. lungs. In the discussion section you talk about a potential influence of IFNα1 on other organs. Therefore, I think that the mice should be thoroughly analyzed and the data presented. The manuscript will benefit from it.

      We thank the reviewer for this valuable comment. Indeed, we always check for dissemination of CRC metastases on MRI analysis and necroscopy. As stated at lines 146-147 and 158 CRC tumors seeded in the liver vasculature after colonizing the liver do not spread to other organs such as the lungs. Indeed, CRC cells intravascularly seeded in the portal circulation, are trapped at the beginning of hepatic sinusoids because their diameter is bigger than that of liver sinusoids (Fig S8A,B). These micro-anatomic peculiarities are also thought to impede the spreading of tumor cells from periportal to centrilobular areas and to the general circulation (Catarinella et al., 2016; Vidal-Vanaclocha, 2008), and this is consistent with studies showing that in CRC patients undergoing surgery the majority of CRC-derived circulating tumor cells are found in the portal vein (Deneve et al, 2013).

      • Overall, MRI pictures and pictures of IHC or IF are sometimes too small to see. Please provide pictures with larger magnification or enlarge the images.

      We thank you for this suggestion and we have indeed increased the size of all MRI, IHC, and IF images to the maximum that will fit within the figure. In addition, we presented the images at the highest magnification available, without making digital enlargements that would significantly reduce resolution.

      • Fig. 3 F, G: immune cell infiltration in the liver was analyzed. Please compare it to untreated, tumor-free wildtype liver tissue.

      We appreciated the reviewer's suggestion and included the results of six Sham mice per each marker in our analysis. The text was added on the figure legends to Fig 3H and Fig S4B,D.

      • Fig. 6: the graphs are too small to be read, especially the volcano plot and the gene names of the heatmap.

      We increased the font size of genes in the volcano plots and heatmap in Fig 6A,B, as suggested.

      • Fig. S6: Pictures of co-immunofluorescences are presented. For the reader it is really hard to distinguish the stainings and to identify colocalized areas. Please provide pictures with one channel to better compare the marker expression.

      We thank the reviewer for pointing this out and we have tried to make each panel as large as possible to fit into a two-column figure. We have also prepared high magnification images of each channel for all immunofluorescence images, which we provide as source data. We hope that this is sufficient to help readers to interpret our results without increasing the number of main or supplementary figures.

      • From page 8 onwards (section about transgenic mice) LSEC was used as kind of synonym for hepatic endothelial cells. Since there is still no LSEC-specific driver mouse, it should be stated "hepatic endothelial cells" instead.

      We agree with this suggestion and thus have indicated that the results refer to HECs but include a large majority of LSECs. Indeed, LSECs make up the majority (~89%) of the total HEC population (Su et al, 2021). In addition, some SEM and TEM analyses were performed only on LSECs, as well as the IF analyses. Therefore, we believe that LSECs play an important role in this process. Although not specifically suggested, we have also changed the title of our manuscript to reflect the reviewer's suggestion. Thus, we propose "Continuous sensing of IFNα by hepatic endothelial cells shapes a vascular antimetastatic barrier" as new title.

      • P. 11: there is a typo: Fig. Fig. S6G,H

      We corrected this typo.

      • P. 13: the authors describe Gata4 as inhibitor of subendothelial matrix deposition. This should be precisely written, since Gata4 originally is described as master-regulator of liver sinusoidal differentiation which leads to liver fibrosis development upon loss of Gata4.<br /> Besides, I came across a study of the same group that investigated the role of Notch signaling in hepatic CRC and melanoma metastasis (Wohlfeil et al, Cancer Res, 2019, https://aacrjournals.org/cancerres/article/79/3/598/638600/Hepatic-Endothelial-Notch-Activation-Protects). Similar to your study they tie the reduction in hepatic metastasis to capillarization of the hepatic microvasculature.

      We agree with this suggestion and modified text accordingly. We are also glad that our results agree with previous reported literature that has now been correctly cited at lines 351-356 and in the discussion lines 474-476.

      • The discussion reads like paraphrasing the results section. The manuscript would clearly benefit if the discussion section had been rewritten short and concisely.

      We agree with this suggestion, and we have modified discussion accordingly. We are also willing to shorten the discussion by removing the schematic model that could possibly be used as a graphical abstract.

      References

      Benechet AP, De Simone G, Di Lucia P, Cilenti F, Barbiera G, Le Bert N, Fumagalli V, Lusito E, Moalli F, Bianchessi V et al (2019) Dynamics and genomic landscape of CD8(+) T cells undergoing hepatic priming. Nature 574: 200-205

      Berg M, Wingender G, Djandji D, Hegenbarth S, Momburg F, Hammerling G, Limmer A, Knolle P (2006) Cross-presentation of antigens from apoptotic tumor cells by liver sinusoidal endothelial cells leads to tumor-specific CD8+ T cell tolerance. Eur J Immunol 36: 2960-2970

      Boukhaled GM, Harding S, Brooks DG (2021) Opposing Roles of Type I Interferons in Cancer Immunity. Annu Rev Pathol 16: 167-198

      Catarinella M, Monestiroli A, Escobar G, Fiocchi A, Tran NL, Aiolfi R, Marra P, Esposito A, Cipriani F, Aldrighetti L et al (2016) IFNalpha gene/cell therapy curbs colorectal cancer colonization of the liver by acting on the hepatic microenvironment. EMBO Mol Med 8: 155-170

      Chambers AF, Groom AC, MacDonald IC (2002) Dissemination and growth of cancer cells in metastatic sites. Nat Rev Cancer 2: 563-572

      Cortes E, Lachowski D, Robinson B, Sarper M, Teppo JS, Thorpe SD, Lieberthal TJ, Iwamoto K, Lee DA, Okada-Hatakeyama M et al (2019) Tamoxifen mechanically reprograms the tumor microenvironment via HIF-1A and reduces cancer cell survival. EMBO Rep 20

      Deneve E, Riethdorf S, Ramos J, Nocca D, Coffy A, Daures JP, Maudelonde T, Fabre JM, Pantel K, Alix-Panabieres C (2013) Capture of viable circulating tumor cells in the liver of colorectal cancer patients. Clin Chem 59: 1384-1392

      Engstrand J, Stromberg C, Nilsson H, Freedman J, Jonas E (2019) Synchronous and metachronous liver metastases in patients with colorectal cancer-towards a clinically relevant definition. World J Surg Oncol 17: 228

      Guidotti LG, Inverso D, Sironi L, Di Lucia P, Fioravanti J, Ganzer L, Fiocchi A, Vacca M, Aiolfi R, Sammicheli S et al (2015) Immunosurveillance of the liver by intravascular effector CD8(+) T cells. Cell 161: 486-500

      Horowitz M, Neeman E, Sharon E, Ben-Eliyahu S (2015) Exploiting the critical perioperative period to improve long-term cancer outcomes. Nature reviews Clinical oncology 12: 213-226

      Jeon S, Juhn JH, Han S, Lee J, Hong T, Paek J, Yim DS (2013) Saturable human neopterin response to interferon-alpha assessed by a pharmacokinetic-pharmacodynamic model. Journal of translational medicine 11: 240

      Katlinski KV, Gui J, Katlinskaya YV, Ortiz A, Chakraborty R, Bhattacharya S, Carbone CJ, Beiting DP, Girondo MA, Peck AR et al (2017) Inactivation of Interferon Receptor Promotes the Establishment of Immune Privileged Tumor Microenvironment. Cancer cell 31: 194-207

      Katz SC, Pillarisetty VG, Bleier JI, Shah AB, DeMatteo RP (2004) Liver sinusoidal endothelial cells are insufficient to activate T cells. Journal of immunology 173: 230-235

      Kuruppu D, Christophi C, Bertram JF, O'Brien PE (1998) Tamoxifen inhibits colorectal cancer metastases in the liver: a study in a murine model. Journal of gastroenterology and hepatology 13: 521-527

      Lalor PF, Shields P, Grant A, Adams DH (2002) Recruitment of lymphocytes to the human liver. Immunol Cell Biol 80: 52-64

      Lambert AW, Pattabiraman DR, Weinberg RA (2017) Emerging Biological Principles of Metastasis. Cell 168: 670-691

      Pandey E, Nour AS, Harris EN (2020) Prominent Receptors of Liver Sinusoidal Endothelial Cells in Liver Homeostasis and Disease. Front Physiol 11: 873

      Sallusto F, Geginat J, Lanzavecchia A (2004) Central memory and effector memory T cell subsets: function, generation, and maintenance. Annu Rev Immunol 22: 745-763

      Schreiber G (2017) The molecular basis for differential type I interferon signaling. J Biol Chem 292: 7285-7294

      Soares KC, Foley K, Olino K, Leubner A, Mayo SC, Jain A, Jaffee E, Schulick RD, Yoshimura K, Edil B et al (2014) A preclinical murine model of hepatic metastases. J Vis Exp: 51677

      Sorensen KK, Smedsrod, B. (2020) The Liver Sinusoidal Endothelial Cell: Basic Biology and Pathobiology. In: The Liver: Biology and Pathobiology, Sixth Edition pp. 422-434. John Wiley & Sons Ltd. :

      Stone JD, Chervin AS, Kranz DM (2009) T-cell receptor binding affinities and kinetics: impact on T-cell activity and specificity. Immunology 126: 165-176

      Su T, Yang Y, Lai S, Jeong J, Jung Y, McConnell M, Utsumi T, Iwakiri Y (2021) Single-Cell Transcriptomics Reveals Zone-Specific Alterations of Liver Sinusoidal Endothelial Cells in Cirrhosis. Cell Mol Gastroenterol Hepatol 11: 1139-1161

      Theeuwes F, Yum SI (1976) Principles of the design and operation of generic osmotic pumps for the delivery of semisolid or liquid drug formulations. Ann Biomed Eng 4: 343- 353

      Van Cutsem E, Cervantes A, Adam R, Sobrero A, Van Krieken JH, Aderka D, Aranda Aguilar E, Bardelli A, Benson A, Bodoky G et al (2016) ESMO consensus guidelines for the management of patients with metastatic colorectal cancer. Ann Oncol 27: 1386-1422

      van Gestel YR, de Hingh IH, van Herk-Sukel MP, van Erning FN, Beerepoot LV, Wijsman JH, Slooter GD, Rutten HJ, Creemers GJ, Lemmens VE (2014) Patterns of metachronous metastases after curative treatment of colorectal cancer. Cancer Epidemiol 38: 448-454

      Vidal-Vanaclocha F (2008) The prometastatic microenvironment of the liver. Cancer microenvironment : official journal of the International Cancer Microenvironment Society 1: 113-129

      Wang DS, Ohdo S, Koyanagi S, Takane H, Aramaki H, Yukawa E, Higuchi S (2001) Effect of dosing schedule on pharmacokinetics of alpha interferon and anti-alpha interferon neutralizing antibody in mice. Antimicrob Agents Chemother 45: 176-180

      Wohlfeil SA, Hafele V, Dietsch B, Schledzewski K, Winkler M, Zierow J, Leibing T, Mohammadi MM, Heineke J, Sticht C et al (2019) Hepatic Endothelial Notch Activation Protects against Liver Metastasis by Regulating Endothelial-Tumor Cell Adhesion Independent of Angiocrine Signaling. Cancer research 79: 598-610

      Yu X, Chen L, Liu J, Dai B, Xu G, Shen G, Luo Q, Zhang Z (2019) Immune modulation of liver sinusoidal endothelial cells by melittin nanoparticles suppresses liver metastasis. Nat Commun 10: 574

      Zhu Y, Karakhanova S, Huang X, Deng SP, Werner J, Bazhin AV (2014) Influence of interferon-alpha on the expression of the cancer stem cell markers in pancreatic carcinoma cells. Exp Cell Res 324: 146-156

      Ziv Y, Gupta MK, Milsom JW, Vladisavljevic A, Brand M, Fazio VW (1994) The effect of tamoxifen and fenretinimide on human colorectal cancer cell lines in vitro. Anticancer Res 14: 2005-2009

      Reviewer #2 (Significance):

      • Since liver metastases of various tumor are tremendously hard to treat and mediates therapy resistance, the authors focus on a very important field of research - prevention of liver metastasis formation.
      • This study adds insights into the mechanisms of action of IFNα1 in the hepatic microenvironment. It extends previous findings of Toyoshima who described anti-tumoral effects of IFNα1 released by dendritic cells in the liver.
      • The study is well designed and will be of great interest for the scientific community. Besides, it will be appreciated by physicians, However, as mentioned in the discussion, further clinical studies by physicians are needed to translate its findings into the clinic.
      • The author of this review works as physician and often deals with liver metastasis. It is one field of focus of her/his research.
    1. Author response


      • A comment on the overall organization of the paper. Figure 2 has a major location in the paper, but it seems that its main takeaway is that these MAPs aren't really involved in the main process this paper is probing. While these are important findings, it might be more satisfying to move some of the central results earlier.

      We agree that this figure displays mostly negative results. However, most work on anaphase B microtubule dynamics from our group and others has focused on the effect that motors and MAPs may have on microtubule dynamics (EB1 and kinesin-8 in budding yeast, klp9 in fission yeast). Therefore, we consider it is important to clearly show that previously proposed candidates are not required for the observed decrease in microtubule growth speed, prior to introducing the unexpected effect of the membrane.

      *A model schematic might drive home the main finding of the paper, and be particularly useful for readers who are not experts in microtubule or spindle dynamics. That said, the Discussion does an excellent job of summarizing the findings and explaining the takeaway message(s), even for the non-expert.

      We have added a model schematic and we have referred to it in the main text.

      Specific comments

      • ‘In higher eukaryotes’ - Suggest avoiding the terms higher and lower when describing organisms, and instead, directly defining which organisms, for instance in animals/metazoans that would be a better description.

      We have removed this terminology.

      • Figure 1 E-F - It is hard to see the difference in the distribution, maybe a different color could be used instead of stars.

      We have used a different color.

      • Figure 1 Data shown in pink in G comes from 832 midzone length measurements during anaphase, from 60 cells in 10 independent experiments - The pink here does not correspond to the pink coding in D, consider colour choice for clarity across panels.

      We have changed this.

      • Finally, yeasts undergo closed mitosis - How does this relate to the findings in the Dey paper (cited here) which shows it was somewhat semi-closed or semi-open. According to the Dey paper, the membrane disassembles locally twice, at the SPB and the bridge.

      Membrane disassembly at the nuclear membrane bridge occurs at late anaphase, and leads to the disassembly of the spindle, presumably by the action of cytoplasmic factors (Dey et al. 2020). We do not believe the membrane disassembly itself has a role in spindle elongation or microtubule dynamics, as when it happens the spindle is then disassembled. However, the fact that les1D reduces the decrease in microtubule growth speed associated with internalisation of microtubules in the nuclear membrane bridge suggest that the organisation of the nuclear membrane bridge required for its local disassembly at late anaphase might affect microtubule growth (see section “Formation of Les1 stalks […]”).

      • ‘vertical comets in kymographs (Fig. 1C) do not correspond to non-growing microtubules, but rather microtubules that grow at a speed matching the sliding speed’- For clarity, it might be nice to add: "(as the SPB moves away from the plus end in the kymograph)".

      We have included this useful clarification.

      • ‘significantly shorter than in interphase, where growth events last more than 120 seconds on average [42, 43]. Microtubule shrinking speed did not change during anaphase either (Fig. 1-Supplement 1D), and was on average 3.56±1.75 μm/min, also lower than in interphase (~8 min/μm)’ - This comment concerns the comparison of growth and shrinking rate as well as growth duration. The authors did not measure microtubule dynamics in interphase in this manuscript but compared their numbers to literature values. The comparison raises some questions for three reasons: 1) the microscopy method used is different in this paper and the two references provided, 2) the sample is mounted differently compared to the two references provided - 1) and 2) combined could lead to different levels of stress on the cells which could affect MT dynamics-, 3) (probably the most important caveat) the experiments are done at different temperatures: 27C in this paper versus 25C in the references provided. Microtubule dynamics are sensitive to temperature so this could explain part of the differences observed. Also, there are multiple values published for MT dynamics in interphase depending on the strain used and the microscopy method used. Suggest that the authors measure microtubule dynamics in interphase cells at 27C in SIM to ensure that the differences are not due to the technical parameters employed. Small item - should ‘8 min/μm’ read “8 μm/min"?

      We have measured microtubule growth speed and growth event duration using GFP-Mal3 during interphase and anaphase B in the same conditions as proposed (see Figure 1 – Supplement 2). Unfortunately, shrinkage speed cannot be measured using GFP-Mal3, so we cannot confirm that the difference between our measurements and the literature values would be observed.

      • ‘we observed two populations of microtubules (fast and slow growing)’ - Does this statement about thistle fast and slow growing populations refer to the data in Fig. 1C and 2A?

      Yes, we have added reference to this figures in the next sentence (mentioned below).

      • ‘In some cells, all microtubules seemed to switch to the slow growing phase simultaneously (Fig. 1C), while in others fast and slow growing microtubules co-existed (Fig. 2A)’ - This is a very interesting observation, could we know how many cells (%) were detected in each case? Is it that in 90% of the cells the switch is simultaneous, and hence the microtubule growth is somehow synchronized? Or is it more random, e.g. around 50%?

      This was just to point the reader to two kymographs and show that a clear point where all microtubules change speed is not present in all kymographs, as one may think from Fig. 1C. Later in the paper, we show that the change in growth depends on whether the microtubule rescue occurs inside or outside the nuclear membrane bridge, so it is a matter of where microtubules are rescued once the dumbbell transition occurs, which is a stochastic process. We have added another sentence pointing the reader to examples in the kymograph (see line 152, This representation captures…).

      • On such a plot, the data points visibly cluster in two separate clouds and the variation of growth speeds can be fitted by an error function (Fig. 1F)’ - It is unclear that there are two distinct clusters, maybe the assertion should be toned down, or some sort of cluster analysis provided.

      We acknowledge that the data is widely spread across the y axis, and given that the magnitude “distance to the closest pole at rescue” is continuous the transition is not a clear cut. However, we consider the fact that the averaged curve closely matches the error function fit to be sufficient evidence for the existence of two populations of microtubule growth. Additionally, R2 of the fit is ~0.5 indicating that half of the variance is explained by this model. In any case, we show later that these two populations do exist (Fig. 3D), and why plotting microtubule growth against distance to the closest pole at rescue is a good way to segregate them (Fig. 3E).

      • ‘speed of interphase microtubules (~2.3 μm/min)’ - It would be interesting to see the dynamics in a les1 mutant (Dey Nature 2020) paper. Just as a control for presence/absence of the bridge?

      We thank the reviewers for kindly suggesting this interesting experiment. We have included it after the ase1 section. Les1 forms stalks at the edges of the nuclear membrane bridge that restrict nuclear membrane disassembly to the center of the bridge at the end of mitosis (Dey at al. 2020). While les1 deletion does not prevent the formation of the nuclear membrane bridge, it has been proposed that Les1 stalks may constitute sites of close interaction between the nuclear membrane and the spindle. Therefore, these sites may influence microtuble growth. Indeed, we have found that removing these Les1 stalks by either deleting les1 or nem1 leads to a smaller decrease in microtubule growth speed when plus ends enter the nuclear membrane bridge (see section “Formation of Les1 stalks […]”)

      *‘Figure 2, Transition from fast to slow microtubule growth occurs in the absence of known anaphase MAPs’ - It looks like the overlap zone is larger on the mal3 kymograph. Is the size of the midzone changed in some of the mutants? It could be important to report. Related to it, is the spindle length changed in some of the mutants? (It does not look like it from the kymographs displayed).

      The midzone is indeed longer in mal3D strains, now this can be seen in Fig. 2 – Supp. 2 and it is mentioned in the main text in line 272. As for the spindle length, diverse kinds of alterations in spindle length have been previously reported for the mutants that we used in this study. For instance, ase1D /cls1off cells have shorter spindles at anaphase onset (Loiodice et al. 2005 and data not shown), and klp5Dklp6D have longer spindles at anaphase onset (Syrivatkina et al. 2013). klp9D / clp1D / dis1D cells have lower spindle elongation velocity and may not reach the wild-type spindle length by the end of anaphase (Kruger et al. 2019). Despite these differences, the decrease in microtubule growth as a function of distance to the closest pole has a similar tendency across conditions, suggesting that the mentioned differences in spindle length are unlikely to have an important effect.

      • Additionally, adding the data about rescue localization in the mutant (equivalent of Fig 1 G) would be interesting to better describe the role of these different proteins. Figure 2, Panel G to L - Could the authors indicate the value for the average +/- error in each bin for the WT and the mutants? Also, it is hard to say from the plots, but it looks like the WT average speed in the first bin is different in every panel, that would be good to know to have an idea of the reproducibility/variability.

      We have added a figure with the rescue distribution (see Fig. 2 – Supp. 2). This apparent difference in the wt speed in different experiments might have come from looking at normalised data. The new way of representing the data in fig. 2H and J shows that the microtubule growth velocity in the wild-type is very consistent across experiments. We have added a table with microtubule growth velocity values (Table 1), and the source data is available.

      • The dots making up the "thick lines" are centered on 1.5/2.5/etc.. in some panels (G and K) and centered on 1/2/3/etc.. the others (I,J,L). Could the authors provide some clarification?

      We have fixed this inconsistency across the paper.

      • Figure 3 - Can the authors indicate the average values +/- error for each of the distributions in Fig. 3D? Maybe on the plot itself, in the legend or as a table. This would make them easily available without having to infer them from the Y axis. This comment is also valid for Fig 4I and 4J.

      We have added tables with average values and confidence intervals in the appendix.

      • Figure 3E ‘Distance from the plus-end to the nuclear membrane bridge edge at rescue as a function of distance from the plus-end to the closest pole at rescue’ - The Y axis reads as "distance to the bridge edge" but it shows negative values, could this be "position to the bridge edge" instead? (same item throughout the text).

      We have fixed this.

      • Figure 3 ‘Number of events: 442 (30 cells) wt, 260 (27 cells) klp9OE, 401 (35 cells) cdc25-22, from 3 independent experiments’ - P values this small raise a concern. Presumably the number of degrees of freedom in the regression analysis should not exceed the number of independent experiments. Instead, the DoF listed under "error" in the analysis output is hundreds or thousands instead of 3. To address this, the regression analysis should use either the "Error" function in R or a linear mixed-effects model to account for the nesting of the repeated measurements within each independent experiment. Alternatively, it is also possible to just calculate summary means for each independent experiment, and calculate p values based on that N=3. See: Lazic. Experimental Design for Laboratory Biologists. p. 157. and the supplemental file of: https://doi.org/10.1371/journal.pbio.2005282 and the additional file 1 of: https://doi.org/10.1186/s12868-015-0228-5 and this for an alternative plotting approach: https://doi.org/10.1083/jcb.202001064 Recommend either recalculating the p values by one of the methods above or removing the reported p values from the paper. The large effects observed in many cases are self-evident without a significance metric, so eliminating the p values would be acceptable here. (This comment applies to other figures through the paper that report p values based on number of cells or number of measurements instead of number of independent samples/experiments.)

      We thank the reviewers for suggesting the improvements to the statistical analysis, as well as for pointing us to useful resources that described the statistical methods and their implementation in detail. We have followed Aarts et al. 2015 and used a linear mixed effects model (see Methods>Statistical Analysis)

      Due to the change in statistical analysis method, to show that some of the differences we had reported previously were significant, we included more cells in the analysis from our existing data. We did this for klp5Dklp6D kymographs (Fig. 2I and Fig.2 – Supp. 1). Spindle dynamics in ase1D (Fig. 5D and Fig. 5 – Supp. 1) and klp9D (Fig. 2 – Supp. 3 A, C). Cell length (Fig. 3 – Supp. 1A).

      For the same reason, we measured anaphase spindle elongation velocity (Fig. 3 – Supp. 1C) from kymographs instead of measuring them from the 1 minute interval movies that we had used previously (from Fig. 3 – Supp 1B). We have reflected this in the methods (see added text in line 800 and deleted text in line 809 in the document with changes highlighted).

      None of these changes has altered our conclusions.

      • Figure 4 - Nice experiment. It brings the question of how cell-shape affects all these dynamics (probably out of the scope of this work). But a for3 mutant for example?

      This is an interesting suggestion, to be tested in the future. Furthermore, we believe that nuclear shape should also have an important effect, since the spindle is confined inside the nuclear membrane. We would expect that mutants that perturb nuclear shape might have effects on microtubule growth. We have observed that the decrease in growth speed associated with internalisation of microtubules in the nuclear membrane bridge is reduced upon nem1 deletion, which increases nuclear membrane surface, and produces membrane ruffling (Fig. 4-Supplement 2). However, nem1 deletion also removes les1 stalks from the nuclear bridge (Dey et al. 2020). It would be interesting to find a perturbation of the nuclear membrane that does not remove the les1 stalks.

      • ‘Ase1 is required for microtubule growth speed to decrease during anaphase B, this is unlikely to be a direct effect’ - If it is unlikely to be a direct Ase1 effect is the title of the section accurate? "Ase1 is required for normal rescue distribution and for microtubule growth speed to decrease in anaphase B"

      Ase1 recruits multiple proteins to the spindle midzone, so the fact that ase1 deletion produces a given phenotype does not necessarily mean that this phenotype results from the absence of Ase1 protein activity. For instance, deleting ase1 perturbs rescue distribution, but it does not mean that Ase1 acts as a rescue factor itself, or at least to a relevant extent, given that deletion of cls1 completely prevents rescue, but ase1 deletion does not. In the discussion we propose some indirect effects of ase1 deletion that may produce this effect. In any case, upon more careful analysis we have found that ase1 deletion does not prevent the decrease in microtubule growth speed during anaphase B, but rather makes it smaller (see section “The decrease in growth speed associated with internalisation of microtubules in the nuclear membrane bridge is reduced upon ase1 deletion”).

      • Figure 5 - What about an ase1 lem1 double mutant?

      We suppose that the intended gene is les1. We have studied the effects of les1 deletion in the new version of the manuscript. However, we do not see the information we would obtain from a double deletion ase1D les1D.

      • ‘In summary, Ase1 is required for rescue organisation and for microtubule growth speed to decrease during anaphase B ‘- In this context it could make sense to discuss the observations from this paper (doi:10.1371/journal.pone.0056808) about the role of Ase1 ortholog's MAP65-1 in coordinating MT dynamics within bundles.

      In the mentioned paper, the authors showed that the presence of PRC1 (ase1 orthologue) in bundles increases microtubule rescue rate, and that it slightly reduces microtubule growth speed.

      We observe a small increase in microtubule growth speed throughout anaphase upon ase1 deletion (Fig. 5), which is consistent with the in vitro observation that PRC1 decreases microtubule growth. However, once more this might not be a direct effect of Ase1, since less Cls1 is recruited if ase1 is deleted, and Cls1 reduces microtubule growth speed (Fig. 2). In addition, this can also be a result of higher concentration of tubulin / MAPs resulting from less polymerised tubulin in ase1 deleted cells, which have less spindle microtubules on average.

      Regarding the increase in rescue rate produced by PRC1 in vitro, it is possible that Ase1 contributes to microtubule rescue in the spindle. However, given that no rescues occur upon inactivation of cls1 (Bratman et al. 2007), we believe Cls1 is the dominant factor, and Ase1 contribution is likely negligible.

      • ‘We initially set the microtubule growth velocity to 1.6 μm/min (early anaphase speed, Fig. 1F), and aimed to reproduce the experimental distribution of positions of rescue and catastrophe at early anaphase (spindle length < 6 μm’ - Kudos to the authors for detailing the model and its parameters in a way that even non-modelling experts can understand.

      Discussion - ‘Our data suggests that microtubule growth speed is mainly governed by spatial cues’ - Is it right to assume that in the cases where fast and slow growing microtubules were simultaneously observed, the fast microtubules were not/had not yet reached the midzone?

      Our data suggests that it’s not about being inside the midzone, but rather inside the nuclear membrane bridge formed after the dumbbell transition. We have elaborated more on this in the main text, pointing the reader to examples in the kymograph, and giving a quantitative argument for distance to the closest pole being a better predictor than anaphase progression or position with respect to the center (which is equivalent to distance to the midzone), see line 152.

      • Methods - ‘PIFOC module (perfect image focus), and sCMOS camera’ - Is this Nikon's "Perfect Focus" autofocus, or some other manufacturer's system? And back-thinned sCMOS.

      We have clarified this in the Methods section.

    1. You may find this book in the “self-improvement” category, but in adeeper sense it is the opposite of self-improvement. It is aboutoptimizing a system outside yourself, a system not subject to you

      imitations and constraints, leaving you happily unoptimized and free to roam, to wonder, to wander toward whatever makes you feel alive here and now in each moment.

      Some may categorize handbooks on note taking within the productivity space as "self-help" or "self-improvement", but still view it as something that happens outside of ones' self. Doesn't improving one's environment as a means of improving things for oneself count as self-improvement?

      Marie Kondo's minimalism techniques are all external to the body, but are wholly geared towards creating internal happiness.

      Because your external circumstances are important to your internal mental state, external environment and decoration can be considered self-improvement.


      Could note taking be considered exbodied cognition? Vannevar Bush framed the Memex as a means of showing associative trails. (Let's be honest, As We May Think used the word trail far too much.)

      How does this relate to orality vs. literacy?

      Orality requires the immediate mental work for storage while literacy removes some of the work by making the effort external and potentially giving it additional longevity.

    1. Author Response

      Reviewer #1 (Public Review):

      1) In terms of the prior hypothesis here I think the authors justify a prior with respect to striatum and I think the most principled analysis of their hypothesis would be based on volumes of interest in striatum. Figure 1 does show difference in MTsat in striatum between neurotypicals and DLDs but the changes are all in the caudate I think- I cannot see anything in putamen. The authors actually describe changes in only one part of anterior caudate. The authors do describe a number of previous conflicting studies that examine caudate structural changes but that is not their hypothesis. The discussion goes into developmental changes affecting striatum at different times that might be relevant and would require a longitudinal study for a definitive study - as the authors acknowledge.

      The reviewer is correct that at this statistical threshold we only observe MTsat differences in the caudate nucleus. Changes in the putamen did not survive this threshold. Lowering the threshold for MTsat (our maps are openly available on Neurovault), or an ROI analysis (see (https://osf.io/2ba57/)) does not reveal significant statistical differences in the putamen. As we noted in the paper, there are differences in the putamen in R1 (these are also observed in the ROI analysis).

      2) There is a lot of overlap between the caudate signal in the two groups - although the correlation of individual differences is reasonable. The caudate signal would not allow group classification.

      Yes, it is clear that these differences would not be sufficient to allow for group classification of DLD. We have discussed this overlap in the discussion.

      3) Outside of the caudate they do show changes in left IFG and auditory cortex that are hypothesised. But there is a lot else going on - I was struck by occipital changes in figure 1 which are only mentioned once in the manuscript.

      We now discuss these differences in the discussion. Note that we did not have any a priori hypotheses about these regions; to our knowledge, they have not been previously described and are not predicted by any theoretical accounts of DLD.

      4) Should I be concerned by i) apparent signal changes in right anterior lateral ventricle from group comparison in figure 1 ii) signal change correlation in right anterior lateral ventricle in figure 4 (slice 22) and iii) signal change outside the pial surface of the occipital lobe in figure 1?

      No – these may be accounted for by smoothing during analyses. Note, these changes at tissue boundaries are fairly commonly seen in statistical maps following smoothing but are not evident when data are projected onto a 3D surface.

      Reviewer #2 (Public Review):

      This work demonstrates the value that multiparameter mapping imaging protocols can have in uncovering microstructural neural differences in populations with atypical development. Previous studies looking at differences in brain structure have typically used voxel based morphometry (VBM) approaches where differences in volumes can be hard to interpret due to complex tissue compositions. The imaging protocol outlined in this paper can specifically index different tissue properties e.g. myelin, giving a much more sensitive and interpretable measure of structural brain differences. This paper applies this methodology to a population of adolescents with developmental language disorder (DLD). Previous evidence of structural brain differences in DLD is very inconsistent and, indeed, using traditional VBM the authors do not find a difference between children with DLD and those with typical language development. However, they provide convincing evidence that despite no macrostructural differences, children with DLD show clear differences in levels of myelin in the dorsal striatum and in brain regions in the wider speech and language network. This can help to reconcile previous inconsistent findings and provide a useful springboard for both theoretical and empirical work uncovering the nature of the brain bases of language disorders.

      We are grateful for these comments, and to the reviewer for pointing out some key strengths of this work.

      Strengths:

      The imaging protocol is robust and is explained very clearly by the authors. It has been used before in other populations so is an established method but has not been applied to populations of children with DLD before, yielding novel and very interesting results. The authors demonstrate that this is a methodology which could have great value in other populations that display atypical development, increasing the impact of these findings.

      The sample size is large for research in this area which increases confidence in the results and the conclusions.

      Rather than relying solely on group differences in brain microstructure to draw conclusions about neural bases of language development, the authors correlated brain microstructural measures with performance on standardised language tests, allowing stronger inferences to be drawn about the relationships between structure and function. This is often an important omission from developmental neuroimaging work. It gave increased confidence in the finding that alterations in striatal myelin are linked to language difficulties.

      Weaknesses:

      The authors rightly use the CATALISE definition of developmental language disorder, which differs from much of the previous literature by not requiring that children with language difficulties have nonverbal ability that is in the normal range. As can be common when using this definition of DLD, the group with DLD have significantly weaker nonverbal ability than the typically developing group. The authors show that brain microstructural differences correlate with language ability but they don't rule out a correlation with nonverbal or wider cognitive skills. Given the widespread differences in myelination across areas of the brain, including those that weren't predicted e.g. medial temporal lobe, it is plausible that perhaps some of the brain microstructural differences are not linked directly to language impairment but a broader constellation of difficulties. Some of the arguments in the paper would be strengthened if this interpretation could be ruled out.

      To rule out the effect of nonverbal IQ or wider cognitive differences, we have conducted stepwise regression analyses on the quantitative data extracted from the statistical cluster covering the caudate nuclei, assessing the influence of factors such as language proficiency, verbal memory and IQ. We find that language status accounts for the most variance, rather than nonverbal IQ or verbal memory (details are included in the paper).

      We also discuss this point in the discussion, pointing to the presence of co-occurring differences in DLD and how these might account for some of the broader group differences we observe.

      The authors acknowledge in the limitations section that their data cannot speak to whether brain differences are a cause or consequence of language impairment. However, there are some implied assumptions throughout the discussion of the results that brain differences in myelination have functional consequences for language learning. A correlation between structure and function does not indicate this level of causality, particularly in an adolescent population - function could just as easily have had structural consequences or environmental differences could have influenced both structure and function. In my view, the speculations about functional consequences of myelin differences are not fully supported by the data collected.

      The reviewer is correct in saying that the myelin deficit could be either a cause or a consequence of DLD or even that both are caused by a third factor. We specifically address this in the discussion section, and note a longitudinal analysis would be the best way to address this question. Indeed, R3 notes about our paper, “…it does a very good job of avoiding the common trope of assuming neural differences play a causal role in DLD (when in fact, reduced atypical development could cause neural differences)”.

      The data suggest that there is much greater variability in left caudate nucleus MTsat values for the DLD group than the other two groups. The impact this may have on the results is not discussed in the interpretation and it is unclear whether this greater variability occurs throughout all of the key MPM measures for the DLD group.

      Thank you for raising this important issue. In figure 1, we only plot the MTsat values from the caudate nucleus for visualisation, and as you note, there we is a considerable degree of variability within the DLD group. However, and crucially, this difference would not influence statistical interpretation of our results. The whole-brain analysis used involves permutation testing, and is robust to a difference in group variability. However, the issue of variability within DLD is important and we now highlight this in our discussion, noting that not every child with DLD will have reduced striatal myelin. Indeed, this variability is even more evident in figure 4. An important challenge for future studies is to understand the link between striatal myelination and the spectrum of language variability.

      Reviewer #3 (Public Review):

      Developmental Language Disorder (DLD) is observed in children who struggle to learn and use oral language despite no obvious cause. It is extremely wide-spread affecting 7-10% of children, and extremely consequential as it persists throughout life and has downstream effects on reading, academic outcomes, and career success. A large number of prior studies have attempted to identify the structural neural differences that are associated with DLD. These have generally shown mixed results, but support a number of candidate regions including left hemisphere language areas (particularly the inferior frontal gyrus), and striatal regions that are possibly linked to learning. However, these studies have suffered from small sample sizes and conflicting results. Part of this may be their reliance on traditional voxel-based-morphometric techniques which estimate cortical thickness and gray matter density. The authors argue that these measures are biologically imprecise; gray matter can be thinner for example, due to synaptic pruning or increased mylenation.

      The authors of this study offer a powerful new tool for understanding these differences. Multi-Parameter Mapping (MPM) is based on standard MRI techniques but offers several measures with much greater biological precision that can be tied specifically to myelination, a key marker of efficient neural transmission. The test a very large number of children (>150) with and without DLD using MPM and show strong evidence for fundamental biological differences in these children.

      This study features a number of key strengths. First, at the level of neuro-imaging, the MPM technique is new in this population and offers fundamental insight that cannot be obtained by other measures. Indeed, the authors wisely use a traditional gray matter approach (voxel based morphometry) and find few if any differences between children with DLD and typical development. This offers a powerful proof of the sensitivity of this approach. Moreover, the authors analyze their data comprehensively, looking at two measures of myelin (MTsat and R1) and their convergence.

      However, at the most important level, I think structural approaches (like MPM, diffusion weighted imaging and so forth) offer tremendous promise for dealing with this as they avoid the ambiguity associated with interpreting functional MRI. Are children showing reduced BOLD because they are less good at language processing? Or do the differences in brain function cause poorer language processing? Structural approaches - and MPM in particular - offer tremendous promise as they unambiguously assess the fundamental neuro-biology.

      Beyond the neuro-imaging this study is also strong in their sample and the measurements of language. The sample size is very large and an order of magnitude larger than existing studies. It is well characterized, and the authors use a large set of well-motivated measures that capture the relevant dimensionality of language. Moreover, the authors treat language both as a clinical category and a continuous measure which is consistent with current thinking on the nature of DLD as potentially the low end of a continuous scale rather than a discrete disorder.

      Finally, the discussion of this paper for the most part does a good job of fitting these neurobiological findings into our broader understanding of DLD. It does an excellent job of mapping the observed brain differences onto functional differences in the child. Importantly, in doing this it does a very good job of avoiding the common trope of assuming neural differences play a causal role in DLD (when in fact, reduced atypical development could cause neural differences).

      We are very grateful for the reviewer for taking the time to read our work so closely and pointing out these strengths in the work.

      Despite these strengths, I have a number of substantive concerns that if addressed will improve the overall impact of this paper.

      First, as the authors are aware, there is a long running and active debate in DLD as to whether DLD is the tail end of continuous distribution of children or a unique disorder (Leonard, 1987, 1991; Tomblin, 2011; Tomblin & Zhang, 1999). The results here offer great promise for informing that debate. And in that vein the authors quite appropriately analyze their data in two ways: once using DLD as a categorical variable and once using continuous measures of language. However, they don't really attempt to wrestle with the differences between the model.

      We have now included a section on the implications of our results for DLD in the discussion.

      Second, I was a little surprised to see the authors highlight left IFG in the discussion to the degree they did. While there was clear evidence for reduced myelin there in the MTsat analysis, this did not hold up in R1 analysis, and even in the MTsat, IFG was clearly not the primary locus. Rather the areas of differences seemed to be centered at Pre- and Post-Central gyrus and extending ventrally (to IFG) and posteriorly from there. Given debate on the role of IFG in language specific processing in general (Diachek, Blank, Siegelman, Affourtit, & Fedorenko, 2020; Fedorenko, Duncan, & Kanwisher, 2013), it was not immediately clear to me why that area was important to highlight. For example, some of the posterior temporal areas (and motor areas) that were found were equally important for perceptual, lexical and phonological processing that are important for other theories of DLD.

      We do see group differences in left IFG in the R1 analysis (see Figure 2) and they were more extensive than those seen in the MTsat analysis with which they overlapped. The reviewer is correct that the differences were limited to the opercular part of the IFG in both analyses whereas they extended more dorsally in the R1 analysis. They also extended ventrally to the anterior insular cortex. We respectfully disagree with the reviewer about the importance of highlighting these differences, given the importance of this region for language processing, and our previous hypotheses about this region. Even so, we agree that the posterior temporal and motor areas are of equal importance and have highlighted these in the discussion.

      The authors rightly point to their differences in the striatum as supporting theories of DLD centered around differences learning. However, as they discuss, there are also large differences throughout the brain in both perceptual, motor and language areas. These would seem to support theories of DLD centered around processing and representation. In particular, the differences in myelination likely are linked to differences in the efficiency of neural coding. This would seem to favor two theoretical views that might be worth mentioning - speed of processing (Miller, Kail, Leonard, & Tomblin, 2001), and approaches based on lexical processing (McMurray, Klein-Packard, & Tomblin, 2019; McMurray, Samelson, Lee, & Tomblin, 2010; Nation, 2014). I was surprised these were not mentioned, given the clear link to the timecourse of processing. Does then suggest that these theories might complement each other? It would be useful to see some more discussion of the implications of these findings for broader theories.

      We have now incorporated mention of these theories in the discussion and discuss implications. We agree with the reviewer that it would be interesting to see whether the different theories could be reconciled.

    1. Reviewer #2 (Public Review):

      Suvorov and colleagues present a well-supported genome-scale phylogeny for 149 Drosophila species based on thousands of single-copy-orthologs. They then use several approaches to estimate the extent of introgression across the phylogeny, and report that it is common both recently and deeper in the past.

      The main strength of this paper is that it uses a scale of sequencing that allows an assessment of genus-wide trends with reasonably good power. It also presents two new analysis approaches, but these represent fairly minor modifications of existing techniques to suit multiple gene alignments, and unfortunately their reliability is not evaluated in this paper. Nevertheless, the main finding that introgression is common appears to be well supported. This finding echoes those of similar recent studies on taxa such as cichlid fishes and Heliconius butterflies. The different approaches used, and different levels of sampling in these different studies do not allow for quantitative comparisons, leaving us with the somewhat vague conclusion that introgression is 'common' in all of these taxa. Perhaps most critically, the present paper does not delve any deeper into the evolutionary impacts of introgression, nor the factors at the species or genomic level that might determine its frequency. Below I describe some areas of concern in more detail.

      1. Extent of introgression

      Perhaps equally as interesting as the frequency of introgression per species across the phylogeny is the proportion of the genome of each species that is affected. Without such estimates, the full extent of introgression is difficult to assess.

      2. Sampling effects

      Since this paper is attempting to make an (admittedly crude) estimate of the extent of introgression in the entire genus, some discussion is needed to address the possible consequences of the fact that only around 10% of species in the genus are represented. For example, if sampling is very even, perhaps most ancient events would be detectable, but more recent events may tend to be missed simply because the species involved are not sampled.

      3. Ancestral structure

      The reasoning provided for dismissing the possible effect of ancestral population structure is unconvincing. First, the authors argue that it "seems less likely" that non-sister taxa would have bred more frequently in the ancestral population. However, this is the entire basis of the problem: it might be unlikely, but it can happen. Eriksson and Manica (2012 https://doi.org/10.1073/pnas.1200567109) provided a very reasonable scenario in which colonisation of a new region can lead to this pattern.

      Second, the authors argue that QuIBL "should not be impacted by ancestral structure because this method searches for evidence of a mixture of coalescence times: one older time consistent with ILS and one time that is more recent than the split in the true species tree and that therefore cannot be explained by ancestral structure." This argument needs clarification. My understanding is that the split in the "true species tree" would also be inflated if there was ancestral structure.

      My view is that ancestral structure leading to discordance between gene trees and species trees is itself an interesting phenomenon. In some ways, it is not conceptually distinct from introgression occurring soon "after" speciation if we consider ancestral structure as the beginning of a continuous speciation process, so I don't think it would weaken the paper to accept this as a possible contributing process.

      4. Discordant count test

      The statistical analysis in the DCT accounts for multiple testing of many triplets for introgression, but there is no mention of the fact that these triplets are non-independent. It is not clear to me whether this makes the correction used more or less conservative.

      If there are any cases where the internal branch is long and the number of ILS gene trees is very small or zero, use of a chi-squared test may not be appropriate.

      5. Branch length test

      The authors acknowledge that the BLT is "conceptually similar" to that of Hahn and Hibbins 2019 https://doi.org/10.1093/molbev/msz178, but to me it seems that the only material difference is the statistical procedure for testing for an significant difference between branch lengths.

      An important consideration that appears to have been ignored is whether selection can impact the distribution of branch lengths, especially since many of the the BUSCO genes used here will be under strong selective constraint.

      6. Intra-locus recombination

      The paper needs to address the possible impact of intra-locus recombination on all of the introgression tests. For the DCT, I imagine that counts would be biased toward the species tree topology if the inferred trees span multiple distinct genealogies (see for example simulations by Martin and Van Belleghem 2017 https://doi.org/10.1534/genetics.116.194720 Figure S7). This might reduce test sensitivity.

      Similarly, for the BLT, I would expect that true introgression would be more difficult to detect in the presence of recombination. It is possible that the block jackknife procedure of Hahn and Hibbins (2019, https://doi.org/10.1093/molbev/msz178) may be more suitable than the comparison of distributions of point estimates for genes used here.

    2. Reviewer #3 (Public Review):

      The authors compiled a collection of published and newly sequenced genomes to assemble the largest collection of Drosophila genomes to date. Using this dataset they extracted a set of single copy orthologs to use for phylogenomic analyses, with a focus on estimating a time-calibrated phylogeny and introgression.

      This new dataset is a valuable resource that will serve the broader community of Drosophila researchers opening many new avenues for future phylogenomics research. The workflow of focusing on BUSCO genes for all comparative analyses is simple in a good way -- it is easy to understand how the data were collected and it should be easily reproducible -- which makes it easy to read past the genomics details and focus on the analyses of these data.

      However, I feel this is an important aspect of the paper that should receive more details, perhaps in the supplement. I may have missed it, but I could not find statistics about this ortholog data set. On average, how long is each locus, how many variable sites are there, how many taxa are missing data for any given locus due to paralogy? Do the BUSCO genes include both introns and exons? It is also unclear from the description exactly how the BUSCO genes were extracted from genomes. Are they extracted from the final assembled genomes, or do you perform variant calling after identifying them to call heterozygous site? If heterozygosity is excluded, how might this impact metrics such as the branch length tests, especially among close relatives? It likely impacts node age estimates as well?

      The authors use this dataset to infer phylogenetic relationships among taxa using both ML concatenation (IQtree) and a two-step MSC approach (Astral) which yielded quite similar topologies, and they examined the impact of filtering loci with treeshrink, which had minimal impact. This new topology represents a substantial step forward for understanding the relationships among major Drosophila clades.

      One of the main results of this study is a new set of node age estimates on the tree. For this they estimated branch lengths in mcmctree from a concatenated matrix of 1000 loci in the presence of fossil calibrations. The fossil calibration scheme selected as the best option includes three fossils, one dating the divergence at the split from mosquitos (uniform 195-230Ma) and two ingroup calibrations (U(43,64) and U(15,43)). To me, the credible intervals on node ages seem incredibly narrow. The authors mention this as an improvement compared to earlier studies, but they also mention later that the total amount of sequence data does not greatly impact node dating. So I'm a bit confused why the node ages are expected to be more accurate here. It seems to me that time calibrations should be most accurate when the greatest number of fossils are available, and when very appropriate Bayesian priors on set on the analysis. The effect of sequence variation is then relatively small. But here there are very few fossils, one of which is hugely distant, and so I would not expect highly precise age estimates. So I guess my question to the authors is, what do you think is going on here? Perhaps further description in the supplement of how the mcmctree method implemented here differs from traditional node dating done in a program like BEAST would help to clarify.

      Considering that this paper aims to infer the new best time calibrated tree for the Drosophila community, I think that the current description of fossil calibration schemes, which primarily refers to other publication names in the supplement, is insufficient. Which fossils are used in those studies, are you using those fossils as calibrations here, or are you implementing secondary calibrations based on their phylogenetic results? The reader should not have to read every one of those papers to understand the basis of the calibrations in this paper.

      Fig.1 shows nodal age posterior probabilities. Are these 95% confidence intervals? The taxon labels are too small in this figure, both on the large tree and especially in the inset figure. The legend refers to fossil taxon names used for calibrations, but because it is still unclear to me where the fossils are placed on the tree. Are the calibrations indicated somewhere in the figure?

      The authors demonstrate evidence of introgression by showing mostly overlapping evidence from two different types of tests. Together, these tests show that most major clades contain significant imbalanced discordance in gene tree counts or branch lengths. The taxon labels in Figure 2 are unfortunately quite unreadable, especially the matrix labels, which makes it difficult to interpret.

      I do not see a reason for presenting new names and acronyms for the introgression tests used in this study. The "DCT" is described as being similar to a suite of existing tests which are also based on comparison of rooted-triplet gene tree frequencies. These methods have been presented in many frameworks (BUCKy, D-stat, f4, etc.) and the only difference here seems to be the precise method used to determine significance. Similarly "the BLT is conceptually similar to the D3 test" could be replaced by just saying we implemented the D3 test which we refer to here as a 'branch length test (BLT)' to clarify that you have not in fact created a new test (e.g., you say "The first method we developed was the discordant-count test...")

      I am not very satisfied with the estimates of the "upper bounds" of introgression used here. It seems that there could possibly be many ways in which admixture edges could be drawn on the tree to explain the matrix of significant test results, and it is better to let formal network inference methods (e.g., SNAQ, Phylonet) infer these edges rather than guess at their placement. The current approach of "placing introgression events between pairs of branches for which most descendant extant taxa show evidence of introgression" leaves significant room for subjectivity.

      The authors did implement phylonet, but not very exhaustively. Why only fit a single edge on the tree instead of multiple? The authors state "networks with more reticulation events would most likely exhibit a better fit to observed patterns of introgression but the biological interpretation of complex networks with multiple reticulations is more challenging". I don't think this type of result is any more complicated to understand than the current approach used by the authors of drawing edges manually. And it is much less subjective. The authors say that it is computationally intractable, and this may be true for clades above ~15 tips, but testing on smaller trees by subsampling 10-12 tips seems feasible. From my experience network inference using pseudo-likelihood methods in SNAQ or phylonet takes a few minutes to fit 1 edge, and a few hours to fit 2-3 edges.

      Currently the two major results of the paper seem disjointed. The authors infer a time-calibrated tree, and they infer introgression events, but there is not much connection between the two. I applaud the authors on one hand for being cautious in interpreting their "upper bounds" of introgression to say too much about when they think introgression has occurred in the context of the time-calibrated tree. I think there is insufficient confidence in the introgression timing estimates to do that. But, what about the inverse relationships? Does this extent of introgression across the tree impact your confidence in the estimated timing of divergence events? One expectation would be that it is biasing all of the divergence times to appear younger. See my suggestions for addressing this.

      Overall, this study presents an impressive new dataset and important new results that greatly impact our understanding of the evolutionary history of Drosophila. Although the estimates of node ages and introgression events may be imperfect, they are clearly a step forward. It is clear from these results that introgression has occurred throughout the history of Drosophila, and this study paves the way for further investigation of these patterns, as the authors propose in their conclusions.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their careful and constructive analysis of our work. Our manuscript aims to exemplify the use of cryo-soft-X-ray tomography (cryoSXT) as a technique to study the dynamic changes to host-cell morphology that accompanies virus infection. This emerging method has several strengths when compared to other ultrastructural analysis techniques. Specifically, cryoSXT does not require the addition of contrast agents and therefore samples can be prepared via plunge cryopreservation alone, allowing us to capture them in a near-native state. Furthermore, the penetrating power of soft X rays and large field of view in cryoSXT allow rapid data acquisition, facilitating quantitative analysis of 10s to 100s of individual cells. We combined high-throughput cryoSXT data collection with semi-automated tomogram segmentation and fluorescence cryo-microscopy to study a recombinant herpes simplex virus (HSV)-1 that produces a pattern of fluorescence indicative of the stage of the infection in a single cell (‘timestamp’ HSV-1) and quantitatively monitored changes in lipid droplet, vesicle and mitochondrial morphology as HSV-1 infection progresses. In response to the reviewers’ comments, we have expanded our analysis of lipid droplet morphology, identifying a transient increase in the size of lipid droplets at early stages of HSV-1 infection, and completed additional fluorescence microscopy analysis to support our statements about the changes to microtubule, mitochondrial and Golgi morphology that accompany infection. Furthermore, we have included additional discussion on the relative merits of cryoSXT versus other ultrastructural analysis techniques like transmission electron microscopy, electron cryo-microscopy and electron cryotomography. We believe that our study serves as a powerful example of how cryoSXT can be used for quantitative cell biology and will be of broad interest to an audience of cell biologists and colleagues who study infection processes.

      1. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary

      The authors have performed an explorative study, investigating morphological changes that occur in cells upon infection with Herpes Simplex Virus 1 (HSV-1) by the use of cryo soft X-ray tomography (cryoSXT). cryoSXT is an emerging technique for imaging of biological material, that allows for 3D imaging of significant volumes of cells under near-native conditions, without the need for sectioning or sample preparation other than rapid freezing. Reference (Groen et al. 2019) provides a nice list of examples from various biological samples. By the use of cryoSXT, the authors confirm findings that they have previously published by use of light and expansion microscopy (ref 16 from manuscript), namely an enrichment of small vesicles close to the nucleus and elongation and branching of mitochondria into interconnected networks in infected cells.

      Infection experiments were done in two different cell types in this study (HFF and U2OS), and a timestamp reporter virus that allows to distinguish between early and late stages of infection was used to provide more context to the observed morphological changes in the cells.

      Major comments

      It is a bit difficult to follow the main message throughout the manuscript, as the topics brought up in the introduction, results and discussion sections are not very coherent. The introduction gives some background on the virus and the timestamp reporter system, and further focuses on cryoSXT as a method and how this can overcome sample preparation artefacts that might be introduced by chemical fixation and sample processing. The results do not contain any direct comparisons between cryoSXT and other methods or sample preparations (light microscopy or EM-based), and the discussion only to a small extent comes back to the advantages brought by cryoSXT compared to other methods. Rather the discussion largely revolves around the possible involvement of microtubules in generating the observed morphological changes, and the possible meaning of elongated mitochondria in infected cells. Both of these topics are barely introduced, and not at all experimentally interrogated in the case of microtubules. There is also some discussion about Golgi fragmentation, although this is also not directly interrogated by cryoSXT in the current manuscript.

      We thank the reviewer for these comments. We have: - Updated the introduction to enunciate more clearly the aims of our study - Included a substantial comparison of the relative merits of cryoSXT versus other ultrastructural analysis techniques (TEM, cryoEM and cryoET) in the discussion - Updated the introduction to introduce the concepts of microtubule and mitochondrial morphology changes during infection that are covered in depth in the discussion - Included additional microscopy experiments, including super-resolution structured illumination microscopy (SIM), to demonstrate the changes in Golgi (Figures 6 and 7), microtubule (Figure 8) and mitochondrial (Suppl. Figure 4) morphology that accompany HSV-1 infection. These additional experiments support the hypotheses presented in the submitted manuscript, namely that microtubule organising centres are disrupted, Golgi membranes dispersed, and mitochondria redistributed as HSV-1 infection progresses.

      The authors perform imaging with a 40nm or a 25nm zone plate, where the 25nm zone plate provides improved resolution of a smaller volume compared to the 40nm zone plate. The authors do not really make use of the improved resolution offered by the 25nm zone plate in the results, so the motivation for turning to this (and therefor also changing cell line) is a bit unclear. The reason for the U2OS cell line to better preserved during X ray imaging is also not discussed, maybe it has to do with the thickness of the cells (as the U2OS cells are very flat). Furthermore, images from the 25 nm zone plate are not compared side by side to neither the 40nm zone plate nor standard TEM, which makes it hard to judge what the increased resolution really brings.

      Only one zone plate can be installed at any one time in the microscope and altering the zone plates requires extensive hardware changes that are outside the control of beamline users. We agree that this was not clearly discussed in the text. We have included additional text in the results (lines 207–208) and methods (lines 633–638) explaining this operational limitation and clarifying which zone plate was used for which experiment. In this study we observed that tomograms acquired with the 25 nm zone plate did not provide significantly more biological information than with the 40 nm zone plate, and thus both are suitable for characterisation of overarching cellular ultrastructural changes that accompany infection. We have added a sentence to this effect to the discussion (lines 410–412). Like U2OS cells, HFF-hTERT cells are also very flat. They appear more robust compared to HFFs when used for protracted exposures to soft X-rays and less likely to suffer from heat deposition after an extensive data collection round. We can speculate at this point that this could conceivably be due to the particular chemical composition of the intracellular environment in different cell lineages but it is impossible to offer anything other than speculation and therefore we have refrained from commenting further on this in the manuscript.

      The switch from a 40 to a 25nm zone plate required a switch in the model system, as mentioned above. The chosen cell types are not linked to biological relevance however (neurons and epithelial cells are mentioned as relevant cell types in the introduction), and it is therefor a bit unclear what the relevance is of keeping results from both cell types and comparing the two, rather than sticking to the one that works with cryoSXT. The results from the U2OS cells could still be compared by LM to the HFF cells if this contributes to the aim of the study.

      U2OS cells were chosen because they have been used previously for studies of HSV-1 infection (references 55–56) and are known to be well suited to cryoSXT analysis (references 32–33). We have added a sentence to this effect to the results (lines 208–211).

      The distribution of the viral proteins of the timestamp reporter virus is used to categorize infected HFF cells into 4 infection stages. In the U2OS cells the protein distribution is a bit different, which only allows them to be categorized into early (stage 1+2) and late (stage 3+4) stage of infection. Although this is what the authors state in the text, all 4 stages are included in Fig.2 for the U2OS cells, so it is not clear how this subdivision is performed and it does not seem like an accurate representation of the data. Furthermore, the uninfected population is not included in the timecourse, and there is not really a gradual change in infection states over the different timepoints as one could have expected. Therefor it is a bit hard to see the relevance of the timecourse. In the paper where the reporter virus is published (ref 16), shorter infection times were used, which leads to a more gradual change in infection stages.

      We thank the reviewer for pointing out these omissions. We have updated Figure 2A to only show the categories early (stage 1+2) and late (stage 3+4) for the U2OS cells. Furthermore, we have repeated the infection time course experiment, quantitating uninfected cells in addition to infected cells and including additional time points (2-, 4- and 6-hours post-infection). This new data (Figure 2B) demonstrates that the temporal profiles of infection progression are similar in HFF-hTERT and U2OS cells. Furthermore, it supports our choice of 9 hours post-infection as a suitable time point for plunge freezing of samples in order to obtain a mixture of cells at early and late stages of infection.

      There is a lot of importance given to the morphological changes of mitochondrial networks in infected cells. However, the quantification represented in Fig.5B is a bit unclear. The mitochondria are classified into different groups, but there is no specific description of the definition and cutoff values of each group. The name of some groups is also confusing, such as "short and long" mitochondria. Furthermore, there are large differences between replicates (suppl. fig. 2). The authors state that some mitochondria are swollen, which they interpret as a sign of apoptosis. They find these swollen mitochondria in 75% of the tomograms of uninfected cells in replicate number 3. If this is indeed cell death this replicate is not healthy.

      We apologise that the categorisation of mitochondria was not sufficiently clear in the submitted manuscript. The categories were percentage of tomograms that had the different mitochondrial morphologies present, not percentages of mitochondria. Thus, tomograms with both short and long mitochondria were classified as “short and long”. We have re-generated Figure 5C and Suppl. Figure 2C as a Venn diagram to illustrate this point more clearly. We have also updated the legend of Figure 5C (lines 845–850) to state clearly that the diagram shows percentage of tomograms with the relevant mitochondrial morphologies. The categorisation was performed manually and we have included examples of each category in Figure 5A. Manual classification can be subjective but, given the large number of tomograms analysed and the clear distinction between morphology in uninfected vs early- and late-stage infected cells, we are confident that our results are robust. We note that we have deposited all of the source tomograms in the Apollo repository at the University of Cambridge (https://doi.org/10.17863/CAM.78593); the data we used for this analysis are thus freely available for inspection and re-analysis by interested colleagues. We note that the swollen mitochondria were observed in multiple samples of uninfected and infected cells. This suggests that, regardless of infection, this is a common phenotype of U2OS cells. Others have observed this morphology by EM in the context of apoptosis and suggest it may represent porous mitochondria (reference 61). Although the proportion of tomograms containing these swollen mitochondria were higher in the uninfected sample of replicate 3, the other 25% contained typical mitochondrial morphologies that we could include in our analysis. The presence of inter-cell morphological variability such as this highlights the importance of imaging multiple cells within a population and performing several distinct biological replicates, as we have done in this study, to ensure project-relevant information is captured and delineated from the background structural variability inherent within a cell population. Previous cryoSXT studies had observed (but did not specifically comment on) a similar swollen mitochondrial morphology (reference 59). However, out of an abundance of caution we excluded all tomograms with swollen mitochondria from our analysis of mitochondrial branching (Figure 5C). Moreover, Tukey tests were performed per replicate for each pair of conditions in Figure 5C and statistical significance was reported only if it was observed independently in all three replicates. We are thus confident that any sampling error in replicate 3 that may arise from excluding tomograms will not have meaningfully altered our conclusions.

      Minor comments

      Results section 1, line 115-117: Where the authors state that it is unclear whether "naked" HSV-1 capsids would be visible by cryoSXT, it would be useful to refer to literature where these are observed by TEM, or to compare to TEM in their own experiments.

      We have included references to previous TEM studies in the results (lines 128–129), as requested. However, we note that TEM and cryoSXT are fundamentally different as TEM uses contrast agents whereas contrast in cryoSXT arises from differential elemental densities (in particular the density of oxygen versus carbon or phosphorous). We have updated the results (lines 129–131) to clarify this point.

      Results line 143: The authors state that it's hard to observe the perinuclear viruses with TEM, but there are several examples of this in the literature that could be referenced, e.g. (Skepper et al. 2001; Leuzinger et al. 2005; Baines et al. 2007; Johnson and Baines 2011), although this does not mean that they are not hard to find or that 3D is not advantegous.

      We thank the reviewer for these references and we have added them to the manuscript.

      Fig.4: It is unclear why all the vesicles are open-ended

      This is due to the differential path-length of carbon rich (and thus high contrast) membrane traversed by the X-rays for the membranes normal or parallel to the incident X-ray beam. We have clarified this point in the results (lines 290–301).

      Some places in the manuscript PFU per cell is used, other places MOI

      Thank you for pointing this out. For consistency, we have changed all instances of PFU per cell to MOI.

      If some specific adjustments to the methods had to be implemented for bio safely reasons (virus work), this should be stated in the methods.

      We have added a section on biosafety measures to the methods (lines 562–568).

      Access to the synchrotron should also be described

      We have expanded the synchrotron access attribution the Acknowledgments section (lines 737– 738).

      Discussion line 320: "consistent with previous research" - there is a reference missing.

      Thank you for spotting this. We have now added the reference.

      The quantifications are based on a limited number of tomograms, but there is no statement as to how the specific tomograms were selected. With a variability between replicates and tomograms, a random selection is important.

      We included all tomograms collected for the relevant experimental condition in all our analyses unless otherwise stated. For the vesicle segmentation we chose four reconstructed tomograms from each condition at random (lines 690–691). For lipid droplet volume analysis and mitochondrial branching analysis we included all tomograms that matched our quality-control criteria. We have added a few sentences to the Segmentation and Graphs and Statistics sections of the methods (lines 691–694 and 724–733) describing our selection criteria for the lipid droplet, vesicle and mitochondrial branching analysis, respectively.

      If gold fiducials are visible in the tomograms it could be useful to indicate, as they can look similar to lipid droplets to a non-expert reader.

      We have indicated gold fiducials Figure 1 H, the only figure in which they are visible, with a gold star as requested.

      Suppl. Fig.2: For clarity it would be good not to use the same color arrows to indicate different things in A and B.

      Suppl. Figure 2B has been removed in response to another reviewer request.

      Reviewer #1 (Significance):

      The authors of this study demonstrate that cells infected by HSV-1 virus can be investigated by the use of cryoSXT, and use this to show that infected cells have more elongated and interconnected mitochondria, and an enrichment of small vesicles close to the nucleus. They thereby also show that cryoSXT offers a nice resolution for characterizing morphological changes in significant volumes of near native-state cells, and that the method offers a promising throughput for screening of large amounts of cells. However, the study does not really present new biological or technical advances compared to previously published literature, see e.g. Müller et.al. 2012, Duke et.al 2014, Perez Berna et.al. 2016, Groen et.al. 2019, Weinhardt et.al. 2020, Loconte et.al. 2021 (not cryo but demonstrates the advantage of capillaries), Kounatidis et.al. 2020, Scherer 2021 (ref 16 from paper), some of which are also referenced in the current study. The study could thus have profited from a more defined focus and possibly further experiments (live-cell imaging, CLEM, TEM, microtubules or more mechanistically focused) depending on the main interest of the authors. The advantage with the current broad focus (assuming that the main concerns are addressed) is that the study could interest a larger audience, ranging from virology, cell biology and immunology to microscopy and methods development.

      We thank the reviewer for recognising the broad audience that will be interested in our manuscript. We believe that our analysis highlights the broad applicability of cryoSXT for analysing cell ultrastructure and changes that occur in response to infection. Furthermore, we think that our use of robust numerical analysis to quantitate the phenotypes we observe highlights the strength of cryoSXT as a high throughput technique for ultrastructural analysis. Our study is the first to investigate HSV-1 infection using cryoSXT and, in addition to confirming previous ultrastructural changes observed using other methods, we present new biological insight in organelle architecture and distribution such as that lipid droplets undergo a transient size increase during early stages of infection. We believe that we have demonstrated the robust utility of cryoSXT as a tool to study ultrastructural changes in response to insults, such as infection by intracellular pathogens, and hope that our manuscript will act as inspiration for others seeking to use cryoSXT to image cellular ultrastructure.

      Reviewer #2 (Evidence, reproducibility and clarity):

      The authors use soft X-ray tomography to examine cell structure following infection by herpes simplex virus-1 (HSV-1). This imaging method can provide 3D images of cryo-preserved intact cells without chemical fixation or staining. The authors find several morphological differences between uninfected and infected cells, including changes in the number and size of vesicles and in the size and shape of mitochondria.

      This is a well-done study with careful and extensive analysis that in general produces convincing images to support the authors' conclusions. The procedures are clearly described and reproducible, and the authors have examined an impressive number of images and have performed appropriate statistical analyses.

      We thank the reviewer for their positive comments.

      I had two comments / suggestions regarding the findings about changes in morphology after infection. First, in the Discussion, the authors consider the possibility of Golgi fragmentation. Can the authors test this by counting Golgi before and after fragmentation?

      We did not frequently observe well-defined Golgi apparatuses in our tomograms, consistent with previous cryoSXT studies (reference 61). We therefore performed new experiments using SIM microscopy to demonstrate the disruption of Golgi apparatus and trans-Golgi network in fixed U2OS cells stained with the markers GM130 and TGN46, respectively. These new results are presented in Figures 6 and 7 and in the results (lines 342–355).

      Second, in the Results the authors report that they did not observe a change in lipid droplets after infection. However, the late-stage image in Fig. 5A seems to show such a change, with the lipid droplets becoming larger and darker relative to the early stage or uninfected cells. Maybe this is just the particular image that was selected, but perhaps it is worth looking at more images by eye just in case the segmentation procedure somehow missed this change.

      We thank the reviewer for suggesting we re-visit the properties of lipid droplets. Based on this suggestion we segmented the lipid droplets from 94 tomograms and found a robust change in the median volume of lipid droplets at early stages of infection. We have included this new data in Figure 4C, Suppl Figure 2 and the text of the results (lines 302–312). The observation that lipid droplet volumes change is particularly interesting as another group recently observed similar changes in lipid droplets in response to HSV-1 infection of astrocytes and they postulate that this may modulate the cellular immune response (reference 85). Our data support and extend their conclusions, as described in the discussion (lines 476–494).

      Minor comments:

      Line 127 - As I understand it, the alignment by fiducial markers corrects primarily for small inaccuracies in tilting of the stage. Hopefully there are not significant vibrations in the microscope because this would also lead to loss of resolution during the exposure of each tilt angle.

      Thank you, we have corrected “vibrations” to “small inaccuracies in tilting of the microscope stage”.

      Line 145 - "electron light" Is this common usage? To me it seems more accurate to just say electrons because light to me means photons.

      Thank you, we have corrected “electron light” to “electrons”.

      Line 390 - detection OF ("of" is missing)

      Thank you, we have made the correction.

      Line 564 - Fig. 2 legend. "partial retention in the nucleus of U2OS cells". I am not sure where the nucleus is in the images. To me, it looks like there is almost no stain for ICP0 in hTERT at stage 1 and stage 3, and then cytoplasmic stain at stage 2 and stage 4. In contrast, for U2OS, the stain looks mostly nuclear until stage 4 when it is partially cytoplasmic. This all needs to be better explained, and perhaps arrows added to the images such that the reader does not have to guess.

      We agree and have added a silhouette around each nuclei in Figure 2 to make this clearer. We have also added arrows to indicate the gC-mCherry enriched juxtanuclear compartment in cells at stage 3 (HFF-hTERT) or a late stage (U2OS) of infection.

      Line 585 - The authors could consider rotating the images by 180{degree sign} in panel A (late) in order to maintain the same orientation of nucleus and cytoplasm. This would make it easier for readers to see the point.

      Done as requested.

      Line 614 - I could not find the length of the scale bar in the legend.

      We apologise for omitting this – is has now been added.

      Reviewer #2 (Significance):

      The significance of the study is two-fold. First, it is a nice technical demonstration of what can be accomplished using soft X-ray tomography. I am qualified to evaluate this, since my expertise is in biological applications of this technique. The second significant aspect of the study is the demonstration of morphological changes in mitochondria and vesicles. I am not a virologist, so I do not know the literature on this point with regard to virus infection, but I find it interesting that the authors were able to detect such changes.

      We thank the reviewer for their positive assessment of our work.

      I believe the authors should cite a couple of papers:

      10.1016/j.cell.2015.11.029 which looks at HSV infection and reports viral particles between the inner and outer nuclear membrane.

      We have included a citation to this work as requested (lines 162–165).

      10.1016/j.jsb.2011.11.025 which also reports nuclear membrane separations or bulges by soft X-ray tomography.

      We have elaborated on this section and incorporated the reference as requested (lines 265– 276).

      Regarding these nuclear membrane bulges, there are a number of papers that show they can also arise from mutations in nuclear-lamin associated proteins like nesprin and SUN (see for example https://doi.org/10.1093/hmg/ddm338). This is perhaps something interesting for the authors to think about, but not necessary for the current manuscript.

      Thank you for this comment. We did consider studying the breakdown of the nuclear lamina during HSV-1 infection, as this has been shown in previous studies [e.g. 10.1101/2021.06.02.446771]. However, we could not robustly resolve the nuclear lamina from the nuclear envelope in uninfected cells. The nuclear lamina is quite thin (30–100 nm in width) and this may have confounded its identification.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary:

      The manuscript by Nahas et al. describes the structural studies performed in U2OS cells infected with a recombinant HSV-1 virus that enables tracing the stage of the infection using fluorescent markers. This system was used to determine major structural changes in HSV-1 infected cells using cryo-soft X ray tomography (cryo-SXT) on near native-state samples. The data presented complement previous studies (particularly ref.16) using similar reagents but different microscopy techniques. While the data are generally well presented and discussed, they do not provide any substantially novel information on the structural changes in HSV-1. Nevetheless, they constitute an interesting technical achievement.

      We thank the reviewer for supporting the technical quality of the analysis. In response to the comments of another reviewer we have extended our analysis and documented new biological information for this system relating to lipid droplet re-shaping and distribution in response to HSV-1 infection; all our new findings are included in the updated manuscript.

      Major comments:

      There are no major concerns on the data, although some of the statements could be revised for a more realistic interpretation of the results.

      • In Figure 1F and lines 152-156 it is stated that a bulging of the nuclear envelope occurs around some of the putative particles, while in lines 243-244 and lines 625-628, it is stated that bulging occurs both in mock and infected cells. This should be clarified to avoid confusion. It is possible that authors differentiate both situations and this should be more clearly stated.

      Many thanks for identifying a possible area of confusion. We have updated the results to clearly distinguish the expansion of the perinuclear space that accompanies virus nuclear egress (lines 160–175) from the bulges of the nuclear envelope that are observed in uninfected and infected cells (lines 265–276).

      • The statistical tests are different for different hypothesis testing throughout the manuscript. The authors should justify in the methods section the use of one or another test. This will contribute to clarity in the hypothesis that is being test and will clarify the reason for the selected test.

      We have significantly expanded the Graphs and Statistics section of the methods (lines 703– 734) to further justify the statistical tests used throughout our study.

      • Sentence: "Our observation..." in lines 349-352. Even though the sentence is in the Discussion it is wildly speculative. The authors could use different approaches to tackle experimentally the question of whether active fusion or faulty fission is involved, but this is not the main subject the manuscript. Please revise the sentence or address experimentally, this would provide new insight into the impact of HSV-1 infection on mitochondrial network morphology. This sentence could be qualified as "speculative".

      We agree that this section of the discussion strayed into speculative territory and have removed it from the updated manuscript.

      • Although ref.16 provides evidence supporting Golgi fragmentation and mitochondrial elongation after HSV-1_timestamp virus infection in HFF cells, it would be important to show confocal microscopy data in U2OS cells, which were used for cryo-SXT, particularly since the authors refer differential virus kinetics and subcellular distribution of viral antigens in these cells. These would greatly contribute to support the statements regarding these two phenomena. It is very likely that the authors already have the data and could easily show them.

      We have included new microscopy experiments to demonstrate changes in mitochondrial (Suppl. Figure 4) and Golgi (Figures 6 and 7) morphology that accompany HSV-1 infection, and these new experiments are now included in the results (lines 335–310 and 342–355).

      -Line 269: Apposition of lipid droplets and mitochondria is not thoroughly described. This statement requires quantitation. Optimally, confocal imaging using Mitotracker and bodipy493/503 or superresolution imaging using specific antibodies may also contribute to strengthen the statement.

      We agree with the reviewer that we do not at this stage have adequate data to support this assertion and have therefore removed it from the manuscript.

      • It would be of great interest to document the budding events observed by cryo-SXT using higher resolution techniques and the kinetic resolution provided by the fluorescent infection fiducials. This would confirm the nature of the particles (using immunogold) and would demonstrate the the usefulness of the cryo-SXT data. This by itself would justify the use of cryo-SXT to temporally locate events that are difficult to visualize otherwise (as stated by the authors).

      We agree with the reviewer that a correlative imaging strategy involving cryoSXT and fluorescence microscopy could aid in identifying features of infection, and have highlighted this interesting future direction in the discussion (line 406–409). However, performing such analysis will be a substantial experimental commitment in its own and is outside the scope of our current manuscript.

      Minor comments:

      • Given that the software used for segmentation (Contour) is not published, a minimal comparative description between manual and semi-automated segmentation may be shown in the supplementary, to illustrate the robustness of the new method and the reliability of the measurements.

      We have now published a preprint (recently accepted in the journal Biological Imaging) that describes Contour in detail, which we have referenced in the updated manuscript: Nahas, K. L., Ferreira Fernandes, J., Crump, C., Graham, S. C. & Harkiolaki, M. (2021) Contour, a semi-automated segmentation and quantitation tool for cryo-soft-X-ray tomography. http://biorxiv.org/lookup/doi/10.1101/2021.12.03.470962

      • Lines 278-280: statistical test and p value are not shown.

      We have updated the text to include details of the statistical test and p value as requested (lines 326–330 of the updated manuscript).

      • After line 376: It would be interesting to mention that transient elongation of mitochondria is observed during dengue virus infection (https://doi.org/10.1016/j.chom.2016.07.008) and that this has also consequences for innate immunity against viruses.

      We thank the reviewer for this suggestion, which we have incorporated into the discussion (lines 522–523).

      • Given that HSV-1 is a BSL-2 level virus and that a recombinant version (GMO) has been used in the study, the authors should describe the biosafety measures taken to image non-inactivated infectious samples by cryo-SXT. The authors should state that a biosafety committee has reviewed these activities.

      We have included a Biosafety Measures section to the methods (lines 562–568) that details the biosafety measures used and their approval by the relevant committees.

      Reviewer #3 (Significance):

      This study constitutes an incremental technical advance in the study of HSV-1 infection. The broad context and the quasi-native structure of the cells enables documenting events that are difficult to observe thin sections for TEM.

      This study is one of the few examples of the use of cryo-SXT for infected cell imaging. Other examples of the literature are cited as well as previous structural studies performed with higher resolution techniques.

      The manuscript may be suitable for HSV-1 specialists and cell biologists interested in using near-native samples for gross cellular imaging and documentation of low-resolution maps revealing alterations in large subcellular structures.

      We thank the reviewer for highlighting that ours is one of only a few comprehensive studies using cryoSXT, illustrating how it can be used to image cellular processes that are hard to ‘catch’ using techniques that require ultra-thin sectioning, and as such that it will be of interest to cell biologists studying infection processes in cellulo.

    1. Reviewer #3 (Public Review):

      The manuscript presents data that high expression of Protein Phosphatase 1 inhibitor in triple-negative breast cancer contributes to the poor outcome by downregulation of an important kinase, GSK3β. If substantiated, this would enhance our understanding of the pathophysiology of this important disease and might suggest new treatment options. Indeed, changes in PPP1R14C expression alter the behaviour of TNBC in cells and in mouse models, but the mechanistic links to GSK3 are not robustly established.

      Fig 1-2 identified the PPP1R14C as upregulated in TNBC and with a significant correlation with worse outcome. Fig 3 and 4 show in vitro and in vivo effects of changes in PP1R14C consistent with increased proliferation, migration and metastasis in vivo. These studies look very solid and appear to identify a role for this phosphatase regulator in TNBC.

      The weaker part of the manuscript is the mechanistic link to GSK3 regulation. Over-expression and knockdown of PPP1R14C have effects on GSK3β phosphorylation and downstream targets, but the direct connection is unclear and made challenging by a number of complex experimental issues.

      The big questions -<br /> 1. Is GSK3 directly ubiquitylated by TRIM25 on K183? I don't think the data are strong here, for reasons elaborated on below.

      2. Is GSK3 really the important target of PPP1R14C/PP1 complex? The biological data are correlative and the direct experiment, does GSK3β (S9A/K183R) rescue PPP1R14C over-expression, would need to be done. But since I suspect K183R is kinase-dead, this may fail.

      3. The studies with C2 are confounded by the broad effects (including on PP2A) of treating cells with ceramide. Calling C2 a specific PP1 activator is I think unwarranted.

      Specific comments:<br /> Why is there a band in Fig 5D lane 2, the Flag-PPP1R14C lane, in the absence of Flag-PPP1R14C?

      Why in Fig 5E, F, G are there two bands in the pGSK3bS9 blot?<br /> The authors would need to show the total GSK3 coming down here too, and the total GSK3 present in Fig 5H as well.

      I have trouble understanding the result in Fig 5H. According to this, global PP1 phosphatase activity increases 3 fold when PPP1R14C is knocked down. First, there is no method noted for this assay. How do we know this is specific to PP1? Second, PPP1R14C is only one of many PP1 interactors. How can its knockdown change cellular PP1 activity 3-fold? I note the knockout mouse for PPP1R14C had a 15% increase in thalamus PP1 activity (see fig 3, https://doi.org/10.1016/j.neuroscience.2009.10.007). This experiment needs much more in the way of controls.

      Fig 6 evaluates the role of PPP1R14C in GSK3 protein stability. There is a fundamental weakness here - How do the authors know the ubiquitylated smear in the various Fig 6 assays is GSK3 versus a ubiquitylated protein that interacts with active GSK3? GSK3 phosphorylation directs many proteins (famously β-catenin and Myc) for ubiquitylation and degradation, so the co-IP of ubiquitylated proteins with GSK3 is to be expected if the IP stringency is not very very high. This is consistent with inactive pSER9 GSK3 not bringing down ubiquitylated proteins. An IP after for example boiling in SDS to break up large complexes would be needed to test if GSK3 itself, rather than associated substrates, is directly ubiquitylated.

      Is TRIM25 specific for GSK3? It's identified by mass spectrometry. However, when I plug TRIM25 into the CRAPome database (https://reprint-apms.org) I find it comes down in 136/716 (19%) of all MS IP studies, making it a very common contaminant in IP. Thus the bar is high to show this is specific. Here the interaction is validated with over-expression of various truncation mutants.

      Line 235: "K183 of GSK3β has been recognized as the ubiquitylation site". First, what is the reference for this statement? I found one paper (https://doi.org/10.1074/jbc.M116.771667 that claims this residue is important for FBXO17 K48 modification, not the K63 linkage associated with TRIM25). In the crystal structure of GSK3β, that K183 appears to coordinate the phosphates of ATP, so the effect of the K183R mutation may be to make the kinase inactive, which would confound their results. So an important experiment is, does K183R retain wildtype kinase activity? Or is it inactive, and so act like the phosphorylated S9 GSK3?

      The reference for ceramide as a PP1 activator is not a primary reference, it is to a paper in the Journal of Endodontics, which uses it. It would be important to cite primary literature for this usage of C2. I note that many papers cite C2 ceramide as a PP2A activator. It is unclear what the rationale is for using it as a specific PP1 activator?

    1. I’d want to learn a lot from Professor Zimmerman so that I may obtain as much information as possible and use it in reality. It’s not about the work.

      This is a "free write" that we did in class recently to think on how we want our experiences in this class to play out during the rest of the semester. As you can see from the first few phrases, I explained how I wanted to learn as much as possible to help me in the future. I made it very obvious that "it wasn't about the work" and that it goes far deeper than that.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We would like to thank the reviewers for their helpful and constructive comments.

      2. Point-by-point description of the revisions

      Reviewer #1

      This reviewer thought our findings would be of interest to a broad range of scientists from both the centrosome and mitosis fields, but noted some important aspects for improvements.

      Additional Experiments (we number these points for ease of discussion).

        • Figure 3. The reviewer points out that because our analysis of Ana2-∆CC and Ana2-∆STAN mutant proteins was conducted in the presence of endogenous WT protein, we should be more cautious in our interpretation.* We agree and apologise for overstating these findings. We have now rewritten the title and text of this section to be more cautious (p11, para.2)
      1. Figure 5A. The reviewer wonders whether the reduced recruitment of Sas-6 in the presence of Ana2(12A) is due to reduced binding, and they request we test this biochemically. This is our favoured interpretation, but we have been unable to test this biochemically for two reasons. First, although we have successfully purified several recombinant Sas-6 and/or Ana2 fragments (Cottee et al., eLife, 2015), the full-length proteins are poorly behaved (tending to precipitate, likely due to their inherent ability to self-oligomerise). Thus, we have been unable to reconstitute their interaction in vitro*. Second, as we show here, the proteins are normally expressed in embryos at surprisingly low concentrations (~5-20nM), and we can detect no interaction between them in coimmunoprecipitation experiments from embryo extracts (not shown). Indeed, this concentration is so low that Sas-6 does not even appear to form a homo-dimer in the embryo, even though Sas-6 clearly functions as a homo-dimer in centriole assembly (new Figure S4A). We now explain these points, and state that our favoured hypothesis that Ana2(12A) has reduced affinity for Sas-6 (or other core duplication proteins) remains to be tested (p22, para.2).

      2. The Reviewer wonders if all 12 of the potential Cdk1 phosphorylation sites that we mutate in Ana2(12A) are important in vivo, and whether we have tested whether mutating fewer sites (e.g. the two sites [S284/T301] that we show are phosphorylated by Cdk1/Cyclin B in vitro) might be sufficient to recapitulate the Ana2(12A) phenotype. *We have now tested this by mutating just the S284/T301 sites to Alanine [Ana2(2A)], but the results were not very informative (Reviewer Figure 1 [RF1]). Whereas Ana2(12A) is recruited to centrioles for a longer period and to higher levels than WT Ana2 (Figure 4A), Ana2(2A) is recruited to centrioles for a normal period but to lower levels (RF1A,B). The interpretation of this result is complicated because western blots show that Ana2(2A) is also present at lower-levels than normal (RF1B). Thus, it is clear that Ana2(2A) does not recapitulate well the behaviour of Ana2(12A). We have decided not to present this data as it is difficult to interpret and it does not change any of our conclusions.

      3. Figure 6. The reviewer asks whether the 12A mutations impair the interaction with Plk4, influence Plk4’s kinase activity or the ability of Plk4 to phosphorylate Ana2. These are excellent questions but, for the same reasons described in point 2 above, we cannot address them biochemically as we cannot purify well-behaved recombinant full-length Ana2 or active Plk4 in vitro, and both proteins are present at such low levels in the embryo that we cannot detect any interaction between them in embryo extracts. We are working hard to reconstitute in vitro* systems to probe these important points, but it may be sometime before we are able to do so.

      4. Figure 7. The reviewer suggests that the 12D/E phosphomimetic substitutions introduce more negative charge than the putative phosphorylation of Ser/Thr residues and they ask if the Ana2(2D/E) [stated as Ana2(3D/E)] is, like the Ana2(12D/E) mutant, not efficiently recruited to centrioles.* This is a fair comment, but we have not analysed an Ana2(2D/E) mutant because, as described in point 3 above, the Ana2(2A) mutant did not recapitulate well the Ana2(12A) phenotype.

      Minor comments

        • Figure S1. The reviewer requests that we show that the mNG tag on its own is not recruited to centrioles.* We do not show this (as it would create a lot of white space in this Figure), but now state that mNG and dNG do not detectably localise to centrioles (p7, para.1).
        • Figure S4C.* We have included the missing error bars (now Figure S4B).
        • Figure S5A. The reviewer asks about the expression levels of the Ana2(12A) mutant, which are not shown in this Figure. They also state that the expression levels of the transgenes shown in Figure 5A are not similar.* The expression level of Ana2(12A) is shown in Figure S9, as this data was analysed independently of the other mutant proteins shown in Figure S5. We agree that it was overly simplifying the situation to state that the expression levels of WT Ana2-mNG, eAna2(∆CC)-mNG and eAna2(∆STAN)-mNG were “similar” (Figure S5), and we now specifically mention the differences between them (p11, para.3). Reviewer #2

      This reviewer found this a rigorous study that advances our understanding of the regulation of centriole duplication, but raised some minor points.

      Minor Points

      The reviewer requests that we mention the literature describing how Ana2/STIL can influence the abundance and centriolar localisation of Plk4. We apologise for this omission, and have amended our description of this literature in the Introduction to include this point (p3, para.2).

      The reviewer notes that we interpret the ability of the Ana2(12A) mutant to keep incorporating into the centrioles for a longer period as being consistent with our idea that rising levels of Cdk activity during S-phase normally reduce the ability of WT Ana2 to bind to the centriole. They ask us to show how Cdk activity increases over this time-course, and to test whether dampening Cdk has the same effect on Ana2 recruitment (i.e. allows Ana2 to be recruited for a longer period). The time-course of Cdk activation in these embryos has been reported previously (Deneke et al., Dev. Cell, 2016; we present the relevant data from this paper in RF#2A [black line]). This reveals how Cdk activity rises throughout S-phase, which is crucial for our model. To assess the effect of dampening Cdk activity in these embryos we have now analysed the effect of halving the genetic dose of Cyclin B (RF#2B). This perturbation extends S-phase length, but has a complicated effect on the recruitment dynamics of Ana2 (RF#2B). As we would predict, Ana2 is recruited to centrioles for a longer period in these embryos, but it is also recruited more slowly (so it accumulates to lower levels). This is consistent with our hypothesis that Cdk1 activity might first stimulate and then ultimately inhibit the centriolar recruitment of Ana2. The interpretation of this experiment is not straightforward, however, as dampening Cdk1 activity alters Ana2 recruitment dynamics (and many other processes in the embryo) in complicated ways, so we have decided not to include it in the manuscript.

      The reviewer suggests that it would be valuable to show that all 12 of the potential Cdk1 phosphorylation sites in Ana2 can be phosphorylated by Cdk1 in vitro. We think this would not be particularly informative as our hypothesis does not rely on all 12 sites being phosphorylated to generate the Ana2(12A) phenotype. We simply mutate all 12 sites because we don’t know which, if any, are relevant. Thus, showing that some/all of the 12 sites can/cannot be phosphorylated in vitro does not test any hypothesis and would not change any of our conclusions. We now explain our thinking on this in more detail (p12, para.2)

      Other points

      Figure 3. We have corrected the amino-acid numbering mistakes.

      Figure 5Aii. We have changed the x-axis (time) labelling in this and all other Figures.

      Figure Legends. We have tried to eliminate the typos from the Figure legends, and apologise that these errors made it through to the final submitted version of our manuscript.

      Reviewer #3

      This reviewer thought our manuscript would be of great interest to not only the centrosome field but also to cell biologists more generally. Although they had no major concerns, they made a number of suggestions for improvements.

      1. As the reviewer suggests, we now explicitly state that although the Ana2(12A) mutant appears to be largely functional, the overall conformation of the protein may be altered, changing its function in ways we do not appreciate (p21, para.2).

      2. The reviewer suggests we include a multiple sequence alignment of Ana2/STIL proteins to provide more context about the distribution and conservation of the 12 S/T-P sites mutated in Ana2(12A).* This is an excellent idea, and we now include this in a new Figure S6, where we also provide more information about which of these sites have been shown to be phosphorylated in embryo or S2-cell extracts

      3. The reviewer is confused as to why the 12A and 12D/E mutants rescue the ana2-/- mutant flies so well, which suggests that the mechanism we propose here cannot be essential for centriole duplication. We understand this confusion and we now make this point more clearly and explain why we think this occurs in more detail (e.g. p22, para.1). We propose that Cdk normally phosphorylates Ana2 to inhibit its ability to promote centriole duplication, but this phosphorylation does not entirely block this function. So, if all other elements of the system are functional, Ana2(12A) is recruited to centrioles for longer than normal, but this does not dramatically perturb centriole duplication because the many other factors that regulate centriole duplication (such as the pulse of Plk4 recruitment to centrioles [Aydogan et al., Cell, 2020]) still occur normally and are sufficient to ensure that centrioles still duplicate normally. When Ana2 phosphorylation is mimicked [Ana2(12D/E)], the ability of Ana2 to promote centriole duplication is perturbed (but not abolished). This perturbation is lethal in the early embryo—where the centrioles must duplicate in just a few minutes to keep pace with the rapid nuclear divisions. In somatic cells S-phase is much longer, so these cells can still duplicate their centrioles (as we observe) even though Ana2(12D/E) does not function efficiently. As we now explain, this phenotype (being lethal in the early embryo, but not in somatic cells) is a common feature of mutations that influence the efficiency* of centriole and centrosome assembly (p17, para.2).

      4A. The reviewer asks us to comment in more detail on why centrioles do not seem to be elongated in the Ana2(12A) mutant wing disc cells (now Figure S8C), even though we show that Ana2(12A) (Figure 4A), and also Sas-6 (Figure 5), are recruited to centrioles for an abnormally long period. This is an excellent question and, although we do not know the answer, we now discuss this interesting point in more detail (p16, para.1). We think this is likely due to the “homeostatic” nature of centriole growth: in our hands, almost any perturbation that makes centrioles grow for a longer/shorter period, also makes them grow more slowly/quickly, so that they tend to grow to a similar size (Aydogan et al., JCB, 2018; Cell, 2020). This is fascinating, but poorly understood. When we perturb the system by expressing Ana2(12A), both Ana2(12A) and Sas-6 incorporate into centrioles for a longer period, as we predict (Figure 4A and 5A). Unexpectedly, however, Sas-6 is also recruited to centrioles much more slowly. Thus, as so often happens, when we perturb the system so the centrioles grow for a longer time, the centrioles “adapt” by growing more slowly. We do not currently understand why this occurs (although we speculate that Ana2 may also be regulated by Cdk/Cyclins to help recruit Sas-6 to centrioles in early S-phase). In the embryo, where S-phase is very short, this homeostatic compensation is not perfect, and the centrioles appear to actually be shorter than normal. In somatic wing-disc cells, where S-phase is much longer, we suspect that there is more scope for homeostatic compensation and so the centrioles grow to the correct size.

      4B. In this point (also labelled [4] by the reviewer, so we have retained this numbering but labelled the points A and B) the reviewer asks why levels of Ana2(12A) eventually decline at centrioles once the embryos actually enter mitosis. The reviewer notes our rheostat theory, but suggests a discussion of other mechanisms might be interesting. This is a good point, and we agree that the observation that Ana2(12A) levels ultimately still decline at centrioles during mitosis is likely to be important in explaining why centriole duplication is not more dramatically perturbed by Ana2(12A). We now expand our discussion of this point, highlighting that other mechanisms must help to ensure that Ana2 is not recruited to centrioles during M-phase, and discussing the possibility that the receptors that recruit Ana2 to centrioles are themselves inactivated during mitosis by high levels of Cdk activity (p15, para.1). In such a model, the rapid drop in WT Ana2 centriolar levels is due to a combination of switching off Ana2’s ability to bind to centrioles (as we propose here) and switching off the ability of the centrioles to recruit Ana2. For Ana2(12A), only the latter mechanism would operate, so Ana2(12A) levels would start to drop later in the cycle (as the inflexion point at which Ana2 recruitment and loss balances out would be moved to later in the cycle), and these levels would drop more slowly—as we observe.

      • The reviewer is confused to how the Ana2(12D/E) mutant can rescue the mutant phenotype when it is recruited to centrioles so poorly. Ana2(12D/E) is indeed recruited very poorly to centrioles in the experiment shown in Figure 7. However, this experiment had to be conducted in the presence of WT untagged Ana2—as the embryos do not develop in the presence of only Ana2(12D/E). We would predict that WT Ana2 would bind more efficiently to centrioles than Ana2(12D/E) (which appears to behave as if it has been phosphorylated by Cdk/Cyclins, and so cannot be recruited to centrioles efficiently). Thus, in the experiment we show in Figure 7, the Ana2(12D/E) protein is probably being “outcompeted” for binding to the centriole by the WT protein. In somatic cells expressing only* Ana2(12D/E) presumably sufficient mutant protein can be recruited to centrioles to support normal centriole duplication (as it no longer has to compete with the WT protein). We now explain our thinking on this point (p18, para.1).

      • The reviewer wonders whether Ana2(12D/E) may be unable to homo-oligomerize, and this may explain why the protein is not recruited to centrioles efficiently even in the presence of WT protein. This is indeed a possibility, but we think it unlikely as it is widely believed that Ana2/STIL proteins must multimerize to be functional (Arquint et al., eLife, 2015; Cottee et al., eLife, 2015; Rogala et al., eLife, 2015; David et al., Sci. Rep., 2016). As Ana2(12D/E) strongly restores centriole duplication in ana2-/-* mutant somatic cells, it seems unlikely that it cannot multimerize. Nevertheless, we now specifically highlight that the 12D/E (and 12A) mutations might alter the ability of Ana2 to multimerise (p21, para.2).

      We thank the reviewers again for their thoughtful and constructive comments. We hope they will agree that the revised manuscript is now improved and would be appropriate for publication in The Journal of Cell Biology.

      With best wishes,

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point description of the revisions

      Black: Comments from reviewers

      Green: Answers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Yamamoto and colleagues have investigated the interplay between microtubules (MTs) and actin in positioning the MTOC at "the cell centre". They have developed a novel experimental setup akin to a synthetic cell to study this question. Essentially a cell-sized (15 µm) microwell that is coated in lipid and then tubulin/actin added and the positioning of a MTOC proxy is studied by microscopy. This is a well executed study. These complicated biochemical reconstitutions are the hallmark of Blanchoin and Théry's group, but even so, it's clear that the exact conditions (e.g. tubulin concentration) are fiddly and critical for these experiments to work. The data are clear, well analysed and presented. In brief, the conditions for centring a cytoskeletal network and decentring/polarising it are recapitulated. This is a short, straightforward paper and I found the results to be clear and the authors' interpretation to be well supported by the data.

      Two questions occurred to me as I read the paper: 1. While the setup is reminiscent of a cell, I suspect that the edge/wall of the microwell is much stiffer than the plasma membrane. So a MT that encounters the wall may behave differently in the cell. This would affect the non-actin conditions but possible also the conditions where an actin mesh is present. Maybe my intuition is not even correct, but I think this issue should be discussed in the paper as a potential limitation of the system.

      Author response: We thank the reviewer for this wise comment. Indeed, the deformation of the container may impact the organization of the MT network, the force balance and the final position of the MTOC. We commented this limitation in the revised discussion (page 10 line 31). However, it should be noted that in the presence of a cortical actin network, MTs are much less capable of deforming the cell than in a vesicle or a in cell treated with actin drugs, so our conditions with a cortical actin network are physiologically relevant although the container can not be deformed.

      1. The graphs in 3C and 4G (lesser extent Fig 1) show nicely that the aMTOC position has apparently rested at a steady state. Some representative trajectories are shown in some figures, but not mentioned much in the text. How does the pathlength (cumulative distance) over time compare to the "distance to centre" measurement? Is there more or less travel under the different conditions? From the supplementary videos it looks like there is a difference. An apparent resting position may still represent significant motion, e.g. circling the centre. What does an analysis of tracklength tell us, if anything?

      Author response: We appreciated reviewer’s comment and followed his/her advice. We measured the pathlength (cumulative distance moved) based on the data shown in Figure 3C and 4G. The analysis confirmed that the MTOC was static in the presence of bulk actin network (shown in the new Supplementary Figure 6B). Interestingly, it also showed that the final position adopted by the MTOC in conditions where it could move more freely was also static, as revealed by the saturation of the pathlength after 1 hour. These analyses are shown in the new Supplementary Figure 6B for the centering in the absence of cortical actin, for the non-centering with long microtubules in Supplementary Figure 7E and for the centering with long MTs and a cortical actin network in Supplementary Figure 7E.

      Very minor clerical point: - the first two sentences of the abstract could be clearer. "The position of centrosome, the main microtubule-organizing center (MTOC), is instrumental in the definition of cell polarity. It is defined by the balance of tension and pressure forces in the network of microtubules (MTs)." In the second sentence, "it" and "defined" are confusing. Are you talking about the position of the centrosome or cell polarity?

      Author response: We thank the reviewer for this comment. As the reviewer suggested, this was a confusing description. Accordingly, we corrected the sentence in the abstract for :

      The orientation of cell polarity depends on the position of the centrosome, the main microtubule-organizing center (MTOC). It is determined by the balance of tension and pressure forces in the network of microtubules (MTs).

      Reviewer #1 (Significance (Required)):

      As I see it, the main advance here is in novel experimental setup which has real potential in the field. Existing methods such as MTs inside lipid bubbles are limited, whereas as the microwell method with fabrication methods allows the shape of the "synthetic cell" to be carefully modulated. Tying the results together with cytosim simulations is also a powerful combination. There is a lot of interest in bottom-up reconstitution of cell biological phenomena, especially those that underlie specialised cell processes, e.g. polarity. My expertise: microtubules in a cellular context with limited experience of MT reconstitution assays.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript describes the use of an elegant in vitro reconstitution system to study the effect of variations in the organization of the actin network on the positioning of a microtubule organizing center (MTOC) within the cell. By using a reconstituted system the authors are able to specifically study the contribution of the "pushing" forces generated by microtubule (MT) growth, without the confounding influence of other factors, like pulling forces from MT motors. The authors find that a bulk actin networks at sufficient density can impair MTOC displacement, likely a result of the large viscous drag of the MTOC. Next they show that MTOC centering more resilient to changes in microtubule length. Finally they show that an asymmetric actin network can cause asymmetric positioning of the MTOC.

      Major comments: 1) The model the authors put forth is that the growth of long MTs leads to decentering as a result of the MTs slipping along the well edge. The presence of a cortical actin mesh prevents this slipping. Their argument would be strengthened with and analysis of the MT behaviors in the various conditions. For example when discussing MTOC in well without actin...

      "As they grew, they first ensured a proper centering but after an hour, MT elongation and slippage along microwell edges broke the network symmetry and MTs pushed aMTOC away from the center (Figure 1I, J and Supplementary Movie 2)"

      In this movie I don't see evidence of MTs hitting the cortex and sliding on the "short" side of the well relative to the MTOC. An analysis of the behavior of MTs in various circumstances would help link the behavior of MTs to the movement of the MTOC for all of their conditions. What fraction of MTs hit the cortex and remain relatively motionless, what fraction slide, what fraction catastrophe, what fraction turn and follow the curve of the well? And how does this behavior change for microtubules that end up on the short side vs. the long side of the MTOC? This type of analysis would solidify their model for how centering/decentering occurs in the various conditions they test.

      Author response: This is a fair criticism. The possibility to perform fine analysis of MT dynamics is technically limited by the fluorescent background due to free tubulin dimers. It is the reason why classical in vitro assays are monitored in TIRF microscopy, which is not possible here since MTOCs move in 3D in the microwells. In addition, working with higher laser power to increase the signal to noise ratio generates severe photodamages on MTs. Nevertheless, we could visualize MT dynamics and displacements near the edge of the microwells and describe their behavior more precisely than in the previous version of our manuscript. New images and tracking of MT behavior are now reported in the new Figure 4E, 4F and 5G, as well as the new supplementary Figure 4C, 4D, 7B, and 7C. We also replaced the supplementary movie 2 and Figure 1I in order to show more clearly MTs hitting and slipping along the well boundary. In addition, we also characterized the pivoting of MTs around the MTOC and near the edge of the microwell in order to better characterize the effect of cortical actin. This is now shown in the new Figure 4G and 4H as well as in the new Supplementary Figure 7C-D). We found that the changes in MT orientation and position, at the centrosome and at the contact with the microwell, were clearly prevented by the presence of cortical actin.

      2) The authors use simulations to support their in vitro findings. However, their simulations have many more microtubules emanating from the MTOC than their experiment (Looks like about 50 in the cytosim and they state they are aiming for 15-20 in the aMTOCs). Do the simulations still reproduce the behavior of the in vitro system with a similar number of MTs?

      Author response: This is another fair criticism. We addressed this point by performing simulations with 10~30 microtubules (the number of MTs is variable because of MT dynamics) which are more similar to the number of MTs that we obtained in our experimental conditions. Results were consistent with previous simulations with higher number of MTs and are now shown in the new supplementary figures 6E-F, 7G and 8I).

      3) When the actin networks are asymmetric, the authors see decentering of the MTOC towards the side with less actin. However there is still actin on the side where the MTOC will move to and in some of their images it looks pretty think. Is the actin on that side not dense enough to prevent MT sliding along the "cortex"? If so, can they generate less dense, but uniform actin networks on the "cortex", where MTs can slide. Again descriptions of MT behaviors would be useful in understanding what is happening.

      Author response: We thank the reviewer for asking this important question. We followed reviewer’s advice and generated homogeneous and less dense cortex by working at lower concentration of actin (0.5 mM). In such conditions, we could not see the centering effect that was observed with dense cortex. These new data are now shown in the new Supplementary Figure 7I. This effect was also tested with numerical simulations (new Supplementary Figure 7J) which were consistent with the key role played by actin network density for MT network positioning by cortical friction.

      Minor Comments: 1)Title - the current title implies that actin is balancing the forces generated by the MTs. I'm not sure this is a good description of what is shown in the paper.

      Author response: We thank the reviewer for pointing at this issue. We revised the title to:

      Reconstitution of centrosome positioning by the production of pushing forces in microtubules growing against the actin network.

      2)The discussion would benefit from more explanation about how the results of this paper relate to the classic examples of MTOC positioning they cite. How do they envision the actin and MTs interacting in these systems and what new insight have we gained from the experiments in this manuscript.

      Author response: This is a good suggestion. We added some comments in our discussion about the actin network asymmetry in several classical examples of cell polarization and explained how our observations suggest some new interpretation on the role of this asymmetry in the reorganization of forces in the MT network and on the consequential peripheral positioning of the MTOC.

      Reviewer #2 (Significance (Required)):

      Overall, this work is a significant advance in our understanding of the potential mechanisms of MTOC movement in cells via pushing by MT growth. The experimental system they have developed is powerful advance, allowing meaningful MTOC reconstitution experiments to be performed in chambers of approximately cellular size. This is an important contribution to understanding the interaction between microtubule pushing and the actin cortex.

      Reviewer expertise: Cell biology of MTOC assembly and positioning. I do not have the expertise to assess the parameters used to generate their cytosim models.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Review of "The architecture of the actin network can balance the pushing forces produced by growing microtubules" by Yamamoto et al.

      The means by which cells maintain their characteristic cytoskeletal architectures is not well understood. This is in part because there is considerable variation in such architectures with, for example, fibroblasts, neurons, and epithelial cells. It is also in part because the microtubule, actin and intermediate filaments engage in a wide range of mechanical and signaling crosstalk mediated by a wealth of proteins and signaling networks, which further complicates the picture.

      In the current study, Yamamoto take the welcome step of developing a simplified system for assessing the mutual contributions of microtubules and F-actin for general cytoskeletal organization in vitro (specifically, in lipid-lined microwells). This allows them to define basic principles of microtubule-F-actin interactions in the absence of the various confounding factors alluded to above. Using their model, they show that artificial MTOCs (aMTOCs) alone will center but as a complex function of microtubule length (controlled by varying tubulin concentrations). That is, the aMTOCs are randomly positioned with short microtubules, stably centered with intermediate length microtubules, and randomly oriented with very long microtubules (following symmetry breaking).

      They then assess the contributions of F-actin to the centering process. In low concentrations of "bulk" F-actin (ie F-actin distributed throughout the droplet) there is no effect on centering whereas at higher concentrations of bulk F-actin, centering is impaired as is the translocation of the aMTOCs. In the presence of uniform peripheral F-actin, in contrast, aMTOC centering is enhanced, and rendered less sensitive to variations in microtubule length. Finally, when the authors contrive a situation in which the peripheral F-actin is non-uniform (by lowering the concentration of actin and adding alpha-actinin, which creates a peripheral ring of F-actin with (I think) relatively less F-actin within the ring), the aMTOCs position themselves within the ring.

      Finally, the authors extend their results with simulations that indicate that the various behaviors can be explained by a combination of friction, pushing and slippage.

      This study is fascinating and will be of general interest to anyone who seeks to understand the contributions of mechanical forces to cytoskeletal organization in a minimal system. I have only minor concerns; these are listed below.

      1. Some of the terminology was a little confusing. The authors introduce the term "inner zone" (pg. 8) without defining it. From the context, it seems like they are talking about the approximate center of the ring of peripheral F-actin. If so, why not just do away with the term "inner zone" and refer to the ring center. If it isn't the ring center, then more explanation is needed as to what the inner zone actually is.

      Author response: We apologize for this confusion and appreciate reviewer’s comment. We coined earlier the term “actin inner zone” to define the central cytoplasmic region in cells that is devoid of actin filament (Jimenez et al., Current Biology, 2021). Because it was a confusing point, we clarified this in the revised version of the manuscript (Page 8, Line 20). What we would like to call the “inner zone” is the region inside of the actin cortex. The definition of this zone and of its geometrical reference points were also pictured more precisely in the new Supplementary Figure 9B.

      1. It is not clear from the text or the images if the region within the F-actin ring has less F-actin, more F-actin, or the same amount of F-actin as the region outside the F-actin ring. This point should be clarified, as it makes a big difference in the interpretation of the findings.

      Author response: We apologize for this lack of clarity. In the revised version of our manuscript, we plotted a line scan intensity profile of the actin fluorescence (new Supplementary Figure 9B). It showed that the region within the actin inner zone contained much less actin than in the cortex. This is consistent with our interpretation of a region-selective pattern of friction acting on microtubules.

      1. Ideally, the authors would include manipulations in which the high concentration of peripheral F-actin is combined with alpha-actinin because, as currently presented, the authors are drawing conclusions from changing two variables at once (ie going from a high concentration of peripheral F-actin to a lower concentration with added alpha-actinin). Thus, the authors cannot cleanly distinguish between effects that arise from F-actin asymmetry versus the presence of an F-actin crosslinker. Since the crosslinking is likely to change the mechanical properties of the peripheral F-actin network, this point should at least be addressed in the text, if not by experiments.

      Author response: We are not sure to fully understand the reviewer’s point. We don’t understand how the crosslinking of a symmetric actin network could break the symmetry of the MT network and force its off-centering. The opposite is clearer to us. A homogeneous and loose actin network can allow MT gliding and MTOC off-centering (like in in Supplementary Figure 7J). The mechanical reinforcement of this network by crosslinkers could indeed resist gliding. But the consequence of this resistance would be similar to the consequence of a dense network: a more robust centering (like in Figure 4). So we don’t understand how the crosslinking by alpha-actinin, rather than the asymmetry of the actin network, could be at the origin of the off-centering we observed. In addition the off-centering of the MTOC was systematically aligned with the asymmetry of the actin network, so both parameters were clearly connected.

      Reviewer #3 (Significance (Required)):

      This is an elegant, well-designed study that provides a clear description of how basic mechanical forces can contribute to cytoskeletal organization in a simplified model system.

    1. It's a little hard to tell if "IndieWeb" is in practice just its own community of people who like to talk about #indieweb things. (That's what gets surfaced when I try to learn more, but of course it is.) I like the idea more than most "fediverse" incarnations, though.

      The Logos, Ethos, and Pathos of IndieWeb

      Where is the IndieWeb?

      Logos

      One might consider the IndieWeb's indieweb.org wiki-based website and chat the "logos" of IndieWeb. There is a small group of about a hundred actove tp very active participants who hang out in these spaces on a regular basis, but there are also many who dip in and out over time as they tinker and build, ask advice, get some help, or just to show up and say hello. Because there are concrete places online as well as off (events) for them to congregate, meet, and interact, it's the most obvious place to find these ideas and people.

      Ethos

      Beyond this there is an even larger group of people online who represent the "ethos" of IndieWeb. Some may have heard the word before, some have a passing knowledge of it, but an even larger number have not. They all act and operate in a way that either seemed natural to them because they grew up in the period of the open web, or because they never felt accepted by the thundering herds in the corporate social enclosures. Many are not necessarily easily found or discovered because they're not surfaced or highlighted by the sinister algorithms of corporate social media, but through slow and steady work (much like the in person social space) they find each other and interact in various traditional web spaces. Many of them can be found in spaces like Tilde Club or NeoCities, or through movements like A Domain of One's Own, some can be found through a variety of webrings, via blogrolls, or just following someone's website and slowly seeing the community of people who stop by and comment. Yes, these discovery methods may involve a little more work, but shouldn't health human interactions require work and care?

      Pathos

      The final group of people, and likely the largest within the community, are those that represent the "pathos" of IndieWeb. The word IndieWeb has not registered with any of them and they suffer with grief in the long shadow of corporate social media wishing they had better user interfaces, better features, different interaction, more meaningful interaction, healthier and kinder interaction. Some may have even been so steeped in big social for so long that they don't realize that there is another way of being or knowing.

      These people may be found searching for the IndieWeb promised land on silo platforms like Blogger, Tumblr or Medium where they have the shadow on the wall of a home on the web where they can place their identities and thoughts. Here they're a bit more safe from the acceleration of algorithmically fed content and ills of mainstream social. Others are trapped within massive content farms run by multi-billion dollar extractive companies who quietly but steadily exploit their interactions with friends and family.

      The Conversation

      All three of these parts of the IndieWeb, the logos, the ethos, and the pathos comprise the community of humanity. They are the sum of the real conversation online.

      Venture capital backed corporate social media has cleverly inserted themselves between us and our interactions with each other. They privilege some voices not only over others, but often at the expense of others and only to their benefit. We have been developing a new vocabulary for these actions with phrases like "surveillance capitalism", "data mining", and analogizing human data as the new "oil" of the 21st century. The IndieWeb is attempting to remove these barriers, many of them complicated, but not insurmountable, technical ones, so that we can have a healthier set of direct interactions with one another that more closely mirrors our in person interactions. By having choice and the ability to move between a larger number of service providers there is an increasing pressure to provide service rather than the growing levels of continued abuse and monopoly we've become accustomed to.

      None of these subdivisions---logos, ethos, or pathos---is better or worse than the others, they just are. There is no hierarchy between or among them just as there should be no hierarchy between fellow humans. But by existing, I think one could argue that through their humanity they are all slowly, but surely making the web a healthier, happier, fun, and more humanized and humanizing place to be.

    1. Author Response

      Reviewer #2 (Public Review):

      Schumacher and Carlson present volumetric data on the brain and main brain areas in several linages of fish that have independently evolved electroreceptors and electrogenesis. The main question is if the evolution of this novel sensory system has led to similar changes in the brain. Previously, the same authors (Sukhum et al 2018) have shown an increase in the relative size of the cerebellum and hindbrain in mormyrid fishes, one group of electrogenic fish. Here they have collected data on South American weakly electric fishes (Gymnotiformes) and weakly electric catfishes (Synodontis spp.) as well as some outgroups. (22 additionally species). I think the question is very interesting, and the inclusion of electrogenic catfishes is particularly interesting as they are a largely understudied group. I do have some concerns about how the data has been analysed and presented.

      1) A first conclusion is that gymnotiform and siluriform brains are not as enlarged as mormyrid brains, and that this suggests that an increase in brain size is not directly tied to an electrosensory system evolution. I think the story here is more complicated than that. From the data presented, it seems that mormyrids have a different body size-brain volume slope than other groups, but is unclear if this was tested in the PGLS model for brain vs body size, although mormirids show different slopes than other groups in the scaling of the cerebellum to brian volume. This difference in slope for body brain allometry has been confirmed by a manuscript published after the submission of this manuscript (Tsuboi 2021 BBE) with a large data set (~ 850 species, 21 of Osteoglossiformes). This steep slope close to one means that mormyrids with large body size have very large relative brain sizes but smaller mormyrids don't (this can be seen in figure 2). I think this needs to be addressed more carefully. First testing in the PGLS for body size vs brain size if mormyrids have a different slope and then in the discussion. Why mormyrids but not other electrogenic fish have evolved such a unique brain scaling?

      We thank the reviewer for this suggestion. We combined our data with the data from Tsuboi 2021 and assessed how the brain-body allometry has changed across 870 actinopterygians. We identified 3 shifts in lineages with at least 3 descendants and 7 shifts total that were supported by both the OUrjMCMC and PGLS analyses. One of these identified shifts was along the branch leading to osteoglossiforms, with a secondary decrease in one lineage within mormyoids. A second identified shift was along the branch leading to Synodontis multipunctatus. However, we find no shifts along the branches leading to other electrosensory lineages. This suggests that although mormyroids do have a different brain-body allometry compared with other electrogenic fishes, this shift predates the origin of mormyrids as it is found in all osteoglossiforms and thus is unlikely to be related to the evolution of electrosensory systems. These changes are reflected in lines 778-826, 110-153, 513-528, 530-538, 569-575 and figure 3 and associated source data files. See also our detailed response to essential revision 1.

      2) I think the number of outgroups species used are too few and spread among several different linages of teleosts. I think this unfortunately tampers some of the conclusions. Particularly seems to leave unanswered the question if other electrogenic fish have brain larger than non electrosensory or electrogenic fish. A large data set of brain and body size data for teleost has been published (Tsuboi et al 2018; 2021). Adding this data should allow to test for changes in body-brain size relationships in the each electrogenic clades. The addition of the additional data should allow to accurately test for difference in relative brain size between and within electrogenic clades and make it possible to test when exactly in the phylogeny of teleost have grade shits in the body-brain allometry have happened.

      We thank the reviewer for this suggestion. We explicitly addressed this question by fixing shifts along the branches that evolved our three electrosensory phenotypes: evolution of electrogenesis, tuberous electroreceptors, and ampullary electroreceptors. After comparing these models to the unfixed shift model, a model where only osteoglossiforms have a shifted allometry (following the finding of Tsuboi 2021), a model where only intercept can shift, and a model with one shared allometry across all actinopterygians, we found that the unfixed shift model has a better fit than any of the electrosensory phenotype associated models. This further supports the conclusion that a shifted allometry/ large brain size is not necessary to evolve an electrosensory system. These additions are reflected in lines 778-826, 110-153, 513-528, 530-538, 569-575 and figure 3 and associated source data files. See also our detailed response to essential revision 1.

      3) Next, the authors use a principal component analysis and phylogenetic linear models to test how much of brain variation is explained by concerted evolution vs mosaic and where the mosaic change have happened. Here, despite the few non electrogenic/ electrocereptive species, the differences are more clear. I do think that in the case of the linear models, the use brain volume as the independent variable is unnecessary. By regressing the total brain volume, the authors are regressing each structure partially against the same value, and not surprisingly, this generates tight linear correlations. Further, this makes grade shifts (i.e. changes in relative size) less apparent. I think only brain volume -the structure should be used and shown in all figures. This has been the standard in the field when testing for grade shifts.

      We thank the reviewer for this comment. There is much debate in the field regarding whether to use brain volume or brain volume – region of interest as the independent variable, and both are commonly used. Originally, we had looked at both and found qualitatively similar results, but only presented the ‘region x brain volume’ results in the main text for brevity. We have revised this to include the results of statistical analyses for ‘region x brain volume – region’ and the accompanying figures in the main text for both the electrosensory phenotype comparisons and the within electrosensory phenotype comparisons (broadly distributed throughout the results and figure 5—figure supplement 1, figure 5—source data 4-6, figure 7—figure supplement 1, figure 7—source data 2). All of the major findings of relative mosaic shifts between tuberous receptor taxa and non-electric taxa, between electrogenic + ampullary only and non-electric taxa for cerebellum and torus, and no mosaic shifts with electrosensory phenotype in telencephalon hold regardless of the method, and we only find minor differences between the analyses for comparisons that had p values near 0.05. These discrepancies do not change any major conclusions. However, we have kept the reporting of ‘region x total brain volume’ analyses in the main text figures to be consistent with other large comparative studies in the field and our group’s previous work (Yopak et al 2010, Sukhum et al 2018).

      4) Related to the previous point, the authors report significant decreases electrogenic clades in the size of the olfactory bulb, rest of the brain and optic tectum. I think this is and artifact that results from including the cerebellum and other enlarged areas (TS and hindbrain) in the dependent variable. Similarly, the authors state that they found no increase in the size of the telencephalon in electrogenic clades and that non-electric osteoglossiforms have a mosaic increase in telencephalon relative to non-electric otophysans. Again, I think this suffers from the same problem. Figure 4-figure supplement 2 actually provides some insight in this respect. When plotted against the rest of the brain, no apparent differences are found in the size of the optic tectum. In the case of the olfactory bulb only two of the out-group species seem to have larger OB than all other species. Regarding the telencephalon, when plotted against RoB, all osteoglossiform seem to have similar telencephalon size. These conclusions need to be carefully evaluated.

      We thank the reviewer for identifying this miscommunication. We have moved previous figure 4—figure supplement 2 to the main text (now figure 6) and have added the statistical analyses and discussion of this point to both the results and discussion. We have also clarified the distinction between relative and absolute shifts in region sizes throughout but see in particular lines 261-295, 307-317, 330-331, 473-499. See also our detailed response to essential revision 3.

      Reviewer #3 (Public Review):

      The authors use micro-CT scanning and sophisticated statistical techniques to compare the sizes of various major brain regions across a sample of 32 fish species, including lineages that have independently evolved passive electroreception and, in a smaller subset, the ability to generate and sense weakly electric fields. They found that most of the variation in brain region sizes is linked to variation in total brain size, indicating concerted evolution. However, the analysis also reveals that the electrogenic lineages/species have selectively enlarged the cerebellum, the midbrain torus semicircularis, and the hindbrain. These findings are interesting and usefully extend the last author's prior work on a subset of these species.

      A significant strength of the work is that it includes a relatively large number of species, makes a good attempt to understand how these species are related to one another (though the authors admit that the phylogeny is tentative), and that the analytical methods are quantitative and relatively sophisticated. It is also true that other researchers have long argued about the relative frequency and importance of concerted versus mosaic evolution. The present study is a valiant attempt to address this issue.

      However, some key results must be viewed cautiously. Most important is that the dramatic increase in the cerebellum (and torus semicircularis and hindbrain), relative to the rest of the brain, must necessarily lead to some other brain regions appearing to have decreased in size. Therefore, their absolute size may well have stayed the same or even increased in evolution; it's just that the enlarged brain regions decrease the proportions of at least some other regions. The authors mentioned this caveat in their previous paper on mormyroids (Sukhum et al., 2018), but not in the present manuscript. As a result of the problem, it is difficult to interpret the documented variation in olfactory bulb, optic tectum, or telencephalon size; is that variation "real" or just artifacts of major changes in the size of other brain regions (mainly cerebellum, torus, and hindbrain). The best way to address this problem would have been to repeat the analysis using a "reference" brain region that is thought not to vary dramatically in size across the species of interest (e.g., "rest of brain"). However, I acknowledge that this approach also has limitations. Still, the problem should be addressed somehow.

      We thank the reviewer for identifying this miscommunication. We have moved previous figure 4—figure supplement 2 to the main text (now figure 6) and have added the statistical analyses and discussion of this point to both the results and discussion. We have also clarified the distinction between relative and absolute shifts in region sizes throughout but see in particular lines 261-295, 307-317, 330-331, 473-499. See also our detailed response to essential revision 3.

      One strength of the manuscript is that it provides information about y-intercepts and slopes. Many other studies simply note increases or decreases in average volume (before or after correcting for absolute brain size). I like knowing which changes in relative brain region size are grade shifts (changes in intercept) versus changes in slope. However, the authors don't really do anything with those results. What do they mean? Are there different kinds of evo-devo mechanisms that underlie the two types of changes (slope versus intercept)?

      We thank the reviewer for this suggestion. We have added some discussion on potential mechanisms for evolutionary changes in intercept and slope (lines 543-559). Unfortunately, this topic is not well studied in fishes, which have extensive adult neurogenesis.

      On a related note, do the major brain regions vary in allometric slope within a given lineage? The realization that such differences do exist (at least in mammals and cartilaginous fishes) contributed much to the excitement around the concept of concerted evolution, since it means that evolutionary changes in absolute brain size can lead to major shifts in brain region proportions, but the authors seemingly ignore this point.

      We thank the reviewer for this suggestion. We do find variability in slope for different regions of each lineage. We reported these values (figure 5—source data 1, figure 7—source data 1) and add discussion of this point (lines 539-542).

      Finally, I must confess that some of the study's findings didn't surprise me. It is well known among fish neurobiologists that mormyrids have a dramatically enlarged cerebellum and that all electrogenic gymnotoids and mormyroids have a very large torus semicircularis and dorsal/alar hindbrain. One didn't need the fancy analytical techniques to confirm this. To be fair, however, it had not been clear whether the cerebellum is enlarged in gymnotoid electric fish and their non-electrogenic relatives (the authors report that it is). Nor was it known that the weakly electric catfishes have a larger cerebellum (not so much for the torus) than their non-electric relatives. This is new information that raises interesting questions about how the electric catfishes are using their electrosensory system (I would have liked to see some discussion of this).

      We thank the reviewer for this comment. We too agree that electric catfishes warrant further study into which species are electrogenic, whether their discharges are sporadic versus continuous, and how they are using their electrosensory systems. We have added further discussion on electric catfishes (lines 411-416, 425-437).

      On balance, I appreciate that the authors have provided a large and useful data set , which they used to address an interesting set of questions about how brain evolution "works." I'm just disappointed that, for me, there are relatively few significant, novel insights. For example, the notion that "selection can impact structural brain composition to favor specific regions involved in novel behaviors" (last sentence of the abstract) is one that I've accepted for a long time. Maybe the conclusion can be made more interesting by focusing more explicitly on changes in the size of major brain regions versus smaller cell groups (where mosaic evolution is widely accepted).

      We thank the reviewer for this suggestion. We agree that mosaic evolution is more readily detected in smaller subregions/ nuclei/ circuits and is found less so at the scale of major brain regions. We have adjusted the text throughout to further highlight this distinction, but see in particular lines 42-48, 500-528.

      Reviewer #4 (Public Review):

      The authors present a detailed and thorough comparative analysis of brain composition across 3 different lineages of weakly electric fish, and several non-electric fishes. The goal of this comparison was to determine whether the evolution of electrosensory systems is associated with common changes in brain composition across the three lineages. Several aspects of this research are highly novel, such as the use of m-CT imaging and phylogeny-informed multivariate statistics. Overall, the authors show that cerebellar enlargement is key to the evolution of electrosensory systems of all three groups and the enlargement of the hindbrain and torus semicircularis varies depending on the types of electroreceptors and electrical signals produced. This is one of very few examples in evolutionary neuroscience of convergent evolution of brain anatomy and behaviour and sets the stage for future research on other sensory specialists and clades.

      Strengths

      The comprehensive analysis provided by Schumacher and Carlson has several strengths. First, the use of m-CT scans to derive neuroanatomical measurements in fish is relatively novel and the detailed descriptions of brain region borders were greatly appreciated. Few papers that focus on comparative neuroanatomy put this degree of effort into describing how regions were differentiated and defined, but the level of detail provided here will allow other researchers to acquire data in an identical method and is therefore an important resource.

      Second, the statistical analysis is phylogeny-informed and uses an array of approaches. Too many neurobiology papers either avoid phylogeny-informed statistics or execute them poorly. This paper is neither of those and should serve as a template for future studies in the field.

      Third, the inclusion of some recording data for Synodontis is an important contribution. I am not an expert on weakly electric fish, but I do know that the catfish are understudied compared with gymnotiforms and mormyroids. Hopefully, this will result in some well-deserved attention to the diversity of catfishes.

      Fourth, I found the manuscript as a whole well written and presented. In particular, the authors provided a novel way of incorporating additional statistical information into Figures 3 and 4.

      Last, the supplemental video was great addition to the data presented.

      Weaknesses

      First, the Introduction was a bit brief for readers unfamiliar with weakly electric fishes. It would be helpful to provide a bit more information to a general audience. Including a figure depicting the phylogenetic relationships among some (not all) bony fish clade to illustrate the independent evolution of electrosensory systems across the three clades would be particularly helpful in this regard.

      We thank the reviewer for this comment. We have included more background on the evolution of electrosensory systems in actinopterygians and included a figure showing this (lines 76-83, figure 1).

      Second, I think it is important to determine if the principal component analysis changes if the volumetric data is scaled. One issue that can affect multivariate analyses is including variables that differ greatly in scale. For example, if one brain region varies between 0.5-1.2 mm3, but another varies from 10-50 mm3 across species, that difference in scale can sometimes affect the PCA. I suggest checking that the analyses are broadly the same if the volumetric data is scaled (e.g., converting to z-scores).

      We thank the reviewer for this suggestion. We z-score normalized the regions and repeated the pPCA and found nearly identical results (lines 175-177, figure 4—figure supplement 1).

      Third is there any information regarding malapteurid catfish? Are they similar enough to Synodontis or could they exhibit yet another brain type from that discussed in this study? The reason I ask is that the authors raise the issue of Torpedo, but do not discuss other strongly electric fish like Malapteurus (which is a siluriform related to Synodontis).

      We thank the reviewer for this comment. We too agree that they would be worthwhile species to add. Unfortunately, there is no data available on malapteurid catfish, and we were unable to sample any. We have added discussion of this point to lines 411-416.

      Last, some of the graphs in the supplemental material are too small with datapoints too crowded to effectively read them. Larger graphs would enable a more effective evaluation of how the various clades differ from one another.

      We thank the reviewer for this comment. We enlarged the region x region plots and plotted species means instead to make it easier to visualize these data (Figure 6, figure 7—figure supplement 2-4).

    1. Not at this time. You know, we believe that the way we collect images is just like any other search engine. And you know, this is stuff in the public domain. And for the purposes that it’s being used for I think, they can be very pro-social. I don't think we want to live in a world where any big tech company can send a cease and desist, and then control, you know, the public square. So, I think it's an issue that is really important because the issue of collecting publicly available online data is not just images, any kind of data. It affects researchers who may be, you know, studying things like discrimination or studying other things like misinformation, and it affects academics and a whole wide range of other types of use cases as well.

      the companies that have asked Clearview to delete these images, has Clearview done so?

      • Didn't delete anything
      • He thinks they are collecting data like other searching engine
      • He believes the purpose of collecting data is favor by the social
      • Don't want the big tech company control the publiced data, that they could just send a cease and desist, and control eveythign
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Unlike other cell organelles, mitochondria contain a small fraction of their genetic information. However, most of the genetic information about mitochondrial proteins is still in the cell's nucleus and the localization of the respective proteins to mitochondria is facilitated by localized translation of their mRNAs. In turn, the mRNA localization to the mitochondria is partly due to the co-translational association, via the mitochondrial target sequence (MTS) of the nascent peptide.

      The manuscript "Mitochondrial mRNA localization is governed by translation kinetics and spatial transport" investigates the mechanisms of mRNA transport and attachment to mitochondria. Concerning mitochondria-localized mRNAs, two types of mRNAs have been distinguished before: mRNAs that are always attached to the mitochondrium (called "constitutively binding" by the authors) and mRNAs that become "sticky" only under certain conditions (called "conditionally binding" by the authors). Modeling the corresponding cellular processes biophysically, the authors infer that yeast cells exercise control over the localization of mRNA (and consequently over their metabolism) in two ways: via varying the mitochondrial volume fraction, and via varying the speed of translation elongation. Data from previously published genome-wide measurements of mRNAs that localize constitutively and conditionally via their MTS in budding yeast S. cerevisiae were used to investigate these mechanisms.

      The manuscript is very well written and the analysis is of high quality. It starts with an introduction that thoroughly reviews many facets around the conducted research and briefly, but self-consistently, summarizes the current knowledge regarding mitochondrial localization of mRNAs. Next, the consequences of the modeling work (presented in the "methods"-section) are explored in the "Results"-section, which contains meaningful and instructive figures and explanations. The manuscript concludes with a comprehensive evaluation of the consequences of the conducted research. All in all, there are only very few minor changes that could be considered.

      Content-wise, we suggest:

      The modeling of translation kinetics is pretty coarse-grained, using only an average elongation rate per amino acid. Much work in this field was done using totally antisymmetric exclusion principle (TASEP)-based models (e.g. MacDonald, J.H. Gibbs, A.C. Pipkin: Kinetics of biopolymerization on nucleic acid templates; Duc, Saleem, Song: Theoretical analysis of the distribution of isolated particles in totally asymmetric exclusion processes: Application to mRNA translation rate estimation). Perhaps this work can be mentioned, and furthermore, the consequences of inhomogeneity of elongation rate for different codons and amino acids could be explored or at least discussed. In particular, this could shed light into the question if ribosome interference and tRNA charging times have any impact on mitochondrial mRNA localization.

      Thank you to the reviewer for pointing us to these relevant papers. As suggested, we have added a paragraph to our Discussion that mentions this work and discusses the possible implications of inhomogeneous elongation along mRNA sequences. We find this suggestion (and the similar one made by the other reviewer) to explore inhomogeneous elongation particularly encouraging, because we are in the early stages of actively pursuing such work. We feel that beyond discussion, exploring the consequences of inhomogeneous elongation is beyond the scope of this work because significant further experimental work would be needed to quantify the impact of specific sequences on translation progress.

      To our Discussion, we have added the following paragraph.

      "In this work our quantitative model assumed uniform ribosome elongation rates along mRNA transcripts. In the presence of ribosome interactions, such dynamics can lead to both uniform and non-uniform ribosome densities and effective elongation rates along the transcript (MacDonald et al., 1968; Duc et al., 2018). With these uniform ribosome elongation rates, previous theoretical results suggest that collisions will be rare (Duc et al., 2018). However, elongation may not be homogeneous along an mRNA transcript, due to factors such as tRNA availability (Varenne et al., 1984), boundaries between protein regions (Thanaraj and Argos, 1996), amino acid charge (Charneski and Hurst, 2013), and short peptide sequences related to ribosome stalling (Sabi and Tuller, 2017). We have found that slow (homogeneous) elongation facilitates mitochondrial mRNA localization, by providing time for MTS maturation, diffusive search, and to maintain binding-competent MTS-mediated mRNA binding to mitochondria. We expect that inhomogeneities in elongation rate along mRNA could either enhance or reduce mitochondrial mRNA localization, controlled by whether slower elongation is in regions that favor longer MTS exposure. For example, a ribosome stall site following full MTS translation could provide more time for MTS maturation and facilitate mitochondrial localization. Future experimental work could identify such stalling sequences and point towards how modeling can improve understanding of sequence impact on localization."

      Ribosome occupancy data from Arava used to infer translation parameters. But there are more recent data sets based on ribosome profiling. Any reason for not using the more recent data?

      We thank the reviewer for bringing up this important point. Our text describing the origin of data for ribosome occupancy in the inset of Figure 2A lacked a citation to the dataset used, and we agree that more recent ribosome occupancy datasets are more appropriate. For the cumulative distributions of ribosome occupancy shown in the inset of Figure 2A, we used the ribosome occupancy data from Zid and O'Shea from 2014. The Arava data from 2003 was used for the cumulative distributions of Figure S1, to show that the similarity between conditional and constitutive genes in the inset of Figure 2A was present in more than a single dataset.

      We have clarified the origin of the ribosome occupancy data in the text.

      In the text description of the inset of Figure 2A, we now include a direct citation of Zid and O'Shea from 2014.

      "These measurements (Zid and O'Shea, 2014) indicate that conditional and constitutive genes have similar distributions of ribosome occupancy (Fig. 2A, inset; see Fig. S1 for similar distributions of conditional and constitutive gene ribosome occupancy derived from (Arava et al., 2003))."

      We also added a citation of Zid and O'Shea to the caption describing the inset of Figure 2A.

      "Inset is cumulative distribution of ribosome occupancy (Zid and O’Shea, 2014), showing ribosome occupancy and β have similar distributions. "

      To determine the translation parameters in our quantitative model, we applied the datasets of Couvillion et al from 2016 for relative protein per mRNA measurements and Zid and O'Shea from 2014 for ribosome occupancy measurements, combined with individual measurements from Morgenstern et al from 2016 and Riba et al from 2019. How these datasets and measurements are used is described in the Methods subsection “Calculation of translation rates”. In addition to the citations in the methods, we have added citations to the briefer description in the Results section.

      "Using protein per mRNA and ribosome occupancy data (Couvillion et al., 2016; Morgenstern et al., 2017; Zid and O’Shea, 2014; Riba et al., 2019), we estimated the gene specific initiation rate kinit and elongation rate kelong for 52 conditional and 70 constitutive genes (see Methods)."

      The effect of the mitochondrial volume fraction on mRNA localization is investigated with a diffusive model. However, the authors make a two dimensional Ansatz for the cell and mitochondrion while it would seem more natural to assume diffusion in three spatial dimensions, as the cell and mitochondria are both three dimensional objects and diffusion strongly depends on the number of dimensions it occurs in. Why was that Ansatz made and why is it justified?

      Our diffusion model is in fact three-dimensional, rather than two dimensional. Specifically, we treat the search process as occurring in a three-dimensional cylinder, whose cross-section is shown in Figure 1D. We have added to Figure 1D to further describe how three-dimensional cylinders represent the mitochondrial proximity in the cell.

      In the Results, we now write:

      “Specifically, we treat the geometry as a sequence of concentric three-dimensional cylinders, each representing an effective region surrounding a tubule of the mitochondrial network. Figure 1D shows a two-dimensional cross-sectional view of these cylinders. The innermost cylinder represents a mitochondrial tubule…”

      We have also clarified the caption of Figure 1D to include:

      "Schematic of mRNA diffusion in spatial model, shown in cross-section. The cytoplasmic space is treated as a cylinder centered on a mitochondrial cylinder: the three dimensional volume extends along the cylinder axis (not shown)."

      The range of variability in the localized fraction +/- CHX is smaller in the experiment compared to the model (Fig. 4B, C). What could be the rationale?

      We agree that the variability in localized fraction from applying CHX is smaller in the experiment (Figure 4C) in comparison to the model (Figure 4B). Our model uses translation parameters (initiation and elongation rates) that are derived from experimental measurements that are expected to be quite noisy. We expect that this noise in the model parameters will expand the range of localization changes predicted by the model for CHX application.

      In l. 417, the authors remark that "constitutively localized mRNAs are on average longer [...] than conditionally localized mRNAs." Yet constitutively localized mRNAs seem to have higher localized fraction than conditionally localized mRNAs. This is somewhat surprising. While it's clear that a higher diffusivity would be compatible with a faster response time of shorter, conditionally-localized mRNAs, it is not clear how the longer, less diffusive mRNAs would have a higher localization fraction. Perhaps the authors can clarify this point.

      The reviewer is correct that experimental measurements show that constitutively-localized genes are, on average, longer than conditionally-localized genes. In our quantitative model, we assume the mRNA of all genes have the same diffusivity. We have used the same diffusivity for different genes because experimental measurements suggest that mRNA length and the number of translating ribosomes on an mRNA do not substantially impact mRNA diffusivity. In our Methods section, we have added citations to papers indicating lack of dependence of mRNA diffusivity on mRNA length.

      "Simulated mRNA have a diffusivity of 0.1 𝜇m2/s. This diffusivity remains constant across genes and mRNA states, consistent with experimental measurements showing little dependence of mRNA diffusivity on mRNA length (Calderwood et al., 2016) or number of translating ribosomes (Wang et al., 2016)."

      We have additionally clarified the part of our Discussion where we explain the distinction of our results from proposals based on differential mRNA diffusion speed.

      "Lower occupancy was proposed to drive mRNA localization through increased mRNA mobility of a poorly loaded mRNA (Poulsen et al., 2019), as more mobile mRNA could more quickly find mitochondria when binding competent, increasing the localization of these mRNA. By contrast, our results imply an alternate prediction – that translational kinetics lead to enhanced localization of longer mRNAs, due to the increased number of loaded ribosomes bearing a binding-competent MTS. Indeed, constitutively localized mRNAs are on average longer than conditionally localized mRNAs."

      Minor formal changes would be:

      Setting the expressions of the fraction in the binding-competent state in l. 118 and the faction of the mRNA-accessible volume in l. 123 in normal math-environments instead of the inline-environment since they are of key importance to the following discussion.

      These two equations (now equations (1) and (2)) are set as distinct equations that are now referred to by their equation numbers later in the manuscript.

      l. 414 contains the verb "vary" twice

      Thank you to the reviewer for pointing out this redundancy, the sentence now reads

      "Translation kinetics can widely vary between genes ... "

      l. 438 lacks an "h" in the word mitochondria

      Thank you to the reviewer for pointing this out, this spelling error has been corrected. The sentence now reads "all mRNA transcripts studied would be highly localized to mitochondria in all conditions."

      Reviewer #1 (Significance (Required)):

      All in all, this is a strong manuscript that contains solid, simple but meaningful and by no means oversimplified models with impactful consequences on the understanding of mitochondrial mRNA localization. Furthermore, it is likely that the approach applies to other cellular compartments like the ER. The research is explained in a remarkably clear and focussed style which makes it easy to follow and meanwhile succeeds in not omitting any details.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Arceo et al. have developed a stochastic, quantitative model of mitochondrial targeting sequence (MTS)-mediated mRNA localization to mitochondria in yeast. They use this model to investigate the role of translation- and diffusion kinetics in controlling mitochondrial mRNA localization of conditional as well as constitutional genes.

      Most importantly, they find that neither mRNA diffusivity nor ribosome density alone are sufficient to account for the differences in localization that were experimentally observed for the two types of genes. Therefore, they implement an MTS maturation time into their model and find that they can now predict gene specific localization rates. Based on these observations, the authors conclude that yeast cells can regulate the localization of mRNAs to mitochondria through (controlling mitochondrial volume fractions and) differences in translation kinetics, which adjust the exposure time and numbers of mature MTSs that are presented on the mRNP and convey binding-competence.

      Major comments:

      Overall, the manuscript is well written and the conclusions are convincing. The underlying assumptions of the model make sense, but I have no background in modelling and can therefore only comment on the RNA biology aspects and general comprehensibility of the work.

      • The authors calculate gene-specific translation initiation and elongation rates to model localization on different transcript classes. In this context,

      (i) They use a single decay rate to estimate trajectory lifetime and this decay rate is such (1 nt / 600 s) that it would take the average yeast mRNA (~ 1400 nt; Smith et al., JCB, 2015) 10 days to be turned over. This is not consistent with physiological decay rates and as a consequence, they are essentially not accounting for mRNA turnover. This should be explained in the Methods.

      The reviewer has highlighted a lack of clarity in our model description. The mRNA decay rate in the model is (1/600) inverse seconds per entire mRNA molecule, rather than (1/600) inverse seconds per nucleotide. This leads the typical mRNA lifetime to be 600 seconds. The sentence in the Methods section describing the decay timescale now reads "The mRNA decay rate is set to kdecay = 0.0017 s-1 per mRNA molecule, such that the typical decay time for an mRNA molecule is 600 s. This decay time is consistent with measured average yeast mRNA decay times ranging from 4.8 minutes (Chan et al., 2018) to 22 minutes (Chia and McLaughlin, 1979)."

      (ii) Translation and decay are intrinsically linked and translation machinery also recruits decay enzymes. What is more, decay rates differ greatly for different mRNA transcripts. I cannot judge how feasible this is, but it might benefit the model if variable decay rates (i.e. modelled based on translation efficiency?) could be included.

      We appreciate this suggestion from the reviewer. We have added a supplemental figure (Figure S4) to explore how mRNA decay rate can impact mitochondrial localization of mRNA. While longer decay rates have little impact on localization, if the decay rate is sufficiently high, the mRNA will have limited opportunity for translation to initiate and a binding-competent MTS to develop, substantially reducing localization. This analysis does not consider how the mRNA lifetime might be coupled with translational effects (such as ribosome stalling). Accounting for the impact of such more complex decay mechanisms would require substantial expansion of the model and extensive additional experiments to parameterize the coupling effects; we believe this extension would be beyond the scope of this manuscript.

      To our Discussion, we have added

      "While we have focused on how variation in translational kinetics between genes can impact mitochondrial mRNA localization, there is also significant variation in mRNA decay timescales (Chia and McLaughlin, 1979; Chan et al., 2018). Our model suggests (see Fig. S4) that the mRNA decay timescale has a limited effect on mitochondrial mRNA localization, unless the decay time is sufficiently short to compete with the timescale for a newly-synthesized mRNA to first gain binding competence. We leave specific factors thought to modulate mRNA decay, such as ribosome stalling (Mishima et al., 2022), as a topic of future study."

      (iii) Along the same lines: Rare codons as well as specific stalling sequences, are known to slow down translation elongation on many transcripts (and will effectively increase MTS exposure time). Can the authors identify transcripts with such signal sequences (on a global scale, apart from TIM50) and incorporate in their model?

      We find this suggestion (and the similar one made by the other reviewer) to explore stalling sequences particularly encouraging, because we are in the early stages of actively pursuing such work. We feel that beyond discussion, exploring the consequences of inhomogeneous elongation is beyond the scope of this work because significant further experimental work would be needed to quantify the impact of specific sequences on translation progress.

      To our Discussion, we have added the following paragraph.

      "In this work our quantitative model has applied uniform ribosome elongation rates along mRNA transcripts, which with ribosome interactions can lead to both uniform and non-uniform ribosome densities and effective elongation rates along the transcript (MacDonald et al., 1968; Duc et al., 2018). With these uniform ribosome elongation rates, previous theoretical results suggest that collisions will be rare (Duc et al., 2018). However, elongation may not be homogeneous along an mRNA transcript, due to factors such as tRNA availability (Varenne et al., 1984), boundaries between protein regions (Thanaraj and Argos, 1996), amino acid charge (Charneski and Hurst, 2013), and short peptide sequences related to ribosome stalling (Sabi and Tuller, 2017). We have found that slow (homogeneous) elongation facilitates mitochondrial mRNA localization, by providing time for MTS maturation, diffusive search, and maintains a binding-competent MTS-mediated mRNA binding to mitochondria. We expect that inhomogeneities in elongation rate along mRNA could either enhance or reduce mitochondrial mRNA localization, controlled by whether slower elongation is in regions that favor longer MTS exposure. For example, a ribosome stall site after the MTS is fully translated could provide more time for MTS maturation and facilitate mitochondrial localization. Future experimental work could identify such stalling sequences and point towards how modeling can improve understanding of sequence impact on localization."

      • Reduced mature MTS exposure time is presented as one of the determining factors that regulate mitochondrial localization of conditionally localized transcripts. For my background, the underlying mechanisms that determine MTS maturation are insufficiently explained. I understand how chaperone recruitment can contribute to MTS maturation. However, it is not obvious to me how receptor binding would account for such long maturation times as the 40 s used here (Fig. 3, 4). I would appreciate if the authors could elaborate and possibly point to directions that their model could be used to study those.

      We agree with the reviewer that the diffusive search time for a chaperone to find a newly-synthesized MTS would be very short (a small fraction of the proposed 40-second MTS maturation time), and we expect that this maturation period is largely controlled by chaperone and co-chaperone interaction timescales. There is a wide range of timescales for newly-synthesized (or misfolded) proteins to productively interact with a chaperone, and the literature provides examples of timescales comparable to 40 seconds, which we now cite.

      To our Discussion, we have added

      "While the diffusive search for a newly-synthesized MTS by chaperones is expected be very fast ( 100 seconds for human chaperone-mediated folding (Wu et al., 2020)."

      We feel that modeling chaperone facilitation of MTS folding, to determine the timescale of this process, is very distinct from the topics covered in our manuscript, and thus beyond the scope of this work.

      • One of the two main conclusions (at least according to the abstract) from the work is that yeast cells modulate mitochondrial volume fractions to regulate mRNA localization to mitochondria. This is a fact, not a novel finding. The other main conclusion, which is that cells use different translation dynamics to control mRNA localization, is intriguing and deserves more attention. It would be great if the authors could suggest/discuss an experimental approach (i.e. a single mRNA imaging experiment quantifying mitochondrial co-localization and translation kinetics of different reporter constructs) to test this hypothesis.

      We appreciate the reviewer raising the point that yeast cells modulate mitochondrial volume fraction to regulate mitochondrial mRNA localization. While we previously showed this relationship between mitochondrial volume fraction and localization, we used experimental techniques (mutations, nutrient sources) that changed many other factors beyond mitochondrial volume fraction. In this work we have used a quantitative model, lacking those extraneous factors, to demonstrate that a change to mitochondrial volume fraction alone can lead to a change in mitochondrial mRNA localization. This work supports our interpretation of those previous experimental results.

      To our Discussion we have added the sentence

      "Previous experimental work suggested that changing mitochondrial volume fraction could control mitochondrial mRNA localization (Tsuboi et al., 2020) --- our quantitative modeling work provides further support for this mechanism of regulating mRNA localization."

      The reviewer also requests a discussion of an experimental approach to test how cells use translational dynamics to control mRNA localization. With the advent of combined mRNA imaging and live translational imaging it would be interesting to directly measure translation in live cells to correlate localization with a time delay. Unfortunately there are currently no published live translational imaging studies in yeast, and thus such a measurement would require the development of the technique in yeast.

      To our Discussion, we have added

      "Experimentally testing our proposal for translation-controlled localization would involve using combined mRNA and live translational imaging (as yet undeveloped in yeast), to directly measure translation and correlate localization with a time delay, presenting a fruitful pathway for future study."

      Minor comments:

      • Figure 1: X axis labels between panel E and F are not consistent. Inset in panel F is mainly and first discussed in text. Please do not show data as tiny inset but as separate panel.

      We have changed the axis label of Figure 1E to match the axis label of Figure 1G (previously Figure 1F). The inset of the old Figure 1F is now the new Figure 1F, and the old Figure 1F is now the new Figure 1G. We have adjusted the Figure 1 caption and the text description of Figure 1 to match these changes.

      Elongation rates of 250 aa per second are not physiological. In mammalian cells elongation has been quantified to proceed between 1 and app. 20 aa per second (Wang et al, 2016; Wu et al., 2016; Yan et al., 2016; Morisaki et al., 2016).

      The reviewer is correct that the elongation rates of 50/s and 250/s too large to be physiological. These large values have been deliberately selected to probe the nonequilibrium behavior of the quantitative model to test the prediction of the simpler four-state model, rather than represent physiological behavior.

      To the text in the Results section discussing Figure 1F, we have added the following sentence.

      "We include unphysiologically high elongation rates to compare to the expected behavior from the 4-state model."

      Panel E: elongation rate range does not match Fig 1F nor median in Fig 3A.

      The reviewer is correct that the elongation rate parameter range of Figure 1E does not match the elongation rates of Figure 1F or the median in Figure 3A. In Figure 1E, we aimed to show that the physiological range of translation parameters can produce a wide range of both MTSs per mRNA and mRNA binding competence for mitochondria.

      We have expanded the description of Figure 1E in the text.

      "By exploring the physiological range of translation parameters, many orders of magnitude of the mean number of translated MTSs per mRNA (β, see Eq. 5) are covered, which also covers the full range of mRNA binding competence (Fig.1E). We find that, for any set of physiological translation parameters, the number of binding-competent MTS sequences (β) is predictive of the fraction of time (fs) that each mRNA spends in the binding competent state (Fig.1E)."

      • Figure 2A and S1: Please explain how ribosome occupancy is defined here and why it is so different between figures

      We have inserted a citation for Zid 2014, to distinguish that the ribosome occupancy measurements in Figure 2A (Zid and O’Shea) and Figure S1 (Arava et al) come from two different techniques. Zid and O’Shea used ribosome profiling to obtain a relative, rather than absolute measurement. While Arava used a technique where they fractioned mRNAs based on the absolute number of ribosomes loaded across 14 fractions of a sucrose gradient, and measured the relative amount of mRNA in each fraction by microarray. So while ribosome occupancy in each paper was calculated in a very distinct manner, the comparison between conditional and constitutively localized mRNAs shows a very similar trend without significant differences in ribosome occupancy between these two classes of mRNAs with either measurement of ribosome occupancy.

      To the caption of Figure S1, we have added

      "These ribosome occupancy values cover a distinct range, in comparison to those of Fig. 2A, due to distinct experimental measurement techniques."

      • Figure 2C: please show experimental data along with model prediction (in the same graph) so that conclusion becomes immediately apparent from figure not just main text. Label clearly (in figure) when experimental and when model data is shown (maybe by using consistent color scheme?)

      We have added experimental data to Figure 2C. Throughout the manuscript, we have kept a consistent color scheme for data for mitochondrial localization for ATP3, TIM50, conditional, and constitutive mRNA, whether from model or experimental data. We have applied distinct line types (e.g. solid for model vs. dot-dashed with circles for experimental).

      • Figure 4B and C: clearly indicate in figure which are experimental and which are modelled data

      In Figures 4B and 4C, we have clarified which data is experimental and which is modeled by adding to the labels for each violin plot. Violin plot labels for model data now read "Model Conditional" or "Model Constitutive" and labels for experimental data now read "Expt Conditional" or "Expt Constitutive".

      • Figure 4D: show experimental vs. model data in same graph (at same axis scaling) for comparability

      We have added the experimental data, previously in the inset of Figure 4D, to the main part of Figure 4D.

      • Line 305: "constitutive" mRNA

      Thank you to the reviewer for pointing out this redundancy, the sentence now reads

      "Figure 3C shows how the localization for the prototypical conditional and constitutive mRNA varies with the maturation time."

      • Line 334: "other changes, such as diffusivity, are unable to separate the two gene groups" - what other changes? The authors only show diffusivity (Fig S3).

      Thank you to the reviewer for pointing this out. We have revised this sentence to only refer to diffusivity changes.

      "While introduction of this maturation time distinguishes the mitochondrial localization of conditional and constitutive gene groups (Fig. 4A vs Fig. 2B), changes to diffusivity are unable to separate the two gene groups (Fig. S3)."

      • Line 403-405: maybe useful to argue against lower ribosome occupancies as drivers of nascent chain complex mobilities: Wang at el, Cell, 2016; single translation site imaging experiments indicating that ribosome occupancy is not the main determinant of mRNP mobility.

      We thank the reviewer for the direction to this paper, which indeed indicates that ribosome occupancy has limited impact on mRNA diffusivity.

      We now cite this paper in our Methods section.

      "Simulated mRNA have a diffusivity of 0.1𝜇m2/s. This diffusivity remains constant across genes and mRNA states, consistent with experimental measurements showing little dependence of mRNA diffusivity on mRNA length (Calderwood et al., 2016) or number of translating ribosomes (Wang et al., 2016)."

      • Line 601-607: include experimental references to explain how measures (25 nm vs 250 nm) were determined/selected.

      The reviewer raises a valuable point, as it is important to motivate these lengthscales used in the model.

      Microscopy with visible light has a lateral resolution limit of approximately 250 nm, often known as the Abbe limit. Accordingly, we assume that mRNA within 250 nm of mitochondria will be measured as adjacent to mitochondria. To the Methods section, we now include a short explanation and a citation.

      Unlike the 250-nm diffraction limit, there is no widely-used reaction range for mRNA binding to intracellular substrates, nor a measurement of the required proximity for an MTS-bearing mRNA to bind to mitochondria. We estimate the 25-nm distance for mRNA binding to mitochondria from the following contributions:

      • The yeast ribosome is 25 - 28 nm in diameter, or 13 - 14 nm in radius.
      • Yeast MTSs have a length of up to 70 amino acids, with 20 estimated yeast MTS lengths having a mean of 31 amino acids. The MTS forms an amphipathic helix (an alpha helix), which has a pitch of 0.54 nm and 3.6 amino acids per turn, so the 31 amino acids will be approximately 5 nm long
      • The MTS will be attached to the ribosome/mRNA by other peptide regions, expected to typically be a few nanometers in length So overall we estimate a 25 nm range for an MTS-bearing mRNA to bind to mitochondria.

      To our methods, we have added this reasoning and accompanying citations.

      "We estimate the 25-nm binding distance by combining several contributions. The yeast ribosome has a radius of 13 - 14 nm (Verschoor et al, 1998). The MTS region, up to 70 amino acids long, forms an amphipathic helix (Bacman et al., 2020) a form of alpha helix. With an alpha helical pitch of 0.54 nm and 3.6 amino acids per turn, a 31 amino acid MTS (the mean of 20 yeast MTS lengths (Dong et al., 2021)) is approximately 5 nm in length. An additional few nanometers of other peptide regions bridging the MTS to the ribosome provides an estimate of 25 nm for the range of an MTS-bearing mRNA to bind mitochondria. The 250-nm imaging distance is based on the Abbe limit to resolution with visible light (Georgiades et al., 2016)."

      Reviewer #2 (Significance (Required)):

      My field of expertise is the development of single mRNA imaging methods to quantify translation/decay dynamics in living mammalians systems. Thus, I cannot judge the significance of this work with respect to the modelling that is presented here.

      However, I do appreciate that one of the main conclusions of this work, which is that cells might use different translation dynamics to control mRNA localization, is truly exciting and could be applied to other types of transcripts (this is exactly what SRP does for ER-targeted mRNAs) as well. Because mechanisms that regulate translation in a transcript-specific manner and in different subcellular localizations have only been described for a handful of cases, I think that this observation is worth following up on and should be appreciated by a broad scientific audience.

    1. Reviewer #3 (Public Review):

      In their study, the authors set up to challenge the long-held claim that cortical remapping in the somatosensory cortex in hand deprived cortical territories follows somatotopic proximity (the hand region gets invaded by cortical neighbors) as classically assumed. In contrast to this claim, the authors suggest that remapping may not follow cortical proximity but instead functional rules as to how the effector is used. Their data indeed suggest that the deprived hand area is not invaded by the forefront which is the cortical neighbor but instead by the lips which may compensate for hand loss in manipulating objects. Interestingly the authors suggest this is mostly the case for one-handers but not in amputees for who the reorganization seems more limited in general (but see my comments below on this last point).

      This is a remarkably ambitious study that has been skilfully executed on a strong number of participants in each group. The complementarity of state-of-the-art uni- and multi-variate analyses are in the service of the research question, and the paper is clearly written. The main contribution of this paper, relative to previous studies including those of the same group, resides in the mapping of multiple face parts all at once in the three groups.

      In the winner takes all approach, the authors only include 3 face parts but exclude from the analyses the nose and the thumb. I am not fully convinced by the rationale for not including nose in univariate analyses - because it does not trigger reliable activity - while keeping it for representational similarity analyses. I think it would be better to include the nose in all analyses or demonstrate this condition is indeed "noisy" and then remove it from all the analyses. Indeed, if the activity triggered by nose movement is unreliable, it should also affect multivariate.

      The rationale for not including the hand is maybe more convincing as it seems to induce activity in both controls and amputees but not in one-handers. First, it would be great to visualize this effect, at least as supplemental material to support the decision. Then, this brings the interesting possibility that enhanced invasion of hand territory by lips in one-handers might link to the possibility to observe hand-related activity in the presupposed hand region in this population. Maybe the authors may consider linking these.

      The use of the geodesic distance between the center of gravity in the Winner Take All (WTA) maps between each movement and a predefined cortical anchor is clever. More details about how the Center Of Gravity (COG) was computed on spatially disparate regions might deserve more explanations, however. Moreover, imagine that for some reason the forefront region extends both dorsally and ventrally in a specific population (eg amputees), the COG would stay unaffected but the overlap between hand and forefront would increase. The analyses on the surface area within hand ROI for lips and forehead nicely complement the WTA analyses and suggest higher overlap for lips and lower overlap for forehead but none of the maps or graphs presented clearly show those results - maybe the authors could consider adding a figure clearly highlighting that there is indeed more lip activity IN the hand region.<br /> In addition to overlap analyses between hand and other body parts, the authors may also want to consider doing some Jaccard similarity analyses between the maps of the 3 groups to support the idea that amputees are more alike controls than one-handers in their topographic activity, which again does not appear clear from the figures.

      This brings to another concern I have related to the claim that the change in the cortical organization they observe is mostly observed in one-handers. It seems that most of this conclusion relies on the fact that some effects are observed in one-handers but not in amputees when compared to controls, however, no direct comparisons are done between amputees and one-handers so we may be in an erroneous inference about the interaction when this is actually not tested (Nieuwenhuis, 11). For instance, the shift away from the hand/face border of the forehead is also (mildly) significant in amputees (as observed more strongly in one-handers) so the conclusion (eg from the subtitle of the results section) that it is specific to one-hander might not fully be supported by the data. Similar to the invasion of the hand territory from the lips which is significant in amputees in terms of surface area. All together this calls for toning down the idea that plasticity is restricted to congenital deprivation (eg last sentence of the abstract). Even if numerically stronger, if I am not wrong, there are no stats showing remapping is indeed stronger in one-handers than in amputees and actually, amputees show significant effects when compared to controls along the lines as those shown (even if more strongly) in one-handers. Also, maybe the authors could explore whether there is actually a link between the number of years without hand and the remapping effects.

      One hypothesis generated by the data is that lips remap in the deprived hand area because lips serve compensatory functions. Actually, also in controls, lips and hands can be used to manipulate objects, in contrast to the forehead. One may thus wonder if the preferential presence of lips in the hand region is not latent even in controls as they both link in functions?

    1. The biggest mistake—and one I’ve made myself—is linking with categories. In other words, it’s adding links like we would with tags. When we link this way we’re more focused on grouping rather than connecting. As a result, we have notes that contain many connections with little to no relevance. Additionally, we add clutter to our links which makes it difficult to find useful links when adding links. That being said, there are times when we might want to group some things. In these cases, use tags or folders.

      Most people born since the advent of the filing cabinet and the computer have spent a lifetime using a hierarchical folder-based mental model for their knowledge. For greater value and efficiency one needs to get away from this model and move toward linking individual ideas together in ways that they can more easily be re-used.

      To accomplish this many people use an index-based method that uses topical or subject headings which can be useful. However after even a few years of utilizing a generic tag (science for example) it may become overwhelmed and generally useless in a broad search. Even switching to narrower sub-headings (physics, biology, chemistry) may show the same effect. As a result one will increasingly need to spend time and effort to maintain and work at this sort of taxonomical system.

      The better option is to directly link related ideas to each other. Each atomic idea will have a much more limited set of links to other ideas which will create a much more valuable set of interlinks for later use. Limiting your links at this level will be incredibly more useful over time.

      One of the biggest benefits of the physical system used by Niklas Luhmann was that each card was required to be placed next to at least one card in a branching tree of knowledge (or a whole new branch had to be created.) Though he often noted links to other atomic ideas there was at least a minimum link of one on every idea in the system.

      For those who have difficulty deciding where to place a new idea within their system, it can certainly be helpful to add a few broad keywords of the type one might put into an index. This may help you in linking your individual ideas as you can do a search of one or more of your keywords to narrow down the existing ones within your collection. This may help you link your new idea to one or more of those already in your system. This method may be even more useful and helpful for those who are starting out and have fewer than 500-1000 notes in their system and have even less to link their new atomic ideas to.

      For those who have graphical systems, it may be helpful to look for one or two individual "tags" in a graph structure to visually see the number of first degree notes that link to them as a means of creating links between atomic ideas.

      To have a better idea of a hierarchy of value within these ideas, it may help to have some names and delineate this hierarchy of potential links. Perhaps we might borrow some well ideas from library and information science to guide us? There's a system in library science that uses a hierarchical set up using the phrases: "broader terms", "narrower terms", "related terms", and "used for" (think alias or also known as) for cataloging books and related materials.

      We might try using tags or index-like links in each of these levels to become more specific, but let's append "connected atomic ideas" to the bottom of the list.

      Here's an example:

      • broader terms (BT): [[physics]]
      • narrower terms (NT): [[mechanics]], [[dynamics]]
      • related terms (RT): [[acceleration]], [[velocity]]
      • used for (UF) or aliases:
      • connected atomic ideas: [[force = mass * acceleration]], [[$$v^2=v_0^2​+2aΔx$$]]

      Chances are that within a particular text, one's notes may connect and interrelate to each other quite easily, but it's important to also link those ideas to other ideas that are already in your pre-existing body of knowledge.


      See also: Thesaurus for Graphic Materials I: Subject Terms (TGM I) https://www.loc.gov/rr/print/tgm1/ic.html

  2. Apr 2022
    1. Author Response

      Reviewer #1 (Public Review):

      Kwon, Huxlin and Mitchell compared motion perception and oculomotor responses in eight patients with post-stroke lesions in the primary visual cortex (V1). Motion perception was measured as peripheral motion discrimination thresholds (NDR) separately in the affected and the intact visual field. Due to restoration training, the NDR thresholds were below chance even in the affected visual field, indicating that some residual motion discrimination was possible. Oculomotor responses were measured as the gain of eye drifts (PFR) after saccades to dot patterns that are coherently drifting inside peripheral, stationary apertures. The authors distinguish between a predictive, open loop component up to 100 ms after the saccade that is entirely based on presaccadic motion processing in the peripheral visual field and a visually-driven component from 100 ms after the saccade that is based on postsaccadic motion processing in the fovea. While the PFR gain of patients in the intactfield was comparable to the data of healthy control subjects from a previous study (Kwon et al., 2019), the predictive, open-loop PFR gain of patients in the affected field was close to zero. This was not the case for the visually-driven PFR. The authors interpret their findings in terms of a dissociation between residual motion perception and absent predictive oculomotor control in patients with V1 lesions.

      Strengths:<br /> The study contains a rare and valuable set of perceptual and oculomotor data from eight patients with lesions in V1, who underwent restoration training. The direct comparison between peripheral motion discrimination and predictive oculomotor responses is interesting and innovative. Also, the distinction between the predictive, open-loop and the closed-loop component of PFR is important. A potential dissociation between motion perception and oculomotor control would be very relevant for the understanding of different pathways of motion processing for perception and oculomotor control and also for the understanding of the effects of restoration trainings after lesions of V1.

      Weaknesses:<br /> The dissociation between perception and oculomotor control in the affected field is primarily based on two results: First, the combination of low PFR gain (Figure 4A) on the one hand and low to medium NDR thresholds (Table 1) on the other hand. Second, the absence of a correlation between NDR thresholds and PFR gain (Figure 4B). However, the data are not as clear-cut. The regression of PRF gain on NDR thresholds in the intact-field predicts that there should be a substantial PRF gain only at NDR thresholds below about 0.3. For the affected field this applies only to three data points of which one shows a substantial PFR and is fully compatible with the data in the intact-field. Hence, the evidence of a dissociation between motion perception and oculomotor control is based on a very small number of data points. This also allows for a different interpretation: instead of assuming separate pathways for motion perception and oculomotor control in patients, the results might also be explained by a different read-out of the same motion signal for perception and oculomotor control, where oculomotor control applies a more conservative threshold and requires a higher internal signal strength than the motion perception.

      The comparison of the patients' data to the data in the previous study (Kwon et al., 2019) is not very informative. First, the patients were considerably older than the participants in the previous study, and an age-matched control group would be favourable. That being said, the fact that the PFR gain was comparable for the intact-field of the patients and the previous study renders age-effects rather unlikely.

      Second, there is no control data for the motion discrimination task, so we don't know what the NDR thresholds and even more importantly what the relationship between NDR thresholds and PFR gain in healthy observers would be.

      We thank the reviewer for their evaluation. We have attempted to address concerns about sufficient sampling from blind-fields with recovery that reached the normal range by collecting additional data, doubling our sample size within that range. This is discussed above in “Essential revisions”, along with the alternative interpretation that perception and oculomotor control might rely on a different threshold in readout. The role of age differences was considered in the original manuscript, but this remains an unlikely factor, as the reviewer notes. With regard to normative NDR threshold data, surprisingly, this has not been published in visually-intact controls in a manner that is identical to that in the present study. However, prior work has established that performance in CB patients’ intact visual fields is normal across a wide range of behavioral measures that include luminance contrast sensitivity, processing of form, color and motion, as well as spatial and temporal frequencies (e.g. Barbur et al., 1980; Morland et al., 1999; Sahraie et al., 2006; Huxlin et al., 2009; Das et al., 2014; Levi et al., 2015). In the present study, we have thus used the intact-field as an internal control for blind-field performance in the same participant, as is standard in the field, expecting that intact-field NDR thresholds should be within the normal range. Verifying this is outside the scope of the present paper, but is now planned for our subsequent studies. Other detailed responses appear below to point by point for the reviewer’s “Recommendations for authors”.

      Reviewer #2 (Public Review):

      This study addresses the oculomotor behaviour of cortically-blind patients (with lesions in V1) who are instructed to perform a saccade toward a cued target placed either in their intact or in the blind visual field. The saccadic target consists in an aperture containing random-dot motion at 75% direction discrimination threshold ("NDR"), and is presented with iso-eccentric similar distractor apertures: with this kind of stimulus, the gaze of normally-sighted participants drifts smoothly in the direction of the target random dot motion immediately after the end of the saccade. Importantly, for some patients, a perceptual training had led to a good recovery of perceptual performance in the blind-field, as documented by the reduction of motion direction discrimination threshold to levels similar to the control healthy participants. Cortically-blind (CB) patients are shown to perform very similarly to control participants in terms of saccade accuracy, but they have longer latency. As for the postsaccadic ocular following response ("PFR"), the eye velocity component projected on the random-dot motion direction Is comparable to controls when the saccade was directed to the intactfield, but the mean PFR is significantly lower for saccades directed toward the blind-field. The authors conclude that V1 lesions result in a previously ignored selective impairment of the automatic transaccadic transmission of visual information that drive the ocular following response. In the supplementary information, it is also shown and the shift of saccadic landing position which is induced by the presaccadic target motion is strongly reduced (yet different from zero) for saccades to the blind-field locations in CB patients.

      The manuscript is very well written and illustrated, and the addressed question is novel and highly interesting. The inclusion in the experiment of locations of the patients' blind-field for which some perceptual abilities had been recovered is particularly interesting. However some major weaknesses fragilize part of the results and undermine the interpretation of results (see below). I also list a series of other minor issues to be clarified or improved.

      Main weaknesses:<br /> 1) Unfortunately, the present data do not allow to strongly support the conclusion that the reduced PFR gain in patients is decorrelated from the motion discrimination performance. As a matter of fact, in Figure 4B the function describing the relation between PFR gain and NDR is reasonably linear in a very limited interval of NDR values (say <0.3), and it should rather be described as a decreasing exponential, or similar, approaching 0 already for NDR~0.3. On the other hand, it is presumably hard to appropriately fit a similar exponential function to the blind-field datapoints, as the majority of the latter lay in the range of NDR threshold (say > 0.4) where the PFR gain would in any case be flat and close to 0. In other terms, in my view there aren't enough blind-field datapoints with low NDR threshold to assess a quantitative difference in the relation between PFR and NDR between CB patients and Control participants.

      Finally, and probably just a misunderstanding of mine, shouldn't the empty circles in Figure 4A and 4B have the same y-coordinate (the PFR gain value)? It does not seem so when looking at these figures.

      2) A second weak point, in my opinion, concerns the interpretation of the results and in particular the exclusion of a role for presaccadic attentional mechanisms. The authors claim (lines 356-358): "That the FEF and its projections to area MT are intact in V1-stroke patients suggests preservation of presaccadic planning and attention selection for the saccade target even when visual input is weak or abnormal in a blind-field" and this is definitely a valuable point. However a number of other physiological mechanisms involving V1 could play a role in the spatially-selective processing of motion and the argument that (lines 368 and ff) "other aspects of saccade pre-planning related to perceptual shifts in the position of motion targets, remain in the blind-field" is not very robust here, considering that the reduction in the angular deviation is very strong in the blind-field (Supplementary Figure 2).

      Here is a speculative alternative interpretation: V1-lesioned patients suffer among others of a specific impairment for spatially-selective motion processing. Unfortunately, the training in peripheral motion discrimination does not test this particular possibility, if I understand correctly, as there was no other distractor aperture containing distracting motion information (see Fig 2A). In contrast, in the main experiment, a lack of spatial selectivity for motion integration may have strongly affected the presaccadic motion discrimination (being more global than local) as well as PFR and postsaccadic landing position shift (although the latter was partly spared). According to this possibility, a simple prediction is that depending on the (randomly determined) motion direction in the distracting apertures, the PFR (the true eye movement, not the projection according to the stimulus motion axis) should be deviated in different directions, coherent with a global integration of motion. Do the available data allow to verify this possibility? In general, I think that it would be interesting to analyse post-saccadic smooth eye velocity beyond the "projected" velocity.

      We thank the reviewer for their evaluation, several parts of which overlap with Reviewers 1 and 3. In particular, the concerns about sufficient sampling from blind-fields that recover motion integration (NDR < 0.35) have been addressed by collecting additional data and performing new analyses, and we have also addressed possible impairments to spatial attention (see above in “Essential revisions”). The discrepancy noted in the y-ordinate between 4A and B is related to those analyses being by subject (4A) versus by visual field location (4B), which we already addressed above, in response to Reviewer 1. Other detailed responses appear below.

      Reviewer #3 (Public Review):

      The human visual system comprises a tangle of neural pathways that subserve different perceptual, cognitive, and motor functions. Unfortunate cases of brain damage can reveal surprising dissociations between the functions of damaged and spared tissue. Perhaps the most famous example is blindsight, when damage to visual regions of occipital cortex leads to subjective blindness in parts of the visual field while sparing some visually-guided actions. Kwon, Huxlin and Mitchell had a rare opportunity to study eight individuals with that type of cortical blindness due to stroke, and put them through a carefully designed regimen of visual training and oculomotor testing.

      The main focus was a particular oculomotor behavior that they term the "post-saccadic following response": when a neurotypical person makes a saccade to an object moving in the periphery, their eyes immediately begin smoothly following the stimulus motion, due to an oculomotor plan made before the saccade began. In this case, the stroke patients were able to regain their ability to discriminate stimulus motion in the "blind" parts of the visual field, but upon saccading to those stimuli they did not show the immediate post-saccadic following response. This surprising result shows yet another splintering dissociation between perception and action, demonstrating that the effects of stroke can be very specific to certain motor actions.

      Strengths:<br /> - The authors masterfully combined several techniques in a rare and carefully chosen sample of participants: neuropsychiatric evaluations, rehabilitation training, psychophysics and eye-movement analyses.<br /> - The analyses that link all those measures together, while complicated and precise, and elegantly and clearly presented.<br /> The study provides a twist on blindsight that is interesting philosophically, while also constraining our models of neural circuitry and informing approaches to rehabilitation after stroke.

      Weakness:<br /> - The unique nature of this study is a strength but also potentially limits its impact: the authors studied one particular type of eye movement with a complicated, unnatural stimulus arrangement. For example, the stimuli were groups of random moving dots windowed through static apertures. These stimuli, which move but also don't, are quite different from real moving objects that people track with their eyes (flying birds, for example). A related issue, which the authors briefly acknowledge, is that the training was specifically directed towards explicit perceptual reports. We therefore don't know if the oculomotor behavior (the PFR) could also be trained.<br /> - The authors rely on traditional null-hypothesis tests (t-tests and correlations) to make binary judgements of whether each effect or difference is "significant" (p<0.05). Some of the conclusions would be more convincing if supplemented with power analyses, bootstrapped confidence intervals, and Bayes factors to evaluate the strength of evidence.

      We thank Reviewer 3 for their evaluation. The choice of stimuli/task and their “naturalness” is addressed in our point by point responses to the “Recommendations for authors” below. We have also revised the manuscript to include boot-strapped confidence intervals, along with other statistics suggested by other reviewers, as noted under “Essential revisions for authors”. Other detailed responses appear below point by point.

    1. Author Response

      Reviewer #3 (Public Review):

      Phillips and colleagues present results obtained by generating loss-of-function mutations in the YAP/TAZ ortholog of the unicellular holozoan Capsaspora owczarzaki. In previous work published collaboratively by the Pan and Ruiz-Trillo labs, the authors had shown that Capsaspora has orthologs of yorkie (yki) and hippo (hpo) and that when these genes were expressed in Drosophila they functioned in a way that was consistent with the well-characterized function of the Hippo pathway in regulating cell proliferation.

      Characterizing the role of the pathway in Capsaspora required the ability to manipulate gene expression in that organism. In this manuscript, the authors describe remarkable progress in that area. They generate lines that stably express fluorescent proteins. Excitingly, they are able to use CRISPR/Cas9 and generate loss-of-function alleles using a donor-template strategy. These accomplishments pave the way for the study of Capsaspora using molecular tools.

      The authors then use these technologies to generate biallelic loss of function mutations in Capsaspora. They find no evidence of defects in cell proliferation either when these cells are cultured by themselves or when they are mixed with wild-type cells. However, they do find evidence of abnormalities in the cytoskeleton. They find that the cells themselves, and the multicellular aggregates that they form are more irregular in shape. The cells appear to adhere to substrates better than wild-type cells. They show surface blebbing that changes in the cell cortex with evidence for altered actin dynamics.

      From these experiments, the authors conclude that the ancestral function of the Hippo pathway is to regulate the cytoskeleton and that its ability to regulate cell proliferation was acquired more recently in evolution.

      The technical achievements are impressive, the experiments are well designed and executed, and are presented clearly. I have no issues with them. However, I feel that two of the main conclusions that the authors make are not justified by the results.

      1) The authors seem convinced that CoYki functions as a transcriptional regulator. They seem to suggest that it is primarily a regulator of cytoskeletal genes. There is a body of work from the Fehon laboratory that Yki has a function at the cell cortex in Drosophila that is independent of its function as a transcriptional regulator. See the work by Xu et al. 2018; PMID30032991 (not cited in this paper). In the absence of data that shows the localization of CoYki, I don't see how the authors can tell where it is working (in the nucleus or at the cell cortex) to regulate the cytoskeleton.

      To provide support for asserting that coYki is transcriptional regulator, we have done the following:

      • We have cited previous results showing that coYki and its binding partner coSd can, when expressed together in the Drosophila eye, induce transcription of Hippo pathway genes, indicating a role for coYki in transcriptional regulation

      • We have examined the localization fluorescent fusions of coYki and a coYki (coYki 4SA) mutant predicted to be nonphosphorylatable by upstream Hippo pathway kinases. Enrichment of coYki at the cell cortex was not detected. However, the 4SA mutant showed increased localization in the nucleus relative to the WT coYki protein, arguing for a nuclear function of coYki.

      These data are therefore consistent with the prevailing view of Yki/YAP/TAZ as a transcriptional regulator in other species. Nevertheless, we cannot formally exclude the possibility that coYki may also affect the cytoskeleton through a non-transcriptional manner as described by Xu et al., which we have now stated in the Results section of our manuscript.

      2) Capsaspora and animals such as ourselves are equally separated by time from our last common ancestor. There is no reason to think that the function of signaling pathways in the Capsaspora lineage has been frozen in time while ours have evolved. Indeed, the amazing diversity of protists is consistent with lots of evolution in every lineage. One could easily argue from the same data that the ancestral function of the Hippo pathway was to regulate cell proliferation and that this was lost in the lineage that led to Capsaspora. As we learn more about the function of the Hippo pathway in diverse organisms, we will be in a better position to guess what the ancestral function was.

      We agree that the function of signaling pathways in modern protists and their ancestors may not necessarily be identical, and that studies of Hippo signaling in other organisms, especially unicellular holozoans, may clarify which functions may have been ancestral, as we make a point to state at the end of our discussion. However, given that in animals Hippo signaling regulates the cytoskeleton and proliferation, and we find that in Capsaspora coYki affects the cytoskeleton but apparently not proliferation, it seems reasonable to us to suggest a model where cytoskeletal regulation was an ancient function, and the pathway was later co-opted for regulation of proliferation. We have added a section in the Discussion pointing out that we cannot, from our results, definitively conclude an ancestral Hippo pathway function.

      In summary, this manuscript describes technological innovations that will have a big impact on those who want to study this organism. They also provide convincing data to show that the Capsaspora Yorkie ortholog regulates cytoskeletal dynamics and not cell proliferation. However, as described above, the authors would need to tone down some of their conclusions.

    1. Attribution Theory Attribution theoryA process theory of motivation holding that that people are motivated according to what they believe underlies other people’s actions and attitudes. holds that people’s behavior is motivated by how they interpret the behavior of others around them. For instance, we may think that what’s causing others to act as they do is a combination of internal, personal factors. On the other hand, we may think that their behavior is a product of environmental variables.

      impacts

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the referees for their valuable suggestions. We have revised the text accordingly and already conducted most of the requested experiments.

      Reviewer #1

        1. The authors state that addition of mannan increases length of Birbeck granules however, no data are presented. It would make this more convincing when the length is compared between conditions with and without mannan (as shown in Fig 4, where the condition without mannan is lacking).

      Reply: Thank you for pointing out the missing data. We added an EM image of Birbeck granules and quantification of Birbeck granules formation in the absence of mannan (Figure 4A-D).

      • Supp, fig 1B perhaps as a panel in main figure as this is an important control to show that Birbeck granules are isolated.

      Reply: We moved the supplemental figure 1B to main figure 1D.

        1. Only the(total) length of Birbeck granules is taken into account, but not the number of Birbeck granules. Is it possible to quantify the number of Birbeck granules.

      Reply: We added Figure 4D to show the number of Birbeck granules. Note that the difference in the number of Birbeck granules was less significant than that of total length because there were numerous short fragments in the mutant specimen.

      • Fig 5. Only the condition (ARGK) where there is virtually no Birbeck granules formation is included, however, is virus still internalized in the other conditions (MRGD or MRGK) as Birbeck granule formation was less effective but still present? It would be interesting to include those mutants. A more specific quantification would be by p24 ELISA. Is there a reason why immunoblotting has been chosen? In the supernatant condition, explain why the virus p24 seems less in the control condition whereas one would expect max concentration in that condition.

      Reply: Thank you for suggesting the use of ELISA. We chose immunoblotting because of its higher sensitivity and lower cost. But ELISA is advantageous when it comes to comparing large number of samples. We performed p24 ELISA and quantified the virus internalization in all the mutants available (Figure 5C). As you pointed out, the transfer efficiency of the immunoblot in Figure 5A was not uniform across the membrane; Pr55 bands became denser toward the right, while p24 bands had a gradient in the opposite direction. The immunoblots and ELISA showed that about ~1% of the viruses were attached or internalized and ~99% did not interact with the cells. Thus, the attached/internalized viruses did not affect the amount of viruses in the supernatant. Results of ELISA also showed the amount of viruses in the supernatant were nearly equal among the samples (Figure S3B).

      • Abstract First sentence: not mucosal tissue but mucosal epithelium Last sentence: Virual should be viral

      Reply: We corrected the typo. Thank you.

      • Discussion The last section comparing DC-SIGN and langerin is not clear and some overstatements are made. "Considering that DC-SIGN serves as an attachment receptor for viruses but not as an entry receptor, the possible structural coupling of lateral ligand binding and internalization implies that langerin functions as a more efficient entry receptor for viruses than DC-SIGN or other C-type lectins." It is not correct that langerin but not DC-SIGN can function as an entry receptor. DC-SIGN has been shown to facilitate infection of different viruses such DENV and ZIKV. In contrast, langerin can restrict viruses such as HIV-1 but also facilitate infection for example Influenza A and DENV. So attachment or entry is more likely a consequence of the internalization and dependence on pH changes for fusion as some viruses such as DENV fuse in acidic vesicles. This needs to be discussed more clearly.

      Reply: Thank you for pointing out our wrong statement. We replaced the statement with weakened one as below:

      Page 13, line 213: “The difference in the ligand-binding manner between langerin and DC-SIGN may contribute to their different carbohydrate recognition preferences (Valverde et al., 2020; Takahara et al., 2004).“

      Reviewer #2 1) Langerin can exist on the cell surface and in Birbeck granules. They should examine langerin cell surface expression in the 3 states, wildtype, mutated and lectin - . Do the mutations change cell surface expression?

      Reply: We performed surface labeling experiments and showed that those mutations did not affect surface expression of langerin (Figure S3A).

      2) Birbeck granules are present in the absence of mannan and pathogens (see Pena-Cruz JCI 2018, PMID: 29723162). Thus, this suggests that Birbeck granules are present even without langerin clathrin coated pit internalization from the cell surface. How does their model account for this observation?

      Reply: We think there are two possibilities:

      1. Birbeck granules were shown to stem from the endoplasmic reticulum (Valladeau et al Immunity 2000; Lenormand et al PlosONE 2013). Since the rER is the site of glycosylation, langerin is likely to capture the oligo-mannose-glycosylated proteins within the rER and form Birbeck granules.
      2. Blood plasma proteins such as immunoglobulin D, immunoglobulin E, and apolipoprotein B-100 are reported to carry high-mannose glycans (Clerc et al Glycoconj J. 2016). Those glycoproteins in the cell culture media can induce Birbeck granule formation.

        3) Different cell types can have varied Langerin levels (see Pena-Cruz JCI 2018, PMID: 29723162). Is Birbeck granule formation depend on certain level of langerin expression? Do Birbeck granules form when Langerin is present at low as compared to high levels?

      Reply: In the course of the experiments, we isolated a cell line stably expressing langerin. However, langerin expressing cells were extremely slow in proliferation and the expression levels were low. To answer this question, we recovered this “failed” stable cell line and found that the low langerin-expressing cells can form Birbeck granules, but with lower efficiency (Figure S3C-E).

      4) Authors use immunoblots to show that HIV is present in intra-cellular Langerin structures. It would be ideal to visualize HIV with presumably internal Birbeck granules using imaging techniques such as cryo-electron micrography or another form of high resolution imaging.

      Reply: We are currently working on ultra-thin section electron microscopy of HIV-infected langerin-expressing cells. Visualization of HIV-containing Birbeck granules using cryo-electron microscopy is highly challenging because the current precision of cryo-FIB-SEM milling technique is too low to target a specific intracellular structure. We believe conventional electron microscopy will provide sufficiently convincing evidence that HIV is present within Birbeck granules.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful for the referees' rigorous review of our manuscript and for their overall positive reception of our work. We have pasted below the entirety of the reviewers’ comments, interleaved with our responses.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Gama et al. use a biophysical assay DAmFRET, structural analysis, and optogenetic tools to uncover the nucleation mechanism of CBM signalosome. They performed experiments first in yeast cells that lack death folds or related signaling networks, then confirmed their discoveries in human cells. The results presented here are clear and convincing. The paper is very well presented and clearly written.

      They found it is the CARD domain of BCL10 that acts as a molecular switch that drives all-or-none activation of NF-kB. Monomeric BCL10 possesses an unfavorable conformation and serves as a nucleation barrier, keeping BCL10 in a supersaturated inactive state that allows for binary activation upon stimulation.

      They also characterized CARD9 CARD domain and a coiled-coil region. They reasoned that CARD9CARD functions as a polymer seed to nucleate BCL10, and that the coiled-coil region has multimerization ability to facilitate nucleation. Furthermore, they characterized that MALT1 activation doesn't depend on BCL10 polymers but its own proximity. And MALT1 induces graded NF-kB activation, thus further demonstrating the binary activation is conferred by BCL10.

      Major comments:

      1. Fig S1D and E, the authors used TNF-a to activate NF-kB independent of CBM signalosome and found the activation in each cell increased with dose. In contrast, CBM activation led to bimodal cell activation. The authors claim that this is evidence that positive feedback upstream of NF-kB. We do not believe this claim can be made from this comparative experiment alone. We agree that positive feedback is important for activating an NF-kB response, but the comparison between CBM and TNFa is inaccurate and glosses over published data. Specifically, there is published data that TNF-a does activate a 'switch-like' or digital response, as defined by the translocation of p65 (see (Tay et al. 2010) among other studies that have examined p65 translocation at the single-cell level). The difference in T-sapphire expression between CBM and TNF activation is most likely due to TNFa induced oscillations of p65 translocation (although this is speculation on our part). Therefore we suggest to the authors that the TNF-a data (Fig S1D and E) should be omitted, as the claim of switch or not-switch as pertains to TNF signaling is more complex and nuanced than presented here. We believe omitting this data will strengthen the manuscript and avoid confusion in the field. The bimodal expression of the T-sapphire NF-kB reporter driven by the CBM signalosome activation is sufficient to claim an all-or-none response.

      We thank the reviewer for this suggestion. We acknowledge that the activation of NF-κB by TNF-ɑ is more complex than we had presented, and agree that the differences in T-Sapphire reporter output could be attributed to p65 oscillations. We had not previously considered this interesting possibility -- which is not addressed by the present data -- believe it is worth future investigation. As suggested by the reviewer, we have now omitted the TNF-a data, and agree that this change does not impact the overall claims of the paper.

      Fig 3B, the authors introduced CARD9CARD-µNS as a stable condensed seed for BLC10. However, considering CARD9CARD can form polymers at high concentration (Fig 3B and S3D), are these high expression levels of CARD9CARD able to induce BCL10-mEos3.1 assembly (as measured by DamFRET in yeast cells)? Can the authors examine BCL10 FRET at these high expression level of CARD9CARD? We assume that BCL10 will be assembled in these cells. This would provide a valuable control experiment and support the author's conclusions.

      Indeed, this question is amenable to DAmFRET. Accordingly, we have now performed DAmFRET of yeast cells expressing Bc10-mEos3.1 in the presence of either CARD9CARD-mCardinal or mCardinal itself (see new Fig S6A and B, and associated results section). We confirmed that cells with high CARD9CARD-mCardinal expression had higher FRET on average than cells with low expression. Importantly, cells expressing high or low levels of mCardinal itself had the same FRET level (Fig S6).

      Fig 3C, the text said "Whereas WT CARD9CARD assembled into polymers at high concentration, the pathogenic mutants R18W, R35Q, R57H, and G72S failed to do so (Fig 3C and S7B,C), explaining why they cannot nucleate BCL10". This claim that these mutants can not nucleate BCL10 does not have a figure call out or a reference. The authors then show the results in Fig 3E which supports this claim. Even though they were done in the context of full-length CARD, all proteins contain the I107E mutation that releases autoinhibition. For clarity, the authors should consider rearranging the text to avoid explaining a phenomenon and making conclusions before showing the results.

      We have now rearranged this section to match the figures and claims.

      Fig 4D, E and Video 1, the authors showed the nucleation of BCL10 into puncta within live cells is followed by p65 translocation to the nucleus. The authors claim that 'this result suggests that BCL10 is indeed supersaturated prior to stimulation' (paragraph 2 section titled BCL10 is endogenously supersaturated'). We fail to understand how this live-cell experiment leads to the conclusion BCL10 is supersaturated before stimulation. We think this text should be deleted from the text, or put into context with the DAmFRET data that lead the authors to make this claim. It would be interesting for the authors to define in discussion what are the golden criteria to claim a protein exists in a supersaturated state with live cells (by microscopy or other methods)? Adaptor protein assembly into puncta and the subsequent nuclear translocation of transcription factors is a common phenomenon across signalling pathways. Not all these pathways rely on signaling adaptors existing in a supersaturated state. The field of cell signaling (and cell biology in general) would benefit from a detailed definition of how these physical-chemical definitions of proteins are supported by experimental data. We believe that this paper will become a seminal paper in the field, and future work will benefit from a clear definition of how a claim of supersaturation is derived from the data.

      We appreciate that the concept of supersaturation will be foreign to many biologists, and welcome this opportunity to elaborate. We have now rephrased the corresponding results section for figure 4D, E, and have added new evidence to support our claim that BCL10 is supersaturated, as had been requested by reviewer 2 (see below in response to point 1). Supersaturation, as we (correctly) use the term, occurs when the concentration of a protein in solution exceeds its equilibrium solubility for the given conditions. The term is also sometimes used to describe __global __protein “concentrations” in excess of the solubility limit, even if a dense phase has already formed and potentially depleted the effective concentration (in solution) to the solubility limit. This is a key distinction, as only the former implies a high-energy out-of-equilibrium scenario that predetermines a future change -- release of the excess energy via phase separation.

      How does one experimentally determine if a protein is supersaturated? In theory, one may conclude that a protein is supersaturated if its assembly causes a net loss of energy from the system (i.e. exothermic). Unfortunately, it is likely not yet possible to perform such measurements with sufficient sensitivity inside a living cell. However, it is possible to infer that a protein is supersaturated if assembly can be shown to occur without a net input of energy to the system, i.e. without any change in thermodynamic control parameters such as temperature, pH, post-translational modifications, concentration of the protein, or concentration of any interacting factor. To do this, one introduces a substoichiometric amount of pre-assembled protein to the system. This manipulation will trigger assembly if the protein is supersaturated. If the protein is instead subsaturated, assembly will not occur and the exogenously added assemblies will simply dissolve. This phenomenon, known as “seeding” in the prion field, is considered a golden criterion sufficient to conclude that a protein has prion behavior. However, because bona fide prions additionally require a means for dissemination between cells, seeding analyzed at the cellular rather than population level is more appropriately considered a sufficient criterion for supersaturation (which is a prerequisite for classical prion behavior (Khan et al. 2018)). Our CARD9CARD-Cry2 experiment was designed to test this criterion. Specifically, it allowed us to introduce a seed independently of receptor activation, thereby precluding any orthogonal cellular response that might lower Bcl10 solubility through e.g. a post-translational change. That the seeds were substoichiometric is evidenced by the fact that Bcl10 polymerized homotypically following stimulation (i.e. it didn’t just bind to the CARD9CARD puncta, but went on to deposit onto itself).

      How does assembly under this scenario differ in principle from the many examples of puncta formed by other signaling proteins that occur upon stimulation of their respective pathways? Puncta formation that is induced by a thermodynamic change in the cell cannot be said to have resulted from pre-existing supersaturation. Rather, the stimulus may have caused some change that either increases the effective concentration of the protein (e.g. upregulates its expression, induces a post-translational change that activates it, or releases an inhibitory factor) or reduces solvent activity (e.g. change in pH).

      An additional requirement (necessary but not sufficient) is that the assembly must be regular with respect to some order parameter. That is to say, it must be a bona fide “phase”. At a minimum, this implies a uniform density. Additionally, for supersaturation to persist over biological timescales under physiological conditions and confinement volumes, the assembly (once formed) must also have structural repetition in at least two dimensions, i.e. crystallinity (Rodríguez Gama et al. 2021; Zhang and Schmit 2016). We know this to be true for Bcl10.

      Rodríguez Gama A, Miller T, Halfmann R. 2021. Mechanics of a molecular mousetrap-nucleation-limited innate immune signaling. Biophys J 120:1150–1160. doi:10.1016/j.bpj.2021.01.007

      Khan, T., Kandola, T.S., Wu, J., Venkatesan, S., Ketter, E., Lange, J.J., Rodríguez Gama, A., Box, A., Unruh, J.R., Cook, M., et al. (2018). Quantifying nucleation in vivo reveals the physical basis of prion-like phase behavior. Mol. Cell 71, 155-168.e7.

      Zhang L, Schmit JD. 2016. Pseudo-one-dimensional nucleation in dilute polymer solutions. Phys Rev E 93:060401. doi:10.1103/PhysRevE.93.060401

      Regarding the supersaturated state of BCL10, the authors convincingly use optogenetics to show how transient assemblies of CARD-Cry2 can template BCL10 assembly. This is a convincing experiment that shows templated nucleation of BCL10. To strengthen the claim that BCL10 is supersaturated endogenously we suggest the author quantify the expression of BCL10-mScarlet and CARD-Cry2 and ideally show that this phenomenon can be observed at expression levels equivalent to endogenous.

      As stated above, that BCL10-mScarlet formed polymers that we observed to elongate homotypically off of the CARD9CARD seeds indicates that the protein was supersaturated under the conditions of the experiment. The concentration of CARD9 is not a relevant parameter in this case. We had already compared the expression of BCL10-mScarlet to endogenous BCL10 in 293T, THP-1, and human fibroblast cells by quantitative immunodetection (Fig. S10D), revealing that the expression level of our BCL10-mScarlet constructs matched that of endogenous BCL10, which was approximately the same in all cell lines. We also compared the distribution of expression levels of BCL10-mScarlet versus that of endogenous BCL10 using antibody staining followed by flow cytometry, which confirmed that the range of expression levels of BCL10-mScarlet falls within that of endogenous BCL10 in 293T cells (Fig. S10F). Hence, we believe our data suffice to conclude that Bcl10 is supersaturated at endogenous levels of expression.

      Minor comments:

      1. Special character "delta" is not displayed in the text (instead only a space).

      This error occurred upon exporting the manuscript from our text editor to a PDF. We now have made sure all special characters are present in the PDF version.

      Several cell lines including mouse, human, and yeast lines were used across this manuscript. It would be clearer and more helpful if the exact cell type of the line could be indicated. Such as, "BCL10-mEos3.1 yeast cells" instead of "BCL10-mEos3.1 cells", "BCL10-mScarlet HEK293T cells" instead of "BCL10-mScarlet cells".

      We have now modified all instances to indicate the origin of the cell lines tested.

      Fig 5B, the authors indicated that BCL10 colocalized with CARD9CARD, then please show the merged image as well.

      We have now included the merged image to indicate colocalization in the inset images.

      Fig 6E, authors claimed that cells were stimulated with blue light for the indicated durations. The longest duration is 12 hours. Please specify if it was continuous exposure or several rounds of exposure in the indicated durations.

      We have now specified in the figure legends, text, and methods section, that this specific experiment used a continuous exposure of blue light.

      Reviewer #1 (Significance (Required)):

      This work used a combination of FRET and optogenetic tools to engineer CBM signaling and visualize the effects. They incorporated knowledge from structure biology, together with their results from mutations and truncations, dissected the significance of each protein in CBM signalosome, and demonstrated in detail how higher-order assemblies make all-or-none cellular decisions. We believe this paper will be a seminal paper in the field of cell signalling and cytoplasmic organization. It defines a new paradigm of macromolecules assembly of signalling complexes as being dependent on protein existing in a supersaturated state. Importantly this paper opens up new questions regarding macromolecular signaling complexes (found in many innate immune signaling pathways): How is protein supersaturation maintained and used throughout evolution to construct biochemical signalling switches?

      This paper will be of particular interest to scientists working on immunity and cell signalling, especially in the field of higher-order assemblies. However, we feel the impact of this paper goes beyond these fields, and we believe this manuscript will be of broad interest to the cell biology and biophysics communities. For reference, our expertise is in innate immunity and cell biology.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In their manuscript entitled "A nucleation barrier springloads..." Rodriguez-Gama et al. dissect the assembly mechanism of the signalosome, composed of the proteins CARD9, BCL10 and MALT1, using a novel in-cell biophysical approach (DAmFRET). They first overexpressed fluorescently tagged versions of the proteins to promote their assembly in yeast and mammalian cells, finding that CARD9 forms higher order assemblies across a wide range of concentrations with no discontinuity in the DAmFRET profile. In contrast, the DAmFRET profile of BCL10 showed a clear separation between monomers and higher order assemblies, which started to form spontaneously only at higher BCL10 concentrations. Furthermore, at the two states of the proteins co-exist at all concentrations. These observations imply that there is a nucleation barrier to forming BCL10 assemblies. MALT1 showed no change in FRET regardless of its expression level. These observations, alongside fluorescence microscopy of the assemblies, and previous structural studies, suggest that BCL10 forms self-templating polymers that act as a switch for an all-or-nothing immune response, assayed in this case by monitoring the nuclear translocation of the NF-kB subunit p65. The authors also assessed the effects of known disease-causing mutations on the nucleation barrier, showing that changes in the strength of the nucleation barrier can have major effects on signalosome function. Finally, they used optogenetic methods to trigger assembly of individual signalosome components, providing insight into the minimal components/conditions required for signalosomes to work.

      Major comments

      Overall, the experiments by Rodriguez-Gama et al. offer convincing evidence that there is a nucleation barrier to BCL10 polymerisation, and that a CARD9 template is sufficient to overcome the barrier. Although the existence of a nucleation barrier had already been postulated, based on structural and other studies (referenced by the authors), it had lacked a rigorous demonstration. This work provides that demonstration, which is important for the signalosome field and more broadly applicable to researchers studying cellular decision making. The study further demonstrates that DaMFRET is an excellent to study protein assembly processes in their native environment, allowing the authors to tackle a question that would have been technically very difficult to address otherwise. The optogenetic experiments are a nice sufficiency test for their ideas.

      We feel there are a few key points to address before publication.

      1) One of the main conclusions is that spring-loading the nucleation barrier with high super-saturating BCL10 concentrations allows a decisive response. Although much of the data strongly imply this conclusion, the dependence of the immune response on BCL10 concentration was not tested directly. A key prediction of the nucleation barrier is that at concentrations below saturation, BCL10 should not be able to induce an all-or-nothing response when stimulated. At saturated/super-saturated concentrations BCL10 should be able to induce a response. At deeply super-saturated concentrations the response should start to be activated spontaneously in the absence of an external stimulus. These predictions could be tested using the doxycycline-inducible BCL10 system (Figure S2D), without establishing major new experimental avenues. We feel that such an experiment would strengthen the main conclusion. It might also help to shed light on whether being highly supersaturated enables a more decisive response than being just saturated.

      This is a great idea. As the reviewer suggested, our Doxycycline-inducible BCL10 system enables us to induce and track the state of BCL10 over time. We have now performed the requested experiments (Fig. S9D, E) and incorporated the results into the relevant section of the text. In short, our new analyses show that BCL10 indeed has a concentration threshold for activation by stimulation, and that it can also nucleate spontaneously when overexpressed. Note that our original analyses in Fig. 4B and C also demonstrate spontaneous BCL10 activation at high concentrations. With this new evidence and the orthogonal approaches used in Fig. 5, we believe our data definitively support our conclusion that BCL10 is supersaturated.

      2) Intuitively, readers might expect that if BCL10 is supersaturated then, once nucleated, it would rapidly assemble at the nucleation sites. In Figure 5B, CARD9CARD-miRFP670nano-Cry2 assemblies are optically induced throughout the cell. However, BCL10 appears to nucleate at just a few sites with a few minutes delay. More widespread nucleation and growth of BCL10 polymers seems to take longer (20-40 minutes, Figures 5B and 5C), after CARD9CARD-miRFP670nano-Cry2 has disassembled. Furthermore, in Figures 4D and 4E, very few BCL10 assemblies are visible/quantifiable after 70 minutes PMA exposure, but p65 has clearly entered the nucleus. It looks like BCL10 assembly slightly lags behind p65 nuclear entry. Can the authors provide a more detailed explanation of these kinetics?

      We do note that the number of CARD9CARD clusters formed upon opto-stimulation exceeds the apparent number of BCL10 nucleation sites. We believe this is consistent with nucleation-limited kinetics, where the clustering of CARD9-CARD increases the local probability of nucleation. As nuclei form and grow, they lower the probability of subsequent nucleation elsewhere in the cell. Additionally, it is possible that our artificial seeds do not perfectly mimic the native CARD9 seeds that form upon natural stimulation (e.g. due to potential steric interference from the fluorophore and Cry2). We also acknowledge that there is a slight delay in the visible appearance of BCL10 polymers relative to p65 nuclear translocation. We expect that MALT1 activates already when the polymers are still too small to see (sub-resolution), whereas the polymers only become microscopically visible once they’ve grown quite a bit more.

      3) Related to point 2 above, in Figure 5D, the leftmost cell in the field of view clearly contains CARD9CARD assemblies but there are no BCL10 assemblies and p65 is not imported into the nucleus (in contrast to the central cell in the field of view). How often does CARD9CARD optogenetic assembly lead to BCL10 assembly? In other words, can the authors quantify the cell-to-cell variability in this experiment?

      Throughout our experiments, whether analyzing BCL10 puncta formation, NF-kB transcriptional activity, or p65 translocation, we observed a persistent nonresponsive fraction of cells even at saturating levels of stimulation. Specifically, approximately 30% of THP-1 cells failed to acquire T-Sapphire fluorescence or form BCL10-mEos3.2 puncta when stimulated with high levels of β-glucan (Fig 1B and E, respectively), and approximately 25% of 293T cells failed to acquire T-Sapphire fluorescence or exhibit p65 nuclear translocation when stimulated with high levels of PMA (Fig 1C and Fig 4E, respectively). Because these numbers did not depend on whether BCL10 was endogenously or exogenously expressed, we know that the underlying cell-to-cell heterogeneity involves factors upstream of BCL10. Indeed, the fraction of recalcitrant cells drops to 10% in our optogenetic experiments that bypass upstream factors (Fig S11E). Possible sources of heterogeneity include different physiological states of the cells or fluctuations in the expression levels of any upstream factor in the signaling pathway. We believe that this phenomenon is not unique to the CBM signalosome, as we (unpublished) and others (Fernandes-Alnemri T et al, 2009, Dick M et al, 2016) have similarly observed a fraction of non-responding cells upon activation of the inflammasome, which involves nucleation-limited polymerization of the adaptor protein ASC. While this phenomenon is interesting and may be important to our understanding of the full complexity of signalosomes in vivo, we believe that identifying the source of heterogeneity would be outside the scope of the present manuscript. We now describe this phenomenon in the final paragraph of the “Endogenous BCL10 is constitutively supersaturated” section.

      Fernandes-Alnemri, T., Yu, JW., Datta, P. et al. AIM2 activates the inflammasome and cell death in response to cytoplasmic DNA. Nature 458, 509–513 (2009). https://doi.org/10.1038/nature07710

      Dick, M., Sborgi, L., Rühl, S. et al. ASC filament formation serves as a signal amplification mechanism for inflammasomes. Nat Commun 7, 11929 (2016). https://doi.org/10.1038/ncomms11929

      Minor comments

      While the work is scientifically well done, the text reads as though it is meant for experts rather than a broad audience. This is a pity because it risks alienating readers. We suggest that some adjustments to the text (mainly additional explanations and not ruling out alternative interpretations of the data) would widen the audience and increase the impact of this important study. Below are some suggestions that might help.

      1. In the first results section, the authors write: 'This suggests that Bcl10 but not CARD9 assembly occurs in a highly cooperative fashion that could, in principle (Koch, 2020), underlie the feed forward mechanism.' It isn't obvious how Figure 1 leads to this statement. Could the authors give a more detailed explanation?

      We have now revised the text to elaborate on this interpretation.

      One limitation of DAmFRET is that it can only detect a nucleation barrier where there is a difference in FRET between the monomer and the assembled form of the protein. However, it can't necessarily detect when there is not a nucleation barrier i.e. if there's no difference in FRET. The text seems to suggest that CARD9 and MALT1 don't have nucleation barriers to their assembly. While this might not be intentional, it would be helpful to explicitly state that CARD9 and MALT1 could also possess such barriers that are not detectable by this method. This wouldn't detract from the finding that BCL10 has a barrier that plays an important function.

      The reviewer is correct that DAmFRET would not be able to detect a nucleation barrier if the assembled phase does not condense the fluorophore to a sufficiently high density for FRET to occur. In our experience, this is only a concern for very large proteins whose bulk “dilutes” the fluorophores within the assembly. Death domains, on the other hand, are only ~ 3 nm in diameter, and FRET occurs within a range of ~10 nm; hence we think it very unlikely that the death domains could be forming cryptic polymers that escape our detection. In any case, when assembly does produce a change in FRET, we can with confidence determine how strongly that form of assembly is governed by concentration. Hence, for CARD9, which does produce a FRET signal upon assembly, we can say that assembly has a smaller intrinsic nucleation barrier than that of BCL10. We further eliminated the possibility of multi-step nucleation (which would reduce the apparent nucleation barrier relative to the one-step ideal case) for CARD9 by showing that artificial condensates of the protein expressed in trans do not influence the concentration-dependence of FRET (Fig. 4 B). Finally, under all conditions where CARD9 lacked FRET, it also lacked signaling activity, suggesting there is not a cryptic functional assembly that evades our assay. Likewise MALT1, which lacked FRET at all concentrations, was entirely unable to activate NF-kB upon overexpression (Fig. S8 A and B), suggesting that it too is not forming a cryptic functional assembly that evades our assay. We therefore feel confident in our conclusion that CARD9 and MALT1 lack nucleation barriers of a magnitude comparable to that of BCL10. Note that our claim is not that they entirely lack a nucleation barrier (CARD9 after all does form a multi-dimensionally ordered polymer), but rather that we fail to observe a nucleation barrier and hence any barrier that may exist is insufficient to manifest at the cellular level.

      In the final results section, the idea that MALT1 activation doesn't depend on BCL10 polymer structure doesn't necessarily follow from the data. An alternative interpretation is that optogenetic clustering of MALT1 causes it to recruit BCL10 and form BCL10-MALT1 filaments (structure solved by Schlauderer et al., 2018). Also, the optogenetic clustering of MALT1 may mimic some structure found in the BCL10 cluster. Therefore, we are neither convinced that the data unambiguously show that MALT1 activation strictly depends on multi-valency rather than an ordered structure of BCL10 polymers nor that this conclusion is truly necessary for the paper.

      We agree that the reviewer’s alternative interpretation of this experiment is possible. However, we consider it unlikely because we performed the experiment with MALT1 lacking its Death Domain (residues 126-824), which mediates its interaction with BCL10 (Schlauderer et al., 2018). Our experiments then suggest that MALT1 clustering is sufficient for activation independent of any structuring mediated by BCL10. Nevertheless, we have now performed an additional control in which we treated these cells with PMA to induce BCL10 polymerization. As expected, the NF-kB transcriptional reporter utterly failed to activate in this condition, indicating that MALT1 does not interact with BCL10 polymers when it lacks its death domain. This aspect has been further elaborated in our response to reviewer 3 point 5.

      What optical density do the yeast cells reach during the 16h induction in galactose? If they are in stationary phase, this could affect the assembly status of the proteins being expressed, as the cytoplasm becomes glassy when cells are starved, and this coincides with widespread protein aggregation/assembly (Joyner et al., 2016; Munder et al., 2016).

      In our DAmFRET strategy, we first dilute an overnight culture and regrow the cells to log phase prior to resuspending them in galactose media. Our strain is engineered to undergo cell cycle arrest upon protein induction, hence exponential growth is prevented and the cells do not deplete galactose during the 16 hr induction. We have also performed many time courses of DAmFRET following induction and generally find no qualitative difference between early and late times (unpublished). Early time points simply have lower expression and correspondingly fewer cells in the high FRET state. Importantly, all comparisons between proteins are made with the same 16 hr induction.

      Although these experiments show that thermodynamically lowering the BCL10 nucleation barrier (e.g. by post-translational modifications or protein expression levels) isn't required for a response, they don't rule it out. It would be good to state this in the discussion, as cells may have multiple mechanisms of switching on the signalosome.

      We thank the reviewer for this suggestion and have now explicitly stated in the discussion that our experiments do not argue against possible thermodynamic tuning of the nucleation barrier.

      The discussion compares signalosomes with condensates formed by liquid-liquid phase separation. This is an interesting comparison but it suggests that disordered assemblies would not be capable of performing signalosome-like functions. This needs to be explained more clearly. For example, non-amyloid prions seem to form gel-like assemblies with a high nucleation barrier that are capable of driving heritable traits, likely through self-templating (Chakravarty et al., 2020). Such examples could represent disordered assemblies with signalosome switch-like behaviour. Furthermore, there are examples of condensates that are induced by environmental changes e.g. Pab1 and Ded1 condensates (Riback et al., 2017; Iserman et al., 2020). This potentially allows the proteins to reach high concentrations and remain un-condensed until a change in heat or pH overcomes a nucleation barrier required for condensate formation. Although the condensates aren't self-templating, they seem to require energy for their disassembly. Combined, this also allows switch-like behaviour, where the switch is flipped back to the uncondensed off state once conditions return to normal. In general, crossing a phase boundary can represent a switch-like response. Finally, recent electron-tomography experiments show that ASC puncta comprise clusters of filaments (Liu et al., 2021, biorxiv). CARD9/BCL10 assemblies may have similar ultrastructures and liquid-liquid phase separation may well play a role in their assembly.

      Indeed, we explicitly maintain that liquid phases cannot themselves perform signalosome-like functions. Chakravarty et al. 2020 did not observe amyloids associated with their phenomena, but the relevant experiments were not designed to exhaustively exclude an underlying ordered phase. To the extent that gelation is involved, their observations are fully consistent with ours. IUPAC defines a “gel” as a colloidal network involving a solid phase and a dispersed phase. The existence of a solid phase necessarily implies an underlying disorder-to-order transition, even if limited to small length scales. In the case of gelation associated with liquid-liquid phase separation, nucleation of the ordered phase simply occurs in two steps (first condensation, then ordering). Note also that a liquid phase could in principle give rise to a heritable phenotype if it activates a positive feedback in a molecular biological process involving the protein of interest (e.g. upregulation of its expression or a change in interacting factors). Chakravarty et al. did not exclude such phenomena (it would be very difficult to do so); hence it cannot be concluded that phase separation is responsible for the sustained phenotypic changes.

      We do not fully follow the reviewer’s logic concerning the relevance of Pab1 and Ded1 condensates. These proteins only condense when their respective phase boundaries fall below the endogenous protein concentration, as upon thermal stress. The proteins are not supersaturated in the absence of such conditions (for example, they cannot be seeded), and it is incorrect to characterize the change in heat or pH as overcoming a pre-existing nucleation barrier. The concept of a nucleation barrier only applies under conditions where a phase is thermodynamically favored. It is also misleading to state that the Ded1 and Pab1 condensates require energy for disassembly. Rather, they require energy to disassemble rapidly. Unless the assemblies have accessed a more ordered phase as described above (two step nucleation), involving a lower phase boundary, they will inevitably dissolve after the conditions return to normal.

      We have much prior experience with ASC. Although it has not been explicitly shown, that it forms ordered polymers and can behave as a prionoid in vivo suggests that it very likely operates the same way as BCL10 (i.e. is physiologically supersaturated). That full-length ASC forms clusters of filaments is not relevant (in our view) to the mechanism shown here, which only requires that filaments are indeed formed. Formally, the size of the relevant nucleus determines the minimum length scale at which ordering must manifest in our mechanism. Based on the structure of death domain filaments, this could be as small as tetramers or hexamers (a minimal but structurally complete “polymer”).

      As stated above, and now elaborated in the discussion, our data do not exclude a role of thermodynamic regulation, as could lead to liquid-liquid phase separation, in tuning the nucleation barrier of Bcl10. What they do exclude is that such changes are required for Bcl10 to activate in the first place.

      Can the authors comment on the loss of BCL10 in Echinodermata, Anthropoda, Nematoda? Is there another protein that plays a similar role? Could a CARD or PCASP protein possess self-templating properties? Could other methods of control be at play e.g. protein expression?

      This is a very interesting question! We think the reviewer’s suggested explanations for the loss of BCL10 in those lineages are valid and worthy of future exploration. Nematodes such as C. elegans have lost multiple components of innate immunity. They have very few pathogen recognition receptors and also lack NF-kB! They do, however, have other adaptor proteins that the literature and our unpublished data suggest may have self-templating ability, such as TIR-1. Drosophila also encodes multiple TIR-containing proteins that are essential for innate immunity. In short, it is possible that other proteins have acquired the hypothetically essential role of supersaturation and nucleation-limited signaling in these organisms.

      Figures 1B/1C: Can the authors comment on why the active cells plateau at about 70-75%? This is a striking feature of the plots, but the explanation may not be obvious to readers.

      See our response to major point 3, above.

      Figures 1D/1E: What was the concentration of B-glucan used in this experiment? This could be included in the figure legend. If greater than 1ug/ml this means that the % of active cells in Figure 1B matches the % of cells with BCL10 assemblies in Figures 1D/1E, which is potentially an important point.

      We thank the reviewer for bringing this point to our attention. We have now indicated in the figure legend the concentration of B-glucan used in this experiment (10 μg/ml). That the percentage of active cells in Fig. 1B matches that of cells containing BCL10 polymers in Fig. 1D and E indeed strengthens the stated relationship between BCL10 assembly and NF-kB activation in THP-1 cells subjected to a relatively physiological stimulus. Additionally, we have performed experiments to measure the levels of p65 translocation in THP-1 cells treated with B-glucan that express BCL10-mEos3.2. This data is shown in Figs. S1D and E in response to reviewer 3.

      Use of both 'BCL10' and 'Bcl10' when referring to the protein.

      We have now replaced all instances where Bcl10 was used to follow guidelines for gene and protein name conventions.

      Bruford EA, Braschi B, Denny P, Jones TEM, Seal RL, Tweedie S. Guidelines for human gene nomenclature. Nat Genet. 2020;52(8):754-758. doi:10.1038/s41588-020-0669-3

      In the supplementary figures there are some formatting problems/missing words in the figure legends. In Figure S11 there is a black box covering the lower part of the figure.

      We have now fixed these instances.

      References used in this review

      Chakravarty, A.K. et al. (2020) "A Non-amyloid Prion Particle that Activates a Heritable Gene Expression Program," Molecular Cell, 77(2), pp. 251-265.e9. doi:10.1016/j.molcel.2019.10.028.

      Iserman, C. et al. (2020) "Condensation of Ded1p Promotes a Translational Switch from Housekeeping to Stress Protein Production," Cell, 181, pp. 818-831.e19. doi:10.1016/j.cell.2020.04.009.

      Joyner, R.P. et al. (2016) "A glucose-starvation response regulates the diffusion of macromolecules," eLife, 5. doi:10.7554/eLife.09376.

      Munder, M.C. et al. (2016) "A pH-driven transition of the cytoplasm from a fluid- to a solid-like state promotes entry into dormancy," eLife, 5(MARCH2016). doi:10.7554/ELIFE.09347.

      Riback, J.A. et al. (2017) "Stress-Triggered Phase Separation Is an Adaptive, Evolutionarily Tuned Response," Cell, 168(6), pp. 1028-1040.e19. doi:10.1016/j.cell.2017.02.027.

      Schlauderer, F. et al. (2018) "Molecular architecture and regulation of BCL10-MALT1 filaments," Nature Communications 2018 9:1, 9(1), pp. 1-12. doi:10.1038/s41467-018-06573-8.

      Reviewer #2 (Significance (Required)):

      The existence of a nucleation barrier had already been postulated, based on structural and other studies (referenced by the authors), it had lacked a rigorous demonstration. This work provides that demonstration, which is important for the signalosome field and more broadly applicable to researchers studying cellular decision making. The study further demonstrates that DaMFRET is an excellent to study protein assembly processes in their native environment, allowing the authors to tackle a question that would have been technically very difficult to address otherwise.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The study by Rodriguez Gama et al. addresses the molecular function of CBM complex-forming proteins CARD9, BCL10 and MALT1 in the activation of myeloid cells, using optogenetic tools, transcriptional reporters and biochemical approaches. It is known from previous studies that Bcl10 oligomerizes into filamentous oligomeric structures incorporating Malt1, and that these structures are nucleated by receptor-induced activation of CARD proteins such as CARD11 (in lymphocytes) or CARD9 (in myeloid cells), but the mechanism underlying the assembly of the resulting CBM complexes remain incompletely understood.

      The authors develop beautiful optogenetic tools to address this question, and convincingly demonstrate that CARD9-mediated nucleation of BCL10 triggers a binary cellular NF-kB response in a spring-load-like fashion, and identify mutants of BCL10 and CARD9 that impact this capacity. Unfortunately, however, the authors do not do a good job to simplify this complex problem so it can be easily understood. In particular, the choices of mutants, models and experiments are not consistent between figures, and some data seem to be arbitrarily added or omitted. Complex hybrid constructs are also used, without assessing whether these are indeed functional in the corresponding ko cells. The paper would therefore benefit from a major overhaul. We also noticed that the literature is often not cited adequately and have included a (non-exhaustive) list of examples of wrong, incomplete, or erroneous citations below.

      1. The initial observations of binary signaling are derived from a reporter system. Although there are controls to show that the reporter used does not function intrinsically cooperatively, it would be nice to see additional data to show that cooperativity occurs also at the level of endogenous response systems, for instance by qPCR-based assessment of a natural NF-kB target gene (induced for example by TNFa versus B-glucan in THP-1 cells, and by TNFa versus PMA in 293T cells).

      As detailed in the introduction, NF-kB has been shown by multiple labs to activate in a binary fashion. Our manuscript shows that NF-kB activation occurs in a binary fashion both at the level of transcription and at the level of nuclear translocation (upstream of any transcriptional output). While we do agree that additional data could further illustrate the biological significance of our findings, we do not feel it is necessary for our conclusions. Note also that because NF-kB activation occurs in a binary fashion per cell, a simple qPCR experiment would not suffice to extend our findings to the broader Nf-kB regulon. Instead, one would have to use e.g. RNA-FISH or single cell RNA-seq, nontrivial experiments that would take months to complete.

      The cell lines in Figures 1D-E (and also some of the BCL10 mutants used later on) would have been better run in the assays in the early parts of Figure 1. The final conclusion prior to the section The adaptor protein BCL10 is a nucleation-mediated switch is otherwise not justified. This is a central tenet of the paper, that is referred to again, with some other ancillary data to support it. These mutants reappear later in the paper, but it would have been better, and easier to make rescue lines of BCL10 KO in Figure 1, otherwise the logic is lost, and the models seem chosen arbitrarily.

      The choice of experiments in different panels of Fig. 1 resulted from a chronological progression of reagent construction as the project evolved. We do appreciate that switching between the assays may lead readers to doubt one or the other. Therefore, we have now immunostained for endogenous p65 in the same experiment as for Fig. 1D and confirmed that p65 translocated to the nucleus only in THP-1 BCL10-KO cells that have been reconstituted with WT BCL10-mEos3.2, but not E53R. We think this additional evidence along with our orthogonal measurements in other reporter systems confirms our findings that BCL10 nucleation determines NF-kB activity.

      Expression with microNS is not well controlled and gives little real evidence for what is occurring. It is unclear what the concentration of the protein expressed was, but certainly the relative expression of the CARD9(CARD) and the microNS version should be assessed.

      We believe these concerns result from a misunderstanding. We assume the reviewer is referring to the experiment in Fig. 3B. Expression of muNS on its own has no effect on the DAmFRET of other proteins, and we have previously used it in exactly the same way as here (Holliday M et al. 2019 and Kandola T et al. 2021). Please note that muNS fusion proteins in our experiment have an orthogonal fluorescent protein whose spectra do not significantly overlap with those of mEos3.1. The experiment evaluates a protein’s ability, when condensed via its fusion to muNS, to nucleate an mEos3.1-fused protein that is expressed in trans. Fusion of proteins to muNS does not affect their expression levels, as we now show for CARD9CARD-muNS-mCardinal versus CARD9CARD-mCardinal (Fig. S6D).

      Also, the AmFRET profile of CARD9CARD looks very weird, it cannot be compared to BCL10.

      We are unsure in what way the AmFRET profile of CARD9CARD is “weird”. It is fully consistent with expectations and has been thoroughly explained in the text. We suspect the reviewer was bothered by the sharp acquisition of FRET at approximately 100 uM. As explained in the text, this represents the phase boundary, also known as the solubility line, for CARD9CARD polymers, which we previously showed in vitro (Holliday M et al. 2019). Above this concentration, the protein self-assembles without a nucleation barrier, hence the sharp but continuous change in FRET. BCL10 plots, in contrast, show a discontinuous acquisition of FRET, which indicates a nucleation barrier. In order to highlight that the CARD9CARD transition is understood and expected, we have also now added a line to the plot to demarcate the phase boundary.

      We are not convinced of the usefulness of the introduction of a slew of disease-causing CARD9 mutations that may or may not be relevant to the authors' point. The fact that they do or do not function in a specific sub portion of an assay that may or may not be relevant to biological activity seems to be of interest but without biochemical understanding, little is clear.

      While several reports have shown the clinical importance of these CARD9 mutations on susceptibility to fungal infections, little was known about the molecular mechanism underlying their effects. The inclusion of the disease-causing mutants to this paper is justified for the following reasons. First, they demonstrate the relevance of our work to disease. Second, they build off our findings to provide an otherwise unknown molecular mechanism of these mutants. We showed using independent methods that CARD9CARD mutations disrupt the ability to nucleate BCL10, via two different mechanisms. Finally, validating the disease-causing mutations allowed us to use them as controls for subsequent experiments demonstrating that BCL10 is supersaturated.

      The Optogenetic experiments are interesting, but difficult to interpret without evidence that these MALT1 constructs are indeed still functional when expressed in MALT1-deficient THP-1 cells. We do not therefore think that this experiment shows a necessity for clustering to signal, just a sufficiency, and in a highly artificial construct.

      We welcome the opportunity to elaborate on the optogenetic experiments. Since BCL10 and MALT1 are expressed ubiquitously across cell types, the validity of our findings should not depend on the cell type used. Indeed, much of what we already know about innate immunity signalosomes comes from work in HEK293T cells. Our optogenetic experiments using MALT1 were performed in 293T MALT1-KO cells in Figures 6E and F, and employed two distinct functional assays (p65 nuclear translocation and a transcriptional reporter). While our approach employs light to control clustering, similar approaches using (no less-artificial) chemically induced dimerization domains have been used to study caspase activation (Oberst A et al, 2010, Boucher D et al, 2018). Our use of light affords higher specificity, reversibility, and spatial and temporal control over MALT1 assembly than does chemically induced dimerization.

      To demonstrate the necessity of clustering, we have now performed an experiment with MALT1(126-824)-miRFP670-Cry2 expressed in 293T MALT1 KO cells that contain a transcriptional reporter of NF-kB ,as in figures 6E and F. We added PMA to the cells and found that it failed to activate NF-kB (Fig. 6), confirming that the interaction of MALT1 (via its death domain) with polymerized BCL10 is required for activation. Note that MALT1 and BCL10 exist as a soluble heterodimer prior to BCL10 polymerization; hence it is polymerization, rather than the interaction itself, that activates MALT1. That artificial clustering rescues this defect strongly suggests that the effect of polymerization can be attributed to increased proximity rather than some allosteric effect communicated from BCL10 polymers through the MALT1 DD to its caspase-like domain.

      Oberst, A., Pop, C., Tremblay, A.G., Blais, V., Denault, J.-B., Salvesen, G.S., and Green, D.R. (2010). Inducible dimerization and inducible cleavage reveal a requirement for both processes in caspase-8 activation. J. Biol. Chem. 285, 16632–16642.

      Boucher, D., Monteleone, M., Coll, R.C., Chen, K.W., Ross, C.M., Teo, J.L., Gomez, G.A., Holley, C.L., Bierschenk, D., Stacey, K.J., et al. (2018). Caspase-1 self-cleavage is an intrinsic mechanism to terminate inflammasome activity. J. Exp. Med. 215, 827–840.

      In the introduction and other parts of the paper, there are numerous instances where the previous literature in the field is not adequately cited. Examples include:

      • In the introduction, it is weird to cite one original paper (a MALT1 ko study by Ruland et al., 2001; there are several other studies of ko papers for CBM components that would merit being citated along with this study) together with two reviews on that topic (Ruland and Hartjes 2019 and Gehring et al. 2018)

      • In the introduction, the original study by Wang et al., 2002 should be cited together with Rebeaud et al., 2002; the two studies on the same topic were published back-to-back

      • In the introduction, the statement "CARD10 and CARD14 are expressed in nonhematopoietic cells including intestinal and skin epithelia, respectively" should be supported by citations.

      • Still in the introduction, the 2 references for the statement "... CARD14 gain of function mutations cause psoriasis (Howes et al., 2016; Jordan et al., 2012)" are not appropriate. There are several reports of patients with CARD14 mutations (the study by Jordan et al is only one of them) and several CARD14 mouse models that provoke a psoriasis-like phenotype, which would merit being cited.

      • In the following sentence: "Point mutations and translocations involving BCL10 and MALT1 cause immunodeficiencies (Ruland and Hartjes, 2019), testicular cancer (Kuper-Hommel et al., 2013), and lymphomas (Zhang et al., 1999).", the citation style also seems completely random, combining the citation of a single original paper for lymphomas (Zhang et al. 1999) (there are several other important original studies on that topic or recent reviews that could be cited instead), together with a review on immunodeficiencies (Ruland and Hartjes, 2019) and then another single example for a role of BCL10 and MALT1 in carcinoma (the study by Kuper-Hommel et al. is one, but several other original publications exist on the latter topic, showing for example a role in breast carcinoma or glioblastoma).

      • In the first section of the results, the reference cited for endogenous CARD10 expression in 293T cells (Ruland et al., 2001) is wrong, no endogenous CARD10 expression was assessed in that study

      We have now revised the citations mentioned above and other instances to ensure adequate citations in each case.

      Reviewer #3 (Significance (Required)):

      The paper deals with a complex question, namely how the CBM signalosome assembles and functions to stimulate NF-kB signaling. This question is important to the understanding of pro-inflammatory immune responses and basic life sciences in general. As the focal point of the paper is complex, and tools to study such phenomena are at the limit of technical capabilities, this further increases the potential impact of the work.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      The characterization of open-ended signalosomes in a number of innate-immunity and cell-death pathways, in particular formed by domains from the death-fold family, has led to the suggestions that these complexes allow a switch-like signalling response suitable for these pathways. It appears that this has been widely accepted. However, these suggestions are based largely on indirect observations and speculation.

      Rodriguez-Gama and coworkers have decided to test these suggestions more directly. Their results confirm the suggestions. Based on my own experience, papers that validate widely adopted suggestions are often not considered seriously by top journals, who are looking for hot topics/paradigm-changing/surprising type results. I would urge the editors to consider seriously work such as in this paper, which directly tests important suggestions and does so at a technically high standard. The authors use a range of ingenious approaches, both with recombinant proteins and in cells, and including proteins from organisms from different parts of the evolutionary tree, to support their interpretations, so it is an extensive and high-quality study. I am impressed that so many different fusion proteins with fluorescent tags continued to function as expected, but I guess the authors controlled for this as much as they could.

      Having said all this, I do get the feeling the authors are "over-selling" the nucleation barrier aspect of these signalling mechanisms. It is clearly an important and critical aspect of signalling in many systems, but then it is not the only important aspect; a number of other regulatory inputs play a role in different systems. So the statement "Our findings introduce a novel structure-function paradigm" in my view is overstretching things somewhat. Further in the Discussion section, the authors state "Existing explanations for the preponderance of ordered polymers in immune cell signalosomes have centered on the functions of multivalency at steady state, such as scaffolding and sensitivity enhancement resulting from the cooperativity of homo-oligomerization". They cite a small (and non-exhaustive) number of papers discussing this topic; all these include "seeding" or "nucleation" as an important part of the proposed mechanism. So I suggest the authors provide a more balanced discussion of this aspect. Different pathways appear to display a different level of switch-like behaviour, and one thing that the current version of the manuscript is missing is more discussion of other death fold-based systems and how the results on the CBM signalosome apply to these, and also other systems such as TIR domain-based ones, which currently get no mention whatsoever. In the CBM system, there seems to be one main nucleation barrier; can there be more than one in others?

      We appreciate the reviewer’s perspective and have now acknowledged in the introduction and discussion additional prior literature that has paved the way for our study. Nevertheless, we maintain -- as now stated in the abstract -- that “our results defy the usual protein structure/function paradigm, and demonstrate that protein structure can evolve via selection for energetic maxima in addition to minima”. We have elaborated in the introduction and discussion how immune signaling provides the functional context in which such a paradigm can evolve, and how our findings uniquely support the paradigm.

      One other aspect I need to express some criticism about is attention to detail - especially with a paper focusing on the physics behind biological processes, I would expect a higher standard of getting the terminology and units correct - see specific examples below. This can obviously be fixed easily.

      Specific points are listed below. No page or line numbers are provided so I have done my best to make it clear what the comments refer to.

      1. Abstract line 6 and throughout: in "NF-kB", the "k" is supposed to be "kappa" (Greek letter) - it stands for "nuclear factor kappa-light-chain-enhancer of activated B cells", not fully defined in the manuscript as far as I can see. Occasionally, small k is also used instead of the small cap K or whatever the authors used most of the time, but I don't think any of them use the Greek letter.

      We had indeed used a version of the small “kappa” κ. We have now fixed the cases where we mistakenly used k instead of κ.

      Page 2 (Introduction) paragraph 2 line 9: period missing at the end of sentence. Same Page 4 (Results: Assembly) paragraph 4 line 3.

      This is now fixed.

      Page 2 (Introduction) paragraph 2 line 15 and throughout: in long sentences, more commas can help help readability, for example before "leading" here. Similar page 15 paragraph 2 line 3 after "Additionally", paragraph 4 line 2 before "which".

      We have now included more commas and tried to improve readability throughout.

      Page 4 (Results: Assembly) paragraph 2 line 2: is "positive feedback" different from "cooperativity"? Is it a broader term that includes cooperativity, nucleation and other mechanisms? It may be useful to introduce some of these terms to avoid confusion by the readers.

      “Positive feedback” is the broadest term as it is agnostic to mechanism. “Nucleation” refers to the initiation of a first order phase transition, which is one mechanism of positive feedback. Nucleation involves “cooperativity”, in that a higher order species is more stable than smaller species. However, cooperativity can occur for oligomers of finite size, whereas nucleation is reserved for phase transitions to species of infinite size. We appreciate that the use of so many related terms may have created more confusion than necessary. Hence, we have now revised the text to omit the more general terms -- “positive feedback” and “cooperativity” where possible.

      Page 4 (Results: Assembly) paragraph 2 line 3: please define "TNF".

      We have now fixed this and other acronyms.

      Page 4 (Results: Assembly) paragraph 3 line 2: the use of size-exclusion chromatography to follow the size of complexes would assume that they are irreversible or very stable. It appears this may be the case here, but some discussion may be warranted.

      We have now explained that SEC is appropriate for this experiment because large nucleation barriers generally imply stable assemblies.

      Page 4 (Results: Assembly) paragraph 3 line 4 and throughout: the symbol for "kilodalton" is "kDa".

      We have now fixed this mistake.

      Page 4 (Results: Assembly) paragraph 3: I am not sure how the results discussed in this paragraph demonstrate that assembly occurs in cooperative fashion - just that there is a change in oligomeric states upon stimulation.

      Cooperativity is implied by the absence of oligomer sizes between monomer and the large assembly. Nevertheless, we realized this can only be concluded in the case of homotypic assembly, which we cannot yet assume at this point in the paper. Therefore, we have revised this paragraph to say that the distribution is “consistent with” an underlying phase transition (which we then go on to prove).

      Page 4 (Results: Assembly) paragraph 4 line 2: "WT" is not defined. Wild-type what? I presume "protein"?

      We refer here to the wild-type protein. We have now fixed this mistake.

      Page 4 (Results: Assembly) paragraph 4: it may be worth pointing out here the wild-type and mutant proteins expressed at similar levels; clearly the outcomes will depend on protein concentration in the cell. I believe the supplementary figure shows this to a large extent.

      Indeed, our supplementary figure shows that the WT and mutant protein express to comparable levels. We have now pointed this out in the text.

      Page 4 (Results: The adaptor) paragraph 1 line 4: "CARD domain" would stand for "caspase activation and recruitment domain domain". Please check throughout (including Supplementary Material).

      We have fixed this mistake.

      Page 4 (Results: The adaptor) paragraph 1 line 9: "expressed over a range of concentrations in cells" - this would imply the authors controlled expression - please rephrase to explain what exactly was done.

      We have now rephrased this sentence to indicate that the range of expression results from the use of a genetic construct with cell-to-cell variation in copy number.

      Page 5 (Results: The adaptor) paragraph 2 line 3 and throughout (including Supplementary Material): please use the Greek letter rather that "u" for micro.

      We have now fixed this mistake.

      Page 5 (Results: The adaptor) paragraph 3: this analysis is rather simplistic, it is not just the RMSD value, it is the nature of conformational change that is important? Please elaborate, I would think the papers presenting structural work have already discussed this to some extent?

      The reviewer is correct; it is the nature of the conformational change that is most important. We are unsure how to accurately estimate the energy barrier separating the two conformations for each protein. However, we have now undertaken a collaboration to attempt to do so via FAST molecular simulations (Zimmerman and Bowman 2015). In lieu of the results of these ongoing studies, we have modified the text to acknowledge that RMSD does not necessarily relate to nucleation barriers.

      Maxwell I. Zimmerman and Gregory R. Bowman. Journal of Chemical Theory and Computation, 2015, 11 (12), 5747-5757 DOI: 10.1021/acs.jctc.5b00737

      Page 5 (Results: The adaptor) paragraph 4 line 5 and further in this section: some symbol(s) do not show in the pdf - before "(delta)", next page line 3-5 after "higher" and "both".

      We have fixed this issue that resulted from exporting to a PDF file from our text editor.

      Page 6 (Results: The adaptor) paragraph 4: interface IIa and IIIb are not introduced, and there is not even any reference provided here.

      We have now added a reference for these mutations and elaborated on the interfaces IIa and IIIb.

      Page 6 (Results: Pathogenic) paragraph 1 line 12: "FL" is not introduced.

      We have now fixed this mistake.

      Page 8 (Results: Pathogenic) paragraph 7: the text "absent the pathogenic mutations" is missing something.

      We have now reworded this section.

      Page 10 (Results: BCL10) paragraph 3: why does CARD9 CARD clustering peak and then disassemble (I guess "clustering" doesn't disassemble, please rewrite as well).

      We have now fixed this mistake.

      Page 11 (Results: MALT1) paragraph 1: I presume dimerization doesn't achieve the same level of proximity as higher-order multimerization?

      Our interpretation here is that for MALT1, activation requires close proximity of more than two molecules. Although our dimerization module did not activate the caspase-like domain of MALT1, we know that it achieves close enough proximity to activate the caspase domain of CASP8. Hence we believe the MALT1 mechanism has a stoichiometry requirement in addition to a proximity requirement. This is, of course, consistent with the fact that activation normally occurs in the context of polymers rather than dimers.

      Page 11 (Results: Ancient) paragraph 1 line 4: is this AlphaFold2?

      That is correct, we used AlphaFold2. We have added that detail.

      Page 12 (Discussion) paragraph 4: not sure if "molecular examples of evolutionary spandrels" will be clear to most readers.

      We have now explained what evolutionary spandrels are, and elaborated on the relationship to our findings.

      Page 14 (Materials: Plasmid) line 2 and throughout: "Golden Gate" is usually capitalized. Similar for "Gibson" further in the paragraph. The English in this paragraph is not up to standard in general; for example "Then placing..." is not a complete sentence, and a number of sentences ending with "via gibson" need to be rewritten.

      We have now rewritten this paragraph.

      Page 16 (Materials: Cell) line 4 and throughout: "2" in "CO2" should be subscripted.

      This is now fixed.

      Page 16 (Materials: Transient) line 6 and throughout (including Supplementary Material): please use a space between number and unit ("35 mm").

      This is now fixed.

      Page 16 (Materials: Generation) line 4 and throughout: to distinguish from "gram", please italicize "g" and/or use "x g".

      We have now fixed this.

      Page 17 (Materials: Yeast) line 3: please specify which table is "table X".

      We have now fixed this mistake.

      Page 17 (Materials: Mammalian) line 1: please provide full reference. Same next paragraph line 2.

      We have now fixed this.

      Page 17 (Materials: DAmFRET) line 3: "SSC" and "FSC" are not defined.

      We have now fixed this.

      Page 18 (Materials: Fluorescence) line 10: "Coefficient" does not have to be capitalized. It does not have to be defined again in the next paragraph.

      We have now fixed this.

      Page 19 (Materials: Optogenetic) line 1: "performed" rather than "made"?

      We have now fixed this.

      Page 19 (Materials: Protein) line 12: the Compass software doesn't have a reference?

      We have now added the reference to the software.

      References: please make format consistent: articles titles in sentence or title case.

      We have now formatted all references to be consistent.

      Legend to Fig. 1: I suggest "Schematic diagram"; and "h" rather than "hrs"; please check throughout (including Supplementary Material).

      We agree with this suggestion.

      Legend to Fig. S1: is "TNF-a" supposed to be "TNF-alpha"?

      We have fixed this.

      Legend to Fig. S7: please capitalize "Figure 2H".

      We have fixed this.

      Legend to Fig. S10F: please move "Dox" behind the concentration.

      We have fixed this.

      Fig. S14B: the colours in the superposition make it difficult to see the differences.

      We have used a different color now.

      Legend to Fig. S14: I suggest "structure...predicted by AlphaFold" (2?) and include the reference.

      We agree with this suggestion.

      Reviewer #4 (Significance (Required)):

      As argued above, the significance of this paper is that it tests directly important hypotheses proposed or assumed previously, and does so at a technically high standard. No published report has done so to a similar extent.

      The paper should be of interest to a broad audience from cell biologists and immunologists to biochemists, biophysicists and structural biologists.

      My expertise is in structural biology or systems similar to the one studied here.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The study by Rodriguez Gama et al. addresses the molecular function of CBM complex-forming proteins CARD9, BCL10 and MALT1 in the activation of myeloid cells, using optogenetic tools, transcriptional reporters and biochemical approaches. It is known from previous studies that Bcl10 oligomerizes into filamentous oligomeric structures incorporating Malt1, and that these structures are nucleated by receptor-induced activation of CARD proteins such as CARD11 (in lymphocytes) or CARD9 (in myeloid cells), but the mechanism underlying the assembly of the resulting CBM complexes remain incompletely understood.

      The authors develop beautiful optogenetic tools to address this question, and convincingly demonstrate that CARD9-mediated nucleation of BCL10 triggers a binary cellular NF-kB response in a spring-load-like fashion, and identify mutants of BCL10 and CARD9 that impact this capacity. Unfortunately, however, the authors do not do a good job to simplify this complex problem so it can be easily understood. In particular, the choices of mutants, models and experiments are not consistent between figures, and some data seem to be arbitrarily added or omitted. Complex hybrid constructs are also used, without assessing whether these are indeed functional in the corresponding ko cells. The paper would therefore benefit from a major overhaul. We also noticed that the literature is often not cited adequately and have included a (non-exhaustive) list of examples of wrong, incomplete, or erroneous citations below.

      1) The initial observations of binary signaling are derived from a reporter system. Although there are controls to show that the reporter used does not function intrinsically cooperatively, it would be nice to see additional data to show that cooperativity occurs also at the level of endogenous response systems, for instance by qPCR-based assessment of a natural NF-kB target gene (induced for example by TNFa versus B-glucan in THP-1 cells, and by TNFa versus PMA in 293T cells).

      2) The cell lines in Figures 1D-E (and also some of the BCL10 mutants used later on) would have been better run in the assays in the early parts of Figure 1. The final conclusion prior to the section The adaptor protein BCL10 is a nucleation-mediated switch is otherwise not justified. This is a central tenet of the paper, that is referred to again, with some other ancillary data to support it. These mutants reappear later in the paper, but it would have been better, and easier to make rescue lines of BCL10 KO in Figure 1, otherwise the logic is lost, and the models seem chosen arbitrarily.

      3) Expression with microNS is not well controlled and gives little real evidence for what is occurring. It is unclear what the concentration of the protein expressed was, but certainly the relative expression of the CARD9(CARD) and the microNS version should be assessed. Also, the AmFRET profile of CARD9CARD looks very weird, it cannot be compared to BCL10.

      4) We are not convinced of the usefulness of the introduction of a slew of disease-causing CARD9 mutations that may or may not be relevant to the authors' point. The fact that they do or do not function in a specific sub portion of an assay that may or may not be relevant to biological activity seems to be of interest but without biochemical understanding, little is clear.

      5) The Optogenetic experiments are interesting, but difficult to interpret without evidence that these MALT1 constructs are indeed still functional when expressed in MALT1-deficient THP-1 cells. We do not therefore think that this experiment shows a necessity for clustering to signal, just a sufficiency, and in a highly artificial construct.

      6) In the introduction and other parts of the paper, there are numerous instances where the previous literature in the field is not adequately cited. Examples include:

      • In the introduction, it is weird to cite one original paper (a MALT1 ko study by Ruland et al., 2001; there are several other studies of ko papers for CBM components that would merit being citated along with this study) together with two reviews on that topic (Ruland and Hartjes 2019 and Gehring et al. 2018)
      • In the introduction, the original study by Wang et al., 2002 should be cited together with Rebeaud et al., 2002; the two studies on the same topic were published back-to-back
      • In the introduction, the statement "CARD10 and CARD14 are expressed in nonhematopoietic cells including intestinal and skin epithelia, respectively" should be supported by citations.
      • Still in the introduction, the 2 references for the statement "... CARD14 gain of function mutations cause psoriasis (Howes et al., 2016; Jordan et al., 2012)" are not appropriate. There are several reports of patients with CARD14 mutations (the study by Jordan et al is only one of them) and several CARD14 mouse models that provoke a psoriasis-like phenotype, which would merit being cited.
      • In the following sentence: "Point mutations and translocations involving BCL10 and MALT1 cause immunodeficiencies (Ruland and Hartjes, 2019), testicular cancer (Kuper-Hommel et al., 2013), and lymphomas (Zhang et al., 1999).", the citation style also seems completely random, combining the citation of a single original paper for lymphomas (Zhang et al. 1999) (there are several other important original studies on that topic or recent reviews that could be cited instead), together with a review on immunodeficiencies (Ruland and Hartjes, 2019) and then another single example for a role of BCL10 and MALT1 in carcinoma (the study by Kuper-Hommel et al. is one, but several other original publications exist on the latter topic, showing for example a role in breast carcinoma or glioblastoma).
      • In the first section of the results, the reference cited for endogenous CARD10 expression in 293T cells (Ruland et al., 2001) is wrong, no endogenous CARD10 expression was assessed in that study

      Significance

      The paper deals with a complex question, namely how the CBM signalosome assembles and functions to stimulate NF-kB signaling. This question is important to the understanding of pro-inflammatory immune responses and basic life sciences in general. As the focal point of the paper is complex, and tools to study such phenomena are at the limit of technical capabilities, this further increases the potential impact of the work.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Two reviewers commented on the smeared appearance of Tae1 bands in our Western blot analyses (Figure 4F and 5B) and asked us to improve their technical quality.

      -We agree and will repeat these experiments with more careful attention to lysate preparation, using a higher percentage SDS gel for better separation of low molecular weight proteins as suggested.

      Reviewer 2 requested that we assess how Tae1 variants impact interbacterial competition outcomes.

      -We agree that this would be interesting to take a look at. While this will not be feasible for every variant we examine in the paper, we can conduct comparative interbacterial assays between P. aeruginosa and E. coli using P. aeruginosa strains with a tae1 point mutation for c110s. Given that our biochemical experiments show that this hyperactive variant evades inhibition by the cognate immunity protein, we expect that this may decrease P. aeruginosa fitness, even in the context of competition.

      More generally, we think that examining Tae1 variants in the context of interbacterial competitions would be a critical orthogonal approach in order to validate that the DMS results have any bearing on competition outcomes. However, we feel that major focus of this paper is on the more molecular and biophysical insights that our approach can offer. Our study tests our assumptions about the kinds of features and surfaces that are important for proteins that engage with non-canonical complex substrates. It is, of course, interesting to think about the implications of this for physiological phenotypes and the drivers of toxin evolution. It is also exciting to imagine how this kind of information could be used to one day engineer certain interbacterial outcomes. We hope that others in the field will push our efforts into these directions, but we do not feel that these directions are essential for our conclusions. However, our conclusions on the molecular and biophysical aspects have helped generate interesting hypotheses in microbial ecology that could be largely followed up on by others.

      In order to conduct well-controlled P. aeruginosa:E. coli competition assays for more Tae1 variants, we would need to generate a significant number of new P. aeruginosa strains encoding point mutations for each of our variants across several genetic backgrounds. The competitions themselves also require a considerable amount of work to optimize and quantify. We are able to do this for one of the variants as previously mentioned (C110S). It’s important to note that the first author of this paper, who was the primary driver of this work, is no longer in my lab or in academia. As for myself, I am also in the middle of a transition out of academia and am actively ramping down my lab at UCSF. I no longer have the space or appropriate set-up to support this longer-term effort.

      Reviewer 2 asked that we examine Tae1 (WT and C110S) expression levels in vivo to more precisely examine whether increased self-intoxication by Tae1C110S in P. aeruginosa was due to differences in toxin activity or toxin levels.

      We agree with this suggestion and will look at toxin protein levels by Western blot analysis in the context of P. aeruginosa cells grown 1) alone on solid media and 2) together with E. coli on solid media during interbacterial competition using conditions that match our other competition assays.

      All 3 reviewers asked us to provide more experimental evidence addressing the hypothesis that differential peptidoglycan (PG) affinity across Tae1 variants could explain variation in toxic activity.

      -We agree that this is an interesting point to follow up on further. To be clear, we also do not know whether this hypothesis is true at this stage, and the answer is not necessarily critical for our central advance, but we would like to give it a try! We have devised an approach to ask the question experimentally across a subset of our deep mutational scanning (DMS) variants.

      Reviewer 1 suggested that we quantify in vitro binding affinities for PG using isothermal titration calorimetry (ITC). However, given that ITC requires high concentrations of well-defined homogeneous substrates, which we are not able to generate for more complex higher order structures of cell wall PG, we propose a pull-down based approach.

      Briefly, we plan to conduct pull-downs using insoluble, purified cell wall sacculi from our two E. coli grown under the two conditions as bait for recombinant Tae1 proteins. Given that intact sacculi or inherently insoluble, we can simply collect bound Tae1 through centrifugation of sacculi pellets and examine the amount of Tae1 associated by Western blot analysis. These analyses will need to be conducted across a titration of Tae1 concentrations and also with catalytic activity inhibited to avoid solubilization of sacculi. We will block Tae1 hydrolysis by carrying out pull-downs in the presence of a general commercially-available cysteine hydrolase inhibitor, E64. If there is indeed differential affinity for PG underlying lytic differences across Tae1 variants, we would expect to see greater relative association of Tae1 variants with the type of cell wall sacculi that they more effectively lyse in our DMS screen. We would expect the reverse trend to also be true (lower affinity for less active variants).

      Reviewer 1 would like to know if we have done lysis experiments with any E. coli mutants that only impact PG density but not PG polymer structure? If they haven’t tested any E. coli mutants, have we done lysis experiments using drugs that have a similar impact on PG? Even if we don’t include these data in the paper, the reviewer would like us to comment on the trends we have observed.

      We have not done experiments in any mutants or chemical backgrounds known to only impact PG density but not polymer structure. We think this would be a very interesting angle! But unfortunately this is outside the scope of this study. It would require that we first experimentally confirm that the restrictive effect on only density is clearly demonstrated using a variety of techniques, including microscopy, chemical analyses, and biophysical probing of sacculi.

      Reviewer 1 asked for additional DMS screens in more conditions

      We love this idea! In fact, we hope that others are motivated to adopt our workflow to run many more DMS screens for T6S toxins, as we believe these screens provide a lot of useful and sometimes surprising insights that could be of great interest to others. However, we believe that the primary goal of this paper is to establish this methodology as a compelling approach for studying toxins and, more generally, proteins with complex cellular substrates. It does not necessarily fall within the scope of this paper to fully assess the mechanistic implications of cell wall diversity across a wide range of conditions.

      In our experience, rigorously conducting DMS screens requires a significant amount of effort and resources to establish consistent experimental conditions. Also, a non-trivial number of costly sequencing-based experiments are required across control and variables for the results to be statistically sound and meaningful. Furthermore, experimental validation of results are ultimately important for our ability to confidently generate hypotheses stemming from these datasets. As stated above, the first author of this paper, who was the primary driver of this work, is no longer in my lab or in academia. As for myself, I am in the middle of a transition out of academia and am actively ramping down my lab at UCSF. I no longer have the space or appropriate set-up to support this longer-term effort.

    1. Design is hope made visible. You can live your life as the result of history and what came before, or you can live your life as the cause of what’s to come. You choose. When talent doesn’t hustle, hustle beats talent. But when talent hustles, watch out. When you work only for money, without any love for what you do in and of itself, your work will lack energy. People will feel that. So give every project everything you’ve got, at every moment, every time. A good philosopher will say: “Know thyself.” A good shopkeeper will say: “Know thy customer.” A good designer will say: “Know both.” Listen for when someone is dismissing your ambitions. Only the petty do that. Avoid them. Instead, seek out those much better than you; they’ll make you feel that you can achieve your dreams, as theirs are probably even larger. They’ll wave you on to the finish line. A brand is always answering two questions. The first one internally facing: What do we believe? The second, externally: How do we behave? You must remain authentic to yourself, your core values, and what you stand for. If you’re not, people will sniff you out. But your brand must maintain cultural congruence — remaining relevant to the times, always evolving to inspire people at large. The answers to these two curiosities must always be aligned. Find a way to connect every project to something much bigger: a higher order value, a truth, a courageous goal, or a larger question. Then, if your efforts start to lag or feel mundane, return to that larger ideal that inspired you in the first place. It works. Put this over your desk: “You never change things by fighting the existing reality. To change something, build a new model that makes the existing model obsolete.” Buckminster Fuller knew stuff. A good designer will help a company get to where they want to go. A great designer will push a company to where they should go. Are you going to tell a story? Then tell a big story. An enormous story. An epic story. Or tell no story at all. The role of creative leadership is to create more leaders — not more followers. This view is more uncommon than I’d like. I’ve learned that there are only two kinds of people: 1.) People who do exactly what they say they’ll do. 2.) People who are full of shit. Form follows fantasy. Every good idea comes from a spark of imagination, not pragmatism. Facts are important. But possibility creates futures. Never take an unpaid internship. Ever. It is unethical to be offered one, and in many places, it is illegal. But more importantly, what kind of people would refuse to pay you? Oh yeah, really shitty people. If you lose the desire to be silly, the power to laugh, and the ability to poke fun at yourself, you will lose the power to think. All work and no play makes Jack a dull boy for one reason: It kills off his imagination. Stuck on a problem you can’t solve? Go bigger. Expand it. Make it giant. Do not try to contain it, or simplify it, or reduce it. Make it so large that you can begin to see a new pattern. Solve the larger problem and the smaller one will get solved along the way. Always begin in mythology. It’s good fuel. Fables and fantasy don’t age or grow stale for one reason: They are a step into a dimension beyond the reach of time itself. Build with them. When I turned 35, I shifted my desire to be happy to a desire to be useful. It made all the difference. There are only two kinds of leaders: 1.) Those in the engine room helping the crew shovel the coal. 2.) Those who sit on top of the train and wave at the crowds as they pass by. Learn from ad agencies. They say yes to everything, even when they can’t do it. But they try. Designers say no all too often: “Oh, no. We don’t do that!” That’s shortsighted. Instead, say yes to everything. But always add “yes…if.” Then define your terms. I was on a board with the esteemed educator Sir Ken Robinson. At one meeting where a pompous guest was droning on, he turned to me and whispered, “What we do for ourselves dies with us when we leave this planet. What we do for other people can live on forever.” The opposite of courage is not cowardice. The opposite of courage is conformity. Ubiquity = Invisibility. What we’re overly familiar with, what becomes common, we stop seeing. One function of design is to restore our perception, renew our understanding, and invite us to be more alert. Seek simplicity only on the far side of complexity. Do the work, the research, the understanding, and discover the unseen, surprising, unanticipated insight before you start crafting your solution. A celebrated designer I admire once said “Style = Fart.” I disagree. I believe “Style = Accuracy.” It gives focus and timely relevance to ideas. If you want to make people like things, work in advertising. If you want to make things people like, work in design. Both are valid ways to build a brand, but the second way pays off better in the long run. You can always pull a good story out of a successful product or service. You can’t always pull a good product out of a story. Hire gifted people your clients would never let in their front door. Give them influence. Clear the runway. Provide sandwiches. And stand back. When designers get overwhelmed we can retreat into passivity. We pull back. This gives us an illusion of control. The less we try, the less our chances to fail. We make it look like we’re not responsible for what happens to us. But never give up. Move in closer, instead. Try. Make a mistake. Apologize quickly. And keep trying. Never be boring. Be ridiculous. Absurd. But never be boring. (Yes, this rule will get you in trouble.) Push. Push harder. The goal is to make the complicated simple — not the other way around. The best ideas are often expressed as simple ideas. They’ll have power because they’ll feel inevitable. Looking backward from the end of a project, it will have the appearance of inevitability. But when you began, you had no idea you’d end up there. What dullards suggest at this point is dangerous: “This creative process is too messy and too complicated. It needs efficiency since this solution was so logical. We should apply more logic throughout the process!” That’s the beginning of the end of creativity. Resist this urge. It destroys spontaneity, originality, serendipity, and unintentionality, which is where the biggest ideas are waiting for you. Do you find yourself surrounded by people who whine that “clients don’t understand what we do”? Those people will never have good clients. A designer’s first job is to articulate the tangible value we bring to every situation. It’s not the clients’ job to try to guess it. Average designers hit the brakes when they feel fear. But when the talented get frightened, they hit the pedal, accelerate, and drive headlong into the unknown. I’ve taught students for 20 years. In that time I’ve seen self-confidence, persistence, and desire play a much larger role in growth and achievement than talent. Passive? Whining? Waiting for orders? You won’t get off the ground. Energized? Enthused? Curious? The sky’s your limit. If you want to teach design, first read “Teaching to Transgress” by bell hooks. Your whole mindset will change. If it doesn’t, please do not teach. Seeking mastery in design means being comfortable with making your own path. Forge the new road. Others will question it and doubt it. But that path will eventually come to fit your soul. It will not only lead you into deeper parts of your craft, but to hidden parts of yourself. There may come a time when someone publicly attacks you or your work. If that happens, remember this: Those who attack are the ones who fear you the most. They’ll suspect that your talents might be greater than theirs. They, in fact, become your most sincere believers. It’s a proof point when they start showing up. Watch for them. Then thank them when they arrive. “Always think with your stick forward.” Amelia Earhart painted that on her plane. She meant, I imagine, to seize the moment when it arrives. Refuel as necessary. Don’t wait for any damn kind of “inspiration.” Punch the throttle. Get back in the air. Keep flying. Are you at an agency that habitually recruits outside industry hotshots to lead instead of promoting potential hotshots from the ranks? Run. Now. It will never become what it wants to become. Separate talkers from doers. For someone to score an interview, I suggest a good book — on anything — to read in advance. “After you finish it, call me, and we’ll schedule some time.” 90% drop off. There are exceptions, but I hire from the remaining 10%. Be careful of doing too much work that copies the people you admire. Start out that way to see what feels right. But aim to seek what they were seeking instead of doing what they were doing. Stay away from people who confuse pomposity for profundity. Articulate incompetency is contagious. When you’re out-gunned, out-staffed, and out-equipped in a competition, what are the things you’ve got left to use? Kindness and imagination. When someone disagrees with you, do not defend yourself. Instead, listen. Ask them to explain, validate their concern, expand on it, and affirm their point of view. Only then will anyone listen to anything you have to say. I wish someone had told me this in my teens. We don’t create fantasy worlds to escape reality. We create them so we can better see, understand, and reshape reality. Seek ambition. Hire character. Train talent. When I hear the word “iterate” more than three times in three minutes, I fear there will be a Post-It® fiesta within three minutes. Fair warning. A story is not just a tale of conflict. It can be a well of shared values. If you shift the story people tell about themselves and their communities, you can not only shift those people, you can shift an entire culture. Build a library for yourself, and read John Milton. He had profound respect for books and human thought. “For books are not absolutely dead things, but do contain a potency of life in them to be as active as that soul whose progeny they are; nay, they do preserve as in a vial the purest efficacy and extraction of that living intellect that bred them.” A better definition about the sanctity of books was never written. Notice someone doing something cruel for the first time? Never wait for a second time. Address it fast, or cut them out. Either way, do not “wait and see.” It leaves you and your team vulnerable. What they showed you is who they are. Move fast. Mastery is not gained from intellect. Mastery is not gained from talent. Mastery is not gained from ambition. Mastery is only gained from time and focus applied to your craft over many, many years. Do not conflate it with fame. Try absolutely everything. Then try it all again. And then, one more time. Accept compliments gracefully. Treat flatterers with suspicion. Listen to your complainers and cynics — not because you might learn from them, but because they secretly care. Design ain’t what the thing looks like. Design is what the thing does. Smartphoning has supplanted daydreaming. Fixated on our little, lit-up screens, dusty old thoughts no longer slip out of our brains as easily, so no new, silly, absurd thoughts slip back in. And all good ideas start out as silly, absurd thoughts. Turn off your phone. Daydream. Fart around. Ponder. Let something odd fly in that’s floating around, hoping for an open mind to land in. If an idea doesn’t scare you in some way, it’s not really a good idea. A strong, sincere voice is like a clear bell—when rung, it travels far, across fields, mountains, and rivers. Ring it. And teach others to. Ignore those who tell you to “only focus on your strengths.” Nonsense. Your strengths never go. Build them, hone them, and add muscle to them. But also focus on what you need to move into new and larger worlds. Become a shocking triple threat, not just a shiny, one-trick pony. Failures are not always mistakes. It just might have been the best you could do at that point. Okay, fine. Apologize quickly. The real failure is to beat yourself up and not take the opportunity to learn. Never hire people for “cultural fit.” What a pernicious term. Instead, hire insanely talented people for their “cultural contribution.” For how unique they are. For why they are different from you. For what they will add that you do not have. People who use the word “lifestyle” don’t have one. Big agency order of importance: Clients –> Work –> People. Ours: People –> Work –> The client’s customers –> Clients. It’s easy. Good people do good work that customers love so clients succeed. T’was ever thus. Don’t work with clients to help them become the best. Work with clients to help them become the only. Hire Tigger. Never Eeyore. Surround yourself with optimists. They will build futures into existence. Read a good book every week. After a year your brain will be fueled like a rocket and your mind will naturally start going to new places, connecting new ideas, and thinking in ways you never have before. Never create and edit at the same time. Get all the sloppy, ugly roughs and first drafts out. Quantity is more important than quality at the start. Mess is more. All ideas are bad ideas. They only become good through craft and love. Clients want you to succeed like crazy. That’s why they hired you. Show them how. That’s your damn job. Do it. We perceive through images. We think in metaphors. We learn by stories. We create with fantasy. When you find yourself on the horns of a dilemma, always do the honest thing. This will shock people. And you’ll come out better, anyway. Perhaps. Maybe. Possibly. Someday. These are among the most damaging words a creative person can use. Lose them. Everybody starts out with good intentions. Not everybody finishes with them. This has been the most painful thing I’ve ever learned. People already know what advice they need to hear. They just need to hear it told to them by someone else. There is no such thing as “The Future.” There is only and always “The Futures“—and they are all in competition with each other, fighting for dominance. Which future will you feed? When asked for a definition of “brand,” I use this: A brand is a promise performed consistently over time. It’s held up for a while now. Brands are mentors of things to come. The best ones anticipate, create, and move us into tomorrow. Companies are no longer in competition with each other. They are—we all are—in competition with the future itself. The era of human-centered design is now gone. Our existence was never human-centered, anyway. Covid-19 proved that to be nonsense. It’s time for environment-centered. Not sustainability. Regeneration Design where we create not apart from Nature but as a part of Nature. It is never about winning. It is never about losing. It is only about contributing. It is only about learning. I’m tired of talks from “designers” who never design anything beyond their keynotes. I’m tired of talks from “entrepreneurs” who never build anything beyond themselves. I’m tired of talks from “thought leaders” who lead nothing but the perpetuation of their own fame. When you submit a fee for your work, someone will always ask, “Is this negotiable?” Answer with this: “Yes. Up.” In the end, it’ll not be what you took. It’ll be what you gave away. Do not worry about your competition. You’re not in competition with them anymore. You’re only in competition with the future itself. So don’t look over your shoulder. Look two, three, five years down the road and invent backward from there. Design is the bridge that gets us from where we are to where we should be. It is future-making. And it’s our job to get our clients into the best futures for themselves as quickly and effectively as possible. Skip the whole “Minimal Viable Product” thing. It leads to incrementalism. Try “Maximum Fucking Love.” It leads to something that someone else might actually care about. Be aware that every choice you make comes down to two options: Feeding grievance or creating hope. In the end, it is that simple. The era of problem-solving is gone. It’s too reactive in a world where the future arrives too fast. Designers must now be problem seekers, finding and anticipating problems before they arrive on our desks, because at that point, it’s already too late. We must now all build bridges, not walls. The rest is detail. Design Thinking gives a definition of romantic expression as found in timely historical contexts. Design says: “I’ll be upstairs.” In first creative presentations, to ensure your creative work has time and space to land, ban all of the devil’s advocates from the room before you show a thing. Then say this: “We are here to create something new. New ideas can be fragile because they are unfamiliar. You may not like something you see here, but you are not allowed to say that for now. We’ll have to edit and remove some of this work later, but for now, everything will be in play. So find something, anything, you like in every idea. A color. A word. An image. A sentence. Anything. In the end, we find what we look for. And today we are going to look for the new.” In the end, there are only two key questions the world asks of us: 1.) Who are you? 2.) Where are you going? These questions are the same ones we ask our clients. The first is about authenticity; the second is about relevance. Asking them will keep the world wide open in front of you. Whether you like it or not, your brand’s story already exists, so you should manage it as you would any other powerful company asset. After your product, your means to deliver it, and your audience, your story will be the most potent tool you have to build with. Be very. For a very long time, it took a very long time for anything to change. If you found an answer that worked, you could count on it being the answer for ages. But those days are over. Being an answer is not the answer. Or even an option. Unless, of course, you’re very curious. Or very focused. Very gay. Very straight. Very caring. Very prickly. Very visual. Very verbal. Very brash. Very funny. Very heady. Very anything. Everyone at COLLINS is very something. If I took any lessons from Ogilvy, it was these two: 1) Think bigger. And then, think bigger still. 2) Take every chance while you can. Grab them. And go all in. You never know if they’ll ever come again. Experience. Don’t observe. Inhale. Don’t read. Transfigure. Don’t shift. Advocate. Don’t ponder. Prove. Don’t promise. Encourage. Don’t cut. Imagine. Don’t worry. Do. Don’t analyze. Hear. Don’t listen. Show. Don’t tell. Give. Don’t take. Design is not what we make. Design is what we make possible.

      Some great design principles and wisdom.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors sought to investigate the diet of the early fossil bird Jeholornis and its implications for bird-plant interactions in early bird evolution.

      Major strengths were: 1) an exquisite near-complete cranial reconstruction of the early fossil bird Jeholornis from the Early Cretaceous of China, 2) a large sample of extant bird skulls (160) for the geometric morphometric analysis, and, 3) qualitative description of alimentary contents of extant birds.

      Major weaknesses were: 1) restriction of diet consideration to only granivory and frugivory, 2) under-detailed comparisons between the extant and extinct alimentary contents, 3) unclear explanation of the connection between early fossil birds and seed dispersal.

      Thanks for the summary of our work! To briefly reply to the weaknesses mentioned here (more details are provided in the following reply to the reviewer’s comments and suggestions):

      1) We have added supplementary analyses according to the reviewer’s suggestions, so this should have been addressed now. Our morphometric analyses attempt to explain the presence of seeds in the gut contents of some individuals of Jeholornis. We believe there are only two possible explanations of the presence of these seeds: granivory or frugivory. Therefore, we were initially motivated by the need to rigorously rule-out a granivorous explanation of the present of seeds in the gut of Jeholornis, which then would demonstrate the partially frugivorous diet of Jeholornis - it doesn’t have to be a specialist frugivore and its supplementary diet components don’t influence the inference that the presence of seeds results from fruit-consumption. Fruit-consumption is the key mechanism that we provide evidence of for the first time in early birds, and is central to the potential for mutualisms between plants and early birds. However, our supplementary geometric morphometric analyses do indicate some clues about its supplementary diets that are useful. In particular, they rule out some other diets e.g. piscivory or a probing diet.

      2) Our work is the first work we know to provide comparative data on the seed-containing gut contents of extant birds, as a tool to interpret fossil gut contents. For granivores and frugivores, we have done detailed 3D comparisons among several species. We think this is important, and we have done our best to document them clearly. However, for now, we have further clarified the images that we have presented, in response to a comment by referee 3 (see below). We hope that this also addresses the concerns of referee 1 here.

      3) By providing direct evidence of fruit-consumption in early birds, we provided evidence of the mechanism for potential bird-plant co-evolutionary mutualism during the Early Cretaceous. We are not showing the direct evidence of the mutualism, although note that plants invest energy in fruit production specifically to attract fruit-eating animals to act as seed dispersers. Therefore, the inference of mutualism is not far-fetched and is very likely, even if direct evidence is almost impossible to preserve in fossils - so that we tend to tone down this statement rather than making it too strong. More detailed analyses based on more new fossil discoveries in the future are expected to further explore the role of birds the Cretaceous Terrestrial Revolution. However, our study is the first step to evidence and discuss this ecological topic and the furthest we could go based on the current fossil discoveries. Nevertheless, this seems important and will be the base of future studies.

      The authors did not yet achieve their full aims because their methods limited the scope of their conclusions. Specifically, a third hypothesis that Jeholornis was neither granivorous nor frugivorous was not addressed in the study. This is especially poignant as the PCA data show overlap between the granivory and frugivory data points and the 'other diet' data points. If it is assumed that Jeholornis must be a granivore or a frugivore, then the results support frugivory over granivory for Jeholornis. However, as explained above, this assumption is not supported by the data provided so the third hypothesis needs to be tested.

      Thank you very much for stating the concern of our study. It seems that there is some misunderstanding here about our study. Our analyses attempt to explain how seeds entered the gut content of Jeholornis, not to predict diet in the absence of evidence from gut content. That is why we tested between just two alternative explanations of the gut contents in our original analyses: (1) That seeds entered the gut through granivory (seed-consumption); and (2) That seeds entered the gut through frugivory (fruit-consumption). Based on this combined evidence of seeds in the gut, comparative study of the gut contents of extant birds, plus morphometrics of the skull and mandible, we claimed partial (possibly seasonal) frugivory - a form of facultative frugivory for this lineage. Therefore, we are not claiming specialised frugivory in Jeholornis as the reviewer might think. However, we acknowledge that the word 'frugivorous' might be misleading to some readers, who could interpret it as meaning 'specialised frugivorous'. To avoid this misunderstanding, we did consistently use adjectives such as 'partial', 'seasonal' and 'opportunistic' in our initial submission. And we have tried to reinforce this in our revised manuscript. For example, we converted some instances of ‘frugivory’ to ‘fruit-consumption’ to indicate the act of consuming fruit rather than a perceived idea of specialised frugivory.

      We may also need to emphasize here that, the seed dispersal and frugivore ecology studies of the modern taxa show that, for most frugivores, fleshy fruits are a non-exclusive food resource, which is supplemented with other foods like animal prey and plants (Howe, 1986; Corlett, 1998; Jordano, 2000; Wilman et al., 2014). In addition, plants usually bear fruits only in certain seasons rather than being available throughout the year, which makes strictly specialized frugivore very rare. Therefore, avian frugivores occupy a wide range of diet space that is highly overlapping with some other diets. However, to reply to the comment from the reviewer and also make this clearer to some other readers, we conducted supplemental analyses by dividing 'other diets' further to test what diets Jeholornis possibly/impossibly had as supplements of frugivory. The results of them were shown in Figure 2 - figure supplements 3, Figure 2 - figure supplements 4 and Figure 2 - figure supplements 5 now. We revised and added these texts into the manuscript to describe the added supplemental analyses:

      “Our main analysis is intended to test why seeds entered the gut of Jeholornis by distinguishing between two hypotheses, either (i) fruit consumption or (ii) seed consumption (Figure 2, Figure 2 - figure supplements 2).”

      “Our supplemental analysis includes a further split of “Other diets”, separating the “Other diets” category into: (1) Probing for invertebrates; (2) Grabbing/pecking for invertebrates (Figure 2 - figure supplements 3); (3) Piscivores; (4) Animal-dominated omnivores; (5) Carnivores (Figure 2 - figure supplements 4); (6) Nectarivores; (7) Omnivores; (8) Plant-dominated omnivores (Figure 2 - figure supplements 5). Our prior expectation is that these analyses will not provide an unambiguous classification of the diet of Jeholornis on their own, because craniomandibular shape data does not completely differentiate among diets in birds (Navalon et al., 2019), but that they may be capable of ruling out the occurrence of some diets.”

      The results of these supplemental analyses are as the descriptions we added in the manuscript:

      “Our supplemental analyses exclude Jeholornis from possessing a probing diet, which occupy negative PC1 values (Figure 2 - figure supplements 3), as well as being a piscivore, which occupy positive PC2 values (Figure 2 - figure supplements 4). However, it cannot be distinguished from other diets such as the grabbing/pecking for invertebrates and omnivory (Figure 2 - figure supplements 3, 4, 5). Euclidean distances in the full multivariate shape space suggest that the mandible of Jeholornis is relatively similar to those of various omnivorous (e.g. Podica), seed-grinding (e.g. Calandrella), frugivorous (e.g. Crax), and invertebrate pecking (e.g. Picus) birds (Figure 2 - Source data 3).

      “Similar to the results of the mandible analyses, the results of the supplemental analyses of cranial shape also exclude Jeholornis from possessing a probing diet, which occupy negative PC1 values (Figure 2 - figure supplements 3), as well as being a piscivore, which occupy positive PC2 values (Figure 2 - figure supplements 4).The other diets are also undistinguishable in the supplemental analyses of cranial shape (Figure 2 - figure supplements 3, 4, 5). Euclidean distances in the multivariate shape space, excluding PC3 (which describes the large-scale differences between stem- and crown-group birds) suggest that the cranium of Jeholornis is relatively similar to those of various frugivorous (e.g. Manucodia), seed-grinding (e.g. Pedionomus) and invertebrate pecking (e.g. Hymenops) birds (Figure 2 - Source data 4).”

      These results are briefly merged into the discussion part:

      “Mandibular and cranial shape excludes Jeholornis from being having a probing/piscivorous diet, and is consistent with omnivory, grabbing/pecking for invertebrates, or processing foliage (using the gastric mill).”

      The existed main morphometric analyses show that a seed-cracking diet can be ruled out as an explanation of the presence of seeds in the gut of Jeholornis, which is its primary goal. In addition, our intention of this study is to show evidence for at least seasonal fruit consumption in some of the earliest birds (not specialised frugivorous), which all three reviewers seem to agree is a well-founded conclusion, and the bigger picture insights of our paper arise from that. Here with the new supplementary analyses inspired by the reviewer, the diet of Jeholornis is more detailed in our study, which may interest more readers concerning about the diet components of early birds.

      The cranial reconstruction of Jeholornis and the alimentary content data for extant birds would be invaluable to the community. The geometric morphometric data are presented in a way that obscures how much overlap there is between dietary categories (non-frugivore and non-granivore diets are grouped as 'other diets'), so the utility of these data is unclear. This aspect has hampered the ability of the authors to reconstruct diet in Jeholornis and, thus, the bigger picture insights that can be drawn from these results, limiting the likely impact of the work.

      Thank you very much for the positive comments about our cranial reconstruction of Jeholornis and the alimentary content data for extant birds.

      It was not our intention to obscure the overlaps between the mandible/cranial shape of frugivorous birds, and those of other birds. In fact, we believed that this was clear from the plots, and from the way we described results in the text that various birds with ‘other diets’ could have similar mandible/cranial shape to Jeholornis. This degree of overlap is also expected based on recent studies that found evidence for only quite diffuse relationships between cranial form and diet in birds (Navalón et al., 2019). However, we also see the point that some readers might be curious about the nature of particular datapoints and it would be useful to clarify this. We therefore added supplementary analyses according to the reviewer’s comment/suggestion by dividing the 'other diets' category into several much more detailed categories, so the concern of the reviewer here that “the non-frugivore and non-granivore diets are grouped as 'other diets' is expected to have been addressed here.

      Jeholornis is one of the earliest fossil birds, so understanding its diet and ecological role is important for understanding Mesozoic ecosystems and the emergence of modern ones.

      Thank you very much for this good explanation of the importance of this study, and it also is what we believed when we wrote the manuscript. We hope that the referee will be satisfied with the efforts we made to address their initial comments that that our paper on the ecology and morphology of Jeholornis can be published in an appropriate venue.

      Reviewer #3 (Public Review):

      Hu et al. reported on a new specimen of the early bird Jeholornis, including a nearly complete skull. Using geometric morphometrics data collected from 3D and 2D retro-deformed reconstructions of its skull, the authors convincingly dismiss a seed-cracking feeding strategy for the taxon. They then use comparisons of 3D reconstructions of ingested seeds to extant birds with known feeding strategies to convincingly argue that Jeholornis was likely at least partially frugivorous. As such, this study provides the strongest evidence yet that early birds such as Jeholornis may have played a role in bird-mediated seed dispersal strategies in the Mesozoic.

      Generally, the data presented in this paper support the authors' interpretations. The specimen at the core of this study is truly spectacular, and the authors' retro-deformation of its skull is skilled. The results of the authors' geometric morphometric analyses support their inference that Jeholornis was likely not a seed-cracker. Their comparisons of ingested seed shapes also convincingly supported a partially frugivorous diet. I especially applaud the authors' detailed description of their process of retro-deformation of the fossil skull (an example many should follow, including myself) as well as making both their raw data and their reconstructed surfaces available online.

      Thank you very much for the summary of our work!

      However, there are a few major and several minor issues that I believe need to be addressed.

      1. The implications for possible bird-mediated seed dispersal are clear in this study, but they are not conclusive. Rather, the authors (convincingly) demonstrate that Jeholornis was at least partially frugivorous -- a necessary component of such a mutualistic interaction. The authors do not demonstrate that such an interaction actually occurs. These results are nonetheless exciting and important, but I think certain statements in the paper are too strong. A notable example is the title - "Earliest evidence for frugivory and seed dispersal by birds." I would strongly urge the addition of a single word to better reflect the data presented: "Earliest evidence for frugivory and possible seed dispersal by birds." Similarly, in lines 328-329 -- "Strong indications for at least seasonal frugivory in Jeholornis provides direct evidence of [specialised seed-dispersal by animals during the Early Cretaceous] for the first time" -- is not true. This paper does not provide direct evidence for this, but does provide a mechanism consistent with this. There are a handful of other statements in the paper that I think should be toned down to account for this.

      Thanks for the helpful suggestions! We have revised the title to be “Earliest evidence for frugivory and potential seed dispersal by birds”, and revised this sentence to be “Evidence for at least seasonal frugivory in Jeholornis provides direct evidence of fruit-consumption by early birds, long before the origin of the bird crown-group. This provides an important indication of the likelihood that birds were recruited by plants for seed-dispersal very early in their evolutionary history, during the Early Cretaceous” now. We also revised through the manuscript to tone down some similar statements about the seed dispersal, such as “…indicating that birds may have been recruited for seed dispersal during the earliest stages of the avian radiation.”.

      1. Much more information should be given about the new Jeholornis specimen. In the supplement, the authors state that "a few post cranial elements" (p. 17, line 352) are preserved along with the skull -- which elements? They should be figured and briefly described in the supplement. This is of relevance to the core assumption of the paper, namely that this individual belonged to Jeholornis -- the taxonomic assignment is based partially on the tail morphology -- which I assume means that, minimally, a complete tail is preserved. The authors also mention the pelvic morphology of the new specimen, so I assume at least some part of the pelvis is preserved. These should all be figured. Most anatomical discussion is limited to the skull (and especially the palate), which is understandable, given the focus of the paper. However, with that in mind, more attention should be paid to the retro-deformation of the skull. Figure 1 is quite attractive, but I'm confused by the differences in depicted preservation between the 3D (Fig. 1C, D) and 2D (Fig. 1E, F) reconstructions. For example, the braincase is not shown in panel C but is in panel E -- why? Is its shape inferred from other specimens for panel E? Again, I very much appreciate the inclusion of near step-by-step description of how the rostrum was retro-deformed. Minimally, a few comments on what isn't preserved would be useful.

      1) We added the photograph of the whole slab of Jeholornis STM 3-8 as Figure 1 - figure supplements 1 here (the eLife format for supplementary figures), and revised this sentence to be “…and a few postcranial elements including the vertebral column, the pelvic girdle and fragmentary hindlimbs.” now. As you could see from the photograph, there are very few valid information could be extracted from the incompletely preserved postcranial elements. Considering this paper is focusing on the skull, we only mentioned the relatively better-preserved tail and pelvis in the taxonomic part.

      2) We added “Dashed-lines indicate the elements not preserved but suspected to exist.” in the legend of Figure 1, and added the details of reconstructions of unpreserved elements in the end of CT scans and digital reconstructions in Materials and Methods part: “However, since the braincase is too flattened to be used as the reference for 3D retrodeformation, it was omitted in Figure1C and reconstructed according to its common shape in early birds in Figure 1E. The ectopterygoid is not preserved but suspected to exist as discussed in the Cranial Anatomy part, therefore it was reconstructed according to the shape of this element among other stem birds e.g. Archaeopteryx and Sapeornis (Elzanowski and Wellnhofer, 1996; Hu et al., 2019).”

      1. The figures are visually attractive but I found some of them confusing or unclear. See my comments above regarding Figure 1. Despite the red arrows in Figure 4 and the supplemental figure, I was hard pressed to understand precisely what set the indicated seeds apart from the rest. In some cases I could see slight "dents" where one or two of the arrows indicated, but it was hard for me to see, even when I zoomed in on my screen. I think inset panels featuring zoom-ins on the indicated regions would be very useful in making the point the authors intend. Also, I don't know if the supplemental image naming/number scheme was imposed by the journal or is a choice by the authors, but I found it baffling. Something more traditional (like "Fig. S1" or "Supplemental Figure 1") would be much more efficient.

      1) We have clarified the confusions in Figure 1 as suggested. For Figure 4 and related supplementary figures, the 3D reconstructed seeds are pretty clear, such as the broken ones in Figure 4B. The broken seeds in the scanning slices are more difficult to observe as the reviewer said, since the seed husks are very thin so that they are only slightly brighter, and that’s why we put the red arrows indicating the breakages there. To help readers observe them easier, we added some zoom-in panels and line drawings for the representative ones (not all of them since otherwise it would be too many) now as suggested by the reviewer;

      2) The supplementary image naming/number scheme was imposed by the journal, and it would be more clear when the paper is digitally published, since these supplementary images will be connected to links in the legends of the main figures.

    1. Author Response:

      Reviewer #2 (Public Review):

      This is an interesting and scientifically rigorous report documenting atypical, dendritic locations for the emerging axon of pyramidal neurons. This is not an entirely new observation (the authors cite relevant publications, including Kole and Brette, 2018 and Mendizabal-Zubiaga et al., 2007), but still important, as a relatively overlooked fact with functional implications. A main feature of the present report is an exceptionally thorough cross-species survey, from which the authors conclude that, as compared with non-primates, the macaque and human brains have a lower proportion of neocortical pyramidal neurons with axon carrying dendrites. The results might be further supported by additional experiments, especially ultrastructural data, or by including more extensive developmental data. There is a section on Development, but there is hardly any Discussion. However, these matters are raised and adequately treated by reference to the existing literature.

      We cannot do EM with frozen material or DEPEX-cleared sections. The developmental aspects have been more extensivel discussed now, but we refrained from speculating too much, since we do not have physiological data.

      Reviewer #3 (Public Review):

      The authors used neuroanatomical techniques to study neocortical pyramidal neurons from several different mammalian species. Their message is that primate neocortex differs from that of other mammals in having substantially fewer cells with axons emanating from dendrites, rather than the canonical route from the soma. The authors employed a range of standard methods, ranging from tracer injection to Golgi impregnation to immunocytochemistry. The feature the authors report is undeniable; there clearly are axons that emanate from dendrites of neocortical pyramidal neurons. Prior studies have reported that these axons are more excitable, thus leading to the intriguing possibility of a fundamental architectural (and thus presumably functional) feature in how primate neocortex operates.

      This is a provocative narrative, that leads to a number of interesting questions. However, I have reservations that the authors must address before I believe the claim that primates are really fundamentally different from other mammals in this respect. A strength but also a central limitation of this study is that different species were compared using different methods, and different areas were studied in different species. The authors make the implicit assumption that the prominence of this feature does not differ among cortical areas.

      We initially considered it a strength of the study – looking into many area with many methods in many species. However, it seemed a bit like cherry-picking, and we now enlarged the data sets for a more systematic analysis. Please note, we assessed archived material. We are bound to what we have available. We now delivered areal comparisions, and I am afraid, the answer is NO, no remakable differences in the areas that we assessed in monkey and cat.

      However, it is entirely plausible that the proportion of neurons with axon-carrying dendrites does differ among cortical areas. The authors also group neurons into 2 large populations: infra- and supragranular. But again, layers 2 and 3 differ from one another (as do layers 5 and 6) in the specific populations of pyramidal cells they contain (morphological and neurochemical types, inputs and outputs, etc.). Certainly many studies do group neurons into these broad populations, but for this kind of comparison relevant differences or similarities could have been lost. Comparisons among species ideally would have all been in the same layer and area.

      As said, we are bound to what we have available. And this is more than what has ever been published on these question so far. The graph and the Tables to Figure 3B allow to compare species across the layers.

      We are aware that pyramidal cells in the layers can differ. Looking into RNA seq papers, up to 19 types exist in mouse. How many could potentially then exist in human? There is no way of pulverizing our kind of analysis down to the level of 19 pyramidal cell types differing by some unexplained RNA signatures which so far exist only for mouse. The SMI-32 staining already “selects” for one subtype in that it stains preferentially so-called type 1 pyramidal cells (Molnar et al., 2006).

      Another limitation is that the same method was not employed in different species. The reader needs to know that different methods reveal the same proportion of axon-carrying dendrites in a given area of a certain species. This should have been stated more clearly and earlier in the text; it took examination of the data tables to see this. The tables show that measurements were made in several different cortical areas. Can the authors provide any evidence that the proportion of neurons with axon-carrying dendrites does not differ in any one species among cortical areas?

      We now provide areal comparisons for 5 fields in monkey (new Figure 4A) and visual fields in cat (new Fig. 4B), both with the same methods. We can even provide a within-individual comparison of brain areas and of methods. Another three areal values for the infant macaque have been plotted in Figure 3B.

      Figure 3 description and/or legend needs to state clearly that different species' neocortex was studied in different areas (and if all Fig3 samples shown are from same layers).

      Figure 3A is total cortex, Figure 3 B is by layers. Counting strategies are now described in detail in methods.

      Supplementary Excel file suggests that for humans Golgi-Kopsch reveals fewer infragranular AcD-cells than Golgi-Cox (4.43 vs 1.39), while for adult macaques Golgi-Kopsch revealed fewer than biocytin injection or SMI-32/BetaIV-spectrin immunofluorescence (13.34 vs 7.98 vs 6.29). Since the human data relies on Golgi methods, the authors must reassure the readers that the comparison of species is validated by direct comparison of different methods.

      The message that primates have fewer cells with axon-carrying dendrites than other mammals might therefore certainly be interesting but far less compelling. The message might be that primate neocortex is not qualitatively different from that of other species; instead they simply have somewhat fewer AcD-bearing neurons than other mammalian species. But even that more modest conclusion is suggested but not fully proven by the data here.

      The referee was right at this point. Having doubled our data sets with more human data we now aggree: the Golgi method underestimates the AcD neurons simply because of optical limitations. We now extensively discuss the issue and we no longer do statistical analysis on human. The issue needs further investigation with more methods.

      I was puzzled by Fig 4 not including primate tissue. If the message is that spine density does not differ in dendrites with and without axons, surely it would be important to include primate tissue in this comparison; the comparison between primates and on-primates is after all the core message of this study. I also do not think the values for each species for non-AcD and shared root should be connected by a line; I suggest instead there should simply be a scatter of values for each group with a large symbol indicating mean or median value of each group. This would facilitate comparison.

      First to the graph on spines, now Figure 6. You have to connect the individual neurons by line, otherwise the major point can no longer be seen: the dendrites differ in spine counts, sometimes the AcD is higher than the other basals of the very same neuron, in the next cell the AcD had a lower count. Statistics did not even suggest a trend. We aggree that things may differ in immature neurons. Possibly, during early development the AcD gains advantages by means of its higher excitability.

      Please read the methods part to this point, elegible neurons had to fullfil a number of criteria. We fully exploited the available material of rat and ferret; no more elegible neurons. We indeed tried the same in macaque. Section thickness 50 µm. We found exactly two neurons which fullfilled the criteria. We had no chance with this material given the enormous dimension of the pyramidal cell dendritic trees in monkey. They were simply cut. For this type of classical tracing studies, non-alternating section series were prepared and submitted to different types of staining. Section spacing was several hundred µm in each individual. No chance to “reconstruct” dendrites from adjacent sections, since there were no adjacent sections.

      The core message of the study is still valid, also without the spine analysis in monkey.

    1. solo thinking isrooted in our lifelong experience of social interaction; linguists and cognitivescientists theorize that the constant patter we carry on in our heads is a kind ofinternalized conversation. Our brains evolved to think with people: to teachthem, to argue with them, to exchange stories with them. Human thought isexquisitely sensitive to context, and one of the most powerful contexts of all isthe presence of other people. As a consequence, when we think socially, wethink differently—and often better—than when we think non-socially.

      People have evolved as social animals and this extends to thinking and interacting. We think better when we think socially (in groups) as opposed to thinking alone.

      This in part may be why solo reading and annotating improves one's thinking because it is a form of social annotation between the lone annotator and the author. Actual social annotation amongst groups may add additonal power to this method.

      I personally annotate alone, though I typically do so in a publicly discoverable fashion within Hypothes.is. While the audience of my annotations may be exceedingly low, there is at least a perceived public for my output. Thus my thinking, though done alone, is accelerated and improved by the potential social context in which it's done. (Hello, dear reader! 🥰) I can artificially take advantage of the social learning effects even if the social circle may mathematically approach the limit of an audience of one (me).

    2. Humans’ tendency to“overimitate”—to reproduce even the gratuitous elements of another’s behavior—may operate on a copy now, understand later basis. After all, there might begood reasons for such steps that the novice does not yet grasp, especially sinceso many human tools and practices are “cognitively opaque”: not self-explanatory on their face. Even if there doesn’t turn out to be a functionalrationale for the actions taken, imitating the customs of one’s culture is a smartmove for a highly social species like our own.

      Is this responsible for some of the "group think" seen in the Republican party and the political right? Imitation of bad or counter-intuitive actions outweights scientifically proven better actions? Examples: anti-vaxxers and coronavirus no-masker behaviors? (Some of this may also be about or even entangled with George Lakoff's (?) tribal identity theories relating to "people like me".

      Explore this area more deeply.

      Another contributing factor for this effect may be the small-town effect as most Republican party members are in the countryside (as opposed to the larger cities which tend to be more Democratic). City dwellers are more likely to be more insular in their interpersonal relations whereas country dwellers may have more social ties to other people and groups and therefor make them more tribal in their social interrelationships. Can I find data to back up this claim?

      How does link to the thesis put forward by Joseph Henrich in The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous? Does Henrich have data about city dwellers to back up my claim above?

      What does this tension have to do with the increasing (and potentially evolutionary) propensity of humans to live in ever-increasingly larger and more dense cities versus maintaining their smaller historic numbers prior to the pre-agricultural timeperiod?

      What are the biological effects on human evolution as a result of these cultural pressures? Certainly our cultural evolution is effecting our biological evolution?

      What about the effects of communication media on our cultural and biological evolution? Memes, orality versus literacy, film, radio, television, etc.? Can we tease out these effects within the socio-politico-cultural sphere on the greater span of humanity? Can we find breaks, signs, or symptoms at the border of mass agriculture?


      total aside, though related to evolution: link hypercycles to evolution spirals?

    1. Author Response

      Reviewer #1 (Public Review):

      In the present study, the authors first analyzed simultaneously recorded human EEG-fMRI data and found the fMRI signatures of burst-suppression. Then, they reported such burst-suppression fMRI signatures in the other three species examined: macaques, marmosets, and rats. Interestingly, their results indicated an inter-species difference: the entire neocortex engaged in burst-suppression in rats, whereas most of the sensory cortices were excluded in primates. The fMRI signatures of burst-suppression were confirmed in several species, suggesting that such signature is a robust phenomenon across animals. These findings warrant further investigation into its neural mechanisms and functional implications.

      Major Issues

      1) One of the major findings is that burst-suppression in primates appeared to largely spare sensory cortices, especially V1. However, as seen in the tSNR map for macaques and marmosets (Figure 3 &4 -figure supplement 4), the tSNR around the primary visual cortex was much weaker than other cortices. Moreover, in marmosets, the EPI slices did not cover the entire brain and actually left most of the V1 uncovered as seen in Figure 4. If so, the authors should draw their conclusions very carefully when talking about the differences in V1 across species. It would be better to analyze and discuss how the tSNR differences affect their findings. For example, the author may consider including the tSNR as covariance in their map analysis.

      The tSNR in the occipital cortex—especially in the macaque V1—is indeed lower than in more anterior parts of the brain. The higher noise in V1 may have obscured the burst-suppression signal and hindered its detection. That said, we think that burst-suppression would still be detectable at such low tSNR values. We base this claim on our analysis of another macaque brain region—area TE of the inferior temporal cortex (see our additions to Figure 3–figure supplement 4). The tSNR in areas TE and V1 is comparably low, and yet TE is significantly correlated with asymmetric PCs while V1 is not. Therefore, if the burst-suppression fluctuation was present in V1 we should have still detected it.

      Regarding the marmoset data, part of V1 was indeed left out of our field of view, as explicitly shown in our figures (Figure 4 and Figure 4–figure supplement 3). Though we cannot exclude the possibility that the omitted posterior V1 engages in burst-suppression, we think that it is unlikely to behave any differently to more anterior visual areas. We sought more support for this view by obtaining full-brain fMRI data in one additional marmoset. We present this analysis in a new paragraph of the relevant Results section and in the new Figure 4–figure supplement 5. The asymmetric PC map in this individual showed widespread correlation across the neocortex, extending slightly further caudally compared with the group map presented in Figure 4. However, nearly all of V1—including the occipital pole—was still uncorrelated. Considering both the new full-brain marmoset data and the results from area TE in macaques, we think that our conclusion about the uncoupling of primate V1 during burst-suppression is still justified. That said, we have now explicitly included the relevant concerns in the manuscript text.

      2) To confirm their findings, it would be great to look into the EEG signals around the sensory cortex (e.g., V1) to see whether the findings in fMRI could be also confirmed with EEG.

      EEG signals around V1 were already examined during the previous analysis of the human dataset (Golkowski et al., 2017). As reported there, the EEG signal of the occipital electrodes did contain bursts, which could not be differentiated from bursts detected by more anterior electrodes in terms of onset timing, duration, or spectral content. This might mean that the BOLD signal in VI is truly uncoupled from electrical activity. However, we should also consider that EEG may lack the spatial resolution to detect a different activity originating from V1. As seen in the human map (Figure 3), the external cortical surface is almost exclusively covered with areas engaging in burst-suppression, whereas the ‘uncoupled’ V1 represents a small patch by comparison. Therefore, EEG cannot safely determine the nature of electrical activity in V1. We have added the above arguments to the last section of Results. We expect a conclusive answer to come from future electrophysiological recordings in nonhuman primates. The larger proportional size of visual areas in macaques and marmosets as well as the possibility of invasive intra-cranial recordings make these animals attractive models for addressing this question.

      3) As seen in Figure 2-figure supplement 2, there was a significant anticorrelation with burst-suppression at the ventricular borders. It is unclear whether the authors have done physiological or white matter/CSF/global nuisance regression as most of the rest-fMRI studies did. Please make it clear. If not, please explain why and discuss whether it would affect their results.

      We chose to analyze the data without CSF or global signal regression. CSF regression typically requires extracting the signal of a few voxels within the ventricles. Accurately placing such voxels is feasible in the human brain but challenging in small animal brains, especially in rodents. Rodent ventricles are very thin, making it difficult to place a CSF voxel that will not overlap with surrounding brain tissue. Since we had prioritized making the analysis as similar as possible across species, we decided to also forgo CSF regression in humans. While this was our original motivation for omitting CSF regression, we later came across an even more important concern. As we show in Figure 2–figure supplement 2, the CSF signal is not ‘noise’; rather, it is directly related to burst-suppression, and most likely caused by it. Regressing it out would remove much of the variance explained by burst suppression. The coherence between neural, hemodynamic, and CSF oscillations that we see in burst-suppression likely also occurs in other states characterized by global synchrony, as has been shown for non-rapid eye movement sleep (Fultz et al., 2019).

      We think that global signal regression makes no sense in our case, given that our goal was to study a nearly global signal fluctuation. Global signal regression relies on the assumption that neuronal activity is variable across brain regions while many non-neuronal sources contribute globally to the brain signal (Murphy and Fox, 2017). This assumption does not hold true in cases where the neuronal activity itself is global.

      4) Three different concentrations of the anesthetic sevoflurane were chosen for human participants. The authors found that the high concentration (3.9-4.6%) induced burst-suppression much better than the other two lower concentrations as expected. However, in rats, almost all asymmetric PCs were found at an intermediate concentration (2%) of isoflurane less at the low (1.5%) or high (2.5%) concentration in Rat 1. At the same time, all fMRI runs from Rat 2 with a 1.3% concentration of isoflurane had a prominent asymmetric PC. That is, it seems that only the high concentration of isoflurane could not induce burst-suppression well in rats, which was opposite to those findings in humans. The authors may explain what reasons may cause such differences and whether such differences may affect the major findings in differences between primates and rodents.

      The three sevoflurane concentrations (‘high’, ‘intermediate’, ‘low’) used in humans do not necessarily correspond to the three isoflurane concentrations used in rats (2.5%, 2.0%, 1.5%). Comparing anesthetic concentrations across our datasets is challenging, since anesthetic potency is expected to vary depending on the drug (sevoflurane or isoflurane), animal species, age, and the co-administration of other drugs. Nevertheless, we may estimate equivalent concentrations across species by expressing them as multiples of the minimum alveolar concentration (MAC), i.e. the concentration that produces immobility in 50% of subjects undergoing a standard surgical stimulus.

      For humans, we can use available age-related MAC charts (Nickalls and Mapleson, 2003) to express the three sevoflurane levels as follows: ~1 MAC (2%), 1.5 MAC (3%), 2.2–2.3 MAC (3.9–4.6%). For rats, we can rely on the previously reported isoflurane MAC value of 1.35% (Criado et al., 2000) to derive the following levels: 1.2 MAC (1.5%), 1.6 MAC (2%), 1.9 MAC (2.5 %), and ~1 MAC (1.3%, Rat 2 dataset). According to these conversions, fMRI-detectable burst-suppression occurred in humans at ~2 MAC (with some cases at 1.5 MAC), in the Rat 1 dataset at 1.2–1.6 MAC, and in the Rat 2 dataset at 1 MAC. There seems to be a difference between rats and humans as well as a discrepancy between the two rat datasets. The latter discrepancy could have arisen from differences in the calibration of isoflurane vaporizers at the two research sites (direct measurements of end-tidal anesthetic concentration were not obtained in rats).

      In order to better interpret the observed human-rat difference we tried to also compute the multiples of MAC values for our nonhuman primate data, but this proved to be hard. For common marmosets, we are not aware of any published isoflurane MAC values. For long-tailed macaques, a value of 1.28% has been reported (Tinker et al., 1977), which gives a range of 0.7 – 1.2 MAC for our macaque dataset. However, that probably underestimates the actual depth of anesthesia in our experiments, since many of our macaques were old and MAC is known to decrease with age (Nickalls and Mapleson, 2003). Moreover, the administration of medetomidine during anesthesia induction may have further reduced the MAC (Ewing et al., 1993). Consequently, we cannot provide good MAC estimates for the nonhuman primate data and thus have no reference for comparison with other species.

      Even if we knew the correct MAC value in all cases, it may be an inappropriate means of standardizing anesthetic concentrations for burst-suppression. The endpoint measured by MAC—immobility—is mainly mediated by anesthetic effects on the spinal cord and my not be a good predictor for effects on the brain (Rampil et al., 1993). In fact, burst-suppression itself has been proposed as a more appropriate endpoint for measuring anesthetic potency. The proposed metric (MACBS) is defined as the concentration that produces suppressions longer than 1 s in 50% of subjects and is not linearly related to MAC (Pilge et al., 2014).

      In conclusion, if we reference anesthetic concentrations against the MAC, humans and rats indeed seem to exhibit burst-suppression at different concentration ranges. We are unable to perform the same referencing for non-human primates, due to lack of accurate MAC values. Moreover, it is unclear whether MAC is the appropriate reference to begin with. Discussing all these nuances would make the manuscript too long. That said, we have now added a new paragraph to the Discussion section, drawing attention to the fact that anesthetic concentrations are not standardized across species.

      Reviewer #2 (Public Review):

      The strong point in their manuscript is the originality of their results. Using the fMRI's spatial resolution, they can successfully reveal that not all brain areas are synchronized during the burst suppression. Furthermore, they can find that the difference is the most obvious when comparing primates with rats, which makes sense considering the distance on the phylogenetic tree. As far as I know, this manuscript first reports these points.

      On the other hand, there is a weak point in their method. As they've already discussed this point, they needed to use arbitrary thresholds to evaluate whether there is burst suppression or not. Furthermore, this study cannot reject the possibility of spatial inhomogeneity and/or anesthesia-specific modulation in hemodynamic response. If there is such a mechanism, one can find different results from those obtained through electrical measurements.

      1) The authors found that some sensory areas in primates are excluded from those highly synchronized during the burst suppression. While it is true, I wonder if each voxel in such areas shows burst suppression-like activity that is not synchronized with others. If this is the case, burst suppression can still be a global phenomenon. Though authors seem to investigate this point, they used in-ROI averaged time-series so that it cannot reject the possibility that each voxel inside the ROI is not synchronized but shows burst suppression in its manner. I recommend the authors look into each voxel if this is the case or not.

      The reviewer raises an interesting point by proposing that it is possible for sub-regions within the excluded areas—e.g. within V1—to exhibit burst-suppression out-of-phase with each other, thus cancelling out in the mean V1 BOLD signal. We do not think this is the case, for several reasons. Firstly, we can exclude the possibility that any part of V1 exhibits bust-suppression in-phase with the rest of the cortex. The original first-level GLM analysis was a voxel-based univariate analysis. If any voxels within V1 were correlated with the global burst-suppression pattern, we would have seen it on the maps. We saw no such effect, except for some subjects in which a subset of V1 voxels was anti-correlated with the asymmetric PC (the effect was not significant in our group analysis). This anticorrelation was mostly located close to the ventral horns of the two lateral ventricles, and thus could have arisen by the same cycle of ventricular shrinkage-expansion that we describe in Figure 2–figure supplement 2. Secondly, no large clusters of V1 voxels exhibited burst-suppression out-of-phase with the dominant asymmetric PC. If this was the case, we would have seen a phase-shifted version of the fluctuation on the carpet plots. This still leaves the theoretical possibility that individual V1 voxels (or a few at a time) exhibit transitions between burst and suppression epochs out-of-phase with each other. In our response to the next point, we will explain why there is no way of detecting this with fMRI and we discuss whether such a possibility would even fit the label of burst-suppression.

      2) The other but similar point is about their way to detect burst suppression. Why did they use the principal component? By definition, burst suppression should be defined by the existence of burst and suppressed periods. I cannot understand why they did not simply use this definition to check whether each voxel shows such an intermittent activity to evaluate whether it is a global phenomenon or not.

      Burst-suppression on EEG is characterized by quasi-periodic suppressions of activity, during which the EEG signal drops close to being isoelectric. We cannot apply the same definition to fMRI, because the BOLD signal only represents relative changes and thus has no natural baseline equivalent to isoelectricity. Hence there is no way of telling whether a BOLD signal decrease corresponds to a complete activity cessation (suppression) or simply a relative decline. Therefore, we instead decided to rely on another defining feature of burst-suppression—synchrony. We knew that burst-suppression appears simultaneously across EEG electrodes, which means that large parts of the cortex (the major contributor to EEG signal) would have to be synchronized. Moreover, we knew that transitions between burst and suppression epochs occur on a very slow timescale and would be resolvable at a TR of 2 s. PCA allowed us to isolate the large slow synchronous component in the cortical BOLD signal, though this is hardly the only approach that would work. We chose PCA because it is a simple, deterministic, and easily interpretable algorithm.

      On a related note, even if we could identify complete cessation of activity in the BOLD signal of a single voxel, it is unclear whether that would qualify as burst-suppression per the EEG definition. EEG electrodes pick up activity from areas much larger than a voxel, and thus the very presence of an EEG fluctuation presupposes synchrony on a larger spatial scale. If individual voxel-sized brain areas engaged in burst-suppression out-of-phase, that would probably not register as burst-suppression on an EEG electrode.

      3) Why is there no synchronization during the slow-wave states under light anesthesia? During the slow-wave sleep, it is shown that the entire cortical network is decomposed into a modular-like network structure. Is there synchronization inside each module while no synchrony between modules?

      We do not claim that there is no synchrony in the slow-wave state. We simply state that this state lacks the nearly global cortex-wide fluctuation that is produced by the abrupt transitions between burst and suppression epochs. In fact, the very presence of slow waves on EEG requires synchrony. However, this slow-wave synchrony occurs at a timescale too fast for fMRI to capture, and thus would not directly translate into a global BOLD fluctuation, as burst-suppression does.

      Though the slow-wave state lacks global synchrony on fMRI, it may well exhibit within-module synchrony, as the reviewer suggests. Modules resembling the resting-state networks of wakefulness and sleep have been detected during isoflurane anesthesia in primates (Hori et al., 2020; Hutchison et al., 2011). These experiments were presumably conducted during the slow-wave state: burst-suppression would generate a global network, while the isoelectric state would erase any modular structure. We suspect that functional networks during the anesthetized slow-wave state resemble those present in slow-wave sleep. However, we have not assessed that in our study, since our primary goal was to map burst-suppression.

      Reviewer #3 (Public Review):

      The authors present a multicenter, multimodal rs-fMRI study of the spatial signature of burst suppression in the brain of humans, non-human primates and rats. They have used EEG to identify burst suppression activity in human data from simultaneous EEG-rs-fMRI measurements of subjects under servoflurane anesthesia. After having identified a (neurovascular) rs-fMRI representation of burst activity, the authors show that bursts can equally be identified from MR data alone. After a principal component analysis, bursts and their spatial signature were identified by an asymmetry of the correlation coefficients. Across species the authors identified similar spatial signatures, which were conserved for all (investigated) primates, but differed for rats. While rats showed a pan-cortical involvement, signatures in primates were more complex, e.g., not including the visual cortex.

      In this study, the authors have presented a novel purely MR-based method to identify burst suppression and its spatial signature. Their method may be used to readily identify burst suppression in fMRI data. However, no general threshold for the median of the cortex-wide correlation could be identified. The authors also establish a conserved signature of burst suppression for primates and reveal subtle but important differences to rodents. Both achievements are novel and represent a major advance in the field of neuroimaging.

      The study was well designed, including important control data to rule out artefacts as source of the observed burst suppression patterns. The particular strengths of this study are: (1) including multicentre data (although only rats were scanned at two different sites); and (2) including four species from humans to rats.

      The manuscript was very carefully and well written (I did not even notice a single typo) and the figures were carefully devised, comprehensively illustrating the large amount of data. The authors further provide a comprehensive account of the relevant literature. Towards the end of their discussion they also clarify the difference in terminology used for burst suppression in some recent rodent studies.

      The only (and in my opinion notable) weakness, is the lack of a general threshold for the asymmetry of the median of the cortex-wide correlation coefficients. With such a threshold, rs-fMRI could be readily used to automatically detect burst suppression across species. However, the authors clearly state this shortcoming and openly discuss its implications. I do not think that an altered experimental design or additional data could provide further remedy.

      To conclude: This very comprehensive study was very well designed, extremely carefully performed, presents a novel tool for identification of burst suppression, and provides insight across species. It has clearly translational potential, which however, is limited by the lack of a general threshold for burst suppression detection.

      I congratulate the authors for this very nice piece of work, and the most typo-free manuscript I have ever read.

      We thank the reviewer for the positive and detailed feedback.

    1. Author Response

      Reviewer #1 (Public Review):

      When theta phase precession was discovered (O'Keefe & Recce, 1993; place cell firing shifting from late to early theta phases as the rat moves through the firing field, averaged over many runs), it was realized that, correspondingly, firing moves from cells with firing fields that have been run through (early phase) to those whose fields are being entered (late phase), with the consequence that a broader range of cells will be firing at this late phase (Skaggs et al., 1996; Burgess et al., 1993; see also Chadwick et al., 2015). Thus, these sweeps could represent the distribution of possible future trajectories, with the broadening distribution representing greater uncertainty in the future trajectory.

      Using data from Pfeiffer and Foster (2013), they examine how neurons could encode the distribution of future locations, including its breadth (i.e. uncertainty), testing a couple of proposed methods and suggesting one of their own. The results show that decoded location has increasing variability at later phases (corresponding to locations further ahead), and greater deviation from the actual trajectory. Further results (when testing the models below) include that population firing rate increased from early to late phases; decoding uncertainty does not change within-cycle, and the cycle-by-cycle variability (CCV) increases from early to late phases more rapidly than the trajectory encoding error (TEE).

      They then use synthetic data to test ideas about neural coding of the location probability distribution, i.e. that: a) place cell firing corresponds to the tuning functions on the mean future trajectory (w/o uncertainty); b) the distribution is represented in the immediate population firing as the product of the tuning functions of active cells or c) (DDC) the distribution is represented by its overlap with the tuning curves of individual neurons; d) (their suggestion) that different possible trajectories are sampled from the target distribution in different theta cycles.

      The product scheme has decreasing uncertainty with population firing rate, so would have to have maximal firing at early phases (corresponding to locations behind the rat), contradicting what was observed in the data, so this scheme is discarded.

      The DDC scheme has an increased diversity of cells firing as the target distribution gets wider within each cycle, whereas the mean and sampling schemes do not have increasing variance within-cycle (representing a single trajectory throughout). The decoding uncertainty in the data did not vary within-cycle, so the DDC scheme was discarded.

      The mean and sampling schemes are distinguished by the increase in CCV vs TEE with phase, which is consistent with the sampling scheme.

      The analyses are well done and the results with synthetic data (assuming future trajectories are randomly sampled from the average distribution) and real data match nicely, although there is excess variability in the real data. Overall, this paper provides the most thorough analyses so far of place cell theta sweeps in open fields.

      We thank the Reviewer for the accurate summary and the encouragement.

      I found the framing of the paper confusing in a way that made it harder to understand the actual contribution made here. As noted in the discussion, the field has moved on from the 1990s and cycle-by-cycle decoding of theta sweeps has consistently shown that they correspond to specific trajectories moving from the current trajectory to potential future trajectories, consistent with continuous attractor-based models (in which the width of the activity bump cannot change, e.g. Hopfield, 2010). Thus it seems odd to use theta sweeps to test models of encoding uncertainty - since Johnson & Reddish (2007) we know that they seem to encode specific trajectories (e.g. either going one way or the other at a choice point) rather than an average direction with variance covering the possible alternatives.

      We thank the reviewer for emphasising the connections to earlier work on theta sweeps during decision making, which suggests that alternative options before a decision point are assessed individually by hippocampal neuron populations in a simple maze. However, as also noted by the reviewer below, previous analysis of theta sweeps in the hippocampus were limited to discrete decisions in a linear maze, which only permits a limited exploration of the alternative hypotheses an animal might experience in a planning situation.

      In particular, the dominant source of future uncertainty in a binary decision task is the chosen option (left or right) providing a distinctly bimodal predictive distribution. Bimodal distributions can not be easily approximated by variational methods (that includes the DDC or product schemes) but can be efficiently approximated by sampling. In contrast, in an open field the available options (changes in direction and speed) are not restricted by the geometry of the environment and the predictive distribution is relatively similar to a Gaussian distribution which can be efficiently approximated by all of the investigated encoding schemes.

      Moreover, it has been widely reported that the hippocampal spatial code has somewhat different properties in linear tracks, where the physical movement of the animal is restricted by the geometry of the environment, than in open field navigation. Specifically, in linear tracks most neurons develop unidirectional place fields and the hippocampal population uses different maps to represent the two opposite running directions, whereas a single map and omnidirectional place fields are used in open fields (Buzsaki, 2005). In terms of representing future alternatives, it remains to be an open question if the scheme that is compatible with planning in a 1D environment generalises to two 2D environments. Our detailed comparison of the alternative encoding schemes provides an opportunity to demonstrate that a sampling scheme can be applied as a general computational algorithm to represent quantities necessary for probabilistic planning, while also demonstrating that alternative schemes are incompatible with it.

      Moreover, these previous studies did not rule out the possibility that, in addition to alternating between discrete options, specific features of the population activity might also represent uncertainty (conditional to the chosen option) instantaneously as in the product or the DDC schemes.

      We added a new paragraph (lines 74-88) to the introduction to clarify that one of the novel contributions of the paper is the generalisation of previous intuitions, largely based on work on binary decision tasks in mazes, to unrestricted open field environments.

      The point that schemes that assume varying-width activity distribution might be unfit for modelling hippocampal theta activity is an interesting insight. Let us note that new results have pointed out that the fixed width activity bump is not a necesssary feature of attractor networks. It has recently been shown that in continuous attractors (modelling head direction cells in the fly) the amplitude of the bump can change and the changes can be consistent with the represented uncertainty (Kutschireiter et al., 2021 Biorxiv; https://doi.org/ 10.1101/2021.12.17.473253). We believe that similar principles also apply to higher-dimensional continuous attractor networks and therefore it is entirely possible to represent uncertainty via the amplitude of the bump (equivalent to the population gain) in the hippocampus.

      Thus, the main outcomes of the simulations could reasonably be predicted in advance, and the possibility of alternative neural models of uncertainty explaining firing data remains: in situations where it is more reasonable to believe that the brain is in fact encoding uncertainty as the breadth of a distribution.

      Having said that, most previous examples of trajectory decoding of theta sweeps have not been for navigation in open fields, and the analysis of Pfeiffer and Foster (2013; in open fields) was restricted to sequential 'replay' during sharp-wave ripples rather than theta sweeps. This paper provides the nicest decoding analyses so far of place cell theta sweeps in open field data. However, there are already examples of theta sweeps in entorhinal cortex in open fields (Gardner et al., 2019) showing the same alternating left/right sweeps as seen on mazes (Kay et al., 2020). Such alternation could explain the additional cycle-by-cycle variability observed (cf random sampling).

      We thank the reviewer for encouraging us to more directly test the idea that alternating left right sweeps could explain the increased cycle-to-cylce variability in the data. We thoroughly analysed the data (see our answer to essential revisions 1.) and found that trajectories at subsequent theta cycles are strongly anticorrelated (Fig. 7, Fig. S11, lines 375-415)

      Reviewer #2 (Public Review):

      This study investigates how uncertainty about spatial position is represented in hippocampal theta sequences. Understanding the neural coding of uncertainty is important issue in general, because computational and theoretical work clearly demonstrates the advantages of tracking uncertainty to support decision-making, behavioural work in many domains shows that animals and humans are sensitive to it in myriad ways, and signatures of the neural representations of uncertainty have been demonstrated in many different systems/ circuits.

      We thank the reviewer for the comment.

      However, studies of whether and how uncertainty is signalled in the hippocampus has remained understudied. The question of how spatial uncertainty is represented is already interesting but recent interest in interpreting hippocampal sequences as important for planning and decision-making provide additional motivation.

      A variety of experimental paradigms such as recordings in light vs. darkness, dual rotation experiments in which different cues are placed in conflict with another, "morph" and "teleportation" experiments and so on, all speak to this issue in some sense (and as I note below, could nicely complement the present study); and a number of computational models of the hippocampus have included some representation of uncertainty (e.g. Penny et al. PLoS Comp Biol 2013, Barron et al. Prog Neurobiol 2020). However, the present study fills an important gap in that it connects a theory-driven approach of when and how uncertainty could be represented in principle, with experimental data to determine which is the most likely scheme.

      The analyses rely on the fundamental insight that states/positions further into the future are associated with higher uncertainty than those closer to the present. In support of this idea, the authors first show that in the data (navigation in a square environment, using the wonderful data from Pfeiffer & Foster 2013), decoding error increases within a theta sequence, even after correcting for the optimal time shift.

      The authors then lay out the leading theoretical proposals of how uncertainty can be represented in principle in populations of neurons, and apply them to hippocampal place cells. They show that for all of these schemes, the same overall pattern results. The key advance of the paper seems to be enabled by a sophisticated generative model that produces realistic probability distributions to be encoded (that take into account the animal's uncertainty about its own position). Using this model, the authors show that each uncertainty coding scheme is associated with distinct neural signatures that they then test against the data. They find that the intuitive and commonly employed "product" and "DDC" schemes are not consistent with the data, but the "sampling" scheme is.

      The final conclusion that the sampling scheme is most consistent with the data is perhaps not surprising, because similar conclusions have been reached from showing alternating representation of left and right at choice points cited by the authors (Johnson and Redish 2007; Kay et al. 2020; Tang et al. 2021) and "flickering" from one theta cycle to the next (Jezek et al. 2011). So, the most novel parts of the work to me are the rigorous ruling out of the alternative "product" and "DDC" schemes.

      We thank the reviewer for helping us to clarify the main novelty of our work compared to previous studies. We have updated the introduction (lines ~74–88) to state more clearly how our analysis extends previous work largely restricted to binary decision tasks in mazes and not explicitly considering alternative probabilistic representations.

      Overall I am very enthusiastic about this work. It addresses an important open question, and the structure of the paper is very satisfying, moving from principles of uncertainty encoding to simulated data to identifying signatures in actual data. In this structure, the generative model that produces the synthetic data is clearly playing an important role, and intuitively, it seems the conclusions of the paper depend on how well this testbed maps onto the actual data. I think this model is a real strength of the paper and moves the field forward in both its conceptual sophistication (taking into account the agent's uncertainty) and in how carefully it is compared to the actual data (Figures S2, S3).

      We thank the reviewer for the encouraging words.

      I have two overall concerns that can be addressed with further analyses.

      First, I think the authors should test which of the components of this model are necessary for their results. For instance, if the authors simply took the successor representation (distribution of expected future state occupancy given current location) and compressed it into theta timescale, and took that as the probability distribution to be encoded under the various schemes, would the same predictions result? Figuring out which elements of the model are necessary for the schemes to become distinguishable seems important for future empirical work inspired by this paper.

      The crucial part of our generative model is its probabilistic nature. Explicit formulation of the generative model under different coding schemes enables us to quantitatively account for the different factors contributing to the variability in the data. Specifically, when we compared sampling and mean codes, we partitioned variability of the represented locations across theta cycles into specific factors related to 1) decoding error; 2) difference between the true position of the animal and its own location estimate; 3) the animal’s own uncertainty about its spatial location; 4) updating this estimate in each theta cycle. This enabled us to derive quantities (CCV, TEE and EVindex) that can discriminate between sampling and mean schemes, and that could be directly measured experimentally. This would not be possible in a simpler model lacking an explicit representation of the animal’s internal uncertainty.

      We believe that the assumptions of the model are rather general and those do not limit the scope of the model. Here we list the specific features of the model for clarity (Fig S1a):

      1) Planned position (Fig S1a, left): the planned position is required to guide movements in the model. The specific way we generated the planned position was not essential for the simulations but we tuned the movement parameters to generate trajectories matching the real movement of the animal. It is defined as a random walk process for velocity which is the simplest model for smooth trajectories.

      2) The inference part (Fig S1a, middle) is crucial for the model since we believe that hippocampal population activity is driven by the animal’s own beliefs about its position, which tells our approach apart from earlier studies (see paragraph around line 466). If the animal represents its predictions optimally then the predictions should be consistent with its movement within the environment. Thus, the consistency of the inference is a critical statistical property of the model, which can be guaranteed if the predictions are generated by the same model that is used for inferring the animal’s position. The simplest model that can be used for inference and predictions is the Kalman filter, which we opted for in our simulations.

      3) The assumptions of the encoding model (Fig S1a, right and Fig 1b) are solely determined by the representational scheme being tested. All of the schemes rely on encoding the result of inference in population activity during theta cycles and the scheme determines how this encoding happens. This part of the model is clearly necessary for the analysis.

      Alternatively, we could use the above mentioned successor representation (SR) framework (Dayan 1993) to represent possible trajectories and their associated uncertainty in our models of hippocampal population activity. However, this option introduces extra challenges: First, in the SR framework (Stachenfeld et al., 2017) neuronal firing rates are proportional to the discounted expected future number of times a particular location is going to be visited given the current policy and position. Thus, the SR does sum over all possible future visits and does not specify when exactly a particular state might be reached in the future which is inconsistent with the idea that trajectories are represented during theta sequences. Second, the SR represents the probability of occupying all future states in parallel without providing possible trajectories defining specific combinations of future state visits. This property is consistent with the product and the DDC encoding schemes but not with the other two. These two properties of the SR implies that this framework per se does not provide a fine-scale temporal description of how expected future state probabilities are related to the dynamics of the hippocampal population activity during theta oscillation.

      Taken together, implementing theta time-scale dynamics using the SR framework would also require several additional model choices to generate consistent temporal trajectories from the expected future state occupancies, and even in this case the subjective uncertainty of the animal would not be consistently represented in the simulated data. Representing the animal’s subjective uncertainty in our model was an important component in contributing to the EV-index and had profound implications on the signatures of generative cycling in a two dimensional arena.

      We have to note that on a slower time scale (calculating the average firing rate over multiple theta cycles) all of our encoding schemes are consistent with the SR framework (line 548).

      Second, the analyses are generally very carefully and rigorously performed, and I particularly appreciated how the authors addressed bias resulting from noisy estimation of tuning curves (Figure S7). However, the conclusion that the "sampling" scheme is correct relies on there being additional variance in the spiking data. This is reminiscent of the discussions about overdispersion and how "multiple maps" account for it (Jackson & Redish Hippocampus 2007, Kelemen & Fenton PLoS Biol 2010), and the authors should test if this kind of explanation is also consistent with their data. In particular, the task has two distinct behavioral contexts, when animals are searching for the (not yet known) "away" location compared to returning to the known home location, which extrapolating from Jackson & Redish, could be associated with distinct (rate) maps leading to excess variance.

      We thank the reviewer for this constructive comment. We note that the signature of the sampling scheme is variability in the decoded trajectory across subsequent theta cycles while overdispersion is usually defined as the supra-Poisson variability in the spiking of individual neurons evaluated across multiple runs or trials. Nevertheless, we tested the existence of multiple maps corresponding to the two distinct task phases and found that the maps representing the two task phases are very similar (Fig S11).

      Such an analysis could also potentially speak to an overall limitation of the work (not a criticism, more of a question of scope) which is that there are no experimental manipulations/conditions of different amounts of uncertainty that are analyzed. Comparing random search (high uncertainty, I assume) to planning a path to a known goal (low uncertainty) could be one way to address this and further bolster the authors' conclusions.

      We agree with the reviewer that the proposed framework provides additional insights into the way the population activity should change with specific experimental manipulations and can therefore inspire further experiments. In particular, a hallmark of probabilistic computations is that experimental manipulations that control the uncertainty of the animal should be reflected in population responses. In the visual processing such manipulations are indeed reflected in changing response variability, as predicted by sampling (Orban et al, Neuron 2016). In the current experimental paradigm there was no direct manipulation of uncertainty (we discuss this around lines 573-576). While one might argue that there are differences in the planning strategy in trials where the animal was heading for away reward and in those heading for home, this is not a very explicit test of the question. Still, to check if we can find traces of changes in uncertainty in the two conditions, we analysed the EV-index separately on home and away trials (Fig. S11e). We did not find systematic differences in the EV-index across these trial types.

      Reviewer #3 (Public Review):

      Summary of the goals:

      The authors set out to test the hypothesis that neural activity in hippocampus reflects probabilistic computations during navigation and planning. They did so by assuming that neural activity during theta waves represents the animal's location, and that uncertainty about this location should grow along the path from the recent past to the future. They next generated empirical signatures for each of the main four proposals for how probabilities may be encoded in neural responses (PPC, DDC, Sampling) and contrasted them with each other and a non-probabilistic representation (scalar estimate of location). Finally, the authors compared their predictions to previously published neural activity and concluded that a sampling-based representation best explained neural activity.

      Impact & Significance: This manuscript can make a significant impact on many fields in neuroscience from hippocampal research studying the functions and neural coding in hippocampus, through theoretical works linking the representation of uncertainty to neural codes, to modeling experimental paradigms using navigation tasks. The manuscript provides the following novel contribution to cognitive neuroscience:

      • It exploits the inherent change in uncertainty about a parsimonious internal variable over time during planning to test hypotheses about probabilistic computations.
      • A full model comparison of competing hypotheses for the neural implementation of probabilistic beliefs. This is a topic of wide interest and direct comparisons using data have been elusive.
      • The study presents substantial empirical evidence for a sampling-based neural representation of the probability distribution over trajectories in the hippocampus, a finding with potential implications for other parts of neural processing. Strengths:
      • Creative exploitation of a naturally occurring change in uncertainty over a parsimonious latent variable (location).
      • Derivation of three empirical signatures using a combination of analytical and numerical work.
      • Novel computational modelling & linking it to neural coding using 4 existing implementational models
      • Comprehensive and rigorous data analysis of a large and high-quality neural dataset, with supplemental analyses of a second dataset
      • Mostly very clear and high quality presentation We thank the Reviewer for the summary and for the positive feedback on the manuscript. Weaknesses:
      • It is unclear to what degree the "signatures" depend on the details of the numerical simulation used by the authors to generate them. At least two of them (gain for the product scheme and excess variability for the sampling scheme) appear very general, but the degree of robustness should be discussed for all three signatures.

      The generality of the signatures follows from the fact that we derived them from the fundamental properties of the encoding schemes. We tested their robustness using both idealised test data (Fig S6c-d, Fig S7b) and our simulated hippocampal model (Fig. 4c, Fig5b-c, Fig6b-g).

      The reviewer is right that the sensitivity and robustness is a potential issue. These schemes have been originally proposed to encode static distributions ie., the neuronal activity was supposed to encode a specific probability distribution for an extended period of time. Therefore, when we test the signatures we make the simplifying assumption that a static distribution is encoded in the three separate phases of the theta cycle. It is currently unknown whether during theta sequences the trajectories are represented via discrete jumps in positions or as continuously changing locations. Therefore we used our numerical simulations to test whether the proposed signatures are sufficiently sensitive to discriminate the encoding schemes using the limited amount of data available and in the face of biological noise but also robust to the parameter choices and modelling assumptions.

      Regarding the product code, the inverse relationship between the gain and the variance has been previously derived analytically for special cases (Ma et al., 2006). In the manuscript we show numerically that the same relationship holds for general tuning curve shapes (Fig. S6d). Finally we demonstrate that the gain is a robust signature that changes systematically along the theta cycles in the case of a product coding scheme.

      Second, in the case of the DDC code we used the decoded variance of the posterior as the signature. Since DDC code relies on the overlap between the target distribution and the neuronal basis functions, potentially the most important source of error is if we overestimate the size of the encoding basis functions. To control for this factor, we first explored this effect in an idealised setting (in fig S7) and found that the decoded variance correlates with the encoded uncertainty both if we used the estimated basis functions or the empirical tuning curves for decoding. Next we performed the analysis in our simulated dataset in 4 different ways - either using empirical tuning curves (Fig 5c-d) or the estimated basis functions (Fig S8a-b), focusing on high spike count theta cycles or including all theta cycles. The fact that all these analyses led to similar results confirms the robustness of this signature.

      Our third measure, the EV-index measures the variability of the encoded trajectories across theta cycles. The cycle-to-cycle variability is also affected by factors independent of whether a randomly sampled trajectory or the posterior mean is encoded. In particular, the encoded trajectory can start at different distances in the past and can be played at different speeds in different theta cycles. These factors are probably present in the data and all inflate the CCV. Another factor is the start and end time of the trajectories, which we may not be able to accurately find in the real data and confusing the end of a previous trajectory with the start of a new one can also inflate CCV. In our simulations we tested how these potential errors influence our analysis, and found that the EV index is surprisingly robust to such changes (Fig 6fg). An additional factor that the EV-index is sensitive to is the specific sampling algorithm used to sample the posterior: an algorithm that produces correlated samples is hard to distinguish from the MAP scheme. Our newly introduced analysis (Fig 7b) demonstrates this and explores the level of correlation between subsequent trajectories, providing evidence that trajectories decoded during exploration reflect the properties of anticorrelated samples, also a signature of efficient inference.

      • The claims about "efficiency" lack a definition of what exactly is meant by that, and empirical support.

      We thank the reviewer for pointing out this inconsistency in our terminology. What we generally meant by efficiency was a claim that pertains the computational level, according to Marr’s classification, i.e.that computations are probabilistic, that is, representation in the hippocampus takes into account uncertainty by representing a full posterior distribution. We performed an additional test, which concerns the algorithmic-level efficiency of the computations. We explored the efficiency of the sampling process by assessinga signature of efficientsampling, the expected number of sampled trajectories required to represent the distribution of possible future locations. We found that subsequent samples tended to be anti-correlated which is a signature of efficient sampling algorithms (Fig 7). In the revised manuscript we thus use the word efficient solely when we refer to the anticorrelated samples.

    1. Author Response:

      Reviewer #2:

      The authors investigated changes in the unstressed and stressed oligomeric states of the mammalian endoplasmic reticulum (ER) stress sensor, IRE1a. Previous biochemical and microscopy studies in mammalian cells and studies of the related protein Ire1 in yeast, describe an increase in oligomerization of the stress sensor upon treatment of cells with chemical agents that impair the ER protein folding environment. The general view has been that IRE1 in unstressed cells is a monomer and varying degrees of misfolded protein stress stimulate dimerization, activation, and higher order oligomerization. Distinguishing between monomers and dimers, as well as tetramers or other small oligomers is technically challenging, especially for integral membrane proteins. To address this challenge, the authors turned to single particle tracking fluorescence microscopy of Halo-tagged endogenous IRE1. Using a clever combination of random labeling with two fluorescent dyes and oblique angle illumination to visualize single molecules, as well as dimers, the authors surprisingly find that their endogenous IRE1 reporter appears to be dimeric in homeostatic cells. This observation challenges the predominant model in which IRE1 is monomeric in unstressed cells and that even dimerization represents a switch into an active state. The authors claim to detect evidence for higher order oligomers following treatment with stressors. The authors then use a series of IRE1 mutants to identify how oligomerization is regulated and present a new model to reconcile the different models of IRE1 activation in the literature.

      The authors have extensively characterized their novel experimental system in terms of protein expression levels, functionality, and ability to distinguish monomers and dimers. The data are well presented and the authors are clearly familiar with the arguments that have surrounded the IRE1 oligomer question. That the authors observe the characteristic XBP1 mRNA splicing activity in the absence of visible large IRE1 clusters may suggest that the large clusters reported by others may have distinct roles, perhaps in more permissive mRNA cleavage.

      The present study is undermined by two major weaknesses. First, while the authors persuasively demonstrate that they can detect IRE1a dimers, a major claim of the manuscript rests upon detection of tetramers and possibly higher order oligomers. Unfortunately, the authors provide no independent controls to show what tetramer or higher order oligomer data would look like. Thus, the authors can only infer that higher order oligomers are detected, based on modest shifts in the percent of correlated particle trajectories observed in some cells. More robust evidence is needed to make claims of oligomerization. Tools have been developed by others that can induce reversible oligomerization of proteins. Application of these tools would provide powerful controls for tetramers or even higher order oligomers in this study.

      The second, deeper concern, is the discrepancy between the Halo Tag clustering results in this study and studies by this lab and several other labs that report a distinct stress phenotype. In mammalian cells and yeast, IRE1 and Ire1, tagged with different fluorescent proteins or even a small HA peptide epitope tag, undergo quantitative visible formation of puncta or clusters upon treatment with stressors. The small number of bright clusters that form effectively deplete the rest of the ER of IRE1 signal. In the present study, the authors observe no visible change in IRE1-Halo localization in stress cells. The authors do not investigate the cause of this difference. While one might argue that the presence of stress-inducible IRE1 activity is sufficient to argue that the reporter in this study is functional, IRE1 reporters (that do cluster) described in previous studies by the Walter lab and other groups are also demonstrably functional. Does IRE1 normally cluster? Is it cell-type dependent? Tag-dependent? Notably, the Pincus et al. PLoS Biology paper from the Walter lab used two different fluorescent protein tags that do not heterozygously dimerize. Robust colocalization and FRET signals were detected upon treatment of cells with stressors and clustering was subsequently observed. A 2007 Journal of Cell Biology study from Kimata et al. reported clustering in yeast with an Ire1 tagged with an HA epitope peptide. The HA peptide seems unlikely to be prone to any oligomerization propensities that GFP tagged reporters might experience. Importantly, a 2020 PNAS paper from the Walter lab (Belyy et al.) studied clustering of a robustly monomeric mNeonGreen-tagged IRE1 in U2-OS cells and mouse embryonic fibroblasts and this construct readily clustered following stress induction.

      When evaluated against the backdrop of the extensive literature describing the visual behavior of IRE1a in live cells, the absence of stress-induced clustering is both puzzling and disconcerting. Given the focus of this study is to use visual techniques to study IRE1a interactions, the burden of proof is on the authors to resolve this significant discrepancy with the rest of the IRE1a literature. One can easily imagine that incorporation of the majority of the pool of IRE1a into 10-100 clusters could produce very different correlated trajectory behavior. Until the authors can determine why their reporters behave differently from other IRE1a reporters and establish which version accurately reflects physiologic IRE1a behavior, the potential impact of the findings of this manuscript are of unknown value.

      We thank the reviewer for this detailed assessment of our work. We agree that the question of apparent discrepancy in the formation of observable IRE1 clusters between this manuscript and earlier work is important. We have now addressed this issue both in the revised version of the manuscript and in specific point-by-point responses to reviewers’ comments. As a brief summary, we addressed the reviewer’s first concern (lack of controls larger than dimers) by cloning and validating a tetrameric HaloTag construct, the measurements from which were entirely consistent with the model we presented in the original version of the manuscript. To address the reviewer’s second concern, we present several lines of evidence showing that the discrepancy between the formation of microscopically visible IRE1 clusters in earlier studies and the absence of such clusters in the present work almost certainly results from differences in expression levels. First, our IRE1-HaloTag construct is perfectly capable of forming stress- induced clusters, as we show in the new Figure 1 – Figure Supplement 3. Second, we point to a parallel study by Gómez-Puerta et al., who demonstrate that a more “conventional” IRE1-GFP construct does not form visible stress-dependent puncta when it is expressed at a low level comparable to that of untagged IRE1 in HeLa cells, despite being fully active. Third, our earlier work in the 2020 PNAS paper referenced by the reviewer actually showed that even in the overexpression context, IRE1-mNeonGreen only forms visible puncta in just over half of all cells, despite the fact that XBP1 processing is nearly 100% effective in bulk assays. Furthermore, in the same paper we show that, rather than all IRE1 molecules being sequestered in clusters, only a small fraction (~5%) of IRE1-mNeonGreen assembles into large puncta while the remaining 95% of IRE1 stays uniformly distributed throughout the ER. Taken together, we believe that IRE1 does have the propensity to assemble into larger clusters when its expression levels are high (regardless of the tag used), but that these clusters are not strictly required for its activation. We have made significant changes to the discussion section of the manuscript to clarify the above points and directly address the apparent discrepancy between the present work and earlier studies.

      Reviewer #3:

      In this paper, the authors' aim was to test how IRE1's oligomerization state relates to its activation status without relying on ectopic overexpression. The principle underlying the work is a rather simple one, which is that, if the population of IRE1 can be labeled stochastically with either of two different fluorescent probes, then if the protein dimerizes, presuming single molecules can be visualized, correlated migration of a spot of each fluorophore should be observed for some of those dimers. Any correlated migration, maintained for long enough, will by necessity by some sort of dimer or multimer. In principle, if my math is right, the correlation should be 50% of spots of each color, assuming all the molecules are in a dimer, all molecules are labeled with one fluorophore or the other, and the koff of the fluorophores is very low. In practice, the correlation appears closer to 10%, which the authors establish using a control molecule that should not dimerize except by chance, and another for which pseudo-dimerization is enforced due to the two HALO domains used to bind the fluorophores being conjugated to the same molecule in cis. Much of the paper is devoted to establishing the fundamentals of the system. For these experiments, the authors replaced endogenous IRE1 with the HALO-tagged version to generate near-normal expression and show that the IRE1-HALO behaves similarly to endogenous. They also show that correlated migration is observed in the dimer control to a much greater extent than in the monomer.

      Using these findings, they demonstrate, in my mind quite conclusively, that IRE1 exists as a dimer even in the unstimulated state. During ER stress, the authors observe a state that is more highly ordered. Mathematical modeling suggests a transition from predominantly dimers to a mix of dimers and something more highly ordered, with tetramers being the simplest explanation. Satisfyingly, a mutation that breaks the known dimer interface causes the protein to exist solely in monomers, as does deletion of the IRE1 lumenal domain, while disrupting the oligomerization interface keeps the protein as dimers. Mutation or deletion of the kinase and RNase domains does not affect higher order status, suggesting that activation of these domains is not a prerequisite for assembly. It is clear from this that the central claims of the paper, which is that IRE1 exists in a dimer in the basal state and transitions to a higher ordered structure in the activated state, are supported. Moreover, the general approach is likely to be appealing to the study of other molecules activated by multimerization.

      We thank the reviewer for this thoughtful and helpful analysis of our work.

      The principal advance of the paper is the technological approach for tracking IRE1 (and, presumably, other molecules whose activity is regulated by dimerization). The approach is quite elegant for that purpose. Its impact in terms of conclusions about IRE1 is perhaps less clear. The authors rationalize their endogenous-replacement approach by describing how their previous efforts and those of others relied on ectopic overexpression of GFP-tagged IRE1. The authors take great pains to claim that the observed multimerization status of the IRE1-HALO constructs is not a function of expression level, which would imply then that expression level alone is not responsible for the previously observed IRE1 oligomeric puncta. It is not clear why exactly the authors' results differ from this group's previous studies on the topic nor where the truth lies, including whether something inherent to the GFP-tagged overexpression approach favors non-physiologic structures, whether the difference is fundamentally one of cell type, or whether multimerization and activation are correlated but not causally related, with multimer-breaking mutations killing IRE1 by some other mechanism.

      The question of reconciling our present data with earlier work (including work from our group) is clearly and understandably a central question for all three reviewers. As we detailed above in our responses to reviewers 1 and 2, we are convinced that the formation of large IRE1 clusters is largely dependent on expression level rather than the differences between fluorescent protein tags and the HaloTag. We added new supplementary figures and substantially revised the text of the manuscript to address this question directly.

      Interpreting the data is also complicated by the fact that, while the authors point out that the percent of correlated trajectories (i.e., the measurement of multimerization state) does not itself correlate with expression level (using trajectories-per-movie as a proxy), the proper conclusion from that lack of correlation is not that variance in expression level does not account for the changes in apparent multimerization status, but instead that it cannot be the only factor. In some sense, the authors are attempting to play the argument both ways, by arguing that expression level matters for IRE1 activation (from previous studies) and that it doesn't (from this study). I think to address this the authors will need to better account, one way or another, for why the findings presented here differ from their previous findings and why these are the more salient (if in fact they are).

      This is a very important point, and we thank the reviewer for raising it. We are not arguing that expression levels do not matter for the formation of oligomers; quite the contrary, as detailed above and in the revised version of the text, we believe that the formation of massive IRE1 oligomers observed in previous studies and in the new Figure 1 – Figure Supplement 3 is mainly a function of elevated concentration. What we do claim is that our approach can reliably pick out oligomeric differences within the relatively narrow range of concentrations used for single-particle tracking experiments in this paper. We are using the very weak truncated CMVd3 promoter in all transient transfection experiments, and we are only analyzing data from cells that have a comparable density of single-molecule spots to the density we observe in endogenously tagged IRE1-HaloTag cells. In fact, the metric of “trajectories per movie” used as a proxy for expression levels in Figure 5 – Figure Supplement 1 is an overestimation of the true variability of expression levels, since each movie only covers a small fraction of each cell’s area and the number of observed molecules varies depending on cell morphology. Practically speaking, all cells that we image have expression levels that are clustered together rather narrowly, roughly within differences of no more than a factor of 3. These levels, in turn, are significantly lower than the expression levels used in earlier papers by our group and others.

      The other somewhat substantial issue is that there is no control for what higher order structures look like. The authors give no sense for the dynamic range of the multimerization assay. I would presume that tetramers would show a higher percentage of correlated trajectories than dimers, and octamers higher still, and that the mathematical model accounts for this theoretical possibility in calculating an average protomer number of 2.7 in the stress condition, but it would be better to see that in practice; at first glance it would seem that engineering a tetrameric and/or higher order control and validating it would be straightforward.

      This is another great point raised by all reviewers. In the revised version of the manuscript, we engineered a new tetrameric control construct (See Figure 2 – Figure Supplement 1), the results from which agree remarkably well with the mathematical model we developed in the original version of the manuscript (see Figure 2 – Figure Supplement 3)

      Lastly, the data analysis lacks statistical justification for its conclusions. I presume given the high number of readings that the observed changes are all statistically significant, but that should be indicated, as in most cases the 95% confidence intervals shown are overlapping.

      This is another excellent point. The reviewer is correct that all relevant conclusions are statistically supported by the data, and our analysis code immediately calculates pairwise p- values for every plot using one of several relevant tests. Our preferred test is the permutation test, since it makes no assumptions about the underlying distributions being compared. To avoid cluttering the main plots, we have included tables of pairwise p-values for each plot in the revised version of the manuscript.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Responses to reviewers’ comments are in blue text, original reviewers’ comments in black text.

      Response to Reviewer 1.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): In this manuscript Neiro et al. aim to expand our knowledge on the regulation of gene expression in stem cells of the planarian model organism. As a first step the authors used published available data to expand the repertoire of the planaria transcriptome. By combining 183 RNAseq datasets the authors were able to identify thousands of new coding and non-coding transcripts. They then screened for TF motifs in the new annotations, identifying 551 putative TFs, of which 248 were already described in the planarian literature. The most substantial contribution of this work to the field of stem cells and planaria biology is the characterization of new putative enhancers that were identified by performing H3K27ac ChIP-seq and ATAC-seq and combining these data with previously published H3K4me1 ChIPseq dataset.

      We thank the reviewer for their careful assessment of our work, we agree that the identification of likely enhancers genome wide is a substantial contribution. Equally the improved annotation of all genes, including transcription factors we choose to focus on here, is a substantial step forward for the planarian research community.

      By overlapping H3K27ac and H3K4me1the authors find 5,529 new enhancers, for which they report a higher chromatin accessibility than random points in the genome as assessed by ATAC-seq. By using ATAC-footprints Neiro et al. refined the subset of TFs that have binding motifs in the predicted enhancer-like regions and present a list of 22,489 such factors. The manuscript is well written and organized and overall, the reported data will provide an important resource to study gene expression regulation in planaria's stem cells. However, this manuscript would greatly benefit from some functional validation to support the predicted gene regulatory networks. One option would be to use a CRISPR-dCas9-KRAB system to silence the putative enhancers identified in the manuscript and check by qPCR the expression of nearby genes.

      Currently mis-expression technologies, in order too directly test enhancer elements in driving expression, are still not available in planarians. This also preempts us using the suggested silencing system used in mammals and other animals with robust mis-expression tools.

      If this type of experiment is not feasible in planaria (I am not an expert in this model organism) another simple but key experiment would be to perform a knockdown of one (or more) putative enhancer-bound TFs identified in this study followed by RNA-seq. This would allow the authors to verify what are the target genes of the putative enhancer-bound TFs and if they correspond to the predicted gene networks they identified. Simultaneously, this experiment would allow the authors to verify if there are any changes in the expression of differentiation/pluripotency markers as a result of the knockdown of the putative enhancer-bound TF.

      These experiments are possible, but this would be the work of many labs in the future expert in studying those TFs and their roles in planarian stem cells and regeneration. However, what we can do is analyze existing RNA-seq data further. There are a number of studies where TF have been studied and RNA-seq performed after RNAi. Although these studies are performed in specific experimental regenerative contexts, and not specifically in stem cells, it will be possible to look at expression changes of genes with predicted enhancers bound by these TFs. We propose to execute this analysis and add it to the manuscript, rather than perform further TF RNAi experiments. This analysis is feasible within a 3-month revision time. We would add that currently their no genes are implicated in controlling pluripotency in the same way we might consider, for example, OSKM in mammals. Our identification of the TFs enriched in stem cell expression and implicated in binding predicted enhancers suggests future candidates.

      Minor revision: • The authors have mostly focused on the identification of enhancer-bound TFs. However, it would be interesting to look at differential enrichment of TFs in promoters versus enhancers and identify if there are specific factors that are enriched specifically at the planarian newly identified enhancer regions.

      We have not looked at potential TF binding sites near promoters/transcriptional start sites. We will try to add an analysis that considers this in our revision.

      • All tornado plots are missing a colorbar (Fig3 and FigS2)

      We will fix this error

      • There is a typo in the discussion: "the combined use of chip-seq data, RNAi of a histone methyltransferase combines with chip-seq" should be changed to "combined".

      We will fix this and other typographical errors.

      Reviewer #1 (Significance (Required)):

      The manuscript is well written and organized and overall the reported data will provide an important resource to study gene expression regulation in planaria's stem cells.

      We thank the reviewer for their appreciation of our work

      **Referees cross-commenting**

      I agree with the other reviewers that additional functional data should be added to support the author's claims (such as knock down of potential TFs that are identified by computational analyses and assessing the impact on gene expression).

      See response above, with regard to adding further analysis for testing this possibility.

      In addition, as noticed by the third reviewer, all data should be made publicly available to the scientific community.

      We have made all data publicly available and will submit all relevant data to public database repositories in advance of final publication after final peer review.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript aims at identifying enhancers in the planarian Schmidtea mediterranea. The authors start with the integration of transcriptome with genome sequencing data to more precisely annotate the genome of the planarian Schmidtea mediterranea. The second part of the manuscript actually then deals with the identification of potentially active enhancer elements in adult stem cells of this regenerating organism using genomic techniques like ATAC-seq and ChIP-seq of histone marks combined with motif searches and in silico footprint analysis. Using these data, the authors predict regulatory interactions potentially critical for pluripotency and regeneration in planarian adult stem cells.

      MAJOR COMMENTS:

      • Are the key conclusions convincing? 1) The authors claim (already in the abstract) that their study identifies enhancers regulating adult stem cells and regenerative mechanisms. This is an over-statement found throughout the manuscript, as none of these enhancers are functionally tested nor is it shown that target gene expression changes when transcription factors predicted to interact with such enhancers are knocked down.

      We agree and it was not our intention to overstate our results, this is why we have tried to refer to putative enhancers, enhancer-like elements etc in manuscript from the title onwards. Only once we have demonstrated a set of elements with key conserved and widely supported characteristics do we suggest we have a set of higher confidence enhancers to study. However, we will adjust the manuscript to reflect that our claims await direct testing as is the case for all enhancers implicated with the approaches used here.

      Another example is at the end of paragraph 1 of section 2.4. Here the authors claim that identifying many fate-specific transcription factor genes in the vicinity of potential enhancers is a further proof that the identified regions represent "real enhancers". It strongly supports this hypothesis, but no evidence for real enhancer activity.

      We agree the total body of evidence strongly supports that we have identified enhancer elements, but as above will adjust the language to suggest further directed functional work will follow from many groups.

      Thus, although the authors state that the regulatory interactions and networks they predict from their data can be studied now in future, they should be more careful with their wording and correct these over-statements. Therefore, the key conclusion is that they identified by various techniques potential enhancers, which are close to genes controlling adult stem cells and potentially controlling these genes, which has to be shown by further analyses.

      We agree

      Thus, also the title needs to be changed.

      We propose changing ‘enhancer-like’ to “predicted enhancers” in the title, and "defines" to "predicts" as well as broadly adjusting the text to caveat that further work will clarify their functions and roles.

      The authors have no proof that the networks are active in planarian adult stem cells, as they do not show that the predicted networks are active in the presented way.

      We agree, see comments above. It was not our attention to claim we are showing pathways that were definitely active, rather predicted by our experiments and analyses of the data from these experiments.

      2) Similarly, the identification of TF motifs within these potential motifs strongly suggests but not shows that these factors are binding, even when these sites were found to be bound by a protein using the ATAC-seq footprinting analysis. Thus, the authors need to be careful with their wording. One example is in the second paragraph of section 2.5, where the authors write that "We found that numerous FSTFs were binding to putative intronic enhancers ... ". The motif suggests that these factors bind, however, they have no experimental confirmation that these sequences are indeed bound by the planarian TFs.

      We agree. We will clarify that ATAC foot printing is the only data suggestive of these motifs being bound and that further experiments will be required for more evidence. We will state this in the section of results and add this explicitly to the discussion

      In sum, this manuscript uses existing genomic tools to define potential enhancer regions in the planarian Schmidtea mediterranea. The manuscript is informative yet descriptive, as tit presents no functional evidence for any of the predictions. If further toned down, the key conclusions are valid.

      Future functional experiments to test the roles of all TFs and enhancers is now possible due to our work.The combination of data and analyses provides strong support of enhancer elements activity in stem cells across the genome.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? The experiments performed are well designed and in line with what is known in the field about enhancer architecture. However, as this model system is not very well characterized on that level and the authors do not provide real experimental evidence that any of the identified regions has really enhancer activity and that any of the identified motifs binds indeed the predicted TF, the authors need to be very careful with their statements. The authors should maybe emphasize even stronger that all the GRNs predicted under section 2.6 are really preliminary and need to be validated.

      Yes, we are happy to be even clearer about this as the reviewer suggests

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. One experiment that could provide more evidence for their predicted regulatory interactions is to knock-down one of the FSTFs for which motifs have been identified in potential enhancer regions and to study expression of associated genes (to confirm that the enhancers potentilla bound by these TFs control the expression of associated genes) or by analyzing the chromatin status of selected chromatin regions (by Q-PCR). These experiments would strongly support the claims of the authors. However, it also depends strongly on the journal whether I would consider these experiments essential or "nice to have".

      This suggestion of possible extra experiments is very similar to that of Reviewer 1. We are copying our earlier comment as this also addresses this point.

      “These experiments are possible, but this would be the work of many labs in the future expert in studying those TFs and their roles in planarian stem cells and regeneration. However, what we can do is analyze existing RNA-seq data further. There are a number of studies where TF have been studied and RNA-seq performed after RNAi. Although these studies are performed in specific experimental regenerative contexts, and not specifically stem cells, it will be possible to look at expression changes of genes with predicted enhancers bound by these TFs. We propose to execute this analysis and add it to the manuscript, rather than perform further TF RNAi experiments. This analysis is feasible within a 3-month revision time. We would add that currently their no genes implicated in controlling pluripotency in the same way we might consider OSKM in mammals. Our identification of the TFs enriched in stem cell expression and implicated in binding predicted enhancers suggests future candidates.”

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. This reviewer is not an expert in Schmidtea mediterranea, thus it is hard to judge how time consuming these experiments would be. Cost-wise they should be feasible, as it would include primarily Q-PCR experiments. And some functional back-up of their claims would be very helpful.

      See previous comment regarding additional analysis.

      • Are the data and the methods presented in such a way that they can be reproduced? For the parts I can judge, yes.

      • Are the experiments adequately replicated and statistical analysis adequate? It is not clear from the manuscript how many replicates of the ChIP-seq experiments were done.

      Chip-Seq replicate data description will be explicitly added to the methods

      MINOR COMMENTS:

      • Specific experimental issues that are easily addressable.

      • Are prior studies referenced appropriately? For the literature I can judge, yes.

      • Are the text and figures clear and accurate? The figures are clear, the text (besides over-statements) is clear. However, the writing can be improved. A few examples: section 2.2 paragraph 1: "... we found 248 to be described in the planarian literature in some way." In which way described?; same paragraph: "... but significantly we could identify new homologs of ..." what does significantly mean? Which test etc? section 2.2, last paragraph: "Most TFs assigned to the X1 and Xins compartments and the least to the X2 compartment", "Very few TFs had expression in X1s and Xins to the exclusion of X2 expression as would be expected by overall lineage relationships"; what do these sentences mean?

      We thank the reviewer for paying careful attention to the language in our manuscript throughout. We will provide clearer explanation of the sentences indicated. We will better explain terms specific to the planarian model system that are obviously not intuitive

      . - Do you have suggestions that would help the authors improve the presentation of their data and conclusions? No over-statements.

      See previous comments agreeing with the need to carefully adjust our language to avoid this

      Reviewer #2 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. This manuscript identifies genome-wide potential enhancers in adult planarian stem cells, and thus represents a very valuable resource for the community to study these enhancers and the gene regulatory networks they control in the future.

      • Place the work in the context of the existing literature (provide references, where appropriate). As I am not a planarian scientist, it is hard to judge this part.

      • State what audience might be interested in and influenced by the reported findings. In my opinion, this work will be primarily interesting for people working with planarian. When functional data exist, this might be also interesting for researchers working generally on regeneration.

      Given the nature of our data we also think all groups working on animal stem cells would be interested in our data and analyses

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. My field of expertise is transcriptional regulation using genomic techniques, however I am not familiar with the model Schmidtea mediterranea.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Neiro et al. capitalize on existing genomic data for the planarian Schmidtea mediterranea and new ChIP-seq and ATAC-seq data to use computational approaches to identify putative enhancers in the planarian genome. They integrate analysis of enhancers with transcription factor binding sites to generate testable hypotheses for the regulatory function of transcription factors active in stem cells or control of cell lineage trajectories. Their work creates an excellent resource for future work to resolve the regulatory logic underpinning stem cell biology and tissue regeneration in planarians.

      We are glad the reviewer likes our research.

      Major: Overall, the work in this manuscript and methodology are well executed and presented. However, the authors should consider the following comments to improve the clarity and accessibility of the data and interpretations.

      1) The new transcriptome does not appear to be publically accessible. The links to Github resources are broken, and there is nothing on Neiro's Github page. Will the new transcriptome be integrated with Planmine?

      The new annotation has been available for over a year as we wished the community to have access to it ASAP (see Garcia Castro, 2021, Genome Biology https://doi.org/10.1186/s13059-021-02302-5). We tested the links in the paper before depositing our preprint and after review and they seemed to work for us both within and outside our institutional network. We can only apologize if they were broken or have not worked for the reviewer. We are unclear if this new annotation will be included in Planmine, but we will ask the colleagues maintaining this database to consider including it.

      2) Figure 1: Ternary plot in 1F. The legend is not clear or could be explained better. What is the metric? It could be my misunderstanding, but I didn't consider the ternary plots as insightful or unnecessary. Perhaps the authors can expand on what they are showing.

      These plots are important in demonstrating the distribution of mRNA expression of all genes across cell sorted compartments. Given the broad lineage relationship between sorted cell compartments This analysis allows us to identify genes expressed predominantly in one cell compartment or another, or across a specific transition. For example, genes enriched in X2 cells and Xins, but not X1 are likely to be enriched in post-mitotic differentiating progeny and differentiated cells. In contrast to single cell data where expression data can be sparse this analysis with bulk data allows identification and assignation of low expressed genes, like transcription factors. We will provide further explanation of this in the revised text.

      1I is a map of exons, not alternative splicing. So, it isn't clear what the authors intend t show. Are the specific exons that are more likely to be spliced? Is the figure necessary?

      We wish to demonstrate the power of annotation approach and the richness of the annotation for looking at alternate splicing. We propose to a more informative figure that indicates the variety of splice forms. We apologize for this oversight.

      3) Figure 2: 2A labels Xins as irradiation responsive. Is this the case (just making sure)?

      The reviewer is correct, this is wrong! This should read “irresponsive” or “irradiation resistant” In Figure 1A. We thank the reviewer for spotting this error. We will fix this.

      2F-G: Ternary plot in F seems redundant with G, but that could be my lack of understanding. In 2G, what is represented on the plots on the right of the hierarchical clusters?

      The ternary plot (2F) and heatmap of hierarchical clustering (2G) are complementary ways to visualize the proportional expression values of transcription factors. The ternary plot (2F) allows an overview of all the proportional expression values, while the heatmap (2G) shows how the proportional values may be grouped into clusters of similar expression profiles and displays the relative size of these clusters. For example, the heatmap shows that the clusters of X1 and Xins are more prominent than X2, suggesting that there are realtivey a few X2-specific transcription factors. We will add text to better to explain this difference.

      4) Figure 3: The heat maps need a legend (i.e., please define the colors). In addition, labeling the figures could help the reader. For example, in G-J, a header about the different experiments above each map, such as "enhancers" and "random," etc., would make the figure more accessible.

      We agree we label the figures to be more easily interpretable and provide an independent scale and legend for the heatmaps.

      5) Figure 5: Although it is in the figure legend, the authors could label the 6th track as "RNA-seq in X1."

      We will add this to the figure.

      6) Section 2.6 second page last sentence of the first paragraph "GRN of asexual reproduction is not active in neoblasts" data in the supplement? Is it not shown?

      We apologize for this poorly written sentence. In line with Reviewer 2s comments this statement needs to be toned down and clarified. The raw information is included in the general table of enhancers (Supplementary Table 2), but the genomic tracks visually highlighting the motifs at the promoters of lox5b and post2b were not included. We will add these to the Supplementary information and clarify Supplementary Table 2.

      7) Discussion: The discussion about pluripotency factors in planarians could be expanded. The authors could contrast the study's findings with Önal et al. 2012.

      We agree we will expand our discussion to compare with previous studies and also summarize what is available from other animals with pluripotent adult stem cells

      Minor: The manuscript has no page numbers or line numbers, so I'll provide a general location of the potential issues.

      1) Section 2 - newly identified isoforms are shorter (1656 vs. 1618). Is the order of the median length reversed?

      Yes, we will correct this.

      2) No mention of Figure S1B in the text.

      It is mentioned in the paragraph regarding splicing, but perhaps not in a useful context. We will add a correct reference to this figure in the presentation of transcript diversity.

      3) Figure 1H should be 1I in the text?

      Yes, we will correct this

      4) The discussion contains some minor typos and grammatical errors.

      We will address with careful rereading.

      We thank the reviewer for spotting these errors and we will fix them in revision.

      Reviewer #3 (Significance (Required)):

      Neiro et al. provide an excellent resource for the planarian community. The paper is generally very well written and easy to read. The new transcriptome described, which improves the annotation of the planarian genome, should be made readily available. It would be excellent if the transcriptome could be incorporated in Planmine.

      We will ask Planmine and the Rink lab to consider this. The annotation (without broad analysis) has been available since the pre-print for Garcia Castro, 2021, Genome Biology was deposited in BioRxiv.

      Furthermore, the authors provide a comprehensive list of transcription factors in the planarian Schmidtea mediterranea. Their work provides insight into which factors are highly expressed in the stem cell compartment. Their computational identification of transcription factors and putative enhancers will be helpful to the growing community of researchers studying stem cell and regenerative biology using planarians. In addition, the large dataset generated in this study could inform studies in the evolution of regulatory sequences and transcription factor function.

      **Referees cross-commenting**

      The data presented are well supported by previous studies. As noted by the authors, it is not possible to make transgenic planarians, and thus the field needs to rely on indirect methods. The authors focus on using the stem cell population, which can be isolated from the animals. Overall, I don't think additional experiments are necessary. Additional RNAi experiments combined with RNA-seq (using the stem cells) could take 6-12 months to complete. I believe this is a solid contribution that should be framed as a resource paper. The authors should pay close attention to Reviewer #2's suggestions and edit the paper accordingly.

      I have 20 years of experience in the field. It would be unreasonable to ask the authors to do more experiments, especially in this post-pandemic environment. I hope this helps.

      We thank the reviewer for the comments.

    1. Reviewer #3 (Public Review): 

      In this paper, Troendle et al investigate changes in alpha oscillation across childhood and adolescence. The main goal of this investigation is to examine how alpha oscillations change across these age ranges, by investigating a large open dataset and adopting new methods that should help to address methodological limitations of many previous analyses. In particular, a key goal is to examine changes in periodic alpha power, and control for potential confounds due to changes in peak frequency and/or aperiodic activity. To do so, they employ a novel spectral parametrization method, and systematically compare measures of isolated periodic alpha activity to conventional measures. Overall, they find that they can replicate the age-related decrease of total alpha power when using conventional methods. However, when explicitly measuring and controlling for aperiodic activity, they find that periodic alpha activity actually increases with age. They suggest this discrepancy can be explained by changes in aperiodic activity, as the aperiodic slope and intercept are found to systematically change across age, in a way that likely drives the finding decrease of total alpha power, while the periodic alpha power actually increases. There are also some follow up analyses, including relating alpha power to anatomical measures of the thalamus, and to performance on an attention task. 

      Strengths of this investigation include that it analyzes multiple, large datasets with well motivated methods. I think the goal of this paper addresses an important question, in terms of seeking to clarify some basic patterns of oscillation changes across development, and doing so in a rigorous way, both in terms of employing methods that are robust to estimating different features of the data, and in terms of using multiple, large datasets, including an internal replication of the main findings. I find the main goal and analysis compelling in terms of examining how alpha activity changes across this age range. 

      I also find some limitations to some aspects of this paper and analysis that could be improved, as they do not always clearly describe the context or support the claims that are made for some of the follow-up analyses, as described in the following. 

      1. Framing and prior literature 

      I find some limitations in the organizing of this paper and it's relationship to prior work that could be improved, as I find that the paper could do better situating the analyses here with prior work, in particular in relation to the methodological issues it is addressing, and prior work on aperiodic activity. 

      For example, in the abstract it is stated that "simulations in this study show that conventional measures of alpha power are confounded". Despite this statement, simulations are not a core feature of this study. There are a couple simulated examples in the supplement, which are referred to in lines 89-95, however it's worth nothing noting that while this section does not include any citations, the described issues, and related simulations, are very similar to points that have been made previously in the literature, that seem like they should be cited here: <br /> - Donoghue, T., Dominguez, J., & Voytek, B. (2020). Electrophysiological Frequency Band Ratio Measures Conflate Periodic and Aperiodic Neural Activity. ENeuro, 7(6), ENEURO.0192-20.2020. https://doi.org/10.1523/ENEURO.0192-20.2020 <br /> - Donoghue, T., Schaworonkow, N., & Voytek, B. (2021). Methodological considerations for studying neural oscillations. European Journal of Neuroscience, ejn.15361. https://doi.org/10.1111/ejn.15361 

      The paper also understates previous work on aperiodic activity, and the degree to which it is known to vary with age, in line 116-117 stating "there is insufficient evidence for the reported significant association between age and aperiodic signal components". This seems to ignore the large number of studies that have replicated this finding, including (some non-exhaustive examples): <br /> - Thuwal, K., Banerjee, A., & Roy, D. (2021). Aperiodic and Periodic Components of Ongoing Oscillatory Brain Dynamics Link Distinct Functional Aspects of Cognition across Adult Lifespan. Eneuro, 8(5), ENEURO.0224-21.2021. https://doi.org/10.1523/ENEURO.0224-21.2021 <br /> - Voytek, B., Kramer, M. A., Case, J., Lepage, K. Q., Tempesta, Z. R., Knight, R. T., & Gazzaley, A. (2015). Age-Related Changes in 1/f Neural Electrophysiological Noise. Journal of Neuroscience, 35(38), 13257-13265. https://doi.org/10.1523/JNEUROSCI.2332-14.2015 <br /> Perhaps this claim is supposed to more specifically reflect the age-range analyzed here, in which case recent studies examining this (in relatively large datasets) are also not mentioned here, including, for example: <br /> - Donoghue, T., Dominguez, J., & Voytek, B. (2020). Electrophysiological Frequency Band Ratio Measures Conflate Periodic and Aperiodic Neural Activity. ENeuro, 7(6), ENEURO.0192-20.2020. https://doi.org/10.1523/ENEURO.0192-20.2020 <br /> - Hill, A. T., Clark, G. M., Bigelow, F. J., Lum, J. A. G., & Enticott, P. G. (2022). Periodic and aperiodic neural activity displays age-dependent changes across early-to-middle childhood. Developmental Cognitive Neuroscience, 54, 101076. https://doi.org/10.1016/j.dcn.2022.101076 

      The notes above do not undermine the utility of examining alpha oscillations in detail, but I think the specific contribution of this work could be better contextualized in terms of other existing work. In the introduction, for example, the following review is an important piece of work that could be cited when introducing aperiodic activity: <br /> - He, B. J. (2014). Scale-free brain activity: Past, present, and future. Trends in Cognitive Sciences, 18(9), 480-487. https://doi.org/10.1016/j.tics.2014.04.003 

      2. Model quality control 

      A limitation to the methods employed in this study is a lack of description of if and how model fit quality was evaluated. For the method of parametrizing neural power spectra that is employed, it is important to validate that models fit the data well, otherwise the estimated parameters may be unreliable. This is especially important in developmental and clinical data, as analyzed here, as this data can be quite noisy, and differences in levels of noise across ages or between clinical groups could plausibly lead to differences in model fit quality. Useful quality checks for this kind of analysis would be to report the average r-squared (or model error) for the parametrized data, and to examine whether model fit quality is significantly related to age, or clinical status. 

      Note that there is also a detailed guide for how best to apply spectral parametrization to developmental datasets, including notes on quality control, that may be useful: <br /> - Ostlund, B., Donoghue, T., Anaya, B., Gunther, K. E., Karalunas, S. L., Voytek, B., & Pérez-Edgar, K. E. (2022). Spectral parameterization for studying neurodevelopment: How and why. Developmental Cognitive Neuroscience, 54, 101073. https://doi.org/10.1016/j.dcn.2022.101073 

      Not reporting any quality control metrics of the model fits also deviates from the analysis of the validation dataset as described in the pre-registered analysis (https://osf.io/7uwy2), which includes the note that the plan is for data to be excluded from the analysis if there is a bad model fit (R-squared < 0.9). It is unclear from the manuscript if this was done at all - and if so, why it was not described, and if not, why this deviates from the pre-registration. Note that though examining and reporting model fit quality is important, it is unclear where the value of 0.9 in the pre-registration came from, and it is unclear if this is an appropriate threshold for these specific datasets. 

      3. The analysis of the relationship between the aperiodic intercept and aperiodic exponent 

      There is an analysis in this paper that attempts to evaluate whether the change in aperiodic intercept that is observed is more than expected due to the measured change in aperiodic exponent. The approach taken for this analysis is ill-posed, and the interpretations made of this analysis are not supported. The issue is that the degree to which the intercept changes due to a change in exponent depend on the rotation frequency, which is not acknowledged or addressed in the analysis employed here. 

      For example, for spectra rotated at 0 Hz, there is no measured change in offset from a change in exponent, whereas for a rotation at 100 Hz, there is a large influence of exponent on the change in offset, with different degrees of impact in between. The results of this analysis are therefore heavily influenced by the rotation frequency that is used. The analysis by the authors uses a rotation frequency of 19 Hz, however, there is no justification provided for this value. It is noted as being the middle point of the analyzed range, however, this itself is unrelated to whether it is an appropriate rotation frequency (since which frequency the spectrum rotates at is unrelated to the experimenter's decision of which frequency range to analyze). 

      In real data, we don't a priori know what the rotation frequency point is, and in general it need not be a single, consistent point, and between subjects, is difficult to measure. To get a sense of what it might be, anecdotally, we can see in Figure 2C that in this particular subset, the rotation point is not at 19 Hz, and appears to be at a higher frequency. If the rotation point is actually higher than 19 Hz, then the analysis employed will systematically under-estimate the impact of the measured exponent change - leading to the conclusion that intercept is changing over and above the influence of the exponent. However, this conclusion is only valid if the rotation point of 19 Hz is accurate, and we would likely arrive at a different conclusion by picking a different rotation point. This analysis, by itself, is therefore invalid. Such an analysis would require a clear motivation of having measured the correct rotation frequency to be interpretable. 

      4. Flanker Analysis 

      Also relating to organization (similar to point 1) it is unclear why the analysis of the Flanker task, which is alluded to in the abstract, is only mentioned in the Discussion section. Given that this appears to be a key analysis, it is unclear why it is not presented in detail in the Results. The Flanker task and analysis is also not described in much detail in the methods. An issue with the Flanker analysis only being mentioned in the Discussion, with a link to supplemental table, is that the details of the results are somewhat obfuscated from the reader. When looking at these results, two key features seem notable - the first that though it is significant effect of aperiodic-adjusted alpha power, the beta value is very small (many times smaller than the coefficients for age and gender), and second, that although it doesn't quite pass significance, the estimated beta value for the total alpha power has the same magnitude as for the individualized alpha power. Between these two features, it is not clear if the relationship between aperiodic-adjusted alpha power and the Flanker performance is of sufficient magnitude to interpret that alpha power is related to attentional performance, and it's not clear that aperiodic-adjusted alpha power is more related to attentional performance than total alpha power (since a difference in significance does not necessarily imply a significant difference in the parameters). I think this analyses, as presented, therefore does not clearly support the claim made in the abstract that alpha power is found to relate to improved attentional performance.

    1. Discussion, revision and decision


      Author response


      To: Adam Marcus, co-founder Retraction Watch & Alison Abritis, PhD, researcher at Retraction Watch

      Major Problems: I found serious deficits in both for this article, and thus I have serious concerns as to the usefulness of this article. Therefore, I have not proceeded in a line-by-line, as I consider the overall problems to be grave enough to require attention and revision before getting to lesser items of clarity.

      I would like to point out that the authors show a marvelous attention to their work, and they have much to contribute to the field of retraction studies, and I do honestly look forward to their future work. However, in order for the field to move ahead with accuracy and validity, we must no longer just rely on superficial number crunching, and must start including the complexities of publishing in our analyses, as difficult and labor-intensive as it might be.

      We do not consider that our article presents serious problems nor that it would be useless.

      It is possible that a different view on the subject, some tendency to forbearance (understandable) for the difficult life of the publishing industry, along with some difficulties in understanding the ideas presented in the article, may have led to a series of points of view that we would like to comment on below.

      We would first like to thank the reviewers for their comments, some of which will allow us to improve and nuance, using objective elements, the analysis of this bumpy field represented by the ecosystem of retracted publications. Because we have based our study on data from freely accessible sources of information, we will not insist too much on commenting on this issue.

      The authors stated that they used the search protocol (and therefore presumably the same dataset) as described in Toma & Padureanu, 2021, and do not indicate any process to compensate for its weaknesses. In the referenced study, the authors (same as for this article) utilized a PubMed search using only “Retracted Publication” in Publication Type. This search method is immediately insufficient, as some retracted articles are not bannered or indexed as retracted in PubMed. This issue is well-understood among scholars who search databases for retractions, and by now one would expect that these searches would strive to be more comprehensive.

      A better method, if one insists on restricting the search to PubMed, would have been to use Publication Type to search for “retracted publication,” and then to search for “retraction of publication,” and to compare the output to eliminate duplications. There are even more comprehensive ways to search PubMed, especially since some articles are retitled as “Withdrawn” – Elsevier, for example, uses the term instead of “Retracted” for papers removed within a year of their publication date – but do not come in searches for either publication type. Even better would have been to use databases with more comprehensive indexing of retractions.

      In an ideal world, if any effort were to be made, it would be aimed at better indexing and managing existing databases, not at generating query strategies to make up for their shortcomings.

      Thank you very much for the suggestions on the search strategy. We do not consider that the use of "Retracted Publication [PT]" should be compensated in any way but, if it should be compensated, we wouldn't want to add "Retraction of publication". We consider that using a search protocol more specific to systematic reviews is not very useful in our case: data are added/updated continuously (sometimes late), incorrect indexing can be corrected, the number of retracted articles increases from month to month; the same strategy can give different results at different times regardless of its complexity. Putting extra effort into detecting problematic articles without knowing the benefit but expecting it only highlights issues that can be improved at the publisher/editor(content delivery) and database level(indexing).

      The dataset analyzed is a snapshot of a particular time interval and nothing more. Even during the analysis we found, in the case of one publisher, the addition of details to the initially incomplete retraction notes. Hence the need for follow-up studies. Therefore in the case of retractions, unlike the reviewer, we prefer an approach based on simple and easily reproducible strategies, widely accessible sources of information, and several steps. The first step in this strategy is the "number crunching" stage which includes this article.

      1. The authors are using the time from publication to retraction based on the notice dates and using them to indicate efficacy of oversight by publishers. However, this approach is seriously problematic. It takes no notice of when the publisher was first informed that the article was potentially compromised. Publishers who respond rapidly to information that affects years/decades old publications will inevitably show worse scores than those who are advised upon an article’s faults immediately upon its publication, but who drag their heels a few months in dealing with the problem.

      Indeed, the article uses the time between publication and retraction(exposure time – ET) as one of the SDTP score components for assessing editorial/publisher performance. Data on when a publisher or editor has been informed of problems with an article, apart from being relatively rare, is not a substitute for a retraction note. Moreover, the use of such information may induce a risk of bias.

      We mention in the article the need to use reporting standards for retraction notes, and one element that might be useful is, indeed, the date on which the publisher or editor was informed of problems with an article. Unfortunately, as the author of this review knows very well, information precedes investigation; the retraction note contains (or should contain) much more data than the initial information about the quality problems of an article.

      Our article aims to suggest a score for measuring publication performance in the context of retracted articles that would also allow an assessment of the dynamics of the activity of correcting the scientific record and, more importantly, how publishers engage in post-publication quality control. ET is only one component of this score.

      It is quite clear from the data presented in the article that a publisher/journal that emphasizes systematic back-checking will have an increasingly longer average lifespan of retracted articles, logically higher than one that does not do this type of checking. We don't see precisely where the reviewer thinks there is a problem: once the checking is done, the ET will decrease, and a publisher that takes concrete steps to correct the literature will ultimately have a better reputation. This does not mean that a higher ET is laudable, it suggests that there is a post-publication quality control but also that the peer review process has let problematic articles through and that the control of these articles has been carried out late. This is an argument for more active involvement of publishers (as potential generators of editorial policies) in post-publication control.

      Second, there is little consistency in dealing with retractions between publishers, within the same publishers or even within the same journal. Under the same publisher, one journal editor may be highly responsive during their term, while the next editor may not be. Most problems with articles quite often are first addressed by contacting the authors and/or journal editors, and publishers – especially those with hundreds of journals – may not have any idea of the ensuing problem for weeks or months, if at all. Therefore, the larger publishers would be far more likely to show worse scores than publishers with few journals to manage oversight.

      It is exactly this inconsistency that we highlight in the article. Differing policies, attitudes, and responsiveness does not mean that a publisher cannot/should not ask questions about the effectiveness of internal processes and resources used for post-publication quality control or the implementation of uniform measures across journals in its portfolio.

      Third, the dates on retraction notices are not always representative of when an article was watermarked or otherwise indicated as retracted. Elsevier journals often overwrite the html page of the original article with the retraction notice, leaving the original article’s date of publication alone. A separate retraction notice may not be published until days, weeks or even years after the article has been retracted. Springer and Sage have done this as well, as have other publishers – though not to the same extent (yet).

      Historically, The Journal of Biological Chemistry would publish a retraction notice and link it immediately to the original article, but a check of the article’s PDF would show it having been retracted days to weeks earlier. They have recently been acquired by Elsevier, so it is unknown how this trend will play out. And keep in mind, in some ways this is in itself not a bad thing – as it gives the user quicker notice that an article is unsuitable for citation, even while the notice itself is still undergoing revisions. It just makes tracking the time of publication to retraction especially difficult.

      We used the same date for all articles in our study (the one listed in PubMed), thus ensuring a uniform criterion for all publishers. If this date was not in PubMed we used the date from the retraction notes on the journal website but this was for a small number of articles. How different publishers handle retraction processes or the delay with which these are published is primarily related to internal editorial procedures, and these delays are reflected in the ET. In our experience, most articles retracted by Elsevier are available online, supplemented, and not replaced by retraction notes, which we think is an excellent policy.

      1. As best as can be determined, the authors are taking the notices at face value, and that has been repeatedly shown to be flawed. Many notices are written as a cooperative effort between the authors and journal, regardless of who initiated the retraction and under the looming specter of potential litigation.

      Shown to be flawed by who? Indeed, in our study, we refer to the retraction notes published by the journals. The fact that they are incomplete or formulated under the threat of litigation only supports our view that publishers and editors need to make a more significant effort to correct the biomedical literature, including avoiding litigation when the retraction note clearly describes the reasons for retraction. The way the retraction note is worded should be an editorial prerogative and should primarily aim at correcting scientific literature, not at appeasing egos, careers, or financial interests.

      Trying to establish who initiated a retraction process strictly by analyzing the notice language is destined to produce faulty conclusions. Looking just at PubPeer comments, questions about the data quality may be raised days/month/years before a retraction, with indications of having contacted the journal or publisher. And yet, an ensuing notice may be that the authors requested the retraction because of concerns about the data/image – where the backstory clearly shows that impetus for the retraction was prompted by a journal’s investigation of outside complaints. As an example, the recent glut of retractions of papers coming from paper mills often suggest the authors are requesting the retraction. This interpretation would be false, however, as those familiar with the backstory are aware that the driving force for many of these retractions were independent investigators contacting the journals/publishers for retraction of these manuscripts.

      Once again, the author of this review does not seem to fully understand our study, apparently favouring information published on third-party websites over that the journals officially assumed. The retraction notes represent the material available to a researcher doing documentation on a particular topic. The clarity and information contained in the note is the editor's or publisher’s responsibility, reflecting their performance and concern for the integrity of the science. Interpretation of a retraction note/analyzing an article occurs in this context. Not everyone has time for further investigation or to search third-party sites for information that is, with a notable exception, the result of a selection bias.

      Assigning the reason for retraction from only the text of the notice will absolutely skew results. As already stated, in many cases, journal editors and authors work together to produce the language. Thus, the notice may convey an innocuous but unquestionable cause (e.g., results not reproducible) because the fundamental reason (e.g., data/image was fabricated or falsified) is too difficult to prove to a reasonable degree. Even the use of the word “plagiarism” is triggering for authors’ reputations – and notices have been crafted to avoid any suggestion of such, with euphemisms that steer well clear of the “p” word. Furthermore, it has been well-documented that some retractions required by institutional findings of misconduct have used language in the notice indicating simple error or other innocuous reasons as the definitive cause.

      We understand your point of view and the situations presented may be accurate. However, from our point of view, the only valid reference remains the retraction note published on the journal's website. The existence of wording difficulties and various other problems that may arise are more likely to do with a tendency of the reviewer to make excuses for journals reluctant to indicate precisely what the reasons for retracting the article are. There are plenty of retraction notes in which the images with problems (including whether they were plagiarized, reused, manipulated, fabricated, etc.) are indicated with great precision, there are equally plenty of notes in which the word plagiarism is used without hesitation, indicating the sources, how they were informed, what was plagiarized. No matter how many hesitant publishers/editors there are, it should not be forgotten that there are many journals/publishers who take their role seriously, acknowledge and learn from their mistakes, thus providing a real service to the scientific community.

      The authors also discuss changes in the quality of notices increasing or decreasing in publishers – but without knowing the backstory. Having more words in a notice or giving one or two specific causes cannot in itself be an indicator of the quality (i.e., accuracy) of said notice.

      "Knowing the backstory" is not part of our objectives, and neither is assessing the quality of the retraction notes. This is also very difficult to do due to the lack of an accepted standard format. We are trying to propose a score composed of several parameters resulting from existing (or non-existing) data in the retraction notes so that we can have a picture of retractions at publisher level. Knowing the backstory is not relevant, reading and interpreting the official retraction note is relevant.

      1. The authors tend to infer that the lack of a retraction in a journal implies a degree of superiority over journals with retractions. Although they qualify it a bit ( “Are over 90% of journals without a retracted article perfect? It is a question that is quite difficult to answer at this time, but we believe that the opinion that, in reality, there are many more articles that should be retracted (Oransky et al. 2021) is justified and covered by the actual figures.”), the inference is naive. First, they have not looked at the number of corrections within these journals. Even ignoring that these corrections may be disproportionate within different journals and require responsive editorial staff, some journals have gone through what can only be called great contortions to issue corrections rather than retractions.

      We believe that this is a case of reviewer confusion generated either by the insufficiently precise wording of the text or a lack of understanding of our study objectives. We are trying to point out that more than 90% of the journals in the NLM catalogue-PubMed subset have not retracted a single article. We are not trying to say that journals without retracted articles are superior to the others. As explained in the article, we referred to retraction notes, not corrections.

      Second, the lack of retractions in a journal speaks nothing to the quality of the articles therein. Predatory journals generally avoid issuing retractions, even when presented with outright proof of data fabrication or plagiarism. Meanwhile, high-quality journals are likely to have more, and possibly more astute, readers, who could be more adept at spotting errors that require retraction.

      Of course, the quality level of articles in a journal is not determined by the number of articles removed.

      Third, smaller publishers/journals may not have the fiscal resources to deal with the issues that come with a retraction. As an example, even though there was an institutional investigation finding data fabrication, at least one journal declined to issue a retraction for an article by Joachim Boldt (who has more than 160 retractions for misconduct) after his attorneys made threats of litigation.

      Threats of lawsuits are instead a failure of a publisher/journal to adapt to the realities of the publishing business or to the risk of misconduct. This is something that needs to change.

      Simply put, the presence or lack of a retraction in a journal is no longer a reasonable speculation about the quality of the manuscripts or the efficiency of the editorial process.

      We have not attempted to suggest this, we have only analyzed the retracted articles and their associated retraction notes. On the other hand, the way a journal/publisher handles the retraction of problematic articles still reflects, to some extent, the quality/performance of the editorial processes.

      1. I am concerned that the authors appear to have made significant errors in their analysis of publishers. For example, they claim that neither PLOS nor Elsevier retracted papers in 2020 for problematic images. That assertion is demonstrably false.

      This is wrong. In our dataset, there are eleven PLOS articles related to human health with the publication year 2019 and 2020. None of these have images as retraction reasons.

      Regarding the 21 Elsevier articles published in 2020, there is nothing in the retraction notes to indicate that the article was retracted because of the images. In 2 retraction notes there is mention of the comments made by Dr. Bik (The Tadpole Paper Mill - Science Integrity Digest) but the text of these (retraction notes) stops at the authors' inability to provide the raw data underlying the article.

      Our study is based only on the content of the retraction notes published and assumed by the journal, not on opinions/comments appearing on other sites, which, for unknown/unmentioned reasons, are not officially assumed in the retraction note. Therefore, we consider the statement in the review to be questionable at best, as the use of material other than the retraction notes has severe implications for the internal and external validity of the study and the suggestion to use such methods is, in our opinion, wrong. We would also like to draw attention to the fact that many retraction notes are explicitely mentioning the request to provide raw images and the authors' inability to provide them.

      Anyway, as far as images are concerned, our article suggested that there are publishers which seem to adopt image analysis technologies faster than others. The numbers are not really relevant in this case but the trend is: it describes the publishing activity complexity better than the numbers.

      Reviewer response

      We appreciate the authors’ zeal in standing by their work.

      In regard to the deficits in the search process, the author states, “We do not consider that the use of ‘Retracted Publication [PT]’ should be compensated in any way but, if it should be compensated, we wouldn't want to add ‘Retraction of publication’”

      There is a lack of appreciation for the complexities of indexing retracted materials in an indexing site such as PubMed. To have a comprehensive search, one should not be choosing to use either “Retracted Publication [PT]” OR “Retraction of Publication [PT].” One would use both, and then filter out the duplicates, because some retractions are indexed by retraction notices, some only have “Retracted” added to the indexed title and the publication type changed to “Retracted Publication.” Use of only one or the other guarantees that the search is far less comprehensive than it should be.

      The authors state, “In an ideal world, if any effort were to be made, it would be aimed at better indexing and managing existing databases, not at generating query strategies to make up for their shortcomings.”

      There is at least one database (http://retractiondatabase.org) that has a far more comprehensive indexing of retractions and is publicly available for use.

      In Item 3, where it is pointed out that retraction notices themselves are inaccurate and cannot be taken at face value as to the reason behind the retraction, the authors responded, “Shown to be flawed by who?” — By an article cited in the manuscript:

      Fang, Ferric C.; Steen, R. Grant; Casadevall, Arturo (2012): Misconduct accounts for the majority of retracted scientific publications. In Proceedings of the National Academy of Sciences of the United States of America 109 (42), pp. 17028–17033. DOI: 10.1073/pnas.1212247109.

      “To understand the reasons for retraction, we consulted reports from the Office of Research Integrity and other published resources (7, 8), in addition to the retraction announcements in scientific journals. Use of these additional sources of information resulted in the reclassification of 118 of 742 (15.9%) retractions in an earlier study (4) from error to fraud.” Followed by “These factors have contributed to the systematic underestimation of the role of misconduct and the overestimation of the role of error in retractions (3, 4), and speak to the need for uniform standards regarding retraction notices (5).”

      The authors then choose to state that it is the “editorial prerogative” – and that when notices “are incomplete or formulated under the threat of litigation [it] only supports our view that publishers and editors need to make a more significant effort to correct the biomedical literature, including avoiding litigation when the retraction note clearly describes the reasons for retraction.”

      Following our attempt to explain why understanding the real reason behind a retraction is important to study the publication of notices, the authors respond: “Once again, the author of this review does not seem to fully understand our study, apparently favouring information published on third-party websites over that the journals officially assumed.”

      First, yes, we do understand the study. We read a lot of these. Second, the “third-party websites” we prefer include the Office of Research Integrity and the Retraction Watch blog, where background investigations into the causes of retraction notices are described. If the authors are challenging the reference to PubPeer, keep in mind that journals initiate investigations based on comments on that website, and have taken to citing the website in their notices.

      Had the authors not chosen to categorize the reasons for retraction, their reasoning may have had more support – but they did, and in doing so, by just using the notice with no further review, their findings address only the notice itself, with no context.

      We recommend that the manuscript be substantially revised with strong attention to the comments we made in our original review.

    2. Discussion, revision and decision


      Author response


      To: Adam Marcus, co-founder Retraction Watch & Alison Abritis, PhD, researcher at Retraction Watch

      Major Problems: I found serious deficits in both for this article, and thus I have serious concerns as to the usefulness of this article. Therefore, I have not proceeded in a line-by-line, as I consider the overall problems to be grave enough to require attention and revision before getting to lesser items of clarity.

      I would like to point out that the authors show a marvelous attention to their work, and they have much to contribute to the field of retraction studies, and I do honestly look forward to their future work. However, in order for the field to move ahead with accuracy and validity, we must no longer just rely on superficial number crunching, and must start including the complexities of publishing in our analyses, as difficult and labor-intensive as it might be.

      We do not consider that our article presents serious problems nor that it would be useless.

      It is possible that a different view on the subject, some tendency to forbearance (understandable) for the difficult life of the publishing industry, along with some difficulties in understanding the ideas presented in the article, may have led to a series of points of view that we would like to comment on below.

      We would first like to thank the reviewers for their comments, some of which will allow us to improve and nuance, using objective elements, the analysis of this bumpy field represented by the ecosystem of retracted publications. Because we have based our study on data from freely accessible sources of information, we will not insist too much on commenting on this issue.

      The authors stated that they used the search protocol (and therefore presumably the same dataset) as described in Toma & Padureanu, 2021, and do not indicate any process to compensate for its weaknesses. In the referenced study, the authors (same as for this article) utilized a PubMed search using only “Retracted Publication” in Publication Type. This search method is immediately insufficient, as some retracted articles are not bannered or indexed as retracted in PubMed. This issue is well-understood among scholars who search databases for retractions, and by now one would expect that these searches would strive to be more comprehensive.

      A better method, if one insists on restricting the search to PubMed, would have been to use Publication Type to search for “retracted publication,” and then to search for “retraction of publication,” and to compare the output to eliminate duplications. There are even more comprehensive ways to search PubMed, especially since some articles are retitled as “Withdrawn” – Elsevier, for example, uses the term instead of “Retracted” for papers removed within a year of their publication date – but do not come in searches for either publication type. Even better would have been to use databases with more comprehensive indexing of retractions.

      In an ideal world, if any effort were to be made, it would be aimed at better indexing and managing existing databases, not at generating query strategies to make up for their shortcomings.

      Thank you very much for the suggestions on the search strategy. We do not consider that the use of "Retracted Publication [PT]" should be compensated in any way but, if it should be compensated, we wouldn't want to add "Retraction of publication". We consider that using a search protocol more specific to systematic reviews is not very useful in our case: data are added/updated continuously (sometimes late), incorrect indexing can be corrected, the number of retracted articles increases from month to month; the same strategy can give different results at different times regardless of its complexity. Putting extra effort into detecting problematic articles without knowing the benefit but expecting it only highlights issues that can be improved at the publisher/editor(content delivery) and database level(indexing).

      The dataset analyzed is a snapshot of a particular time interval and nothing more. Even during the analysis we found, in the case of one publisher, the addition of details to the initially incomplete retraction notes. Hence the need for follow-up studies. Therefore in the case of retractions, unlike the reviewer, we prefer an approach based on simple and easily reproducible strategies, widely accessible sources of information, and several steps. The first step in this strategy is the "number crunching" stage which includes this article.

      1. The authors are using the time from publication to retraction based on the notice dates and using them to indicate efficacy of oversight by publishers. However, this approach is seriously problematic. It takes no notice of when the publisher was first informed that the article was potentially compromised. Publishers who respond rapidly to information that affects years/decades old publications will inevitably show worse scores than those who are advised upon an article’s faults immediately upon its publication, but who drag their heels a few months in dealing with the problem.

      Indeed, the article uses the time between publication and retraction(exposure time – ET) as one of the SDTP score components for assessing editorial/publisher performance. Data on when a publisher or editor has been informed of problems with an article, apart from being relatively rare, is not a substitute for a retraction note. Moreover, the use of such information may induce a risk of bias.

      We mention in the article the need to use reporting standards for retraction notes, and one element that might be useful is, indeed, the date on which the publisher or editor was informed of problems with an article. Unfortunately, as the author of this review knows very well, information precedes investigation; the retraction note contains (or should contain) much more data than the initial information about the quality problems of an article.

      Our article aims to suggest a score for measuring publication performance in the context of retracted articles that would also allow an assessment of the dynamics of the activity of correcting the scientific record and, more importantly, how publishers engage in post-publication quality control. ET is only one component of this score.

      It is quite clear from the data presented in the article that a publisher/journal that emphasizes systematic back-checking will have an increasingly longer average lifespan of retracted articles, logically higher than one that does not do this type of checking. We don't see precisely where the reviewer thinks there is a problem: once the checking is done, the ET will decrease, and a publisher that takes concrete steps to correct the literature will ultimately have a better reputation. This does not mean that a higher ET is laudable, it suggests that there is a post-publication quality control but also that the peer review process has let problematic articles through and that the control of these articles has been carried out late. This is an argument for more active involvement of publishers (as potential generators of editorial policies) in post-publication control.

      Second, there is little consistency in dealing with retractions between publishers, within the same publishers or even within the same journal. Under the same publisher, one journal editor may be highly responsive during their term, while the next editor may not be. Most problems with articles quite often are first addressed by contacting the authors and/or journal editors, and publishers – especially those with hundreds of journals – may not have any idea of the ensuing problem for weeks or months, if at all. Therefore, the larger publishers would be far more likely to show worse scores than publishers with few journals to manage oversight.

      It is exactly this inconsistency that we highlight in the article. Differing policies, attitudes, and responsiveness does not mean that a publisher cannot/should not ask questions about the effectiveness of internal processes and resources used for post-publication quality control or the implementation of uniform measures across journals in its portfolio.

      Third, the dates on retraction notices are not always representative of when an article was watermarked or otherwise indicated as retracted. Elsevier journals often overwrite the html page of the original article with the retraction notice, leaving the original article’s date of publication alone. A separate retraction notice may not be published until days, weeks or even years after the article has been retracted. Springer and Sage have done this as well, as have other publishers – though not to the same extent (yet).

      Historically, The Journal of Biological Chemistry would publish a retraction notice and link it immediately to the original article, but a check of the article’s PDF would show it having been retracted days to weeks earlier. They have recently been acquired by Elsevier, so it is unknown how this trend will play out. And keep in mind, in some ways this is in itself not a bad thing – as it gives the user quicker notice that an article is unsuitable for citation, even while the notice itself is still undergoing revisions. It just makes tracking the time of publication to retraction especially difficult.

      We used the same date for all articles in our study (the one listed in PubMed), thus ensuring a uniform criterion for all publishers. If this date was not in PubMed we used the date from the retraction notes on the journal website but this was for a small number of articles. How different publishers handle retraction processes or the delay with which these are published is primarily related to internal editorial procedures, and these delays are reflected in the ET. In our experience, most articles retracted by Elsevier are available online, supplemented, and not replaced by retraction notes, which we think is an excellent policy.

      1. As best as can be determined, the authors are taking the notices at face value, and that has been repeatedly shown to be flawed. Many notices are written as a cooperative effort between the authors and journal, regardless of who initiated the retraction and under the looming specter of potential litigation.

      Shown to be flawed by who? Indeed, in our study, we refer to the retraction notes published by the journals. The fact that they are incomplete or formulated under the threat of litigation only supports our view that publishers and editors need to make a more significant effort to correct the biomedical literature, including avoiding litigation when the retraction note clearly describes the reasons for retraction. The way the retraction note is worded should be an editorial prerogative and should primarily aim at correcting scientific literature, not at appeasing egos, careers, or financial interests.

      Trying to establish who initiated a retraction process strictly by analyzing the notice language is destined to produce faulty conclusions. Looking just at PubPeer comments, questions about the data quality may be raised days/month/years before a retraction, with indications of having contacted the journal or publisher. And yet, an ensuing notice may be that the authors requested the retraction because of concerns about the data/image – where the backstory clearly shows that impetus for the retraction was prompted by a journal’s investigation of outside complaints. As an example, the recent glut of retractions of papers coming from paper mills often suggest the authors are requesting the retraction. This interpretation would be false, however, as those familiar with the backstory are aware that the driving force for many of these retractions were independent investigators contacting the journals/publishers for retraction of these manuscripts.

      Once again, the author of this review does not seem to fully understand our study, apparently favouring information published on third-party websites over that the journals officially assumed. The retraction notes represent the material available to a researcher doing documentation on a particular topic. The clarity and information contained in the note is the editor's or publisher’s responsibility, reflecting their performance and concern for the integrity of the science. Interpretation of a retraction note/analyzing an article occurs in this context. Not everyone has time for further investigation or to search third-party sites for information that is, with a notable exception, the result of a selection bias.

      Assigning the reason for retraction from only the text of the notice will absolutely skew results. As already stated, in many cases, journal editors and authors work together to produce the language. Thus, the notice may convey an innocuous but unquestionable cause (e.g., results not reproducible) because the fundamental reason (e.g., data/image was fabricated or falsified) is too difficult to prove to a reasonable degree. Even the use of the word “plagiarism” is triggering for authors’ reputations – and notices have been crafted to avoid any suggestion of such, with euphemisms that steer well clear of the “p” word. Furthermore, it has been well-documented that some retractions required by institutional findings of misconduct have used language in the notice indicating simple error or other innocuous reasons as the definitive cause.

      We understand your point of view and the situations presented may be accurate. However, from our point of view, the only valid reference remains the retraction note published on the journal's website. The existence of wording difficulties and various other problems that may arise are more likely to do with a tendency of the reviewer to make excuses for journals reluctant to indicate precisely what the reasons for retracting the article are. There are plenty of retraction notes in which the images with problems (including whether they were plagiarized, reused, manipulated, fabricated, etc.) are indicated with great precision, there are equally plenty of notes in which the word plagiarism is used without hesitation, indicating the sources, how they were informed, what was plagiarized. No matter how many hesitant publishers/editors there are, it should not be forgotten that there are many journals/publishers who take their role seriously, acknowledge and learn from their mistakes, thus providing a real service to the scientific community.

      The authors also discuss changes in the quality of notices increasing or decreasing in publishers – but without knowing the backstory. Having more words in a notice or giving one or two specific causes cannot in itself be an indicator of the quality (i.e., accuracy) of said notice.

      "Knowing the backstory" is not part of our objectives, and neither is assessing the quality of the retraction notes. This is also very difficult to do due to the lack of an accepted standard format. We are trying to propose a score composed of several parameters resulting from existing (or non-existing) data in the retraction notes so that we can have a picture of retractions at publisher level. Knowing the backstory is not relevant, reading and interpreting the official retraction note is relevant.

      1. The authors tend to infer that the lack of a retraction in a journal implies a degree of superiority over journals with retractions. Although they qualify it a bit ( “Are over 90% of journals without a retracted article perfect? It is a question that is quite difficult to answer at this time, but we believe that the opinion that, in reality, there are many more articles that should be retracted (Oransky et al. 2021) is justified and covered by the actual figures.”), the inference is naive. First, they have not looked at the number of corrections within these journals. Even ignoring that these corrections may be disproportionate within different journals and require responsive editorial staff, some journals have gone through what can only be called great contortions to issue corrections rather than retractions.

      We believe that this is a case of reviewer confusion generated either by the insufficiently precise wording of the text or a lack of understanding of our study objectives. We are trying to point out that more than 90% of the journals in the NLM catalogue-PubMed subset have not retracted a single article. We are not trying to say that journals without retracted articles are superior to the others. As explained in the article, we referred to retraction notes, not corrections.

      Second, the lack of retractions in a journal speaks nothing to the quality of the articles therein. Predatory journals generally avoid issuing retractions, even when presented with outright proof of data fabrication or plagiarism. Meanwhile, high-quality journals are likely to have more, and possibly more astute, readers, who could be more adept at spotting errors that require retraction.

      Of course, the quality level of articles in a journal is not determined by the number of articles removed.

      Third, smaller publishers/journals may not have the fiscal resources to deal with the issues that come with a retraction. As an example, even though there was an institutional investigation finding data fabrication, at least one journal declined to issue a retraction for an article by Joachim Boldt (who has more than 160 retractions for misconduct) after his attorneys made threats of litigation.

      Threats of lawsuits are instead a failure of a publisher/journal to adapt to the realities of the publishing business or to the risk of misconduct. This is something that needs to change.

      Simply put, the presence or lack of a retraction in a journal is no longer a reasonable speculation about the quality of the manuscripts or the efficiency of the editorial process.

      We have not attempted to suggest this, we have only analyzed the retracted articles and their associated retraction notes. On the other hand, the way a journal/publisher handles the retraction of problematic articles still reflects, to some extent, the quality/performance of the editorial processes.

      1. I am concerned that the authors appear to have made significant errors in their analysis of publishers. For example, they claim that neither PLOS nor Elsevier retracted papers in 2020 for problematic images. That assertion is demonstrably false.

      This is wrong. In our dataset, there are eleven PLOS articles related to human health with the publication year 2019 and 2020. None of these have images as retraction reasons.

      Regarding the 21 Elsevier articles published in 2020, there is nothing in the retraction notes to indicate that the article was retracted because of the images. In 2 retraction notes there is mention of the comments made by Dr. Bik (The Tadpole Paper Mill - Science Integrity Digest) but the text of these (retraction notes) stops at the authors' inability to provide the raw data underlying the article.

      Our study is based only on the content of the retraction notes published and assumed by the journal, not on opinions/comments appearing on other sites, which, for unknown/unmentioned reasons, are not officially assumed in the retraction note. Therefore, we consider the statement in the review to be questionable at best, as the use of material other than the retraction notes has severe implications for the internal and external validity of the study and the suggestion to use such methods is, in our opinion, wrong. We would also like to draw attention to the fact that many retraction notes are explicitely mentioning the request to provide raw images and the authors' inability to provide them.

      Anyway, as far as images are concerned, our article suggested that there are publishers which seem to adopt image analysis technologies faster than others. The numbers are not really relevant in this case but the trend is: it describes the publishing activity complexity better than the numbers.

      Reviewer response

      We appreciate the authors’ zeal in standing by their work.

      In regard to the deficits in the search process, the author states, “We do not consider that the use of ‘Retracted Publication [PT]’ should be compensated in any way but, if it should be compensated, we wouldn't want to add ‘Retraction of publication’”

      There is a lack of appreciation for the complexities of indexing retracted materials in an indexing site such as PubMed. To have a comprehensive search, one should not be choosing to use either “Retracted Publication [PT]” OR “Retraction of Publication [PT].” One would use both, and then filter out the duplicates, because some retractions are indexed by retraction notices, some only have “Retracted” added to the indexed title and the publication type changed to “Retracted Publication.” Use of only one or the other guarantees that the search is far less comprehensive than it should be.

      The authors state, “In an ideal world, if any effort were to be made, it would be aimed at better indexing and managing existing databases, not at generating query strategies to make up for their shortcomings.”

      There is at least one database (http://retractiondatabase.org) that has a far more comprehensive indexing of retractions and is publicly available for use.

      In Item 3, where it is pointed out that retraction notices themselves are inaccurate and cannot be taken at face value as to the reason behind the retraction, the authors responded, “Shown to be flawed by who?” — By an article cited in the manuscript:

      Fang, Ferric C.; Steen, R. Grant; Casadevall, Arturo (2012): Misconduct accounts for the majority of retracted scientific publications. In Proceedings of the National Academy of Sciences of the United States of America 109 (42), pp. 17028–17033. DOI: 10.1073/pnas.1212247109.

      “To understand the reasons for retraction, we consulted reports from the Office of Research Integrity and other published resources (7, 8), in addition to the retraction announcements in scientific journals. Use of these additional sources of information resulted in the reclassification of 118 of 742 (15.9%) retractions in an earlier study (4) from error to fraud.” Followed by “These factors have contributed to the systematic underestimation of the role of misconduct and the overestimation of the role of error in retractions (3, 4), and speak to the need for uniform standards regarding retraction notices (5).”

      The authors then choose to state that it is the “editorial prerogative” – and that when notices “are incomplete or formulated under the threat of litigation [it] only supports our view that publishers and editors need to make a more significant effort to correct the biomedical literature, including avoiding litigation when the retraction note clearly describes the reasons for retraction.”

      Following our attempt to explain why understanding the real reason behind a retraction is important to study the publication of notices, the authors respond: “Once again, the author of this review does not seem to fully understand our study, apparently favouring information published on third-party websites over that the journals officially assumed.”

      First, yes, we do understand the study. We read a lot of these. Second, the “third-party websites” we prefer include the Office of Research Integrity and the Retraction Watch blog, where background investigations into the causes of retraction notices are described. If the authors are challenging the reference to PubPeer, keep in mind that journals initiate investigations based on comments on that website, and have taken to citing the website in their notices.

      Had the authors not chosen to categorize the reasons for retraction, their reasoning may have had more support – but they did, and in doing so, by just using the notice with no further review, their findings address only the notice itself, with no context.

      We recommend that the manuscript be substantially revised with strong attention to the comments we made in our original review.

    1. Author Response

      Reviewer #1 (Public Review):

      Liu et al investigated the role of Wnt/β-catenin pathway in the genesis of thermogenic adipocytes. Their study shows that some adipocytes exhibited Wnt/β-catenin signaling ("Wnt+ adipocytes") in intrascapular brown adipose tissue (iBAT), inguinal white adipose tissue (iWAT), epidydimal WAT (eWAT), and bone marrow (BM). There was a different level of the possession of Wnt+ adipocytes between the different depots with iBAT expressing 17%, iWAT expressing 6.9%, and eWAT expressing the least at 1.3%. Expression of these adipocytes was noted on embryonic day 17.5 and was present in a higher percentage in female mice compared to male mice and in younger mice compared to older mice, which aligns with their observation that Wnt+ adipocytes are thermogenic.

      The authors also noted that Wnt+ adipocytes can differentiate from human stromal cells. In regards to the pathway, Wnt/β-catenin adipocytes are distinct from classical brown adipocytes at molecular and genomic levels. It was noted that Tcf7L2 was largely expressed in Wnt+ adipocytes but other Tcf proteins (Tcf 1, Tcf 3, and Lef1) were not. Wnt- cells showed a reversible delay in maturation with LF3, however, no cell death was noted. Wnt/β-catenin adipocytes seem to depend on AKT/mTOR signaling. It was further shown that insulin is a key factor in mTOR signaling and Wnt+ adipocyte differentiation.

      Upon cold exposure, UCP1+/Wnt- beige fat emerges largely surrounding Wnt+ adipocytes, implicating that Wnt+ adipocytes serve as a "beiging initiator" in a paracrine manner. Lastly, mice with implanted Wnt+ adipocytes had a significantly better glucose tolerance which suggests that Wnt+ adipocytes have a beneficial impact on whole-body metabolism. I found no major flaws in the method and data largely supports their conclusion that Wnt+ adipocytes have (at least some) a significant role in thermogenesis/metabolism, which I think is a very impressive and innovative finding.

      Thanks so much for the outstanding summary of our manuscript. We feel sorry that we somehow did not make it clear in the original manuscript that the percentage of Wnt+ adipocytes is higher in male mice than that in females.

      Reviewer #2 (Public Review):

      Liu et al present evidence for the surprising finding of Tcf/Lef-active, "Wnt+" mature adipocytes. They report that Wnt+ adipocytes arise during embryogenesis and regulate cold-induced beiging in surrounding adipocytes. Tcf/Lef transcriptional activity in these cells is Wnt-ligand independent and instead appears to be stimulated by insulin-dependent AKT/mTOR signaling. Using a diphtheria toxin inducible depletion mouse model, the authors show that Wnt+ cells play an important role in glucose homeostasis.

      As the authors have acknowledged, proper assignment of adipocyte nuclei is a notoriously difficult histological challenge. Mesenchymal cells sit directly adjacent to the adipocyte plasma membrane and their nuclei are often incorrectly assigned to the adipocyte both in vivo and in vitro. Pparg nuclear co-staining is helpful, however, Pparg is very highly expressed by endothelial cells and Col15a1+ committed preadipocytes, which are intercalated throughout the adipose. The authors have made an impressive attempt to address this concern by generating a Tcf/Lef-CreER mouse line to fluorescently label Wnt+ adipocytes, however, it is not entirely clear if the images presented support the conclusion that mature adipocytes are being labeled. Given that Wnt+ mature adipocytes are the core conclusion of this manuscript, and because this hypothesis runs counter to a large body of literature concluding that Wnt signaling inhibits adipogenesis, the authors have assumed a very high burden of proof that these are indeed Wnt+ mature adipocytes in vivo.

      Thanks for the outstanding summary of our manuscript.

      To address these concerns, the authors could utilize the specificity of in vivo single-nuclei RNA-Seq. Several data resources have been published (https://singlecell.broadinstitute.org/single_cell/study/SCP1376/a-single-cell-atlas-of-human-and-mouse-white-adipose-tissue), and the authors should re-analyze these data for subpopulations of mature adipocytes that express a transcriptional signature of active Tcf/Lef signaling. It is unfortunate that the authors were unable to successfully perform single-nuclei analysis of the Wnt+ adipocytes as this would significantly enhance this manuscript. The physiologic relevance of the single-cell analysis of immortalized, in-vitro differentiated clonal cell lines is questionable.

      We took the advice by Reviewer 2 and intersected our scRNA-seq data on Wnt+ adipocytes with the published single-nucleus sequencing (sNuc-seq) dataset of mouse iWAT (Emont et al., 2022). Because the activation of Tcf/Lef signaling in the Wnt+ adipocytes is relied on AKT/mTOR signaling but not the conventional Wnt ligands and receptors, those traditional downstream markers of Wnt signaling such Axins were not found specifically enriched in the Wnt+ adipocytes. Therefore, the AKT/mTOR-dependent Wnt signaling in Wnt+ adipocytes appears to regulate expression of genes distinct from that controlled by the conventional Wnt signaling pathway. This conclusion is supported by our recent studies that inhibition of this AKT/mTOR-dependent Wnt signaling by LF3 in Wnt+ adipocytes negatively impact pathways implicated in “PI3K/Akt signaling”, “insulin signaling”, “thermogenesis”, and “fatty acid metabolism” et al (see below for details). However, we found that one cluster (mAd3) of sNuc-seq dataset, which is relatively enriched in Tcf7l2, expresses remarked high levels of Cyp2e1 as well as Cfd that encodes Adipsin. These genes, regarded as hallmark of mAd3 cluster, are also uniquely or highly expressed in Wnt+ adipocytes. Interestingly, the percentage of mAd3 among the total iWAT adipocytes in chow-fed male group is about 5%, which is very close to that of Wnt+ adipocytes in vivo (~7%). Thus, mAd3 possibly represents Wnt+ adipocytes in iWAT. These analyses are included in the revision.

      Reviewer #3 (Public Review):

      It is becoming increasingly clear that adipocytes are not homogenous, but rather comprise several distinct subtypes with specific physiological functions. The mechanisms that underlie the development and distinct roles of each adipocyte subtype are of great interest for understanding the biology of metabolic regulation and its impairments in metabolic disease. In this manuscript, the authors describe a previously unknown population of adipocytes in mice, which are characterized by a special form of beta-catenin signaling. They perform a comprehensive series of experiments in cultured cells, in mouse models of in-vivo lineage tracing, and transplantation experiments to define the origin and function of these adipocytes. They find that the formation of these Wnt+ adipocytes is dependent on insulin signaling, and find possible roles in thermogenic adipose tissue development. Overall, the conclusions of this study are very convincing in their identification of a subpopulation of adipocytes displaying non-canonical Wnt signaling. The proposed role of these adipocytes as regulators of thermogenesis is more ambiguous, and their physiological function remains unclear.

      Thanks for the good comments. To distinguish this AKT/mTOR dependent intracellular Wnt signaling in Wnt+ adipocytes from the conventional non-canonical Wnt signaling, we feel that it would be appropriate to call this signaling as atypical Wnt signaling.

      • The new adipocyte types are identified through expression of a reporter for TCF/Lef signaling. This reporter is classically activated by Wnt/beta-catenin and using both siRNA depletion of beta-catenin as well as an allele lacking its transcriptional activation domain, the authors confirm the reporter expression is dependent on the presence of beta-catenin and TCF7L2, but independent of canonical Wnt signaling.

      • The involvement of TCF7L2 is also probed using a specific inhibitor of the beta-catenin/TCF7L2 interactions, LF3, which inhibited reporter expression. Inhibition of canonical Wnt signaling was without effect.

      • The authors isolate clonal lines of precursor cells that give rise to Wnt+ or Wnt- adipocytes from mouse brown adipose tissue. They find that Wnt+ adipocytes are dependent on the Wnt pathway, as inhibition by LF3 induces cell death.

      • To further probe the nature of Wnt+ and Wnt- adipocytes, the authors perform scRNASeq on cells after 7 days of adipose induction and find 2 distinctive cell populations. The finding of 2 distinct populations is expected, given the a priori separation of cells as a function of GFP expression. It is not clear why scRNASeq was chosen over RNASeq on the population, since the fat content of adipocytes may preclude full characterization of the most differentiated cells.

      With scRNA-seq, it would be more convincing to identify specific subpopulation of cells, as adipocytes are well known to be heterogenous.

      Overall, this experiment is less informative on the mechanisms by which Wnt+ adipocytes display Wnt signaling dependency for viability, and what their functional role might be.

      Yes, these are major questions to be addressed in our future studies.

      • The non-canonical nature of Wnt signaling in Wnt+ adipocytes prompted the authors to explore the role of the insulin/PI3K/AKT/MTOR pathway. They find enhanced basal activity of this pathway in Wnt+ adipocytes. It was not explored whether this enhanced activity persists under insulin stimulation; this is relevant as feedback mechanisms within the signaling pathway may result in lower signaling under stimulated conditions.

      • To test the relevance of insulin signaling in-vivo on non-canonical Wnt signaling in adipocytes the authors use the Akita mouse, which lacks the insulin-2 gene and find a marked decrease in reporter activity, confirming the requirement for insulin signaling for expression of this non-canonical Wnt pathway.

      • To determine the functional role of Wnt+ adipocytes, the authors explore their relationship to mitochondrial respiratory activity and thermogenesis. They perform experiments to monitor mitochondrial membrane potential and oxygen consumption rate and find higher overall O2 consumption, and lower membrane potential in adipocyte populations vicinal to Wnt+ adipocytes. Overall these results are not fully convincing: The traces are highly variable from cell to cell, and rigorous quantification of uncoupled respiration is limited by the small number of cell lines analyzed; only one cell line of Wnt- and two Wnt+ adipocytes are analyzed. In situ differences in membrane potential would be more convincing if performed on homogenous collections of Wnt- and Wnt+ adipocytes to better understand stochastic variance.

      Thanks for the suggestions. Actually, the results of mitochondrial membrane potential assay on mixed adipocyte culture gave us the initial hint of the potential paracrine effect of Wnt+ adipocytes.

      • To determine the role of Wnt+ adipocytes in-vivo thermogenesis, the authors expose mice to cold temperature and monitor the proportion of UCP1+ adipocytes in relation to Wnt signaling. They find a proportion of Wnt+ adipocytes expressing UCP1. Whether this proportion is higher or lower than that of Wnt- adipocytes is not quantified, so it is unclear whether Wnt+ adipocytes preferentially develop beige characteristics. The authors find that UCP1+, Wnt- adipocytes are topologically close to Wnt+ adipocytes, and hypothesize a paracrine signaling role. However, this correlation may be explained by known topological biases in inguinal fat pad beiging, where adipocytes closer to lymph node preferentially induce UCP1. The Wnt+ adipocyte population may coincidentally be present in this region.

      As shown in Figure 5-figure supplement 1E, while all Wnt+ adipocytes were co-stained with UCP1, the percentage of Wnt+ adipocytes did not increase after cold challenge. As shown in Figure 5-figure supplement 1C, the initial beiging response is closely associated with Wnt+ adipocytes, but not topological bias.

      • To functionally determine the role of Wnt+ adipocytes in thermogenesis, the authors ablate the Wnt+ lineage through expression of diphtheria toxin using a Fabp4-Flox-DTA mouse crossed to Tcf/Lef-CreERT2 mice. Less than 50% of these mice displayed impaired thermogenesis upon cold exposure. The authors interpret this finding to signify a partial role for Wnt+ adipocyte beiging in thermogenic regulation. This conclusion is not fully supported, as Fabp4 is expressed in many cells other than adipocytes, and therefore the phenotype of the affected mice is not unambiguously attributable to loss of Wnt+ adipocytes. An additional concern is that diphtheria toxin-induced cell death will lead to tissue inflammation, with potential functional effects on thermogenesis. The degree of cell death and inflammation should be measured and reported.

      While Fabp4 is expressed in some SVFs, the Fabp4-Flox-DTA allele is not activated by Tcf/Lef-CreERT2 allele, as T/L-GFP reporter is not seen in freshly isolated SVFs of iWAT (Figure 2-figure supplement 1A). To avoid potential side effects of DTA-induced cell death on adipose tissues, we compounded the Tcf/Lef-rtTA allele with TRE-Cre and floxed Pparg alleles (PpargF/F) to prevent the differentiation of Wnt+ adipocytes. These new results are included in the revision as supplemental results (Figure 5-figure supplement 2G).

      • The finding that Akita mice lack Wnt+ adipocytes was used to determine whether these mice are susceptible to cold-induced challenges. The authors report a decrease in cold-induced UCP1 expression in these mice. This conclusion, derived from a single immunofluorescence image, is not fully convincing in the absence of additional metrics.

      Additional analyses are included in the revision, as Figure 5-figure supplement 3.

      • To further explore the role of Wnt+ adipocytes in systemic metabolism, the authors conduct implantation studies of Wnt+ adipocytes and measure effects on glucose tolerance. They show a significant difference in glucose excursions in mice harboring fat pads developed from Wnt+ adipocytes. These results are convincing, but the conclusion may be due to enhanced volume of additional functional fat developing from Wnt+ adipocytes.

      In this experiment, unbiased mBaSVF adipocytes were used in parallel as control.

    1. Author Response

      Reviewer #2 (Public Review):

      1. The manuscript seems to claim that the study shows that S4 is the voltage sensor and S4 moves in KCNQ2. This has been repeated in Abstract, Introduction and Results. However, by this time S4 movements as a voltage sensor are well accepted mechanisms. The importance of the work is actually that it defines parameters of the VSD movement in KCNQ2 such as the stretch of S4 in and out of the membrane, and the relationship between VSD activation and pore opening. These points should be brought out as the rationale and significance of this work, rather than the well-known S4 function.

      We thank Reviewer# 2 for this important comment that was also brought up by Reviewer# 3. We apologize for over emphasizing that the 4th TM segment is the voltage sensor and that the S4 moves in KCNQ2 channels. This might be the result of the author’s past struggle to convince earlier reviewers that the fluorescence signals at a given position are not an experimental artifact, but S4 moving during channel opening. We are very happy to learn that this is now a well-accepted mechanism.

      In the revised version, we now state:

      Abstract: “Here, we define parameters of voltage sensor movements in wt-KCNQ2 and channels bearing epilepsy-causing mutations using cysteine accessibility and voltage clamp fluorometry (VCF).”

      Introduction: “Similar to that seen in other Kv channels, the fourth transmembrane segment contains several highly conserved positively charged amino acid residues that move in response to changes in membrane voltages that functions as the voltage sensor(25-28)[…]Although these studies provided insight into S4 rearrangements, they did not define parameters of S4 movement, such as the dynamic relationship between S4 activation and pore opening during voltage-controlled gating of KCNQ2 channels.

      Results: We deleted: “Collectively, these close correlations in time (Figure 3) and voltage dependence (Figure 2C) of fluorescence and current suggest that the environmental changes around labeled F192C at the outer end of S4 rendered fluorescence signals that seem to report on S4 motion associated with the opening and closing of the channel gate.”

      And simply state: “The close correlations in time (Figure 3) and voltage dependences (Figure 2G) of S4 motion (fluorescence) and activation gate (ionic current) resemble those observed for homologous KCNQ1 (without KCNE1)(42) and KCNQ3 channels(41, 43)”

      We also rewrote in its entirety the subsection: “Disease-causing mutations differentially affect S4 and gate domains” (Pages 10-11).

      1. The closeness of fluorescence and current traces and FV and GV curves led to the conclusion that the movement of a single VSD could trigger channel opening. The rationale for connecting the experimental observations to this conclusion needs to be well explained when the conclusion is first made. References that have made similar arguments such as Osteen et al PNAS 2010; Westhoff et al PNAS 2019 should be cited. In addition, as the authors recognized in Discussion, the same observations can also lead to an alternative conclusion such that the movements of four VSDs highly cooperative to all activate and then open the pore. However, this alternative mechanism is not mentioned until at the end of the manuscript, while "the movement of a single VSD opening the pore" is firmly claimed in Abstract and Results. Some justifications need to be provided for this.

      Thank you for this important observation, the wording we used was clumsy. Since we removed the kinetic model (Figure 6 in the original manuscript), we have also deleted any sentences that discuss concerted or independent S4 movement in the Abstract and Result sections. We only discussed that these alternatives, concerted or independent S4 movement, might explain our VCF data which shows that both the steady-state voltage dependence of S4 transitions and the kinetics closely follow those of ionic currents. Both references – Osteen et al PNAS 2010 and Westhoff et al PNAS 2019 have also been added – as recommended by the reviewer and apologize for overlooking these references in the original manuscript.

      1. An explanation is needed for how same the covalent MTS modification of N190C at two voltages resulted in different GV relations (Fig 1E).

      Thank you for pointing out this important point. We have spent a good deal of time since we received the reviews answering this important point that was also raised as a concern by Revewer# 1. To that end, we have included additional data that support the idea that N190C channels are accessible in both the open and closed states. This is now clearly addressed in Recommendations for the Authors, first Specific Suggestions from Reviewer #1. See above Response to the first Specific suggestions from Reviewer# 1 on Pages 2-5.

      In the original submission, we only used the protocols shown old Figure 1. We applied MTSET only at +20-mV for the open state and – 80-mV for the closed state. We used – 100-mV and – 120 mV for the closed state of A193C and S199C, respectively, because compared to the wt channels, these cysteine mutants shifted the GV relationship to negative voltages.

      In the revised version, to further strengthen our conclusions, we have used a new protocol: For each cysteine mutant, we have designed a protocol in which we first apply MTSET at hyperpolarized voltages (closed) before switching to depolarized voltages (open) on the same cell, in a pairwise manner.

      This is now described in the Result subsection “State-dependent external S4 modifications consistent with S4 as voltage sensor”, Pages 6-8 of the revised manuscript and new Figure 1 and Figure 1-figures supplement 3 and 4.

      We also apologize for the lack of clarity in citing reference 40 in the original submission. This reference is deleted in the revised version, in light of our new data on N190C (new Figure 1 and Figure 1-figures supplement 3 and 4), which strengthen our claims that N190C modification occurs in in both states (open and closed).

      1. The model in Fig 6F raises several concerns including vertical transitions having the rates of VSD activation and detailed balance is violated.

      The reviewer raises an important concern in our original Figure 6F (model). Based on the Editors and reviewers comments, we have removed Figure 6 from the original manuscript to eliminate any of potential misunderstanding about the data presented. In future studies, we will gather additional fluorescence and current data using different protocols and dimer constructs to provide a more in depth description of KCNQ2 gating.

      1. Discussion. The argument of no intermediate open state based on K/Rb permeability ratio assumes that the pore properties such as ion selection and permeability of KCNQ2 are the same as that of KCNQ1. The evidence for this assumption is not provided or discussed. On the other hand, some evidence suggests that the VSD of KCNQ2 may activate in two steps. For instance, the time course of VSD activation can be fitted with two exponentials, and the fluorescence increases after a plateau at voltages > 0 mV in FV curves (Fig 2C). How these results affect the conclusion should be discussed.

      We agree with the reviewer that the claim of a lack of an intermediate open state in KCNQ2 channels based on the Rb/K data provided in the original submission assumed that the pore properties of KCNQ2 are the same as those seen in KCNQ1 channels. Since we did not show sufficient experimental evidence to prove this point, we have removed Figure 6 (the model) from the revised manuscript. In the future, we will provide more evidence to build stronger support for the potential existence of intermediate and active open states in KCNQ2 channels. As such, we have removed the model shown in the original manuscript. Future studies will be performed to refine the KCNQ2 model, including the use of mutations that can lock the S4 in the intermediate or activated states in KCNQ2, as has been performed in the KCNQ1 channel by Zaydman et al; PMID: 25535795). These experiments will provide more conclusive results regarding the different S4 states.

      We have now re-analyzed the data and concluded that while the time course of the fluorescence appeared to have multiple exponentials, our fluorescence data lacked sufficient resolution to reliably estimate the first (fast) component. This might be because of the low signal-to-noise ratio of our VCF or/and because the filtering might have limited the tau-on from the optical signal (shown to be 20 ms in Figure 3C of the original submission).

      As suggested by reviewers # 3, we have removed the kinetics comparison of fluorescence and current in the revised version of Figure 3, and simply state: …” There is a close correlation between the time course of fluorescence signals and ionic currents at all the voltages tested (Figure 3B, D). The close correlations in time (Figure 3) and voltage dependences (Figure 2G) of S4 motion (fluorescence) and activation gate (ionic current) resemble those observed for homologous KCNQ1 (without KCNE1)(42) and KCNQ3 channels(41, 43).”

      As for the last part of the reviewer comments, the apparent increase in fluorescence after a plateau at voltages > 0mV has now also been revised. We have attempted new VCF at voltages more positive than + 40 mV to probe if a putative second fluorescence component after the plateau phase develops or if it is just artifacts of the experimental system. To get reliably fluorescence signals, we need a huge expression of labeled KCNQ2* channels (often producing currents larger than 100uA). It is considerably more difficult to properly clamp these high expressing cells, especially at extreme voltages. This experimental limitation makes it challenging to draw conclusions about the occurrence of a second fluorescent component. It may be possible to perform the cut—open technique coupled with VCF in order to shed light on this issue, but these experiments would require significant upgrade of the set up that we currently do not have this in place.

      Reviewer #3 (Public Review):

      1. I am convinced that the fluorescence signals reflect the voltage sensor conformation in the system. The authors focus quite a lot of attention on demonstrating that the fluorescence signals are not an experimental artifact, which is fine.

      We thank Reviewer# 3 for this observation. We apologize for over emphasizing that the fluorescence signals reflect the voltage sensor conformation in the system. As state above in response to a similar comment from Reviewer #1, this might be the result of the author’s past struggle to convince earlier reviewers that the fluorescence signals at a given position are not an experimental artifact, but S4 moving during channel opening. This has been amended in the revised version.

      However, I feel the authors could be more cautious in terms of describing how the mutations or dye conjugation may alter some of the gating properties. A place where this may be very important is in the description or characterization of activation kinetics as lacking sigmoidicity, which is part of the argument that these channels may open with only a fraction of voltage sensors activated. This may be correct in the modified (dye-conjugated) channel recordings, but many other recordings of unmodified channels (Figure 1) or WT KCNQ2 or 3 channels exhibit some sigmoidicity. I wonder if this difference may arise because the dye labeling may prevent complete VSD deactivation or interfere with gating in some other way. I would also add that this comment isn't meant to diminish the importance of the findings, I just think it would be wise to qualify some of the description of data with these possible caveats.

      We thank the reviewer for this suggestion, which we believe improves the flow and description of data considering all possible limitations. The reviewer is right. The mutation F192C on its own accelerates the kinetics of activation and causes a leftward shift in the GV curve of KCNQ2 channels. Moreover, labeling F192C with either fluorophore further shifts the GV towards negative potentials.

      In the revised version, we have rewritten the Result subsection ‘Tracking S4 movement of KCNQ2 channels using voltage-clamp fluorometry (VCF)’ almost in its entirety. In this subsection, we now bring to the forefront the changes associated with the measurement of gating properties caused by the mutations or dye conjugation that we agree helps with data interpretation. We made a direct comparison of voltage dependence and kinetics between wt, unlabeled KCNQ2-F192C, and labeled-KCNQ2F192C channels (new Figures 2 and Figure 2-figure supplement 1).

      These differences are also discussed on Pages 12-13 of the revised manuscript. See also below response to Recommendations for the authors:

      1. A brief aside on this point is that a lack of sigmoidicity does not necessarily imply a single transition required for opening - it can also arise if there is a rate-limiting step during a sequence of pre-open transitions.

      Thanks -good point-. We will keep this possibility in mind for future studies where the model will be developed.

      1. The generation of a quantitative model is a useful application of the data. It was not clear to me whether there was a benefit to using multiple-exponential components to fit the fluorescence signals and generate a more complex model. This may add complexity where it may not be necessary, as it is not clear whether the fluorescence signals require multiple components for an adequate fit.

      Thank you for your comment. We agree with the reviewer that our model is underdeveloped and needs additional VCF data to better describe KCNQ2 gating. Based on all three reviewers concerns and as suggested by the Reviewing editor in his summary, we removed the kinetic model from this manuscript and will work to refine this model in our future studies.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper address the "origins and drivers of Neotropical diversity." The Neotropics have high diversity of plants and animals relative to other global regions. There are also many hotspots of global biodiversity (species richness) within the Neotropics.

      This paper aggregates 150 time-calibrated phylogenies from different groups of plants and animals that occur predominantly in the Neotropics. They analyze the diversification dynamics of these clades over time primarily using the method of Morlon et al. (2011; PNAS) as implemented in RPANDA (Morlon et al. 2016). The authors find that most clades have constant rates of speciation and extinction over time.

      Thank you for having reviewed our study and for your feedback.

      The strength of the paper is that it aggregates many previously published phylogenies of Neotropical organisms. However, it is unclear whether the method used gives meaningful inferences about diversification dynamics over time (e.g. Burin et al. 2019; Syst. Biol.). Therefore, the overall contribution of the study is somewhat questionable.

      This is a legitimate comment, and we understand the skepticism on a study that relies on macroevolutionary models of questionable robustness (e.g. Kubo & Iwasa 1995 - Evolution; Rabosky & Lovette 2008 - Evolution; Crisp & Cook 2009 - Evolution; Quental & Marshall 2010 - TREE; Burin et al. 2019 - Syst. Biol.; Louca & Pennell 2020 - Nature; Pannetier et al. 2021 - Evolution).

      The methodology used here has been thoroughly tested with both simulations (e.g. Morlon et al. 2011 - PNAS; Lewitus & Morlon 2018 - Syst. Biol.; Condamine et al. 2019 - Ecol. Lett.) and empirical cases (e.g. Lewitus et al. 2018 - Nat. Ecol. Evol.; Condamine et al. 2019 - Ecol. Lett.). We cannot deny that such a methodology is fully free from issues, which affect all birth-death models, and brings the question: are we able to reliably infer the diversification model and identify parameter values of this model (Louca & Pennell 2020 - Nature)? These concerns are not likely to be resolved in the short term. Although many studies are making progress in understanding the behavior of diversification rate functions, showing, for example, that equally likely diversification functions (i.e. the congruent parameter space of Louca & Pennell 2020 - Nature) can share common features, with diversification rate patterns being robust despite non-identifiability (Höhna et al., 2022 - bioRxiv; Morlon et al., 2022 - TREE).

      Being aware of these concerns, we also relied on the recently developed Pulled Diversification Rates method (Louca & Pennell 2020 – Nature; Louca et al., 2018 - PNAS) that is supposed to correct for the identifiability issue raised by recent studies. Hence, applying both traditional and pulled birth-death models to all phylogenies, we have shown a good consistency in the inferred models, which suggests that our study can provide meaningful estimates of diversification. Our empirical study is also one of the first to perform such a large-scale methodological comparison in diversification analyses (pulled vs. traditional birth-death models) while addressing a key question in evolutionary biology. We have now emphasized this point in the conclusions of our study: “To the extent possible, these results are based on traditional diversification rates, and on the recently developed Pulled Diversification Rates method that is supposed to correct for the identifiability issue raised by recent studies associated with traditional diversification rates (71). Hence, applying both traditional and pulled birth-death models to all phylogenies, we have shown a good consistency in the inferred models, which suggests that our study can provide meaningful estimates of diversification”.

      The design of the study is also somewhat problematic. There is no comparison to other regions outside the Neotropics, so the study cannot address why the Neotropics are so diverse relative to other continental regions. Similarly, within the Neotropics, the authors do not find significant differences in diversification rates or dynamics among regions. As far as I can tell, they do not attempt to relate patterns of diversification to patterns of species richness among regions within the Neotropics (and presumably they would find no significant patterns if they did).

      We agree with this remark. We are sorry for this confusion. Our study does not aim at addressing why the Neotropics are more diverse than other regions in the world. We simply wanted to establish that the Neotropics are the richest region in the world based on previous studies, and that we are interested in understanding what are the patterns/drivers behind such a diversity. In the Introduction, we state that such diversity is not evenly distributed within the Neotropics, and that some regions are richer (e.g. Andes) than others (e.g. southern cone of South America). Diversity models, from Stebbins (1974), have long been proposed to explain this unbalanced diversity. Our study has then defined different bioregions within the Neotropics in which we have looked for differences in diversification patterns. In other words, we do “attempt to relate patterns of diversification to patterns of species richness among regions within the Neotropics”, although we were not able to explain the observed differences in species richness by differences in diversification dynamics (i.e. diversification dynamics are similar across regions). Please, see our response to the essential revision point 1 addressing this comment.

      In the revised version, we have changed the title of the study as: “Diversification dynamics of plants and tetrapods in the Neotropics through time, clades and biogeographic regions”. We hope you will find this new title better fits the content of the article. In addition, to avoid any confusion in light of your comment, we have deleted the following sentence from the introduction: “But such an assessment is required to understand the origin of Neotropical diversity and why the Neotropics are more diverse than other regions in the world”.

      The authors set up their study by claiming that most previous attempts to explain Neotropical diversity relied on two evolutionary models: cradles vs. museums of diversity. The justification cited for this thinking comes mostly from papers from the last century or before. I do not think that this represents the cutting edge of modern thinking about this topic. Many researchers moved on from this dichotomy long ago.

      Thank you for this interesting comment. You are right. The cradle and museum models of diversity are indeed old definitions (Stebbins 1974 - Flowering Plants: Evolution Above the Species Level), but they were convenient to formulate clear and testable hypotheses on the processes underlying the observed patterns of diversity that Stebbins described. We agree that Stebbins’ view is likely outdated, and that is why we took advantage of these models to draw a series of hypotheses relying on evolutionary processes, which has been argued as a “cutting edge of modern thinking about this topic” (Vasconcelos et al. 2022 - Am. Nat.). In the revised version, we have extended the explanation for our rationale to rely on Stebbins’ models and propose process-based hypotheses to explain diversity patterns. We also cite Vasconcelos et al. (2022 - Am. Nat.). We have modified the introduction as follows: “Although the concepts of cradle and museum have contributed to stimulate numerous macroevolutionary studies, a major interest is now focused on the evolutionary processes at play rather than the diversity patterns themselves (23). Four alternative evolutionary trajectories of diversity dynamics could be hypothesized to explain the Neotropical diversity observed today: …”.

      However, we will argue as well that some contemporary studies still rely on the cradle and museum framework to frame their studies, for example: McKenna et al. (2006 - PNAS), Couvreur et al. (2011 - BMC Biol.), Condamine et al. 2012 (BMC Evol. Biol.), Moreau & Bell (2013 - Evolution), Dornburg et al. (2017 - Nat. Ecol. Evol.). A search in Google Scholar with "Neotropic AND cradle AND diversif*" returns 1,700 results since 2010. That is why we would like to emphasize that this framework should be abandoned, because it does not rely on evolutionary processes and does not consider the full spectrum of hypotheses explaining Neotropical diversity. In the revised version, we have qualified our assertion that most studies are based on these models, which we agree is not entirely true. We have modified the corresponding paragraph as follows: “Attempts to explain Neotropical diversity traditionally relied on two evolutionary models. In the first, tropical regions are described as a “cradle of diversity”, [...] Although not mutually exclusive (15), the cradle vs. museum hypotheses primarily assume evolutionary scenarios in which diversity expands through time without limits (16). However, expanding diversity models may be limited in their ability to explain the entirety of the diversification phenomenon in the Neotropics. For example, expanding diversity models cannot explain the occurrence of ancient and species-poor lineages in the Neotropics (17–19) or the decline of diversity observed in the Neotropical fossil record (20–22). Although the concepts of cradle and museum have contributed to stimulate many macroevolutionary studies, the major interest is now focused on the evolutionary processes at play rather than the diversity pattern (23)”. We hope you will find this new paragraph better represents current thinking in the field.

      There are potentially interesting differences in the diversification dynamics of plants and animals, but this depends on whether we can believe the inferences of the diversification dynamics or not.

      Thank you for pointing this out. We understand the concern because of the general (not new) skepticism on macroevolutionary models (e.g. Kubo & Iwasa 1995 - Evolution; Rabosky & Lovette 2008 - Evolution; Burin et al. 2019 - Syst. Biol.; Louca & Pennell 2020 - Nature; Pannetier et al. 2021 - Evolution). Unfortunately, the study of PDR did not help to confirm/reject this particular conclusion.

      We thus remain cautious with our results, and we have acknowledged several caveats that should be kept in mind when interpreting them. Here, the same methodological treatment has been applied to both animals and plants, and yet the results indeed indicate different diversification patterns. In addition, our results remained stable to AIC variations (Figure 5 - figure supplement 1), and regardless of the paleo-temperature curve considered for the analyses. Still, we do not “believe” the inferences made with birth-death models in general are accurate, but as long as these models are applied in a well-defined framework and thoroughly performed with a hypothesis-driven approach, recent studies have shown that one can interpret the results and draw conclusions (Helmstetter et al. 2021 - Syst. Biol.; Morlon et al. 2022 - TREE).

      For this new version of the manuscript, and following the suggestions of reviewer 3, we have conducted new analyses to assess whether the contrasted diversification dynamics found here between plants and tetrapods could be explained by differences in their datasets (i.e. differences in tree size, crown age, or sampling fraction of the phylogenies). We found that the higher proportion of increasing dynamics observed in plants cannot be explained by significant differences in these factors, strengthening our conclusions.

      Reviewer #2 (Public Review):

      In this study, the authors explored the evolution dynamics of Neotropical biodiversity by analyzing a very large data set, 150 phylogenies of seed plants and tetrapods. Furthermore, they compared diversification models with environment-dependent diversification models to seek potential drivers. Lastly, they evaluated the evolutionary scenarios across biogeographic regions and taxonomic groups. They found that most of the clades were supported by the expansion model and fewer were supported by saturation and declining models. The diversity dynamics do not differ across regions but differ substantially across taxa. The data set they compared is impressive and comprehensive, and the analysis is rigorous. The results broadened our understanding of the evolutionary history of the Neotropical biodiversity which is the richest in the world. It will attract broad interest to evolutionary biologists as well as the public interested in biodiversity.

      Thank you very much for your review and the positive input.

      Reviewer #3 (Public Review):

      This manuscript seeks to address a series of questions about lineage diversification in the Neotropics. The authors first fit a range of lineage diversification models to over 150 neotropical seed plant and tetrapod phylogenies to characterize diversification dynamics. Their work indicates that a constant diversification model was most frequently the best fit model, while time-, temperature- and Andean uplift-dependent models were far less frequently favored. The authors then attempted to determine whether distinct biogeographic clusters existed by using clade abundance patterns as a proxy for long-term diversification within regions. They found that while clades were widespread across ecoregions, regional assemblages could be binned into five clusters reflecting clade endemism. Finally, they asked whether diversification dynamics of individual lineages varied by parent clade, by environment (temperature through time, and Andean uplift) and by biogeographic region, finding that diversity trajectories best explained by environmental drivers and parent clade identity, while no significant association was detected with biogeographic region. I especially appreciated the detailed model-testing procedure, the inclusion of pulled rates, tests for phylogenetic signal in the results, and the acknowledgment of caveats. By using a massive dataset and, and a battery of cutting-edge analyses, the authors provide new insight into questions that have intrigued biologists for decades.

      Thank you for reviewing our study and for your positive feedback.

      1. The neotropics, as defined here, extends from Tierra del Fuego to Central Florida, rather than from the Tropic of Cancer-Capricorn. I was confused by this broad circumscription, and wondered whether the findings presented here could be biased by the inclusion of these exclusively or primarily extra-tropical regions (such as "elsewhere" and "Chaco+Temperate south America") and lineages.

      Thank you for this comment, which is also in line with the second comment of Reviewer 1. We understand the confusion. The Neotropics, as originally defined by Alfred Wallace, represent a broad region including many types of ecosystems and biomes (not only tropical ones): i.e. the Neotropical realm. It also has a paleobiogeographic significance, as the whole South American continent was isolated for tens of millions of years (Simpson 1983). This definition is well accepted in the field of biogeography and evolutionary biology and we followed it to avoid adding a new definition. A Google Scholar search with keywords “Neotropic AND phylogen AND diversificat*” returns >24,000 hits. Our biogeo-regionalization and clustering results also corroborate the strong connection between South American temperate and tropical biotas: very few clades were restricted or exclusive to a single region, and in most cases, clades comprised species from tropical regions (Cerrado, Caatinga) together with species from the temperate South America zones (Chaco, Temperate South America; Figure 6, Source Data 1).

      That being said, we did not find significant differences in diversification rates (or diversity dynamics) across temperate and tropical regions (indeed, between any region), even if temperate regions were analyzed separately (Figure-6-figure supplement 2), suggesting that our results would have been similar if we had confined the Neotropics to tropical latitudes, as in a more climatic circumscription. Although, if we would have circumscribed the Neotropics to the tropical latitudes, many of the 150 clades would have not been selected. Hence, our study would have less insights into our understanding of the diversification processes explaining the Neotropical biodiversity in the broad sense.

      1. Model categories and clade diversification dynamics were also linked to the size and age of the phylogeny, such that small and young clades tended to exhibit constant diversification, while exponential and declining dynamics were linked to more diverse and older clades. As one of the main conclusions is that seed plant diversification is more frequently characterized by constant diversification (relative to that of tetrapods), I cannot help but wonder if seed plant phylogenies tend to also be younger and less diverse than those of tetrapods. Figure S1 shows distributions an overview of the distribution but lacks a formal, statistical comparison.

      This is a very good point. We agree this comparison is relevant to support our conclusions, but it was missing from our results. We have now compared tree size, crown age and sampling fraction across taxonomic groups, and found that the higher proportion of increasing dynamics, characteristic of plants, cannot be explained by significant differences in these factors. As can be seen in new Figure-2-figure supplement 2 on the manuscript, tree size does not differ among plants, mammals, birds and squamates. Crown age does not differ among plants, mammals and birds. Groups do differ on sampling fraction: plant (p < 0.01) and squamate (p < 0) phylogenies are significantly worst sampled than the phylogenies of other groups. Yet plants show a higher frequency of increasing dynamics than squamates, and other tetrapods (Figure 4). Incomplete taxon sampling has the effect of flattening out lineages-through-time plots towards the present, and thus artificially increasing the detection of diversification slowdowns rather than diversification increases (Cusimano & Renner 2010 – Syst. Biol.).

      We have included this important piece of information in the results “In our dataset, amphibian phylogenies are significantly larger than those of other clades (p < 0.05) (Figure 2 - figure supplement 2). Amphibian and squamate phylogenies are also significantly older (p < 0). Groups also differ in sampling fraction: plant (p < 0.01) and squamate (p < 0) phylogenies are significantly worst sampled than phylogenies of other groups.”; and in the discussion section: “Differences in the phylogenetic composition of the plant and tetrapod datasets do not explain this contrasted pattern. On average, plant phylogenies are not significantly younger or species-poorer than tetrapod phylogenies (Figure 2 - figure supplement 2). Yet, the proportion of clades experiencing increasing dynamics is significantly higher for plants (Figure 4). Plant phylogenies are significantly worst sampled than those of most other tetrapods, though, as explained above, incomplete taxon sampling has the opposite effect: flattening out lineages-through-time plots towards the present (83).”

      1. I wondered whether it was possible to disentangle time-dependent decreasing diversification from decreasing temperature in young trees? I raise this because it appears that (generally speaking) most of the clades have diversified over periods in which temperature has generally been declining.

      This is also a very good point. It is common to observe that two different models are equally likely or close in terms of statistical support. Previously, Condamine et al. (2019 - Ecol. Lett.) reported that the ΔAIC between the best and second-best diversification model was often below the threshold of 2, which is typically chosen to statistically distinguish models (see Fig. 3 and Fig. S5 in Condamine et al. 2019). Simulation analyses confirmed that it was not enough to distinguish the best and second-best models with confidence (see Fig. S6 in Condamine et al. 2019). This applies to any kind of clade.

      However, in the case of time-dependent decreasing diversification and temperature-dependent decreasing diversification, one can further test the effect of past temperatures by smoothing more the temperature curve so that the features of ups and downs are removed. Previously, Condamine et al. (2019 - Ecol. Lett.) found that smoothing strongly decreased the support for temperature-dependent models (Fig. S13a) to the point where it was lost (Fig. S13b), showing that the support for temperature-dependent models was not simply due to a temporal trend in diversification rates potentially unlinked to temperature.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Major comments:

      Are the key conclusions convincing?

      We discuss 4 key conclusions.

      __# 1 __A PRC of the segmentation clock was constructed.

      Although the authors have produced an interesting phase map, the regulation function F(\phi) of the circle map does not give the phase response curve (PRC) (Hoppensteadt & Keener 1982, Guevara & Glass 1982). This holds only when the system is stimulated with very short pulses (ideally Dirac delta), but the experimental pulses here are a quarter of the intrinsic period.

      There are several definitions of the PRC (Dirac pulses PRCs, linear PRCs, etc.). We use the general definition from Izhikievich, 2007: “In contrast to the common folklore, the function PRC (θ) can be measured for an arbitrary stimulus, not necessarily weak or brief. The only caveat is that to measure the new phase of oscillation perturbed by a stimulus, we must wait long enough for transients to subside“.

      The corresponding equation from Izhikievich (section 10.1.3) is

      PRC(θ)= θ_new-θ

      which is equivalent to our Equation 1.

      Hence, the key assumption we make is that after perturbing the system, we are back on the limit cycle as pointed out by Izhikievich. We think this is a reasonable assumption, because the perturbation we impose is relatively weak, despite pulsing for almost one quarter of the intrinsic period. The concentrations of DAPT we used in this current study are just enough to elicit a measurable response, and further lowering the concentration does not result in entrainment within our experiment time (0.5uM, Figure S7B in submitted version of the manuscript). Additionally, we previously reported that periodic pulsing with 2uM DAPT did not result in change of the Notch signaling activity with respect to control samples (Sonnen et al., 2018). Along similar lines, the DAPT drug concentrations we used are much lower compared to what has been used in previous studies aiming to perturb signaling levels, e.g. 100uM and 50uM used in study of segmentation clock in zebrafish embryos (Özbudak and Lewis, 2008 and Liao et al., 2016, respectively), and 25uM used in study of the segmentation clock in mouse PSM cells (Hubaud et al., 2017). Combined, we reason that we apply weak perturbations that allow to extract the PRC of the segmentation clock during entrainment. Additional evidence that indeed we have revealed a meaningful PRC is provided below, please see our response to point #3.

      __# 2 __Furthermore, in eq. 1 T_ext must be the winding number, and the modulus must be in units of

      phase, either one or two pi, for the circle map to be correct. Thus, calling the measured response of the system a PRC is not convincing.

      We thank the reviewer for pointing this out. We indeed rescaled everything to express the PRC in units of phase. We made this more explicit and updated equations throughout the text.

      __# 3 __The system is being entrained. Technically, It would also be easier to get the stroboscopic maps

      in the quasi-periodic regime since all the points in the circle will be sampled. Since no quasi-periodic response was demonstrated, the claim of entrainment is not convincing.

      While, in principle, PRC can be indeed obtained from responses in the “quasi-periodic” regime, such an approach is, in practice, challenging due to the intrinsic noise. The closest approximation to this is the phase response after the first pulse, that we reproduce below and compare to our inferred PRC, where we indeed clearly see a high noise level. Nevertheless, also the PRC based on the 1st pulse is in agreement with the PRC we derived from the entrainment data.

      In the entrained regime, one can get a much more reliable estimate of the phase response despite the noise. The level of noise in the stroboscopic map lowers as the samples approach entrainment (Figure S12), and the entrainment phase itself is a reliable statistical quantity that can be used to infer regions of the PRC as the detuning is varied.

      In addition, and maybe even more importantly, we identify several key features characteristics of entrainment, such as the change of entrainment phase as a function of detuning (Figure 7, Figure S6-S7 in submitted version of the manuscript) and the dependency of the time to entrainment as a function of initial phase (Figure 6). While additional features can be linked, in theory, with entrainment, i.e. period-doubling, higher harmonics (Figure 5), quasi-periodicity, we do not agree with the reviewer that all of these need, or in fact, can be found in the experimental data, in particular because of the influence of the noise. Conversely the positive experimental evidence that we provide for the presence of entrainment, combined with the theoretical framework we develop, justifies, in our view, the conclusions we make.

      __# 4 __The response of the system to external pulses is compatible with a SNIC. This is compatible, but

      it is equally compatible with other explanations. Assuming that the PRC is the same as the regulation function F(\phi), the PRC in Kotani 2012 (PRL 2012 fig. 3C) would be a similar shape as that shown by the authors. Similar models to that in Kotani et al., have been studied, but a SNIC has not been found (an der Heiden & Mackey 1982). It is relatively straightforward to construct a phenomenological model with a SNIC, but having underlying biological insight is not guaranteed. No argument for choosing a SNIC is given, so this emphasis of the paper is not convincing.

      It is true that the mapping of PRCs to oscillators is undetermined, in the sense that many systems could potentially give rise to similar PRCs. That said, there is value in parsimonious models, which often generalize very well despite their simplicity. This explains why in neuroscience, constant sign PRCs are generally associated with SNIC. There is a mathematical reason for this : 1-D oscillators with resetting (such as the quadratic fire-and-integrate model) are the simplest models displaying constant sign PRCs, and are the “normal” form for SNICs. In other words, SNIC bifurcations are among the simplest ones compatible with constant sign PRCs, and we think it is informative to point this out. In our manuscript, we go one step further by actually fitting the experimental PRC with a simple, analytical model that allows us to compute Arnold tongue for any values of the perturbation (contrary to more complex models).

      Other models such as Kotani 2012 can display similar PRC shapes, but they are of mathematically higher complexity, and furthermore it is not clear how such systems might behave when entrained. For instance that model in particular uses delayed differential equations, and as such contains long term couplings, so that a perturbation might have effects over many cycles, which is not consistent with the hypothesis we here make of a relatively rapid return to the limit cycle. Furthermore, for more complex models, PRCs are analytical only in the linear regime, while our model is analytical for all perturbations. That said, we agree that other types of oscillators can be associated with constant sign PRCs, and we have given more details in this part, in particular we better emphasize the Class I vs Class II oscillators as a way to broaden our discussion on PRC, and emphasize the “infinite period” bifurcation category which is more intuitive and further includes saddle node homoclinic bifurcations.

      __# 5 __The work demonstrates coarse graining of complex systems.

      This conclusion is correct, but coarse graining theory-driven analysis and control of dynamical systems has been established for many years. What is new here is that it is applied specifically to the in vitro culture system of the mouse segmentation clock.

      We agree it is new to successfully apply coarse-graining analysis and, importantly, control, to the in vitro culture system of the mouse segmentation clock. We also agree that such an approach has been pioneered and established for many years, especially in (theoretical) physics, but indeed, the key question is whether and how this can be applied to complex biological systems. Insights coming from theoretical considerations on idealized physical systems might not necessarily apply to biology, as already pointed out by Winfree.

      There are still very few examples in biology with coarse graining similar to what we do here. We think there is immense value in demonstrating that quantitative insights, and control of the biological systems, can be obtained without precise knowledge of molecular details, which is still counter-intuitive to many biologists. In this sense, we think our report will be of interest to both colleagues within the field of the segmentation clock and also to anyone interested to in the question, how theory and physics guided approaches can enable novel insight into biological complexity.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Following on the points above, each of these needs to be corrected or re-done, and/or the conclusions need to be modified accordingly.

      We have modified the manuscript in response to all those points.

      # 6 Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. If the authors wish to make the strong claim of determining a true PRC, Dirac delta-like perturbation needs to be applied, or approximated by short time duration pulses compared to the intrinsic period.

      Please refer to our response to point #1 and #3..

      # 7 *Are the suggested experiments realistic in terms of time and resources? It would help if you could *

      add an estimated cost and time investment for substantial experiments.

      It's not clear to this reviewer if it is feasible to deliver a very short pulse and record a response. But this may not be relevant, see above.

      Please refer to our response to point #1 and #3 .

      Are the data and the methods presented in such a way that they can be reproduced?

      Yes.

      Are the experiments adequately replicated and statistical analysis adequate?

      Yes.

      Minor comments:

      Specific experimental issues that are easily addressable.

      No issues.

      Are prior studies referenced appropriately?

      Yes.

      # 8 Are the text and figures clear and accurate?

      Figure 1D illustrates how a PRC should be obtained, but doesn't show the experimental protocol applied in the paper.

      Figure 1D is a general introduction on the phase description of oscillators and phase response. It demonstrates how a perturbation can change the phase and is not supposed to represent the experimental protocol. We describe how data are analyzed and how phases are extracted in Supplementary Note 1.

      __# 9 __In Figure 5B, 10 uM DAPT, the traces are already synchronized before the pulse train starts,

      which makes the subsequent behavior difficult to interpret.

      It appears here that by chance, the samples were already almost synchronized. We notice however that the establishment of a stable rhythm with the pulses (which here is not a multiple of the natural period) supports entrainment, and is already evident when looking at the timeseries with respect to the perturbation. The temporal evolution of the instantaneous period further confirms this, showing a change in period close to ½ zeitgeber period (which is very different from the natural period of ~140 mins). This also relates to point #35, in reply to both comments we have further expanded this figure to better show the 2:1 entrainment, adding statistics on the measured period and period evolution for a zeitgeber period of 300 mins.

      # 10 Do you have suggestions that would help the authors improve the presentation of their data and Conclusions? The text includes several paragraphs reviewing broad principles of coarse graining and making general conclusions. This is confusing, because, as mentioned above, there is no new general advance in this paper. The interesting contributions here are specific to the applications to the segmentation clock, and the text should be focused on this aspect.

      As commented above for #3 , we respectfully disagree that there is no “new general advance” in this paper. It is far from obvious that a complex ensemble of coupled oscillators implicated in embryonic development would be amenable to such coarse-graining theory. Of note, we still do not have a full understanding of neither the core oscillators in individual cells, nor what slows these down and eventually stops the oscillations, and multiple recent works suggest that both phenomena are under transient nonlinear control (e.g. our own work in Lauschke 2013). It is remarkable that despite this lack of detailed mechanistic insight, general entrainment theory can be applied to the segmentation process at the tissue level. We further show that classical entrainment theory alone is not sufficient to account for the experimental findings. Specifically, we need to account for a period change that we interpret as an internal feedback, an insight that would be impossible without our coarse-graining approach. While the results might of course be specific to the segmentation process, we think our approach motivated by coarse-graining theory and leading to new insights into the process is of general interest. We tried to make these points explicit in our conclusion.

      Reviewer #1 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Description of the complex mouse segmentation clock in terms of a simple model and its PRC is an interesting, original and non-trivial result. The proposal that the segmentation clock is close to a SNIC bifurcation provides a consistent dynamical explanation of slowing behavior that has been recognized for some time, but not fully understood. This proposal also raises a hypothesis about the behavior of the underlying molecular regulatory networks, which may be tested in the future. The increase or decrease of the intrinsic period due to the zeitgeber period is not expected from theory, pointing to structures in internal biochemical feedback loops, an idea which again may be tested in the future. Also surprising from a theoretical perspective, the spatial gradient of period in the system persisted after entrainment. Although the categorization of the generic behavior is interesting, by its nature there is little from this that might give a typical developmental biologist any conclusions about pathways or molecules. The successes and limits of the theoretical description do nevertheless focus future attention on interesting behaviors.

      # 11 Place the work in the context of the existing literature (provide references, where appropriate).

      Such an analysis of the segmentation clock is based strongly on the experimental system and results in Sonnen et al., 2018, and goes well beyond it in terms of the dynamical analysis. It provisionally categorizes the mouse segmentation clock as a Class I excitable system, allowing its dynamics at a coarse grained level to be compared to other oscillatory systems. In this aspect of simplification, it is similar to approach of Riedel-Kruse et al., 2007 who used a mean-field model of oscillator coupling to explain the synchrony dynamics observed in the zebrafish segmentation clock in response to blockade of coupling pathways, thereby allowing a high-level comparison to other synchronizing systems.

      It is interesting the reviewer sees similarities with the work of Riedel-Kruse et al, which uses a mean-field variable Z that corresponds to a classical approach, as described in Pikovsky’s textbook, to quantify synchronization of oscillators. In our view, while of course we work in the same context of coupled oscillators in the PSM, our approach based on perturbing and monitoring the system’s PRC in real-time provides a novel strategy to gain insight. This is evidenced by the fact that our quantifications of synchronization and insight into the PRC is the basis to exert precise control of the pace and rhythm of segmentation.

      State what audience might be interested in and influenced by the reported findings.

      Developmental biologists, biophysicists

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Developmental biology, somitogenesis, dynamical systems theory, biophysics, cell signaling


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: This is a beautifully elegant study that tests how previously published theoretical predictions about entraining nonlinear oscillators applies to a biological oscillator, the segmentation clock. The authors use a combination of state of the art experimental techniques, signal processing and analytical theory to reach a series of interesting and novel conclusions.

      They show that the segmentation clock period can be entrained through Notch inhibitor (DAPT) pulses acting as an external clock (referred to as zeitgeber) using a previously developed and sophisticated microfluidic perfusion system. Pulsing DAPT every 120 to 180min can change the internal clock period while entrainment beyond this range leads to higher order coupling to the zeitgeber period, i.e. entrainment of every other pulse. They then perform entrainment experiments where the concentration of DAPT is changed to elicit a change in the strength of interaction between the internal clock and the external stimulus (referred to as zeitgeber strength); interestingly at low strength response to entrainment is more variable leading to entrainment occurring in some samples while others remain unaffected (Figure 4A); overall, higher concentration leads to faster entrainment (Figure 4C). The experimental data is then analysed using stroboscopic maps to reveal that a stable entrainment phase shift is achieved between the internal clock and the external zeitgeber. Phase response curve (PRC) analysis indicates that the system response is not sinusoidal but predominantly characterised by negative PRC, a behaviour consistent with saddle-node on invariant cycle (SNIC); it also reveals that the intrinsic period changes in a non-linear way and that this effect is reversible when external stimulation stops. Finally, a theoretical model is proposed to represent the segmentation clock as a dynamical system; this is based upon Radial Isochron Cycle with Acceleration (ERICA), an extension motivated by the PRC analysis results which are incompatible with a Radial Isochron Cycle (RIC); this model has predictive capability and could be used to design new control strategies for entrainment of the segmentation clock.

      This study makes a series of key conclusions which are of particular importance in understanding the dynamic response of a biological oscillators. Firstly, given it's the characteristics of the dynamic response to entrainment, the segmentation clock is likely close to a SNIC bifurcation and this can explain the tendency for relaxation of the period over time. Secondly, the clock period was changed in a non-linear way in the direction of the zeitgeber period, a finding which is interpreted to indicate the presence of feedback of the segmentation clock onto itself, potentially via Wnt. This makes an excellent prediction that if tested experimentally would greatly improve the impact of the study. It is also noted that the entrainment of the segmentation clock does not abolish spatial periodicity and phase wave emergence suggesting that single cell oscillators can adjust to periodic perturbation while maintaining emergent properties. This is also a significant result that would need to be followed up with experiments and computation however would be best suited to a separate study.

      Major comments:

      __# 12 __The coarse graining is a major point that would need to be clarified since the rest of the analysis

      and theoretical modelling in the paper flow from this. Firstly, the interpretation of the schematic in Figure 1A on experimental data collection is not immediately obvious to the reader, lacks a clear flow between the different panels or steps (which could be numbered for example) and does not have a legend to indicate the different colour mapping.

      We are grateful to the reviewer for this comment. We have implemented in Figure 1A all the changes suggested by the reviewer: we numbered the different steps and have added a colour mapping. In addition we have rephrased the caption of Fig 1A to better connect the experimental steps.

      __# 13 __Secondly, Figure 2A which explicitly addresses coarse graining is not clear enough. Is the

      message here that by excluding the inner parts of the sample with a radial ROI, a similar dynamic response is observed over time?

      Yes, indeed this is the point and we have adjusted the figure and text to explain this better. Our goal is to focus on the quantification of segmentation pace and rhythm. This is best captured by reporters such as LuVeLu, which has maximum intensity in regions where segment forms, and which dynamics is known to be strongly correlated to segmentation (Aulehla et al., 2007; Lauschke and Tsiairis et al., 20132). The global ROI is thus expected to precisely capture these segmentation and clock dynamics and we have now included more validation data and have also edited the text to make this very important point clearer:

      “To perform a systematic analysis of entrainment dynamics, we first introduced a single oscillator description of the segmentation clock. We used the segmentation clock reporter LuVeLu, which shows highest signal levels in regions where segments form \cite{Aulehla_A_2007}. Hence, we reasoned that a global ROI quantification, averaging LuVeLu intensities over the entire sample, should faithfully report on the segmentation rate and rhythm, essentially quantifying 'wave arrival' and segment formation in the periphery of the sample.”

      Figure 2A indeed shows that the dynamics (from the timeseries) is very similar when considering the entire field of view (global ROI) or when considering only the periphery of the 2D-assay (excluding central regions). We modified Figure 2A to clarify this point by indicating each measurement as either global ROI or global ROI minus the diameter of the excluded circular region (e.g. global ROI - 50px). We also emphasized in the caption that timeseries are obtained using global ROI, unless otherwise specified. We included a link (https://youtu.be/fRHsHYU_H2Q) in the caption to a movie of 2D-assay subjected to periodic pulses of DAPT (or DMSO) and corresponding timeseries from global ROI.

      Since the inner part of the sample corresponds to the posterior side how do we interpret similarities and differences between signals with different ROIs?

      As stated above, the global ROI measurements essentially capture the signal at the periphery where segments form and faithfully mirrors segmentation rate and rhythm. We have now included a comparison to the center ROI, also in response to reviewer’s comments, see our response #34.

      The result shows that the period and PRC in the center matches the one found in the periphery, i.e. global ROI. We have shown previously that center and periphery differ in their oscillation phase by 2pi, i.e., one full cycle (Lauschke et al., 2013). We interpret these findings as confirmation of our analysis strategy, i.e. the global ROI allows a very reproducible, unbiased quantification that reports on segmentation clock and period.

      __# 14 __A quantitative analysis of essential coarse-grained properties such as period and amplitude

      should be performed for different ROIs and across multiple samples. As this effectively masks any spatial differences, limitations of this approach should be clearly stated in the Discussion. For example in lines 466-470 where it is difficult to interpret the slowing down tendency and relate back to single cell level.

      As outlined in our response to comment #13 and also #34, we chose an analysis that allows to determine the segmentation pace and rhythm, i.e. segment formation, which is well captured by LuVeLu signal and a global ROI analysis. We agree that a spatially resolved analysis of dynamic behaviour is important (and indeed a gradient of amplitude might be relevant in such context), but we think this is beyond the scope of the current study focused on the system level segmentation clock behaviour. We have revised the discussion as suggested by the reviewer to make this point approach and the need for future studies clearer.

      __# 15 __The functional characterisation of the sample using LFNG, AXIN2 and MESP2 is unclear. The

      images included in Figure 2D representing expression observed when tissue explants are grown within the microfluidic chip are difficult to interpret and would require a more detailed description of anterior-posterior, pillars etc; it is also difficult to view the bright-field since it is presented as a merged image.

      It is particularly difficult to see the somite boundaries for the same reason. In lines 113-117 the authors state that the global oscillation period matches the periodic boundary formation. How do we reach this conclusion from these images? What is the variability between samples?

      If these two issues would be addressed it would increase confidence in the coarse graining argument and thus would strengthen the importance of the findings in the study.

      We thank the reviewer for this feedback, and we have added more quantifications to address this point directly in the modified Figure 2. Importantly, we added the quantification of the rate of segmentation in multiple samples based on segment boundary formation (new Figure 2D) and compared this to the global ROI quantifications using the reporter lines LuVeLu. This data provides clear evidence that the quantification of global ROI reporter intensities closely matches the rate of morphological segment boundary formation. In addition, we show that segment formation and also Wnt-signaling oscillations (Axin2-Achilles) and the segmentation marker Mesp2 (Mesp2-GFP) are all entrained to the zeitgeber period. We have also revised the text to clarify this important validation of our quantitative approach.

      In addition, we provide, in the revised Figure Suppl. 2, details of entrained samples, focusing on the segmenting regions. The brightfield and reporter channels were separated, emphasizing the segment boundaries and the expression pattern of the reporters. For ease of visualization, these samples were also re-oriented so that the tissue periphery (corresponding to anterior PSM) is at the top while the tissue center (corresponding to the posterior PSM) is at the bottom. This now additionally better shows the localization of the different reporters with respect to the segment boundary. We also included supplementary movies showing timelapse of samples expressing either Axin2-GSAGS-Achilles or Mesp2-GFP that were subjected to periodic DAPT pulses, with their respective controls.

      Several minor points could be addressed to improve the manuscript and are listed below:

      # 16 Figure 1 A the colormap and axes for the oscillatory traces should be defined

      We thank the reviewer, and we have modified the figure accordingly (related to point # 12). A colormap and axes for the illustrated timeseries are now included.

      # 17 Strength of zeitgeber is not defined and there is no analytical expression provided; how does it

      relate to DAPT concentration? Is the fact that low DAPT concentration corresponds to weak strength expected or is it a result?

      Zeitgeber strength generally refers to the magnitude of the perturbation periodically applied to an oscillator. With DAPT pulses, our expectation was that both the duration of the pulse and the drug concentration could influence the strength. Practically, the pulse duration was kept constant for all experiments and the concentration was varied. We thus expected that DAPT concentration would indeed be correlated to zeitgeber strength. We have discussed multiple evidence supporting this assumption in the main text, and this is indeed a result. In particular, as explained in the section “The pace of segmentation clock can be locked to a wide range of entrainment periods”, higher DAPT concentration gives rise to faster and better entrainment, as expected from classical theory. In the context of Arnold tongue, weaker zeitgeber strength corresponds to narrower entrainment region, which is experimentally observed (Fig 8F, showing regions where the clock is entrained).

      From a modelling standpoint, Zeitgeber strength corresponds to parameter A which is the amplitude of the perturbation. Possible zeitgeber strength was inferred from the model by matching the experimental entrainment phase with that obtained from the model isophases. As explained in Supplementary Note 2, we tested four concentrations of DAPT (0.5, 1, 2, and 3 uM) respectively corresponding to A values of 0.13, 0.31,0.43, 0.55. As we can see, those A values are not linear in DAPT concentrations, which is expected since multiple effects (such as saturation) can occur.

      __# 18 __In some figures it looks like the amplitude of oscillations may change with DAPT concentration

      and hence zeitgeber strength? Is this expected?

      We have not systematically analyzed the amplitude effect and have, intentionally, focused on the period and phase readout as most robust and faithful parameters to be quantified. Regarding the amplitude of LuVeLu reporter, we are cautious given that it is influenced, potentially, by the (artificial) degradation system that we included in LuVeLu, i.e. a PEST domain. This effect concerns the amplitude, but not the phase and period, explaining our strategy.

      That said, we agree with the referee that DAPT concentrations might change the amplitude of oscillations. Such change could even play a role in the change of intrinsic period (in fact a similar mechanism drives overdrive suppression for cardiac oscillators, Kunysz et al., 1995). But since the change of period can be more easily measured and inferred, we prefer to directly model it instead of introducing a new hypothesis on amplitude/period coupling, at least for this first study of entrainment.

      __# 19 __Figure 2A including the black area creates confusion and it is unclear which ROI is used in the

      rest of the study; consider moving this to a supplementary figure perhaps

      We thank the reviewer for this feedback (related to point #13), and we have modified the figure accordingly. As we responded to point # 13: We modified Figure 2A, by indicating each measurement as either global ROI or global ROI minus the diameter of the excluded circular region (e.g. global ROI - 50px). We also emphasized in the caption that timeseries are obtained using global ROI, unless otherwise specified.

      __# 20 __What type of detrending is used in Figure 2 and throughout (include info in the figure legend)?

      We used sinc-filter detrending, described and validated in detail previously (Mönke et al., 2020), as specified in Supplementary Note 1: Materials and methods > H. Data analysis > Monitoring period-locking and phase-locking: In this workflow, timeseries was first detrended using a sinc filter and then subjected to continuous wavelet transform. We thank the reviewer for pointing out that this detail is lacking in the figure captions, and we have modified the captions accordingly.

      __# 21 __Figure 2D merged images are difficult to read/interpret (see major comments)

      We thank the reviewer for this comment, and we have modified the figure accordingly (please see response to related point #15).

      __# 22 __Kuramoto order parameter is used to quantify the level of synchrony across the different samples

      however it is not defined in the text. Is it also possible to assess variability in each sample? For example how quickly does entrained occur in each sample? How faithfully the peaks of expression beyond 80min (to exclude initial unsynchronised state) match with zeitgeber time? This would help make the point that weak strength leads to a more variable response which is an interesting finding.

      We have now added a mathematical definition of the Kuramoto parameter in Supplementary Note 1.

      A high order parameter corresponds to coherence between samples, as also elaborated in respective figure captions (e.g. in the caption for polar plots in Figure 4D).

      In terms of variability in response to entrainment, we thank the reviewer for the comments, which has prompted us to perform an additional analysis, now included as Figure S13 in the Supplement.

      Briefly, we represent below figures showing how different samples get synchronized with the zeitgeber. To do this, we first represent the zeitgeber signal as a continuous uniformly increasing phase (“zeitgeber time”) with period : . The initial condition for is chosen so that the zeitgeber phase at the moment of last pulse is matching the experimental entrainment phase for each . We plot for each sample (dotted lines) and the zeitgeber phase (magenta line). To quantify how well each sample is following the zeitgeber time, we compute the Kuramoto parameter: . By the end of experiment most samples reach , indicating entrainment. Most samples need zeitgeber cycles to become entrained. For min the entrainment takes much longer (edge of the Arnold tongue). For min there is much variability, which can be explained by the horizontal region in the PRC around the entrainment phase. As suggested by the referee, synchronization is faster for higher DAPT concentration. So those dynamics are indeed consistent with the expectation from classical PRC theory.

      # 23 Do samples change period to Tzeit in similar ways - i.e. patterns over time. It looks like the

      kuramoto order parameter and period drop initially - why?

      We do not have a direct answer as to why the Kuramoto first order parameter and the period drop for the condition the reviewer specified. It has to be noted though that because of how wavelet analysis is done (cross-correlation of the timeseries with wavelets), the period and phase determination at the boundaries of the time series are less reliable (edge effects, see Mönke et al., 2020). Because of this, we should take caution when considering data to and from the first and last pulses, respectively. This was explicitly stated in the generation of stroboscopic maps: “As wavelets only partially overlap the signal at the edges of the timeseries, resulting in deviations from true phase values (Mönke et al., 2020), the first and last pulse pairs were not considered in the generation of stroboscopic maps.

      # 24 In Figure 4C why is the Kuramoto order parameter already higher in the 2uM DAPT conditions at

      the start of the experiment?

      Samples can, by chance, start synchronously and this results in a high Kuramoto first order parameter. Because of this likelihood, it is thus important to interpret the entrainment behaviour of multiple samples using various readouts, in addition to a high Kuramoto first order parameter. We investigated entrainment of the samples based on several measures: multiple samples remaining (or becoming more) synchronous (because each sample actively synchronizes with the zeitgeber), period-locking (where the pace of the samples match the pace of the zeitgeber, which can be distinct from natural pace), and phase-locking (where there is an establishment of a stable phase relationship between the samples and the zeitgeber).

      # 25 Figure 3C and Figure S2 require statistical testing between CTRL and DAPT in each condition

      p-values were calculated for the specified conditions and were added in the caption of the figures. These values are enumerated here:

      • Figure 3C
      • 170-min 2uM DAPT (vs DMSO control): p
      • Figure S2
      • 120-min 2uM DAPT (vs DMSO control): p = 0.064
      • 130-min 2uM DAPT (vs DMSO control): p = 0.003
      • 140-min 2uM DAPT (vs DMSO control): p = 0.272
      • 150-min 2uM DAPT (vs DMSO control): p = 0.001
      • 160-min 2uM DAPT (vs DMSO control): p To calculate p-values, two-tailed test for absolute difference between medians was done via a randomization method (Goedhart, 2019). This confirms that the period of samples subjected to pulses of DAPT is not equal to the controls, except for the 140-min condition (where the zeitgeber period is equal to the natural period, i.e. 140 mins).

      # 26 Figure 3A gray shaded area not clearly visible on the graph

      We have decided to remove the interquartile range (IQR) in the specified figure as it does not serve a crucial purpose in this case. By removing it in Figure 3A, the timeseries of individual samples are now clearer.

      # 27 Figure 6C colour maping of time progression is not clearly visible on the graph; the interpretation

      of this observation is unclear in the text and the figure

      We agree that the low quality of the image is unfortunate, and it seems that our file was greatly compressed upon submission. We have checked the proper quality of figures in the resubmitted version of the manuscript.

      Regarding the interpretation of Figure 6C, we conclude that in our experiments the entrainment phase is an attractor or stable fixed point, in line with theory (Granada and Herzel, 2009; Granada et al., 2009),. We had elaborated this in the text (lines 248-252 of the submitted version of the manuscript): at the same zeitgeber strength and zeitgeber period, faster (or slower) convergence towards this fixed point (i.e. entrainment) was achieved when the initial phase of the endogenous oscillation (φinit) was closer or farther to φent.

      # 28 Figure 7A circular spread not clearly visible on the graph

      Similar to point #27, we have provided a high resolution graph for the re-submission and hopefully resolved this issue.

      # 29 Figure S7A difficult to see the difference between colours

      See point #28.

      # 30 Is it possible to compare the PRC and the plots of period over time during entrainment? The PRC

      is mainly negative (Fig 8A1,A2), in my understanding this means a delay, however the periods seem to decrease over time before entraining to the Tzeit (Fig 3B). Is this reflective of a decrease in Kuramoto parameter and potential de-synchronisation of single cells before re-synchronisation at Tzeit?

      To address this question, we now plot the Phase response with colors indicating pulse number in new Supplementary Figure S13. While capturing the entire PRC as a function of time would require many more experiments (in particular to sample the phases far from entrainment phase), we still clearly see that the PRCs appear to translate vertically as the oscillator is being entrained, i.e. the latter time points are shifted up (down) for T_zeit = 120 (170) min, respectively.

      # 31 Fig 8A What is the importance/meaning of the PRC being similar shape between different

      entrainment periods? Does this reflect that the underlying gene network is the same?

      If one single gene network is responsible for oscillations, we expect from dynamical systems theory that the PRC are not only of similar shape but actually the same, independent of the entrainment period. What is surprising is that the PRC for different entrainment periods do not overlap, and the simplest explanation for this is that the intrinsic period changes with entrainment, all things being kept equal (including the underlying gene networks). This relates to the previous point since we indeed observe that the PRC “translates” vertically with the pulse number for longer periods. The change of period might be due to a long-term regulation as detailed in the discussion.

      # 32 The spatial period gradient and wave propagation under DAPT (Figure S8) should be included in

      the results and not just the discussion.

      We fully agree with the reviewer that both the establishment and the maintenance of a spatial phase gradient is of great interest. However, many more experiments would be required to fully quantify and understand the processes at play here, which we believe to be out of the scope of the current manuscript. To keep the focus of the paper on the global segmentation clock itself, we prefer to keep this figure in Supplement.

      Reviewer #2 (Significance (Required)):

      We currently do not have a detailed understanding of how biological oscillators integrate local signals from their neighbours as well global external signals to give rise to complex patterning that is important for embryonic development. Main bottlenecks that hinder our understanding are lack of real-time endogenous dynamic response together with known global inputs as well as comprehensive models that can explain emergent behaviour in a variety of tissues.

      This study goes a long way in addressing these bottlenecks in the embryonic tissue responsible for somite formation, a dynamical and oscillatory system also known as the segmentation clock. Firstly, they rely on a state-of-the-art previously developed system to entrain endogenous response in live tissue explants using precise microfluidic control. They test the complete range of exogenous perturbation periods and use an existing live reporter (LuVeLu) to monitor endogenous response. They also identify higher order coupling relationships whereby every other LuVeLu peak is entrained through external stimulation.

      As the stimulation system does not control but rather perturb the endogenous response, the observations from LuVeLu provide a unique opportunity in understanding input-output relationships and thus describing the dynamic response of the segmentation clock. Authors propose to study dynamic behaviour of the clock using coarse-graining and focus on describing the overall response over time while amalgamating spatial information. Appropriate coarse-graining is an important strategy in addressing complex problems and is widely used. They use sophisticated methodology such as phase response curves and Arnold tongue mapping to make several important observations. For example the nonlinear shortening and elongation of the period in response to stimulation is particularly interesting since this may indicates a feedback of the clock onto itself potentially via Wnt. Another key observation is that the spatial periodicity and phase wave activity persists in the perturbed conditions suggesting that individual single cell oscillators can adjust their behaviour to external input while retaining coordination with their neighbours. Finally, the authors go on to construct a general dynamical model of the segmentation clock and use this to conclude that the intrinsic period of the oscillator is altered and that the oscillator can be considered excitable.

      This work sheds light onto mechanisms of coordination of Notch activity in assemblies of cells observed in living tissue, an area of research that is important not only for somitogenesis but also for understanding gene expression patterning in many other tissues where Notch plays a critical role, for example in the development of the neural system and organs. As a study of a real-world nonlinear oscillator this work is directly of interest to theoreticians and synthetic biology experts interested in understanding complex patterning and emergence.


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, authors studied the system-level responses of the somite segmentation clock by the coarse-grained theoretical-experimental approach, applying the theory of entrainment to understanding the phase responses of mouse pre-somitic mesoderm (PSM) tissues in the presence of periodic perturbation of Notch inhibitor DAPT generated by micro-fluidics technique. It was demonstrated that the segmentation clock is responsive to diverse range of the perturbation-periods from 120 to 180 min, can be period- and phase-locked, and the efficiency is dependent of the DAPT concentration (input-strength). The authors also observed two cycles of the segmentation-clock ticking in single cycles of 300 or 350 min period-perturbation, suggesting that higher order (2:1 mode) entrainment. They also applied stroboscopic maps to analysis and found that entrainment-phases are dependent of period of DAPT pulses, which is recapitulating theoretical predictions. The estimation of the phase response curve (PRC) of the segmentation clock revealed that the inferred PRC is an asymmetrical and mainly negative function, which represents characteristic features in oscillators that emerge after saddle-node on invariant cycle (SNIC) bifurcation. These results also indicated that the the segmentation clock changed the intrinsic period during entrainment.

      Major comments:

      # 33 I have major concerns about the relevance of the global time-series analysis proposed in Fig.2

      and conclusion about the changes of the intrinsic period during entrainment. The validity of the global time-series analysis should be carefully analyzed, because it could bring artifacts in estimated values of the intrinsic period. The authors concluded (page 3, line 172) that the period calculated by the global analysis represents similar values with the rate of segment formation, but there is no data about the quantification of the periods of segmentation, such as the frequency of Mesp2 reporter expression.

      We thank the reviewer for this feedback. We have now added the quantification of the period of segment formation (new Figure 2E) and show its strong correspondence to the dynamics of reporters used (Lfng, Axin2, and Mesp2). Please see also our response to point #15 with additional comments regarding the validation of the global time-series analysis.

      # 34 Another related issue is the presence of spatial period gradient as mentioned (page 13, line 524).

      One possible approach to circumvent this issue would be "local" time-series analysis; for instance, just focusing on the "putative posterior" regions that are close to source-positions of waves. Authors can re-compute and estimate PRCs by using such a method.

      We thank the reviewer for this suggestion and have accordingly now included the analysis of a localized ROI at the center (center ROI) of the 2D-assays (new Figures S5-S6). We also computed the PRC from center ROIs as shown below. We note strong correspondence between the global ROI and the center ROI.

      # 35 I have another major concern about the evidence of higher order entrainment shown in Fig.5. If

      the 1:2 entrainment is successful, we can expect that the values of observed period is close to the half of the period of pulses; However, the period shown in Fig.5B looks like 185 min longer than the half of 350 min. Is this gap due to the temporal accuracy of time-lapse movies?

      We do not think the discrepancy comes from a problem of temporal accuracy as the temporal accuracy is the same for all movies and there is no reason why there would be a specific issue for this set of experiments. In addition, we have re-analyzed the data to calculate the period from the stroboscopic maps. Mathematically speaking, we take the stroboscopic map as (see PDF) and use this to estimate the period of oscillation in entrained samples , in particular inverting the formula for 1:2 entrainment we have : see PDF.

      The advantage of this method is that it gives a more ``instantaneous” estimation of the period.

      The results are as follows:

      350 10uM: 187 +- 8 min (average across entrained samples from the last zeitgeber period)

      350 5uM: 193 +- 13 min (average across entrained samples from the last zeitgeber period)

      300 2uM: 148 +- 8 min (averaged across entrained samples and from two last periods)

      This additional analysis is in agreement with the wavelet analysis.

      The reviewer is right that for 350 minutes, entrained samples show an observed period that is higher than expected, also based on this new additional analysis. The reason for this is not known. One explanation is the relatively short observation time, especially considering for pulses separated by as much as 350-minutes, i.e. only 3 pulses are applied. [We notice that for 300 minutes pulses, the period converges to 150 mins between the 3rd and the 4th pulse]. We have adjusted the text in the results section to reflect that for 350min entrained samples, the observed period ‘approaches’ the predicted value, while for 300min entrained samples, the observed period is very close to it, i.e. 147mins In addition, we comment that the phase distribution narrows with time, another indication supporting higher order entrainment.

      # 36 Also, authors showed the period evolution towards 1:2 locking with just one condition (350 min).

      Authors can show the data for multiple conditions as in Fig. 3D, at least for 300 min and 325 min pulses and add the data about final entrained period with statistic analysis that supports the difference between the entrained period and the natural period (140 min).

      We thank the reviewer for this feedback and have modified the figure accordingly. In particular, in Figure 5A, we have added the period evolution plot for samples subjected to 300-min periodic pulses of 2uM DAPT (or DMSO for control). Additionally, we have added Figure 5D, which plots the average period in the 300-min and 350-min conditions. We summarize the median average period here with computed p-values:

      • 300-min pulses of 2uM DAPT (or DMSO for control): p-value = 0.191
      • CTRL: 130.39 mins
      • DAPT: 146.45 mins

      • 350-min pulses of 5uM DAPT (or DMSO for control): p-value = 0.049

      • CTRL: 127 mins
      • DAPT: 174.86 mins

      • 350-min pulses of 10uM DAPT (or DMSO for control): p-value = 0.016

      • CTRL: 142.82 mins
      • DAPT: 185.12 mins

      Minor comments:

      # 37 The authors can draw vertical lines indicating the T_zeit in Fig.3B, Fig.4B and Fig.5B in order to

      help comparisons between T_zeit and patterns of period (solid lines).

      We thank the reviewer for this comment. We have accordingly added a horizontal line indicating Tzeit in Figures 3B, 4B, S4A, and S5A (figure panel numbers based on the submitted version of the manuscript). We similarly added a horizontal line indicating 0.5Tzeit in the period evolution plots of 300-min and 350-min conditions in Figures 5A and 5B, respectively.

      # 38 In Fig.5A, the authors can show period evolution in the case of 300 min DAPT-pulses as shown

      in Fig.5B.

      We thank the reviewer for this feedback (related to point #36), and we have modified the figure accordingly.

      # 39 In Fig.6B DAPT panel, the authors can draw the points of phi_ent as shown in Fig.7A.

      We thank the reviewer for this comment, and we have modified the figure accordingly.

      # 40 In Fig. 8F, authors can put the information about DAPT concentration at the right y-axis.

      This is a similar comment as point #17, see above. In brief, we do not know the precise relation between the strength of the perturbation in our model and DAPT concentration, zeitgeber strength was inferred from the model by matching the experimental entrainment phase with that obtained from the model isophases.

      # 41 In Fig. 8G, the PRC in the panel "170 mins" does not have any fixed point (cross sections with

      horizontal lines of "0" phase response). If entrainment is successful, there should be stable and unstable fixed points, but those are absent, although 170 min pulses succeeded in the entrainment as shown in Fig.3D. Authors can explain where the fixed points are.

      The fixed points are indeed defined by the intersection with a horizontal line, but not with the ‘0’ line. They are found where the phase response compensates for the detuning/period mismatch, not at ‘0’ phase response. (See PDF for more details).

      Note however on Fig 8G that we further observe a vertical shift of the PRC, which prompted us to propose a change of the intrinsic period with (as explained in the text when we introduce Figs 8A1-2).

      Another way to visualize fixed points is offered in Fig 16 D-E, where we plot the inferred corrected PTC and the stroboscopic maps: there, fixed points correspond to intersections with the diagonal.

      Reviewer #3 (Significance (Required)):

      Although the phase-analysis has been widely applied to various biological systems, such as circadian clocks, cardiac tissues and neurons, this paper represents the first detailed experimental analysis of the segmentation clock based on the theory of phase dynamics. The major results are inline with theoretical predictions, whereas the suggestion about the SNIC bifurcation is attractive not only to the theoretical researchers but also to the experimental biologists; it has been believed that the segmentation clock consists of negative-feedback oscillator that emerge by Hopf bifurcation, whereas this paper proposes another possibility of the molecular network structure for the clockwork. This issue is related to recently proposed hypothesis about the excitable system in the segmentation clock based on the Yap signaling (Hubaud et al. Cell 171, 668 (2017)). However, unfortunately, discussion about detailed molecular networks are not abundant.

      # 42 Thus, maybe the main readers are computational biologists and systems biologists.

      We thank the reviewer for his/her significance comment. We have added comments on the bifurcation structure of the segmentation clock and on excitable systems in the discussion. While our focus is on coarse-graining so that we do not and cannot infer precise molecular details, we can still infer some properties of the underlying networks. In particular we now cite several papers explaining how systems with tunable periods/excitable are indicative of the interplay between positive and negative feedbacks. We think those considerations are of interest to a broad range of biologists interested in connecting experiments to theory.

    1. SAMSON CARRASCO

      Samson is extremely important to Don Quixote. At first glance we think of him as the antagonist however, as the story progresses we find that he is trying to help the Don. "The ir a key figure that fulfills a double function: to cheer up Don Quijote so that he may go out for the third time and also to induce him to return home." This makes him a pivotal part of this story.

      Presence and Sense of Sanson Carrasco | Request PDF. https://www.researchgate.net/publication/298984686_Presence_and_sense_of_Sanson_Carrasco.

    1. “My reasons for marrying are, first, that I think it a right thing for every clergyman in easy circumstances (like myself) to set the example of matrimony in his parish; secondly, that I am convinced that it will add very greatly to my happiness; and thirdly—which perhaps I ought to have mentioned earlier, that it is the particular advice and recommendation of the very noble lady whom I have the honour of calling patroness. Twice has she condescended to give me her opinion (unasked too!) on this subject; and it was but the very Saturday night before I left Hunsford—between our pools at quadrille, while Mrs. Jenkinson was arranging Miss de Bourgh’s footstool, that she said, ‘Mr. Collins, you must marry. A clergyman like you must marry. Choose properly, choose a gentlewoman for my sake; and for your own, let her be an active, useful sort of person, not brought up high, but able to make a small income go a good way. This is my advice. Find such a woman as soon as you can, bring her to Hunsford, and I will visit her.’ Allow me, by the way, to observe, my fair cousin, that I do not reckon the notice and kindness of Lady Catherine de Bourgh as among the least of the advantages in my power to offer. You will find her manners beyond anything I can describe; and your wit and vivacity, I think, must be acceptable to her, especially when tempered with the silence and respect which her rank will inevitably excite. Thus much for my general intention in favour of matrimony; it remains to be told why my views were directed towards Longbourn instead of my own neighbourhood, where I can assure you there are many amiable young women. But the fact is, that being, as I am, to inherit this estate after the death of your honoured father (who, however, may live many years longer), I could not satisfy myself without resolving to choose a wife from among his daughters, that the loss to them might be as little as possible, when the melancholy event takes place—which, however, as I have already said, may not be for several years. This has been my motive, my fair cousin, and I flatter myself it will not sink me in your esteem. And now nothing remains for me but to assure you in the most animated language of the violence of my affection. To fortune I am perfectly indifferent, and shall make no demand of that nature on your father, since I am well aware that it could not be complied with; and that one thousand pounds in the four per cents, which will not be yours till after your mother’s decease, is all that you may ever be entitled to. On that head, therefore, I shall be uniformly silent; and you may assure yourself that no ungenerous reproach shall ever pass my lips when we are married.”

      this seems unnecessary

  3. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. It’s not such a diffi cult process . . . to start with. . . . If they [Latinos] really wanted to do it, they would just go out and fi ll out the application and ask the teacher for details. . .

      I don't think it is as easy as she makes it sound. As we learned in last week's article, there are some kids who don't even understand the order of taking pre-algebra before algebra simply because they don't have people in their lives to explain that to them. It may seem easy to her because it is a path a lot of people in her family have taken or that her family holds strong values to school so she knows more about it.

    1. Within the field of instructional design, we have sometimes observed a hesitation to dwell on visual aesthetics (Parrish, 2009). This hesitation may stem from concern that artistically-approached designs will lack the ability to be replicated (Merrill & Wilson, 2006) or that the artistic elements will serve merely as window dressing—or worse, distraction—that provides no educational benefit to the learner.

      I find this to be true in my experience. I have worked with some professors who baulk at the idea of spending time creating or searching for a course banner image. There's other examples related to this, but I personally think that something as simple as finding or creating a course banner image can excite students. Or, if it's a corporate training hosted through Rise or Storyline, this may just be little visual elements and images that add a little something to the visual experience.

    1. Author Response

      Reviewer #1 (Public Review):

      This study provides data suggesting that tonic presynaptic a7 nicotinic receptor activity enhances corticostriatal input-mediated excitation of striatal medium spiny neurons; the data also suggest that tonic a4b2 nicotinic receptor activity on PV-fast spiking GABA interneurons inhibits striatal medium spiny neurons. These data advance our understanding about the complex cholinergic regulation of striatal neuronal circuits.

      The presented data are generally clean and high quality; but there are some problems that require the authors' attention.

      We thank the Reviewer for their insightful comments. We have addressed each point below with additional data and/or text. We believe these revisions have made the manuscript significantly stronger.

      1. In this study, ADP is a key parameter manipulated by several pharmacological treatments. But it is not clearly defined. The authors indicate EPSP and ADP are distinct by stating "LED pulse of increasing intensity generates excitatory postsynaptic potentials (EPSPs), or an AP followed by an after depolarization (ADP)." But the data (e.g. Fig. 1B) indicates that much of the ADP is probably EPSP. Please clarify. If much of the ADP is indeed EPSP, how are the data interpretation and the overall conclusion affected?

      We apologize for the oversight. The main focus of our study is on how tonic nAChR activation controls the timing of striatal output; our justification for including the ADP in our experimental analysis was simply corroborative, in that it represents an additional, easily measured parameter of the postsynaptic response to convergent cortical stimulation that 1) can be modulated by similar local inhibitory circuits that we show to mediate the effect of tonic nAChR activation and 2) is positioned (as opposed to EPSPs) to influence subsequent spiking, should the appropriate synaptic cues be present (which are deliberately omitted in our study). That said, under our experimental conditions EPSPs and ADPs were similar in both their kinetics and modulation by mecamylamine, suggesting that they represent mechanistically similar responses to cortical afferents. The defining difference (besides ADPs exhibiting larger amplitudes) is that they appear either in the absence of or following a spike. For these reasons we ultimately decided that reporting changes in both ADPs and EPSPs would be redundant, and limited our analyses to ADPs. Text has been added to the first paragraph of the results section to address these points.

      In Fig. 1F, ADP is absent. Why? Please clarify.

      Figure 1F shows an example of a SPN held at a mimicked ‘up-state’, achieved by injecting positive somatic current to produce a ‘resting’ membrane potential of -55-50mV. In this scenario, the ‘up-state’ membrane potential is higher than what would be reached during most ADPs evoked from Vrest, preventing the observation of ADPs in many trials. Text has been added to the end of the first paragraph in the results section to clarify this point.

      If ADP is distinct from EPSP here in MSNs, has it been reported in the literature, and how is it generated?

      Under our experimental conditions, we do not see any major differences between EPSPs and what we term ADPs (other than amplitude), at least in terms of kinetics and modulation by mecamylamine. That said, we have added text to the first paragraph of the results section that references previous work (Flores-Barrera et al.) describing suprathreshold depolarizations proceeding SPN spikes, which shaped our reasoning for including this measure in our study.

      1. In Fig. 1F, the holding potential for mecamylamine is a few mV more negative than the control, but the spike latency is shorter under mecamylamine. This is hard to understand because membrane potential (current-injection-induced depolarization + EPSP) determines spike firing and latency. If the holding potential is the same, then it's easy to understand (larger EPSP under mycamylamine).

      Thanks for pointing this out! We agree that this might seem counter-intuitive in terms of Vrest and EPSP amplitude only. Given that mecamylamine reduces GABAergic inputs to SPNs, the reduction in spike latency in this case is consistent with a reduction of GABA receptor mediated shunting. We have added this point to the text in the 3rd paragraph of the results section, which we think strengthens our justification to look at GINs as the potential mediators of mecamylamine’s effect on spike latency.

      1. Data in Fig. 2D, E are weak. The spiking ability of whole-cell recorded neurons often declines over time (evidence: the AP duration for the red trace is longer); recovery/partial recovery from MLA is needed for the data to be reliable. Fig. 2E shows 8 cells: 6 had no response, 2 increased. Sample size needs to increase.

      We appreciate this comment. Our initial justification for this experiment was from previous reports that alpha-7 nAChRs reduce corticostriatal glutamate release probability. We have now added additional data (Figure 2 supplemental data) showing that blockade of tonically activated alpha-7 nAChRs with the more specific antagonist MLA was not sufficient to change corticostriatal synaptic strength or release probability. In parallel, as we began increasing the sample size of the experiment testing the effect of MLA on spike latency, we noticed that the effect size became smaller than what we initially reported, which was already modest. Given the modest effect size of MLA on spike latency (with no presynaptic mechanism to offer), we reason that it would likely have minimal impact compared to the larger effect of mecamylamine. For this reason, we have backed off our conclusion that TONIC activation of presynaptic alpha-7 nAChRs on corticostriatal axon terminals will have a meaningful physiological impact on SPN spike timing. Accordingly, we removed previous figure 2D/E, but supplemented Figure 2A/B/C with new data (figure 2 supplement) demonstrating the lack of effect of tonic nAChR activation on corticostriatal synapse release probability. The title of the manuscript has been altered to reflect this.

      1. Fig. 7: the data on DhbE increasing AP duration is not convincing: no effect in 4 neurons, increase in 4 other neurons, and decrease in other neurons. Data ismore important than p<0.05. How do you interpret DhbE increasing AP duration?

      Point taken. We shouldn’t let a statistical calculation dominate the interpretation of a mostly mixed population result. Furthermore, upon revisiting this figure we realized that the main points pertinent to our conclusions (mecamylamine hyperpolarizes PV-FSI Vrest) were obscured by data that were of limited relevance. We have re-focused this figure to highlight data that are directly pertinent to our interpretation. This included removing the AP duration data set in question, which does not add to or inform our conclusions. We have further strengthened our conclusion that PV-FSIs are a primary mediator of the effect of tonic nAChR activation on spike latency by adding new data showing that pharmacologically blocking cortical activation of PV-FSIs occludes the effect of mecamylamine (new figure 8, see comments to Reviewer 2).

      Fig. 7F shows AP duration for PV-FSI is around 1.75 ms (some are over 2 ms, recorded at 35 C). This is unusually long. Also, the AP rise time is around 1.4 ms, very long. 1.75 ms total rise time vs. 1.4 ms for just rise: they do not add up?

      Please see our response to the above point.

      Reviewer #2 (Public Review):

      This manuscript examines one aspect of how acetylcholine influences striatal microcircuit function. While striatal cholinergic interneurons are known to be engaged in key events and tasks related to the basal ganglia in vivo, and pharmacological studies indicate cholinergic signaling is complex and critical to striatal function, the mechanistic details by which acetylcholine regulates individual cell types within the striatum, as well as how these integrate to shape striatal output, remain largely unknown. This work thus addresses an important problem in the basal ganglia field, with likely relevance to both normal function and disease-related dysfunction. The authors used a brain slice preparation in which a large number of excitatory cortical inputs to the striatum are activated, and they could measure the resulting activation of striatal projection neurons (SPNs). Their primary finding was that in this preparation, blocking nicotinic acetylcholine signaling resulted in more rapid activation of SPNs. They then explored some of the potential mechanisms for this phenomenon, and conclude that in their preparation, cholinergic interneurons are engaged both tonically and phasically, resulting in recruitment of local GABAergic interneurons that provide feedforward inhibition onto SPNs. They show that one striatal GABAergic interneuron subclass, PV-FSI, are modestly excited by tonic nicotinic signaling, and suggest this may be one contributor to their primary finding.

      Strengths of the study include the focus on cholinergic signaling across multiple striatal cell types, careful and clearly displayed slice electrophysiology, good writing, and a methodical approach to pharmacology.

      Weaknesses include reliance on the Thy1-ChR2 line to activate excitatory cortical inputs to the striatum (this line may be less specific to cortical pyramidal neurons than a specific Cre recombinase mouse line used with Cre-dependent ChR2, and thus have unintended influences on the results), and despite a strong start, a fairly weak mechanistic exploration of what GABAergic neuron subclasses might contribute to their original phenomenon.

      We thank the Reviewer for their thoughtful and constructive comments. The Reviewer identified two weakness of our study, as presented. The first weakness was our reliance on a transgenic mouse line (Thy1-ChR2) to activate cortical inputs to the striatum. Specifically, how a potential lack of specificity/ectopic expression of ChR2 in non-glutamatergic cortical neurons may impact our interpretation of the data. The second is that we did not make an effort to identify the specific subclass(es) of GINs that contribute to the phenomenon we describe. We have addressed both of these comments with new experiments, which we will describe individually below.

      1) Specificity of corticostriatal afferent activation in Thy1-ChR2 mice. As the Reviewer keenly points out, although Thy1-ChR2 mice are often used as a tool to specifically activate excitatory corticostriatal nerve terminals with optogenetic stimuli, there is concern that ChR2 expression is not exclusively limited to glutamatergic cortical neurons. If present, direct optogenetic activation of non-cortical striatal afferents would influence our results and impact our interpretation. We have addressed this issue experimentally by adding two new types of experiments (and related text, pages 7-8).

      We have added new data using immunohistochemical staining to survey for ectopic expression of ChR2 in the cortex. Staining for GAD, to broadly identify GABAergic neurons, displayed no overlap with ChR2-expressing cortical neurons in Thy1-ChR2 mice. Since a population of GABAergic somatostatin-expressing cortical neurons (particularly in the auditory cortex), have been shown to directly innervate the striatum (Rock et al., 2016), we also show that we found no evidence for somatostatin-ChR2 colocalization in our mice. Furthermore, we report no evidence for somatic expression of ChR2 in the striatum. We do report somatic expression of ChR2 in a population of globus pallidus soma, and add text to describe the above data (figure 3 supplement ) as well as published data identifying ChR2 in axons of the substantia nigra. Together, these data suggest that cortical expression of ChR2 is limited to non-GABAergic neurons, though do not eliminate the possibility of a direct monosynaptic GABAergic input to the striatum form non-cortical (and extrastriatal) brain regions. We describe newly added experimental data below to address this possibility.

      We have added new data to directly test if the optogenetic stimulation protocol used in this study induces a monosynaptic GABAergic current in SPNs (figure 3 supplement). We report that an optogenetically-evoked monosynaptic GABAergic current is indeed detected in SPNs, though it is unlikely to affect our results or interpretations for two reasons. First, based on the newly added histological data, the source of this GABAergic current is non-cortical and extrastriatal. Second, and more importantly, this input is insensitive to mecamylamine (new data, figure 3 supplement) and as such would not be modulated by the key manipulations presented in this study. Finally, experiments described below – instructed by a suggestion made by Reviewer 2 (see below) – show that blocking glutamatergic synaptic activation of a class of striatal GINs eliminates the effect of mecamylamine on SPN spike latency, ruling out the involvement of a monosynaptic GABAergic input in mediating the phenomenon.

      2) Identification of the key GIN subclass that mediates the phenomenon. Our initial manuscript included data demonstrating the feasibility of PV-FSIs in participating in the phenomenon we described, but we agree with the Reviewer that we stopped well short of identifying the class of GINs that are actually involved. We have added two new data sets to the manuscript that now corroborate both the involvement and necessity of PV-FSIs in mediating this phenomenon. First, we have added data showing that striatal SOM+ interneurons respond to mecamylamine differently than PV-FSIs do: while mecamylamine hyperpolarizes PV-FSIs, it depolarizes the average membrane potential of SOM+ interneurons and has no effect on their spontaneous firing frequency, making them unlikely candidates to mediate the phenomenon we describe. Second, we have added data showing that pharmacologically preventing cortical activation of PV-FSIs both mimics and occludes the effect of mecamylamine on spike latency and ADP amplitude (new figure 8). This data also rules out the involvement of certain other classes of GINs, such as PLTS interneurons, as the pharmacological manipulation we performed (blockade of calcium-permeable GluA2-lacking AMPA receptors) does not affect their response to cortical inputs (Gittis et al., 2010).

      Reviewer #3 (Public Review):

      The manuscript by Matityahu et al., investigated the role of tonic activation of AChRs on the spike timing of striatal spiny projection neurons (SPNs) in acute striatal slices. By selectively activation of corticostrialal projections using optogenetic tools (ChR2), they find that pharmacological blockade of presynaptic α7 nAChRs delays SPN spikes, whereas blockade of α4β2 nAChRs on GABAergic interneurons advances SPN spikes. The work is carefully done with proper control experiments, and the main conclusions are mostly well supported by data.

      Although they only constitute ~1% of the total striatal neurons in rodents and humans, cholinergic interneurons (ChINs) are gatekeepers of striatal circuitry because of their extensively arborized axons and varicosities which tonically release ACh. Whereas the role of muscarinic AChRs (mAChRs) in modulating striatal output has been well established, the role of nAChRs (especially the tonic activation) remains to be elucidated. The study is solid and the results are new and convincing. The data suggest that tonic activation of nAChRs may place a "brake" on SPN activity, and the lift of this brake during pauses of ChIN firing in response to salient stimuli may be critical for striatal information processing and learning. The findings from this study will enhance our understanding of the role of tonic nAChR activation in controlling SPNs and striatal output.

      We thank the reviewer for their careful reading of our manuscript and for their kind words and helpful suggestions.

      Unjustified Conclusions and Suggestions:

      1) The change of the SPN spike timing by AChR modulation is on a few milliseconds time scale. To make the current study more significant, the authors should design and perform additional experiments to demonstrate the functional consequence in controlling striatal output and learning. For example, will activation or blockade of nAChRs have effects on striatal STDP?

      We too would be thrilled to see the results of such experiments. Unfortunately our early attempts to perform such tests (e.g., crossing Thy1-ChR2 mice with ChAT-Cre mice to selectively express halorhodopsin in CINs, and combine cortical excitation with silencing of CINs) have been plagued by technical challenges, and would require time and resources that we feel are pragmatically beyond the scope of this study. That said, we’ve included new text (particularly, page 15) discussing how our results may fit with a newly published study on the role of CINs in corticostriatal LTP (Reynolds et al., 2022).

      2) Modulation of striatal circuitry is complex. The addition of a diagram illustrating the hypothesis and key results would help.

      Excellent suggestion. We have added a summary diagram, which is now figure 9.

    1. Rintze December 5, 2011 With regard to broken translators, do the Zotero clients phone home any details on save failures? (there is a preference checkbox "Report broken site translators" which suggests they do)I don't mind fixing up a few more translators, but it would be nice to know which translators fail most often. ajlyon December 5, 2011 It does phone home, but I'm afraid those reports are going into a black hole for now; I've noticed the requests in various logs, but I've never been notified of a failing translator by the Zotero team. It'd be great if the translator list / status page integrated explicit tests and such error reports. adamsmith December 5, 2011 there is, of course, also a good number of translators who don't trigger any errors, because they don't detect. Rintze December 5, 2011 Yes, but I would argue that non-detecting translators are less frustrating to users. dstillman December 7, 2011 Here's a start:https://repo.zotero.org/errorsThe actual error reports aren't public for privacy reasons (and we're not displaying absolute numbers), but we can provide example error strings and URLs on request. We also might be able to have this automatically display error strings that show up across many reports (e.g., "TypeError: scisig is null" for Google Scholar), since short of major site breakages it will probably be hard to debug many of these without examples.Note that the Google Scholar results are greatly skewed by Retrieve Metadata attempts, and DOI is also showing mostly "could not find DOI" errors. I'm hoping detection can be tightened on those (e.g., to remove the folder icon on a Google Scholar search with no results), which would allow this to better show actual error frequency. ajlyon December 7, 2011 I'll try to work on detection. Automatic display of common error strings would be very useful, as well as some general idea of how many errors we're talking about-- for something like ScienceDirect, are we talking about 10 errors? 100? 1000?Also, does this filter out data from clients with out-of-date translators or Zotero versions?Thanks for putting this up! It's sure to be useful in the coming weeks and years. Rintze December 7, 2011 Like ajlyon, I think some indication of the number of errors per translator would be very useful. And could the list be expanded to show more than the top 10 translators (say the top 50)?Also, would it be possible to create somewhat comprehensive reports with, say, 10 error strings and URLs for each translator to send to ajlyon, adamsmith and me, so we don't have to submit individual requests per translator? I'd hope we have established ourselves as at least somewhat trustworthy (and I assume all three of us would be more than willing to sign any privacy agreement). ajlyon December 8, 2011 Thanks for upping the number visible.What's going on with the outdated translators? There are people out there with three different ScienceDirects, two DOIs... Is that just people with updating off? Or something else? dstillman December 8, 2011 OK, updated again with absolute numbers and per-error breakdowns. Hover over each segment for error details. I don't think any page data will make it into the errors, but to be safe I'm displaying only errors coming from at least three addresses that don't include the string "http" in them—the rest get lumped together at the end in blue. If you notice anything that shouldn't be in there, let me know.We might be able to display URLs that show up across enough addresses, though there may not be enough of those. What's going on with the outdated translators? Those are all <2.1.9. Not much we can do for those folks.
      • ABOUT property "Report broken translators"
    1. It is not just that trans women are not really women;even females who self-identify as women are not really women.

      I think that Barnes will not agree this characterization. Barnes's idea is simply that there is no single group corresponding to the term "woman". Instead, there are multiple groups that may be the semantic value of "woman". Some of them are much more gerrymandered. I think the idea does not imply that no one is really a woman. Instead, the upshot is simply that when we consider whether one is really a woman, we must attend to the meaning of "woman".

    1. Author Response:

      We largely agree with the assessment of the Reviewers. Indeed, as noted by Reviewer #2, under the urgent conditions of our experiment, the onset of the cue modulates competing saccade plans that are already ongoing. The reviewer is correct in considering that the initial motor plans are endogenously generated, as they favor one location or the other based simply on the subject's internal bias or preference. We would just note that the endogenous signal that we focus on refers to a later modulation which, based on the perceived cue location and the task rules, directs the motor plans to the correct target location. According to our findings, this endogenous modulation occurs after the exogenous response and acts in the opposite way, boosting the anti-saccade plan and curtailing the activity that would otherwise trigger an erroneous pro-saccade. Thus, three things may happen in each trial: (1) initial, uninformed motor plans are endogenously generated, (2) the cue onset exogenously reinforces the plan toward the cue, and (3) an informed endogenous signal suppresses the plan toward the cue and boosts the plan toward the anti location. We think the novelty here is in being able to characterize these distinct events, which unfold within a few tens of milliseconds of each other.

      Reviewer #3 considered our conclusion that the exogenous response "is entirely insensitive to behavioral context" too strong, and that is a fair point. Conclusions apply to the degree that experimental conditions are valid in general, and furthermore, the deviations from the idealized predictions were small but not zero. However, we do not consider the assumption noted by the reviewer, that saccade-related neural activity ramps up before the saccade goal is known, as a weakness. We have, in fact, recorded such activity in several oculomotor areas using similar urgent-choice designs (Stanford et al., Nat Neurosci 13:379, 2010; Costello et al., J Neurosci 33:16394, 2013; Costello et al., J Neurophysiol 115:581, 2016; Scerra et al., Curr Biol 29:294, 2019; Seideman et al., bioRxiv, 2021, https://doi.org/10.1101/2021.02.16.431470), and the responses in the frontal eye field (FEF) in particular conform quite closely with those assumed by the model (Stanford et al., Nat Neurosci 13:379, 2010; Costello et al., 2013; Salinas et al., Front Comput Neurosci 4:153, 2010). Rather than a potential liability, we think the early ramping activity is a key constraint for any model of urgent choice performance.

    1. Author Response

      Reviewer #1 (Public Review):

      In this article, Miettinen and colleagues exploit the suspended microchannel resonator developed in their lab and optimize the method to be able to record single live mammalian cells for very long periods of times, across several cell division cycles, while performing a double measure of their buoyant mass in media of different densities (H2O and D2O). Because water exchanges fast enough inside the cell, it allows them to define a dry mass and a dry volume, and thus a density of dry material for single cells along the entire cell division cycle. These measures lead them to confirm and clarify some points from previous studies from their lab and others, such as exponential growth also in dry mass and the fact that buoyant mass and this new dry mass are the same thing in interphase cells. They then find that this is not true during mitosis, mostly because dry mass density increases in early mitosis (dry mass decreases and dry volume decreases even more, suggesting that there is a loss of material of density lower then the average dry mass density). The authors rule out a number of potential mechanisms and give evidence for a role of exocytosis, more precisely exocytosis of lysosomal content. Blocking this phenomenon prevents the change in dry mass density but does not affect cell division. They propose some potential function for this phenomenon, including the interesting hypothesis that this helps cleaning the lysosomal content which might contain some toxic components, so that daughter cells are born with 'clean' lysosomes. Cool idea! It is also quite amazing that the precision of their method allows them to detect this event.

      The main question I have concerns the definition of dry mass and dry volume. The authors should discuss in more details what it represents physically. Technically, this is defined by their equation 1, which relates their measure of buoyant mass to a dry mass and a volume of water as parameters to fit from the buoyant mass data. One gets to this equation by writing the definition of buoyant mass as the mass of the cell minus the mass of the equivalent volume of the surrounding medium. But then, to get what the authors find, one has to write that the cell mass is the sum of the dry mass and the mass of water contained in the cell (which makes the dry mass easy to understand) and then to write that the cell volume is the sum of a volume of water and of a volume of dry material. This then defines a dry volume, as the difference between the volume of the cell and the volume of the water contained in the cell (which is the parameter Vwater in the equation 1). At least this is how I got to this equation. The question I asked myself then is: what is this dry volume? Is it really the volume occupied by the dry mass in the cell? This is probably not the case, since dry mass is solvated in the cell. One can estimate this solvated volume using the van't Hoff/Ponder relation, which can be found changing the osmolarity of the external medium. It defines an excluded volume, which is the total volume excluded by macromolecules (like for a van der Waals gas) - it is usually between 25 and 30% of the cell volume. This volume contains the dry mass plus a certain fraction of the water, so it is not exactly the dry mass volume as defined here by the authors. I am worried that this dry mass volume, which is mathematically defined here and calculated from the fit of the equation, is not a standard physical quantity and so it is not easy to relate it to standard biophysical theories (e.g. equations of state), and its behavior could be very unintuitive even for simple systems. This makes the variation in this quantity not easy to interpret, and thus also the variation in dry mass density is not easy to interpret in physical terms.

      That being said, it is still clear that whatever this is, it changes in early mitosis, and it seems to be related to exocytosis, so I am not saying that the authors are wrong here. They potentially indeed detect this increase of exocytosis. But they should discuss more what they think this quantity is, either in the methods or in the discussion of the article. In particular, the sentence at the bottom of page 5, line 104, is not clear ('We are not aware of any other single cell methods capable of quantifying this biophysical feature of a cell'), since this measure is not really clearly a biophysical feature of a cell, but is defined a bit artificially from the equation which defines the dry mass volume from the measures of buoyant mass.

      Thank you for the detailed and very constructive feedback. As stated above in the Essential Revisions section, we have now clarified the terminology we use and made the terminology more consistent with existing literature. We have also better defined the concept behind our method. Our updated Measurement Method section now states (page 3) that: “In our approach, we consider the buoyant mass of a cell to be dependent on two distinct physical “sections” of the cell, the dry content and the water content. To measure the cell’s dry content independently of the water content, we measure the cell’s buoyant mass in H2O and D2O-based solutions. Under these conditions, the influence of the water content on buoyant mass can be excluded, because the intracellular water is exchanged with extracellular water, making the intracellular water content neutrally buoyant with extracellular solution. This allows us to detect the cell’s dry mass (i.e. total mass – water mass), dry volume (i.e. total volume – water volume) and dry mass density (i.e. dry mass / dry volume).”

      The reviewer is also correct that our method measures a dry volume which is, by our model’s definition, the volume occupied by the dry mass independently of water. In other words, our method & measurement model assumes that the intracellular water exchange is 100% complete. The reviewer is correct that some water may be retained, and we cannot directly measure the amount of H2O left inside the cell after immersion in D2O-based media. However, our results indicate that our dry volume measurements are not limited by the water exchange time that the cell experiences (Figure 1–figure supplement 2). In other words, in our measurements, cells exchange all the water they can exchange, be that 100% or 98%. This is further supported by our new estimations of the time needed to transport all water in and out of the cell (see above, other comments section #1, and our updated manuscript page 5). Note that, as our method only exchanges H2O to D2O instead of removing all water from the cell, dry mass will always remain solvated in either H2O or D2O, which makes it plausible that 100% of the water content is exchanged.

      As the reviewer keenly points out, our measured dry volume is biophysically distinct from the more classically measured excluded cell volume (or dehydrated cell volume), which still includes some water in the excluded cell volume quantifications. Consistently, our method measures dry volumes that are smaller (~15%) than what the excluded volumes typically are (~25-30%). We do not consider this a limitation of our method, but rather an opportunity for new measurements. That being said, we completely agree with the reviewer that this may cause confusion in the readers. To address this point, our Measurement Method section now states (page 4) that: “Importantly, our approach assumes that all water within the cell is exchangeable between H2O and D2O. Accordingly, our dry volume measurement is distinct from the excluded cell volume detected by measuring cell volume following strong hyperosmotic shocks, which does not remove all water from the intracellular space.”

      Finally, we have also changed the sentence “We are not aware of any other single cell methods capable of quantifying this biophysical feature of a cell” (page 5) so that it only refers to a metric, which hasn’t been quantified before on a single-cell level. We believe that this minor change will avoid the suggestion that dry volume is of biophysical importance on its own.

      Reviewer #2 (Public Review):

      The new suspended microchannel resonator (SMR)-based method described in this paper enables high precision and high temporal resolution single-cell measurements of key physical properties: cell dry mass and the density of cell dry mass, which depends on the macromolecular composition of the cell. The validity of the method is rigorously tested with several convincing control experiments. This method will be useful for future studies investigating cell size and growth regulation and the coordination of mass, volume and density in animal cells.

      Using their method, the authors report two important results. First, they confirm that buoyant mass measurement is a valid proxy for cell mass in interphase, an important finding given that SMR measurements have been one of the best and most productive approaches to investigating cell mass growth regulation. Second, they provide evidence that some cell types lose dry mass during metaphase by a mechanism that involves exocytosis, emphasizing how mass, volume, and density dynamics are more complex than during the rest of the cell cycle.

      While this paper presents very interesting results, it would benefit significantly from two main improvements. First, the different physical variables studied here (dry mass, dry density, dry mass density, dry volume) should be better defined, and the terminology revised to provide a more straightforward and intuitive description of their biological meaning. Several sections of the paper (especially the introduction and the discussion of Fig. 2-4) should be re-written to help the reader understand the message. Second, some of the drug treatments require more replicates to provide more conclusive answers.

      Thank you for this constructive feedback. As stated above in the Essential Revisions section, we have now changed our terminology to increase clarity. Our new density measurement in this manuscript (dry mass divided by dry volume) is now defined as ‘dry mass density’. This change has been applied throughout our manuscript, including our manuscript title. In addition, we have added clearer definitions of each term to our Introduction and Measurement Method sections. Furthermore, we have minimized the use of the term ‘dry composition’ throughout our manuscript, as we now realize this may cause confusion to some readers.

      More specifically, our introduction (page 3) now states: “Here, we introduce a new approach for monitoring single cell’s dry mass (i.e. total mass – water mass), dry volume (i.e. total volume – water volume), and density of the dry mass (i.e. dry mass / dry volume), which we will refer to as dry mass density.” These definitions are also repeated in our Measurement Method section (page 4), as many readers may look for the definitions in that section. We have also done many other minor modifications to our main text throughout the manuscript to help the readers understand our message.

      In addition, as detailed above in the Essential Revisions section 3, we have adjusted the writing of our manuscript to avoid overly strong claims where our replicate numbers are insufficient. More specifically, we now avoid conclusions where we claim that inhibition of cytokinesis has no influence on dry mass and dry mass density changes in mitosis.

      Reviewer #3 (Public Review):

      In this manuscript, the authors extend the Manalis lab's vibrating cantilever approach by adding the ability to rapidly exchange media with heavy water. This allows the authors to measure dry mass and its density in growth and proliferating cells. This resolves a previous discrepancy of the cantilever approach and quantitative phase imaging and shows that cells in early mitosis likely increase lysosomal exocytosis. This is an interesting piece of work.

      The authors report that: "On average, the FUCCI L1210 cells lost ~4% of dry mass and increased dry density by ~2.5%, and these changes took place in approximately 15 minutes (Figure 3C). In extreme cases, cells lost ~8% of their dry mass while increasing dry density by ~4%". Although these changes may sound small, I believe they would require significant changes to the cell composition. I.e., to increase the overall dry mass density by 4% while losing 8% of the cell's dry mass, the cell would need to lose almost exclusively low-density components, which may not be typical for exocytosis. Moreover, even if all of those lost 8% of cell dry mass are exclusively lipids (or other low-density components), it is not intuitively obvious that such a loss would be sufficient to cause a 4% change to the dry density. To make this more convincing, the authors should provide a simple mathematical model that would roughly estimate how the cell composition (e.g., the contents of lipids vs proteins) needs to change and what the composition of the lost (secreted) components needs to be to provide the observed changes to the dry mass and density, given the existing information on average cell composition and the densities of different biomolecules (lipids, sugars, proteins, etc).

      Thank you for this comment. The reviewer is correct that significant changes to the cell composition are needed to explain the phenotypes we observe. As stated above in the Essential Revisions section, we fully agree that such calculations could be very useful in interpreting our results. Our manuscript now contains a new paragraph (discussion section, page 13), where we state: “The magnitude of dry mass density increase in mitosis was large. We have previously observed similar magnitude changes in dry mass density when perturbing proliferation in mammalian cell (Feijo Delgado et al., 2013). To provide some rough estimates of what kind of compositional changes would be required to achieve the dry mass loss and dry mass density increase, we carried out a back-of-the-envelope calculations. Assuming a typical mammalian cell composition and typical macromolecule dry mass densities (Alberts, 2008; Feijo Delgado et al., 2013), we calculated the degree of lipid loss needed to increase dry mass density by 2.5%. This suggested that cells would have to secrete ~1/3 of their lipid content in early mitosis. This could be achieved via lysosomal exocytosis of lipids. Lipid droplets, the main lipid storages inside cells, are frequently trafficked into and degraded in lysosomes (Singh et al., 2009), and lipid droplets can also be secreted via lysosomal exocytosis (Minami et al., 2022). However, it seems likely that the mitotic dry mass density increase also involves secretion of other low dry mass density components (e.g. lipoproteins, specific metabolites) and/or a minor, transient increase in high dry mass density components (e.g. RNAs, specific proteins) in early mitosis. Indeed, CDK1 activity has been suggested to drive a transient increase in protein and RNA content in early mitosis (Asfaha et al., 2022; Clemm von Hohenberg et al., 2022; Miettinen et al., 2019; Shuda et al., 2015).”

    1. The illustrations below (pp. 224 ff.) show the course of the reaction time in hysterical individuals. The light cross-hatched columns denote the locations where the test person was unable to react (so-called failures). The first thing that strikes us is the fact that many test persons show a marked prolongation of the reaction time. This would make us think at first of intellectual difficulties, - wrongly, however, as we are often dealing with very intelligent persons of fluent speech. The explanation lies rather in the emotions.

      This makes sense. Some words may have someone relate to a certain incident or time/place that slows their quick responses. They are distracted and taken back to the thought that is associated with that word.

    1. Author Response:

      Reviewer #1:

      This study reports on the inference of the evolutionary trajectory of two specialist species that evolved from one generalist species. The process of speciation is explained as an adaptive process and the changing genetic architecture of the process is analyzed in great detail. The genomic dataset is big and the inference from it solid. The authors reach the conclusion that introgression and de novo mutations, but not standing genetic variation, are the main players in this adaptive process.

      I would avoid the term adaptive radiation for the group of fish studied here. It is misleading. It is generally accepted to use the term adaptive radiation when a fairly large number of new species originate from a common ancestor (cichlids in big African lakes, gammarids in Lake Baikal, etc). Here are only 2 new lines that evolved from a common ancestor. Furthermore, I do not see much parallel between the ideas and concepts used when people study real adaptive radiations and one studied here. I actually believe that the term adaptive radiation even distracts from the beauty of the current study.

      We would like to acknowledge that the usage of the term “adaptive radiation” has a long, rich history of debate in the literature over how it should be applied to empirical systems. Some example definitions of adaptive radiation are listed below:

      1) “The evolution of ecological and phenotypic diversity within a rapidly multiplying lineage” - Schluter, 2001 (The ecology of adaptive radiation). This definition implies that abundant ecological and morphological diversity that arose in a single lineage over a short time are the hallmarks of adaptive radiation and has been frequently applied to stickleback species pairs. The pupfishes of San Salvador Island meet these criteria (two trophic specialists arose from a generalist ancestor within 10,000 years). Importantly, please note that in this foundational textbook on adaptive radiation, no statement is made about the number of species necessary to be considered an adaptive radiation.

      2) “The evolutionary divergence of members of a clade to adapt to the environment in a variety of different ways.” – Losos, 2009 (Lizards in an evolutionary tree: Ecology and adaptive radiation of Anoles). Here again, the pupfish system described meets the definition. Unlike the previous definition, no statement about the rate of diversification (species or morphological/ecological) is made.

      3) “The rise of a diversity of ecological roles and attendant adaptations in different species within a lineage” – Givnish, 1997 (Adaptive plant evolution on islands: classical patterns, molecular data, new insights. Evolution on islands). As with the previous definition, no qualification is made with respect to rates of diversification. The pupfishes again meet the definition.

      As discussed by Givnish in 2015 (“Adaptive radiation versus ‘radiation’ and ‘explosive diversification’: why conceptual distinctions are fundamental to understanding evolution” – New Phytologist), few of the early definitions of adaptive radiations contained any reference to the rapidity of speciation – Simpson (1953) perhaps being the only notable exception. However, despite this, no definition states that the application of “adaptive radiation” to a given system is contingent upon a given number of species having arisen by the present day.

      The pupfishes of Salvador island meet all definitions of adaptive radiation – exceptional rates of morphological diversification and ecological diversification, as well as truly exceptional rates of speciation – focusing just on the three species here, two species have arisen within the last 10,000 years – this roughly translates to a speciation rate of 200 species per million years. While this pace is highly unlikely to be maintained, we feel that every line of evidence points towards the pupfishes of San Salvador Island as an adaptive radiation at the earliest stages of the process. We disagree that an adaptive radiation must be ‘complete’ or nearly so, for it to be deemed as such.

      Finally, we have also discovered a fourth pupfish species on the island (Richards and Martin 2016; Richards et al. 2021), and even more undiscovered species may exist there. Thus, this is an adaptive radiation of four sympatric species, not two as suggested.

      The "Result and discussion" section has rather little discussion. There is not much about other systems or studies, neither in concepts nor in biology. The results are not linked to the bigger questions and the larger field. The same is true for the conclusion, which is very strongly centered on the here reported study. What can we learn from this study for other systems? Is there a generalizable take-home message? How do the findings relate to commonly held ideas/theory on how adaptive speciation works? Without this, it reads like a report of a case study, disconnected from the larger field. To achieve this aim, it may be good to split the main section into a result and a discussion section, but this is only a suggestion.

      We followed this helpful suggestion and have split the results and discussion section and significantly expanded and revised our discussion section. We now relate our findings to the broader fitness landscape theory literature and emphasize how our findings inform the process of speciation. We conclude by emphasizing that our findings point to a process in which adaptive introgression and de novo mutation not only provide diversity that is useful in reaching novel fitness peaks on a static landscape but alter the shape of the landscape itself.

      Reviewer #2:

      This is a really interesting and challenging question the authors are addressing here. I enjoyed reading the manuscript and a few comments below:

      One major concern I have concerns the analysis of the two treatments (low and high density, l411). I believe that the two treatments should analyzed separately as the authors are estimating two different fitness landscapes. When conducting their analysis, experiment is treated as a single factor. Yet, in Martin and Wainwrigth (2013), it was established that the fitness landscapes were quite different between the two treatments (Figure S7 of said paper), meaning that different phenotypes (and therefore genotypes) were affected differently. I do not think that the complex effect described there can be capture by a single factor as done here.

      We examined this concern further and now include new analyses of only data from the second field experiment to address these concerns (described in more detail below), resulting in qualitatively similar conclusions to those conducted using all samples.

      Please also note that only the high-density treatments from the 2013 study were included in the current study due to the low sample sizes of the original low-density treatments. In the 2020 fitness landscape study, we found no evidence of a treatment effect (frequency-manipulation) on the curvature of the fitness landscape. In all our analyses, we do include the effect of lake accounting for environmental differences between lake replicates.

      While the two high-density treatments in Martin and Wainwright 2013 were analyzed and visualized in some cases as distinct adaptive landscapes as pointed out by the reviewer, many aspects of stabilizing and disruptive selection were comparable between the lake environments and detected in similar regions of morphospace as described in Table 1 in that paper. All statistical analyses of the second field experiment (e.g. Figure 5A of Martin & Gould 2020 Evol. Letters) indicated no effect of the frequency treatment between the two field enclosures in each lake; accounting for treatment did not improve model fit to the data. In the second field experiment, the authors found that the two frequency treatments in each lake could in fact be summarized by a single fitness landscape accounting for lake-specific effects which was as the best fitting GAM model. This surface bore remarkable similarities to the high-density fitness surfaces of the 2013 in the placement of fitness peaks and valleys on the morphospace (Martin and Gould 2020). Thus, we tend to view the fitness landscape of interest to us as a single landscape connecting the fitness of different species phenotypes while treating lake-specific environmental effects on this landscape as background noise.

      Unfortunately, we do not have sufficient resequenced samples to analyze only data from the first experiment alone (Martin and Wainwright 2013); fewer than half of our samples come from the 2013 study – the remainder come from the second field experiment. Therefore, we now include a second set of analyses focused on just the subset of resequenced fish from the second field experiment (Figure 5—figure supplement 1-2, Appendix 1—table 18-19). Our primary goal was to assess whether our major findings held within a single field experiment by focusing on the latter, more data-rich experiment.

      Because we believe the most significant analyses from our paper are those pertaining to genotypic fitness landscapes and accessibility, using the subset of data from the second field experiment we performed 1) analyses of models fit between ancestry proportion and fitness (i.e. Figure 1—figure supplement 3), and 2) analyses estimating accessibility between generalists and either trophic specialist (reported in Appendix 1—table 19).

      Overall, we found qualitatively similar results between analyses conducted using either all samples or only those in the second experiment. As a result, we report results for all samples in the main text while referencing the analyses of the second field experiment alone which are presented in the supplementary material.

      A second major concern I have is in the use of the Admixture software (Figure 1 and l152.) The generalist type is assumed to be the ancestral type. Yet, a unique group was not assigned to it. This is a known problem for Admixture (Lawson et al. 2018). Groups that are under-sampled are far more likely to be consider a mixture of different ancestry groups even when this is impossible (Rasmussen et al 2010, Skolung et al 2012). While this in itself is not problematic, I am concerned about the use the authors are making of these ancestry proportions (l 156-165). The authors analyzed how ancestry of scale eater or molluscivore affect survival probability, growth, or the hybrid composite fitness. However, the ancestries values are partly generated due to an artefact, so I wonder how modelling the ancestral type as a group, and therefore acknowledging some amount of share ancestry between the three species may further affect this analysis.

      We agree that the ancestries estimated for the generalists by our unsupervised admixture analyses appear to be confounded and we briefly allude to this in the text. In our original submission, we focused exclusively on molluscivore and scale-eater ancestry, which appear less biased by this artifact. To address this concern, we ran new admixture analyses using a supervised analysis, a priori assigning generalists, molluscivores, and scale-eaters to one of three populations. Ancestry proportions of hybrids were then inferred for each of three clusters. We now include new analyses of fitness by ancestry associations using these admixture proportions and found qualitatively similar results. We report these new analyses in the results and supplemental material.

      We also conducted analyses using only samples from the second field experiment (related to the first concern raised by the reviewer). In all, we now include the following analyses of the extent to which the three fitness measures are associated with each of the three ancestry proportions using:

      1) an unsupervised admixture analysis (Appendix 1—table 2), 2) all samples using a supervised admixture analysis (i.e. model is informed a priori which samples are known to belong to either of the three assumed populations/parental species: Appendix 1—table 3), 3) only samples from the second field experiment (Martin & Gould 2020) in which lake was not found to significantly affect fitness using an unsupervised analysis (Appendix 1—table 4).

      Importantly, results are qualitatively the same; ancestry proportions do not strongly influence fitness in this system. There is one exception – generalist ancestry appears to positively predict growth when modeled using all samples and the supervised admixture analysis (Appendix 1—table 3). However, the inconsistency of this result across the three analyses leads us to cautiously interpret this exception

      I understand the need to use subsets of a network, due to impossibly large dimension size of the network in the first place. However, subsetting said network may give the wrong impression of the whole network (Fragata et al 2019). I wish this point was further discussed here.

      We have followed this suggestion. In our now-expanded and significantly revised discussion, we include discussion of this limitation, citing Fragata et al (2019) as well as related works. We also discuss how estimation of combinatorially complete fitness landscapes may be misleading, as their topography is determined in part by epistasis that occurs among loci that are not segregating in natural populations. We also suggest that the ‘realized epistasis’ that occurs among only those loci that are naturally segregating in a population may be why the shape of the fitness landscape, and thus accessibility of fitness peaks, changes upon the appearance of adaptive introgression and de novo mutations.

      L 294-295: I wonder whether the results here could be used to discuss the geometry of the different fitness peaks. The small number of steps within molluscivores suggest a rather narrow peak, while the rather large ones within the generalist suggest a rather flat fitness peak. The shape of the peak can be linked to the amount of genetic variation that can be maintained within populations, as well as the mutational load of said populations.

      This is an excellent suggestion and led us to consider the ruggedness of our fitness landscapes as an additional factor affecting evolutionary accessibility. We now interrogate the geometry of the fitness landscape further, asking for each specialist, how many local peaks exist on their respective landscapes (i.e. the ruggedness), how far specialists are from these peaks, and how accessible these peaks are to specialists. We elaborate on these findings in the discussion as recommended.

      These expanded analyses further led us to similarly investigate the influence of each source of genetic variation on the ruggedness of the fitness landscape. Consequently, we now discuss in more detail the interplay between fitness landscape ruggedness and accessibility of interspecific genotypic paths, in the context of what sources of genetic variation are available. We show that the presence of adaptive introgression and de novo mutations both increase the accessibility of interspecific genotypic paths, while decreasing fitness landscape ruggedness. We now discuss how this finding makes sense in light of epistasis; changes to the pool of segregating genetic variation alters the ‘realized epistasis’ in natural populations, thus altering the shape of the fitness landscapes and ultimately the evolutionary outcomes favored by natural selection.

      L74-75 I would suggest to more cautious in the phrasing here. While this is true within Fisher geometric model, where population are assumed monomorphic and infinite, this is not true in general. Deleterious mutations can fix within populations, especially when drift is non negligible. Crossing fitness valleys has been quite widely investigated (see Weissman et al 2010 for example). Even the authors themselves mention it later (l 108).

      We tempered these statements as recommended and expand our references to include Weissman et al. 2010 and additional references describing these caveats.

      Lastly, I would be more cautious about the conclusion. Line 373-374, the authors mentioned that "de novo mutations may enable the crossing of a large fitness valley". Given that the authors focus only on adaptive walk (fitness always has to increase between each mutational step), there is no crossing of fitness valleys. Switching from one fitness peak to another is simply a matter of walking along a (very) narrow ridge.

      We revised our language as recommended, emphasizing that our results support an interpretation in which apparent phenotypic fitness valleys are crossed along narrow fitness ridges, which are not observed in a three dimensional morphospace, to reach new fitness optima.

      Reviewer #3:

      This paper uses sophisticated regression methods and numerical experiments to produce a genotype-fitness relationship for three closely related sympatric pupfish species, forming an adaptive radiation. In addition to providing insights into the genetic targets of selection, this paper goes further in attempting to tease out what types of genetic variation were most likely to have played key roles in this radiation.

      Strengths:

      The idea behind this study is excellent, and clearly a large amount of thought and effort went into collecting the underlying data. The attention paid to linking evolutionary dynamics with the fitness results is laudable. The system is extremely exciting and I think an experiment and analysis of this sort could potentially be interesting to a broad audience within evolutionary biology.

      Weaknesses:

      The claim that this is the first genotypic fitness network in a vertebrate needs additional qualifiers: as far as I can tell, the claim to novelty is based on the inclusion of multiple species, the number of alleles, and measuring fitness in the field. I can't fully assess this claim but I would urge the authors to avoid staking a stronger claim to priority than is really needed, as it might be a lightening rod for criticism and hair-splitting that would distract from the contents of the paper.

      We tempered this claim as suggested, removing it from the title, and de-emphasizing or removing this claim elsewhere throughout the manuscript.

      One of my major questions while reading this was whether these three species were better or worse adapted to subenvironments within the lakes. This is partially answered in a few places in the manuscript, but I think that resolving this point more precisely would help interpret if positioning all three species on the same fitness landscape is fair.

      We have included more description/discussion of the ecological differences between species to the manuscript, particularly their habitats within the lake. We now point out that all three species coexist within the benthic littoral zone of each lake. No habitat segregation among these species has been observed in 13 years of field studies, suggesting that it is reasonable to position all three species within the same fitness landscape. Their foraging also occurs within the same benthic microhabitat throughout each lake; indeed, the scale-eaters target their generalist neighbors for scale attacks. This thinking also underlies much of the theory of speciation and adaptive radiation. We now include these qualifiers in the text as well.

      I find it a little hard to follow the construction of the landscapes in Fig. 2 B and C. I am not clear why the landscapes don't cover the location of the molluscivore population.

      We now include a brief statement that estimated values of fitness are only plotted for samples within the observed morphospace in the hybrids. That is, because none of the hybrid phenotypes were morphologically similar to the most divergent molluscivore phenotypes, we could not measure fitness values for this region of morphospace. However, there were hybrid phenotypes that fell within the 95% confidence ellipse of the lab-reared molluscivore population, suggesting that we have good power to detect adaptive walks to this region of the morphospace.

      I think the fitnesses predicted for the main bulk of the generalists and scale-eaters are the same across the two landscapes (as I expect they would be), but this is obscured by the differing fitness ranges of the two landscapes. I would suggest using a single color-fitness relationship for the two panels to aid cross-comparison.

      We re-plotted these landscapes using a uniform color scheme across panels as recommended.

      Also, two salient features of the landscape-the major peak at the top center and the deep pit at the bottom center-seem to be supported by few fish in each case. I would imagine that something like boot-strapping could be done for fitness landscapes, where the support for each feature of the landscape could be judged by how often it appears in subsets of the data (or in inferred models with nearly as high support as the best model), but I acknowledge that might be very hard to do. Still, I think some statement of uncertainty should be prominently included.

      We followed this suggestion and now more explicity address uncertainty in our estimation of three-dimensional fitness landscapes, with particular focus on the landscape we devote the most attention to (Fig. 2c-d – composite fitness + genotypes).

      To quantify uncertainty, we conducted a bootstrap procedure as suggested in which we resampled hybrids with replacement, re-estimated the fitness landscape, and compared the topology of the predicted fitness landscapes to that of the observed fitness landscape (Figure 2—figure supplement 7). Even across the bootstrap replicates, we still recovered the same general features – a peak localized near generalists, a fitness valley near scale-eaters, and a fitness ridge/modest peak near molluscivores.

      Furthermore, we emphasize more strongly in the revised manuscript our point that three-dimensional representations of the fitness landscape may in fact mislead interpretations of how evolution proceeds. In that respect, even though we recover the same features of the landscape when accounting for uncertainty, we articulate that these inferred peaks and valleys separating populations may be bridged in multidimensional genotype space.

      More generally, the landscapes reconstructed in Fig. 2 do not show very clear evidence that the M or S types are separated by valleys from the G type. Close inspection of the figure suggests a very shallow valley might be present between G and M, but the overall trend is declining fitness; between G and S, fitness appears to simply decline. While peaks may occur within the landscapes composed of limited sets of loci, the overall pattern seen in Fig. 2 doesn't seem conducive to analyzing how adaptive evolution in generalists crossed valleys to reach the putatively higher peaks of the two specialists. As such, I find the connection between these phenotypic-fitness landscapes and the later genotypic fitness landscapes quite confusing.

      We thank the reviewer for this comment. The apparent disconnect noted by the reviewer is in fact a point that we would like to draw more attention to. Thus, we have revised much of the discussion of these results to address this.

      As discussed in our response to the reviewer’s previous comment, the three dimensional landscape contrasts with our inferences from genotypic fitness landscapes. This incongruence demonstrates, through example, how three-dimensional fitness landscapes may in fact mislead our intuition about how evolution proceeds.

      As has been discussed extensively in the fitness landscape literature (e.g. Kaplan et al. 2008; Gavrilets 2010; Fragata et al. 2019), reduction of the fitness landscape, which is inherently highly multidimensional (as originally recognized by Wright), to only three dimensions can mask viable evolutionary trajectories, underestimate the number of peaks, and oversimplify our understanding of how populations evolve. We now attempt to better clarify and discuss this in the revised manuscript.

      I also had trouble understanding the role of fitness in the analysis of mutational distances in a subset of loci between the three species (lines 282-296). While the illustration in Fig. 3C uses directed edges to capture fitness data, this framework doesn't seem to be applied in Fig. 3d or the resulting analyses in 3e. As such, I don't see how this section is about genotypic fitness landscapes at all.

      We followed this suggestion and have rearranged our figures and their constituent panels to provide a more coherent illustration of our results and analyses. Figure 3 now serves to describe 1) the focal loci used to construct genotypic networks and 2) the general structure of genotypic networks constructed using loci sampled across all three species. What is now figure 4 is dedicated explicitly towards investigation of genotypic fitness landscapes, describing how we incorporated fitness measures into these networks to identify accessible path. This figure also serves to describe the fitness landscapes for each specialist, quantifying accessibility of interspecific genotypic trajectories, and landscape ruggedness. Our discussion of these sections similarly attempts to distinguish their respective focus, emphasizing that investigation of the general isolation of each species on genotypic networks will help provide context for our later focused investigation of fitness landscapes.

      The final part of the conclusion sketches a story in which de novo and introgressed alleles reduce the accessibility of reverse evolution, back to a generalist. I think this is conceptually confusing because we don't expect evolution to favor paths toward lower fitness, even if those paths do not pass through a valley. Again, the framing here-that generalists are less fit than either specialist-is hard to square with the facts that generalists seem to be coexisting with the specialists, and much closer to the hypothesized fitness peak than is either specialist.

      We agree and have completely rewritten this section and removed this framing. We omitted this part of the conclusion entirely, as we felt it too speculative, and as noted by the reviewer, difficult to square with some of the rest of our findings. Instead, we now devote more focus on other aspects and implications of our findings in a new discussion section as requested by reviewer 1.

      This is a complicated and ambitious paper, on an exciting system and aiming at important questions. I think the main results about genotypic-fitness networks are hard to relate back to the other major analyses in the paper due to the points raised above. Moreover, using fitness measurements of three coexisting species to infer how they evolved faces a major obstacle: if fitnesses are frequency-dependent, then the actual trajectory of an initially rare variant will be completely obscured post-invasion. This possibility, as well as the potential issue that data on reproductive success might change these findings, need to be discussed, especially in light of the puzzling fact that the specialists appear less fit than their ancestor in at least one of the paper's major analyses.

      We now emphasize the apparent disconnect between three-dimensional fitness landscapes and the highly dimensional genotypic fitness landscapes as noted by the reviewer (see above). We hope to demonstrate through example how highly dimensional genotypic fitness landscapes may harbor numerous viable evolutionary trajectories (e.g. fitness ridges) on rugged fitness landscapes that are unobservable on low-dimensional representations. Additionally, we expand our discussion of the caveats in our analyses pertaining to the use of data on contemporary species to infer historical dynamics on the fitness landscape as recommended by the reviewer.

      We also now note that no evidence for frequency-dependent selection has been found in this system (Martin and Gould 2020; Martin 2016). We previously explicitly manipulated the frequency of rare phenotypes between treatments and found no effect of treatment across lake populations. Rather, these fitness peaks and valleys appear surprisingly stable across lakes, treatments, and years.

      Regardless, we now include in the discussion that we necessarily have taken a ‘birds-eye view’ of evolution here, describing the influences of different sources of genetic variation on the fitness landscape, after these have already undergone selective sweeps. Likewise, we acknowledge that it is impossible to quantify reproductive success in this system using field enclosures due to the very small size of newly hatched fry and continuous egg-laying life history of pupfishes. This is a limitation of our system. We take this opportunity to emphasize that other experimental or simulation studies would be invaluable to quantify the changing influence of these different sources of genetic variation on the fitness landscape as a function of time, during the process of selective sweeps.

    2. Reviewer #3 (Public Review): 

      This paper uses sophisticated regression methods and numerical experiments to produce a genotype-fitness relationship for three closely related sympatric pupfish species, forming an adaptive radiation. In addition to providing insights into the genetic targets of selection, this paper goes further in attempting to tease out what types of genetic variation were most likely to have played key roles in this radiation. 

      Strengths: 

      The idea behind this study is excellent, and clearly a large amount of thought and effort went into collecting the underlying data. The attention paid to linking evolutionary dynamics with the fitness results is laudable. The system is extremely exciting and I think an experiment and analysis of this sort could potentially be interesting to a broad audience within evolutionary biology. 

      Weaknesses: 

      The claim that this is the first genotypic fitness network in a vertebrate needs additional qualifiers: as far as I can tell, the claim to novelty is based on the inclusion of multiple species, the number of alleles, and measuring fitness in the field. I can't fully assess this claim but I would urge the authors to avoid staking a stronger claim to priority than is really needed, as it might be a lightening rod for criticism and hair-splitting that would distract from the contents of the paper. 

      One of my major questions while reading this was whether these three species were better or worse adapted to subenvironments within the lakes. This is partially answered in a few places in the manuscript, but I think that resolving this point more precisely would help interpret if positioning all three species on the same fitness landscape is fair. 

      I find it a little hard to follow the construction of the landscapes in Fig. 2 B and C. I am not clear why the landscapes don't cover the location of the molluscivore population. I think the fitnesses predicted for the main bulk of the generalists and scale-eaters are the same across the two landscapes (as I expect they would be), but this is obscured by the differing fitness ranges of the two landscapes. I would suggest using a single color-fitness relationship for the two panels to aid cross-comparison. Also, two salient features of the landscape-the major peak at the top center and the deep pit at the bottom center-seem to be supported by few fish in each case. I would imagine that something like boot-strapping could be done for fitness landscapes, where the support for each feature of the landscape could be judged by how often it appears in subsets of the data (or in inferred models with nearly as high support as the best model), but I acknowledge that might be very hard to do. Still, I think some statement of uncertainty should be prominently included. 

      More generally, the landscapes reconstructed in Fig. 2 do not show very clear evidence that the M or S types are separated by valleys from the G type. Close inspection of the figure suggests a very shallow valley might be present between G and M, but the overall trend is declining fitness; between G and S, fitness appears to simply decline. While peaks may occur within the landscapes composed of limited sets of loci, the overall pattern seen in Fig. 2 doesn't seem conducive to analyzing how adaptive evolution in generalists crossed valleys to reach the putatively higher peaks of the two specialists. As such, I find the connection between these phenotypic-fitness landscapes and the later genotypic fitness landscapes quite confusing. 

      I also had trouble understanding the role of fitness in the analysis of mutational distances in a subset of loci between the three species (lines 282-296). While the illustration in Fig. 3C uses directed edges to capture fitness data, this framework doesn't seem to be applied in Fig. 3d or the resulting analyses in 3e. As such, I don't see how this section is about genotypic fitness landscapes at all. 

      The final part of the conclusion sketches a story in which de novo and introgressed alleles reduce the accessibility of reverse evolution, back to a generalist. I think this is conceptually confusing because we don't expect evolution to favor paths toward lower fitness, even if those paths do not pass through a valley. Again, the framing here-that generalists are less fit than either specialist-is hard to square with the facts that generalists seem to be coexisting with the specialists, and much closer to the hypothesized fitness peak than is either specialist. 

      This is a complicated and ambitious paper, on an exciting system and aiming at important questions. I think the main results about genotypic-fitness networks are hard to relate back to the other major analyses in the paper due to the points raised above. Moreover, using fitness measurements of three coexisting species to infer how they evolved faces a major obstacle: if fitnesses are frequency-dependent, then the actual trajectory of an initially rare variant will be completely obscured post-invasion. This possibility, as well as the potential issue that data on reproductive success might change these findings, need to be discussed, especially in light of the puzzling fact that the specialists appear less fit than their ancestor in at least one of the paper's major analyses.

    1. Author Response:

      Reviewer #1 (Public Review):

      The observation that the cells are able to steadily move along the light axis but perpendicular to their long axis is very interesting considering the T4P appear to be bipolarly localized. There is some discussion on the micro-optic effect in single cells but it does not include the observation that the negative phototaxis to green light occurs no matter where the direction of blue light comes from or the micro-optic effect in a microcolony.

      We have added the following sentences in the Discussions part (p16 L363-372) in the Related Manuscript File: “The focused green light would excite yet unknown photosensory molecules to induce spatially localized signalling, whereas the position of the focused blue light is not crucial for directional switching. As we showed, the direction of blue light illumination did not influence directionality of movement, because cells do not move in random orientation (Figure 2 – figure supplement 6). Thus, blue light does not control the directional light-sensing capability, instead it provides the signal for the switch between positive and negative phototaxis. This is very similar to the situation in Synechocystis where the blue light receptor PixD controls the switch between negative and positive phototaxis independently of the position of the blue-light source (Sugimoto et al., 2017).”

      Reviewer #2 (Public Review):

      I- The author's attribute the defect of negative phototaxis observed in the SesA mutant to the level of C-di-GMP in the cell, mainly because a SesA mutant shows a two fold decrease in C-di-GMP concentration upon blue light treatment. However, this measurement has been realised in a batch culture and normalised to dry cell mass. At the opposite, the negative phototaxis observed at single cell level occurs in a range of less than a minute (Figure 2). It would be therefore important for the author's to strength the implication of C-di-GMP in the phototaxis regulation. For example, the author's could ectopically modulate the level of C-di-GMP in the cell, via the expression of ectopic a diguanylate cyclase or phosphodiesterase enzymes, and observe its effect on phototaxi

      We highly appreciate your evaluation and comments. As we pointed out in our response to reviewer 1, utilizing heterologous expression systems in T. vulcanus is challenging, maybe due to the cultivation of cells at of 45°C. However, we were lucky in isolating a spontaneous mutant (named WT_N) that shows constitutive negative phototaxis under lateral light illumination. By comparative genomics, we identified the frameshift mutation that confers an increase of the intracellular concentration of c-di-GMP and which was accompanied by negative phototaxis under the condition where the WT cells showed positive phototaxis (Figure 4). We have added a paragraph in the Results part for these experiments on p9-10 (L201-219). See also our comments to the other reviewers and the editor concerning these new experiments, which support the role of c-di-GMP in directional switching. In addition, the figure formerly assigned as Figure 3 – figure supplement 1 was moved to the main manuscript as Figure 3C, because we think that the data of the intracellular concentration of c-di-GMP are very important to support our conclusions.

      II- The author's used fluorescent beads to visualize T4P dynamics. As it was previously described, the author's show that it is specific of the T4P activity and it also can reveal T4P retraction. Then, the author's used this method to convincingly show that cells that move perpendicular of the light source have only active pili at one half of the both cell poles (Fig6). It is an interesting observation but again it gets short of details.

      -The manuscript would definitively benefit from more general analysis of T4P dynamics during phototaxis. For example, during the switch from positive to negative phototaxis. What are the behaviours (T4P pole activation) of cells parallel to the light source?

      -Beside, as suggested by the author's in the discussion, having the intracellular localisation of the Atpase PilB would definitively be a plus.

      -Moreover, in the discussion section the author proposed the existence of "a specific signalling system with high special resolution" to explain the asymmetric polar T4P activation. Why could it not be a molecular mechanism similar to the one observed in round cell such as Synechocystis, where the light receptor PixD regulates T4P function at some part of the cell according to the direction of the light.

      In order to get more direct insights into T4P dynamics, we have performed additional experiments, which are summarized in Figure 8 and Movies S17-20. Importantly, we succeeded in visualizing T4P filaments by PilA1 labelling using live cells. The T4P filaments were bipolarly localized and showed dynamics of assembly and retraction at both cell poles. When the cells moved perpendicular to their long axis, the T4P filaments at both poles showed biased distribution towards the same direction of cellular movement. These results support our idea that T4P are asymmetrically activated within a single cell pole. This asymmetric activation can rely on the localization of PilB ATPase. We would like to address how a molecular machinery such as PilB governs directional switching events. However, GFP-tagging has not been established in thermophilic cyanobacteria so far. We have added a chapter in the Results part for these experiments p13-14 (L296-322) in the Related Manuscript File. Please, also pay attention to our answers to similar comments of the other reviewers.

      Our results suggest that the T. vulcanus cell can actuate the spatially resolved signaling even within a cell pole to activate the pilus activity at only one side of a cell pole to enable biased cellular movements. This finding means that the cell harnesses "a specific signalling system with high special resolution" compared to other rod-shaped bacteria showing pole-to-pole regulation of cell polarity. We do not exclude that a system which works similar to the PixD/PixE complex in Synechocystis contributes to the asymmetric localization of the pili in Thermosynechococcus motility. Thermosynechococcus encodes a PixD protein but no PixE homolog. For Synechocystis, it was shown very recently that PATAN domain response regulators (including PixE) bind PilB1 and PilC and can switch the direction of movement (Han et al. Mol. Microbiol. 2021). Thermosynechococcus encodes homologs of such PATAN-domain response regulators, but at the moment, we do not know whether they have a similar function in both cyanobacteria.

      III- The links between the C-di-GMP concentration and T4P dynamics during the switch from positive to negative phototaxis is absent. The author's proposed in the discussion a potential binding of C-di-GMP to PilB as previously shown for some T4P. Could it be tested here by the author's since they seem to be able to handle C-di-GMP?

      The experimental verification of the binding of c-di-GMP to PilB is ongoing work, but it seems that direct binding of c-di-GMP to PilB is either very weak or does not happen in our setup. Thus, detailed molecular events of c-di-GMP signaling are out of the scope of the current study. However, we do show in the revised version of the manuscript that pilus extension and retraction dynamics are not different between positive and negative phototaxis (Figure 7 − figure supplement 2), suggesting that c-di-GMP most probably does not affect the activity of the PilB protein. Therefore, we have modified the sentence about the binding of c-di-GMP to PilB in the Discussion part as follows. See p17 L391-394: “Since we did not observe a change in pilus dynamics under green and green/blue light illumination (Figure 7 − figure supplement 2), the T4P regulation in T. vulcanus may not be explained simply by a specific activation of PilB (Floyd et al., 2020, Hendrick et al., 2017).”

      In addition, we have performed experiments to show additional data that the c-di-GMP levels switch the direction of T4P-dependent phototaxis (new Figure 4). We also performed additional experiments to visualize T4P dynamics by PilA labeling (new Figure 8), which suggest asymmetric activation of pili and most probably of the motor ATPases as well.

    1. Reviewer #1 (Public Review):

      This is an interesting manuscript providing important new information on the mechanism of action of EROS in the generation of superoxide by the NADPH oxidase of neutrophils. The authors have shown in previous publications that EROS deficiency results in defective NOX2 activity and thus represents a hitherto unrecognised, rare form of chronic granulomatous disease. They now show how EROS is involved in oligosaccharide transfer during the maturation of gp91phox and also extend what is known about the role for EROS in regulating expression of the P2x7 ion channel.

      The results presented in the manuscript are supported by findings from a variety of techniques and for the most part, are convincing and well presented. However, I do have queries about certain aspects of the manuscript.

      1. Figure 1<br /> The much lower EROS expression when gp91phox is expressed warrants a comment.<br /> Fig 1 G. Please explain what fold change represents. From F, zero time expression appears much more than the 1.5 fold higher shown in G for the EROS-expressing cells. This needs explaining. With the very high error bars (presumably for the EROS sample although this is not clear) overlapping zero I find it hard to conclude anything from this figure.

      2. P 9 line 9 states that Fig 1H shows that cycloheximide increases expression. Yet it appears from the legend that cycloheximide is present in all samples and it is EROS that increases expression. Please clarify.

      3. Fig 3A&B and p12 1st para. The identification of OST as a binding partner is interesting and a significant novel finding. However, the presentation of this information appears to me to be unduly complex and more information is required. Not all the readers will be familiar with the details of SAINTexpress methodology and more explanation of what is being shown would be helpful. At the least, a supplementary Table of the 59 identified proteins would be helpful, plus information on controls to establish selective pull down by EROS and on how the blue spots in A relate to the proteins. Also please make it clearer which of the proteins in B were identified and the relevance of showing all the steps in the pathway.

      4. Figure 6. This contains a large amount of information. Although interesting, I am concerned that the authors may be trying to include too much at the expense of the necessary detail for some of the experiments. For example, the EROS -/- +ATP scattergram on the left of Fig 6E does not seem to agree with the right hand graph. I would also like to see the mean values for the 5 experiments in Fig 6G shown. Most importantly, insufficient information is given for Fig 6H. I don't think I missed it but I could find no details about the experiment in the Methods section. We need to know more about exactly how many animals in were in each group (death of 1 animal appears to equate to 5% of total - how does this relate to >10 in total), how signs of illness were monitored and related to death, and generally more about the conditions of the experiment. Alternatively, this may be better left to a more detailed study.

  4. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. Thus the strongest research evidence appears to indicate that money matters, in a variety of ways, for children's long-term success in schoo

      Money suggests recourses. Things kids can obtain, chances they can have and people they can meet. I had a former roommate who said that we come from different hierarchies, because her family income is more than 10 times of mine. I do see some difference between us, but I think the difference is not as big as the poverties and the riches. Middle class families can basically make sure that their children get enough resources. The richer families may hold better resources, but this is a gap that somehow not that big. The problem for now is how to give the kids from poverty families get the resources, no matter how good the resources are, I hope at least they can have the basic needs being met.

    1. Author Response:

      Reviewer #1 (Public Review):

      The manuscript by Liu et al investigates how MRI can be used to detect the earliest stages of CNS infections and how MRI can also be used as a surrogate readout for treatment efficacy. Authors demonstrate convincingly that microbleeds, as evidenced by unusual dark spots in the brain of mice infected with a virus that infects the brain, occurred at the earliest stages of viral infection. Authors also convincingly demonstrate that the infusion of virus-specific immune cells, when delivered at the right time and at the right dose, could reduce these microbleeds. Importantly, authors showed that the wrong dose could be detrimental.

      The authors cast this study as a method for improving research and discovery in immunotherapy context and the study is convincing in its conclusions regarding imaging microbleeds and the immunotherapy tested herein. While authors do not directly suggest so, these findings extend the significance of this work beyond research and development of immunotherapies by providing a potential early detection mechanism for viral infection in the brain. This may be feasible as the MRI methodologies for detecting these phenomena are generally translatable to clinical imaging scenarios, though the imaging resolution may not.

      Weaknesses in the report revolves around the value of and the ability to image magnetically labeled T cells in the presence of microbleeds.

      1) Authors developed a magnetic particle coated with fluorescent molecules and antibodies specific for CD8+ T cells. They labeled these T cells with particles for detection by MRI. They then wanted to follow the accumulation of these cells in the brain following infusion and viral infection by performing MRI using parameters that amplify the signal of the attached label. The rationale for these experiments was to determine if immune cell infiltration preceded vascular compromise. This suggests the expectation for active chemotactic migration or other signaled accumulation rather than leakage. When authors tested their magnetically labeled T cells for functional impairment due to the presence of attached magnetic particles, they did not test for deficits to migratory capabilities, such as in standard transwell migration assays. Others have shown (see https://doi.org/10.1038/nm.2198 for example) that T cell migration is very sensitive to the type of attached nanoparticle as well as the surface coverage. Perhaps authors should temper their claims that magnetically labeling of T cells does not alter T cell function without at least an assay of this critical function. Further, the fluorescence microscopy shown in Figure 7D is of insufficient resolution to claim that MPIOs are inside cells. Electron microscopy should be used to determine this.

      We thank this Reviewer for the comments. In this Revision, we added EM data to confirm the cellular location of MPIOs (Fig 7D and S7D). The EM experiment also added another layer of information for improving our cell isolation method. We improved our FACS experiment by narrowing down the MPIO positive gating to exclude the T cell population that labeled with high numbers of MPIO particles, which may affect T cell functions, and some crosslinked MPIO particles that formed during conjugation (Fig 7B and S7A). The yield of FACS of MPIO-labeled T cells is ~8.3%. As quantified from EM images, 91% MPIOs were localized intracellularly (Fig 7E). We agree that labeling T cells with nanoparticles might alter key T cell functions. We have improved the manuscript by putting this caution and reference. We also added T cell migration assay results (Fig 7G). Labeling CD8 T cells with MPIO did not affect T cell migration. This adds to our other in-vitro assays that T cell function is not significantly affected. There is in-vivo evidence as well that labeled T cells are functional. In Fig 8E-I, MPIO-labeled T cells were found in the brain, which showed that labeled T cells can migrate into the brain. In addition, a key phenotype of virus specific CD8 T cells in this model is the therapeutic function described in the manuscript. Labeling virus specific CD8 T cells with MPIO did not affect their therapeutic function. Quantification of bleeding in the OB and brain on day 6 and 11 verified the therapeutic effects of MPIOlabeled OT-I T cells (Fig 1E and 2C vs Fig S9C and D). We added discussion of these points in this Revision.

      2) Regarding the use of imaging the accumulation of magnetically labeled T cells, authors show evidence that magnetically labeled T cells accumulate in areas of the brain that as yet do not present with microbleeds but do have the histological hallmarks of vascular inflammation. This corroboration is intriguing but only provable with a serial imaging study in the same animal, which was not performed. Authors are also encouraged to report on the frequency in which a magnetically labeled T cell was present in a pre-vascular compromised inflammatory environment. The bulk of the results on imaging magnetically labeled T cells essentially show that the accumulation of magnetically labeled T cells enhances the ability to detect microbleeeds that otherwise were perhaps too small to detect (Sup Fig 8). Given the lack of data supporting the retained migratory capacity of magnetically labeled T cells, one wonders then, whether magnetically labeled T cells are indeed trafficking to the brain or are passively arriving in the brain, and might some vascular magnetic particle accumulate in an early inflammation or leak into the microbleed on its own and similarly enhance the ability to detect the otherwise undetectable microbleed. A series of controls would be useful to answer these questions, perhaps testing the administration of magnetic particles alone, and/or magnetically labeled non-CD8+ T cells. Authors are also encouraged to report on the frequency in which a magnetically labeled T cell was present in a pre-vascular compromised inflammatory environment versus in the microbleed, as measured by MRI and histology.

      Distinguishing bleeding from T cells is a key challenge for doing a serial MRI study in the same animal. In the new Fig 8I and Fig S8, we did a study using time-lapse MRI on the same mouse from 20 to 24 hr-post infection. We observed the appearance of hypointensities at the center of the bulb at 22 hr which is prior to bleeding in this area. Bleeds were observed at the GL, but not at the center of the bulb by IHC. Thus, we were able to time the entrance of T cells in this area of the brain. We were not able to find migration tracks of T cells from the outer GL layer into the center of the bulb. This is consistent with the idea that T cells infiltrate directly into areas with virus prior to vessel breakdown and microbleeds. We didn’t observe a very significant change in the location of T cells from 22 to 24 hr on the distance scale of MRI. There are two possibilities to explain our inability to detect T cell movement over a 2 hr time interval: 1.) the T cells under investigation may have been attached to blood vessels and required more time to extravasate. surface due to inflammation, and it might take some time for extravasation, or 2.) although T cell velocities in the CNS have been clocked at ~10 µm/min (Herz et al., 2015), their paths are often tortuous and influenced by antigen presenting cells displaying cognate peptide MHC as well as local chemokine gradients. Thus, upon entering a site of viral infection, the labeled T cells may not have traveled far enough in 2 hrs for us to detect their movement by MRI. We did not image mice beyond 24 hrs post-infection due to the possibility of bleeding. We added this discussion. Quantification of the frequency in which a MPIO labeled T cell was present in a region where no bleeding was detected versus in a region with a microbleed was added in Fig 8H. In the ONL/GL, 85% of MPIO-labeled T cells were in the region with microbleeds and 15% were in a region where no tissue bleeding was detected. In the MCL/GCL areas, no evidence for bleeding was detected. Magnetic labeling of CD8 T cells doesn’t reduce their migratory capacity in an in-vitro migration assay (Fig 7G). This adds to other in-vitro assays that the labeled T cells are functioning. Labeled T cells had therapeutic efficacy like unlabeled T cells and labeled T cells were found at the center of the bulb (Fig 8F-I) with no bleeds as well as in other brain regions. Based on these observations, we think that MPIO-labeled T cells are functioning and trafficking in the brain. A previous study showed that non-CD8 T cells, such as monocytes/macrophages, CD4 T cells, and neutrophiles also migrate into the OB and are involved in the immune responses in this model [(Moseman et al., 2020), Fig 2E]

      Reviewer #2 (Public Review):

      [...]

      Weaknesses:

      • Individuals with systemic infections or other underlying condition may have microbleeds due to inflammation or hypertension. The etiology of microbleeds is thus not necessarily tied to CNS infections. Investigation of potential cerebrovascular microbleeds following systemic or respiratory infections not affecting the CNS may shed light on this possibility which may also provide alternative interpretation of neurological symptoms associated with on CNS invasive infections.

      This is an important issue. Prior work has shown that virus in this model is cleared quickly (2 to 3 days) from the periphery (Ramsburg et al., 2005; Roberts et al., 1999). This is likely due to the fact the virus is inoculated through the nose. It is clear in this model that virus infects the brain, that bleeding corresponds to sites of high viral load, and bleeding can be modulated by blocking immune infiltration into the brain. However, the quantitative role of peripheral influences such as high blood pressure could be important and will be checked as this work proceeds.

      • Representative colocalization of virus infected endothelial cells with red blood cells (RBCs) is shown in Fig 4. However, a more quantitative assessment indicating how many areas or hypointensities were evaluated for virus-localization with RBCs, and how many of these revealed colocalization versus virus or RBC only would strengthen interpretation.

      Fig 4 shows that VSV can infect vascular endothelial cells and cause bleeding. Hypointensities were not measured in this Figure. We quantified the numbers of VSV infected vessels, colocalizing and not colocalizing with bleeds. Fig 4D was added with this new data.

      • A limitation clearly acknowledged by the authors is that hypointensity spots detected by MRI cannot distinguish microbeads from MPIO-labeled T cells.

      As in our response to Reviewer 1, this is a critical next step since bleeding so often occurs with immune cell infiltration in the brain. We have discussed potential approaches and have added the idea that development of more sensitive MRI contrast agents and quantitative T2* analysis especially at different magnetic field strengths may be approaches to accomplish this. It will be crucial for MRI cell tracking under the condition of bleeding, which is one common pathology associated with many diseases.

    1. Author Response:

      Reviewer #1 (Public Review):

      In this manuscript, the authors exploit retinal cell proliferation and neurogenesis in zebrafish to study banp, a protein that is essential in humans and embryonic lethal in mice. The authors performed large-scale mutagenesis and identified a mutant known as "rw337" that compared to WT cells the mutant zebrafish have smaller eyes and optic tectum. They found that the retinas of these mutants have mitotic-like round cells that accumulate indicating mitotic arrest. Sequencing of these mutants identified that the rw337 mutant gene encodes a truncated banp protein. Expression of WT Banp occurs primarily in retinal and neuronal cells in Zebrafish. Interestingly, rw337 showed significant decrease in retinal photoreceptors number and neuronal formation within the OPL and IPL were morphologically disrupted and had fewer cells. The authors found that rw337 cells have increased numbers of DSBs in the retina over time (via TUNEL) assays. They found that mitotic defects and apoptosis are spatially and temporally occurring in distinct regions of the retina as prolonged phosphorylation of histone H3, which indicates an issue in exit of mitosis, occurred in apical surface of the neural retina whereas apoptosis occurred in retinal progenitor cells (via Caspase 3 staining). The authors then went on to examine the role of replication stress regulators like p53, atm, and atr and showed that protein and RNA levels of banprw337 were increased and upregulated. As p53 binds banp in zebrafish, it was not surprising that regulators of p53 were enhanced in banprw337 mutants. Intriguingly, the authors found that two genes which are essential for chromatin segregation were downregulated in banprw337 mutants and banp morphants as a result of chromatin accessability decreases near the TSS of resulting in decreased transcriptional activity of cenpt and ncapg genes. Finally, the authors temporally monitored mitosis in mitosis of banprw337 mutants and found that chromosomal segregation is abnormal and takes longer. The authors have performed a thorough analysis of the impact of the banp gene on retinal biology and its importance regulating replication stress response and cenpt and ncapg expression. This paper is important to retinal biology, genome stability, and replication stress response fields and requires minor revision.

      Strengths:<br /> • These studies exploit zebrafish retinal development and its cell-cycle regulation as knockout of Banp/ SMAR1 is an essential gene in human cells and embryonic lethal in mice.<br /> • The authors show that this gene is involved in replication stress responses involving p53, atm, and atr signaling.<br /> • The authors show that banp is required for chromatin segregation factors and chromatin accessability by binding to banp sequences (TCTCGCGAGA) upstream of specifically cenpt and ncapg. Interestigly the mutant rw337 had decreased chromatin accessability near the transcript start sites of these genes. This is an elegant study of how a gene is regulating the transcription of two genes essential for chromatin segregation.<br /> •<br /> Weaknesses:<br /> • The authors could highlight the protein names of both zebrafish and humans throughout the text using standard nomenclature description with humans proteins all capitalized etc... This will enable the reader to understand their findings in the context of fascinating biology and human disease/cancer.

      We have revised nomenclature of genes and proteins throughout the text, consistent with nomenclature conventions as follows.

      species /gene/ protein zebrafish / banp / Banp mouse / Banp / BANP human / BANP / BANP

      In the revised manuscript, we have used human/mouse/zebrafish nomenclature in sentences relating findings that were achieved using human/mouse/zebrafish samples, respectively.

      • As banprw337 mutants show such severe morphological disruption a discussion on the impact of this work for the vision community could strengthen the importance of understanding how this gene functions.

      We appreciate this suggestion. In response to comments from the editor and reviewer #2, we have revised the Introduction to mention that vertebrate retina is an excellent model system to dissect mechanisms of cell-cycle regulation and DNA damage response-mediated neuronal cell death. We believe that our banp paper will have an impact on the retinal community. Furthermore, in addition to the role of Banp in cell-cycle regulation, most photoreceptors fail to differentiate in banp mutants, whose phenotypes are more severe than other retinal cell-types. Nuclear architecture, especially heterochromatin and euchromatin patterns, are quite differently organized in photoreceptor neurons and dynamically changed during rod photoreceptor differentiation, so we suspect that Banp may be important for photoreceptor differentiation through regulation of its nuclear organization. In the future, we will investigate this underlying mechanism. There are very interesting perspectives on retinal phenotypes in banp mutants, which may attract retinal and vision community researchers. However, these are diverse topics. So, in the current manuscript, we have limited the discussion to within cell-cycle regulation.

      • Gamma H2AX phosphorylation is a global marker of DSBs and stalled forks. The authors did not note that H2AX phorylation is present and a marker of stalled replications forks.<br /> o PMID: 11673449, PMID: 20053681, doi:10.1101/gad.2053211, https://doi.org/10.1016/j.cell.2013.10.043 etc.

      We appreciate this suggestion. We have added a statement on gamma-H2AX and cited appropriate references.

      • As gamma H2AX phosphorylation recruits DNA repair factors like BRCA2, speculation of importance of these genes may be of interest to the DNA repair community.

      We agree that to clarify which step or steps of DNA replication stress and the DNA repair mechanism are direct targets of Banp, it is important to consider how DNA repair factors are affected in banp mutants. Among Banp transcriptional target genes, we found that wrnip1 mRNA expression is significantly reduced in banp mutants. We have added these data to a new Figure 6-figure supplement 2. wrnip1 protects stalled replication forks from degradation and promotes fork restart during replication stress by cooperating with BRCA2. It was recently reported that WRNIP1 functions in translesion synthesis (TLS) and template switching (TS) at stalled forks, and also interstrand crosslink repair (ICR). It is possible that the loss of Wrnip1 causes defects in fork stabilization for restart, and ICR, leading to genomic instability. We have added this material to the Discussion and have revised a summary figure (Figure 7).

      Reviewer #2 (Public Review):

      Babu et al report the role of the zebrafish banp gene in the developing retina. They find that banp is required for faithful S-phase as well as mitosis.

      Manuscript strengths: 1- The authors performed a large-scale mutagenesis screen and successfully identified a causative banp gene mutation from these efforts, which represent a significant amount of work. 2- The authors provide a substantial amount of cellular-level analysis of a host of cell cycle-related phenotypes in the banp mutant retina. The data are of high technical quality and the experiments are well-executed. For the most part, the data support the conclusions.

      We are grateful for the reviewer’s high estimation of our work.

      Manuscript weaknesses: 1- Banp mutants have numerous defects, and perhaps this is not unexpected for a nuclear matrix protein. I'm left wondering what insights are gained from the study beyond that the nuclear matrix is required for numerous cell cycle events?

      As we mentioned in the Introduction, BANP was originally identified as a nuclear protein that binds matrix-associated regions (MARs). MARs are regulatory DNA sequences mostly present upstream of various promoters. MAR-binding proteins interact with numerous chromatin-modifying factors and regulate gene transcription. In addition, it was reported that BANP suppresses tumor growth, and that loss of BANP heterozygosity is associated with several cancers in humans. So, before we started this banp mutant analysis, we expected that loss of Banp might cause defects in the cell cycle. However, because the majority of prior studies on BANP have been done using in vitro systems, its physiological function was still ambiguous. Very recently, it was reported that BANP functions as a transcription factor that binds to Banp motifs and regulates essential metabolic genes. In this study, rather than focusing on the MAR domain, we used this Banp motif to search for direct transcriptional targets of Banp that may function in cell proliferation and differentiation in zebrafish retina. Our study provides the first in vivo evidence that Banp serves as an essential transcription activator of cell cycle genes, including cenpt, ncapg, and wrnip1 via Banp motifs. We believe that such a list of Banp direct target genes provides a new research avenue to discover more precisely how Banp functions in tumor suppression and that it will contribute to medical research on cancer therapy.

      Our study did not investigate how the nuclear matrix itself is involved in Banp mutant phenotypes. However, since it is likely that the interaction between MAR domains and nuclear matrix may influence chromatin organization in the nucleus, BANP functions must depend on nuclear matrix configuration. So, while this question is interesting, we think it is beyond the scope of our current study. In addition, we are afraid that the term “matrix-associated nuclear protein” might mislead people to think that Banp is a regulator of nuclear matrix. To better clarify the relationship between Banp and nuclear matrix, we have revised “nuclear matrix-associated protein” -> “nuclear matrix associated region-binding protein” in the text.

      2- Why did the authors focus on the eye? It is unclear whether this study revealed a sensitivity to eye development regarding nuclear matrix function specifically, or it was just a convenient place in the animal to look.

      Historically, molecular and cellular mechanisms that regulate cell proliferation and differentiation in the nervous system has been intensively studied using the vertebrate retina, because retinal neuronal cell types are fewer than those of other brain regions and its neural circuits are also simpler than those of other brain regions. Furthermore, many research groups, including us, have identified zebrafish retinal mutants, including mutants that show defects in cell-cycle regulation and DNA damage response. Indeed, our group has investigated this topic using retinal apoptotic mutants for the last 20 years. Thus, we focus on the zebrafish retina, because the retina is an excellent in vivo model system to dissect mechanisms of cell-cycle regulation and DNA damage response. To emphasize the importance of this excellent in vivo model system to researchers beyond the retinal community, we have revised in the Introduction as follows. "The developing retina is a highly proliferating tissue, in which a spatiotemporal pattern of neurogenesis is tightly coordinated by cell-cycle regulation. So, vertebrate retina provides a great model for studying how cell-cycle regulation, including DNA damage response ensures neurogenesis and subsequent cell differentiation."

      3- I found the conclusions regarding mitosis to be contradictory. The authors at first emphasize mitotic arrest, but then characterize chromosome segregation defects. How can chromosomes segregate if cells are arrested in mitosis?

      We apologize for the confusion due to our incorrect usage of the term “mitotic arrest.” Mitotic arrest was one of possibilities that we considered when first examining banp mutant phenotypes, in which we just observed accumulation of mitotic (pH3+) cells. However, when we examined mitosis in Banp morphants using live imaging, we found that mitosis duration is significantly prolonged because of chromosome segregation defects in Banp morphants, but that all 28 mitoses we examined eventually completed cytokinesis. Thus, we finally concluded that mitotic cells are not permanently arrested in M phase, but that mitosis is prolonged. To prevent confusion, we have changed “mitotic arrest” to “mitotic cell accumulation” or simply “mitotic defects” in the Results section on banp mutant phenotype analysis (shown in Figures 2 and 4).

      4- It would be important to know whether the authors can rule out that S-phase defects cause the M phase defects, or vice versa. Could there be a primary defect, rather than multiple independent defects as the authors conclude?

      We thank reviewer #2 for this suggestion. Interdependence between S phase defects and M phase defects is important to correctly interpret the data on cell-cycle regulation, especially cell-cycle checkpoint and DNA damage response. Indeed, there are interesting reports using in vitro cell culture systems indicating that replication stress induces mitotic death, through specific pathways (for example, Masamsetti et al., 2019, Nat. Comm. 10.4224. However, this topic is still challenging to dissect in vivo. In terms of our findings on Banp functions in zebrafish, we found that two chromosome segregation regulators, ncapg and cenpt, are direct transcription targets of Banp, and that it is likely that loss of Banp causes mitotic defects through downregulation of cenpt and ncapg. From this point, we conclude that mitotic defects are primary effects of the loss of Banp. The next question is how the loss of Banp stalls DNA replication forks and causes subsequent cell death. To address this question, we examined whether Banp direct targets include cell-cycle regulators, especially in S phase. We found that wrnip1 is an interesting candidate, because Wrnip1 reportedly protects stalled replication forks and promotes fork restart after DNA replication stress. In addition, Wrnip1 functions in interstrand crosslink repair (ICR). We found that the mRNA expression level of wrnip1 is markedly decreased in banp mutants, suggesting the possibility that DNA replication stress may be caused by reduction of wrnip1 expression in banp mutants. We present these data in new Figure 6-figure supplement 2. We have revised the possible role of Banp in cell-cycle regulation in new Figure7. Under this scenario, we consider it likely that loss of Banp may cause DNA replicationstress through downregulation of S phase regulators, independent of mitotic defects. However, we cannot exclude the possibility that DNA replication stress causes mitotic defects in banp mutants. Masamsetti et al., 2019, Nat. Comm. 10.4224. revealed that replication stress induces spindle assembly checkpoint (SAC)-dependent mitotic arrest and subsequent mitotic death when tp53 activity is inhibited. We showed that cell death in zebrafish banp mutant retinas was fully suppressed by tp53-MO at 48 hpf, but still occurred at 72 hpf, although there was no significant difference between wildtype and banp mutants (Figure 3GH). In the manuscript, we mentioned the possibility that some tp53-independent mechanism induces retinal apoptosis in banp mutants after 48 hpf. An alternative possibility is that most cell death in banp mutants depends on tp53; however, replication stress persisting in banp mutants injected with MO-tp53 may cause SAC-mediated mitotic death, as reported by Masamsetti et al., 2019. Future studies will be necessary to clarify this possibility.

      Reviewer #3 (Public Review):

      Babu and colleagues demonstrate that banp is expressed in the retina progenitor cells among other locations, and mutational loss of it results in increased mitosis, increased apoptosis, increased DNA damage, and the failure to differentiate photoreceptors. Importantly, these phenotypes are seen at a time period when retina progenitors undergo rapid cell cycles and differentiate into multiple cell types that make up the fully developed retina. Rescue with the wild type and phenocopy with another mutant allele provide strong support that the phenotypes results from loss of banp. Mutant animals show elevated p53 protein and reduction of p53 delays the onset of apoptosis by 24 hours. Mutant animals show altered transcriptional profile, with increased p53 expression and decreased expression of two genes that encode proteins needed for chromosome segregation. The authors propose that loss of banp results in defective DNA replication and DNA damage as well as mitotic chromosome segregation failures, all of which contribute to p53-dependent apoptosis to reduce cell number and cause developmental defects.

      Banp is a very interesting protein. Also known as Scaffold/matrix attachment region binding protein 1, it is known to regulate the transcription of a number of genes including those important in oncogenesis. In vivo function of Banp, especially in the context of normal development, remains to be better understood. The current study fills this knowledge gap but I have some concerns about the interpretation of the data, the presentation and the potential impact. Specifically:

      We are very pleased that reviewer #3 understood and appreciated the significance of our study.

      Increased expression of atm and atr is observed and the authors suggest that replication stress and DNA damage activate the checkpoints to cause cell cycle arrest. There are several problems with this conclusion, which is depicted in Fig. 4G. Checkpoint activation occurs via phosphorylation changes in ATM/ATR and not through their transcriptional upregulation, which would take too long for a response that occurs within minutes.

      We agree with the referee that upregulation of ATR/ATM mRNA expression may represent chronical activation of DNA replication stress and DNA damage response. In addition to ATR/ATM mRNA upregulation, RNA-seq analysis revealed that exo5 is one of the TOP15 upregulated genes in banp mutants (Fig. 3B). exo5 plays a critical role in ATR-dependent replication restart (Hambarde et al., 2021), suggesting that chronic replication stress occurs in banp mutants. We have mentioned exo5 upregulation in the Results section. As Referee 1 suggested, phosphorylation of H2AX is induced by ATR prior to DSBs, indicating that gammaH2AX is a marker of DNA replication stalling as well as of DSBs. We showed that gamma-H2AX+ cells are more numerous in banp mutants (Figure 4CF) and morphants (Figure 4-figure supplement 1AB) and in S phase banp mutant cells (Figure 4-figure supplement 1CDEFF’), suggesting that DNA replication stress and subsequent DNA damage linked to fork breakage are induced in banp mutants. We have revised the text by adding this statement in the Results section. In addition, we have revised Fig. 4G and its legend, in order to more clearly show the role of ATR and ATM in DNA replication fork repair and HR-mediated DNA repair in response to DSBs, and tp53-mediated regulation of cell survival and death.

      ATM/ATR-dependent checkpoints arrest cells in G1 or G2 so you would expect reduced S and M phases. Yet, the authors saw increased M and no change in S.

      It is puzzling that BrdU+ cell number does not change because if cells are indeed arrested in mitosis, they should be prevented from going into S phase and BrdU+ cell numbers should decrease.

      There is no significant difference in the BrdU+ fraction of total retinal cells between wild-type and banp mutants at 48 hpf (Fig. 2-figure supplement 1AC), suggesting that cell-cycle arrest in S phase does not occur at significant levels in banp mutants at 48 hpf. At present, we have no good tool to detect G1 phase in zebrafish developing retina, because the Cdt1 fluorescent protein of the FUCCI zebrafish line cannot be stably driven in highly proliferating tissues such as zebrafish retina due to its very short G1 duration. Thus, we cannot determine whether G1 arrest occurs in banp mutant retina. However, we found that mRNA expression of p21 cdk inhibitor is upregulated in banp mutants, using bulk RNA-seq (Figure 3AB) and RT-PCR (Figure C), so it is still possible that banp mutant retinal cells are (probably partially) arrested in G1 phase. We have added this possibility to the Discussion. Further study is necessary to evaluate this point.

      It is not addressed whether cenpt and ncapg expressed in the retina and whether are their expressions decreased in banp mutants. The RNAseq data is from whole animals.

      RNA-seq data (Fig 3AB) were obtained from embryonic heads, but not whole bodies (see Materials and Methods). In accordance with this suggestion, to examine whether cenpt and ncapg mRNAs are expressed in retina, we performed in situ hybridization. We confirmed that these mRNAs are expressed in proliferative cells in zebrafish retina and have added these data to new Figure 5-figure supplement 1. In addition, we also confirmed that cenpt and ncapg mRNA expression is absent in banp mutants (see panels at 48 hpf in Fig. 5-figure supplement 1).

      The rescue by banp-EGFP in Fig.1G is very nice. But it looks like there is partial rescue also with EGFP-banp(rw337) in the same panel. The defects the last panel do not seem as severe as in non inj controls. There are fewer pyknotic nuclei and the cell layers lack gaps. Quantification of the extent or reproducibility of the rescue is lacking.

      We conducted acridine orange (AO) staining of retinas of wild-type, banp mutants, and banp mutants injected with banp(wt)EGFP and with EGFP-banp(rw337). We confirmed that banp(wt)EGFP significantly suppressed apoptosis in banp mutant retinas, whereas EGFP-banp(rw337) did not. We have added these data to new Figure 1-figure supplement 5. So, there is no partial rescue by EGFP-banp(rw337).

      Some of the conclusions lack supporting data. For example, line 99: "Thus, Banp is required for integrity of DNA replication and DNA damage repair." There are no data for the integrity (meaning 'fidelity'?) of DNA replication and there are no DNA repair assays.

      Thank are grateful for this suggestion. We understand that the term “integrity” could be too strong and changed it to “regulation.”

      In another example, non-overlap of pH3 (M phase) and caspase+ cells is interpreted to mean that cells are dying in S phase (Figure 2 supplement 1). But the data are equally consistent with cells dying in G1 and G2.

      In addition to non-overlap of the pH3+ and caspase+ areas along the apico-basal axis of the retina (Fig.2-figure supplement 1DG), we did not observe mitotic death in our live imaging of mitosis in banp morphant retinas. Considering the very short G2 phase of retinal cells in zebrafish, we conclude that apoptosis occurs mostly in retinal progenitor cells undergoing G1 or S phase, or differentiating neurons. However, we cannot exclude the possibility that apoptosis occurs in G2 phase. So, we have revised the text. Furthermore, caspase 3+ cells were mostly located in the intermediate zone of the neural retina along the apico-basal axis, whereas pH3+ cells were localized at the apical surface of the neural retina (Fig. 2-figure supplement 1G), suggesting that apoptosis occurs mostly in retinal progenitor cells during G1, S or G2 phase, or in differentiating neurons. Accordingly, we have revised Fig. 2-figure supplement 1L, to suggest that apoptosis may be induced in G1, S, or G2 phase.

      The model in Figure 7 includes components without accompanying supportive data. For example, the arrow from Banp to DNA repair that indicates a direct role and the arrow from tp53 to delta113 tp53 that indicates direct activation.

      Thank appreciate this suggestion. We have revised Figure 7 and its legend. In new Figure 7, we used solid arrows for regulatory pathways confirmed by us and previous other groups, and dotted arrows for proposed regulatory pathways. We already cited a reference (Chen et al., 2009), indicating direct activation of ∆113 tp53 by FL tp53.

      The data that together support a single point are often split up among figures. For example, increased pH3+ cells shown in Fig. 2 and is interpreted as mitotic arrest. But it is equally possible that cells are undergoing extra divisions (and then dying). Support for mitotic arrest is provided by live imaging of mitosis, which is not presented until the last figure (Fig. 6). There are many such instances in the manuscript.

      A similar concern was raised by reviewer #2. Please see our response.

      Banp is already known for roles in p53-dependent transcription and in apoptosis (e.g. Sinha et al papers cited in the manuscript). Banp is also known to bind to the promoter regions of cenpt and ncapg (Grand et al and Mathai et al papers cited in the manuscript). These genes are known to be involved in mitosis in zebrafish (Hung et al and Seipold et al papers cited in the manuscript). In terms of what is new about banp function in this report, the requirement for banp in a critical phase of retina development and spontaneous induction of DNA damage come to mind. Unfortunately, how loss of banp leads to this defect remains to be addressed.

      A related concern was raised by the editors and also by reviewer #2. Please see our responses. We found that wrnip1 mRNA expression is drastically reduced in banp mutants, which may cause DNA replication stalling and abnormal phenotypes.

    1. Author Response:

      Reviewer #3 (Public Review):

      Two cell types in the parasubthalamic nucleus (a region of the posterior hypothalamus) are activated following food intake. The authors determine that the Tac1 expressing population is sufficient to suppress food intake and the Crh population does not influence food intake. Further, the authors demonstrate that only the Tac1 population projects to the PBN. The Tac1 neurons are transiently activated following food presentation or satiation hormones (for about 1 minute). This transient change in activity is interesting and fits into a lot of other recently published work showing transient neural activity changes that are involved in longer term behavior. Longer term activation of these neurons reduces food intake and the authors begin to explore the circuits/networks that these neurons influence. Overall, the work is well done and the experiments support the conclusions. Some minor clarifications could enhance the manuscript and could be addressed through further analysis or adding in text.

      1. What % of the overall PSTN neurons are tac1/crh (ie, how many other cell types are there?). Or what % of the vglut2 neurons do they make. This just requires further analysis of the current dataset. And, are there any GABAergic cells (like are the PV GABAergic)?

      We thank the Reviewer for suggesting this analysis because it is interesting and other readers are likely to ask the same questions. In our original submission we were hesitant to report these values because they ultimately represent an approximation. Because the neurons that surround the PSTN are also glutamatergic (including the subthalamic nucleus and the lateral hypothalamic area), it is impossible to precisely delineate the border of the PSTN using Slc17a6 as a marker. However, this is an important question and we feel that reporting these values while qualifying them as an estimation will be impactful. Therefore, in the revised manuscript, we now include the following statement:

      “Although it is impossible to delineate a precise border for the PSTN using Slc17a6 because adjacent regions are also glutamatergic, we estimate that ~22% of Slc17a6- expressing neurons within the PSTN region do not express either Tac1 or Crh, indicating the presence of glutamatergic PSTN cell types that may express other unique genetic markers.”

      We did not examine GABAergic expression in the PSTN because the Allen Brain Atlas and recent RNA-Seq studies (e.g., Wallén-Mackenzie et al., 2020) found an almost complete absence of Gad1- and Gad2-expressing cells in the PSTN region. We report this previous finding within the Results:

      “Expression of the GABAergic markers Gad1 and Gad2 are notably absent from the PSTN region (Shah et al., 2022).”

      2. The 60 second increase in tac1 neuron activity is interesting. In the discussion, the authors present some plausible arguments for how that may affect feeding for hours. Additionally, it would be nice to point out that this is a recurring theme. This occurs in other neuron populations that influence food intake. Although this is seemingly counterintuitive, I think it is good to mention as these short-term neural activity changes are clearly having large effects on behavior and it is important for everyone to realize this.

      This point is an excellent observation and we agree that we could highlight other studies showing transient activation of neural activity controlling food intake. Therefore, we added to our Discussion:

      “Indeed, many other neural populations that regulate food intake behavior also show a transient increase in neural activity on the timescale of seconds (Berrios et al, 2021; Luskin et al., 2021; Mohammad et al., 2021; Wu et al., 2022).”

      3. Something a little strange with the meal frequency. I thought CCK reduced meal size not frequency. Why does the rescue then increase frequency? Could it be that the rescue to the CCK is by a different means than just blocking the effect of CCK? Adding some language to the discussion about how to interpret the satiation peptide data would be useful.

      We thank the Reviewer for bringing up this interesting point. Previous studies do indicate that CCK (and also amylin, to a large extent) reduces meal size and does not have much of an effect on meal frequency. We therefore added a paragraph to the Discussion to note and discuss this point:

      “It is also noteworthy that chemogenetic inhibition of PSTN^Tac1 neurons attenuates the effects of amylin, CCK, and PYY by decreasing the frequency of meals as opposed to meal size or meal duration (Figure 5). Previous studies of these anorexigenic hormones, especially amylin and CCK, indicate that they affect food intake primarily by decreasing meal size as opposed to meal frequency (Drazen and Woods, 2003; Lutz et al., 1995; West et al., 1987). Therefore, inhibition of PSTN^Tac1 neurons might attenuate the effects of these hormones indirectly, perhaps by reducing activity in downstream populations such as the NTS or PBN. In this model, infusion of anorexigenic hormones activate PSTN^Tac1 neurons that, in turn, cause sustained activation of downstream populations. Without this sustained activity, downstream populations may not have sufficient activity to cause a reduction in the intermeal interval, leading to increased bouts of feeding. The mechanism by which anorexigenic hormones activate PSTN^Tac1 neurons, as well as how decreases in PSTN^Tac1 neuronal activity affect downstream populations, are important topics for future investigation.”

      4. The axonal stimulation data needs qualification - as axons could project to multiple target regions (like the projections to the PVT could also have a collateral to the CEA). For this type of experiment, I prefer to use the phrase "neurons with a projection to region X do behavior Y". Otherwise, the implication in reading the results is that the particular projection is mediating the behavior. Also, the collateral issue, which is qualified in the discussion, should be mentioned here.

      We see the Reviewer’s point and have revised the language to highlight this important qualification of our results. Specifically, we added text in the Results section in regard to Figure 8:

      “Because it is unknown whether PSTNneurons send collateral projections to multiple brain regions, it is possible that stimulation in a single projection target causes antidromic activation to one or more other target areas. Therefore, these results indicate that PSTNTac1 neurons with projections to the CeA, PVT, PBN, and NTS can suppress food intake, although the exact functional role of each downstream target region on food intake behavior remains undetermined.”

  5. Mar 2022
    1. Author Response:

      Reviewer #2 (Public Review):

      In their supplementary section A.3-1.5 the authors perform QTL simulations to assess the performance of their analysis methods. Of particular interest is the performance of their cross-validated stepwise forward search methodology, which was used to identify all the QTL. However, a major limitation of their simulations was their choice of genetic architectures. In their simulations, all variants have a mean effect of 1% and a random sign. They also simulated 15, 50, or 150 QTL, which spans a range of sparse architectures, but not highly polygenic ones. It was unclear how the results would change as a function of different trait heritability. The simulations should explore a wider range of genetic architectures, with effect sizes sampled from normal or exponential distributions, as is more commonly done in the field.

      As suggested, we have expanded the range of simulations we explore in the revised manuscript. We note that the original simulations discussed in the manuscript involve exponentially distributed effect sizes (with a mean of 1% and random sign) at multiple different heritability values. These are described in Figures A3-4 and A3-5. We also simulated epistatic terms (Figure A3-3.3). In the revision, we have broadened the simulations to add more ‘highly polygenic’ architectures (1000 QTL). We find that the algorithm still performs well, though worse than when 150 QTL are simulated. The forward search behaves in a fairly intuitive way: QTLs get added when the contribution of a true QTL to the explained phenotypic variance overcomes the model bias and variance. QTLs are only missed if their effect size is too low to contribute significantly to phenotypic variance, or if they are in strong linkage and thus their independent discovery barely increases the variance explained (which is all finally controlled by the trait heritability). At much higher polygenicity, composite QTL can be detected as a single QTL when their sum contribute to phenotypic variance, and get broken up if and only if independent sums also contribute significantly to phenotypic variance. Of course, there are many ways to break up composite QTL, but the algorithm proceeds in a greedy fashion focusing on unexplained variance. We have also explored cases with multiple QTL of the same effect, and with different mean effects or different number of epistatic terms, but we found these results were largely redundant. To summarize these conclusions, we have added the following discussion at the end of the results section: “The behavior of this approach is simple and intuitive: the algorithm greedily adds QTL if their expected contribution to the total phenotypic variance exceeds the bias and increasing variance of the forward search procedure, which is greatly reduced at large sample size. Thus, it may fail to identify very small effect size variants and may fail to break up composite QTL in extremely strong linkage.”

      We have also added additional clarification in the Appendix: “These results allow us to gain some intuition for how our cross-validated forward search operates. […] However, while our panel of spores is very large, it remains underpowered in several cases: 1) when QTL have very low effect size, therefore not contributing significantly to the phenotypic variance, and 2) when composite QTL are in strong linkage and few spores have recombination between the QTL, then the individual identification of QTL only contributes marginally to the explained variance and the forward search may also miss them.”

      In this simulation section, the authors show that the lasso model overestimates the number of causal variants by a factor of 2-10, and that the model underestimates the number of QTL except in the case of a very sparse genetic architecture of 15 QTL and heritability > 0.8. This indicates that the experimental study is underpowered if there are >50 causal variants, and that the detected QTL do not necessarily correspond to real underlying genetic effects, as revealed by the model similarity scores shown in A3-4. This limitation should be factored into the discussion of the ability of the study to break up "composite" QTL, and more generally, detect QTL of small effect.

      We agree with some aspects of this comment, but the details are a bit subtle. First, we note that the definition of underpowered depends on the specifics of the QTL assumed in the simulation. In addition, many of the simulations were performed at 10,000 segregants, not at 100,000, with no effort to enforce a minimum effect size, or minimum distance between QTL. For example, if 100 QTL are all evenly spaced (in recombination space) and all have the same effect such that they all contribute the same to the phenotypic variance, then the algorithm is in principle maximally powered to detect these. This is why our algorithm is capable of finding >100 QTL per environment. On the other hand, just 2 QTL in complete linkage cannot be distinguished and no panel size will be able to detect these.

      However, we do agree with the general need to discuss the limitations in more detail and have clarified these concerns in the ‘Polygenicity’ result section. We have also reiterated the limitations of the LASSO approach within the simulation section. The motivation for an L0 normalization in this data was first discussed in the section A3-1.3: “Unfortunately, a harsh condition for model consistency is the lack of strong collinearity between true and spurious predictors (Zhao & Yu, 2006). This is always violated in QTL mapping studies if recombination frequencies between nearby SNPs are low. In these cases, the LASSO will almost always choose multiple correlated predictors and distribute the true QTL effect amongst them.”

      In section A3-2.3, the authors develop a model similarity score presented in A3-4 for the simulations. The measure is similar to R^2 in that it ranges from 0 to 1, but beyond that it is not clear how to interpret what constitutes a "good" score. The authors should provide some guidance on interpreting this novel metric. It might also be helpful to see the causal and lead QTLs SNPs compared directly on chromosome plots.

      We agree that this was unclear, and have added additional discussion in the main text describing how to interpret the model similarity score. Essentially, the score is a Pearson’s correlation coefficient on the model coefficient (as defined in section A3-2.3, after equation A3-28). However, given a single QTL that spans two SNPs in close linkage, a pure Pearson’s correlation coefficient would have high variance, as subtle noise in the data could lead to one SNP being called the lead SNP vs the other, and two models that call the same QTL might have either 100% correlation, or 0% correlation. Instead, our model similarity score ‘aligns’ these predicted QTL before obtaining the correlation coefficient. The degree at which QTL are aligned are based on penalties with respect to collinearity (or linkage) between the SNPs, and the maximum possible score is obtained by dynamic programming. Similar to sequence alignments between two completely unrelated sequences, a score of 0 is unlikely to occur on sufficiently large models as at least a few QTL can usually be paired (erroneously). We have also added a mention in the main text referring to Figures A3-3, A3-7, A3-8, A3-9, which show the causal and lead QTL SNP directly on the chromosome plots.

      The authors performed validation experiments for 6 individual SNPs and 9 pairs of RM SNPs engineered onto the BY background. It was promising that the experiments showed a positive correlation between the predicted and measured fitness effects; however, the authors did not perform power calculations, which makes it hard to evaluate the success of each individual experiment. The main text also does not make clear why these SNPS were chosen over others-was this done according to their effect sizes, or was other prior information incorporated in the choice to validate these particular variants? The authors chose to focus mostly on epistatic interactions in the validation experiments, but given their limited power to detect such interactions, it would probably be more informative to perform validation for a larger number of individual SNPs in order to test the ability of the study to detect causal variants across a range of effect sizes. The authors should perform some power calculations for their validation experiments, and describe in detail the process they employed to select these particular SNPs for validation.

      We agree with the thrust of the comment, but some of the suggestions are impossible to implement because of practical constraints on the experimental methods (and to a lesser extent on the model inference). First, we chose the SNPs to reconstruct based on three main factors: (a) to ensure that we are validating the right locus, the model must have a confident prediction that that specific SNP is causal, (b) the predicted effect must be large enough in at least one environment that we would expect to reliably measure it given the detection limits of our experimental fitness measurements, and (c) the SNP must be in a location that is amenable to CRISPR-Cas9 or Delitto Perfetto reconstruction. In practice, this means that it is impossible to validate SNPs across a wide range of effect sizes, as smaller-effect SNPs have wider confidence intervals around the lead SNP (violating condition a) and have effects that are harder to measure experimentally (violating condition b). In addition, because the cloning constraints mentioned in (c) require experimental testing for each SNP we analyze, it is much easier to construct combinations of a smaller set of SNPs than a larger set of individual SNPs. Together, these considerations motivated our choice of specific SNPs and of the overall structure of the validation experiments (6 individual and 9 pairs, rather than a broader set of individual SNPs).

      In the revised manuscript, we have added a more detailed discussion of these motivations for selecting particular SNPs for validation, and mention the inherent limitations imposed by the practical constraints involved. We have also added a description of the power and resolution of the experimental fitness measurements of the reconstructed genotypes (we can detect approximately ~0.5% fitness differences in most conditions). We are unsure if there are any other types of power calculations the reviewer is referring to, but we are only attempting to note an overall positive correlation between predicted and measured effects, not making any claims about the success of any individual validation (these can fail for a variety of reasons including experimental artifacts with reconstructions, model errors in identifying the correct causal SNP, unresolved higher-order epistasis, and noise in our fitness measurements, among others).

      In section A3-1.4, the authors describe their fine-mapping methodology, but as presented is difficult to understand. Was the fine-mapping performed using a model that includes all the other QTL effects, or was the range of the credible set only constrained to fall between the lead SNPs of the nearest QTL or the ends of the chromosome, whichever is closest to the QTL under investigation? The methodology presented on its face looks similar to the approximate Bayes credible interval described in Manichaikul et al. (PMID: 16783000). The authors should cite the relevant literature, and expand this section so that it is easier to understand exactly what was done.

      We have attempted to clarify section A3-1.4. As the reviewer correctly points out, the fine mapping for a QTL is performed by scanning an interval between neighboring detected QTL (on either side) and using a model that includes all other QTL. For example, if a detected QTL is a SNP found in a closed interval of 12 SNPs produced by its two neighboring QTL, 10 independent likelihoods are obtained (re-optimizing all effect sizes for each), and a posterior probability is obtained for each of the ten possible positions. We have cited the recommended paper, as our approach is indeed based on an approximate Bayes credible interval similar to the one described in that study (using all SNPs instead of markers). We have added the following sentence to the A3-1.4 section at the end of the second paragraph (similar to the analogous paragraph in Manichaikul et al): “[…] as above by obtaining the maximum likelihood of the data given that a single QTL is found at each possible SNP position between its neighboring QTL and given all detected other QTL (thus obtaining a likelihood profile for the considered positions of the QTL). We then used a uniform prior on the location of the QTL to derive a posterior distribution, from which one can derive an interval that exceeds 0.95.” Some typos referring to a ‘confidence’ interval were also changed to ‘credible’ interval.

      The text explicitly describes an issue with the HMM employed for genotyping: "we find that the genotyping is accurate, with detectable error only very near recombination breakpoints". The genotypes near recombination breakpoints are precisely what is used to localize and fine-map QTL, and it is therefore important to discuss in the text whether the authors think this source of error impacts their results.

      This is a good point, we have added a reference in the main text to the Appendix section (A1-1.4) that has an extensive discussion and analysis of the effect of recombination breakpoint uncertainties on finemapping.

      The use of a count-based HMM to infer genotypes has been previously described in the literature (PMID: 29487138), and this should be included in the references.

      We now also add this citation to our text on the count-based HMM.

    1. Author Response:

      Reviewer #1 (Public Review):

      Major Comments

      I am concerned that a lot of these studies had relatively low n numbers (n=5 in some cases) and that some of the studies may have been underpowered. Given the variability with in vivo studies, some endpoints may have been significant with more numbers. Along these lines, what is the justification for using the (parametric) ANOVA test. I'm not a statistician but I thought that the rule of thumb was that non-parametric tests should be used if n<12 since you cannot verify that the data is normally distributed. In this case, I would recommend having a statistician look at it and/or increasing some of the N's, or using the non-parametric Kruskal-Wallis test. Indeed, in some cases, the variation the variation is quite large (ie Fig 6, 7). Whilst I do not think that the low N's change the ultimate conclusions, but more rigor (ie more N's) would help solidify the paper given that it will likely be of great interest and scrutinized by the scientific community.

      We conducted power analyses prior to the start of the studies to identify the number of animals per group to use, based on our past studies of inflammatory changes induced by inhalants, infections and asthma. We set the target number of mice (n) at that time, such that these studies would be powered to detect a 25% change in cytokine expression. We did go through and reviewed all of the data with our biostatisticians, we came to the conclusion that it would not be statistically appropriate to run more mice to increase the n when our primary outcome remains the same. We double-checked that the ANOVAs with corrections for multiple comparisons were correct for each particular experiment. Discussion with our statistician confirmed that ANOVA is correct as long as the data passed normality testing, which was done. An additional point, and most relevant to this specific recommendation, JUUL Mint and JUUL Mango flavors are no longer on the market, such that extensive further studies are not feasible. While these two flavors are not available anymore, they were composed of an array of chemicals commonly found in other flavors (but in different combinations), such that we believe that these data are most likely relevant to other vapes. In particular, JUUL Mint shares chemical features with JUUL Menthol, which took its place as one of the most popular JUUL flavors. The discontinuation of these flavors has been added as a limitation within the Discussion

      Fig S3. For the lung histology, please quantify the mean linear intercept per ATS guidelines and show representative BAL images.

      We have conducted the mean linear intercept (MLI) measurements on e-cigarette aerosol exposed lungs and controls per ATS guidelines and have added these data to the manuscript (new Appendix 1- Figure 4M). We paired these data with the original histology images (Appendix 1 – Figure 4A-4L). We have added appropriate methods (pages 21-22) and results (page 9) as well. Of note, the MLI data matches our original physiologic assessments of lung function (Appendix 1 – Figure 2A-2J), including elastance and compliance, which are known to change in the setting of emphysema. MLI, lung elastance and compliance were no different across inhalant groups and controls. Further, we have taken representative images of Giemsa Wright stained BAL samples, and have added these to the manuscript (new Appendix 1 Figure - 3E-3J and 3O-3T) paired with BAL cell count data.

      One of the most novel conclusions from this paper is increased inflammation in the brain which the authors speculate could lead to altered moods and or change the addiction threshold. I would tend to agree with this conclusion, but could the authors perform additional mouse psychological tests to confirm this? Also, were there observable physiological responses in the vaped mice that could be reported which may correlate this conclusion, ie changes in grooming, fur ruffling or other behavioral changes?

      We are thrilled that the Reviewer is as interested in these implications as we are, because we believe the neuroinflammation detected is quite frightening, particularly because it is likely to impact both behavior and mood. We have added further discussion regarding the potential consequences of inflammation in each of the organs (pages 13-19), with an emphasis on the effects of neuroinflammation on behavior and psychology. We have subdivided the Discussion section to highlight potential effects on each distinct organ.

      While we are not a behavioral lab, and thus running behavioral studies in mice is beyond the scope of both our lab and this manuscript, we agree that the neuroinflammation is of great interest and further studies are needed to best assess potential psychological and behavioral changes. Of note, we did not observe any overt behavioral changes - we closely observe the mice both during and after exposures and make notes regarding grouping, fur, and activity level - none of which were changed by the different vaping exposures. We have added the lack of dedicated behavioral and psychological evaluations as a limitation of this work and as an opportunity for discovery in future studies (page 19- 20).

      Minor comments Change title to state "in mouse". That this study was performed in rodents should be apparent from the outset.

      Actually, our original title does contain “in mice” at the end. Apologies if these words were cut off on your end. We do agree that the title should be apparent that the study was conducted in mice. We wanted to make the title even clearer, so replaced the brand name JUUL with the type of e-device. The title is as follow: “Effects of Mango and Mint pod-based e-cigarette aerosol inhalation on inflammatory states of the brain, lung, heart and colon in mice”

      No changes in collagen deposition were detected using basic histology. Have the reviewers considered performing immunohistochemistry and staining for alpha-smooth muscle actin which may be a more sensitive assay?

      We agree with the reviewer that there are more sensitive tools that can be used. We believe that, in our system, and at 3 months of exposure, JUUL Mint and Mango are not very likely to induce fibrosis, since our data of inflammatory markers and fibrosis associated genes (in homeostatic conditions, Figure 3) show that there are not significant differences, and in some markers, JUUL Mint and Mango exposed mouse lungs are even showing less inflammation than Air controls. In addition, we also showed no differences were obtain in physiological assessment (heart rate, heart rate variability or blood pressure, Appendix 1 – Figure 1). Thus, we do not expect to find significant differences even with additional assays. We are planning on challenging mice with bleomycin in the future, as it may be possible to detect differences in fibrosis in the setting of this pro-fibrotic challenge.

      "Thus long term exposure to Juul does not lead to significant changes...". I would argue that 1-3 months is not long term. Indeed, other researchers have performed 6-12 month ecigarette exposures and it takes a lifetime in humans to develop lung disease after smoking. Since you can detect pro-inflammatory changes but no altered physiology, it may be that alterations in airway physiology are only just beginning.... The authors should modify this sentence and maybe not call their studies "long term".

      We agree with the reviewer and have modified the sentence as follows for a more accurate interpretation of our results (page 9): “Thus, 1 and 3 month exposure to JUUL Mint and Mango aerosols may not cause significant changes in airway physiology, but this does not preclude the possibility that changes may occur with longer exposures, such as 6-12 months.” We have also gone through the entire the manuscript to focus on describing our exposure in terms of months instead of the descriptive terms acute / sub-acute / chronic, and we have removed the word chronic from the title.

      "Differences in LPS induced cytokine levels were no longer observed after 3 month JUUL exposure versus Air control groups". As per the major comments, this might be a power issue - there is certainly a trend for some cytokines.

      It has been seen in prior studies that chronic inhalant use (including and most notably cigarette smoke) can lead to proinflammatory changes in the first days to weeks, but opposite effects thereafter. For example, cigarette smoke inhalation leads to inflammatory changes at 4 weeks that resolve by 12 weeks. Thus, we feel that some of the cytokine findings are not unusual or surprising versus other patterns of inhalant use. However, we agree with the reviewer that IL-1b in cardiac tissue trends in the same direction at 3 months in both JUUL Mint and JUUL Mango exposed mice (Figure 8C and 8D). As per one reviewers’ comments, we combined 1 and 3 month data for merged graphs (Appendix 1 – Figure 4) and when analyzed together (data passed normality testing) further differences at 3 months were identified (see IL-1b in Appendix 1 – Figure 4 panel 4B). We have included these additional figures for each dataset in the Appendix 1 files.

      Of note, because some JUUL flavors are no longer on the market, including JUUL Mint and JUUL Mango, we are unable to run additional studies with these flavors. We are running new studies of the impact of JUUL Tobacco and JUUL Menthol, the two remaining JUUL flavors on the market. However, these studies will take an additional 1- 2 years and thus are beyond the scope of this manuscript. We have expanded the limitation section within the discussion with regards to power, in order to clarify to the readers that some findings are limited by the number of subjects.

      Reviewer #2 (Public Review):

      Under homeostasis conditions, the authors observed sign of inflammatory responses in the brain, the heart and the colon, while no inflammation was detected in the broncho-alveolar lavage fluid of the mice following exposures to JUUL aerosols. Also, JUUL aerosol exposures mediated airway inflammatory responses in the acute lung injury model (LPS). Further, this infection affected the inflammatory responses in the cardiac tissue. Most of the biological adverse effects induced by JUUL aerosols were flavor-specific.

      Strengths include evaluating inflammation in multiple organs, as well as assessing the physiological responses in the lungs (lung function) and cardiovascular system (heart rate, blood pressure), following exposures to JUUL aerosols. Weaknesses include the fact that only female mice were used in this study. Further, the daily exposures to either air or to the JUUL aerosols lasted only 20 min per day. It is unclear how a 20-min exposure is representative of human vaping product use. Also, although daily exposures were conducted for a duration of both 1 and 3 months, time-course effects associated with JUUL aerosols are barely addressed.

      We would like to thank the reviewer for their positive comments on our manuscript. We apologize for our error; in reality we exposed mice for 20 minutes three times daily, so one hour in total per day. We have corrected this error within our Methods. We designed the exposures this way to better mimic human e-cigarette use throughout the day (instead in one intense vaping session per day, which is not the norm). We agree that there is a limitation in using only female mice in the study in case that there are sex-dependent effects, which is definitely an interesting question. We typically start with one sex of mice and then run repeat experiments with the other sex. Unfortunately, this study faced problems beyond our control that prevented us from performing further experiments. In late 2019 the FDA was moving to ban specific flavors for pod devices, which include those for Mint and Mango. In anticipation of the new regulations, JUUL ultimately decided to discontinue JUUL Mint and Mango, and soon they were out of the market. The same process occurred with the other popular JUUL flavors such as Crème Brûlée and Cucumber. We have expanded the limitation section within the Discussion, and have pointed out that because these studies were conducted in female mice alone, the results may not represent effects in males.

      Although there are a few limitations related to this study, which should be included in the manuscript, overall, the authors' claims and conclusions are based on the data that is presented through multiple figures.

      We appreciate the Reviewers comments and have added limitations about the study size, power, lack of male subjects, etc. to the discussion section.

      Reviewer #3 (Public Review):

      Weaknesses

      1. The authors observed neuroinflammation in brain regions responsible for behavior modification, drug reward and formation of anxious or depressive behaviors after exposure to JUUL. The importance of the neuroinflammation is still unclear. It would help demonstrate the pathogenic role of the neuroinflammation by testing animal behaviors. Similar issue for other organ inflammation.

      We are an immunology, inflammation, and lung physiology lab, thus, behavioral studies are beyond the scope of both our lab and this manuscript. However, we agree that the neuroinflammation is of great interest and is highly likely to impact behavior and mood. Further studies are needed to best assess potential psychological and behavioral changes. We believe this work is important to share such that dedicated behavioral science labs can undertake these important studies. We have added these important limitations to the discussion.

      1. Majority of the data are inflammatory cytokine mRNA expression. Other methods would be needed to confirm their expression.

      Of note, in the original submission, we included protein quantification data for both the brain and the lung. We have taken the reviewers comments to heart and have conducted protein-level assays on the cardiac tissues as well, yielding additional data (new Figure 4) that has been added to the methods, results, figures and discussion. Unfortunately, we do not have any additional colonic tissue for protein-level assessments, as all of the tissue was used for the gene transcription and histologic studies. But to take a step back, these studies were originally intended to examine the broad reaching impact of e-cigarette aerosols across the body. This work, and thus this manuscript, was designed to highlight changes at the gene expression level, to demonstrate that e-cigarette use is not benign and does have broad-reaching effects on gene expression. We agree that more work is needed to fully define the impact of e-cigarette use at the protein, cellular, and organ level, but the majority of that work is beyond the scope of this manuscript. To bring the focus back to gene expression, we have conducted RNAseq on the lungs of JUUL exposed mice, and have included those data herein to highlight the effects of ecigarette aerosols on gene expression in the lung, with a particular focus on differences between Mint and Mango flavors (the most popular JUUL flavors at the time of this study). These new data (new Figure 6) support the hypothesis that e-cigarette aerosol inhalation fundamentally alters the lung, which raises the specter of downstream health effects.

      1. The author seemed to assume the difference between JUUL Mango and JUUL Mint is flavor and then came up with the conclusion regarding flavor-dependent changes in several inflammatory responses. Evidence is needed to approve the assumption.

      Although the formulation of JUUL e-liquids is proprietary, their website claims simplicity (https://www.juul.com/learn/pods) in that they use pharmaceutical grade propylene glycol and glycerol (which makes up the majority of their e-liquids), in order to form an aerosol which carries pharmaceutical grade nicotine and benzoic acid (when combined, create a nicotine salt), and flavors (which can be a mixture of natural and artificial ingredients). Thus, according to their website the only difference among the different JUUL pods would be the flavoring components. Hence, we concluded that differences observed in our study between Mint vs Mango should be most likely due to flavor-dependent effects, since base components should be the same. To support this flavor-dependent effect, a study from Omaiye et al in 2019 (PMID: 30896936) showed the variety of different flavoring chemical in all JUUL flavors and how the different JUUL vapors induce different level of cytotoxicity in BEAS-2B cells in vitro based their flavor. We have added relevant discussion to the manuscript.

      1. In most cases, the change of inflammatory cytokines is mild ~2 fold. The author should demonstrate how these marginal changes could affect pathophysiology.

      We agree with the reviewer that the majority of changes in cytokines were relatively small. However, the fact that multiple cytokines are changing in concert indicates a significant shift in immunophenotyping across organs. We are most concerned about how these shifts in the inflammatory state will alter an e-cigarette vapers response to common clinical challenges. In Dr. Kheradmand’s recent work, mice exposed to e-cigarette aerosols with and without nicotine were much more susceptible to acute lung injury in the setting of viral pneumonia. In our work, we utilized the LPS model of acute lung injury to take a first look at the potential impact of JUUL inhalation in particular on susceptibility to lung inflammation. Further work is needed to truly define how the subtle, broad shifts in the cytokine milieu across organs will impact the health of e-cigarette vapers. We have added relevant discussion to the manuscript.

      1. To fully evaluate the health impact of evolving cigarette, it would be informative to included other tobacco or vaping device as control.

      We agree that such comparisons are likely to provide insight into the differences between devices and formulations and versus cigarette smoke, and thus will be incredibly important for the field. However, these comparisons were beyond the scope of this study, whose main goal was to assess the inflammatory and physiological aspects of JUUL in particular. We believe this to be important because JUUL e-cigarettes are the most popular of all e-cigarette devices, and many young users do not use other e-devices or conventional tobacco. Thus, our primary objective of this work was to specifically assess the safety or risk of this device in particular (versus not using any inhalant at all). However, because we have run parallel studies in the past with vape pens, box mods, and conventional tobacco, we are hopeful to start combining data to look for trends and differences across inhalant exposures. For example, we recently published our work on differences in metabolites in the circulation of mice exposed to a wide variety of ecigarette based inhalants (Moshensky et al. Vaping induced metabolomic signatures in the circulation of mice are driven by device type, eliquid, exposure duration and sex. ERJ Open. July 2021 PMID: 34262972). This study is one of the few studies that have employed animal models to test JUUL devices and the only one assessing their effects in different organs, and although we agree that comparisons with other devices is important, it was not the goal of this study.

      1. The longest exposure in the study is 3 months. It is not convicting to come up with conclusions regarding chronic exposure. Some organ showing no difference may be due to the timing.

      We have altered the wording throughout the manuscript to clarify that the 3-month duration is equivalent to 10 to 20 years of inhalant use versus 40 to 50 years for a 6 to 12 month model. We have also removed many instances of the descriptive terms acute, sub-acute and chronic across the manuscript, as focused on using the absolute duration of exposure instead, to avoid accidental extrapolation to longer exposures. Because we utilized cellular and molecular based assays, we were not relying on identifying organ level pathology such as fibrosis, emphysema, and organ dysfunction, all of which would require longer exposures.

    1. Mauro's solicitation

      I think this website is a great way for educators to communicate and share their work and ideas. However, it is important to note that all information may not be as reliable as we expect!

    1. United States, researchers have long found that echo chambers are smaller and less prevalent than commonly assumed

      research continues to show that echo chambers are not as prevalent or important as we may think

      • reminds me of how facebook is known for this (Zucked)
  6. tandfbis.s3.amazonaws.com tandfbis.s3.amazonaws.com
    1. Cognitive flexibility is the ability to change how we think about something—to see things from another person’s point of view, consider multiple options, think of several ways to respond, and seek information that may not be readily available

      I think Cognitive Flexibility is a very interesting concept which requires personal effort as it encourages us to change our mentality regarding something which is important because we need to have the ability to think differently.It helps us to think of new ideas.This skill is very useful in academic and work environments as it allows us to think with keeping in mind another person's point of view.

    1. Author Response:

      Reviewer #1

      1: “A major weakness was that the simulation algorithm was both highly complex, but insufficiently explained. As a consequence, it was not clear what the underlying assumptions of the simulations were and how these assumptions were based on and/or constrained by the experiments.”

      We have revised the section related to the simulation algorithm. This reviewer also raised a similar issue and suggested adding pseudocode or explaining it in plain language. We have therefore included two sections, “Cell-fate simulation algorithm” and “Cell-fate simulation options with Operation data”, as well as Figure 7, Figure 8 and Supplementary Figure 9.

      In our previous version of the manuscript, we named the data used for the simulation as “Source data”. However, we realize that this journal uses this term for other purposes. We have therefore changed “Source data” to “Operation data” to avoid confusion.

      1. “The single-cell analysis, including measuring lineages, by itself is not cutting-edge and has been done before, and so the novelty should be in the analysis.”

      We agree that single-cell tracking per se is not a new technology, and was carried out as early as 1989 using 16 mm film. However, it has not been used frequently in the field of cell biology because of its extremely laborious nature. Our focus was thus on the development of a single-cell tracking technique that could be used routinely in cell biological research. We therefore computerized the analysis (preprint, BioRxiv 508705; doi: https://doi.org/10.1101/508705 (2018)) to allow the generation of large amounts of single-cell tracking data for bioinformatics analysis. We have mentioned this in the Results (“System to investigate the functional implications of maintaining low levels of p53 in unstressed cells”).

      1. “However, in many cases, the resulting data is presented in a manner that does not rely on the single-cell tracking (e.g. total cell number vs time in Fig. 2, average frequency of events in Fig. 4).”

      We realize that we did not adequately explain the data relating to Figure 2. Counting experiments were performed to validate the results of single-cell tracking data, because such verification has not previously been performed. We therefore intended to produce a figure including both the actual counting data and single-cell tracking data together, to allow the readers to compare the results obtained by the different approaches. Although this reviewer commented that some data did “not rely on the single-cell tracking”, we would like to stress that the counting data were only used for the purpose of comparison. We have thus rewritten the “Effect of silencing the low levels of p53 on cell population expansion” in the Results, to clarify this.

      1. “The impact of p53 was only assessed on level of differences between experimental conditions (p53 siRNA or not), but p53 levels themselves were not measured and therefore not incorporated in the single-cell analysis.”

      To the best of our knowledge, there are currently no techniques that allow the expression levels of proteins or genes of interest to be determined in individual live cells that are being tracked, and which could thus be used to generate data for bioinformatics analysis. It may be possible to use cells expressing a fluorescence-tagged protein, but as noted by this reviewer, frequent excitement of fluorophores in cells could affect cell growth (phototoxicity). We have thus been searching for a suitable technique that could be combined with single-cell tracking since 2012. If it becomes possible to perform an experiment similar to that suggested by this reviewer, it could potentially reveal many unknown cellular characteristics. We have revised the Discussion to consider this matter.

      1. “In general, differences between wild-type and p53 siRNA data were small, while cell-to-cell variability in p53 knock-down appears high (as judged by Supplementary Fig. 4). This leaves open whether the relatively minor difference between wild-type and p53 siRNA cells reflects variability in p53 knockdown between cells, which is currently not directly assessed.”

      With regard to the “differences between wild-type and p53 siRNA data were small”, we would like to make a comment related to the small difference. In a typical study of p53, a lethal dose of an agent that could kill a majority of growing cells within e.g. 24-48 hrs has been used to detect a difference with control cells. A reason to use the lethal dose of agents is to make the status of cells homogeneous to detect any alteration of interest using average-based techniques, which represent the alteration that occurred in a majority of cells. On the other hand, when lower doses of agents are used, cell-to-cell heterogeneity has to be talking into account, as only a certain group of cells in a cell population may respond to the agents. In this case, only a small or no difference may be able to detect by the average-based analyses, if only a small number of cells in a cell population respond. Distance from the average-based analysis, single-cell tracking is a technique that allows quantitative analysis of alteration that occurred in individual cells in a cell population. By Western blotting, which is an average-based assay, (Supplementary Fig. 4), the level of p53 in unstressed cells was reduced to 30%. As the levels of p53 in unstressed cells are already low, a 70% reduction of the amount of p53 may be considered to be small. However, at the individual cell levels, it was sufficient to increase cell death, multipolar cell division, and cell fusion (Fig. 4). Thus, analysis of cells at the single-cell level could allow obtaining information that is difficult to find by the average-based analysis.

      The comment related to “reflects variability”, however, made an important point. It is currently technically difficult to determine the expression levels of p53 or other proteins in individual live cells that are being tracked by long-term live-cell imaging. We therefore assumed that silencing reduced the levels of p53 in all the tracked cells. However, it is reasonable to expect variations in the silencing levels of p53 among individual cells, and it may be possible that cells in which p53 levels were reduced, e.g. to 0%, underwent cell death, while cells in which expression was only reduced to 50% underwent cell fusion, etc. Information on the levels of silencing in each cell would allow us to evaluate the relationship between p53 levels and the type of induced events. However, this analysis is currently technically difficult, as explained above. Nevertheless, the fact that silencing induced changes in cell fate suggested that the low background levels of p53 may have some functions. We have revised “Silencing of p53 and single-cell tracking” in the Results.

      Reviewer #2

      “The study's main weakness is the lack of empirical evidence from the simulation predictions of biology, and that the cellular consequences of p53 function were predictable and mostly confirmatory.”

      We appreciate these interesting comments regarding the similarities and differences of the empirical and simulation approaches. In empirical studies, a model or hypothesis is often based on the results of an analysis that aims to reveal characteristics of interest e.g. of cells. However, such a model or hypothesis generally needs to be confirmed or tested independently. We therefore considered simulation as a tool to build a model or hypothesis, which also needed to be confirmed or tested.

      Simulation could thus be considered as an additional tool, e.g. in addition to western blotting and DNA sequencing, which could generate different types of data than other existing techniques. We therefore think that such simulations could provide new options for cell biological studies. Regarding its “confirmatory” use, we think that simulation can be used to confirm existing models, but may also be used as a discovery tool. For example, p53-knockout cells are known to produce tetraploid cells, but how such cells are formed remains unclear. Single-cell tracking analysis can be used to fill the gap between the loss of p53 and tetraploid cell formation, and simulation can then be used to simulate the fate of cells generated by this loss.

      Although we focused on describing our approach using single-cell tracking and cell-fate simulation in our manuscript, we believe these methods could be used in combination with empirical studies, to widen the cell biological research options.

      We have discussed these issues in “Cell fate simulation and its applications” in the Discussion.

      Reviewer #3

      "Yet it is unclear how these results can be generalized because the authors only studied one cell line."

      The current work focused on addressing a biological question using single-cell tracking and cellfate simulation; however, it will also be interesting to see if the proposed models can be generalized. Given that HeLa cells, in which p53 function is neutralized by papillomavirus E6 protein, also frequently undergo cell fusion followed by multipolar cell division and cell death (Sato, Rancourt, Sato and Satoh Sci Rep (2916) 6:23328), we believe that the low levels of p53 may also play a similar role in suppressing those events in many other types of cells.

      "The results are not compared to other cell lines or primary cells, in terms of baseline expression of p53. "

      We agree that it will be interesting to apply the methods in various types of cells and primary cell lines. However, there are significant variations in growth profiles among cell types. We have created live-cell imaging videos for > 30 cell lines, and found that each cell type showed unique characteristics in terms of growth patterns, frequencies of cell death, cell fusion, and multipolar cell division, and in the degree of cell-to-cell heterogeneity, implying that each cell type must be characterized using single-cell tracking analysis before moving on to studies using those cells, given that no such data are currently available. We believe that establishing a public data archive of single-cell tracking data will be useful for cell biological research, as well as for testing the current model.

      "In addition, it is unclear how this model is superior to testing homeostatic p53 compares to models that use mutated p53.”

      Most cancer cells carrying p53 gene mutations still express mutant p53 in the cytoplasm, and mutant p53 is suggested to confer gain-of-function in cancer cells. The characteristics of the cells used in the current study were related to the p53 null phenotype, but it will be interesting to determine if cancer cells carrying mutant p53 have a null+gain-of-function phenotype, or if gainof-function alters the null phenotype, in order to further understand the role of p53 in tumorigenesis. Such a study will require a large amount of work, but is probably feasible.

      In addition to our responses, we would like to take this opportunity to discuss the cell biological meaning of “generality”. For example, if a response is detected in cell types A, B, and C by e.g. enzymatic assay, quantitation of protein expression levels, and staining of cells, it is often concluded that the response is commonly induced in those cells (generalized). However, as noted by this reviewer, the levels of responses may vary among cells, and commonly induced responses may thus only occur in a specific group of cells in the A, B, and C cell populations. In this case, such responses may not be generally induced in cell types A, B, and C, but only in certain subpopulations of these cell populations. In the current study, cell death etc. were induced in the A549 cell population following p53 silencing, but not in the majority of A549 cells, indicating that this might not be “general” for A549 cells, according to the definition of “generality” used for classical experimental approaches. We have thus been considering the meaning of the term “general”. Each cell in a cell population may have a different status, and without knowing the context affecting the status of each cell, it is not possible to establish “generality”. Information regarding the context of each cell in various types of cell populations is currently lacking, and we do not know how many contexts exist. In the current study, we described one context related to A549 cells, but there will be many other contexts, which may be similar to or distinct from A549 cells. We therefore consider that we are still at the stage of revealing such contexts, e.g. contexts for cancer cells carrying p53 mutation and for metastatic cells, and some commonality may begin to emerge after more contexts have been revealed. However, revealing these contexts will require extensive work, and we hope that other groups will also show an interest in this type of study.

      We have addressed some these points in the revised Discussion.

      “The tools described, including the DIC tracking software and the simulation algorithms would be useful additions to the biologist's toolkit. The direct visualization of siRNA transfection agents through DIC, and its integration with western blotting is novel, and the authors may consider preparing a protocol or methods paper that describes this in more detail, as it may be useful for trouble-shooting when encountering difficulties with siRNA transfections. ”

      We appreciate the encouraging comments and would be happy to publish a protocol.

      “The use of white-light imaging is refreshing, as many of us in the field default to fluorescence imaging, which has the potential to interfere with cell proliferation. Overall, the approach is innovative by extracting the most information possible from optical imaging data sets, in the less invasive way possible.”

      We have been working on live-cell imaging since 2000 and had difficulty maintaining cell viability using fluorescent imaging. We therefore tried various light sources and found that nearinfrared light (not white light) was less toxic to the cells, allowing us to maintain cell cultures for at least a month on a microscope stage. We mentioned that near-infrared was used in the current study (“System to investigate the functional implications of maintaining low levels of p53 in unstressed cells” in the Results.

    1. g. 8) . The power of the photographs Spiegelman includes in Maus lies not in their evocation of memory, in the connection they can establish between present and past, but in their status as fragmen

      Indeed, the power of photographs lies in the fragments of history that we cannot take in. On December 16, 2014, Taliban stormed a children school in Peshawar, where more than a 100 children were killed. The photographs of blood bath and massacre in school still invites the most horrible memory our city Peshawar ever witnessed. It is that fragment of history we cannot take in. In contrast to this, when we see photographs of those young children dressed in uniforms, as a memory of who passed away in the attack still invokes a different kind of a meaning. A photograph freezes the moment between life and death. In that very moment, when a child was posing in school uniform, he was very well alive, unaware of what will happen to him. When today their parents hold photographs by protesting on roads to find justice hurts even more. After reading this, I think there is a need to do similar work which emphasizes that those killed in wars were human too. For instance placing the pictures of people in some seminar project where people could come and see who died in Drone strikes or military operations. It may evoke some anti-war sentiments that those killed were not just numbers or stats but human beings.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The authors present further investigation of the Sox transcription factors in the model Cnidarian Hydractinia. They showcase the Hydractinia as now a relatively technically advanced model system to study animal stem cells, regeneration and the control of differentiation in animal cells. In this study they characterise the neural cells in hydractinia using FACS and sing cell transcriptome sequencing, investigate the sequential expression of SoxB genes in the i-cells and presumptive lineage giving rise to i-cells and investigate the neuronal regeneration making good use of transgenic rules. Finally, they investigate the role of SoxB genes in embryonic neurogenesis.

      There are no major or minor issues effecting the conclusions

      Reviewer #1 (Significance):

      This study helps to confirm the role of an important group of transcription factors is conserved across the metazoan as well as showcasing an exciting model organism for regeneration and stem cell biology. This will of interest to a broad audience of developmental and biologists.

      My own research is in the same field, using a different model system

      Referees cross-commenting

      I agree with the comments from the other reviewers, and am sure the authors can address these adequately with further explanation.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary

      Chrysostomou et al. investigate the role of three putative SoxB genes in embryonic neurogenesis in the colonial hydrozoan Hydractinia. They show that SoxB1 is co-expressed with Piwi in the multipotent i-cells and, using transgenics, they show that these Piwi/SoxB1 cells become neurons and gametes, consistent with the cell types that differentiate from i-cells. They further suggest that SoxB2 and SoxB3 are expressed downstream of SoxB1 in the progeny of the i-cells and, using shRNAs, investigate the role of SoxB genes on embryonic neurogenesis. The primary conclusions center on the similarity between neural differentiation in humans and Hydractinia as both systems pattern neurons using sequential expression of SoxB genes during the differentiation of neurons. The manuscript presents a large and diverse set of data derived from analysis of transgenic animals, single-cell sequencing, and investigation of gene function; despite this, the conclusions are either not particularly novel or not well-supported. The co-expression of SoxB1 in Piwi-expressing i-cells appears to be both novel and significant but the implications are not clearly indicated. Additional specific concerns are detailed below.

      Major comments

      1. SoxB genes act sequentially<br /> Knockdown of SoxB2 has already been shown to result in the loss of SoxB3, so the sequential action of SoxB genes in this animal does not seem to be a terribly novel conclusion.

      Sequential expression of Soxb1-Soxb2 has not been demonstrated previously. Flici et al. did show some data on Soxb1 expression but these were not detailed. Furthermore, they have not shown in vivo transition to Soxb2. Our new single-molecule fluorescence in situ hybridization, and the transgenic reporter animals have been developed to address these issues.

      While this manuscript does appear to report the most comprehensive analysis of SoxB1 expression, the evidence for sequential activation of SoxB1 and then SoxB2 in the same lineage (Figure 4) is a bit troubling. Panel A of this figure appears to show complete overlap between SoxB1 and SoxB2, suggesting all the cells in this field are synchronously passing through the transition point from SoxB1 to SoxB2 expression. While this may reflect reality, it would be more convincing to see adjacent cells expressing SoxB1 only or SoxB2 only, reflecting the dynamic progression of cell type specification along the main body axis.

      As shown in Figures 1, Soxb1 is expressed by i-cells (together with Piwi1) in the lower body column of feeding polyps and in germ cells in sexual polyps. These cells do not express Soxb2. Figure 2 shows that Soxb2 is expressed more orally in a population of putative i-cell progeny as they migrate towards the head. These cells still express Soxb1. In the upper part of the body column, just under the tentacle line, there are Soxb2+ cells that do not express Soxb1. Therefore, cells expressing Soxb1 but not Soxb2 are present in the basal part of the polyp, Soxb1+/Soxb2+ double positive cells in the mid body region (i.e., the interface between the two domains where Soxb1+ cells start to express Soxb2 and downregulate Soxb1.), and cells expressing Soxb2 but not Soxb1 in the upper part of the polyp, just under the tentacle line. In Figure 4, we show the interface between these two domains using in vivo imaging of double transgenic reporter animals to visualize the Soxb1 to Soxb2 transition. Indeed, in the mid body area, most Soxb1+ cells also express Soxb2 (Figure 2). Hence, Figure 4 should be seen keeping Figure 2’s data in mind. At the mRNA level, the overlap between the Soxb1 and Soxb2 domains is smaller (Figure 2) than the one shown in Figure 4 because the latter constitutes a lineage tracing, showing fluorescent proteins with a long half-life. Therefore, when i-cells downregulate Soxb1 while starting to express Soxb2, the long half-life of tdTomato results in red fluorescence persisting longer than the mRNA encoding it. We have added cartoons to Figure 4 to indicate the position along the main body axis that are depicted.

      Panel B is more concerning; while the authors have highlighted a cell that does appear to transition from SoxB1+ to SoxB1+/SoxB2+, there are several cells in the background that appear to gain SoxB2 expression without first expressing SoxB1. Do these cells constitute a fundamentally different, SoxB1-indpenendent, lineage of SoxB2+ cells? This would be noteworthy but is not mentioned or characterized.

      The panels included in Figure 4 constitute selected confocal slices of stacks acquired in vivo. During imaging, cells move in three dimensions, making them appear and disappear in given optical planes over time. In other words, the individual time frames shown (T0-T5) were not always found in the same plane due to cell migration in the Z dimension. The cells that appear to gain Soxb2+ w/o having expressed Soxb1 first are an example of such cells. They are probably Soxb2+ cells that had already downregulated Soxb1 and migrated into the respective plane of image. We have added the explanation to Figure 4's legend.

      Figure 7 shows the effect of SoxB1 knockdown (by shRNA) on the number of Piwi-expressing cells, nematocytes, etc but why not show that SoxB2 and SoxB3 are also knocked down in these experiments? Figure S11 shows no effect of SoxB2 and SoxB3 knockdown on SoxB1 expression but why wasn't the reciprocal experiment performed? If SoxB2 and SoxB3 are really downstream of SoxB1, the authors should demonstrate that with the shRNA experiments.

      Our data show that Soxb1 is expressed in i-cells and its KD reduces the number of these stem cells (assessed by expression of Piwi1, an i-cell marker). Because i-cells give rise to all Hydractinia somatic lineages (and to germ cells), focusing specifically on Soxb2+ cells would provide no further insight because all cell types are expected to be affected. Indeed, injection of shRNA targeting Soxb1 resulted in smaller animals with multiple defects, including but not limited to the neural lineage.

      1. Knockdown of SoxB genes resulted in complex defects in embryonic neurogenesis<br /> The manuscript aims to detail the roles of SoxB1, SoxB2, and SoxB3 in embryogenesis but only one of the main figures even shows pre-polyp life stages (Figure 7) and the results presented in in this figure are confusing. The authors suggest that knockdown of SoxB3 had no effect on embryonic neurogenesis but another interpretation of these data is that the SoxB3 shRNA simply did not work. The authors should provide additional support to show that this reagent is working as expected.

      This information is included in Figure S11. Using mRNA in situ hybridization, we show that injection of shRNA targeting Soxb3 causes transcriptional downregulation of Soxb3 but not of Soxb2. The figure also shows the specificities of the shRNAs targeting Soxb1 and Soxb2.

      Further, the results for SoxB1 and SoxB2 knockdown do not support the previous investigation of the role of SoxB2 in neurogenesis (Flici et al 2017). If SoxB1 is upstream of SoxB2, how does knockdown of SoxB1 have such a dramatic effect on RFamide neurons and nematocytes but knockdown of SoxB2 has an effect only on RFamide neurons? Is it possible the SoxB2 shRNA also wasn't working as expected? Can the results of the Flici et al 2017 paper showing SoxB2 knockdown in polyps be recapitulated using these shRNAs? If the point is to argue that embryos and adults (polyps) use fundamentally different mechanisms to drive neurogenesis, then the results presented in Figures 1-6 (which investigate SoxB genes in polyps) can't really be used to make inferences about embryonic neurogenesis. I think the authors have more work to do to demonstrate that embryonic and adult neurogenesis fundamentally differ.

      The Soxb2 shRNA specificity is shown in Figure S11 (i.e., it KD Soxb2 but not Soxb1). We were equally surprised to discover that Soxb2 KD resulted in somewhat different phenotypes than the ones obtained by Flici et al. (2017) in polyps. At this stage, we cannot explain the difference. However, one could speculate that it resulted from slightly different regulation logic between embryonic and adult neurogenesis. More specifically, we propose different priorities for generating neural subtypes as explanation. Unfortunately, shRNAs work only with embryos, and long dsRNA mediated KD works only with polyps. CRISPR/Cas9-mediated KO is feasible in Hydractinia, but knocking out developmental genes, such as these Sox genes, would likely cause embryonic lethality. Other conditional KO/KD approaches are not available for Hydractinia. We believe we have made all possible efforts to clarify the roles of these genes using currently available techniques. Neurogenesis is a complex process that is only partially conserved among different animals and poorly studied in non-bilaterians. Furthermore, it is not possible to answer all questions in one study. As many studies before, our work contributes to the understanding of neurogenesis but also raises new questions. Addressing them is matter for future research. We have toned down the statement in the last sentence of the results and in the discussion and do not claim that embryonic and adult neurogenesis are fundamentally different.

      Minor comments

      Methods: A large bit of data from this manuscript relies on quantitative analysis of cell number but there's not enough information in the methods to understand how quantification was performed. How many slices from the z-stack were analyzed? Were counts made relative to the total tissue area in the X/Y dimension or relative to the number of total nuclei in the same section? How many individuals were examined for each analysis?

      All cell counting analysis was performed using ImageJ/Fiji software. Counts were made relative to the total tissue area in the X/Y dimension (for the shRNA experiments). A Z-stack covering the whole depth of each larva was obtained. Counting was performed on cells positive for the respective cell type marker based on antibody staining and numbers were compared between shControl and shSoxb1/2/3 animals. At least 4 animals were counted per condition.

      Page 11 - "Piwi2low cells, which are presumably i-cell progeny" - how were "high" and "low quantified?

      “High” and “low” were not quantified. This is because i-cells progressively downregulate Piwi genes (i.e., Piwi1 and Piwi2) as they differentiate but this is a continuous process. Hence, it is difficult to put a threshold of Piwi1/Piwi2 protein level below which a cell ceases to be an i-cell while becoming a committed progeny. This is a similar process that is well documented in other animals where stemness markers are gradually downregulated during differentiation.

      Page 13 - "a role in maintaining stemness" - this comment is not totally clear to me. Why would the number of EdU+ cells increase if the role of SoxB1 is to maintain stemness? Wouldn't SoxB1 knockdown then force stem cells to exit their program, resulting in early differentiation of i-cell progeny? This should be clarified.

      KD of Soxb1 resulted in a decrease in the number of i-cells (i.e., Piwi1+ ones), suggesting that the gene is required for stemness maintenance. The increase in the numbers of cells in S-phase in this context was not related to i-cells because most of them were Piwi1-negative (Figure 7B). The identity of the cells in S-phase remains unknown, but a plausible explanation is that i-cell progeny (e.g., nematoblasts; see also next comment) increase their proliferative activity when i-cells numbers are low as a compensatory mechanism. This is merely a speculation. We have rephrased the paragraph to increase clarity.

      Page 13 - "if progenitors are limiting" - if progenitors are limited why would there be an increase in nematocytes?

      We do not have a definitive answer to this question but speculate that nematoblasts (i.e., stinging cell progenitors) account, at least in part, for the excessive proliferation seen under Soxb1 KD. This may constitute a mechanism allowing a depleted i-cell population to recover by self-renewal (instead of differentiation), moving temporarily the proliferation task to committed progeny (e.g., nematoblasts) until i-cell numbers return to normal. However, in the absence of evidence we refrain from expanding on this in the text.

      Figures 1 and 2 claim to show "partial overlap" but they look perfectly overlapping to me. This makes the situation in Figure 4B difficult to interpret.

      Figure 1 shows full overlap between Piwi1 and Sox1 expression and this is reflected in the text. Figure 2 shows no overlap between Soxb1 and Soxb2 in the lower body column (where only Soxb1 is expressed), overlap in the mid body region, and Soxb2 only expressing cells in the upper part of the body, just under the tentacle line. Similarly, the figure shows overlap between Soxb2/Soxb3 under the tentacle line, and predominantly Soxb3 above it in the head region. The small cartoons at the left side of each panel indicate its position along the oralaboral axis. See also our reply to the second part of comment #1.

      Figure 4 - No indication of which part of the animal or which stage is shown in these images.

      We have added cartoons to indicate the area in the polyp from where the images were taken.

      Figure 5 - No indication of where these dissociated cells came from - polyps? Larvae?

      All tissue samples were taken from feeding polyps; this is now mentioned in the Materials and Methods section.

      Panel D is a bit perplexing - what are the "progeny" of Piwi+ cells if not SoxB2+ cells and their derivatives?

      In Panel D, we show three cell fractions. One constitutes i-cells, based on high Piwi1 expression (green fluorescence of the Piwi1::GFP reporter transgene) and morphology; one fraction includes nematocytes, based on the characteristic nematocyst capsule, and one constitutes a mixture of other i-cell progeny. The latter includes different cell types, given that i-cells are thought to contribute to all lineages. They have only dim GFP fluorescence because the Piwi1 promoter-driven GFP shuts down upon i-cell differentiation. Soxb2+ cells are also among them but are not the only i-cell progeny.

      Why are nematocytes but not neurons indicated?

      Neurons are shown on Panels E & F. See also next comment.

      Piwi seems to be maintained in Ncol-expressing cells but not in SoxB2- or RFamide-expressing cells? Does this suggest that Piwi is turned on in i-cells, off in SoxB2-expressing cells, and on again in terminally differentiating nematocytes? This would be quite surprising and should be verified with antibody labeling/imaging in Piwi transgenics to confirm the result. The resolution for Panel M is too low to evaluate this part of the figure.

      The Piwi1i gene is downregulated upon i-cell differentiation. In the Piwi1:GFP reporter animal, residual GFP fluorescence persists post differentiation due to GFP's long half-life. The brightness of which depends on the time elapsed since differentiation. Because nematocytes are short living cells with high turnover, most nematocytes have recently differentiated and are therefore relatively bright green in the Piwi1::GFP animal. Neuron turnover is lower, making most neurons in the same transgenic animal appear dim. The resolution of the imaging flow cytometer is limited because the machine images 1000s of cells per second through all optical channels. However, it is high enough to allow the identification of features such as cell shape, some organelles (e.g., nematocytes), nuclear size and shape, and fluorescence intensity.

      Figure 7 - the low magnification images provide nice overall context but the authors should also provide high magnification panels for the same images. Without them it is not possible to assess "defects in ciliation" or to determine if there are defects in GLWamide neurons from these knockdowns (e.g., neurite vs cell body defects). There's no mention of the fact that SoxB1 knockdown resulted in complete loss of RFamide cells, which is strange. Are there SoxB2-independent populations of RFamide? Panel B could be interpreted multiple ways - downregulation of Piwi in SoxB1 shRNA or upregulation in SoxB2/B3. The authors should provide an image of control shRNA-injected larvae with the same co-labeling of Piwi/EdU for context. From the images, it's not clear that there were differential effects of SoxB2 and SoxB3 on nematocytes.

      The resolution of the images is, in fact, high, allowing it to be blown up on the screen. Even higher magnification of ciliation can be seen in Figure S12. KD of Soxb1 resulted in complete or nearly complete loss of Rfamide+ neurons. We have added this statement to the text as requested. Panel B shows the relative difference in Piwi1+ and S-phase cells between shSoxb1, shSoxb2, and shSoxb3-treated animals. The quantification relative to the control is presented in Figure 7C.

      Figures 6 and S9 - why piwi2 and not piwi1?

      In Figure 6, we co-stained the regenerates with two antibodies: one was a rabbit anti-GFP (to visualize the RFamide+ neurons), and the other was a guinea pig anti-Piwi2 (to visualize icells). The anti-Piwi1 antibody that was used in other images to visualize i-cells was raised in rabbit and could not be used in conjunction with the anti-GFP one.

      Figure S1 - Kayal et al 2018 is the most recent phylogeny of cnidarians and should probably be cited in place of Zapata throughout the manuscript. Independent of this, the polytomy in Figure S1 panel A is not supported by either Zapata or Kayal and should be fixed.

      We have cited Kayal et al. 2018 and revised the tree in Figure S1 as pointed.

      Figure S3 - is this mRNA? Protein? Panels E-G are too small to interpret. Please provide stage/time for cartoons in panel H.

      As per the legend, Panels A, B, D, E, F refer to protein; C is lectin staining (DSA), and G is EdU. The resolution of Panels E-G is actually high, allowing blowing up of the images on the screen to view the details. The stages of the cartoon in Panel H are now provided in the figure legend.

      Figure S11 - please provide images of whole larvae as shown for Piwi knockdown in Fig S9 and some additional support (e.g., qPCR) to demonstrate the shRNAs are actually working.

      Figure S9 represents immunostaining using the anti-Piwi1 antibody. In Figure S11, we show the specificity of the shRNA treatments; we used highly sensitive single-molecule mRNA in situ hybridization. Whole animal imaging is not informative due to the punctuated nature of the single-molecule staining.

      Figure S12 - it's not clear what ciliary "defects" are being shown.

      In the control, cilia are uniformly distributed along the oral-aboral axis whereas in the shSoxb1-injected animals, the pattern is patchy. Additionally, shSoxb1-injected larvae could not swim (planulae swim by coordinated cilia beat).

      Reviewer #2 (Significance):

      Generally, the results are either equivocal or the conclusions are not well supported by the results (as detailed above). The significance of this work to vertebrate neurobiology is somewhat weak. (Especially considering the orthology of these genes to bilaterian SoxB genes is not well supported.) Why not compare these results to other cnidarians - the expression patterns of SoxB1 and SoxB2 in corals and sea anemones seem to differ quite a lot (Shinzato et al 2008; Magie et al 2005), suggesting these genes are almost certainly not behaving in the same way across cnidarians. This is exciting! What's happening in Hydra? Seems like it should be possible to mine the single-cell data set from Siebert et al to test these hypothesized relationships between the Sox genes in another hydrozoan which constantly makes new neurons.

      We have modified the concluding section in the discussion, in line with this comment. See also comment to Reviewer #3.

      Reviewer #3 (Evidence, reproducibility and clarity):

      This paper characterizes the role of Soxb genes in neurogenesis in Hydractinia. The authors use cutting edge approaches including FISH, transgenics, image flow cytometry, FACS and shRNA knock downs to characterize SoxB in Hydractinia. The images are beautiful, the data is sound and the interpretation of the data is appropriate.

      I have only minor suggested listed by section below:

      Abstract<br /> - The abstract and introduction should make clear that this is a colonial animal and the cell migration occurs from the aboral to the oral end of the polyp (not the animal, as there are many oral ends). This is relevant to the interpretation of the data as the polyps do not act in isolation as they interconnected and may communicate via the stolonal network that connects the polyps in the colony.

      We have added a section to the Introduction to address the reviewer's comment. The Abstract, however, is too short to include this explanation.

      • The human disease justification is a relatively weak one and does not need to be included. Using Hydractinia to understand the role of SoxB in the evolution of neurogenesis in animals is enough justification for the study.

      We have adopted the reviewer's comment and modified the statement in the discussion (see also comment to Reviewer #2).

      Introduction<br /> - Instead of Sox phylogenies (the term phylogeny is more appropriate for species trees), consider substituting, for Sox gene trees. And instead of "phylogenetic relation" use the term "orthology"

      This has been done.

      • The number of times the sentences that have the sentiment "....remain unknown." "....little is known.." "...unclear..." , "....difficult to establish...." etc. is distracting and detracts from what IS known about these genes. It is not necessary to continually justify the study throughout the introduction. Instead a clearer description of the background and setting up the question/hypothesis of SoxB paralog subfuctionalization in space and time - would be more informative to the reader.

      We have reduced the number of occasions as recommended.

      • The authors state that there are three SoxB genes in the Hydractinia genome? What genome? For several years there has been multiple papers published by subsets of these authors have used unpublished genome data, but the complete genome has yet to be released to the public. This is especially egregious because they cite their NSF funded EDGE proposal to CEF and UF which is supposed to develop tools to the community, and yet the community at large doesn't have access to the genome. If these data came from the genome, then the genome should be released. If these data came from a previously published transcriptome as in the previous SoxB paper then this should be stated explicitly.

      The Hydractinia genome assembly, annotation, RNA-seq data, and genome browser are now available in the Hydractinia genome project portal at the National Human Genome Research Institute (NIH) website (https://research.nhgri.nih.gov/hydractinia/). The raw data have been deposited in the NCBI Sequence Read Archive (SRA) under BioProject PRJNA807936. This information has been added to the 'Resource availability' section.

      Results<br /> - I assume there was no expression of Soxb2 and Soxb3 in the reproductive polyps? This should be stated explicitly.

      Soxb2 expression in sexual polyps was consistent with the nervous system and with maternal deposition in oocytes. It was not detected in male germ cells. We have added a new in situ hybridization image of Soxb2 to Figure 12.

      • The word "progeny" is used throughout to describe terminally differentiated cells. However, progeny implies offspring, but these are actually later stages of differentiation of the in a cell's ontogeny, thus the term should be changed to "differentiated cells"

      We used "progeny" to indicate that the corresponding cells derived from a specific progenitor cell type. We did try replacing it with "differentiated cells" but this completely changes the meaning of the sentence: first, it does not include the cell of origin info and second, not all progeny are already fully differentiated.

      • Typo on page 11 "This predictable generation of many new neurons provides an opportunity to study neurogenesis in [a ]regeneration." - Remove the "a"

      Corrected.

      • While the regeneration study is interesting, there is nothing revealed about the role of Soxb and there is not a lot of new information revealed about regenerations. Authors should better justify this section or consider omitting.

      These sections demonstrate de novo neurogenesis in head regeneration. This was not known in this animal before.

      Discussion<br /> - The authors assume that in the transgenic lineage, the fluorescent marker in differentiated cells is due to retention of fluorescence, but it is unclear if they can rule out that Soxb2 is still being expressed in those cells" Please clarify.

      We conclude this by comparing the mRNA expression (Figures 1 & 2) with the fluorescent proteins (Figure 3).

      • How did the authors determine that the shSoxb3 knockdown worked? Please discuss relevant controls and validation (either in discussion or methods). This is particularly important given that it didn't have an apparent phenotypic effect.

      The efficacy of all shRNAs determined by in situ hybridization, showing that each shRNA downregulates its own target mRNA but not the others (Figure S11).

      • Again, the connection to human health is a bit of a stretch. Instead, what is most interesting is the similarity of Soxb paralogs acting sequentially as has been found in vertebrates. This suggests a highly conserved mechanism of subfunctionization following gene duplication at the base of animals.

      We agree. This is now also better highlighted in the discussion.

      Figures<br /> - Its very hard to distinguish the overall abundance of Soxb2 and Soxb3 expression along the polyp body axis from the panels figure 2. A lower magnification or larger area in each region would be helpful

      In Figure 2, we performed single-molecule in situ hybridization. While highly sensitive, this method generates spotty images because they highlight single molecules and are not coupled to an enzymatic reaction as in other methods. They mostly looks poor when showing low magnification images. Because a previous study (Flici et al. 2017) has already shown the general expression pattern, we aimed at providing the details of the transition.

      • Figure 4 - either the figure is upside down or the text is upside down. It is also difficult to see the double staining (if any).

      The figure is oriented to position the oral end up. The resolution of the panels is high, enabling blowing-up on the screen. The quality of in vivo time lapse images cannot match that of fixed and antibody stained ones, or of single in vivo images. This is because the animals are imaged for many hours during which they tend to bleach.

      • Figure 5M is difficult to read due to the small print. Consider enlarging and moving it to Supplementary Material

      The size of the text is small but the resolution is very high, enabling blowing up the image on the screen. We thought that the information was important enough to be presented in the main text and given that most readers would use the electronic version we preferred this option on another supplemental figure on top of the 12 we already have.

      Reviewer #3 (Significance):

      This is an interesting and important study because although it is well known that SoxB genes function in neurogenesis in animals, it is unclear how and if subfunctionalization occurs outside of vertebrates. Hydractinia is an excellent model to study SoxB genes because of its colonial organization and continuous development of nerve cells throughout the life of the animal. In addition, it is part of the early diverging cnidarian lineage and thus can provide insight into the relative conservation of SoxB genes across animals.

  7. learn-us-east-1-prod-fleet02-xythos.content.blackboardcdn.com learn-us-east-1-prod-fleet02-xythos.content.blackboardcdn.com
    1. It is an insurrection.It may be that in this presentation of a dreadful event we will some­times speak of rioting, but merely to describe what was happening on the surface and always maintaining the distinction between form andessence, riot and insurrection.In the sudden outbreak and grim suppression of this 1832 uprising there was so much grandeur that even those who see it as mere riot cannot speak of it without respect.

      Hugo places a certain scrutiny onto riots, and here argues that, even those who think that the June 1832 insurrection was a riot, "cannot speak of it without respect." I understand Hugo's distinction between riots and insurrections, but is a riot against injustice not a noble act to him? Why must an uprising be an insurrection in order to gain our respect? Yes, riots are violent and seemingly random, but they also serve a purpose in society and are one viable option for the oppressed.

    1. I saw many technologies used in unequal ways

      I have't read on yet, but I wonder if there are any biases that we are unaware of that contribute to this mistreatment. We did a study similar to this, on race and age using the tool "Implicit Association Test", which is said to unveil hidden biases that we may have. I have added the link if you want to try this out for yourself. Personally, I believe that there are many things that are unconscious to us, and we try to avoid negative biases, but sometimes they can be apart of our nature based on the way we were raised, and the environment we are exposed to.

      https://implicit.harvard.edu/implicit/selectatest.html

      It is really sad to hear the data on this as we think of teachers to be loving caretakers.

    1. Author Response:

      Reviewer #1 (Public Review):

      This article focuses on a quantitative description of airineme morphology and its consequences for contact and communication between cells via these long narrow projections. The primary conclusions are

      1) Airineme shapes are consistent with a persistent random walk model (analogous to a wormlike polymer chain), unhindered by the presence of other cells.

      The authors convincingly demonstrate, using analysis of the mean-squared-displacement along the airineme contour, that the structures cannot be described by a diffusive growth process (ie: a Gaussian chain) as would be expected if there were no directional correlations between consecutive steps. Furthermore, by observing the airineme growth and looking at the distribution of step-sizes, they show that these steps do not exhibit the expected long-tail distributions that would imply a Levy-walk behavior. The persistent random walk (PRW) is presented as an alternative that is not inconsistent with the data. However, given the high level of noise due to low sampling, the claimed scaling behavior of the MSD at long lengths is not fully convincing. Nevertheless, the PRW provides a plausible potential description of the airineme shapes.

      To reiterate the comment: the MSD analysis allows us to reject the simple random walk model, and it is consistent but alone is not strongly supportive of the PRW model, especially at high time of around 15 minutes (long lengths of around 65 microns). As the Reviewer points out, this is due to low numbers of long airinemes.

      This prompted us to investigate the long-length data using multiple analysis approaches. In the new manuscript, new Fig 2B, we took all airinemes whose growth time was greater than 15 min, and plotted their final angle, i.e., the angle between the tangent vector at their point of emergence from the source cell and the tangent vector at their tip. At long times (>1/D_theta), the PRW model predicts that the angular distribution should become isotropic.

      In new 2B, we find that the angular distribution is uniform, i.e., isotropic, using a Kolmogorov-Smirnov test (p-value 0.37, N=26).

      Since there are relatively few data points, we repeated this analysis under various airineme selection criteria, and in all cases found the final angular distribution to be consistent with uniformity (new Supplemental Data Figure 1). For example, if we set the threshold at 10min, which includes N=49 airinemes, the Kolmogorov-Smirnov test against a uniform angular distribution gives a p-value of 0.32.

      We here add a few additional notes

      ● Note that there is significantly less data used in this test than in the MSD analysis or the autocorrelation function maximum likelihood analysis. In order to perform a hypothesis test, we wanted to be sure that the data points are independent, so we take only one from each airineme (unlike MSD and autocorrelation analyses, for which we take every interval of a particular length, whether in the same airineme or not.)

      ● Finally, although the >10min KS test has more data than the >15min KS test (N=49 compared to N=26), we have chosen to present the >15min KS test in the Main Text. As we mentioned above, the conclusion is unchanged for >10min (see Supporting Data). The reason is that >15min is the first test we ran to check angular distribution against a uniform (-pi,pi) distribution, and we did not want to bias our testing.

      Taken together, the data are even more strongly supportive of the PRW model. We are grateful for the Reviewer in encouraging us to further explore the high-time data.

      2) The flexibility (ie: persistence length) of the airineme shapes is one that maximizes the probability of a given airineme (of fixed length) contacting the target cell.

      This optimum arises due to the balance between straight-line paths that reach far from the source but cover a narrow region of space and diffusive paths that compactly explore space but do not reach far from the starting point. Such optimization has previously been noted in unrelated contexts both for search processes of moving particles and for semiflexible chains that need to contact a target. The authors present a compelling case (Fig 4B) that the measured angular diffusion of the airinemes falls close to the predicted optimum. Furthermore, the measured probability of hitting the target cell also lies close to the model prediction, providing a strong test of the applicability of their model.

      3) Airineme flexibility engenders a tradeoff between contact probability and directional information (ie: the extent to which the target cell can determine the position of the source).

      This calculation proposes an alternative utility metric for communication via airinemes. The observed flexiblity is shown to be at a Pareto optimum, where changes in either direction would decrease either the probability of contact or the directional information. Again the absolute value of the metric (Fisher information for angular distribution) is within the predicted order of magnitude from the model. Thus, while the importance of maximizing this metric remains speculative, its quantitative value provides an additional test for the applicability of the PRW model.

      Overall, this paper provides an interesting exploration of optimization problems for communication by long thin projections. A particular strength is the quantitative match to experimental data -- indicating not just that the experimental parameters fall along a putative optimum but also that the metrics being optimized are well-predicted by the model. Defining an optimization problem and showing that some parameter sits at the optimum is a common approach to generating insight in biophysical modeling, albeit invariably suffering from the fact that it is difficult to know which optimization criteria actually matter in a particular cellular system. The authors do an excellent job of exploring multiple optimization criteria, quantifying the balance between them, and pointing out inherent limitations in knowing which is most relevant.

      A minor weakness of the manuscript is its focus on a very narrowly defined cellular system, with the general applicability of the results not being highlighted for clarity. For example, the fact that the same flexiblity optimizes contact probability and the balance between contact and directional information is an interesting conclusion of the paper. Is this true in general? Is it applicable to other systems involving a semiflexible structure reaching for a target or a moving agent executing a PRW?

      The Reviewer’s question is an excellent question: Is the trade-off between contact and directional information a general property of searchers that obey persistent random walks? To address this question, we now include the analysis previously contained in Figure 5D, but for a full parameter space exploration. This is done in new Figure 5 Supplemental Figure 1. In doing so, we found fascinating behavior that sheds some light on the loop in Fig 5D.

      At low d_targ, the trade-off is amplified, and the parametric curve resembles bull's horns with two tips representing the smallest and largest D_theta in our explored range, pointing outward so the shape is concave-up. Intuitively, we understand this as follows: since the target is fairly close (relative to l_max), contact is easy. The only way to get directional specification is by increasing D_theta to be very large, effectively shrinking the search range so it only reaches (with significant probability) the target at the near side (“3-o-clock'' in Fig. 5A). At low d_targ, the parametric curve is concave-up, and there is no Pareto optimum.

      At high d_targ, the searcher either barely reaches (when D_theta is high), and does so at 3-o-clock, therefore providing high directional information, or D_theta is low, and the searcher fails to reach, and therefore also fails to provide directional information. So, at high d_targ, there is no trade-off.

      At intermediate d_targ, the curve transitions from concave-up bull's horn to the no-tradeoff line. To our surprise, it does so by bending forward, forming a loop, and closing the loop as the low-D_theta tip moves towards the origin. At these intermediate d_targ values, the loop offers a concave-down region with a Pareto optimum.

      So, to answer the specific question of the Reviewers: No, the Pareto optimum is not a general feature of persistent random walk searchers. It only exists in a particular parameter regime, sandwiched between a regime where there is a strict trade-off with no Pareto optimum and a regime in which there is no trade-off.

      All of these results are now discussed in the main text.

      (Note that although we do not explicitly explore lmax, since these plots have not been nondimensionalized, the parametric curve for a different lmax can be obtained by rescaling the results).

      Reviewer #2 (Public Review):

      Signalling filopodia are essential in disseminating chemical signals in development and tissue homeostasis. These signalling filopodia can be defined as nanotubes, cytonemes, or the recently discovered airinemes. Airinemes are protrusions established between pigment cells due to the help of macrophages. Macrophages take up a small vesicle from one pigment cell and carry it over to the neighbouring pigment cell to induce signalling. However, the vesicle maintains contact with the source cell due to a thin protrusion - the airineme. In support of these data, the authors find that the extension progress of the airinemes fits an "unobstructed persistent random walk model" as described for other macrophages or neutrophils.

      The authors describe the characteristics of an airineme as it would be a signalling filopodia, e.g. a nanotube or a cytoneme, which sends out to target a cell. An airineme, however, is fundamentally different. Here, a macrophage approaches a pigment cell binds to the airineme vesicle. Then, the macrophage approaches a target pigment cell and hands over the airineme vesicle. During this process, the airineme vesicle maintains a connection to the source pigment cell by a thin protrusion. Then, the macrophage leaves the target cell, but the airineme vesicle, including the protrusion, is stabilized at the surface and activates signalling. Indeed nearly all airinemes observed have been associated with macrophages (Eom et al., 2017).

      Therefore, it is essential to focus on the "search-and-find" walk of the macrophage and not the passively dragged airineme. In the light of this discussion, I am not sure if statements like "allow the airineme to hit the target cell" are helpful as it would point towards an actively expanding protrusion like a filopodium.

      We have added a new paragraph in the Introduction emphasizing the role of the macrophage, and we have changed the language. In particular, we want to remove agency from the airineme, since it is indeed moving with the macrophage. In the mathematical sections, we opt for the phrase “search process”.

      We have also clarified that, in the biological system, the details of contact are unclear (e.g., what mechanism in the macrophage-airineme-vesicle is responsible for distinguishing the target cell). Therefore, in the model, we have clarified that contact is declared when the airineme tip arrives at a distance r_targ from the center of the target cell, and this critical distance might be larger than the size of the target cell, since it might include part or all of the macrophage.

      Reviewer #3 (Public Review):

      This paper studies statistical aspects of the role of long-range cellular protrusions called airinemes as means of intracellular communication. The mean square distance of an airineme tip is found to follow a persistent random walk with a given velocity and angular diffusion. It is argues that this distribution with these parameters is the one that optimise the probability of contact with the target cell. The authors then evaluate the directional information (where in space did the airineme come from) and found that, again, the measure diffusion coefficient optimise the trade-off between high directional information (small diffusion) and large encounter probability.

      I found this paper well written and clear, and addressing an interesting problem (long-range intracellular communication) using rigorous quantitative tools. This is a very useful approach, which appears to have been appropriately done, that in itself makes this paper worthy of interest.

      1) The main conclusion of this paper is that the airineme properties optimises something that has to do with their function. Although rather appealing, I find this kind of conclusion often questionable considering the large uncertainty surrounding many parameters.

      We agree that conclusions about optimality need to be expressed carefully, to avoid teleological statements and overstating our knowledge about the constraints and variability faced by the living system. In the revised manuscript, we strive to use language to point out that the parameter extracted from data (an average) and the parameter predicted to be optimal (on average) are approximately equal, and avoid speculation about the evolutionary process that may have led to these parameters.

      Here, optimality is shown from a practical perspective, using measure parameters. For instance, the optimal diffusion coefficient for hitting the target varies by 2 orders of magnitude when the distance between cells is varied (Fig.3A). The measured coefficient is optimal for cells about 25 µm distant. Does this reflect anything about the physiological situation in which these airinemes operate?

      To find the physiological regime in which the airinemes operate, we extracted distance-to-target measurements from imaging data, and found an average distance of 51 microns (note possible typo in referee comment), with a range of 33𝜇m − 84𝜇m, 𝑁 = 70. We report this in updated Table 1). The optima we find is in the average number of attempts before success (so, a single instance of an airineme may either succeed or fail, stochastically), when the distance to the target is 50 microns. These are both averages, across an entire fish epithelium (which contains ~10^5 source cells). So, for a particular cell generating airinemes, there may be different optimal parameters given a priori knowledge of its environment, but, across the whole fish epithelium, we assume the overall success corresponds to the average single-cell success we simulate.

      Another rather puzzling claim is that the diffusion coefficient is optimised both for finding the target, AND for finding the best compromised between finding the target and providing directional information, while the latter must necessarily require weaker diffusion. Hence the last paragraph of p.6 ("the data is consistent with either conclusion that the curvature is optimized for search, or it is optimized to balance search and directional information"), although quite honest, gives the feeling that the conclusions are not very robust. I would welcome a discussion of these points.

      We have clarified the result about directional information in the new manuscript.

      First, it is not optimized for maximal directional information, in the sense that there are other parameters that would give more directional information – we apologize for the lack of clarity. Rather, the parameters observed are such that changing them would either reduce search success or directional information. In the study of multiple optimization, this property is called “Pareto optimality”.

      Second, the Reviewer’s intuition is that weaker diffusion (straighter airinemes) would provide more directional information. This was indeed our intuition as well, prior to this study. To our surprise, we found that very weak diffusion or very strong diffusion both give local maxima of directional information. The intuitive explanation is that the searchers are finite-length, and high diffusion leads to a smaller search extent which only reaches the target cell at its very nearest region. We provide this intuitive explanation (which was indeed a surprise to us) in the Results section.

      Third, the Reviewer asks about the generality of the result about directional information. This is an excellent question. The comment, and similar comments from other Reviewers, prompted us to perform a parameter exploration study. This is contained in a new Supplemental Figure and new paragraphs in the Results section.

      The Reviewer’s question is an excellent question: Is the trade-off between contact and directional information a general property of searchers that obey persistent random walks? To address this question, we now include the analysis previously contained in Figure 5D, but for a full parameter space exploration. This is done in new Figure 5 Supplemental Figure 1. In doing so, we found fascinating behavior that sheds some light on the loop in Fig 5D.

      At low d_targ, the trade-off is amplified, and the parametric curve resembles bull's horns with two tips representing the smallest and largest D_theta in our explored range, pointing outward so the shape is concave-up. Intuitively, we understand this as follows: since the target is fairly close (relative to l_max), contact is easy. The only way to get directional specification is by increasing D_theta to be very large, effectively shrinking the search range so it only reaches (with significant probability) the target at the near side (“3-o-clock'' in Fig. 5A). At low d_targ, the parametric curve is concave-up, and there is no Pareto optimum.

      At high d_targ, the searcher either barely reaches (when D_theta is high), and does so at 3-o-clock, therefore providing high directional information, or D_theta is low, and the searcher fails to reach, and therefore also fails to provide directional information. So, at high d_targ, there is no trade-off.

      At intermediate d_targ, the curve transitions from concave-up bull's horn to the no-tradeoff line. To our surprise, it does so by bending forward, forming a loop, and closing the loop as the low-D_theta tip moves towards the origin. At these intermediate d_targ values, the loop offers a concave-down region with a Pareto optimum.

      So, to answer the specific question of the Reviewers: No, the Pareto optimum is not a general feature of persistent random walk searchers. It only exists in a particular parameter regime, sandwiched between a regime where there is a strict trade-off with no Pareto optimum and a regime in which there is no trade-off.

      All of these results are now discussed in the main text.

      (Note that although we do not explicitly explore lmax, since these plots have not been nondimensionalized, the parametric curve for a different lmax can be obtained by rescaling the results).

      2) on p.4: "the airineme tips (which are transported by macrophages [30]) appear unrestricted in their motion". I don't understand what it means that the airineme tips are transported by macrophage, and I missed the explanation in the cited article. Is airineme dynamics internally generated (i.e. by actin/microtubule polymerisation) or does it reflect to motility of cells dragging the airineme along? This is discussed in passing in the Discussion, but I think that this should be explainde in more detail right from the start. Aslo, if a cell is indeed directing the tip, what does contact mean? Does it mean that the driving macrophage must contact the target cell and somehow attached the airineme to it? IF yes, that means that the airineme tip has a large spatial extent, which will certainly affect the contact probability.

      These are very good questions. Airinemes have been characterized in a few studies since their discovery in 2015. We are saddened (and excited) to say that: the answers to all of these questions are currently unknown. To paraphrase the Reviewer, the questions are: First, what is the force generation mechanism that leads to airineme extension (additionally, if there are multiple coordinated force generators, e.g., the airineme’s internal cytoskeleton and the macrophage, how are these forces coordinated)? And second, what are the molecular details of airineme tip contact establishment upon arrival at a target cell?

      We present an extended biological background discussion addressing these questions, including what is known and what remains unknown. We have incorporated a shortened version of this as a new paragraph in the introduction.

      Airinemes are produced by xanthophore cells (also called yellow pigment cells) and play a role in the spatial organization of pigment cells that produce the patterns on zebrafish skin. Xanthophores have bleb-like structures at their membrane, and those blebs are the origin of the airineme vesicles at the tip. Those blebs express phosphatidylserine (PtdSer), an evolutionarily conserved ‘eat-me’ signal for macrophages. Macrophages recognize the blebs, ‘nibble,’ and ‘drag’ as they migrate around the tissue and the filaments trailing and extending behind. Airineme lengths have a maximum, regardless of whether they reach their target. If the airineme reaches a target before this length, the airineme tip complex recognizes target cells (melanophores) and the macrophage and airineme tip disconnect.

      The airineme tip contains the receptor Delta-C, which activates Notch signaling in the target cell. The mechanism by which a macrophage hands off the airineme tip is still mysterious, due to temporal and spatial resolution limits. It is also known what other signals, if any, are carried by the airineme. If no target cell is found by the maximum length, the macrophage and airineme disconnect, and the airineme the extension switches to retraction. Thus, macrophages do not keep dragging the airineme vesicles until they find the target melanophores. However, how macrophages determine when to engulf the untargeted airineme vesicles is not understood.

      Regarding the Reviewer’s specific question about the implications for the macrophage on how we model contact establishment: This would indeed change the interpretation of the model parameter r_targ. Specifically, contact is declared when the airineme tip arrives at a distance r_targ from the center of the target cell, and this critical distance might be larger than the size of the target cell, since it might include part or all of the macrophage. We have added this to the first part of Results, when the parameter is introduced.

      3) Fig. 2A shows the airinemes MSD and the fit using the PRW model. I don't find the agreement so good. The power law t^2 seems good almost up to 10 minutes, and the scaling above that, if there is one, is clearly larger than linear. So I would say that the apparent agreement with the PRW model reflects the fact that there is a crossover from a ballistic motion to something else, but that this something else is not a randow walk. The MSD does look quite strange at long time, where it apparently decays. This made me wonder whether there might be a statistical biais in the data, for instance, the longest living airinemes are those who didn't find their target and hence those who travel less far, on average. I tried to get more information on the data from the ref.[29,30], but could not find anything. The authors should discuss these data and possible biais in more detail. For instance, do the data mix successful and unsuccessful airinemes? This is somewhat touched upon in Fig.s$, but I did not gain any useful information from it, except that the authors find the agreement "good" while it does not look so good to me.

      To reiterate the comment, which is closely related to comments from other Reviewers: the MSD analysis allows us to reject the simple random walk model, and it is consistent but alone is not strongly supportive of the PRW model, especially at high tau of around 15 minutes (long lengths of around 65 microns). As the Reviewer points out, this is due to low numbers of long airinemes.

      We agree, and have performed new analysis. The following is repeated here for convenience:

      This prompted us to investigate the long-length data using multiple analysis approaches. In the new manuscript, new Fig 2B, we took all airinemes whose growth time was greater than 15 min, and plotted their final angle, i.e., the angle between the tangent vector at their point of emergence from the source cell and the tangent vector at their tip. At long times, the PRW model predicts that, for long times >1/D_theta, the angular distribution should become isotropic. In new 2B, we find that the angular distribution is uniform, i.e., isotropic, using a Kolmogorov-Smirnov test (p-value 0.37, N=26).

      Since there are relatively few data points, we repeated this analysis under various airineme selection criteria, and in all cases found the final angular distribution to be consistent with uniformity (new Supplemental Data Figure 1). For example, if we set the threshold at 10min, which includes up to N=49 airinemes, the Kolmogorov-Smirnov test against a uniform angular distribution gives a p-value of 0.32.

      We here add a few additional notes

      ● Note that there is significantly less data used in this test than in the MSD analysis or the autocorrelation function maximum likelihood analysis. In order to perform a hypothesis test, we wanted to be sure that the data points are independent, so we take only one from each airineme (unlike MSD and autocorrelation analyses, for which we take every interval of a particular length, whether in the same airineme or not.)

      ● Finally, although the >10min KS test has more data than the >15min KS test (N=49 compared to N=26), we have chosen to present the >15min KS test in the Main Text. As we mentioned above, the conclusion is unchanged for >10min (see Supporting Data). The reason is that >15min is the first test we ran to check angular distribution against a uniform (-pi,pi) distribution, and we did not want to bias our testing.

      Taken together, the data are even more strongly supportive of the PRW model. We are grateful for the Reviewer in encouraging us to further explore the high-time data.

      4) Regarding the directionality discussion, some aspect are a bit vague so that we are left to guess the assumptions made. For instance, the source cell is place at \theta=0 "without loss of generality" (p.6). Apparently (sketch Fig.5A) this also means that the airineme starting point from the source is at \theta=0, which clearly involves loss of generality, since the airineme could start from anywhere, its path could be hindered by the body of the source cell, and its contact angle would then be much less likely to be close to 0. It might be that in practice, only those airineme starting close to theta=0 do in fact make contact, but this should be discussed more thoroughly. Also, why is there to maxima in the Fisher information (Fig.5C) for very high and very low diffusion coefficient at short distance?

      The sketch was indeed not clear about generality, so we have edited it so that the angles are no longer perpendicular. We also now also clarify in the Main Text that, in all simulations (both measuring contact probability and directional sensing), the airineme begins at a specified point in an orientation uniformly random in (-pi,pi). We apologize that this was not clear in the previous sketch.

      Regarding hindrance by the source cell: While the tissue surface is crowded, the airineme tips appear unrestricted in their motion on the 2d surface, passing over or under other cells unimpeded (Eom et al., 2015, Eom and Parichy, 2017). We therefore do not consider obstacles in our model. This includes the source cell, i.e., we allow the search process to overlie the source cell. We now state this explicitly in the Main Text.

      Regarding two maxima in Figure 5C (which was a surprise to us): We understand it with the following intuitive picture. For low D_theta, i.e., for very straight airinemes, the allowed contact locations are in a narrow range (by analogy, imagine the day-side of the planet Earth, as accessible by straight rays of sunlight), resulting in high directional information. For high D_theta, i.e., for very random airinemes, we initially expected low and decreasing directional information, since there is more randomness. However, these are finite-length searches, and the range of the search process shrinks as D_\theta increases. This results in a situation where the tip barely reaches only the closest point on the target cell, resulting again in high directional information. We have added this intuitive reasoning in the Main Text.

    1. In constructing personas, we had to be cognizant of inadvertently creating stereotypes as humans naturally stereotype as a way of categorizing conceptions of others

      In addition to this inadvertent tendency to create stereotypes, I think that in only making a couple of personas to represent the learning audience you may fall into stereotyping just by lack of a sufficient sampling. How do you determine how many personas would be a representative enough sample? If you are looking at a diverse group of learners, you need more personas, and you need to have instructional materials that cover diverse needs. In larger groups would you break the group into sections to better address individual needs? Or have additional instructors?

    2. One way to enhance the socio-technical design of learning environment is by espousing a human-computer interaction perspective, which allows us to not only consider what the s/he is learning, but the unique interactions that impact their learning process.

      This is such an important point! I think that often designers, SMEs, coders...everyone involved in the design team can become so enthralled with and focused on their design, that they lose sight of the learner experience. It seems to me that true LXD requires that the design team set their egos aside and be flexible and open to change in order to provide the best and most effective learning experience for the learner. If the design itself induces frustration, the learner may give up and never get to the actual learning process. Designers need to strive for ease of use and provide design with limited barriers for the learner

      I have seen the role of designer-ego play out in the real world. In my role as a virtual math teacher, my colleagues and I regularly reached out to the curriculum team to request a change to the virtual book, activities, or assessments in order to enhance our students' experiences. Too often, we were told no, with no regard for the learner.

      My frustration with this led me to want to move into curriculum so that the learner's point of view would be better understood and represented. That is part of what led me to my current role in Gifted and to the ID program at UF.

    3. We put ourselves in his or her shoes.

      This week really put into perspective the idea of empathy vs. sympathy. It's much easier to sympathize with someone because you're still creating understanding from your personal perspective, while empathy requires what Baaki & Maddrell suggest: putting ourselves in someone else's shoes. However, I think that's a lot more difficult than these activities suggest. In Dr. Schmidt's example of creating a course for parents dealing with a child's diagnosis, while we may sympathize with them as instructional designers and human beings, it's far more difficult to truly understand the depths and nuances and of their experiences. While empathy interviews certainly help, it can never replace the experience. I saw this as a mother whose son recently received a diagnosis and dealing with the feelings and thoughts associated with it. I don't know if there's truly an empathy interview that an instructional designer can use to gauge and truly understand and feel what I do.

    1. Accommodations alone are not enough to achieve inclusion; when we go beyond accommodations, we create paths that help and support many learners, not just those who need or want accommodations.

      I think this idea is so important! Creating accommodations or new accessibility features is not just helpful for people with disabilities. They can serve as useful tools for anyone regardless of their abilities. If the accessible features can make everyone's time using a tool easier, why is there a lack of emphasis on creating these features? Tool creators may not prioritize them to begin with because they do not value people with disabilities as much as they should, but they need to realize these features can help everyone.

    2. They require constant reevaluation of the design choices we make in order to recognize how each choice can open up new forms of exclusion and barriers for learners.

      This is something that I think is very important to recognize. In regards to inclusion, it is essential that we are constantly aware of how or who we may be excluding others, even if we do not realize it. Nobody can be perfect 100% of the time, but as long as we are making an effort to respect others, that's all that we can ask for.

    1. Author Response:

      Reviewer #1 (Public Review):

      In this detailed study the authors show that in isolated islets the polarity of the secretory apparatus is largely lost while it is preserved in slices where the capillary network remains intact. The authors then go on to show that the integrin/FAK pathway appears to be responsible for inducing and maintaining polarity, which involves concentration of active zone proteins and calcium channels at the contact sites and a higher sensitivity and potency of insulin secretion to glucose stimulation.

      Generally, the data appear to be of high quality, being carried out with state-of-the-art technology, and the manuscript is lavishly illustrated. Since as a neuroscientist I am not sufficiently familiar with the field of the cell biology of insulin release it is difficult for me to judge whether there is sufficient advance in knowledge. A higher degree of organization of release sites including a role of active zone proteins was previously demonstrated from other endocrine organs involving the release of large dense-core vesicles such as chromaffin cells. Thus, the differences between the highly organized and rapidly responding exocytotic sites in neurons and the slower reacting release sites of peptide/protein containing granules are not fundamental but rather gradual, despite the principal cell biological differences between the biogenesis and recycling pathways of the secretory organelles.

      In summary, the work adds new aspects to the understanding of the regulation of exocytosis in pancreatic beta cells. Aside from corrections of figure descriptions and experimental details, my only major comment relates to the data shown in Fig. 4. It appears that the difference in the time-to-peak between the two preparation is mainly caused by a (rather variable?) delay between glucose addition and the onset of the rise since the rate of increase is apparently not different between the preparations. Is this due a delay in depolarization, i.e. a delay in the closure of the ATP-K channels? This should be clarified. Also, the authors should show a comparative histogram of the delay times (between glucose addition and the inflection point at the onset of the rise).

      The delay observed is due to a slower response in islets vs slices, which given the potentiating effects we show of the KATP channel drugs (diazoxide and now glibenclamide) is likely explained by a delay in KATP closure. However, since we are measuring the Ca2+ response we cannot directly prove this. We feel this is adequately discussed with reference to glucose-dependent triggering (where the KATP channel is a key component). In direct response to the referee’s comment about variability, we have re-expressed the data to show frequency histogram comparisons of the delay to peak (new Fig 4J).

      Reviewer #2 (Public Review):

      1) The authors present an investigation of subcellular distribution and dynamics of known presynaptic proteins in a relatively new approach, pancreatic slices, mastered by a limited number of laboratories, and which is currently the best method to largely preserve capillary networks. They demonstrate the advantage of this method by detailed cellular and subcellular optical analysis comparing isolated islets, islets in pancreatic slices, isolated islet cells and isolated islet cells on ECM (laminin) covered surfaces. This work provides good proof that preservation of capillary networks and corresponding distribution of proteins (laminin, liprin, integrin beta1 etc) is required for insulin secretion at the apical surface of islet cells. Moreover, in these pancreatic slices they observe a restriction of exocytotic sites at the vascular surfaces. The role of the extracellular matrix is also well investigated here by experiments on dispersed or single beta cells attached either to a glass-BSA interface or to a glass-laminin interface. However, the authors have already previously published in 2014 a restricted polarized insulin secretion in cultured islets as well as the preservation of localized liprin and laminin distribution (as well as RIM2 and piccolo; DOI 10.1007/s00125-014-3252-6). It is not clear why these data cannot be reproduced now again in isolated islets (see Fig. 1 and 2) .

      We thank the referee for their comments. To clarify the specific issue around our past work. All our live sub-cellular resolution experiments have previously been performed with isolated islets – we have not, until recently been able to reliably get the slice to work. In contrast, our work with immunofluorescence of active zone proteins has been performed with fixed slices (including DOI 10.1007/s00125-014-3252-6, Low et al 2014).

      2) The authors try to gain insight which mechanisms control this specific spatial restriction and they provide evidence that Focal Adhesion kinase activity is implicated in glucose-induced calcium fluxes and insulin secretion by the use of a small molecule antagonist and the use of a purified monoclonal antibody. They conclude that FAK is a master regulator of glucose induced insulin secretion that controls positioning of presynaptic scaffold proteins and the functioning of calcium channels. Although FAK may be a regulator, the claim that FAK controls functioning of calcium channels can certainly not be made. Ratio measurements of cellular calcium levels do not suffice for that (patch or sharp would be required). Moreover, the fact that KCl-induced insulin secretion (which bypasses nutrient metabolism and leads directly to opening of voltage-dependent calcium channels) is not altered by the FAK antagonist strongly argues against a role of FAK in calcium channel regulation. Indeed, the presented data suggest that FAK may intervene far more upstream from exocytosis such as in nutrient metabolism or granule mobility/maturation.

      Our data clearly shows that integrin/FAK activation is part of the glucose dependent control of Ca2+ and insulin secretion. It is not relevant to this conclusion how we measure Ca2+ responses – they are obviously affected by all manipulations of integrin/FAK. We note that the referee is specifically correct in saying that we do not have evidence that Ca2+ channel function is a direct target of integrins/FAK and we have reworded the text to make this clear.

      Further, our work does not define where in the glucose pathway integrin/FAK are acting. The referee is correct in saying the KCl data suggests it is upstream of the final stages of Ca2+ channel and exocytosis. Consistent with this we see effects of integrin/FAK manipulation on ELKS and liprin positioning (Figs 7 and 8) and, given the published data showing that ELKS enhances Ca2+ channel current (Ohara-Imaizumi et al 2019) we think it is plausible integrin/FAK intersect with this pathway to regulate Ca2+ channel activity. With reference to the high K responses, KCl rapidly depolarises the cells to recruit Ca2+ channels, in contrast glucose slowly depolarises cells. This difference will affect Ca2+ channel behaviour and altered CaV1.2 function, such as lowered voltage threshold might specifically only be apparent in the glucose responses.

      3) The authors present data that islets in pancreatic slices are considerably more sensitive to glucose, inducing a response already at basal glucose levels (2.8 mM). In the same vein the authors observe a considerably shortened delay between stimulus and response (this delay is general due to nutrient metabolism and initial filling of intracellular calcium stores). The authors take these phenomena as evidence for a superior and more physiological quality of their islet slices as compared to conventional purified islets.

      However, contrary to their interpretation, these observations considerably questions whether the slice preparation used here in this work has physiological qualities. Indeed, the authors observe considerable activity of islet beta-cells already far below the set-point of around 6 or 7 mM in rodents, very well characterized through a number of studies in-vivo, in-vitro and even in-situ (10.1113/jphysiol.1995.sp020804), and their preparations reach almost full activity around the set-point. This is also surprising as such a hypersensitivity has not been reported by several other groups using the same preparation, i.e. pancreatic slices (10.1152/ajpendo.00043.2021; 10.1371/journal.pone.0054638; 10.3389/fphys.2019.00869; 10.1371/journal.pcbi.1009002; 10.1038/nprot.2014.195) even using patch clamp (10.3390/s151127393). >Moreover, even human islets, known for a lower set-point, are inactive in slices at 3 mM (10.1038/s41467-020-17040-8) in line with the physiological requirement to avoid insulin secretion in low glucose states as to avoid life-threatening hypoglycaemia. The same applies for the shortened delay between application of a stimulus (glucose) and start of the response, which has also not been observed by other groups in pancreatic slices (refs see above).

      We are cognisant that our data challenges the dogma and talked around this point in the discussion. Evidence that our findings might be correct include the responses seen by Henquin to glucose concentrations below 6 mM (Gembal et al 1992) and the long-standing evidence of heterogeneous responses in isolated cells that show responses to very low glucose concentrations (Van Schravendijk et al 1992). As such, our data is not as unusual as it might initially appear. Furthermore, as discussed in detail below the findings from others using the slice preparation is not directly or easily compared to our work.

      In general, such an increased glucose sensitivity is observed in prediabetic states or experiments mimicking such a condition. To the best of my recollection such an apparently increased sensitivity can also be observed in brain slices due to leakage. Unfortunately, no independent measures of islet quality in slices are provided.

      We have previously characterised increased insulin secretion in “prediabetes” in mice and demonstrated a clear effect on the mechanisms of granule fusion such as an increase in compound exocytosis (Do et al 2016). We do not think this is relevant to this slice preparation where normal mice were used for both the slice and the islet experiments and our data in slices and islets both show normal granule fusion and not compound exocytosis.

      Within the same vein the comparison between slices and islets (Fig 5) is not in favour of a more physiological aspect of slices and the different cell morphology and small number of observations shed more doubt, especially in view of the well known normal beta-cell heterogeneity (which may explain differences and may have been missed here due to a small sample size).

      We acknowledge that beta cell heterogeneity is a potential confounding factor. However, our sample sizes are not small, in each islet or slice we record Ca2+ responses from ~10 cells (see Fig 3) and have repeated preparations from each mouse with the total dataset from >3 mice. It is true that the sample size for Ca2+ waves is small for the isolated islets, but this is because these are such rare events which is explained by the fragmented capillaries and compromised cell structure (eg Fig 1) in isolated islets.

      In a larger context this glucose supersensitivity may also shed doubts on the proposed important role of FAK as its role may be far less preponderant in preparations corresponding to physiological criteria.

      We agree that the relative importance of FAK might be different in different in vitro models. But it is clear that FAK plays an important role in vivo and the data from FAK KO mice show both defective glucose homeostasis and lower insulin secretion (Cai et al 2012) directly demonstrating physiological relevance.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript documents a very thorough biophysical, structural and functional dissection of interactions between the RNA-binding protein Rrm4 and the endosomal adaptor Upa1 in the filamentous fungus Ustilago maydis. It has been shown previously that the Rrm4-Upa1 interaction is critical for mRNA transport in this system as mRNAs hitchhike on motor-associated endosomes. Here, the authors reveal using modelling that Rrm4 has three MLLE domains, including a cryptic one that had not been identified previously. They then report the crystal structure of MLLE2 and analyze the distribution anf arrangement of the MLLE domains in the protein using SAXS. They then show using pulldowns and isothermal titration calorimetry that MLLE3 is critical for the Upa1 interaction (via the PAM2L domains of Upa1) and that MLLE2 contributes to Rrm4 localization in vivo when the MLLE3-Upa1 interaction is partially impaired. The study suggests that Rrm4 has a platform of MLLE domains for orchestrating Rrm4 function. Overall, this is technically a high quality study. However, a number of points (mostly minor) should be addressed.

      Major comments:

      __A key part of the study if the in vivo work illustrating a role for MLLE2 in regulating Rrm4 localization when the system is sensitized. Some aspects of this part of the work need clarifying.

      a) The authors should show that the abberant staining is indeed microtubule-related with the benomyl experiment that they used in Jankowski et al. 2019. __

      We included this important control in Figure EV5F demonstrating that the aberrant staining is no longer visible after the microtubule inhibitor benomyl treatment

      b) The authors claim from these experiments that MLLE2 contributes to endosomal targeting (as there is ectopic protein on other structures (presumptive microtubules)). However, to make this claim, the authors would need to measure the intensity of the mutant Rrm4 protein on endosomes and/or the colocalization of these Rrm4 variants with endosomes, as they do in other experiments in this paper. Otherwise, it is possible that the MLLE2 deletion has another effect, e.g. increasing protein stability, and thus increasing the likelihood of binding to structures other than endosomes. If available, data on the relative abundance in the cell of the protein expressed from the wild-type control (rrm4-kat) and MLLE2 deletion constructs (e.g. rrm4-m1,2delta-kat) should be provided.

      As indicated by the reviewer, a critical point is identifying a function of MLLE2. Surprisingly, the domain is conserved in evolution, but , we do not see a mutant phenotype under optimal culture conditions. Therefore, we challenged the system and observed the mislocalisation of Rrm4, if the MLLE2 domain is deleted. However, the overall amount of shuttling Rrm4-positive endosomes was not strongly affected according to our kymograph experiments. We observe aberrant staining, which is not seen with the Rrm4 wild-type protein. Thus, under challenging conditions, we do see a function of MLLE2.

      To address the valid point of the reviewer, we quantified the signal intensities in kymographs of the most important Rrm4 variants. As indicated in Figure 5E, we observed that the maximum fluorescence intensity in kymograph signals was reduced when Rrm4 variants are mislocalised to microtubules while the minimum intensities were comparable in all strains. This underlines that a subset of Rrm4 molecules are no longer shuttling through the cell and most likely are attached to microtubules (to prove the involvement of microtubules, we did benomyl treatment which is now shown in Figure EV5F). We also included a Western Blot experiment (Figure EV5G) demonstrating that neither MLLE1 nor MLLE2 deletion impacts the total protein amount of Rrm4. These data support the notion that MLLE2 contributes to endosomal targeting.

      c) Was the data in Figure 5D scored blind of the identity of the samples? Given that the classification has to be done manually, it is important to confirm the phenotypes are robust to blinding (at least for the key comparisons).

      We agree entirely that manual evaluation of microscopic images has to be carried out with utmost care. The phenotype of aberrant microtubule staining is not easily detectable, and it needs an experienced person to quantify this. The data were analyzed by a second experimentalist with experience in evaluating microscopy images to validate the system’s robustness. Notably, the key findings were confirmed in both cases aberrant microtubule staining was only observed when the MLLE domain was mutated. However, the second person reported difficulties in differentiating a bundle of Rrm4 signals or stained microtubules. Therefore, this person quantified higher values with less experience in Rrm4 movement. In essence, we can rely on the key findings. We included the information in the section “Materials and methods” and gave the comparison in Figure EV5H.

      If points b and c are addressed, it should be possible to draw an arrow between the gray question mark protein in Figure 6 and the endosome surface, which is what I assume the authors believe to be case based on their discussion.

      Having addressed both points, we have also improved the model. To this end, we added a second unknown protein component (grey oval with a question mark) that interacts with MLLE2 and the endosomal surface. Thereby the hierarchical order with the accessory role of MLLE2 during endosomal attachment is stressed.

      Minor comments:

      1. The first line of the abstract is quite bold. It is hard to quantify the role of transport vs RNA stability for example, so I suggest this sentence is toned down. Correct, the first line now reads, “Spatiotemporal expression can be achieved by transport and translation of mRNAs at defined subcellular sites”.

      Line 269: change "amount of motile Rrm4-M12delta-Kat positive signals" to "number of motile Rrm4-M12delta-Kat positive signals".

      Changed as mentioned above.

      Figure 3 legend: Insert "Variant" before "amino acids of the FxP and FxxP..." to indicate what is labeled in gray. Change "fond" to "font" in the same sentence.

      Corrected as mentioned above.

      The cartoons of the different protein variants are very helpful but I had problems spotting the Upa1-Pam2L deletions due to the similar gray to the background of the protein. This would perhaps be clearer if the gray used for the background was lighter than it currently is.

      We improved the contrast by reducing the background of Upa1 to a lighter grey tone in all the corresponding figures.

      The residual motility of wild-type Rrm4 when PAM2L1 and PAM2L2 are both mutated (Figure 5C) is reminiscent of what is seen in a complete Upa1 deletion in the group's previous work. It would be helpful to point this out to the reader, as well as the implication that other proteins are contributing to Rrm4's linkage to endosomes. After all, some of these other adaptors might contact MLLE2 of Rrm4.

      We addressed this point by referring to our previous publication with the following sentence: “Comparable to previous reports, we observed residual motility of Rrm4-Kat on shuttling the endosomes if both PAM2L motifs are mutated or if upa1 is deleted. This indicates that additional proteins besides Upa1 are involved in the endosomal attachment of Rrm4 (Pohlmann et al., 2015).”

      Some of the y-axes of the charts should be more descriptive so that the reader can understand the plots even before they consult the legends. For example, in Figure EV4A and EV5D and E, which protein is being to referred to in each 'number of signals' plot should be included. In Figure 5D, 'Hyphae [%]' would be clearer as 'Hyphae with MT staining of Rrm4 [%]'

      We improved this in Figures EV4, 5D and EV5.

      Figure EV5 legend title: this could be misleading as the authors are seeing ectopic MT localization rather than a deficit in microtubule association.

      Corrected to “Deletion of MLLE1Rrm4 and -2 cause aberrant staining of microtubules”.

      Reviewer #1 (Significance (Required)):

      __The Feldbrugge group has previously mapped interactions between Upa1 and Rrm4 (Pohlmann et al., 2015) and some conclusions are corroborated in the paper by Boehm et al. The paper under review is, however, a significant advance due to the identification of the third MLLE domain, detailed biophysical characterization of the interactions, the structural insights, and evidence of a subsidiary role of MLLE2. The work would of course be stronger if the target of MLLE2 had been identified but I think this is beyond the scope of this initial work. To my knowledge, this is one of the most extensive analyses of the interactions mediated by MLLE and PAM domains and will be of interest to others working on these protein features. The work will also appeal to those interested in the links of localizing mRNAs with motor-associated membranes, which is an emerging field.

      Reviewer expertise: I have a long-standing interest in molecular analysis of mRNA trafficking mechanisms. I do not have experience in fungal genetics. __

      **Referee Cross-commenting**

      It seems that we are in agreement that this is solid work and that biochemical and biophysical analysis of the MLLE-PAM interactions will be of significant interest to those working on those domains (or proteins containing those domains). I agree with the comments of the other reviewers and there are clearly some essential minor revisions needed to strengthen the evidence for their conclusions and some clarifications. I think it is a long shot that RNA binding to the RRMs will affect the MLLE-PAM interactions and would require quite a lot of work to show this conclusively. The study would, however, be more impactful if this was shown to be the case, or the target of MLLE2 was found. Nonetheless, I would not say these new avenues of research are necessary to find a home in one of the Review Commons journals.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Devan, Schott-Verdugo et al.

      Summary

      In this study the putative MLLE RNA-binding motifs of the endosomal RNA-binding protein, Rrm4, from Ustilago maydis were examined using structural and genetic analyses. MLLE motifs are conserved in polyA-binding proteins (Pab1/PABPC1) and found also in Rrm4, which was shown to reside on motile endosomes and deliver septin mRNAs for endosome-localized translation during polarized growth. Upa1 on the endosome interacts with Rrm4 via its PAM2L domain that itself interacts with the MLLE domains of proteins like Pab1. Mutations in the known MLLE domain of Rrm4 were earlier shown to affect localization to endosomes. Here, the C-terminal domain of Rrm4 was revealed to have three divergent MLLE motifs using comparative modeling; only two of which were previously predicted. Crystallization and X-ray diffraction analysis of a truncated version of bacterially produced Rrm4, showed MLLE2 is most similar to that of PABPC1 and UBR5, although MLLE1 and 2 are somewhat divergent in the key region of PAM2 binding. Small angle X-ray scattering of recombinant full-length or truncated Rrm4 revealed that the MLLE domains might form a platform that could allow for multiple contacts with different binding partners. In vitro binding studies with different N-terminal GST-tagged versions of the Rrm4 were used to examine for interactions with PAM2 sequences of Upa1 using N-terminal hexa-histidine-SUMO fusions. It was found that Pab1-MLLE interacts with the PAM2, but not PAM2L, domain of Upa1. In contrast, the complete Rrm4 MLLE region (G-Rrm4-NT4) interacted with the PAM2L domain, but not the PAM2 of Upa1. Notably, the interaction with PAM2L required the third MLLE and neither MLLE1 nor MLLE2, nor both. No significant differences in affinity were observed and were similar to that of the Pab1 MLLE. The results also show that the MLLE3 has a higher affinity for the PAM2L2 than PAM2L1 of Upa1.

      To examine the biological role of the Rrm4 MLLEs, U. maydis strains bearing deletions in the domains of Rrm4 were examined for hyphal growth and endosomal transport (latter using Upa1-GFP and Rrm4-mKate2). Only the loss of the MLLE3 domain inhibited polarized growth (as seen with the full deletion of RRM4) and not the deletion of either MLLE1 or 2. Similar results were obtained regarding endosome shuttling. Thus, in line with the biochemical experiments performed the MLLE3 domain alone (of the three identified) is necessary for the biological actions of Rrm4. This suggested the MLLE1 and 2 are not necessary for function under these conditions.

      To examine this further, Upa1 carrying mutations in the PAM2L 1or PAM2L2 domains were examined. It was found that the deletion of both PAM2L domains affected unipolar growth resulting in bipolar growth similar to the deletion of UPA1 alone. This phenotype was observed even upon the deletion of Rrm4 MLLE1 and 2 in the same background as the PAM2L mutants. The mutation of both PAM2L domains led to a reduction in Rrm4-labeled shuttling endosomes, which suggests that these domains help anchor Rrm4 to endosomes. When only the PAM2L1 domain is present in Upa1 there was a larger increase in hyphae with aberrant microtubule staining than upon the loss of PAM2L1. The authors suggest that this indicates PAM2L2 is more important and prescribes an accessory role for MLLE2 in endosome association.

      Comments: Overall, the study seems well conducted. We cannot comment on the structural aspect of the work since this is not our field of expertise. That said, the biochemical and genetic/functional studies appear solid, well thought-out, and clearly presented. No new experiments are necessary to support the general claims of the paper, however, experiments suggested below might make it more revealing with regards to the connection between RNA binding and MLLE-PAM2L interactions (i.e. endosome localization and RNA binding functions).

      1. Line 286 - It reads the they "Next, we investigated the association of Rrm4 -M12D-Kat in strains expressing PAM2L1. Thus, the endosomal attachment was solely dependent on the interaction of MLLE3 with the PAM2L2 sequence of Upa1." Unclear - wouldn't lacking PAM2L1 (and not expressing) fit the logic of the sentence? We corrected this with the sentence, “Next, we investigated the association of Rrm4-M1,2D-Kat in strains expressing Upa1 with mutated PAM2L1”.

      Several questions regarding the specificity of PAM2 vs. PAM2L domains. What happens when you switch/replace the PAM2L1 or 2 of Upa1 with Upa1 PAM2 domains? Are they exclusive? What happens when the MLLE3 of Rrm4 is switched with that of Pab1? And if one does both - does that restore functionality to Rrm4?

      These are very interesting suggestions. Previously, we have shown that a single PAM2L1 or PAM2L2 sequence of Upa1 is sufficient for unipolar growth and recruitment of Rrm4 to endosomes. Please note that Upa1 with mutated PAM2L1 and L2 still contains a PAM2 motif. Furthermore, mutating the PAM2 motif of Upa1 did not affect Rrm4 shuttling or unipolar growth. Thus, switching the domains would mostly address whether the precise location within Upa1 would be important. This is interesting but, unfortunately very labour-intensive and beyond the manuscript’s current scope.

      Switching MLLE3 with MLLE of PAB1 is an interesting approach. One might expect that Rrm4 can be recruited to endosomes again. However, Rrm4 would also interact with numerous other proteins containing PAM2 motifs like deadenylase Not4. Here it would compete with the MLLE of Pab1. Thus, it would be expected that Rrm4 is on the surface, but the protein will be mistargeted to other proteins causing pleiotropic alterations. It will be difficult to judge whether Rrm4 functionality is restored or whether other processes are disturbed. In essence, these are stimulating ideas, but we believe that these experiments are beyond the scope of the current study. In the future, we might address this point by using a heterologous peptide-binding pocket or tethering approach.

      Likewise, what happens if Upa1 only has PAM2L2 instead of only PAM2L1 domains? Does that alter function - perhaps now one can observe a contribution of MLLE1? If it it's there it's likely to have function. Anything known about the post-translational modification of these MLLE or PAM domains? Does it change during unipolar vs. bipolar growth? Perhaps the different MLLE domains are regulated in such a fashion?

      Again also very valid points. Upa1 with two PAM2L2 motifs might interact stronger. The problem is that one PAM2L motif is sufficient for interaction, and we do not see a strong phenotype.

      Currently, we do not know if post-translational modifications regulate the MLLE domains. This could alter the binding affinity or specificity, and by expressing fungal proteins in E. coli, we might have missed this type of regulation. However, we addressed the function of MLLE1 and MLLE2 in U. maydis using a genetic approach. We deleted the corresponding domains and interfered with potential regulation by posttranslational modification. Thus, we cannot exclude post-translational modification, but it appears to be not essential for function. We will address the posttranslational regulation of Rrm4 in more detail in the future.

      Can the authors show whether the binding of mRNA cargo (e.g. Cdc3 mRNA) to the RRM motifs of Rrm4 affects the interaction between any of the MLLE-PAM2L pairs, or vice versa (i.e. does the MLLE-PAM2L interaction affect mRNA binding)?

      In previous studies, we have investigated a version of Rrm4 carrying a mutation in the first RRM motif of Rrm4. According to RNA live imaging, the respective strains exhibit a loss of function phenotype and mRNA transport is strongly affected. However, the endosomal association of Rrm4-mR1-Gfp is not affected, indicating no direct cross-talk between RNA-binding via RRM1 and endosomal attachment via MLLE3. Also, a version of Rrm4 carrying a deletion of all three RRM domains is still shuttling on endosomes. The two functions, i.e. RNA binding and endosomal binding, appears to be carried out by two independent platforms, i.e. three RRMs and three MLLEs, respectively. The overall structure of the protein also reflects this. The RRM domains are structurally clearly separated from the flexible MLLE domains.

      Discussion line 311 It is written that the three MLLE domains "collaborate for optimal functionality..." Perhaps there's a misunderstanding here, but the authors show that MLLE3 domain alone is necessary & sufficient for function, so where is the collaboration? MLLE2 may have an accessory role according to the authors, but we do not know if it is in collaboration with MLLE3 or independent thereof. Since the KD of MLLE3 is not affected by the presence or absence of MLLE1,2 in vitro at least, it may be that they have independent, and not collaborative, roles.

      Correct, we rephrased this more carefully. We omitted the collaboration aspect. It now reads, ”but a sophisticated binding platform consisting of three MLLE domains with MLLE2 and MLLE3 functioning in linking the key RNA transporter to endosomes.”

      Reviewer #2 (Significance (Required)):

      This paper concerns functional domains found in an endosome-localized RNA binding protein, U. maydis Rrm4, which is necessary for localized translation on endosomes and subsequent unipolar growth. Here the authors show using structural, biochemical, and genetic studies that instead of one or two MLLE protein-protein interacting domain in Rrm4 there are three, although one (MLLE3) is necessary and sufficient for full function. This work is for an audience interested in those studying RNA trafficking and its role in cell physiology, which is our expertise. The work is interesting, but it could be made more so especially if a connection was established between the RNA-binding function of the RRM domains and the MLLE-PAM2L interaction(s). At this point it is solid technical work and could be published after minor revisions.

      **Referee Cross-commenting**

      I concur with the comments of the other reviewers in that the work is solid and necessitates minor revisions in order to be published. Clearly, establishing a connection between the RNA-binding function and the MLLE-PAM interactions of Rrm4 would be an interesting and worthy pursuit that might enhance the novelty of the work, but I agree that it could belong to future studies.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      __ Summary: Long-distance subcellular transport of mRNAs is achieved through selective and dynamic interaction with the transport machinery. Using the highly polarized hyphae of Ustilago maydis, the authors previously showed i- that mRNAs can hitchhike on actively transported endosomes for proper distribution, and ii- that the connection between mRNAs and endosomes is mediated by the interaction between a C-terminal MademoiseLLE (MLE) domain of the RNA binding protein Rrm4 and the Upa1 adapter protein. In this study, the authors aimed at more precisely characterizing the structural and molecular bases underlying the Rrm4-Upa1 interaction. Combining structural modeling and X-ray analyses, they discovered a non-canonical, and previously missed, MLE domain (MLE1) in Rrm4, and characterized the structure of the second MLE domains (MLE2) of Rrm4. Through binding assays, they showed that the three MLE domains exhibit different binding properties, and that MLE3 is the only domain capable of binding to the PAM2 domain of Upa1. Consistent with this finding, functional assays performed in U. maydis revealed that MLE3 is the main domain involved in interaction with endosomes and trafficking, MLE1 and 2 having either no or minor functions in this process.

      The manuscript is very-well written, the data are of high quality and clearly presented. A wide range of complementary approaches has been used to molecularly and functionally characterize the different MLE domains of Rrm4. From an "RNA transport" perspective, this manuscript falls short of a main novel findings as the domains characterized in this study (MLE1 and 2) don't have a clear function in connecting mRNAs to the transport machinery. From an "MLE domain" perspective, this work however provides interesting information about non-canonical domains and structures, and about binding and function specificity. As described below, my major concern relates to the role played by the ML2 domain of Rrm4, a role referred to as "accessory" by the authors. __

      __

      Major comments: __

      The authors conclude from their results that ML2 has an accessory role in promoting association with endosomes.

      1- This conclusion is made based on in vivo experiments showing that a form of Rrm4 lacking the M2 domain, in contrast to wild-type Rrm4, aberrantly attached to MTs in a context where the Rrm4-Upa1 interaction mediated by MLE3Rrm4 has been weakened (Upa1-pl2m). Although the results are convincing, their interpretation is less. The authors, indeed, claim that the observed phenotype results from "the static accumulation of Rrm4" due to reduced interaction with endosomes. Why then don't they see a decrease in the motility/transport properties of Rrm4-M2Δ in this context then? Also, do the authors see a decrease in the co-localization of Rrm4-M2Δ with endosomes (which would be expected if the interaction is decreased)? Can the authors perform IP or co-sedimentation experiments to strengthen their hypothesis?

      This is a fair criticism that was also raised by reviewer 1. In the improved version of the manuscript, we now include important control experiments demonstrating that (i) the aberrant localisation is microtubule-dependent (Fig. EV5F) (ii) the mutations do not cause differences in protein amounts of Rrm4 (Fig. EV5G) (iii) the key findings of the aberrant microtubule staining, which were scored manually in microscopic images were verified independently by two persons (Fig. EV5H) and (iv) most importantly, Rrm4 signal intensity is decreased in processive signals of our kymograph analysis (Fig. 5E). We firmly believe that this set of experiments strengthens our conclusion that MLLE2 plays an accessory role in the endosomal attachment (Fig. 6).

      2- Whether MLE2Rrm4 mediates interaction with endosomes through association with Upa1 is unclear, as the binding assays performed in Figure 3 test for association of Rrm4 variants with single isolated domains of Upa1, not with the full-length protein. Assessing the binding of Rrm4-M2Δ variants with Upa1-PL2m would help interpreting the phenotypes described in Figure 5.

      Unfortunately, it is difficult to express full-length Upa1 protein in E. coli due to the presence of extended unstructured regions. To overcome this limitation, we performed yeast two-hybrid experiments with full-length proteins of Rrm4 and Upa1. We were able to recapitulate qualitatively the results observed in vitro using the individual domains.

      Notably, the Rrm4 version carrying a deletion in MLLE1 and MLLE2 interacted with Upa1 versions carrying mutations in PAM2L1 or PAM2L2 (Fig. EV3C), suggesting that both MLLE domains of Rrm4 are dispensable for interaction with Upa1. MLLE3 is sufficient to interact with a single PAM2L sequence of Upa1. This suggests the presence of additional interaction partners for MLLE1 and MLLE2 and is entirely consistent with our genetic and cell biological analysis described in Fig. 5.

      __

      Minor comments: __

      1- The authors have previously characterized the effect of a C-terminal deletion of Rrm4 on Rrm4 motility and binding to Upa1 (Becht et al., 2006; Pohlmann et al., 2015). How their previously-described construct compares to the Rrm4-M3Δ used in this study is unclear (is it the same?).

      It is the identical mutation to allele rrm4GPD from Becht et al. 2006. We indicate the information in the text “(Fig. 4B-C; mutation identical to allele rrm4GPD in Becht et al., 2006).”

      2- page 6, line 141: refer to Fig. 1B rather than Fig. EV1A ?

      We included the reference to Fig. 1B.

      3- page 10, line 274: "Rrm4-Kat was found"

      We corrected this.

      4- page 11, line 286: "in strains expressing Upa1-PAM2L1", replace by "in strains expressing Upa1 with mutated PAM2L1"?

      We corrected this.

      5- The Figures and accompanying legends are overall very clear and detailed. In Figures EV4A and EV5D-E, it would however help if the authors would indicate on the Figure itself, left to each panel which markers/signals is being analyzed (e.g Rrm4-Kat (top) and Upa1-GFP (down) for Figure EV4).

      We clarified this.

      Reviewer #3 (Significance (Required)):

      Active transport of mRNAs along microtubule tracks has been shown to play a key role in the spatio-temporal control of gene expression in various cell types and species. How specific mRNAs mechanistically connect to molecular motors for their transport to their subcellular destination has however for long remained largely unclear. Recent work, including work from the authors, has uncovered that RNAs can hitchhike on membranous organelles through adapter proteins linking mRNAs and RNA binding proteins with trafficking membrane-bound organelles.

      This study aimed at investigating the structural and molecular bases underlying the interaction between RNA binding proteins and endosomes. While their identification and characterization of the MLE1 and MLE2 domains of Rrm4 did not provide significant new insight into the mechanisms involved in the endosome-mediated transport of mRNAs, it uncovered interesting new properties of MLE domains, including structural variations, selective binding and functional specificity. This work should thus be of interest for structural biologists and researchers interested in protein-protein interaction platforms.

      **Referee Cross-commenting**

      Our comments all converge to the idea that this study is solid as it is and requires only minor revision work to support the authors conclusions. Although characterizing further MLE/PAM2 binding specificity and MLE2 interactors would be of great interest and indeed provide a more complete understanding of interaction networks at play, I feel that this is beyond expected revision work.

    1. Author Response:

      Reviewer #1:

      Hu and colleagues employ computed-tomography methods and provide a detailed description of and inferences about the dental system in three early-diverging ceratopsian dinosaur genera represented by rare specimens from China. Their study identifies nuanced tooth replacement rates and patterns. Furthermore, combined with the analysis of dental wear patterns, their study not only elucidates ontogenetic aspects of these early ceratopsians but also explores the implication of such patterns for dietary adaptations among these taxa. The manuscript, therefore, provides unique insights into the anatomical and ecological contexts of ceratopsians in such deep time.

      The manuscript is rich in data that are summarized in multiple tables and figures. It is also well-written and easy to follow. The inference and conclusions made are also overall well supported by the data presented.

      Thank you for your positive comments!

      The only main comment I have concerns the inference made about the dietary adaptation of Yinlong, which is inferred to be characterized by "feeding strategies other than only grinding food with their teeth." I think that this could be expanded a bit more to incorporate dietary breadth as an additional possible explanation, particularly given the lack of conclusive evidence for the predominance of a single plant species. As it stands, the inference (made across lines 475 through 485) may only imply processing the same food resource using non-chewing methods (e.g., gastroliths to triturate fern). Could the incorporation of other, less abrasive plat foods--in addition to the fibrous ferns--in the diet of Yinlong be a possible, additional explanation for the relatively slow tooth replacement and lack of a heavy tooth wear from chewing-related stress?

      We have provided more explanations and discussion for feeding strategies based on analysing the environmental condition and internal features. Firstly, we analyzed the flora of the Shishugou Formation and the environment that Yinlong lived. Then its feeding strategy can be inferred from its body size and tooth characters. The relatively small body length implies that Yinlong likely feeds on some low plants. The morphology of dentitions, the primitive jaw morphology, and the low tooth replacement rate suggest that Yinlong is unlikely to grind tough foods like derived ceratopsians. Yinlong possibly has other feeding strategies such as processing the foodstuffs by gastroliths, which have been found in some other dinosaurs. We have added more comparison with other dinosaurs (i.e., an armoured dinosaur preserved stomach contents and gastroliths). We suggest that ferns such as Angiopteris, Osmunda, and Coniopteris are suitable to be food choices of Yinlong. Some low and tender leaf and other less abrasive plant foods could also be possible.

      Reviewer #2:

      The authors of the present work aimed to describe tooth replacement in early ceratopsian species from the Lower Jurassic of China, and with this novel information, discuss new hypotheses of successive changes in jaw evolution that led to the highly specialized replacement and jaw function of derived ceratopsids. Major strengths of this study include not only the use of microCT-scans and 3D reconstructions to address tooth replacement in three different species of early ceratopsians (Yinlong, Hualianceratops, and Chaoyangsaurus), but also the observation of wear development, pulp cavity development, zahnreihen, and z-spacing and replacement rate to compare between taxa and address the succession of mandibular and replacement changes in the phylogeny of ceratopsian dinosaurs. The aims were achieved and the conclusions are strongly supported by the evidence discussed and the cited bibliography. Figures are clear and captions are concise. The presented information gives evidence for the comparison and discussion of the order of acquisition of different craniomandibular adaptations that lead to a specialized herbivorous diet, useful not only for ceratopsians and ornithischians, but also for other lineages of dinosaurs in the Mesozoic, and further for comparing with extant and extinct lineages of mammals. Dinosaurs not only were fantastic creatures from the past but also achieved different morphologic, physiologic, and behavioral traits unknown to any other creature, even mammals. For ceratopsians, the appearance of dental batteries corresponds to a unique trait only functionally similar to that in hadrosaurs and some sauropods, and understanding the steps that led to that specialized structure allows us to also understand the drivers that later guided their diversification during the Late Cretaceous.

      Thank you for your positive comments!

      Reviewer #3:

      The major strengths of the paper are its thorough level of detail, rich dataset, and easy readability. The figures are excellent and clear.

      One shortcoming of the paper is the lack of measurements -- a table of measurement for each functional and replacement tooth's length, mesiodistal width, and linguolabial width should be provided.

      We thank the reviewer for pointing out this. We have provided each functional and replacement tooth’s total height, maximum mesiodistal width, maximum labiolingual width of all specimens presented in TABLE S1. These data help to support our conclusions.

      Unfortunately the manuscript is not publishable in its current form because the conclusions are not testable based on the limited data provided. The authors stated "All data generated or analysed during this study are included in the manuscript and supporting file." This is not true. Only the 3D models derived from segmentations are provided, not the raw scans. Segmentation-derived models are interpretations, akin to publishing a drawing of a fossil instead of a photograph, which is not generally acceptable under today's publishing standards (drawings can be published alongside photographs). Please upload the raw scans to an appropriate repository such as Morphosource, Dryad, or Morphobank. Scans can be cropped to the dentigerous regions only, so long as scaling information is preserved.

      We have added raw micro-CT scans of all scanned specimens (all cropped to the dentigerous regions) in Dryad as .TIF or .BMP file format. The file object details are also provided in a TXT file ‘README_file.txt’ saved in Dryad, at https://doi.org/10.5061/dryad.9ghx3ffk0.

    1. This behavioral data is fed to machine learning systems that provide predictions about what people will do in the future. She documents how surveillance capitalists have gained immense wealth through the trading of “prediction products,” as companies profit from laying accurate bets on people’s future behaviors. These systems tend to reward the privileged while entrapping the underprivileged, whose choices are particularly constrained.

      Indeed. The machine learning system will tend to learn the most from people's initial wealth. I think we may combine other facts as input (like education background, occupation etc. ) to the machine learning systems to weaken the effect of initial wealth.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1: General comments:

      Fujimoto and collaborators use Nanopore-based cDNA sequencing for genome-wide transcriptome analysis of a collection of hepatocellular carcinomas (HCCs) and matched normal liver tissues. To improve detection of alternatively spliced isoforms and hybrid transcripts potentially deriving from genomic rearrangements, they develop a dedicated pipeline SPLICE, which they benchmark against available software used for the same analysis. Besides having dual functionality (calls both alternative transcripts and fused transcripts), SPLICE seems to outperform previous software in calling alternative/fused transcripts and accuracy. They use the SPLICE pipeline to call isoforms and gene fusions in normal liver cells and HCCs and perform basic functional validations on novel fusions identified. The manuscript is well written, and the analyses are well performed. Perhaps the benchmarking of the SPLICE pipeline could have been more extensive (i.e., performed on additional independent datasets).

      Major points: 1. Line 149-150: "We compared the results of mapping to the reference genome and the reference transcriptome sequences, and removed candidates if both were inconsistent (removal of mapping errors). " Please specify what "both were inconsistent" means.

      Our reply; Thank you for this comment. The accuracy of fusion gene detection is influenced by mapping errors. To remove possible mapping errors, SPLICE aligned reads to the reference genome and the reference transcriptome sequences and compared the results. If the results are inconsistent (for example, GeneA-GeneB in the reference genome and GeneA-GeneB in the transcriptome genome, or GeneA-GeneB in the reference genome and GeneA in the transcriptome genome), SPLICE considers the candidates as false positive and removes them from the analysis.

                We changed the sentence “We compared the results of mapping to the reference genome and the reference transcriptome sequences, and removed candidates if both were inconsistent (removal of mapping errors).” to “we compared the results of mapping to the reference genome and the reference transcriptome sequences, and removed candidates if both results did not detect same fusion genes (removal of mapping errors).”  (line 150-152).
      
      • Concerning TE-derived novel exons, in principle, this may lead to altered expression of the TE-transcript (as the Authors report for L1-MET) or to altered splicing of the transcript (i.e., other exon/introns could be retained or excluded). Can the Authors assess whether the inclusion of the TE in a transcript enhances its expression or affects the splicing of the "parental" transcript? If so, can they verify if the position of the insertion of the TE has any effect on expression and splicing?*

      Our reply; Thank you very much for this important comment. As the reviewer mentioned, exonization of TE may affect the splicing patterns and gene expression levels of transcripts. To determine the effect of TE on expression levels, we compared the expression levels of transcripts with TE-derived novel exons with those of known transcripts of the gene. We found that the expression levels of transcripts with TE-derived novel exon were lower than those of known transcripts (Figure 1 in the reply). Since the same results were observed in all novel transcripts (Fig. 1E,F), most TE exonization would not affect the expression level of transcripts.

                We then analyzed the effects of TE in the splicing change, we compared the numbers of novel splicing junctions between transcripts with TE-derived novel exons and other transcripts in each gene. The proportions of genes with novel splicing junctions were not significantly different between the transcripts with TE-derived novel exons and others (transcripts with TE-derived novel exons; 9.1% and others; 11.9%)  (Figure 2 in the reply). As observed in L1-*MET* and L2-*RHR1*, transposons can affect expression levels and structures of transcripts, however, their effect would be limited to a part of genes.
      

      Figure 1

      Comparison of expression levels of transcripts with TE-derived novel exon and known transcripts. Only transcripts derived from genes with TE-derived novel exons were compared. The total number of transcripts is shown below the plot. Transcript abundance was measured in reads per million reads (RPM), and log10 converted values for RPM were shown in the violinplot. P-values were calculated by Wilcoxon rank-sum test.

      Figure 2

      Comparison of the percentage of novel splicing junction in transcripts with novel TE-derived exon and other transcripts. The total number of genes are shown below the plot. Transcripts with TE-derived novel exons and other transcripts were compared. P-value was calculated by Fisher’s exact test.

      • Can the Authors explain why the NBEAL1-RPL12 was not detected by SPLICE?*

      Our reply; Thank you for this comment. Although NBEAL1-RPL12 fusion was detected by SPLICE, mapping results to the reference genome and the reference transcriptome were inconsistent and removed from the final result. AsNBEAL1-RPL12 was not validated by PCR (Supplemental Fig. S4B) (line 183-184), we consider that this fusion-gene is a false positive, and filtering of SPLICE successfully removed false-positive fusions.

      • Line 332: Can the Authors explain how the total amount of HVB mRNA was determined in each sample? Is it a relative amount calculated from the sequencing data? If so, it should be made clear in the text that this is a fractional measure.*

      Our reply; Thank you very much for this comment. Expression levels were calculated by log10 converted reads per million reads (log10(RPM)) for each sample. We added the following sentences to the "Expression from HBV" subsection in the Results (line 337-338); “Expression levels were estimated by log10 converted support reads per million reads (log10(RPM)) for each sample.”.

      • Fig4a: please specify if the y-axis "number of support reads" reports library normalized values.*

      Our reply; Thank you for this comment. The values of the y-axis are row read counts. We added the following sentences to the Figure legend (line 348); “Y-axis shows the total number of support reads (raw counts).”.

      • HCCs have more HBV-human genome fusion transcripts than normal liver. Could the authors clarify if these HCC transcripts are selectively found in tumors? or whether they are also expressed in normal liver samples? The paragraph starting from line 356 is confusing, and it is difficult to retrieve the above information for both HBs and HBx fusions.*

      Our reply; We apologize for the confusing description. All HBV-human genome fusion transcripts were selectively expressed in tumor or normal liver. We added the following sentence to the "Expression from HBV" subsection in the Results (line 365-366); “All of these HBV-human genome fusion transcripts were selectively expressed in the HCCs and the livers.”.

      • Figure 4C: what was the control used to calculate the relative viability in these analyses?*

      Our reply; Thank you for this comment. Fig. 4C shows the number of HBV-human fusion transcripts in the six categories. If this comment refers to Fig. 4H, cell lines transfected with the empty vector (pIRES2-AcGFP1-Nuc) was used as controls. This has been described in the "Gene overexpression" subsection of Methods (line 716-717).

      • MYT1L: the Authors report the identification of a novel MYT1L transcript downregulated in HCC, and argue it may have a potential tumor-suppressive function. For the sake of clarity, it will be advisable to show also the differential expression (HCC vs. Liver) of the other transcripts expressed from the same locus.*

      Our reply; Thank you for this important comment. In HCCs and normal livers, only the novel MYT1L transcript was expressed from this locus, and no known transcript of MYT1L was expressed. We changed the sentence “In the MYT1Lgene, a highly-conserved novel exon was detected (Fig. 2E), and this transcript was significantly down-regulated in the HCCs” to “In the MYT1L gene, a highly-conserved novel exon was detected (Fig. 2E), and only a transcript with the novel exon was expressed.” (line 471-472).

      • *

      Minor points: 1. Table S4: there is a typo, correct “secific” in “specific”

      Our reply; Thank you very much for this comment. We corrected the typo of Table S4.

      • *

      • *

      *Reviewer #2: General comments:

      Summary: This is both a presentation of a pipeline for analysis of Nanopore RNA-seq data, as well as an analysis of a cohort of 44 hepatocellular carcinomas against matched-normal liver tissue. It presents a number of quite intriguing results from the long-read RNA analysis, and suggests potential new targets for study in HCC. It is also worth noting that the current version of guppy (6) has functionality to detect primer sequences in the middle of reads and split those reads, which may obviate one of the steps in SPLICE.*

      *Major comments:

      1) The work done in this study used data that was basecalled using guppy 3.0.3. Since that version, I am aware of at least two major upgrades to the base caller accuracy, which would likely also improve the accuracy of isoform resolution. Given that the data is relatively low-coverage and that you have an automated workflow for the analysis, I would recommend re-basecalling using an updated basecaller and re-running your analysis using that. This is especially important given your comments in the paper about splice site misalignment.*

      Our reply; Thank you very much for this important comment. We performed basecalling of a sequence data of MCF7 using the latest guppy v6.0.6 and compared the result with that by guppy v3.0.3. We randomly extracted 1M reads from MCF-7 reads that passed qscore filtering in guppy basecaller. The same reads were extracted and basecalled by guppy v3.0.3. These two data were analyzed by SPLICE.

      The average error rate was 4.6 % for v6.0.6 and 6.8 % for v3.0.3. The number of transcripts was 9,674 for v6.0.6 and 9,329 for v3.0.3. Of these, the number of novel transcripts was 446 and 410, respectively. The number of fusion genes was 2 (BCAS3-BCAS4, and BCAS3-ATXN7) by v6.0.6 and one (BCAS3-BCAS4) by v3.0.3. As the reviewer mentioned, we found that using the latest version of guppy improved the accuracy and detected a larger number of transcripts.

      We added the results to Supplemental Table S12. We also changed the sentences from “Second, our analysis removed the change of splicing sites within 5 bp to remove alignment errors (Fig. 1B). We consider that this cutoff value is necessary due to currently available high-error reads (S____upplemental Data S____2). However, sequencing technologies and basecallers are improving, and in the near future, we should be able to use a smaller cutoff value and identify larger numbers of splicing changes.” to “Second, the accuracy of the analysis depends on the sequencing error rate. Although several filters are used for currently available high-error reads (Fig. 1B and ____Supplemental____ Fig. S1), sequencing errors would affect the accuracy of the result. Sequencing technologies and basecallers are improving, and in the near future, we should be able to identify larger numbers of splicing changes with high accuracy (Supplemental Table S10).” (line 538-542).

      2) You have compared your software to another tool for isoform analysis on Nanopore sequencing data, TALON. But a number of other tools exist for this purpose, including stringtie2, flair and bambu. My own testing has shown that stringtie2 outperforms TALON in terms of concordance with Illumina RNA-seq. It is quite important that you perform a complete comparison of your software to the state of the art for this purpose.

      Our reply; Thank you very much for this important comment. We compared our tool with four tools (TALON, FLAIR, StringTie, and bambu). For this comparison, we used sequence data of MCF-7 and HCC (RK107C). We randomly extracted 1 M reads from MCF-7 and HCC (RK107C) sequence data using Seqtk (v1.3) (params: sample -s1 1000000). Reads were mapped to the reference genome sequence (hg38) with minimap2 (v2.17) (params: -ax splice --MD), and the output SAM files were converted to BAM files and sorted with samtools (v1.7) (Li et al. 2009).

      For benchmarking of TALON (v5.0), we corrected aligned reads with TranscriptClean (v2.0.3) (Wyman and Mortazavi 2018). Next, we ran the talon_label_reads module to flagging reads for internal priming (params: --ar 20). TALON database was initialized by running the talon_initialize_database module (params: --l o --5p 500 --3p 300). Then, we ran the talon module to annotate the reads (params: --cov 0.8 --identity 0.8). To output transcript abundance, we first obtained a whitelist using the talon_filter_transcripts module (params: --maxFracA 0.5 --minCount 5), and then quantified transcripts using the talon_abundance module based on the whitelist. For FLAIR (v1.5), the sorted BAM file was converted to BED12 using bin/bam2Bed12.py. We then corrected misaligned splice sites with the flair-correct module. High-confidence isoforms were defined from the corrected reads using the flair-collapse module (params: -s 3 --generate_map). For benchmarking of StringTie (v2.2.1), Stringtie was performed with input files consisting of long-read alignment and reference annotation (params: -L -c 3). For benchmarking of bambu (v2.0.0), Bambu was performed with input files consisting of long-read alignment, reference annotation and reference genome (hg38) (params: min.readCount = 3). Candidates with low expression levels (support reads As a result, SPLICE identified the third-highest number of transcripts followed by FLAIR and StringTie (Supplemental Fig. S3A). In MCF-7 the concordance rate with IsoSeq MCF-7 transcriptome data was the highest in SPLICE for known transcripts and the second highest in SPLICE for novel transcripts (Supplemental Fig. S3B). These results indicate that SPLICE has sufficient accuracy for analyzing transcript aberrations.

      We added the text to the "Comparison of SPLICE method with other tools" subsection of the Results (line 165-177) and the "Benchmarking" subsection of the Methods (line 640-679). We added the results to Supplemental Fig. S3.

      3) Likewise, for fusion detection, you compare to LongGF. You should also compare to (and cite) JAFFAL.

      Our reply; Thank you very much for this important comment. We compared our tool with the two tools (LongGF and JAFFAL). We used 1 M reads randomly extracted from MCF-7 and HCC (RK107C) sequence data as described above.

                For benchmarking of LongGF (v0.1.2), reads were mapped to the reference genome sequence (hg38) with minimap2 (v2.17) (params: -ax splice --MD), and the output SAM files were converted to BAM files and sorted with samtools (v1.7). We then ran the *longgf* module and obtained the list of fusion genes (params: min-overlap-len 100 bin_size 50 min-map-len 200 pseudogene 0 secondary_alignment 0 min_sup_read 3). For benchmarking of JAFFAL (v2.2), we ran the *JAFFAL.groovy* module with zipped fastq files.
      
                In this comparison, close gene pairs (We added the text to the "Comparison of SPLICE method with other tools" subsection in the Results (line 178-186) and the "Benchmarking" subsection in the Methods (line 667-679). We showed the results in Supplemental Fig. 4.
      

      4) In terms of the source code, I have questions. Why did you use BASH to run the Python code, instead of making this into a Python package? Why did you not use the functionality already available in BioPython for a number of basic sequence data handling tasks? Why is there not even a single function defined anywhere, let alone classes?

      At some level, if it works, it works. But I have serious concerns about the long-term maintainability of the code in its current state.

      Our reply; Thank you very much for this critical comment. As the reviewer mentioned, we think it is better to make a python package and use BioPython for maintenance and long-term maintainability of the code. We have been building our analysis pipeline by trial and error, and at this stage, the current scripts are convenient for us (our group may need to learn software development). We provided a Docker package (see the reply to comment 5)), and this would promote usability.

      5) Also related to the code, it is generally the standard now to create a BioConda package or Docker container for a bioinformatics package. BioConda has the advantage that the BioContainers project automatically generate Docker and Singularity containers from it. Please provide one of these.

      Our reply; Thank you very much for this critical comment. We made a Docker file and provided it from our github page. It is available from the "Installation and usage via Docker" section.

      6) There is some quite nice functional validation work done on some of the DE transcripts that would have been hidden in a gene-level analysis. There is also some nice work on detecting HBV fusion genes. These both contain important results which are not mentioned at all in the abstract. I feel like the abstract as it stands is selling the paper short.

      Our reply; Thank you very much for this important comment. We added the following sentences to the abstract; “Comparison of expression levels identified 9,933 differentially expressed transcripts (DETs) in 4,744 genes. Interestingly, 746 genes with DETs, including LINE1-MET transcript, were not found by the gene-level analysis. We also found that fusion transcripts of transposable elements and hepatitis B virus (HBV) were overexpressed in HCCs. In vitro experiments on DETs showed that LINE1-MET and HBV-human transposable elements promoted cell growth.”.

      7) Fig 5C shows a Venn diagram of fusions detected by short-read vs long-read sequencing, in which there is quite low overlap between these. You make the statement in the paper that "a combination of short- and long-reads can detect more fusion genes". I find it more likely that the short-read ICGC data had much greater depth of coverage than the MinION data you produced, which allowed for the detection of fusions that were expressed at much lower levels. This could be easily tested by downsampling the ICGC data to the same amount of sequence data as was generated on the MinION, and re-creating the Venn diagram with the fusions detected that way.

      Our reply; Thank you very much for this very important comment. We compared the amount of data between our long-reads and the previous short-reads. However, the amounts of data were not quite different (Supplemental Fig. S14A). Therefore, differences in depth are not likely to be the cause of the low overlap. We considered that two possibilities could explain the low overlap. First, most of the fusion genes missed by short-read were very low expression levels, less than 1 reads per million reads (RPM) (Supplemental Fig. S14B), therefore, there are many fusion-genes with low expression levels, and they are difficult to be detected. Second, 28.9 % of transcripts in long-reads lacked 5' region (Supplemental Fig. S5 and Supplemental Fig. S14C,D). Therefore fusion-genes whose breakpoints are located in the 5' region were difficult to detect by long-read.

      We added the following sentences to the "Fusion genes" subsection in the Results (line 400-405); “We considered that two possibilities could explain the low overlap. Since the most of the fusion genes missed by short-reads had very low expression levels (Supplemental Fig. S14B), many fusion-genes with low expression levels would be missed by a single approach. In addition, 28.9 % of transcripts in long-reads lacked 5' region (Supplemental Fig. S5 and Supplemental Fig. S14C, D). Therefore fusion-genes whose breakpoints are located in the 5' region would be difficult to detect by long-read.”. We also added a figure on the amount of data to Supplemental Information (Supplemental Fig. S14A).

      8) Figure 5D is very interesting. What do you conclude from that result? Please comment in the manuscript.

      Our reply; Thank you very much for this important comment. We used samples that used for whole-genome sequencing in our previous study. Therefore, a list of SVs is available. We classified fusion-gene to these supported by SVs (SV detected fusion-genes) and others (no SV detected fusion-genes), and compared the expression levels of them (Figure 5D).

      Whole-genome sequencing can accurately identify clonal (high frequency) SVs, however, would miss sub-clonal (low frequency) SVs. Therefore, we considered that no SV detected fusion-genes were generated by sub-clonal SVs. This result suggests that there are a lot of sub-clonal fusion genes, and their expression levels are lower than clonal fusion genes. Although the functional importance of sub-clonal fusion genes is currently unknown, deeper RNA sequencing would detect a larger number of fusion genes.

                We added the following sentences to the “Fusion genes” subsection in the Results (line 410-412); “This result suggests that there are a lot of sub-clonal fusion genes, and their expression levels are lower than clonal fusion genes. Although the functional importance of sub-clonal fusion genes is currently unknown, deeper RNA sequencing would detect a larger number of fusion genes.”.
      

      *Minor comments:

      1) The manuscript has many small errors in English grammar, spelling and style. I would strongly recommend sending it for copy editing before submitting it to a journal.*

      Our reply; Thank you very much for this comment. Due to the limitation of time, the current version has not been proofread by a native-English speaker. We are planning to review English grammar by a native-English speaker.

      2) Neither the results section nor the methods section describing the sequencing that was performed specify whether it was done on a MinION or PromethION (or flongle). While this is implied elsewhere in the paper, it should definitely be specified in the methods at a minimum.

      Our reply; Thank you for this comment. We used a MinION for sequencing. We added the following sentences to the Method section (line 579-580); “Libraries were sequenced on a SpotON FlowCell MKⅠ(R9.4) (Oxford Nanopore), using the MinION sequencer (Oxford Nanopore)”.

      3) You also write in the introduction that your method, SPLICE, was developed for the MinION specifically. Please comment on its applicability to data generated on the PromethION and flongle Nanopore sequencers.

      Our reply; Thank you very much for this comment. We consider that our method is applicable to data from MinION, PromethION, and flongle. We added the following sentence to the Methods section (line 592-593); “In the present study, we analyzed sequence data from MinION. We consider that our method is applicable to data from MinION, PromethION, and flongle.”.

      4) The volcano plot in Fig 3A is missing its dots.

      Our reply; Thank you very much for this comment. We modified the Fig. 3A.

      *Reviewer #3: General comments:

      Summary: In this manuscript, Kiyose et al have developed and tested a novel methodology for identifying splicing alterations, and fusions, from full-length transcript or long read sequencing data. They apply this approach to liver cancer and paired, non-cancerous liver tissue from a prior publication, and use wet-lab/experimental methods to validate their in silico findings. They conclude that their new methodology, SPLICE, outperforms one existing method, and is uniquely suitable to identifying fusion genes.*

      Major Comments: 1) Figure 1B shows a schematic of common error patterns from MinION cDNA sequencing, and the text of the manuscript describes how the authors' new approach (SPLICE), overcomes several of these, e.g. sequencing errors, artificial chimeras, and mapping errors of highly homologous genes. However, there is a fundamental disconnect between the text and the graphic in Figure 1B. This should either be revised for clarity, or an additional graphic or flowchart placed in the supplementary materials to clearly show *how* SPLICE overcomes each of these limitations.

      Our reply; We apologize for the insufficient explanation in Figure 1. We showed a detailed explanation of the data analysis procedure in Supplemental Fig. S1.

      2) Why was TALON the only alternative approach chosen for validation of SPLICE performance? There are a number of other, more advanced pipelines such as SUPPA2, and IsoformSwitchAnalyzeR. It would strengthen the manuscript, and its conclusions, to incorporate at least one of these methods as a second comparator. This is particularly true for IsoformSwitchAnalyzeR, since Kiyose et al identify a number of differentially expressed transcripts (DETs) for genes that are not differentially expressed.

      Our reply; Thank you very much for this important comment. Another reviewer also requested additional benchmarking, therefore we performed an additional performance comparison for the revised manuscript. As SUPPA2 and IsoformSwichAnalyzeR are used to analyze the annotated output GTF files, and direct comparison with SPLICE is difficult. Since IsoformSwichAnalyzeR recommends StringTie as an annotation soft, we compared using StringTie instead.

      We compared the performance of SPLICE with that of four other methods (TALON, FLAIR, StringTie and Bambu) for splicing variant detection. SPLICE identified the third-highest number of transcripts followed by FLAIR and StringTie (Supplemental Fig. S3A). In MCF-7 the concordance rate with IsoSeq MCF-7 transcriptome data was the highest in SPLICE for known transcripts and the second highest in SPLICE for novel transcripts (Supplemental Fig. S3B).

      We added the text to the "Comparison of SPLICE method with other tools" subsection of the Results (line 165-177) and the "Benchmarking" subsection of the Methods (line 640-665). We added the results to Supplemental Fig. 3.

      3) The Venn diagram in Figure 5C appears to show that conventional short read sequencing identifies 46 fusion genes that are not also detected by long read sequencing. However, this result, and its implications are never addressed in the text.

      Our reply; Thank you very much for this important comment. We apologize for the insufficient explanation. We considered that two possibilities could explain the low overlap. First, most of the fusion genes missed by short-read were very low expression levels, less than 1 reads per million reads (RPM) (Supplemental Fig. S14B), therefore these are many fusion-gene with low expression level and they are difficult to be detected. Second, 28.9 % of transcripts in long-reads lacked 5' region (Supplemental Fig. S5 and Supplemental Fig. S14C,D). Therefore fusion-genes whose breakpoints are located in the 5' region were difficult to detect by long-read.

                We added the following sentences to the "Fusion genes" subsection in the Results (line 400-405); “We considered that two possibilities could explain the low overlap. The most of the fusion genes missed by short-reads had very low expression levels (Supplemental Fig. S14B). This result suggests that there are many missed fusion-genes with low expression levels. In addition, 28.9 % of transcripts in long-reads lacked 5' region (Supplemental Fig. S5 and Supplemental Fig. S14C, D). Therefore fusion-genes whose breakpoints are located in the 5' region would be difficult to detect by long-read.”. We also added a figure on the amount of data to Supplemental Information (Supplemental Fig. S14A).
      

      Minor Comments: 1) On pages 20-21, the language used to describe the HBV and/or HCV postive vs negative materials is very confusing. Please clarify that by "HBV- and HCV-related tissues" you in fact mean "HBV-and HCV-infected samples."

      Our reply; We apologize for the confusing wording. We converted "HBV and HCV-related tissues" to " HBV and HCV-infected samples" in the manuscript.

    1. There is no one way of practicing CSP — this would go against the very idea of sustaining students’ cultures! — but there are ways to understand what a CSP approach may require from a teacher

      I believe multilingualism and multiculturalism are what define today's societies, being able to speak more than one language is a need since there is so much language contact around us. I believe that as teachers we have to recognize, respect, and protect the different cultures present in our classroom. Some examples that I can think about are: reading about a legend or myth from different cultures, learning about a holiday from different countries, having students share with each other their country's traditional food, games, music, etc. Finally, I believe CSP practices are about creating a welcoming and safe space for all students.

    1. As we may think

      Considere un dispositivo futuro... en el que un individuo almacene todos sus libros, registros y comunicaciones, y que esté mecanizado para que pueda consultarse con una velocidad y flexibilidad extraordinarias. Es un suplemento íntimo ampliado a su memoria.

    1. The spread of misinformation online is a global problem that requires global solutions. To that end, we conducted an experiment in 16 countries across 6 continents (N = 33,480) to investigate predictors of susceptibility to misinformation and interventions to combat misinformation. In every country, participants with a more analytic cognitive style and stronger accuracy-related motivations were better at discerning truth from falsehood; valuing democracy was also associated with greater truth discernment whereas political conservatism was negatively associated with truth discernment in most countries. Subtly prompting people to think about accuracy was broadly effective at improving the veracity of news that people were willing to share, as were minimal digital literacy tips. Finally, crowdsourced accuracy evaluation was able to differentiate true from false headlines with high accuracy in all countries. The consistent patterns we observe suggest that the psychological factors underlying the misinformation challenge are similar across the globe, and that similar solutions may be broadly effective.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      Overall we were elated to have received such positive comments on the manuscript, with requests for only minor changes. We have made all suggested changes to clarify or tone down the language as suggested.

      We would like to thank each of the three reviewers for their assessment of our work. We note that all three reviewers agreed the phylogenetic analysis was interesting and convincing. Two of the three reviewers felt the study sufficiently demonstrated roles for Baramicin in the nervous system. We have responded to comments from Reviewer 2 to draw attention to some aspects of the data that they may have been overlooked, which we hope reassures them that our proposal of BaraB and BaraC involvement in the nervous system is robust, coming from different approaches that show consistent results.

      Reviewer 1 and Reviewer 3 compliment the study as being very worthwhile, and for suggesting concrete routes for how an AMP evolved non-immune functions. Both compliment its comprehensiveness, and describe the study as having striking findings that should have broad appeal to audiences interested in the crosstalk between the nervous system and the innate immune system.

      2. Point-by-point description of the revisions

      In the revised manuscript file, we have highlighted all text where changes were made.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors provide convincing evidence for an evolutionary scenario in which duplications of an AMP gene with ancestral immune function led to paralogs specialist for neural functions. They focus on the Baramicin genes, coding for major Toll signalling targets in the context of antifungal defence. Their study uses infection experiments in several Drosophila species, a careful annotation of the Baramicin genes of D. melanogaster, the demonstration of neural expression of BaraB and BaraC, the KD analysis of Bara B revealing lethality and neurological phenotypes, a reconstruction of the evolutionary history of Baramicn genes in Drosophilids and an analysis of the sequence evolution of the IM24 domain providing the neural functions. In general the paper is well written. There are a few places in the manuscript where the language can be improved and one point, which needs clarification: - ine 297: ...,which did not present with... - line 314/315: ...to just 14% that of...to 63% that of - line 459: ..., we this motif... - line 518: What does "... genomic relatedness (by speciation and locus)..." mean? - line 527/528: ...drive behaviour or disease through interactions... - line 532: ... ancestrally encodes distinct peptides involved with either the nervous system or the immune response... line 535: ...with either the nervous system (IM24) or.... Do the data provide enough evidence suggesting that IM24 had a neural function in the ancestor? Ideally the authors should look at neural expression of the Baramicin gene in the ourgroup, S. lebanonensis. The authors later (line571) admit, that they cannot rule out that IM24 is also antimicrobial.

      We thank reviewer #1 for drawing attention to these points. We have made changes to each line to be more concise, clarify our meaning, or fix typos.

      Reviewer #1 (Significance (Required)):

      This is a very comprehensive study, which, to my knowledge for the first time, suggests concrete routes of how an AMP evolved non-immune functions. One of the striking findings of this paper is that duplications and subsequent truncations of the ancestral Baramicin locus linked to specialisation for neural functions occurred independently in different Drosophila lineages.

      We thank reviewer #1 for their very positive comments. We also agree with all suggested changes, including more careful phrasing to emphasize that we have not described a mechanism, just an involvement in the nervous system. For instance, see lines 556-568 are reworked to soften language and explicitly state the ancestral function of IM24 is unknown, and our suggestion that IM24 could underlie Dmel\BaraA interactions with the nervous system is speculation that should be tested.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Hanson and Lemaitre present a genomic and phylogenetic characterization of the Baramicin family of antimicrobial peptide genes in different species. They discover new Baramicin paralogs, united by the presence of an IM24 domain at the N-terminus. They show that among Baramicins, those that are not inducible by infection (which they improperly call non-immune since a protein can be non-inducible by infection and have very important immune functions), are truncated. They propose that an ancestor peptide with immune functions evolved into a neuronal regulator/effector via truncation.

      Although the hypothesis is interesting, the data do not really support it. This manuscript is rather descriptive at this point. The demonstration that IM24 is necessary for neural function is very tenuous. For example, in the paragraphs titled Dmel\BaraB is required in the nervous system during development and Baramicin B plays an important role in the nervous system, I did not find convincing data demonstrating that BaraB is required in the nervous system. The only data that links BaraB to the nervous system is a weak locomotion defect observed in the BaraB mutant. But how many genes, when inactivated, give a locomotion defect? This remains totally unexplained at the molecular level. The authors also mentioned that BaraB is expressed in a subset of mechanosensory neuron cells in the wing. What is the link between this expression and the nubbin phenotype? The authors also mention that data in the literature indicate that BaraC is expressed in glial cells but also in other tissues. Finally, we have no idea what role, if any, these peptides have in the nervous system.

      While the characterization of the Baramicin gene family and its evolution across species is convincing, the link between these AMPs and the nervous system is really too preliminary to be convincing. The manuscript would greatly benefit from being more concise.

      Reviewer #2 (Significance (Required)):

      see above

      We thank reviewer #2 for their fair assessment. We have made edits to soften our phrasing, and to emphasize that we have not described a mechanism, just an involvement, in the nervous system.

      Examples:

      line 270: “integral development role” -> “important for development”

      line 277: “Baramicin B plays an important role in the nervous system“ -> “Baramicin B suppression in the nervous system mimics mutant phenotypes”

      line 532: “Here we demonstrate that the Baramicin antimicrobial peptide gene of Drosophila ancestrally encodes distinct peptides involved with either the nervous system or the immune response.“ -> “Here we demonstrate that the Baramicin antimicrobial peptide gene of Drosophila ancestrally encodes distinct peptides that may interact with either the nervous system (IM24) or invading pathogens (IM10-like, IM22).”

      line 562 new text: “Thus while our results suggest that IM24 of different Baramicin genes might underlie Baramicin interactions with the nervous system, we cannot exclude the possibility that IM24 is also antimicrobial, or even that antimicrobial activity is IM24’s ancestral purpose. Future studies could use tagged IM24 transgenes or synthetic peptides to determine the host binding partner(s) of secreted IM24 from the immune-induced Dmel\BaraA, and/or to see if IM24 binds to microbial membranes.”

      We have also changed all instances of “non-immune Baramicins” to “Baramicins lacking immune induction” or something to that effect (e.g. new Lines 25,464, 469,478-82).

      We also made some small changes to be more concise (e.g. line 387, 447, cut lines 492-495 from previous version, cut lines 506-507 from previous version).

      We have responded below in the reviewer-to-reviewer comments for a few of the specific points raised there, which we hope further assuage some of Reviewer 2’s concerns.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Antimicrobial peptides are main effectors in (insect) immune defenses. It is becoming more and more clear, that AMPs can have pleiotropic effects or even acquire new functions. In the present paper, the authors investigate Baramicin, an antifungal AMP that they described first in publication last year. Here they show that in Drosophila melanogaster Baramicin A, which they described before, has paralogs, that are not immune-inducible. They then show that these paralogs, named BarB and BarC, which are truncated versions of BarA, are expressed in the head and neural tissues. That they have neural functions is supported by targeted gene-silencing experiments. They go on to show, using a comparative approach across Drosophila, that Baramicin A with its antimicrobial function constitutes the ancestral state. Moreover, Baramicin is also enriched in head samples of some of the other Drosophila species they study. This manuscript, which according to the acknowledgements has already been seen by reviewers, is in a very good shape.

      I have only a number of minor points, that might help to clarify the presentation.

      Lines 34-36: I would delete this sentence and replace it with a statement based on the main findings of the manuscript

      We now conclude the abstract with “As many AMP genes encode polypeptides, a full understanding of how immune effectors interact with the nervous system will require consideration of all their peptide products.”

      Lines 56-60. May be tone down a bit. Anti-inflammatory activities of AMPs have been known for a long time. I think the next paragraph makes a very good case what is already known and is hence a nice motivation for the current study.

      Toned down. This part now reads: “However AMPs and AMP-like genes in many species have recently been implicated in non-immune roles in flies, nematodes, and humans, suggesting non-immune functions might help explain AMP evolutionary patterns.”

      Line 125: classical instead of classically

      done

      Line 200: what is a 'novel' time course? I would just describe what has been done.

      Now reads: “We next measured Baramicin expression over development from egg to adult.”

      Line 268: hypomorph, I guess in the literature usually hypomorphic is used.

      done

      Line 279: I would suggest to tone this headline down. This is not a criticism of the paper, but the actual mechanisms of the roles in the nervous system are not studied here.

      Done. Now reads: “Baramicin B suppression in the nervous system mimics mutant phenotypes”

      Line 505: what does not really become clear is whether IM24 plays an important role in the nervous system of fly species that only have BarA.

      Edits from lines 556-568 now help highlight this question.

      Line 540-549. This comparison I find a bit far-fetched, or maybe it needs clarification how doublesex expression is related to Baramicins.

      Being completely honest: the doublesex discussion was requested during previous review at another journal. We agree that it is a bit of a tangent, and so we have removed these sentences.

      Line 584-585. I think that this has been known for much longer from studies in frogs and beetles.

      Our use of “in vivo” might have been a bit squishy here. We have edited this to reflect endogenous loss-of-function study, rather than simply “in vivo,” to clarify our intended sentiment.

      Reviewer #3 (Significance (Required)):

      Overall, I think that this is a very worthwhile and convincing story about the evolution AMPs and how they can acquire new functions. All the main statements are supported by careful experiments and data analysis. The paper does not go into any detail, of how the neurological role of BarB and BarC is achieved, but I think this is beyond the scope of the current manuscript. In short, this is a very worthwhile contribution to the growing literature of the role of AMPs in the nervous system. The authors provide the context of the main published papers in the area in the introduction. As opposed to most papers on this so far, the current manuscript also provides very interesting data on the evolutionary history of the Baramicin genes, both within the main study species, and within other Drosophila species. This paper should appeal to a rather broad audience of researchers interested in innate defenses, AMPs and the crosstalk between the nervous system and the innate immune system.

      My background is insect immunology with a focus on AMPs and evolutionary approach.

      We thank reviewer #3 for their very positive comments. We agree with all suggested changes.

      **Referees cross-commenting**

      This session contains the comments of all reviewers

      Reviewer 3

      Reviewer 2 and I share the view, that the evidence for the effects of BarB and C on the nervous system is rather limited. But I still think, that the paper provides enough new and interesting data that make it a very useful contribution. Though not a neurobiologist, I would assume that providing functional evidence for the role of BarA and B in the nervous system would justify a paper on its own. I agree though, that the relevant sections should be toned down.

      Reviewer 2

      As I mentioned in my review, I found the genomic and phylogenetic analysis interesting and convincing. I therefore totally agréé with reviewers 2 and 3 on that. Whether BarA and B are playing a role in the nervous system and how it does remain speculative. BaraB mutants show locomotion defects. But mutants in mitochondrial genes have locomotion defects. Can we conclude that mitochondria play a role in the nervous system? If I understand correctly, downregulating Bara in neurons only (With Elav-Gal4 driver) does not show the locomotion phenotype. it induces early lethality. How many genes when inactivated in neurons will give rise to such a phenotype? A lot. I really think that the implication of Bara in the nervous system should be seriously toned done and more presented as an hypothesis than a validated fact.

      We would like to note for Reviewer 2 here that it is specifically elav> BaraB-IR that results in lethality, and in weaker gene silencing experiments, adult elav>BaraB-IR flies emerge, and they do suffer locomotor defects. Often, they got stuck in the food shortly after emerging, or would move haphazardly (which was common in flies with nubbin-like wings). We have added explicit mention that elav>BaraB-IR also results in locomotor defects (Line 288-289).

      Our private speculation is that the reason flies fail to emerge from their pupae is because they are so uncoordinated that they sometimes cannot wriggle out of the pupal case before their cuticle hardens. In some instances, both using mutants and RNAi, we observed fully developed adults with mature abdominal pigmentation that died trapped inside their pupal cases.

      We’d also like to emphasize here that despite testing many other Gal4 drivers, including mef2-Gal4 (muscle/myocytes), nubbin-like wings and lethality were only found using elav-Gal4. A role interacting with mitochondria would likely have been revealed using mef2-Gal4, given the importance of mitochondrial function in muscle.

      For BaraC: expression in other tissues (like the rectal pad) could nevertheless be from e.g. nerves innervating the muscles controlling the sphincter. Or it could indeed be entirely unrelated to the nervous system. However we feel the nearly perfect overlap with Repo-expressing cells is a strong argument for a neural role. We also made an effort using RNAi to validate this pattern suggested by scRNAseq, which confirmed a strong knockdown of BaraC-IR with Repo-Gal4 (Fig. 3, Fig. S4).

      We hope these comments clarify for Reviewer 2 why we feel confident in proposing a role for Baramicins in the nervous system, even if we do not investigate a mechanism in this study.

      Reviewer 1

      I agree with reviewer 3 that the main message of the paper providing a concrete scenario of how non-immune functions of AMPs may evolve is an important contribution. A deep investigation of the neural function is definitely going beyond the scope of the paper. Indeed this might be quite tricky. But it would help if the authors could clarify their idea about the ancestral condition. Is there the possibility that IM24 had ancestrally already non-immune function? They are not really clear about this point.

      Reviewer 2

      I agree with the other reviewers that determining the exact role of Bara peptides could be complicated. I just ask that the authors limit themselves to proposing that the peptides have lost their immune function. I stress that this argument is not very strong. It relies solely on the lack of inducibility of these peptides following infection. I still think that the demonstration of the role of Bara in the nervous system is not provided.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Antimicrobial peptides are main effectors in (insect) immune defenses. It is becoming more and more clear, that AMPs can have pleiotropic effects or even acquire new functions. In the present paper, the authors investigate Baramicin, an antifungal AMP that they described first in publication last year. Here they show that in Drosophila melanogaster Baramicin A, which they described before, has paralogs, that are not immune-inducible. They then show that these paralogs, named BarB and BarC, which are truncated versions of BarA, are expressed in the head and neural tissues. That they have neural functions is supported by targeted gene-silencing experiments. They go on to show, using a comparative approach across Drosophila, that Baramicin A with its antimicrobial function constitutes the ancestral state. Moreover, Baramicin is also enriched in head samples of some of the other Drosophila species they study. This manuscript, which according to the acknowledgements has already been seen by reviewers, is in a very good shape.

      I have only a number of minor points, that might help to clarify the presentation.

      Lines 34-36: I would delete this sentence and replace it with a statement based on the main findings of the manuscript

      Lines 56-60. May be tone down a bit. Anti-inflammatory activities of AMPs have been known for a long time. I think the next paragraph makes a very good case what is already known and is hence a nice motivation for the current study.

      Line 125: classical instead of classically

      Line 200: what is a 'novel' time course? I would just describe what has been done.

      Line 268: hypomorph, I guess in the literature usually hypomorphic is used.

      Line 279: I would suggest to tone this headline down. This is not a criticism of the paper, but the actual mechanisms of the roles in the nervous system are not studied here.

      Line 505: what does not really become clear is whether IM24 plays an important role in the nervous system of fly species that only have BarA.

      Line 540-549. This comparison I find a bit far-fetched, or maybe it needs clarification how doublesex expression is related to Baramicins.

      Line 584-585. I think that this has been known for much longer from studies in frogs and beetles.

      Significance

      Overall, I think that this is a very worthwhile and convincing story about the evolution AMPs and how they can acquire new functions. All the main statements are supported by careful experiments and data analysis. The paper does not go into any detail, of how the neurological role of BarB and BarC is achieved, but I think this is beyond the scope of the current manuscript.

      In short, this is a very worthwhile contribution to the growing literature of the role of AMPs in the nervous system. The authors provide the context of the main published papers in the area in the introduction. As opposed to most papers on this so far, the current manuscript also provides very interesting data on the evolutionary history of the Baramicin genes, both within the main study species, and within other Drosophila species.

      This paper should appeal to a rather broad audience of researchers interested in innate defenses, AMPs and the crosstalk between the nervous system and the innate immune system.

      My background is insect immunology with a focus on AMPs and evolutionary approach.

      Referees cross-commenting

      This session contains the comments of all reviewers

      Reviewer 3

      Reviewer 2 and I share the view, that the evidence for the effects of BarB and C on the nervous system is rather limited. But I still think, that the paper provides enough new and interesting data that make it a very useful contribution. Though not a neurobiologist, I would assume that providing functional evidence for the role of BarA and B in the nervous system would justify a paper on its own. I agree though, that the relevant sections should be toned down.

      Reviewer 2

      As I mentioned in my review, I found the genomic and phylogenetic analysis interesting and convincing. I therefore totally agréé with reviewers 2 and 3 on that. Whether BarA and B are playing a role in the nervous system and how it does remain speculative. BaraB mutants show locomotion defects. But mutants in mitochondrial genes have locomotion defects. Can we conclude that mitochondria play a role in the nervous system? If I understand correctly, downregulating Bara in neurons only (With Elav-Gal4 driver) does not show the locomotion phenotype. it induces early lethality. How many genes when inactivated in neurons will give rise to such a phenotype? A lot. I really think that the implication of Bara in the nervous system should be seriously toned done and more presented as an hypothesis than a validated fact.

      Reviewer 1

      I agree with reviewer 3 that the main message of the paper providing a concrete scenario of how non-immune functions of AMPs may evolve is an important contribution. A deep investigation of the neural function is definitely going beyond the scope of the paper. Indeed this might be quite tricky. But it would help if the authors could clarify their idea about the ancestral condition. Is there the possibility that IM24 had ancestrally already non-immune function? They are not really clear about this point.

      Reviewer 2

      I agree with the other reviewers that determining the exact role of Bara peptides could be complicated. I just ask that the authors limit themselves to proposing that the peptides have lost their immune function. I stress that this argument is not very strong. It relies solely on the lack of inducibility of these peptides following infection. I still think that the demonstration of the role of Bara in the nervous system is not provided.

    1. Before we start talking about how to choose search terms and where to search for sources, it can help to get a sense of what we’re hoping to get out of the research. We might think that in order to support a thesis we should only look for sources that prove an idea we want to promote. But since writing academic papers is about joining a conversation, what we really need is to gather the sources that will help us situate our ideas within that ongoing conversation. What we should look for first is not support but the conversation itself: who is saying what about our topic? The sources that make up the conversation may have various kinds of points to make and ultimately may play very different roles in our paper. After all, as we have seen in Chapter 2, an argument can involve not just evidence for a claim but limits, counterarguments, and rebuttals. Sometimes we will want to cite a research finding that provides strong evidence for a point; at other times, we will summarize someone else’s ideas in order to explain how our own opinion differs or to note how someone else’s concept applies to a new situation.  As you find sources on a topic, look for points of connection, similarity and difference between them. In your paper, you will need to show not just what each one says, but how they relate to each other in a conversation.  Describing this conversation can be the springboard for your own original point.

      Arguments not only involve evidence for a claim but for limits, counterarguments, and rebuttals.

    1. First I menaced thee with a feigned one, and hurt thee not for the covenant that we made in the first night, and which thou didst hold truly. All the gain didst thou give me as a true man should. The other feint I proffered thee for the morrow: my fair wife kissed thee, and thou didst give me her kisses–for both those days I gave thee two blows without scathe–true man, true return. But the third time thou didst fail, and therefore hadst thou that blow. For ’tis my weed thou wearest, that same woven girdle, my own wife wrought it, that do I wot for sooth. Now know I well thy kisses, and thy conversation, and the wooing of my wife, for ’twas mine own doing. I sent her to try thee, and in sooth I think thou art the most faultless knight that ever trode earth. As a pearl among white peas is of more worth than they, so is Gawain, i’ faith, by other knights. But thou didst lack a little, Sir Knight, and wast wanting in loyalty, yet that was for no evil work, nor for wooing neither, but because thou lovedst thy life–therefore I blame thee the less.”

      The Green Knight is informing Gawain that none of the strikes were due to the covenant. Instead, he explains that he pretended to strike Gawain the first two times because "Gawain gave him the gifts he received from the lady" (Sparknotes summary part 4 page 1). He then goes on to say that he hurt Gawain on the third strike because Gawain was dishonest about the girdle from Bertilak's wife. However, the Green Knight adds on that he did not kill Gawain because Gawain valued his life, which the Green Knight understood.

      "Sir Gawain and the Green Knight," Sparknotes.com. www.sparknotes.com/lit/gawain/section4/ <accessed 18 May 2022>

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for carefully reading our manuscript. We found their comments to be incredibly thoughtful and constructive and greatly appreciate their feedback. We are confident that addressing the reviewers’ concerns has strengthened our manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Camuglia, Chanet and Martin investigate the mechanisms that control cell division orientation in vivo, using the mitotic domains (MDs) in the head of the Drosophila embryo as their main model system. They find that cells in the head mitotic domains rotate and align their spindles within 30 degress of the anterior-posterior axis of the embryo. The Pins protein, implicated in spindle orientation in other systems, is planar polarized in mitotic cells. Pins polarization precedes spindle rotation and is correlated with the division angle (but cell shape is not, violating Hertwig's rule). Overexpression of myristoylated Pins results in uniform Pins distribution on the membrane and affects spindle orientation. alpha-catenin RNAi (but not canoe RNAi) disrupts Pins polarity and spindle orientation in MDs 1, 3 and 5. Low dose CytoD injections (which should disrupt force transmission) also result in defective Pins polarity and spindle orientations. Finally, mechanical isolation by laser ablation also disrupts spindle orienttion. The authors find that preventing mesoderm invagination by snail dsRNA disrupts Pins polarity and spindle orientation in the head. MAJOR 1. Is there a certain chirality in the rotation of the spindles? From Movie 1, it seems like in MDs 1 and 3 at least, a majority of spindles on the right side of the embryo rotate clockwise, while spindles on the left side rotate counter-clockwise? Is that so, and in that case, are there geometric/molecular considerations that could explain that chirality?

      We thank the reviewer for pointing this out. They are correct in that there is a tilt to the spindle orientation relative to the AP axis. To illustrate this tilt, we performed our spindle analysis separately on the right and left sides of MD1 and found that spindles on the left side align with an average division angle of about 30from the AP axis whereas spindles on the right side align with an average division angle of -30from the AP axis. To determine whether spindles on either side rotated with a certain chirality, we found there was no preference in rotating clockwise or counterclockwise on the left and right sides (on the left side of MD1 53% of measured spindles rotated counterclockwise and 47% rotated clockwise, on the right side 46% rotated counterclockwise and 54% clockwise). We have added this data as Fig. 1I-J and discussed in the Results lines 134-145.

      1. The authors are experts in mesoderm invagination, and understandably concentrate on the role that forces from that process may have in the orientation of head MD divisions. However, the cephalic furrow forms much closer to the head MDs, and in an orientation that might also explain the alignment of spindles in the head. Is cephalic furrow formation important for Pins polarity and spindle orientation in the head MDs?

      This was certainly a possibility, but our experimental results strongly argues that mesoderm invagination is most relevant.

      1) Perturbing the ventral furrow (e.g. by Snail depletion) does not block the cephalic furrow (Vincent et al., 1997; Leptin and Grunewald, 1990), but does block mesoderm invagination. Snail depletion strikingly disrupted spindle orientation and Pins localization, which suggests mesoderm is most important.

      2) In addition, depletion of -catenin blocks ventral furrow invagination but not cephalic furrow formation. We see a disruption in spindle orientation and Pins localization in -catenin RNAi, which suggests cephalic furrow itself cannot orient spindles.

      3) Furthermore, light sheet imaging of the Drosophila embryo has shown that the head region of the embryo undergoes tissue movement in the direction of the cell division and that this is associated with mesoderm invagination (Streichan et al., 2018; Stern et al., 2022).

      See movies here: https://www.youtube.com/watch?v=kC11Upr30JY

      To further test the importance of mesoderm invagination, we will perform additional ablation experiments trying to disrupt forces transmitted to the mitotic domains from distinct directions. Once we get this experimental result we will include language in the Discussion that will summarize the experimental results and the weight of the evidence for the roles of either ventral or cephalic furrow.

      1. Does expression of myristoylated Pins affect mesoderm invagination (or cephalic furrow formation)? From Table S1 it seems that a maternal Gal4 driver was used to express myristoylated Pins, which could affect other tissues in the embryo. So it is in principle possible that effects of myristoylated Pins on mesoderm internalization/cephalic furrow formation could affect cell division orientation much like sna loss of function does, but in a mechanism that does not depend on Pins polarity. There is definitely an effect on mesoderm invagination in alpha-catenin RNAi (but not in canoe RNAi) embryos, so I wonder if the effect could be consistently through defects in mesoderm invagination (or cephalic furrow formation), and Pins polarity is really dispensable for spindle orientation. Are there head-specific Gal4 drivers that could be used to drive myristoylated Pins exclusively in the head?

      We apologize that we did not clarify this in the text. Maternal overexpression of myr-Pins does not obviously disrupt mesoderm internalization/cephalic furrow formation. But, we do see that targeted disruption of mesoderm internalization via a Snail depletion affects the orientation of division. Note that our paper demonstrates the effect of force transmission on Pins polarity and division orientation, which is new and the main conclusion. The role of these divisions in morphogenesis is more complicated and is beyond the scope of this study.

      In response to this comment we: 1) added language in the Results that states that gastrulation proceeds in myr-Pins expressing embryos (lines 206-208), 2) Added to the Discussion of the role of these oriented divisions to morphogenesis (lines 443-449), and 3) will add a figure showing ventral furrow and cephalic furrow formation in embryos ectopically expressing the myr-Pins.

      1. Related to the previous point, does mechanical isolation by laser ablation (Figure 6I-N) affect Pins polarity? This experiment could alleviate some of my concerns above, as it certainly does not (should not?) disrupt neither mesoderm invagination nor cephalic furrow formation.

      We agree that it would be useful to look at Pins polarity in laser ablated embryos. Currently, we have been unable to analyze Pins polarity after laser ablation, because the ablation to fully isolate the mitotic domain has bleached our Pins::GFP signal. Also, we have shown that Pins polarity is disrupted by 1) alpha-catenin-RNAi, 2) low dose CytoD injection, and 3) Snail depletion, all of which are expected to disrupt force generation and transmission through tissues.

      In response to the reviewer comment, we will determine if Pins::GFP can be analyzed in less aggressive (directional) laser ablations. Again, remember that myr-Pins does not affect mesoderm internalization and that Snail depletion affects Pins polarity.

      MINOR 1. Figure S5: I am a bit confused about the role of Toll 2, 6, 8 in orienting spindle orientation. In Figure S5D it seems that dsRNA treatment against these genes does not disrupt spindle orientation, but Figure S5F shows quite a significant (p=0.0057) effect in triple mutants. The authors favor the idea that Toll receptors do not affect spindle orientation, but the difference with the mutant should be addressed. Furthermore, what happens in MDs 3, 5 and 14 (if the germband extension defect does not affect those divisions)? Is there a difference between dsRNA and triple mutant embryos in these other MDs?

      We think this is a great point. We stated in the text that TLRs are not solely responsible (line 247) for spindle orientation as they do not recapitulate the random pattern of division seen in the myr-Pins expression condition. We acknowledge the differences between the dsRNA injection and TLR triple mutant in the manuscript (lines 242-247), but our data show a greater importance for the role of force transmission. We favor the idea that other mechanisms contribute to spindle orientation because of the small effect of mutating all three Tolls and the dramatic effects of depleting AJs, inhibiting actin (with CytoD), laser ablation, and blocking mesoderm invagination. The planned laser ablation experiments (described above) will also contribute to addressing this point.

      1. No statistical analysis is provided for any of the differences in polarity between Pins and Gap43, and this should be done to demonstrate the significance of the polarization of Pins. Also, particularly for MD14, they should compare anterior vs. posterior polarity, as based on the images in Figure 2H it is not clear that there is a difference between the anterior and posterior side of cells.

      We thank the reviewer for this point. We have added the statistical comparison.

      1. Figure 2A-D: the authors propose that Pins localizes preferentially to the posterior end of cells (instead of both anterior and posterior ends) in MDs 1, 3 and 14 (and anterior in MD 5). How is the asymmetry in the distribution of Pins along the AP axis accomplished, and is there any significance to it? This should be discussed in a bit more detail (currently no potential mechanisms provided in the discussion, just an acknowledgment of the question).

      __We agree the localization of Pins to the posterior end of cells in MDs 1, 3, and 14 and anterior end in MD 5 is of great interest. The details and further mechanism of this preferential localization are beyond the scope of this paper, but we have added an acknowledgment of the question and discuss possible models that could explain the result (lines 458-460). __TYPOS 1. Line 49: "one daughter cells" should be "one daughter cell". 2. Line 193: "rotation. (Figure 3E-F)." should be "rotation (Figure 3E-F)." 3. Lines 232-237: please review. 4. Line 238: "epithelia cells" should be "epithelial cells".

      We thank the reviewers for carefully reading our manuscript. We have fixed the typos mentioned.

      Reviewer #1 (Significance (Required)): This is the first study to my knowledge that demonstrates the role of mechanical forces in polarizing Pins, and provides a nice model to further investigate how mechanical forces generated in one tissue may affect cell division orientation in distant ones. The paper is clear, well written, and quantitative analysis is present for most results. I have some issues with the statistics (or lack thereof) for a couple of results, and potential alternative interpretations for some experiments that in my opinion should be addressed prior to publication. Specifically, it is not clear to me if Pins polarity is at all necessary for spindle orientation in any of the examined MDs.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Overview: In this manuscript, Camuglia et al. show Pins/LGN, which is understood to drive spindle orientation, can localize asymmetrically (with respect to the tissue plane) in the Drosophila embryo. Experimental work (including drug treatments, laser ablation, and knockdowns) lead the authors to propose that this asymmetry is driven by tissue-level tension. The findings are quite interesting and the manuscript is well-written overall. Major Comments: • The authors propose that localization is driven by tissue-level tension, but the direction of the tension isn't clear from the experimental work. For example, the laser ablation experiments cut around the entire perimeter of the mitotic domain, rather than along just one tension axis. Similarly, the finding that disruption of the ventral furrow (by Snail RNAi) interferes with spindle orientation in the head is very puzzling; the furrow is A) outside the embryonic head and B) runs in the parallel direction to the divisions considered. The authors need to address the directionality of tension experimentally.

      We thank the reviewer for this comment and agree that better defining the direction of tension would strengthen our manuscript. We showed that blocking mesoderm invagination with Snail depletion disrupts spindle orientation, despite Snail not being required for cephalic furrow formation (refs). Recent light sheet data has shown that mesoderm invagination is associated with global movements throughout the embryo. Furthermore, the ventral furrow extends into the head region just past the anterior of MD5. To address the reviewer’s comments, we plan to: 1) Perform directional laser ablations to determine the directionality of the tension that orients the spindle, 2) Analyze strain rates in the mitotic domains prior to and during division, and 3) Add to our Discussion more about what is said in the literature about the movements that occur in the head during mesoderm invagination.

      • As acknowledged in the text, the asymmetric enrichment of Pins in MD14 is fairly weak. Since the cells being examined here border a divot in the tissue, and might therefore be curving relative to the focal plane, it would be good to rule out the possibility that some of the asymmetry in Pins intensity is just a consequence of cell/tissue geometry. One way this could be achieved is by showing multiple focal planes.

      Good point. We do not think that the asymmetric Pins enrichment in MD14 is due to tissue geometry or junction tilt. 1) MD14 divides ~10-15 minutes after mesoderm invagination is completed, so the cells do not border a divot (as seen with Gap43::mCh, Fig. 2I). The cells do round up, which can be seen as gaps between cells (Fig. 3E). 2) We compare Pins to GapCh and only see an enrichment with Pins (Fig. 2H-K). If the enrichment was due to tissue curvature or junction orientation relative to imaging axis, we would see the same enrichment in GapCh. 3) Expression of myr-Pins randomizes spindle orientation in MD14 (Fig. 3M, N).

      • In Figure 3I (and 3M?), it appears that there are fewer cell divisions in the presence of myr-Pins. Is this the case? Since cell shapes change during division, and cell shapes influence tissue tension, an increase in cell divisions could lead to a change in tissue tension. This would be important to address, since tissue tension plays an important role in the proposed model.

      These images are not taken at the same point of MD1 division ‘wave’, there are the same number of divisions in each condition. These mitotic domains exhibit a ‘wave’ of cell division (Di Talia and Wieschaus, 2012), and so the number of divisions in each image reflect the timing at which we captured the image. Quantifications involved divisions throughout this wave, but we have chosen images for figures which are most representative of what we see. We will add this to the text in the final version of the manuscript.

      • The alpha-catenin and Canoe results are a bit confusing: - The rose plot in Figure 4D doesn't show a random distribution of spindle angles, but rather a modest change; most spindles still orient in the normal range. The p value in the figure legend (0.0012) is very different from the one in the figure (5.8284e-04). - Alpha-catenin is the strongest way to disrupt AJs, but A) the epithelium appears to be intact in the knockdown condition and B) spindle orientation is impacted but not randomized. Does this mean that the knockdown is incomplete? Or is Cadherin-mediated adhesion (in which alpha-catenin participates) only partially responsible for force transduction?

      We acknowledge that perturbation using ____alpha-cat RNAi does not recapitulate the complete disruption of division orientation seen in embryos expressing myr-Pins. This is likely due to the variability in the strength of RNAi knockdown, which is observed for most RNAi lines that we use. To address the reviewer’s comment, we have added rose plots for individual embryos showing extremes in the severity of division orientation disruption (Fig. 4E and F). For the main plot (Fig. 4D), we have included all the data that we took because we obviously did not want to pick and choose which embryos were used for analysis. So Fig. 4D includes all the variability.

      • Given that previous studies implicate Canoe in Pins localization, it seems important to lock down the question of whether Canoe is participating in the mechanism described in this paper. How do the authors know the extent of Canoe knockdown? As suggested by the alpha-catenin results (described above), is it possible that Canoe knockdown is simply not strong enough to impact spindle orientation? Aren't there genetic nulls available? We thank the reviewer for bringing these points to our attention. There are certainly genetic nulls available (Sawyer et al., 2009), but the experiment suggested by the reviewer would not establish the necessity of Canoe in mitotic domain cells. This is because Canoe nulls severely disrupt mesoderm invagination (Sawyer et al., 2009; Jodoin et al., 2015), as well as affecting junctions in the ectoderm during germband extension (Sawyer et al., 2011). Therefore, we would not be able to distinguish what effect of Canoe would be responsible for the spindle orientation using a null mutation. We did better experiments, we used 1) a mutant which specifically compromised mesoderm invagination (snail), 2) laser isolation to show the importance of external force transmission in orienting mitotic domain divisions, and 3) RNAi to deplete Canoe so that mesoderm invagination initiates and pulls on the ectoderm, but where there is clearly compromised Canoe function. This treatment did not cause any effect on spindle orientation arguing against a role of Canoe in this case. In response to the reviewers comment, we added language to the Results to indicate that it is possible that the Canoe knockdown is not strong enough and our rationale for why we did not perform the experiment in a Canoe null (lines 279-282).

      Minor Comments:

      • It can be difficult to interpret some of the spindle orientation data since the AP axis is vertical in the diagrams but horizontal in the rose plots. Can one of these be flipped so they go together?

      We thank the reviewer for this suggestion and have flipped the rose plots so they match the images. Note that because of the large size of the figures, we have had to consistently orient anterior towards the top, which we establish at the beginning of the Results.

      • Figure S3 is important information for the reader and should be ideally moved into the main paper. - Protein localizations referred to in text should be annotated on images, as they can be hard to see.

      We disagree that S3 should be included in the main paper. The myr-Pins reagent has been used previously so the information in S3 is not new (Chanet et al., 2017).

      • There are some discrepancies between figures, legends and text. - p-values differ between figures, legends, and/or text. - Fluorescent markers are labelled differently in figures and legend (CLIP170 in Figure 1) - Graphs appear to show that MD3 polarizes on posterior side, but figure legend says anterior in Figure S1. Vice versa for MD5.

      We thank the reviewer for catching these typos. We have fixed these issues.

      • Ideally, multichannel image overlays should be shown along with individual channels (b/w). However, it is appreciated that the fluorescent signals are exceptionally weak in this study, presenting a challenge to presentation and to quantification.

      We agree the overlays would be nice. However, the Pins::GFP signal is weak compared to the tubulin and Gap43 signals, the merge does not provide more clarity, and the figures are already quite large. Therefore, we have only included the separated the images.

      • Graph axes depicting spindle orientation would be more clear if shown in degrees, instead of normalized or in radians.

      We thank the reviewer for this suggestion. We have changed the graph axes to be in degrees.

      Reviewer #2 (Significance (Required)): Several recent studies have demonstrated that division orientation (in the tissue plane) is governed by tissue level tension. Remarkably, it appears that diverse mechanisms link tension with spindle orientation. Here the authors provide the first in vivo evidence connecting tension to the asymmetric localization of Pins, an important and evolutionarily conserved spindle orientation factor.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): This beautiful manuscript uncovers a role for planar polarized PINS/LGN in orienting the mitotic spindle in Drosophila epithelia. In response to morphogenetic forces acting on adherens junctions, PINS/LGN localises to junctions in a planar polarized fashion to orient the spindle, and de-polarization of PINS/LGN prevents planar spindle orientation. The experiments are very well performed and the findings are robust. The conclusions are well supported by the data. Reviewer #3 (Significance (Required)): These important findings mirror previous work in human cell culture, but crucially reveal that the same phenomenon occurs in vivo in the Drosophila embryo. Thus, the findings underscore the highly conserved nature and in vivo relevance of this phenomenon.

      We thank this reviewer for reading the manuscript and their encouraging words.

    1. Author Response:

      Reviewer #1 (Public Review):

      This paper is of potential interest to researchers performing animal behavioral quantification with computer vision tools. The manuscript introduces 'BehaviorDEPOT', a MATLAB application and GUI intended to facilitate quantification and analysis of freezing behavior from behavior movies, along with several other classifiers based on movement statistics calculated from animal pose data. The paper describes how the tool can be applied to several specific types of experiments, and emphasizes the ease of use - particularly for groups without experience in coding or behavioral quantification. While these aims are laudable, and the software is relatively easy to use, further improvements to make the tool more automated would substantially broaden the likely user base.

      In this manuscript, the authors introduce a new piece of software, BehaviorDEPOT, that aims to serve as an open source classifier in service of standard lab-based behavioral assays. The key arguments the authors make are that 1) the open source code allows for freely available access, 2) the code doesn't require any coding knowledge to build new classifiers, 3) it is generalizable to other behaviors than freezing and other species (although this latter point is not shown) 4) that it uses posture-based tracking that allows for higher resolution than centroid-based methods, and 5) that it is possible to isolate features used in the classifiers. While these aims are laudable, and the software is indeed relatively easy to use, I am not convinced that the method represents a large conceptual advance or would be highly used outside the rodent freezing community.

      Major points:

      1) I'm not convinced over one of the key arguments the authors make - that the limb tracking produces qualitatively/quantitatively better results than centroid/orientation tracking alone for the tasks they measure. For example, angular velocities could be used to identify head movements. It would be good to test this with their data (could you build a classifier using only the position/velocity/angular velocities of the main axis of the body?

      2) This brings me to the point that the previous state-of-the-art open-source methodology, JAABA, is barely mentioned, and I think that a more direct comparison is warranted, especially since this method has been widely used/cited and is also aimed at a not-coding audience.

      Here we address points 1 and 2 together. JAABA has been widely adopted by the drosophila community with great success. However, we noticed that fewer studies use JAABA to study rodents. The ones that did typically examined social behaviors or gross locomotion, usually in an empty arena such as an open field or a standard homecage. In a study of mice performing reaching/grasping tasks against complex backgrounds, investigators modified the inner workings of JAABA to classify behavior (Sauerbrei et al., 2020), an approach that is largely inaccessible to inexperienced coders. This suggested to us that it may be challenging to implement JAABA for many rodent behavioral assays.

      We directly compared BehaviorDEPOT to JAABA and determined that BehaviorDEPOT outperforms JAABA in several ways. First, we used MoTr and Ctrax (the open-source centroid tracking software packages that are typically used with JAABA) to track animals in videos we had recorded previously. Both MoTr and Ctrax could fit ellipses to mice in an open field, in which the mouse is small relative to the environment and runs against a clean white background. However, consistent with previous reports (Geuther et al., Comm. Bio, 2019), MoTr and Ctrax performed poorly when rodents were fear conditioning chambers which have high contrast bars on the floor (Fig. 10A–C). These tracking-related hurdles may explain, at least in part, why relatively few rodent studies have employed JAABA.

      We next tried to import our DeepLabCut (DLC) tracking data into JAABA. The JAABA website instructs users to employ Animal Part Tracker (https://kristinbranson.github.io/APT/) to convert DLC outputs into a format that is compatible with JAABA. We discovered that APT was not compatible with the current version of DLC, an insurmountable hurdle for labs with limited coding expertise. We wrote our own code to estimate a centroid from DLC keypoints and fed the data into JAABA to train a freezing classifier. Even when we gave JAABA more training data than we used to develop BehaviorDEPOT classifiers (6 videos vs. 3 videos), BehaviorDEPOT achieved higher Recall and F1 scores (Fig. 10D).

      In response to point 1, we also trained a VTE classifier with JAABA. When we tested its performance on a separate set of test videos, JAABA could not distinguish VTE vs. non-VTE trials. It labeled every trial as containing VTE (Fig. 10E), indicating that a fitted ellipse is not sufficient to detect fine angular head movements. JAABA has additional limitations as well. For instance, JAABA reports the occurrence of behavior in a video timeseries but does not allow researchers to analyze the results of experiments. BehaviorDEPOT shares features of programs like Ethovision or ANYmaze in that it can classify behaviors and also report their occurrence with reference to spatial and temporal cues. These direct comparisons address some of the key concerns centered around the advances BehaviorDEPOT offers beyond JAABA. They also highlight the need for new behavioral analysis software targeted towards a noncoding audience, particularly in the rodent domain.

      3) Remaining on JAABA: while the authors' classification approach appeared to depend mostly on a relatively small number of features, JAABA uses boosting to build a very good classifier out of many not-so-good classifiers. This approach is well-worn in machine learning and has been used to good effect in highthroughput behavioral data. I would like the authors to comment on why they decided on the classification strategy they have.

      We built algorithmic classifiers around keypoint tracking because of the accuracy flexibility and speed it affords. Like many behavior classification programs, JAABA relies on tracking algorithms that use background subtraction (MoTr) or pattern classifiers (Ctrax) to segment animals from the environment and then abstract their position to an ellipse. These methods are highly sensitive to changes the experimental arena and cannot resolve fine movement of individual body parts (Geuther et al., Comm. Bio, 2019; Pennington et al., Sci. Rep. 2019; Fig. 10A). Keypoint tracking is more accurate and less sensitive to environmental changes. Models can be trained to detect animals in any environment, so researchers can analyze videos they have already collected. Any set of body parts can be tracked and fine movements such as head turns can be easily resolved (Fig. 10E).

      Keypoint tracking can be used to simultaneously track the location of animals and classify a wide range of behaviors. Integrated spatial-behavioral analysis is relevant to many assays including fear conditioning, avoidance, T-mazes (decision making), Y-mazes (working memory), open field (anxiety, locomotion), elevated plus maze (anxiety), novel object exploration, and social memory. Quantifying behaviors in these assays requires analysis of fine movements (we now show Novel Object Exploration, Fig. 5 and VTE, Fig. 6 as examples). These behaviors have been carefully defined by expert researchers. Algorithmic classifiers can be created quickly and intuitively based on small amounts of video data (Table 4) and easily tweaked for out of sample data (Fig. 9). Additional rounds of machine learning are time consuming, computationally intensive, and unnecessary, and we show in Figure 10 that JAABA classifiers have higher error rates than BehaviorDEPOT classifiers, even when provided with a larger set of training data. Moreover, while JAABA reports behaviors in video timeseries, BehaviorDEPOT has integrated features that report behavior occurring at the intersection of spatial and temporal cues (e.g. ROIs, optogenetics, conditioned cues), so it can also analyze the results of experiments. The automated, intuitive, and flexible way in which BehaviorDEPOT classifies and quantifies behavior will propel new discoveries by allowing even inexperienced coders to capitalize on the richness of their data.

      Thank you for raising these questions. We did an extensive rewrite of the intro and discussion to ensure these important points are clear.

      4) I would also like more details on the classifiers the authors used. There is some detail in the main text, but a specific section in the Methods section is warranted, I believe, for transparency. The same goes for all of the DLC post-processing steps.

      Apologies for the lack of detail. We included much more detail in both the results and methods sections that describe how each classifier works, how they were developed and validated, and how the DLC post-processing steps work.

      5) It would be good for the authors to compare the Inter-Rater Module to the methods described in the MARS paper (reference 12 here).

      We included some discussion of how BehaviorDEPOT Inter-Rater Module compares to the MARS.

      6) More quantitative discussion about the effect of tracking errors on the classifier would be ideal. No tracking is perfect, so an end-user will need to know "how good" they need to get the tracking to get the results presented here.

      We included a table detailing the specs of our DLC models and the videos that we used for validating our classifiers (Table 4). We also added a paragraph about designing video ‘training’ and test sets to the methods.

      Reviewer #2 (Public Review):

      BehaviorDEPOT is a Matlab-based user interface aimed at helping users interact with animal pose data without significant coding experience. It is composed of several tools for analysis of animal tracking data, as well as a data collection module that can interface via Arduino to control experimental hardware. The data analysis tools are designed for post-processing of DeepLabCut pose estimates and manual pose annotations, and includes four modules: 1) a Data Exploration module for visualizing spatiotemporal features computed from animal pose (such as velocity and acceleration), 2) a Classifier Optimization module for creating hand-fit classifiers to detect behaviors by applying windowing to spatiotemporal features, 3) a Validation module for evaluating performance of classifiers, and 4) an Inter-Rater Agreement module for comparing annotations by different individuals.

      A strength of BehaviorDEPOT is its combination of many broadly useful data visualization and evaluation modules within a single interface. The four experimental use cases in the paper nicely showcase various features of the tool, working the user from the simplest example (detecting optogenetically induced freezing) to a more sophisticated decision-making example in which BehaviorDEPOT is used to segment behavioral recordings into trials, and within trials to count head turns per trial to detect deliberative behavior (vicarious trial and error, or VTE.) The authors also demonstrate the application of their software using several different animal pose formats (including from 4 to 9 tracked body parts) from multiple camera types and framerates.

      1) One point that confused me when reading the paper was whether BehaviorDEPOT was using a single, fixed freezing classifier, or whether the freezing classifier was being tuned to each new setting (the latter is the case.) The abstract, introduction, and "Development of the BehaviorDEPOT Freezing Classifier" sections all make the freezing classifier sound like a fixed object that can be run "out-of-the-box" on any dataset. However, the subsequent "Analysis Module" section says it implements "hard-coded classifiers with adjustable parameters", which makes it clear that the freezing classifier is not a fixed object, but rather it has a set of parameters that can (must?) be tuned by the user to achieve desired performance. It is important to note that the freezing classifier performances reported in the paper should therefore be read with the understanding that these values are specific to the particular parameter configuration found (rather than reflecting performance a user could get out of the box.)

      Our classifier does work quite well “out of the box”. We developed our freezing classifier based on a small number of videos recorded with a FLIR Chameleon3 camera at 50 fps (Fig. 2F). We then demonstrated its high accuracy in three separately acquired data sets (webcam, FLIR+optogenetics, and Minicam+Miniscope, Fig. 2–4, Table 4). The same classifier also had excellent performance in mice and rats from external labs. With minor tweaks to the threshold values, we were able to classify freezing with F1>0.9 (Fig. 9). This means that the predictive value of the metrics we chose (head angular velocity and back velocity) generalizes across experimental setups.

      Popular freezing detection software including FreezeFrame, VideoFreeze as well as the newly created ezTrack also allow users to adjust freezing classifier thresholds. Allowing users to adjust thresholds ensures that the BehaviorDEPOT freezing classifier can be applied to videos that have already been recorded with different resolutions, lighting conditions, rodent species, etc. Indeed, the ability to easily adjust classifier thresholds for out-of-sample data represents one of the main advantages of hand-fitting classifiers. Yet BehaviorDEPOT offers additional advantages above FreezeFrame, VideoFreeze, and ezTrack. For one, it adds a level of rigor to the optimization step by quantifying classifier performance over a range of threshold values, helping users select the best ones. Also, it is free, it can quantify behavior with reference to user-defined spatiotemporal filters, and it can classify and analyze behaviors beyond freezing. We updated the results and discussions sections to make these points clear.

      2) This points to a central component of BehaviorDEPOT's design that makes its classifiers different from those produced by previously published behavior detection software such as JAABA or SimBA. So far as I can tell, BehaviorDEPOT includes no automated classifier fitting, instead relying on the users to come up with which features to use and which thresholds to assign to those features. Given that the classifier optimization module still requires manual annotations (to calculate classifier performance, Fig 7A), I'm unsure whether hand selection of features offers any kind of advantage over a standard supervised classifier training approach. That doesn't mean an advantage doesn't exist- maybe the hand-fit classifiers require less annotation data than a supervised classifier, or maybe humans are better at picking "appropriate" features based on their understanding of the behavior they want to study.

      See response to reviewer 1, point 3 above for an extensive discussion of the rationale for our classification method. See response to reviewer 2 point 3 below for an extensive discussion of the capabilities of the data exploration module, including new features we have added in response to Reviewer 2’s comments.

      3) There is something to be said for helping users hand-create behavior classifiers: it's easier to interpret the output of those classifiers, and they could prove easier to fine-tune to fix performance when given out-ofsample data. Still, I think it's a major shortcoming that BehaviorDEPOT only allows users to use up to two parameters to create behavior classifiers, and cannot create thresholds that depend on linear or nonlinear combinations of parameters (eg, Figure 6D indicates that the best classifier would take a weighted sum of head velocity and change in head angle.) Because of these limitations on classifier complexity, I worry that it will be difficult to use BehaviorDEPOT to detect many more complex behaviors.

      To clarify, users can combine as many parameters as they like to create behavior classifiers. However, the reviewer raises a good point and we have now expanded the functions of the Data Exploration Module. Now, users can choose ‘focused mode’ or ‘broad mode’ to explore their data. In focused mode, researchers use their intuition about behaviors to select the metrics to examine. The user chooses two metrics at a time and the Data Exploration Module compares values between frames where behavior is present or absent and provides summary data and visual representations in the form of boxplots and histograms. A generalized linear model (GLM) also estimates the likelihood that the behavior is present in a frame across a range of threshold values for both selected metrics (Fig. 8A), allowing users to optimize parameters in combination. This process can be repeated for as many metrics as desired.

      In broad mode, the module uses all available keypoint metrics to generate a GLM that can predict behavior. It also rank-orders metrics based on their predictive weights. Poorly predictive metrics are removed from the model if their weight is sufficiently small. Users also have the option to manually remove individual metrics from the model. Once suitable metrics and thresholds have been identified using either mode, users can plug any number and combination of metrics into a classifier template script that we provide and incorporate their new classifier into the Analysis Module. Detailed instructions for integrating new classifiers are available in our GitHub repository (https://github.com/DeNardoLab/BehaviorDEPOT/wiki/Customizing-BehaviorDEPOT).

      MoSeq, JAABA, MARS, SimBA, B-SOiD, DANNCE, and DeepEthogram are among a group of excellent opensource software packages that already do a great job detecting complex behaviors. They use supervised or unsupervised machine learning to detect behaviors that are difficult to see by eye including social interactions and fine-scale grooming behaviors. Instead of trying to improve upon these packages, BehaviorDEPOT is targeting unmet needs of a large group of researchers that study human-defined behaviors and need a fast and easy way to automate their analysis. As examples, we created a classifier to detect vicarious trial and error (VTE), defined by sweeps on the head (Fig. 9). Our revised manuscript also describes our new novel object exploration classifier (Fig. 5). Both behaviors are defined based on animal location and the presence of fine movements that may not be accurately detected by algorithms like MoTr and Ctrax (Fig. 10). As discussed in response to reviewer 1, point 3, additional rounds of machine learning are laborious (humans must label frames as input), computationally intensive, harder to adjust for out-of-sample videos, and are not necessary to quantify these kinds of behaviors.

      4) Finally, I have some concerns about how performance of classifiers is reported. For example, the authors describe "validation" set of videos used to assess freezing classifier performance, but they are very vague about the detector was trained in the first place, stating "we empirically determined that thresholding the velocity of a weighted average of 3-6 body parts ... and the angle of head movements produced the bestperforming freezing classifier." What videos were used to come to this conclusion? It is imperative that when performance values are reported in the paper, they are calculated on a separate set of validation videos, ideally from different animals, that were never referenced while setting the parameters of the classifier. Otherwise, there is a substantial risk of overfitting, leading to overestimation of classifier performance. Similarly, Figure 7 shows the manual fitting of classifiers to rat and mouse data; the fitting process in 7A is shown to include updating parameters and recalculating performance iteratively. This approach is fine, however I want to confirm that the classifier performances in panels 7F-G were computed on videos not used during fitting.

      Thank you for pointing this out. We have included detailed descriptions of the classifier development and validation in the results (149–204) and methods (789–820) sections and added a table that describes videos used to validate each classifier (Table 4).

      To develop the classifier freezing, we explored linear and angular velocity metrics for various keypoints, finding that angular velocity of the head and linear velocity of a back point tracked best with freezing. Common errors in our classifiers were identified as short sequences of frames at the beginning or end of a behavior bout. This may reflect failures in human detection. Other common errors were sequences of false positive or false negative frames that were shorter than a typical behavior bout. We included the convolution algorithm to correct these short error sequences.

      When developing classifiers (including adjust the parameters for the external videos), videos were randomly assigned to classifier development (e.g. ‘training’) and test sets. Dividing up the dataset by video rather than by frame ensures that highly correlated temporally adjacent frames are not sorted into training and test sets, which can cause overestimation of classifier accuracy. Since the videos in the test set were separate from those used to develop the algorithms, our validation data reflects the accuracy levels users can expect from BehaviorDEPOT.

      5) Overall, I like the user-friendly interface of this software, its interaction with experimental hardware, and its support for hand-crafted behavior classification. However, I feel that more work could be done to support incorporation of additional features and feature combinations as classifier input- it would be great if BehaviorDEPOT could at least partially automate the classifier fitting process, eg by automatically fitting thresholds to user-selected features, or by suggesting features that are most correlated with a user's provided annotations. Finally, the validation of classifier performance should be addressed.

      Thank you for the positive feedback on the interface. We addressed these comments in response to points 3 and 4. To recap, we updated the Data Exploration Module to include Generalized Linear Models that can suggest features with the highest predictive value. We also generated template scripts that simplify the process of creating new classifiers and incorporating them into the Analysis Module. We also included all the details of the videos we used to validate classifier performance, which were separate from the videos that we used to determine the parameters (Table 4).

      Reviewer #3 (Public Review): There is a need for standardized pipelines that allow for repeatable robust analysis of behavioral data, and this toolkit provides several helpful modules that researchers will find useful. There are, however, several weaknesses in the current presentation of this work.

      1) It is unclear what the major advance is that sets BehaviorDEPOT apart from other tools mentioned (ezTrack, JAABA, SimBA, MARS, DeepEthogram, etc). A comparison against other commonly used classifiers would speak to the motivation for BehaviorDEPOT - especially if this software is simpler to use and equally efficient at classification.

      We also address this in response to reviewer 1, points 1–3. To summarize, we added direct comparisons with JAABA to a revised manuscript. In Fig. 10, we show that BehaviorDEPOT outperforms JAABA in several ways. First, DLC is better at tracking rodents in complex environments than MoTr and Ctrax, which are the most used JAABA companion software packages for centroid tracking. Second, we show that even when we use DLC to approximate centroids and use this data to train classifiers with JAABA, the BehaviorDEPOT classifiers perform better than JAABA’s.

      In a revised manuscript, we included more discussion of what sets BehaviorDEPOT apart from other software, focusing on these main points:

      BehaviorDEPOT vs. commercially available packages (Ethovision, ANYmaze, FreezeFrame, VideoFreeze)

      1) Ethovision, ANYmaze, FreezeFrame, VideoFreeze cost thousands of dollars per license while BehaviorDEPOT is free.

      2) The BehaviorDEPOT freezing classifier performs robustly even when animals are wearing a tethered patch cord, while VideoFreeze and FreezeFrame often fail under these conditions.

      3) Keypoint tracking is more accurate, flexible, and can resolve more detail compared to those that use background subtraction or pixel change detection algorithms combined with center of mass or fitted ellipses.

      BehaviorDEPOT vs. packages targeted at non-coding audiences (JAABA, ezTrack)

      1) DLC keypoint tracking performs better than MoTr and Ctrax in complex environments. As a result, JAABA has not been widely used in the rodent community. Built around keypoint tracking, BehaviorDEPOT will enable researchers to analyze videos in any type of arena, including videos they have already collected. Keypoint track also allows for detection of finer movements, which is essential for behaviors like VTE and object exploration.

      2) Hand-fit classifiers can be creative quickly and intuitively for well-defined laboratory behaviors. Compared to machine learning-derived classifiers, they are easier to interpret and easier to fine-tune to optimize performance when given out-of-sample data.

      3) Even when using DLC as the input to JAABA, BehaviorDEPOT classifiers perform better (Figure 10)

      4) BehaviorDEPOT integrates behavioral classification, spatial tracking, and quantitative analysis of behavior and position with reference to spatial ROIs and temporal cues of interest. It is flexible and can accommodate varied experimental designs. In ezTrack, spatial tracking is decoupled from behavioral classification. In JAABA, spatial ROIs can be incorporated into machine learning algorithms, but users cannot quantify behavior with reference to spatial ROIs after classification has occurred. Neither JAABA nor ezTrack provide a way to quantify behavior with reference to temporal events (e.g. optogenetic stimuli, conditioned cues).

      5) BehaviorDEPOT includes analysis and visualization tools, providing many features of the costly commercial software packages for free.

      BehaviorDEPOT vs. packages based on keypoint tracking (SimBA, MARS, B-SOiD)

      Other software packages based on keypoint tracking use supervised or unsupervised methods to classify behavior from animal poses. These software packages target researchers studying complex behaviors that are difficult to see by eye including social interactions and fine-scale grooming behaviors whereas BehaviorDEPOT targets a large group of researchers that study human defined behaviors and need a fast and easy way to automate their analysis. Many behaviors of interest will require spatial tracking in combination with detection of specific movements (e.g. VTE, NOE). Additional rounds of machine learning are laborious (humans must label frames as input), computationally intensive, and are not necessary to quantify these kinds of behaviors.

      2) While the idea might be that joint-level tracking should simplify the classification process, the number of markers used in some of the examples is limited to small regions on the body and might not justify using these markers as input data. The functionality of the tool seems to rely on a single type of input data (a small number of keypoints labeled using DeepLabCut) and throws away a large amount of information in the keypoint labeling step. If the main goal is to build a robust freezing detector then why not incorporate image data (particularly when the best set of key points does not include any limb markers)?

      While one main goal was to build a robust freezing detector, BehaviorDEPOT is a general-purpose software. BehaviorDEPOT can classify behaviors from video timeseries and can analyze the results of experiments similar to Ethovision or FreezeFrame. BehaviorDEPOT is particularly useful for assays in which behavioral classification is integrated with spatial location, including avoidance, decision making (T maze), and novel object memory/recognition. While image data is useful for classifying behavior, it cannot combine spatial tracking with behavioral classification. However, DLC keypoint tracking is well-suited for this purpose. We find that tracking 4–8 points is sufficient to hand-fit high performing classifiers for freezing, avoidance, reward choice in a T-maze, VTE, and novel object recognition. Of course, users always have the option to track more points because BehaviorDEPOT simply imports the X-Y coordinates and likelihood scores of any keypoints of interest.

      3) Need a better justification of this classification method

      See response to reviewer 1, points 1–3 above.

      4) Are the thresholds chosen for smoothing and convolution adjusted based on agreement to a user-defined behavior?

      Yes. We added more details in the text. Briefly, users can change the thresholds used in both smoothing and convolution in the GUI and can optimize the values using the Classifier Optimization Module. Smoothing is performed once at the beginning of a session and has an adjustable span for the smoothing window. The convolution is a feature of each classifier, and thus can be adjusted when adjusting the classifier. When developing the freezing classifier, we started with a smoothing window that had the largest value that did not exceed the rate of motion of the animal and then fine-tuned the value to optimize smoothing. In the classifiers we have developed, window widths that are the length of the smallest bout of ‘real’ behavior and count thresholds approximately 1/3 the window width yielded the best results.

      5) Jitter is mentioned as a limiting factor in freezing classifier performance - does this affect human scoring as well?

      We were referring to jitter in terms of point location estimates by DeepLabCut. In other words, networks that are tailored to the specific recording conditions have lower error rates in the estimates of keypoint positions. Human scoring is an independent process that is not affected by this jitter. We changed the wording in the text to avoid any confusion.

      6) The use of a weighted average of body part velocities again throws away information - if one had a very high-quality video setup with more markers would optimal classification be done differently? What if the input instead consisted of 3D data, whether from multi-camera triangulation or other 3D pose estimation? Multianimal data?

      From reviewer 2, point 3: MARS, SimBA, and B-SOiD are excellent open-source software packages that are also based on keypoint tracking. They use supervised or unsupervised methods to classify complex behaviors that are difficult to see by eye including social interactions and fine-scale grooming behaviors. Instead of trying to improve upon these packages, which are already great, BehaviorDEPOT is targeting unmet needs of a large group of researchers that study human defined behaviors and need a fast and easy way to automate their analysis. Additional rounds of machine learning are laborious (humans must label frames as input), computationally intensive, and are not necessary to quantify these kinds of behaviors. However, keypoint tracking offers accuracy, precision and flexibility that is superior to behavioral classification programs that estimate movement based on background subtraction, center of mass, ellipse fitting, etc.

      7) It is unclear where the manual annotation of behavior is used in the tool as currently stands. Is the validation module used to simply say that the freezing detector is as good as a human annotator? One might expect that algorithms which use optic flow or pixel-based metrics might be superior to a human annotator, is it possible to benchmark against one of these? For behaviors other than freezing, a tool to compare human labels seems useful. The procedure described for converging on a behavioral definition is interesting and an example of this in a behavior other than freezing, especially where users may disagree, would be informative. It appears that manual annotation doesn't actually happen in the GUI and a user must create this themselves - this seems unnecessarily complicated.

      Manual annotation of behavior is used in the four classifier development modules: inter-rater, data exploration, optimization, and validation. The inter-rater module can be used as a tool to refine ground-truth behavioral definitions. It imports annotations from any number of raters and generates graphical and text-based statistical reports about overlap, disagreement, etc. Users can use this tool to iteratively refine annotations until they converged maximally. The inter-rater module can be used to compare human labels (or any reference set of annotations) for any behavior. To ensure this is clear to the readers, we added more details to the text and second demonstration of the inter-rater module for novel object exploration annotations (Fig. 7). The validation module imports reference annotations which can be produced by a human or another program, which can benchmark classifier performance against the reference. We added more details to this section as well.

      Freezing is a straightforward behavior that is easy to detect by eye. Rather than benchmark against an optic flow algorithm, we benchmarked against JAABA, another user-friendly behavioral classification software that uses machine learning algorithms. We find that BehaviorDEPOT is easier to use and labels freezing more accurately than JAABA. We also made a second freezing classifier that uses a changepoint algorithm to identify transitions from movement to freezing that may accommodate a wider range of video framerates and resolutions.

      We plan to incorporate an annotation feature into the GUI, but in the interest of disseminating our work soon, we argue that this is not necessary for inclusion now. There are many free or cheap programs that allow framewise annotation of behavior including FIJI, Quicktime, VLC, and MATLAB. In fact, users may already have manual annotations or annotations produced by a different software and BehaviorDEPOT can import these directly. While machine learning classifiers like JAABA require human annotations to be entered into their GUI, allowing people to import annotations they collected previously saves time and effort.

      8) A major benefit of BehaviorDEPOT seems to be the ability to run experiments, but the ease of programming specific experiments is not readily apparent. The examples provided use different recording methods and networks for each experimental context as well as different presentations of data - it is not clear which analyses are done automatically in BehaviorDEPOT and which require customizing code or depend on the MiniCAM platform and hardware. For example - how does synchronization with neural or stimulus data occur? Overall it is difficult to judge how these examples would be implemented without some visual documentation.

      We added visual documentation of the experimental module graphical interface to figure 1 and added more detail to the results, methods and to our GitHub repository (https://github.com/DeNardoLab/Fear-Conditioning-Experiment-Designer). Synchronization with stimulus data can occur within the Experiment Module (designed for fear conditioning experiments) or stimuli timestamps can be easily imported into the Analysis Module. Synchronization with neural data occurs post hoc using the data structures produced by the BehaviorDEPOT Analysis Module. We include our code for aligning behavior to Miniscope on our GitHub repository https://github.com/DeNardoLab/caAnalyze).

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In their manuscript, Hattori et al., put forward evidence that the knock-out of CD38 expression in astrocytes at approximately post-natal day 10 (referred to as CD38 AS-cKO P10) leads to a specific deficit in social memory in adult mice, while other types of memory remain unaltered. Using immunohistochemistry (IHC), the authors found a reduced number of excitatory synapses in the medial prefrontal cortex (mPFC) of CD38 AS-cKO P10 mice. Switching to in vitro primary cell culture models, the authors identify the astrocyte secreted protein SPARCL1 as a relevant synaptogenic factor. Using pharmacological dissection of relevant signaling pathways, Hattori et al., propose that cADPR formation and calcium released from intracellular stores, is essential for SPARCL1 secretion from astrocytes. Finally, the authors analyzed the transcriptome of primary CD38 KO astrocytes using bulk mRNA sequencing, and found that genes related to calcium signaling were downregulated in these cells.

      Major commments:

      • Are the key conclusions convincing?
        1. From a global perspective, the multiple lines of evidence provided by the authors strongly suggest that expression of CD38 in astrocytes is important for synaptogenesis in the mPFC of P10 mice, with ablation of CD38 and reduced synapse formation leading to social memory deficits at P70. However, the data concerning the role of astrocyte-secreted SPARCL1 is not particularly strong: further experiments are needed to support this claim (see below).
      • Are the claims preliminary or speculative?
        1. As it stands, there is no proof that the claimed astrocyte-specific deletion of CD38 is actually astrocyte specific. This evidence is crucial: without it the reported effects could be due to non-specific CD38 knock-out in other CNS cells. In this respect, the Western Blot in Supplementary Figure 1A does not provide information on astrocyte-specific deletion, merely that CD38 was globally reduced in the mPFC. Interestingly, the authors have previously published data (Hattori et al., 2017, 10.1002/glia.23139) showing that CD38 expression is mostly astrocyte-specific, peaking at p14, which coincides with the peak period of synaptogenesis. The degree of CD38 heterogeneity is also an issue that I think the authors need to consider. Do they information on this? Is CD38 expressed in every astrocyte of the CNS, or are there some astrocytes that are CD38 negative at P14? Is the mPFC a region specifically enriched in CD38 positive astrocytes and does this explain the observed behavioral deficit? I think if this is known, the authors should mention it in the "Introduction" or "Discussion". If this is not known, maybe the authors could provide data addressing the issue.
        2. I think the authors should take more caution in claiming that SPARCL1 is the main factor secreted through the CD38 signaling pathway and responsible for increased synaptogenesis. This is for several reasons, all centered on data displayed in Figure 4 and Supplementary Figure 6:
          • a) Western Blot (WB) data: The "Materials and Methods" section for WB does not indicate how protein loading and transfer efficiency were controlled for. Normalizing to β-Actin levels is an acceptable way to control for loading and transfer efficiency when using cell lysates. However, in the absence of such an abundant structural protein in conditioned media it is unclear how loading and transfer was controlled for under these conditions. Do the authors normalized the CD 38 KO AS ACM data by expressing protein levels relative to those from WT AS ACM? Is BDNF being used as a control, based on proteomics data? If so, why is proteomics data not given in the manuscript and why is this control not shown for all ACM blots? I realize that (quantitative) blotting using ACM is difficult, but I am also not convinced that the methodology used is sufficiently rigorous. Simple steps to give confidence would be Coomassie staining of gels both before and after membrane transfer, to show that i) the total protein amount loaded was the same in each lane of the gel and ii) the transfer to the nitrocellulose membrane was complete. In addition, Ponceau S staining of the nitrocellulose membrane should also have been performed and displayed, to show (roughly) equal amounts of protein were transferred for each lane. In summary, the WB data quantification needs to be better controlled. The values of the Y axis in these graphs (and throughout the manuscript) are simply too small to be read properly. Finally, I want to highlight the general lack of precision regarding the nature of the replication unit (the "n"). For example, the legend of Figure4C-D states "n = 6", but we have no idea if these are 6 independent primary cultures originating from 6 mice, 6 independent cultures from the same mouse, 6 repeats of the Western Blot using the same sample etc. This issue is valid for the whole manuscript: in my opinion, the authors should be more much careful when it comes to these crucial elements of scientific reporting.
          • b) While the data hint at an important role of SPARCL1 in synapse formation, when the authors tested if ACM from CD38 KO astrocytes supplemented with exogenous SPARCL1 could rescue synapse formation, the effect was incomplete, with only a trend to an increase in synapse number (Figure 4J-K). Perhaps the authors simply forgot to indicate the statistical significance of differences between the experimental groups (Figure 4K)? However, if there really were no statistically significant differences observed, the authors should reduce the strength of their conclusions regarding SPARCL1. This protein may well be pro-synaptogenic but, as it stands, other factors could well be in play. Perhaps the authors should have tried higher concentrations of SPARCL1 to further boost synaptogenesis? In this respect, the SPARCL1 knockdown (KD) experiment in Supplementary Figure 6B-D is an important addition, but should be supplemented by rescue with an siRNA-resistant recombinant SPARCL1? If SPARCL1 is a major player in synaptogenesis, the prediction is that synapse numbers would be close to wild type levels with this approach.
          • c) In my opinion, there are also issues with the data displayed in Figure 4H-I. The authors want to convince the reader that SPARCL1 is mostly an astrocytic protein using immunohistochemistry on mouse mPFC sections, co-labelled with antibodies against neuronal and astrocytic markers. In these panels, we are presented with images showing a few cells, in which it seems SPARCL1 is absent from NeuN positive cells, present in WT astrocytes and reduced in CD38 AS-cKO P10 astrocytes. However, the numbers of cell counted and lack of quantification severely impact on the strength of this conclusion. In my opinion, the authors should have quantified their IHC data by counting cells and establishing the ratios of SPARCL1 positive over NeuN or S100β positive cells, in both control and CD38 AS-cKO P10 animals. This experiment would provide critical information that the conditional gene targeting strategy is robust. The authors should also consider quantifying the intensity of the SPARCL1 signal in astrocytes. This is recommended as the image displayed in Figure 4I for the CD38 AS-cKO is problematic: are the authors really claiming that the reduction in SPARCL1 expression following cKO of CD38 in astrocytes is at best only partial? Is 11 days between the first tamoxifen injection and tissue fixation actually sufficient to allow for CD38 turnover? With low levels of protein turnover, the possibility exists that residual levels of CD38 are still sufficient to impact SPARCL1 levels. What would happen if there is a greater interval between tamoxifen administration and tissue recovery? Would levels of synaptogenesis be further reduced? Is this an issue of production versus secretion or a combination of factors?
        3. The heatmap (Figure 5E-F) is simply too small to interpret. The color choice is also not accessible for colorblind readers. The authors might consider displaying this heatmap in a separate figure. The authors should also provide a supplementary table where all the genes detected are listed along with their respective counts. Furthermore, it is surprising that the authors only found genes being downregulated in CD 38 KO astrocytes. Were they really no genes up-regulated? The authors might also want to indicate the genes belong to each of the ontological categories listed in Figure 5F. On p. 11, Figure 5E: The authors should indicate in the main text they performed bulk RNA-sequencing and not another type of RNA sequencing (like single cell RNA sequencing for instance). The authors indicate n = 2 but we have no indications of the nature of the replicate (also see earlier comments). Please amend.
      • Are additional experiments necessary? I think supplementary experiments are essential to support the claims of the paper. Most are described in the section above, but to summarize:
        1. Show data to prove that the CD38 AS-cKOP10 model is astrocyte-specific and leads to a total loss of CD38 in these cells.
        2. WB data: The issue of protein loading and transfer efficiency should be dealt with. Quantifications should be revisited.
        3. The authors should quantitatively analyze the different IHC performed in Figure 4H-I.
        4. The authors should provide more information on their RNA sequencing data: list of genes detected with their FPKM values etc. The authors should display the RNA sequencing data in a separate figure, allowing the heatmap to be enlarged.
        5. LC-MS/MS data: the authors should provide the list of all the proteins they identified in their LC-MS/MS experiment. As a supplementary table for instance? The majority of these experiments should be able to be performed with pre-existing samples/tissue slices. If not, the experimental pipeline necessary exits and these supporting experiments should not be too burdensome.
      • Data and methods presentation Methods: The authors need to work on this aspect of the manuscript. Most of the important details are already described, but some crucial ones are missing, while the phrasing used to describe methods is sometimes misleading. I will give some examples here, but this is not an exhaustive list. The fact that the manuscript is riddled with small mistakes, inconsistencies and/or oversights makes it difficult to read and creates a negative impression. The whole manuscript would benefit from a thorough proof-reading, preferably by a native speaker.
        1. in the "Immunohistochemistry and Synaptic Puncta Analysis" section on p. 21-22, we have no indication of which antibodies against "GFAP, NDRG2, VGlut1, PSD95, S100β, NenN(?) and SPARCL1" were used. It is standard practice to indicate the company, product number and lot number. The authors must also indicate the dilution at which they use these antibodies. On p.22, the authors write the cells were incubated with "Alexa- or Cy3-conjugated secondary antibodies". The excitation wavelengths of the Alexa dyes used need to be given.
        2. The authors need to provide more details on the microscope they used. Merely writing "using a 63× lens on a fluorescence microscope" (p.23) is insufficient.
        3. In the "LC-MS/MS" method the authors wrote: "Briefly, these proteins were reduced, alkylated, and digested by trypsin". I think that in the reduction and alkylation steps, chemicals other than trypsin were actually used. This sentence should be modified to reflect this.
        4. p.19: "uM" is written when the authors very likely mean "µM". Please check the whole manuscript for repeat examples. I know this is often lab "short-hand", but it should be avoided in scientific publications.
        5. The authors should be careful when describing their data to always indicate whether they referring to experiments performed using cultured astrocytes or not. As it stands, the text is confusing: for instance, when describing RNA-sequencing data in Figure 5, the main text appears to indicate that these astrocytes were acutely isolated from adult mice, when in fact they were obtained from primary cultures. Given concerns in the literature about potential differences between acutely isolated and cultured astrocytes (Foo et al., Neuron, 2011), this is essential. Data presentation: The figures appear to have been produced in a rush - and almost have a "screenshot" feel to them. This is not a scientific issue per se, but does impact on the overall impression given by the manuscript. The following is a non-exhaustive list of issues with the figures. I list the major ones that the authors should correct.
        6. Almost all Y axis labels are too small. The authors should comply to the basic journal requirements in terms of font sizes. Some axes do not end on a tick (e.g. Figure 3R). This is not dramatic, but should be corrected. Globally, the authors need to display bigger bar plots - most of them are extremely hard to read. Labeling should also be checked: Figure 4K, the Y axis label indicates values displayed are in %, when I think the axis graduation displays ratio values. Some of the IHC pictures are also too small to be easily interpreted.
        7. The heatmap in Figure 5E is impossible to read and, as such, has little or no value for the manuscript.
        8. Scale bars: where is the scale bar in Figure 2A? Figure 3A-H: Is the scale bar really representing 10 millimeters? Supplementary Figure 3A: scale bar is missing. Please check for similar issues throughout the manuscript.
        9. Figure Legends are problematic, and often contain incorrect or incomplete information. Examples include: Supplementary Figure 1: The description of panels J, L and N appears to be missing. Please also use the Greek letter beta and not 'b' for S100β. Supplementary Figure 5: I think the term "KO" is missing after CD 38 in the legend title. Figure 3: why state that nuclei were counterstained with DAPI in Figure 3P,Q, when this precision is not given for panels Figure 3A-H? Figure 3A-H: If the authors choose to explicitly state PSD95 is a post-synaptic marker, why not indicate that VGlut1 is a pre-synaptic marker? Same issue in Supplementary Figure 4.
        10. There are multiple instances of panels being wrongly referred to in the main text. On p.10, Figure 4H is referenced, when I think the authors mean Figure 4I; on p.10, Figure 4I-J are referred to when the authors clearly describe data found in Figure 4J-K. These types of mistakes are problematic and recur throughout the manuscript.
      • Statistical analysis As mentioned above, the exact nature of the replicates is often not stated, when the "n" number is indicated. The authors must correct this issue and give the information either at the appropriate point in the main text or in the figure legend.

      The authors should also be more consistent in the way they indicate which statistical tests were performed. This should also be indicated either at the appropriate point in the main text or in the figure legend. Furthermore, care should be taken to ensure statistics are presented in an appropriate manner: at the end of legend for Figure 4, it is indicated #p < 0.05 vs. CD38 KO ACM. This hashtag symbol is completely absent from the figure. In Figure 4F-G, the lack of statistical symbols seems to indicate no statistical tests were performed on these data, when the legend covering these panels states "*p < 0.05 versus P70", indicating some tests were done. We cannot interpret this panel without knowing which comparisons were done exactly and which were significant.

      In the "Materials and Methods", the authors give no indication that the assumptions of the statistical test they used were met (normality of data distribution for t-tests, homogeneity of variances for ANOVA...). This needs to be checked, and if not met, appropriate non-parametric tests should be used instead.

      Minor commments:

      • Specific experimental issues that are easily addressable. Most of the experimental issues that need to be addressed are given in previous sections and should be easily addressable.
      • Citation of previous studies? Adequate
      • Clarity and accuracy of text and figures There are issues with the clarity and accuracy of text and figures - which are described above. The text is also often problematic in its phrasing and other, more fundamental aspects. For instance, the authors spent a considerable amount of time speaking about the role of oxytocin, when they only performed one measurement of oxytocin levels in mice.
      • Suggestions to improve the presentation of data and conclusions? All my suggestions to improve the presentation of data can found in previous sections. As for improving the authors presentation of their conclusions, the authors should make a considerable re-drafting effort, particularly for the "Discussion", which lacks clarity in how supporting arguments are built and presented. For example, on p.13, I am confused with the argument made by the authors. Their data are focused on synapses onto pyramidal neurons of the mPFC, but here the discussion states that the behavioral phenotype they observed in CD38 AS-cKOP10 might be explained by a lack of mPFC neurons synapsing onto neurons in the Nucleus Accumbens (assuming that "NAc" really refers to this brain region, as the definition is missing from the text). I think the authors should make it clear if this is their interpretation of their own result, which essentially renders their focus on mPFC pointless, or a speculation on possible other mechanisms that could also explain their behavioral results. Personally, given the data shown, I believe the authors should focus on explaining how their data in mPFC might explain the behavioral output observed. The authors could also provide perspectives on how the hypothesis laid down in this paragraph would be tested. When the authors write on p.14 "We identified SPARCL1 as a potential molecule for synapse formation in cortical neurons" why use the word "potential"? Does this mean the authors consider their data on SPARCL1 (one of the key messages of the paper) invalid? If the authors themselves think the role of SPARCKL1 is ambiguous based on their own data, they should perform further experiments. P. 13, the authors write: "Moreover, many studies have shown that astrocyte-specific molecules, including extracellular molecules such as IL-6, are involved in memory function"; Interleukin 6 (Il-6, abbreviation not defined in the manuscript) is definitely not an astrocyte-specific molecule (see, for example, Erta et al., 2021 10.7150/ijbs.4679).

      Significance

      NATURE AND SIGNIFICANCE OF THE ADVANCE: I think that despite the issues described above, this manuscript, once revised, could have a strong impact in the field. It would fuel the current paradigm shift which puts astrocytes at the forefront of neuronal circuit wiring during development with links to adult behavior. By identifying clear molecular targets involved in astrocyte-driven synaptogenesis, this article could help the clinical field to find new druggable targets, which may help reverse aging-related cognitive decline.

      COMPARISON TO EXISTING PUBLISHED KNOWLEGDE: This work adds new data in the specific and growing line of research that study how astrocytes control synaptogenesis. Recent reviews have summarized advances in this field (Shan et al., 2021, 10.3389/fcell.2021.680301; Baldwin et al., 2021, 10.1016/j.conb.2017.05.006).

      AUDIENCE: Neuroscientists in general, clinicians interested in cellular and molecular causes of neurodevelopmental disorders leading to social dysfunctions.

      REVIEWER EXPERTISE: Astrocyte biology; Astrocyte-neuron interactions and synapse assembly; Neuronal circuit formation and plasticity

      Referees cross-commenting

      After careful reading of the other comments, I feel that there is considerable agreement/overlap between the reviewers on the main issues with this manuscript. Perhaps the major difference relates to the amount of further work necessary for the manuscript to be publication ready.

      As Reviewer 3 rightly points out, this is always a moot point: how much is it reasonable for reviewers to ask authors to do? While I agree with all of Reviewer 1's comments regarding the rigour of the mass-spec/western blot analysis, it seems to me that from a molecular/cell biological point of view, the key issue is whether Sparcl1 is a synaptogenic factor released from astrocytes following CD38/cADPR/calcium signaling (irrespective of whether other factors may be in play); and whether raising Sparcl1 levels is sufficient to recover spine morphology and synapse numbers. Of course, if these experiments were performed in vivo using AAV-mediated overexpression of Sparcl1, it is also reasonable to think that the deficit in social memory may be reversed on testing.

      The issues of whether there is a difference in observable behavioral phenotypes between the astrocyte-specific and constitutive CD38 knock-outs is an interesting one, as is why there is only a deficit in social memory seen following astrocyte-specific CD38 ablation. These issues should at least be discussed.

    1. Author Response:

      Evaluation Summary:

      This study adds to the considerable, but often conflicting, work on how neurotransmitter systems contribute to auditory processing dysfunction. The paper details a thorough and careful analysis of an important hypothesis from the point of view of schizophrenia research: do muscarinic and dopaminergic receptors contribute to mismatch negativity effects? The answers could be useful for future treatment allocation in psychosis. The analysis was pre-registered and departures from the planned analysis were well-motivated and clearly described.

      Thank you for this positive statement. We would like to make sure that the nature of our pre-registration is fully understood: we did not formally pre-register our study (i.e., there was no independent peer review). Instead, we defined an analysis plan ex ante (i.e., before beginning the data analysis for examining drug effects), and time-stamped and uploaded this plan on our institutional Git repository, prior to the unblinding of the analysing researcher. This a priori analysis plan is publicly available as well as our analysis code, and we report any departures from the analysis plan in our manuscript.

      Reviewer #1 (Public Review):

      The reduced amplitude of the mismatched negativity (MMN) in Schizophrenic patients has been associated with NMDA receptor malfunction. Weber and colleagues adjusted the systemic levels of two neurotransmitters (acetylcholine and dopamine), that are known to modulate NMDA receptor function, and examined the effects on mismatch related ERPs. They examined mismatch related ERPs elicited during a novel passive auditory oddball paradigm where the probability of hearing a particular tone was either constant for at least 100 trials (stable phases) or changed every 25-60 trials (volatile phases). Using impressive statistical testing the authors find that mismatch responses are selectively affected by reduced cholingeric function particularly during stable phases of the paradigm, but not by reduced dopamine function. Interestingly neither enhanced cholingeric or dopamine function affected MM responses at all. While the presented data support the main conclusions mentioned above, there are some claims in the abstract and text that are not supported by the results.

      1) The authors state in the abstract that "biperiden reduced and/or delayed mismatch responses......", while the results (Figure 2) support the statement that biperiden delayed mismatch responses, the claim that biperiden reduced mismatch responses is misleading as on P13 the authors actually report that "mismatch signals were stronger in the biperiden group compared to the placebo group at right central and centro-parietal sensors" around 200ms. This is close both in time and spatially to the traditional temporal and spatial locations of the MMN component. If one were to only read the abstract they would take away the result that the muscarinic acetylcholine receptor antagonist biperiden has an attenuative effect on MMN which is not what the results show.

      Thank you for this comment. We agree that the description in the abstract might be misleading and have changed our wording there. We now say (in the overall shortened abstract):

      “We found a significant drug x mismatch interaction: while the muscarinic acetylcholine receptor antagonist biperiden delayed and topographically shifted mismatch responses, particularly during high stability, this effect could not be detected for amisulpride, a dopamine D2/D3 receptor antagonist.”

      2) The conclusion that biperiden reduced mismatch responses may be due to the finding that at pre-frontal sensors mismatch responses were significantly smaller in the biperiden group than in the amisulpride (a dopaminergic receptor antagonist) group (P9) around 164ms. However, it is difficult to interpret if this is a meaningful result as amisulpride was found not to significantly alter mismatch responses in any way compared to placebo. It would be more convincing if the significant difference here were between biperiden and placebo groups. Or are we to think of amisulpride as being comparable to a placebo?

      We agree with your previous point and have adjusted our wording in the abstract accordingly (see response to previous comment).

      Furthermore, we have included an additional section in the Discussion in which we address the points you raise:

      "One might wonder whether the early difference between the biperiden and the amisulpride group at pre-frontal sensors is difficult to interpret, given the lack of differences of either drug group compared to placebo. However, given our research question – i.e., whether auditory mismatch signals are differentially susceptible to muscarinic versus dopaminergic receptor status – showing a significant difference between biperiden and amisulpride is critical.

      Clearly, such a differential effect would be even more compelling if biperiden differed significantly from amisulpride and placebo at the same time (and in the same sensor locations). While we do not find this in our main analysis, we do see it for the analysis using the alternative pre-processing pipeline and the trial definition (Figure 2—figure supplement 3) that was also specified a priori in our analysis plan. In this alternative analysis, mismatch responses under biperiden did differ significantly from both placebo and amisulpride."

      We suspect this difference in results between the analysis pipelines might partly be due to the different re-referencing. Compared to the average reference used in the main analysis, the linked mastoid reference in the alternative pre-processing pipeline subtracts the effects at sensors which show positive mismatch signals from those at fronto-central channels (with opposite sign), effectively enhancing the signal at the fronto-central channels (for evidence of this effect see also current Figure 3—figure supplement 1) but weakening it at temporal and pre-frontal sensors.

      We now discuss the question of sensitivity of both our paradigm and processing strategy in the discussion.

      3) The authors use the words mismatch negativity (MMN) and mismatch responses interchangeably however in some cases it is clearly mismatch responses being described and not the classical MMN ERP component. This occurs especially in the Introduction where the authors describe the study and that they plan to focus on the MMN but in the results section, since the initial analysis focuses on all sensors, other mismatch responses are consistently discussed. These differences in wording need to be precisely defined and used consistently in the text.

      We agree that it is important to use precise definitions of the terms and be consistent in their use. The dipole source signal of mismatch detection shows up with different signs across different sensor locations, and “MMN” traditionally refers to the effect in fronto-central channels, where it is a deviant-induced negativity. However, even when we constrain the use of “MMN” to the (difference in) negative deflection at fronto-central channels between 100 and 250ms (or similar) there remains some ambiguity due to the choice of reference. A common choice in MMN research is a linked mastoid reference. Because the mismatch signal shows up at the mastoids with opposite sign to fronto-central channels, this reference maximizes the observed difference at fronto-central channels (see also our Figure 3—figure supplement 1 and our reply to the previous comment) and minimizes it elsewhere, effectively forcing all (drug or other) effects to show up at frontocentral channels. This demonstrates that we typically think of the effects at different sensor locations as (caused by) one and the same (dipole source) signal. In our average referenced data (our main analysis), we observe some effects at fronto-polar sensors, where they are expressed as a modulation of a positive deflection, however, we think of these as being part of what is typically referred to as “MMN” for the above reasons.

      However, to avoid any confusion that this may cause, we have adapted the wording in our manuscript everywhere and mention this distinction in the methods section:

      “To avoid confusion, we will only use the term “MMN” when we talk about effects in the classical time window (100-200ms) and sensor locations (frontocentral sensors) for the MMN, and use “mismatch responses” for all other effects.”

      4) A weakness of the paper would be that the authors offer no prediction in the Introduction about what the expected effects of these specific neurotransmitter modulations would be on mismatch responses.

      Thank you for this suggestion and apologies for this oversight. We have now added a sentence to the Introduction, describing the effects we expected based on previous literature.

      Based on previous literature, one would expect mismatch responses in our paradigm to be sensitive to (1) volatility, with larger mismatch amplitudes during more stable phases (Dzafic et al., 2020; Todd et al., 2014; Weber et al., 2020), and (2) cholinergic manipulations, with galantamine increasing and biperiden reducing mismatch amplitudes (Moran et al., 2013; Schöbi et al., 2021). Furthermore, we expected a differential effect of cholinergic (muscarinic) and dopaminergic receptor status on mismatch responses, as postulated by initial work on MMN-based computational assays (Stephan et al., 2006). Our results suggest that muscarinic receptors play a critical role for the generation of mismatch responses and their dependence on environmental volatility, whereas no such evidence was found for dopamine receptors.”

      5) A nice aspect of this paper is that the authors re-analyzed their data using pre-processing settings identical to those used in comparable research papers examining the effect of cholinergic modulation on MMN. The main findings did not differ following this re-analysis.

      Reviewer #2 (Public Review):

      The authors found that Biperiden (M1 antagonist) delayed and altered the topography of MMN responses, particularly in the stable condition. Amisulpride did not do so, and neither did Galantamine or L-DOPA. The analysis using an ideal Bayesian observer (the HGF) detailed in the Appendix showed that Biperiden reduced the representation of lower-level prediction errors and increased that of higher-level prediction errors (about volatility).

      The methods were rigorous (including obtaining drug plasma levels and detailing alternative preprocessing techniques) and I have no suggestions for improvement from that point of view.

      I only have one main comment that I think could be discussed. I'm not an expert on this but as I understand it, Olanzapine is most selective for M2 receptors rather than M1 (https://www.nature.com/articles/1395486), although Clozapine metabolites do have some M1 selectivity (https://www.pnas.org/content/100/23/13674) - I'm not sure about Clozapine itself. So Biperiden (very M1 selective) might not be the ideal drug to use to explore a treatment allocation paradigm, at least for Olanzapine? I suspect the options are quite limited but it would probably be worth commenting on this.

      Thank you for pointing this out, this is indeed an important point for the discussion.

      First, clarifying the pharmacodynamics of psychopharmacological drugs and their relative affinity to different receptor subtypes is notoriously difficult as this depends on many methodological factors. The seminal paper on the binding profile of olanzapine (which, at the same time, also examined clozapine) is (Bymaster et al., 1996). Using in vitro assays, this study found that both olanzapine and clozapine showed by far the greatest affinity for the M1 receptor (see the Table 5). By contrast, using SPECT data from seven patients with schizophrenia treated with olanzapine, the paper you mentioned (Raedler et al., 2000) estimated the affinity of olanzapine to the M2 receptor as being roughly twice as high as to the M1 receptor. Both studies have methodological pros and cons (as discussed by (Raedler et al., 2000)). From our view, an important limitation by the study of (Raedler et al., 2000) is that they used the ligand [I-123]IQNB which is not selective and "does not allow discrimination between the different subtypes of the muscarinic receptors" (Raedler, Knable, Jones, Urbina, Gorey, et al., 2003). Instead, the M1/M2 comparison by (Raedler et al., 2000) rested on conclusions from a mathematical approximation – under various assumptions and with only 7 data points available. We note that subsequent studies by the same group on muscarinic receptors in schizophrenia (Raedler, Knable, Jones, Urbina, Egan, et al., 2003; Raedler, Knable, Jones, Urbina, Gorey, et al., 2003) no longer used this approach and refrained from making statements about relative selectivity of olanzapine and clozapine with regard to M1/M2 receptors. Furthermore, the results by (Raedler et al., 2000) are potentially confounded by the fact that they were not obtained from healthy controls, but from patients with schizophrenia. This is potentially problematic: if schizophrenia is characterised by an aberration related to M1 receptors (see below), this would affect the interpretability of the results by (Raedler et al., 2000). Overall, the relative affinity of olanzapine and clozapine to M1/M2 receptors remains a matter of debate, but it seems safe to say that both drugs affect both receptors.

      Second, we would like to explain that we think of biperiden as a model of a (potential) impairment, rather than a treatment. A series of studies have provided compelling evidence for a role of muscarinic (M1) receptor dysfunction in the pathophysiology of schizophrenia. In particular, there is compelling evidence for a subgroup of patients with markedly decreased M1 availability in the prefrontal cortex ((E. Scarr et al., 2009); see also (Gibbons et al., 2013) and (Elizabeth Scarr et al., 2018)). Moreover, multiple studies have found antipsychotic effects of xanomeline, an M1/M4 agonist (Bodick et al., 1997; Shekhar et al., 2008).

      Against this background, clozapine and olanzapine may seem counterintuitive as treatment options since they antagonize muscarinic receptors. However, the muscarinic system is complex, and the mechanisms by which muscarinic receptors are involved in the therapeutic effects of clozapine and olanzapine are far from being understood. One interesting observation is that both clozapine and olanzapine have been found to elevate extracellular acetylcholine concentrations in cortical regions (Ichikawa et al., 2002; Shirazi-Southall et al., 2002), potentially by blocking muscarinic autoreceptors (Johnson et al., 2005), although this is debated (Tzavara et al., 2006). There is clinical evidence that clozapine or its metabolites may exert their pro-cognitive effects by increasing the release of actetylcholine (Weiner et al., 2004), and preclinical evidence that clozapine is able to normalize M1 receptor availability in cortex (Malkoff et al., 2008).

      Irrespective of the exact mechanism by which clozapine and olanzapine exert their antipsychotic effects, their much higher affinity to muscarinic cholinergic receptors compared to dopaminergic receptors sets them apart from other antipsychotics. If a functional readout of the relative contribution of cholinergic versus dopaminergic deficits could be obtained in individual patients, this might be predictive of whether this patient would profit from clozapine, olanzapine, or, in the future, potential new treatments targeting the muscarinic system specifically.

      Given the above considerations, we have amended the relevant paragraph in the discussion to state this rationale more clearly.

      Notably, there is compelling evidence for a subgroup of patients with markedly decreased M1 availability in the prefrontal cortex ((E. Scarr et al., 2009); see also (Gibbons et al., 2013) and (Elizabeth Scarr et al., 2018)). This is consistent with the possibility that a key pathophysiological dimension of the heterogeneity of schizophrenia derives from a differential impairment of cholinergic versus dopaminergic modulation of NMDAR function (Stephan et al., 2006, 2009). Distinguishing these potential subtypes of schizophrenia could be highly relevant for treatment selection, as some of the most effective neuroleptic drugs (e.g., clozapine, olanzapine) differ from other atypical antipsychotics (e.g., amisulpride) in their binding affinity to muscarinic cholinergic receptors. The exact mechanisms by which muscarinic receptors are involved in the therapeutic effects of clozapine and olanzapine are still under debate and include, for example, elevation of extracellular levels of acetylcholine in cortex (Ichikawa et al., 2002; Shirazi-Southall et al., 2002; Weiner et al., 2004), possibly via blocking presynaptic muscarinic autoreceptors (see (Johnson et al., 2005; Tzavara et al., 2006) for conflicting data), and normalization of M1 receptor availability in cortex (Malkoff et al., 2008). Irrespective of the exact mechanism by which clozapine and olanzapine exert their antipsychotic effects, their much higher affinity to muscarinic cholinergic receptors compared to dopaminergic receptors sets them apart from other antipsychotics. If a functional readout of the relative contribution of cholinergic versus dopaminergic deficits could be obtained in individual patients, this might be predictive of whether this patient would profit from clozapine, olanzapine, or, in the future, potential new treatments targeting the muscarinic system specifically. Indeed, muscarinic receptors have become an important target of drug development for schizophrenia (Yohn & Conn, 2018).

    1. Author Response:

      Reviewer #2 (Public Review):

      The manuscript addresses an important question regarding sensory processing related to self-motion. The main experiment is clearly described and demonstrates that neurons display a diversity of responses from purely reflecting vestibular input (head-in-space motion) to predominantly body motion, and any combination between. Of particular interest, is that the response of the Purkinje cells are profoundly different than its downstream target, the fastigial neurons which signal only head-in-space or body motion. This substantive difference in neural representations between these two connected brain regions is surprising.

      The manuscript also provides a simple population model to show that fastigial responses could be generated from Purkinje cell activity, but only from combining at least 40 neurons. While the model provides some insight on the potential interaction between Purkinje cells and fastigial neurons, I think the model assumes no other input to the fastigial neurons. However, I would assume that there is likely a strong input from mossy fibers onto the fastigial neurons that also target the Purkinje cells. This mossy fiber input will certainly provide vestibular and neck proprioceptive input to the fastigial nucleus. Thus, the Purkinje cell input may be essential for countering the mossy fiber input leading to separate representations for head and body motion in the fastigial nucleus.

      We agree this is an important point. To address the reviewer’s concern, we performed additional modeling in order to consider the influence of mossy fiber inputs. Specifically, following the reviewer’s suggestion below, mossy fiber input was modeled using random patterns of vestibular and neck proprioceptive input. Prior studies have shown that the dynamics of vestibular nuclei neuron responses strongly resemble those of unimodal fastigial neurons in rhesus monkeys (i.e., they encode vestibular input and are insensitive to neck proprioceptive inputs, Roy & Cullen, 2001). In contrast, reticular formation neurons responses to such yaw head and/or neck rotations have not yet been described. We therefore simulated mossy fiber input first as a summation of vestibular and neck proprioceptive inputs, for which the gains and phases were randomly drawn from a distribution, comparable to that previously reported (Mitchell er al. 2017) in the vestibular nuclei (Fig. 7-figure supplement 3). We then further explored the effect of systematically altering this simulated mossy fiber input - relative to the reference distribution of mossy fiber inputs - by i) doubling the gain, ii) reducing the gain by half, iii) doubling the phase, and iv) reducing the phase by half (Fig. 7-figure supplement 4). Overall, we found that the addition of such simulated mossy fiber did not dramatically alter our estimate of the population Purkinje cell population size required to generate rFN neurons responses (~50 versus 40; Fig. 7-figure supplement 3&4).

      Another issue is the limited number of neurons recorded in the secondary experiment with only 12 bimodal neurons and 5 unimodal (although there appears to be only 4 neurons in Figure 5C). Such a small sample impacts the estimated tuning properties of Purkinje neurons in Figure 5D and the results from the population model. This needs to be clearly recognized.

      We have revised the RESULTS to clarify the numbers of Purkinje cells that were tested (13 bimodal and 4 unimodal Purkinje cells). For comparison, in our Brooks and Cullen study, tuning curves were computed for 10 bimodal and 12 unimodal rFN. We note that i) unimodal Purkinje cells make up a relatively small percentage of anterior vermis Purkinje cells and ii) similar to unimodal rFN, our small sample of unimodal 9 Purkinje cells did not demonstrate significant tuning. In contrast, all bimodal Purkinje cells in our sample demonstrated significant tuning. To simulate responses for the bimodal Purkinje cells that were not held long enough to test during gain-field paradigm (i.e., Fig 5), we generated tuning curves drawn from a normal distribution estimated from 13 bimodal Purkinje cells. We appreciate this was not clear in the original submission and have revised the METHODS section to clarify our approach. Overall, while we recognize that our sample size is small, we nevertheless found it interesting that including this our results from this protocol did not increase the estimated population size relative to that estimated using our other dynamic protocols.

      Reviewer #3 (Public Review):

      In this study, the authors characterize the simple spike discharges of Purkinje cells in the anterior vermis of the macaque during passive vestibular and neck proprioceptive stimulation. The activity of most Purkinje cells encoded both vestibular (whole-body rotation) and proprioceptive (body-under-head rotation) stimuli. Although the vestibular and proprioceptive responses were, on average, antagonistic in the preferred direction, consistent with a partial transformation from head to body coordinates, response properties for both modalities were highly variable across neurons. Most cells responded under combined vestibular and proprioceptive stimulation (head-on-body rotation), and these responses were well-approximated by the average of the responses to each modality individually. Vestibular responses exhibited gain-field-like tuning with changes in head-on-body position, though these changes were significantly smaller than the shifts observed for neurons downstream in the rostral fastigial nucleus. Finally, a weighted average of the responses of approximately 40 Purkinje cells provided a good fit to the responses of postsynaptic fastigial neurons.

      Overall, these results provide important and novel insights into the implementation of coordinate transformations by cerebellar circuitry. The experiments are well-designed, the data high quality, the analyses reasonable, and the conclusions justified by the data. The manuscript is clear and well-written, and will be of interest to a broad neuroscientific audience. I have no major concerns. I have a few minor suggestions for improving this manuscript, described below.

      1 - The authors may wish to discuss earlier work in the decerebrate cat by Denoth et al. (1979, Pflügers Archiv), which provided evidence that the responses of Purkinje cells in the anterior vermis to head-on-body tilt is relatively well-approximated by averaging the responses to neck and macular stimulation alone.

      We thank the reviewer for bringing this reference to our attention and have revised the INTRODUCTION and DISCUSSION to include the early work of Denoth et al.,1979.

      2 - To better convey the heterogeneity of responses across the sample of Purkinje cells, two additional supplemental figure panels might be useful: (1) the vestibular, proprioceptive, summed, and combined sensitivities in each direction (as in the Fig. 3C insets) for each individual neuron (perhaps as a series of subpanels), and (2) scatterplots of response phase for proprioceptive vs vestibular stimulation for bimodal neurons (with separate panels for preferred and non-preferred directions).

      We agree that this is a useful way to emphasize the heterogeneity of bimodal Purkinje cells responses and have added the requested response phase scatterplots for proprioceptive vs vestibular stimulation (Fig 2 - figure supplement 2C&D). We have also made a figure showing the summation model for each individual neuron. However, because our Purkinje cell population included 73 neurons, this figure includes a corresponding 73X2 =146 polar plots (i.e., two plot each cell, one for ipsi and contralateral motion). Given the immense size of this figure, we elected not to include this figure in the supplementary material in the revised manuscript.

      3 - Can the authors provide additional information on the approximate location of the recorded neurons (lobule and zone or mediolateral position)? Is it possible that some project to the vestibular nuclei, rather than the rFN? This consideration seems especially relevant for the interpretation of the pooling analysis in Fig. 6, which seems to assume that Purkinje cells are sampled from a sagittal zone with overlapping projections in the rFN (or, at least, that the response properties of the sampled neurons are representative of the properties in a corticonuclear zone). Some additional discussion on this point would be helpful.

      The recorded neurons were located in the lobules II-V of the anterior vermis, ~0 to 2 mm from the midline. We now include this information in the revised METHODS. As noted by the reviewer, Purkinje cells in this region of the anterior vermis project to the vestibular nuclei as well as to the rFN (Voogd et al. 1991). Nevertheless, using comparable stimulation protocols, we have previously shown that the responses of vestibular nuclei neurons are comparable to those of unimodal rFN neurons (Brooks et al., 2015). Specifically, both vestibular nuclei and unimodal rFN neurons are insensitive to proprioceptive stimulation and demonstrated comparable responses to vestibular stimulation. Thus, our present modeling results regarding the population convergence required to account for unimodal rFN neurons can be directly applied to vestibular nuclei neurons. We have revised the DISCUSSION to consider this point.

      4 - When weighted averages of Purkinje cell responses are used to model rFN responses, my intuition would be that w_i is near zero for v-shaped and rectifying Purkinje cells. That is, the model would mostly ignore them, as data from both directions appear to be included. Is this the case? A more detailed description of the fitting procedure would also be helpful.

      To address the reviewers’ concerns regarding the Purkinje cell weights, we have added a new inset to Fig 7C. As can be seen, model weights are well distributed across different Purkinje cells. Further, to confirm that the distribution of the weights of Purkinje cells inputs are distributed over different classes of PCs we now illustrate the weight distributions for (a) linear vs. v-shaped vs. rectifying Purkinje cells, (b) bimodal vs. unimodal Purkinje cells, (c) Type I vs. Type II Purkinje cells and (d) Purkinje cells with agonistic vs. antagonistic vestibular and proprioceptive sensitivities. These results are shown in Figure 7-supplemental figures 1&2. Overall, we found that distribution of the weights was not biased towards linear cells, but rather were similarly distributed across all three groups. This was true for our modeling of both bimodal and unimodal rFN cells (compare Fig 7- figure supplement 1 vs. Fig 7- figure supplement 2). As can be seen in this Figure, we likewise found comparable results for the weights of Type I vs. Type II Purkinje cells, unimodal vs. bimodal Purkinje cells, and/or vestibular / proprioceptive agonist vs. antagonist bimodal neurons. Finally, as detailed above in our response to the reviewers’ consensus comments, we have also revised the METHODS section to provide a more detailed description of linear regression method.

      5 - Another potential interpretive issue in the averaging analysis concerns the presence of noise on single trials. The authors could briefly comment on whether more Purkinje cells might be needed to predict rFN responses on a single trial in real time.

      This is an interesting question; we have revised the DISCUSSION to consider this point.

    1. Author Response:

      Reviewer #1:

      The aim of this paper to reveal the mechanisms that establish the Wnt gradient combining a mathematical model and experiments is of general importance. The results of computer simulations and biological experiments are interesting because they consider multiple extracellular components. They successfully demonstrated that the ligand/receptor feedback and the other extracellular components shape the morphogen gradient of Wnt ligand so that the fine patterning found in heart development can be explained. However, I feel that quantification of the experimental data, explanation of the mathematical model and discussion of the results are not sufficient in the current manuscript.

      Major points:

      1. Experimental validation of the results of computer simulations is very important in this study. However, many of experimental data were not properly quantified or statistically tested. The authors would need to quantify the experimental results when appropriate and perform statistical tests (e.g. Figs. 1E, 2A, 4A-B, Supplemental Figs. 6, 7).

      We are sorry for the lack of quantitative and statistical analyses in many experiments. We revised all the points (graphs and statistical analyses in Figs 1, 2, 4; Figure 1-figure supplement 1; Figure 3-figure supplement 7; Figure 4-figure supplement 1, 2).

      1. Design of the mathematical model is not sufficiently explained in the main text. Besides details in the method section, the basic design of the model and simulation should be briefly explained. For example, initial distribution of Fzd7, regions that produce Wnt6 and sFRP1, and interpretation of the simulation results should be added for Fig. 3 (page 10, line 11-16).

      We are sorry for the inconvenience. In this revision, we wrote the basic design of the model and simulation in the main text.

      As an interpretation of the simulation results, we added an explanation as follows:

      The Wnt signaling gradient became steeper with increased feedback strength. Considering a threshold of signal activation (Fig. 3A, dashed line), feedback results in restriction of the Wnt-activated region.

      1. The authors demonstrated the roles of Wnt6/Fzd7 feedback and sFRP/Heparan sulfate binding. A typical simulation data showing the roles of sFRP and Heparan sulfate would need to be shown in the main figure.

      Thank you for your suggestions. We moved a typical result of sFRP/HS simulation from the original supplemental figure to a main figure (Fig. 4G).

      Unfortunately, they did not sufficiently discuss their actions using the mathematical model. They would need to at least qualitatively discuss these points. How do they control Wnt gradient? What are the roles of these two mechanisms? What are the difference? How do they influence with each other? Simplified models may be necessary to reveal the relationship between these two mechanisms and to gain mechanistic insights.

      Thank you for pointing out these critical points.

      For Wnt gradients, receptor feedback, sFRP, and HS are synergistically acting for the restriction of signal activated region (steep gradient).

      However, there are some differences. The receptor-feedback can overcome the variation of Wnt production but sFRP1 and HS cannot because sFPR1 expression is inhibited by Wnt, which forms a positive feedback loop for Wnt signaling (Gibb et al., 2013). Thus, sFRP1/HS cannot buffer the variation of Wnt production.

      In this revision, we added these explanations.

      [They will influence each other] Because sFRP1 inhibits Wnt signaling, sFRP1 reduces fzd7 expression. This occurs mainly in the right side (because sFRP1 is expressed in the right side), resulting in a short-range activation of Wnt signaling.

      Deeply considering your comments, we recognized that we did not describe sFRP1/HS function in the title of the previous version. We revised it as follows:

      Previous) Positive Feedback Regulation of fzd7 Expression Robustly Shapes Wnt Signaling Range in Early Heart Development

      Current) Positive feedback regulation of fzd7 expression robustly shapes a steep Wnt gradient in early heart development, together with sFRP1 and heparan sulfate

      Additionally, the situation studied in this paper would need to be compared with the other examples of ligand/receptor feedback, and the similarity and difference should also be discussed (e.g. Hedgehog/Patched and Wingless/Frizzled2 in the fly wing).

      Thank you for your helpful comments.

      As you mentioned, the gene regulatory circuit of our Wnt6/Fzd7 is similar to that of Hedgehog (Hh)/Patched (Ptc): both of the morphogens commit self-enhanced degradation via induction of receptor expression (Eldar et al., 2003; Hh induces Ptc expression, and this increases Hh degradation). In the case of Wingless/Frizzled2, the gene regulatory circuit is different from that of Wnt6/Fzd7: Wingless commits self-enhanced degradation via repression of receptor expression. Wingless inhibits Fzd2 expression, and Fzd2 inhibits Wingless degradation. Both gene regulatory circuits function as a robust system for morphogen variations (Alon, 2006).

      There is also a little difference between Wnt6/Fzd7 and Hh/Ptc. In the Hh, the receptor Ptc inhibits downstream signaling. Thus, the network of Hh restricts the ligand distribution as is the case with Wnt, but the signal activity is not as steep as Wnt (highly Ptc expression inhibits the signaling).

      We added these explanations.

      Reviewer #2:

      In this work, the authors tried to understand the effect of receptor and diffusible inhibitors on the Wnt morphogen gradient during heart development by combining experiment and computational modeling. The experimental part seems to be a solid contribution to this academic field, and I appreciate the interdisciplinary attempt to combine the results with the computational model. However, their results may be interpreted more clearly using classical mathematical models.

      First of all, we greatly thank you for evaluating our manuscript. And thank you very much for explaining classical models in detail.

      1. Classical models may be enough.

      Previous mathematical models provided stronger predictions than numerical simulations, and I am not sure numerical results provided by the authors give us new insights. For example, Eldar et al. (2003) have provided analytical results on why the concentration becomes robust. In normal SDD model

      u'(x,t) = -d_1 u(x,t) + d_u \Delta u(x,t),

      the steady-state solution is exponential function,

      u_s(x) = u_0 exp(- \sqrt (d_1/d_u)x)

      , and the amount of morphogen production at the boundary critically affects the result (If the production becomes 1/2, the concentration becomes 1/2 everywhere). On the other hand, if the degradation is promoted by the morphogen itself (in this case, by the upregulation of the receptor expression), the governing equation becomes

      u'(x,t) = -d_2 u(x,t)^2 + d_u \Delta u(x,t),

      the solution is

      u_s(x) =A/(x+x_b)^2

      ($A$ and $x_b$ are constants determined by $d_u$ and $d_2$). It converges to

      u_s(x) =A/x^2

      and the morphogen gradient profile does not change much when the morphogen production is relatively high (that means there is a condition to be robust).

      Similarly, a linear approximation is enough to understand the diffusion length change - diffusion length of the morphogen gradient (the length necessary to become morphogen concentration 1/e) is in general $\sqrt{D_u /d_1}$, and feedback mechanism should increase d_1 in first-order estimation, hence decreasing the diffusion length. Binding to HSPG may have a similar effect (in the case of FGF, HSPG is necessary to the binding of FGFR, and the situation is very different).

      Thank you again for your explanations. Our explanations in the previous manuscript were not enough.

      –Difference of our computational simulation and the classical analysis:

      We think we need numerical simulation to consider points not addressed with previous analytical methods. The following two points are the new points that are too complicated to handle with analytical methods.

      1. Transient state is considered, which is hard to analyze without computer simulation.

      Considering the in vivo situation, we cannot determine whether the fate determination takes place at a transient or steady state (as described in page 7, line 14). So, we analyzed it not limited to a steady state but including transient state in our simulation.

      1. Receptor has multiple functions in interaction with multiple molecule species: (i) binds to the ligand and restricts the ligand spreading, (ii) activates the intracellular signaling, and (iii) degrade the ligand (new Supplementary Fig. 1A). We would like to include these different functions separately in the simulation. In addition, we considered sFRP1 and N-acetyl-rich HS. Thus, we need a multivariate nonlinear reaction-diffusion equation, which is hard to handle without computer simulation.

      To clarify these points, we added an explanation of the multiple receptor functions with a schematic figure (Supplementary Fig. 1A).

      –Importance/significance of our simulation:

      We first confirmed that our simulation reached a similar conclusion as the classical simulation at a certain time point (~ 1 day after the onset of simulation): the network was robust against variation of Wnt production. In addition, examining the time change of activation level, we have found that this network is robust against changes in speed of the differentiation. We added these explanations.

      1. Biological example of Wnt fluctuation

      The authors examine the effect of Wnt production fluctuation, but their motivation is not clear. Eldar et al. (2003) is motivated by the fact that the Shh heterozygote knockout has no phenotype, although the amount of mRNA is halved. Theoretically, it should have a major effect on the organs utilizing the Shh morphogen gradient (actually, haploinsufficiency is observed, but the phenotype is mild). The authors would need to provide some argument why they are interested in the robustness to the Wnt expression fluctuation.

      We all agree with your opinion. Compared with Eldar et al. (2003), our motivation is not clear to set 50% for the variance of ligand production.

      It is generally accepted that gene expression is different between individuals. In contrast, the proportions of the patterned tissues are almost the same among individuals.

      We examine this general question in our specific example of Wnt production. Here we focused on an extreme example (50% increase) among various sizes of gene expression.

      We added a phrase “as an extreme case” to clarify that it is an example in the revised manuscript.

      1. Wnt signal distribution

      It is difficult for general readers to understand why the Wnt signal distribution in the simulations (0 around 0-10 µm, Sudden disappearance at 40 µm) is appropriate. The authors can provide the profile plot of the actual measurement, which corresponds to the modeling result.

      Sorry for this inconvenience. As indicated in Figure 1—figure supplement 1B, Fzd7 shows a limited expression in pericardium. Fzd7 expression was not detected in epidermis (Figure 1—figure supplement 1B), which is the Wnt source (Lavery et al., 2008), indicating that the sudden increase of Fzd7 expression near Wnt source (at x = 10 μm) is reasonable (because the amount of Wnt at x = 10 μm is considered to be above the threshold for Fzd7 expression). In the prospective myocardium region, Fzd7 expression was also disappeared suddenly (Figure 1—figure supplement 1B), suggesting that the activity of Wnt signaling is also disappeared suddenly in the region. We added the explanations.

      In addition to the indirect estimation of Wnt signaling from Fzd7 expression, to directly confirm the “sudden disappearance” of Wnt signaling, we tried following three ways, but they failed. We examined (i) a transgenic reporter line of Wnt signaling (TCF-promoter-driven GFP) and (ii) immunohistochemistry (IHC) of beta-catenin (nucleus localization of beta-catenin is an indicator of the activation of Wnt signaling) and (iii) IHC of active beta-catenin (which only detect the active form of beta-catenin), expecting more gradual signal distribution, compared to the readout of Fzd7 expression which may have a threshold to express. But (i) the background signal was high in the transgenic. (ii) The background signal was also high with IHC maybe because beta-catenin is abundant also in the cytoplasm in heart region. (iii) The signal of active beta-catenin was not changed by Wnt addition in Xenopus.

      In addition, about the width of wnt6 and fzd7 expression, we measured the actual size of the fzd7-expressed region (Figure 1—figure supplement 1B), which was around 32 μm. It was almost the same as that in the model (30 μm). The width of Wnt6-expressed region was set to be 10 μm following a previous report (Lavery et al., 2008). We added explanations for the width of the expressions.

      1. Variable "Wnt signal"

      It is not clear what the variable "Wnt signal" means. As far as I understand, the signal inside the cell changes quickly (in the case of FGF, the ERK phosphorylation state changes within a minute). The author should provide a concrete example of this "Wnt signal" (maybe mRNA expression of some marker gene?).

      We agree with your opinion. As an indicator of Wnt signal activation, we think of the translocation of β-catenin (a transcriptional regulator) into the nucleus. Indeed, the translocation is observed at least in a 15 min and concurrently the transcription of the target gene is observed (Kafri et al., 2016), suggesting this translocation (the activation of the signal in the cells) is recognized enough by the cells within a minute. We added this explanation.

      1. Use of BMP measurement values.

      In addition, I am not sure whether using BMP values for the estimate of Wnt dynamics is appropriate. I have an impression that BMP is a fast-diffusing molecule that has a less binding affinity to ECM compared to FGFs. Although I have not dealt with Wnts, they are reported to bind strongly to ECM.

      Thank you for the comments. In this revision, we used all of the reported Wnt values. According to this parameter change, we performed computer simulation again. All the conclusions were not changed.

      Reviewer #3:

      A summary of the study and the strengths of this manuscript: The authors found several new molecular interactions that may be essential for understanding the mechanism of steep gradient formation of Wnt ligands in the prospective cardiac field.

      One of the new findings is that expression of a Wnt receptor, Frizzled7, in the prospective heart field is activated by Wnt/b-catenin signaling, as well as by Wnt6 ligands, which is involved in the patterning of this field. They also found that the diffusing Wnt6 ligand is trapped at the surface of cells in which Frizzled7 is ectopically expressed. It seems reasonable that the combination of signal-dependent receptor expression and receptor-dependent ligand capture would result in a steep gradient of morphogen molecules. In fact, this idea is supported by mathematical modeling. In addition, this modeling suggests that the receptor feedback mechanism provides robustness to morphogen-mediated patterning against fluctuations in morphogen production.

      Another highlight of their study is that the soluble Wnt antagonist, sFRP1, specifically binds to N-acetyl HS, and this modification of HS is specifically detected in the outer of the cardiogenic field. The localized N-acetyl HS may also be involved in Wnt gradient formation by inhibiting Wnt signaling around myocardium region.

      The weaknesses of this manuscript: Although the issue they address in this manuscript is very important for understanding the mechanism of morphogen-based tissue patterning, most of the experimental data presented in this manuscript are preliminary.

      We added and revised many experiments (including computational analysis) in this revision. In particular, in Figs 1, 2, 4; Figure 1-figure supplement 1; Figure 3-figure supplement 7; Figure 4-figure supplement 1, 2.

      Therefore, interpretations other than the ones they have argued for in this manuscript are quite possible. any other interpretations except those they claimed in this manuscript are still possible.

      For example, the authors argue that receptor feedback is essential for the formation of steep Wnt gradients (lines 8-9 in the abstract), but their model does not rule out an alternative possibility that high levels of receptor expression in the cardiogenic field form steep gradients.

      We agree.

      As you mentioned, high levels of receptor expression can form steep gradients. In a case distributions are similar with and without feedback, the changes in the boundary position in response to Wnt production change seemed smaller with feedback than without (Fig. 3B), providing a possibility that feedback has higher robustness to the variation.

      These explanations were poor in the previous version. We added explanation.

      In addition, it would be a waste of energy because too much receptor expression is needed. If the initial expression of receptor is critical for the patterning (not the receptor feedback), the amount and the area should be tightly controlled by an additional mechanism.

      We added these explanations to the result and discussion sections.

      Furthermore, they have not succeeded in directly examining the effect of receptor feedback on Wnt6 gradient formation. Although the data shown in Supplementary Figure 6E appear to support the contribution of feedback mechanisms to patterning, the results do not exclude another interpretation that an increase in Wnt trapper molecules simply inhibits the receptor-mediated clearance of Wnt ligands from the extracellular space in the pericardial region, resulting in an increase of extracellular Wnt ligands and their long-range transport.

      Thank you for your comment. As you mentioned, the Wnt trapper inhibits clearance. However, at the same time as it inhibits clearance, it also inhibits diffusion of Wnt. These two inhibitions happen simultaneously for the same duration. Thus, the trapper will not promote long-range transport via competitive inhibition of the Wnt clearance.

      Thus, from the results using the trapper, we can conclude that the receptor expressed after the activation of Wnt signal (not the initial amount of receptor) is critical for determining the range of Wnt signaling (e.g. the width of the resulting pericardium).

      We added these explanations in the new text.

      With regard to the restriction of sFRP1 diffusion, no evidence has been presented to show that N-acetyl modification of HS is actually involved in the restriction of sFRP1 diffusion, the formation of Wnt gradient, and the patterning of prospective cardiac fields. This lack of data significantly undermines the credibility of the conclusions presented in this paper.

      We performed a new experiment.

      We overexpressed Ndst1 enzyme that modifies N-acetyl to N-sulfo HS to eliminate N-acetyl HS, and analyzed if heart patterning is changed. We revealed that Ndst1 expression results in a reduced pericardium but an increased myocardium region, suggesting that N-acetyl HS promotes pericardium differentiation and inhibits myocardium differentiation.

      We added these explanations and figures (Fig. 4F; Figure 4-figure supplement 2A-C).

    1. A list of all the questions that Vannevar Bush poses in the piece:

      • What are the scientists to do next?
      • Of what lasting benefit has been man's use of science and of the new instruments which his research brought into existence?
      • Is this all fantastic?
      • Will there be dry photography?
      • What would it cost to print a million copies?
      • The preparation of the original copy?
      • To consider the first stage of the procedure, will the author of the future cease writing by hand or typewriter and talk directly to the record?
      • Is it not possible that some day the path may be established more directly?
      • Might not these currents be intercepted, either in the original form in which information is conveyed to the brain, or in the marvelously metamorphosed form in which they then proceed to the hand?
      • Is it not possible that we may learn to introduce them without the present cumbersomeness of first transforming electrical vibrations to mechanical ones, which the human mechanism promptly transforms back to the electrical form?
      • True, the record is unintelligible, except as it points out certain gross misfunctioning of the cerebral mechanism; but who would now place bounds on where such a thing may lead?
      • Must we always transform to mechanical movements in order to proceed from one electrical phenomenon to another?
    1. Author Response:

      Evaluation Summary:

      This work provides new insights into how surface-exposed lipoproteins of Gram-negative bacteria reach their destination in the outer membrane. Authors find that the outer membrane protein complex Slam serves as a translocon for the lipoproteins and the periplasmic chaperone Skp mediates their targeting to Slam. This work may contribute to the elucidation of host invasion mechanisms by pathogenic bacteria, in which surface lipoproteins play an important role.

      Reviewer #1 (Public Review):

      Previously, using rigorous genetic, bioinformatic and cell-based biochemical analyses, the same group discovered SLAM1, an outer membrane protein in Neisseria spp., which mediates the membrane translocation of surface lipoproteins (SLPs) (Hooda et al. 2016 Nature Microbiology 1, 16009). Here, authors reconstituted this system in proteoliposomes using minimal purified components including the translocon Slam1 and the client lipoprotein TbpB. Authors further coupled the system to TbpB-expressing E. coli spheroblasts and LolA, the Slam1-specific periplasmic shuttle system. Using the digestion pattern of TbpB by Proteinase K as a readout, authors confirmed that Slam1 indeed served as a translocon for SLPs. As a step forward, authors found that Skp, a periplasmic chaperone (holdase), was critical to the membrane-assembly and translocation of TbpB. Strengths: Overall, this is a solid biochemical study that demonstrates the role of Slam1 as a translocon for SLPs. The experimental design is neat and straightforward. The specific role of Skp in SLP translocation is interesting. This reconstituted system will serve as a novel platform for further elucidation of the Slam1-mediated SLP translocation mechanisms. The manuscript is overall well written. Weakness: There are several major concerns, however. 1) It is not fully convincing whether these findings are novel and significantly advance the field. Identification of minimal components in a biological process and their reconstitution are always challenging and thus, this study is an achievement. Nonetheless, I am not sure whether we have learned novel molecular insights besides the confirmation of the group's previous discovery. The specific role of Skp in translocation is interesting but not surprising, considering that periplasmic holdases are already known to be extensively involved in the biogenesis of periplasmic and outer membrane proteins.

      We thank the reviewer for their time and thorough review of the manuscript. In the previous paper (Hooda et al. 2016 Nature Microbiology 1, 16009), we discovered that the outer membrane protein Slam is “important/responsible” for the surface display for SLPs (TbpB, LbpB, fHbp). In this mechanism focused manuscript, we were able to demonstrate Slam’s role as an outer membrane translocon. One of the achievements in this paper is to demonstrate that Slam as an autonomous translocon – importantly this is unlike the two-partner secretion systems, as it does not require the Bam complex for the translocation of TbpB.

      2) Although authors developed nice assays (Figs. 1 and 2), it was not verified whether TbpB protected from Proteinase K digestion had "correct" conformation and membrane-topology. Authors performed a functional assay on TbpB (Fig. 5a), but this result was obtained from a cell-based assay, not from the reconstituted system.

      We have performed pulldown assay for the TbpB that has been translocated into Slam-proteoliposomes using human transferrin conjugated beads to show that this TbpB protein is correctly folded and functional. Blots and explanations are attached in the revised manuscript (see new Figure 2 – figure supplement 2 and line 197-207). (As addressed in major scientific concerns point 2-i).

      Although the data in Figs. 1 and 2 clearly show that the membrane association of TbpB depends on Slam1, it does not mean that the "translocation" has actually occurred in the proteoliposomes. Probably, more rigorous analysis on the Proteinase K-protected portion of TbpB (for example, mass spec) seems necessary (that is, whether the proteolytic product is expected based on the predicted topology).

      The TbpB is flag-tag at its C-terminus and the protected band on our blots (detected by α-flag antibody) corresponds to the expected Mw (~75kDa) for Mcat TbpB flag tagged protein. Therefore, we believed the band at 75kDa is our full length processed TbpB. Moreover, we have confirmed that TbpB can be detected at the top of the sucrose gradient with our Slam-proteoliposomes in this assay. This would only occur if TbpB was actually translocated inside the intact liposomes, otherwise we should not see any TbpB in the top layer of the sucrose gradient (Figure 4d). Furthermore, we have performed a pulldown assay for TbpB in proteoliposomes to check for their functional binding to human transferrin beads after translocation. These results are explained in the updated new Figure 2 – figure supplement 2 and line 197-207.

      3) The manuscript has a couple of missing supporting data. 3a) Lines 87-89: "From our analysis, we found that the Slam1 from Moraxella catarrhalis (or Mcat Slam1) expressed well and the purified protein was more stable than other Slam homologs." I cannot find the expression and stability data of various homologs supporting this sentence.

      In general, what we meant was that we chose Mcat Slam as the target of this study because it is more stable during the purification and resulted in a higher yield of protein. We needed higher yields of Slam to be able to reconstitute the protein into the liposomes for the translocation assay. We have purification data for Mcat Slam1, Nme Slam1 and Ngo Slam2 but we think including them in the supplementary is not necessary. We have changed and rewritten this section dedicated to Mcat Slam1 purification (Figure 1 – figure supplement 1 and 2).

      3b) "Lines 216-219: Furthermore, the processing of TbpB by signal peptidase II and subsequence release from the inner membrane was unaffected suggesting the defect in surface display by Skp occurs after the release of TbpB from the inner membrane (Fig. 4a)." The result supporting this sentence seems missing or this sentence points to a wrong figure.

      Yes, this sentence is misleading. What we meant was that the processed TbpB (TbpB has 2 bands, unprocessed TbpB – upper band and signal peptidase processed, lipidated TbpB - lower band) is similar for all samples indicating that the knockout of Skp did not affect the expression or processing of the signal peptide of TbpB up until it is ready (processed and lipidated in the periplasm) for translocation by Slam to the surface. We have added an explanation in the figure legend of Figure 4a –line 267-269.

      4) Some statistical analysis results are not clear, making some conclusions not convincing. 4a) Figure 4a top "Exposure of TbpB on the surface of K12 E. coli" Apparently, all three data points for (Delta_DegP+Slam1+TbpB) are very closely distributed. Accordingly, (WT+Slam1+TbpB) vs (Delta_DegP+Slam1+TbpB) data look significantly different (difference is ~0.2). But the two data were assigned as "Not Significant". Similarly, in the comparable in vitro data (Figure 4b), the intensity for Slam1 (WT+Proteinase K - Triton) looks larger than that for Slam1 (Delta_DegP + Proteinase K - Triton). So, the DegP contribution should not be ignored.

      For figure 4a, the ONE WAY ANOVA test was performed using Prism with 4 biological replicates (we can include the analysis report in the revised submission if this is requested we have updated the figure to include data points. In general, both our in vitro liposomes translocation assay and in vivo surface exposure assay for TbpB showed that delta-DegP only slightly reduces the translocation of TbpB to the surface but could not detect statistically significant differences.

      4b) Figure 5a top "Exposure of TbpB on the surface of N. meningitidis" What is the p-value for WT vs Delta_Skp data? Are the two data significantly different? The p-value range for (*) is not shown.

      We have included the p-value range for (*) in the revised manuscript, figure 5a.

      Reviewer #2 (Public Review):

      The article addresses the function of SLAM, a protein which the authors have shown previously to be involved in the traffic of lipoproteins to the bacterial surface. The authors have performed a series of experiments to assess the impact of SLAM on the delivery into proteoliposomes of the model lipoprotein TbpB either added exogenously or presented by E coli spheroplasts. They identify a periplasmic chaperone, Skp, which enhances transport of TbpB and other lipoproteins to proteoliposomes, and show the contribution of endogenous Skp to lipoprotein transport in Neisseria meningitidis. The authors set up an in vitro translocation assays using purified components from different bacteria. This is reasonable as the assays can be challenging to establish and require proteins that can be expressed and are stable. It would be helpful however if the sources of the proteins and how they are tagged (for their detection) is clearly documented in the article and the figures. In keeping with this, the figures describing the assays could be improved (ie 1A, 2A, 3A and C). Despite this, the results presented in Fig 1 and 2 clearly demonstrate the role of SLAM as a translocase, and the authors have included appropriate controls for their assays; the translocation of a OmpA to demonstrate that the Bam complex is functional in their hands in an important control and should be included in the main figures. Experiments outlined in Figure 3 and Table 1 demonstrate the interaction specific of TbpB and another lipoprotein HpuA with Skp, a previously characterised periplasmic chaperone. This is performed by pull-downs and MS as well as immunobloting. A critical result is shown in Figure 4 in which SLAM and TbpB are introduced into E coli, and the role of endogenous Skp is assessed. Importantly, the absence of Skp reduces but does not eliminate TbpB surface expression. The authors could speculate on the nature of Skp-indendent surface expression of TbpB, as this result mirrors what they find in a meningococcal strain lacking Skp (Figure 5A). It appears that Skp might be required for the correct insertion/folding of lipoproteins given their result in Figure 5B (currently, this could be changed into 5C) which tests the binding of transferrin to the bacterial surface. Clearly this could be influenced by an effect of Skp on TbpA, which acts as a co-receptor with TbpB. In summary, the authors have used appropriate assays to reach their conclusions about the role of SLAM as a translocase and the contribution of Skp to the localisation of lipoproteins to the surface of bacteria. The findings presented are robust and shed new insights into the sorting of proteins in bacteria, an incompletely understood process which is central to microbial physiology, viurlence and vaccines.

      Reviewer #3 (Public Review):

      Slam was identified as an outer membrane protein involved in the translocation of certain lipoproteins to the cell surface in Neisseria meningitidis. Slam homologs were also identified in other proteobacteria. However, direct evidence that Slam is an outer membrane translocation device is still missing. In this paper, the authors set up an in vitro translocation assay to probe the role of Slam proteins in the translocation of the lipoprotein TbpB. Although they provide strong data supporting the role of Slam in lipoprotein translocation, further molecular dissection is required to unambiguously establish Slam as a lipoprotein translocator. The work is interesting and the paper clearly written. The authors also discovered a functional link between the periplasmic chaperone Skp and Slam-dependent lipoproteins, which is a novel and interesting finding.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01015

      Corresponding author(s): Jordan, Raff

      1. General Statements [optional]

      We thank the reviewers for their thoughtful and constructive comments and have now revised our manuscript accordingly. We apologise that it has taken so long to send in these revisions, but this is in part because both first authors have now left the lab.

      2. Point-by-point description of the revisions

      Reviewer #1

      This reviewer was generally supportive. They note that it is unfortunate that our data suggests the CP110/Cep97 complex does not play a major part in controlling daughter centriole growth—although we believe that this is an important negative result—but feel that other aspects of our data are interesting. They requested no further experiments, but did comment that it would be interesting to determine when g-tubulin is incorporated into growing centrioles. Unfortunately, we cannot test this as the centrioles in these embryos recruit large amounts of g-tubulin to their PCM, so we cannot specifically assay the small amount of protein in the centriolar fraction.

      Reviewer #2

      Major Points:

      __Figure 1: The reviewer notes that Sas-4 and CP110 have antagonistic roles in promoting/repressing centriole growth and asks if Sas-4 is involved in promoting centriole elongation and whether it also oscillates. __It is unclear if Sas-4 directly promotes centriole elongation in flies. We have previously shown that centriolar Sas-4 levels do oscillate during S-phase, but with a timing that is distinct from CP110/Cep97 (Novak et al., Curr. Biol., 2014). These observations do not shed much light on the potential antagonistic relationship between CP110/Cep97 and Sas-4, so we do not comment on this here.

      Figure S1B: The reviewer requests that we image the centrioles with greater laser intensity to test whether some residual CP110 or Cep97 protein can be recruited in the absence of the other protein. The quantification of this data suggests that some residual CP110 or Cep97 can still be recruited to centrioles in the absence of the other (Graphs, Figure S1B,C), so we do not think it necessary to repeat this experiment at higher laser intensity to further test this point. We now state that the centriolar recruitment of one protein may not be completely dependent of the other (p6, para.2).

      Figure 3: The reviewer questions whether the reduction in CP110/Cep97 levels at the mother centriole that we observe during S-phase could be due to photobleaching. This is an interesting point that we now analyse in more detail (p8, para.2). We do not think the decrease in mother centriolar CP110/Cep97 levels is due to photobleaching as our new analysis (which includes more data points during mitosis) strongly suggests that centriolar levels on the mother rise again at the start of the next cycle (New Figure 3C,D).

      The reviewer asks whether the CP110/Cep97 oscillations occur at the tip of the growing centriole, and whether we can use super-resolution imaging to address this. A large body of evidence indicates that CP110/Cep97 are highly concentrated at centriole distal tips, and all our experiments suggest that it is this fraction that is oscillating. In Figure 3, for example, we use Airy-scan super-resolution imaging to follow the oscillation on Mother and Daughter centrioles in living embryos. Although the resolution in these experiments is not as high as we can achieve using 3D-SIM in fixed cells, it seems reasonable to assume that the dots of fluorescence we observe oscillating on these centrioles (Fig. 3) are the same fluorescent dots we observe localised at the distal tips of the mother and daughter using 3D-SIM in fixed cells (Fig. 1A).

      The reviewer requests additional quantification of the western blots shown in Figure S1 that we use to judge relative expression levels. As we now describe in more detail in the M&M, these ECL blots are very sensitive, but highly non-linear, so we usually estimate relative expression levels by comparing serial dilutions of the different fractions (see, for example, Figure 1B, Franz et al., JCB, 2013). As we now clarify, the key point is not precisely by how much these proteins are over- or under-expressed, but that we observe a similar oscillatory behaviour when they are either over- or under-expressed.

      __The reviewer points out that our statement that the CP110/Cep97 oscillation is entrained by the Cdk/Cyclin oscillator (CCO) is too strong as it is based only on a correlation. __We agree and apologise for this overstatement. To address this, we have now perturbed the CCO by halving the dose of Cyclin B (New Figure 5E—H). This extends S-phase length and we now show that the period of the CP110/Cep97 oscillation is also extended. This suggests that the CCO directly influences the period of the CP110/Cep97 oscillation.

      The reviewer notes that our conclusion that the centriole cartwheels are longer or shorter when CP110 or Cep97 are absent or overexpressed, respectively, is based only on Sas-6-GFP fluorescence intensity. They ask if this fluorescence intensity perfectly reflects cartwheel length, and if we can confirm these conclusions using EM. Sas-6 is the main structural component of the cartwheel, so the amount of Sas-6 at the centriole should be proportional to cartwheel length, and we have published two papers that support this conclusion and that use the incorporation of Sas-6 as a proxy to measure cartwheel length (Aydogan et al., JCB, 2018; Aydogan et al., Cell, 2020). Importantly, our previous EM studies support our conclusions about the relationship between cartwheel length and CP110/Cep97 levels: the centrioles in wing-disc cells are slightly longer in the absence of CP110 and slightly shorter when CP110 is overexpressed (Franz et al., JCB, 2013). The new findings reported here provide a potential explanation for this EM data, which was puzzling at the time.

      Minor Points:

      Figure 1C: The reviewer noted that our schematic illustrations in this Figure could be misleading____. We agree and have now redrawn them.

      Reviewer #3

      Major points:

      The reviewer requested that we clarify our use of the term oscillation, pointing out that oscillations are repetitive variations in levels/activity over time, whereas the “oscillations” we describe here occur during each round of centriole assembly. This is a fair point, and one that is often debated in the oscillation field, with many believing that too many biological processes are termed “oscillations”, when they are not truly driven by the passage of time. To avoid any ambiguity, we now no longer describe the behaviour of CP110/Cep97 as an oscillation (although, for ease of discussion, we still use the term in this letter).

      The reviewer thought that the data we show in Figure 1 was not relevant as we largely analyse centrioles in living embryos whereas the data in Figure 1 is derived from fixed wing-disc cells—and similar fixed-cell data has been shown in previous studies. The reviewer suggests we use super-resolution methods to analyse Cp110/Cep97 dynamics in the syncytial embryo, and show this relative to Sas-6 and Plk4. They ask if Plk4 and CP110/Cep97 colocalise at any time. While CP110/Cep97 localisation has been analysed by super-resolution microscopy previously (e.g. Yang et al., Nat. Comm., 2018; LeGuennec et al., Sci. Adv., 2020), CP110/Cep97 was a minor part of these studies and our data is the first to show that this complex sits as a ring on top of the centriole MTs in fly centrioles (that lack the complex distal and sub-distal appendages present in the previously analysed systems). As this localisation is important in thinking about how CP110/Cep97 might influence centriole MT growth, we would like to include it. We cannot show this detail in living embryos as the movement of the centrioles reduces resolution and we cannot observe the ring structure.

      Although we do use Airy-scan super-resolution microscopy to study CP110/Cep97 dynamics in living embryos (Figure 3), we cannot do this in two colours (to compare these dynamics to Sas-6 or Plk4 dynamics) as red-fluorescent proteins bleach too quickly. We now show the relative dynamics of CP110/Cep97 and Plk4 recruitment using standard resolution microscopy (New Figure S2). While it is well established that Plk4 and CP110/Cep97 are concentrated at opposite ends of centrioles, they are all recruited to the nascent site of daughter centriole assembly, effectively “colocalising” at this timepoint. This could provide an opportunity for the crosstalk we observe here, and we now mention this possibility (p17, para.1).

      The Reviewer questioned whether the loading of Sas-6-GFP onto centrioles can be used as a proxy for cartwheel length, pointing out that Sas-6 could load into centrioles in a way that does not change the cartwheel structure, and that EM is required to test this. As described in our response to Reviewer #2, Sas-6 is the main structural component of the cartwheel, and we have published two papers that use the incorporation of Sas-6 into the cartwheel as a proxy to measure cartwheel length (Aydogan et al., JCB, 2018; Aydogan et al., Cell, 2020). While we cannot exclude that Sas-6 might also associate with the cartwheel in a way that does not involve its incorporation into the cartwheel, it is not clear how EM might address this question. Moreover, even if such a fraction existed, it should not affect our conclusions—as long as Sas-6 is binding to the cartwheel in some way, then the amount bound should remain proportional to the length of the cartwheel. Perhaps the reviewer is suggesting that we perform an EM time course of cartwheel growth to back up our conclusions from the Sas-6 incorporation assay? If so, we think this impractical. The changes in cartwheel length shown in Figure 6 are revealed from analysing several thousand images of centrioles compared at precise relative time points. Such an analysis cannot be done in fixed embryos by EM.

      Similar to the point above, the reviewer notes that we use the length of the cartwheel to infer centriole MT length, but we never directly measure MT length. They suggest we perform either an EM analysis or use MT markers to directly measure the kinetics of centriole MT growth. In flies (and many other organisms), the centriole MTs grow to the same length as the centriole cartwheel (Gonzalez, JCS, 1998), so we can be confident that the final length of the cartwheel reflects the final length of the centriole MTs. Moreover, we previously measured the distance between the mother centriole and the GFP-Cep97 cap that sits at the distal tip of the centriole MTs as a proxy for centriole MT length, and found that the inferred kinetics of MT growth were similar to the kinetics of cartwheel growth (inferred from Sas-6 incorporation) (Aydogan et al., 2018). This manual analysis was very time consuming, and we have tried to implement computational analysis methods, but so far without success. For similar reasons to those described in the point above, it is not feasible to accurately measure centriole MT growth kinetics by EM (nobody has been able to do this). Moreover, the centrosomes in these embryos are associated with too much tubulin and the centriole MTs are not yet modified (e.g. by acetylation) as the cycles are so fast—so we cannot directly stain the centriole MTs in fixed embryos. We have now toned down our conclusions about MT length throughout the paper, and we make it clear that we cannot directly measure this.

      All of the experiments shown here are performed in the presence of endogenous untagged proteins, and the reviewer wonders if recruitment dynamics might be influenced by competition for binding from the endogenous protein. We have compared the behaviour of many centriole and centrosome proteins in the presence and absence of the untagged WT protein. In all cases, less tagged-protein binds to centrioles/centrosomes in the presence of untagged protein, presumably due to competition. Apart from this, however, we usually observe no real difference in overall dynamics and in Reviewer Figure 1 (see below) we show that CP110-GFP and GFP-Cep97 both oscillate even in the absence of any endogenous protein. As we feel this result is not very surprising, we do not show it in the manuscript.

      The reviewer correctly noted that our data was not strong enough to conclude that the CP110/Cep97 oscillation is influenced by the CCO. This was also raised by Reviewer #2 and, as described above (p2, para.3 above), we have now performed additional experiments to more directly demonstrate this point (new Figure 5G—H).

      The reviewer requests more discussion of why our conclusion that CP110/Cep97 levels oscillate on the growing daughter centrioles during S-phase is different to that reached by Dobbelaere et al, (Curr. Biol., 2020), who conclude that Cep97-GFP only starts to incorporate into the new daughter centrioles late in S-phase when the daughters are fully grown. We have discussed this discrepancy with these authors and they kindly shared their reagents with us (so our endogenous Cep97-GFP oscillation data comes from the same line they used in their experiments), but we have not come to a clear conclusion on this point. We have shown robust oscillations for CP110 and Cep97 by quantifying many hundreds of centrioles using multiple transgenes (both over- and under-expressed) in multiple backgrounds. Cep97 dynamics were a very minor part of the Dobbelaere et al., study, and they analysed a much smaller number of centrioles. We now briefly mention this discrepancy (p9, para.1), but do not discuss it in detail as we have no definitive explanation for it.

      The reviewer requests more experiments or more discussion to address the mechanism(s) of crosstalk between CP110/Cep97 and Plk4, and they suggest several avenues for further investigations. These are excellent ideas, and we are working hard on these approaches. These are all long-term experiments, however, and we feel it is important that the field be made aware of these surprising findings as soon as possible, as others may be better-placed to provide mechanistic insight into how this system ultimately works. We now briefly mention some of the future directions the reviewer highlights in the Discussion.

      The reviewer thought we should highlight the previous publications showing that Plk4-induced centriole amplification requires CP110 and that Plk4 can phosphorylate CP110. These studies (Kleylein-Sohn et al, Dev. Cell, 2007; Lee et al., Cell Cycle, 2017) were mentioned, but we now discuss them more prominently (p17, para.2).

      Minor Points:

      The reviewer raised a number of minor concerns that we have now addressed: (1) We discuss the model the reviewer suggests; (2) we no longer state that the crosstalk between CP110/Cep97 and Plk4 is unexpected; (3) We have clarified our description of the shift in timing of the peak levels of CP110/Cep97, which we no longer refer to as an oscillation; (4) We define mNG as monomeric Neon Green; (5) We have changed our schematics in Figure 1 as suggested by the reviewer; (6) We have corrected the mistake in the legend to Figure 8.

      Reviewer #4

      Major points:

      1. The reviewer noted that the amplitude of the CP110/Cep97 oscillations depended on protein expression levels, so the oscillations might not reflect the behaviour of the endogenous proteins. They requested that we either repeat our experiments with CRISPR knock-in alleles, or conduct experiments with the lines driven by the endogenous promotors but in their respective mutant backgrounds. We have not generated CRISPR knock-ins for CP110/Cep97, but have done so for many other centriole/centrosome proteins (>8) and found that most such lines are expressed at higher or lower levels than the endogenous allele (and sometimes very significantly so). This is also true for our standard transgenic lines, where genes are expressed from their endogenous promoters, but are randomly integrated into the genome. The blots in Figure 4 show that CP110-GFP and GFP-Cep97 expressed from a ubiquitin (u) promoter or from their endogenous promoters (e) are expressed at ~2-5X higher or ~2-5X lower levels than the endogenous proteins, respectively. As we observe CP110/Cep97 oscillations in all cases, it seems unnecessary to generate new CRISPR knock-ins (that are also likely to be somewhat over- or under-expressed) to show this again. As the reviewer asks, we show that Cep97-GFP and CP110-GFP still oscillate in in the absence of the endogenous proteins (Reviewer Figure 1). As this does not seem a surprising result, we do not show this in the main manuscript. In the same point the reviewer requests that we use antibody staining in fixed embryos to show that the untagged proteins also oscillate. Analysing protein dynamics is much harder in fixed embryos, as the levels of fluorescent staining are more variable and we can only approximately infer relative timing, rather than precisely measuring it (as we can in living embryos). Moreover, as both proteins in the CP110/Cep97 complex exhibit a very similar oscillatory behaviour when tagged with either GFP or RFP (e.g. Figure 2C), and this behaviour is distinct to that observed with several other GFP- or RFP-tagged centriole proteins (e.g. Novak et al., Curr. Biol., 2014; Conduit et al., eLife, 2015; Aydogan et al., JCB, 2018; Aydogan et al., Cell, 2020) it seems very unlikely that this behaviour is induced by the GFP (or RFP) tag.

      The reviewer also suggests that we show the data with the endogenous promoter before we show the data with the ubiquitin promoter. As we now explain better (and show in Figure 4), this seems unnecessary as the proteins expressed from the ubiquitin promotor are probably actually expressed at levels that are more similar to the endogenous protein.

      The reviewer questions whether the oscillations we observe might be due to the centrioles simply moving up and down in the embryo during the cell cycle, and they suggest we monitor Asl behaviour to rule this out. We have previously shown that Asl-GFP levels do not oscillate; they remain constant throughout the cell cycle on old-mother centrioles, and grow approximately linearly throughout S-phase on new-mother centrioles (see Figure 1D in Novak et al., Curr. Biol., 2014).

      We were not sure we understood this point properly, so we copy the reviewers comment in full here: ____The authors mention (for instance on p. 3) that the inner cartwheel and the surrounding microtubules assemble at opposite ends of the daughter centriole. However, my understanding is that the short centrioles present in the fly embryo have an inner cartwheel that extends throughout the organelle, such that it might be moot to make a distinction between the two ends in this case. Moreover, it is also my understanding that this inner cartwheel is itself surrounded by microtubules, so that microtubule assembly might not be expected to occur strictly at the distal end no matter what. The reviewer is correct that Drosophila centrioles are short (~150nm) and that the cartwheel extends throughout the centriole. We think the reviewer is suggesting that it may not be relevant therefore whether the cartwheel and centriole MTs grow from opposite ends—as the activities that govern their growth may not be spatially separated? However, because cartwheels grow preferentially from the proximal-end (Aydogan et al., JCB 2018) while centriole MTs are assumed to grow preferentially from the distal (plus) end, there is an intrinsic problem in ensuring they grow to the same size—no matter how short or long the centrioles are. The reviewer is correct that one possible solution to this problem is that the centriole MTs actually grow from their minus ends, but this is not widely accepted (or even proposed). We have tried to explain this issue more clearly throughout the revised manuscript.

      The reviewer points out that the schematic illustrations in Figure 1A and 1C are inaccurate and unhelpful. We agree and have now redrawn these.

      The reviewer asks that we provide information about the eccentricities of the centrioles in the different datasets used to calculate the protein distributions shown in Figure 1, particularly as the data for Sas-4-GFP and Sas-6-GFP were obtained previously using a different microscope modality, making comparisons complicated. The point that comparing distance measurements across different datasets is difficult is an important one, and we now state that such comparisons should be treated with caution. However, we have not provided information on the distribution of centriole eccentricities in the different experiments as it wasn’t clear to us how this information could be used to make such comparisons more accurate (presumably the reviewer is suggesting we could apply a correction factor to each dataset?). The very tight overlap in the positioning of CP110/Cep97 fusions (Figure 1C) strongly suggests that any difference in the average centriole eccentricities of the different populations of centrioles analysed, which are already tightly selected for their en-face orientation (i.e. eccentricity

      The reviewer requested that we show the “noisy data” we obtained during mitosis that we excluded from our analysis in Figure 3. As we now explain in more detail (p8, para.2), there are two reasons why the data for mitosis in this experiment is “noisy”: (1) The protein levels on the centrioles are low in mitosis and the centrioles are more mobile, so they are hard to track; (2) The Asl-mCherry marker used to identify the mother centriole starts to incorporate into the daughter (now new mother) centriole during mitosis, making it difficult to unambiguously distinguish mothers and daughters. As a result, we cannot track and assign mother/daughter identity to very many centrioles during mitosis—although we now include some extra data-points during mitosis for the centrioles where we could do this (revised Figure 3C,D). Importantly, it is clear that this “noisy” data hides no surprises: one can see (Figure 3C,D) that the signal on the centrioles is simply low during mitosis and then starts to rise again as the embryos enter the next cycle. This is confirmed in the normal resolution data (Figure 2B,C; Movies S1 and S2) where we can track many more centrioles due to the wider field of view and because we do not have to discard centrioles in mitosis that we cannot unambiguously assign as mothers or daughters.

      The reviewer requests that we conduct a super-resolution Airy-scan analysis of CP110/Cep97 driven from their endogenous promoters (eCP110 or eCep97) to ensure that the oscillations we see with these lines (shown in Figure 4C,D) are also occurring at the daughter centriole—as we already show for the oscillations observed with the uCP110 and uCep97 lines (shown in Figure 4C,D, and analysed at super-resolution on the Airy-scan in Figure 3). This is technically very challenging as super-resolution techniques require a lot of light and the centriole signal in the eCP110/Cep97 embryos is very dim compared to uCP110/Cep97 embryos (Figure 4C,D). We have managed to do this for eCep97-GFP and confirmed that—even in these embryos that express Cep97-GFP at much lower levels than the endogenous protein (Figure 4A)—the “oscillation” is primarily on the daughter (Reviewer Figure 2). As this data is very noisy, and as the ubiquitin uCP110/Cep97 lines express these fusions at levels that are closer to endogenous levels (Figure 4A,B), we do not show this data in the main text.

      The reviewer also asks for clarification as to why we use the Airy-scan for some experiments and 3D-SIM for others. As we now explain (p8, para.1), 3D-SIM has better resolution than the Airy-scan, but it takes more time and requires more light—so we cannot use it to follow these proteins in living embryos. Thus, for tracking CP110/Cep97 throughout S-phase in living embryos we had to use the Airy-scan.

      The reviewer questions why in some experiments we analyse the behaviour of 100s of centrioles, whereas in others the numbers are much smaller (1-14 in Figure 3—note, the reviewer quoted this number as coming from Figure 4, but it actually comes from Figure 3, so we have assumed they mean Figure 3). We apologise for not explaining this properly. The super-resolution experiments in Figure 3 are performed on a Zeiss Airy-scan system, which has a much smaller field of view than the conventional systems we use in other experiments. Thus, we inherently analyse a much smaller number of centrioles in these experiments. In addition, as explained in point 6 above, in these experiments we need to analyse mother and daughter centrioles independently, and in many cases we cannot unambiguously make this assignment, so these centrioles have to be excluded from our analysis.

      The reviewer questions why we selected the 10 brightest centrioles for the analysis shown in Figure S1B,C (note, the reviewer states Figure S2 here, but it is the data shown in Figure S1B,C that is selected from the 10 brightest centrioles, so we assume this is the relevant Figure). We apologise for not explaining this properly. In these mutant embryos very little CP110-GFP localises to centrioles in the absence of Cep97, and vice versa, so we cannot track centrioles using our usual pipeline and instead have to select centrioles using the Asl-mCherry signal. As the difference between the WT and mutant embryos is so striking, we simply selected the brightest 10 centrioles (based on Asl-mCherry levels) in both the WT and mutant embryos for quantification. We could select more centrioles, or select centrioles based on different criteria, but our main conclusion—that the centriolar localisation of one protein is largely dependent on the other—would not change.

      The reviewer also questioned why we performed the analysis shown in Figure S2 (new Figure S3) during S-phase of nuclear cycle 14, when the rest of the manuscript focuses on nuclear cycles 11-13. We apologise for not explaining this properly. In cycles 11-13 centriolar CP110/Cep97 levels rise and fall during S-phase, whereas both proteins reach a sustained plateau during the extended S-phase (~1hr) of nuclear cycle 14—making it easier to analyse CP110/Cep97 levels in embryos when their centriole levels are maximal. We now explain this.

      The reviewer requests that we quantify the western blots shown in Figure 4 in the same way we do in figure 8. To do this we would need to perform multiple repeats of these blots and we did not perform these because the blots shown in Figure 4 largely recapitulate already published data (Franz et al., JCB, 2013; Dobbelaere et al., Curr. Biol., 2020). Moreover, as described in our response to Reviewer #2, these ECL blots are very sensitive, but highly non-linear, so we always compare multiple serial dilutions of the different extracts to try to estimate relative levels of protein expression. We now explain this in the M&M.

      The reviewer suggests the data shown in Figure 8 is a “straw man”: we really want to test whether modulating CP110/Cep97 levels modulates centriolar Plk4 levels, but instead we test how they modulate cytoplasmic Plk4 levels. The language here is harsh, as it suggests that our intention was to mislead readers into thinking that we have addressed a relevant question by addressing a different, irrelevant, one. We apologise if we have missed something, but we believe we do perform exactly the experiment that the reviewer thinks we should be doing—quantifying how centriolar Plk4 levels change when we modulate the levels of CP110 or Cep97 (Figure 7). It is clear from this data that modulating the levels of CP110/Cep97 does indeed modulate the centriolar levels of Plk4. In Figure 8 we seek to address whether this change in centriolar Plk4 levels occurs because global Plk4 levels in the embryo are affected—a very reasonable hypothesis, which this experiment addresses quite convincingly (although negatively).

      Minor Points:

      The reviewer highlights a small number of mistakes and omissions, all of which have been corrected.

      Finally, we would like to thank the reviewers again for their detailed comments and suggestions. We hope that you and they will agree that the changes we have made in response to these comments have substantially improved that manuscript and that it is suitable for publication in The Journal of Cell Science.

      Sincerely,

      Jordan Raff

      __Reviewer Figure 1. CP110/Cep97 dynamics remain cyclical even when Cep97-GFP and CP110-GFP are expressed from their endogenous promotors in the absence of any endogenous protein. __Graphs show how the levels (Mean±SEM) of centriolar CP110/Cep97-GFP change during nuclear cycle 12 in (A) Cep97-/- embryos expressing eCep97-GFP or (B) CP110-/- embryos expressing eCP110-GFP. CS=Centrosome Separation, NEB=Nuclear Envelope Breakdown. N≥11 embryos per group, average of n≥15 centrioles per embryo.

      __Reviewer Figure 2. ____The cyclical recruitment of Cep97-GFP expressed from its endogenous promoter occurs largely at the growing daughter centriole. __The graph quantifies the fluorescence intensity (Mean±SD) acquired using Airy-scan microscopy of eCep97-GFP on mother (dark green) and daughter (light green) centrioles in individual embryos over Cycle 12. CS = Centrosome Separation, NEB = Nuclear Envelope Breakdown. Data was averaged from 3 embryos as the number of centriole pairs that could be measured was relatively low (total of 2-8 daughter and mother centrioles per time point; in part due to the much dimmer signal of eCep97-GFP in comparison to uGFP-Cep97).

    1. This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giac011), which carries out open, named peer-review.

      These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 2: Gregg Thomas

      This paper presents 17 new insect genomes from the order of caddisflies (Trichoptera). The authors combine these genomes with 9 previously sequenced genomes to analyze genome size evolution across the order. They find that genome size tends to correlate with evolution of repeat elements, specifically expansion of transposable elements (TEs). Interestingly, the authors also notice that TE expansions also correlate with gene copy-number (or gene fragment copy-number), even of highly conserved genes used to assess genome completeness. Overall, I find this paper very well written and easy to follow. The genomic resources and analyses presented provide novel new resources and findings for insects in the order Trichoptera, with potential implications beyond. I have only minor suggestions before publication, outlined below.

      1. Regarding the TE and BUSCO gene fragment associations, while I think this is a really interesting analysis, I found the underlying models a bit difficult to understand. Line 236 reads, "To test whether repetitive fragments were due to TE insertions near or in the BUSCO genes or, conversely, due to the proliferation of 'true' BUSCO protein-coding gene fragments…" Is the idea that a BUSCO gene has been duplicated itself and then one copy is either fragmented by a TE insertion or hitch-hikes with a TE (as mentioned on line 501)? Or are these fragments only of BUSCO genes that didn't match a full BUSCO gene at all, but the fragments that did match had unexpectedly high coverage? I guess I'm just confused as to whether a gene duplication needs to precede the TE insertions/hitch-hiking, which is subsequently pseudogenized either prior to or because of the TE activity, or if these are gene losses. I understand how the TE could inflate the coverage of these fragments, but I guess I'm still not clear on how these fragments arise in the first place. Any clarification would be helpful! Also, if the case is that these are fragments of BUSCO genes that have no full matches in the genome, how might assembly contiguity or quality be affecting these matches?

      2. One thing that I noticed throughout the figures is that branch B1, leading to A. sexmaculata, the branch leading to clade A, and the branch leading to clade B (as labeled in Figures 1 and 2) appear to form a polytomy. I don't find this mentioned in the text and am wondering why this relationship remains unresolved with these data. I don't think this has any bearing on the results, since all analyses are done on the tips of the tree, but I think readers looking at these trees will want to know what is going on at that node.

      3. The authors use custom scripts for their BUSCO-TE correlation analysis and provide a link to a Box folder on line 514. I would request that these scripts be put somewhere more stable and accessible (e.g., github). Not only was I asked to login when clicking the link, but after I had done so that link didn't seem to exist.

      Minor/editorial points

      1. Would the authors be able to report concordance factors for the species tree? I think this should be easy enough with IQ-tree and is something I ask everyone to do. This may also help answer my question about the polytomy.

      2. The authors do a good job of mentioning and citing programs used throughout the manuscript but seem to skip this in the Assembly section (starting on Line 398). "First, we applied a long-read assembly method…" Which one? Same for "de novo hybrid assembly approaches." I see that assembly is covered in detail in the Supplement, but I think naming the main programs used (wbtdbg2 and Masurca) should be in the main text.

      3. Line 281-282: I think some of the brackets and parentheses here are mismatched or un-closed.

    1. This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giac005), which carries out open, named peer-review.

      These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1: Paul Stewart

      Fahrner et al have produced a very nice manuscript and corresponding pipeline. They describe a collection of DIA tools in the Galaxy framework for reproducible and version-controlled data processing. These DIA tools are an excellent addition to the growing number of proteomics-centric tools already available in Galaxy. The reviewer could find no major revisions needed and therefore only requests a few minor revisions before this is ready for publication:

      Please include page numbers in the revised manuscript to make referencing the text easier.

      Page 6

      OpenSwath and PyProphet are cited and are also used in the manuscript. Please cite one or two alternatives.

      Please consider citing a tool the each time it is used in a new paragraph (e.g. MSstats).

      There is heavy reliance on conjunctive adverbs (However, ...; Thus, ...) on this page and throughout the manuscript. These can make passages a bit hard to read. Please consider rephrasing.

      Page 7

      Why "so-called histories"? Aren't they simply "Histories"?

      Page 14

      'To decrease the analysis time of the semi-supervised learning, the merged OSW results can be first subsampled using the PyProphet subsample tool and subsequently scored using the PyProphet score tool. '

      The reviewer is not familiar with this approach. Can you please give additional justification (maybe under methods?) or provide a citation that this is a reasonable approach?

      Page 15

      Please check your reference software and/or work with the journal to ensure that the web addresses are linked properly. For example, the reviewer tried copying the link "https://training.galaxyproject.org/training- %20material/topics/proteomics/tutorials/DIA_lib_OSW/tutorial.html" but a "%20" (or a space) is inserted into the URL after "training-" so the link as it appears did not work until this was removed. A less technically savy reader may think the links are broken and will not be able to access the materials.

      Page 16

      'We identified and quantified between 25.000 to 27.000 peptides ...'

      Please be consistent with number formatting (25000 vs 25.000). Other values in the tables did not use this formatting. Please check with journal editor for convention.

      Figures

      Please be consistent with axes labels. Some are upper case and some are lower case.

      Figure 2B

      Please round R2 to 2 or 3 decimals.

      Figure 3

      Please change the red-green color scheme to a more color-blind friendly color scheme (e.g. red blue)

    1. This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giac001), which carries out open, named peer-review.

      These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1: Bo Li

      Single-cell RNA-seq has revolutionized our abilities of investigating cell heterogeneity in complex tissue. Generating a high-quality gene count matrix is a critical first step for single-cell RNA-seq data analysis. Thus, a detailed comparison and benchmarking of available gene-count matrix generation tools, such as the work described in this manuscript, is a pressing need and has the potential to benefit the general community.

      Although this work has a great potential, the benchmarking efforts described in the manuscript are not comprehensive enough to justify its publication at GigaScience unless the authors address my following major and minor concerns.

      Major concerns:

      1) The authors should discuss related benchmarking efforts and the differences between previous work and this manuscript in the Background section instead of the Discussion section. For example, Du et al. 2020 G3: Genes, Genomics, Genetics. and Booeshaghi & Pacther bioRxiv 2021 should be mentioned and discussed in the Background section. In addition, STARsolo manuscript (https://www.biorxiv.org/content/10.1101/2021.05.05.442755v1), which contains a comprehensive comparison of CellRanger, STARsolo, Alevin and Kallisto-Bustools should be cited and discussed. Zakeri et al. 2021 bioRxiv (https://www.biorxiv.org/content/10.1101/2021.02.10.430656v1) should also be included and discussed in the Background section.

      2) Benchmark with latest versions of the software. The choice of Cell Ranger, STARsolo, Alevin and Kallisto-BUStools is good because they are four major gene count matrix generation tools. However, I urge the authors also include CellRanger v6 and Alevin-fry (Alevin_sketch/Alevin_partialdecoy/Alevin_full-decoy, see STARsolo manuscript), which are currently lacking, into their benchmarking efforts. The authors may also consider add STARsolo_sparseSA into the benchmark. Since single-cell RNA-seq tool development is a fast-evolving field, benchmarking of the up-to-date versions of tools is super critical for a benchmarking paper.

      3) Conclusions. The authors summarized the observed differences between tools based on the benchmarking results. This is good but very helpful for end-users. I recommend the authors to emphasize their recommendations for end-users more clearly in the discussion/results section. For example, do the authors recommend one tool over the others under certain circumstances? If so, which tool and which circumstance and why? I like Figure 5 a lot and hope the authors can summarize this figure better in the manuscript.

      4) This manuscript concluded that differential expression (DEG) results showed no major differences among the alignment tools (Figure 4). However, the STARsolo manuscript suggested DEG results are strongly influenced by quantification tools (Sec. 2.6, Figure 5). Please explain this discrepancy.

      5) This manuscript suggested simulated data is not as helpful as real data. However, the STARsolo manuscript reported drastic differences between tools using simulated data. Please comment on this discrepancy.

      6) I have big concerns regarding the filtered vs. unfiltered annotation comparison. In particular for pseudogenes, we know that many of them are merely transcribed or lowly transcribed. As a result, many of these pseudogenes would not be captured by the single-cell RNA-seq protocol. At the same time, because these pseudogenes share sequence similarities with functional genes, they would bring trouble for read mapping. This is one of the main reasons for using a carefully filtered annotation. Actually, whether and how to filter annotation is in active debate in big cell atlas consortia such as Human Cell Atlas. Thus, I would be super careful about describing results comparing filtered vs. unfiltered annotation. For example, in Suppl. Figure 8D, there are 6 mitochondrial genes that have 100% sequence similarity to their corresponding pseudogenes. It is impossible to distinguish if a read comes from a gene or a pseudogene for these 6 genes and it is also not necessary --- the transcribed RNA should also be exactly the same. Thus, I encourage the authors remove their pseudogenes from the annotation and I suspect the mouse data results should look similar to the human data in the Suppl. Figure 8A.

      7) The endothelial dataset was only run on CellRanger 3 because the UMI sequence is one base shorter. Could the authors augment the UMI sequence with one constant base and run this dataset through CellRanger 4/5/6?

      8) I think it is more appropriate to call the tools benchmarked as "gene count matrix generation tools" instead of "alignment tools".

      Minor concerns:

      1) The Suppl Table 2 mentioned in the main text corresponds to Suppl. Table 3 in the attachment. In addition, there is no reference to Suppl Table 2.

      2) Suppl Table 3 PBMC, why do I see endothelial cell markers in PBMC dataset?

      3) Suppl Figure 7 is never referenced in the main text.

      4) Suppl Figure 8D is never referenced in the main text.

    1. This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giab099), which carries out open, named peer-review.

      These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 3: idoia ochoa

      The authors present a novel tool for the compression of collections of bacterial genomes. The authors present sound results that demonstrate the performance gain of their tool, MBGC, with respect to the state-of-the-art. As such, I do not have concerns about the method itself. My main concerns are with respect the description of the tool, and how the results are presented. Next I list some of my suggestions (in no particular order):

      Main Paper: - Analysis section: Before naming MBGC specify that it is the proposed tool. - Analysis section: Reference for HRCM. Mention here also that other tools such as iDoComp, GDC2, etc. are discussed in the Supplementary (this way the reader knows more tools were analyzed or at least tried on the data).

      • Analysis section: The paragraph "Our experiments with MBGC show that... " is a little misleading, since it seems that the tool has the capacity to compress a collection and just extract a single genome from it. This becomes clear later in the text when it is discussed how the tool could be used to speed up the download of a collection of genomes from a repository. So maybe explain that in more detail here, or mention that it could be used to compress a bunch of genomes prior to download. And then point to the part of the text where this is discussed in more detail.

      • Analysis section: The results talk about the "stronger MGBC mode", the "MGBC max", but in the tables it reads "MBGC default" or "MBGC -c 3". I assume "MBGC -c 3" refers to "MBGC max", but it is not stated anywhere. maybe better to call it "MBGC default" and "MBGC max".

      • Analysis section: Although the method is explained later in the text, it would be a good idea to give a sense of the difference between the default and max modes of the tool. Or some hints on the trade-off between the two. Also, the parameter "-c 3" is never explained.

      • Analysis section: Figures, it is difficult to see the trade-off between relative size and relative time, can you use colored lines? such that the same color refers to the same set of genomes. Also, in the caption, explain if we want small or high relative size and time. it may be clear, but better to clearly state it.

      • Analysis section: there is a sentence that says "all figures w.r.t. the default mode of MBCG". It would be good also to state that in the caption, so that the reader knows which mode of the tool is being used to generate the presented results. and if the input files are gzipped or not. For example, for the following paragraph that starts with Fig. 1, it is not clear if the files are gzipped or not.

      • Analysis section: First time GDC2 is mentioned, the first thing that comes to mind is why it was not used for the bacterial experiments. See my previous point on having a couple of sentences about the other tools that were considered, and why they are not included in the main tables/figures.

      • Methods:

      -- Here I am really missing a diagram explaining the main steps of the tool. It seems the paper has been rewritten slightly to fit the format of the journal and some things are not in the correct order. For example, it says the key ideas are already sketched, but i do not think that is true.

      -- (offset, length) i assume refers to the position of the REF where the match begins, and the length of the match, but again, not really explained. A diagram would help. Also, when it is time to compress the pairs, are the offset delta encoded? or encoded as they are with a general compressor?

      -- How are the produced tokens (offset, length, literals, etc.) finally encoded?

      -- First time parameter "k" is mention, default value? Also, how can you do a left extension and "swallow" the previous match? is it because the previous match could have been at another position? otherwise if it was in that position it would have been already extended to the right, correct? i mean, it would have generated a longer match.

      -- The "skip margin" idea is not well explained. not sure why the next position after a match is decreased by m. please explain better or use a diagram with an example.

      -- when you mention 1/192, maybe already state that this is controlled by the parameter u. otherwise when you mention the different parameters is difficult to relate them to the explanation of the algorithm.

      Availability of supp...

      -- from from (typo) Tables

      -- Specify the number of genomes in each collection.

      -- change MBGC -c 3 to MBGC max or something similar. (see my previous comment -c flag is not explained!)

      Supplementary Material

      -- move table 1 after the text for ease of reading

      -- not clcear if the tool has random access or not. it is discussed the percentage of time (w.r.t. decompreessing the whole collection i believe) that it would take to decompress one of the first gneomes vs one of the last ones. this should be better explained. for example, if we decompress the last genome of the collection we will employ 100% of the time, right? given that previous genomes are part of REF (potentially). please explain better and discuss this point in the analysis part, not only in the supplementary. seems like an important aspect of the algorithm.

      -- I assume this is not possible, but should be discussed as well. can you add a genome to an already compressed collection? this together with the random access capabilities will highlight better the main possible uses of the tool.

      -- section 4.3: here HT is used, and then HT is introduced in the next paragraph. please revise the whole text and make sure everything is in the right order.

      -- parameter m, please explain better.

      -- add colors to figures, it will be easier to read them. Overall, as I mentioned before, I believe the tool offers significant improvements with respect to the competitors for bacterial genomes, and performs well on non bacterial genomes as well. What should be improved for publication is the description of the method, since at the end of the day is the main contribution, and how the text is presented.

    1. One of the most challenging aspects of the Pandemic for dual-income parents is the school and daycare closures. (Note: Whereas the first support focused on gender roles, the second paragraph focuses on the particular challenges for parents during the Covid-19 epidemic.) These dual-earner parents should find a way to split children’s needs during the shelter-in-place. If they do not balance paid work and child care, both sides will feel the consequences. To emphasize these consequences, Lewis humorously says “Dual-income couples might suddenly be living like their grandparents, one homemaker, and one breadwinner.” (Note: Drawing on evidence from the text, this passage shows how gender roles relate to the challenges of Covid-19 for working parents and families.) Instead of splitting the housework, women take the role of “homemaker” so the author implies here that this regresses gender dynamics two generations backward. It obviously demonstrates that nothing much has changed over time and the mentality remains. While many couples are trying to find a middle way, others think that women have to suck it up and sacrifice their jobs. In reference to school closures, Lewis brings up the Ebola health crisis which occurred in West Africa in the time period of 2014-2016. (Note: The following paragraph cites a historical precedent for the Covid-19 outbreak as a basis for comparison.) According to Lewis, during this outbreak, many African girls lost their chance at education; moreover, many women died during childbirth because of a lack of medical care. Mentioning these elaborations proves once again that not only coronavirus but also many other outbreaks have caused a disaster for feminism. Pandemics, in other words, pile yet another problem on women who always face an uphill battle against patriarchal structures. (Note: This passage ties this observation about the Ebola outbreak in West Africa to a greater observation about Pandemics and gender roles overall.) I started reading her article with a feeling of frustration. While the main topic of the article is feminism, Lewis gives a couple of male examples from the past, such as William Shakespeare and Isaac Newton. (Note: The author makes a personal note here, marking an emotional connection and reaction to the text.) She seems at times to attribute their success to their masculinity. They both lived in times of plague, demonstrating that despite all our progress, the human species is still grappling with the same issues. According to Lewis, neither Newton nor Shakespeare had to worry about childcare or housework. Even though her comparison seemed odd to me, she managed to surprise me that in over 300 years many gender inequities remain the same. This is actually very tragic. It is hard to acknowledge that women are still facing gender inequality in almost every area even 300 years after the time of these great English thinkers. (Note: The author cites historical precedent again: this passage argues that the relationship between plagues and gender roles has not changed much in centuries.) Assuming housework is the natural place of women without asking women if they want to do it is asking for too big a sacrifice. Since couples have the option to split the housework and childcare, why should only women have to shoulder most of the burden? This is a question that I might never be able to answer, even if I search my whole life. It is unacceptable that there is pressure on women to conform to gender roles, such as cultural settings and expectations. (Note: The author uses a rhetorical question to segue into a new supporting argument.) Women should not have to sacrifice their leisure time completing unpaid work. I agree with Lewis when she mentions the “second shift” situation. When we consider women’s first shift as their paid work, the second shift represents the time that they spend working in the home. In this case, there is apparently no shift for leisure time. Lewis also supports this by saying “Across the world, women—including those with jobs—do more housework and have less leisure time than their male partners.” Additionally, it seems like economic recovery is going to be long-lasting because of the Coronavirus. As a solution, if men and women have equal housework responsibilities, women may spend more of their time completing paid work. (Note: The author makes a call to action near the end of the essay.) In this way, they can contribute to the economy while they are socializing. Especially after the Pandemic is over, we will need a greater workforce, so hopefully both men and women can equally participate in the economy. (Note: Much like the first sentence of the essay, the last sentence speaks to a greater, big-picture context: the need for equality in a post-pandemic world.)

      Many schools and daycares are sadly closed at the moment because of COVID19 pandemic.

    1. I think that the students’ voice is not always heard entirely, even through dialogue. I feel that by doing this journal we can make a difference with our personal experience and touch the heart of someone who is willing to stand by us. I also wanted to get the attention of other students who may be feel-ing the same frustration I have felt

      Rashida, as an SLA student, talked about the issue she met before, and explained why this process can help. Her letter is a great evidence to show that our action is effective.

    1. Author Response:

      Reviewer #1:

      The authors of this study carried out two carefully designed field and a glasshouse experiment simulating effects of rapid warming on soil carbon loss. They did this by transplanting alpine turfs from their cold environment to lowland warm environment. They found that when lowland plants were inserted into alpine turfs under these lowland climatic conditions (referred to as warming treatment combined with warm-adapted plant introduction) they rapidly increased soil microbial decomposition of carbon stocks due to root exudates feeding the microbes.

      The question is how well this experimental setup mimics what would happen if lowland plants would be inserted into alpine turfs in situ (which have already experienced considerable warming over the past decades), perhaps with an additional warming treatment there.

      The Reviewer alludes to two pertinent points here. The Reviewer’s first point considers whether lowland plants would function similarly (and, by extension, have the same effect on the soil system) if moved from the warmer lowland site to the cooler alpine site. This is a fascinating question in its own right, in that it raises questions about how migrations of non-adapted genotypes far beyond range edges (e.g. via human activity) impact recipient ecosystems. However, although we agree that alpine ecosystems have warmed considerably in recent decades, we cannot be confident that the high elevation sites in our study are already within the climate niche of the lowland focal species. As such, to address our research questions in situ at the high sites would have required additional warming treatments, which come with their own set of disadvantages (see our second point to this comment, below). We also refer the Reviewer to specific questions about adaptation below (see R6), although we see that we were not careful enough about the rationale for our design in the previous version of the manuscript. We have therefore added a clarifying sentence to the Main Text as follows:

      L101: “In short, the experiments used here examined how the arrival of warm-adapted lowland plants influences alpine ecosystems in a warmed climate matching lowland site conditions (i.e. turf transplantation to low elevation plus lowland plant addition) relative to warming-only (i.e. turf transplantation to low elevation) or control (i.e. turf transplantation within high elevation) scenarios.”

      Second, the Reviewer implicitly raises a point about whether our chosen approach of simulating warming plus lowland plant arrival (i.e. transplantation plus addition of lowland plants) is the most appropriate, specifically by suggesting an alternative option of adding lowland plants to (possibly experimentally-warmed) alpine turfs at the high elevation origin site. Here, it was essential to create a climate scenario in which lowland plants would survive and operate within their climatic niche (i.e. relative to their home conditions) once planted into alpine turfs, rather than perform sub-optimally (e.g. be in a potentially inferior competitive position) or be unable to persist at all. The most parsimonious and reliable way to ensure this was to transplant alpine turfs to a site with a lowland temperature regime, with transplantations also being shown to outperform other methods when novel species interactions are involved (Yang et al. 2018). Most importantly, it was crucial to select a method that warmed the entire plant-soil system rather than only the air (e.g. open-top chambers, IR lamps; Marion et al. 1997; Aronson et al. 2009) or soil (e.g. heating cables; Hanson et al. 2017), and did so realistically throughout the year regardless of the weather (e.g. open-top chambers only work on sunny days in the summer; Marion et al. 1997) or a power supply (e.g. IR lamps, heating cables). Transplantation remains the only way to achieve this (Hannah 2022; Shaver et al. 2000). We now clarify our logic in the manuscript as follows:

      L91: “Elevation-based transplant experiments are powerful tools for assessing climate warming effects on ecosystems because they expose plots to a real-world future temperature regime with natural diurnal and seasonal cycles while also warming both aboveground and belowground subsystems. This is especially true if they include rigorous disturbance controls (here, see Methods) and are performed in multiple locations where the common change from high to low elevation is temperature (here, warming of 2.8 ºC in the central Alps and 5.3 ºC in the western Alps). While factors other than temperature can co-vary with elevation, such factors either do not vary consistently with elevation among experiments (e.g. precipitation, wind), are not expected to strongly influence plant performance (e.g. UV radiation) or in any case form part of a realistic climate warming scenario (e.g. growing-season length, snow cover).”

      A further question is if alpine plants inserted in turfs at alpine climatic conditions would have a similar effect as lowland plants inserted in turfs at lowland climatic conditions.

      We interpret “turfs” to mean “lowland turfs” here, since we did insert lowland plants into alpine turfs under lowland climatic conditions (i.e. the WL treatment). We found that adding alpine plants to alpine turfs in alpine climatic conditions (i.e. planting disturbance control, see Methods) had no effect on alpine soil carbon content. By extension, we would expect that adding lowland plants to lowland turfs in lowland climatic conditions would have no effect on lowland soil carbon content. While not explicitly tested, including this treatment would not change our finding that adding lowland plants to alpine turfs causes a reduction in soil carbon content relative to adding alpine plants to alpine turfs. Given this, we have left the text as is, but are happy to revisit this issue based on further discussion with the Reviewer/Editor.

      I suggest that the authors consider these questions when they draw conclusions about the results from their experiments. It would also be interesting to discuss the relevance of sudden strong warming effects relative to slower warming, potentially allowing ecosystems to adjust via changes in genetic composition of species (i.e. evolution) or species composition of communities (i.e. community assembly).

      Thank you for this excellent suggestion. We absolutely agree that anything short of a decadal experiment is unable to detect the role of longer-term evolutionary or community processes on soil carbon dynamics. While this doesn’t eliminate the need for experiments that consider shorter timescales, it is important to explicitly state this limitation. As suggested, we have added a sentence discussing this possibility in the concluding paragraph:

      L387: “While our findings demonstrate that lowland plants affect the rate of soil carbon release in the short term, short-term experiments, such as ours, cannot resolve whether lowland plants will also affect the total amount of soil carbon lost in the long term. This includes whether processes such as genetic adaptation (in both alpine and lowland plants) or community change will moderate soil carbon responses to gradual or sustained warming.”

      We also agree that it is extremely challenging to undertake warming experiments that do not initially “shock” the system through a sudden change in temperature. Having said this, alpine ecosystems are adapted to rapid within- and between-season temperature changes, making such shocks less relevant here.

      Reviewer #2:

      The authors were trying to test whether the migration of lowland plants into alpine ecosystems affects the warming impact on soil carbon. To achieve this goal, the authors first did two field experiments (moving intact turf from high-elevation to low-elevation to simulate warming) in the Alps, and then did a greenhouse pot study to explore the potential mechanisms for the results observed in the field experiments.

      The main strenghs of this work are the combination of a field experiment (conducted at two sites) and a greenhouse pot experiment (to explore the detailed mechanisms). Moreover, a number of techniques were used to measure plant traits, soil DOM and microbial properties (e.g. CUE, growth) which help to find the potential mechanisms.

      We thank the Reviewer for this positive comment.

      The main weaknesses of this work are below:

      1) The two field experiments are very short-term (<1 year), but the results were that warming and/or warming+lowland plants led to very high amount of soil C loss (up to ~40%, Fig. 1). I was shocked to see these results as many field warming studies have shown undetectable change in SOC even after years or decades. The authors did not provide a good explanation for this rapid and large change in SOC.

      We apologise for the confusion. We’re unsure where “up to ~40%” comes from here, so we have taken the Reviewer’s later suggestion of changing the annotation on Fig. 1 to contrast C versus WL treatments (Western Alps = 25.6 ± 7.2 mg g-1; Central Alps = 25.3 ± 8.6 mg g-1) rather than W versus WL treatments.

      With regards to the magnitude of soil carbon loss observed, we express soil carbon content in mg g-1 (i.e. mass-based per-mil), not cg g-1 (i.e. mass-based percent). This is so that we could use percent changes in the text to highlight the numeric magnitude of differences between treatments without confusing them with mass-based percent soil carbon – although we appreciate that this also caused confusion. To clarify, converting the above C versus WL treatment contrasts from mg g-1 to mass-based percent yields 2.56% ± 0.72% for the Western Alps experiment and 2.53% ± 0.86% for the Central Alps experiment. While it is striking that the WL treatments lost ~2.5% (~25 mg g-1) soil carbon in one year, such a loss is not extraordinary. To avoid future confusion, we have clarified the units in the Fig. 1 caption as follows:

      L77: “Mean ± SE soil carbon content (mg C g-1 dry mass; i.e. mass-based per-mil) in alpine turfs transplanted to low elevation (warming, W; light grey), transplanted plus planted with lowland plants (warming plus lowland plant arrival, WL; dark grey) or replanted at high elevation (control, C; white). Data are displayed for two experiments in the western (left) and central (right) Alps, with letters indicating treatment differences (LMEs; N = 58).”

      2) The greenhouse experiment was used to explore the potential reasons for the amplified loss of soil C in the field experiment. However, a key result was based on incubation of disturbed soils (8 g) and a two-pool modeling of the respiration data from the short-term incubation. This may not provide a good estimate of the true turnover rate of SOC under different plant species (even in the greenhouse condition). If rhizosphere priming was the proposed mechanism (as hinted by the authors), a better approach (such as 13C labeling) is needed to measure microbial respiration from intact soils (with plant/root presence).

      We agree with the Reviewer that using an approach such as 13C-labelling would have provided more direct evidence that lowland plants cause a rhizosphere priming effect. However, although some of our evidence comes from disturbed soils (i.e. microbial respiration), some (i.e. soil pore water) also comes from intact pots prior to harvest and we now also include another line of evidence from plant root biomass. In short, we draw on multiple lines of evidence suggesting that root exudates were involved, and note that Reviewer #3 thought our approach and interpretation on this aspect of the study was robust.

      Having said this, we acknowledge that we were too confident in our interpretation here, so we have added caveats to the text as follows:

      L207: “While not directly measured here, a nine-day decay period corresponds to the time expected for newly photosynthesised CO2 to be released through root exudation and respired by soil microbes, suggesting that this carbon pool was mostly root exudates.”

      L215: “While further directed studies are required to resolve whether root exudates are truly involved, our findings collectively suggest that lowland plants have the capacity to increase total root exudation into alpine soil relative to resident alpine plants.”

      3) Some details of the sampling or measurement are very crucial and affect the results/interpretations. For example, in the field experiment, the soil core was only 1-cm diameter. Considering the spatial heterogeneity of soil carbon in field plots, this small volume may not well represent the true soil condition. Moreover, in the field plots, did soil bulk density change after planting of lowland plants or warming? This will affect the measured SOC concentration (mg/g) even the SOC stock (g/m2) did not change.

      We agree with the Reviewer that taking a single soil core of 1 cm diameter in each plot would not have been robust. We did not do this. While we used 1 cm diameter cores to minimise disturbance, we took three cores per plot to account for within-plot heterogeneity and combined them into a composite sample. This is stated in the Methods as follows:

      L523: “In each plot, we created a composite sample from three cores (ø = 1 cm, approx. d = 7 cm) no closer than 7 cm from a planted individual and from the same quarter of the plot used for ecosystem respiration measurements (see below; Supplementary Fig. S1).”

      We also agree that bulk density measurements were an important omission in the initial submission. We note that this point was fleshed out by Reviewer #3, below, so we refer the Reviewer to our response to that comment for further details.

      Reviewer #3:

      The authors investigated the effect of warming and herbaceous plant migration on soil carbon (C) content using an ecosystem monolith transplant experiment along an elevation gradient in the Swiss Alp mountains. They observed, approximately 1 year after the transplant, that warming alone had little effect on soil carbon content (monoliths transplanted to a lower elevation with higher temperature remained unchanged in C content) but that the presence of lowland (warm-adapted) herbaceous plants in combination with warming had a negative effect on soil C content. The authors then conducted a glasshouse experiment and used a series of field and laboratory measurements to explore potential mechanisms explaining the observed changes in soil C content in the field. They concluded that soil C losses under lowland plant migration were likely mediated via increased microbial activity and CO2 release from soil C decomposition.

      The research questions are extremely relevant to our understanding of the feedback between soil C dynamics and climate warming and remain an unexplored part of this debate. Moreover, both field and laboratory experimental designs are robust, with all the relevant and necessary validation checks needed for transplant experiments; the laboratory techniques employed to measure the range of microbial and plant variables potentially explaining soil C dynamics are adequate and modern; and the statistical analyses are appropriate. These elements make the present data set very relevant and valuable. The manuscript is also very well and clearly written.

      We thank the Reviewer, and are delighted that they think the study is extremely relevant, novel, experimentally robust, cutting-edge and valuable.

      However, I have two major concerns, casting doubt respectively on the main field results and on the proposed explanatory mechanisms.

      First, at no point is bulk density mentioned and it does not appear to have been measured. This is critical because changes in soil C concentration (which was measured and reported here, in mg C g-1 soil) does not necessarily indicate an actual change in the quantity of C present in the soil (C stock, in unit mass C per unit soil volume, or per unit surface area to a constant depth) if this is accompanied by a change in bulk density: if less C per unit mass of soil (lower C concentration) is concurrent with more mass of soil in a constant volume (higher bulk density), this could mean that no change in C stocks actually occurs (or that even an increase occurs). In the present study, it is possible that the presence of lowland plants increased bulk density as compared to only alpine plants, compensating the lower C concentration and resulting in no change in C stocks. This is perhaps not likely, but it is too critical an issue not to be quantified (or at the very least discussed).

      This is an excellent point, and one also raised by Reviewer #2. To clarify, we initially decided against measuring bulk density because it is destructive and the experiments were still being used for other studies. Having said this, we agree with the Reviewer that more consideration of soil bulk density was needed, so we have rectified this in three ways. First, although the western Alps experiment has now been taken-down, to address this comment we took new soil cores to measure bulk density in the central Alps experiment in 2021 to indirectly confirm that no changes occurred in the presence versus absence of lowland plants. They did not, and we now include these data in the Methods as follows:

      L539: “It was not possible to take widespread measurements of soil bulk density due to the destructive sampling required while other studies were underway (e.g. ref 28). Instead, we took additional soil cores (ø = 5 cm, d = 5 cm) from the central Alps experiment in 2021 once other studies were complete to indirectly explore whether lowland plant effects on soil carbon content in warmed alpine plots could have occurred due to changes in soil bulk density. We found that although transplantation to the warmer site increased alpine soil bulk density (LR = 7.18, P = 0.028, Tukey: P < 0.05), lowland plants had no effect (Tukey: P = 0.999). It is not possible to make direct inferences about the soil carbon stock using measurements made on different soil cores four years apart. Nevertheless, these results make it unlikely that lowland plant effects on soil carbon content in warmed alpine plots occurred simply due to a change in soil bulk density.”

      Second, in the Main Text we now caution readers against translating soil carbon content changes to soil carbon stock in absence of coupled measurements of soil bulk density as follows:

      L113: “We caution against equating changes to soil carbon content with changes to soil carbon stock in the absence of coupled measurements of soil bulk density (Methods). Nevertheless, these findings show that once warm-adapted lowland plants establish in warming alpine communities, they facilitate warming effects on soil carbon loss on a per gram basis.”

      Finally, we have altered the language throughout the manuscript (including the title) to make it clearer that we focussed on soil carbon content/concentration – not stock.

      Second, even assuming that no changes in bulk density occurred and that indeed soil C stocks decreased under warming combined with lowland plant migration, the interpretation of the results are, in my view, at least incomplete. Certainly, the results do not support the claim that soil C losses were mediated via increased microbial decomposition of soil C with the certainty suggested by the authors. Generally speaking, I see three issues with the interpretation:

      • Very schematically, increased microbial respiration and soil C losses from decomposition is only one of two equally likely pathways potentially explaining soil C losses (the other being decreased C inputs to the soil from the plant community). The possibility that decreased soil C content was simply mediated by decreased inputs of C to the soil is hardly explored at all in the study (there is a quick mention of it (L155), but differences in plant biomass are interpreted only for their correlations with microbial activity (L160-166), not as a component of the C balance. Plant traits are measured and analysed but not in a way that can be used to test the hypothesis of changing C inputs. The presence of "more productive traits" (L141) for the lowland plants does not directly relate to differences in the quantity of C inputs to the soil, nor is it interpreted in relation to inputs. Even the interpretation of changes in ecosystem respiration seem to omit the possibility of changes in plant respiration (L208): "depressed microbial respiration per unit of soil was also evident at the ecosystem scale in that warming accelerated total ecosystem respiration but its effect was dampened in plots containing lowland plants". This statement was made despite no significant differences in microbial respiration per unit soil in the field data, and disregards the possibility that the dampened effect in plots with lowland plants could be due to lower plant respiration.

      This is an excellent point. We have performed new analyses of the plant trait/biomass data from the field experiment, included additional measurements/analyses of NEE and GPP from the field experiments (originally omitted due to space, which was a mistake!) and have rewritten all relevant sections in the manuscript to change the focus to a shifting balance between soil carbon inputs and outputs. Importantly, our original interpretation remains robust – i.e. that lowland plants most likely operate by accelerating soil carbon outputs, not decelerating soil carbon inputs – but we are careful to present our conclusions with an appropriate level of caution.

      • For the glasshouse experiment, I agree that the results indicate that (L115); "lowland plants accelerated microbial activity by increasing the quantity of root exudates", but not that (L112): "these findings together imply that lowland plants accelerate alpine soil C loss" because stimulating microbial activity is not per se an indicator of soil C loss. It is now well-known that the activity of microbes is not only a motor for soil C losses, but also a key mechanism leading to transformation of C inputs from plants that leads to the subsequent stabilisation of C in the soil. This is actually clearly stated further down in the manuscript when interpreting the field microbial data (L190. Furthermore, there is no direct evidence that the pots with lowland plants were losing more C than those without. Therefore, results from the glasshouse experiment could be interpreted differently: a larger fast cycling pool of soil C constituted of recently photosynthetically fixed exudates associated with higher microbial activity could well be interpreted as an early indicator of more C stabilisation, particularly since the absorbance index seems to indicate more microbially derived product in the DOC. It would have been great to measure microbial biomass C over time (as well as CUE, and mass specific growth and respiration), to see if higher respiratory activity was associated with higher biomass. The lack of differences in microbial biomass between the plant community treatments at the end of the 6 weeks does not show that the quantity of microbial biomass produced over the whole incubation period remained constant. In a word, more respiration of a larger fast cycling pool is not an indicator of future soil C loss (in the presence of plants).

      We thank the Reviewer for raising this important point. On reflection, we agree that the previous version of the manuscript did not give sufficient consideration to the possibility for increased microbial activity (and, indeed, respiration) in the glasshouse experiment to signal soil carbon accumulation via increased microbial growth. Having said this, all pots began with the same soil and microbial biomass remained unchanged between alpine and lowland plant treatments at the end of the six-week experiment. By extension, no net microbial growth occurred during this timeframe, making it unlikely that the accelerated respiration observed under lowland plants was indicative of soil carbon accumulation. Sadly, while we can deduce that intrinsic rates of respiration were higher, we can only speculate that growth remained unchanged (no new measurements can be done since growth measurements require fresh soil). We have rewritten the respective section in the manuscript in light of this and the Reviewer’s other comments, which includes the following caveat:

      L181: “These findings support the hypothesis that lowland plants have the capacity to increase soil carbon outputs relative to alpine plants by stimulating soil microbial respiration and associated CO2 release. While accelerated microbial respiration can alternatively be a signal of soil carbon accumulation via greater microbial growth, such a mechanism is unlikely to have been responsible here because it would have led to an increase in microbial biomass carbon under lowland plants, which we did not observe.”

      • The interpretation of the microbial variables measured in the field line up better with current conceptualisations of the role of microbes in C cycling (but overall interpretation still lacks consideration for plant C inputs). However, interpreting those data measured once 1 year after the transplant to explain the changes that happened gradually over this whole year is a risky and difficult exercise. How do we know that CUE, Rmass, Gmass etc… measured then represent what they were a day, a week, a month before? There is an attempt to deal with this timing issue by comparison with the glasshouse experiment, but only Cmic and Rmass can really be compared and it only very partially fills in the gap in time. Besides, the interpretation of this comparison can be questioned: in the glasshouse, Rmass was higher for the lowland plant pots (as compared to alpine plant at constant temperature) but actually remained constant between the comparable treatments W and WL in the field (Fig 2m). The results from the field, therefore, do not "support observations from the glasshouse experiment" in this context (L197) and neither do they "confirm (…) that this persists for at least one season" (L199). Finally, the thinking around the pulsed nature of C losses seems misplaced because there are no evidence that soil C losses had stopped after a year in the field (no measurements of soil C content are presented after that year).

      With regards to plant carbon inputs, we refer the Reviewer to their previous comment for corresponding revisions. With regards to specific comparisons between the glasshouse and field experiments, we have now deleted the sentences in question and have interpreted our results as follows:

      L329: “Thus, despite lower rates of ecosystem respiration overall, alpine soil microbes still respired intrinsically faster in warmed plots containing lowland plants. Moreover, accelerated microbial respiration, but not growth, implies that alpine soils had a higher capacity to lose carbon under warming, but not to gain carbon via accumulation into microbial biomass, when lowland plants were present. These findings align with observations from the glasshouse experiment that lowland plants generally accelerated intrinsic rates of microbial respiration (Fig. 3), although in field conditions this effect occurred in tandem with warming.”

      With regards to soil carbon loss being pulsed, while there is support for such a mechanism, we agree that this is one of several hypotheses and with only two timepoints we were too confident about it in the original submission. We have now reshaped this section of the manuscript entirely to be more cautious about the temporal dynamics involved. For instance, the section title now reads “Lowland plant-induced soil carbon loss is temporally dynamic”. Some other notable changes are:

      L286: “Importantly, lowland plants had no significant bearing over net ecosystem exchange (Fig. 5a), implying that although lowland plants were associated with soil carbon loss from warmed alpine plots (Fig. 1), this must have occurred prior to carbon dioxide measurements being taken and was no longer actively occurring.”

      L293: “By contrast, ecosystem respiration in warmed alpine plots was depressed in the presence versus absence of lowland plants (Fig. 5c). These findings generally support the hypothesis that lowland plants affect the alpine soil system by changing carbon outputs. However, they contrast with expectations that lowland plants perpetually increase carbon outputs from the ecosystem and thus raise questions about how soil carbon was lost from warmed plots containing lowland plants (Fig. 1).”

      L320: “Carbon cycle processes are constrained by multiple feedbacks within the soil system, such as substrate availability and microbial acclimation, that over time can slow, or even arrest, soil carbon loss. We thus interrogated the state of the soil system in the field experiments in the western Alps experiment to explore whether such a feedback may be operating here, in particular to limit ecosystem respiration once soil carbon content had decreased in warmed alpine plots containing lowland plants.”

      L354: “Taken together, one interpretation of our findings is that the establishment of lowland plants in warming alpine ecosystems accelerates intrinsic rates of microbial respiration (Fig. 3, Fig. 6a), leading to soil carbon release at baseline levels of microbial biomass (Fig. 1, Fig. 3c), a coupled decline in microbial biomass (Fig. 6c) and a cessation of further carbon loss from the ecosystem (Fig. 5a, Fig. 6d).”

      L358: “Although such a mechanism has been reported in other ecosystems, applying it here is speculative without additional timepoints because field soil measurements came from a single sampling event after soil carbon had already been lost from the ecosystem. For instance, an alternative mechanism could be that soil microbes acclimate to the presence of lowland plants and this decelerated microbial processes over time.”

      L368: “Beyond the mechanism for lowland plant effects on alpine soil carbon loss, it is conceivable that soil carbon loss is not isolated to a single season, but will reoccur in the future even without further warming or lowland plant arrival. This is especially true in the western Alps experiment where warming yielded a net output of carbon dioxide from the ecosystem (Fig. 5a). Moreover, in our field experiments we simulated a single event of lowland plant establishment and at relatively low abundance in the community (mean ± SE relative cover: 4.7% ± 0.7%), raising the possibility that increases in lowland plant cover or repeated establishment events in the future could facilitate further decreases in alpine soil carbon content under warming.”

      Reviewer #4:

      This manuscript took alpine grasslands as a model system and investigated whether lowland herbaceous plants contributed to the short-term dynamics of soil carbon under the context of climate warming. The authors find that warming individually does not render significant changes in alpine soil carbon, but corporately causes ~52% of carbon loss with lowland herbaceous plants in two short periods of field experiments. They further show that alpine soil carbon loss is likely mediated by lowland herbaceous plants through root exudation, soil microbial respiration, and CO2 release. This work adds in an interesting way to the ongoing debate on whether a positive climate feedback will be mediated by plant uphill range expansion in alpine grasslands, where climate warming may lead to a rapid loss of soil carbon.

      The claims of this manuscript are well supported, but some aspects of background information in the studied alpine systems and field experiment design need to be clarified.

      1) There is an extremely high level of carbon stored in the alpine soils (Figure 1). Climate warming will certainly lead to a great loss of soil carbon in the study systems that could contribute to the positive climate feedback. However, it is unclear for me how the effects of climate warming on soil carbon are relevant to the ongoing climate change in the studied alpine grasslands. It is therefore reasonable to provide more background information about ongoing climate change, and whether the simulated climate warming (i.e., 2.8 oC in central alps and 5.3 oC in western alps, Line 328-329) is realized as real-world climate change in the local systems. In addition, it seems that the manuscript aims to address a question that is of global concern, but my concern is about how the findings could be generalized to other regions.

      We thank the Reviewer for pointing this out. With regards to the amount of soil carbon stored in the alpine soils, we refer the Reviewer to comments from Reviewer #2. With regards to the magnitude of warming expected in mountain regions, we agree with the Reviewer that the original submission lacked context. We have therefore added specific values as suggested:

      L59: “They are experiencing both rapid temperature change (0.4 to 0.6 ºC per decade) and rapid species immigration…”

      With regards to how findings could be generalised to other regions or ecosystems, this is an important point that requires further research – and which we raise in the concluding paragraph. However, we see that we could have been more explicit about validating our findings in other mountain regions, so we have amended the sentence in question as follows:

      L400: “Future work should focus on testing the conditions under which this feedback could occur in different mountain regions, as well as other ecosystems, experiencing influxes of range expanding plant species, on quantifying how deeply it occurs in shallow alpine soils, and on estimating the magnitude of the climate feedback given both ongoing warming and variation in rates of species range shifts.”

      2) I understand that the manuscript considers elevation as a natural gradient of climate change, which makes it possible to compare soil carbon dynamics in lowlands with alpine grasslands under climate warming. I also understand that the authors have done everything they can to control for the disturbances caused by transplanting that has been well justified by the supplementary data (e.g., Figure S6). However, it is unclear how the authors controlled for the influences of other factors given there are huge differences between lowlands and alpine grasslands, such as differences in wind, solar radiation, humidity, and the length of growing season.

      This is an excellent point. We note that Reviewer #1 also raised this point, so we refer the Reviewer to our response to that comment for further details.

      3) It is generally known that different species respond to climate warming differently. Some species may be sensitive to climate warming and have traits aiding to dispersion that could expand their living ranges to some degree, while others may adjust themselves to adapt to climate warming and may not migrate to alpine systems. It is therefore cautious to assume that all the lowland species have the same dispersal ability. In other words, it is unclear how lowland plant species are selected for the field transplanting experiment (Line 284-290). Do all the lowland plant species selected have the potential to migrate to alpine systems?

      This is an excellent question. In short, the specific dispersal abilities of lowland species used are currently unknown and will certainly vary. However, all are widespread and we assume have the capacity to migrate to higher elevations, given that horizontal distances between high and low elevation sites were in both cases less than 2 km. We now clarify this in the manuscript as follows:

      L433: “While exact dispersal distances for selected lowland species are unknown, all species are widespread and are expected to migrate uphill under warming and the horizontal distance between high and low sites in the field experiments was always less than 2 km.”

      4) The authors acknowledge that "we did not perform a reverse transplantation (that is, from low to high elevation), so we cannot entirely rule out the possibility that transplantation of any community to any new environment could yield a loss of soil carbon" (Line 318-320). When I read the title "lowland plant migrations into alpine grasslands …", I thought lowland plant species that were transplanted from low to high elevation. In fact, it is just the opposite to my thoughts. Without performing a reverse transplantation experiment, I am not sure the conclusion will stand that "lowland plant migrations into alpine grasslands amplify soil carbon loss under climate warming". In addition, it is unclear whether lowland plant effects stand alone or depend on climate warming based on the results in Figure 1 that lowland plant treatment is missing, and it is impossible to test the interactions between lowland plant and climate warming.

      We apologise for the confusion. This comment echoes other comments from Reviewer #1 asking us to be more explicit about the treatments used when interpreting findings, to caveat the step in logic from transplantation to warming and to acknowledge throughout the manuscript that lowland plant effects were dependent on transplantation in the field experiment. We therefore refer the Reviewer to our responses to those comments for details on how we resolved this. We have also modified the title and abstract to more accurately represent the experimental design, as follows:

      Title: “Lowland plant arrival in alpine ecosystems facilitates soil carbon loss under experimental climate warming”

      L30: “Here we used two whole-community transplant experiments and a follow-up glasshouse experiment to determine whether the establishment of herbaceous lowland plants in alpine ecosystems influences soil carbon content under warming. We found that warming (transplantation to low elevation) led to a negligible decrease in alpine soil carbon content, but its effects became significant and 52% ± 31% (mean ± 95% CIs) larger after lowland plants were introduced at low density into the ecosystem.”

      With regards to testing the interaction between warming and lowland plants, while we acknowledge that not performing a fully-factorial design limited our ability to explicitly separate lowland plant versus warming effects on alpine soil, both are occurring simultaneously due to climate warming and we thus focussed effort on simulating such a scenario with greater experimental replication and at multiple locations. We note that Reviewers #1, #2 and #3 thought that this approach was robust. Importantly, the statistical analyses performed are valid for such an experimental design, and we have clarified and nuanced our interpretation throughout to avoid reaching beyond it.

    1. Author Response:

      Reviewer #1 (Public Review):

      This paper uses a combination of confocal and electron microscopy to localize gap junctions in the outer retina. Electrical coupling between photoreceptors is an important aspect of retinal function, and past work provides (often indirect) evidence for rod-rod, rod-cone and cone-cone coupling. The work described here indicates that rod-cone coupling dominates. The combination of techniques is quite convincing and very elegant. My concerns are primarily about the appeal of the work to non-retina readers. Some of these concerns could be mitigated by a more accessible presentation of some of the results. Suggestions along these lines, and a few other minor issues, follow.

      Introduction:

      The introduction is a bit retina-centric. I think more needs to be done to explain how each type of coupling (rod-rod, rod-cone, cone-cone) could impact retinal processing, and why it is important to resolve which are present or dominant. One issue that could get emphasized is the difference between gap junctions between like cell types (presumably involved in lateral spread of signals, averaging, etc) and between unlike cells (potentially providing an alternate path for signal flow - as in the secondary rod pathway).

      We have included new text in the introduction to address this issue. We have tried to provide background material of a general nature and we have included some introductory text about different types of gap junctions, as requested. We thank reviewer 1 for this helpful suggestion.

      Cone-cone coupling:

      It would be helpful to put the conclusions about rod-cone and cone-cone coupling together. The paragraph starting on line 585 is a bit confusing that way. It starts by summarizing evidence that blue cones are not coupled with red/green cones. But then (in mouse) all the cones are coupled to rods, so that specific exclusion of blue cones seems unlikely to hold. You come back to this a bit later in the discussion, and there indicate that there appears to be weak cone-cone coupling. Merging the text in those two locations might help. It might also help to make the (seemingly clear) prediction that blue and green cone signals in mouse will get mixed.

      Thank you for pointing out that this section is not clear. It seems two different points are muddled: 1) Blue cones do not make gap junctions with other cones, perhaps to minimize spectral mixing: the evidence from primate and ground squirrel suggests that blue cones are not coupled to red/green cones or green cones. 2) In contrast, we find no evidence of color selectivity in rod/cone coupling: green cones and blue cones are both coupled to all nearby rods. Thus, rod signals can be injected into the downstream pathways of both blue and green cones.

      We have rewritten the text and separated these points into separate paragraphs for clarity, as below.

      Revised Text:

      Blue cone pedicles are also coupled to rods.

      In the cone networks of primate and ground squirrel retina, there is good evidence that blue cones are not coupled to neighboring red/green (primate) or green cones (ground squirrel) (Hornstein et al., 2004; Li and DeVries, 2004; O’Brien et al., 2012). In the primate retina, the telodendria of blue cones are few in number and too short to reach the neighboring red/green cones (O’Brien et al., 2012). Thus, blue cones appear to be electrically separated from other cones in these two species, perhaps to maintain spectral discrimination (Hsu et al., 2000). In the mouse retina, although the blue cones were identified by Behrens et al., (2016), we were unable to find any cone to cone gap junctions, regardless of color (see below).

      In contrast to the selective connections between cones in some species, rods were coupled to both blue and green cones indiscriminately in the mouse retina (present work) and in primate retina (O’Brien et al., 2012). Blue cones, identified in confocal work by the presence of S-cone opsin, and in SBF-SEM by their connections with blue cone bipolar cells (Behrens et al., 2016; Nadal-Nicolás et al., 2020), and green cones both made telodendrial contacts at Cx36 clusters with all nearby rod spherules (Fig. 4). Thus, we find no evidence for color specificity in rod/cone coupling. In fact, a single rod spherule may be coupled to both blue and green cones (Fig. 5, supplement 5). Therefore, rod signals can pass via the secondary rod pathway into both blue and green cones and their downstream pathways. Considering blue cone circuits specifically, rod input to blue cone bipolar cells and downstream circuits is predicted via the secondary rod pathway, in addition to the previously reported primary rod pathway inputs from AII amacrine cells to blue cone bipolar cells (Field et al., 2009; Whitaker et al., 2021).

      Relation to other circuits:

      Are there implications of the present results for gap junctional coupling in other circuits that could be emphasized? Things like the open probability how strongly it can be modulated seem like points of general interest - but I don't have enough expertise to know if those are established facts on other systems. Some of that is touched on in the Discussion, but quite briefly.

      In an effort to keep the discussion short, we have perhaps been too abrupt. We have added text to the discussion to include some general issues concerning gap junctions.

      Location of Cx36:

      Can you speculate on why Cx36 is generally located at the mouth of the synaptic opening in the rod spherule? This was a very clear result, but it was unclear (at least to me) if it was important.

      This is an interesting topic and we have expanded the discussion to consider potential functions and mechanisms.

      Added to discussion:

      The position of rod/cone gap junctions, at the base of the rod spherule, close to the opening of the post-synaptic cavity, appears to be systematic in that the vast majority of rod/cone gap junctions occur at this site. We may speculate that gap junctions are localized with some of the same scaffolding proteins that occur at the rod synaptic terminal, but the functional significance of this repeated motif is unknown. In mutant mouse lines, where Cx36 has been deleted from either rods or cones, cone telodendria are still present and they still reach out to contact nearby rod spherules in the absence of rod/cone gap junctions. Therefore, the specificity of synaptic connections is not determined or maintained by the presence of Cx36 gap junctions.

      Reviewer #2 (Public Review):

      Previous studies demonstrate that modulation of gap junctional coupling in the outer plexiform layer of the mouse retina regulates the balance between sensitivity and resolution. The authors use optical and electron microscopy to structurally characterize this coupling. They find that gap junctional coupling in mouse OPL is produced by a dense meshwork of cone photoreceptor telodendrions that selectively innervate the rim surrounding the synaptic openings of rod photoreceptor spherules. The density of this coupling network is such that each cone is coupled to dozens of rods and each rod is coupled to multiple cones. Rod/rod and cone/cone gap junctions were not detected.

      The combination of antibody labeling, reconstruction of the photoreceptor terminal network, and ultrastructural analysis provides a remarkably clear view of the gap junctional connectivity that constitutes the first stage of visual processing. A few results are only weakly supported due to sample size or technical limitations. However, the overall conclusions are well supported and the data is presented with unusual transparency. The map of the network organization of photoreceptor coupling generated here is an important contribution to visual science.

      Optical imaging:

      The quality of the confocal imaging is high and the images of the Cx36 distribution relative to rod spherules is convincing. There does seem to be a significant amount of processing in the images and a lack of background signal in antibody images. Whether this processing is due to the airy scan software or additional filtering and thresholding, it can be difficult to judge the distribution of signal in several images.

      In general, there was no filtering or processing of any confocal images, except for adjusting brightness and contrast. However, we may have been over-zealous in reducing the background. Therefore, we have adjusted Figures 1 and 2 to include more background as requested, to enable the reader to better judge the specificity of the immunolabeling. In addition, we have prepared supplementary figures to show the individual channels with background, as well as the combined images, to be absolutely clear and transparent. Finally, for each confocal image, the confocal series from which it was derived has been archived and is publicly accessible.

      Former Figure 1D, now Fig. 2D is an exception because it shows a 3D projection of the colocalization between a single EGFP labeled cone pedicle and Cx36. We have revised this figure, providing new 2D optical sections to show how the image was prepared, in addition to revising the final 3D projection, labeling it as a 3D projection with colocalized Cx36.

      Electron microscopy:

      The authors perform annotations on two previously acquired volume EM datasets. The first serial blockface EM dataset is relatively low resolution and lacks ultrastructural labeling but is used effectively to reconstruct the terminal morphology and points of contacts between photoreceptors. The second EM data set uses FIB SEM to obtain smaller voxel sizes from tissue stained in such a way that the darkened membranes of putative gap junctions are distinct from surrounding membrane. Most measures of gap junction number come from the ultrastructure free dataset. In isolation, counting of gap junctions in this type of image volume could be unreliable. However, comparing the putative gap junctions in this dataset to the morphology and distribution of Cx36 antibody clusters in the confocal imaging and the darkened plaques in the FIB SEM images greatly increases confidence that the network description of rod/cone gap junctional coupling is accurate.

      Quantification:

      Most quantification is presented with an unusually high degree of transparency, with scatterplots showing all data points, data source files showing the animals that data came from, and standard deviations being supplied in descriptive statistics. There are a few places where Ns are difficult to determine or the analysis is not quite clear. For several results, claims are made when the sample size is too small to be sufficiently confident. The reconstruction of 5 blue cones suggests that, overall, blue cones are not radically different from other cones in their terminal morphology or gap junctional coupling to rod spherules. Claims that the blue cones are identical to other cones in most measures or that their telodendrions are smaller, but not statistically smaller are not well supported by the sampling. Similarly, the fact that the 6 nearby cones closely analyzed for cone/cone gap junctions yield no junctions, strongly suggests that vast majority of gap junctions are cone/rod gap junctions. However, the sample is too small to argue that there could not be infrequent, atypical, or region-specific cone/cone gap junctions.

      We have addressed the issues of blue cones and cone/cone coupling to soften our conclusions and explicitly point out the small numbers.

      Estimate of open channels:

      The authors estimate that 89% of gap junction channels are open during times of maximum rod/cone coupling and point out that this number is surprisingly high relative to previous estimates. However, this estimate appears to be subject to many significant potential errors. The estimate combines previous freeze fracture studies of the density of gap junctions from various species and various parts of the retina the measurements of the length and width of the gap junctions in the current study. Differences in tissue processing, density variation within and between systems, reconstruction error, and variation and error in the inputs to the model could all contribute to an underestimate of the total number of channels linking mouse rods and cones. Moreover, without an accounting of these issues, the real error bars on the range of possible open channels would seem to include both surprising and less surprising estimates of open gap junction fractions.

      This is a major issue. In short, for the calculations of open probability, we have estimated the cumulative errors, added these numbers to the text and attached an appendix showing the statistical analysis. We have also added a section to the discussion to address the possible sources of error enumerated by reviewer 2.

      Reviewer #3 (Public Review):

      In the presented work, Ishibashi and colleagues combine immunohistochemistry, analysis of a publicly available large scale 3D EM dataset and smaller but more detailed newly acquired EM datasets to qualitatively and quantitatively study gap junctions of mouse rod and cone axon terminals. The existence of rod-to-cone gap junctions has been known before, but the use of larger 3D EM data allows to determine an average number of contacts as well as an estimate of the strength of gap junctions. This as well as the (very likely) exclusion of direct cone-to-cone coupling in the mouse as opposed to some other mammals are the main contributions of this paper and one more puzzle piece of the big picture of mouse retinal connectivity. However, while the findings are a valuable addition towards a complete picture of the connectivity in the mouse retina, the novelty of the findings is limited to the number of contacts per photoreceptor and gap junction sizes.

      In my opinion, while the authors present a thorough analysis of their data, the manuscript in its current state has stylistic flaws on the motivational side. To me, abstract and introduction lack a motivation or stronger statement of relevance for this analysis. Similarly, while each individual analysis is discussed one by one, I'm missing a broader discussion of the implications of the findings for the field and possible directions for future research to highlight relevance for a broader readership.

      Thank you for the positive comments. We have rewritten and added material to the Abstract, Introduction and Discussion in an attempt to explain the reasoning for this study and to explain the findings to a broader audience.

    1. Backward design (or backward planning/mapping) is about designing with the end in mind. Where do you want students to end up after a lesson? What knowledge and skills do they need to showcase? What are the desired results of the lesson?

      I believe these questions are so important to consider when creating any lesson! As someone who needs time adjusting to new tools or apps, I have experience many times that I have had to focus more on how to use the tool than the content we were using it for. I think when you consider these questions, it helps make sure the students will all benefit from the tool. It may take some extra time during planning, but it will be more beneficial in the long run.

    1. Author Response:

      First we would like to thank the reviewers for their very kind words regarding our manuscript and for their helpful suggestions for how to improve our paper. We believe their suggestions have helped to strength the paper as a whole. We will address below the specific weaknesses that the reviewers have brought up and describe how we have modified our manuscript in response to these suggestions.

      Reviewer #1:

      This is an interesting study of the relation between vividness of visual imagery and the pupillary light response that can result from it. The authors collected data in two experimental paradigms, which they ran in two independent samples. One of these samples was a larger group of psychology students; the other a self-reported group of people with aphantasia. In a first paradigm, the authors show that a lack of vivid imagery is associated with a smaller (or even absent) pupillary light response. Using a second paradigm, binocular rivalry, they show that the degree to which imagery primes binocular rivalry is correlated (to a degree that is quite striking) with the magnitude of the pupillary light response to imagined stimuli. These results were obtained both for low-scoring individuals in the large sample as well as for the aphantasics. The study provides objective evidence for the absence of imagery in individuals that self-report as aphantasic.

      The paper is well written and all the necessary controls for potentially confounding variables are in place. For instance, age or visual persistence are discussed and excluded as alternative explanations based on convincing analyses. A particular strength of the manuscript is that the authors report positive results for pupillary responses in the group with aphantasia. That is, these individuals show regular pupillary responses to changes in physical stimulus brightness as well as to cognitive load. Another strength is that the group of aphantasics was invited separately and not determined post-hoc in the initial sample.

      In summary, there is a lot to like about this paper. I have three comments / questions that I think should be addressed, however.

      1. A point that I would like to see analyzed and discussed is the role of eye movements. The authors do not report any analyses of fixation behavior or the frequency of saccades in the two groups. These should be analyzed and reported. The only mention of fixation control is in lines 423-424, but the authors remain at a very superficial level, stating that footage from this scene camera of the pupil labs eye tracker was "assessed to ensure fixation on the computer monitor". Does this mean that participants could look anywhere provided they looked at the monitor?



      We have now analysed the eye-movements of participants to assess whether or not they might be driving some of our findings, which we agree is a very important additional analysis to add to this paper to confirm our findings are not being driven by eye-movements. When analysing both eccentricity and the number of saccades participants made there was no differences between the two groups when imagining the triangles (see supplementary figures s7 and s11). There was also no correlation between eccentricity data and either the imagery pupillary light reflex or binocular rivalry priming. Taken together it seems unlikely that the observed pupillary light response during imagery is being driven by eye-movements.

      1. In Figure 1D (also lines 120-124), the authors show a correlation between vividness ratings and the pupillary light response. I assume that participants differ substantially in their distributions of responses. So these correlations could be a consequence of individual differences or they could provide evidence for trial-by-trial variation. There might be ways to find out. For instance, is there evidence for these correlations at the level of individuals? Does the correlation persist if individual vividness-response distributions are normalized to span the same range for each observer?

      We would like to clarify the analysis we ran. Figure 1D is the results of 2 x 4 linear mixed-effects analysis, not correlations. This model included subject identity as a random effect (see Methods section of our paper) and therefore the effects reported were computed at the subject level. We report in the text, effects that are significant at the level of the sample. This does not exclude the possibility of inter-individual differences, but we are not sure how interpretable a single-subject analysis is in the current study.

      1. In lines 314-315, the authors state that the pupillary light response to imagined stimuli may serve as an objective indicator of aphantasia. I think this is taking the interpretation of the data too far, mainly for two reasons. First, the authors haven't shown that low pupillary light response predicts aphantasia in a group of people that does not self-report as aphantasics before the test. Second, the absence of a pupillary light response (in a new sample with no additional controls) could also indicate a lack of motivation to engage in imagery. The authors should thus clarify that such tests would always have to be combined with positive tests that show the commitment of participants to the task instructions.

      We agree that it is very important to include positive controls in not only pupillary light response imagery tasks, but any task that measures imagery or any other internal experience. We have now expanded on this point in our discussion as well as reporting on the mock binocular rivalry trials that were included in the priming imagery task as a control for potential response biases.

      Reviewer #2:

      Kay et al. investigated visual mental imagery in the general population and the lack thereof in individuals with aphantasia by measuring the pupillary light response to imagined light and dark shapes. Their findings are twofold. First, they show a link between pupil size change and perceived vividness of imagery and corroborate this finding using another established objective measure of vividness. Secondly, they found a lack of such a pupillary light response in a group of individuals who maintain no visual experience of imagery. This demonstrates the usefulness of using the pupillary light response as a measure of subjective vividness of imagery and potentially demonstrates the first physiological finding in aphantasia.

      Strengths

      The experiment incorporates several different dimensions into a single clean design that is useful for isolating and tracking multiple relevant measures. First, by having the brightness of the perceived and imagined shapes vary across trials, the authors could show that changes in the pupillary light response correspond to changes in imagined brightness. The authors also added in an independent number-of-objects dimension since pupil size also varies with cognitive effort. This provided evidence that aphantasic subjects were attempting to imagine, since the pupil size did change with set size, even when it didn't change with brightness. Finally, by having subjects report the perceived vividness of each imagined image, the authors could link subjective experience of imagery to the pupillary light response.

      The authors also strengthen their findings by comparing changes in pupil size to an objective measure of imagery vividness. By leveraging the fact that imagery mimics vision's ability to bias a perception during binocular rivalry, the authors avoid the severe limitations present in measures that rely on introspection only.

      Weaknesses

      Due to the inherently private nature of mental imagery, ruling out fabrication or demand characteristics is extremely difficult. This is especially true in aphantasia research, as we are often looking for the absence of an effect rather than an enhancement. Readers should keep in mind that, while the authors made some effort to confirm that the aphantasic subjects were attempting to imagine, the potential for this and other biases were not ruled out. Without the use of probes to test subjects on the remembered/imagined objects and reporting the outcomes of catch trials, it is difficult to tell whether subjects were fully engaging with the stimuli.

      Readers should also take the pupillary light response as a tool to add to the battery of assessments for aphantasia, not as one that a diagnosis can be based on alone. While the authors do show a group level difference in pupil size in response to imagined shapes and claim it as a "new low-cost objective measure for aphantasia", it should be remembered that this manuscript does not demonstrate the tool's efficacy in identifying individual subjects with aphantasia. The absence/presence of an imagery pupillary light response does not confirm/rule out aphantasia.

      Overall, the manuscript helps characterize an intriguing condition that until relatively recently received little empirical attention. These findings support the internal experiences described by aphantasic individuals, experiences that are often met with skepticism. Importantly, the authors have also offered the field a new objective physiological approximation of imagery vividness which can be incorporated into a number of study designs examining changes in imagery. The majority of previous measures relied on self-report alone and often suffered from the limitations of language (e.g., what it means for something to be "like vision" can be very different for different people). This manuscript also adds to the growing body of evidence of the power of internally generated signals, which can apparently reach all the way down the visual hierarchy to the eyes themselves.

      We are in full agreement that when we investigate the internal contents of the mind we need to be mindful of the many caveats that exist when relying on people’s ability to introspect. We agree that future studies should expand on our research by adding in further controls, such as having participants report what item they were asked to remember at the end of the trial. However researchers should also keep in mind that changing the demands of a task can alter how participants undertake a given task. For example by emphasising remembering the items, rather than creating detailed vivid images in mind, participants may revert to a non-visual imagery strategy to remember the items, such as labelling the items. This may be particularly easy to do in the current study as the items being imagined are simple geometric shapes. Indeed it was important to avoid this potential pitfall here with our aphantasic population as we have previously shown that aphantasic individuals can perform a wide array of visual working memory tasks despite their lack of visual imagery. We believe that the addition of a set-size/cognitive load condition, plus our added reporting on mock trails helps to answer some of these potential response bias issues, but future research can and should further investigate these potential biases in greater detail.

      The second point Reviewer #2 brings up is a very good one, that no one singular measure in isolation, at this point in time, can be used to ‘diagnose’ aphantasia. The field is very young and we are still in the process of understanding exactly what aphantasia is. For example there may be many subtypes of aphantasia, with previous work from our group and others showing that aphantasic individuals are heterogenous in their reporting of how other imagery modalities are affected. We agree with Reviewer #2’s point that a battery of tests, potentially comprising questionnaires (e.g. VVIQ), psychophysical tasks (e.g. binocular rivalry paradigm) and physiological (e.g. skin conductance, pupillometry) should be aimed for where possible in testing aphantasic populations. The pupillary light response is a new tool that can be added to this arsenal.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): *

      In this manuscript by Wang and colleagues, the authors analyse single-cell RNA-seq (scRNAseq) data by applying transition path theory to infer gene regulatory network (GRN) changes along the transition (reaction coordinate, trajectory) between free energy stable states (i.e. cell types). The work aims to understand how stable cell types, and their regulatory programs (combination of active and repressed genes) switches during differentiation/reprogramming/response (i.e. cell phenotypic transition/CPT). The premise of the work is to assess whether genes within GRNs undergo step-wise repression, state-change and activation (& vice-versa; analogous to SN1) or concurrently regulate gene expression (analogous to SN2). The GRNs are inferred based on highly variable genes and their expression dynamics from RNA velocity over CPT, across 3 scRNA-seq datasets.

      The authors first analyse public scRNA-seq dataset of 3003 human A549 adenocarcinomic basal epithelial cells treated with TGF-b for 0hrs, 8hrs, 1 day and 3 days (4 timepoints). The authors select two stable states (Day0-untreated; Epithelial and Day 3-treatment; Mesenchymal) using local kernel densities and set transition paths using Dijkstra shortest path, dividing state space into Voronoi cells (i.e. reaction coordinate value), and constructed single-cell GRNs based on RNA velocity differences (n=500 genes) and a linear model (from Qiu et al). This GRN is based on expression and velocity estimates, and does not distinguish direct from indirect regulation. Calculating interaction frequency (edges) across two stable states over 4 louvain clusters, the authors find global increase in effective edges that correlates with increased active genes; but with variable trend within inter-cluster edges. To quantify the concerted GRN changes between clusters, the authors utilise a "frustration" score (from Tripathi et al 2020). The average frustration score increases and peaks at day 1 treatment, followed by a decline over terminal stable state (day 3-treatment); similar to interaction frequency trends. The author also separately measure network heterogeneity and repeat analysis using alternative transition matrix. The authors conclude that EMT proceeds through concerted regulation of multiple genes first with an increase in inter-cluster edges, frustration and heterogeneity followed by a decrease into final stable state. The authors apply the analysis to scRNA-seq data from (i) pancreatic endocrine differentiation where Ngn3-low progenitors give rise to Ngn3-high, then Fev-high and into glucagon producing a-endocrine cells; (ii) dendate gyrus; radial glial cell differentiation into nIPCs, neuroblast, immature granule and mature granule cells. In both cases, the authors observe concerted regulation with initial increase in inter-community edges, heterogeneity during differentiation followed by decrease towards final stable state. **

      The study and ideas in the manuscript are interesting and the methods would be potentially be useful. However, there are a few specific and general comments stated below, which the authors should try to address.

      1 • P4: "RC increases first and reaches a peak when cells were treated with TGF-β for about one day, then decreases (Fig. 1G)". It would be better to label the figure with the treatment information. *

      Reply: Thanks for your advice. In the revised manuscript, we analyzed two additional datasets, and moved the EMT result in the supplemental Fig. EV8. In the new Fig. 1d, we marked the cell types along the reaction coordinate.

      *2 • Fig. 1G and EV1D: Why are the trends different? *

      Reply: In the original figures, ____Fig____.1g is the frustration score and EV1D shows the variation of pseudo-Hamiltonian along the reaction coordinate. The frustration score is the focus of this work. We also calculated the pseudo-Hamiltonian since it has been used in the literature. However, we realized that showing both of the results might lead to confusion, so we deleted all pseudo-Hamiltonian results in the revised manuscript.

      * 3 • How is the appropriate community/cluster/Louvain resolution selected? This can have a major impact on number of cell states, types and transition path from initial to final state. *

      Reply: The number of cell states, types and transition path from initial to final state____ are not determined from the community/cluster/Louvain analyses. For the EMT data, we assume most cells in the initial treatment time are epithelial cells, and those in the final time point are mesenchymal cells. For other datasets, we followed the original publications to assign cell types based on known marker expression.

      The Louvain method was applied to coarse grain the gene regulation network, and it does not affect the number of cell states, types and transition path, which were determined separately. To address the reviewer’s question, we also use the Leiden method to adjust the resolution ____(1)____. The resolution does not affect the result. The results are added to Fig. EV12. We tried three different resolution values 0.8,1.0 and 1.2. The number of inter-community edges consistently shows the trend that it increases first then decreases.

      Figure EV12 Cell-specific variation of the number of effective inter-community edges between communities calculated with different resolution parameter values for dentate gyrus neurogenesis (a), pancreatic endocrinogenesis (b), and bone marrow marrow hematopoiesis (c). Each dot represents a cell and the color represents the number of inter-community edges____.

      • * What effect does the Louvain resolution have on e.g. frustration scores? * Reply: The resolution of community division algorithm doesn’t affect the frustration scores, since the frustration score is based on the gene-gene interactions instead of community assignment.

      • * The authors match resolution to samples/timepoints/known prior cell types i.e. 3-4 communities. However it is unclear whether this is enough to describe entire differentiation/transition process. * Reply: This is a good question. In one above reply we have explained how the cell types were determined____. We also agree with the reviewer that these coarse-grained communities cannot reflect the overall heterogeneity and dynamics of the whole process. Notice in most of our analyses (e.g., reaction coordinate and transition paths), we treated the transition as continuous and the distribution of single cell data points in all datasets cover the whole space that involved in cell phenotype transition. The coarse-grained analyses are for further mechanistic insights on how gene regulatory networks are reorganized during the transition process.

      • * Gene selection: The selection based on minimum 20 counts as highly expressed genes is arbitrary and dependent on sequencing depth. Perhaps the authors could show distribution of gene counts for the datasets and have a data-driven filtering criteria * Reply: Thanks for the advice. The number 20 is a default value suggested in the package (scVelo) we use, and in another package dynamo the default number is 30. Following the reviewer’s suggestion (together with the next question on the influence of all highly variable genes), we looked for a data-drive filtering criterion. The method has been described in different tools ____(2-4)____. We first grouped the genes into 20 bins by their mean expression values, and____ scaled their dispersions by subtracting the mean of dispersions and dividing standard deviation of dispersions____. Figure EV9 shows the distribution of the minimum shared counts. ____As one can see, most genes counts are larger than 10, and using a smaller value causes error in the following velocity analysis. Therefore we set the minimum shared counts as 10 in the new results.

      Figure EV9 Shared counts distribution of the datasets. (a) Dentate gyrus neurogenesis; (b) Pancreatic endocrinogenesis; (c) Bone marrow hematopoiesis.

      • * The choice of 500 variable genes (for human A549 cells) is also quite arbitrary. Perhaps, the authors could compare how additional genes (all highly variable genes) affects their analysis and interpretation. * Reply: ____Thanks. Following previous question on shared counts and ____data-driven filtering criteria____,____ we take all the highly variable genes into consideration. The details of gene selection and binarization are given in the Materialss and Methods (Materials and Methods 2) section.

      • * How are other factors (sequencing depth, genes detected, #of cell types, multiple branches) affects the connectivity between communities at different phases of transition/development? * Reply: This is a good question. The A549 EMT dataset has a sequence depth of 40000-50000. The ____dentate gyrus neurogenesis dataset____ has a sequence depth of 56,700 reads. A saturation depth would be close to 1,000,000, but there is a compromise between cell number and depth. There are genes that are not detected even under the saturation reads setting. That is why the preprocessing is needed. On the other hand, the network we inferred include both direct and indirect interaction, so the influence of sequence depth and gene number detected can be reduced to a certain extent. We used a random subset of the selected gene and performed the same analyses. The results are consistent with what we obtained using all the genes (Fig. EV11b). With the new gene selection criteria (Materials and Method 2), our analyses are not related with the number of cell types.

      We did analysis on another beta branch of pancreatic endocrinogenesis data. The other branches show the same results (Fig. EV4). There are two additional branches in the pancreatic endocrinogenesis dataset. It has been reported that the RNA velocity estimation for the epsilon branch is incorrect ____(3)____. There are too few cells in the delta branch for reliable analyses. Therefore we didn’t present results for these two branches.

      Figure EV4 Analyses on the branch of glucagon producing β-cells in pancreatic endocrinogenesis.

      (a) Transition graph based on RNA velocity.

      (b) The RCs and corresponding Voronoi cells. The large colored dots represent the RC points (start from blue and ends in red). The small dots represent cells with color as cell type.

      (c) Frustration score along the RCs.

      (d) Cell-specific variation of effective intercommunity regulation. Each dot represents a cell. Color represents the number of effective intercommunity edges within each cell in the GRN.

        • Are the velocity graph, transition matrix and further shortest path estimation derived in a reduced latent space, and if so, how much (nPCs) and what impact does it have. Presumably, the density estimation is not performed in expression space. Reply: Yes. ____The calculation of transition matrix is based on neighbor information. The calculation of neighbors was in the reduced latent space in scVelo and Dynamo. We performed the same analysis by varying number of principal components. The results are similar because the first several components account for large proportion of variance. Figure R1 shows the results of dentate gyrus neurogenesis with the number of principal components being 10, 20 and 30, respectively. In the revised manuscript, we delete the step of using density estimation constrain to simplify the procedure. __Figure R1 Frustration scorer along RCs (left) and cell specific variation of number of effective intercommunity edges (Each dot represents a cell and color represents the number of effective intercommunity edges) in the GRN within each cell (right) when using different number of PCs in analyses (dentate gyrus neurogenesis): (a) number of PCs is 10.*__

      (b) number of PCs is 20. (c) number of PCs is 30

      * - The figure legends and labels were hard to read. These should be improved for better readability. *

      Reply: Thanks. We modified the figure legends and labels.

      * - A suggestion would be move the initial results section to methods and highlight the biological interpretation. *

      Reply: Thanks for your advice. We moved large part of this section to the Materials and Methods.

      *The authors could highly which GRN and representative genes/edge pairs are highest ranked within inter-community and to overall final stable states. *

      Reply: Thanks. We list some representative gene pairs in the Table. EV 2&EV 3 &EV 4 for different datasets. And we performed gene enrichment analysis for each community.

      * - How does the GRN inference compare to current state-of-the-art GRN inference scRNA-seq methods? *

      Reply: we used the method GRISLI to perform the same analysis ____(5)____. The results are similar to what obtained with our current method (Figure EV6). We want to emphasize that the focus of this work is not on another GRN inference method, but discussing some general principles of GRN reorganization during a cell phenotypic transition process.

      Figure EV6 Analyses of datasets of dentate gyrus neurogenesis (a), pancreatic endocrinogenesis (b), and hematopoiesis (c) based on GRN inferred with GRISLI.

      (a) Frustration score along the RCs of dentate gyrus neurogenesis (left) and cell-specific variation of the number of inter-community edges (right). Each dot represents a cell and color represents the number of inter-community edges in GRN within each cell.

      (b) Same as in panel (a), except for pancreatic endocrinogenesis.

      (c) Same as in panel (a), except for hematopoiesis.

      * - How do extremely noisy/stochastic genes vary in metrics between final stable states? How are the metrics affected by number of cells and stochasticity of expression within a given cluster/community. *

      Reply: To address this question, we selected two genes, Id2 and Cdkn1c, with high variance and compare their distributions in the initial and final states. ____The gene distributions show significant shift between the Ngn3 low EP cells and Alpha cells (Fig. R2 a &b left).____ Then we randomly selected a subset (half) of cells and compared the distributions of these high-variance genes in the sub-population (Fig. R2 a&b right). The results are similar to the full-set results.

      Fig. R2 Comparison of gene distribution in the initial and final states in pancreatic endocrinogenesis. (a) Comparison of the distribution of gene Id2 at the initial and final states (left), and in the randomly selected sub-population at the initial and final states (right). (b) Comparison of the distribution of Cdkn1c at the initial and final states (left), and in the randomly selected sub-population at the initial and final states (right).

      * - Given that the author's approach includes both direct and indirect genes effects, the authors could further prune genes based on existing TF databases or protein-protein validated networks. *Reply: This is a good suggestion. We will work on this idea in future work. As we mentioned, due to constrains of data quality, only tens of transcription factors can be analyzed in these dataset. We list some regulations of transcription factors inferred with current method in Table EV1.

      • *It is unclear which GRNs are already known and which ones are novel and biologically relevant * Reply: We compare some regulations inferred with the method and compare these interactions w____ith some references in Table. EV1____.

      * - It would be good for authors to comment when there are multiple bifurcations instead of A-B transitions. Particularly in datasets with multiple discrete stable states. *Reply: This is a good question.____ In our analysis, we focus on the transition from one stable state to another stable state. For transition process with multiple bifurcations like____ the pancreatic endocrinogenesis, the results are similar across different branches. For the transition that goes through multiple discrete stable states, for example, a transition from state A____à____B____à____C, we expect to observe two peaks in the frustration score and the number of inter-community edges. We added some discussions in the Discussion section.

      • *Another suggestion would be to highlight gene expression of selected markers based on f-regression and mi over the trajectory * Reply: As we modified the criteria of gene selection, we plotted trajectories of some high-variance genes versus the reaction coordinate obtained with different datasets in Fig. EV10 based on current criteria.

      Figure EV10 ____Typical trajectories of high variance genes versus RCs of dentate gyrus neurogenesis (a), pancreatic endocrinogenesis (b) and bone marrow ____hematopoiesis ____(c).

      * - If possible, a proof of principle could be re-analysis of a perturbation scRNA-seq dataset (e.g. where one path/transition path is stalled) *

      Reply: Thanks. This is a really a good suggestion. We will perform more systematic studies in future work.

      * Reviewer #1 (Significance (Required)): Nature and significance of advance: The study and ideas in the manuscript are interesting and the methods would be potentially be useful to community. Compare to existing published knowledge: *

      *Audience: Predominantly computational audience *

      *Your Expertise: PI with background in experimental, computational biology and expertise in single-cell genomic tools and developmental biology *

      *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Understanding the cellular and molecular basis of cell type or cell state transitions occurring during development or reprogramming is a fundamental challenge. scRNA-seq has provided a window into gene expression programs across thousands of cells undergoing such transitions. Wang and colleagues leverage scRNA-seq and develop an approach to reverse engineer gene regulatory network underlying cells along a path from one cell type/state to another, and characterize community-level properties of this network associated with various stages of the cell phenotype transition. The study is innovative and rigorous, and their results point to how intercommunity interactions increase and then decrease, indicating a concerted regulatory rewiring that orchestrates transitions. Application of their approach to three different datasets also shows that this trend is consistent across three different transitions and maybe a general trend. However, there are some major and minor concerns that need to be addressed.

      **Major comments and questions**

      1. The analogy to SN1 and SN2 mechanisms of chemical bond formation is very nice.
      2. What is the basis for the two statements made in paragraph 3 of Introduction (beginning with "A question arises ...") about transitions being sequential or concurrent? Please *Reply: Thanks. We added references in this paragraph.

      * 2.1. Provide references to previous experimental and computational studies that have investigated developmental and reprogramming gene expression programs. *

      Reply: Thanks. We added a paragraph in the Introduction.

      *

      2.2. Describe specific examples of findings that support the two possible transitions highlighted here. Why couldn't transitions happen through an entirely gradual process involving changes to overlapping subsets of genes. *

      Reply: Thanks. In the review paper of Naomi Moris et. al., they proposed the hypothesis that cell phenotype transition is similar to a chemical reaction ____(6)____. Thus we extrapolate this hypothesis and test it in our study. For the example of SN1 mechanism, ____Kalkan et al. showed that mouse embryonic stem cells can exit from ____naïve pluripotency____ but remain uncommitted ____(7)____.

      Just like the SN1 and SN2 mechanisms are two extremes in chemical reactions and there are cases lie in between, for cell phenotypic transitions we agree with the reviewer that such gradual process may exist. Actually the result in Fig. EV4d shows that the frustration score remains flat for the Fev+ ____à____ Beta transition, suggesting a possible gradual process. With the analyses provided in this work, such as the reaction coordinate, frustration score, heterogeneity, and inter-/intra- community edges, one may perform more systematic studies on a larger number of datasets and enumerate/classify possible patterns of transitions.

      • Please make plots of the number of effective intra-community edges vs. number of active genes to support the statement that these two numbers are correlated. *

      Reply: We plotted the corresponding intra-community active genes and calculated its correlation coefficient with the number of effective intra-community edges in dentate gyrus neurogenesis (Fig. EV1d). ____The correlation coefficients are 0.91,0.96, 0.99 and 0.96 for community 0, 1, 2 and 3 separately.

      * A bunch of notations are not clear:

      4.1. What is the "r" in "strongest intercommunity interactions at r = 10 (Fig. 1F)"? Is it the same as the "r" mentioned in the Methods section? *

      Reply: r____ is the index number of the discretized reaction coordinate. We added it when we define the reaction coordinate. We modified the conflict usage of r in Materials and Method 4.

      4.2. What is "s_i" in "cell-specific effective matrix, Fbar_ij = (2*s_i - 1)*F_ij"? Also, that description of F_ij, f_ij, and H should be moved to the Methods section, and a more high-level, intuitive description should instead be included in this Results paragraph. Reply: represent the binarized gene expression state. is 0 for when gene is in low expression level (silence) and is 1 when gene is in high express level (active). We modified this part following your advice.

      * How were the h_f and h_m thresholds chosen? *

      Reply: and are based on the distribution of each dataset. Following suggestions from another reviewer, we modified this part. All the highly variable genes were selected and the genes were binarized with the Silverman’s bandwidth method and ____K____means (Materials and Methods 2).

      * What is the "density of each single cell" ("_t")? The formulation of the penalty of the distance between cells i and j (the expression with -logP_ij...) is unclear. What is the intuition behind it? What is r? How were the values of r (0.5 and 0.8) chosen? *

      Reply: The probability density of cells in the expression space is based on the kernel density estimation. Intuitively, a region in the expression space with more cells is more likely passed by more cell trajectories. The values are based on the distribution of kernel density estimation in different datasets.

      In the modified manuscript, we used trajectory simulation and deleted this assumption for simplification.

      * One of the reasons the authors state to justify the choice of PLSR is "In the scRNA dataset, the number of genes is often comparable to or larger than the number of cells." This is not true most of the time. In nearly all recent studies, the number of cells is way larger than the number of genes measured. *

      Reply: The PLSR method definitely can be used for the data whose number of cells is larger than the number of genes. Also the PLSR method was applied on cells that are the k nearest neighbors of each reaction coordinate, which are a subset of the whole dataset (Materials and Methods 5). While we mainly presented results with the PLSR method, in this revised manuscript we also added results with another method of GRISLI (Materials and Methods 9). The results are similar with what we obtained with PLSR.

      * There is a fleeting reference to a nice previous finding that supports their observations: "several lines of evidence support that EMT proceeds through a concerted mechanism. Indeed, both in vivo and in vitro studies have identified intermediate states of EMT that have co-expressed epithelial and mesenchymal genes (Pastushenko et al, 2018; Zhang et al, 2014)". The authors should thoroughly survey the literature related to EMT transition, development of pancreatic endocrine cells, and development of the granule cell lineage in dentate gyrus, to find more previously identified molecular/cellular features relevant to cell state/type transitions, compared and contrasted with findings from this study. *

      Reply: Thanks. We added references on these cell phenotype transitions and modified the corresponding part. We do want to point out that the main focus of this work is that all these processes share a common feature of transient increase of intercommunity interactions.

      * What is the "dynamo" package, which is supposed to contain a Python notebook? As of now, the code and data have not been made available. Both need to be released along with thorough documentation on how to run the code to reproduce the analyses described here. *Reply: Thanks. Dynamo is a python package accompanying our recent publication ____(8)____. We uploaded the code on Github and added the link of Dynamo.

      * **Minor comments and questions**

      1. Replace "confliction" throughout the manuscript with "conflict" or "conflicting" as appropriate. *

      Reply: Thanks. We modified them.

      * Paragraph two of the Introduction (beginning with "Another example of transitions ...") is missing multiple references, esp. for the last four sentences. *

      Reply: Thanks. We added references.

      * There are direct quotes from previous papers like "predicts the future state of individual cells on a timescale of hours". The authors are highly encouraged to check for usage of exact phrasing using available text software such as iThenticate. *

      Reply____: ____Thanks a lot for pointing out this severe mistake. We re-edited the manuscript and checked with iThenticate. *

      *

      • "Each community contains both E and M genes": what does this mean? *

      Reply: The E (M) genes are defined as those genes that are active or have high expression levels in epithelial (mesenchymal) state or sample. As we reorganized the manuscript, we add this explanation for all datasets in the caption of Fig.1i.*

      *

      • Reference to Qui 2021 is missing in the "Path analysis" subsection under Methods. *

      Reply: We added it in the Methods.

      * Fix: "transition between the cells that their sample time points are successive" in Methods. *

      Reply: Thanks. ____We modified it.

      * In Methods, under "Network inference", it is "partial least square regression" (not *least* s square). *

      Reply: Thanks. We modified it.

      * Figure 1: The cyan, magenta, and lime in 1C are very hard to see and, perhaps, the grey of the points can be made lighter. Also, change the red and green colors for the arrows in 1I to something else. These colors are not colorblind-friendly. *

      Reply: Thanks. We re-plotted the figures and changed the colormap.*

      *

      • Periods and commas are missing at several places. Reply: Thanks. We modify these and re-edit the manuscript.

      Reviewer #2 (Significance (Required)):

      The study uses RNA-velocity calculated from scRNA-seq data in an inventive way to characterize paths that reflect cell phenotype transitions. Then, a sparse gene regulatory network is reverse engineered from the data and the community structure within this network is examined at various stages along the transition to make observations about inter- and intra-community regulation and network "frustration". However, the study lacks the context of existing literature in terms of previous work studying cell transitions both experimentally and computationally. Adding this context (as suggested in the comments) will considerably improve the utility and significance of the findings. Overall, this study will be of broad interest to researchers interested in development and reprogramming as well as computational scientists developing and applying methods for scRNA-seq data analysis, trajectory inference, and network reconstruction. All the comments and questions raised here are based on my background and expertise in omics data (including scRNA-seq) analysis and network biology.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The authors analyze three datasets of Single cell RNA velocity measured during phenotypic transition. They infer the gene regulatory network in each case and characterize the transition between the initial and final expression states (in which different sets of genes are expressed). Their motivating question was to find whether during such transitions first genes characterizing the initial state are no longer expressed and only then the genes associated with the final state start expressing or alternatively there is gradual transition through an intermediate state in which subsets of both initial and final state genes are transiently expressed.

      They define a measure of regulatory frustration representing the mismatch between regulatory signals a gene receives and its current expression state. They conclude that phenotypic transitions involve transient interactions between otherwise non-interacting gene modules and a temporary increase of gene frustration, which is relaxed once the final expression state is reached.

      The study uses of advanced inference and machine learning methods.

      I find the question studied in this manuscript interesting, opening avenue to further questions and studies and relevant to different scientific communities. Personally I think that the focus of the paper should be the exposition of the methods used this manuscript would benefit from a longer format, but that depends of course on the journal they are aiming at. *

      *

      Statistical analysis is missing. Especially since the authors mention the potential of over-fitting due to large number of genes (on the order of the number of cells) - I think the authors should provide a sensitivity analysis testing how sensitive are the conclusions to the choice of cells or genes by applying the methods to subsets of the cells / genes. *

      Reply: Thanks. For the subset of cells, we randomly selected cells from the dataset and performed the analyses (Fig. EV11a). For the subset of genes, we selected a subset of genes randomly and performed the analyses (Fig. EV 11b). We found the results are not affected. We also perform another statistical analysis by varying the value of resolution in community detection algorithm. And we found that the conclusion on variation of inter-community edges is not affected (Fig. EV12).

      Figure EV11 Statistical analyses of dentate gyrus neurogenesis. Each dot represents a cell and color represents the number of inter-community edges.

      (a) Frustration score along the RCs (left) and cell-specific variation of the number of inter-community edges (right) of a randomly selected sub-population of 2000 cells (from a total of 3184 cells);

      (b) Frustration score along the RCs (left) and cell-specific variation of the number of inter-community edges) (right) of cells on the space of 400 randomly selected genes (from a total of 678 genes).

      *What is the meaning of the distribution in the frustration plots? *

      Reply: For each cell we calculated a frustration score. Therefore for cells in each Voronoi cell (which is a geometric cell, don’t be confused with the biological “cells”) along the reaction coordinate (Fig.1d, Fig. 2b &2g), we obtained a distribution of the frustration scores.*

      In general, the conclusions are well-justified, but I think some statements in the discussion are inaccurate: "intercommunity interactions of a GRN are indeed minimized' - are they minimal or are they only lower at the stable states? There are two stable states - for which of them is intercommunity interaction lower? *

      Reply: Thank. We agree with the reviewer and modified the writing. Comparing with the transition state, the number of intercommunity interactions is less for the stable states. ____The datasets' quality are not high enough for us to investigate whether ____"intercommunity interactions of a GRN are indeed minimized”.*

      It is written in the discussion that 'for all three datasets frustration decreases with differentiation', but then Fig. 1g shows the opposite (final state is more frustrated than initial state). It is interesting to discuss the differences between the datasets analyzed in that respect and what could cause transition to a more frustrated state. I suggest that the authors also refer in the discussion to related questions and possible follow-up studies, such as: what determines the duration of the phenotypic transition? A relevant number is the switching time of a single gene. *

      Reply: Good suggestion. Compared to other datasets, we found that the result of EMT shows larger variances. The relative difference of the frustration score is also affected by the GRN inference algorithm. For example, the difference between initial and final frustration scores of the pancreatic endocrinogenesis is more significant when using the GRISLI method (Figure EV6b). Given these, the trend that the frustration scores in the transition states transiently increase keep consistent.

      Our conclusion is limited by the quality of the data. So we delete this part of discussion in the manuscript.

      Qiu et al. have shown that splicing-based ____RNA velocities are relative, while metabolic-labeling-based RNA velocities are more quantitative and accurate____(8)____. We will re-analyze this problem if data with metabolic labeling becomes available.

      * The authors mention at the end that the networks can often reach multiple final states from a common initial states. Do such transitions share some of their path (and in particular the intermediate frustrated state)? Given the intermediate connected state, it would be interesting to characterize the network stability to perturbations. *

      Reply: This is a very important question. To reliably address these questions, we need higher quality data. We plan to characterize the network stability to perturbations in future studies, while in our recent paper using a full nonlinear modeling framework____(8)____, we performed in silico perturbations.

      * While interesting, the manuscript itself is unfortunately hard to read and would benefit from major editing, including better exposition of the science and language editing. *

      Reply: Thanks. We revised the manuscript extensively.*

      Methods: Description of PCA and 'revised finite temperature string method' are missing in the Methods section. *

      Reply:____ Thanks. PCA is used in RNA velocity analysis for dimension reduction. We added this in Materials and Methods 3. The revised string method is in Materials and Methods ____4.

      *

      Some examples:

      Figure captions are very short and often non-informative. Some variables are not defined (or only defined later on) and the reader then needs to guess their meaning: it took me a while to understand what is 'r' in Fig. 1f and what 'r=10' (p. 4) means. *

      Reply: Thanks. ____r____ represents the index number of reaction coordinates. We added this in the manuscript where we define reaction coordinates.*

      p. 4: what are 'f' (as opposed to F) and 's_ij' and 's_j' (expression states?) Or is fs_ij one variable? What does a Hamiltonian of a cell mean (p. 4, bottom)? *

      Reply: is the regulation of gene ____j on gene i, and is the expression state of gene i (0 for silence, and 1 for active expression). is the frustration value of regulation from gene j to gene i.

      The pseudo Hamiltonian value is proposed in the literature as an analogy of ____the magnetic systems following the work of Boolean model in EMT ____(9)____. A high Hamiltonian value indicates that the cell is in an unstable state. In the original manuscript we included this quantity since it has been discussed in the literature. However we found it causes confusion and is not necessary for our discussions, so we removed the pseudo-Hamiltonian results in the revised manuscript. * P. 4: how are 'E and M genes' defined? *

      Reply: The E (M) genes are defined as those genes that are active or have high expression levels at the epithelial (mesenchymal) state or sample. We explained our general strategy in the caption of Fig.1i . * What does 'network heterogeneity' (p. 5) mean? *

      Reply: Network heterogeneity measures how homogenously the connections are distributed among the genes____(10)____. A high heterogeneity ____means that some genes have high degree of connectivity (the so-called hubs), while some have low degree of connectivity.

      *

      Fig. 1 is too tiny and hard to read and details are missing. *

      Reply: Thanks. We modified this figure and caption.*

      A glossary for all the acronyms used would be very helpful. *

      Reply: Thanks. We added glossary in the manuscript.*

      Language (some examples):

      p. 5 bottom: Another system is on development... invitro -> in vitro

      p. 6: 'measure on developmental potential' -> measure of... *

      Reply: Thanks. We modified these and re-edited the whole manuscript.*

      Reviewer #3 (Significance (Required)):

      This study presents a methodological advance in demonstrating the application of data analysis methods to study developmental phenotypic transitions. High throughput measurements and computation power available today enable putting to test theoretical conjectures, as made by Waddington. I think this is a promising line of research, which could be used to further develop the computational methods as well as to further our understanding of developmental transitions and potentially develop associated mathematical modeling frameworks.

      This study should be of interest to a diverse readership composed of developmental biologists as well as to quantitative biologists and CS researchers applying optimization techniques and data analysis methods to high-throughput biological data.

      I am not an expert on the computational methods applied in this manuscript and hence cannot assess their correct use and statistical analysis.

      *

      1. Traag VA, Waltman L, & van Eck NJ (2019) From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports 9(1):5233.
      2. Stuart T, et al. (2019) Comprehensive Integration of Single-Cell Data. Cell 177(7):1888-1902.e1821.
      3. Bergen V, Lange M, Peidli S, Wolf FA, & Theis FJ (2020) Generalizing RNA velocity to transient cell states through dynamical modeling. Nature Biotechnology 38(12):1408-1414.
      4. Wolf FA, Angerer P, & Theis FJ (2018) SCANPY: large-scale single-cell gene expression data analysis. Genome Biology 19(1):15.
      5. Aubin-Frankowski P-C & Vert J-P (2020) Gene regulation inference from single-cell RNA-seq data with linear differential equations and velocity inference. Bioinformatics (Oxford, England) 36(18):4774-4780.
      6. Moris N, Pina C, & Arias AM (2016) Transition states and cell fate decisions in epigenetic landscapes. Nature reviews. Genetics 17(11):693-703.
      7. Kalkan T, et al. (2017) Tracking the embryonic stem cell transition from ground state pluripotency. Development 144(7):1221-1234.
      8. Qiu X, et al. (2022) Mapping Transcriptomic Vector Fields of Single Cells. Cell 185(4):690-711.
      9. Font-Clos F, Zapperi S, & La Porta CAM (2018) Topography of epithelial–mesenchymal plasticity. Proceedings of the National Academy of Sciences 115(23):5902-5907.
      10. Gao J, Barzel B, & Barabási A-L (2016) Universal resilience patterns in complex networks. Nature 530(7590):307-312.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are very grateful to the three referees for their constructive comments and suggestions which have helped improve the quality of our manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In the publication HAT-field: a very cheap, robust and quantitative point-of-care serological test for Covid-19 by Joly and Ribes the authors describe an adaption and an improved protocol to their previously published haemagglutination based test to detect antibodies to SARS-CoV-2 in patient blood (Towsend et al., 2021). In detail, they analyzed the effect of several adaptions including buffer optimization, plate coating, usage of patient whole blood instead of washed RBCs and plasma. Additionally they tested different temperatures and stability of the reagents, namely the nanobody-RBD construct IH4-RBD. For validation they compared their optimized HAT-field assay with Jurkat-S&R as a FACS-based assay.

      Major comments:

      Introduction: This section is rather short and could benefit from a broader overview of currently established methods and assays to detect appropriate immune responses against SARS-CoV-2. The author are advised to summarize the current literature in the field more comprehensively and not only focus on their own work.

      Response: Hundreds of different tests to monitor immune responses against SARS-CoV-2 have been described to date, and the literature on these various tests is vast, with new articles coming out almost on a daily basis. We would not feel either that the introduction of our rather technical paper would benefit from being lengthened by such a review of the current literature, or even competent to carry out such a summary. Following the referee’s suggestion, we have, however, introduced a new sentence and given three references providing relatively recent overviews on the subject of immune-monitoring.

      Cross-reactivity with IH4-RBD. In Figure 6, the authors highlight the samples in red and orange that showed cross-reactivity with IH4-RBD. In their discussion, however, the authors state that only 2 of 60 (3%) were cross-reactive. In making this statement, they ignore the proportion of cross-reactive samples that were also positive in the Jurkat S&R assay. Therefore, the authors should acknowledge in the discussion that the actual number of cross-reactive samples was higher.

      Response*: The statement in the discussion about 2 cross reactive samples out of 60 concerns the results obtained after an incubation of one hour under normal gravity, and not the two red dots in each of the three graphs of figure 6, which correspond to the two negative samples which gave false-positive results in HAT plasma titrations after spinning (Figure 6C), for which we correctly state in the discussion that 12 samples showed cross-reactivity on IH4 alone. The data presented in Figure 6B corresponds to HAT-field after spinning, for which we correctly state in the discussion that 5 out of 60 showed cross-reactivity (4 orange dots + 1 red dot, the second red dot having a score of 0, in accordance with the fact that this sample showed no cross reaction on IH4 alone in HAT-field after spinning). *

      *To try to prevent this possible confusion, we have now clarified what data we are referring to at the start of that paragraph in the discussion. *

      Quantitative Assay. Since the HAT assay does not allow determination of the absolute number of antibodies reactive to SARS-CoV-2 in the blood samples, the authors should refrain from claiming that the HAT-field is a quantitative assay.

      Response*: Since immune sera are inherently polyclonal, they contain a multitude of different types of antibodies of different affinities and avidities, and we are not aware of any technique that allows to determine the “absolute number” of antibodies directed against a given antigen in such samples. *

      *For many serological tests, including ELISA and the initial protocol of HAT, serum or plasma titrations are used as a means to obtain what is widely considered as a quantitative evaluation of the amounts of antibodies in blood samples. Even FACS-based assays such as the Jurkat-S&R-flow test we have used, are commonly considered as quantitative but those only provide relative results and not absolute numbers. *

      We perceive that the close correlations we find between the results of the HAT-field protocol and those of the Jurkat-S&R-flow test as well as with serum titrations using the standard HAT protocol warrants considering the results of HAT-field as being as quantitative as those obtained with all those other tests.

      Morphological read out For field application, the morphological description of the observed deposits ("teardrop" vs. "button") could be problematic and might lead to bias depending on the user. Thus, the authors should provide a clearer description for phenotype classification.

      Response: We have now introduced a specific paragraph detailing how to score HAT assays in the Methods section, as well as a new figure providing a graphic description of positive, partial and negative RBCs deposits.

      Minor comments: Title: the authors should remove "very"

      Response*: We have now removed the word ‘very’ from the title, and thank the referee for this helpful suggestion. *

      By the way: What are the costs of IH4-RBD for a 96 well plate? Who will produce this reagent? Is the sequence of the IH4 fully disclosed?

      Response*: As specified in our original paper (see Townsend et al. 2021), the plasmid coding for the IH4-RBD is available upon request from Alain Townsend (Oxford, UK). Furthermore, his laboratory funded the production of 1 gram of the IH4-RBD reagent by a commercial company, and professor Townsend has been graciously sending aliquots of 1 mg of this reagent, which suffice for several thousand tests, to all the laboratories that have requested it from him. *

      *In its initial format, HAT only required 100 ng of IH4-RBD per well, corresponding to a cost of 0.0027 £ per well. For the HAT-field protocol, 5 times more reagent is needed, thus bringing the cost of the reagent to 1.5 cts per test, to which one would have to add a similar cost for the IH4-reagent alone. This would thus bring the cost of the two reagents to approximately 3 cts, which is still lower than the price of any of the cheap disposable plasticware necessary for the test (lancet, pipet, plastic tube and portion of a plate). *

      The sequence of the IH4 nanobody is indeed fully disclosed (see figure 1 of Townsend et al. 2021), and has actually been protected by a patent ( US9879090B2 ). Whilst IH4 can be used freely for research purposes, licensing rights would have to be taken into consideration by any health authority wishing to use the technique broadly, or for any commercial distribution.

      The usage of the CR3022 as positive control for neutralizing antibodies should be reconsidered since this antibody does not confer viral neutralization. Other well describe antibodies blocking the ACE2:RBD interface might be better suited.

      Response*: CR3022 was the one that we had at our disposal, but other mAbs can certainly be used instead of as positive controls, and this is actually indicated in the detailed HAT-field protocol provided. Since the use of a positive control is only to ensure that the IH4-RBD has not been degraded and works as well as expected, and that any negative samples are not due to a very rare glycophorin mutation that could prevent IH4 from binding to it at the surface of RBCs, we are not sure why using a mAb with neutralizing activity would necessarily be better than the CR3022 mAb. *

      Figure 2: Please state the concentration of IH4-RBD used. As stated in the figure legends for Figure 2 B, the authors should show the result all 4 replicates (incl. SD)

      Response: The concentration of IH4-RBD was 1 m*g/ml, i.e. the normal concentration for standard HAT tests. This was already indicated in the Methods section, but has now been added to the legend of Figure 2. *

      Whilst 4 experiments were indeed carried out, which all gave similar results, i.e. showed that using PBS-N3 or PBN did not hinder HAT performance, but could instead result in a slight increase in HAT sensitivity, those various experiments were not all exact replicates of the experiment shown on figure 2. Furthermore, performing of those various experiments was spread over a period of over a year, using different reagents, thus precluding numerical comparisons between the various results. We have clarified this issue by rewording the final statement to “Comparable results were obtained in four similar experiments.”

      Figure 3: Although the authors showed stability of IH4-RBD at 2 µg/ml they do not provide data for the stabilities at higher dilutions. As the authors suggest to predistribute the IH4-RBD in plates they should at least discuss this issue.

      We thank the referee for raising this valid point, which has now been discussed in the paragraph entitled “Practical considerations for performing HAT assays” in the Methods section: “One aspect that will have to be considered for the design and use of such individual strips of wells will be to ensure that, upon storage, the various dilutions of IH4-RBD are as stable in such strips as the working stocks of IH4-RBD (2 mg/ml) tested in Figure 3.”

      Figure 6/Supplementary Figure 1 and 3 The presentation of the data is not accurate, as many of the points (samples) are obviously identically positioned in the graph. The authors should choose a different representation of their data. E.g. they could adjust the size of the points to the number of overlapping samples.

      Response: We thank the referee for raising this issue, which was also pointed to by referee #2. This apparent inaccuracy is due to the fact that, on these plots, the scales for both x and Y axes used discrete values, which indeed results in multiple points overlapping on top of one another. This was resolved by adding numbers next to the positions where several dots overlapped

      Wording / text length In the current manuscript the text is very long. Thus, the authors should shorten it to report the essential findings more appropriately. Additionally they should check for correct English wording.

      Response*: We thank the referee for this remark, which helped us realize that the excessive length of the manuscript was mostly due to an extensive discussion of highly technical and practical points. The corresponding paragraphs were indeed out of place in the general discussion, and have not been deleted but have been moved to the Methods section since we feel that they contain very important information for people who would actually start to performing HAT assays. *

      Reviewer #1 (Significance (Required)):

      In summary, the authors describe the HAT-field test as a simple PoC test for the detection of SARS-CoV-2 antibodies in patients. Because of its ease of use and robustness, the test appears to be particularly well suited for use in countries with underdeveloped health care or limited testing facilities, as also reported previously. The value of this manuscript lies mainly in the detailed description of the protocol and its validation. In this context, the adaptations described are certainly useful and helpful from a practical point of view, but do not provide significant new scientific insights. In light of these considerations, we recommend that this work be submitted to an appropriate journal specializing in the publication of such methods

      Expertise The reviewers have established and published different serological assays to monitor immune responses against SARS-CoV-2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this paper, the authors developed a feasible protocol for an affordable point-of-care serological test for SARS-CoV-2. This method was adapted from the HAT plasma titration test that the authors previously published. Specifically, the test utilizes a 96-well plate pre-coated with the RBD of SARS-CoV-2 spike glycoprotein fusing to a red blood cell targeting nanobody (IH4). By adding microliters amount of the blood or plasma samples to the plate, it allows the detection of antibodies against RBD by measuring the level of hemagglutination. In the current upgraded protocol (so called HAT-field), the authors made major modifications including optimizations of buffer and experimental protocol and the use of pre-titrated IH4-RBD on the plate, which collectively helped to lower the sample consumptions, improved the stability and the sensitivity of detection, and made the test more user-friendly under non-clinical settings.

      Major comments: My major concerns are related to the robustness and quantitative capability of this approach. Specifically: It seems that multiple variables may impact the results. These include volume of droplets, the presence/absence of serum IH4 or BSA cross-reactive antibodies, and the amount (%) of red blood cells which may vary substantially among samples. Could you find a way to normalize the results (e.g., the discrepancy shown in Figure 6) instead of only leaving them as false-positives or false-negative?

      Response*: Regarding the volume of the droplets, in other words, the amount of blood collected and used in an assay, two sentences in the manuscript underline the fact that this is not a critical variable: *

      In the Results section “the precise volume of blood collected is not critical; it may vary by as much as 30% with no detectable influence on the results.”

      In the discussion: “On this subject, we have found that increasing the amount of whole blood per well (in other words using blood that is less dilute) has very little influence over the HAT-field results, and, if anything, adding more blood can sometimes reduce the sensitivity, albeit never by more than 1 dilution.”

      Consequently the % of RBCs in samples seem unlikely to influence the HAT-field scores significantly. This is supported by the fact that, although men tend to have higher hematocrits than women, we have not noticed any detectable difference between men and women in the correlation of the HAT-field scores with those of the Jurkat-S&R-flow test.

      We are not sure that we fully understand what discrepancy shown in Figure 6 the referee is pointing to, but if it is about the increase in the number of samples found to be cross reacting on IH4 alone when the sensitivity increases, in the discussion, we propose to perform tests using titrations of the IH4 nanobody alone simultaneously to using the IH4-RBD reagent, so as to minimize the number of samples that would be identified as false positives if only one concentration of IH4 alone was used as negative control. Comparing the titers obtained with IH4-RBD and IH4 alone will then provide some level of normalization for the samples cross reacting on IH4. As for the hypothetical presence of antibodies cross reacting on BSA alluded to by the referee, since such antibodies would not bind to RBCs, we do not think they would affect the HAT results.

      Second, the score of the HAT-field ranges from 0 - 8. However, based on the current manuscript, it is not clear how the scoring and scaling works. How is the noise (non-specific antibody signal) defined here?

      Response: We have now introduced a specific paragraph and a new figure detailing how to score HAT assays in the Methods section.

      In addition, it is unclear how to translate the HAT-field score into a meaningful measure of protection by serum antibodies.

      Response*: Documenting the correlation between HAT-field scores and levels of protection against SARS-CoV-2 infections and/or Covid-19 severity would indeed be extremely interesting. This would, however, require setting up a large scale clinical trial carried out over several months. This type of work could only be carried out by a large consortium including clinicians or even preferably a national health agency. This was, however, far beyond the reach of this initial project, which was based on the work of a single person on a shoestring budget. *

      Can you provide more evidence to demonstrate that the test is quantitative? For example, performing additional orthogonal experiments to better validate the scoring and generate a correlation function?

      Response*: Inasmuch as it would have been very interesting to perform additional serological tests from commercial sources on the samples of our cohort, such tests are all very expensive (e.g. ca. 500 € for one ELISA plate). This was in fact the main reason for developing the Jurkat-S&R-flow test in the first place, since it is much cheaper, more modular, and at least as sensitive as ELISA (see Maurel Ribes et al. 2021). The funds for this whole project came from a single 15 k€ grant obtained from the ANR, and we simply did not have access to the funds, or to the human resources to carry out such experiments based on commercial serological tests. *

      Minor comments: Figure 6: are all results included? To me, it does not seem that all 60 samples data were included in the plot.

      Response: We thank the referee for raising this issue, which was also pointed to by referee #1. This apparent inaccuracy is due to the fact that the scales for both x and Y axes used discrete values, which results in multiple points overlapping on top of one another. This was resolved by adding numbers next to the positions where several dots overlapped.

      There are several redundant statements in the discussion and results section. Please make the text more concise.

      Response: The discussion has now been shortened considerably, mostly by moving the paragraphs pertaining to technical considerations to the Methods section.

      Reviewer #2 (Significance (Required)):

      The current paper is built upon the improvement of previous published work. In addition, there are similar approaches that have been published. It was unclear if the current method is superior to other works.

      Response: Whilst we have made no statement regarding whether the method we describe is superior to other methods, we are pretty confident that very few alternatives will be as frugal and simple as the HAT-field protocol described here. As alluded to in the final paragraph of the discussion, two recent reports have described that HAT could be performed on cards rather than in V-shaped wells, with semi-quantitative results being obtained in minutes. If such card-based approaches turn out to provide sensitivity and reliability comparable to those of the HAT-field protocol, they will certainly represent very interesting alternatives. As stated in our manuscript, we would be very interested if the comparative evaluation of the two approaches could be carried out by one or several independent third party.

      My research involves the development of antiviral antibody therapeutics. This method may be used as a point-of-care tool for the measurement of serologic response to RBD in less developed countries. However, due to the high vaccination rate and large infected populations, the overall needs for such detection drastically decrease. The significance of the work and utilities of the test may expand with more experiments related to the variants.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This paper describes a low-cost robust and quantitative serological test based on haemgglutination, which could be used in resource limited settings for evaluating population-based and vaccine induced immunity. Neutralising antibodies to the receptor binding domain (RBD) on the SARS CoV-2 spike protein are an immunological correlate of protection. The HAT has a single reagent the RBD domain of SARS CoV-2 linked to a monomeric anti-erythrocyte single domain nanobody. When human polyclonal serum antibodies bind to the RBD they cross-link and agglutinate human red blood cells, resulting in haemagglutination which can be read visually.

      This paper thoroughly evaluate the stability of the HAT reagents used to measure human and monoclonal antibodies examining the robustness of the HAT reagent. It provides a comprehensive protocol for conducting field based HAT with limited reagents. The test can evaluate is subjects have been infected using a simple finger prick to detect RBD specific antibodies. The field HAT can also be used to define people that can be susceptible to reinfection or in need of vaccination, With the use of RBDs from the variants of concern the test can be rapidly adapted to evaluate antibodies as new variants arise to evaluate surrogate correlates of protection to allow timely evaluation of vaccine effectiveness and predict the need for vaccine booster doses. The data are very comprehensively presented with good figures demonstrating the most appropriate buffer to store the IH4-RBD reagent and the robustness of the HAT over time at different temperatures. No additional experiments are needed and suitable numbers of replicates are included. All data, methods and reagents are comprehensively described.

      Minor comments: The paper is well written but rather long in places and may have benefited from being more succinct.

      Response: The excessive length of the manuscript was mostly due to an extensive discussion of highly technical and practical points. The discussion has now been shortened considerably, mostly by moving the paragraphs pertaining to technical considerations to the Methods section.

      Panels in figures could be labelled as A, B, C etc to help in identifying the correct panel..

      Response: We thank the referee for this helpful suggestion, which we have followed.

      I would avoid the use of experiment and project and refer to next we confirmed... or in this paper or our results show Please make sure all abbreviation are defined upon first use. Perhaps include early in the paper that most of the work was conducted with the Wuhan RBD

      Response: We thank the referee for these helpful suggestions, which we have followed to the best of our abilities. The abstract now contains a mention of the fact the work on optimizing the protocol was carried out with the IH4-RBD carrying the Wuhan version.

      Figure 2: I would suggest placing either a solid line between the two halves of the plates to make it easier for the reader to differentiate between the two antibodies. It also would have been easier to read if the bottom PBS, PBS-N3 and PBN were at 45 degree angle. In B include the serum name (e.g. serum 197).

      Response: We thank the referee for these helpful suggestions, which we have followed.

      Legend to figure 4: please include the serum numbers after covid-19 patients. Perhaps include arrows to demonstrate the dilutions of serum and IH4-RBD in the figure.

      Page 6 it might be easiest to use the same times as in figure 6 and use for example more than one year in the discussion

      Response: We thank the referee for these helpful suggestions, which we have all followed.

      Legend figure 6 perhaps replace dots with circles page 10 include the R values from figure 6 in the description of results.

      Response: We are grateful to the referee for these helpful suggestions, but have not followed them since we do not feel that these changes would be real improvements.

      Page 12 of note perhaps this can be moved to the methods ?

      Response: This, and several other paragraphs of the Discussion, have now been moved to the Methods section.

      Supplementary figure 2 A can be seen, is something missing here?

      Response: An s was indeed missing : “A can be seen” corrected to “As can be seen “

      *

      Reviewer #3 (Significance (Required)):

      This paper describes a simple rapid field test for evaluating antibodies to the receptor binding domain of the spike of SARS CoV-2 using the Wuhan and delta variant. Whilst high income countries can provide booster doses and extensive testing (either lateral flow or RT-PCR based) and contact racing to control the waves of the pandemic, low income countries have had limited access to Covid vaccine and the extent of previous waves of the pandemic in the populations are unknown.

      This paper describes a robust and simple test for investigating human antibodies to SARS-CoV-2 which could be performed in resource limited settings providing a very useful tool for monitoring infection in the community and potentially for prioritising this scarce COVID-19 vaccines available.

      This study builds upon the work conducted on the HAT and has extensively studied and optimised the test so that it could be used globally. This paper provides a comprehensive protocol and has simplified the test to ensure it could be used in LMICs.

      This paper would be of great interest to a wide scientific audience who are interested in a rapid low-cost test to evaluate population based and vaccine induced immunity.

      Reviewer: serological assays for use in virology and vaccinology. Suitable competence to review the whole paper *

    1. Author Response:

      Evaluation Summary

      Challa and Ryu et al. systematically evaluated various combinations of ADP-ribose-binding modules to make sensors detecting poly(ADP-ribose). They developed and tested two indicator designs optimized for analyses in cell culture (dimerization-dependent GFP-based) or intact tissues (split Nano luciferase-based). Overall, with further experimental controls and quantification, this timely set of cell biology probes will be useful to study the biological functions of ADP-ribosylation in cultured cells and whole organisms.

      We appreciate the positive and encouraging words from the reviewer. We also appreciate the helpful comments, criticisms, and suggestions, which we have endeavored to address fully.

      Reviewer 1 (Public Review):

      While these tools are more sensitive than existing tools, it is unclear whether a dynamic range of 6-fold (GFP) and 3-fold (luciferase) provide sufficient sensitivity for properly understanding the PAR dynamics (which was thought to increase as much as 100-fold in DNA damage settings). In addition, it is unclear whether the fold increases in both fluorescence and luminescence linearly correlate with the traditional measures by western blot.

      We are pleased that the reviewer found our sensors to potentially useful. The reviewer provided a number of excellent comments and suggestions that have served as a useful guide for improving our paper. We have carefully considered all of the comments, insights, and suggestions from the reviewer and revised the manuscript accordingly. We think this has strengthened our conclusions and improved the paper considerably. We thank the reviewer for the careful and thorough review of our paper.

      Figure 1F indicates on the western blot that there was a precipitous drop of PARylation after 5 min, but the GFP signal indicated a linear drop. It will be important to quantify the signals on western blots and test how correlate their data with the GFP/luciferase data in scatter plots for their various sets of data. Would this system under-estimate the changes and be not sensitive enough to subtle changes that may be 1-2 fold measured by traditional means

      We agree with the reviewer that a comparison with existing PAR detection technologies will improve the manuscript. We now performed a comparative analysis of ELISA, Western blot, and immunofluorescence assays with live cell imaging using PAR-T GFP (Figures 6A, 6C, 6D). The results indicate that the detection range of PAR-T ddGFP is comparable to the established PAR detection assays. In addition, we also compared the live cell luciferase assays using PAR-T NanoLuc to Western blotting (Figure 6B) and found that these two assays are able to detect PAR changes at comparable levels. We would also like to emphasize that these sensors were developed to improve our ability to detect PAR changes in living cells and animals, which the existing techniques are not capable of doing.

      Similarly, how is their quantitation in Figure 2 compared with traditional immunofluorescence?

      We performed this comparison and observed that the changes in PAR levels as detected by live cell imaging using PAR-T ddGFP are comparable to the changes detected in immunofluorescence assays using the WWE-Fc reagent (Figure 6D and 6E).

      Lastly, for the luciferase signal in Figure 3B and C, the corresponding signal in western blots are missing. Therefore, it is difficult to estimate the background signal. If Niraparib, as in other figures, eliminates PAR signals on western blot, these data would indicate half of the basal signal are background, which is rather high. Having said that, tool development is an evolution process. These tools will provide a good foundation for future development. Therefore, understanding these limitations (dynamic range, quantitative sensitivity correlation, and background) will provide a better assessment of the utility of these new tools for investigating PAR biology.

      We appreciate the reviewer’s concern about the high background signal in Niraparibtreated samples. To answer this concern, we compared the dynamic range of PAR-T NanoLuc to Western blotting (Figure 6B) and found that the results from live cell luciferase assays using PAR-T NanoLuc are comparable to Western blotting using WWE-Fc. Of note, we were able to detect decreases in PAR levels with Niraparib using live cell luciferase assays using PAR-T NanoLuc, but not Western blotting. Based on these analyses, we can conclude that the changes in PAR levels at the basal level are very minimal, leading to only 50% decrease in PAR-T NanoLuc signal with Niraparib treatment (Figure 6B, Figure 5A-5C). Note that the decrease in PAR-T NanoLuc signal is greater when UV-treated cells were pre-treated with Niraparib, which is consistent with the results from Western blot analysis (Figure 5A).

      Reviewer 2 (Public Review):

      In this study, the authors attempted to extend their own work and that of others in the field in developing probes to detect the signaling molecule, poly-ADPribose (PAR) that can be used in the test tube, in cells and in tumor models. Major strengths include the development of a set of probes with data demonstrating utility and efficacy. Further, the authors show the assay to be useful in cell models and tumor models. Some weaknesses include what appears to be a high level of background in the assay. Further, regarding methods, the exact probes (sequences) being evaluated are not defined. This is one of several new PAR probes being developed over the last few years but may have widespread utility due to the quantitative nature of the bioluminescent assay.

      We thank the reviewer for these thoughtful and encouraging comments, as well as the interesting, thought-provoking, and constructive criticisms that have prompted us to dig deeper and provide more evidence to support our claims

      Reviewer 3 (Public review):

      The major drawback is that, while the authors demonstrated some applications of these PAR trackers (PAR-T) in both culture cells and in animals, the data of PAR-T ddGFP on cancer cells and the data of PAR-T Nano luciferase may not be sufficient to support the authors' claim that the new tool can detect spatial and temporal dynamics of PAR in cells and in animals. That said, the new tools can potentially expand the capability of cell biologists to visualize and study the PAR production process in both normal and disease states with improved sensitivity and tissue compatibility.

      We thank the reviewer for appreciating the potential utility of the PAR-T sensors, as well as the detailed and constructive criticisms that have prompted us to provide more evidence to support our claims. Addressing these comments has helped us to improve the paper.

      One of the major issues of this manuscript is the lack of time-course data for PAR-T luminescent sensors to demonstrate temporal monitoring of PAR levels in animals. If the binding of two split Nano Luciferase parts is irreversible, the application might be limited. However, according to the literature (Scientific Reports volume 11, Article number: 12535 (2021)), the split Nanoluc technology should be able to detect dynamic changes. Either way, a set of time-course data would be necessary. The authors need to provide evidence to support their statement "The high sensitivity and low signal to noise ratios of the PAR-Trackers described here enable spatial and temporal monitoring of PAR levels in cells and in animals.

      We agree with the reviewer’s comment that the original manuscript did not demonstrate that the PAR-T sensors can be used to detect spatio-temporal changes in PAR. To demonstrate that PAR-T NanoLuc can be used to detect time-dependent changes in PAR levels, we performed a time course of UV-mediated PARP-1 activation (Figure 5D). The results from this assay demonstrated that the dynamic changes in PAR in live cells, in response to DNA damage, can be recaptured using the PAR-T NanoLuc sensors. In addition, we also measured PARGi-mediated PAR accumulation in vivo in xenograft tumors (Figure 8 - figure supplement 1B-1D). We found that PAR can be detected readily in breast cancer cells when injected into mice. Upon treatment with PARGi, the luminescence from PAR-T NanoLuc increased significantly by 6 hours and then diminished by 24 hours. These data demonstrate that PAR-T NanoLuc can be used to track dynamic changes in PAR levels both in cells and in animals. While not in vivo, our work with spheroids also addresses this concern. See our response to the next comment below.

      Figure 2- figure supplement 2. For the detection of spatial dynamics of PAR signals in cancer spheroids, the authors did not provide sufficient evidence as only static images of different spheroids in different conditions were provided. And 2 out of 3 fields of view only include one spheroid. In addition, there is no time-course image data showing the spatial patterns of PAR in cancer cells are dynamic.

      We have now performed a quantitative analysis of multiple spheroids. As indicated in Figure 3B, we observed a significantly higher GFP fluorescence signal in spheroids derived Challa et al. (Kraus) – Rebuttal February 2, 2022 10 from PAR-T ddGFP expressing cells compared to those expressing ddGFP or those treated with Niraparib. To address the reviewer’s concern about using PAR-T ddGFP for spatio-temporal changes in cells, we included a video for live cell imaging of H2O2-mediated increase in PAR-T ddGFP (Figure 2 - figure supplement 2, video). We also developed an analysis approach that allows us to quantify the signals from the core of the spheroids separately from the periphery of the spheroids. We also performed a time course in 3D cancer spheroids to visualize the spatio-temporal changes in PAR levels (Figure 3C and 3D). The results from this experiment demonstrate that the PAR levels in cells at the core of the spheroids are relatively resistant to Niraparib treatment, as the PAR levels in cells at the core of the spheroid decrease at a lower rate when compared to PAR in the cells at the outer layer of the spheroid.

      In the caption of Figure 2 -figure supplement 1 (B and C), it states "Immunofluorescence assay to track PAR formation in response to H2O2.", but there is no evidence showing any antibodies were used there.

      We thank the reviewer for pointing out this error. It should have been written as live cell imaging, not immunofluorescence assay. We made this correction.

      It seems that Figure 3 B and C does not support the statement "we observed specific detection of firefly luciferase with D-Luciferin and NanoLuc with furimazine with no cross-reactivity" And it is unclear why the authors refer Fig. 3B and C after that statement as those data seems not supporting this claim. Similarly, the statement "Moreover, the luminescence of PAR-T Luc is only 30-fold lower than intact firefly luciferase." Was not supported by Fig. 3B. In fact, the differences between PAR-T Luc and intact firefly luciferase were ~1000 fold in vivo, judging from Fig 5B. It is also unclear which data of the construct was used to plot Fig. 3C.

      We thank the reviewer for this comment. We changed the scale bar to represent the true scale for the luminescence from Nano luciferase and Firefly luciferase. This indicates that the brightness of PAR-T NanoLuc is 30-fold lower than intact firefly luciferase. In Figure 3C, we plotted the ratio of PAR-T NanoLuc to firefly luciferase.

      Fig. 4C, it seems that Firefly luciferase was consistently brighter with PARGi, and I wonder if such difference is statistically significant. The authors did not perform a twoway ANOVA test for the firefly luciferase dataset.

      We included the statistics to indicate that these changes were not significant.

      The statement "Moreover, none of these sensors can detect PAR accumulation in vivo." seems to lack support. Have the authors proved that with evidence? I would recommend using the following statement instead: "Moreover, none of these sensors has yet demonstrated detection of PAR accumulation in vivo

      We made this change.

      For the in vivo experiment, it is unclear about the benefits of normalizing the PAR-T radiance to the Firefly luciferase since the signals from Firefly luciferase did not overlap well with that from the PAR-T nano luciferase, which may cause bigger variations.

      We thank the reviewer for raising this point. We normalize the luminescence from PAR-T NanoLuc to that from firefly luciferase to account for the variability in tumor size between the mice. We think this is an important control in the analysis. The luminescence from firefly luciferase represents the differences in tumor size between the mice. Hence, that signal is greater than the signal from PAR-T NanoLuc and is spread over a larger area.

      Judging from the data of Fig 3 supplement 1E, the signal intensity from the split firefly luciferase-based PAR-T sensors was ~10000 fold less than intact firefly luciferase, not ~1000 fold. It makes more sense to give up the split firefly luciferase for ~10000 fold differences since the signal intensity from the split nano luciferase was ~1000 fold less than intact firefly luciferase (Fig 5B).

      We noted the reviewers concern about the split firefly luciferase PAR-T. We agree with the reviewer that the split nano luciferase is brighter than the split firefly luciferase (Figure 4C and Figure 4 - figure supplement 1E). Although split nano luciferase is 1000-fold dimmer than the intact firefly luciferase in vivo (Figure 8B and Figure 8 - figure supplement 1A), this difference is only 30-fold in in vitro assays (Figure 4C). Hence, the comparison of sensors based on split firefly luciferase to split nano luciferase highlights our efforts to make a brighter sensor. Moreover, we included the split firefly luciferase data to compare the performance of WWE vs macrodomain in the development of the PAR-T NanoLuc sensor. Since firefly luciferase is frequently used for sensor development, we believe that it is important to include the results obtained from this sensor.

      Therefore, developing tools to measure ADPR dynamics in cells and in vivo is critical for better understating the various biological processes mediated by ADPR". "understating" should be "understanding".

      We corrected this error.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Estrach and colleagues seek to identify the ECM components that are key to regulating hair follicle stem cell (HFSC) activation using the highly-characterized mouse hair follicle as a model. They first use a targeted approach to examine key ECM components expressed by HFSC and find that Fibronectin (FN) is highly expressed. Further, wholemount analysis of the hair follicle reveals a meshwork of FN enveloping the hair follicle. They hypothesize that FN is a fundamental regulator of hair follicle (HF) cycling and then proceed to carry out longterm studies required to examine hair follicle cycling and knockout FN with two different HFSC Cre lines (Lrig1 and Krt19), as well as integrin coreceptor SLC3A2. They clearly show that absence of Fibronectin (FN) and SLC3A2 is detrimental to hair follicle stem cell activation and cycling (FN) and hair follicle identity (SLC3A2).

      Overall comments:

      The authors use the tail hair follicles as a model similarly to the highly-characterized, synchronous back skin hair follicles. However, the tail hair follicles are asynchronous (Braun et al. 2003, PMID: 12954714), thus reporting the age of the mouse from which the tail whole mounts came from is not sufficient to claim a HF cycle disorder - HF should be imaged in an unbiased manner and subsequently quantified for phase. The manuscript would greatly benefit from including more information in the figure legends, such as age of mice, number of mice and HF quantified, as well as what the error bars represent. Further, in samples where many HF were counted per mouse, these should be averaged and then the average per mouse displayed; super plots would be great to use here.

      Major comments:

      1. In Figure 1, the use of tail whole mount images indeed provides striking display of the fibronectin meshwork that envelops the hair follicle. However, addition of a marker of the regenerative phase (e.g. proliferation) and resting phase would provide more convincing evidence that this is the particular phase of the hair cycle that you have captured, especially given my overall comment regarding the asynchronous nature of the tail HF cycle.
      2. The authors show that FN is expressed in early-mid anagen and conclude that FN is a regenerative signal. This claim should be substantiated with FN staining on more time points across the HF cycle to substantiate the argument that it is a regeneration-specific signal, found only in the telogen-anagen transition.
      3. Lrig1-cre and K19-cre-mediated FN knockout result in HF that are thinner at D158 - this is not immediately apparent from histological sections. Can you use your thick sections to give better perspective?
      4. The authors measure the width of the infundibulum from lightsheet microscope images. It is a bit difficult to position whole tissues using this technique, and the images that are shown are not from the same perspective, and thus measurement of the width is not accurate from these images. I suggest either removing this analysis or using more comparable images. Further, if this is a true phenotype, can you speculate on what the thickened infundibulum might mean?
      5. The authors then show mislocalization of Lrig1+ cells to the infundibulum in absence of FN. Are other stem markers localized to the infundibulum or outside of the bulge? Further, what might the mislocalization of Lrig1+ cells might mean?
      6. Please explain your conclusion after Figure 3i and at the end of the manuscript that states that FN is required for stem cell anchorage. I think that a very plausible explanation is that FN is required for stem cell function and identity, but anchorage of the SC lacks sufficient evidence. Further, your only evidence to support the anchoring theory only comes from expression of Lrig1 in FN knockout and no other markers. Are they also mislocalized? Please either tone down this conclusion on SC anchorage or provide stainings for more SC markers to show mislocalization in absence of FN.
      7. In Figure 3l-o, you examine proliferation on the control vs the conditional deletion of FN in D30 and D158. However, in D30, these tissues are not at all directly comparable since one is obviously in anagen and the knockout in telogen. You must compare the anagen knockout sample, although this occurs a bit later than the control. Further, how was the infundibulum distinguished from the bulb in these control images?
      8. In Figure 3P, you carry out RT-qPCR on whole skin to detect HFSC markers. This should have been carried out on sorted epithelial cells as isolation of whole back skin introduces bias to the system in that the number of stem cells may artificially look different in skin that is in anagen vs skin that is in telogen as the anagen skin has a different proportion of SC to progenitor cells to dermal cells. This concern is also similar to point 9 - the control and FN knockout at D30 are not comparable given that they are in different phases of the hair cycle.
      9. Figure 4a these images need to be of the whole mouse - it is not possible to determine what we are looking at or where - there is not even a scale bar.
      10. After Figure 4, you argue that because fibronectin expression resolves from healing dermis is the reason that hair follicles do not form, and site Dekonick and Blanpain (PMID: 30602767) - however this review makes no mention of the dynamics of fibronectin in wound healing. Further, evidence from Driskell et al (2013, PMID: 30602767) would suggest that it is the fibroblast population that responds to the wound that determines whether HF regenerate. And further, very large wounds do regenerate HF (Ito et al PMID: 30602767). In addition, this would all be fibroblast-derived FN, as opposed to the current study which examines keratinocyte-derived FN. Please reconsider this argument.
      11. The authors knockout SLC3A2, an integrin coreceptor that is localized to the plasma membrane. They show a very similar, yet more severe phenotype to the Lrig1- and K19- mediated knockout of FN. Given the bidirectional communication that SLC3A2 is responsible for, can you reconcile whether the defects in the HF cycle and the HFSC are a result of outside-in or inside-out signaling? Further, is it possible that integrin function regulated by SLC3A2 is necessary for more than FN assembly? This could be especially relevant given that your targeted screen also identified Col17A1, which is well known to be required for HFSC function (Matsumura et al., PMID: 26912707)
      12. It is intriguing that in the absence of HFSC-derived SLC3A2 that no FN network forms. Is FN expressed or is the assembly perturbed in the absence of properly functioning integrins? The authors conclude that the signaling cascade flows from fibronectin to integrin to SLC3A2, but do not test where the FN phenotype arises in the SLC3A2 knockout - is it due to aberrant assembly of the FN meshwork or a change in transcriptional or translational levels?
      13. In the grafting assay in Supplemental Figure 3, keratinocytes undergo a de novo hair follicle morphogenesis - is Lrig1 expression maintained in order to carry out cre-mediated deletion? Further, the fibroblasts in this assay may adopt a wound-like phenotype, expressing FN, which you earlier claim to be required for hair follicle production in wounds. Yet in the absence of epithelial FN, no HF form. Can the authors reconcile this?

      Minor comments:

      1. In Figure 1a, the two populations are Lgr5+ and basal; please define what the basal population is in this experiment.
      2. Significative is not a word.
      3. In Figure 4 figure legend, there is reference to a grafting experiment but no experiment shown.
      4. The authors delete FN in Lrig1+ or K19+ cells starting D19 and harvest at D30, and conclude that the hair follicles do not enter anagen after the second telogen, can you please include the data supporting the statement that mutant HF did not reenter the hair cycle after D65.

      Significance

      The authors show for the first time that fibronectin is expressed during cutaneous homeostasis and that it is required for normal function of the hair follicle stem cells. This is significant conceptual advance for the field of skin biology because fibronectin is thought to only be present in wounds: derived first from infiltrating serum and second from fibroblasts to act as provisional dermal ECM to support epithelialization during wound-response, which is ultimately resolved upon the conclusion of wound healing (reviewed in: Singer and Clark, PMID: 10471461). Further, FN has also been characterized as an EMT marker during cancerous progression (Lamouille et al, PMID: 24556840). Estrach and colleagues show that fibronectin is actually expressed by hair follicle stem cell keratinocytes and then is assembled into a meshwork that envelops the hair follicle and is in fact necessary for hair follicle stem cell homeostasis. This work would be broadly interesting to the field of stem cell biology as well as those working on extra cellular matrix signaling. My field is epithelial stem cells and more specifically hair follicle development and cycling.

      Referee Cross-commenting

      I have no disagreement with any of the points raised by the other reviewers. In fact, we seem to agree on the majority of the concerns. This includes the use of the tail wholemount model, the use of Lrig1-cre, selection of timepoint vs phase of the hair cycle, the appropriateness of the link between Fibronectin and SLC3A2, and further significant issues related to display of data and their reproducibility. Further, all of the major comments raised need to be addressed in order to properly evaluate the conclusions that the authors make. In my opinion, none of the comments raised here are unreasonable.

    1. Authors Response:

      Reviewer #2 (Public Review):

      The authors use representational similarity analysis on a combination of behavioral similarity ratings and EEG responses to investigate the representation of actions. They specifically explore the role of visual, action-related, and social-affective features in explaining the similarity ratings and brain responses. They find that social-affective features best explain the similarity ratings, and that visual, action-related, and social-affective features each explain some of the variance in the EEG responses in a temporal progression (from visual to action-related to social-affective).

      The stimulus set is nicely constructed, broadly sampled from a large set of naturalistic stimuli to minimize correlations between features of interest. I'd like to acknowledge and appreciate the work that went into this in particular.

      The analyses of the behavioral similarity judgments are well executed and interesting. The subject exclusion criteria and catch trials for online workers are smart choices, and the authors have tested a good range of models drawn from different categories. I find the case that the authors make for social features as determinants of behavioral similarity ratings to be compelling.

      I have a few questions and requests for additional detail about the EEG analyses. I appreciate that the authors have provided the code they used for all the analyses, and I'm sure that the answers to many if not all of my questions are there, but I don't have access to a Matlab license to run the code. Also, since the code requires familiarity with not just Matlab but with specific libraries to understand, I think that more description of the analysis in the paper would be appropriate.

      Some more detail is needed in the description of the multivariate classifier analysis. The authors write (line 597-599): "The two pseudotrials were used to train and test the classifier separately at each timepoint, and multivariate noise normalization was performed using the covariance matrix of the training data (Guggenmos et al., 2018). "

      I suspect I'm missing something here, because as written this sounds as if there was only one trial on which to train the classifier, which does not seem compatible with SVM classification. If only one trial was used to train the classifier, that sounds more like nearest-neighbor classification (or something else). Alternatively, if all different pseudo-trial averages - each incorporating a different subset of trials - were used for training, then that would seem to mean that some of the training pseudo-trials contained information from trials that were also averaged into the pseudo-trials used for testing. I don't know if this was done (probably not) but if it was it would constitute contamination of the test set. I think this part of the methods needs more detail so we can evaluate it. How many trials were used to train and to test for each iteration?

      Thank you for raising this issue; we agree that our Methods section was unclear on this point. We used split-half cross-validation. There was one pseudotrial for training per condition (which was obtained by averaging trials). There was no contamination between the training and test sets, because the data was first divided into separate training and test sets, and only afterwards averaged into pseudotrials for classification. This procedure was repeated 10 times with different data splits to obtain more reliable estimates of the classification performance. We rewrote the corresponding section to make this clearer:

      “Split-half cross-validation was used to classify each pair of videos in each participant’s data. To do this, the single-trial data was divided into two halves for training and testing, whilst ensuring that each condition was represented equally. To improve SNR, we combined multiple trials corresponding to the same video into pseudotrials via averaging. The creation of pseudotrials was performed separately within the training and test sets. As each video was shown 10 times, this resulted in a maximum of 5 trials being averaged to create a pseudotrial. Multivariate noise normalization was performed using the covariance matrix of the training data (Guggenmos et al., 2018). Classification between all pairs of videos was performed separately for each time-point. […] The entire procedure, from dataset splitting to classification, was repeated 10 times with different data splits.”

      We also performed the decoding procedure with a higher number of cross-validation folds and found very similar results.

      I think a bit more detail is also necessary to clarify the features used for the classification. My understanding is that each timepoint was classified as one action vs each other action on the basis of all the electrodes in the EEG for a given temporal window. Is this correct? (I'm guessing / inferring more than a little here.)

      This is correct, and we agree that further clarification was needed in text. We have added this:

      “Classification between all pairs of videos was performed separately for each time-point. Data were sampled at 500 Hz and so each time point corresponded to non-overlapping 2 ms of data. Voltage values from all EEG channels were entered as features to the classification model.

      The entire procedure, from dataset splitting to classification, was repeated 10 times with different data splits. The average decoding accuracies between all pairs of videos were then used to generate a neural RDM at each time point for each participant. To generate the RDM, the dissimilarity between each pair of videos was determined by their decoding accuracy (increased accuracy representing increased dissimilarity at that time point).”

      It would be useful to know how many features constituted each feature space. For example, was motion energy reduced to one summary feature (total optic flow for whole sequence?) For "pixel value", is that luminance? (I suspect so, since hue is quantified separately, but I don't think this was specified).

      For motion energy, we used the magnitude of the optic flow, and calculated Euclidean distances between the vectorized magnitude maps rather than reducing it to summary features. We have included the dimensionality of each feature in Supplementary File 1b and we now refer to it in text:

      “These features were vectorized prior to computing Euclidean distances between them (see Supplementary File 1b for the dimensionality of each feature).”

      Pixel value was indeed the luminance, and we have clarified this in text.

      More broadly, I would appreciate a bit more discussion of the role of time in these analyses. Each clip unfolds over half a second, so what should we make of the temporal progression of RDM correlations? Are the social and affective features correlated with later responses because they take more time to compute (neurally speaking), or because they depend on longer temporal integration of information? These two are not even exactly mutually exclusive, and I realize that it may be difficult to say with certainty based on this data, but I think some discussion of this issue would be appropriate.

      This is a great point, although it is difficult to speculate based on this data. One way to get at this would be to examine how much social-affective processing relies on previously extracted features. Future work could look at the causality between early and later-stage EEG features (unfortunately our post-hoc attempts to address this via Granger-causal analysis were unsuccessful, likely due to insufficient SNR with our specific experimental design). Alternatively, this could be investigated in a follow-up experiment that varies how social information unfolds over time (e.g., images vs. videos or varying video duration). We now discuss this possibility in the manuscript:

      “Given the short duration of our videos and the relatively long timescale of neural feature processing, it is possible that social-affective features are the result of ongoing processing relying on temporal integration of the previously extracted features. However, more research is needed to understand how these temporal dynamics change with continuous visual input (e.g. a natural movie), and whether social-affective features rely on previously extracted information.”

    2. Reviewer #2 (Public Review):

      The authors use representational similarity analysis on a combination of behavioral similarity ratings and EEG responses to investigate the representation of actions. They specifically explore the role of visual, action-related, and social-affective features in explaining the similarity ratings and brain responses. They find that social-affective features best explain the similarity ratings, and that visual, action-related, and social-affective features each explain some of the variance in the EEG responses in a temporal progression (from visual to action-related to social-affective).

      The stimulus set is nicely constructed, broadly sampled from a large set of naturalistic stimuli to minimize correlations between features of interest. I'd like to acknowledge and appreciate the work that went into this in particular.

      The analyses of the behavioral similarity judgments are well executed and interesting. The subject exclusion criteria and catch trials for online workers are smart choices, and the authors have tested a good range of models drawn from different categories. I find the case that the authors make for social features as determinants of behavioral similarity ratings to be compelling.

      I have a few questions and requests for additional detail about the EEG analyses. I appreciate that the authors have provided the code they used for all the analyses, and I'm sure that the answers to many if not all of my questions are there, but I don't have access to a Matlab license to run the code. Also, since the code requires familiarity with not just Matlab but with specific libraries to understand, I think that more description of the analysis in the paper would be appropriate.

      Some more detail is needed in the description of the multivariate classifier analysis. The authors write (line 597-599): "The two pseudotrials were used to train and test the classifier separately at each timepoint, and multivariate noise normalization was performed using the covariance matrix of the training data (Guggenmos et al., 2018). "

      I suspect I'm missing something here, because as written this sounds as if there was only one trial on which to train the classifier, which does not seem compatible with SVM classification. If only one trial was used to train the classifier, that sounds more like nearest-neighbor classification (or something else). Alternatively, if all different pseudo-trial averages - each incorporating a different subset of trials - were used for training, then that would seem to mean that some of the training pseudo-trials contained information from trials that were also averaged into the pseudo-trials used for testing. I don't know if this was done (probably not) but if it was it would constitute contamination of the test set. I think this part of the methods needs more detail so we can evaluate it. How many trials were used to train and to test for each iteration?

      I think a bit more detail is also necessary to clarify the features used for the classification. My understanding is that each timepoint was classified as one action vs each other action on the basis of all the electrodes in the EEG for a given temporal window. Is this correct? (I'm guessing / inferring more than a little here.)

      It would be useful to know how many features constituted each feature space. For example, was motion energy reduced to one summary feature (total optic flow for whole sequence?) For "pixel value", is that luminance? (I suspect so, since hue is quantified separately, but I don't think this was specified).

      More broadly, I would appreciate a bit more discussion of the role of time in these analyses. Each clip unfolds over half a second, so what should we make of the temporal progression of RDM correlations? Are the social and affective features correlated with later responses because they take more time to compute (neurally speaking), or because they depend on longer temporal integration of information? These two are not even exactly mutually exclusive, and I realize that it may be difficult to say with certainty based on this data, but I think some discussion of this issue would be appropriate.

    1. To treat the clot postpartum, the doctors wanted to prescribe an FDA Category X drug to treat the clot -- it's so dangerous for pregnancy that women often choose to be sterilized before they take it. They told me that my clotting disorder means I should not have any more children, because of the risk that pregnancy poses to my health. I didn't want them to think I was religious for fear of what they'd think of me, but when I hinted at the question of using Natural Family Planning (a method for spacing children that the Church deems morally acceptable), they laughed. Someone with my condition had to use contraception, they said. There was no choice. Fatigued by the constant pain, overwhelmed by medical bills that were piling up by the thousands, I began to slide back away from this religion, tumbling down a slope that ended back in atheism. I hadn't minded changing in the sense of not using the f-word so much, but this was a whole different ballgame. To stick with the Church now would be to lose my life as I knew it, and to set out down an unfamiliar, frightening path. Not knowing what else to do, I went back to the basics of the way I'd been taught to work through problems since childhood. My dad, my parent from whom I got my religious views (or lack thereof), had not raised me to be an atheist as much as he'd raised me to seek truth fearlessly. "Never believe something because it's convenient or it makes you feel good," he'd always say. "Ask yourself: 'Is this true?'" And so I set everything else aside, and clung to the simple question: What is true? I quickly realized then that that was not in question, and hadn't been for a while. For weeks now, I had known on an intellectual level that I believed what the Church taught. What stalled me had not been a hesitation of whether or not it was true; it had been a hesitation of not wanting to sacrifice too much. I had no idea how things would work out. I thought there was a fair chance that this step would lead us to financial ruin, and may even take a serious toll on my health. But I decided, for the first time in a long time, to choose what was true instead of what was comfortable. Joe and I signed up to begin the formation process at our parish church. And, in the first statement of faith I'd ever made, I told my doctors that I would not use contraception, because I was Catholic. ### After that moment, a bunch of fortuitous events occurred that smoothed the way for us to become Catholic. A series of windfalls gave us the money we needed to manage our medical bills. After they got over their initial shock at encountering someone who wouldn't contracept, my doctors came up with creative solutions to keep me healthy.

      What "creative solutions" did her doctors come up with?

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      We are grateful to the reviewers for their honest opinion regarding this work and plan to address the majority of the comments in a revised version either through new analysis or revision to the text, as we believe these will improve the manuscript by making some of the details clearer. There were few suggestions that will lead to substantiative changes to the findings. Here, we address the most salient critiques, the primary one being related to novelty.

      We respectfully disagree, as our detailed analysis of the DNA methylome in Octopus bimaculoides represents a significant advance to understanding how the epigenome is patterned in non-model invertebrates in general, and cephalopods in particular. We acknowledge that the previous report that the octopus methylome resembles the few other invertebrates where low DNA methylation has been found, the finding was part of a multi-organism study last year (de Mendoza et al., 2021), which lacked any detailed investigation. Our study provides the first in depth analysis on methylation patterning, the relationship with transposons and gene expression, and reports the finding of other key epigenetic marks in O. bimaculoides, and in other cephalopods.

      In short, we believe our study to be highly novel and that it represent the first analysis of this kind in cephalopods and one of the few existing in non-model invertebrate organisms. In addition, we identify the conservation of the histone code in cephalopods. While this may be expected, this is the first experimental evidence in this class and represents an important step forward to understand the epigenetic regulation of genes and transposons in invertebrates. Finally, we plan to provide an updated transcriptome annotation for O. bimaculoides that will be available for the scientific community as a new valuable resource. We believe these features will make this study highly cited.

      We believe that findings like ours will complement several recent studies that extend the epigenetics field out of the current narrow focus on model organisms to understand how epigenetic mechanisms function in diverse animals. This provides new insights regarding the epigenetic mechanism of gene regulation in an emerging invertebrate model.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      Reviewer 1 raised the following points that we are planning to address:

      *- It is unclear why the authors did not use the original gene models of O. bimaculoides or tried to improve them. By only relying on adult tissue (but the relatively late hatchling stage), they would have omitted most developmentally expressed genes, that are incidentally also the ones that are subjected to extensive spatiotemporal gene regulation (which is also a problem to assess the role of methylation). I think more comparisons with existing gene models and how the newly generated stringtie models should be provided. *

      We agree that using as many tissues and developmental stages as possible will expand the octopus transcriptome.

      We plan to:

      • Add RNA-seq data from stage 15 embryos to improve this.
      • Compare the gene model used in the original version of the manuscript (Stringtie model to use in Trinotate for improving the annotation of the genes) to the existing annotation model and report on which has superior performance for annotating the * bimaculoides* transcriptome.
      • Extend the annotation of the transcriptome which we undertook in a focused fashion in the first iteration of this manuscript. Reviewer 2 raised the following points that we are planning to address:

      *- It is not exactly clear to me why the authors look for expression clusters in the first part of the manuscript? This information, while interesting, does not seem to be used in the methylation analysis. It is also somewhat contradictory because the authors first claim that, based on their GO-term enrichment analysis, that different expression clusters are associated with "complex regulatory mechanisms, potentially based in the epigenome". Yet at the end they conclude that, due to the global and tissue-overarching nature of methylation, this "argues against this epigenetic modification as a player in the dynamic regulation of gene expression". *

      We thank the reviewer for pointing out this issue and we plan to clarify the point through changing the text and additional analysis. Since we found that the methylation pattern was stable across tissues, and that it corresponded to gene expression levels regardless of tissues, we concluded that the methylation pattern is not likely relevant for the tissue-specific gene expression pattern reported in Figure 1.

      We plan to:

      • Ask whether there is a correlation between the gene clusters generated in Figure 1 and the DNA methylation patterns identified in Figure 4. *- At least for the trees that are shown in the main figures it would be great to show support values. *

      We thank the reviewer for this request.

      We plan to:

      • Add full Supplementary information regarding the support values in Supplemental Files for all the trees present in the main Figures. Reviewer 3 raised the following points that we are planning to address:

      *- It would be great to see more data on cephalopod TET and MBD structure. For example, it would be interesting to know whether octopus TETs have a CxxC domain or whether MBD proteins harbor functional 5mC - binding domains. *

      We agree that it would be of interest to examine the conservation of TET genes to expand upon the initial analysis by Planques et al 2021 showing that O. bimaculoides have one TET homolog, one MBD4 homolog and one MBD1/2/3 homolog. Detailed analysis of MBD4 protein has been already performed in de Mendoza et al. 2021 by using the protein sequence of O. vulgaris, as the MBD4 gene in the O. bimaculoides genome appears truncated.

      We plan to:

      • add the PFAM domain analysis for TET proteins This will be added as a new figure panel.
      • Update the text to include the reference to the identification of MBD4/MECP2 as the invertebrate homologs of vertebrate MBD4. *- Even though RRBS provides limited insight into DNA methylation patterns, the authors could have done more to explore read-level 5mC information. For example, by studying single reads, the authors could deduce the numbers of fully methylated, unmethylated or partially methylated reads. Such analyses might provide valuable insight into potentially different modes of epigenetic inheritance in different tissues i.e are there tissues that favor fully methylated or unmethylated stretches of DNA vs tissues that favor partial methylation? *

      We think this is a really interesting point. This has been partially addressed in a previous work (de Mendoza et al., 2021) which found limited to no partially methylated reads in whole-genome bisulfite sequencing from O. bimaculoides brain.

      We plan to:

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      Reviewer 1 raised the following points that we have already addressed:

      We addressed all the comments raised by this Reviewer by revising the text, fixing references, typos and improving clarity.

      Reviewer 2 raised the following points that we have already addressed:

      We addressed all the minor Comments raised by this Reviewer regarding spelling errors and Supplementary Figures.

      - The finding that less than 10% of all possible sites are methylated is surprising. I could not (easily) find statistics of RRBS experiment read mapping to the genome.

      We have now provided this data and new Supplemental Table 1 (refereed in the text as Table S1).

      *- It is very exciting to see methylation of gene bodies and some correlation to their expression levels, but the authors may need to include a disclaimer that the methylation of TEs may go undetected due to the gapness of the genome. In fact, the authors may try to map their data onto a somewhat closely related Octopus sinensis genome sequenced with long reads available at NCBI to confirm overall pattern. It is likely though that due the evolutionary distance only gene bodies will have mapping. *

      The thank the reviewer for this suggestion and we included a sentence in the Result session indicating that methylation of TEs may go undetected due to the poor annotation of the octopus genome.

      *- The statistical reasoning (and methodology) behind how clusters in Figures 1 and 4 were defined is unclear. In particular, in Figure 4, it seems that the authors had asked the program to give four clusters in total - why was this number chosen? It seems that using the same generic clustering approach as in Figure 1 may benefit or confirm the results in Figure 4. *

      We clarified the rationale in the Material and Methods session to describe the bioinformatic analysis. We will put the full code used in the manuscript in our GitHub page (https://github.com/SadlerEdepli-NYUAD/) to have a more comprehensive understanding of the Method used.

      Reviewer 3 raised the following points that we have already addressed:

      We addressed all the minor comments in the text and figures raised by this reviewer regarding typos and clarity.

      *- There is little info on the generated 5mC data. To bolster its value as a resource, the manuscript should have a link to the table describing RRBS metrics. This should include: non-conversion rates, numbers of sequenced and mapped reads, read length and other info that the authors deem useful. *

      We have now provided this data in a new Supplemental Table 1 (refereed in the text as Table S1).

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      Reviewer 1 raised the following points that we are not planning to address:

      *- The newly sequence RNA-seq samples are using a ribodepletion protocol (RiboZero) while the other ones are using a polyA selection. This might be a slight problem to compare them quantitatively. Actually in the Figure 1, all 4 newly generated samples group together in the hierarchical clustering. *

      We acknowledge the reviewer’s point here and agree that heterogeneity in library prep and batch is a common issue when comparing public available with newly generated datasets. This could account for the clustering of the Ribosomal RNA depleted (i.e. RiboZero) from polyA selected RNA libraries. While this could potentially introduce bias, we do not believe that it substantially alters any of the main findings or the interpretations of this data. Our purpose for carrying out the cluster analysis of transcriptomic data from multiple tissues was to identify distinct gene patterns that defined different tissue types. This was accomplished regardless of the potential confounding variable introduced by different library preparations. In addition, we used TMP which seems to help in the comparison across different samples when used for qualitative analysis such as PCA and cluster analysis (Zhao et al. 2020; DOI: http://www.rnajournal.org/cgi/doi/10.1261/rna.074922.120). Therefore, even if not ideal we think that this approach is still valuable.

      *- I am not so sure about the way the authors used z-score normalized logTPMs and applied hierarchical clusters, this most likely would not fully alleviate the impact of expression level on the outcome compared to more advanced form of normalization and clustering. *

      We agree with the reviewer that applying z-score or a logTPMs normalization would not fully resolve the technical variance in the direct comparison of libraries generataed with different RNA selection methods. We did not apply z-score on logTPMs but these 2 methods were applied separately: z-score on TPMs in Figure 1B to define the gene clusters and log2(TPM+1) in Figure 4E. We have clarified the text to reflect this.

      *- I am not convinced that differences in western blot for histone modification could really provide a clear insight into their regulatory role. *

      We agree with the reviewer that Western blotting for histone modifications does not provide deep insight into their regulatory role. However, this is the first description of these marks in any cephalopod, and we believe that reporting a finding from experimental evidence is important, even if the result is aligned with the existing paradigm. Moreover, the marked difference in levels of distinct histone marks across tissues supports the hypothesis that they play a regulatory role. We observed this in mice where difference abundance in western blot correspond to different abundance and enrichment also by ChIP-seq (Zhang et al., 2021 DOI: https://doi.org/10.1038/s41467-021-24466-1). Considering the limited tools available in this species, we still consider this an important finding.

      Reviewer 2 raised the following points that we are not planning to address:

      *- The finding that less than 10% of all possible sites are methylated is surprising. I could not (easily) find statistics of RRBS experiment read mapping to the genome. I also wonder how much the gap-richness of the genome may affect the overall methylation estimate. If assembly permits, would it make sense to limit the sampled sites to areas where no flanking gaps are present (and sufficient scaffold length is available, maybe excluding very short scaffolds)? *

      We added all the statistical values regarding the RRBS in a NEW Supplemental Table 1. We used a single base pair analysis approach (not tiling windows), so the data we extracted is not biased by the length of the scaffolds. This is confirmed by the fact that the DNA methylation value obtained in our RRBS data matches the findings observed in Whole Genome Bisulfite Sequencing (WGBS). Moreover, global DNA methylation values assessed by Slot blot analysis as a technique independent from genome assembly confirmed what observed with RRBS.

  8. Feb 2022
    1. Well-paid and well-treated non-tenure track faculty are more probably to have the necessary time and be granted the backing required to lecture a first-rate class (Edmonds, 2015Edmonds, D. (2015). More than half of college faculty are adjuncts: Should you care? Retrieved from http://www.forbes.com/sites/noodleeducation/2015/05/28/more-than-half-of-college-faculty-are-adjuncts-should-you-care/#6ff634541d9b, May 28 [Google Scholar]).

      I think that the is possible and we could see it at more progressive schools. It is just like the banking industry, one bank got rid of overdraft fees and slowly we are seeing major banks do away with the fees as well. All it takes is for one brave institution and another to change the landscape of things.

    1. Fast Car
      • Fast Car is a song written and performed by American folk rock singer Tracy Chapman in 1986.
      • As quoted by Tracy Chapman in an interview in 1986 with Canadian radio station CIDR, “I think that it was a song about my parents…..a new life together and my mother was anxious to leave home. [They]tried to make a life for themselves and it was very difficult going….my mother didn’t have a high school diploma and my father was a few years older….hard for him to create the ideal life that he wanted…in a sense I think they came together thinking that they would have a better chance of making it”
      • The song became popular around the 1980s after Tracy Chapman performed it at the 70th birthday Nelson Mandela tribute.
      • The song is narrative, and tells the story from the point of view of a woman who wants to get out of her hometown to leave her problems behind, such as taking care of her alcoholic father, after her mother left the family. She finds the addressee, a lover with the titular fast car, and sees it as an opportunity to get out of her hometown at last and start life anew. However, the reality is far from that, as the song continues, and reveals how even after moving to the city, she is still working a dead-end job and living in a homeless shelter, while her lover is unemployed. It parallels her own parents’ relationship, as her lover becomes a father and an alcoholic, and their relationship grows rocky. Eventually she decides that she does not need him anymore, and tells him to either clean up his act or leave.
      • Hope and resilience are major themes in this song, as the persona continues to persevere despite the many challenges that are thrown at her in life, from her upbringing to her newfound problems.However, she remains optimistic and presses on, and we find hope in how she eventually achieves independence and supposedly leaves her problems behind.
      • One can interpret the title ‘Fast Car’ as a symbol of escapism. However, as the song goes on, it is revealed that getting out of a situation is far from easy and even if you start anew, there will always be problems and challenges in life to overcome. Additionally, sometimes, you will find that people who are close to you make irresponsible and selfish decisions which affect you .In order to carry on a meaningful and purposeful life, you may have to give up on them. To prevent this from happening, it is essential to become independent and not place all your faith in someone else, just as the persona has done in Fast Car.
    1. As a result, teachers who choose a variety of assistive technologies for their classrooms may want to ensure that students and parents are fully aware of any privacy or security issues that could arise.

      I think that this is so important. Being aware of the different privacy or security issues that may arise when using different technologies is something that everyone should be aware of in order to be safe. We have learned that not all tools are safe and there are many times where passwords and personal information can be stolen. Also, students may not actually fully research the privacy of a website so having a parent also research the tool/site can help ensure that all information will be safe.

    1. Even if you as an individual user may be okay with sharing your data for “free” tools, when you assign a tool to students you are asking them to share their data, whether they want to or not.

      It is so easy to click "accept" online which may be sharing your information and we often don't even think twice and should be paying closer attention to this.

    2. However, it is important to note, even when there is an accessibility statement or VPAT, these are often self-reported by the company and can be limited by the knowledge of accessibility of the person(s) creating it.

      I think this is a huge "however" when looking at how helpful an accessibility statement is. If there are huge gaps then the statement is not as helpful but also if it is quite extensive you have to remember that it is still the company writing it up, and they may be lacking in knowledge. I think one day regulations will be put in place for the VPAT but that is not where we are yet.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Formation of tubes in a developing organism may arise from the closure of a pre-existing polarized epithelium or from de novo polarization and cavity formation in group of dividing cells. The concept of apical membrane initiation site (AMIS) refers to the fact that polarity proteins as PAR3 accumulate at a point where the apical membrane will be created. This accumulation occurs as early as the two cell stage. Previous reports have demonstrated the importance of the division process in defining this AMIS, however, in the present work the authors in vitro 3D cultures of mESC to report a mitosis independent mechanism that creates an AMIS, induces the polarization of groups of two or more cells, and permits the formation of a central cavity. The report shows that the mechanism is fully dependent on the polarized accumulation of E-cadherin at the cell membrane in contact with the other cells. Moreover, the mechanism does not require mitosis or interaction with the extracellular matrix.

      Major comments:

      The main objective of the work is to demonstrate that AMIS creation and cavity formation can be mitosis independent and that it is dependent on the accumulation of E-cadherin at the midline between two cells in contact. To demonstrate these objectives, the authors perform 3D cultures of mESC. To rule out the requirement of mitosis the authors perform cultures that are treated with mitomycin C and the purify single cells that are cultured again. The authors show time-laps experiments demonstrating that individual cells that do not dived create an AMIS when they contact one to each other. With this cultures they demonstrate that the process does not require an interaction with ECM (provided by the matrigel) but requires E-cadherin, to demonstrate, that they use E-cadherin KO cells (the same line where E-cadherin has been deleted). The work is well written and the objectives very clear. The technology used and the experiments done are adequate and sufficient to accomplish the proposed objectives and the results obtained clearly support the conclusions reached. The methods are well explained and transparent to be reproduced elsewhere and the number of replicas and the statistical methods applied seem corrects to me, although I am just a biologist, not a mathematician. Although the objectives of the work, that are: to demonstrate that AMIS formation can be independent of mitosis and that AMIS requires E-cadherin, there are parts of the results that could be farther studied or at least discussed more thoroughly. Firstly, the authors show that in non-dividing cells an AMIS is formed at the first contact site between the two cells, they also show that in the absence of E-cadherin the cell maintains the polarization of centrioles and Golgi apparatus, in spite that no AMIS is formed, this indicates that the deposition of E-cadherin at the midline membrane is part of a more global polarization event that most likely is initiated by the a directional activity of the Golgi apparatus that may direct the delivery of mature E-cadherin in that particular direction, initiating or maintaining the basis for an AMIS, since recent work (already cited in the manuscript) has demonstrated the importance of cadherin maturation for polarity establishment and maintenance (Herrera et al, 2021), the actual results should be farther discussed in this context. Secondly, it was previously shown that in different epithelia, upon cell-cell contact, the aPKC complex (that includes Par3 and Par6) is recruited early to the contact site where with the participation of Cdc42, aPKC is activated generating an initial spot-like adherent junction (AJs) (Suzuki et al., 2002). In that case it is thought to be mediated by a direct interaction between the first PDZ domain of PAR-3 and the C-terminal PDZdomain-binding sequences of immunoglobulin-like cell adhesion molecules: JAM-1 and nectin-1/3 (Fig. 3) (Ebnet et al., 2001; Itoh et al., 2001; Takekuni et al., 2003). Thus it wold be interesting to know if AMIS formation in absence of cell division depends on JAM-1 or nectin and whether JAM-/Nectin signalling is sufficient to initiate the Golgi and centriole polarization and which is the mechanism governing it.

      Minor comments:

      As I mentioned before, the paper is well presented and very clear, yes it is simple, but simple is always better, no complicated graphics or letterings, thank you. Although in my opinion the work is very well written, I have to admit that I am not qualified to evaluate the literary style of the work since English is not my mother tongue, also I have not reviewed typographical errors since I think that is the work of the editorial, not of scientific reviewers. Please include the full reference of all the antibodies used, including the company and not just the catalog number

      Quoted references:

      Ebnet, K., Suzuki, A., Horikoshi, Y., Hirose, T., Meyer Zu Brickwedde, M. K., Ohno, S. and Vestweber, D. (2001). The cell polarity protein ASIP/PAR-3 directly associates with junctional adhesion molecule (JAM). EMBO J. 20, 3738-3748.

      Itoh, M., Sasaki, H., Furuse, M., Ozaki, H., Kita, T. and Tsukita, S. (2001). Junctional adhesion molecule (JAM) binds to PAR-3: a possible mechanism for the recruitment of PAR-3 to tight junctions. J. Cell Biol. 154, 491-497.

      Takekuni, K., Ikeda, W., Fujito, T., Morimoto, K., Takeuchi, M., Monden, M. and Takai, Y. (2003). Direct binding of cell polarity protein PAR-3 to cell-cell adhesion molecule nectin at neuroepithelial cells of developing mouse. J. Biol. Chem. 278, 5497-5500

      Suzuki, A., Ishiyama, C., Hashiba, K., Shimizu, M., Ebnet, K. and Ohno, S. (2002). aPKC kinase activity is required for the asymmetric differentiation of the premature junctional complex during epithelial cell polarization. J. Cell Sci. 115, 3565-3573.

      Significance

      The paper describes for the first time that contrary to what was previously believed an AMIS can be generated without a cell division. This is very important because it opens the possibility that the mechanisms that originate the biologic cavities are in fact not really how we believed. The work is of interest of all cell biology scientists, specially working in developmental biology, cancer research.

      My particular field of expertise is cell biology and signaling, always applied to particular events as nervous system development or cancer, in particular I am interested in Wnt/b-catenin and Sonic Hedgehog pathways.

    1. I think that we both know the way that this story ends

      The speaker foreshadows about how their relationship would undoubtedly end. The word ‘story’ symbolises their relationship, which both may have a start and an end. The different parts of a story can resemble the life in a relationship with someone you love. The ‘introduction’ resembles how they meet, the ‘rising action’ resemble how they have taken things further and started to date, the ‘climax’ can resemble the fights or arguments that start to take place inferred from the previous line. And it carries on until the ‘conclusion’, where they separate. From this, I can reflect on the theme of relationship as how it may only last for a brief moment, and shatter, which can hurt so much so it is hard to let go. I feel proud for the speaker even though he fails repetitively to let go but tries his best to do so

    1. Reviewer #1 (Public Review):

      In this manuscript, Yang et al. trained monkeys to play the classic video game Pac-Man and fit their behavior with a hierarchical decision making model. Adapting a complex behavior paradigm, like Pac-Man, in the testing of NHP is novel. The task was well-designed to help the monkeys understand the task elements step-by-step, which was confirmed by the monkeys' behavior. The authors reported that the monkeys adopted different strategies in different situations, and their decisions can be described by the model. The model predicted their behavior with over 90% accuracy for both monkeys. Hence, the conclusions are mostly supported by the data. As the authors claimed, the model can help quantify the complex behavior paradigm, providing a new approach to understanding advanced cognition in non-human primates. However, several aspects deserve clarification or modification.

      1. The results showed that the monkeys adopted different strategies in different situations, which is also well described by the model. However, the authors haven't tested whether the strategy was optimal in a given situation.

      Our approach to analyze monkeys’ behavior is not based on optimality. Instead, we centered around the strategies and showed that they described the monkeys’ behavior well. The model and its fitting process does not assume the monkeys were optimizing for something. Nevertheless, the fitting results suggested that the strategies that the monkeys chose were rational, which suggests validity of our model. As we have pointed out above, optimality is hard to define in such a complex game. In particular, most of the game is about collecting pellets, strategies that are only used in a small portion of the game can be ignored when searching for optimal solutions. We feel that further analyses on the issue of optimality would dilute the center message of the paper and choose not to include them here.

      According to the results, the monkeys didn't always perform the task in an optimal way, as well. Most of the time, the monkeys didn't actively adopt strategies in a long-term view. They were "passively" foraging in the task: chasing benefit and avoiding harm when they were approached. This "benefit-tending, harm-avoiding" instinct belongs to most of the creatures in the world, even in single-cell organisms. When a Paramecium is placed in a complex environment with multiple attractants and repellents, it may also behave dynamically by adopting a linear combination of basic tending/avoiding strategies, although in a simpler way. In other words, the monkeys were responding to the change of environment but not actively optimizing their strategy to achieve larger benefits with fewer efforts. The only exception is the suicides. Monkeys were proactively taking short-term harms to achieve large benefits in the future.

      One possible reason is that the monkeys didn't have enough pressure to optimize their choices since they will eventually get all the rewards no matter how many attempts they make. The only variable is the ghosts. Most of the time, the monkeys didn't really choose between different targets/ strategies. They were making choices between the chasing order of the options, but not the options themselves. It is similar to asking a monkey to choose either to eat a piece of grape or cucumber first, but not to choose one and give up the other one. A possible way to avoid this is to stop the game once the ghost catches the Pac-Man or limit each game's time.

      The game is designed to force the players to make decisions quickly to clear the pellets, otherwise the ghosts would catch Pac-Man. Even in the monkey version of the game where the monkeys always get another chance, Pac-Man deaths lead to long delays with no rewards. They will not be able to complete the game if they do not actively plan their route, especially in the late stage when they must reach the scarcely placed dots while escaping from the ghosts. In addition, we provided additional rewards when a maze is cleared in fewer rounds (20 drops if in 1 to 3 rounds; 10 drops if in 4 to 5 rounds; and 5 drops if in more than 5 rounds), which added motivation for the monkeys to complete a game quickly.

      The monkeys’ behavior also suggested that they did not just adopt a passive strategy. Our analyses of the planned attack and suicide behavior clearly demonstrated that the monkeys actively made plans to change the game into more desirable states. Such behavior cannot be explained with a passive foraging strategy.

      2. It is well known that the value of an element is discounted by time and distance. However, in the model, the authors didn't consider it. A relevant problem will be the utility of the bonus elements, including the fruits and scared ghosts. Their utilities were affected not only by their value defined by the authors but also by effects, including their novelty and sense of achievement when they were captured, as the ghosts attracted relatively much more attention than the other elements (considering the number is 2 for them, see in figure 3E).

      These are good points, and our strategies could be built with more complexity to account for other potential factors. However, we focused our investigation on how to account for monkeys’ behavior with a set of strategies. A set of simple strategies with a small number of parameters would make a strong argument.

      Using a complex game such as Pac-Man allows us to investigate all of these interesting cognitive processes. We can certainly look at them in the future.

      3. The strategies are not independent. They are somehow correlated to each other. It may result in, in some conditions, false alarming of more strategies than the real, as shown in figure 2A.

      We have computed the Pearson correlations between the action sequences chosen with each basis strategy within each coarse-grained segment determined by the two-pass fitting procedure. As a control, we computed the correlation between each basis strategy and a random strategy, which generates action randomly, as a baseline. Most strategy pairs' correlations were lower than the random baseline. The results were now included in Supplementary (Appendix Figure 3).

      Sometimes two strategies may give exactly the same action sequence in a game segment. To deal with this problem, now we include an extra step when we fit the model to the behavior, which was described in Methods:

      “To ensure that the fitted weights are unique (Buja et al., 1989) in each time window, we combine utilities of any strategies that give exactly the same action sequence and reduce multiple strategy terms (e.g., local and energizer) to one hybrid strategy (e.g., local+energizer). After MLE fitting, we divide the fitted weight for this hybrid strategy equally among the strategies that give the same actions in the time segments.“

      Moreover, as the reviewer correctly reasoned, correlations between the strategies would yield possibly more strategies. However, our finding is that the monkeys were using a single strategy most of the time. This possible false alarm would go against our claim. Our conclusions stand despite the strategy correlations.

      It is hard to believe that a monkey can maintain several strategies simultaneously since it is out of our working memory/attention capacity.

      Exactly, and we are among the first to quantitatively demonstrate that the monkeys’ mostly relied on single strategies to play the game.

      Reviewer #2 (Public Review):

      In this intriguing paper, Yang et al. examine the behaviors of two rhesus monkeys playing a modified version of the well-known Pac-Man video game. The game poses an interesting challenge, since it requires flexible, context-dependent decisions in an environment with adversaries that change in real time. Using a modeling framework in which simple "basic" strategies are ensembled in a time-dependent fashion, the authors show that the animals' choices follow some sensible rules, including some counterintuitive strategies (running into ghosts for a teleport when most remaining pellets are far away).

      I like the motivation and findings of this study, which are likely to be interesting to many researchers in decision neuroscience and animal behavior. Many of the conclusions seem reasonable, and the results are detailed clearly. The key weakness of the paper is that it is primarily descriptive: it's hard to tell what new generalizable knowledge we take away from this model or these particular findings. In some ways, the paper reads as a promissory note for future studies (neural or behavioral or both) that might make use of this paradigm.

      I have two broad concerns, one mostly technical, one conceptual:

      First, the modeling framework, while adequate, is a bit ad hoc and seems to rely on many decisions that are specific to exactly this task. While I like the idea of modeling monkeys' choices using ensembling, the particular approach taken to segment time and the two-pass strategy for smoothing ensemble weights is only one of many possible approaches, and these decisions aren't particularly well-motivated. They appear to be reasonable and successful, but there is not much in the paper to connect them with better-known approaches in reinforcement learning (or, perhaps surprisingly, hierarchical reinforcement learning) that could link this work to other modeling approaches. In some ways, however, this is a question of taste, and nothing here is unreasonable.

      Thanks for the suggestion. In the new revision, we include a linear approximate reinforcement learning model (LARL) (Sutton, 1988; Tsitsiklis & Van Roy, 1997). The LARL model shared the same structure with a standard Q-learning algorithm but used the monkeys’ actual joystick movements as the fitting target. The model, although computationally more complex than the hierarchical mode, achieves a worse fitting performance.

      Second, there is an elision here of the distinction between how one models monkeys' behavior and what monkeys can be said to be "doing." That is, a model may be successful at making predictions while not being in any way a good description of the underlying cognitive or neuroscientific operations. More concretely: when we claim that a particular model of behavior is what agents "actually do," what we are usually saying is that (a) novel predictions from this model are born out by the data in ways that predictions from competing models are not (b) this model gives a better quantitative account of existing data than competitors. Since the present study is not designed as a test of the ensembling model (a), then it needs to demonstrate better quantitative predictions (b).

      We concede to the point that our model, while fitting to the behavior well, does not directly prove that the monkeys actually solved the task in this way. The eye movement and pupil dilation analyses partly addressed this issue, as their results were consistent with what one would expect from the model. We also hope future recording experiments will provide neural evidence to support the model.

      But the baselines used in this study are both limited and weak. A model crafted by the authors to use only a single, fixed ensemble strategy correctly predicts 80% of choices, while the model with time-varying ensembling predicts roughly 90%. This is a clear improvement and some evidence that *if* the animals are ensembling strategies, they are changing the ensemble weights in time. But there is little here in the way of non-ensemble competitors. What about a standard Q-learning model with an inferred reward function (that is, trained to replicate monkeys' data, not optimal performance). The perceptron baseline as detailed seems very poor as a control given how shallow it is. That is, I'm not convinced that the authors have successfully ruled out "flat" models as explanations of this behavior, only found that an ensembled model offers a reasonable explanation.

      We hope the new LARL model provides a better baseline control as a flat model. It performs better than the perceptron, yet much worse than our hierarchical model. Yet, we must point out that any hierarchical models can be matched in performance with a flat model in theory (Ribas-Fernandes et al., 2011). The advantage of hierarchical models mainly lies in their smaller computational cost for efficient planning. Even in a much simpler task such as a four-room navigation task, a hierarchical model can plan much faster than a flat model, especially under conditions with limited working memory (M. Botvinick & Weinstein, 2014). Our Pac-Man task contains an extensive feature space while requiring real-time decision-making. The result is that a reasonably performing flat model would go beyond the limits of the cognitive resources available in the brain. Even for a complex flat model such as Deep Q-Network (it can be considered to be similar a flat model since it does not explicitly plan with temporal extended strategies (Mnih et al., 2015)), the game performance is much worse than a hierarchical model (Van Seijen et al., 2017). The performance of the monkeys was unlikely to be achieved with a flat model. In addition, we trained the monkeys by introducing the game concepts gradually, with each training stage focusing on certain game aspects. The training procedure may have encouraged the monkeys to generalize the skills acquired in the early stages and use them as the basis strategies in the later training stages when the monkeys faced the complete version of the Pac-Man task.

      Reviewer #3 (Public Review):

      Yang and colleagues present a tour de force paper demonstrating non-human primates playing a full on pac-man video game. The authors reason that using a highly complex, yet semi controlled video game allows for the analysis of heuristic strategies in an animal model. The authors perform a set of well motivated computational modeling approaches to demonstrate the utility of the experimental model.

      First, I would like to congratulate the authors on training non-human primates to perform such a complex and demanding task and demonstrating that NHP perform this task well. From previous papers we know that even complex AI systems have difficulty with this task and extrapolating from my own failings in playing pac-man it is a difficult game to play.

      Overall the analysis approach used in the paper is extremely well reasoned and executed but what I am missing (and I must add is not needed for the paper to be impactful on its own) is a more exhaustive model search. The deduction the authors follow is logically sound but builds very much on assumptions of the basic strategy stratification performed first. This means that part of the hierarchical aspect of the behavioral strategies used can be attributed to the heuristic stratification nature of the approach. I am not trying to imply that I do not think that the behavior is hierarchically organized but I am implying that there is a missed opportunity to characterize that hierchical'ness (maybe in a graph theoretical way, think Dasgupta scores) further.

      All in all this paper is wonderful. Congratulations to the authors.

      We thank the reviewer for the encouraging comments. We have included a new flat model in the new revision for comparison against our hierarchical model and discussed other experimental evidence to support our claim.

    1. To create trails  When we are studying a text we need to take the time to understand more than just the storyline. During your second reading, any comments made during the first reading (marginal comments or summaries) will quickly give you the gist of your first reading, so that you can take advantage of your second.

      While multiple readings of a text in antiquity may have been rarer, due to the cheap proliferation of books, one can more easily "blaze a trail" through their reading to make it easier or quicker to rebuild context on subsequent readings.


      Look at history of reading to see which books would have been more likely re-read, particularly outside of one's primary "area" of expertise.

      Link to the trails mentioned by Vannevar Bush in As We May Think.

    1. Author Response:

      Reviewer #1:

      In this work Warneford-Thomson et al. developed an approach for surveillance screening for SARS-CoV-2, which involves the isothermic amplification of a region of the SARS-CoV-2 nucleocapsid gene using RT-LAMP, followed by detection with deep sequencing. High-throughput and cost effectiveness is achieved by two sets of barcodes that allow up to about 37,000 samples to be combined into one deep sequencing run. Moreover, the authors demonstrate they can do the detection from saliva collected on paper, which should make sample collection easier.

      The main strength of the work lies in the technical aspects, including setting up multiple controls such as a detection of a human gene, and multiplexing with detection of the influenza virus.

      The main weakness is that there are multiple other papers either published or archived that use RT-LAMP for SARS-CoV-2 detection, deep sequencing for SARS-CoV-2 detection, or both. These are cited in the current work, which is very well written and presented. Whether this method is better than the others which have the same aim of developing cost-effective and high-throughput detection is not conclusively demonstrated as only 8 clinical saliva samples are examined.

      We do not wish to claim that our method is better than the others. We think it has advantages and disadvantages and certainly it should be further optimized before scaling it up to population level. We have added these considerations to the text (lines 376–80).

      Furthermore, the requirement for deep sequencing and batching many samples for cost-effectiveness will, in most situations, greatly increase turn-around time. This will make surveillance much less effective, since by the time results are fed back, the asymptomatically infected individual would have had more opportunity to transmit the infection to others.

      We argue that time from sample to result is a mostly a function of logistics and not of the method. With proper set ups the time from sample collection to results could be < 16 hours, which would be compatible with population-level surveillance. We added these considerations to the text.

      However, the deep sequencing step may be very useful for surveillance of circulating SARS-CoV-2 spike sequences to detect emerging variants within a population, provided this method can be modified to do it.

      We agree and we mention this possibility in the discussion.

      Reviewer #2:

      In 'COV-ID: A LAMP sequencing approach for high-throughput co-detection of SARS-CoV-2 and influenza virus in human saliva', Warneford-Thomson et al. present a novel methodology to perform large numbers of COVID-19 tests in parallel. Their approach takes unprocessed saliva and requires only a small number of experimental steps before the results are sequenced overnight to generate many thousands of results. This straightforward experimental design should allow the protocol to be expanded to a number of settings where population-level monitoring is required in order to contain outbreaks and reduce transmission. In this paper, the authors demonstrate the efficacy of their approach and perform a large number of benchmarking experiments to quantify its sensitivity, specificity and limitations of detection. They are able to detect artificially created infections (spike-ins) with as low as 5 virions per µL and all clinically available samples agreed with the standard RT-qPCR test. This method can detect both SARS-CoV-2 and Influenza infection and can also be applied to saliva samples which have been collected on filter paper, a strategy which will further simplify the testing regime.

      The authors have spent much time testing this approach but these have largely been limited to analysing artificially created infections. The only results which were obtained were from eight clinically derived samples which are presented in Figure 2E. Although all results from this approach agreed with the standard clinical test this is a small number of tests compared to the total number of tests which are reported in this paper. It is also only a small proof-of-principle experiment to justify a quick rollout of this technology.

      We have now performed COV-ID on 120 additional patient samples (new Figure 2-figure supplement 2). These new results are described in the text.

      The potential for this technology to perform rapid, high-throughput SARS-CoV-2 testing alongside the potential for very low sequencing costs (Figure 4G) is impressive. It is noted in the manuscript that this will require 96 unique barcodes but only 32 are tested here. All but three of these 32 work for the SARS-CoV-2 N2 primers and required STATH control but how will the remaining 67 primers be derived (i.e. is it realistic that this can be made to work to deliver the promise of this approach)?

      The current COV-ID patient barcodes are 5 base pairs long. This allows for 4^5 = 1,024 combinations. Out of an abundance of caution, we excluded barcodes with homology to the reverse complement of the RT-LAMP primers used in any of the experiments (i.e. primers for SARS-CoV-2 N2, STATHERIN, ACTIN, and influenza virus) and then selected a set of 32 with Hamming distances of at least 2 from each other. This is now described more in detail in the methods.

      Regarding the numbers, out of 1,024 5-bp barcodes, 404 were removed due to homology, leaving 620. Of these, we could find at least 163 with Hamming distance ≥ 2 from each other. Even with a substantial failure rate, this should allow for 96 working barcodes. If we had only considered clashes with N2 and STATHERIN primers, the number of available barcodes would be substantially higher.

      Overall, this is an interesting paper which has very clear real-world application to helping to defeat the ongoing COVID-19 pandemic, but some extra validations are needed to fully demonstrate its performance in clinical and/or public health settings.

    1. Author Response:

      Reviewer #1:

      The experiments are well designed, generally well controlled, and carefully conducted, and are thoughtfully and appropriately discussed. The authors make conclusions that are well supported by their results.

      When describing the aptamer knockdown of the PPS, the authors explain that the western blot was too noisy for monitoring the knockdown, which is frustrating for the reader and must have been frustrating for the authors. The authors instead counter-intuitively use qRT-PCR to monitor the transcript abundance of the PPS transcript in the aptamer system - this aptamer system is thought to be a modifier of protein, not transcription or transcript abundance. The authors describe that this has been seen once before (using aptamer knockdown of PfFis1), and the authors of that study speculate that the TetR-DOZI aptamer might be degrading the target mRNA. This is a plausible explanation, but it isn't quite clear from the description how this experiment was performed. The authors explain that the knockdown parasites grew normally for three days, but the parasites may be becoming sicker over this period. It's therefore possible that the decrease in PPS mRNA abundance is a product, rather than a cause of the growth defect. Sick or dying parasites could plausibly impact the PPS differently to the two chosen controls, particularly since both control genes chosen have substantially longer half-lives than the PPS mRNA (according to the Shock and DeRisi datasets). I therefore I suggest that this experiment be performed in an IPP rescue scenario (where the parasites aren't dying) with biological replicates. There is no explanation of the replicates here, but the error bars in 6C are implausibly small for real biological replicates.

      To address these concerns, we have added western blot data showing down-regulation of PPS expression in -aTc +IPP conditions, relative to a loading control. We have also repeated the growth assay and RT-qPCR experiment (in biological triplicate) under IPP-rescue conditions. Parasites samples harvested on day 3 of the IPP-rescue assay were analyzed by RT-qPCR and show reduced PPS mRNA abundance that is similar to (and slightly lower than) that observed without IPP supplementation. This similarity is not surprising to us, since the day 3 harvest in the original growth assay (without IPP) was 3 days before observing a parasite growth defect in -aTc conditions. With respect to the mechanism of transcript loss in the aptamer/TetR-DOZI system, the fate of transcripts in this system has not been investigated in depth. However, DOZI is believed to target bound mRNA to P-bodies, which are a known site of mRNA degradation in cells. We have unpublished data with multiple parasite proteins tagged with the aptamer/TetR-DOZI system. In all cases, we see strong reductions in mRNA abundance in -aTc conditions, suggesting that such decreases are a general property of this knockdown system.

      Line 342 "These results directly suggest that apicoplast biogenesis specifically requires synthesis of linear polyprenols containing three or more prenyl groups." - I think that this might be overinterpreting those results - there could be a number of different reasons why polyprenols of different sizes do or don't rescue, including different solubility, diffusion, availability of transporters, predisposition to break down to useable subunits. Perhaps this needs a caveat.

      We have modified the text here to remove “directly” and to acknowledge uncertainty in beta-carotene uptake: “Although it is possible that β-carotene is not taken up efficiently into the apicoplast, rescue by decaprenol, which is similar in size and hydrophobicity to β-carotene, suggests that apicoplast biogenesis specifically requires synthesis of linear polyprenols containing three or more prenyl groups.” We have also added the statement that “this hypothesis is further supported by additional results described in the next two sections”, referring to our identification of an apicoplast-targeted polyprenyl synthase.

      Line 361 " the cytosolic enzyme, PF3D7_1128400" - I don't think we know the localisation of this protein based on the published data. The Gabriel et al study makes it clear the protein isn't apicoplast or mitochondrial, but it is punctate at stages in a pattern that doesn't look to me to be a straightforward cytosolic localisation (and the original authors don't describe it as cytosolic).

      We agree that the localization of PF3D7_1128400 requires further investigation. The Gabriel study, which (surprisingly) is the only study we found that has examined localization of this protein by microscopy, observed diffuse signal in trophozoites consistent with cytoplasmic localization, in additional to focal, punctate signals in schizonts that were distinct from the apicoplast or mitochondrion. The authors described their results as, “Analysis by fluorescence microscopy of live parasites confirms expression along the intra-erythrocytic cycle and shows FPPS/GGPPS localization throughout the cytoplasm and also forming spots, which increase in number as parasites mature from trophozoite to schizont stages.” For simplicity we referred to FPPS/GGPPS localization as cytoplasmic but agree that available data suggest more a complex localization that requires further studies to understand. We have modified the text to indicate that available data suggests a complex cellular distribution that includes both the cytoplasm and additional sub-cellular foci outside the apicoplast and mitochondrion.

      Line 423 "with strong prediction of an apicoplast-targeting transit peptide but uncertainty in the presence of a signal peptide". I don't think this describes well the bioinformatic analysis of the N-terminus. Although the experimental data are convincing that this is an apicoplast-targeted protein, bioinformatically this would not be predicted as an apicoplast protein. There is no obvious signal peptide, and "uncertainty" is too vague a descriptor. None of the versions of signalP, nor psort, predict this as possessing a signal peptide (which by definition means that PlasmoAP absolutely rejects it), and there is no obvious hydrophobic segment at the N-terminus that we would normally expect of a signal peptide. The toxoplasma hyperlopit doesn't suggest that the Toxoplasma orthologue is apicoplast, and the protein isn't found in the Boucher et al apicoplast proteome. This is somewhat of a mystery. It doesn't diminish the solid localisation data, with the excellent complementary data from IFA as well as the doxycycline+IPP experiment, but it should be pointed out clearly that this localisation isn't to be expected from the sequence analysis.

      We thank the reviewer for this perspective and agree that SignalP is unable to identify a signal peptide at the N-terminus of PPS. We have modified the text to remove our description of “uncertainty” and explicitly state that SignalP is unable to identify a canonical signal peptide at the N-terminus of PPS.

      We note that multiple proteins detected in the Boucher et al. apicoplast proteome also lack an identifiable signal peptide by SignalP yet are clearly imported into the apicoplast. These proteins include the key MEP pathway enzymes DXR (PF3D7_1467300) and IspD (PF3D7_0106900), holo ACP synthase (PF3D7_0420200), FabB/F (PF3D7_0626300), and the E1 subunit of pyruvate dehydrogenase (PF3D7_1446400). Thus, apicoplast import despite lack of identifiable signal peptide by SignalP is not unique to PPS but general to multiple (if not numerous) apicoplast-targeted proteins. These observations suggest to us that protein N-termini in Plasmodium can have sequence properties compatible with ER targeting that are broader and more heterogenous than other eukaryotic organisms that comprise the training sets upon which SignalP is currently based. It remains a future challenge to fully understand these properties.

      With respect to the lack of PPS detection in the Boucher et al. apicoplast proteome, PPS appears to have a very low expression level and unusual solubility properties that require overnight extraction of parasite pellets in 2% SDS (or LDS) for detection. In our experience, the RIPA extraction conditions (which contained 0.1% SDS) used in the Boucher et al. study are insufficient to solubilize PPS, which may explain lack of PPS detection in their study.

      To explicitly address these questions regarding PPS targeting to the apicoplast, we have added a new section to the Discussion to explore PPS targeting in the absence of a recognizable signal peptide, its unusual solubility properties and lack of detection in the Boucher et al. proteome, and planned future studies to further test, refine, and understand targeting determinants.

      With respect to Toxoplasma, T. gondii appears to also express two polyprenyl synthase homologs, TGME49_224490 and TGME49_269430, that are ~30% identical (in homologous regions) to PF3D7_1128400 (FPPS/GGPPS) and PF3D7_0202700 (PPS), respectively. TGME49_224490 appears to be targeted to the mitochondrion in T. gondii (based on MitoProt and HyperLOPIT analysis), in contrast to its P. falciparum homolog, PF3D7_1128400, which localizes to the cytoplasm and other cellular foci outside the mitochondrion. TGME49_269430 does not appear to target the apicoplast in T. gondii (based on HyperLOPIT data), which contrasts with our determination of apicoplast targeting for the P. falciparum homolog, PF3D7_0202700. These differing localizations may suggest distinct cellular roles for these homologs in T. gondii compared to P. falciparum. We are also aware of a recent study (Pubmed 34896149) showing that loss of MEP pathway activity in T. gondii (due to loss of apicoplast ferredoxin) does not impact apicoplast biogenesis, in contrast to our observations in P. falciparum based on FOS treatment, DXS deletion, and PPS knockdown. These distinct phenotypes further suggest differences in isoprenoid utilization and metabolism between T. gondii and P. falciparum that remain to be understood. We have added a new section to the Discussion to address these considerations.

      The section after line 344 "Iterative condensation of DMAPP with IP…", up until line 377 doesn't sit well within the section that has the heading "Apicoplast biogenesis requires polyprenyl isoprenoid synthesis". I suggest either creating a separate subheading for this material, or moving it into the start of the subsequent section "Localization of an annotated polyprenyl synthase to the apicoplast.".

      We thank the reviewer for this suggestion, which we have followed. We have moved the referenced text to the beginning of the subsequent section to better align the text with that section heading.

      Reviewer #2:

      Minor comments:

      The authors emphasize that this study reveals a previously unnoted interconnection between apicoplast maintenance and pathways that produce an output from the apicoplast to serve the cell. But is the prevailing view really that these two are separate? Isn't the interconnection already clear from many other studies and observations? E.g., the fatty acids produced inside the apicoplast provide membrane- and lipid- precursors for the rest of the cell as well as for the apicoplast itself (Botte et al., PNAS, 2013) (although not essential in Plasmodium blood stages). Other pathways that function inside the apicoplast such as the Fe-S cluster synthesis are critical to support enzymes that provide exported metabolites (e.g., IPP synthesis, IspG/H) and function in maintenance (e.g., MiaB) (Gisselberg et al., PLoSPath, 2013). Perhaps the authors could tone this conclusion down and acknowledge that maintenance and output are interconnected in other cases, which have been acknowledged in the literature.

      We thank the reviewer for this perspective and agree that in Toxoplasma as well as in mosquito- and liver-stage Plasmodium there are multiple apicoplast outputs (i.e., metabolic products exported from the apicoplast) that contribute to parasite fitness, including IPP, fatty acids, and coproporphyrinogen III. To clarify, we are specifically referring to blood-stage Plasmodium in our manuscript, when heme and fatty acid synthesis are dispensable and where the prior literature has intensely focused on IPP as the key essential output of the blood-stage apicoplast and consistently stated that IPP is not required for organelle maintenance.

      We agree that prior work has firmly established that apicoplast housekeeping functions (e.g., synthesis of proteins and Fe-S clusters) are required for organelle maintenance and to support IPP synthesis. However, our work is the first to demonstrate in blood-stage Plasmodium that the reverse is also true- that IPP as an essential apicoplast output is also required for organelle maintenance and that apicoplast maintenance and IPP synthesis are thus reciprocally dependent. We have modified the Discussion section to clarify these points and to explicitly acknowledge that apicoplast maintenance and other metabolic outputs may also be interdependent in Toxoplasma and other Plasmodium stages.

      Could the authors elaborate more on the leader sequence predicting apicoplast localization for the PPS characterized here and discuss why it might have been missed in previous detailed study of apicoplast localised proteins (Boucher et al., PlosBiol, 2018)?

      Please see our response above to Reviewer #1.

      Could the authors discuss conservation of the PPS gene(s) in other Apicomplexa with (e.g., T. gondii) and without (e.g., Cryptosporidium spp.) an apicoplast? This could be relevant for other people in the field and could give further insights into the enzyme's role in apicoplast maintenance.

      Please see our response above to Reviewer #1. Polyprenyl synthases are diverse enzymes that perform a variety of cellular functions, whose specific roles can differ between organisms. Although the two Plasmodium prenyl synthases show preferential homology with each of two different prenyl synthase homologs in Toxoplasma and Cryptosporidium (CPATCC_003578 and CPATCC_001801), the differing localizations of these homologs in each parasite suggest differing cellular roles. The differing dependence of apicoplast biogenesis on MEP pathway activity in T. gondii and P. falciparum and the absence of an apicoplast in Cryptosporidium further support differences in isoprenoid utilization and metabolism in these organisms. We have added a new section to the Discussion to address these considerations.

      Reviewer #3:

      The paper is very nicely written and was a true pleasure to read. The introduction is concise yet dense with all relevant background of our current understanding of functioning of the apicoplast in relation to IPP production and utilization. The rational of the experiments and the interpretation of the results are presented clearly and everything is discussed well in the context of the current understanding of the field. The main conclusion of the paper that isoprenoid is not solely essential for critical functions elsewhere in the cell, such as prenylation-dependent vesicular trafficking but also for apicoplast biogenesis via its processing by an essential polyprenyl synthase conserved with plants and bacteria is well substantiated and very exciting. The authors demonstrate an equally beautiful and clever use of available and newly generated genetic mutants in combination with complementary pharmacological interventions and metabolic supplementation. There are no true major weaknesses that could jeopardize the conclusions or change the interpretation of the results. However, the authors do consistently perform statistical analyses on data obtained from individual cells obtained in no more than two independent experiments, which in my humble opinion does not qualify for statistical analysis. That said, the results are so clear-cut that no statistics are required to convince me, or to quote Ernest Rutherford: '"If your experiment needs statistics, you ought to have done a better experiment."

      We thank the reviewer for these positive comments and suggestions. For growth assays, we have performed a third biological replicate and updated those figures and the indicated statistical analyses. For microscopy experiments, we have removed p values.

    2. Reviewer #1 (Public Review): 

      The experiments are well designed, generally well controlled, and carefully conducted, and are thoughtfully and appropriately discussed. The authors make conclusions that are well supported by their results. 

      When describing the aptamer knockdown of the PPS, the authors explain that the western blot was too noisy for monitoring the knockdown, which is frustrating for the reader and must have been frustrating for the authors. The authors instead counter-intuitively use qRT-PCR to monitor the transcript abundance of the PPS transcript in the aptamer system - this aptamer system is thought to be a modifier of protein, not transcription or transcript abundance. The authors describe that this has been seen once before (using aptamer knockdown of PfFis1), and the authors of that study speculate that the TetR-DOZI aptamer might be degrading the target mRNA. This is a plausible explanation, but it isn't quite clear from the description how this experiment was performed. The authors explain that the knockdown parasites grew normally for three days, but the parasites may be becoming sicker over this period. It's therefore possible that the decrease in PPS mRNA abundance is a product, rather than a cause of the growth defect. Sick or dying parasites could plausibly impact the PPS differently to the two chosen controls, particularly since both control genes chosen have substantially longer half-lives than the PPS mRNA (according to the Shock and DeRisi datasets). I therefore I suggest that this experiment be performed in an IPP rescue scenario (where the parasites aren't dying) with biological replicates. There is no explanation of the replicates here, but the error bars in 6C are implausibly small for real biological replicates. 

      Line 342 "These results directly suggest that apicoplast biogenesis specifically requires synthesis of linear polyprenols containing three or more prenyl groups." - I think that this might be overinterpreting those results - there could be a number of different reasons why polyprenols of different sizes do or don't rescue, including different solubility, diffusion, availability of transporters, predisposition to break down to useable subunits. Perhaps this needs a caveat. 

      Line 361 " the cytosolic enzyme, PF3D7_1128400" - I don't think we know the localisation of this protein based on the published data. The Gabriel et al study makes it clear the protein isn't apicoplast or mitochondrial, but it is punctate at stages in a pattern that doesn't look to me to be a straightforward cytosolic localisation (and the original authors don't describe it as cytosolic). 

      Line 423 "with strong prediction of an apicoplast-targeting transit peptide but uncertainty in the presence of a signal peptide". I don't think this describes well the bioinformatic analysis of the N-terminus. Although the experimental data are convincing that this is an apicoplast-targeted protein, bioinformatically this would not be predicted as an apicoplast protein. There is no obvious signal peptide, and "uncertainty" is too vague a descriptor. None of the versions of signalP, nor psort, predict this as possessing a signal peptide (which by definition means that PlasmoAP absolutely rejects it), and there is no obvious hydrophobic segment at the N-terminus that we would normally expect of a signal peptide. The toxoplasma hyperlopit doesn't suggest that the Toxoplasma orthologue is apicoplast, and the protein isn't found in the Boucher et al apicoplast proteome. This is somewhat of a mystery. It doesn't diminish the solid localisation data, with the excellent complementary data from IFA as well as the doxycycline+IPP experiment, but it should be pointed out clearly that this localisation isn't to be expected from the sequence analysis. 

      The section after line 344 "Iterative condensation of DMAPP with IP...", up until line 377 doesn't sit well within the section that has the heading "Apicoplast biogenesis requires polyprenyl isoprenoid synthesis". I suggest either creating a separate subheading for this material, or moving it into the start of the subsequent section "Localization of an annotated polyprenyl synthase to the apicoplast.".

    1. Author Response:

      Reviewer #1:

      Significance: A central puzzle in evolutionary biology (and philosophy of biology) is the evolution of new (collective) entities that can evolve on their own right (e.g. the evolution of multicellular organisms from single cells). These evolutionary transitions are often conceptualized in terms of fitness decoupling (a fitness increase of the collective even as the fitness of the component particles decreases). Using a life-history model, the authors show that fitness decoupling is not possible when the conditions for fitness are the same. Thus, this paper has the potential to change how we think about the evolution of new collective entities.

      Strengths: This paper is conceptually rich and the overall argument is clear. Re-analyzing previous data/models using their new framework highlights new patterns of fitness change in these transitions of individuality, and as such, it provides novel and exciting avenues of research.

      Weaknesses: While the overall argument is clear, some of the details can be hard to follow (even as someone familiar with the literature). The initial description of their model is fairly clear, but given its conceptual novelty, the paper does not spend enough time developing the different concepts of fitness at the particle level.

      Moreover, it is not entirely clear what is at stake: what is the role of fitness decoupling in our understanding of fitness transitions? And how does the proposed mechanistic ("trade-off breaking") model serve as a replacement? It seems to me like trade-off breaking is a characteristic of many evolutionary innovations, not only of major transitions. It seems even possible to envision groups that allow for an escape in a trade-off without leading to the evolution of a new "Darwinian" individual.

      For example, one could conceive of a trade-off in zebras between time spent foraging and protection against predators. Coming together temporarily as a group is likely to allow for values outside this trade-off space (similar to those in Fig. 6). One could even imagine a new mutation that makes zebras switch activities (foraging/watching) depending on their position within the group. This mutation is only available to zebras that form groups (the phenotype does not exist in the absence of a group). But I would still want to argue that there is more to the evolution of new levels of individuality. Trade-off breaking seems (potentially) a necessary, but not sufficient step in these transitions.

      And while the language of the authors is careful to not suggest sufficiency, it is not entirely clear how this approach helps us understand the particularity of these transitions.

      Reviewer #1 asks first to clarify the stakes: what is the role of fitness decoupling in the explanation and how does tradeoff-breaking replace or supplement it? Second, they requested us to make a statement about the necessary or sufficient nature of tradeoff-breaking.

      With respect to the second point, we argue that tradeoff breaking is not sufficient, but is probably necessary for an ETI to occur.

      Let us now clarify the role of fitness decoupling and tradeoff breaking in the explanation of ETIs. It must be stressed that tradeoff breaking does not “replace” fitness decoupling; rather, tradeoff breaking is an event that cannot be understood readily in the framework of fitness decoupling. Thus, we claim that ETIs are better understood when seen through the lenses of traits and the evolutionary constraints that link them (i.e., tradeoffs) than via the export-of-fitness model (i.e., fitness decoupling). To illustrate this, we use the zebra herd example proposed by the reviewer. Coming together temporarily as a collective does not, in itself, constitute a tradeoff-breaking event, but rather simply a collective-formation event (similar to the first ace2 mutation in snowflake yeast or the first WS mutation in the Pseudomonas system). From this starting point, a number of mutations (i.e., change in traits values) can be fixed in the population that improve the performance of zebras within this environment. This is the “fast” part of the evolutionary trajectory that occurs on the ancestral tradeoff, which we called “low hanging fruit mutations” in the manuscript. As a consequence, “optimal herds within the ancestral tradeoff” evolve. As stated in the manuscript, if we assume that the tradeoff on traits is identical for lone zebra and zebra herd and also assume that the ancestral lone zebra exhibit trait values that are optimal (within these constraints) for lone zebras, it follows that the low-hanging fruit mutations that improve the zebra herd will probably reduce counterfactual fitness. This lowering of counterfactual fitness is not due to a “transfer” between real and counterfactual fitness (because there is nothing to transfer between real and counterfactual worlds), but is a consequence of the differential contribution of the traits involved in the tradeoff to the two fitness quantities. However, this specificity of the tradeoff might be significant because it could lead to stabilisation of the new collectives through ratchetting.

      There is, indeed, “more to the evolution of new levels of individuality,” as pointed out by Reviewer #1. We claim that it involves rare mutations that would overcome the ancestral constraint and call them “tradeoff breaking mutations”. Tradeoff-breaking mutations are not bound by ancestral tradeoff; therefore, there is no a priori theoretical or biological reason to think they would have any positive or negative effect on counterfactual fitness. Here, we must stop using the zebra herd example because no tradeoff-breaking mutation occurred. However, the tradeoff-breaking lineages in the Pseudomonas example exhibit an improvement of both counterfactual and within-collective fitness. This observation does not fit within an export-of-fitness framework, but makes perfect sense in a traits-based view of ETIs—as a tradeoff-breaking mutation.

      Reviewer #2:

      This work reviews the influential "fitness decoupling" heuristic for understanding evolutionary transitions in individuality (ETIs), describes some of its limitations, and clarifies its interpretation. The review of the fitness decoupling account capably describes an interpretation of this framework that has frequently occurred in the literature, for example in Okasha 2006, Godfrey-Smith 2011, Hammerschmidt et al. 2014, Black et al. 2019, and Rose et al. 2020. However, it does not address the interpretation advanced by its authors, Richard Michod and colleagues, which they have clarified in several papers cited in the present work. Michod and colleagues have argued that the fitness decoupling account describes a changing relationship between the fitness of groups and the "counterfactual" fitness of their component cells, that is, the fitness the cells would have if they were removed from the group. This point is made explicitly in Shelton & Michod 2104 and Shelton & Michod 2020 and was present (though perhaps not as obvious) in Michod 2005 and later works, in contrast to the claim in the Glossary that this is a "relatively recent development of the fitness decoupling literature." The interpretation that Michod embraces is similar to what is here described as f2, the fitness of a "theoretical mono-particle collective", but that interpretation is not mentioned in the present work until Section 2.3. It is possible that an argument could be made that Michod and colleagues have not consistently interpreted fitness decoupling this way, or have made statements inconsistent with this interpretation, but no such argument is present in this work. Thus the impression conveyed is that Michod and colleagues consider decoupling of "commensurably computed fitnesses" possible, which is counter to their explicit statements on the topic.

      The description of the limitations of the fitness decoupling heuristic (Section 2) is useful and goes a considerable distance toward clarifying the ways in which fitness decoupling can rigorously be interpreted. However, the final assessment (Section 2.3) does not make a compelling case for its central argument, the lack of utility of the fitness decoupling concept. Elsewhere in the work, the ratcheting model of Libby and colleagues is referenced in comparison to the tradeoff-breaking approach, but Section 2.3 does not acknowledge the relationship between Libby and colleagues' model and the counterfactual interpretation of the fitness decoupling heuristic. For example, the argument in Libby and Ratcliff 2014 that "If any of the yeast that evolved high rates of apoptosis within clusters were to leave the group and revert to a unicellular lifestyle, they would find themselves at a competitive disadvantage relative to other, low-apoptosis unicellular strains." and in Libby et al. 2016 that "…if G cells were to revert to unicellular I cells, they would be quickly outcompeted" are counterfactual fitness arguments essentially similar to that of Shelton and Michod 2020 that "the fitness a cell would have on its own declines as the transition progresses." Section 2 makes a convincing case that commensurable fitnesses cannot be decoupled, but by fixating on commensurability, which is not relevant to the counterfactual interpretation of fitness decoupling, Section 2.4 fails to make a convincing case that "fitness-decoupling observations do little to clarify the process of an ETI." That is, "because they are not commensurable" does little to explain why the counterfactual interpretation of fitness decoupling "does little on its own to clarify the process of an ETI," since commensurability is not a claim that the the counterfactual interpretation of fitness decoupling makes.

      We agree with the reviewer on two essential points: (1) the decoupling of commensurably computed fitness is impossible when collectives have a finite size and (2) counterfactual fitness is not commensurable to particle or collective fitness.

      While we recognise that Michod and collaborators did clarify that fitness decoupling referred to counterfactual fitness (although, to us, this becomes clear from 2015 onward), we argue that the fitness transfer (or export-of-fitness) metaphor implies (by its wording) a commensurability of fitnesses that undermine this welcome clarification.

      Indeed, for a quantity to be transferred from one place—or component—to another, the source and destination must be commensurable. It is incorrect to talk about a transfer between counterfactual and actual quantities. A better choice of words to discuss the relative change of counterfactual and actual quantities would avoid the physical transfer metaphor and focus instead on the correlation of the two quantities. It must be noted that, despite the clarification of counterfactual fitness, the word “transfer” continues to be used in recent work (Davison & Michod, 2021).

      This may seem like nitpicking; however, there is a real advantage in being careful about this. We do agree that, under some assumptions, counterfactual fitness would decrease while whole–life cycle particle fitness (or collective fitness) increases. From there, one might ask: what needs explaining? If one assumes an export-of-fitness framework, the transfer of fitness explains why it cannot be otherwise. If fitness decreases on one side, it must increase on the other. In other words, the existence of a tradeoff is taken for granted based on the improper physical metaphor. While there are strong reasons to think that such tradeoffs exist, they should be assessed in their own right and on a case-by-case basis rather than being assumed to hold. Otherwise, there is no way to make sense of the tradeoff-breaking scenario described in Section 4.

      By the same token, the metaphor of “decoupling” often associated with the export-of-fitness model is misleading because it is used to describe a part of the evolutionary dynamics where counterfactual particle fitness and whole–life cycle particle fitness are strongly dependant on one another (even if their changes are anticorrelated), through the existence of the tradeoff.

      Nevertheless, we welcome the reviewer’s urge to clarify our position and how this relates to Michod and colleagues’ counterfactual fitness proposal.

      The model based on trade-offs and trade-off breaking is useful and likely to be of interest to theorists interested in ETIs. The observation that this model can reproduce the (counterfactual) fitness-decoupling observation is a useful in showing the how the two models relate. The result that counterfactual fitness decoupling is a consequence rather than a cause of the evolutionary dynamics is an important point (though perhaps obvious in retrospect, since counterfactuals, things to do not happen, can't be the causes of anything).

      The caution in Section 3.3 that "the same [counterfactual fitness decoupling] observation will be made in any situation in which short-term costs are compensated by long-term benefits, not solely during ETIs" is a good point, and it sets up the argument that trade-off breaking is a "genuine marker for an ETI". However, no convincing case is made that the same criticism, that the observed phenomenon is not unique to ETIs, is not equally true of trade-off breaking. Some nice examples of trade-off breaking in the context of ETIs are given, but these do not amount to an argument that trade-off breaking is only observed during ETIs. The life history literature includes examples of trade-off breaking that are not related to ETIs, so it is not clear that trade-off breaking is either a reliable indicator of ETIs or superior in this respect to counterfactual fitness decoupling.

      This point is in line with one of the points made by Reviewer #1. We have now clarified our position with respect to the generality of the tradeoff-breaking approach.

      In the Discussion, the "inconveniences" associated with the fitness decoupling are cogent limitations of this heuristic. The "impossibility of decoupling between commensurable measures of fitness" is an important result, but it is not new and should thus probably not be presented as "[o]ur first main finding". Shelton and Michod 2014 includes a mathematical proof in the appendix that, given the model assumptions, "consideration of the births and deaths of colonies gives us exactly the same bottom line (fitness) as consideration of the births and deaths of lone cells." The second main finding, that "fitness decoupling observations cannot be reliably used as a marker for ETIs," is valid, but as described above, a convincing case is not made that trade-off breaking can be reliably used in this manner, either. Trade-off breaking may, however, be a useful way to think about ETIs in the other ways that are suggested, for example as key events and as stepping stones to new hypotheses.

      We have now clarified our position.

    1. A node may have several applications running on it (several sensors) each of which is an application. These application instances on a node are said to be endpoints, where messages can originate and terminate.

      I think here we are calling sensors/acuators/transducers as "applications". But this may be saying that the endpoint is where the data comes from, and not the hardware device. The question is, can application endpoints be at coordinator and router devices?

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The paper tackles an important problem regarding the effect of demographic dependent vaccination protocols on the reduction in the number of deaths with respect to the situation of no vaccination (say J). A compartmental SIRD model with reinfection Y is proposed, stratified in two (age dependent) groups, based on a binary reduction of a given contact map, and given infection fatality risk (IFR). Several countries are then analyzed.

      As far as I understand we have a control variable v, parameters of the stratified model (i=1,2) tuned to match IFRi, and a control objective, i.e. minimization of J over one year.

      The paper is well written. The final message and some theoretical passages are not completely clear, at least to me. I have the following observations that the authors may want to consider.

      We thank the referee for the revision and are very glad that the overall evaluation is positive. Comments and suggestions have been thoroughly addressed, as we discuss in the following.

      1) The study of stability of infection free and endemic equilibria should be better developed. The 5 equations can be reduced to 4 (neglecting D) and the characteristic of the reduced Jacobian used to characterize the local asymptotic stability of equilibria, instability, bifurcation points etc... Alternatively, one can use a co-positive Lyapunov function (LF). For instance, if we take the LF V=S+I+Y+R, we get $\dot V=-\mu_I I-\mu_Y Y \le 0$. If $\mu_I$ and $\mu_y$ are strictly positive all equilibria are characterized by (S*,0 0,R*) and D=1-S*-R*. So, I don't understand the phrase after (7,8), notice that Y cannot be zero in finite time. For $\mu_y=0$ then Y* can be nonzero. I guess that closed-form computation of S* and R* is possible as function of the parameters at least in the case v=0. The stability result should be cast in function of the current reproduction number (not explicitated) wrt to S and R.

      The authors are invited to have a look at

      1.1) Pagliara et al, "Bistability and Resurgent Epidemics in Reinfection Models", IEEE CSLetters, 2018,

      for a theoretical analysis of stability on a similar (just a little bit simpler) model.

      We appreciate the suggestions of the referee for improvement of this material. We have carried out an in-depth revision of the stability analysis and significantly extended it. The major addition has been, as suggested, a section relating the current reproductive number at equilibrium (we call it the asymptotic reproductive number in the text) to the fixed points of the dynamics for three different scenarios: general model, no vaccination, and zero mortality of reinfected individuals. As Pagliara et al. show in their paper, the connection between the fixed points and the reproductive number is not trivial, but it is possible to derive it through the next-generation matrix technique, as we now do. Additional references regarding this technique have been added. We have included a Table summarizing the stability analysis (page 2 in SI 3) at the end of this new section.

      Other modifications include the reduction of 5 equations to 4 for the stability analysis and a clarification of possible equilibria (page 1 of SI 3), rephrasing and correcting our sentence after eqs. (7) and (8). We also attempted to obtain a closed-form computation of S* and R* but, to the best of our knowledge, concluded that it is not possible. We would be happy to pursue any insight in this respect the referee may have.

      What said before should be also extended to the stratified model, where a "network" Rt could be defined, see for instance

      1.2) L. Stella et al, "The Role of Asymptomatic Infections in the COVID-19 Epidemic via Complex Networks and Stability Analysis", SIAM J Cont. Opt., 2021, (arxiv.org/pdf/2009.03649.pdf)

      We thank the referee for pointing out this reference. Following the analysis in Stella et al., we have carried out a stability analysis for the stratified model as well. The results are included in a new section (pages 7-10 in the SI 3).

      2) It is not clear whether the free contagion parameters of the model have been fitted on real data (identification from infection and reinfection data). Notice that the interplay between vaccination strategies and NPI is important, see e.g.

      *2.1) Giordano et al, Modeling vaccination rollouts, SARS-CoV-2 variants and the requirement for non-pharmaceutical interventions in Italy", Nature Medicine 2021, *

      where progressive vaccination in reverse age order is considered together with different enforced NPI countermeasures.

      In the first part of our study, parameters are intendedly left free because we aim at describing the generic behavior of the model. Still, we derive several inequalities and relationships between parameter ratios that seem to be sensible attending to what the different classes in the model stand for. This is as described in sections regarding model parameters when the two generic models (SIYRD and S2IYRD) are introduced. The aim is to represent both the generic dependence with some variables and a broad class of contagious diseases, so parameters are mostly free. In agreement with this approach, parameters can be also freely varied in the companion webpage.

      In the second part of our study, the model is applied to COVID-19. In that case, we have used parameter values in agreement with observations, as (admittedly poorly) explained in pages 9-10 of the main text. Indeed, not enough information on parameter estimation was provided in the main text, and the SI 2 also needed some additional information. This has been amended. Let us explicitly mention that we have not fitted the dynamics of the model to any actual data set to fix specific values, as Giordano et al. do. In our case, we have first used different demographic data sets to evaluate contact rates and IFRs of the two population groups (these are parameters Mij and Ni in eqs. (7-10)). Secondly, recovery and death rates are estimated through the IFRi values for each age group i and the infectious period of COVID-19, that we fix at dI=13 days. Third, infection rate βSI=R0/dI has been estimated fixing R0=1, since the reproductive number of COVID-19 all over the world fluctuates around this value (Arroyo-Marioli et al. (2020) Tracking R of COVID-19: A new real-time estimation using the Kalman filter, PLoS ONE 16(1):e0244474). The reinfection rate is defined through its relationship with the infection rate, βRI= α1 βSI, where α1 was in the range 0-0.011 at early COVID-19 stages (Murchu et al. (2022), Quantifying the risk of SARS‐CoV‐2 reinfection over time, Rev Med Virol 32:e2260) and seems to be about 3-4 fold larger for the omicron variant (Pulliam et al., Increased risk of SARS-CoV-2 reinfection associated with emergence of the Omicron variant in South Africa, www.medrxiv.org/content/10.1101/2021.11.11.21266068v2). Given the relationships derived among parameters, our only free parameter was α2RY= α2 βRI, and we fixed it to α2=0.5 (i.e., reinfected individuals recover twice as fast as individuals infected for the first time).

      Once more, it was not our goal to precisely recover specific trajectories of COVID-19 or to point at possible future scenarios, but to illustrate the dependence of major trends with model parameters. Also, the appearance of new variants requires the reevaluation of parameters. For example, omicron has different IFR (therefore different mortality and recovery rates), a different infectious period, and higher infection and reinfection rates. In this context, the interactive webpage (where we will update demographic profiles and IFR data as they become available) is a useful resource to simulate any situation different from current or past ones.

      3) In the model the immunity waning is not explicitly considered (flux from R to S or better from a vaccinated compartment to S). It is clear that this complicates the model. Please discuss why the indirect way the waning is considered here is justified.

      3.1) Batistela et al, "SIRSi compartmental model for COVID-19 pandemic with immunity loss", Chaos Soliton and fractals, 2021.

      3.2) McMahon et al, "Reinfection with SARS-CoV-2: Discrete SIR (Susceptible, Infected,Recovered) Modeling Using Empirical Infection Data", JMIR Public health and surveillance, 2020.

      Though the model does not consider an incoming flux of individuals to compartment S, the existence of a "backward" flux from R to Y yields a transient phenomenology analogous to models with increases in the S class. Indeed, it is these fluxes that cause persistent endemic states; otherwise, the S class is monotonously depleted until infection extinction.

      In Batistela's et al. work, the possibility that individuals become reinfected is effectively implemented through a flux between the R and S classes, since only one class of infected individuals is considered and recovered individuals cannot be infected again. In our case, feeding back to S would mean that previous immunity is completely lost or that vaccines are not effective at all for some individuals. This is neither what McMahon et al. conclude when evaluating real data nor what more recent surveys indicate (see for instance the Science Brief published in October 2021 by the CDC, SARS-CoV-2 Infection-induced and Vaccine-induced Immunity, https://www.cdc.gov/coronavirus/2019-ncov/science/science-briefs/vaccine-induced-immunity.html).

      This nonetheless, complete immunity waning (feedback to the S class) and reinfections (feedback to a partly immune class experiencing overall lower severity of the disease) are equivalent to a large extent: the trend of COVID-19 seems to indicate that our Y class will be the "new S", and that fully naive individuals would arrive mostly due to demographic dynamics (birth and death processes, as also implemented by Batistela et al.). Summarizing, complete immunity waning is rare in the time scales considered in our simulations, while partial immunity that decreases the severity of the disease (after infection or vaccination) is the rule, in agreement with our choices.

      4) Reduction of deaths wrt no vaccination is of course important, but also reduction of stress in hospitals. This is particularly important now with the advent in Europe of the omicron variant. Please discuss on the real message you want to convey to policy makers in the actual scenario of the pandemic.

      The model in this work is deliberately simple. Our main goal was to explore the qualitative effects of demographic structure and disease parameters in protocols for vaccine administration. This was the reason to consider a mean-field model in a population structured into two groups. The main conclusion is that optimal vaccination protocols are demography- and disease-dependent. If this is so in our streamlined model, the more it will be in more realistic models, where one should include a finer stratification and, in all likelihood, heterogeneity in contagions. Our main message, therefore, is that there is no unique protocol for vaccine roll-out, valid for all populations and diseases. The abstract has been modified to highlight this conclusion.

      Some qualitative considerations also allow us to draw preliminary conclusions on the reduction of stress in hospitals. Since the number of hospital admissions is proportional to the incidence of the disease, the number H of hospitalized individuals can be represented as H=a I + b Y, with a>>b due to the partial immunity of vaccinated or recovered individuals (which belong to class Y upon (secondary) contagion). Therefore, minimizing the burden on the healthcare system amounts to minimizing the number of individuals in the I class. Beyond non-pharmaceutical measures, I is minimized when individuals are transferred as fast as possible to the Y class, that is, maximizing vaccine supply and acceptance. In terms of our model parameters, this entails maximizing v and also θ (the maximum fraction of individuals eventually vaccinated), for instace through devoted awareness campaigns. These ideas have been included in the Discussion section.

      Reviewer #1 (Significance (Required)):

      The final message and some theoretical passages are not completely clear, at least to me.

      Please discuss on the real message you want to convey to policy makers in the actual scenario of the pandemic.

      As discussed above, we have modified the manuscript following the advice given by the Reviewer. We think that both the presentation and the theory are clearer now.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this paper, a compartmental model of the propagation of an infection with vaccination and reinfection is studied. The impact that changes in the rates of these two processes have on disease progression and on the number of deaths is analyzed. In order to highlight the overall effect of the demographic structure of populations and the propagation of a given disease among different groups, the population is divided into two subpopulations and the model is extended to the two-dimensional case. In addition to the study of equilibria and their relative stability, the model is then applied in the case of COVID-19. Different vaccination strategies are studied using real demographic data and with a population split between under 80 and over 80 individuals. It is observed that for low vaccination rates, the advisable strategy is to vaccinate the most vulnerable group first, in contrast to the case of sufficiently high rates, where it is appropriate to vaccinate the most connected group first. The simulations show also that with a low fatality ratio, the strategy that yields the greatest reduction in deaths is vaccination of the group with the most contacts, while the situation is reversed for higher fatality ratio.

      The model and simulations presented are interesting and valuable. The comparison of the behavior of the model in the 4 different countries is very interesting, as well as the webpage created by the authors.

      We thank the referee for the very positive evaluation and are very glad that the study is found interesting and valuable.

      As minor comment, I think that the introduction of the model needs a more extensive literature review. For example, there is no mention of the classic SIR model of Kermack and McKendrick (1927) and other works on the introduction to epidemic models, which form the basis of the model presented by the authors.

      The referee is right. There is a long history of extensions and applications since Kermack & McKendrick introduced the SIR model that we obviated. This has been amended by adding an introductory paragraph with several new references at the beginning of the Models section, page 3 in the main text.

      Reviewer #2 (Significance (Required)):

      The model presented by the authors is quite original and simple enough to be suitable to different contexts and scenarios.

      Compared to previous work, this paper makes a twofold contribution, as explained by the authors. First, the introduction of reinfections shows the existence of long transients (or quasi-endemic states) that may precede the transition to a truly endemic state predicted for COVID-19. Second, the simplicity of model allows the characterization of systematic effects due to, at least, group size, demographic composition, and IFRs.

      I am involved in the study and analysis of epidemic models accompanied by network effects. I think this paper is a good contribution, although preliminary, in the analysis of the vaccination process and in the search for the optimal strategy.

      We thank the Reviewer and are glad that our goal, offering a model as simple as possible to obtain meaningful conclusions, is appreciated.

    1. Author Response:

      Reviewer #1 (Public Review):

      In this paper, Qin et al. investigated the molecular mechanism of phospholamban (PLN) linked dilated cardiomyopathy (DCM), using structural approaches combined with biophysical measurements. Structures of the catalytic domain of protein kinase A (PKAc) in complex with PLN peptides (both wild-type and the R9C and A11E DCM mutants) provide insights into the mechanism of substrate recruitment and how it is perturbed in the disease state. Qin et al. show convincingly that the mutant peptides all have lower affinity for PKA than the wild-type peptide, suggesting models in which heterozygous DCM mutations act via sequestering PKA and thereby preventing phosphorylation of the wild-type peptide may be incorrect.

      The authors highlight significant differences between their structure of the WT-PLN:PKAc complex, which has a 1:1 stoichiometry, and a previous structure of the complex (PDB 3O7L), which has 1 PLN bound between two PKAc monomers (a 1:2 complex). The authors posit that the stoichiometry observed in 3O7L is an artifact of the crystal lattice, and does not occur in solution, supporting this with analysis of the elution volumes of the peptide complexes on size exclusion chromatography compared to PKAc alone. They further suggest that the AMP-PNP ligand included in the 3O7L structure is not bound, based on analysis of Fo-Fc maps calculated from the deposited coordinates. Inspecting 3O7L I am not convinced of this last point - it seems more likely that a technical error was made in assigning or refining the B-factor of the ligand in 3O7L, because there is clearly density present in SA-omit maps for the nucleotide.

      Taking these results together, the authors suggest a mechanism for DCM, whereby mutations in PLN result in lower affinity for PKA, and consequently reduced phosphorylation. This seems plausible and well supported by the data, although in the ADP-Glo assay used here, the reductions in phosphorylation observed for some of the mutant peptides are rather modest. However, as the authors state, it is plausible that even relatively subtle changes in PLN phosphorylation could have substantial effects on Ca2+ homeostasis via increasing SERCA inhibition.

      We thank the reviewer for the appreciation of our work.

      Reviewer #2 (Public Review):

      Strengths:

      The authors presented new high-resolution 3D crystal structures of the PKA catalytic domain (PKAc) in complex with PLN WT or mutant peptides (residues 8-22) containing the DCM-associated PLN mutations (R9C or A11E). These are novel and important data given that the present structures are dramatically different from those reported previously. The authors made convincing argument that the 3D model reported previously may result from a crystallization artifact.

      By characterizing the interactions between the PKAc domain and PLN WT or DCM-associated mutant peptides using surface plasmon resonance (SPR) analysis, the authors convincingly showed that the DCM-associated PLN mutations at positions 9, 14, and 18 alter the conformation of the PLN peptide and reduce the binding affinity of the PLN peptide with PKAc. These data provide an explanation how some DCM-associated PLN mutations at these positions reduce the level of PKA-dependent phosphorylation of PLN.

      The authors also performed nuclear magnetic resonance (NMR) to determine the structural dynamics of PLN WT, R9C, P-Ser16, and P-Thr-17 peptides. These NMR structures combined with the SPR analysis also support their conclusion that PLN phosphorylation and DCM-associated PLN mutations have an impact on its conformation.

      We thank the reviewer for the comments.

      Weakness:

      The present study used PLN-derived peptides (aa 8-22). Although technically challenging, it is important to consider if the full-length WT or mutant PLN will behave the same as those observed with the peptides. This is especially crucial in light of the prior work showing substantially different structures using a different segment of PLN.

      We are fully aware of the potential risk to draw conclusion from an isolated peptide instead of the full-length PLN as a transmembrane protein. In the previous study, people showed that the PLN peptide could be used as a good model substrate that gets phosphorylated as efficiently as the full-length PLN protein (L. R. Masterson et al., Dynamics connect substrate recognition to catalysis in protein kinase A. Nat Chem Biol 6, 821-828 (2010); D. K. Ceholski, C. A. Trieber, C. F. Holmes, H. S. Young, Lethal, hereditary mutants of phospholamban elude phosphorylation by protein kinase A. The Journal of biological chemistry 287, 26596-26605 (2012)). These results together with our biochemistry results suggest the tail peptides are indeed active substrates of PKA. Due to the technical difficulty, we were not able to crystallize PKAc in complex with the full-length PLN. To explain the potential difference between the peptides and the full-length PLNs, we added more text in the discussion section “Additionally, the trend of the reduced phosphorylation by DCM mutations can be significantly affected by the oligomerization state of PLN. Ceholski et al. showed that R9C severely inhibits PKA phosphorylation in the context of full-length pentameric PLN, but has a much milder effect in the context of full-length monomeric PLN or an isolated tail peptide [41].”

      Although it is convincing that DCM-associated PLN mutations likely reduce the interaction between PKAc and PLN (assuming that the peptides behave the same as the full-length PLN with respect to interaction with PKA) and, as a result, the PKA dependent phosphorylation of the mutant PLN, it is unclear how this impaired interaction between PKA and PLN mutant could explain the effects of the DCM-associated PLN mutations on SERCA function (either reduced or enhanced PLN-dependent inhibition of SERCA, as proposed previously). In this regard, can the authors predict if the DCM-associated PLN R9C mutation reduces or increases SERCA inhibition based on the results of their present study?

      It is indeed controversial how PLN mutations cause DCM. Previous studies have shown that the DCM mutations in PLN might change this regulation in either a phosphorylation-dependent or phosphorylation-independent manner. Our results show that the mutations may act through both manners: 1) the mutations reduce the phosphorylation level of PLN, which has been shown to enhance the inhibition of SERCA and inhibit the uptake of Ca2+; 2) the mutations change the conformation of PLN before binding to PKA or SERCA, which could have additional consequences, such as altered assembly state of PLN, phosphorylation of PLN by CaMKII, or changes in interactions of PLN with the lipid membrane. This could impact in either directions, reducing or increasing SERCA inhibition, which is difficult to predict based on our data. We added the explanation in the discussion “While decreased PLN phosphorylation is likely an important contributor to the physiological dysfunction associated with familial DCM, disease-causing mutations in PLN may have additional consequences, such as altered assembly state of PLN, phosphorylation of PLN by CaMKII, or changes in interactions of PLN with the lipid membrane. The influence of such factors on SERCA inhibition are unclear. In principle, they might further increase inhibition of SERCA and act in conjunction with lower PKA-mediated phosphorylation to manifest the disease symptoms. Conversely, it is possible that these factors could decrease the inhibition of SERCA, partially compensating for the decreased phosphorylation level, and mitigating the symptoms.”

      It is also unclear how reduced PKA phosphorylation of mutant PLN could lead to DCM. PLN is unlikely to be significantly phosphorylated by PKA at rest (in other words, PLN is likely to be phosphorylated by PKA during stress, i.e. during the adrenergic fight-or-flight response). Therefore, it is puzzling how such reduced PKA-dependent phosphorylation of PLN would significantly affect the PLN function during the absence of flight-or-flight response.

      As explained above, we think that this regulation could be through both phosphorylation-dependent and phosphorylation-independent manner. Even only considering the phosphorylation-dependent manner, the DCM phenotype could be due to an accumulation of the Ca2+ imbalance in the cell over repeated cycles of cardiac muscle contraction upon chronic accumulation of the sporadic phosphorylation events. It is also possible that the mutations affect the CaMKII-dependent regulation of PLN, which leads to DCM.

      Given that the DCM-associated PLN mutations have significant effects on the conformation of PLN itself, at least in the form of short-peptides, it is possible that these mutations could affect the folding, oligomerization, trafficking, degradation, etc., in addition to PKA-dependent phosphorylation. The relevance and contribution of reduced PKA-dependent PLN phosphorylation to DCM remain unresolved.

      We agree with the reviewers that both phosphorylation-dependent and phosphorylation-independent manners could contribute to the DCM disease phenotype. It remains unresolved which factor is the major contributor. We have added a statement in the discussion (see point above).

      Reviewer #3 (Public Review):

      This manuscript describes an elegant study utilizing the crystal structures for the elucidation of the disease mechanism of familial dilated cardiomyopathy. It has been known for decades that the mutations in PLN are associated with DCM, but the underlying mechanism remains controversial. In my opinion, Prof Yuchi and co-authors did excellent job on revealing the high-resolution crystal structures of PKA-phospholamban complexes, representing both the native and diseased states. Combined with various of biophysical and biochemical methods, including SPR, ADP-glo, thermal melts, NMR, etc, the authors systematically investigated the correlations between the PLN conformation, the binding affinity, and the phosphorylation level. The mechanism of PKA phosphorylation on another related substrate, ALN, was also convincingly revealed. The results are very helpful for understanding the pathological mechanism of PLN-related DCM. More importantly, the atomic structures of PKA-phospholamban complexes lay a solid foundation for the structure-based rational design of therapeutic molecules that can reverse the effects of the DCM-causing mutations in the future, e.g. by stabilizing the interactions between PLN and PKA.

      We thank the reviewer for the appreciation of our work.

    1. Author response:

      Reviewer #2 (Public Review):

      This work by Castledine et al. addresses the important question of whether results from in vitro (laboratory-based) evolution studies may be useful for predicting evolution during phage therapy in a clinical setting. In order to explore this question, the authors cultured a set of bacterial isolates from a patient pre- and during phage therapy, as well as phages from several time points during therapy. They then experimentally evolved (in vitro) a mixture of the bacterial isolates from the patient in the absence of phage, or in the presence of phage using two different treatments (phage added once or added repeatedly). Overall, they observed similarities between the evolutionary outcomes (genomic and phenotypic) in vitro and in the patient. Resistance evolved rapidly in the patient and in vitro under phage selection, and similar genomic changes were observed in both environments. The approach of using bacterial isolates directly from the patient (as well as the phages used for therapy) in vitro is clever, and the observed similarities are compelling.

      We thank the reviewer for appreciating the novelty in our results and methodology.

      However, I think there are some limitations with the study that should be addressed in the text.

      In particular, (1) While the similarities in vitro and in the patient are quite interesting, there are some differences that were dismissed as being minor without justification. Calling the results "highly parallel" is a bit subjective - in vitro in the repeated phage treatment (which is suggested to be most similar to the clinical context), there did appear to be phage coevolution that was not observed in vivo. The tradeoffs/relationships between traits (as shown in Fig. 3) also differed to some extent.

      We agree this could have been more objectively phrased at the start of the discussion – this has been edited to reflect this. We have highlighted the differences between in vivo and in vitro treatments with respect to phage evolution. Moreover, we have also highlighted that the observed trade-offs had different underlying mechanisms which may not always result in parallel evolutionary changes between in vivo and in vitro environments.

      Additionally, for the genomic results only a subset of variants were plotted (those in genes of known function), but there were far more significant variants in genes of unknown function that were not included. It is difficult to assess whether the genomic findings are truly similar across environments if only a fraction of those results were presented in the manuscript.

      We chose to concentrate on genes of only known function so that we could better understand their potential significance, and also because the figures and analyses (Figures 4 and 5) would become extremely complex and large and uninterpretable with genes of unknown function included. This is especially true for Figure 5, which would have required us to show 284 rows if all genes would have been included. Ultimately, whichever way we do this exploratory analysis, it is going to be difficult to see if findings are truly similar across environments because we only have a single patient who had phage therapy.

      However, we have redone the analysis with all of the significant genetic changes (SNPs and indels from both known and unknown genes) included.

      Figure 5 has been recreated and is now included as "Figure 5 - Figure supplement 1". All of the statistical analysis on (a) the number of SNP/indels seen (b) genetic distance from ancestor and (c) alpha diversity give quantitatively similar results. That is, although all the estimates are generally much higher after including many more genetic variants, all of the significant results from both the overall model fit and post-hoc multiple comparisons remain the same. One interesting result that came out of looking at all the genetic changes was that for genetic variants occurring in a gene of known function, 56% (28 out of 50) were de novo mutations, whereas this value was only 42% (98 out of 234) for variants in genes of unknown function.

      We then looked at the proportion of genetic variants (both in known and unknown genes) found in vitro that were also found in vivo. For genes of known function, 62% of genetic variants were found in vivo (31 of 50) and this was comparable to the 65% of genetic variants in genes of unknown function (153 of 234). Of the 26 genes of known function with differences identified in the in vitro analysis, 16 (61%) were also found to have genetic changes in vivo. The equivalent metric for genes of unknown function was 86% (85 of 99). Similar to in vitro, variants occurring in a gene of known function were more likely to be de novo mutations (77%) compared to variants occurring in a gene of unknown function (46%).

      While these patterns and exploratory analyses are interesting, they have extremely limited statistical power and therefore do not alter the conclusions or results of the work presented. For these reasons, we have chosen not to include these results in the already long manuscript. We have added a line to say we have done it both way:

      “We performed all downstream statistical analyses on (a) only genetic variants in genes of known function and (b) all genetic variants.”

      And we also added a line at the beginning of the genomic analysis results section:

      “Results were not affected whether we included only genetic variants occurring in genes of known function or all genetic variants (Figure 5-Figure supplement 1). As we were interested in attributing potential functions to the variants identified, we only present the results for genetic variants occurring in genes of known function.”

      (2) Much of the text is framed around whether in vitro outcomes are predictive of those in vivo, but this study only included results from a single patient. Thus, it is impossible to know whether these findings are by chance or representative of a more general relationship between in vitro and in vivo evolution.

      We agree that having a single patient for our in vivo comparison limits the generalisability of our results. We have highlighted this in the revised manuscript. However, that our replicated in vitro experiments agreed broadly with our in vivo results and that of other studies (finding resistance-virulence trade-offs) suggests that at least in some circumstance in vitro dynamics are predictive of in vivo dynamics. Further studies are clearly needed (and hopefully will arise as a consequence of this work) to determine the generalisability of this finding and the circumstances where this parallelism might break down.

      (3) Although the evolutionary outcomes appear to be similar, the pathogen was successfully cleared from the patient but persisted throughout experimental evolution. Whether the pathogen is successfully eliminated or not is presumably the most important clinical outcome, and while this difference is not surprising, it is an important one to point out to the reader. Essentially, evolution was similar to some extent but the consequences of evolution for bacterial persistence in each environment were quite different.

      We have now highlighted this difference to the reader in the revised manuscript.

    1. Author Response:

      Reviewer #1 (Public Review):

      • Line 141: It would be beneficial to better understand how the sequenced sample of the population corresponds to the PCR confirmed sample of the population, in order to understand possible selection biases in the sequence data. Could you elaborate on how the composition of sequence PCR confirmed cases matches the composition of PCR confirmed cases, by the demographic characteristics listed in Table 1.

      Early in the pandemic (March-April), we tried to sequence every SARS-CoV-2 positive case diagnosed in our KWTRP laboratory from Coastal Kenya. However, with the sharp increase in the number of identified cases from the month of May 2020 onwards, and a limited in-house sequencing capacity, we changed strategy to sequence only a sub-sample of the identified positives. The criteria for sub-sampling included having a cycle threshold of < 30.0, spatial representation (at county level) and temporal representation (at month level). The consequent number and proportion of samples sequenced across the study period months and across the counties is summarized in Fig. 2C-E with the sample flow provided in Figure 2-figure supplement 1.

      In the revised manuscript we have provided a comparison of the demographic characteristics of the sequenced cases versus non-sequenced cases (shown as Table 2). The participants providing the sequenced and non-sequenced positive samples had a similar gender distribution and similar probabilities of being from either from Wave one or Wave two. However, the distribution of sequenced vs non-sequenced cases differed significantly in age distribution, nationality and travel history. Specifically in the sequenced sample, there were more participants in 30–39 years age bracket compared to the non-sequenced samples, a disproportionately representation of non-Kenyan nationals and persons with a recent international travel history in the sequenced sample.

      • Line 283: I am particularly interested in the observed inter county flows, but it is hard to interpret the numbers. Considering population sizes in each county, what are the phylogenetically observed import rates per 100,000? What are the rate ratios? Based on the observed data, is there any evidence that imports into coastal Kenya occurred statistically significantly through Mombasa?

      We thank the reviewer for these comments.

      In the revised manuscript we have added two new tables (1 & 4) which detail the population size in each of the six Coastal Kenya counties, population density and estimated import/export rates (per 100,000) for the counties.

      The alluvial plots are descriptive regarding genome flows. The underlying data on the pattern of virus movement is inferred using the ancestral state reconstruction which an established phylogenetic approach that has been applied elsewhere to infer SARS-CoV-2 local and global movement (Wilkinson et al, Science 2021, Tegally et al, Nature, 2021).

      The results we obtained from ancestral state reconstruction of Mombasa being a major gateway for variants entering the coastal region of Kenya is consistent with (a) the county showing the highest number circulating of lineages (n=28) compared to the other five remaining counties of Coastal Kenya, (b) approximately half (n=21, 49%) of the detected lineages in coastal Kenya had their first case identified in Mombasa and (c) Mombasa had an early wave of infections compared to the other Coastal counties.

      We are not aware of an approach to consider statistical significance on these plots. The graphical display is based on the observed number events, and we would argue this is more appropriate than presenting absolute rates which would be susceptible to sampling bias.

      Is it possible to account for potential bias in sequence sampling in these calculations, perhaps as done in Bezemer et al AIDS 2021? It should be possible to adjust for the proportion of sequenced individuals in PCR confirmed individuals, and it might also be possible to back calculate infected cases from cumulative reported deaths and to adjust for the proportion of sequenced individuals in infected individuals?

      The reviewer suggests helpful methods to examine sampling bias, but we found this beyond scope here. Our method was based on ancestral location state reconstruction of the dated phylogeny. The approach has been used elsewhere to answer similar questions (Wilkinson et al, Science 2021, Tegally et al, Nature, 2021). The Bezemer paper uses maximum parsimony ancestral state reconstruction algorithm implemented in phyloscanner, and the Bayesian method applied to impute incomplete sampling is applicable to chains of transmission which we have not tried to reconstruct in our analysis.

      Considering my earlier recommendation to document sequence sampling representativeness in Table 1, if Mombasa is found to be oversampled relative to infections, then it might also be helpful to perform sensitivity analyses in which sequences from over-represented locations are down-sampled. Another option might be to consider the approaches considered in de Maio PLOS Comp Bio 2015, or Lemey Nat Comms 2020. Thank you for investigating potential caveats and substantiating your findings in more detail.

      In the revised manuscript we have clarified that our sequenced sample was proportional the number of positive cases reported in the respective Coastal Kenya counties (see-Fig.2E and Table 1).

      The De Maio method uses BASTA (BAyesian STructured coalescent Approximation) into BEAST for purposes of phylogeographic analysis to compare ability to discriminate a zoonotic reservoir vs the implausible alternative cryptic human transmission. Analyses developed from these methods would be valid and interesting to apply to our dataset but would be a major new analysis and beyond the scope of the present paper. We have therefore taken the approach of: a) more clearly acknowledging sampling bias (see below) and b) undertaking sensitivity analyses (Supplementary File 5, see below). Using the larger global background sequence sets selected in a different way (more geographically balanced relative to the first round that was random), we still find that most of the virus introductions into coastal Kenya occurred via Mombasa consistent with our previous analysis.

      The results are consistent with the case numbers in that (i) Mombasa experienced an earlier peak during wave one relative to other counties and (ii) had in total more cases than all the other five counties, and (iii) was commonly the first county of detection for many of the identified lineages in the region. However relative to its population, the border county of Taita Taveta had a higher import rate (13.5. per 100,000 people) compared to that of Mombasa (11.6 per 100,000 people), Table 4

      Observations from our sensitivity analyses (Supplementary File 5) are included in the revised manuscript. We found that the absolute number of estimated viral imports/exports and intercounty transmission events fluctuated depending on the number of Coastal Kenya sequences and size of global comparison dataset but with a clear pattern of (a) counted events increasing with sample size (b) with Mombasa County consistently leading in the number of events; imports or exports.

      • Line 292: The results are of course subject to differences in sequencing rates in each of the countries listed, and differences in reporting of these data.

      This is a valid concern; to mitigate the bias that arises with these differences, unlike in the previous comparison dataset where we randomly selected a specified number of samples per month for each continent, in the revised analysis we have done the selection at country level. We limited the comparison data to maximum of 30 genomes per country per month per year. In this way, countries with high sequencing rates do not become overrepresented in our comparison dataset.

      Some of these biases could be elicited through comparison to international travel data. For example, are the US and England also the top two countries from which most travellers arrive into Kenya? If such additional analyses are out of scope, it seems warranted to either strongly point to the substantial limitations of this analysis, or remove it altogether.

      We concur with the reviewer on the potential bias that could exist in conclusions that arise from inferring sources of importations based on genomic data alone, available from only a few countries. However, vital quality and curated international travel data into Kenya during the study period was not available to us at the time of this analysis. We have therefore agreed to remove the previous analysis on potential origins and destinations of observed Kenya lineages from the revised manuscript.

      What is perhaps striking is that Tanzania is entirely missing from this list, given extensive spread there. Another analysis that could be useful is a comparison of country specific lineage compositions, which might bypass some of the difficulties associated with substantial differences in sequence sampling/reporting rates.

      SARS-CoV-2 genomic data from Tanzania has not been publicly shared to date, and hence is not included. And as indicated above, we have removed the analysis that was trying to infer sources of SARS-CoV-2 importations into Kenya.

      To hypothesize on the potential lineages circulating in Tanzania, we have added a sentence detailing that 5 Pango lineages were identified among the 34 Tanzanian nationals who provided samples that were sequenced: B.1 (n=10), B.1.1 (n=10), B.1.351 (n=8), A (n=5) and A.23.1 (n=1)

      • Line 536: it seems problematic that the data used in the import/export analysis did not contain all available African sequences. Can these be included in the corresponding analysis please.

      In the revised manuscript we have included all accessible, good quality and contemporaneous Africa genomes in the revised manuscript (n=21,150). However due to the huge computational processing power need to process the phylogenetics for such large sequence data sets, we split the analysis into two parts, each with approximately 10,000 genomes (see Figure 3-figure supplement 1).

      Notably with the increased sample size (including the analysis of 390 more genomes from coastal Kenya), we detected far more imports of SARS-CoV-2 into Coastal Kenya compared to our previous analysis (n=280 vs n=69) but only a modest change in exports (n=95 vs n=105) and inter-county virus movement events (239 vs 190).

      Reviewer #2 (Public Review):

      Agoti et al. analyzed SARS-CoV-2 samples collected from infected patients in coastal Kenya, collected between March 2020 and February 2021. This period spans the first two waves of COVID-19 in Kenya, and the authors aimed to understand the lineages circulating throughout the region, in comparison to the virus circulating elsewhere in Kenya and in the world. The manuscript is clearly written, and the figures and results are thorough and well described throughout. These data add to our understanding of COVID-19 in Kenya and in East Africa, and the discussion of how different lineages spread in Kenya (single clusters versus dispersed over several regions) is both interesting and potentially useful for informing public health measures.

      The analyses are well done and excellently presented, but this paper is significantly lacking in a discussion of how sampling bias may affect the stated conclusions. Additionally, the paper focuses almost exclusively on genomic data and fails to closely examine epidemiological factors that may better contextualize the results presented.

      We thank the reviewer for bringing this to our attention, we have added the paragraph below to the revised manuscript.

      “Sampling bias is a potential limitation of this study arising from the fact that (a) demographic characteristics (age distribution, travel history and nationality) of the sequenced versus non-sequenced sub-sample differed significantly, (b) <10% of confirmed SARS-CoV-2 infections in Coastal Kenya were sequenced, prioritizing samples with a Ct value of <30.0 (Table 1); (c) the Ministry of Health case identification protocols were repeatedly altered as the pandemic progressed (Githinji et al., 2021) and (d) sampling intensity across the six Coastal counties differed, probably in part due to varied accessibility of our testing center that is located in Kilifi County (Figure 1A and Table 1). This may have skewed the observed lineage and phylogenetic patterns. To better contextualize the genomic analysis results, close examination of the case metadata is important, but unfortunately there was a lot of the metadata was missing (e.g., travel history, nationality, Table 2) which made it hard to integrate genomic and epidemiological data in an analysis. Although all analyzed genomes had > 80% coverage, very few were complete or near complete (>97.5%, n=344) due to amplicon drop-off or low sample quality and this may have reduced the overall phylogenetic signal.”

      Specifically:

      1) The authors do not discuss the potential effects of sampling on their import/export analyses. For example, they find that the USA and England are in the top six country sources of SARS-CoV-2 importation into coastal Kenya, as well as in the top six country destinations of viral export from the region. These two countries have generated huge numbers of sequences compared to the rest of the world, which may clearly bias these findings. While the authors do evaluate the sensitivity of their analyses by repeating them with different global subsamples, it is unclear if these subsamples corrected for large discrepancies in available data from different parts of the world.

      We concur and appreciate that sampling bias is indeed a common limitation in the type of analysis we have undertaken given the variation in data collection across geographies. Some of the approaches we took to correct for this have been highlighted in our responses to reviewer #1.

      In the revised manuscript, we have undertaken a reanalysis with a larger and more representative dataset at all scales of observation (Figure 3-figure supplement 1). Specifically, for the global dataset, we have revised our sub-sampling script to pick up the comparison dataset uniformly across months and countries for non-African countries. All the available African genomes have been included in our analysis including 605 collected in Kenya outside the coastal regional.

      Similarly, the authors find that new variant introductions were mainly through Mombasa city, but most of the Kenyan sequences were from this region, so it is perhaps unsurprising that more lineages were found there. The authors should repeat their analyses with a more representative global subsample, or at the very least discuss these caveats in the discussion and discuss what other evidence there may be to support their findings.

      Our sequencing rate by county is approximately proportional to the total number of cases seen in the county (Table 1 and Figure 2E). For Coastal Kenya, the revised manuscript included 389 additional genomes from coastal Kenya that became available while the manuscript was under review.

      Thus, in the revised manuscript, we have addressed the valid sampling bias concerns of the reviewers and editor by: (i) increasing the number of analyzed genomes in our dataset for previously under-represented periods and regions, (ii) including contemporaneous Kenyan genomes from outside the coastal counties in our import/export analysis, (iii) including all available Africa genomes into the analysis and selecting a balanced global sub-sample for inclusion into the analysis. In addition, were have also provided a paragraph in the discussion section highlighting sampling bias as a caveat to interpretation of the findings of the current study:

      “The accuracy of the inferred patterns of virus importations to and exportations from coastal Kenya are in part dependent on both the representativeness of our sequenced samples for Coastal Kenya and the comprehensiveness of the comparison data from outside Coastal Kenya. Our sequenced sample was proportional the number of positive cases reported in the respective Coastal Kenya counties (Figure 2E and Table 1). Also, we carefully selected comparison data to optimize chances of observing introductions occurring into the coastal region (e.g. by using all Africa data). But still there remained some important gaps e.g. non-coastal Kenya genomic data was limited (n=605). Despite this, we think the results from ancestral state reconstruction indicating that Mombasa is a major gateway for variants entering coastal Kenya is consistent with (a) the county showing the highest number lineages circulating (n=28) during the study period compared to the other five remaining Coastal counties Kenya, (b) approximately half (n=21, 49%) of the detected lineages in coastal Kenya had their first case identified in Mombasa and (c) Mombasa had an early wave of infections compared to the other Coastal counties and (d) is the most well connected county in the region to the rest of the world (large international seaport and airport and major railway terminus and several bus terminus).”

      2) Restriction measures enforced by the Kenyan government are briefly introduced at the very beginning of the manuscript and then mentioned at the very end as a possible explanation for observed transmission patterns. However, there is very limited discussion of the potential effect of restriction measures throughout, and no formal analyses are presented using this kind of epidemiological information. Adding formal analyses to back up the hypothesis that relaxation of interventions may have driven the second wave of infections would make this paper much stronger and potentially more interesting.

      In the revised manuscript, we have detailed the restriction measures the government of Kenya put in place in the introduction, methods, and results sections and discussed where appropriate on how we think they impacted the observed transmission patterns. We have added Supplementary Table 1 that provides the dates the various measures took effect or were relaxed.

      In a separate piece of work (Brand et al, 2021 published in Science journal, 10.1126/science.abk0414), we investigated the potential drivers of the first three waves of infection observed in Kenya and we have appropriately referenced this in the revised manuscript.

      We feel that additional analyses on the impact of the restriction measures on SARS-CoV-2 epidemiology and the lineage patterns observed are beyond the scope of this work whose focus was primarily genomic epidemiology.

      3) Generally, the text of the manuscript focused on waves of SARS-CoV-2 transmission, while the analyses presented data aggregated by month. A clearer connection between month and wave (particularly visually, on the figures themselves) would aid in interpretation of the data presented.

      This is a valid concern and a good suggestion. In the revised manuscript, for all temporal plots, we have added a line to demarcate when we switched from wave one to wave two period. Similarly, for several analyses, we have provided aggregations by wave period rather than by month.

      4) One of the strengths of this manuscript is the depth to which the authors discuss the detection of specific lineages in coastal Kenya. However, there is limited discussion of these results in the context of when various lineages appeared or disappeared globally, though these details are presented in a table. Discussing the appearance of the various lineages (was it surprising to see a particular lineage at a certain time or in a certain place?) would also improve this manuscript.

      In the revised manuscript, we have compared the patterns of lineage detection locally compared to all Kenya and to all continents in the newly added Figure 3. We have also discussed this aspect for the most frequent 4 lineages in both Wave one and Wave two.

    1. Author Response:

      Reviewer #1:

      Hauser et al, analyze two large datasets of GPCR-G protein interactions/couplings ("Inoue" and "Bouvier"), comparing and combining them with the widely-used literature-based Guide to Pharmacology (GtP) database. As the Inoue and Bouvier datasets were based on different experimental setups, this enables the identification of which couplings are supported by more than one method. The authors also establish a normalization protocol that enables to move from qualitative to quantitative comparisons and identify couplings that might be either below are above a rigid threshold. Overall, the paper describes a new resource and the methodologies used to build this resource. The resulting coupling map is available through the GPCRdb website, a widely used resource in the field.

      The authors have thus improved the ability of researchers to assess prior results and compare them to their own new data. This resource clearly and significantly upgrades options currently available and will likely be of interest and prove quite useful to scientists both in academia and in industry.

      We thank the reviewer for so nicely describing the study and its prospective application.

      Weaknesses include:

      • The data is described mostly by broad numbers, such as the number of receptors or coupling in a subset, or percentages. While this is helpful to understand the data, this reviewer found it hard to follow the mountain of numbers. A suggestion would be to add a section where the authors pick selected examples of particular experimental data and show how their combine database can resolve previously unanswered (or wrongly answered) questions of GPCR/G protein coupling.

      We have removed numbers in several places throughout Results where we had included multiple measures e.g., absolute numbers and percentages. Furthermore, where an overall number has been broken down into distributions, e.g., across different G proteins of families thereof, we moved other numbers to parentheses.

      The different sections of Results that answer questions of GPCR-G protein coupling have now been presented more clearly by updating their headings and grouping them all in a subsection of part of Results called “Research Advances – Insights on GPCR-G protein selectivity”. These sections are all based on our “combined database”/coupling map. In each such section, we start at the overall level – covering all GPCRs and/or G proteins – but then give selected examples thereof that are weaved into and exemplifies the text. This approach has also been used in the new Results section “Differential tissue expression gives G proteins in the same family large spatial selectivity”, which gives selected examples of G proteins with specific tissue expression profiles.

      Given that the paper has already exceeded the maximum of 5,000 words by quite a bit, we think that this approach of weaving selected examples into each selectivity insight section is the most appropriate, and that it brings most clarity. Furthermore, we hope that readers will be inspired to use our coupling map to generate additional questions for future experiments.

      • The paper does not reveal new biological findings. For example, while some emphasis is placed on new data on G15, it would be helpful to take the extra step and use this to suggest new biological insights.

      eLife’s author guidelines (https://reviewer.elifesciences.org/author-guide/types) state that “Tools and Resources articles do not have to report major new biological insights or mechanisms, but it must be clear that they will enable such advances to take place, for example, through exploratory or proof-of-concept experiments.” In case this manuscript is published as a Tools and Resources paper, it may therefore be sufficient to provide the foundation for future studies to reveal new biological findings.

      Nevertheless, the coupling map led to biological findings relating to patterns and mechanisms of GPCR-G protein selectivity that were not described in the original studies. I.e., while this study did not generate new data, it arrived at new insights based on published data. This seems to be in line with eLife’s publication format “Research Advances” (https://reviewer.elifesciences.org/author-guide/types), and the Analysis format of several other journals. Some insights described herein have not been presented before while others have been updated in scope and precision. Furthermore, we have added a new section of Results with insights on G protein expression profiles and co-expression.

      We have clarified this by updating the headings of the sections that present these insights, and grouped them under a common subheading of Results termed “Research Advances – Insights on GPCR-G protein selectivity”. However, in case we have overlooked very recent studies describing some of the same biological insights, we would please like to ask for their references and would be more than willing to revise the manuscript again to incorporate them. Furthermore, if the Reviewer is missing a particular analysis that is critical to understand GPCR-G protein coupling, please let us know.

      • The authors cautiously label couplings supported by only one dataset as "unsupported". It would seem more helpful to grade couplings by a reliability scale, providing users with a wider set of data. Perhaps only couplings that are directly conflicted by negative data should be labeled as unsupported?

      We understand that the term “unsupported” has been used in a confusing way. We have now replaced this term with “unique” and explained all terms in Table 1 of the revised manuscript.

      To address the need for a means to grade or filter couplings by reliability, we have added the following paragraph to the manuscript:

      “To enable any researcher to use the coupling map, we have availed a “G protein couplings” browser (https://gproteindb.org/signprot/couplings) in GproteinDb (2). By default, this browser only shows “supported” couplings with evidence from two datasets, but there is an option (first blue button) to changes the level of support to only one (for most complete coverage of GPCRs) or to three (for the highest confidence) sources. We propose a standardized terminology to describe couplings based on their level of experimental support from independent groups (Table 1). The criterion of supporting independent data, and the terms “proposed” and “supported”, are already used by the Nomenclature Committee of the International Union of Basic and Clinical Pharmacology (NC-IUPHAR) for GPCR deorphanization. Furthermore, the online coupling browser allows any researcher to use only a subset of datasets, or to apply filters to the Log(Emax/EC50), Emax, and EC50 values. Finally, users can filter datapoints based on a statistical reliability score in the form of the number of SDs from basal response."

      Furthermore, we have added references to the online G protein coupling browser in the:

      (1) Introduction ending: “On this basis, we develop a unified map of GPCR-G protein couplings that can be filtered or intersected in GproteinDb …”, (2) Fig. 2 legend ending: “Note: Researchers wishing to use this coupling map, optionally after applying own reliability criteria or cut-offs, can do so for any set of couplings in GproteinDb (1).” (3) Fig. S2 ending: “Unique couplings are hidden by default in the online G protein couplings browser in GproteinDb, as they await the independent support by a second group.”

      To many scientists the most reliable option is to involve NC-IUPHAR. Gloriam is a corresponding member of NC-IUPHAR, which has mentioned the possibility of involving its many worldwide pharmacological experts to update GtP on a case-by-case basis for receptors. For example, many of the “novel” couplings jointly supported by Bouvier and Inoue may be added. This option is advantageous as it involves experts in each receptor system (often with knowledge of other relevant studies) and is backed by the authoritative organization.

      • Given that this manuscript includes authors from both the Inoue and Bouvier studies, I can understand why they are not directly assessing which of the two datasets (in relation to the GtP) might be more accurate. Nevertheless, I believe this assessment should be done and that the advantages and disadvantages of the two experimental systems discussed clearly.

      We believe that the three-way intersection of couplings is the most informative and therefore preferred over individual comparison of each of the Inoue and Bouvier datasets to GtP. GtP is unfortunately not suitable as a stand-alone resource – neither to contradict nor support couplings (on the G protein subtype level). This is because GtP is incomplete (especially for G12/13) and does not provide any information on the level of G protein subtypes, only families. The three-way interactions will always use GtP but adds a second dataset on top of this when validating a third dataset. Our manuscript already included a three-way intersection of datasets, allowing readers to conclude which dataset might be more accurate (then Fig. 3 and Spreadsheet 3) on a per-G protein basis.

      In the revised manuscript, we have rewritten this section, which now has the heading “Bouvier’s and Inoue’s biosensors appear more sensitive for G15 and, Gs and G12, respectively. We have also made a completely new figure, Fig. 7, which more clearly illustrates for which G proteins that Bouvier and Inoue may have overrepresented or underrepresented couplings. This section specifically investigates the question of whether differential sensitivity can explain “unique” couplings. However, such unique couplings can either be due to overrepresentation or instead be true positives that are missing in GtP because of incompleteness and in the other biosensor due to lower sensitivity. Unfortunately, we will not be able to distinguish these possibilities until the research community has gained additional datasets from independent biosensors with as high sensitivity.

      Whereas our study compares datasets rather than experimental systems, we have added a paragraph in the Discussion describing which aspects should be considered when choosing a biosensor. There, we reference a review from last year dedicated to biosensors and describing their pros and cons (3), and the accompanying paper by Bouvier et al. (4), comparing several aspects of the experimental system used by Inoue et al. It is also important to note that the most advantageous biosensor may be one of the two for which data is analyzed in our paper. For many studies, researchers may instead be better off with another biosensor, for example those from Lambert/Mamyrbekov (5), Roth (2) (Gαβγ sensors first described in (6-11)) or Inoue (unpublished dissociation assays using wt G proteins fused with LgBit and HiBit). These are all referenced in the Discussion.

      References:

      1. Pandy-Szekeres G, Esguerra M, Hauser AS, Caroli J, Munk C, Pilger S, et al. The G protein database, GproteinDb. Nucleic Acids Res. 2022;50(D1):D518-D25. 10.1093/nar/gkab852
      2. Olsen RHJ, DiBerto JF, English JG, Glaudin AM, Krumm BE, Slocum ST, et al. TRUPATH, an open-source biosensor platform for interrogating the GPCR transducerome. Nat Chem Biol. 2020;16(8):841-9. 10.1038/s41589-020-0535-8
      3. Wright SC, Bouvier M. Illuminating the complexity of GPCR pathway selectivity – advances in biosensor development. Curr Opin Struct Biol. 2021;69:142-9. https://doi.org/10.1016/j.sbi.2021.04.006
      4. Avet C, Mancini A, Breton B, Gouill CL, Hauser AS, Normand C, et al. Effector membrane translocation biosensors reveal G protein and B-arrestin profiles of 100 therapeutically relevant GPCRs. bioRxiv. 2021:2020.04.20.052027. 10.1101/2020.04.20.052027
      5. Masuho I, Martemyanov KA, Lambert NA. Monitoring G Protein Activation in Cells with BRET. Methods Mol Biol. 2015;1335:107-13. 10.1007/978-1-4939-2914-6_8
      6. Gales C, Rebois RV, Hogue M, Trieu P, Breit A, Hebert TE, et al. Real-time monitoring of receptor and G-protein interactions in living cells. Nat Methods. 2005;2(3):177-84. 10.1038/nmeth743
      7. Gales C, Van Durm JJ, Schaak S, Pontier S, Percherancier Y, Audet M, et al. Probing the activation-promoted structural rearrangements in preassembled receptor-G protein complexes. Nat Struct Mol Biol. 2006;13(9):778-86. 10.1038/nsmb1134
      8. Schrage R, Schmitz AL, Gaffal E, Annala S, Kehraus S, Wenzel D, et al. The experimental power of FR900359 to study Gq-regulated biological processes. Nat Commun. 2015;6:10156. 10.1038/ncomms10156
      9. Breton B, Sauvageau E, Zhou J, Bonin H, Le Gouill C, Bouvier M. Multiplexing of multicolor bioluminescence resonance energy transfer. Biophys J. 2010;99(12):4037-46. 10.1016/j.bpj.2010.10.025
      10. Bunemann M, Frank M, Lohse MJ. Gi protein activation in intact cells involves subunit rearrangement rather than dissociation. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(26):16077-82. 10.1073/pnas.2536719100
      11. Janetopoulos C, Jin T, Devreotes P. Receptor-mediated activation of heterotrimeric G-proteins in living cells. Science. 2001;291(5512):2408-11. 10.1126/science.1055835

      Reviewer #2:

      This study is a meta-analysis of previously reported studies on G protein-coupled receptor (GPCR) coupling to G proteins. The data sets are from three distinct sources: a compendium compiled by the International Union of Basic & Clinical Pharmacology (IUPHAR), and two data sets compiled by two separate laboratories. Each of these data sets describes the coupling of members of the superfamily of non-sensory GPCRs (~200 genes) to the large family of G protein alpha subunits (~20 genes). The authors try to arrive at a consensus for receptor-G protein coupling from the three data sets, as well as identify and highlight differences or incongruencies. Compiling these vast data sets into a unified format will be extremely useful for investigators to understand receptor and effector relationships. The meta-analysis will help to deconvolute the complex physiology and pharmacology underlying hormone or drug actions acting on receptor superfamilies. A better understanding of receptor-G protein selectivity and/or promiscuity will ultimately help in identifying safer therapeutics.

      We appreciate the summary and the explanation of the usefulness of our meta-analysis and its potential impact.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This article focuses on one possible outcome of protein sequence evolution after duplication, in which the residue distribution at specific positions of a multiple sequence alignment becomes uncoupled from the distribution expected from the phylogeny of the protein family. The authors call these events "residue inversions" and interpret them as the result of functional pressures on family members with diverging cellular roles. Based on a theoretical model of residue evolution after duplication of the coding gene, the authors describe the criteria for categorizing a particular position in a protein as a "residue inversion" and develop an algorithm to identify such events in a multiple alignment. They then apply their approach to the family of Epidermal Growth Factor Receptors in Teleost fishes and identify 19 EGFR positions in a dataset of 88 fish genomes, which satisfy the criteria of "residues inversions". They provide support to the scoring scheme used in their approach through a simulated evolution run and conclude from a comparison of their positions to the ones predicted by SPEER to represent Specificity Determining Sites that the two are largely orthogonal and may therefore complement each other in sequence-based function prediction.

      Major comments: 1. Throughout the paper, the functional involvement of positions subject to "residue inversions" is indirect, inferred from the literature, and in parts sparse and tenuous. It therefore remains unclear to what extent the interpretation that "residue inversions" represent functional adaptations is correct. The authors acknowledge this uncertainty in several places, including the Conclusions.

      We agree with the reviewer that without experimental validation an uncertainty about the data interpretation remains, however testing protein function on a large scale and in non-model organisms is extremely challenging. Since we were aware of this obstacle, we validate our conclusions in different ways: 1. the theoretical model and the simulated MSA both show a lower chance of observing residue inversions than what we detected in the teleost fish EGFR example. 2. previous literature highlighted an identified inverted residue as the possible cause of sub-functionalization of teleost fish EGFR. 3 We generated the alpha fold models of teleost fish EGFR and performed molecular dynamic simulation of the two copies, in complex with the ligand. In our simulations, we see the same trend that we observe with the inter-paralog inversions at the functional level. The new results have been integrated in line 692-706.

      "Residue inversion" is a very unintuitive term, which took me several readings to penetrate and made reading the article difficult. The authors may wish to reconsider this term. Naively, a residue inversion would be the swapping of residues between two positions, such that a residue expected in position A is found in position B, while the residue expected in B is found in A. That is what I suspect most readers will think.

      We acknowledged that the terminology might be confusing. We therefore decided to define it as inter-paralog inversion of amino acids throughout all the text.

      Is the phenomenon described here just a curiosity, or an important aspect of divergent evolution after duplication? The authors seem to be of two minds about it, calling the phenomenon "rare" in the Abstract, but an "important and understudied outcome of gene duplication" in the Introduction, then hedging again that it "might be rare" in the Conclusions. The benefits of recognizing such positions are also formulated with great caution, for example in lines 309-311: "In summary, the identification of residue inversion event has the potential to improve functional residue predictions".

      We agree with the reviewer that we did not yet test the recurrence of this event on a large scale, however this does not exclude that this event is frequent. This work is focused on the observation, characterization, and implications of this event. Considering this comment and the one below we decided to perform a further analysis (see below for more details).

      Additionally, the analysis of the frequency of this event at the whole-organism scale on multiple organisms, while interesting, would be out of the scope of this paper, if not just because it requires a totally different (large-scale) approach compared to the one used in here. This type of analysis is also limited by the absence of a database collecting intermediate knowledge that would speed up the initial part of ortholog classification at a broad range.

      Finally, by rarity we mean the statistical chance of the event, not considering the effective chance of observing it from the real data. In fact, we rectified in the text using the reviewer’s observation.

      OLD VERSION (ppXX):

      Our work uncovers a rare event of protein divergence that has direct implications in protein functional annotation and sequence evolution as a whole.

      NEW VERSION:

      Our analysis shows a new way to investigate an important and understudied outcome of gene duplication.

      It would probably strengthen the article substantially if the authors would (I) use their program to scan a large number of multiple alignments in order to establish more reliably how frequent this phenomenon actually is, and whether it is universal or a specifc aspect of eukaryotic, maybe even only vertebrate evolution; and then (II) mapped the positions identified on structural models for the proteins, obtained by homology modeling or AlfaFold prediction, in order to substantiate their potential origin as functional adaptations.

      We thank the reviewer for the thoughtful suggestions. (I) we tested the inter-paralog inversion score at the proteome level using a reduced dataset (70) of reference teleost fish proteomes from Uniprot. We obtained 54 proteins that duplicated in the teleost specific whole genome duplication, then we run our pipeline on it. We found that the overall distribution of scores is more similar to the simulated evolution experiment rather than to the EGFR test case. We integrated the new results and discussion in a new paragraph and new figure in line 708-716.

      (II) We considered also the analysis requested in the second point. Unfortunately, we could not extract any meaningful data from the AlphaFold models.

      Reviewer #1 (Significance (Required)):

      A method to improve the functional annotation of proteins in a paralogous family would be very useful, given the abundance of sequence data.

      We thank the reviewer for acknowledging the importance of the question that we have addressed.

      I am knowledgeable in varios aspects of molecular evolution and functional annotation. I am neither a mathematician, nor a developer of phylogenetic methods, so I cannot judge these aspects of the paper.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Review of Pascarelli and Laurino titled “Identification of residue inversions in large phylogenies of duplicated proteins”

      I find the topic of the paper very exciting and long overdue. Indeed, I was under the impression that the question of parallel evolution in paralogous copies must have been addressed long ago: to my surprise, having looked in depth at the literature, that is only partially so. The manuscript, therefore, addresses a relatively novel and fundamental question of broad interest.

      We thank the reviewer for his positive comment.

      Having said this, I also found the manuscript to suffer from an identity problem, which in many places encroaches on the underlying quality of the science. I will structure my review into three concerns: the identity issues, the novelty issue and the emergent quality issues from the two.

      Identity issues:

      The manuscript is primarily dealing with an evolutionary issue – or I am biased to see it this way as an evolutionary researcher myself. Nevertheless, much of the language and terminology of the paper either misuses evolutionary terms or invents new ones in its place with a bias towards a protein chemistry perspective. Specifically, what the authors call “residue inversions” is called “parallel evolution” or “convergent evolution” in the literature. Also, "residues" are typically used for physical amino acids in a structure. If we are talking about sequence level “amon acid” would be a better term. The issue is further confounded by the meaning of “inversion” in genetics as a single mutation that inverts the position of nucleotides (i.e. an “AT” becomes “TA”).

      I strongly recommend for the authors to become familiarized with the common usage of existing and widely used terms in evolutionary biology that describe the phylogenetic patterns they see: parallel evolution, convergent evolution, homoplasy, etc, and to use them consistently throughout the manuscript.

      The same goes for "mutation", which the authors confuse on two levels: evolutionary and biochemical. Sometimes the authors refer to “mutation” of amino acids (which can be entertained at some level, but from a genetic perspective only nucleotides mutate – in the protein biochemistry field this term is frequently applied to amino acid residues, which is the basis of the identity issue). However, since the authors also use “mutation” to refer to a “substitution” (which is what we call a mutation that has become fixed in evolution) this creates another level of confusion. I urge the authors to change this aspect of the language of the manuscript to better reflect evolutionary concepts.

      As part of the language issues I am not sure how meta-functionalization in the author’s view differs either from neofunctionalization or specialization of duplicated genes.

      We thank the reviewer to point out the terminology issue, this will also help reaching a broader audience. We clarify the confusion surrounding the terms “mutation” and “residue inversion” by changing the former to “substitution”, while the latter to “inter-paralog inversions” (see also other reviewer comments).

      We understand the importance of the usage of the correct term to talk about this event of protein sequences evolution. Therefore, we used convergent and parallel evolution accordingly when we discussed the nuances between Metafunctionalization and parallel evolution in the text, in lines 188 and 399.

      Novelty issues:

      As I mentioned, the issue of parallel evolution of gene duplications is an extremely interesting topic. I was sure that the people who studied parallel evolution, or those interested in gene duplications, must have published extensively on this. However, my search of the literature revealed only a modest pre-existing effort. Nevertheless, previous efforts are not entirely non-existent and should be cited and discussed in this paper too. The most pertinent example is

      https://bmcecolevol.biomedcentral.com/articles/10.1186/s12862-020-01660-1

      which has an identical setup from what I can tell (compare Figure 1 in each paper).

      This paper was not hard to find using "parallel evolution", thus my focus on the language issues in the previous section.

      We thank the reviewer for his suggestion, we included the relevant papers in the text in lines 520-523. Interestingly, the cited paper shows that a comprehensive analysis of the fate of duplicated genes at the sequence level was done. However, in this paper, the ‘fate’ of a paralog is determined by counting the number of sites that support one or the other fate, independently of the orthologous relationship. In our study, we start from the orthologous relationship to pre-determine the fate of the paralogous protein, then we identify the sites that break this assumption. Our type of analysis is deemed to work only where the orthologous relationship is unequivocal. That is the reason why we chose an example with relatively short branch lengths after duplication (the teleost specific duplication). Our rationale is that with a higher genome coverage across organisms, resolving the orthologous relationship will get easier in time. However, our study focuses on a distinct case (asymmetric divergence) where the diverging paralogs converge to the same phenotype. In such a case, neutral substitutions related to the ancestral relationship of a protein can be filtered out to better search for functional adaptations.

      Content issues:

      The lack of attention to evolutionary concepts, in my opinion, provided some missed opportunities for the authors to attack the problem in a more convincing fashion. Specifically, in the setup to distinguish between parallel evolution of paralogues versus orthologues ("inversion" versus "species-specific adaptation" in the author's text) one must be able to distinguish between the two copies and assign true evolutionary relationship. In practice, that is not always possible based on tree lengths or topologies alone because of confounding factors such as independent duplications or gene conversion events.

      I would feel better about the results of this study if the following two things were integrated.

      The use of synteny to better determine homologous relationships (declare copies to be true paralogues if they occupy the same syntenic region). To compare the frequency or parallel evolution of paralogues versus orthologues as a null model of the expected number of parallel events in paralogous copies.

      We agree that a synteny analysis has to be included. We tested it for the EGFR proteins in fish and the results support the orthologous relationship of EGFRa and EGFRb in the two groups compared (Cypriniformes versus other teleosts). The results were included in the text and in the Supplementary figure in lines 303-305.

      The second point targets the way the model derives the expectations: at the author's own admission the model makes a number of unrealistic assumptions, ") equal branch length between the two paralogs; 2) only zero to one mutation can occur in each of the six branches; 3) after a mutation, each residue is equiprobable; 4) no selective pressure; 5) the probability of a mutation on a branch solely depends on the branch length (mutation rate). The authors do not really test the resulting tree on deviation from these assumptions (I am sure that it does not conform) but essentially comparing the occurrence of parallel events in paralogues versus orthologues may solve the problem with a less restrictive set of assumptions (that one expects an equal number of parallel events in paralogues and orthologues unless there is some paralogue-specific selection pressure, which is what the authors are looking for.

      We compared the occurrence of the two outcomes in both the simulation and in the real data. In all cases, the two score distributions have a very similar shape, with a 99th percentile score of respectively 0.062 and 0.113. Most sites in an alignment (>99%) are not expected to be inverted and will have scores very close to 0, making the identification of inversions a quest for outliers. Furthermore, in case of the real data, each distribution can be independently affected by different selective pressures that might bias the background distribution. While the inversion in paralogs is expectedly involving few, functional, residues, the inversion in orthologs is expected to have a broad effect. For example, a temperature adaptation might shift the number of polar residues on the protein surface (see for example: https://academic.oup.com/peds/article/13/3/179/1466666). Also, a different protein chosen for analysis might generate a different background distribution of the two events. In the larger dataset, the similarity of the two distributions is even more (99th percentile of 0.07 and 0.08). Because of the shown similarity of the two event distributions, and the possible issues with different selective pressures, we leave the analysis suggested by the reviewer as a post-processing possibly performed by the user. We report a summary of this result born from the reviewer’s observation in line 478.

      In summary, I believe that the topic is very interesting, the authors potentially found a new aspect of evolution of a specific gene family. However, in my opinion a major revision is needed to unite this text with the terms in the field, the previous publication and to integrate the two additional analyses I suggested.

      Minor Comments:

      I started adding these specific comments before generalizing the broader deviation from the common evolutionary language. There are more further along in the manuscript, but in the interest of time I will not articulate them here hoping that the authors will first try a major revision targeting these issues.

      Line 64: While neutral mutations help to determine the phylogenetic position of a protein, mutations of functional residues are a signal of functional shifts that might occur independently of the phylogeny. - this is quite misleading. All substitutions (neutral or beneficial) have a phylogenetic signal. In any case, this is discussed here in phylogenetic terms: https://pubmed.ncbi.nlm.nih.gov/10742039/

      We corrected the sentence to refer to divergence time instead of phylogenetic signal.

      OLD VERSION:

      While neutral mutations help to determine the phylogenetic position of a protein, mutations of functional residues are a signal of functional shifts that might occur independently of the phylogeny.

      NEW VERSION:

      While neutral substitutions are directly proportional to the time of divergence, a change in functional residues could be a signal of a functional shift that might occur independently of the divergence time.

      Line 107: "under high evolutionary pressure" - I do not know what evolutionary pressure is nor why it can be high or low.

      We corrected the term to “selective pressure”.

      OLD VERSION:

      Lorin et al. showed that both copies of EGFR might have been retained because they are involved in the complex process of skin pigmentation (40), which is under high evolutionary pressure in most fish.

      NEW VERSION:

      Lorin et al. showed that both copies of EGFR might have been retained because they are involved in the complex process of skin pigmentation (40), a trait that is under selective pressure in most fish

      Line 112 "linearly inherited across orthologs" - linear is a poor choice of a word here. The first thing that comes to my mind is quadratic inheritance as an alternative. Perhaps the authors are looking for "vertical" versus "horizontal" - these are established terms in phylogenetics (think "horizontal gene transfer").

      We corrected the term to “vertically inherited”.

      OLD VERSION

      Therefore, the power to predict functional residues is limited by our ability to track protein function on the phylogenetic tree when it is not linearly inherited by orthologs.

      NEW VERSION

      Therefore, the power to predict functional residues is limited by our ability to track protein function on the phylogenetic tree when it is not vertically inherited by orthologs.

      It is my invariant practice to reveal my identity to the authors,

      Fyodor Kondrashov

      Reviewer #2 (Significance (Required)):

      Addressed in the above

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Review of Pascarelli and Laurino titled "Identification of residue inversions in large phylogenies of duplicated proteins"

      I find the topic of the paper very exciting and long overdue. Indeed, I was under the impression that the question of parallel evolution in paralogous copies must have been addressed long ago: to my surprise, having looked in depth at the literature, that is only partially so. The manuscript, therefore, addresses a relatively novel and fundamental question of broad interest.

      Having said this, I also found the manuscript to suffer from an identity problem, which in many places encroaches on the underlying quality of the science. I will structure my review into three concerns: the identity issues, the novelty issue and the emergent quality issues from the two.

      Identity issues:

      The manuscript is primarily dealing with an evolutionary issue - or I am biased to see it this way as an evolutionary researcher myself. Nevertheless, much of the language and terminology of the paper either misuses evolutionary terms or invents new ones in its place with a bias towards a protein chemistry perspective. Specifically, what the authors call "residue inversions" is called "parallel evolution" or "convergent evolution" in the literature. Also, "residues" are typically used for physical amino acids in a structure. If we are talking about sequence level "amon acid" would be a better term. The issue is further confounded by the meaning of "inversion" in genetics as a single mutation that inverts the position of nucleotides (i.e. an "AT" becomes "TA").

      I strongly recommend for the authors to become familiarized with the common usage of existing and widely used terms in evolutionary biology that describe the phylogenetic patterns they see: parallel evolution, convergent evolution, homoplasy, etc, and to use them consistently throughout the manuscript.

      The same goes for "mutation", which the authors confuse on two levels: evolutionary and biochemical. Sometimes the authors refer to "mutation" of amino acids (which can be entertained at some level, but from a genetic perspective only nucleotides mutate - in the protein biochemistry field this term is frequently applied to amino acid residues, which is the basis of the identity issue). However, since the authors also use "mutation" to refer to a "substitution" (which is what we call a mutation that has become fixed in evolution) this creates another level of confusion. I urge the authors to change this aspect of the language of the manuscript to better reflect evolutionary concepts.

      As part of the language issues I am not sure how meta-functionalization in the author's view differs either from neofunctionalization or specialization of duplicated genes.

      Novelty issues:

      As I mentioned, the issue of parallel evolution of gene duplications is an extremely interesting topic. I was sure that the people who studied parallel evolution, or those interested in gene duplications, must have published extensively on this. However, my search of the literature revealed only a modest pre-existing effort. Nevertheless, previous efforts are not entirely non-existent and should be cited and discussed in this paper too. The most pertinent example is

      https://bmcecolevol.biomedcentral.com/articles/10.1186/s12862-020-01660-1

      which has an identical setup from what I can tell (compare Figure 1 in each paper).

      This paper was not hard to find using "parallel evolution", thus my focus on the language issues in the previous section.

      Content issues:

      The lack of attention to evolutionary concepts, in my opinion, provided some missed opportunities for the authors to attack the problem in a more convincing fashion. Specifically, in the setup to distinguish between parallel evolution of paralogues versus orthologues ("inversion" versus "species-specific adaptation" in the author's text) one must be able to distinguish between the two copies and assign true evolutionary relationship. In practice, that is not always possible based on tree lengths or topologies alone because of confounding factors such as independent duplications or gene conversion events.

      I would feel better about the results of this study if the following two things were integrated.

      The use of synteny to better determine homologous relationships (declare copies to be true paralogues if they occupy the same syntenic region). To compare the frequency or parallel evolution of paralogues versus orthologues as a null model of the expected number of parallel events in paralogous copies.

      The second point targets the way the model derives the expectations: at the author's own admission the model makes a number of unrealistic assumptions, ") equal branch length between the two paralogs; 2) only zero to one mutation can occur in each of the six branches; 3) after a mutation, each residue is equiprobable; 4) no selective pressure; 5) the probability of a mutation on a branch solely depends on the branch length (mutation rate). The authors do not really test the resulting tree on deviation from these assumptions (I am sure that it does not conform) but essentially comparing the occurrence of parallel events in paralogues versus orthologues may solve the problem with a less restrictive set of assumptions (that one expects an equal number of parallel events in paralogues and orthologues unless there is some paralogue-specific selection pressure, which is what the authors are looking for.

      In summary, I believe that the topic is very interesting, the authors potentially found a new aspect of evolution of a specific gene family. However, in my opinion a major revision is needed to unite this text with the terms in the field, the previous publication and to integrate the two additional analyses I suggested.

      Minor Comments:

      I started adding these specific comments before generalizing the broader deviation from the common evolutionary language. There are more further along in the manuscript, but in the interest of time I will not articulate them here hoping that the authors will first try a major revision targeting these issues.

      Line 64: While neutral mutations help to determine the phylogenetic position of a protein, mutations of functional residues are a signal of functional shifts that might occur independently of the phylogeny. - this is quite misleading. All substitutions (neutral or beneficial) have a phylogenetic signal. In any case, this is discussed here in phylogenetic terms: https://pubmed.ncbi.nlm.nih.gov/10742039/

      Line 107: "under high evolutionary pressure" - I do not know what evolutionary pressure is nor why it can be high or low.

      Line 112 "linearly inherited across orthologs" - linear is a poor choice of a word here. The first thing that comes to my mind is quadratic inheritance as an alternative. Perhaps the authors are looking for "vertical" versus "horizontal" - these are established terms in phylogenetics (think "horizontal gene transfer").

      It is my invariant practice to reveal my identity to the authors,

      Fyodor Kondrashov

      Significance

      Addressed in the above

    1. Is this all right, or are there other people that, in this case, you would rather be paired with for whatever reason—even if that reason is only for breaking up the appearance of possible racism; since the appearance of possible racism can be just as much a factor in reproducing and promoting racism as anything else: Racism is as much about accustoming people to becoming used to certain racial configurations so that they are specifically not used to others, as it is about anything else. Indeed, we have to remember that what we are combatting is called prejudice: prejudice is pre-judgment—in this case, the prejudgment that the way things just happen to fall out are “all right,” when there well may be reasons for setting them up otherwise.

      i guess in a SF world the thought of having a diverse appearance is important in order to combat prejudice but in the world we live in today i don't know if thats always the case. How many ads do you see today with people of color for companies who in the past haven't always been the most diverse (the film industry). Its like white people have found away to hide their red hands behind our colored faces for the perception of diversity. who is diversity really helping if white people are the ones cashing in. Hers is a list of companies that appear to promote diversity and inclusion but actually don't. Amazon, Apple, Bank of america, Cisco, Facebook, KFC, Lego, L'Oreal, Loius Vuitton. all of these companies have spent a lot of money to appear one way but when you look at who is really profiting on their board of directors, they are all white and mostly men. To be clear i think Delany's point is valid and i agree with it but a lot has happened since and we cant continued to be fooled at face value. Forget having a seat at the table, if it's only for perception.

    1. Reviewer #1 (Public Review):

      The premise of this paper is that a significant amount of microbial diversity might be maintained not purely through resource partitioning, as has been the thrust of multiple recent papers over the last few years, but perhaps also through "physical" differences between organisms---here manifested by the detachment rate of heterotrophic bacteria from resources in the form of particulate matter. I completely agree with that premise, and agree that this is an underexplored niche axis that is important to account for when seeking to understand coexistence and diversity.

      As with any mathematical model, the assumptions made are critical to get right, and different assumptions about the details of resource uptake, dispersal, and competition may lead to different conclusions. So my comments primarily relate to some of these mathematical choices, as well as to their explanation in the text.

      -- In framing the paper, I think the authors are right to focus on dispersal and detachment as under-explored mechanisms. But readers will benefit from reference to other work (even on particle-associated microbes) related to resource diversity, succession, and crossfeeding. That can only help put the current study in context with other mechanisms for the maintenance of microbial diversity.

      -- There is a population growth process when a cell settles on a new particle. This is assumed to be logistic growth, though in the end, it seems likely that the precise dynamics of the growth process don't matter so much as the final abundance (carrying capacity). However, this seemed subtle to me for three reasons:

      (i) Will detachment rate directly affect carrying capacity?

      (ii) Is carrying capacity occurring when microbes fill out the surface of a particle, or when they have eaten the entire volume of a particle?

      (iii) If the former, will particles continue to be shed from the particle as growth continues approximately linearly?

      It's possible that none of this matters too much if all that's important is a final population size. However, it might help to clarify the process for readers if we have a conceptual picture of what this final population size represents (surface of particle being filled? or volume of particle entirely eaten up) and if there is a truer picture of the dynamics than logistic growth.

      -- The relationship between the trade-off (between different detachment rates) derived in Eq 2 versus the optimal detachment rate (derived in the methods) is framed a little confusingly. If I understand correctly, the "trade-off" actually comes from the condition that a population will have net non-negative growth rate in the absence of other populations with different strategies. So it may be reasonable to frame this as a threshold---a necessary condition rather than a sufficient condition for a given population to persist. The reason I say this is that it is a bit confusing to have a trade-off that suggests a range of detachment rates can coexist so long as they differ in their carrying capacities, since it is then stated that the optimal detachment rate outcompetes all the others. Maybe I misunderstood something important being assumed about the carrying capacity for the optimal case, but a trade-off that also has an optimum is an odd outcome.

      -- In the end, it seems critical that for multiple strategies to be maintained in the population that there is not only whole-particle mortality (which in effect is highly correlated catastrophic dynamics for an individual microbial population), but that the inflow of resources itself fluctuates. Did I interpret that correctly? Readers may appreciate a slightly clearer description of how this environmental stochasticity differs from the previous possibility of whole-cell mortality, and this also left me wondering how to quantity the kind of environmental stochasticity that will generally lead to multiple strategies coexisting.

      -- In summary, I think this is a terrific idea and promising analysis that will bear fruit. But I also wanted to understand how robust is the outcome of coexistence to the various assumptions in the model.

    1. As we research, we may find ourselves returning to and changing our question, or we may near the end of a project and think we’re done but discover we need to go back to find more or better sources. The messiness of research requires us to be flexible,

      This passage shows how it is acceptable and even accepted to have to change your research as you go along. This makes research feel more free and creative rather than strict and boring.

    2. Like a daisy’s petals, research is described as cyclical and fluid. As we research, we may find ourselves returning to and changing our question, or we may near the end of a project and think we’re done but discover we need to go back to find more or better sources. The messiness of research requires us to be flexible, often modifying our approaches along the way.

      I can attest to this as many times before I have changed my original question or approach after uncovering some bits of research.

    1. Author Response

      Reviewer #1 (Public Review):

      This is a very solid and exciting study.

      We thank the reviewer for finding our study to be very solid and exciting.

      I have several suggestions, comments and questions:

      1. The authors focused on examining the role of C129 as a regulator of PTPN22 redox sensitivity based on a published crystal structure of the catalytic domain. It would be great if they could demonstrate the existence of the disulfide bond between C129 and C227 also experimentally (in T cells).

      As we understand it, it is requested that the disulfide bond between C227 and C129, as previously suggested by Tsai et al. (2009) (1) with pure protein, should be documented to actually occur in the activated T cells. We fully agree that this would improve the study and we have therefore made several attempts to demonstrate this oxidation, or the oxidation state of the active site Cys residue in PTPN22 in situ. However, as we had also expected, it has proven to be technically very challenging. Nevertheless, as the functional consequence of the PTPN22 oxidation and the effect of the C129S mutation is clearly documented in the mouse, using in vivo experiments, we still think it is valid to conclude that the reversible oxidation state of PTPN22 as well as the involvement of the Cys129 residue regulates the function of PTPN22 in vivo, which is the main conclusion of our study.

      1. To this end, there are other cysteine residues in the vicinity of C227 such as the C231 that might be involved in the redox regulation PTPN22. The authors should at least discuss the their possible involvement.

      It is correct that Tsai et al. (2009) (1) found that mutating C231 to serine dramatically reduced phosphatase activity, thus suggesting its importance in catalysis. Reactivation assays showed higher reactivation rates for C231S mutants, and they suggested that C231 suppresses reactivation in a reducing environment by competing with C227 for reduction in the catalytic pocket. Therefore, C231 could also be a target for negative regulation of PTPN22. However, our project was from the start limited to the intention of studying whether PTPN22 could be shown to be redox regulated in vivo through modification of key cysteine residues, and the aim has not been to give the full picture of how the molecule is regulated. We have now extended this point in the discussion in the paper.

      1. How is mutation of C227 affecting T cell function? Are the effects similar with those of C129S?

      This would be interesting but to analyze if also the cysteine at 227 is regulating the T cell activation by creating another transgenic C227S mouse is outside the scope of the study. As said above and clearly described in the study, we have focused on the redox-mediated effects through C129 and hope that the reviewer can agree with us that this rather focused study is solid and fully sufficient for publication on its own merits.

      1. Although the in vitro evaluation of the PTPN22 activity is of highest quality, it would be good to demonstrate that C227 redox status is modified under physiological conditions. 25-100 µM H2O2 is a high concentration that might not be reached within a cell and might be lethal for T cells.

      See response to point 1.

      1. C129 seems not to be mutated in patients with autoimmunity but is an excellent tool to test the importance of C227 redox regulation and the findings of this study suggest that its over-oxidation will support autoimmune responses. When considering the clinical relevance of the study, a drug that will protect the oxidation of the catalytic cysteine and/or stabilize the disulfide bond would have beneficial effects. The authors could test such pharmacological modulators in isolated T cells.

      Indeed, such modulators would be very interesting to test; however, developing such drugs can hardly be demanded to be within the scope of this study. We have however included a statement on this topic in the Discussion of the manuscript.

      1. The authors discuss that NOX2-derived ROS most likely originate from antigen presenting cells. I fully agree with this discussion. However, some studies have proposed that NOX2 plays an important role also in T cells, a finding which was not confirmed by other following studies. It would be great if the authors could address this controversial issue in regards to their findings.

      The finding that the ROS that modify PTPN22 in fact come from the interacting APC rather than from the T cell itself we believe is very important. However, we have not made a major point of this as we have shown that aspect before in other studies, and we wanted in the current paper to focus on the take home message that PTPN22 could hereby be shown to be redox regulated in vivo. However, the last word about the source of ROS has not been said. The controversy whether the Ncf1 containing NOX2 complex is functionally expressed in T cells stems from the paper by Jackson et al. in Nat Immunol 2004 (2). We have not been able to reproduce those findings and in addition we have never detected a NOX2 dependent response in pure T cells, which has also been shown in several of our papers. There are certainly many pitfalls, contaminating NOX2 expressing cells, NOX2 containing exosomes and peroxides, and even NOX2 complexes picked up by interactions with antigen presenting cells. However, it is dangerous to completely exclude that Ncf1 could be expressed at minimal levels or to exclude that functional NOX2 complex can indeed be formed in T cells, and we all know that minute levels of any peroxide as produced by cells could have an impact on cellular functions. But, based on the present knowledge we conclude that T cells do not functionally express Ncf1-containing NOX2 complexes. We have now added two references to enlighten this point, (3, 4; refs. 38 & 39 in the manuscript).

      1. Fig. 1: Is the addition of bicarbonate affecting the pH and thus the activity of PTPN22?

      No, we believe that addition of bicarbonate is not acting by an altered pH but is instead required for formation of peroxymonocarbonate when reacting with H2O2, which is subsequently the molecular species that bypasses the cellular antioxidant systems in order to oxidize the active site Cys residues of target PTPs. This was shown by us in an earlier publication (Dagnell et al, ref. 11 in the manuscript) (5) and a sentence has now been added in the Discussion to further emphasize this point.

      1. The H2O2 concentration dependence of PTPN22_C129S should also be shown as for WT (see Fig. 1B)

      We agree with the reviewer that titration of the mutant with additional H2O2 concentrations could potentially have been done, but we thought that the comparison of WT and C129S enzyme side-by-side using either 0 µM, 25 µM or 50 µM as in Fig. 1D was a sufficient comparison in H2O2 sensitivity. Unfortunately, we do not have the possibility to analyze more purified C129S mutant protein at the moment and it would require a major effort to run those additional experiments. We thereby hope that the reviewer would agree with having the data presented as they currently are to be sufficient.

      1. Quantification of the slope based on only 3 measuring points is not accurate (Fig. 1D).

      Each data point in those curves represents the mean ± S.D. derived from duplicate samples ran three different times, with clearly very low standard deviations. Thus, we believe that the data are reliable and that the statistically significant difference when comparing the slopes between WT and the C129S mutant as shown in the figure, should be trustworthy.

      1. The pinna thickness measurements shown in Fig. 3B and C suggest that in NCF1 mice C129S has no effect. However, the thickness in NCF1 mice is already much higher than in WT mice (compare B and C). Does this mean that NOX2-derived ROS are the only factor that affects C227 redox properties?

      The effects of the decreased ROS due to the Ncf1 mutation is likely to have consequences for the functions of many proteins, in different pathways, and not only of PTPN22. The sum effect is that the Ncf1 mutated mice responds stronger than the wild type, which explains the difference. However, the main message here is that if there is no ROS from the NOX2 complex, the effect of the PTPN22 mutation is lost.

      1. The results shown in Fig. 5D could be moved to a supplementary figure.

      We prefer to keep it within Fig 5 as it is more logical in the context or the other parts of this figure. Of course, if there is a space layout problem, we can consider moving it.

      1. The calcium measurements are not convincing and the differences are rather small. The y axis labels show 50K, 100K etc. Are this ratio values? If yes the imaging settings need to be optimized. Why is the mutant labeled as Pep? How is the C129S affecting calcium signaling? These observations need be examined in more detail or maybe calcium is not playing an important role.

      We agree that the differences in calcium measurements are not very large but have nevertheless been repeated several times, and there is a significant difference as shown. The calculation is done on the slope of the curve, which is independent of the absolute values given on the y-axis. We agree that the figure was not properly labeled and have now changed this.

      1. I would suggest a more extensive evaluation of the proteomic data presented in Fig. 6D. The results might be very exciting and can further increase the impact of this study.

      We fully agree with this. We have chosen not to go into details of the results of the proteomic analysis. The data shown confirms our conclusion and we did not plan to identify the downstream targets of the PTPN22 oxidative regulation. Highlighting some of these targets will require biological confirmation, which can be done but must await future work. The full dataset has however been deposited in PRIDE for any reader interested to analyze the results further.

      1. Is 24h BSO treatment not toxic for the T cells (ferroptosis)?

      We have not seen any evidence for toxicity upon the BSO treatment of T cells in vitro, which however has been more thoroughly checked by others. Gringhuis et al (JI, 2000) (6) have shown immunofluorescence staining on T cells 72 hours post BSO treatment with intact cell membranes. Additionally, Carilho et al. (Chem. Cent. J., 2013, 7:150) (7) noted no changes in Jurkat T cell viability after 24 hours at a maximum dose of 100 µM BSO.

      Reviewer #3 (Public Review):

      The manuscript by James, Chen Hernandez et al. reveals a novel function for PTPN22 oxidation in T-Cell activation. The authors used a broad array of methods to demonstrate that PTPN22 is catalytically impaired in addition to being more sensitive to reversible oxidation in vitro. In the characterization process, the authors found that PTPN22 could be directly reduced by Thioredoxin Reductase and that oxidation of PTPN22 oxidation could be easily monitored by the appearance of a faster migrating band in non-reducing gels. Supporting the hypothesis that the catalytic Cysteine forms a disulfide with a backdoor Cysteine (Cys129), the authors found that this C129S mutant is prone to oxidation and cannot be reduced back to its active form by Thioredoxin Reductase. Using a new mouse model in which this key Cysteine of PTPN22 is mutated to a Serine residue (PTPN22C129S mutant) and can presumably not form a stabilizing redox intermediate between the catalytic Cys residue and this backdoor Cys (C227-C129), the authors study how the oxidation prone mutant affects T-Cell activation. The authors find that the C129S mutant mouse showed an increased T-Cell dependent inflammatory response that was dependent on activation of the reactive oxygen species-producing enzyme NOX2. This data adds an interesting redox twist to the function of PTPN22 in T-Cells that contributes to conversation on the protective effects of reactive oxygen species against inflammatory diseases in vivo.

      Strengths:

      The in vitro characterization of the WT and C129S mutant form of PTPN22 is very thorough. Determination of the Km and Kcat highlights the differences between the two enzymes that go beyond redox regulation of the phosphatase. The reduction studies are masterfully done and highlight a novel reduction mechanism that merits to be further studied in cells. Demonstrating that PTPN22C129S is prone to oxidation in vitro is a key and technically challenging result that may be applicable to other members of the PTP family that also form disulfides with a backdoor cysteine. Showing that PTPN22C129S mice (backcrossed to B6Q mice making them susceptible to autoimmune arthritis) displayed higher T cell activation in two models (DTH and GPI), in addition to studies in T cells stimulated with collagen, increased this reviewer's confidence that the PTPN22C129S mouse exhibited T-cell-dependent inflammatory response phenotype similar to the PTPN22 knockout phenotype. Validation of T-cell signaling events in PTPn22C129S T cells were in line with the in vitro characterization of the phosphatase.

      We thank the reviewer very much for the detailed summary of our findings and the appreciative words.

      Weaknesses: Although the paper has many strengths, some important weaknesses need to be addressed by the authors. In particular, the authors need to characterize better their mouse model and determine if PTPN22 is reversibly oxidized following TCR activation. If PTPN22 is oxidized, does it form an intramolecular disulfide between C227 and C129? The proposed model, that PTPN22C129S is more prone to oxidation, also has to be validated in vivo. Although this could be technically challenging in theory, the authors have shown that the migration pattern of the oxidized enzyme is different that of the reduced enzyme. Another major issue is that PTPN22 does not appear to be expressed in CD4+ T cells unless these cells are activated in vitro with anti-CD3/CD28 for 24 hours. This makes acute CD3-stimulation of CD4+ T cells studies - such as the measurement of acute calcium influx in Fig. 5E - very difficult to interpret. Perhaps the authors should explain why acute signal transduction studies in Figure 6 were performed in lymph node cells. If the reason is that PTPN22 (WT and C129S mutant) expression is higher, the authors should provide immunoblots for PTPN22 in these cells. Since the PTPN22C129S mouse model has not been sufficiently validated, the claims of the authors are unfortunately weakened and the underlying molecular mechanisms do not completely support their conclusions. However, given the clear in vitro work provided in figures 1 and 2, it is this Reviewer's opinion that the authors can address the issues related to the oxidation status of PTPN22 and of PTPN22C129S in vivo, support their claims, and make a significant contribution to the field.

      We again thank the reviewer for the detailed summary of our findings and for the suggestions. With regards to the in vivo oxidation status of PTPN22, please see the discussion above.

      1. Tsai SJ, Sen U, Zhao L, Greenleaf WB, Dasgupta J, Fiorillo E, et al. Crystal structure of the human lymphoid tyrosine phosphatase catalytic domain: insights into redox regulation. Biochemistry. 2009;48(22):4838-45.
      2. Jackson SH, Devadas S, Kwon J, Pinto LA, Williams MS. T cells express a phagocyte-type NADPH oxidase that is activated after T cell receptor stimulation. Nat Immunol. 2004;5(8):818-27.
      3. Gelderman KA, Hultqvist M, Holmberg J, Olofsson P, Holmdahl R. T cell surface redox levels determine T cell reactivity and arthritis susceptibility. Proc Natl Acad Sci U S A. 2006;103(34):12831-6.
      4. Gelderman KA, Hultqvist M, Pizzolla A, Zhao M, Nandakumar KS, Mattsson R, et al. Macrophages suppress T cell responses and arthritis development in mice by producing reactive oxygen species. J Clin Invest. 2007;117(10):3020-8.
      5. Dagnell M, Cheng Q, Rizvi SHM, Pace PE, Boivin B, Winterbourn CC, et al. Bicarbonate is essential for protein-tyrosine phosphatase 1B (PTP1B) oxidation and cellular signaling through EGF-triggered phosphorylation cascades. J Biol Chem. 2019;294(33):12330-8.
      6. Gringhuis SI, Leow A, Papendrecht-Van Der Voort EA, Remans PH, Breedveld FC, Verweij CL. Displacement of linker for activation of T cells from the plasma membrane due to redox balance alterations results in hyporesponsiveness of synovial fluid T lymphocytes in rheumatoid arthritis. J Immunol. 2000;164(4):2170-9.
      7. Carilho Torrao RB, Dias IH, Bennett SJ, Dunston CR, Griffiths HR. Healthy ageing and depletion of intracellular glutathione influences T cell membrane thioredoxin-1 levels and cytokine secretion. Chem Cent J. 2013;7(1):150.
    1. Author Response

      Reviewer #1 (Public Review):

      Dias et al proposed a new method for genotype imputation and evaluated its performance using a variety of metrics. Their method consistently produces better imputation accuracies across different allele frequency spectrums and ancestries. Surprisingly, this is achieved with superior computational speed, which is very impressive since competing imputation softwares had decades of experience in optimizing software performance.

      The main weakness in my opinion is the lack of software/pipeline descriptions, as detailed in my main points 36 below.

      We have made the source code and detailed instructions available publicly at Github. The computational pipeline for autoencoder training and validation is available at https://github.com/TorkamaniLab/Imputation_Autoencoder/tree/master/autoencoder_tuning_pipeline.

      1. In the neural network training workflow, I am worried it will be difficult to compute the n by n correlation matrix if n is large. If n=10^5, the matrix would be ~80GB in double precision, and if n=10^6, the matrix is ~2TB. I wonder what is n for HRC chromosome 1? Would this change for TOPMed (Taliun 2021 Nature) panel which has ~10x more variants? I hope the authors can either state that typical n is manageable even for dense sequencing data, or discuss a strategy for dealing with large n. Also, Figure 1 is a bit confusing, since steps E1-E2 supposedly precede A-D.

      We included more details in the methods section to address this question. It is true that computing the entirety of this matrix is computationally intensive, thus, in order to avoid this complexity, we calculated the correlations in a sliding box of 500 x 500 common variants (minor allele frequency (MAF) >=0.5%). In other words, no matter how dense the genomic data is, the n x n size will always be fixed to 500 x 500. Larger datasets will not influence this as the additional variants fall below the MAF>=0.5% threshold. Thus, memory utilization will be the same regardless of chromosome length or database size. Please note that this correlation calculation process is not necessary for the end-user to perform imputation, since we already provide the information on what genomic coordinates belong to the local minima or “cutting points” of the genome. This computational burden remains on the developer side. The reviewer is right to point out that Figure 1 is misleading in its ordering, we have corrected this in the revision.

      1. I have a number of questions/comments regarding equations 2-4. (a) There seems to be no discussion on how the main autoencoder weight parameters were optimized? Intuitively, I would think optimizing the autoencoder weights are conceptually much more important than tuning hyper-parameters, for which there are plenty of discussions.

      These parameters are optimized through the training process described in “Hyperparameter Initialization and Grid Search / Hyperparameter Tuning” - where both the hyperparameters and edge weights are determined for each autoencoder for each genomic segment. There are 256 genomic segments in chromosome 22, and each segment has a different number of input variables, sparsity, and correlation structure. Thus, there is a unique autoencoder model that best fits each genomic tile (e.g.: each autoencoder has different weights, architecture, loss function, regularizes, and optimization algorithms). Therefore, while there are some commonalities across genomic tiles, there is not a single answer for the number of dimensions of the weight matrix, or for how the weights were optimized. Instructions on how to access the unique information on the parameters and hyperparameters of each one of the 256 autoencoders is now shared through our source code repository at https://github.com/TorkamaniLab/imputator_inference.

      We included an additional explanation clarifying this point in the Hyperparameter Tuning subsection of the Methods.

      (b) I suppose t must index over each allele in a segment, but this was not explicit.

      That is correct, t represents the index of each allele in a genomic segment. We included this statement in the description of equation 2.

      (c) Please use standard notations for L1 and L2 norms (e.g. ||Z||_1 for L1 norm of Z). I also wonder if the authors meant ||Z||_1 or ||vec(Z)||_1 (vectorized Z)?

      We included a clarification in the description of equation 3. ‖𝑾‖𝟏 and ‖𝑾‖𝟐 are the standard L1 and L2 norms of the autoencoder weight matrix (W).

      (d) It would be great if the authors can more explicitly describe the auto-encoder matrices (e.g. their dimensions, sparsity patterns if any...etc).

      As we answered in comment 2.a, each one of the 256 autoencoders for each genomic segment is unique, so it would be unfeasible to describe the architecture, parameters, optimizers, loss function, regularizes, of each one of them. We realized it would be more suitable to share this information in a software repository and have now done so.

      1. It is not obvious if the authors intend to provide a downloadable software package that is user-friendly and scalable to large data (e.g. HRC). For the present paper to be useful to others, I imagine either (a) the authors provide software or example scripts so users can train their own neural network, or (b) the authors provide pretrained networks that are downloaded and can be easily combined with target genotype data for imputation. From the discussion, it seems like (b) would be the ultimate goal, but is only part dream and part reality. It would be helpful if the authors can clarify how current users can benefit from their work.

      We have now shared the pre-trained autoencoders (including model weights and inference source code) and instructions on how to use them for imputation. These resources are publicly available at https://github.com/TorkamaniLab/imputator_inference. We have added this information to the Data Availability subsection of the Methods.

      1. Along the same lines, I also found the description of the software/pipeline to be lacking (unless these information are available on the online GitHub page, which is currently inaccessible). For instance, I would like to know which of the major data imputation formats (VCF/BGEN..etc) are supported? Which operating systems (window/linux/mac) are supported? I also would like to know if it is possible to train the network or run imputation given pre-trained networks, if I don't have a GPU?

      We have now made the github repository publicly available. The description of the requirements and steps performed in the hyperparameter tuning pipeline is available at https://github.com/TorkamaniLab/Imputation_Autoencoder/tree/master/autoencoder_tuning_pipeline.

      1. Typically, imputation software supplies a per-SNP imputation quality score for use in downstream analysis. This is important for interpretability as it helps users decide which variants are confidently imputed and which ones are not. For example, such a quality score can be estimated from the posterior distribution of an HMM process (e.g. Browning 2009 AJHG). Would the proposed method be able to supply something similar? Alternatively, how would the users know which imputed variants to trust?

      We included further clarification in the data availability session of methods: Imputation data format. The imputation results are exported in variant calling format (VCF) containing the imputed genotypes and imputation quality scores in the form of class probabilities for each one of the three possible genotypes (homozygous reference, heterozygous, and homozygous alternate allele). The probabilities can be used for quality control of the imputation results.

      We included this clarification in the manuscript and in the readme file of the inference software repository https://github.com/TorkamaniLab/imputator_inference.

      1. I think the authors should clarify whether input genotypes must be prephased. That is, given a trained neural network and a genotype data that one wishes to impute, does the genotype data have to be phased? The discussion reads "our current encoding approach lacks phasing information..." which can be understood both ways. On a related note, I hope the authors can also clarify if the validation and testing data (page 7 lines 1423) were phased data, or if they were originally unphased but computationally phased via softwares like Eagle 2 or Beagle 5.

      The input genotypes are not phased, nor pre-phased, and no pre-phasing was performed before imputation. We included further clarification on the method section, stating “All input genotypes from all datasets utilized in this work are unphased, and no pre-phasing was performed.”. We also included further clarification in the Discussion session.

      1. It is unclear if the reported run times (Figure 6) includes model training time, or if they are simply imputing the missing genotypes given a pre-trained autoencoder? For the later, I think the comparison may still be fair if users never have to train models themselves. However, if users currently have to train their own network, I feel it is imperative to also report the model training time, even if in another figure/table.

      The end-users do not have to train the models, the computational burden of training the models remains on the developer side, so the runtimes refer to the task of imputing the missing genotypes given a pre-trained autoencoder set. This allows for distribution without reference datasets. We included further clarification on the Performance Testing and Comparisons subsection of Methods.

      Reviewer #2 (Public Review):

      In this manuscript the authors introduce a segment based autoencoder (AE) to perform genotype imputation. The authors compare performance of their AE to more traditional HMM-based methods (e.g. IMPUTE) and show that there is a slight but significant improvement on these methods using the AE strategy.

      In general the paper is clearly presently and the work in timely, but I have some concerns with respect to the framing of the advances presented here along with the performance comparisons.

      Specific Points:

      1. The authors aren't doing a good enough job presenting the work of others in using deep neural networks for imputation or using autoencoders for closely related tasks in population genetics. For instance, the authors say that the RNN method of Kojima et al 2020. is not applicable to real world scenarios, however they seem to have missed that in that paper the authors are imputing based on omni 2.5 at 97% masking, right in line with what is presented here. It strikes me that the RNNIMP method is a crucial comparison here, and the authors should expand their scholarship in the paper to cover work that has already been done on autoencoders for popgen.

      This is an important comparison that we erroneously misrepresented. We have now separated out this particular application of the RNN-IMP in the introduction of the manuscript. The major difference is that RNN-IMP needs to be retrained on different input genetic variants, much like a standard HMM-based method. The computational burden of RNN-IMP remains on the end-user side. It appears that computational complexity is tremendous in this model, given that the only example the authors provided with their software consists of 100 genomes from 1000 Genomes Project to perform the imputation on Omni by de novo training of the data. Given their approach does not achieve the benefits of distributing a generalizable pre-trained neural network, and the computational burden associated with training these models on the 60K+ genomes we use in our manuscript, we have opted for stating the benefits and downsides of their approach in the introduction.

      1. With respect to additional comparisons-Kenneth Lange's group recently released a new method for imputation which is not based on HMM but is extremely fast. The authors would be well served to extend their comparisons to include this method (MendelImpute)-it should be favorable for the authors as ModelImpute is less accurate than HMMs but much faster.

      We appreciate the reviewer pointing out this additional method, however their parent manuscript clearly shows substantially inferior imputation performance relative to BEAGLE/Minimac etc. which we already compare against. There is not much to gain by performing this comparison. Our autoencoder-based approach is already generating results that are competitive with the best and most cited imputation tools, which are all HMM-based and outperforming MendelImpute. The outcome of this comparison is forecasted based upon the parent manuscript.

      1. The description of HMM based methods in lines 19-21 isn't quite correct. Moreover-what is an "HMM parameter function?"

      Thank you for catching this. We were referring to parameter *estimation and have corrected this in the manuscript.

      1. Using tiled AEs across the genome makes sense given the limitations of AEs generally, but this means that tiling choices may affect downstream accuracy. In particular-how does the choice of the LD threshold determine accuracy of the method? e.g. if the snp correlation threshold were 0.3 rather than 0.45, how would performance be changed?

      This choice is driven by the limitations of cutting-edge GPUs. 0.45 is the threshold that returns the minimum number of tiles spanning chromosome 22 with an average size per tile that fits into the video memory of GPUs. While developing the tiling algorithm, we tested lower thresholds, which made the tiles smaller and more abundant, and thus made the GPU memory workload less efficient (e.g. many tiles resulted in many autoencoders per GPU, which thus caused a CPU-GPU communication overhead). Due to the obstacles related to computational inefficiency, CPUGPU communication overhangs, and GPU memory limits, we did not proceed with model training on tiles generated with other correlation thresholds. We’ve added a paragraph explaining this choice in the manuscript.

      1. How large is the set of trained AEs for chromosome 22? In particular, how much disk space does the complete description of all AEs (model + weights) take up? How does this compare to a reference panel for chr22? The authors claim that one advance is that this is a "reference-free" method - it's not - and that as such there are savings in that a reference panel doesn't have to be used along with the genome to be imputed. While the later claim is true, instead a reference panel is swapped out for a set of trained AEs, which might take up a lot of disk space themselves. This comparison should be given and perhaps extrapolated to the whole genome.

      This is an interesting point. For comparison, the total combined uncompressed size of all pre-trained autoencoders together is 120GB, or 469MB per autoencoder. The size of the reference data, HRC chromosome 22 across ~27,000 samples is 1GB after compression – or nearly 10X the autoencoder size. Moreover, unlike in HMM-based imputation, the size of the pre-trained autoencoders does not increase as a function of the reference panel sample size. The size of the autoencoders remains fixed since the number of model weights and parameters remains the same regardless of sample size – though it will expand somewhat with the addition of new genetic variants. Another point to consider is that privacy concerns associated with distribution of reference data are mitigated with these pretrained autoencoders.

      1. The results around runtime performance (Figure 6) are misleading. Specifically HMM training and decoding is being performed here, whereas for the AE only prediction (equivalent to decoding) is being done. To their credit, the authors do mention a bit of this in the discussion, however a real comparison should be done in Figure 6. There are two ways to proceed in my estimation - 1) separate training and decoding for the HMM methods (Beagle doesn't allow this, I'm not sure of the other software packages) 2) report the training times for the AE method. I would certainly like to see what the training times look like given that the results as present require 1) a separate AE for each genomic chunk, 2) a course grid search, 3) training XGBoost on the results from the course grid search, and 4) retraining of the individual AEs given the XGBoost predictions, and 5) finally prediction. This is a HUGE training effort. Showing prediction runtimes and comparing those to the HMMs is inappropriate.

      We consider the prediction only during the runtime comparisons because only the prediction side is done by the enduser, whereas the computational burden remains on the developer side. For the HMMs, we included only the prediction time as well (excluded the time for data loading/writing, computing model parameters and HMM iterations). The pre-trained autoencoders, when distributed, can take as input any set of genetic variants to produce the output without any additional training or fine-tuning required.

      1. One well known problem for DNN based methods including AEs is out-of-sample prediction. While Figure 5 (missing a label by the way) sort of gets to this, I would have the authors compare prediction in genotypes from populations which are absent from the training set and compare that performance to HMMs. Both methods should suffer, but I'm curious as to whether the AEs are more robust than the HMMs to this sort of pathology.

      Our test datasets in Figures 4 and 5 are independent of the reference dataset. MESA, Wellderly, and HGDP are all independent datasets, never used for training, nor model selection. Only HRC was used as reference panel or for training, and ARIC was used for model selection during tuning. We included a statement in the methods clarifying this point.

      Reviewer #3 (Public Review):

      Over the last 15 years or so genotype imputation has been an important and widely-used tool in genetic studies, with methods based on Hidden Markov Models (HMMs) and reference panels emerging as the dominant approach. This paper suggests a new approach to genotype imputation based on denoising autoencoders (DAE), a type of neural network. This approach has two nice advantages over existing methods based on Hidden Markov Models (HMMs): i) once the DAE is trained on a reference panel the reference panel can be discarded, and users do not need access to the reference panel to use the DAE; ii) imputation using a DAE is very fast (training is slow, but this step is done upfront so users do not need to worry about it). The paper also presents data showing that the tuned DAE is competitive in accuracy with HMM methods.

      I have two main concerns.

      First, it is unclear to me whether the accuracy presented for the tuned DAE (eg Figure 3, Table 4) is a reliable reflection of expected future accuracy. This is because the tuning process was quite extensive and complex, and involved at least some of the datasets used in these assessments. While the paper correctly attempts to guard against overfitting and related issues by using separate Training, Validation and Testing data (p7), it seems that the Testing data were used in at least some of the development of the methods and tuning (eg p14, "A preliminary comparison of the best performing autoencoder..."; Figure 2 and Table 2, all involve the Testing data). Because of the complexity of the process by which the final DAE was arrived at it is unclear to me whether there is a genuine concern here, but it would seem safest and most convincing at this point to do an entirely independent test of the methods on genotype data sets that were not used at all up to this point.

      MESA, Wellderly, and HGDP were not used for training, nor for tuning, they are completely independent. So all the results showing these datasets are completely independent. Only HRC and ARIC were used for training and validation/tuning, respectively. We included a statement in the methods session clarifying this point.

      Moreover, HGDP in particular includes 828 samples from 54 different populations representing all continental populations and including remote populations like Siberia, Oceania, etc. This reference panel is described in more detail in the reference below and likely represents the most diverse human genome dataset available. Thus, we have externally validated generalizability on a dataset with much greater diversity than our training dataset:

      Bergström A, et al. Insights into human genetic variation and population history from 929 diverse genomes. Science. 2020 Mar 20;367(6484):eaay5012.

      Second, there is a potentially tricky issue of to what extent distributing a black box DAE trained on a reference sample is consistent with data sharing policies. Standards of data sharing have evolved over the last decade. Generally there currently seems to be little hesitation to publicly share "single-SNP summary data" such as allele frequency information from large reference panels, whereas sharing of individual-level genotype data is usually explicitly forbidden. It is not quite clear to me where sharing the fit of a DAE falls here, or how much information on individual genotypes the trained DAE contains. The current manuscript does not adequately address this issue.

      Currently there are no official data sharing restrictions on deep learning data. We are aware that future policies may rise, and we have started a collaboration with Oak Ridge National Laboratory to explore differential privacy techniques and privacy concerns for these autoencoders. Another point to consider is that the autoencoders segment the genome, making reconstruction of an individual genome impossible even if reference data were somehow recoverable from the neural networks. Regardless, this is an interesting and important point that should be addressed in the manuscript and we have added a paragraph discussing this point.

      Reviewer #4 (Public Review):

      In this manuscript, Dias et al proposed a novel genotype imputation method using autoencoders (AE), which achieves comparable or superior accuracy relative to the state-of-the-art HMM-based imputation methods after tuning. The idea is innovative and provides an alternative solution to the important task of genotype imputation. The authors also conducted some experiments using three different datasets as targets to showcase the value of their approach. The overall framework of the method is clearly presented but more technical details are needed. The results presented showed slight advantage of AE imputation after tuning but more comprehensive evaluations are needed. In particular, the authors didn't consider post-imputation quality control. The reported overall performance (R2 in the range of 0.2-0.6) seems low and inconsistent with the imputation literature.

      Overall, the method has potential but is not sufficiently compelling in its current form.

      We show average accuracy of 0.2-0.6 in Table 4, but that is the average R2 per variant across all variants (no MAF filtering or binning applied). The reviewer points that the accuracy should be R2>0.8, but this R2>0.8 refers to common variants only (allele frequency >1%), and we have shown r2>0.8 for these variants (Figure 4). The aggregate accuracy displayed in Table 4 is lower because the vast majority of variants fall below 1% allele frequency threshold.

      The references bellow demonstrate this issue and agree with our results:

      References:

      Rubinacci S, Delaneau O, Marchini J. Genotype imputation using the positional burrows wheeler transform. PLoS genetics. 2020 Nov 16;16(11):e1009049.

      McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, Kang HM, Fuchsberger C, Danecek P, Sharp K, Luo Y. A reference panel of 64,976 haplotypes for genotype imputation. Nature genetics. 2016 Oct;48(10):1279.

      Vergara C, Parker MM, Franco L, Cho MH, Valencia-Duarte AV, Beaty TH, Duggal P. Genotype imputation performance of three reference panels using African ancestry individuals. Human genetics. 2018 Apr;137(4):281-92.

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript by Rekler and Kalcheim examines the role of neural tube-derived retinoic acid (RA) in neural crest development. They observe that the onset of expression of the RA-synthesizing enzyme RALDH2 in the dorsal neural tube coincides with the end of neural crest production. The authors propose that this local source of RA is essential to activate the transcription of Bambi other BMP inhibitors, leading to the disruption of BMP signaling. Loss of BMP activity at the dorsal neural tube would halt neural crest production, leading to the establishment of the definite roof plate. Thus, precise temporal regulation of RALDH2 in the dorsal neural tube would dictate the timing of neural crest production and the segregation of PNS and CNS progenitors.

      Previous studies have already identified a role for RA in the control of the timing of neural crest production. MartinezMorales et al (JCB 2011) have shown that during early trunk development, mesoderm-derived RA works with FGF signaling to jumpstart the BMP/Wnt cascade that drives neural crest migration in the trunk. Rekler and Kalcheim choose to focused on a distinct function of RA at a later timepoint. The main contribution of the present study is the demonstration that - at later stages - RA produced by the neural tube has the opposite effect, acting to inhibit the BMP/Wnt cascade and halt neural crest production. Thus, RA would be a major regulator of the timing of neural crest production, acting to both trigger and repress neural crest migration.

      The study's strengths lie in an experimental strategy that allows the authors to manipulate RA function in a stagespecific manner and therefore uncover a later role for the signaling system in neural crest production. The authors also show that RA inhibition results in an incomplete fate switch and results in the generation of cells that share regulatory features of neural crest and roof plate cells. A significant limitation of the study is that the molecular mechanisms that endow RA signaling with stage-specific functions remain unknown. This is of particularly important since the early vs. late RA seem to have opposing effects, acting to either promote or terminate neural crest production.

      We thank this referee for her/his positive comments on our manuscript. We agree with the referee that a key question is understanding how RA signaling is differentially interpreted over time given its multistage activity in dorsal NT development.

      This is based on the following findings: Years ago, we uncovered that the balance between activities of BMP/Wnt and noggin in the dorsal NT trigger the onset of NC EMT. Martinez-Morales et al. strengthened our findings by reporting that a balance between somitic RA and FGF works on the reported BMP/Wnt modules to initiate the process. This group found that at gastrulation stages, RA is required for NC specification, as revealed by analysis of VAD quail embryos. Next, during somite formation, somitic RA is necessary for the onset of emigration of specified NC progenitors but at advanced somite stages it is dispensable for the subsequent maintenance of cell emigration. Presently, we find that RP-derived RA ends NC production. Together, this highlights a dynamic behavior of RA at 4 sequential stages of NC ontogeny. Clearly enough, the two first effects are mediated by an influence of RA dorsoventral patterning of the early NT, as distribution of ventral NT markers was strongly affected. In our case, RA from the nascent RP has no such effects suggesting that RP-derived RA acts at a post-patterning phase to specifically affect the dorsal NT.

      All things considered, we think that the problem is not simply a binary question of “opposing functions of RA signaling in starting or terminating NC production”. Instead, it is the understanding of a differential interpretation to the same morphogen by progenitor cells with changing states and at sequential stages.

      To the referee’s request, we begun addressing the question of how does RA inhibit BMP signaling close to the RP stage. To this end, we decided first to examine the temporal regulation of Raldh2 expression that is restricted to the RP stage, and is therefore a prerequisite for the late activity of RA. Whereas repressing RA activity extends the NC phase including the continuous transcription of Foxd3, Sox9 and Snail2 (Fig.3), we now found that extending the activity of each of these transcription factors close to the RP stage represses the onset of Raldh2 transcription in the nascent RP (new Fig. 9). We interpret these results to mean that as long as NC genes are active in the dorsal NT (NC stage), local Raldh2 and consequent RA synthesis in the NT does not take place, so Raldh2 in RP is repressed by NC-specific traits. The significance of these data is twofold: first, they explain the late onset of Raldh2 production at the RP stage. Second, since we also report the reciprocal result, that RA represses NC genes (Fig.3), we conclude that a cross repressive interaction exists between NC and RP-specific genes downstream of RA, being an emerging temporal property of the network. These data further indicate that the changing roles of RA throughout development of the dorsal neural primordium, largely depend on a different interpretation of the signal mediated by changing and mutually repressive codes.

      We have now presented these data in Fig.9. To clarify our thoughts further, we now provide a working model summarizing the effects of RA in NC to RP transition (Fig.10B).

      Our article uncovers for the first time and thoroughly documents, a role of local RA activity on the end of NC production and ensuing RP architecture. We believe that a comprehensive elucidation of the molecular mechanism responsible for inhibition of BMP signaling by local RA is the next obligatory step. We show in this study the selective activation of BMP inhibitors by endogenous RA and previously found that one of them, Hes/hairy, indeed inhibits BMP signaling and NC EMT (Nitzan et al, 2016). Therefore we propose that upregulation of BMP inhibitors by RA is a possible mechanism. However, we also predict that this is not the only one, and a deeper understanding of this problem is beyond the scope of the present study.

      Additional possibilities that fit with our data were now discussed: RA expression in somites vs. RP can be regulated by different enhancers and thus have distinct functions. For example, a specific enhancer driving expression of Raldh2 was found to be activated only at the definitive RP stage (Castillo et al., 2010). This enhancer contains Tcf binding sites and thus may be activated by Wnt signaling. In turn, as we show, RP-derived Raldh2 and resulting RA could negatively feed-back on Wnt signaling in the formed RP either directly or through BMP acting upstream of Wnt (now presented in Fig. 10B).

      Another possible scenario is that RA represses BMP signaling by inactivating Smad proteins via ubiquitination, as shown to be the case in selected cell lines (Sheng et al., 2010). These possibilities were discussed and await to be systematically explored.

      Comments:

      Previous studies have demonstrated that early RA production (presumably from the mesoderm) is necessary for the expression of early dorsal neural tube / neural crest genes like Pax7, Msx, Wnt1, and even BMP ligands. This is in contrast to the local source of RA, which presumably would be silencing these genes. Thus, mesoderm-derived RA would have the opposite effect in these progenitors than the RA synthesized in the neural tube. The study does not provide a mechanism that explains these stage-specific effects of the morphogen.

      As elaborated in our reply above to the general comment, we believe that RA whether emanating from somites or nascent RP, provides an initial signal that is later relayed upon target factors unique to each stage. It is possible that the precise source of factor plays a role; along this line we showed that somitic RA is dispensable for late events, reciprocally, there is no RA synthesis in the early NT that could affect NC cells. Having said that, there is RA activity in the NT at both stages and the output is still different. Hence, there should be more to this: In the revised version, we report that NC and RP-specific genes stand in a mutually repressive interaction downstream of RA, and this may contribute to the stage-specific effects of the morphogen.

      The effects of RA manipulation are often examined with non-quantitative techniques, like in situ hybridization (Fig. 2, 3). The incorporation of quantitative approaches (e.g., qPCR) would allow for the precise characterization of phenotypes (and better estimation of penetrance, etc.). Furthermore, the study lacks molecular/biochemical strategies to define the regulatory linkages between genes and pathways. This is a considerable limitation of the study since it prevents the establishment of a regulatory axis that would directly connect RA signaling to the BMP pathway.

      As the referee may notice, most genes examined are not restricted solely to the dorsal NT/RP domains. Since it is technically not accurate to isolate only the regions of interest for qPCR analysis, collecting entire NTs following unilateral or bilateral electroporations for qPCR would be highly inaccurate. In situ hybridization and immunohistochemistry provide a precise tool to assess the spatial localization of the transcripts/proteins of interest. To note is that in all cases examined, development of the color reactions was for the same length of time for control and experimental cases and photography was performed under identical conditions. Furthermore, in most cases, effects between treatments were dramatic, readily apparent at a qualitative level and easily quantifiable from ISH or fluorescent images.

      As to regulatory linkages between genes and pathways, the referee is correct; we do not demonstrate direct molecular interactions between the different players at the biochemical level. The present study provides a wealth of novel data connecting morphogens such as RA with BMP and Wnt activities, and those with a variety of downstream genes specific for either NC or RP stages. The next step will be to ask about the precise nature of the linkages between specific molecules/pathways.

      The function (and the regulation) of RALDH2 at the dorsal neural tube has been studied thoroughly, and RA is a known player in the dorsal-ventral patterning of the CNS. It is not clear to what extent the phenotypes observed by the authors are due to the disruption of a neural crest-intrinsic mechanism or if they are secondary to the overall changes in the cellular organization of the neural tube caused by loss of RA.

      This is a good point as RA is known to have multiple effects on NT development whose nature changes with stage. Available data emanating from young caudal neural plate explants and from VAD embryos that lack RA, showed that early RA signaling from developing somites is required for ventral patterning of the neural tube (motoneurons and V1, V2 interneurons) and for neuronal differentiation (Diez del Corral et al, 2003, Sockanathan and Jessel, 1998, Liu et al, 2001, Maden et al, 1996). These effects were shown to depend, at least partly, on antagonistic activities of RA and FGF in mesoderm which affect ventral, but not dorsal NT patterning (Diez del Corral et al, 2003). Our study focuses on a later stage when D-V neural tube patterning is already established.

      To address the referee’s comment, we now examined the effects of RA attenuation on expression of Pax7, a dorsal factor, and Hb9, a motoneuron-specific protein. We found that RARa403 does not affect the localization and/or extent of expression of Hb9, and causes only a mild 12% increase in the area of expression of Pax7. Consistent with these results, we also show in several figures that in the absence of RA signaling pSmad and Wnt activities, Foxd3, Snai2 and Sox9 expression patterns are prolonged in time but not in D-V extent.

      These data corroborate that the effects documented are directed to the dorsal NT and do not result from overall changes in D-V patterning. The data were now added as Fig.7 Supplementary 1.

      The authors rely solely upon overexpression constructs to manipulate the activity of the RA signaling pathway, which may be prone to artifacts. Furthermore, both overexpression constructs aim at inhibiting RA activity. This limits the impact of the work since there is no demonstration that RA is sufficient to activate BMP inhibitors and halt neural crest production.

      The tools we used to repress RA signaling consist of RARa403 that acts as a pan-dominant negative construct to abrogate receptor activity, and Cyp26A1, an enzyme that degrades RA. To activate RA signaling in a ligand- independent manner, we now implemented VP16-RAR-alpha in the revised version of this manuscript. All these tools are extensively and routinely employed in the literature in a variety of animal species and were shown to act in vivo as expected both by others and further confirmed by us in the present study. Having said that, we are currently optimizing the CRISPR-Cas9 method for gene editing of RA-specific genes and hope to succeed in the near future.

      We have now performed experiments to address the sufficiency of RA. Data were now added as Fig.5 Supplem 2 and 3 and Fig.6 Supp.2 .

      As we expected, gain of RA function at NC stages is not sufficient to prematurely activate BMP inhibitors like BAMBI, to end prematurely BMP signaling (pSMAD) or NC EMT, to alter the dynamics of expression of NC-specific genes, or to cause an earlier appearance of RP-specific traits. This is fully consistent with RA being active at NC stages when BMP/Wnt signaling, NC EMT, etc are operational. The fact that RA is necessary but not sufficient for these processes further suggest that the key is how NC cells at various stages of their ontogeny and then RP cells, differentially interpret the signal given the profound changes in cellular and molecular landscapes apparent between these stages.

      Reviewer #2 (Public Review):

      The manuscript presents a novel role for RA signaling during development as the mediator of the switch that occurs in the dorsal neural tube after the neural crest cells have migrated and the roof plate forms. The finding is interesting and novel as the events that take place at the end of neural crest stage are poorly understood. The strengths of the manuscript are that the study is well planned and executed to show the interesting phenotype of delayed/disturbed roof plate formation accompanied with prolonged neural crest stage caused by inhibition of RA signaling in the dorsal neural tube. The results also show that RA signaling marks the RP territory and inhibits the DI1 interneurons from invading the region. The results bring novel information to the field. The original finding of the involvement of RA in the process was revealed in a RNAseq screen comparison between the neural crest and the roof plate (which was recently published by the same lab). However, the current study doesn't use any new technology such as high throughput screens or high resolution or live imaging etc., but rather relies mainly on "old fashioned" techniques: electroporation to induce transient inhibition of RA signaling in the dorsal neural tube followed by analysis of the phenotype by using chromogenic in situ hybridization. The chosen techniques are sufficient to convincingly show the point the authors want to make and the study serves as a reminder that fancy new techniques are not necessarily a requirement for creating a solid story. The manuscript is also well written and easy to follow.

      We thank this referee for a very positive feedback on our study. Although we are always motivated by the implementation of new techniques, we agree that the primary goal is to answer a biologically meaningful question with suitable means.

      Finally, the manuscript links the activation of RA signaling to the decline of BMP signaling and specifically the upregulation of BMP inhibitors in the dorsal neural tube at the end of the NC stage, but in its current form the proof of this proposed link remains weak.

      Our article uncovers for the first time and thoroughly documents, a role of local RA activity on the end of NC production and ensuing RP architecture. We believe that a comprehensive elucidation of the molecular mechanism responsible for inhibition of BMP signaling by local RA is the next obligatory step. We show in this study the selective activation of BMP inhibitors by endogenous RA and previously found that one of them, Hes/hairy, indeed inhibits BMP signaling and NC EMT (Nitzan et al, 2016). Therefore we propose that upregulation of BMP inhibitors by RA is a possible mechanism. However, we also predict that this is not the only one, and a deeper understanding of this problem is beyond the scope of the present study.

      Additional possibilities that fit with our data were now discussed: RA expression in somites vs. RP can be regulated by different enhancers and thus have distinct functions. For example, a specific enhancer driving expression of Raldh2 was found to be activated only at the definitive RP stage (Castillo et al., 2010). This enhancer contains Tcf binding sites and thus may be activated by Wnt signaling. In turn, as we show, RP-derived Raldh2 and resulting RA could negatively feed-back on Wnt signaling in the formed RP either directly or through BMP acting upstream of Wnt (this was now presented in a working model in Fig. 10B).

      Another possible scenario is that RA represses BMP signaling by inactivating Smad proteins via ubiquitination, as shown to be the case in selected cell lines (Sheng et al., 2010). These possibilities were discussed and await to be explored systematically.

      Similarly, the manuscript does not address the consequences of exposure of RA to the dorsal neural tube during NC stage and it thus remains unknown whether RA signaling is sufficient to end the NC stage and activate roof plate formation prematurely. Additional experiments of this kind would help clarify the role of RA in the dorsal neural tube and the reciprocal roles of the two signaling pathways (RA and BMP).

      We have now performed experiments to address the sufficiency of RA. Data were now added as Fig.5 Supp.2 and Supp.3, and Fig.6 Supp.2, and discussed.

      As we expected, gain of RA function at NC stages is not sufficient to prematurely activate BMP inhibitors, to end prematurely BMP signaling (pSMAD) or NC EMT, to alter the dynamics of expression of NC-specific genes, or to cause an earlier appearance of RP-specific traits.

      This result is totally understandable in light of RA being anyway active (but not produced) in NT at NC stages (original Fig.1) when BMP/Wnt signaling, a NC-specific gene network, and NC EMT are operational.

      The fact that RA is necessary but not sufficient for these processes further suggests that the key is in the following, perhaps complementary mechanisms: 1) a different interpretation of the same signal by NC progenitors at sequential stages of their ontogeny and then by RP cells, accounted for by the profound changes in cellular and molecular landscapes apparent between these stages. 2) the possibility that somite-derived versus RP-derived RA are differentially interpreted by the dorsal NT cells owing, for example, to a distinctive mode of ligand presentation (e.g; by CRABP1 expressed in RP but not NC, etc).

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: Figures we made to respond to the referee comments appear to be not supported by the ReviewCommons system.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The authors characterized a new lncRNA locus named FLAIL that controls flowering time in Arabidopsis thaliana. The functional validation of this locus is strongly supported by the use of several different tools (CRISPR-Cas9 deletions, T-DNA insertion, amiRNA gene silencing, and transgene complementation of KO lines). It is also suggested that FLAIL lncRNA works in trans but not in cis. There are strong observations supporting that FLAIL works in trans.

      Moreover, it is suggested that FLAIL regulates gene expression by interacting with distant chromatin loci. This was assessed using RNA-Seq and ChIRP-Seq. Yet, the overlap between DEGs in the flail mutant and FLAIL binding sites at the chromatin is very small, with only 12 genes. From those, only 2 flowering genes' expression was rescued by FLAIL transgene complementation. The final conclusion that FLAIL lncRNA represses flowering by direct inhibition of the 2 flowering genes expression is correlative, and lacks genetic validation.

      #1.1 We plan to support the conclusions in the manuscript genetically as the reviewer suggests. We started these experiments yet they will require the timeframe of the full revision.

      In addition inspection of the supplementary file shows that the ChIRP analysis was done without filtering for the FDR so that some of the positive hits have an FDR of 0,232.

      #1.2 We strengthened the manuscript by implementing and FDR filter of ChIRP-seq results. The distribution of FLAIL binding sites in Fig. S7B and Table S4, and overlapping numbers between DEGs and FLAIL-ChIRP in Fig. S8A were correspondingly updated.

      In addition, many of the peaks land in intergenic regions with is not mentioned in the text a graph with the position of the peaks in respect to nearby genes would help.

      #1.3 Thank you for the suggestion, we strengthened the manuscript with the requested analysis. We implemented the FDR filter, then we used "tssRegion" in ChIPseeker to set distance to the nearest TSS as (-1000, 1000), then most peaks were located in promoter regions (67.24%) and in intergenic regions with 16.38%. Since many papers present the position of the peaks by ChIPseeker (PMID: 32338596, PMID: 28221134, PMID: 31081251, PMID: 32012197, PMID: 31649032, PMID: 32633672) we also applied a similar method to display a distribution of FLAIL binding loci relative to distance from the nearest TSS in Fig. S7C.

      In one sentence, the authors used the right model system and methodology, including advanced techniques, to characterize a new trans-acting lncRNA important for controlling the flowering time in Arabidopsis but lack evidence supporting a mechanism of action that goes beyond the interaction with several chromatin loci.

      **minor points:**

      line#63-64 the authors say the COLDAIR and ASL work on FLC in cis in my view the original papers suggested/showed they work in trans.

      #1.4 We increased precision by changing this sentence to ‘Vernalization-induced flowering associates with several lncRNAs such as ____COOLAIR____, COLDAIR____, ANTISENSE LONG (ASL), and COLDWRAP____ that in cis or in trans locally repress gene expression of FLOWERING LOCUS C (FLC), a key flowering repressor at different stages of vernalization’____.

      Fig 1B please add some more protein-coding RNAs for the bio-info analysis for comparison

      #1.5 ____done.

      Order of Supplementary Fig citation is mixed with S2 coming before S1B

      #1.6 Thank you, we ordered all figures by appearance in the text. __

      __

      It would help the reader to have a schematic of the crisper deletions, T-DNA insertion, and position of primers used for the RT-qPCR.

      #1.7 We enhanced our presentation of Fig. 1A. It shows a schematic of them as well as positions of primers.

      In the supplementary PDF file, some of the text is missing on page 3 beginning and end of lines.

      #1.8 we ensured all text in new submission.

      Reviewer #1 (Significance (Required)):

      The use of several different tools to validate the biological function of FLAIL locus is a major strength of this work.

      The authors propose that flowering time and its gene regulation are controlled by sense FLAIL lncRNAs. However, the sense transcription of FLAIL locus is not detected in wild-type plants by TSS-Seq, TIF-Seq, or plaNET-Seq.

      #1.9.1 There appears to be some confusions. Transcription of sense FLAIL can be observed in chr-DRS, TSS-seq, TIF-seq in wild type and even in plaNET-seq in NRPB2-FLAG nrpb2-1 plant. We enhanced presentation of Fig. 1 and provided a more clear description in Line 81-99.

      If the authors would have explored further the expression of FLAIL transcripts in different stages of development (vegetative and non-vegetative) and in response to different conditions, it would make their claims on the function of FLAIL lncRNAs more convincing. Additionally, flail mutants could have been obtained in the hen-2 background, since it's there where we can observe FLAIL transcription.

      #1.9.2 Thank you for the suggestion. We included additional analyses in ____Fig. S2 for FLAIL transcription level in different tissues and different abiotic stress conditions base on 20,000 publicly available RNA-seq libraries (PMID: 32768600). Although many libraries are non-stranded, this analysis determined that sense FLAIL or total FLAIL (including sense and antisense) is broadly expressed over many tissues and induced in response to many abiotic stresses (Fig. S2A-B), therefore suggesting that FLAIL may be needed broadly in Arabidopsis.

      FLAIL locus lays on the proximal promoter region of PORCUPINE (PCP), an important regulator of plant development. As flail mutants, pcp mutants display an early flowering phenotype. The authors show no link between FLAIL and PCP from the overlap between re-analysis of published RNA-Seq data for pcp and RNA-Seq and ChIRP-Seq from the authors. This analysis is not enough to exclude the involvement of PCP from the FLAIL function. PCP expression using RT-qPCR should be performed in flail mutants to further support that FLAIL works independently from PCP.

      #1.10 We strengthened this conclusion by adding the requested experiment. PCP transcription level in flail3 mutant was provided by RT-qPCR and RNA-seq in Fig. S11A-B.

      This work does not hypothesize any molecular mechanism besides the interaction of FLAIL lncRNAs with several chromatin loci. It was recently proposed in Arabidopsis that a trans-acting lncRNA interacts with distant loci via the formation of R-loops. The authors do not comment on that. This work would benefit in correlating FLAIL binding sites with R-loop-forming regions mapped in Arabidopsis, regardless of the results from this analysis. Additionally, the authors could attempt to look for a motif responsible for FLAIL binding.

      Check R-loop forming data R-loops (Santos-Pereira and Aguilera, 2015) in Arabidopsis, determined by DRIP-seq (Xu et al., 2017).

      #1.11 Thanks very much for this excellent suggestions.

      First, we searched for a consensus DNA motif on FLAIL binding regions by Homer. We determined four commonly enriched DNA sequence motifs among FLAIL target genes (Fig. 4G). Notably, the target genes CIR1 and LAC8 contained consensus sequences that matched to all FLAIL binding motifs (Fig. 4G). These data are consistent with a model where FLAIL binds DNA targets through a sequence complementary mechanism. Functionally important sequences are frequently conserved among evolutionarily distant species, we observed three motifs that appeared to cross-species conserved (Fig. S9), suggesting a potential evolutionarily constrained role.

      Second, we indeed identified R-loops peaks on several of FLAIL binding sites by DRIP-seq (Xu et al., 2017). For example, we observed R-loop formation over three FLAIL binding motifs at CIR1 locus and one at LAC8 (Fig. R1), indicating that R-loop formation may also be a factor determining FLAIL binding. Even though R-loop peaks are present at several FLAIL targets, full elucidation if R-loop formation determines FLAIL targeting requires further experimental evidence is beyond the scope of the current manuscript.

      Fig. R1 Representative tracks at LAC8 and CIR1 showing R-loop formation by DRIP-seq on Watson strand (w-R loops), Crick strand (c-R loops). Undetectable R-loops after RNAse-H treatment was shown as negative control. Four conserved sequence regions of FLAIL binding motifs were indicated by red arrows at LAC8 and CIR1 loci. Gene annotation was shown at the bottom.

      Most of the key conclusions are convincing, except for the flowering time control directly through CIR1 and LAC8, which should be mentioned as speculative

      ____#1.12____ Thank you for finding most key conclusions convincing. We plan strengthen the manuscript with additional genetic evidence to as part of the full revision.

      The words locus and loci are latin and they should be written in italic. The word Brassicaceae, referring to the family should be in italic, and should not be "Brassicaceaes". The word analysis has the wrong spelling.

      #1.13 We follow conventions given in Scientific Style and Format: The CBE Manual for Authors, Editors and Publishers (1994) Cambridge University Press, Cambridge, UK, 6th edn. The words locus and loci are common Latin terms and should not be italicized. However, should the format of the final prefer these words in italics we will change it later. We improved consistency of using italics. “Brassicaceaes” was changed to “Brassicaceae”.

      "How much time do you estimate the authors will need to complete the suggested revisions: this is difficult to answer as it depends to which level the author would like to take their work. In my view, if all new experiments would have to be started from scratch it is too far away to be estimated.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this ms, the authors identified the FLAIL lncRNA that represses flowering in Arabidopsis from a locus producing sense and antisense transcripts. They use an allelic series involving T-DNA insertions, CRISPR/Cas9 and artificial miRNAs to study the role of FLAIL in flowering. A complementation series of constructs of the flail3 allele allowed them to show that the sense FLAIL lncRNA can act in trans. RNAseq revealed a small group of genes linked to the regulation of flowering whose expression is affected in the mutant and restored in the complementation line. To gain further insight into FLAIL function, the authors used a ChIRPeq approach to test whether the lncRNA can recognize potential target genes along the genome and they could show that FLAIL binds specific genomic regions. Clearly, this paper shows very nice evidence that the FLAIL lncRNA can act in trans to regulate gene expression. Nevertheless, there are certain points that need to be clarified to further support the action of the sense FLAIL transcript.

      1.According to Fig. 1 A, the antisense FLAIL is "internal" to the DNA genomic area spanning the sense FLAIL. Hence, with direct RT-qPCR is very difficult to distinguish between these molecules as a minor "RT" activity of the Taq polymerase may lead to detection of low levels of antisense, idem if RDRs may generate low antisense levels. Although I think that the plaNET seq brings strong evidence about the start and ends of these molecules, to measure them by RT-qPCR is not trivial and requires the use of strand-specific RT-PCR using a 5' extension of the oligo and amplification with one oligo of the FLAIL sequence (sense or antisense) and the added oligo.

      #2.1.1 Thanks for this good suggestion. We tested both sense and antisense FLAIL transcription using oligo linked gene specific reverse primers for RT and a pair of the linked oligo and gene specific forward primer for qPCR. Primer locations were shown in Fig. 1A and new data were in Fig. 1C-D, Fig. 2B-C, and Fig. S4B-C.

      It is not clear how they could distinguish precisely sense and antisense particularly when both RNAs correlate as it is the case here in all alleles (Fig. 1 C and 1D). This should be more explicitly mentioned in the materials and methods section.

      #2.1.2 We gave a description of strand specific RT-qPCR method in detail in Line 397-402.

      2.In Fig. 2, what are the levels of antisense in the complementing lines with the sense transcript? And reciprocally sense levels in antisense constructs?

      #2.2 We added this data in Fig. 2B-C and described in Line 136-143. We indeed observed that sense FLAIL transcripts in the transformed asFLAIL construct or asFLAIL transcripts in the transformed sense FLAIL construct was similar to the control 35S:GUS (Fig. 2B-C), validating that NOS terminator inhibits antisense transcripts. We also noted that the transformed 35S:GUS and sense FLAIL construct expressed higher asFLAIL compared to the flail3 mutant (Fig. 2C). This may be caused by a T-DNA insertion of the resulting transgenic plants.

      This will definitively demonstrate the assumption that the T-NOS termination will not allow any expression on the other strand. At present, only one of the lncRNAs is measured in each experiment?

      #2.3 We appreciate the next-level reflection of this reviewer, with so many regions initiating cryptic antisense transcription it is an interesting challenge to identify a 3´- terminator that initiates no or poor antisense transcription.

      First, previous published data argue that the NOS terminator is largely abolishing initiation of antisense transcription (PMID: 33985972, PMID: 30385760, PMID: 27856735). All these studies address roles of antisense transcription by generating mutations abolishing antisense lncRNA transcription using the NOS terminator sequences.

      Second, to satisfy the curiosity of this reviewer, we provide data below that from another manuscript of the lab in preparation. It’s a screenshot of plaNET-seq in fas2-4 NRPB2-FLAG nrpb2-1 mutant carrying a pROK2 construct. The pROK2 T-DNA coincidentally carries a NOS terminator. We mapped plaNET-seq reads to the pROK2 scaffold to display the reads. In pROK2, a NOS promoter activates NPTII expression (red) with NOS terminator as a terminator sequence. No antisense transcription (blue) is detectable by this sensitive method to detect nascent transcripts. Taken together, the selection of the NOS terminator as a region suppressing initiation of antisense transcription represents a valid choice.

      Fig. R2 Genome browser screenshot of plaNET-seq at NPTII locus of pROK2 T-DNA vector in fas2-4 NRBP2-FLAG nrpb2-1 mutant. This mutant carries a pROK2 construct, in which a NOS promoter activates NPTII expression with NOS terminator a terminator sequence. Sense strand was shown in red and antisense strand in blue. pROK2 annotation was shown at the bottom.

      3.In Fig. 3, it will be important to also show the FLAIL locus in the flail3 mutants (in comparison to the wt) as well as the transgene locus. Here the reads will be strand specific and furthermore this will allow to show that the transgene is not generating antisense transcripts (through RDRs for gene silencing?) and confirm that the sense FLAIL is required for the complementation.

      #2.4 Thank you very much for this suggestion. NGS reads for endogenous FLAIL and transgenic FLAIL both map to the FLAIL locus, so we show the FLAIL locus in Fig 3B. This representation shows that sense FLAIL transcripts were significantly reduced in flail3 and rescued in complementation line comparing to wild type. These data argue against the idea of gene silencing and linked antisense production from the transgene. However, RNA-seq suggests that an isoform of asFLAIL appears to accumulate in flail3. Since we fail to identify this accumulation by strand specific RT-qPCR result in flail3 and in CRISPR-deletion lines, this may be an asFLAIL isoform resulting from the T-DNA insertion.

      4.In Fig. S5, the expression of FLAIL is shown in the artificial miRNA lines. Is the antisense FLAIL affected "indirectly" by the cleavage of the amiRNA or remains constant? This is likely the case but should be shown.

      #2.5 We added this result in Fig. S4C and expression level of asFLAIL remains constant compared to the transformed empty vector control.

      5.The ChIRPseq data adds major novelty to the ms and brings new ideas about the way of action of FLAIL. However, are there any common epigenetic states between ChIRP targets (e.g. histone modifications, antisense RNA production, homologies "detected" in the conserved regions between Camelina and Arabidopsis and the target loci? Or others) that may highlight potential mechanisms leading to repression mediated by FLAIL of these loci? There are many databases that could be explored (even during flowering) to search for potential relationships. Although precise description of the mechanism is out of the scope of this ms, this can be discussed in more detail to further expand on the nice data obtained.

      #2.6 We searched for a consensus DNA motif on FLAIL binding regions by Homer. We determined four commonly enriched DNA sequence motifs in target genes. Notably, the target genes CIR1 and LAC8 contained consensus sequences that matched to all FLAIL binding motifs (Fig. 4G). These data are consistent with a model where FLAIL binds DNA targets through a sequence complementarity mechanism. Functionally important sequences are frequently conserved among evolutionarily distant species, we observed three motifs that appeared to cross-species conserved (Fig. S9), suggesting a potential evolutionarily constrained role.

      **Minor comments:**

      6.In Fig. S3, a global alignment between FLAIL and two loci in Arabidopsis and Camelina is sown. What is the extent of homology? How conserved is this sequence at nucleotide level (small or very long?) to support the conservation of this lncRNA. Are there potential structures conserved among these lncRNAs?

      #2.7 T____wo consensus regions of ____FLAIL____ sequences among eleven disparate Brassicaceae genomes were shown in Fig. S9. ____Camelina sativa_ shared 98-nucleotide_ conserved sequences with Arabidopsis thaliana. In the future, it will be interesting to explore evolutional conserved structures among Brassicaceae genomes. However, these analyses are beyond the scope of the current manuscript.

      7.In Fig. S4B, arrows may help to understand which seeds were selected.

      __#2.8 Thanks. Arrows were included.____

      __

      Reviewer #2 (Significance (Required)):

      This paper is a very nice piece of work and demonstrate the action of a long non-coding RNA (lncRNA) in trans on specific targets involved in the regulation of a developmental process, flowering. There is growing evidences that the non-coding genome hides large number of lncRNAs and there is little detailed genetic support for the action of lncRNAs globally. In contrast to many descriptive papers in the field, this ms demonstrates genetically, through an allelic series and complementation experiments, that this lncRNA locus is involved in flowering regulation and that its sense lncRNA recognizes target loci genome-wide, bringing interesting perspectives on potential new mechanisms of transcriptional regulation mediated by non-coding RNAs.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In the manuscript by Jin et al authors characterize the FLAIL DNA locus in Arabidopsis (using a wide array of publicly available datasets), which produces a set of sense and anti-sense lncRNAs.

      While our work on the FLAIL manuscript was ongoing we published the manuscripts where we presented these novel genomics methods and related data to capture nascent transcription and cryptic isoforms. We shared most data with TAIR, so we are happy to hear that these data are considered publically available.

      Authors determined that the sense FLAIL lncRNA (or a set of sense lncRNAs, which isn't fully clear from the way the data are presented) is involved in flowering time in Arabidopsis based on the fact that the several flail mutants lead to the early flowering phenotype and this flowering defect is complemented by transgenic FLAIL DNA, meaning that FLAIL lncRNA acts in trans.

      A series of experiments lead us to conclude that the sense isoform of FLAIL is responsible for the effect. We improved the data representation and writing of the manuscript to enhance accessibility.

      The T DNA flail3 - mutant results in expression changes (up or down) of 1221 genes, including twenty genes linked to flowering in various ways. Expression of a group of these flowering-related genes could be either fully (for eight genes) or partially (for five genes) rescued by transgenic FLAIL. Authors also conducted the ChIRP-seq to determine which genes are physically bound by FLAIL lncRNA genome-wide. It was found that 210 genes in the genome are bound by FLAIL lncRNA. Comparison of the dataset of differentially expressed genes in the T-DNA flail3 mutant with the ChIRP-seq dataset of genes that are bound by FLAIL lncRNA revealed the 12 overlapping genes.

      Among these twelve overlapping genes, four were found to be functionally connected to flowering with expression of these four genes being down in flail3 T-DNA mutant. Two out of these four genes were ruled out from being involved in the regulation of flowering by FLAIL. Authors conclude that the two other genes (Cir1 and Lac8) are responsible for the late flowering phenotype of flail mutants based on the three lines of evidence: (i) these genes expression is reduced in the flail mutant, (ii) FLAIL lncRNA directly interacts with these genes chromatin, (iii) the mutants of these genes were previously reported by others to display early flowering phenotypes too. While I find many of the findings reported in the manuscript very interesting, building a good foundation on which to expand the study and providing a very good leads for follow up experiments, I also have serious concerns about the manuscript in its current form.

      Most importantly, this reviewer doesn't think that the mechanism of FLAIL lncRNA action was convincingly demonstrated. The main question would be how FLAIL lncRNA works and this question wasn't fully answered. It is great that FLAIL lncRNA binds directly to the two flowering-related genes, but what does it mean? Does it change any chromatin context of these genes quantitatively or qualitatively to affect the transcription? Or does it bind any components of transcriptional machinery and thus controls the transcriptional output?

      #3.1 This manuscript addresses an important question in the field question: what is the evidence for functional elements in non-coding regions of genomes? Despite many efforts, convincing genetic support for these functions often remained limited. In addition to our strong genetic data, we provided new evidence that FLAIL recognizes targets with evolutionally conserved sequence motifs as part of the revision in Fig 4F and Fig. S9. Additionally, we plan to do ChIP-qPCR to identify histone modifications on FLAIL targets.

      Additionally, flail3 T-DNA mutant affects the expression of 1221 genes and FLAIL lncRNA physically interact with 210 genes, so how can authors be fully sure that FLAIL lncRNA has only direct effect on these two genes and doesn't also contribute to the regulation of the upstream to Cir1 and Lac8 genes or even components of the transcriptional machinery that regulate these genes?

      #3.2 We agree with this opinion. It is the reason why we felt stating this exact conclusion in our previous manuscript was justified. We improved accessibility of our manuscript in the revision, these clarify our model, that the trans-acting lncRNA sense FLAIL can interact with the chromatin regions of its target genes to directly or indirectly regulate gene expression changes involving flowering (Line 274).

      Theoretically, doing RNA-seq in the amiR-FLAIL sense lncRNA mutant might have a chance of reducing the number of affected DEGs, making it easier to analyze the FLAIL targets, even if the allele can't be used for complementation experiments.

      #3.3 Thanks for this suggestion. We plan to confirm key gene expression changes using amiRNA-FLAIL in full revision.

      Also, auhors totally neglect putting the Cir1 and Lac8 genes into the context of flowering regulation, but it is something that needs to be done.

      #3.4 ____We discussed roles of CIR1 and LAC8 in flowering regulation in Line 260-272. Flowering is fine-tuned to maximize reproductive success and seed production and by endogenous genetic cues and external environmental stimuli such as photoperiod. Nevertheless, many details of the flowering pathways and their integration remain to be investigated____. CIR1 is a circadian clock gene, induced by light and involved in a regulatory feedback loop that controls a subset of the circadian outputs and thus determines flowering time. Our GO analysis supports that a subset of DEGs are connected to the response to red or far red light that contains among other key flowering genes such as ____phytochrome interacting factor____ 4____ (PIF4) and CONSTANS (CO)____. FLAIL also binds the chromatin region of LAC8. LAC8 is a laccase family member that mainly modulates phenylpropanoid pathway for lignin biosynthesis____. Similar to flail, lac8 mutants flower early. While intermediates in this pathway or dysregulation of lignin-related genes could promote flowering in plants, the molecular connections of reduced LAC8 expression to effects on flowering time will require further investigation.

      Lastly, the paper needs to be totally rewritten to be even properly evaluated. In its current state it reads like a very short draft.

      #3.5 We reorganized the structure of manuscript, improved clarity and provided new mechanistic evidence in Fig. 4G and Fig. S9 to present a more complete manuscript.

      The Abstract is weak, the Introduction is written in a such telegraphic style that it is barely readable, in many places there is no connections between sentences leading to the information appear to be presented as random, even if it isn't.

      #3.6- We strengthened the Abstract by providing new evidence and improved for the Introduction.

      The Results section is written rather rudimentary with information not being sufficiently provided to describe the results but rather scattered between the Results and Figure legends.

      #3.7 Thanks for your suggestions, we described each FLAIL length and all constructs in detail in Results, put a schematic of T-DNA and CRISPR mutants in Fig. 1A, moved comparative genomics data to the end of Results and ensured all figures in order.

      The Discussion is the best written part of the manuscript.

      Thanks for your appreciation of the Discussion.

      The Conclusion section carries no specific information and reads more like a little summary suitable for a review article rather than experimental paper.

      #3.8 We agree this opinion, this paragraph fits Discussion better and Conclusion was removed.

      Therefore, this reviewer thinks that regardless of how authors will choose to proceed with the current experimental version of the manuscript, it'd be in the authors' best interests to at least fully revise the paper before resubmitting anywhere. I'd also advise authors to seek professional editorial help specifically using an editor with the background in the plant sciences.

      Authors might also want to consider moving Fig.3 into the Suppl. as it doesn't carry much weight or significance and perhaps make existing figures more meaningful and comprehensive and by including a better diagram of the locus (e.g., Fig. S1), etc.

      #3.9 We thank this helpful suggestion. Fig.3 represents the RNA-seq data. In combination with supporting data in the supplementary material, it gives an easy visual readout of the reproducibility of the findings in replicates of stranded RNA-seq. In a new submission, we moved it to Fig. S5B and highlighted 13 differentially expressed flowering genes as well as sense FLAIL in flail3 that were rescued in complementation line in Fig. 3A. Moreover, we gave screenshots of FLAIL itself and four flowering related FLAIL targets in RNA-seq with a clear schematic representation of each locus. We believe these revisions improve Fig. 3.

      It's not practical to list all issues with the writing as the paper requires total re-writing, so I can just make a few suggestions without any specific order to help authors improve the paper:

      We are happy to improve our manuscript with the help of the reviewers. We addressed all comments including from reviewer #3 with a constructive spirit. However, since colleagues and reviewers #1 and #2 found the manuscript comprehensible to the point where they could make expert-level comments that illustrate understanding of the manuscript, a total re-writing did not feel like the most constructive suggestion to improve the manuscript.

      --There is no statement anywhere that states the goal of the study.

      #3.10____ We stated the goal of the study in line 50-69 and we think this is a misunderstanding. We summarized three issues currently exist in characterization of functional lncRNA in the last sentence of the first three paragraphs in Background: 1 in Line 50, the broad range of candidate hypotheses by which lncRNA loci may play functional roles call for multiple approaches to distinguish alternative molecular mechanisms. 2 in Line 59, functional characterization of trans-acting lncRNAs remains a key knowledge gap to understand the regulatory contributions of the non-coding genome. 3 in Line 69, the contribution of trans-acting lncRNAs to the regulation of distant flowering genes is currently unclear. So in the last paragraph of the background, we claimed that our goals are to address these questions through characterization of functional FLAIL lncRNA in flowering repression using multiple genetic approaches and various genomic data.

      --No rational is provided on why authors decided to examine this specific genomic locus.

      #3.11 For several years, our lab studies the rules and roles of non-coding transcription. We characterized and are characterizing several loci with evidence of non-coding transcription in a range of species. Early experiments suggested that FLAIL functioned in flowering, this manuscript clarifies that the function is executed as trans-acting lncRNA of the sense FLAIL isoform.

      --Typically, the significance is in studying the function of lncRNA or a group of lncRNAs produced from a genomic locus, I don't think I ever encountered the instances when it was exciting to study just a specific genomic locus. If the locus does indeed have any significance for initiating the study, it needs to be explained.

      #3.12 This study is remarkable in many aspects. We fully discuss key strengths in the discussion. First, we ____exhibit a trans-acting lncRNA FLAIL that represses flowering by promoting the expression of floral repressor genes as discussed in Line 247-281_; Second, in Line 284-306, we informed that this study provide a compelling model about how to apply _series of convincing genetic data____ to functionally characterize lncRNA loci. Third, in Line 307-312, evolutionary conserved FLAIL sequences across species is key to characterize the functional _microhomology in other _Brassicaceae.

      --The locus can produce lncRNAs, but it can't harbor them.

      #3.13 We clarified this confusion by enhancing ____presentation of Fig. 1 and providing a more clear description of each sequencing method and results in Line 81-99. Although we provided evidence that transcription of both sense and antisense FLAIL are more stable in hen2-2, they were clearly observed in chr-DRS in wild type and plaNET-seq in NRBP2-FLAG nrpb2-1 and sense FLAIL was even detected in TSS-seq and TIF-seq in wild type.

      --No length of FLAIL lncRNAs or their range is provided in the first section of Results.

      #3.14 We gave the length of sense FLAIL in Line 82 and antisense FLAIL in Line 86.

      --On many occasions authors don't state rational for doing experiments, which leads to information often flowing as random.

      #3.15 we enhanced clarity of the rational for each experiment and made some connections between sentences to make more fluent. For example, in sentences in Line 99, Line 113, Line 126, Line 159, Line 183, Line 214, and Line 219.

      --What do authors mean by the subtitle "FLAIL characterizes a trans-acting lncRNA repressing flowering"? How can lncRNA FLAIL or FLAIL locus characterize lncRNA?

      #3.16 We changed it to “FLAIL represses flowering as trans-acting lncRNA” in Line 112.

      --Check all figures. E.g., Fig. 3B-E mentions only accession numbers for the genes.

      #3.17 The systematic gene IDs are a valid way to represent data, in particular for genomics data since it facilitates cross-comparisons. To make it more accessible we also show systematic names of each gene in Fig. 3A-F, Fig. S6 and Table S3.

      --It is not clear where exactly the T-DNA insertion is located relative to sense FLAIL in flail3 mutant (Fig. S4).

      #3.18 We moved the schematic to clarify this to revised Fig. 1A and the exact T-DNA insertion site is mentioned in the legend.

      --- What is the length of the complementing sense FLAIL lncRNA?

      #3.19 We now include the length of the complementing sense and antisense FLAILs in Line 351-352.

      --Check the description of each and every construct used and provide explanation for each in the Results. E.g., the pFLAIL:gFLAIL18/88 and pasFLAIL:gasFLAIL18/39 constructs aren't explained in Results, and can only be found in Fig. 2 legends.

      #3.20 We described each construct including pFLAIL:gFLAIL18/88 and pasFLAIL:gasFLAIL18/39 constructs in Line 133, amiR-FLAIL-11 and amiR-FLAIL-11 in Line 149.

      Reviewer #3 (Significance (Required)):

      Tens of thousands of lncRNAs have been identified in various eukaryotes, but their biological roles have been shown only for a small fraction of them, and the mechanisms of their action are delineated for only a very few of them. Most of the advances on the field of lncRNAs are reported in metazoan, while the field of lncRNAs in plants is lagging far behind in terms of knowledge about lncRNAs with assigned biological functions or lncRNAs with delineated mechanisms of action. From this point of view, this reviewer is always excited to see any new functional plant lncRNAs for which either biological or mechanistic functions have been determined, and deems the information on this subject significant. The manuscript's findings are potentially very interesting and present a decent body of work that lays a very solid groundwork for future experiments. My main concern about the manuscript's significance in its current form is the fact that no real solid mechanism of action for the described lncRNA or a set of lncRNAs (?) has been demonstrated. The best mechanistically studied lncRNAs in Arabidopsis are involved in the regulation of flowering time, particularly those that function in the vernalization flowering pathway and to lesser extent in autonomous pathway. The new FLAIL lncRNA or lncRNAs (?) described in this manuscript also appear to regulate the flowering time in Arabidopsis, however more experiments would be needed to provide a definite conclusion about how direct FLAIL's effect is and how exactly it functions. That unfortunately obviously diminishes the significance of the manuscript and makes it potentially interesting only to researches studying flowering in Arabidopsis and even then the manuscript results would be incomplete to make solid conclusion.

      Lots of functional phenotype have

      Additionally, the manuscript requires complete re-writing.

      We thank this reviewer for the appreciation of __a decent body of work and a very solid groundwork for future experiments. We are confident that our revisions make the manuscript more comprehensible to highlight the qualities of our manuscript more accessibly.____

      __

    1. We need to be open to what takes placeand able to change our plans and go with whatmight grow at that very moment both inside thechild and inside ourselves.

      Isn't this the way children are though ? The way things are born inside them, and it may because of something we may take for granted as ordinary, something small or something big can cause a tremble or ripple...and it can grow into something exciting. I like to think it is that way for us too.

    1. Author Response

      Reviewer #1 (Public Review):

      Yang, Bhoo-Pathy, Brand et al detail their investigation of a large Swedish cohort compared with age matched controls to estimate the risk of short- and long-term cardiotoxicities of breast cancer therapies in a general breast cancer patient population. They find that breast cancer patients are at significantly increased risk of developing arrhythmia and heart failure both within the first year of cancer diagnosis as well as at least 10 years after. Interestingly, they find that there is an increased risk of ischemic heart disease within the first year after diagnosis, but no increased risk of ischemic heart disease in the long term.

      The authors should be commended for this large cohort study that achieves its goal of identifying the incidence and hazard ratio of cardiotoxicity associated with breast cancer treatment within a general breast cancer population. Their findings of increased risk of heart failure in patients treated with anthracyclines and trastuzumab is consistent with multiple prior studies in the field of cardio-oncology and adds to the validity of the data.

      The finding that there is only a slightly increased (and statistically insignificant) risk of ischemic heart disease after left sided radiotherapy is quite interesting, and as noted by the authors, differs from prior understandings about risk of ischemic heart disease associated with breast radiation therapy. Without data on mean heart dose or total radiation administered the results are hypothesis generating, but should not be utilized to guide medical decision making.

      One of the major limitations of this study is that the authors' goal is to identify the incidence and risk of cardiotoxicity associated with the various breast cancer treatment regimens and determine these risks over time, and as noted by the authors, the registry utilized only includes planned treatment not whether patients did receive this therapy (and what dose of therapy). This is a key point that should be emphasized when interpreting the results.

      As noted by the reviewer, the Stockholm-Gotland Breast Cancer Register only included the intended treatment without a detailed dosage of the therapy. However, the agreement between intended and administrated treatment was about 95% in Sweden (Löfgren,L et, al BMC Public Health. 2019). We have now further explained this in the discussion section.

      In Discussion: “Overall, our results indicate only small risk of heart disease due to radiotherapy in women treated in Sweden after year 2000. Further studies with detailed information on the mean heart dose of radiation or total cumulative radiation dose administered are therefore needed to confirm and provide more context to this finding.”

      In Discussion: “Besides, the Stockholm-Gotland Breast Cancer Register only records intended treatment, not whether patients actually received these therapies. However, the agreement between the intended and administered breast cancer treatment in Sweden has been previously reported to be about 95% (Löfgren et al., 2019).”

      There are several conclusions included in the discussion section that are not supported by the data from the results section and the authors should be careful to suggest mechanisms of cardiotoxicity from an observational population-based study. Examples include suggesting anthracyclines cause cardiotoxicity of the myocardium but not the cardiac vessels; attributing early increased risk of ischemic heart disease to emotional distress alone; and that inhibition of HER2 receptors in myocytes may explain cardiotoxicity caused by trastuzumab. These are interesting hypotheses that would be better supported by references to lab/animal model studies.

      We thank the reviewer for the suggestions and have now added the reference for the suggested mechanisms of cardiotoxicity with lab/animal model studies in the discussion section.

      In Discussion: “As the long-term risk was observed for heart failure but not ischemic heart disease, the cardiotoxic effect of chemotherapy might be mainly on the myocardium mediated by the effect of DNA double-strand breaks through topoisomerase (Top) 2β, but not the cardiac vessels. (Lyu et al., 2007)”

      In Discussion: “The finding that risk of ischemic heart disease in breast cancer patients was only transiently elevated after diagnosis is not unexpected, considering the emotional distress of dealing with a new cancer diagnosis in the patients, which may lead to higher short-term rates of ischemic heart disease (Fang et al., 2012; Schoormans, Pedersen, Dalton, Rottmann, & van de Poll-Franse, 2016). In addition, surgery after breast cancer diagnosis might increase the risk of arterial thromboembolism (Gervaso, Dave, & Khorana, 2021), which includes myocardial infarction, and the effect appears to attenuate one year after diagnosis. (Navi et al., 2017; Navi et al., 2019).”

      In Discussion: “The cardiotoxic effect of trastuzumab meanwhile may be explained by inhibition of the HER2 receptors in myocytes, that activates the mitochondrial apoptosis pathway through modulation of Bcl-xL and -xS, which regulates cell development and growth (Grazette et al., 2004; Yeh & Bickford, 2009)”

      The authors succeed in highlighting the increased risk of cardiotoxicity associated with breast cancer treatment in the observed patient population. Rather than exploring the mechanism of cardiotoxicity for the treatment regimens observed, the data presented may be more useful to propose a longitudinal cardiac monitoring schedule for patients who have been treated for breast cancer, and who the current data suggest, are at long term risk for heart failure and arrhythmia.

      As we found increased long-term risk of heart failure in breast cancer patients, especially for those treated with Anthracyclines +Taxanes and Trastuzumab, we therefore suggest for a prolonged longitudinal cardiac monitoring schedule for ten or more years in these treated patients. We have added the suggestion in the discussion section.

      In Discussion: “Analysis by time since diagnosis revealed long-term increased risks of arrhythmia and heart failure following breast cancer diagnosis, suggesting that a longitudinal cardiac monitoring schedule might be helpful to improve cardiac health in breast cancer patients.”

      Reviewer #2 (Public Review):

      This is a registry based study in which patients diagnosed with locoregional breast cancer ( stage 1-111) from 2001-2008, between the ages of 25-75 were compared to a randomly sampled cohort of 10 women matched by the year of birth and for three specific cardiac conditions as outlined in the key objective. Data was gathered by cross referencing Subject's unique identification numbers in Swedish Cancer Register, Patient Register, Cause of Death, and Migration Register. Prescribed Drug Register was reviewed to gather information about prescribed medication to perhaps infer the medical comorbid conditions for which medication was prescribed. Breast cancer treatment specific information was missing in cases and presumption of use of Anti Her2 therapy was made based on HER2 neu status in some cases. While the primary objective of the study to show increased evidence primarily Heart failure and arrythmias seem to have been met in this patient registry based study, there is some question of the specificity of the data since it was gathered from the various registers and is subject to operator dependent biases.

      Strengths: Study is a long term follow up of patients treated with potential cardiotoxic drugs, confirming the previously known association of specific heart disease to the use of these drugs. Longest follow up seems to be for 16 yrs for the earliest cohort of 2001 and minimum approximately 10 yrs for the cohort of 2008. This study does confirm that long term risk that remains even after the treatment is completed and potentially suggests that more robust cardiac function monitoring guidelines for survivors may be warranted.

      Weaknesses: This is a patient register based study. As outlined above, data was extracted by cross referencing various patient registers. Since the data was dependent on the ICD codes entered in the patient register, there seems to be potential for missed information.

      The Swedish Patient Register has quite high validity for the heart diseases analyzed in this study, with a positive predictive value between 88%-98%, by using the main diagnosis in the register. However, it is still possible that we have missed some information for heart disease and we have emphasized this limitation in the discussion section.

      In discussion: “The Swedish Patient Register has high validity for heart failure, arrhythmia and ischemic heart disease (with positive predictive value between 88%-98%) (Hammar et al., 2001; Ludvigsson et al., 2011), by analysing main diagnoses only. However, misclassification of heart diseases may still have occurred.”

      Preexisting comorbidities were also extracted through Patient Registers hence may be subject to same potential for missed information.

      The Swedish Patient Register has relatively high validity for the majority of comorbid diseases. However, patients without severe symptoms of the diseases might be treated in the primary health care centers, which were not included in the patient register. We have therefore pointed out this limitation in the discussion section.

      In discussion: “In addition, preexisting comorbidities extracted from the patient registers may not include those patients with slight symptoms.”

      In addition, information for use of Trastuzumab was extrapolated from the Her2neu status of the patient when such information may not have been accessible through Prescribed Drug Registers.

      As the majority of HER-2 positive patients were treated in the clinics, the Swedish Prescribed Drug Register does not register their information. Because ~90% of HER-2 positive cancers were treated with trastuzumab between 2005 and 2008 in the Stockholm-Gotland region, we therefore used HER-2 positivity as a proxy for trastuzumab treatment. We have now further explained this in the methods section.

      In Materials and Methods: “As ~90% of HER-2 positive cancers were treated with trastuzumab between 2005 and 2008 in the Stockholm-Gotland region and the Swedish Prescribed Drug Register does not cover data on treatment with trastuzumab, HER-2 positivity was used as a proxy when no registry data on trastuzumab was available during this time period (30% of the HER-2 positive patients had missing information on trastuzumab).

      It is also unclear if there was any protocol in place for cardiac monitoring for patients receiving cardiotoxic chemotherapy or Anti Her2neu agents.

      In Sweden, there is no cardiac monitoring for chemotherapy in routine clinical practice. For HER2-therapy, cardiac monitoring with a thorough cardiac assessment prior to treatment, including history, physical examination, and determination of left ventricular ejection fraction before, during and right after treatment has been mandatory since introduction in clinical routine. We have now added this information to the discussion.

      In discussion: “As there is no cardiac monitoring for chemotherapy in routine clinical practice and cardiac assessment is only performed prior to and during the treatment period for HER-2 positive patients in Sweden, a longer-term cardiac monitoring program might be helpful for these patients.”

      Reviewer #3 (Public Review):

      This matched analysis uses data from patients newly diagnosed with breast cancer the Stockholm-Gotland Breast Cancer Register and data from patients in the general female population in Sweden to ask the question of whether breast cancer diagnosis (and subsequent treatments of breast cancer) is associated with an increased rate of heart disease after treatment. It is impossible to answer this question in a randomized controlled setting and would be unethical to randomize patients to not be treated for their cancer, thus a matched approach in theory would seem to make sense at face value. However, I have some concerns about the analysis that I believe impede their answering the research aims.

      1. With regard to the matched analysis of time to heart disease diagnosis, I have several critiques/questions. First, for the breast cancer cohort, were patients with a diagnosis of heart disease prior to cancer diagnosis included in the analysis? If so, how was the event (which precedes time = 0) incorporated into the analysis? If not, please make sure to make note of this important restriction. I think the latter approach is the better / correct.

      As suggested by Referee 3, we have now excluded those patients with a diagnosis of heart disease prior to cancer diagnosis. We have updated the results and the methods section accordingly.

      In Materials and Methods:

      “We included all patients diagnosed with non-metastatic breast cancer (stages I-III) and without prior diagnosis of heart disease at age 25 to 75 years (N = 8015).”

      Second, for the matched cohort, what is time = 0 for these persons? i.e. how does one interpret "Time since diagnosis" on Figure 1 for a patient who has not been diagnosed with breast cancer?

      We apologize for this misunderstanding and have revised it to “Time since index date (= date of diagnosis, which is the same date for corresponding matched individual from the general population) ” in Figure 1.

      Third, how was the matching incorporated into the FPM? Presumably there should be a frailty term of some sort to indicate the matched groups, within which there is expected to be correlation.

      In the flexible parametric survival model for matched cohort data, a shared frailty term was incorporated into the model to indicate the matched cluster. The maximum (penalized) marginal likelihood method is used to estimate the regression coefficients and the variance for the frailty. We have added this explanation in the methods part.

      In Materials and Methods: “Considering the correlation within the matched clusters, a shared frailty term (as random effects) was incorporated into the model and the maximum (penalized) marginal likelihood method was used to estimate the regression coefficients and the variance for the frailty.”

      1. It is noted that Kaplan Meier curves were used to estimate the cumulative incidence of heart disease. How was death of the patient prior to diagnosis of heart disease handled? I do not think that Kaplan Meier is the correct approach here but rather a Aaalen-Johansen-type estimator that treats death as a competing event. See e.g. https://pubmed.ncbi.nlm.nih.gov/10204198/ A Kaplan Meier will tend to overestimate the event rate when competing events are counted as censoring.

      As suggested by the reviewer, we have now used the Aalen-Johansen method to estimate the cumulative incidence of heart disease and revised the text in the Methods, as well as the tables and figures in the supplement.

      In Materials and Methods,: “Aalen-Johansen estimation was used to assess the cumulative incidences of heart diseases in breast cancer patients and matched reference individuals, while other causes of death were considered as competing events.”

      1. The sentence "Missing indicators were included for the analysis of these covariates in the model" and the results in Table 3 suggest that some missing values were analyzed 'as is', meaning that missingness was used as a category itself. This of course is not desirable and there exists methodology+software for more appropriately handling these data, e.g. multiple imputation with chained equations. For example, how does one interpret that 'unknown chemotherapy' status is positively associated with heart failure but less so than anthracycline based chemo.

      Missingness of the type of adjuvant treatment was considered as a category in the previous version of our manuscript. To address potential biases resulting from missing data, we have now used multiple imputation with chained equations and revised the methods and Table 3 accordingly.

      In Materials and Methods: “Multiple imputation with chained equations was used to deal with the treatment categories with missing information. We replaced the missing data with 10 rounds of imputations and all the covariates were included in the imputation model.”

      1. The reported HRs at the top of page 10 seem incongruous with the FPM model demonstrated in Figure 1, since there is clearly a non-linear relationship between the hazard and the outcome. In other words, there is little sense in which the hazards are proportional at all time points.

      As shown in the FPM model in Fig. 1, HRs were not constant according to time since index date. Therefore, in the revised version, we only showed the HRs separately in <1, 1-2, 2-5, 5-10 and 10-17 years after diagnosis. We have revised the abstract, methods, and Table 2.

      In Abstract: “Time-dependent analyses revealed long-term increased risks of arrhythmia and heart failure following breast cancer diagnosis. Hazard ratios (HRs) within the first year of diagnosis were 2.14 (95% CI = 1.63-2.81) for arrhythmia and 2.71 (95% CI = 1.70-4.33) for heart failure. HR more than 10 years following diagnosis was 1.42 (95% CI = 1.21-1.67) for arrhythmia and 1.28 (95% CI = 1.03-1.59) for heart failure. The risk for ischemic heart disease was significantly increased only during the first year after diagnosis (HR=1.45, 95% CI = 1.03-2.04).”

      In Materials and Methods: “We compared the risk of heart diseases in breast cancer patients with that observed in the matched cohort, using flexible parametric model (FPM) with time since index date as underlying time scale.”

      In Results: “A short-term increase in risks of arrhythmia and heart failure was found in breast cancer patients (Table 2, Figure 1, HR at first year for arrhythmia= 2.14; 95% CI = 1.63-2.81, for heart failure =2.71; 95% CI = 1.70-4.33, respectively).”

      1. It seems unlikely that breast cancer diagnosis could ever be 'protective' for ischemic heart disease. A more constrained model that does not allow for the possibility of HR < 1 could provide a more sensible estimate of this time-dependent HR.

      To the best of our knowledge, the inverse association between breast cancer and the long-term risk of ischemic heart disease is possible considering that some of the reproductive risk factors for breast cancer have protective effect on the risk of ischemic heart disease. We have now discussed about this in Discussion.

      In Discussion: “The long term lower risk of ischemic heart disease in breast cancer patients compared to age-matched women might be explained by the opposite role of reproductive factors in breast cancer and ischemic heart disease. Women with younger age at menarche and older age at menopause were associated with increased risk of breast cancer, while decreased risk of ischemic heart disease were found among these women (Collaborative Group on Hormonal Factors in Breast, 2012; Okoth et al., 2020).”

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01118

      Corresponding author(s): Jun, Nakayama and Kentaro, Semba

      1. General Statements

      We are grateful to all of the reviewers for their critical comments and insightful suggestions that have helped us considerably improve our paper. As indicated in the responses that follow, we have taken all of these comments and suggestions into account in the revised version of our paper, including the supplementary information.

      In the revised manuscript, we focus on the existence of two cancer stem cell-like populations in TNBC xenograft model and patients. The response to each reviewer is described below.

      Sincerely,

      Jun Nakayama

      Kentaro Semba

      Department of Life Science and Medical Bioscience

      School of Advanced Science and Engineering

      Waseda University

      E-mail: junakaya@ncc.go.jp or jnakayama.re@gmail.com to JN

      ksemba@waseda.jp to KS

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): * **Summary:** Nakayama and colleagues use their previously developed automated tissue microdissection punching platform to perform spatial transcriptomics on a breast cancer xenograft model. Using transcriptomics on multiple clumps of 10-30 cells from different regions in a tumor and a lymph node metastasis they identified different cell-type clusters. Two of these clusters expressed different cancer stem cell markers. This led the authors to suggest that two distinct cancer stem cell(-like) populations may exist within one (breast) tumor, which could potentially make tumors more drug-resilient.

      **Major comments:** While the quality of the presented sequencing data is good and the manuscript is mostly written in a clear and accessible style, there are some concerns that limit the impact of this story. Most importantly, the manuscript in its present form does not convince me that the MDA-MB-231 xenografts indeed contain two distinct populations of cancer stem(-like) cells.

      1.The data obtained are not single cell data, which makes it difficult -if not impossible- to draw conclusions about presence of cancer stem cells. Each data point is the average of 10-30 cells, and the interpretation of the data is severely limited by this. How can the quantification of expression of CD44/MYC/HMGA1 in clumps of 10-30 cells teach us something about the stemness of tumor cells? *

      Answer: We would thank the comment. The reviewer’s suggestion is an important point; however, this is technical limitation of spatial transcriptomics technology. Most advanced spatial transcriptomics technologies, e.g. Visium (10x Genomics), also have the same problem. It means that our technology and the advanced technologies are technics to analyze gene expression and characteristics of tissues from 10-30 cells in each spot. Although high resolution spatial transcriptomics has been developed in 2021 [1], it is not generally used yet as described in the comment (Significance) from reviewer1.

      From our spatial analysis, we identified that CD44, MYC, and HMGA1 were expressed from human cancer cell. Their expression profiles were distinct among specific parts of the tumor section. To validate the existence of two types of cancer stem-like cells in TNBC tumors, we performed the additional analysis with the public scRNA-seq datasets of high-metastatic MDA-MB-23-LM2 xenograft model (GSE163210) [2]. This study performed scRNA-seq analysis of primary tumor and circulating tumor cells in MDA-MB-231-LM2 xenograft model. We analyzed it with Seurat/R (Figure A-1). As a result of reanalysis, HMGA1 and CD44 expression were confirmed at single-cell resolution (Figure A-2,3). These results verified the existence of two cancer stem cell-like populations (HMGA1-high, CD44-high) in MDA-MB-231 xenograft. Hence, the study of MDA-MB-231 xenograft supported our findings from spatial transcriptomics.

      Additionally, we performed the immuno-staining of sections using anti-CD44 antibody and anti-HMGA1 antibody as described in reviewer’s comment 5. As a result, CD44 and HMGA1 were detected in primary tumor sections. There were cells that express either CD44 or HMGA1 and cells that co-express both CD44 and HMGA1 (Figure B). We believe that our findings are solid results because the findings were also validated by other methods.

      In the revised manuscript, Figure A are incorporated as Figure 3B-E. Figure B is incorporated as Figure 3A. Hope our new results will be now accepted by the learned Reviewer and Editor.

      Figure A-1. Reanalysis of scRNA-seq of metastatic MDA-MB-231 xenograft

      Flowchart of the public single-cell RNA-seq (scRNA-seq) reanalysis using GSE163210 datasets.

      Figure A-2. UMAP plots of xenograft and CD44/HMGA1 expression

      UMAP plot of MDA-MB-231-LM2 xenograft tumors and circulating tumor cells (Left). Expression of CD44 and HMGA1 in the UMAP plot (Right).

      Figure A-3. Pie chart of CD44/HMGA1 positive cancer cells in MDA-MB-231 xenograft

      Pie chart of cancer stem cell-like population ratio in MDA-MB-231-LM2 xenografts.

      Figure B. Fluorescent immuno-staining of MDA-MB-231 primary tumor

      Representative images immunostained with CD44 and HMGA1 in primary tumor sections of the MDA-MB-231 xenograft model. Red: HMGA1, Green: CD44, and Blue: Nucleus. Scale bars, 20 μm (left), 10 μm (right). White arrows represent cancer cells that independently expressed or co-expressed.

      * 2.Furthermore, the authors should better explain their data analysis strategy with identification of gene expression profiles. It is unclear how they found CD44, MYC, and HMGA1 other than by cherry-picking from the list of cluster markers. *Answer: In this research, to identify the characteristics of clusters, we analyzed differentially expressed genes (DEGs) by ‘FindAllMarkers’ function of Seurat. As a result, ‘Cluster 0’ significantly expressed HMGA1 gene, and ‘cluster 1’ significantly expressed CD44. HMGA1 and CD44 are popular cancer stem cell markers in triple-negative breast cancer [3, 4]. In this study, we focus on metastasis-related genes and cancer stem cell markers (described in introduction section). Therefore, we focus on cancer-stem cell markers in the presented study. Cancer stemness is an important concept in cancer metastasis [5-7]. These results suggested that the existence of two cancer stem cell-like populations could potentially make tumors more drug-resilient in xenograft models and clinical patients.

      To improve the manuscript, we revised the description in the revised manuscript (Pages 5-6, Lines 97-105).

      * 3.Following up on the above point: I looked in the supplementary tables, but couldn't find MYC. How did the authors conclude that MYC is involved in cluster 1? In fact, when I ran a quick analysis in EnrichR, I saw that putative MYC target genes were strongly enriched among the markers in the HMGA1 cluster, but not the CD44/MYC. That's opposite to what I would expect. *__Answer: __We apologize for our confusing data and description. First, we found the expression of CD44 and HMGA1 in each cluster. Therefore, we performed the up-stream enrichment analysis using gene signatures of FindAllMakers by Metascape. From the result of enrichment analysis, we found the MYC activation in CD44 high-cluster; therefore, we named the cluster “CD44/MYC-high” cluster.

      To improve the manuscript, we revised the Figure2, Supplementary Table S3, and manuscript (Pages 5-6, Lines 103-106).

      * 4.All data were produced from 1 primary tumor and 1 metastasis. Thus, reproducibility and robustness of the methodology cannot be evaluated. The interpretation of the data could be strengthened when xenografts from at least 3 different mice are shown. *__Answer: __We would thank the suggestion. As the reviewer’s comment, we performed 1 primary tumor and 1 metastasis lesion from a transplanted mouse. Since this experiment take a long time, we tried to validate the findings by other methods (Figure A: scRNA-seq analysis of MDA-MB-231 xenografts, Figure B: Immuno-staining of MDA-MB-231 primary tumor, Figure C: scRNA-seq analysis of TNBC patients).

      First, we reanalyzed the public dataset which performed single-cell RNA-seq analysis of MDA-MB-231 xenografted tumor and circulating tumor cells in immunodeficient mice as shown in the answer to comment 1 (Figure A). Next, we performed the immuno-staining of sections using anti-CD44 antibody and anti-HMGA1 antibody as described in reviewer’s comment 5. As results, CD44 and HMGA1 were detected in primary tumor sections. There were cells that express either CD44 or HMGA1 and cells that co-express both CD44 and HMGA1 (Figure B). Next, we performed the reanalysis of 19 scRNA-seq samples from integrated 3 TNBC cohorts (Figure C-1). In a UMAP plot, differences between CD44-positive cancer cell and HMGA1-positive cancer cell were observed; however, these cells did not visually form the specific clusters (Figure C-2). CD44 and HMGA1 expressed globally in the UMAP plot, but CD44 makes some specific clusters (cluster at right side). Additionally, following the comment, we performed the population analysis in each patient (Figure C-3 and C-4). Detection of double-positive population in TNBC patients suggested that the population may be more undifferentiated cancer stem cells diving into both CD44-positive cells and HMGA1-positive cells.

      In addition, we reanalyzed primary tumors and metastasis lesions from other mice as a test trial sample (Figure D-1). The microspots including test trial samples showed 3 human clusters which were classified into CD44/MYC, HMGA1, and Marker-low clusters. We believe that our findings are solid results because the findings were also validated by other methods.

      In the revised manuscript, Figure A are incorporated as Figure 3B-E. Figure B is incorporated as Figure 3A. Figure C is incorporated as Figure 5. We only showed Figure D in the response to the reviewer’s comment. Hope our new results will be now accepted by the learned Reviewer and Editor.

      Figure C-1. Reanalysis of integrated TNBC patients scRNA-seq

      A flowchart of the reanalysis of a public scRNA-seq dataset. We downloaded GSE161529, GSE176078, and GSE180286 (scRNA-seq data of 19 TNBC patients). Integrated datasets were analyzed with Seurat. Log normalization, scaling, PCA and UMAP visualization were performed following the basic protocol in Seurat. To extract the cancer cells, cells expressing EPCAM/KRT8 (epithelial marker) were filtered. A UMAP plot of cancer cell from 19 TNBC patients (right).

      Figure C-2. CD44/HMGA1 expression in TNBC patients

      Expression analysis of CD44 (Expression level > 2) and HMGA1 (Expression level > 2) with UMAP plots.

      Figure C-3. CD44/HMGA1-positive cancer cell with UMAP plot

      UMAP plots of CD44-high, HMGA1-high, HMGA1/CD44-high, and Negative cancer cells.

      Figure C-4. Ratio of CD44/HMGA1-positive cancer cell in each patient

      The bar plot showed the ratio of cancer cells that expressed CD44 and HMGA1.

      Figure D-1. Analysis of microspots of MDA-MB-231 xenografts including test trial samples

      UMAP plots of CD44-high, HMGA1-high, and Marker-low clusters with test trial samples (2 primary tumors and 1 lung metastasis). ‘Primary tumor 1’ has 20 microspots, ‘Primary tumor 2’ has 24 microspots, and ‘lung metastasis’ has 7 microspots. Most microspots of lung metastasis failed extraction of RNA; therefore, these spots classified into Marker-low cluster.

      Figure D-2. Expression analysis of CD44, HMGA1, and MYC

      Feature plot of CD44-high, HMGA1-high, and Marker-low clusters with test trial samples.

      * 5.The only methodology is single cell RNA-sequencing. Immuno-staining on relevant markers such as CD44, MYC, HMGA1 plus human epithelium and cell cycle markers would provide strong additional support for the claims made by the authors, because it's a complementary technique and it allows quantification at single cell resolution. *__Answer: __We would thank the comment. As described in the responses to the reviewer’s comment 1 and 4, we performed the immuno-staining of sections using anti-CD44 antibody and anti-HMGA1 antibody as described in reviewer’s comment 5. As a result, CD44 and HMGA1 were detected in primary tumor sections. There were cells that express either CD44 or HMGA1 and cells that co-express both CD44 and HMGA1 (Figure B).

      In the revised manuscript, Figure B is incorporated as Figure 3A.

      * 6.Line 173-175. The marker-low cluster look to me simply like spots containing a relatively high amount of dead/dying (tumor) cells. The identity/state of cells in the marker-low cluster should be characterized and discussed more extensively. *__Answer: __We would thank the comment. This suggestion is important. In fact, total count of RNA in the Marker-low cluster decreased as compared to HMGA1-high and CD44/MYC-high (Supplementary Figure S1B). Additionally, Ttr-high mouse cluster also has low total count of RNA (Supplementary Figure S1C).

      Following the comment, we described that the Marker-low cluster and Ttr-high cluster have the possibility to include dead/dying cells (Page 13, Lines 268-279).

      * 7.Figure 5 and accompanying text in line 182-194; the authors try to infer cell-to-cell interactions using a previously published tool. However, any biological interpretation is lacking. What can be concluded from this analysis? *__Answer: __Initially, algorithms of cell-to-cell interaction were reported with previously published tool [8, 9]; however, in this manuscript, we originally conducted the code for cell-to-cell interaction with the interaction database of the Bader laboratory from Toronto University (https://baderlab.org/CellCellInteractions#Download_Data) as previously described [10, 11]. We aimed to estimate the cell-to-cell interaction in each spot (including 10-30 cells). We think that this analysis will be helpful for discovering the cancer stem cell niche and metastatic niche [6].

      However, in the revised manuscript, we focused on the existence of two cancer stem cell-like populations in TNBC xenograft and patients. Therefore, CCI analysis in previous Figure 5 moved to Supplementary Figure S7. Previous Figure 6 is removed from revised manuscript.

      * 8.Figure 6. Can the authors please explain more clearly what they mean by "PT" and "Mix" groups? I had a very hard time to understand what the data in figure mean. Again, an overall interpretation at the end (line 211) is lacking. *__Answer: __We apologize for the confusing result. We examined the combinations of human cancer cell cluster and mouse stromal cell cluster. To summarize, there are 10 combinations in the MDA-MB-231 xenograft. The combination groups in only primary tumor were named “PT”; on the other hand, the combination groups in both primary tumor and lymph-node metastasis were named “Mix”. These CCI analysis focused on cluster types of cancer cell and stromal cell. However, according to this revision, our presented study mainly focuses on the existence of two types of cancer stem cell-like population in TNBC xenograft and patients. Therefore, CCI analysis with cluster types was deleted from revised manuscript.

      In the revised manuscript, we focused on the existence of two cancer stem cell-like populations in TNBC xenograft and patients. Previous Figure 6 was removed from the revised manuscript.

      * 9.Figure 7. I like the effort to align the results with public scRNA-seq data. But although the expression of the cluster-signatures is heterogeneous, there is no evidence for distinct (CSC-like) cell populations. Why don't these HMGA1 vs CD44 signature cells cluster away from each other in the UMAPs? Perhaps the patient-to-patient heterogeneity overwhelms differences within tumors, but in that case the authors could re-run their analysis for each patient separately, to make 6 patient-specific UMAPs. In its present form, this analysis does not convince me that two distinct CSC(-like) populations within one TNBC exist. *Answer: We would thank the comment. To improve the quality of reanalysis of clinical cohorts, we performed the reanalysis of 19 scRNA-seq samples from integrated 3 TNBC cohorts (Figure C-1). In a UMAP plot, there are differences between CD44-positive cancer cells and HMGA1-positive cancer cells; however, these cells did not visually form the specific clusters (Figure C-2). CD44 and HMGA1 were expressed globally in the UMAP plot, but CD44 made some specific clusters (cluster at right side). Additionally, following the comment, we performed the population analysis in each patient (Figure C-3 and C-4). There is double-positive population in TNBC patients suggesting that this population may be more undifferentiated cancer stem cells, dividing into both CD44-positive cells and HMGA1-positive cells.

      In the revised manuscript, Figure C is incorporated as Figure 5.

      * **Minor comments:** 10.In the Supplemental table 2 noticed that many of the marker genes have adjusted P values well above 0.05 (and even above 0.1). That makes the statistical analysis rather weak. This could especially be problematic since the authors entirely base their main claims on this marker analysis, and I recommend that the authors use more stringent P-value cut-offs in the cluster analysis. *Answer: We would thank the comment. We reshaped the list of differentially expressed genes (DEGs). Significantly expressed genes (adjusted p-value In mouse clusters, the enrichment analysis using significantly DEGs showed that only Tcell-like clusters had a lot of enriched terms. Citric acid (TCA) cycle, chemical stress response, and fatty acid oxidation were enriched in Tcell-like populations (Page 7, Lines 141-144).

      In the revised manuscript, enrichment analyses are showed as Supplementary Figure S2 and S3B. We revised the sentence of enrichment analyses (Page 6, Lines 114-121), (Page 7, Lines 141-144). The network visualization of enrichment analysis was removed from the revised manuscript because this result did not support conclusions of the presented study.

      * 11.Line 129/130. If I look at figure 3A, I don't see this tendency that the authors describe. Can the authors provide statistical support or visual aid to make their claim more apparent to the reader? *__Answer: __We would thank the suggestion. Following the comment, we performed the statistical analysis of spot position. The spots were categorized outer side (tumor edge) and Inner site (Center of tumor) in the primary tumor section (Figure E-1 upside). We counted the spot numbers of the clusters (Figure E-1 table) and performed statistical test by chi-test. As a result, CD44/MYC clusters significantly resided at outer side of primary tumor (Figure E-1 barplot). On the other hand, the spots in lymph-node metastasis are not readily defined the outer or inner. In addition, cell cycle analysis in the primary tumor and lymph node metastasis was performed with statistical test. As a result, HMGA1-high cluster and CD44/MYC-high cluster significantly proliferated in the lymph node metastasis section (Figure E-2).

      Therefore, in the revised manuscript, we revised the sentence of spot position in lymph-node metastasis (Pages 8-9, Lines 159-172). Figure E-1 is incorporated as Figure 4D. Figure E-2 is incorporated as Figure 4F. Hope our new results will be now accepted by the Reviewer and Editor.

      Figure E-1. Statistical analysis of spot position

      Chi-test was performed by R. *p Figure E-2. Statistical analysis of cell cycle index

      Fisher’s exact test was performed by R. *p * 12.Line 217; shouldn't this be 6 patients? I see six clusters and in the original paper six patients are mentioned. *Answer: We would thank the comment. ‘6 patients’ is correct, we revised it. However, in the revised manuscript, we added integrated analysis of TNBC as shown in the answer to comment 9.

      Previous reanalysis of clinical scRNA-seq (previous Figure 7) was removed from the revised manuscript. The reanalysis using 3 integrated TNBC cohorts (Figure C) is incorporated as Figure 5.

      Reviewer #1 (Significance (Required)): * Conceptual/biological impact: Showing the existence of distinct populations of CSCs within one (breast-)tumor potentially has a high impact on the field of fundamental and translational cancer research. As the authors state, it could be one key reason underlying drug resistance. However, the technology used by the authors does in my view not allow to make such a claim. First and foremost because the technology does not allow analysis at single cell resolution.

      Technical impact: The platform used by the authors can be of interest for some applications, but they already published this in Scientic Reports a few years ago. I'm afraid that with the rapid recent developments in the field of spatial single cell transcriptomics (See for example Srivatsan et al Science 2021; 373: 111-117), the technical impact on the field is relatively low.

      Audience: Researchers in the field of cancer biology with an interest to perform low-cost molecular analysis at low-resolution spatial-resolved tissue specimens (transcriptomics, but perhaps expanded with bisulfite sequencing, or ATAC sequencing) could be interested in the technology presented in this manuscript.

      My expertise: single cell transcriptomics, (cancer) cell cycle, cancer drug resistance, cell plasticity, mouse models. *

      **Referee Cross-commenting** I have read the comments and align mostly with reviewer #2. The authors need to improve this manuscript a lot before it's suitable for publication in any of the Review Commons journals. Answer: We are grateful to the reviewers. As indicated in the responses that follow, we have taken all of these comments and suggestions into account in the revised version of our paper, including the supplementary information.

      *

      *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): * This manuscript uses spatial transcriptomics to perform single cell-like expression analysis between a breast cancer cell line and tumor microenvironment in mice xenografted with these cells. Unfortunately, from the title, abstract, and introduction, it is difficult to understand exactly what the authors are focusing and discussing. It is also unclear the advantage of their technique for evaluating the populations observed within this manuscript. Furthermore, there is very little explanation of the results, and it does not appear to be a scientific logical structure. Hence, this manuscript is not suitable for acceptance in the journal. In order to improve the scientific quality of this study, the following concerns are presented.

      **Major concerns:** 1.Is cell-cell interaction (CCI) analysis novel method? If so, please specify detail in the manuscript. If the basic concept and the principle of CCI analysis have not been published, please mention in the discussion section as a limitation that a manuscript on CCI analysis is under submission to the preprint. In addition, please revise the abstract and related text. *__Answer: __Initially, algorithms of cell-to-cell interaction were reported with previously published tool [8, 9]; however, in this manuscript, we originally conducted the code for cell-to-cell interaction with the interaction database of the Bader laboratory from Toronto University (https://baderlab.org/CellCellInteractions#Download_Data) as previously described [10, 11]. We aimed to estimate the cell-to-cell interaction in each spot (including 10-30 cells). We think that this analysis will be helpful for discovering the cancer stem cell niche and metastatic niche [6].

      However, in the revised manuscript, we focused on the existence of two cancer stem cell-like populations in TNBC xenograft and patients. Therefore, CCI analysis in previous Figure 5 is moved to Supplementary Figure S7. Previous Figure 6 are removed from the revised manuscript. We revised the description in the manuscript (Page 18, Lines 385-387).

      * 2.The reviewer thinks that spatial transcriptomics plays an important role in your manuscript. Please describe the technique in the introduction. *__Answer: __We would thank the comments. Following the comments, we described the spatial technics in Introduction section. We revised the manuscript (Page 4, Lines 63-65) (Page 12, Lines 250-253).

      * 3.The classification by expression profile (HMGA1, CD44/MYC and marker-low) lacks an explanation. Authors should mention in detail how these populations were extracted from breast cancer cell lines. *Answer: In this research, to identify the characteristics of clusters, we analyzed differentially expressed genes (DEGs) by FindAllmarkers function of Seurat. As a result, ‘Cluster 0’ significantly expressed HMGA1 gene, and ‘cluster 1’ significantly expressed CD44. Next, we performed the up-stream enrichment analysis using gene signatures of FindAllMakers by Metascape. From result of enrichment analysis, we found the MYC activation in CD44 high-cluster; therefore, we named the cluster “CD44/MYC-high” cluster.

      HMGA1 and CD44 are popular cancer stem cell markers in triple-negative breast cancer [3, 4]; therefore, we focus on cancer-stem cell marker in presented study. Cancer stemness is an important concept in cancer metastasis [5-7].These results suggested that the existence of two cancer stem cell-like populations could potentially make tumors more drug-resilient in xenograft model and clinical patient.

      To improve the manuscript, we revised the Figure2, Supplementary Table S2 and S4, and manuscript (Pages 5-6, Lines 97-106).

      * 4.The description of the results is back and forth and confusing. Please reconsider the flow of the analysis. *__Answer: __We would thank the comment. We reconsidered the description and structure of manuscript. In revised manuscript, we focused on the existence of two cancer stem cell-like populations in TNBC xenograft and patients.

      To improve the manuscript, we revised the Figure2 for examination of cluster characteristics by clustering and gene expression profiling. Figure 3 was revised for the validation of two cancer stem cell-like populations in TNBC xenograft model. Figure 4 was revised for the elucidation of spatial characteristics of each cluster. Figure 5 was revised for the validation of two cancer stem cell-like populations in TNBC patients.

      * 5.How did you evaluate the outsides of the samples with very different spot positions in Figure 3A? Please mention your evaluation method in a scientific manner. In particular, authors should clearly indicate the outer evaluation for the metastatic case. *

      Answer: We would thank the suggestion. Following the comment, we performed the statistical analysis of spot position. The spots were categorized outer side (tumor edge) and Inner site (Center of tumor) in primary tumor section (Figure E-1 upside). We counted the spot numbers of the clusters (Figure E-1 table) and performed statistical test by chi-test. As a result, CD44/MYC clusters significantly resided at outer side of primary tumor (Figure E-1 bar plot). On the other hand, the spots in lymph-node metastasis are not readily defined the outer or inner. In addition, cell cycle analysis in the primary tumor and lymph node metastasis was performed with statistical test. As a result, HMGA1-high cluster and CD44/MYC-high cluster significantly proliferated in the lymph node metastasis section (Figure E-2).

      Therefore, in the revised manuscript, we revised the sentence of spot position in lymph-node metastasis (Pages 8-9, Lines 153-172). Figure E-1 are incorporated as Figure 4D. Figure E-2 are incorporated as Figure 4F. Hope our new results will be now accepted by the Reviewer and Editor.

      Figure E-1. Statistical analysis of spot position

      Chi-test was performed by R. *p Figure E-2. Statistical analysis of cell cycle index

      Fisher’s exact test was performed by R. *p * 6.The spots in primary tumor have few counts derived from mouse stromal/immune cells, as shown in Figure S1A. Nevertheless, Figure 3C shows that mouse stromal/immune cells are evaluated in the same way in primary and metastatic sites. The reviewer thinks that the regions identified as Tcell-like in the metastatic site, where there are many mouse-derived counts, and in the primary, where there are few mouse-derived counts, do not have the same characteristics. If many mouse-derived counts were detected in a spot using the spatial transcriptomics, then there must be many mouse-derived cells in the spot. Please discuss how this expression is evaluated on this technique, which is not a single cell analysis. *__Answer: __We would thank the comment. The reviewer’s suggestion is an important point; however, this suggestion is technical limitation of spatial transcriptomics technology. Most advanced spatial transcriptomics technologies, e.g. Visium (10x Genomics), also have the same problem. It means that our technology and the advanced technologies are technics to analyze gene expression and characteristics of tissues from 10-30 cells in each spot.

      In this spatial transcriptome analysis of mouse genes, we first performed the log normalization and scaling. Since Seurat used variable features among the samples for single-cell or spot clustering, we extracted the variable features for detection of clusters using the ‘FindVariableFeatures’ function. PCA and clustering using only mouse genes was performed for detecting the neighboring samples. After the clustering of mouse spots, we identified the character of clusters by finding the gene signatures. As the indication by the reviewer, the detected RNA counts and features are different, so it is difficult to define the exact character and cell type of stromal cells. Theoretically, spatial transcriptomics could only detect some kinds of stromal cells expressing the T-cell marker gene in the spot. Therefore, we named the cluster as “Tcell-like”. Not all of the Tcell-like cluster have the same characteristics or cell types, but they certainly express T-cell marker genes. This is also a technical limitation of spatial transcriptomics. Spatial transcriptomics with higher resolution probably is able to detect the stromal cells as a single-cell resolution, such as the one developed in previous research [1].

      In the revised manuscript, we focused on the two types of cancer stem cell-like populations that were validated by other methods (scRNA-seq and Immuno-staining). As the method is not able to define the exact cluster characters, we moved CCI analyses to supplementary figures or removed partly.

      We also revised the discussion in the revised manuscript (Pages 13-14, Lines 279-283).

      * 7.Please explain how the gene symbols listed in Figure 4A were selected. Also, please indicate the characteristics of the gene groups that are not listed. *__Answer: __We selected the gene signature list from results of ‘FindAllMarker’ function in Seurat. ‘FindAllMarker’ function enables to extract the significantly expressed genes in each cluster. Heatmap in previous Figure 4A was drawn using these marker genes (Adjusted p-value 0.1). Highlighted genes in the heatmap have been reported as cancer-related genes or cell cycle-related genes.

      The genes used for drawing heatmap are shown in Supplementary Table S2 and S4.

      * 8.Please describe the details of the division and cycle index in lines 141-142. *__Answer: __Cell cycle index is a basic function of Seurat [12] (https://satijalab.org/seurat/archive/v3.1/cell_cycle_vignette.html). A list of cell cycle markers is loaded with Seurat. We can segregate this list into markers of G2/M phase and markers of S phase. We subjected this function into our spatial transcriptomics to estimate the cell cycle in each spot.

      We revised the description manuscript (Page 16, Lines 331-332).

      * 9.In Line 148-151, the expression and prognosis of TMSB10, CTSD, and LGALS1 is mentioned based on the previous reports. Aren't these findings the result of bulk? Is the HMGA1 cluster that the authors found involved in the prognosis of mice? Please clarify, as it is unclear what you want to discuss. *

      Answer: We apologize for our confusing data and description. These highlighted genes (TMSB10, CTSD, LGALS1, CENPK, and CENPN) were extracted as DEGs of human cancer clusters (Supplementary Table S2). Previously, these genes have been reported as cancer-related genes or cell cycle-related genes, described in the manuscript (Page 6, Lines 107-110). To show the other expressed genes in each human cluster, we focused on these genes in the manuscript.

      We extracted the gene signatures from DEGs and showed the gene signatures from HMGA1-high cluster correlated to poor prognosis in TNBC patients. Our data suggested that the HMGA1 signatures from the microspot resolution has the potential to be a novel biomarker for diagnosis, and HMGA1-high cancer stem cells may contribute to poor prognosis.

      In this revision, since we reperformed DEGs analysis with significant threshold; therefore, survival analysis was reperformed with novel gene signatures with METABRIC TNBC cohorts (Figure F).

      To improve the manuscript, we revised the description of DEGs extraction and heatmap (Page 6, Lines 106-112). Hope our Reviewer will approve this revised sentence.

      Figure F. Survival analysis with gene signatures of HMGA1-high and CD44/MYC-high

      Survival analysis of TNBC patients (claudin-low subtype and basal-like subtype) in METABRIC cohorts by the Kaplan-Meier method. (Left) Survival analysis with the expression of the HMGA1 signatures (High = 151, Low = 247). Shading along the curve indicates 95% confidential interval. Log-rank test, p = 0.012. (Right) Survival analysis with the expression of the CD44/MYC signatures (High = 333, Low = 65). Log-rank test, p = 0.079.

      * 10.Please provide details of all statistical tests used in this manuscript and describe significance levels used in the p-values and FDR. *__Answer: __We performed the extraction of differentially expressed genes (DEGs) by ‘FindAllMarkers’ function with MAST method. MAST method identifies differentially expressed genes between two groups of cells using a hurdle model tailored to scRNA-seq data [13]. Adjusted p-value is calculated based on Bonferroni correction using all features in the dataset. In spatial spot analysis, statistical analyses were performed by Chi-test and Fisher’s exact test.

      We revised materials and methods section in the manuscript (Page 19, Lines 391-394).

      * 11.Please mention CCI score (line 198). *Answer: As described in answer to comment 1, the algorithms of CCI score calculation were performed using previously published tool [8, 9]; however, we originally conducted the code for cell-to-cell interaction with the interaction database of the Bader laboratory from Toronto University (https://baderlab.org/CellCellInteractions#Download_Data). We extracted the genes whose expression value was greater than 2. We selected the combinations representing ligand__-__receptor interactions, in which both ligand genes and receptor genes were expressed in the same spot.

      We revised materials and methods section in the manuscript and Supplementary Legends (Page 18, Lines 385-387).

      * 12.Lines 204-206 and Figure 6G show specific interaction of ITGB1 and CST3, but it is unclear why only these molecules were extracted. What about the other molecules? At least ITGB1 is not scored in mix5. *Answer: We selected genes that have been reported as cancer-related ones in breast cancer to discuss the interactions in primary tumor and lymph-node metastasis. However, according to this revision, our presented study mainly focused on the existence of two types of cancer stem cell-like population in TNBC xenografts and patients. Therefore, CCI analysis with cluster types moved to supplementary Figure or some were not shown now.

      In the revised manuscript, previous Figure 6 is removed.

      * 13.HMGA1 signature appears in Line 214, please explain in detail. *__Answer: __As described in answer to comment 7, we selected the gene signature list from results of ‘FindAllMarker’ function. ‘FindAllMarker’ function enables to extract the significantly expressed genes in each cluster. HMGA1 signature genes were selected from significantly differentially expressed genes of HMGA1-high clusters.

      We revised the description in the revised manuscript (Pages 9-10, Lines 190-193).

      * 14.Authors should discuss how the previously reported bulk expression data used in Figure 7E can be linked to the single-cell-like analysis in this study. *__Answer: __Previous research reported that gene signatures extracted from specific clusters in scRNA-seq study have the potential to be a prognosis marker [14]. We showed the gene signatures from HMGA1-high cluster correlated to poor prognosis in TNBC patients. Our results suggested that the gene signatures from the resolution of microspot (10-30 cells) could have the potential to be prognosis markers. This punching microdissection system enables to extract only the parts of a section that are necessary for diagnosis of cancer and to analyze at low-cost. It could be applied to diagnostics instead of the laser-capture microdissection methods.

      We performed additional survival analysis with METABRIC cohorts. As described in this revision, since we reperformed DEGs analysis with significant threshold, survival analysis was reperformed with novel gene signatures with METABRIC TNBC cohorts (Figure F).

      In revised manuscript, Figure F were incorporated as Figure 6. The usefulness of gene signatures from microspot resolution was additionally discussed (Page 12, Lines 242-245, 250-253).

      * **Minor concerns:** 15.Please describe how the normalized centrality was calculated in UMAP algorithm and explain what this means in the results. __Answer: __The data showed that the expressional diversity in each cluster based on the network centrality of a correlational network with graph theory. The differences in the centrality among the clusters suggested expressional diversity in each (Supplementary Figure 4). Higher centrality represented lower expressional diversity and vice versa*. The detailed method for the calculation of centrality was previously shown to reveal the difference between smokers and never-smokers [10, 11].

      We added the description in the Legend (Pages 7-8, Lines 145-150).

      * 16.Please mention an explanation for the red X in Figure 1B to the legend. *__Answer: __The red X means failure spot for RNA extraction. We added the description in Figure 1B.

      * 17.Please spell out the abbreviations in all figure legends. *__Answer: __We added the abbreviations in the legends of all figures.

      * 18.Please explain what is meant by the color of the lines and the size of the circles in Figure 4D. *__Answer: __The network analysis was performed by Metascape (https://metascape.org/gp/index.html#/main/step1) [15]. The node size is proportional to the number of genes belonging to the term, and the node color represents the identity of the cluster. However, as described in the answer to reviewer’s comment 9, we reperformed enrichment analysis with significant DEGs. As a result, only CD44/MYC cluster had a lot of enrichment terms.

      Therefore, network visualizations were removed from the revised manuscript.

      * 19.Please mention an explanation for the color of the spots in Figure 5D and 5F to the legend. *__Answer: __The color showed the spots categorized into the selected group.

      In the revised manuscript, previous Figure 5 was incorporated as Supplementary Figure S7. We added the description in Supplementary Figure S7 and S8 with the legends.

      * 20.Is "S51" in Line 148 a typo for "S5A"? *Answer: Thank you. We revised “S5A”.

      * 21.Please mention an explanation for the bars in Figure 6D and 6F to the legend. *__Answer: __The bars showed relative CCI scores. As described below, we removed the results of CCI analysis with cluster group (previous Figure 6) in the revised manuscript.

      * 22.Please mention an explanation for the colors in Figure 7E to the legend. *__Answer: __The color showed patients’ group based on expression levels of gene signatures. We added the description in the Legend of Figure 6.

      *

      *

      Reviewer #2 (Significance (Required)): * The approach in Figure 5 is interesting, but the rest of the results do not take full advantage of the technology developed by the authors. The structure of the manuscript should be re-examined and new perspectives added. I look forward to the future of the authors' research.

      *

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): Microtissue transcriptome analysis of triple-negative breast cancer cell line MDA-MB-231 xenograft model using automated tissue microdissection punching techonology revealed that the existence of three cell-type clusters in the primary tumor and axillary lymph node metastasis. The CD44/MYC-high cluster showed aggressive proliferation with MYC expression, the HMGA1-high cluster exhibited HIF1A activation and upregulation of ribosomal processes. The cell-cell-interaction analysis revealed the interaction dynamics generated by the combination of cancer cells and stromal cells in primary tumors and metastases. The gene signature of the HMGA1-high cancer stem cell-like cluster has the potential to serve as a novel biomarker for diagnosis. The key conclusions are convincing. The data and methods are presented in a reproducible way. The experiments are adequately replicated and statistical analysis is adequate. Prior studies are appropriately referenced. The text and figures are clear and accurate. __Answer: __We would thank the valuable comments. As the reviewer mentioned, our findings showed that the existence of two cancer stem cell-like populations has the potential to make tumors more drug-resilient. Our results suggested that the gene signatures from the resolution of microspot (10-30 cells) could have the potential to be prognosis markers. This punching microdissection system enables to extract only the parts of a section that are necessary for diagnosis of cancer and to analyze at low-cost. It could be applied to diagnostics instead of the laser-capture microdissection methods.

      In this revision, we focused on the existence of two cancer stem cell-like populations in TNBC xenografts and patients. Following the other reviewer’s comments, we performed the extraction of DEGs with significant threshold; therefore, we revised the results of enrichment analysis but it did not influence our main findings.

      To validate the existence of two types of cancer stem-like cells in TNBC tumors, we performed the additional analyses (reanalysis of public scRNA-seq datasets and immuno-staining of MDA-MB-231 primary tumor). These results verified two cancer stem cell-like populations (HMGA1-high, CD44-high) in MDA-MB-231 xenograft and TNBC patients. We believe that our findings are solid results because the findings were also validated by other methods.

      Again, we would thank kind reviewing our manuscript.

      Reviewer #3 (Significance (Required)): * In the past several studies showed the heterogeneity of cell-cell interactions between cancer cells and stromal cells in situ (Andersson et al, 2021; Wu et al, 2021) and tumor microheterogeneity (Jiang et al, 2016; Liu et al, 2016; Zhang et al, 2020). Spatial transcriptomics methods are important to reveal microheterogeneity of cancer. As a physician working in gynecology and obstetrics in my opinion the results of the study and spatial transcriptomic methods could be relevant to detect new biomarkers for diagnosis and prognosis of breast cancer in future and to find novel therapeutic targets to overcome drug resistance and facilitate curative treatment of breast cancer.

      *

      References in response letter

      1. Srivatsan SR, Regier MC, Barkan E, Franks JM, Packer JS, Grosjean P, et al. Embryo-scale, single-cell spatial transcriptomics. Science. 2021;373(6550):111-7. Epub 2021/07/03. doi: 10.1126/science.abb9536. PubMed PMID: 34210887.
      2. Moravec JC, Lanfear R, Spector DL, Diermeier SD, Gavryushkin A. Cancer phylogenetics using single-cell RNA-seq data. bioRxiv. 2021:2021.01.07.425804. doi: 10.1101/2021.01.07.425804.
      3. Liu H, Patel MR, Prescher JA, Patsialou A, Qian D, Lin J, et al. Cancer stem cells from human breast tumors are involved in spontaneous metastases in orthotopic mouse models. Proc Natl Acad Sci U S A. 2010;107(42):18115-20. Epub 2010/10/06. doi: 10.1073/pnas.1006732107. PubMed PMID: 20921380; PubMed Central PMCID: PMC2964232.
      4. Pegoraro S, Ros G, Piazza S, Sommaggio R, Ciani Y, Rosato A, et al. HMGA1 promotes metastatic processes in basal-like breast cancer regulating EMT and stemness. Oncotarget. 2013;4(8):1293-308. Epub 2013/08/16. doi: 10.18632/oncotarget.1136. PubMed PMID: 23945276; PubMed Central PMCID: PMC3787158.
      5. Weiss F, Lauffenburger D, Friedl P. Towards targeting of shared mechanisms of cancer metastasis and therapy resistance. Nat Rev Cancer. 2022. Epub 2022/01/12. doi: 10.1038/s41568-021-00427-0. PubMed PMID: 35013601.
      6. Oskarsson T, Batlle E, Massagué J. Metastatic Stem Cells: Sources, Niches, and Vital Pathways. Cell Stem Cell. 2014;14(3):306-21. doi: https://doi.org/10.1016/j.stem.2014.02.002.
      7. Turdo A, Veschi V, Gaggianesi M, Chinnici A, Bianca P, Todaro M, et al. Meeting the Challenge of Targeting Cancer Stem Cells. Front Cell Dev Biol. 2019;7:16. Epub 2019/03/06. doi: 10.3389/fcell.2019.00016. PubMed PMID: 30834247; PubMed Central PMCID: PMC6387961.
      8. Armingol E, Officer A, Harismendy O, Lewis NE. Deciphering cell-cell interactions and communication from gene expression. Nat Rev Genet. 2021;22(2):71-88. Epub 2020/11/11. doi: 10.1038/s41576-020-00292-x. PubMed PMID: 33168968; PubMed Central PMCID: PMC7649713.
      9. Kumar MP, Du J, Lagoudas G, Jiao Y, Sawyer A, Drummond DC, et al. Analysis of Single-Cell RNA-Seq Identifies Cell-Cell Communication Associated with Tumor Characteristics. Cell Rep. 2018;25(6):1458-68.e4. Epub 2018/11/08. doi: 10.1016/j.celrep.2018.10.047. PubMed PMID: 30404002; PubMed Central PMCID: PMCPMC7009724.
      10. Watanabe N, Nakayama J, Fujita Y, Mori Y, Kadota T, Shimomura I, et al. Single-cell Transcriptome Analysis Reveals an Anomalous Epithelial Variation and Ectopic Inflammatory Response in Chronic Obstructive Pulmonary Disease. medRxiv. 2020:2020.12.03.20242412. doi: 10.1101/2020.12.03.20242412.
      11. Nakayama J, Yamamoto Y. Single-cell meta-analysis of cigarette smoking lung atlas. bioRxiv. 2021:2021.12.09.472029. doi: 10.1101/2021.12.09.472029.
      12. Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM, 3rd, et al. Comprehensive Integration of Single-Cell Data. Cell. 2019;177(7):1888-902.e21. Epub 2019/06/11. doi: 10.1016/j.cell.2019.05.031. PubMed PMID: 31178118; PubMed Central PMCID: PMC6687398.
      13. Finak G, McDavid A, Yajima M, Deng J, Gersuk V, Shalek AK, et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 2015;16:278. Epub 2015/12/15. doi: 10.1186/s13059-015-0844-5. PubMed PMID: 26653891; PubMed Central PMCID: PMCPMC4676162.
      14. Cheng S, Li Z, Gao R, Xing B, Gao Y, Yang Y, et al. A pan-cancer single-cell transcriptional atlas of tumor infiltrating myeloid cells. Cell. 2021;184(3):792-809.e23. Epub 2021/02/06. doi: 10.1016/j.cell.2021.01.010. PubMed PMID: 33545035.
      15. Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10(1):1523. Epub 2019/04/05. doi: 10.1038/s41467-019-09234-6. PubMed PMID: 30944313; PubMed Central PMCID: PMC6447622.
    1. Author Response

      Reviewer #1 (Public Review):

      The authors use ribosome profiling (RiboSeq) and RNA sequencing (RNASeq) to characterise the transcriptome and translatome of two PRRSV species as well as the host in response to infection. One particularly exciting feature of the study is that the analysis is carried out at different times of infection, which shows how both the virus and the host regulate their gene expression. The authors identify several new regulatory mechanisms of virus gene expression. Unexpectedly, they also find that the frameshifting efficiency at the ORF1ab frameshifting site changes with time. This contradicts the dogma in the field, which states that frameshifting is constant and has evolved to be constant to produce the a particular ratio of the two protein isoforms. The strength of the paper is in its comprehensible analysis. The paper is extremely rich in data, with 12 main and 23 Supplemental Figs and 11 Supplemental Tables, all of them rather complex. The main weakness is that it is written in a technical language that will be hardly readable by a non-specialist readership. Unfortunately, the authors do not make a good job in guiding the reader through their findings and hardly identify the the most important findings, while leaving the details to the specialists. This is particularly exemplified in Fig. 12, which should present the summary of the findings and would be extremely helpful, but hardly provides any text at all. This is potentially a very interesting paper, but the impact on the field could be increased considerably by better presentation of the work.

      We would like to thank this reviewer for the positive comments about the scientific findings, and for their suggestions for improving the presentation of the work. This outside perspective was very useful in helping us see which parts of the paper required clearer explanation or less detail, which can be hard to discern when very close to the work. We have incorporated all of this reviewer’s suggestions and we think this has improved the manuscript and made it easier to follow.

      Reviewer #2 (Public Review):

      The authors used the ribosome profiling technique to study gene expression at transcriptional and translational levels in the cells infected with porcine reproductive and respiratory syndrome virus (PRRSV-1 and PRRSV-2) using ribosome profiling. The ribosome profiling was carried out on the cells at different time points within the first 12 hours of infection, thus providing information on gene expression changes during the time of infection.

      The analysis of ribosome profiling data is exceptionally detailed and includes scrupulous characterization of footprint read lengths, de novo prediction of translated ORFs, characterisation of local pauses and differential gene expression of host and viral genes. The RNA-seq analysis is on par with that, the authors did a superb job at characterising the composition of the viral transcriptome that included identification of heteroclite RNAs and defective interfering RNAs. This provided the authors with reliable information for the interpretation of translational mechanisms responsible for the translation of ORFs discovered with ribosome profiling data.

      A specific focus of the manuscript was placed on the characterisation of two instances of ribosomal frameshifting occurring in PRRSVs. In addition to "canonical" -1 frameshifting at a slippery sequence stimulated by downstream RNA secondary structure (common to many viruses), PRRSVs genome contains an additional frameshifting site whose efficiency is stimulated by a viral protein. The authors demonstrated that the efficiency of this frameshifting is increasing over time which is expected since the concentration of stimulating protein is increasing. Furthermore, the authors found that the efficiency of "canonical" frameshifting is also changed. The authors describe this as surprising since it directly contradicts the common description of its function as "setting the fixed ratio" between the synthesized products upstream and downstream of the frameshift site. Perhaps it is not so surprising in the hindsight, given that the frameshifting is dependent on so many different factors, folding states of RNA pseudoknots which are dynamic, ribosome density upstream, etc. it would be more surprising if the efficiency of frameshifting were indeed fixed. I think the "fixed ratio" was proposed mainly to draw a difference to ribosomal frameshifting occurring in cellular genes (like antizyme or bacterial release factor 2) where there seems to be only one functional product, but its synthesis level depends on the efficiency of frameshifting sensing certain conditions. It is great though that the authors observed such changes and I agree with the authors' speculations that this is unlikely to be unique to PRRSVs.

      While I found the work to be largely descriptive, the authors did not shy away from speculating about potential mechanisms responsible for observed regulation. The manuscript is hard to get through simply due to its large length and a lot of data, but reading it is rewarding.

      Again, we would like to thank this reviewer for their positive comments about the work, and to reiterate that hopefully the revised version of the manuscript will be easier to read.

      Reviewer #3 (Public Review):

      The manuscript by Cook et al. describes the first comprehensive gene expression analysis of two species of PRRSV, an important agricultural pathogen. Using ribosome profiling and RNA-sequencing, the authors systematically analyze the transcriptome of the virus and its translation, and their temporal kinetics. The analysis revealed non-canonical RNA species that are suggested to contribute to translation of parts of ORF1ab, changing the stoichiometry between the NSPs. In addition, the authors use the ribosome profiling data to identify novel overlapping ORFs, including a conserved uORF in the 5' leader, and to analyze the efficiency of frame-shift in two sites in the viral genome, one of which is trans-regulated by the viral nsp1β. The frame-shift efficiency in both sites is presented to be increasing late in infection. The authors also present conservation analysis from hundreds of available genomes. Finally, analysis of host gene expression uncovers a pattern suggesting translation inhibition of induced transcripts, and by comparing a WT virus to a mutant virus lacking the nsp2 site frame-shift, the authors identify a gene (TXNIP) whose expression is affected by nsp2TF.

      In this rigorous work, the authors uncover new insights on an important pathogen, which can be of value to the wider field of virology. However, due to technical issues a few of the authors claims may require reconsideration.

      We are grateful to this reviewer for their comments on the rigour and the impact of the work, as well as the suggestions for improvement which they included in their more detailed review. Within the detailed review, this reviewer expressed some concerns that ribosome run-off (seen in Figure 1—figure supplement 1 [formerly Supplementary Figure 1]) might confound the comparison of ribosome densities in different regions of the viral genome (particularly ORF1ab). However, this run-off only noticeably affects the first ~100 nt of host CDSs, which is very small compared to the ~12,000 nt total length of ORF1ab. The regions of ORF1ab in which we compare ribosome density in our study are almost all > 1,000 nt downstream of this ~100 nt run-off region and will therefore not be significantly affected by run-off. The exception to this is our assessment of heteroclite sgRNA translation, where the “heteroclite” region does include the first ~100 nt of ORF1a. As such, run-off may have a slight effect on this analysis, but we expect this to be minor, as the ~100 nt run-off region represents only a small proportion of the 1,550-nt “heteroclite” region. Further, any such effect would actually lead to under-estimation of heteroclite sgRNA translation, by artefactually reducing the relative RPF density in the heteroclite region. This would therefore strengthen our conclusion that our data provide evidence for heteroclite sgRNA translation.

    1. Author Response

      Reviewer #2 (Public Review):

      Romand et al investigates the role of hyperphosphorylated guanosine nucleotides (ppGpp) in acclimation of plant chloroplasts to nitrogen limitation. The signaling role of ppGpp as alarmone is well established in the stringent response of bacteria. The stringent response allows bacteria to adapt to amino acid or carbon starvation and other acute abiotic stress conditions by downregulation of resource-consuming cell processes. A series of studies, including the current one, have demonstrated the retention of the bacterial-type ppGpp-mediated signaling response in plant and algal chloroplasts. The current study convincingly demonstrates the involvement of ppGpp in remodeling of photosynthetic machinery under nitrogen limitation. Using three Arabidopsis RSH lines (two underaccumulators and one overaccumulator of ppGpp), the authors show that the ppGpp is required for preventing excess ROS accumulation, oxidative stress and death of cotyledons under nitrogen limiting condition. The authors show a transient accumulation in ppGpp upon nitrogen limitation, which is followed by a sustained increase in the ratio of ppGpp to GTP. There is a prompt decline in maximum photochemical efficiency of photosystem II (PSII) and linear electron transport under nitrogen deficiency in wild type and ppGpp overaccumulator plants. However, mutants with low amount of ppGpp have a delayed decrease in these photosynthetic parameters. PpGpp is further shown to decrease (or degrade) photosynthetic proteins, and a remodeling of PSII that involves uncoupling of LHC II from the reaction center core has been suggested to occur under nitrogen starvation. The authors also show a ppGpp-mediated downregulation of chloroplast gene transcription and a coordinated plastid-nuclear gene expression under nitrogen deficiency.

      Strengths 1. The conclusions of this paper are mostly well supported by data. With three different RSH lines, there is a convincing demonstration of the specific involvement of ppGpp in nutrient acclimation. The line carrying conditional overexpression of Drosophila ppGpp hydrolase (MESH) nicely complements the RSH lines and strengthens many of the conclusions. This is a detailed analysis of ppGpp function in a plant species. The data supplement accompanying each main figure is extensive and helpful. 2. The genomic analysis in nitrogen replete and deplete wild type uncovers an interesting regulation of RSH enzymes at the transcriptional level. This is likely to be part of a signaling response that works in conjunction with allosteric modulation of RSH activity under nitrogen limitation. 3. The large-scale analysis of plastid and nuclear gene transcripts supports the involvement of ppGpp in coordinated repression of plastid and nuclear gene transcription. 4. By the inclusion of mitochondrial genes and proteins in their analysis, the authors clearly show that the ppGpp action is limited to plastids and does not extend to mitochondria, which like chloroplasts, have a bacterial ancestry. 5. The thorough demonstration of the involvement of ppGpp in low nitrogen acclimation of photosynthetic metabolism adds greatly to the understanding of plant abiotic stress tolerance mechanisms and ppGpp function in both plants and bacteria.

      We thank the reviewer for these observations on our work.

      Weaknesses: 1. With two earlier reports from a different laboratory (Maekawa et al 2015 and Honoki et al 2018) showing the involvement of ppGpp in acclimation to nitrogen deficiency, the novelty of the current study is diminished. The authors mention that the double mutant (rsh2 rsh3) used by Honoki et al does not show a clear phenotype other than a delay in Rubisco degradation. It is not clear to me why the lack of two major RSH isoforms, involved in synthesis of ppGpp under light, would not produce any phenotype. This discrepancy should be discussed further in the manuscript.

      The work of Maekawa et al., 2015 and Honoki et al., 2018 was indeed important for highlighting the potential involvement of ppGpp in the acclimation to nitrogen deficiency. However, these studies were based on the constitutive overaccumulation of ppGpp. Here, we demonstrate a physiological requirement for ppGpp signalling by the plant to allow acclimation to an abiotic stress- we consider this to be a major step forwards in understanding the role of ppGpp in plants, and one of the few examples of a physiological requirement for ppGpp in plants.

      We mention the use of an RSH2 RSH3 mutant by Honoki et al. 2018 while putting our results into the context of previous findings in the discussion. We bring the attention of the reviewer to our analysis of an RSH2 RSH3 mutant in this study, and that in our hands the mutant phenotype was indistinguishable from the RSH quadruple mutant (rshQM) (Figure 2- figure supplement 1 panel B). Therefore, we do indeed consider that RSH2 and RSH3 are the main RSH isoforms involved in ppGpp-mediated acclimation to nitrogen deficiency, and we state this ( see p7 l161-164 in original manuscript). As we explain in the discussion there are probably technical reasons for the discrepancy with the results reported by Honoki et al. 2018. We also note here that the RSH2 RSH3 mutants used in our study and by Honoki et al. 2018 are not identical: the same SAIL insertion SAIL_305_B12 was used for rsh2, while the rsh3 allele used by Honoki was the GABIkat insertion GABI129D02 and here SAIL_99_G05). We now add this difference in the genetic identity of the mutants as an additional potential explanation for the different findings in the two studies.

      1. The authors at times show a tendency to overinterpret their results. A ppGpp-mediated repression of chloroplast transcription and translation is sufficient to explain most of the observations in this study. However, the authors seem to go beyond this simple explanatory framework by invoking specific roles for ppGpp in remodeling of PSII antenna-core interaction and in blocking of PSII reaction center repair. There is no data in the manuscript in support of these two propositions. A coordinated decrease in synthesis of most chloroplast proteins, including the D1 reaction center protein of PSII, is sufficient to explain the decrease in Fv/Fm. There is no evidence in the manuscript for "photoinactivation gaining an upper hand via ppGpp-mediated signaling"

      The circuit breaker analogy of PSII photoinhibition that the authors discuss in support is just an interpretation. The remodeling of PSII antenna-core interaction, likewise, could be a simple consequence of the ppGpp-mediated decrease in D1 protein synthesis. The high antenna-core ratio under nitrogen starvation likely reflects the lag in the decrease of LHCB1 (which eventually decreases significantly by day 16).

      Since ppGpp-signaling primarily affects plastid transcription and translation, there is a rapid decrease in plastid psbA gene product (D1) relative to the nuclear-encoded LHCB1. The unconnected LHCII might simply be a result of the mismatch in antenna-core stoichiometry rather than an active regulation of PSII functional assembly by ppGpp.

      We have re-worked the discussion to make these points more clearly, and also to tone down certain points where we may have over-stretched our interpretation.

      We think that our interpretation is essentially the same as the reviewer’s- the ppGpp mediated inhibition of chloroplast translation and transcription is sufficient to explain the majority of our results. In the discussion we also discuss the possibility that ppGpp stimulates the active degradation of some chloroplast proteins, and put this in context of studies showing that N-starvation activates the specific proteolysis of certain photosynthetic proteins in Chlamydomonas and has an effect on the half lives of different chloroplast proteins in plants. We do not propose or present data suggesting that ppGpp has any other specific targets/effectors- for example within the PSII repair cycle or in remodelling PSII stoichiometry- although we also cannot exclude the possibility of targets in these processes.

      We think that the ppGpp dependent change in PSII stoichiometry during N-starvation is not just a side effect of a general downregulation or a temporary mismatch as suggested- but due to its size, persistence and effect on photosynthesis is likely to be part of the acclimation process. For example, the ppGpp-dependent drop in Fv/Fm is maintained at day 16 and even beyond (Fig 2D). We also see that photosynthetic proteins are still degraded in low ppGpp mutants (Fig. 3A), but that the high Fv/Fm is maintained throughout. These points and the fact that the alteration of PSII stoichiometry is not caused by the direct action of ppGpp on PSII (but via transcription/translation) does not mean that it is not important or does not play a role in acclimation. Other studies report that PSII RC inactivation can protect PSI (e.g. Tikkanen et al. 2014) and ppGpp may be working in a similar fashion here by reducing the flow of energy into the photosynthetic electron transport chain. This interpretation is consistent with our results showing that wild-type plants and high ppGpp plants (rsh1-1) accumulate less ROS and ROS-related damage than plants defective in ppGpp biosynthesis (Fig. 1).

      1. The work is mostly descriptive of the involvement of ppGpp in low nitrogen tolerance without any data on how the nitrogen deficiency is sensed by the RSH enzymes and how ppGpp orchestrates the multi-faceted acclimatory response. Perhaps, these aspects are beyond scope of the current manuscript, but they could be discussed more.

      We agree that these are very important questions, and also that they are out of the scope of the current work. We think that our work goes beyond the descriptive by demonstrating the physiological functions of ppGpp-signalling during nitrogen deficiency and a framework for how it occurs (i.e downregulation of chloroplast function and avoidance of excess oxidative stress).

      Reviewer #3 (Public Review):

      The manuscript by Romand et al. explores the role of guanosine penta- and tetraphosphate, ppGpp, in the acclimation of plants to nitrogen limitation. It shows that an early and transient ppGpp accumulation - and a controlled ppGpp/GTP ratio - is necessary for a proper acclimation of plants to such stress. The pathway is shown to act on remodeling the photosynthetic machinery and downregulating photosynthesis during stress, thus limiting ROS damage to the plants. This regulation most likely takes place by affecting chloroplast transcription, maintaining the balance between nucleus- and chloroplast-encoded proteins.

      The manuscript proposes a thorough analysis of the ppGpp-induced response including extensive wild type and mutant analyses at the gene and protein expression level as well as at the physiological level under nitrogen limitation together with heterologous expression of ppGpp hydrolase from Drosophila. The conclusions are carefully backed by the data (but for the lack of gene expression analysis in the high ppGpp line, rsh1-1), the figures and text clear, well-written and easy to follow. Altogether it represents a solid new step in improving the comprehension of plant response to nitrogen limitation, as well as on the role of ppGpp in plants and possibly throughout the green lineage. An alternative hypothesis to ppGpp photoprotective role could be discussed in that photoprotection may be an indirect effect due to photosynthetic protein degradation enabled by ppGpp, possibly through modulation of ppGpp/GTP ratio affecting chloroplast protease activity.

      On this last point we agree with the reviewer- our data indicates that the photoprotective role of ppGpp is via the ppGpp-dependent control of the abundance of photosynthetic proteins. This is indirect in the sense that we have no evidence that ppGpp itself interacts with components of the photosynthetic machinery. However, as discussed below we do not think that photoprotection is just a side-effect of ppGpp’s action- we show that the capacity to synthetise ppGpp is required for avoiding the generation of ROS and tissue death.

    1. Author Response

      Reviewer #1 (Public Review):

      In this paper, the authors examine the role of feedback from primary visual cortex (V1) to the dorsolateral geniculate nucleus of the thalamus (dLGN) under a variety of visual stimulus conditions. This is a well-defined circuit originating from a specific population of Layer 6 cells in the cortex, and the authors test the role of this projection by recording in dLGN during silencing of V1 via ChR2 expression in PV inhibitory cells. This is a well-established technique for strong silencing of cortex. However, because there are other disynaptic pathways from V1 to thalamus, they also perform a similar set of experiments using more targeted optogenetic inhibition of a genetically-defined class of Layer 6 (NTSR1) cells that make up most of the L6 corticothalamic projections. The fact that these experiments elicit similar results supports their interpretation that these direct projections are largely responsible for the observed results. While previous studies have manipulated corticothalamic projections pharmacologically, via V1 lesions, or via optogenetics, the authors rightly point out that most previous studies have focused on simple parametric stimuli and/or have been performed in anesthetized animals. The results of this study suggest feedback during natural visual stimuli and locomotion reveal effects that are distinct from these previous studies.

      Overall, these are important and carefully-performed experiments that significantly advance our understanding of the role of corticothalamic feedback to the dLGN.

      We thank the reviewer for the appreciation of our methods and results.

      The authors suggestion that the different effects observed during simple and complex stimuli may be due to increased surround suppression during the full-field gratings seems reasonable, but I didn’t understand how the analysis of blank periods during these two conditions supported this argument. It wasn’t clear to me what mechanisms would be expected to support the alternative outcome, where suppressing feedback during the blank periods interleaved with the two different stimuli would have different effects - unless they are testing whether natural movies elicit some longer-lasting state change that would change the results observed during blank periods. This seems somewhat implausible, and unless the authors wish to expand the study to include different stimulus sizes, I think the interpretation regarding surround suppression is best left to the discussion, where it is already treated well.

      We thank the reviewer for the recommendation. We fully agree that explaining the difference in CT feedback across blanks, gratings, and movies will require more experiments. We have followed the recommendation of the reviewer and removed the interpretation related to differences in surround suppression from the results section and treat it now in the discussion only.

      The paper would benefit from more clearly highlighting results that agree or disagree with previous studies, with a brief mention of how the authors interpret these similarities or differences. For example the results of Olsen et al 2012 seem to be consistent with what the authors observe here with gratings but not with natural movies, and although Olsen et al performed some awake recordings, I think the LGN recordings were all under anesthesia. Specifically highlighting these differences (and suggesting an interpretation for them) would help emphasize the novelty of the study.

      We thank the reviewer for the recommendation and now highlight throughout the results and discussion where our results agree or disagree with previous studies. As mentioned by the reviewer, we have similar results for gratings to the results obtained by Olsen et al. (2012), although in our study we have not explicitly centered the full field gratings on the RFs and we have not measured surround suppression. The results for the blank stimuli and the movies, however, are different, at least in terms of how CT feedback affects ring rate. A key insight of our study, at least in our view, is that CT feedback effects might well differ for different stimuli, and understanding the underlying mechanism (e.g., differential engagement of the excitatory and indirect inhibitory CT feedback pathway) will be an important avenue of research in the future.

      The authors should comment more on the spatial extent of V1 silencing and potential effects of the variability observed across mice, especially given that they appear to have made only a single injection of ChR2 to label PV cells. While silencing with this method extends beyond the injection site, it probably doesn’t cover all of V1. Was any analysis done of variability across mice based on the size or location of the ChR2 expression measured post-hoc?

      Unfortunately, we did not preserve enough slices to precisely quantify the extent of expression across animals. However, visual inspection of the slices revealed that even a single injection typically resulted in a widespread pattern of expression. In fact, we think that activation of PV neurons was determined in its spatial extent not so much by the virus expression but rather by the photoactivation light. With a distance of 0.5 0.1 mm of the optical fibre from the cortical surface, most of V1 was covered by light. A previous study performing a quantitative characterization of the lateral spread of optogenetic suppression by PV activation demonstrates that pyramidal neuron ring can be suppressed 2 3 mm from the laser center Li et al. (2019). Hence, we think that variability in opsin expression across mice is unlikely to have a substantial impact on our results.

      The decrease in reliability and sparseness during running is attributed partially to increased eye movements. In cortex this has been studied in awake animals with natural movies in a variety of studies where the opposite effects are observed including Froudarakis et al 2014 where there was a small increase in both metrics during running, and Reimer et al 2014 where reliability strongly increased during pupil dilation. If there is enough data to condition on running periods where eye movements are stable or dilation outside of running to measure the effects of feedback suppression during these periods, this would be useful information.

      We thank the reviewer for bringing up this interesting issue. We fully agree that our results recorded in dLGN are different from those measured by Froudarakis et al. (2014) and Reimer et al. (2014) in V1.

      As suggested by the reviewer, we have repeated the analysis proposed by Reimer et al. (2014) to identify periods in the movie with the most rapid pupil dilation / constriction in face of continuous changes in overall luminance. Besides the effects of pupil dilation / constriction on ring rate, we have computed reliability both according to what we had used throughout our manuscript and in the way proposed in Reimer et al. (2014), which resembles our measure of SNR. We find that both measures of reliability are unaffected by pupil dilation.

      Interestingly, in the meantime other studies have also reported that reliability might be differently affected by behavioral state in V1 compared to dLGN. For instance, Nestvogel and McCormick (2022) found that consistent with our results variability of membrane potential in visual thalamic neurons was not significantly altered by locomotion or whisker movement.

      Reviewer #2 (Public Review):

      Spacek et al. study the corticothalamic feedback of different visual stimuli on visual thalamus. With optogenetic suppression of visual cortex feedback and simultaneous multi-channel recordings in visual thalamus, the authors succeeded to acquire important data about this essential feedback loop in awake, behaving animals. The authors show in detail that the cortical feedback acts as a gain factor in thalamus for the transmission of signals from retina to cortex. They also show that naturalistic scenes result in robust feedback from cortex. As expected from anatomy, the authors find that modulatory feedback from cortex and modulatory input from brain stem act rather independently on thalamus. The paper is technically very impressive and the results are important for a wide range of readers.

      We thank the reviewer for the positive feedback.

      It is advisable to revise the Introduction and Discussion to better integrate the new findings into the existing literature.

      We thank the reviewer for this advice, and have revised the title, abstract, introduction and discussion to better integrate our new findings into the existing literature, and highlight our advances in relation to previous findings.

      The authors distinguish between awake, resting state and running state. However, the awake, resting state in mice comprises a wide range of alertness levels. This range of alertness will most likely affect the bursting probability of thalamocortical neurons.

      We thank the reviewer for this comment. So far, our manuscript had only taken locomotion as a proxy for behavioral state, as locomotion typically goes along with increased pupil size (Erisken et al., 2014; McGinley et al., 2015) and increased levels of arousal (McGinley et al., 2015; Vinck et al., 2015). To also study the effects of locomotion-independent arousal, we have now applied the analysis mentioned by the reviewer: following methods originally suggested by Reimer et al. (2014), we identified periods of the movie presentation without locomotion that corresponded to the upper or the lower quartile of pupil size change. Similar to the results that Reimer et al. (2014) found for primary visual cortex, we observed that ring rate in dLGN is enhanced during times when the pupil was dilating faster than usual vs. when it was constricting faster than usual. Like the effects of running, the modulations by pupil-indexed arousal persisted even with V1 suppression. We present these new results in Figure 5 - Supplement 2.

    1. Author Response

      Reviewer #2 (Public Review):

      The visual system must extract two basic features of visual stimuli: luminance, which we perceive as brightness, and contrast, the change in luminance over space or time (this paper focuses on changes over time). Contrast is separately processed by ON and OFF pathways, which encode luminance increments or decrements, respectively. Contrast must be robustly detected even if the overall luminance changes rapidly, as might occur if an animal is moving in and out of shadows. This paper addresses how such a luminance correction occurs in the fly.

      In the fly, three types of first-order interneurons - L1, L2, and L3 - transmit information from photoreceptors to the medulla, where ON and OFF encoding emerges. Previous work suggested that all three interneurons primarily encode contrast signals and that they project to distinct pathways: L1 to the ON pathway and L2 and L3 to the OFF pathway. Ketkar et al. show that, contrary to this model, these interneurons encode both contrast and luminance in specific ways and are not cleanly segregated into ON versus OFF inputs.

      This study reveals several new insights into early visual processing that are interesting and well-supported by the data:

      1) The authors show that behavioral responses to ON stimuli can compensate for rapid changes in luminance. However, the purported sole input to the ON pathway, L1, shows activity that is highly dependent on luminance. This suggests that a luminance correction must arise downstream of L1. These results are analogous to findings previously made by the same group regarding the OFF pathway (Ketkar et al., 2020). The previous paper showed that L2 provides contrast information to the OFF pathway, and L3 provides luminance information to allow for a luminance correction in downstream contrast encoding. But unlike the multiple inputs to the OFF pathway, the ON pathway was thought to only receive input from L1, provoking the question of whether L1 is able to provide both contrast and luminance information.

      2) Using well-designed calcium imaging studies, the authors surveyed the responses of the three interneurons and found that they encode different stimulus features: L1 encodes both contrast and luminance, L2 purely encodes contrast, and L3 purely encodes luminance (with a different dependence than L1). These are interesting and important findings revealing how both contrast and luminance encoding are distributed across the three interneurons.

      3) Using neuronal manipulations, the authors dissected the contributions of the three interneurons to ON and OFF behavior under changing luminance. These experiments showed that L1 and L3 are required for the luminance correction in the behavior. Moreover, the finding that all three interneurons contribute to both ON and OFF behavior contrasts with the existing model of segregated pathways. Thus, this paper could change the way we think about early visual processing in the fly: rather than relaying similar information to distinct downstream pathways, first-order interneurons relay distinct information to common pathways.

      Overall, the major claims of this paper are important and supported by the experiments. There are just a few concerns that I would note:

      Thank you for the overall positive evaluation of our work, as well as for the constructive criticism, which we are going to address below.

      1) The authors state that they have shown luminance invariance in ON behavior (e.g. line 376-377 of the Discussion), but this is not entirely accurate: the ON behavior decreases as luminance increases. This is still an interesting effect since it's the opposite of what L1 activity does, so it's clear that the circuit is implementing a luminance correction, but it is not "luminance invariance".

      As pointed out in response to essential comment #2, we carefully edited the manuscript to talk about ‘near’ luminance invariance, or data approaching luminance invariance. More prominently, we rephrased the text to highlight the need for a luminance gain to scale behavioral responses to contrast, even if the resulting behavior is not entirely luminance invariant.

      2) The visual stimuli presented for most imaging experiments (full-field) are not the same as those presented for behavior (moving edges). It is possible neuronal responses and their encoding of luminance and contrast may differ if tested with the moving edge stimuli (if so, this would be concerning). The authors did image L1 with both types of stimuli and could compare these responses. Also, testing behavior at 34º and imaging at 20º presents a possible discrepancy in comparing these data.

      We use moving ON edges in Figure 1, and these data suggest that the transient response of L1 scales with step changes in luminance, consistent with data in Figure 2B. Although we did not point this out in the paper, the L1 responses in Figure 1 also decay to different response levels, consistent with the luminance-sensitive component that static stimuli reveal in Figure 2. Furthermore, for other ongoing projects in the lab, we have for example measured physiological responses in L2 with the same stimuli used in behavior, and there is no discrepancy with the data reported here. Overall, there is no reason to believe, following a vast amount of literature in Drosophila and other flies, that LMCs would respond any different to moving vs. static stimuli.

      We can additionally point out that the behavioral data of L3 silencing (at 34ºC) nicely correlate with physiological contrast responses of L1 and L2 (at 20ºC, predicted from electrophysiological recordings for LMCs in Ketkar et al. 2020, measured for L1 here). Many previous studies, for example in motion detection, have linked data from physiological recordings at room temperature with behavioral experiments done at higher temperature (e.g., Ammer et al., 2015; Clark et al., 2011; Creamer et al., 2019; Fisher et al., 2015; Leonhardt et al., 2017; Salazar-Gatzimas et al., 2016; Serbe et al., 2016; Silies et al., 2013; Strother et al., 2017). We therefore do not think that these are major concerns.

      3) I find it puzzling that silencing L1 has little effect on ON behavior at 100% contrast and varying luminance (Figure 3A), but severely affects ON behavior to 100% contrast (and lower values) when different contrasts are interleaved (Figure S1). The authors note this but do not provide a clear explanation of why this might be the case. Aside from mechanism, it is not clear whether the difference is due to varying luminance in the first experiment or varying contrast in the second one (e.g. they could test 100% contrast without varying luminance).

      The two stimulus sets used here do not allow us to pinpoint why the L1 silencing phenotype differs between them, since they comprise more than one difference as discussed above (see point 4) in “Essential Revisions”). We now include two additional experiments that dissect the role of different stimulus parameters (Supp. Figure 2). To understand whether the difference is due to varying luminance, we tested responses to ON edges of fixed (100%) contrast and luminance at the same stimulus parameters (motion duration, speed) as used in Figure 3, and did not find reduced turning responses when silencing L1. Thus, varying luminance does not change the effect of L1 on ON behavior. However, when repeating this experiment with a bright inter-stimulus interval, L1 silencing lead to a strong response deficit. Therefore, differences in the interval luminance explain the differences in the L1 silencing phenotype observed not only in this study but also across studies. Although we hypothesize a role of contrast adaptation that may function differently with altered contrast statistics, a more detailed investigation would be necessary to understand the mechanism. Nevertheless, our experiments allow us to conclude that L1 is not the sole major input to the ON pathway, even though it is required under certain stimulus conditions.

      4) I do not entirely agree with the authors' interpretation of the L1 ort rescue experiment for OFF behavior. They state that rescue flies "responded similarly to positive controls". However, the graph shows that the rescue flies generally fall in between the mutant and heterozygote control flies; they resemble the controls at low luminance but resemble the mutants at high luminance. One may conclude that L1 is sufficient to enhance OFF behavior at low luminance, but it is a stretch to say it's a complete rescue.

      Sorry, we just meant to say that they “responded similarly to positive controls (...) at low luminance”, but the sentence was badly written. We corrected this to: “L1 ort rescue flies responded similarly to positive controls at low luminances, rescuing responses to OFF edges at dim backgrounds.”

      5) The authors typically use t-tests to analyze experiments with 2 variables (genotype and luminance) and 3 or more conditions per variable. This is not the most appropriate statistical test; typically one would use a two-way ANOVA. At the least, it should be clear whether they are performing corrections for multiple comparisons if performing many t-tests on the same dataset.

      Thank you for the suggestion, we now use a two-way ANOVA followed by corrected pairwise comparisons and state this clearly in the figure captions (also addressed above in essential comment #5).

      Reviewer #3 (Public Review):

      Ketkar et al combine calcium imaging and behavioral experiments to investigate the encoding of luminance and contrast in 3 first-order interneurons in the Drosophila lamina: L1, L2, and L3, as well as the role of these signals in moving ON edge behavior across luminance. The behavioral experiments are well performed. The rescue experiments are particularly interesting. Together with silencing they support and nicely extend previous work showing that L1/2/3 are not simply segregated between ON and OFF pathways. My main issue is the link that the authors make between the cellular responses and the behaviors performed and therefore the overall conclusions and claims of the paper about the roles of contrast vs luminance encoding of each neuron type (particularly L1) in the behaviors.

      Major concerns:

      1) The authors state that the main behavior they study, namely optomotor response to moving light edges at 100% contrast, is "luminance invariant". A strict definition of this would be that behavioral responses are constant with increasing luminance. However, there are very few plots in this paper where this is the case. In almost all examples, the response is decreasing with respect to increasing luminance. The authors do qualify a "nearly" invariant behavior, but this does not change the fact that interpretation of the data in the context of the framing of the paper is often problematic.

      We thank the reviewer for this critical comment. The main point (that we apparently failed to make clear enough) is that there is a clear requirement for a luminance gain. Physiological LMC responses measured using calcium imaging to ON stimuli in Figure 1, or predicted from previous electrophysiological recordings to OFF stimuli in (Ketkar et al., 2020) cannot account for any of the (control) behavioral data. We now edited the text to tone down statements about luminance invariance, and instead highlighted the need for a luminance gain.

      2) The manuscript would benefit from clear definitions of luminance and contrast, as well as an explanation of how contrast and luminance sensitivity can be inferred from experiments. In particular, the authors use transient vs. sustained response properties in L1, L2, and L3 as indicators of contrast and luminance sensitivity, but this is not stated clearly. It would be important to explain this to the reader early on.

      We now added definitions of general terms to the introduction and added data and analysis to the manuscript (Figure S1, and Figure 2B-D) to more clearly test which component of the neurons’ responses encode contrast or luminance.

      3) In the manuscript, it is often stated that "calcium imaging experiments reveal that each first order interneuron is unique in its contrast and luminance encoding properties" (line 110). This was shown clearly for L2 and L3 in their previous work in Ketkar et al. 2020, with a welldesigned two-step stimulus that was able to tease apart contrast vs. luminance invariance. Unfortunately it does not seem that this level of experimental detail and analysis is applied to L1 here. In particular, the authors state " L1 encodes both contrast and luminance in distinct response components." Line 112, in the summary of their findings. I would not agree that the authors have actually shown this properly in this manuscript.

      Addressed above, in point 6 of “Essential Revisions”

      4) The results as they are stated, are at times not well supported by the data. The manuscript would benefit from a careful assessment of the accuracy and precision of the language used to interpret the data. Sometime just moving some conclusions to the discussion and explaining the assumptions made to reach a particular conclusion would be enough. A few of examples:

      We carefully edited the entire manuscript, in addition to addressing the specific points below.

      o Figure 2: "Lamina neuron types L1-L3 are differently sensitive to contrast and luminance". It is overall true that from the raw traces, the response are different. However the quantification in C-E only pertains to luminance.

      As stated above, we now did further analysis on the contrast encoding properties of L1 and L2 and pointed out the major differences between these neurons (Figure 2B-D).

      o Figure 3: "L1 is not required but sufficient for ON behavior across luminance". The data convincingly shows this. I would however point out that the statement "this data [..] highlights its behavioral relevant role of its luminance component" line 231 is an overstatement.

      We deleted this statement at the end of the paragraph.

      o Figure 6: "L1 luminance signal is required and sufficient for OFF behavior" the data presented shows convincingly that when L1 is inactive the behavior becomes (more) intensity variant. However, it does not show that it is the "luminance signal" in L1 that is required for this effect. In general, because L1 has a sustained and a transient response, it is difficult to strictly implicate one or the other in supporting any behavior, short of manipulating L1 to make it fully transient or fully sustained.

      We agree. The figure title now reads “L1 function is required and sufficient for OFF behavior”.

      o It is often not clear which conclusions stem from this work and which from their previous work Ketkar et al. 2020, or even other previous work on contrast sensitivity in particular. Clarifying this might help with my concern about statements not well supported by the data in this paper, and also justify their overall novelty. In general the manuscript assumes familiarity with this previous work, which is not always helpful for the reader.

      As stated above, we now more clearly separate previous findings from novel findings in the abstract, and throughout the text. We also expanded the introduction to better explain the core concepts that are needed to understand this work, without having read Ketkar et al. 2020.

    1. Reviewer #3 (Public Review):

      I think the framing could be improved to better reflect the contribution of the work. From the abstract, for example, it's unclear to me what the authors think is the most meaningful conclusion. Is it the observations about the finer details of TF regulation (bursting dynamics), the fact that Bcd is probably the sole source of "positional information" for hb-p2, that Bcd exists in active/inactive form, or the fact that an equilibrium model probably suffices to explain what we observe? The first sentence itself seems to suggest this paper will discuss "dynamic positional information", in which case it's somewhat misleading to say this kind of work is "largely unexplored"; Johannes Jaeger in particular has been a strong proponent of this view since at least 2004. On that note some particularly relevant recent papers in the Drosophila early embryo include:<br /> 1) Jaeger and Verd (2020) Curr Topics Dev Biol<br /> 2) Verd et al. (2017) PLoS Comp Biol<br /> 3) Huang, Amourda, et al. and Saunders (2017) eLife<br /> 4) Yang, Zhu, et al. (2020) eLife [see also the second half of Perkins (2021) PLoS Comp Biol for further discussion of that model]<br /> Some reviews from James Briscoe also discuss this perspective.

      I would also recommend modifying the title to reflect the biology found in the new results.

      A major point that the authors should address is the design of the synthetic constructs. From table S1, the sites are often very closely linked (4-7 base pairs). From the footprint of these proteins, we know they can cover DNA across this size (see, https://pubmed.ncbi.nlm.nih.gov/8620846/). As such, there may be direct competition/steric hindrance (see https://pubmed.ncbi.nlm.nih.gov/28052257/). What impact does this have on their interpretations? Note also that the native enhancer has spaced sites with variable identities.

    1. Author Response:

      Evaluation Summary:

      This paper will be of interest to researchers who perform single-molecule fluorescence imaging experiments as well as those who want to include machine learning in their data analyses. The authors have developed a machine learning algorithm that addresses some of the data analysis challenges in the field of single-molecule fluorescence imaging. The methods are rigorously benchmarked using simulated data and tested using real data. There are some concerns whether Tapqir is general enough for use by the broader community of single-molecule fluorescence researchers.

      We thank the reviewers for their thorough review of the manuscript. In response to the reviewer comments, we posted to bioRxiv a revised manuscript with new data and edits to text. Concerns about generality are addressed in the revised manuscript and in the responses to specific reviewer comments below.

      Reviewer #1 (Public Review):

      "Bayesian machine learning analysis of single-molecule fluorescence colocalization images" by Ordabayev, et al. reports the development, benchmarking, and testing of a Bayesian machine learning-based method, which the authors name Tapqir, for analyzing single-molecule fluorescence colocalization data. Unlike currently available, more conventional analysis methods, Tapqir attempts to holistically model the microscopy images that are recorded during a colocalization experiment. Tapir uses a physics-based, global model with parameters describing all of the features of the experiment that are expected to contribute to the recorded microscopy images, including shot noise of the spots and background, camera noise, size and shape of the spots, and specific- and non-specific binders. Based on benchmarking on simulated data with widely varying properties (e.g., signal-to-noise; amounts, rates, and locations of specific and non-specific binders; etc.), Tapqir generally does as well and, in some cases, better than currently existing methods. The authors also test Tapqir on real microscopy images with similarly varying properties from studies that have been previously published by their research group and demonstrate that their Tapqir-based analysis is able to faithfully reproduce the previously published results, which were obtained using the more conventional analysis methods available at the time the data were originally published. This is a well-designed and executed study, Tapqir represents a conceptual and practical advance in the analysis of single-molecule fluorescence colocalization experiments, and its performance has been comprehensively and rigorously benchmarked on simulated data and tested on real data. The conclusions of this study are well supported by the data, but some of the limitations of the method need to be clarified and discussed in more depth, as outlined below.

      1. Given that the AOI is centered at the target molecule and there is a strong prior for the binder also being located at the center of the AOI, the performance of Tapqir is dependent on several variables of the microscopy/optical system (e.g., the microscope point-spread function, magnification, accurate alignment of target and binder imaging channels, accurate drift correction, etc.). Although this caveat is mentioned and some of these factors are listed in the main text of the manuscript, the authors could have expanded this discussion in order to clarify the extent to which the performance of Tapqir depends on these factors.

      We added relevant new data to the revised manuscript in Table 5. The question about alignment accuracy is now discussed in the Materials and Methods:

      “Tests on data simulated with increasing proximity parameter values σxy (true) (i.e., with decreasing precision of spatial mapping between the binder and target image channels) confirm that the cosmos model accurately learns σxy (fit) from the data (Figure3–Figure Supplement 3D; Table 5). This was the case even if we substituted a less-informative σxy prior (Uniform vs. Exponential; Table 5).

      The CoSMoS technique is premised on colocalization of the binder spots with the known location of the target molecule. Consequently, for any CoSMoS analysis method, classification accuracy will in general decline when the images in the target and binder channels are less accurately mapped. However, for the Tapqir cosmos model, low mapping precision has little effect on classification accuracy at typical non-specific binding densities (λ = 0.15; see MCC values in Table 5).”

      The more general point about priors is now addressed in the Materials and Methods as follows:

      “All simulated and experimental data sets in this work were analyzed using the prior distributions and hyperparameter values given above, which are compatible with a broad range of experimental conditions (Table 1). Many of the priors are uninformative and we anticipate that these will work well with images taken on variety of microscope hardware. However, it is possible that highly atypical microscope designs (e.g., those with effective magnifications that are sub-optimal for CoSMoS) might require adjustment of some fixed hyperparameters and distributions (those in Eqs. 6a, 6b, 11, 12, 13, 15, and 16). For example, if the microscope point spread function is more than 2 pixels wide, it may be necessary to increase the range of the w prior in Eq. 13. The Tapqir documentation (https://tapqir.readthedocs.io/en/stable/) gives instructions for changing the hyperparameters.”

      1. The Tapqir model has many parameters, each with its own prior. The majority of these priors are designed to be uninformative and/or weak and the only very strong prior is the probability that a specific binder is located at or very near the center of the AOI. The authors could have tested and commented on how the strength of the prior on the location of a specific binder affects the performance of Tapqir.

      The revised manuscript includes new data on and expanded discussion of this point. In our model, the position of a target-specific spot relative to the target position has a prior distribution illustrated as the green curve in Figure 2-Figure supplement 2. Importantly, the peak in this distribution does not have an a priori set width. Instead, the width of the peak is a model hyperparameter, σxy, that is learned from the image data set without user intervention. To make sure that this point is understood, we expanded and clarified the relevant Methods section and modified the legend of Figure 2-Figure supplement 2.

      To address the reviewers’ specific question, we constructed simulated data sets with different mapping precision values and analyzed them; the results are presented in the (new) Table 5 and discussed:

      “The CoSMoS technique is premised on colocalization of the binder spots with the known location of the target molecule. Consequently, for any analysis method, classification accuracy declines when the images in the target and binder channels are less accurately mapped. For the Tapqir cosmos model, low mapping precision has little effect on classification accuracy at typical non-specific binding densities (λ = 0.15; see MCC values in Table 5).”

      1. Given the priors and variational parameters they report, the authors show that Tapqir performs robustly and seems to require no experiment-to-experiment optimization. This is expected to be the case for the simulated data, since they were simulated using the same model that Tapqir uses to perform the analysis. With regard to the real data, however, it is quite likely that this is due to the fact that the analyzed data all come from the same laboratory and, therefore, likely the same microscope(s). It would have therefore been very useful if the authors would have listed and discussed which microscope settings, experimental conditions, and/or other considerations, beyond those described in point 1 above, would result in a need for re-optimization of the priors and/or variational parameters.

      As noted above, we now address this point in the Materials and Methods as follows:

      “All simulated and experimental data sets in this work were analyzed using the prior distributions and hyperparameter values given above, which are compatible with a broad range of experimental conditions (Table 1). Many of the priors are uninformative and we anticipate that these will work well with images taken on variety of microscope hardware. However, it is possible that highly atypical microscope designs (e.g., those with effective magnifications that are sub-optimal for CoSMoS) might require adjustment of some fixed hyperparameters and distributions (those in Eqs. 6a, 6b, 11, 12, 13, 15, and 16). For example, if the microscope point spread function is more than 2 pixels wide, it may be necessary to increase the range of the w prior in Eq. 13. The Tapqir documentation (https://tapqir.readthedocs.io/en/stable/) gives instructions for changing the hyperparameters.”

      1. Based on analysis of the simulated data shown in Figure 5, where the ground truth is known, the use of Tapqir to infer kinetics is less accurate that the use of Tapqir to infer equilibrium binding constants. The authors do a great job of discussing possible reasons for this. In the case of the real data analyzed in Figure 6 and in Figure 6 - Figure Supplements 1 and 2, the kinetic results obtained using Tapqir have different means and generally larger error bars than those obtained using Spot-Picker. To more comprehensively assess the performance of Tapqir versus Spot-Picker, the authors could have used the association and dissociation rates to calculate the corresponding equilibrium binding constants and then compared these kinetically calculated equilibrium binding constants to the population-calculated equilibrium binding constants that the authors calculate and report in the bottom plot in Panel D of Figure 6 and Figure 6 - Figure Supplements 1 and 2. This would provide some information on the accuracy of the kinetics in that the closer the kinetically and population-calculated equilibrium binding constants are to each other, the more accurately the kinetics have been estimated. Performing this type of analysis for the kinetics obtained using Tapqir and Spot-Picker would have allowed a more comprehensive comparison of the two methods.

      This comment seems to reflect a misunderstanding. Fig. 6 and its figure supplements do not report any dissociation kinetics or binding equilibrium constants. Instead, they report ka (pseudo first-order target-specific association rate constant), kns (pseudo first-order target non-specific association rate constant), and Af (the active faction, i.e., the fraction of target molecules capable of association with binder). ka and Af values from the two methods agree within experimental uncertainty for all four data sets analyzed. kns values differ, but as we point out:

      “We noted some differences between the two methods in the non-specific association rate constants kns. Differences are expected because these parameters are defined differently in the different non-specific binding models used in Tapqir and spot-picker (see Materials and Methods).”

      (There is additional discussion of this point in Materials and Methods). The reviewer is correct that the estimated uncertainties (i.e., error bars in panels D) in ka and Af are generally larger for Tapqir than for spot-picker. This is expected, for the reasons that we explain:

      “In general, previous approaches in essence assume that spot classifications are correct, and thus the uncertainties in the derived molecular properties (e.g., equilibrium constants) are systematically underestimated because the errors in spot classification, which can be large, are not accounted for. By performing a probabilistic spot classification, Tapqir enables reliable inference of molecular properties, such as thermodynamic and kinetic parameters, and allows statistically well-justified estimation of parameter uncertainties. This more inclusive error estimation likely accounts for the generally larger kinetic parameter error bars obtained from Tapqir compared to those from the existing spot-picker analysis method (Figure 6, Figure 6–Figure Supplement 1, Figure 6–Figure Supplement 2, and Figure 6–Figure Supplement 3). ”

      Reviewer #2 (Public Review):

      The work by Ordabayev et al. details a Bayesian inference-based data analysis method for colocalization single molecule spectroscopy (CoSMoS) experiments used to investigate biochemical and biophysical mechanisms. By using this probabilistic framework, their method is able to quantify the colocalization probabilities for individual molecules while accounting for the uncertainty in individual binding events, and accounting for camera and optical noise and even non-specific binding. The software implementation of this method, called Tapqir, uses a Python-based probabilistic programming language (PPL) called pyro to automate and speed-up the optimization of a variational Bayes approximation to the posterior probability distribution. Overall, Tapqir is a powerful new way to analyze CoSMoS data.

      Tapqir works by analyzing small regions (14x14 pixels) of fluorescence microscopy images surrounding previously identified areas of interest (AOI). The collection of images of these AOIs through time are then analyzed collectively using a probabilistic model that accounts for each time frame of each AOI and is able to determine whether up to K "binders" (K=2 here) are present and which of them is specifically bound. This approach of directly modeling the contents of the image data is relatively novel, and few other examples exist. The details of the probabilistic model used incorporate an impressive amount of physical insight (e.g., camera gain) without overparameterization.

      We thank the reviewer for these positive comments.

      The gamma-distributed noise model used in Tapqir captures quite a lot of physics and, given the analyses in Figs. 3-6, clearly works, but might be limited to certain types of cameras used in the fluorescence microscopy (e.g., EMCCDs). For instance, sCMOS cameras have pixel-dependent amplification and noise profiles, rather than a single gain parameter, and are sometimes approximately modeled as normal distributions with both mean and variance having an intensity-dependent and independent contribution that is different for each pixel on the camera. It is unclear how Tapqir performs on different cameras.

      In the revised manuscript, we expanded the discussion of the Image likelihood component of our model to emphasize that 1) all data sets we analyze are experimental or simulated EMCCD images, 2) sCMOS images have the different noise characteristics alluded to by the reviewer, and 3) optimal sCMOS image analysis might require a modified model, possibly including the ability to use per-pixel calibration data as a prior as was done in super-resolution work (now cited) that uses sCMOS data.

      sCMOS cameras have in recent years become very popular for some kinds of single-molecule imaging (e.g., PALM/STORM or live-cell single-particle tracking). However, for the low-background/low-signal in vitro single-molecule TIRF that is our target application for the approach described in the manuscript, EMCCD is still preferable over sCMOS for many, but not all, imaging conditions (see https://andor.oxinst.com/learning/view/article/what-is-the-best-detector-for-single-molecule-studies). Thus, we think there will be plenty of interest in the approach we describe in the manuscript even if (which is not certain) the program functions better with EMCCD than with sCMOS images.

      Going forward to develop and test an sCMOS-targeted version of the model, as we have done for EMCCD, will require revised model and code, but will also necessitate accurately simulating sCMOS CoSMoS images, obtaining experimental sCMOS CoSMoS images reflecting a broad range of realistic experimental conditions, and using the simulated and experimental images to test the new model. These may well be useful things to do in the future but would be a considerable step beyond the scope of the present manuscript.

      The variational Bayes solution used by Tapqir provides many computational benefits, such as numerical tractability using pyro and speed. It is possible that the exact posterior, e.g., as obtained using a Markov chain Monte Carlo method, would be insignificantly different with the amount of data typical for CoSMoS experiments; however, this difference is not explored in the current work.

      We agree. However, since we have not done any analyses using MCMC, there is nothing in particular that we can say about it in the context of CoSMoS data analysis. Implementation of an MCMC approach using our model will be easier in the future because the Pyro developers are currently working to optimize the implementations of MCMC methods in their software.

      The intrinsic use of prior probability distributions in any Bayesian inference algorithm is extremely powerful, and in Tapqir offers the opportunity to "chain together" subsequent analyses by using the marginalized posteriors from one experiment as the basis for the priors for subsequent experiments (e.g., in \sigma^{xy}) for extremely high accuracy inference. While the manuscript discusses setting and leveraging the power of priors, it does not explore the power of such "chaining" and the positive effects upon accuracy.

      Chaining is beneficial in principle. However, in practice it will help significantly only if the uncertainty in the posterior parameter values from the non-chained analysis is larger than the experiment-to-experiment variability in the “true” parameter values. For σxy we obtain very narrow credence intervals without chaining (Table 1). In our judgement, these are unlikely to be made more accurate by using prior information from another experiment where such factors as microscope focus adjustment may be slightly different.

      A significant number of CoSMoS experiments use multiple, distinct color fluorophores to probe the colocalization of different species to the target. The current work focuses only upon analyzing data with a single color-channel. Extensions to multiple independent wavelengths are computationally trivial, given the automated variational inference ability of PPLs such as pyro, and would increase the impact of the work in the field.

      Our current approach can be used to analyze multi-channel data simply by analyzing each channel independently. However, we agree that there would be advantages to joint analysis of multiple wavelength channels (especially if there is crosstalk between channels) and that implementing multi-channel analysis is a logical extension of our study. It is straightforward (though not trivial, in our experience) to implement such multi-wavelength models. However, testing the functioning of candidate models and validating them using simulation and experimental data would require extensive work that in our view goes beyond what is reasonable to include in the present manuscript.

      Tapqir analysis provides time series of the probability of a specific binding event, p(specific), for each target analyzed (c.f., Fig. 5B), and kinetic parameters are extracted from these time series using secondary analyses that are distinct from Tapqir itself.

      The method reported here is well designed, sound, and its utility is well supported by the analyses of simulated and experimental data sets reported here. Tapqir is a cutting-edge image analysis approach, and its proper treatment of the uncertainty inherent to CoSMoS experiments will certainly make an impact upon the analysis of CoSMoS data. However, many of the (necessary) assumptions about the data (e.g., fluorescence microscopy) and desired information (e.g., off-target vs on-target binding) are quite specific to CoSMoS experiments and therefore limit the direct applicability of Tapqir for the analysis of other single-molecule microscopy techniques. With that in mind, the direct Bayesian inference-based analysis of image data, as opposed to integrated time series, as demonstrated here is very powerful, and may encourage and inspire related methods to be developed.

      Our approach is a powerful way to analyze CoSMoS data in part because it is specific to CoSMoS – it is premised on a physics-based model that incorporates known features of CoSMoS experiments. We agree that the general approach could be adapted to other image analysis applications.

      Reviewer #3 (Public Review):

      In this manuscript, the authors seek to improve the reproducibility and eliminate sources of bias in the analysis of single molecule colocalization fluorescence data. These types of data (i.e., CoSMoS data) have been obtained from a number of diverse biological systems and represent unique challenges for data analysis in comparison with smFRET. A key source of bias is what constitutes a binding event and if those events are colocalized or not with a surface-tethered molecule of interest. To solve these issues, the authors propose a Bayesian-based method in which each image is analyzed individually and locally around areas of interest (AOIs) identified from the surface tethered molecules. A strength of the research is that the approach eliminates many sources of bias (i.e., thresholding) in analysis, models realistic image features (noise), can be automated and carried out by novice users "hands-free", and returns a probability score for each event. The performance of the method is superb under a number of conditions and with varying levels of signal-to-noise. The analysis on a GPU is fairly quick-overnight-in comparison with by-hand analysis of the traces which can take days or longer. Tapqir has the potential to be the go-to software package for analysis of single molecule colocalization data.

      The weaknesses of this work involve concerns about the approach and its usefulness to the single-molecule community at large as wells as a lack of information about how users implement and use the Tapqir software. For the first item, there are a number of common scenarios encountered in colocalization analysis that may exclude use of Tapqir including use of CMOS rather than EM-CCD cameras, significant numbers of tethered molecules on the surface that are dark/non-fluorescent, a high density/overlapping of AOIs, and cases where event intensity information is critical (i.e., FRET detection or sequential binding and simultaneous occupancy of multiple fluorescent molecules at the same AOI). In its current form, the use of Tapqir may be limited to only certain scenarios with data acquired by certain types of instruments.

      In the following paragraphs, we address 1) concerns about application to CMOS, 2) dark target molecules, 3) overlapping AOIs, and 4) application to methods (e.g., smFRET) that require extraction of both colocalization and intensity data.

      1) Application to CMOS images.

      In the revised manuscript, we expanded the discussion of the Image likelihood component of our model to emphasize that 1) all data sets we analyze are experimental or simulated EMCCD images, 2) sCMOS images have the different noise characteristics alluded to by the reviewer, and 3) optimal sCMOS image analysis might require a modified model, possibly including the ability to use per-pixel calibration data as a prior as was done in super-resolution work (now cited) that uses sCMOS data.

      sCMOS cameras have in recent years become very popular for some kinds of single-molecule imaging (e.g., PALM/STORM or live-cell single-particle tracking). However, for the low-background/low-signal in vitro single-molecule TIRF that is our target application for the approach described in the manuscript, EMCCD is still preferable over sCMOS for many, but not all, imaging conditions (see https://andor.oxinst.com/learning/view/article/what-is-the-best-detector-for-single-molecule-studies). Thus, we think there will be plenty of interest in the approach we describe in the manuscript even if (which is not certain) the program functions better with EMCCD than with sCMOS images.

      Going forward to develop and test an sCMOS-targeted version of the model, as we have done for EMCCD, will require revised model and code, but will also necessitate accurately simulating sCMOS CoSMoS images, obtaining experimental sCMOS CoSMoS images reflecting a broad range of realistic experimental conditions, and using the simulated and experimental images to test the new model. These may well be useful things to do in the future but would be a considerable step beyond the scope of the present manuscript.

      2) Dark target molecules.

      In their detailed comments, the reviewers suggested a “no target molecules in sample” (NTIS) control instead of the “no fluorescent target molecules in control AOIs” (NFTICA) design that we illustrate in Fig. 1. Both types can be used as a Tapqir control dataset without any modification of the program or model. We have edited the Fig. 1 caption to explain that either type is acceptable. The reviewers are correct that, all else being equal, NTIS may be better if the target molecules are incompletely labeled. However, in practice experimenters usually know the fraction of molecules that are labeled and reduce the fluorescent target molecule surface density to hold the fraction of spots with two or more coincident target molecules (fluorescent or not) below a chosen threshold (typically 1 % or less), negating the possible advantage of NTIS (but at the expense of collecting less data per sample). On the other hand, NFTICA has the practical advantage that it is a control internal to the sample and is thus immune to problems caused by temporal or sample-to-sample variability (e.g., of surface properties).

      3) Overlapping AOIs.

      The method does not require non-overlapping AOIs – we used partially overlapping AOIs in the experimental data analyzed in the manuscript. Even though our analysis used larger AOI sizes (and hence, more overlap) than the spot-picker method, there was good agreement in the results, indicating that overlap does not cause any undue problems.

      In the revised manuscript Results section we added the following discussion of the effect of AOI size:

      “Since target-nonspecific spots are built into the cosmos model, there is no need to choose excessively small AOIs in an attempt to exclude non-specific spots from analysis. We found that reducing AOI size (from 14 x 14 to 6 x 6 pixels) did not appreciably affect analysis accuracy on simulated data (Table 2). In analysis of experimental data, smaller AOI sizes caused occasional changes in calculated p(specific) values reflecting apparent missed detection of a few spots (Figure 3–Figure supplement 4). Out of caution, we therefore used 14 x 14 pixel AOIs routinely, even though the larger AOIs somewhat reduced computation speed (Table 2 and Figure 3–Figure Supplement 4).”

      4) Methods requiring extraction of intensity data.

      The cosmos model we describe in the manuscript does not incorporate phenomena where the spot intensity at a single target changes, such as when there is FRET or multiple binders. As we point out in the final paragraph of the Discussion, more elaborate versions of the cosmos model that incorporate these phenomena could be developed. This would entail implementation, optimization, and validation with simulations and real data of the new model, which is beyond the scope of the present manuscript.

      Second, for adoption by non-expert users information is missing in the main text about practical aspects of using the Tapqir software including a description of inputs/outputs, the GUI (I believe Taqpir runs at the command line but the output is in a GUI), and if Tapqir integrates the kinetic modeling or not.

      This information is given in the online Tapqir documentation. The kinetic analysis (as in Fig. 6) is a simple Python script that is run after Tapqir; the instructions for using it are included in the documentation. Tapqir runs can be initiated using either a CLI or GUI. Output can be viewed in Tensorboard, in a Tapqir GUI, and/or passed to a Jupyter notebook or Python script for further analysis, plotting, etc.

      Given that a competing approach has already been published by the Grunwald lab, it would be useful to compare these methods directly in both their accuracy, usefulness of the outputs, and calculation times.

      The reviewer does not explain why comparing with the Grunwald method would be preferable to the comparison with spot-picker that is included in the manuscript. To be sure there is no misunderstanding, the following are the same for the two methods and therefore are not reasons to prefer one or the other of these methods for the comparison in Fig. 6 (see also Discussion):

      1) Like Tapqir, both spot-picker and Grunwald methods analyze 2-D images, not integrated intensities.

      2) Unlike Tapqir, neither spot-picker nor Grunwald is fully objective; both require subjective selection of classification thresholds by the analyst in order to tune the algorithm performance for analysis of a particular dataset.

      3) Neither spot-picker nor Grunwald is a Bayesian method. “Bayesian” in the Grunwald paper title refers to their excellent work on a separate analytical method (described in the same paper) for evaluating the number of binder molecules colocalized with a target spot; this method is not relevant to a comparison with the model presented in our manuscript.

      4) Unlike Tapqir, neither spot-picker nor Grunwald estimate classification probabilities. Instead, they simply assign binary spot/no-spot classifications that do not convey to downstream analyses the extent of uncertainty in each classification.

      5) Neither spot-picker nor Grunwald has been validated previously using simulated image data. Consequently, the validity of image classification has not been established for either.

      The comparison of Fig. 6 and supplements does not claim to and is not intended to show that Tapqir is better than spot-picker for real experimental data; we cannot make such a claim for these or any other methods because we do not know the true kinetic process and rate constants that generated the experimental data. Instead, our comparison uses experimental data sets with a broad range of characteristics (Table 1) to show that Tapqir yields similar association rate constants to those produced by spot-picker even though the former is objective and automatic while the latter requires subjective tuning by an analyst. Our choice to use spot-picker over Grunwald for this comparison was dictated by the fact that among the co-authors we have such an expert in the use of spot-picker, whereas we lack comparable expertise with Grunwald. We have little doubt that Grunwald would also produce results similar to the other methods in the hands of an expert user who is able to subjectively adjust classification parameters.

      Along these lines, the utility of calculating event probability statistics (Fig. 6A) is not well fleshed-out. This is a key distinguishing feature between Tapqir and methods previously published by Grunwald et al. In the case of Tapqir, the probability outputs are not used to their fullest in the determination of kinetic parameters. Rather a subjective probability threshold is chosen for what events to include. This may introduce bias and degrade the objective Tapqir pipeline used to identify these same events.

      This comment reflects a misunderstanding. No probability threshold is used in the kinetic analyses (Figs. 5 and 6). Instead, we make full use of the p(specific) probability output using the posterior sampling strategy that is illustrated in Fig. 5B and is described in the Results and in Materials and Methods. In the revised manuscript we modified the Results section to further emphasize this point.

      Finally, the manuscript could be improved by clearly distinguishing between the fundamental approach of Bayesian image analysis from the Tapqir software that would be used to carry this out.

      We have revised the manuscript to adopt this recommendation. We now call the mathematical model “the cosmos model” and use “Tapqir” to refer to the software.

      A section devoted to describing the Tapqir interface and the inputs/outputs would be valuable. In the manuscript's current form, the lack of information on the interface along with the potential requirement for a GPU and need for the use of a relatively new programming language (Pyro) may hamper adoption and interest in colocalization methods by general audiences.

      Description of the interface and inputs/outputs is given in the online Tapqir documentation.

      Users do not need to own a GPU; they can instead run the program on a readily available cloud computing service. We have now added to Table 1 data showing that computation time on the Google Colab Pro cloud service is actually faster than that on our local GPU system. Colab Pro is inexpensive, readily accessible, and user friendly. We have added to the user manual a tutorial that shows how to run a sample data set using Tapqir on Colab.

      Users do not need any knowledge of Pyro to use Tapqir; Pyro is merely used internally in the coding of Tapqir.

    1. thinking out loudMaybe we found love right where we are

      The main phrase “thinking out loud” is repeated throughout the entire song. It means to share one's thoughts so that other people can hear them. This can be seen throughout the poem as the speaker repeats the idea of displaying affection for his partner. “Thinking out loud” happens when people need help working though their thoughts. Likewise for the speaker, he has a lot of pent-up feelings and emotions that prevent him from thinking clearly thus this causes him to think out loud. And the one thought that always becomes apparent to him is that “Maybe we found love right where we are” which shows that the speaker thinks that he has found true meaning of love all because of his partner. This causes me to feel delight for the speaker as he is sure that no matter what doubts he may have about their relationship, he is convinced that he has found his true love. This also pushes me to reflect on the theme of love and that true love will find its way and will remain strong no matter the circumstances or the difficult situations they are put through.

    1. Author Response:

      Reviewer #2 (Public Review):

      The authors have developed a new method that allows for two-color STED imaging. They have applied this method to measure spine head size and PSD95 changes following exposure to an enriched environment.

      Strengths

      -The new method is well-described and seems to have considerably less crosstalk than previous attempts at in vivo two-color STED imaging. The analyses and controls of the method are compelling. I think that this method could be valuable for examining how different components of the synapse are changing in response to sensory or environmental changes.

      -The method is appropriate for measuring the size of PSD95 and spine head size in the enriched environment paradigm they use here. They find that in the short-term spine head size and PSD95 size are not always correlated.

      -They also find that there is less variability in the spine head size in animals in an enriched environment.

      Weaknesses<br /> -The authors use an enriched environment plasticity paradigm to showcase the method and measure spine head and PSD95 size and how they change over short periods of time. This particular biological study is not well-motivated and there is not a stated reason for studying the short-term (30-120 minutes) dynamics of PSD95 and spine head size, and their correlations. They also show that the variability in spine head size is decreased with the enriched environment, but do not show what the implications of that change would be from a biological point of view for synaptic dynamics or synaptic function.

      -The authors show that there are differences in the morphology of PSD95 between mice reared in enriched environments and those in control environments. While this quantification is done blindly by three different analysts, it is not done in a quantitative way. Also the authors do not show or explain the biological relevance of differences in the morphologies of PSD95, thus it is not clear what this measure means for synaptic plasticity or function.

      -The authors use a cranial window preparation, which is commonly used in the literature. However, it is not clear how long they wait to image the mice after the cranial window. Previous work from Xu et al. (PMID: 17417634) suggests that there is in an increase in glial activation for a period of up to a month after surgery. The authors have not shown the degree of glial activation that follows after their surgeries and if they have not waited a month, there may be upregulation of microglia, which may alter synaptic stability (also demonstrated in the same paper). The authors have not discussed this point or the implications for their findings.

      We thank the reviewer for his/her valuable input.

      The time-scale we study is similar to what is known from structural changes after LTP and thus we wanted to study the same time scale in vivo. We revised the motivation and explained better the biological relevance of the observed changes. We absolutely agree with the reviewer on his/her concern for chronic imaging. However, we performed acute experiments and imaged directly after implanting the window in the same session. After imaging the mice were sacrificed.

      Reviewer #3 (Public Review):

      Wegner et al. use two-color STED to follow spines and their PSDs in layer1 of mouse visual cortex over 2 hours under anesthesia. They compare mice that were kept in an enriched environment (EE) to control mice housed in standard laboratory cages. Spines in EE mice are larger and show larger fluctuations in size. PSDs in EE mice shrink during anesthesia and tend to change their nanostructure. Very importantly, changes in spine size were not driven by PSD size changes, or vice versa. Technologically, this is a landmark study, as tracking two different labeled structures in individual synapses at the nanoscale can obviously be applied to a large number of synaptic proteins and organelles, two at a time. Single-color superresolution microscopy is much less useful, as 'puncta in space', without cellular context, are difficult to interpret. This pioneering work is the first proof-of-concept of two-color in-vivo STED and of major importance for the community. Although stochastic processes seem to drive much of the synaptic dynamics under anesthesia, the environment shapes the spine size distribution and affects synaptic dynamics in a lasting fashion.

      One major comment:

      l.259: "These results suggest that Ctr housed mice undergo stronger morphological changes." This I find a bit misleading. What about: These results suggest that anesthesia induces stronger morphological changes in Ctr housed mice? Altogether, a discussion of the potential effects of anesthesia on spine/PSD dynamics is missing (see e.g. Yang et al., DOI: 10.1371/journal.pbio.3001146). The fact that there was weak correlation between spine head and PSD fluctuation could have something to do with the state of suppressed activity the system was in during imaging. Under conditions of intense processing of visual information, changes might have been more rapid and more tightly correlated. This could be mentioned as a perspective for the future - to visually stimulate the anesthetized animal.

      We agree with the reviewer that it should be mentioned here that the morphological change was observed under anesthesia. However, the sentence suggested by the reviewer is also a bit misleading since it suggests that the anesthesia has triggered the change. We think that anesthesia might affect the amplitude and dynamic of the observed changes but does not induce the change. Thus we rephrased as follows: These results suggest that Ctr housed mice undergo stronger morphological changes under anesthesia.

      We absolutely agree about the potential influence of the anesthesia on the spine and PSD95 nanoplasticity and added the following comment. Of course, we would like to perform the measurement in the future also in awake mice and after visual stimulation.

      Added to discussion: However, it was shown that MMF anesthesia reduces spiking activity and mildly increases spine turnover in the hippocampus (Yang et al., 2021). Thus, the plasticity of spine heads and PSD95 assemblies might be different in the awake state and under intense processing of visual information.

    1. Author Response:

      Reviewer #2 (Public Review):

      The reported study includes an overall well-conducted and well-presented set of experiments. Ample data are reported and a clear and conclusive picture of the findings is portrayed.

      1. The Introduction falls short of providing the background needed for fully appreciating the current findings and their importance. The authors don't present the current understanding regarding the role of 4-vinylanisole in locusts (mostly their own work). Nor do they present the accepted knowledge of the control of sexual maturation in locusts (mostly several decades-old work). Moreover, the importance of reproductive synchrony in the life history of gregarious locusts, including its tentative roles in maintenance of the homogeneity and integrity of the swarm, in ensuring high density conditions for the next generation, and more, is also not adequately presented.

      We appreciate the reviewer’s helpful comments. According to these comments, we have revised the introduction part by enriching the significance of reproductive synchrony in ecological adaption of gregarious locusts and the research progresses on sexual maturation control in locusts. Details were shown as: “Depending on population density, locusts display striking phenotypic plasticity, with a cryptic solitarious phase and an active gregarious phase (Wang and Kang, 2014). Gregarious locusts, compared to solitarious conspecifics, show much higher synchrony in physiological and behavioral events, such as egg hatching and sexual maturation, as well as synchronous feeding and marching behaviors (Norris, 1954, Uvarov, 1977). Reproductive synchrony in gregarious locusts provides benefits for individuals in several aspects, such as more favorable microenvironment, lower risk of predation, efficiently forging, as well we more encounters with mates, therefore ensures high density conditions for the next generation, and is essential for maintenance of locust swarm (Beekman et al., 2008, Maeno et al., 2021). Some sort of vibratory stimulus, maternal microRNAs, and SNARE protein play important roles in the egg-hatching synchrony of gregarious locusts (Chen et al., 2015b, He et al., 2016, Nishide and Tanaka, 2016). It has been revealed that the presence of mature male adults has effectively accelerating effects on synchrony of sexual maturation of immature male and female conspecifics in two locust species, Schistocerca gregaria and Locusta migratoria (Norris, 1952, Loher, 1961, Guo and Xia, 1964, Norris, 1964). The accelerating effects of several prominent volatiles released by gregarious mature males in male maturation have been exampled in the desert locust. Four volatile pheromones (benzaldehyde, veratrole, phenylacetonitrile, and 4-vinylveratrole) have significantly stimulatory effects on sexual maturation of male adults, with phenylacetonitrile having the most pronounced effect. (Mahamat et al., 1993, Assad et al., 1997). However, how conspecific interaction affects female sexual maturation remains unclear and the pheromones those contribute to maturation synchrony of females have not been determined so far”. In the current study, we identify 4-vinylanisole as a key pheromone promoting sexual maturation synchrony through validating the role of five gregarious male-abundant volatiles one by one, instead of following up our previous work on 4-VA. Thus, we have fully elaborated the multifunction of 4-VA as both aggregation pheromone and maturation accelerating pheromone in the formation and maintenance of locust swarm in the discussion part.

      2. Research on pheromonal signaling in locusts have traditionally focused on compounds with a putative role in density-dependent phase-specific behaviors. Hence, it is common to compare the response of crowd-reared vs. solitary locusts to applied chemicals. The challenge, however, is maintaining the density context, while attempting to conduct controlled similar experiments with locusts of the two phases (i.e. keeping the solitary phase locusts isolated, while the gregarious locusts must always be crowded). This is even more challenging when studying reproductive physiology. By the basic nature of the two phases, there can be a multitude of interacting factors (behavioral and/or physiological) affecting the much-desired reproductive synchronization in gregarious locusts, while such synchronization is not expected at all in solitary ones (it may even be claimed to have no fitness-related advantage).

      3. In general, the authors of the current report have dealt well with these challenges, taking extra care to conduct multiple controls and making an effort to specifically test all the possible factors. However, there are several points that raise some uncertainties. For example:

      o If I am not mistaken, females of both phases were included in the study only if already mated by day A+7 (LL355-357). While this is reasonable for gregarious locusts, it may not be suitable for the solitary locusts, imposing an undesired and unequal selection criterion.

      We thank the reviewer’s comments. We don’t think the criterion (mated at PAE 6-7 days) cause significant bias in either gregarious locusts or solitarious locusts. In fact, the limitation of mating before PAE 7 days is used to rule out the effects on oviposition synchrony caused by difference in mating age among individuals. This criterion is only limited during the analysis of the first oviposition date. On the premise of consistent mating time, oviposition consistency in gregarious female adults may largely present the sexual maturation synchrony among individuals (Figure 1A). For subsequent experiments, we mainly concentrate on regulation of sexual maturation using only virgin females in all experiments.

      o In the test of the effects of conspecifics interactions, 10 gregarious locusts provided stimulation to the tested gregarious female, while only one insect was the stimulating factor for the solitary female.

      Actually, we carried out two independent experiments to test the effects of conspecifics interactions. The population densities were kept in solitarious context for comparison of female sexual maturation synchrony between typical gregarious and solitarious phases (Figure 1D). For locust emissions treatments, ten solitarious locusts were used to ensure the stimulations at the same density level (Figure 1F). Both of two experiments suggested that solitarious male adults had no effects on female sexual maturation.

      o It is not clear how were egg pods attributed to specific gregarious females (maintained in groups of 10)

      Thanks for the reviewer’s comments. To monitor the oviposition activities of each individual of gregarious females in a group, locusts were individually marked, and their first oviposition times were determined by collecting egg pods every 4 hours per day after mating. Females those laid new eggs could be easily distinguished by much thinner abdomen with white foam around ovipositor. We have provided the method details in the revised manuscript.

      Overall, since the focus of this study is actually not on the comparison between the phases, it might have been beneficial to the readers if the focus was on the gregarious locusts only, with maybe a couple of experiments conducted on solitary insects and presented separately.

      We understand the reviewer’s concern. Actually, the aim of this study is to explore the mechanism underlying sexual maturation synchrony by comparing phase- and sex-dependent conspecific interactions in locusts. The reproductive synchrony in gregarious might be not highlighted without comparison with solitarious locusts, including both first oviposition time and sexual maturation, although the mechanism studies were mostly performed in gregarious locusts. Moreover, phase-dependent comparison of volatile contents is helpful for us to screen candidate volatiles responsible for the acceleration of sexual maturation synchrony in females.

      4. Assuming that within a locust group there is overall agreement in the age of males and females, there seem to be a not-fully-explained mismatch between the age of max 4-VA release by males (linearly increasing with age) and the age of max effect in females (critical period at A+3-4)

      We appreciate the reviewer’s query. We have provided additional discussions on the “mismatch” of between age-dependent release of 4-VA by males and the age of max effect in females (PAE 3-4 days). Details were shown as: “. We find that the release of 4-VA by gregarious males continuously increased after adult eclosion, with maximal 4-VA release at PAE 8 days. The age of maximal 4-VA production outwardly seems to be unmatched with the sensitive developmental stage to 4-VA of females (PAE 3-4 days). In insects, it is very common for males to mature earlier than females (Alonzo, 2013). In the locust, male adults also display earlier sexual maturation for several days, compared to females. In given locust population, individuals emerge to adults successively in a couple of days, not in completely synchronous period. Therefore, age-dependent increase in 4-VA release in gregarious male adults presents a persistent stimulus for less-developed young female adults, and thus maximizes synchronous maturation of female locusts, which could reduce male competitions for mate selection”.

      5. Similar to the introduction, the discussion section also does not present comprehensive arguments regarding the importance of reproductive synchronization in female locusts. Points that could have been discussed include: females' oviposition disrupting migration, synchronization affecting sexual selection, accelerating intra-sex competition over mates as well as oviposition sites, and more.

      We appreciate the reviewer’s nice suggestions. We have provided additional discussions on this point following these suggestions. Details were shown as: “Reproduction synchrony involves consistence in maturation, mating, and egg laying, among which sexual maturation synchrony serves as the most foundational step for oviposition uniformity (Hassanali et al., 2005). Extremely high energy cost for female reproduction could restrict migration to pre, post, or inter oviposition period in locusts, thus have crucial effects on collective movement of local populations (Min et al., 2004). Given this, a balance of sexual maturation timing among female members presents an essential subject for maintenance of locust swarms. We here demonstrated that young female adults reared with older gregarious male adults show faster and more synchronous sexual maturation in the migratory locust, supporting the accelerate role of crowding in sexual maturation of females (Guo and Xia, 1964, Norris and Richards, 1964,). Together with the accelerating effects on immature male sexual maturation induced by older gregarious male adults reported previously (Torto et al., 1994, Mahamat et al., 2000), young adults of both sexes lived in gregarious conditions prefers more synchronous maturation than individuals reared in solitary. The consistent maturation in both sexes will greatly reduce intra- and inter-sexes competitions for mate selection and thus ensures reproductive synchronous in whole locust populations. We demonstrated that a single minor component (4-VA) of the volatiles abundantly released by gregarious male adults is sufficient to induce the maturation synchrony of female adults. By comparison, four volatiles (benzaldehyde, veratrole, phenylacetonitrile, and 4-vinylveratrole) showed stimulatory effects on male maturation (Mahamat et al., 2000). Thus, there might exist a sex-dependent action modes of maturation-accelerating pheromones: multi-component pheromones for males and single active component for females, possibly due to different selective pressures between two sexes in response to social interaction. Further exploration will be performed to confirm this hypothesis by determining whether 4-VA has maturation-accelerating effects on male adults in the migratory locust in future”.

      Reviewer #3 (Public Review):

      Strengths: Grouping behavior for marching, sexual maturation, swarming, oviposition and egg hatching in gregarious locusts is complex and it's mediated by a combination of cues-olfactory, tactile, and visual cues to ensure synchronous behavior. The authors show that only olfactory cues released by gregarious adult males mediates maturation synchrony of females. This finding is a confirmatory result of a well-established phenomenon for maturation synchrony in both sexes of adult locusts, although in this study, the authors focused on only females. Further, the authors validated their findings using gene editing techniques to show that maturation synchrony was diffused in Or35-/- mutant adult females but not in wild type females exposed to adult male volatiles and the individual component identified as 4-vinylanisole among five male-abundant volatiles as promoting synchronous sexual maturation in only post adult eclosion females (PAE) 3-4 days old. Use of molecular and single sensillum recordings, followed by physiological experiments focused on the interaction between this specific adult pheromone and juvenile hormone to validate the behavioral results found for females add scientific value to the study.

      Weaknesses: Firstly, synchronous and accelerated sexual maturation of young adults by older pheromone-producing ones, is a primer effect driven by males and this facilitates 'integration and cohesion' of both sexes of adults. In my view, the fact that this study focused on only females but not on both sexes, weakens the contribution of the study towards increased understanding of the biology/ecology of locusts.

      We accepted the reviewer’s comment that synchronous and accelerated sexual maturation of young adults by older pheromone-producing ones occurs in both sexes. In fact, early studies have reported that mature males can accelerate sexual maturation of young males through several candidate compounds (Mahamat et al.,1993, Chemoecology; and Mahamat et al., 2000; International Journal of Tropical Insect Science). However, the effects of conspecific interaction on sexual maturation of females are rarely reported. Moreover, distinct volatiles that can accelerate female sexual maturation have not been characterized before this work. Therefore, we focus on female sexual maturation synchrony in the current study. A comparison of regulatory mechanisms underlying sexual maturation synchrony in males and females has been discussed in the revised manuscript.

      There are also weaknesses in the methods, such as focusing on only the five-abundant male volatiles based on heat maps. Basically, the decision as to which components in adult male volatiles may be contributing to sexual maturation should be made by antennae of different ages of PAE females and males to avoid selecting only abundant compounds based on artificial intelligence (AI). Since most studies in this subject area have demonstrated that there is no direct correlation between volatile abundance and detection at the periphery or central nervous systems of an insect, I believe that the authors will agree with me that often some of the minor volatile components tend to contribute more to the chemical ecology of an insect than the more abundant components. Without testing minor components identified in male volatiles as a blend or individually, as additional controls to increase the robustness of the study, I am not convinced that the authors have fully achieved their aim in identifying a male-produced volatile that promotes sexual maturation in females.

      We agree the reviewer’s comments that the activities of volatiles are not always determined by the absolute contents. In fact, in our work, the selection of candidate effective compounds for female sexual maturation did not rely on the absolute content of these volatiles, but mainly based on comparative analysis of their relative contents between gregarious and solitarious male adults, because only volatiles from gregarious male adults could accelerate sexual maturation of females (Figure 1C-F). In the revision process, given that the volatiles released by gregarious males, rather than gregarious females and solitarious males, have the accelerate effects on female sexual maturation, we further performed more comparative analysis of volatile contents among these three groups (G-males, G-females, and S-males). Compared to volatiles released by G-females, and S-males, only five kinds of volatiles display significantly higher emission in G-males (PAN, guaicol, 4-VA, vertrole, and anisole). The roles of five candidate volatiles in female sexual maturation were individually validated by removing the volatile from the stimulation blend one by one. The results showed that only the omission of 4-VA from the blends lost the accelerating effects on sexual maturation synchrony of gregarious females (Figure 2B). Based on these findings, we inferred that 4-VA played major roles in promoting female sexual maturation synchrony.

      JH experiments- My main concern is the lack of proper controls to fully investigate the interactive effect of the male-produced pheromone promoting sexual maturation and juvenile hormone production. JH titers were not measured in females exposed to the other male-abundant compounds including PAN, guaiacol, veratrole and anisole or blend/individual minor components.

      We understand the reviewer’s query. In fact, the potential role of JH pathway was inferred firstly by the RNA-seq analysis of CC-CA, which showed that the expression levels of JH metabolism-related genes were significantly affected by 4-VA treatment at PAE 3-4 days. The measurement of JH titer after 4-VA treatment was further performed to support the involvement of JH in 4-VA-accelerated sexual maturation in female adults. Since other male-abundant compounds have been excluded due to the omission of any of the four volatiles (Figure 2B), we don’t think it is necessary to detect their effects on JH titers in females including PAN, guaiacol, veratrole, or anisole.

      Another notable weakness is the 'JH Rescue Experiment'. The authors did not inhibit JH synthesis in the corpora allata (allalectomized locusts) in treated locusts before injecting the JH-analog methoprene to accelerate maturation and reproduction in females.

      Thanks for the reviewer’s comments. The JH rescue experiments in Figure 4D-F were performed in Or35 female mutants, which showed lower JH levels and sexual maturation rate. Thus, the JH analog was applied to Or35^-/- females to test whether activation of JH pathway could recover sexual maturation rate and Vg expression. To provide additional evidence, we performed addition rescue experiments in WT females by inhibiting JH synthesis using Precocene (PI) before JH treatment. The results showed that PI treatment significantly inhibited sexual maturation rate and Vg expression in 4-VA-exposed WT females, whereas JH treatment post PI application can obviously recovered the sexual maturation rate and Vg expression (Figure 4G-I).

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Review of 'Mother centrioles generate a local pulse of Polo/PLK1 activity to initiate mitotic centrosome assembly' from Wong et al.

      In this paper, Wong et al address the mechanisms of centrosome assembly in flies. They start with the interesting observation that Polo localized at centrosomes oscillates before cells enter mitosis, while Cnn (and with it centrosome maturation) either increases or reaches a plateau. The phenomenon is local, since Polo levels at in the cell are high during mitosis. They propose that the oscillation is driven by a negative feedback loop whereby Polo inhibits its own binding to the centrosome, Ana1 being the most likely relevant receptor. Finally, they discuss the possible meaning of this oscillatory behavior, in the light of the rapidity of the early embryonic cell cycles.

      Major comments

      1- One can imagine different reasons for the fact that the model displays different dynamics for Cnn and Spd-2/Polo. For example, a major difference may be due to the different dissociation rates of the clusters Cstar and Shat. These are governed by different laws and different parameters (kdis vs kidsCstar1/n). If I understand, both parameters and dependency on Cstar^2 are assumptions. Hence, it would be important to pinpoint which component of the model is more directly responsible for the observed behavior. The analysis should not be limited to the dissociation, but should be extended to the whole model. To this aim, one could test the robustness of the model's parameters. The results of this analysis will also be a prediction of the model.

      2- The presence of a positive-feedback loop involving Cnn could offer an alternative and more robust explanation for the slower dynamics of Cnn. Such a loop between Cnn and Spd-2 was proposed by the authors (Conduit, eLife, 2014). I think some comment on this point would be interesting (eg, could the Cnn/Spd-2 loop proposed earlier work in this context? If not, why? If yes, should not this option be explored?).

      3- The prediction presented in Figure 6 is very relevant. I wonder how robust this behavior is to changes in parameters values.

      4- Additional testing of the model would be important to confirm that the negative feedback loop is actually in place, although I understand experiments may be difficult to be performed. Possible examples: constantly high levels of Polo are expected to decrease its centrosomal localization, is that correct and, if so, testable? Is it possible to delay one cycle, and then observe the decay in Cnn values? This latter experiment, for example, could help to distinguish positive feedback vs slow decay rates. If the experiments are not possible, it may be worth anyway to present some predictions worth testing.

      5- The difference between Models 2 and 3 is not clear to me. In mathematical terms, they seem to be basically the same thing: reaction (50)=(33), (51)~(34) given (40) and (52)~(35) again given (40). This is precisely since the model comes with the assumption of a well-stirred system, and thus adding P in solution is not so different from assuming P=Rphat (40). I would have imagined that also Model 2 accounts for the fact that in Spd-2-S16T and Ana1-S347T Polo is recruited slower and for a longer period. Is it not true? If so, is model 3 really needed? More in general, assuming a role for an increase of local concentration of P* is quite a jump, especially given the small distances involved, and the fast diffusion occurring within cells.

      Minor points

      1-Could the authors use the FRAP data to estimate the different kdis? If so, a comparison with the 20-fold difference used in the model would be useful.

      2- p. 6, The authors should state clearly for the worm-uneducated like me whether the fusions were done with the endogenous proteins or not.

      3- p.7 Figure 1B, in the text it is referred to display 'levels of peaks' and in the figure and legend we find 'growth period'. Not clear how the two refer to the same quantity.

      4- Spd2-mCherry is present in both Figure 1C and D, but with very different amplitudes. Why is that the case?

      5- The fact that Polo peaks in mitosis is a key observation. Unfortunately, this is often reported as a personal communication. The authors never tried to produce this piece of data?

      6- p.11 It is explained that NM and OM differ for their initial values because the OM starts with some PCM from the previous cycle. However in Figure 3A, for example, the values of Polo at the end of the cycle are identical in the two. Is not this in contrast with the explenation?

      Still p11, there is reference to Figure 3C,D, but Figure 3D does not exist, I guess it should be 3A,C.

      7- In the formulation of the model (page numbers in Suppl Mat are unfortunately missing..), one citation for the total amount of Polo being large is needed.

      8- I do not understand this point: scaled c output is 1, and the initial condition for c=1 also?

      9- It has been shown in different systems (from yeast -- haase winey reed, NCB, 2001-- to worms -- McCLeland O-Farrell CB 2008) that centrosome duplication can occur independently from the cell cycle oscillator. I was wondering whether the proposed negative feedback loop may play a role in this phenomenon. This is only a curiosity, which does not need to be addressed.

      Significance

      The new observation and hypotheses presented in the paper provide a sizeable advance. The presence of an oscillation in Polo, uncoupled from cellular levels, is new, and the model proposes a testable hypothesis to explain it. Some additional experiments to verify the model would strengthen the manuscript.

      The work is probably more appropriate for experts in the centrosome field. My primary expertise for this review was in mathematical models.

    1. A single mom on disability struggles to provide food and clothing for her teenage daughter. A recent college graduate forgoes therapy. A young professional puts off buying a home and taking the next steps in his life. And a 74-year-old in a senior living community knows her monthly Social Security budget down to the cent.

      The lives of these Utahns are all being shaped by spending around 50% or more of their income on housing each month.

      Their challenges are part of a larger statewide housing crisis, one that is being blamed on both a shortage of homes and sluggish income growth that isn’t keeping pace with soaring real estate prices.

      While past spikes in housing costs have priced people out of home ownership in Utah, the current affordability crisis is more all-encompassing — so it’s also stretching renters to the breaking point, said James Wood of the University of Utah’s Kem C. Gardner Policy Institute.

      “I speak from personal experience,” said Wood, a senior fellow at the Gardner Institute. “I have people in my basement, and I’ve tried to help them find places. It’s really tough.”

      Nearly one in five renters in Utah is severely cost-burdened, meaning they spend at least half their income on housing and often struggle to pay for food, transportation and other bills, according to federal data for 2013 to 2017. And more than 63% of the state’s lowest-income residents fall into this category, this data shows.

      [Read more: Do you spend more than half your income on rent? Here are resources that can help.]

      The disparities are particularly acute for Utahns of color, with a recent Gardner analysis showing that Black and Hispanic renters are more likely to face severe housing cost burdens. The research found that 32% of Black renters in the state spend more than half of their income on housing, making them almost twice as likely to face severe cost burden as white renters.

      For a minimum wage worker in Utah, a rental home would have to cost $377 per month or less in order to be affordable, according to an analysis by the National Low Income Housing Coalition. But the average rent for a one-bedroom Salt Lake City apartment is nearly triple that, at $1,099 a month, according to a June report from the popular rental website, Zumper.

      Wood said some of the state’s lowest-income residents receive public housing assistance, but there’s not enough money to reach everyone who needs help. Without government support, this group of Utahns lives on the brink of homelessness, with any additional hardship potentially pushing them over the edge.

      “Whether it’s domestic violence, or whether it’s the loss of job or a health incident or a traffic accident,” he said. “That’s a disaster.”

      Tara Rollins, executive director of the Utah Housing Coalition, notes that it’s easier to prevent people from losing their housing than it is to get them off the streets. The coalition advocates for increased wages and additional units of deeply affordable housing, to help people in this position before they’re pushed into homelessness.

      Even those who are moderately cost-burdened — meaning they spend more than a third of their income on housing — face challenges, Rollins noted. But she doesn’t think that many Utahns and policy makers are paying enough attention to this swath of Utahns who are barely keeping their heads above water.

      “Unless they feel it, see it, they don’t get it,” she said. “You can’t see somebody’s wallet and how empty it is.”

      (Rick Egan | The Salt Lake Tribune) Jan Aus, a 74-year-old apartment resident in Sandy, says her rent keeps rising.(Rick Egan | The Salt Lake Tribune) Jan Aus, a 74-year-old apartment resident in Sandy, says her rent keeps rising. (Rick Egan/)

      ‘Nobody has my back’

      When Jan Aus, 74, moved into a senior living community in Sandy seven years ago, she was shelling out $720 for rent each month.

      “And then they raised it $10 two or three years after that,” she recounted. “And then, bing, they hit me with $65 a year.”

      Today, Aus is paying $925 to live in her one-bedroom apartment ― a figure that sucks up the bulk of her Social Security check. She knows the amount she has to budget each month down to the cent: $1,251.80.

      Aus said there are people who are “worse off than I am,” noting that she’s receiving government assistance available to low-income Utahns to help pay for electricity and food. She owns her car and considers her health insurance “good,” as long as she makes sure to get generic prescriptions.

      Still, she said the amount of money she’s putting toward rent has become stressful, especially as she waits to see whether the apartment complex where she lives will raise her rent again this fall.

      “It scares me,” she said. “And like I said, they’re going to hit me in September ... and it scares me to think they’re going to raise it again. I just feel like nobody has my back.”

      If not for the federal pandemic stimulus checks, Aus said, she wouldn’t have any kind of savings, money she’s socked away in hopes that she can put a security deposit down on a more affordable apartment soon.

      The problem, she said, is that there’s very little available in her price range of $800 to $900 a month, other than a room in someone else’s house.

      “I don’t see myself going that way,” she said. “That to me is kind of scary. I think we need more affordable housing, I really do. Because it’s not going to get any better. To me, it’s going to get worse.”

      ‘I want to take care of more of my health’

      Jazmin May has cut back her therapy sessions from once a week to once a month. The 24-year-old Salt Lake City resident can’t go in for an eye exam as soon as she’d like. And she’s had to ask her parents to chip in some money when her car needed repairs.

      That’s all because rent claims about 50% of her income, and she has to stretch the rest to pay her other bills.

      “For right now, it works for me,” she said. “I do wish I had more money left over in my paycheck just to be able to afford other things. I want to take care of more of my health.”

      May says many of her other friends from college are also struggling, as they strain their early-career salaries to cover the cost of housing. Some have found it impossible and have gone back to live with their parents, she said.

      She and a friend signed a lease for their two-bedroom apartment near Liberty Park in 2019, but the pandemic that arrived just months later quickly jeopardized the living arrangement. Her friend lost a retail job and had to move back to her family home in Ogden.

      May said she considered looking for another roommate and decided it would be better for her mental health if she lived alone for a while.

      Taking on the entire rental payment meant accepting a job at an area museum rather than continuing to hop between political campaigns — work that she loves but is too unreliable for her right now.

      “I feel like I have sacrificed, in a way, my passion, to be able to afford housing, because I love campaigns and politics and outreach,” she said. “But campaigns are also not a stable job, and often you don’t get benefits. So I decided to just take a break from politics for a little.”

      Even with the more predictable salary, May said home ownership seems like an unattainable goal at this point, especially since she’s worried that housing costs will always be one step ahead of her income growth.

      She peruses rental listings for fun sometimes, but she’s not convinced she could find something cheaper, especially considering the pet fees she’d have to pay for her cat, Lilith. Her only other option, she said, would probably be to move into her parents’ home, as some of her friends have done.

      ‘You just kind of want to be an adult’

      Fresh out of college 10 years ago, Orem resident Eric Wilson set a long-term goal of saving enough for a down payment on a home.

      He’s passed up concerts he wanted to attend, vacations he wanted to take and movies he wanted to see. He’d love to buy the latest tech gadgets and the newest iPhone, but he’s socking away every extra penny in his investment portfolio instead.

      Still, the 31-year-old marketing specialist said he doesn’t feel much closer to buying a house than he did a decade ago — and perhaps even further away, as he watches home values grow at warp speed compared to his slow-and-steady savings. So he can’t help but wish Utah’s economy would hit the tiniest snag.

      “Just a little bit. Not enough to hurt anybody,” he half-jokes. “Just to make house prices go down.”

      Making it especially hard to save is his current rent, which eats up nearly half of his salary.

      Wilson has lived in the two-bedroom unit since shortly after he graduated from Utah Valley University. He had a roommate initially but opted not to get another one after his friend married and moved out.

      “You just kind of want to be an adult and go off and do your own thing and have your own space and not have to worry about marking your milk,” he said. “But at a certain point, if prices keep rising, it’s not really feasible.”

      Wilson had gotten about a fifth of the way to his goal of saving $100,000 when COVID-19 struck and his marketing agency had to cut jobs, including his. His unemployment lasted six months, forcing him to deplete the nest egg he’d spent so long accumulating.

      He keeps browsing online real estate listings, despite knowing how far away he is from becoming a homeowner. The hobby is becoming increasingly demoralizing, though, he said.

      Three years ago, he toured a modest home in a nice neighborhood that was pretty affordable for him, priced at a bit less than $200,000. Recently, he saw the same place had sold again for $415,000.

      Wilson said he likes his apartment and knows he’d have to pay much more if he relocated. But he’s also weary of renting. He’s tired of feeling like he has to put off his life — and delay buying a dog or becoming a foster parent.

      “I’m just kind of not at that point where I can do that space-wise,” he said. “But I would love to do that. And having a home would help make that possible.”

      (Christopher Cherrington | The Salt Lake Tribune)(Christopher Cherrington | The Salt Lake Tribune)

      ‘This is where you belong’

      Anna, 50, is a single mother supported by disability payments from Social Security and living in an income-restricted apartment complex in Holladay — but with around $700 left over each month after she pays her rent, she said, she’s still struggling.

      While her monthly housing costs have ballooned from $850 when she first moved into the two-bedroom apartment in 2016 to $1,077 now, her disability income hasn’t increased at the same rate.

      “It’s a huge stress,” Anna said. She fears she might be pushed out of the unit for speaking out about her rent increase, and The Salt Lake Tribune is not publishing her surname.

      Among her biggest challenges is making sure her 13-year-old daughter can access nutritious foods, a goal she said is easier thanks to assistance from The Church of Jesus Christ of Latter-day Saints.

      “Otherwise, my daughter probably would just be eating rice,” Anna said.

      She’s also struggled at times to supply her daughter with new clothes that fit as she outgrows old ones.

      Anna said she’s trying to save money, but “life keeps happening,” such as a car problem earlier this year that claimed everything she had saved and more. Sometimes, she worries that one misstep could land her and her daughter on the streets.

      Drowning in monthly housing costs, Anna said she’s been searching online for a more affordable apartment in the hopes that she wouldn’t have to stretch so much to make ends meet — but she’s growing increasingly discouraged.

      “I do keep on looking,” she said. “I keep hoping maybe I’ll find somewhere that is rent manageable as well as safe that I can move my daughter and I to so we can be able to provide for ourselves without having to rely on governmental programs completely. And basically feel like we’re being pushed into that hole that, well, if you can’t work for yourself, then this is where you belong. That’s how it feels.”

      Hours after speaking with The Tribune, Anna received a notice taped to her door that her rent was being increased once again: to $1,122 starting July 1.

      Crédito: Taylor Stevens, Bethany Rodgers

      Word count: 2156 Copyright The Salt Lake Tribune Jun 10, 2021

      Related items Latter-day Saints are overrepresented in Utah’s Legislature, holding 9 of every 10 seats Davidson, Lee. The Salt Lake Tribune; Salt Lake City, Utah [Salt Lake City, Utah]. 14 Jan 2021.

      Salt Lake County keeps losing Latter-day Saints, and there are multiple theories as to why Davidson, Lee. The Salt Lake Tribune; Salt Lake City, Utah [Salt Lake City, Utah]. 14 Jan 2021.

      Filmmaker gets to Sundance with a dark comedy inspired by his Latter-day Saint mission Means, Sean P. The Salt Lake Tribune; Salt Lake City, Utah [Salt Lake City, Utah]. 28 Jan 2021.

      Jana Riess: The big question — Will Latter-day Saints follow their prophet and get the COVID vaccine? Riess, Jana. The Salt Lake Tribune; Salt Lake City, Utah [Salt Lake City, Utah]. 20 Jan 2021.

      Sex therapist expelled from Church of Jesus Christ of Latter-day Saints Sarah Pulliam Bailey. The Washington Post; Washington, D.C. [Washington, D.C]. 23 Apr 2021: A.8.

      Show more related items Search with indexing terms Subject Cost control Affordable housing Pandemics Home ownership Location Utah Company/organization Salt Lake Tribune Back to top About ProQuestContact UsTerms and ConditionsPrivacy PolicyCookie Policy Cookie Preferences Accessibility

  9. data-ethics.jonreeve.com data-ethics.jonreeve.com
    1. Do numbers speak for themselves? We believe the answer is ‘no’. Significantly,Anderson’s sweeping dismissal of all other theories and disciplines is a tell: itreveals an arrogant undercurrent in many Big Data debates where other formsof analysis are too easily sidelined. Other methods for ascertaining why peopledo things, write things, or make things are lost in the sheer volume ofnumbers. This is not a space that has been welcoming to older forms of intellectualcraft. As Berry (2011, p. 8) writes, Big Data provides ‘destablising amounts ofknowledge and information that lack the regulating force of philosophy’. Insteadof philosophy – which Kant saw as the rational basis for all institutions – ‘compu-tationality might then be understood as an ontotheology, creating a new ontological“epoch” as a new historical constellation of intelligibility’ (Berry 2011, p. 12)

      Big data can provide a lot of information, and we will finally get the analysis results when we analyze it. But does huge data necessarily give us the right result, I don't think so. Excessively large data sometimes not only brings us a greater amount of calculation and analysis difficulty, but also provides me with some repetitive, complicated and useless information. This information may lead us to deviate from the correct results, or to obtain results that are too dependent on the same environment. If we want to get a general conclusion, we may not only rely on the analysis of these numbers, but also have some prior knowledge or more efficient data processing methods.

    1. Both novels tackle the issue of racism and were removed after parents complained of “profanity”.

      I really think that banning books that address the topics of racism, we are limiting our progress as a society. If we can't learn about the problem, it won't ever be addressed, and true change cannot occur. I was thinking about this in my Spanish class as this week we were reading about bullfighting and the running of the bulls. While it made me feel uncomfortable, I noted the importance of reading about both sides of the issue. If I remained in ignorance, how could I add my voice to changing customs? I may not live in Spain, and may not be able to change much in regards to bullfighting practices, but I can help make a difference in the racism embedded in our society by learning more about the issue through reading books on the subject, especially personal experiences of others.

    1. In view of all this we may say, not, I think, that psychology is all there is of philosophy, as Wundt does, nor even that it is related to the systems as philosophy to theology, nor that it is a philosophy of philosophy, implying a higher potence of self-consciousness, but only that it has a legitimate standpoint from which to regard the history of philosophy,-- a standpoint from which it does not seem itself a system in the sense of Hegel, but the natural history of mind, not to be understood without parallel [p. 131] study of the history of science, religion, and the professional disciplines, especially medicine, nor without extending our view from the tomes of the great speculators to their lives and the facts and needs of the world they saw. It strives to catch the larger human logic within which all systems move, and which even at their best they represent only as the scroll-work of an illuminated missal resembles real plants and trees, in a way which grows more conventionalized the more finished and current it becomes. In a word, it urges the methods of modern historic research, in a sense which even Zeller has but inadequately seen, in the only field of academic study where they are not yet fully recognized.

      Hall states his belief that psychology is much more than philosophy. Psychology is its own science, just like medicine. Despite the contributions of theology and philosophy, psychology is scientific and researchable.

    1. The Self by Soul, not trample down his Self, Since Soul that is Self’s friend may grow Self’s foe. Soul is Self’s friend when Self doth rule o’er Self, But Self turns enemy if Soul’s own self Hates Self as not itself. The sovereign soul Of him who lives self-governed and at peace Is centred in itself, taking alike Pleasure and pain; heat, cold; glory and shame.

      This excerpt from the passage might seem a bit overwhelming as the phrasing is a bit odd to modern language but I think I arrived at a general basis for what Krishna is saying in the beginning of this chapter. The practice of Yoga to many have a direct correlation to harmony and control but Krishna considers another meaning, one that might surprise many. Yoga is learning to let go, it is to detach oneself from their desires and thus coming to the realization that the desire has a direct link to the pain we all face in life. By letting go of those desires you are breaking the tether that binds you to worldly aspects. Ones soul can turn into an enemy if hatred takes over.

      Source: V, Jayaram. Descriptions of Soul or Atman In The Bhagavad Gita. HInduWebsite.com. https://www.hinduwebsite.com/soul.asp

    1. If you consume too little, you could be leaving potential gains on the table and missing out on fat loss, just because you didn’t want to eat an extra chicken breast or protein shake. In this sense, we think of having a high protein intake as a sort of anabolic insurance. It covers you in a similar way as car insurance in that you may not necessarily need it, but it’s a good idea to have it just in case.

      It's clear that they are writing the book in the perspective of maximizing body recomposition in whatever way possible. I believe that this is a little misguided for most people as I personally don't feel that good if I eat a ton of protein and nothing else. It makes my stomach feel a little poopy.

    Annotators

    1. “You don’t want all of your Hispanic kids looking up to a bunch of white teachers and that’s basically what we have so, yeah, it’s an issue,”

      I think it's extremely important that children in the education system have someone from their cultures who they can look towards and relate to. However, it is equally as important to provide children with exposure to different cultures that they may not interact with or know much about. It's so important that school districts start hiring a diverse range of teachers in their schools instead of it being primarily white teachers.

    1. Our brains work not that differently in terms of interconnectedness.Psychologists used to think of the brain as a limited storage spacethat slowly fills up and makes it more difficult to learn late in life. Butwe know today that the more connected information we alreadyhave, the easier it is to learn, because new information can dock tothat information. Yes, our ability to learn isolated facts is indeedlimited and probably decreases with age. But if facts are not kept

      isolated nor learned in an isolated fashion, but hang together in a network of ideas, or “latticework of mental models” (Munger, 1994), it becomes easier to make sense of new information. That makes it easier not only to learn and remember, but also to retrieve the information later in the moment and context it is needed.

      Our natural memories are limited in their capacities, but it becomes easier to remember facts when they've got an association to other things in our minds. The building of mental models makes it easier to acquire and remember new information. The down side is that it may make it harder to dramatically change those mental models and re-associate knowledge to them without additional amounts of work.


      The mental work involved here may be one of the reasons for some cognitive biases and the reason why people are more apt to stay stuck in their mental ruts. An example would be not changing their minds about ideas of racism and inequality, both because it's easier to keep their pre-existing ideas and biases than to do the necessary work to change their minds. Similar things come into play with respect to tribalism and political party identifications as well.

      This could be an interesting area to explore more deeply. Connect with George Lakoff.

    1. Author Response:

      Reviewer #2:

      Weaknesses:

      The competition assay used in this study may not truly reflect the competitiveness of SSIMS males. The mating assay used 20 virgin WT females and 4 males (including both WT and SSIMS), resulting 5:1 sex ratio so the males are not really competing for females. A more competitive ratio (such as WT females: WT males: SSIMA males at 1:1:1) should be designed to address this. Also, the sperm competition assay mixed the mated WT females with SSIMS males for 12 days, allowing plenty of time for the females to remate with these males. Therefore, it's more like a sperm replacement assay rather than competition assay. The authors should either repeat it with a strict time control, or soften their statements for sperm competitiveness.

      We have repeated the experiment at a 1:1:1 ratio as suggested. The new results are reported in the revised Figure 3. It is not clear to us how the timing of the mating experiments differentiates sperm competition versus sperm displacement, but we agree that sperm displacement is a better term to describe what we did. We have repeated the sperm displacement experiment with strict time control based on several published literature precedents and describe the results in the revised manuscript.

      Some necessary information or statistics are not shown or mis-presented. For example, the alternative splicing diagram in Figure 1c likely was taken from the original transformer gene, but here it's the tTA gene so the male intron should be removed since it's not in the construct;

      We have revised text in the manuscript to clarify some of these points. First of all, the male intron is still in the construct, even though we fused the intron to the tTA gene. The alternative splicing between males and females is caused by use of alternative 5' splice sites, which means the intron that is spliced out in males is just a smaller section of the intron that is spliced out in females. Use of an alternative 5' splice site in males means that a protein-coding sequence with multiple stop codons is incorporated to the mature mRNA. We do not support the precise splicing mechanism with empirical data in this paper, but this has been done in a number of previous publications (https://doi.org/10.1016/j.ibmb.2014.06.001; https://doi.org/10.1371/journal.pone.0056303).

      Because the construct works as predicted (100% female lethality in the absence of tetracycline), and we did not change the genetic design in a way that would impact the mechanism of female lethality, we think there is little reason to believe that the splicing is occurring in a different way.

      the panels of Figure 2 were not consistent to the legend and confusing; the statistics for different tetracycline concentration tests were not shown in Figure 2 or text to answer their hypothesis "(to) optimize rearing of SSIMS stock, …..we titrated Tet in the food";

      We re-wrote the text describing Figure 2 to make the results more clear. We clarified in the legend that the symbol signifies p<0.0001 (we were not trying to imply that all experiments had this level of significance, only the ones marked with the symbol in the figure). We removed the word ‘optimize’ from the main text. Optimization was not the true aim of the experiment, and as the review points out, we did not statistically determine an optimal concentration of Tet. Our main goal was to show a dose- dependent response in the number of females surviving on Tet-free medium, which the data supports and which does not require statistical support.

      Figure 3b shows 5-8 day old females were used but in the text it's 5-6 day, and it didn't mention the duration of the first crossing and time lag until the second crossing which are critical in such experiments; the conclusion and statistics for Figure 3c among tests with mixed males should also be mentioned.

      We have corrected the figure (now Figure 3c) to indicate that the females were 5-6 days old. The first mating was for 5-6 days and there was no lag time between being co-housed with different males. We have performed multiple new experiments in revision that have been added to Figure 3. We have revised the discussion of these new experiments (and how they relate to the originally performed experiments) in the revised submission.

      The discussion is largely towards the merits of SSIMS but missing some key points that might decide how it can be translated into applications or transferred to other species. First, the actual basis for tTA lethality that employed in this study is still unknown which is subject to suppression by a pre-existing inherent variation in the targeted field population. The very phenomenon may also be true for any gene-overexpression-based lethality including EGI lines generated here. Second, the complete penetrance observed from the relatively small sample size here can be hardly used to predict field or mass-rearing condition. Previous study showed that mutations in such lethal construct could occur at a one out of 10,000 frequency, and typical SIT program release millions of sterile insects every week. Third, while the authors claimed SSIMS is "one of the most complex engineered systems in insects", they also proposed that "the genetic design is likely to be portable to other species" without mention any potential obstacles along the way. Therefore, efforts should be made to give full picture of SSIMS including rain and sunshine.

      We have added discussion of possible failure modes for this genetic biocontrol approach to the discussion section. We have also added text to discuss how the complexity of SSIMS is a potential obstacle to its translation to non-model organisms.

    1. Author Response:

      Reviewer #1 (Public Review):

      This manuscript is a follow-up of an earlier manuscript using the LRET technology, but extends the study by identifying a new "open" state and using experimental distance constraints to provide molecular models of the different states. All in all, the manuscript is well written, the experiments are described in sufficient details and experiments are done to high quality with the appropriate controls. The data corroborate the partially open state as published early, but extend the study to a second, open state. It is very good to see that the observed states are not only present in the catalytic head but the authors also use the full-length protein and find similar states. However, in the present manuscript, I find the conceptual advance with respect to the mechanism of MR somewhat limited. The authors curiously do not include any DNA in their structural studies, so the observed states are only relevant for the free MR complex, but not the complex "in action" bound to DNA where quite different conformations might occur. As one consequence, the structurally proposed states do not directly correlate with the functional nuclease states that are necessarily bound to DNA. Perhaps as a consequence, in the author's model, Rad50 is merely a gate-keeper for Mre11, but this is not the case as recent structural work shows that Rad50 forms a joint DNA binding surface with Mre11. Likewise, biochemical studies are done with physiologically unclear/less relevant 3' exonuclease activity only, but not with the physiological important 5' endonuclease activity. In my opinion, it is important for a publication in a journal with the scope of eLife and addressed to a broad audience to provide structural analysis in the presence of DNA and validate the structures using the endonuclease activity.

      We thank the reviewer for these comments.

      Specific recommendations:

      1) Instead of using the physiological unclear exo activity, I suggest to use the more relevant endonuclease activity to validate the mutants.

      We now include plate- and gel-based endonuclease activity assays, using a variety of DNA substrates, for all of the validation mutants. We have expanded Fig. 3 and included a new Supplemental Fig. S4 to show this data. We have expanded the Results section of the modified manuscript to present and discuss these findings.

      2) Since the authors mutated one side of newly identified/proposed salt-bridges, I also suggest to test whether a charge reversal on both sides of the salt bridge rescues the phenoptype. I find this important because MR has quite many conformations, and mutating a single residue might not unambiguously validate the proposed conformation, a rescue by a charge reversed salt bridge is much stronger.

      We thank the Reviewer for this suggested experiment, and we tried to do it. Although we were successful in generating each of the charge reversal mutations in full-length Rad50, all of the mutants unfortunately had issues with either expression or purification. For example, the 6x His-tag for several of the new Rad50 mutants was not accessible to the TEV protease for cleavage indicating that the mutated proteins were mis-folded (the His-tag of the WT full-length Rad50 is readily cleaved off by TEV). As such, we did not feel confident using these proteins in subsequent MR activity assays.

      3) Since all LRET experiments are done without DNA, the authors do not capture relevant DNA processing states and comparison of structural (w/o DNA) and biochemical data (w/ DNA) is not really justified, in my opinion. Also, they might miss critical conformations. Is there a technical reason for not including DNA in the LRET studies?

      We have collected LRET data on ATP-bound MRNBD in the presence of a hairpin DNA or a ssDNA as substrates. We still observe three states in the presence of both DNAs; however, the open conformation appears to be slightly more compact (i.e., closer distance between Rad50NBD protomers) in the presence of ssDNA. As described above, we have added to the Results section of the modified manuscript and included a new figure (Fig. 4) describing these data.

      4) If the authors want to claim processive movement coupled to partially open/open state interchanges, they should provide experimental evidence. Where would the energy come from for such a movement, this is not clear from the model?

      On the surface, ATP hydrolysis by Rad50 would seem to be the perfect source of energy for the conformational changes that drive the sequential and/or processive nuclease functions of the MR complex. However, the D313K mutant is not as good at ATP hydrolysis as the wild type enzyme (Fig. 3E), and the data in Fig. 3 and Supplemental Fig. S4 clearly demonstrate that D313K is by far the best nuclease. If the free energy for the movement does not come from ATP hydrolysis, where else could it come? Richardson and co-workers measured a release of -5.3 kcal mol-1 (-22.17 kJ mol-1) of free energy for the hydrolysis of a DNA phosphodiester bond (Dickson, K.S. et al. 2000 J. Biol. Chem. 275:15828–15831). Thus, the free energy released from the Mre11 nuclease activity could be the driving force for the conformational changes we propose. We have made this point in the Discussion of the revised manuscript.

      5) The SAXS data for the "open" state do not validate the model, in my opinion. Experimental data and model are not inconsistent, but the curve looks to me as if the open state is perhaps much more flexible (i.e. an ensemble) or extended? Please comment.

      We agree with the Reviewer on this point. We have updated Fig. 5A (original Fig. 4) to include the two-state fits to the experimental SAXS data. Although the multi-state fit to the apo MR SAXS data is better than any of the single model fits (2 = 1.05 vs. 1.26, respectively), the 2 is still larger than the multi-state fits to the ATP-bound MR SAXS data. Thus, an additional unobserved conformation (perhaps the so-called “extended”) might be present in solution for apo MRNBD. We have added a sentence to the revised manuscript with this point.

      To explore the possibility that the previously described “extended” structure might be contributing to the SAXS data, we built a model of the extended conformation of Pf MRNBD based on the Tm MRNBD structure (PDB: 3QG5) and used Rosetta to connect the coiled-coils and add the linker to the Mre11 HLH. When this model was used in the FoXS calculations for the apo SAXS data, the 2 was 4.77 (versus 2 of 1.26 for the “open” model). The MultiFoXS two-state fit gave 90% open + 10% closed (2 of 1.04), whereas the three-state fit gave 65% open + 20% extended + 15% part open (2 of 0.84). Thus, there is some improvement when using the extended model, but since that model is not measurable in our LRET experiments and we are unsure of its validity as we have modeled it for Pf MR, we have chosen to omit it from the analysis.

      6) Distance errors for the full complex are much smaller than those for the catalytic module only (Fig. 1d). Does that mean that the full complex is more rigid, please comment?

      From looking at the data presented in Fig. 1D, it is logical to suggest that the full-length complex may be more rigid or better defined by the LRET data. However, we note that there are nearly as many distance errors which are similar between MRNBD and MR as there are MR errors less than MRNBD. And although many are not identical, most are of a similar magnitude. Because of this, we do not think the variations in LRET errors are systematic (i.e., related to a more rigid full-length complex).

    1. assumptions are evident in the thinking that assumes that implied consent will reach the parts that generic consent does not reach; but proponents of specific consent procedures also assume that consent travels beyond the propositions to which it is explicitly and literally given in signing a consent form. Yet strictly speaking, consent (like other propositional atti tudes) is not transitive. I may consent to A, and A may entail B, but if I am blind to the entailment I need not consent to B. Consent is said to be opaque because it does not shadow logical equivalence or other logical implications: when I consent to a proposition its logical implications need not be transparent to me. Transitivity fails for propositional attitudes. Consent and other propositional attitudes also do not shadow most causal connections. I may consent to C, and it may be well known that C causes D, but if I am ignorant of the causal link I need not consent to D. Again, transitivity fails for propositional attitudes. When I consent to a proposition describing an intended transaction, neither its logical implications nor the causal links between transactions falling under it and subse quent events need be transparent to me; a fortiori I may not consent to them. Events at Alder Hey illustrate the opacity of consent. Some parents consented to removal of tissue, but objected that they had not consented to the removal of organs?although, of course, organs are composed of tissues. They did not agree that their consent to removal of tissue implied their consent to the removal of organs. As a point of logic the parents were right. These simple facts create a dilemma. The real limits of patient and donor comprehension suggest that it is unreason able to seek consent for every detail of a proposed treatment, or of a proposed research protocol, or of a proposed use of tissues. Yet the logic of propositional attitudes suggests that we cannot simply assume that implied consent will spread from one proposition to another, or from one proposition to the expected consequences of that which it covers, making any further consent unnecessary. There are many ways of skinning this cat. I conclude by sketching one approach that I think plausible.

      propositional attitude SHOULD ONLY BE LEFT PARAGRAPH. Also, there's a bug in the code here.

    Annotators

    1. Then, after speaking with the person about the ways in which they don’t hold privilege, I ask in what ways they do. (I’ll use myself as an example: while I am a woman, dyslexic, and have a chronic medical condition, I ALSO have the privilege of being upper-middle class, living in the United States, holding a graduate degree, having financial resources, and being white.)

      This makes you think about privilege in another light. Although someone may have privilege they could also have some disadvantages. Which isn't a bad thing but we have to ask ourselves these things.

    1. A couple of weeks ago I did a mock interview with an executive I’m coaching. One of the interview questions I posed was this: “You have employees, external customers, internal customers (stakeholders or peers), and your boss. Put them in order of priority in terms of serving their needs.Regardless of the type of company or organization, here’s the answer and why:1. External customersThe purpose of any company or business is to win and keep customers. Without customers, there’s no business, no shareholder value, and no jobs. Since there are a finite number of customers, in practical terms, they are irreplaceable. They’re always the highest priority.2. Your bossYour boss is more important to the success of the company than you and your peers. You may not like hearing that, but in just about every case, it’s true. You may think you’re more competent than your boss and you might even be right. But that doesn’t change the fact that his function incorporates yours and is higher up on the org chart so, by definition, his needs top yours or your peers.3. Internal customers (stakeholders or peers)Each and every one of you has peers, stakeholders, internal customers whose functions are intertwined with yours and whose needs are important. Marketing folks, for example, should count product groups and sales as their stakeholders. You should make it a priority to meet with them periodically and ask them how you’re doing. Next to paying customers and your boss, they’re needs matter most.   4. EmployeesSo, here we are. The dirty little secret no executive, business leader, or manager ever wants to admit. Nevertheless, it’s true. Employees are at the bottom of the totem pole in terms of how important their needs are to their management. That’s all there is to it.Don’t get me wrong. Creating a culture where employees are empowered, challenged, and supported, where they can really make a difference, should be huge for any company. But all things being equal, as priorities go, employees come in dead last on that list. Sobering as that sounds, it’s entirely as it should be.

      This really gets to the heart of the matter, it is justifiable that Employees are the lowest of the priorities for an executive.

      Based on the article priorities are: 1. External Customers - They bring money into the company 2. Your boss - They being money into you 3. Internal Customers (stakeholders or peers) - They make things work for external customers and your boss 4. Employees - They are paid to work for the company and are the lowest of the four priorities if you have to stack rank

    1. SciScore for 10.1101/2022.01.30.22270029: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      NIH rigor criteria are not applicable to paper type.

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Materials: The video-conferencing sessions took place using ZOOM software (Zoom Video Communications, Inc., Version 4.4; https://zoom.us/).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>ZOOM</div><div>suggested: (ZOOM, RRID:SCR_002175)</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Statistical analyses were conducted using GraphPad Prism v.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>GraphPad Prism</div><div>suggested: (GraphPad Prism, RRID:SCR_002798)</div></div></td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">8 and SPSS v.</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>SPSS</div><div>suggested: (SPSS, RRID:SCR_002865)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:


      Limitations and future clinical considerations: The main limitation of this study was also what made it possible - the unexpected circumstance of supporting families during home-confinement orders. It was not possible to randomise groups or complete formal measures of child and parent outcome, and our satisfaction questionnaires needed to be created very quickly. This was a period of great uncertainty, and relative solidarity, where parents seemed open to try new modes of communication and were motivated to keep a sense of continuity for their child’s program, all of which may have impacted their level of participation and satisfaction. It should also be noted that the parents in our study had all met or worked with their therapists in-person prior to being asked to meet online, which may have increased their willingness to take part in the new approach.1,30 This study did not examine whether a family without previous experience in early intervention would have the same level of engagement with the remote delivery of services. In considering ideal sessions frequency, the current study did not compare parent experience between varying durations of sessions (30 vs 60 vs 90 minutes), which would be important to take into account in future research. The COVID-19 pandemic disrupted our intervention program for children on the autism spectrum, forcing us to re-think our service provision model and giving us the chance to experience very frequent interactions with the families. It r...


      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We found bar graphs of continuous data. We recommend replacing bar graphs with more informative graphics, as many different datasets can lead to the same bar graph. The actual data may suggest different conclusions from the summary statistics. For more information, please see Weissgerber et al (2015).


      Results from JetFighter: We did not find any issues relating to colormaps.

      Results from rtransparent:


      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      Results from scite Reference Check: We found no unreliable references.


      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>
    1. analysis is to make us better producers of persuasion, the immediate purpose here is to see the tools available for analysis, as this brief consideration of two opposing audiences illustrates. LOGOS The third kind of proof, according to Aristotle, that rhetors may use to appeal to their audiences is logos. You may readily associate the term with “logic,” and while there is some reason for doing so, we shouldn't think too narrowly about logic when conceiving logos as a mo

      The most interesting/helpful idea here is logos= logic. It's not about emotion or facts, it's using logic to inform the audience/ reader. The four parts help me understand logos the best. They are the claim, data to support it, a warrant connecting the data to the claim, and backing. This is a solid structure to follow when using logos.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The manuscript reports the identification of a novel protein complex involved in denervation-induced desmin degradation. The first protein to be identified was the ATPAse Atad1. A clever isolation strategy was based on the fact that the ATPAse p97/VCP is involved in the extraction of ubiquitinated myofibrillar proteins but is not required for the removal of ubiquitinated desmin filaments. The authors reasoned that a related ATPAse might be specifically required for desmin filaments. Atad1 was identified by treating desmin filaments with a nonhydolyzable ATP analog and looking for ATPases that are associated with desmin filaments by proteomics. Knockdown of Atad1 causes a loss of desmin degradation and led to a loss of denervation-induced muscle atrophy. It seems that Atad1 binds desmin in a phsphorlation-dependent manner, although the binding maybe mediated by a protein that hasn't yet been identified. The authors went on and identified two additional proteins which together with Atad1 form a protein complex involved in recruiting calpain for desmin degradation.<br> Overall, this study is very convincing providing novel important insight. I have only some minor comments

      Minor comments

      1. I wondered whether Aatd1 is expressed at higher-than-normal levels in muscle and heart. I looked that expression pattern up and it seems that they are especially abundant in muscle and heart and expressed at lesser levels in smooth muscle and overall have a restricted expression.

      We now analyzed ATAD1 levels in various tissues by Western Blotting and the new data is presented as Fig. S2. ATAD1 is present in many tissues and thus may have many cellular roles.

      Maybe you have some data on their expression in muscle tissue. Did you perform some staining of muscle tissue at baseline and after denervation with regard to the protein localization by immunostaining?

      The new associations between ATAD1 and its protein partners reported herein were further validated by an immunofluorescence staining of longitudinal sections from 7 d denervated muscles and super-resolution Structured illumination microscopy (SIM). The new data presented as Fig. 3E demonstrate colocalization of ATAD1 with calpain-1, PLAA and UBXN4. To confirm that these proteins in fact colocalize, we measured the average colocalization of ATAD1 with calpain-1, PLAA and UBXN4 using the spots detection and colocalization analysis of the Imaris software (Fig. 3E). Only spots that were within a distance threshold of less than 100 nm were considered colocalized (Fig. 3E, graph).

      1. The string data presented in Figure 3C needs some further explanation with regard to the colors used for the different proteins. While the authors explained the meaning of the proteins labeled in red, there is no explanation for the other colors.

      These were arbitrary colors assigned to protein nodes by the STRING database. The current color code we use is only meant to group the UPS enzymes based on function (e.g. E2s, E3s, DUBs etc). This information has now been added to figure legend.

      1. Molecular weights in Fig. 2E, 3D needs to be 'repaired' and additional MW information is required in case of the ubiquitin blot shown in 3D.

      All molecular weight values and protein ladders have been added.

      1. Fiber size distributions shown in Fig. 1D and 4F. Have the differences been statistically tested?

      We thank the reviewer for raising this important point because we just established an approach to quantitate these effects statistically using Vargha-Delaney A-statistics test and Brunner-Manzel test. Our new paper on this topic entitled “A semi-automated measurement of muscle fiber size using the Imaris software” by Gilda et al. was recently published in the AJP Cell Physiol. As requested by the reviewer, we now also apply A-statistics test and Brunner-Manzel test on the fiber size measurements presented in our current manuscript (Figs. 1C, 4F and Table I), which show a significant difference in size distributions of fibers expressing shAtad1 vs. adjacent non-transfected fibers. As indicated in our paper (Gilda et al, 2021), the A-statistics is a direct measure of the fiber size effect, and it shows significant beneficial effects on cell size by shAtad1 (Table I). Such effects can be simply missed by traditional measurements of median, average, and Student’s t-test.

      1. For my taste the referral to the individual data (Fig. numbers) in the discussion section is too detailed and becomes a second results section. This should be substituted by a summary paragraph before the implications are discussed.

      We agree and revised the discussion section accordingly.

      1. The summary slide is very good. However, could you please add information, which protein of the three in the Atad1 complex is depicted by each symbol?

      The model slide has been revised to include all enzymes studied in this paper, and a legend to improve clarity.

      Reviewer #1 (Significance)

      Novel insight into the proteins involved in desmin filament degradation. Since this is an important subject both in muscle and heart and plays an important role in muscle and heart disease, it is of significant clinical importance. Currently it has only been implicated in denervation-induced skeletal muscle atrophy, but it is likely that desmin filament metabolisms is also similarly regulated in the heart.

      I am a researcher mainly focusing on the cardiac biology with some expertise also on muscle, however no specific knowledge about desmin filament biology. <br> Referee Cross-commenting Overall, I think all three reviewers agree that this is a significant and important paper. I think that the comments made by the reviewers are fair and probably add to the quality of the manuscript.

      We are pleased that the reviewers found our paper novel and important.

      Thus, both myself and reviewer 2 agree that it would be useful to visualize Atad1 and partners localization in muscle fibers by immunofluorescence. These data would provide independent support to the model the authors are proposing, which currently is only based on biochemical analysis.

      These data have been added as new Fig. 3E.

      I also support the proposed use of proximity ligation to provide further evidence of the presence of the Atad1, Ubxn4 and PLAA in a complex. However, this experiment depends on the quality of the available antibodies and I would consider this not absolutely required.

      Because our antibodies are not suitable for proximity ligation assay (PLA), we used a super-resolution SIM microscope, immunofluorescence, and the spots detection and colocalization analysis of the Imaris software to confirm colocalization of ATAD1 and its partners (new Fig. 3E). Similar to PLA (where signal is generated only if two antibodies used for staining are 100nm apart), only spots that were within a distance threshold of less than 100 nm were considered colocalized (Fig. 3E, graph). In addition, we present immunoprecipitation (Fig. 3D) and use three independent mass spectrometry-based proteomic approaches to validate these new associations.

      I also agree that some further information on the proteomics data (as suggested by reviewer 3) is required with regard to the method of filtering for UPS components was performed.

      We agree and thank the reviewer for this comment. More information on the proteomics data have been added to the text and legend to Table II.

      The proposed request for further information on the electroporation approach is a valid comment and if the authors have this information, it would be good to provide. However, I do not recommend further experiments as overall the data are very consistent and the findings are very significant and represent a major advance in our understanding of desmin degradation.

      With regard to the electroporation approach, i) representative images have been added to Figs. 1C and 4F, ii) a statement was added to Methods under “in vivo electroporation” about the percent of transfection routinely used in our experiments (60-70%), iii) we determine transfection efficiency by dividing the number of transfected fibers (also express GFP) by the total number of fibers in the same muscle cross section (using the Imaris software). This approach was fully validated in our recent papers by Goldbraikh et al EMBO Rep, 2020 (see supplementary material) and Gilda et al AJP-Cell Physiol, 2021.

      Reviewer #2 (Evidence, reproducibility and clarity)

      In their manuscript the authors show the involvement of the AAA ATPase Atad1 in Desmin degradation. They identify PLAA and Ubxn4 as partners of Atad1 that participate to its function in desmin degradation.<br> A general comment is that some conclusions are overstated. The authors mention several times that Atad1 depolymerises desmin filaments. The data show that Atad1 participates to the degradation of Desmin and to its solubilization. "Depolymerisation" should be kept for the model presented in figure 8 but not used in the result section.

      We respectfully disagree with the reviewer that our conclusions are overstated. Early studies from Fred Goldberg’s group showed that filaments are not accessible to the catalytic core of the proteasome (Solomon and Goldberg, JBC, 1996), and therefore must depolymerize before degradation. Accordingly, more recent studies by us and others identified distinct enzymes and cellular steps promoting disassembly and subsequent degradation of ubiquitinated desmin filaments (Cohen, JCB, 2012; Aweida, JCB, 2018) and myofibrils (Cohen, JCB, 2009; Volodin, PNAS, 2017). In the current manuscript, we employed a similar approach as we used before to analyze disassembly of filamentous myofibrils by p97/VCP (Volodin, PNAS, 2017), and demonstrate a critical role for ATAD1-PLAA-UBXN4 complex in promoting desmin IF disassembly and loss (figures 2C, 3D, 3G, 4C, 4G, 4H). We show that ATAD1 binds intact insoluble desmin filaments in an early phase during atrophy (3 d after denervation)(figures 2B, 2F) and later accumulates in the cytosol bound to soluble ubiquitinated desmin (figure 3D). Moreover, downregulation of ATAD1, PLAA or UBXN4 in mouse muscles prevents the solubilization of desmin IF (figures 2C, 3G, 4C) because in these muscles desmin accumulates as ubiquitinated insoluble filaments. Based on these data we conclude that Atad1 complex promotes desmin IF disassembly and subsequent loss.

      Major comments:<br> 1) It would be useful to visualize Atad1 and partners localization in muscle fibers in immunofluorescence. Do they colocalize with desmin filaments, with calpain?

      As requested, the new associations between ATAD1 and its protein partners reported herein were further validated by an immunofluorescence staining of longitudinal sections from 7 d denervated muscles and super-resolution Structured illumination microscopy (SIM). The new data presented as Fig. 3E demonstrate colocalization of ATAD1 with calpain-1, PLAA and UBXN4. To confirm that these proteins in fact colocalize, we measured the average colocalization of ATAD1 with calpain-1, PLAA and UBXN4 using the spots detection and colocalization analysis of the Imaris software (Fig. 3E). Only spots that were within a distance threshold of less than 100 nm were considered colocalized (Fig. 3E, graph). Given the antibodies in hand and new ones that we purchased, as well as the species of the antibodies, we were able to perform and optimize the staining only for the presented combinations of antibodies.

      2) In the same line, interactors were obtained from large crosslinked complexes. It would make the model more convincing if direct interactions with Atad1 were shown, for example using Proximity Ligation Assays.

      Because our antibodies are not suitable for proximity ligation assay (PLA), we used a super-resolution SIM microscope, immunofluorescence, and the spots detection and colocalization analysis of the Imaris software to confirm colocalization of ATAD1 and its partners (new Fig. 3E). Similar to PLA (where signal is generated only if two antibodies used for staining are 100nm apart), only spots that were within a distance threshold of less than 100 nm were considered colocalized (Fig. 3E, graph). In addition, we present immunoprecipitation (Fig. 3D) and use three independent mass spectrometry-based proteomic approaches to validate these new associations._

      3) Evaluation of atrophy is made on cross-sections of muscles electroporated with shRNAs. Histology pictures should be shown.

      As requested, representative images of transfected muscles were added to figures 1C and 4F.

      4) What is the percentage of electroporated fibers? To evaluate the effect of shRNAs it is important to have this information. For example, if the efficiency is 50% it means that the reduction in expression of the target in electroporated fibers is twice the value reported for the whole muscle. Alternatively, immunofluorescence could be provided to see the decrease in targeted proteins in electroporated fibers.

      We determine transfection efficiency by dividing the number of transfected fibers (also express GFP) by the total number of fibers in the same muscle cross section (using the Imaris software). This approach is fully validated in our recent papers by Goldbraikh et al EMBO Rep, 2020 (see supplementary material) and Gilda et al AJP-Cell Physiol, 2021. For our biochemical studies we always analyze muscles that are at least 60-70% transfected (added to methods).

      As shown in figures 1B, 3F, and 4A-B, our shRNAs reduced gene expression by at least 40-50%, which in a whole muscle was sufficient to promote the beneficial effects on muscle (as mentioned in the text, shCAPN1 was validated in Aweida, JCB, 2018). Similar reduction in gene expression is commonly seen by the in vivo electroporation of a fully developed mouse muscles because transfection efficiency is never 100%. This means that the beneficial effects on muscle by the electroporated shRNA must underestimate the actual protective effects by gene downregulation. To prove that these beneficial effects on muscle result from specific gene downregulation, we compare and analyze in parallel in each experiment muscles transfected with shLacz scrambled control.

      5) The same is true for all the experiments quantifying the effect of shRNAs in western blot. Since quantifications are probably made on whole muscles (ie a mix between electroporated and non electroporated fibers) and since the percentage of electroporated fibers is not given it is not possible to estimate the efficiency of the shRNAs in electroporated fibers.

      As mentioned above and now also in the text, for our biochemical studies we always analyze muscles that are ~60-70% transfected. This methodology is very well established in our lab, and a reduction of 40-50% in gene expression by our shRNAs is sufficient to promote the beneficial effects on mouse muscle (see our papers in JCB, PNAS, Nat Comm, EMBO rep).

      6) Figure 2C: by decreasing solubilization of desmin, one would expect a decrease in the levels of soluble desmin. Conversely the authors observe an increase in both insoluble and soluble desmin. Of course, this can be explained by reduced desmin degradation once solubilized but this should be demonstrated at least by showing that UPS inhibitors induces an increase in soluble ubiquitinated Desmin.

      The reviewer raises an important point that we now discuss in the text. Soluble pool of desmin, its homolog vimentin as well as other Type III IF proteins is small as these proteins mostly exist in the cell assembled within filaments (see papers by RA Quinlan and WW Franke). This soluble pool of desmin may function either as precursors to the mature filament or as components released during filament turnover. Because we block desmin IF disassembly by downregulating Atad1, the soluble desmin that accumulates in the cytosol likely represents new precursors whose degradation also requires ATAD1. Therefore, we conclude that ATAD1 promotes degradation of desmin filaments and of soluble proteins (see also figures 2E and 4D).

      As requested by the reviewer, we inhibited proteasome activity by injecting mice with Bortezommib and measured the effects on desmin content in denervated muscle (new figure 2D). Our new data clearly demonstrate accumulation of ubiquitinated desmin in atrophying muscles where proteasome activity was inhibited, indicating that in denervated muscles desmin is degraded by the proteasome.

      7) Figure 2E: the levels of Atad1 in the insoluble fraction seem to be the same in the shLacZ and GSK3DN conditions, whereas the phosphor Ser is different. In other words, there should be more Atad1 in the insoluble fraction with shLacZ than with GSAK3DN since the phosphorylation level with shLacZ is significantly higher.

      To quantitate the changes in ATAD1 association with desmin and avoid confusion by the reader, we performed densitometric measurements of ATAD1 and desmin, and depict in a graph the ratio of ATAD1 to desmin in the insoluble fraction. The new data was added to figure 2F and clearly demonstrate that ATAD1 association with desmin is significantly reduced in muscles expressing GSK3b-DN. These findings further support our conclusions that Atad1 association with desmin IF requires desmin phosphorylation.

      8) Figure 4E: the authors state that phosphorylation decreases because of increased degradation (lanes 6-8). However, Calpain also increases degradation and phosphorylation is increased (lanes 2-4), so increasing degradation does not systematically cause a decrease in phosphorylation. Similarly, lane 5 Atad1 induces less degradation than Calpain, however, it causes a decrease in phosphorylation. Explain.

      Here we use a cleavage assay, which was established and validated in our recent JCB paper (Aweida 2018). Desmin filaments were isolated from mouse muscle and the obtained preparation was divided between 9 tubes (hence there is no situation for “increase in phosphorylation” as indicated by the reviewer). Recombinant calpain-1 was then added to the tubes and cleavage of phosphorylated desmin was analyzed over time. Because the substrate for calpain-1 is phosphorylated desmin, we measured the content of both desmin and its phosphorylated form in the tube throughout the duration of the experiment. Only when cleavage of phosphorylated desmin by calpain-1 was accelerated (i.e., in the presence of Atad1), a rapid reduction in the amount of phosphorylated desmin could be detected (compare lanes 6-8 with 5) concomitantly with accumulation of small desmin fragments in short incubation times (compare lanes 6-7 with 2-3).

      With respect to the reviewer’s comment that “Atad1 induces less degradation than Calpain” in lane 5, please note that Atad1 is not a protease and cleavage of desmin occurs in this experiment only in the presence of calpain-1. However, if there is a slight reduction in phosphorylated desmin, it should account for the ability of ATAD1 appears to slowly disassemble desmin IF (as our in vivo data by shATAD1 show).

      9) The AAA ATPase VCP shares partners with Atad1 and is involved in muscle atrophy. It would greatly add to the manuscript if the authors inhibited VCP to compare its effect to Atad1

      As stated in the text, we previously demonstrated that p97/VCP is not required for desmin filament loss: “the AAA-ATPase, p97/VCP disassembles ubiquitinated filamentous myofibrils and promotes their loss in muscles atrophying due to denervation or fasting (Piccirillo and Goldberg, 2012; Volodin et al., 2017). However, desmin IF are lost by a mechanism not requiring p97/VCP (Volodin et al., 2017). We show here that their degradation requires a distinct AAA-ATPase, ATAD1”. Therefore, our current studies were undertaken to specifically identify the AAA-ATPase that is involved in desmin filament disassembly and loss. Accordingly, p97/VCP was not detected by our mass spectrometry-based proteomic analyses presented here (stated in the discussion).

      We did identify PLAA and UBXN4 as ATAD1 partners and show they are required for desmin loss, and therefore state in the text that “PLAA and UBXN4 are also known cofactors for p97/VCP (Liang et al., 2006; Papadopoulos et al., 2017), a AAA-ATPase that was not in our datasets, indicating that p97/VCP adaptors can bind and function with other AAA-ATPases”.

      Minor comments:

      1) The soluble fraction contains a large number of ubiquitinated proteins. Please explain how it can be stated that an increase in total soluble polyubiquitinated proteins corresponds to an increase in ubiquitinated desmin.

      We do not state in the text that “an increase in total soluble polyubiquitinated proteins corresponds to an increase in ubiquitinated desmin”. We state that “stabilization of desmin filaments attenuates overall proteolysis. The reduced structural integrity of desmin filaments on denervation is likely the key step in the destabilization of insoluble proteins (e.g. myofibrils) during atrophy, leading to the enhanced solubilization and degradation in the cytosol”. We invite the reviewer to read our papers about this topic by Cohen 2012, Volodin 2017, and Aweida 2018. Using a dominant negative of desmin polymerization we show that disassembly of desmin filaments is sufficient to trigger myofibril destruction and consequently overall proteolysis (because myofibrils comprise ~70% of muscle proteins).

      2) Page 11: the authors conclude that denervation enhance the interactions with Atad1. Figure 3D indeed show an increase for Ubxn4, but it is not clear for the other proteins.

      Figure 3D shows that in 7 d denervated muscles there is an increase in associations between ATAD1 and ubiquitinated desmin, UBXN4, PLAA and calpain-1.

      3) Figure 4 F: show muscle sections

      A representative image was added as requested.

      4) Page 21 in vivo transfection: it is stated "see details under immunofluorescence" but there is no immunofluorescence section in materials and methods.

      Thank you. An immunofluorescence section has been added to Methods.

      5) The authors show that Atad1 inhibition in innervated muscle is sufficient to induce muscle hypertrophy (Figure 4E). They conclude that the hypertrophic effect of Atad1 is due to the inhibition of Desmin degradation. However, this hypertrophic effect could be independent of the action of Atad1 on Desmin.

      We believe the reviewer refers to figure 4F-H, where we show that downregulation of ATAD1 prevents the basal turnover of desmin and of soluble proteins and causes muscle fiber growth. Based on this data we speculate in the text that “ATAD1 attenuated normal muscle growth most likely by promoting the loss of desmin filaments and of soluble proteins … Thus, ATAD1 seems to function in normal postnatal muscle to limit fiber growth, and suppression of its activity alone can induce muscle hypertrophy”. We agree with the reviewer that in addition to these beneficial effects on desmin and soluble proteins, ATAD1 downregulation may contribute to muscle growth by additional mechanisms.

      Reviewer #2 (Significance)

      This is new information in the field since calpain cannot hydrolyze desmin insoluble filaments and that the mechanisms that give calpain access to desmin are not known.

      The authors already made important contribution in the study of muscle atrophy and especially in desmin degradation. This work constitutes a new advance in their attempts to understand the molecular mechanisms leading to desmin degradation and muscle atrophy.

      Audience: desmin is the main intermediate filament in skeletal muscle. This work will therefore interest scientists working on skeletal muscle.

      Expertise of the reviewer: molecular and cellular biology of skeletal muscles, muscle atrophy.

      Referee Cross-commenting

      I fully agree with reviewer 1.

      Reviewer #3 (Evidence, reproducibility and clarity)

      Summary:

      The manuscript by Aweida & Cohen introduces a novel complex formed by the AAA-ATPase ATAD1 and its interacting partners PLAA and UBXN4 as initiator of calpain-1-mediated disassembly of ubiquitylated desmin intermediate filaments (IF) during muscle atrophy. The authors use a denervation model of murine tibialis anterior muscles as their main resource for experimentation. They apply a kinase trap-assay and co-immunoprecipitation method followed by mass spectrometry as starting point for identifying novel interactors of desmin IF (Aweida et al. 2018 in JCB). They continue to analyze their candidates using immunoblotting, co-immunoprecipitation, shRNA-mediated intramuscular knock-down, gel filtration, mass spectrometry, and enzyme assays. In their experiments, thee authors show an accumulation of ATAD1 in the insoluble desmin filament fraction of denervated muscle fibers together with an increase in ubiquitylation of desmin filaments. Both proteomics experiments of size-exclusion chromatography of denervated muscles and ATAD1 immunoprecipitation identify several components of the ubiquitin-proteasome system as novel interactors of ATAD1, that are also bound to insoluble desmin filaments after muscle denervation. Following additional co-immunoprecipitation and knock-down experiments, the authors confirm PLAA and UBXN4 as novel cofactors of Atad1 that help in extracting previously GSK3-β-phosphorylated and TRIM32-ubiquitylated (Aweida et al. 2018 in JCB, Volodin et al. 2017 in PNAS) desmin from desmin IF. The authors further show that ATAD1 encourages calpain-1-dependent proteolysis of soluble desmin after extraction from the desmin IF in an in vitro enzymatic proteolysis assay.

      Major comments:

      The authors present clear and convincing arguments from in vivo and in vitro experiments for their proposed model of ATAD1/PLAA/UBXN4-aided calpain-1-mediated proteolysis of desmin IF.

      In my opinion, no additional experimental evidence is essential to underlining their statement.

      Data and methods are presented clearly and understandably to allow for the reproduction and the reapplication of the utilized methods for verifying the presented data and analyzing complementary aspects in a similar fashion.

      A concern is with the presentation of mass spectrometry results, particularly regarding Table I: I am wondering whether the presented UPS components were the only proteins found in the proteomics screens or whether any filtering has taken place to only show UPS components in this manuscript. If so, please note the total number of proteins identified in the respective proteomics analyses and explain how filtering for UPS components was performed. This comment goes in line with the first minor comment on Figure 1A, see below.

      We thank the reviewer for this valuable comment, as it helps clarify a point that was not completely lucid in the previous version of this manuscript. Because our paper focuses on protein degradation, we extracted from our datasets only UPS components that were identified with ³ 2 unique peptides using DAVID annotation tool-derived categories (Table II). Column 1 includes UPS components that were co-purified with ATAD1 by size exclusion chromatography (SEC)(20 out of 427 total proteins), and column 2 includes UPS components that were co-purified with ATAD1 by immunoprecipitation from muscle homogenates (17 out of 592 total proteins). These two proteomics experiments were oriented specifically towards identifying ATAD1-binding partners. To further validate our observations, we compared these lists of ATAD1-interacting components to our previous kinase-trap assay dataset (Aweida 2018, 1552 total proteins were identified) and included in column 3 only the proteins that overlapped with the other two proteomics approaches. The kinase trap assay was used to identify proteins that utilize ATP for their function and act on desmin, and as mentioned in the text, ATAD1 was one of the most abundant proteins in the sample. Of note is UBXN4, which was identified only by our kinase trap assay, and accumulated on desmin after denervation. These interactions between active enzymes in vivo must be transient and very dynamic, hence using three approaches did not identify the exact same subset of putative adaptors (see “discussion”). These points are now further elaborated in the text and the legend for Table II.

      The relatively small number of individuals analyzed per experiment is owing to the limiting nature of mouse research and therefore acceptable. The observed alignment of the individual results is commendable, underlines the experimentator's ability, and strengthens the reached conclusion of the study.

      We thank the reviewer for this comment.

      Minor comments:

      Figure 1A seems redundant, since the experimental approaches are described in the text and the Venn diagram does not integrate the identification of ATAD1 into the setting of the conducted screens, e.g. by showing how many additional proteins were identified in these two screens before the authors tended to their candidate ATAD1.

      We agree and therefore removed Fig. 1A.

      Word order mistake on page 6 in the sentence: "To test whether Atad1 is important for atrophy, we suppressed...".

      Corrected.

      Figure 1D: statistical analysis of the significance of the fiber area difference missing

      Statistics for these effects is now included in new Table I. We quantitated the effects statistically using Vargha-Delaney A-statistics test and Brunner-Manzel test, based on our recent methodology paper in AJP Cell Physiol: “A semi-automated measurement of muscle fiber size using the Imaris software” (Gilda et al. 2021). The new statistical analyses show a significant difference in size distributions of fibers expressing shAtad1 vs. adjacent non-transfected fibers (Table I). As indicated in our paper (Gilda et al, 2021), the A-statistics is a direct measure of the fiber size effect.

      Figure 2A: desmin ubiquitylation is not shown in these samples by immunoblotting against (poly-)ubiquitin, but only by the identification of high molecular weight bands of the desmin blot. I wonder about the specificity of the desmin antibody in this case and about the manner of sample extraction/isolation for this particular blot, as a detailed description is missing. There seems not to have been any muscle tissue fractionation beforehand, if I am correct?

      This blot presents an analysis of desmin filaments isolated from mouse muscle, which are purified with associated proteins. In order to specifically detect ubiquitinated desmin filaments we must use a specific desmin antibody (antibody and methodology are validated in Cohen 2012 JCB, Volodin 2017 PNAS, and Aweida 2018 JCB). An antibody against ubiquitin conjugates will detect all proteins that are ubiquitinated in this insoluble preparation (e.g. proteins that bind desmin).

      Orthography mistake "demin" instead of "desmin" on page 7 in sentence "It is noteworthy that the amount of ubiquitinated demin..."

      Corrected.

      Figure 3C: image quality is insufficient; some protein names are rather difficult to decipher

      The figure has been revised to improve clarity.

      Word missing on page 13 in sentence "In addition, by 10 minutes of incubation, phosphorylated ... due to their processive cleaveage by calpain-1 ..."

      We thank the reviewer for reading the paper thoroughly and carefully. The missing word was added to the text.

      Figure 4F: statistical analysis of the significance of the fiber area difference missing

      Statistics is now included in new Table I. Asmentioned above, we quantitated the effects statistically using Vargha-Delaney A-statistics test and Brunner-Manzel test, based on our recent methodology paper in AJP Cell Physiol: “A semi-automated measurement of muscle fiber size using the Imaris software” (Gilda et al. 2021).

      "ug" on page 21 in "Briefly, 20ug of plasmid DNA..." is probably supposed to be "µg". In general, please be aware of correct unit declaration and space character usage before units.

      Corrected.

      Please be aware of the usage of correct nucleic acid and protein nomenclature and style: When referring to gene or transcript levels mark the candidate characters in italic, e.g. Atad1 mRNA levels, shUbxn4, versus ATAD1 protein etc. In addition, please be aware to use the correct gene and protein name styles: e.g. shCapn1 instead of shCAPN1 for shRNA targeting the murine Capn1 transcript in Figure 4 in comparison to CAPN1 the protein. Helpful link: https://www.biosciencewriters.com/Guidelines-for-Formatting-Gene-and-Protein-Names.aspx

      We thank the reviewer for this comment. The nomenclature for all genes and proteins have been revised accordingly.

      Reviewer #3 (Significance)

      Aweida & Cohen present evidence for the involvement of the AAA-ATPase ATAD1 not only in regulation of synaptic plasticity and the extraction of mislocalized proteins from the mitochondrial membrane, but also in a collaboration with the ubiquitin-binding proteins PLAA and UBXN4 in the disassembly of desmin intermediate filaments in muscle atrophy. The authors compare this newly discovered function of the AAA-ATPase ATAD1 to the numerous functions of the AAA+ ATPase p97/VCP and raise compelling arguments for their statement. Previously, E3 ligases that ubiquitylate sarcomere components in muscle atrophy have been identified, such as MuRF1 (Bodine et al. 2001 in Science) and TRIM32 (reviewed in Bawa et al. 2021 in Biomolecules), but the complete extraction mechanism of monomers from the diverse macromolecular fibrillary structures in muscle has been lacking.

      Both, researchers of general proteostasis mechanisms, in particular their impact on muscle function and metabolism, as well as medical researcher investigating therapeutic roads may appreciate the authors' work. This study opens up various roads to follow with complementing investigations on the many functions of the UPS in the regulation of muscle fiber architecture and functionality.

      I am working on proteostasis and particularly the UPS. I have a long-standing track record on muscle assmebly mechanisms, the regulation of E3 ligases and p97/VCP functions.

    1. Background

      Reviewer 2. Dean Giustini This is a well-written manuscript. The methods are well-described. I've confined my comments to improving the reporting of your methods, some comments about the paper's structure, and a few about the readability of the figures and tables (which I think in general are too small, and difficult to read). Here are my main comments for your consideration as you work to improve your paper:

      1) Title of manuscript - the title of your paper seems inadequate to me, and doesn't really convey its content. A more descriptive title that includes the idea of the "first wave" might be useful from my point of view as a reader who scans titles to see if I am interested. I'd recommend including words in the title that refer to your methods. What type of research is this - a quantitative analysis of citations? Title words say a lot about the robust nature of your methods. As you consider whether to keep your title as is, keep mind that title words will aid readers in understanding your research at a glance, and provide impetus to read your abstract (and one hopes the entire manuscript). These words will help researchers find the paper later as well via the Internet's many search engines (i.e., Google Scholar).

      2) Abstract - The abstract is well-written. Could the aims of your research be more obvious? and clearly articulated? How about using a statement such as "This research aims to" or similar? I also don't understand the sentence that begins with "Using references as a readout". What is meant by a "readout" in this context? Do you mean to read a print-out of references later? Lower down, you introduce the concept of Wikipedia's references as a "scientific infrastructure", and place it in quotations. Why is it in quotations? I wondered what the concept was on first reading it. A recurring web of papers in Wikipedia constitutes a set of core references - but would I call them a scientific infrastructure? Not sure; they are a mere sliver of the scientific corpus. Not sure I have any suggestions to clarify the use of this phrase.

      3) Introduction - This is an excellent introduction to your paper, and it provides a lot of useful context and background. You make a case for positioning Wikipedia as a trusted source of information based on the highly selective literature cited by the entries. However, I would only caution that some COVID-19 entries cite excellent research but the content is contested, and vice versa. One suggestion I had for this section was the possibility of tying citizen science (part of open science) to the rise of Wikipedia's medwiki volunteers. Wikipedia provides all kinds of ways for citizens to get involved in science. As an open science researcher, I appreciated all of the open aspects you mention. Clearly, open access to Wikipedia in all languages is a driving force in combatting misinformation generally, and the COVID "infodemic" specifically. I admit I struggled to understand the point of the section that begins, "Here, we asked what role does scientific literature, as opposed to general media, play in supporting the encyclopedia's coverage of the COVID-19 as the pandemic spread." The opening sentence articulates your a priori research question, always welcome for readers. Would some of the information that follows in this section around your methods be better placed in the following section under the "Material and Methods"? I found it jarring to read that "....after the pandemic broke out we observed a drop in the overall percentage of academic references in a given coronavirus article, used here as a metric for gauging scientificness in what we term an article's Scientific Score." These two ideas are introduced again later, but I had no idea on reading them here what they signified or whether they were related to research you were building on. You might consider adding a parenthetical statement that they will be described later, and that the idea of a score is your own.

      4) Material and methods - Your methods section might benefit from writing a preamble to prepare your readers. As already mentioned, consider taking some of the previous section and recasting it as an introduction to your methods. Consider adding some information to orient readers, and elaborating in a sentence or two about why identifying COVID-19 citations / information sources is an important activity.

      By the way, what is meant by this: "To delimit the corpus of Wikipedia articles containing DOIs"? Do you mean "identify" Wikipedia articles with DOIs in their references? As I mentioned (apologies in advance for the repetition), it strikes me as odd that you don't refer to this research as a form of citation analysis (isn't that what it is?). Instead you characterize it as "citation counting". If your use of words has been intentional, is there a distinction you are making that I simply do not understand? Also: bibliometricians and/or scientometricians might wonder why you avoid the phrase citation analysis. Further to your methods which are primarily quantitative and statistical - what are the qualitative methods used throughout the paper to analyze the data? How did you carry out this qualitative work? (On page 10, you state "we set out to examine in a temporal, qualitative and quantitative manner, the role of references in articles linked directly to the pandemic as it broke.") That part of your methods seems to be a bit under-developed, and may be worth reconsidering as you work to improve your reporting in the manuscript.

      5) Table 1. I am not sure what this table adds to the methods given it leads off your visuals. Do you really need it? It doesn't reveal anything to me and could be in a supplemental file. I also have difficulties in properly seeing table 1; perhaps you could make it larger and more readable?

      6) Figure 1. This is the most informative visual in the paper but it is hard to read and crowded. It deserves more space or the information it provides is not fully understood.

      7) Figure 3. This is very bulky as a figure, although informative. Again, I'm not sure all of it needs inclusion. Perhaps select part of it, and include other parts in a supplement.

      7) Limitations - The paper does not adequately address its limitations. A more fulsome evaluation of limitations would be beneficial to me as a reader, as it would place your work in a larger context. For example, consider asking whether the results are indicative of Wikipedia's other medical or scientific entries? Or are the results not generalizable at all? In other works, are they indicative of something very limited based on the timeframe that you examined? I found myself disagreeing with: "....the mainstream output of scientific work on the virus predated the pandemic's outbreak to a great extent". Is this still true? and what might its significance be now that we are in 2021? Would it be helpful to say that most of the foundational research re: the family of coronaviruses was published pre-2020, but entries about COVID-19 disease and treatment entries are now distinctly different in terms of papers cited, especially going forward. Wiki editors identify relevant papers over time but are not adept at identifying emerging evidence in my experience, or at incorporating important papers early; it's strange given that recency is one of its true calling cards. For me, the most confounding aspect of the infodemic is the constant shifts of evidence, and how to respond in a way that is prudent and evidence-based. As you point out, Wikipedia has a 8.7 year latency in citing highly relevant papers - and, it seem likely that many important COVID-19 papers were neglected in Wikipedia in the first wave especially about the disease. As you point out, this will form part of future research, which I hope you and your team will pursue.

      8) Reference 31 lacks a source: Amit Arjun Verma and S. Iyengar. Tracing the factoids: the anatomy of information reorganization in wikipedia articles. 2021.

      Good luck with the next stages in improving your manuscript for publication. I believe it adds to our understanding of Wikipedia's role in promoting sources of information.

    1. Author Response:

      Reviewer #1 (Public Review):

      This is an interesting study looking at the evolution of ageing in social insects using ants as a model. As I haven't seen the initial submission, I have looked at the manuscript and the response to reviewers and I base my suggestions on both documents.

      Evolution of ageing remains only partially understood and this field seems to be experiencing a sort of renaissance in recent years with a surge of theoretical advances and new empirical findings. Queens of social insects, and ant queens in particular, have remarkable lifespans and understanding the biology of their long life can help in understanding the biology of ageing in a more general sense.

      In this study, the authors focus on following quite a large number of ant (C. obscurior) colonies and provide intriguing data in relation to age-specific mortality and reproduction. The gist of their argument is that the mortality is decreasing with age while reproduction (production of sexuals) is increasing with age, such that there is little evidence of ageing in this species.

      Overall I think this is an interesting dataset that provides important information that will advance the field. However, I think the manuscript currently lacks clarity, structure and suffers from poor formulation of ideas in places, and is rather difficult to follow even for an expert in the field. I think that it requires quite a bit of work to sort this out. However, I also have a methodological question (#15) which could be key for the interpretation of the results.

      We hope that this manuscript is clearer now, especially with the additional data.

      My understanding is that queens live for 40-50 weeks max (Fig. S3). Fig. 4 suggests that from week 30 onwards the production of eggs, worker pupae and queen pupae decline. This suggests that while queen mortality declines in late life, so does queen reproduction. So, do queens of this species show reproductive senescence?

      Yes, they do experience reproductive senescence.

      The data do suggest that relative investment into reproduction (queen worker ratio) increases with age, but the absolute number of queens declines with age. This suggests an interesting result from the life-history theory perspective - increased investment in reproduction with reduced residual reproductive value, but not necessarily the absence of reproductive senescence. Please clarify.

      We hope this new version of the manuscript addresses clearly that ants queens do experience reproductive senescence and actuarial senescence, but only after late in life (after the peak of sexual investment is reached). Therefore, we state that senescence is delayed.

      Reviewer #2 (Public Review):

      The authors investigated the evolutionary drivers of delayed senescence in ant queens by carefully observing the survival and productivity of C. obscurior colonies that were maintained at 10, 20, or 30 workers. They show that the 10 worker treatment produces fewer new queens, and lower quality workers, indicating low colony efficiency under a reduced workforce. The authors focused their conclusions on the observation of a hump-shaped relative mortality curve, with queens having a higher than average mortality around 30 weeks and then a lower than expected mortality around 40 weeks. The colonies produced more queens at the end of their lifespan, so the authors conclude high fitness gains at the end of life selects for minimal senescence in ant queens, thus generating the drop in mortality they observed at 40 weeks.

      There is a large body of research focused on the early life stage and establishment of ant colonies, but relatively little that follows their worker and reproductive trajectory to the end of life. Partially, this is because many commonly studied ant species have a lifespan too long to feasibly track, and partially because most ant species do not readily produce sexual queens or males in the lab setting. For this alone, the study provides valuable insight into the ant lifecycle and demonstrates that C. obscurior is an ideal species for future study. The experimental design and analyses are sound, and I must acknowledge the incredible amount of work that must have gone into the data collection. However, I have some serious concerns about how the results are interpreted, and what is left out of the discussion on ant colony structure and limitations that are crucial to reaching accurate conclusions.

      One issue is that the conclusions hinge on the observation that relative queen mortality decreases at the latest observational period, around 40 weeks. The authors raise this as evidence that queens are under selection for reduced senescence, as they also conclude that fitness gains (queen production) are highest late in life. The problem is that according to figure S3, only a handful of queens survive past week 40, and they all manage to hang on for another month or two before dying out. I cannot be sure how many colonies survive to this period from how the data is presented, but I worry that the authors are resting their conclusion on a low number of particularly tenacious queens. These colony numbers should be provided, and the authors should demonstrate that the drop in mortality is observable even if these outliers are excluded.

      Fitness gains are highest late in life, and this is shown for all queens, regardless whether they are short- or long-lived. Therefore, selection is maintained until late in life. We calculate relative mortality as a function of age as in Jones et al. (2014), (Fig. 4.) As suggested by the first reviewer we also now include age-specific mortality of the best-model fitted using BaSTA and the estimated parameters in the supplement (Figure 4 - Figure supplement 1, Supplementary File 8 and 9). We have also included RNAseq data of queens near and middle-aged queens. The data support our conclusion of a delayed selection shadow, as age signs were not obvious in the middle-aged queens. This is in line with two studies (Wyschetzki et al. MBE 2015; Harrison GBE et al. 2021), where no signs of aging were found in middle-aged queens of the same species.

      It also appears that the queen pupae production drops off precipitously during the end of the observational period, according to figure 4A, which runs counter to the argument that selection is reducing senescence in these older queens because they have high reproductive output at this stage. The authors put a lot of emphasis on the queen/worker ratio being highest at the end of the observational period, but this doesn't necessarily mean queens are receiving the highest fitness during this period. A queen would have a high queen to worker production ratio if she lays one worker and one queen, but she would have higher fitness if she lays 100 workers and 10 queens. Figure 2A indicates that the highest overall queen pupae laying occurs around 30 weeks, which actually corresponds with the highest level of relative queen mortality. The question of fitness gains at advanced queen age would be better answered by just analyzing which stage in their life they produced the most queen pupae. Does the queen laying rate reach a maximum and remain stable for the rest of a queen's life, or does it decrease along with worker production as they reach end of life? Figure 4A makes it appear that it decreases towards end of life, but I'm not sure if that is only because so few colonies lasted until the end of the observational period.

      We have included that “This caste ratio shift does not occur because a drop of pupae production at the end of life. Actually, pupae production is at its highest just before death (Figure 2 - Figure supplement 1).” We added a figure with raw numbers of pupae produced at the end of life for the 99 tracked queens.

      Another factor that should be discussed is sperm depletion. The authors state that each queen mated with a single male when they set up the colonies, so sperm depletion may be more important than senescence for determining the reproductive lifespan of these queens. I'm not sure if this species is normally single mated in the wild, or the length of their natural colony lifespan, but this is important information to provide in order to dismiss issues of sperm depletion in this study. Without this information it is impossible to determine if the decrease in egg laying towards the end of the study is due to senescence or sperm depletion.

      Taken together, it could be argued that these data better support selection on an optimal lifespan, around 30 weeks, as opposed to selection for directional extended lifespan and reduced senescence. If the reproductive benefits of an extended lifespan are capped by sperm depletion, the alternative strategy would be to produce a robust workforce as quickly and efficiently as possible, and then produce as many sexual offspring as possible with the remaining sperm. Perhaps selection has determined that the optimal length of this cycle is around 30 weeks, with variation dependent on the amount of sperm transferred during mating and the condition of the queen. This possibility should be addressed, and if possible additional data should be provided on sperm depletion in C. obscurior, and the colonies that survived to the end of the observation period. Without these additions, the conclusions on senescence and lifespan remain tenuous.

      We now discuss in the manuscript that sperm depletion is not commonly seen in this species, and also occurred only once in this study (of the 99 colonies). All colonies were tracked until death. Therefore, there is no evidence of stabilizing selection to a lifespan of 30 weeks based on sperm depletion. This manuscript addresses the question of how is the “shape” of aging in this species, and not the “pace” (lifespan extension), but gives a hint on why extended lifespans should be favored.

  10. Jan 2022
    1. It’s important to understand – just because we have don’t have certain kinds of privileges, it doesn’t mean that we don’t benefit from other kinds of privileges.

      I greatly connected with this quote. As a white woman I don't face much discrimination, but I realize that the privileges I do have I benefit from. As a white woman there are probably several instance where privileges I have benefitted me. While I come from a small town I see a privilege I get there is that everyone knows my mother, while some may have this same instance but it may come as discrimination against them.

    1. as not increased the amount of pleasurable satisfaction which they may expect from life and has not made them feel happier. From the recognition of this fact we ought to

      Freud makes a fascinating point that alludes to the fact that no matter what society does, it will never be enough and I feel like that is true if we think about how the world operates today, for example: APPLE comes out with a new phone, a new pc, new headphones, new tech gear every year. Every year there is always something "new and better" that just slightly enhances what was already there. As humans, we're just slightly enhancing civilization in hopes the void of unhappiness will someday fill itself, but I think all the advances are just distractions from the fact that happiness is unattainable within these conditions.

    1. SUBSCRIBE NOW Australian Open PGA TOUR LIVE: Farmers Insurance Open NHL: Select Games Men's College Hoops Women's College Hoops UFC 271: Adesanya vs. Whittaker 2 (Feb. 12, PPV) 2022 NBA Mock DraftQuick Links NFL Playoffs Schedule 2022 NFL Draft Order How To Watch Australian Open NBA Trade Machine NBA Trade Deadline Buzz How To Watch PGA TOUR Men's Hoops Rankings Women's Hoops Rankings Games For Me Caribbean Series Favorites Manage Favorites Customize ESPNSign UpLog InESPN Sites ESPN Deportes The Undefeated espnW ESPNFC X Games SEC NetworkESPN Apps ESPN ESPN FantasyFollow ESPN Facebook Twitter Instagram Snapchat YouTube The ESPN Daily PodcastLowe's 10 things: Luka's surging Mavs, a rookie steal in Chicago and the Knicks' city edition courtThis week, we showcase signs of life from Luka and the Mavs (finally!), a second-rounder making serious moves in Chicago and MSG's city edition court.8hZach LoweSteph Chambers/Getty ImagesHow Klay Thompson's game has changed since his returnWhat has Thompson's return meant for a Warriors team that looks poised to return to the playoffs?8hKendra Andrews and Kevin PeltonTOP HEADLINESCowboys' Jones clears the air on McCarthy, QuinnSources: Kings balk at Simmons price, end chaseNets' Harden 'ready to go' after hamstring issueSteelers GM Colbert stepping down after draftBurrow, Bengals prepared for loud crowd in K.C.Rooney says he turned down Everton interviewWarren's $41M bid secures rights to Fury-WhytePitch & putt: NCAA golfer J.R. Smith has NIL repPotential QB options for Packers, BucsCHAMPIONSHIP SUNDAYGet ready for Bengals-Chiefs, 49ers-Rams: We previewed both games -- and picked winnersWhat to watch for in the AFC and NFC Championship Games. Bold predictions. Key stats to know. Matchups to watch. And, of course, final score picks.9hESPN staffJayne Kamin-Oncea-USA TODAY SportsHow all four remaining NFL playoff teams can win this weekend: We mapped out game plansWe scheme up keys to victory, pick players to watch and outline important matchups.9hMatt BowenConference championship best betsFARMERS INSURANCE OPENWEDNESDAY THROUGH SATURDAY ON ESPN+ Farmers Insurance Open Rd 3 - In Progress Torrey Pines (North Course) - La Jolla, CA Total Purse: $8,400,000 ESPN+ Get POS Player Score Round Leaderboard 1 J. Thomas J. Thomas J. Thomas -14 -1(5) Leaderboard 1 J. Rahm J. Rahm J. Rahm -14 -1(5) Leaderboard 3 C. Tringale C. Tringale C. Tringale -13 -1(6) Leaderboard 3 W. Zalatoris W. Zalatoris W. Zalatoris -13 -6(10) Leaderboard 5 A. Schenk A. Schenk A. Schenk -12 +1(6) How to watch the Farmers Insurance OpenBlonds have more fun? Koepka debuts new lookSTATE OF THE MINOR LEAGUES'I was wrong': Why MLB's restructuring of the minors turned out mostly better than expectedIn 2020, minor league baseball officials predicted disaster as MLB moved to reduce the number of affiliated teams. While the transition didn't work out for some, many owners now say their worst fears weren't realized.7hESPN StaffCourtesy Missoula PaddleHeadsHUMAN RIGHTS ABUSES IN XINJIANGNBA players face questions over shoe deals with Chinese companies linked to forced laborESPN has identified at least one former and 17 current NBA players who maintain contracts with four Chinese brands accused of abetting China's human rights abuses in Xinjiang.9hMike Fish and Michael A. FletcherESPNNBA ON ESPNFRIDAY'S GAMESSee All 7:30 PM ET ESPN CHA -8.0 Los Angeles Lakers Lakers LAL 24-25 Charlotte Hornets Hornets CHA 27-22 10:00 PM ET ESPN MIL -9.0 New York Knicks Knicks NY 23-26 Milwaukee Bucks Bucks MIL 30-20 Best bets for Friday: Heat too much for Clippers?StandingsAUSTRALIAN OPEN MEN'S FINALNADAL VS. MEDVEDEV: 3:30 A.M. ET ON SUNDAYA few months ago, Rafael Nadal thought he might retire -- now he may make Grand Slam historyAs he recovers from COVID-19 and deals with a chronic foot injury, Rafael Nadal has become a comeback story for the ages at the 2022 Australian Open.13hTom HamiltonRecep Sakar/Anadolu Agency via Getty ImagesMedvedev explains 'small cat' rant at umpire8h1:54AUSTRALIAN OPEN WOMEN'S FINALExpert picks: Will Ash Barty win in front of her home crowd? Or will Danielle Collins pull off the upset?Our experts weigh in on who will win the 2022 Australian Open women's title. Spoiler: It's not as straightforward as it seems.13hESPNAP Photo/Hamish Blair Summary Sat 3:30 AM ESPN 1 A. Barty A. Barty A. Barty 27 D. Collins D. Collins D. Collins In her country, for her country, Barty looks to bring Australian Open title home AFC Championship Bengals Bengals CIN 10-7 CBS Sun 3:00 PM ET Chiefs Chiefs KC 12-5 The UndefeatedPatrick Mahomes has an opportunity to join another exclusive clubThe Kansas City Chiefs star is already among the NFL's most accomplished playoff passers; one more win will put him on another tier8hJason ReidJamie Squire/Getty ImagesStephen A.: Chiefs should be 'scared to death' of Burrow-Chase connection6h1:47Joe Burrow can cement place among top young QBs in AFC title gameAssisting single-parent families is a personal passion for Bengals' Bates Gamecast Tickets NFC Championship 49ers 49ers SF 10-7 FOX Sun 6:30 PM ET Rams Rams LAR 12-5 Trash talk, a ticket blockade and a rivalry reborn: Rams and 49ers meet for NFC titleThe 49ers beat the Rams in Week 18 just to reach the playoffs, and now a seventh straight win over their rivals would send them to the Super Bowl.10hNick Wagoner Kirby Lee/USA TODAY Sports Odell Beckham Jr. giving Rams boost in production, good vibes Gamecast Tickets COACHING CAROUSELStephen A.: Cowboys keeping Mike McCarthy is great for me, bad for them!Stephen A. Smith trolls the Cowboys' decision to retain Mike McCarthy.5h1:45Grading NFL head-coach hires: Can Hackett bring offensive stability to Denver?How did the Broncos fare with their hiring of Nathaniel Hackett? Here's what our experts think of the move.1dJeremy Fowler, Dan GrazianoAs Bears hire Poles and Eberflus, it's time for chairman McCaskey to step upHackett is the newest Broncos' coach to fix same old offensive problemTracking head-coach hirings, firings and potential openingsBRACKETOLOGYMen's hoops: Kansas back to No. 1 seed ahead of showdown with KentuckyKansas moves past Arizona to reclaim a place on the No.1 seed line.13hJoe Lunardi Photo by Jamie Squire/Getty Images Women's hoops: With help from Tennessee loss, Louisville returns to No. 1 seedA pair of SEC upsets, just hours after the NCAA selection committee revealed its top-16 seeds, shuffled teams again in our latest projection.13hCharlie Creme Andy Lyons/Getty Images COLLEGE HOOPS SCORESMEN'S AND WOMEN'S GAMES 5:00 PM ET ESPNU HARV -3.5 Pennsylvania Quakers Pennsylvania PENN 7-12 Harvard Crimson Harvard HARV 10-6 NCAAM NCAAM 5:00 PM ET ESPN+ LEH -5.0 Lehigh Mountain Hawks Lehigh LEH 8-13 Holy Cross Crusaders Holy Cross HC 3-15 NCAAM NCAAM 7:00 PM ET ESPN2 DAY -6.5 Rhode Island Rams Rhode Island URI 12-6 Dayton Flyers Dayton DAY 13-7 NCAAM NCAAM 8:00 PM ET 23 Iowa Hawkeyes Iowa IOWA 13-4 Northwestern Wildcats Northwestern NU 11-7 NCAAW NCAAW Full men's hoops scoreboardFull women's hoop scoreboard SEC/Big 12 Challenge 12 Kentucky Kentucky UK 16-4 ESPN Sat 6:00 PM ET 5 Kansas Kansas KU 17-2 How Kentucky beats Kansas and predictions for Saturday's SEC/Big 12 ChallengeJayhawks/Wildcats is the marquee matchup in Saturday's 10-game series, but other major storylines await in the 10-hour extravaganza.9hESPN Jordan Prather-USA TODAY Sports Gamecast Tickets LUNDQVIST'S JERSEY RETIREMENTBEFORE FRIDAY'S GAME VS. WILD'I think we have a goalie here': Henrik Lundqvist's journey from seventh-rounder to Rangers legendNew York landed the goalie at No. 205 overall in 2000, starting a career that would leave a legacy across two continents and the NHL record book. Tonight his No. 30 will be retired.5MEmily Kaplan Andre Ringuette/Getty Images U.S. TOPS EL SALVADORFIFA WORLD CUP QUALIFYINGRobinson gave USMNT the win they were looking forAnother scoreless first half for the USMNT, but Antonee 'Jedi' Robinson struck back to give his side a crucial win.15hKyle BonaguraHow USMNT can qualify for World Cup finalsMatch scheduleWORLD CUP QUALIFYINGFRIDAY'S MATCHESSee All 24' Colombia Colombia COL 0 Peru Peru PER 0 2022 World Cup Qualifying - CONMEBOL 2022 World Cup Qualifying - CONMEBOL 5:00 PM ET VEN -150 Venezuela Venezuela VEN Bolivia Bolivia BOL 2022 World Cup Qualifying - CONMEBOL 2022 World Cup Qualifying - CONMEBOL GAME CHANGERS UNITEDU.S. Soccer is determined to improve on past inclusivity initiatives. Enter Game Changers UnitedU.S. Soccer president Cindy Parlow Cone has created a DEIB advisory council, recruiting Cobi Jones and others to set the course.3hCaitlin Murray Denise Truscello/Getty Images for Eight Cigar Lounge ANALYZING MLS MOVESUnpacking the offseason's most intriguing signings, tradesLorenzo Insigne leaving Napoli for Toronto dominated headlines across MLS, but this offseason has been full of signings worth a closer look.6hJeff Carlisle, Austin Lindberg Robin Alam/Icon Sportswire via Getty Images CARIBBEAN SERIESFRIDAY'S GAMES Top 6th 2 Outs 2 Outs ESPN Dep Watch Colombia Colombia COL 5 Venezuela Venezuela VEN 0 2 Outs Caribbean Series Caribbean Series Final Puerto Rico Puerto Rico PUR 2 Panama Panama PAN 3 Caribbean Series Caribbean Series 7:00 PM ET ESPN Dep Dominican Republic Dominican Republic DOM 0-0 Mexico Mexico MEX 0-0 Caribbean Series Caribbean Series Try adding more teams for the latest scores and highlights! + Add favorites How to watch the 2022 Caribbean Series on ESPN Deportes'THEY DESERVE TO BE IN'HOW DAVID ORTIZ CAN HELPCould Big Papi's endorsement get Bonds, Clemens into the Hall?With Bonds and Clemens shut out again, their fates are in the committees' hands. Could a seal of approval from David Ortiz be the difference-maker?8hBuster Olney Maddie Meyer/Getty Images ALL-STAR GAME STARTERS ANNOUNCEDASG WEEKEND: FEB. 18-202022 NBA All-Star debate: Our experts answer big All-Star questionsOur experts took a look at the biggest questions ahead of the game.1dNBA InsidersFirst-time All-Stars Morant, Wiggins picked to startNBA All-Star Game 2022: Latest news, starters and updatesCATASTROPHIC OLYMPIC INJURYARON BAYNES' ROAD BACKThe mysterious fall and harrowing story of an NBA centerIn his first public comments since the Tokyo Olympics, NBA center Aron Baynes details the freak injury that left him unable to walk and alone inside a Japanese hospital.1dBrian Windhorst WU HONG/EPA-EFE Top HeadlinesCowboys' Jones clears the air on McCarthy, QuinnSources: Kings balk at Simmons price, end chaseNets' Harden 'ready to go' after hamstring issueSteelers GM Colbert stepping down after draftBurrow, Bengals prepared for loud crowd in K.C.Rooney says he turned down Everton interviewWarren's $41M bid secures rights to Fury-WhytePitch & putt: NCAA golfer J.R. Smith has NIL repPotential QB options for Packers, BucsFavorites FantasyManage FavoritesFantasy HomeCustomize ESPNSign UpLog InICYMISteph makes unreal diving and-1 shotSteph Curry absorbs the contact on the drive but is somehow able to dive around his defender to make the and-1 bucket. Trending Now@jtmillzzy/TwitterOttawa Senators fan gets tattoo of favorite player in exchange for autographed Josh Norris jerseyOttawa Senators fan Mason Kohne made a blockbuster deal with Josh Norris: He'd get a life-size tattoo of Norris' head on his stomach in exchange for an autographed jersey.Ben Jared/PGA TOUR/Getty ImagesBrooks Koepka goes blond in midst of Farmers Insurance OpenKoepka, the four-time major winner, revealed a new Eminem-inspired look on social media -- and confusion ensued. Best of ESPN+Photo by David Rosenblum/Icon SportswireKiper's new Big Board for the 2022 draft: Ranking the best prospects at every positionMel Kiper unveils his latest top 25 and position rankings, with the 2021 college football season behind us.Kevin Jairaj/USA TODAY SportsUSC, LSU and the teams most helped by transfer portalThe coaching carousel led to many players entering the transfer portal. Which teams improved the most? Australian Open on ESPN+EPA/LUKAS COCH AUSTRALIA AND NEW ZEALAND OUTAustralian Open 2022 -- How to watch on ESPN/ESPN+The first major tennis tournament of the 2022 calendar year is the Australian Open. Here's how you can watch all of the action. PGA Tour on ESPN+Brian Rothmuller/Icon SportswireHow to watch the PGA Tour's Farmers Insurance Open this Wednesday-Saturday on ESPN+The PGA Tour's Farmers Insurance Open shifts to a Wednesday-Saturday schedule this week. Here's how to watch on ESPN+. Terms of UsePrivacy PolicyYour California Privacy RightsChildren's Online Privacy PolicyInterest-Based AdsAbout Nielsen MeasurementDo Not Sell My InfoContact UsDisney Ad Sales SiteWork for ESPNCopyright: © ESPN Enterprises, Inc. All rights reserved.(function () {var footerLinks, needConsent;try {footerLinks = JSON.parse('[{"copyright":"Copyright: © ESPN Enterprises, Inc. All rights reserved.","footer":[{"label":"Terms of Use","href":"https://disneytermsofuse.com/english/"},{"label":"Privacy Policy","href":"https://privacy.thewaltdisneycompany.com/en/current-privacy-policy/"},{"label":"Your California Privacy Rights","href":"https://disneyprivacycenter.com/notice-to-california-residents/"},{"label":"Children%27s Online Privacy Policy","href":"https://disneyprivacycenter.com/kids-privacy-policy/english/"},{"label":"Interest-Based Ads","href":"http://preferences-mgr.truste.com/?type=espn&affiliateId=148"},{"label":"About Nielsen Measurement","href":"http://www.nielsen.com/digitalprivacy"},{"className":"ot-sdk-show-settings","label":"Do Not Sell My Info","href":"https://privacy.thewaltdisneycompany.com/en/dnsmi/"},{"label":"Contact Us","href":"https://www.espn.com/espn/news/story?page=contact-index"},{"label":"Disney Ad Sales Site","href":"https://disneyadsales.com/"},{"label":"Work for ESPN","href":"https://jobs.disneycareers.com/espn"}]}]'); needConsent = 'false';} catch (e) { console.log(e); }window.espn.footerLinks = footerLinks || {}; window.espn.needConsent = needConsent || false})();

      The overall website is extremely cluttered with different updates, subscription options, promotions, game scores and articles being on the same page. This is a bad example of website accessibility as it may be a sensory overload for some individuals and a bit difficult to understand especially for those individuals using audio softwares that read the contents of the page out loud.

    1. SciScore for 10.1101/2022.01.24.22269714: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Ethics</td><td style="min-width:100px;border-bottom:1px solid lightgray">IRB: Ethical considerations: The Ethics Committee of the MIBS approved the VE study on June 21, 2021.<br>Consent: All participants signed the informed consent upon referral to the LDCT triage.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">Sample size of 1,198 cases and 2,747 controls, and 1,175 patients with the complete vaccination status (exposure level of 29.8% for Sputnik V) provides 80% power to detect an odds ratio of 0.80 (or the VE of 20%) at the 5% alpha level.</td></tr></table>

      Table 2: Resources

      No key resources detected.


      Results from OddPub: Thank you for sharing your code and data.


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      The self-reported vaccination status is an important limitation of our study. Several survey participants included in the control group have not reported the exact date of vaccination. While the overall number of such individuals was low, we assumed that the vaccination date for such individuals is likely to be several months from the interview date. However, we assigned them a “non-vaccinated” status in our sensitivity analysis, and the estimates were only slightly affected. Our definition for full vaccination status was also very conservative, as we decided to accept a minimum of six days between the second vaccine dose and study inclusion. While our decision was driven by the idea that we should not exclude participants without an exact date of vaccination, we do not think that this assumption would significantly bias the results. However, most of the studies choose 14-day period [5], and that should be taken into account when comparing our results to other studies. We have undertaken additional attempts to identify cases (patients with symptomatic SARS-CoV-2 in October, 2021) who had the history of confirmed COVID-19 more than two months before the current episode. We were able to identify only two cases of re-infection. While underreporting may occur, it is also likely that a patient with re-infection that requires additional diagnostic followup is an infrequent event. Absolute risks of re-infection, especially of severe disease, are low for the Alpha, Beta, and Delta VO...

      Results from TrialIdentifier: We found the following clinical trial numbers in your paper:<br><table><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Identifier</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Status</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Title</td></tr><tr><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NCT04981405</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Active, not recruiting</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Real-world Evidence of COVID-19 Vaccines Effectiveness</td></tr><tr><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NCT04406038</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Active, not recruiting</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Study of the Spread of COVID-19 in Saint Petersburg, Russia</td></tr><tr><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">ISRCTN11060415</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NA</td><td style="min-width:95px; border-right:1px solid lightgray; border-bottom:1px solid lightgray">NA</td></tr></table>


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      Results from scite Reference Check: We found no unreliable references.


      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01129

      Corresponding author(s): Koji Kikuchi


      Reviewer #1

      Evidence, reproducibility and clarity (Required):

      In this manuscript, Kikuchi et al describe the characterization of MAP7D2 and MAP7D1, two MAP7 family members in mouse with specific expression patterns. Focusing mostly on MAP7D2, they assess its expression pattern across the body and find that it is mostly expressed in certain neuronal subsets. They then characterize the MT-related properties of MAP7D2 based on previous knowledge of other MAP7 family members. They show that MAP7D2 binds MTs (via the N-terminus), determine the binding affinity, and show that it can stimulate MT polymerization (or stabilization) both in vitro and in vivo. Using a specific antibody, they localize MAP7D2 to centrosomes, midbody and neurites in N1-E115 cells. Functionally, they show that loss of MAP7D1/2 mildly affects microtubule stability as judged by acetyl-tubulin staining, and properties of these cells that rely on cytoskeletal elements such as cell migration and neurite growth. Interestingly, there might be a feedback loop regulating MAP7D1/2 expression, as knockdown of MAP7D1 upregulates MAP7D2.

      Overall, the experiments and conclusions are very solid and convincing, such that I would not ask for further experiments. This is in part because the experiments are largely based on previous characterizations of other MAP7 family members, which are largely confirmed. The presentation of the data is also very clear.

      Significance (Required):

      I see the value of the study in the fact that it provides solid and specific research tools for MAP7D1/2 which could be very useful for the microtubule/neuronal cytoskeleton community.

      Response: We thank the reviewer very much for appreciating the content of our manuscript.

      \*Referees cross-commenting***

      Reviewers 2 and 3 criticize that the evidence for an effect of MAP7D1/2 on MT dynamics is weak. I would agree in that ac-tub stainings and in vitro experiments are rather indirect. The experiments suggested by reviewer 2 should clarify this (esp. nocodazole should be easy). I also agree that an experiment addressing the potential involvement of kinesin-1 would help, the involvement of which seems to have been omitted by the authors. A kinesin-binding deficient mutant would add another MAP7D1/2 tool and increase the value for the community.

      Response: As for the reviewer’s suggestions listed above, please refer to our responses to the comments of Reviewer #2.

      Reviewer #2

      Evidence, reproducibility and clarity (Required):

      In this study, the authors investigate 2 members from the MAP7 family Map7D2 and Map7D1. They first address the tissue distribution of Map7D2, by northern blotting using a variety of rat tissues. To complement their analysis, they also raised an antibody to look at the protein distribution. From their studies, they concluded that Map7D2 is abundantly expressed in the brain and testis. The authors went on to perform a series of functional assays. First, they biochemically demonstrated that rat Map7D2 directly binds to MTs by MT co-sedimentation assay. The MT binding domain was mapped to the N-terminal half. They performed MT turbidity assay to demonstrate enhanced MT polymerisation in the presence of Map7D2, suggesting that this Map stabilises MTs. The authors went on to characterise in detail the subcellular localisation of Map7D2 which was predominantly present in the centrosome and partially localised to MTs including within neurites from N1-E115 cells. Kikuchi et al. further revealed the overlap in expression between Map7D2 and another family member, Map7D1. The authors continued these studies by a series of functional studies in N1-E115 cells where they performed single or combined knock-downs of Map7D2 and Map7D1 and studied the levels of acetylated and detyrosinated tubulins and the effect of the knock-downs on migration and neurite extension. The main conclusion from this work was that Map7D2 and Map7D1 facilitate MT stabilization through distinct mechanisms which are important in controlling cell motility and neurite outgrowth. Map7D2 is proposed to stabilise MTs by direct binding whereas Map7D1 does it indirectly by affecting acetylation.

      Major comments:

      The main conclusion from this work that Map7D2 and Map7D1 facilitate MT stabilization and that this is necessary for correct migration and neurite extension has not been convincingly demonstrated. In my opinion, a more detailed study of MT properties to demonstrate a role in MT stabilisation would greatly benefit the work, eg. experiments using MT destabilising agents such as nocodazole. In addition, a series of experiments aiming to study MT dynamics would help to understand the function of these MT regulators. The authors proposed an elevation in microtubule dynamics to explain the increase in migration and neurite extension but no experimental proof was provided.

      Response: According to the reviewer’s suggestion, we plan to assess the role of MT stabilization in greater detail by analyzing the sensitivity to the MT-destabilizing agent, nocodazole.

      To study MT dynamics, methods such as analyzing the velocity and direction of an EB1-GFP comet are commonly used. We have previously analyzed the roles of Map7 and Map7D1 in MT dynamics using HeLa cells stably expressing EB1-GFP (Kikuchi et al., EMBO Rep., 2018). However, no such tools have been developed for analyzing MT dynamics in N1-E115 cells, which were used in this study. In addition, it is difficult to analyze MT dynamics by transient expression of EB1-GFP because of the low plasmid transfection efficiency. Therefore, we instead plan to assess the effect on MT dynamics by measuring the EB1 comet length by immunofluorescence, referring to Fig. 7D in EMBO J. 32:1293–1306, 2013.

      Moreover, considering the possibility that the Map7D2 dynamics are altered when MT stability is changed, e.g., before and after differentiation induction, we analyzed the Map7D2 dynamics at the centrosome by fluorescence recovery after photobleaching (FRAP) using N1-E115 cells stably expressing EGFP-rMap7D2. We found that the dynamics were altered between the proliferative and differentiated states (see the figure below). Compared to the proliferative state, the recovery rate of EGFP-Map7D2 was reduced (lower left panel), and the immobile fraction of Map7D2 was increased in the differentiated state (lower right panel). As these data suggest that the increase in immobile Map7D2 may enhance MT stabilization, we will present them in a new figure in our manuscript along with the results of the above two experiments.

      It has been previously demonstrated that loss of MAP7D2 leads to a decrease in axonal cargo entry to axons resulting in defects in axon development and neuronal migration. The C-terminus is necessary for this function as it mediates interaction with Kinesin-1 (Pan et al., 2019). Such mechanisms could also explain the defects in migration and neurite growth that the authors observed. This possibility has not been considered but instead, the subtle changes in total α-tubulin led to suggest MT stabilisation as a key function without proof of causation. Could the authors provide some further experimental evidence to demonstrate that stability is the main contributor to the phenotypes observed? Eg. by rescuing migration and neurite phenotypes with a variant of MAP7D2 which cannot bind kinesin1.

      Response: The reviewer states “Such mechanisms could also explain the defects in migration and neurite growth that the authors observed;” however, our results showed that loss of Map7D2 elevated the rates of both cell motility and neurite outgrowth (original Fig. 5). In contrast, it has been reported in several papers that when Kinesin-1 function is impaired, both cell motility and neurite outgrowth are reduced (Curr. Biol., 23: 1018–1023, 2013; Mol. Cell. Biol., 39: e00109–19, 2019; etc.). Therefore, it is likely that the phenotypes we observed are independent of the functions associated with Kinesin-1 in N1-E115 cells. It is indeed possible that the experiment suggested by the reviewer may reveal relationships between Map7D2 and kinesin-1 in terms of cell motility and neurite outgrowth, however, it is difficult to conduct such an experiment because transient expression of Map7D2 induces MT bundling, as shown in original Fig. 2F. Based on the above, we plan to add a discussion of the relationship between Map7D2 and Kinesin-1.

      A key conclusion proposed by the authors is that Map7D2 and Map7D1 facilitate MT stabilization through distinct mechanisms. Such different roles in MT stabilisation are important in controlling cell motility and neurite outgrowth. In my opinion, their data does not fully support this statement and the findings using MT readouts do not match the defects in migration and neurite growth. Loss of Map7D2 leads to a very subtle phenotype on α-tubulin, while Map7D1 decreases both α-tubulin and acetylated tubulin, but Map7D1 seems to have a milder or similar effect on migration and neurite growth than Map7D2. Furthermore, it would be expected that the combined loss of function would lead to a stronger phenotype in cell migration when compared to the single loss of functions due to their distinct roles on MT stability, however, this seems not to be the case.

      Response: The fact that no stronger phenotype was observed may be because, besides Map7D2 and Map7D1, other molecules are involved in MT stabilization. Another possible explanation is that the increases in both cell motility and neurite outgrowth caused by decreased MT stabilization are offset by Kinesin-1 dysfunction. We plan to add a discussion of the above two possibilities.

      Minor comments:

      1) In the first result section, the author refers to Fig. S3 to suggest the expression of MAP7D2 in the cerebral cortex, however, there are no transcripts in the cerebral cortex according to the figure. Similarly, the immunofluorescence analysis done by the authors shows marginal expression of MAP7D2 in the cerebral cortex.

      Response: According to the reviewer’s comment, we have changed the order of the data shown in Fig. 1C, top panels. The data from the olfactory bulb, cerebellum, and hippocampus, in which Map7D2 expression was detected in the database, were arranged in the top three rows, and the data from the cerebral cortex, in which Map7D2 expression was not detected in the database, were moved to the bottom row as a negative control. In addition, we have revised the relevant part of the Results section as follows: “Based on RNA-seq CAGE, RNA-Seq, and SILAC database analysis (Expression Atlas, https://www.ebi.ac.uk/gxa/home/), Map7D2 expression was detected in the cerebellum, hippocampus, and olfactory bulb, and not in the cerebral cortex (Fig. S3). We further confirmed Map7D2 expression in the above four brain tissue regions of postnatal day 0 mice by immunofluorescence. Among these regions, Map7D2 was the most highly expressed in the Map2-negative area of the olfactory bulb, i.e., the glomerular layer (Fig. 1C). Weak signals were detected in the cerebellum, and marginal signals were observed in the hippocampus and cerebral cortex (Fig. 1C).” (page 5, lines 4–11)

      2) The authors use γ-Tubulin as a housekeeping gene in Fig. 3D, since Map7D2 is enriched in centrosomes this may not be the most appropriate choice.

      Response: γ-Tubulin is abundant in both the cytosol and the nuclear compartments of cells (Sig. Transduct. Target Ther. 3: 24, 2018). As it has been used for similar purposes in several other studies (Cancer Res., 61: 7713–7718, 2001; J. Biol. Chem., 291: 23112–23125, 2016; etc.), we considered it acceptable for use as a loading control for immunoblotting.

      3) According to the authors, knockdown of Map7D2 leads to a decrease in the intensity of α-tubulin and Map7D1 (Fig. 4C and D). This data doesn't agree with the previous statement made by the authors where they show that Map7D2 knockdown or knockout did not affect Map7D1 expression by Western Blot Analysis (Fig. S2C and S5B)

      Response: The immunoblotting results indicate that the total amount of Map7D1 in the cells is not affected by loss of Map7D2. In contrast, the immunofluorescence results indicate that the amount (distribution) of Map7D1 localized around the centrosome is decreased by loss of Map7D2, presumably due to a reduction in the number of MT structures that can serve as scaffolds for Map7D1. We plan to add this interpretation in the Results section.

      4) Line 6 page 7 "Endogenous Map7D2 expression is suppressed in N1-E115 cells stably expressing EGFP-rMap7D2 and was restored by specific knock-down of EGFP-rMap7D2 using gfp siRNA (Fig. 3D)". No quantifications and stats are shown. Also, endogenous Map7D2 after knock-down of EGFP-rMap7D2 is not comparable to the control.

      Response: According to the reviewer’s suggestion, we have quantified the amount of endogenous Map7D2 or EGFP-rMap7D2, normalized it to the amount of γ-tubulin, and calculated relative values to endogenous Map7D2 in the parental control. The amount of endogenous Map7D2 was decreased to 53% in N1-E115 cells stably expressing EGFP-rMap7D2, suggesting that EGFP-rMap7D2 expression suppressed endogenous Map7D2 expression. In this cell line, the total amount of Map7D2 (EGFP-rMap7D2 + endogenous Map7D2) was increased, however, when EGFP-rMap7D2 was depleted using sigfp in this cell line, endogenous Map7D2 was expressed to the same level as EGFP-rMap7D2 before knock-down. Together with the finding that Map7d1 knock-down increased the amount of Map7D2, these findings indicate that the amount of Map7D2 in the cells is regulated in response to the amount of Map7D1 and exogenous Map7D2. We have added this interpretation in the Results section. (page 7, lines 8–15)

      In addition, we have changed the legend of the original Fig. 3D to clarify the quantification method, as follows: “(D) Generation of N1-E115 cells stably expressing EGFP-rMap7D2. To check the expression level of EGFP-rMap7D2, lysates derived from the indicated cells were probed with anti-GFP (top panel) and anti-Map7D2 (middle panel) antibodies. The blot was reprobed for γ-tubulin as a loading control (bottom panel). The amount of endogenous Map7D2 or EGFP-rMap7D2 was normalized to the amount of γ-tubulin, and the value relative to endogenous Map7D2 in the parental control was calculated.” (page 22, lines 18–20)

      5) Line 8 page 7 "These results suggest that the expression of Map7D2 was influenced by changes in that of Map7D1" This statement seems in the wrong place, after the Map7D2 and EGFP-rMap7D2 experiment. Instead for clarity, it would be better placed after line 5 where the authors explain the effect of Map7D1 knock-down on the levels of Map7D2.

      Response: According to the reviewer’s suggestion, we have rephrased the relevant sentence as “Interestingly, Map7d1 knock-down upregulated Map7D2 expression, as confirmed with three different siRNAs (Fig. S2C), suggesting that Map7D2 expression is affected by changes in Map7D1 expression, not by off-target effects of a particular siRNA.” (page 7, lines 7, 8)

      6) Line 8 page 8 "Although the physiological role of the C-terminal region of Map7D2 is currently unknown..." This statement seems not adequate as there are several studies reporting the role of the C-terminal region of Map7D2 in Kinesin1- mediated transport. The authors mention such studies in the discussion.

      Response: According to the reviewer’s suggestion, we plan to add a discussion of the relationship between Map7D2 and kinesin-1.

      7) Line 6 page 9 " Further, the knock-down of either resulted in a comparable reduction of MT intensity (Fig. 4C and D) ..." This is not visible and/or justified by the images provided and would benefit from some sort of quantification at other regions such as neurites.

      Response: Considering the cell motility, quantification of α-tubulin/Ace-tubulin/Map7D1/Map7D2 intensities in neurites is not appropriate. Instead, we have added arrowheads indicating α-tubulin/Ace-tubulin/Map7D1/Map7D2 in Fig. 4C, for better understanding.

      8) In Fig. 2B, a band corresponding to his6-rMAP7D2 of molecular weight >97 kDa co-sedimented with the microtubules. However, the cloned rMAP7D2 had a molecular weight of 84.82 kDa and the addition of 6XHis-Tag would add another 2-3 kDa, therefore, the final protein band observed should be less than 90 kDa. It would be beneficial if the authors could specify the molecular weight of the purified protein after the addition of the V5-his tag and/or if there was addition of amino acids due to cloning strategy.

      Response: In Fig. 2B, we used full-length GST-tagged rMap7D2, like in Fig. 2E and D; therefore, we have corrected His6-rMap7D2 as GST-rMap7D2. We apologize for the mistake.

      9) In Fig. 2C, there is misalignment of the western blot with the panel or text underneath.

      Response: We thank the reviewer for pointing this out; we have corrected the misalignment of the CBB staining in Fig. 2C.

      10) In Fig. 3C the inset from the first panel seems to correspond to a different focal plane than the main image.

      Response: We have revised the relevant part of the figure legend as follows: “In C, images of differentiated cells were captured by z-sectioning, because the focal planes of the centrosome and neurites are different. Each inset shows an enlarged image of the region indicated with a white box at each focal plane. Arrowheads indicate the centrosomal localization of Map7D2.”

      11) In Fig. 4A, the cell type is not specified and is referred as "indicated cells", also the material and methods section seems to omit the specific cells used.

      Response: We have added “in N1-E115 cells treated with each siRNA” in the legend of Fig. 4A.

      12) Fig. S6 is not mentioned in the results.

      Response: We apologize for having referred to Fig. S6 only in the Discussion section in the original manuscript. We plan to describe the findings shown in the original Fig. S6 to the Results section and renumber the figures accordingly.

      Significance (Required):

      MTs play essential roles in practically every cellular process. Their precise regulation is therefore crucial for cellular function and viability. MAPs are specialised proteins that interact with MTs and regulate their behaviour in different manners. Understanding their precise function in different cellular contexts is of utmost importance for many biological and biomedical fields.

      MAPs are well known for their ability to promote MT polymerization, bundling and stabilisation in vitro (Bodakuntla et al., 2019). Several members of the Map7 family have been shown to regulate microtubule stability. For instance, MAP7 can prevent nocodazole-induced MT depolymerization and maintain stable microtubules at branch points in DRG neurons (Tymanskyj & Ma, 2019). Ensconsin, the Drosophila Map, is required for MT growth in mitotic neuroblasts by regulating the mean rate of MT polymerization (Gallaud et al., 2014). However, this family of Maps seems to have diverse functions encompassing a variety of mechanisms, as exemplified by a series of studies demonstrating the involvement of MAP7 family proteins in the recruitment and activation of kinesin1 (Hooikaas et al., 2019; Pan et al., 2019) and in microtubule remodelling and Wnt5a signalling (Kikuchi et al., 2018). Further understanding of this family of Maps and how its members differ in their function is important and will help to advance the field.

      Response: We appreciate the reviewer’s comments. We believe that our revision plan will greatly improve the quality of our manuscript.

      Reviewer #3

      Evidence, reproducibility and clarity (Required):

      Summary:

      Microtubule Associated Proteins (MAPs) are important regulators of microtubule dynamics, microtubule organization and vesicular transport by modulating motor protein recruitment and processivity. In the current manuscript the authors have characterized 2 members of the MAP7 protein family, MAP7D1 and MAP7D2. The authors characterized MAP7D2 expression pattern in the brain and its microtubule binding properties in vitro and in cells. In cells both proteins localize to the centrosome and to microtubules and upon depletion centrosome localized microtubules seem reduced, and cell migration and neurite outgrowth are increased. Surprisingly, they find that microtube acetylation (a common marker for stable microtubules) is reduced upon MAP7D1 depletion but not MAP7D2 depletion. Based on this finding the authors conclude that these proteins have a distinct mechanism in stabilizing MTs to affect cell migration and neurite outgrowth; MAP7D2 stabilizes by binding to MTs, whereas MAP7D1 stabilizes MTs by acetylation.

      Main comments:

      - Both MAP7 proteins show strong localization to the centrosome and to a lesser degree to MTs. Knockdown of either protein leads to reduced MTs around the centrosome, which lead the authors to conclude the MAP7s are stabilizing the MTs. However, the effect could just as well be an indirect effect due to a function of these MAPs at the centrosome. To address this authors could e.g. quantify microtubule properties in postmitotic cells. In addition, antibody specificity should be tested using knockdown of knockout cells, as this centrosome localization was not observed in Hela cells (Hooikaas, 2019; Kikuchi, 2018). Maybe this localization is specific to rat MAP7s or to the cell line used.

      Response: We think that this comment partly overlaps with the comments raised by Reviewer #2. We plan to assess the role of MT stabilization in greater detail by analyzing the sensitivity to the MT-destabilizing agent, nocodazole, and the effect on MT dynamics by measuring the EB1 comet length by immunofluorescence.

      Regarding the reviewer’s concern about antibody specificity, we had carefully confirmed the antibody specificity, as shown in Fig. S2 of the original manuscript. Subsequently, Map7D2 localization was confirmed in N1-E115 cells stably expressing EGFP-rMap7D2, as shown in Fig. 3D, E of the original manuscript. In addition, we are currently conducting analyses using Map7d1-egfp knock-in mice, which confirmed that Map7D1 localizes around the centrosome in cortical neurons, as shown below (we would like to disclose these unpublished data to the reviewers only). Therefore, it is thought that the localization pattern of Map7D2 and Map7D1 differs depending on the cell type and cell line. We plan to add this interpretation to the Results section.

      - Centrosome nucleated microtubules are typically highly dynamic and little modified. Therefore is the Ac-tub staining at the centrosome really MTs? I cannot identify MTs in the fluorescent images in 4C. Maybe authors could consider ac-tub/alpha-tub ratio in non centrosomal region (e.g. neurites). Moreover, as both Acetylation and detyrosination are associated with long-lived/stable MTs, it is surprising that only acetylated tubulin goes down on WB. Does this suggest that long-lived MTs are still present to normal level? If so, can one still argue that the loss of acetylation is the cause of the lower MT levels? This should at least be discussed.

      Response: As for the reviewer’s statement “Centrosome nucleated microtubules are typically highly dynamic and little modified. Therefore is the Ac-tub staining at the centrosome really MTs?”, it has been previously reported that tubulin acetylation is observed around the centrosome in some cell lines (J. Neurosci., 30: 7215–7226, 2010; PLoS One, 13: e0190717, 2018; etc.). N1-E115 is one of the cell lines in which tubulin acetylation is observed around the centrosome.

      It is not surprising that “only acetylated tubulin goes down on WB,” as it has been previously reported that acetylated and detyrosinated tubulins are sometimes not synchronous (J. Neurosci., 23: 10662–10671, 2003; J. Neurosci., 30: 7215–7226, 2010; J. Cell Sci., 132: jcs225805, 2019., etc.). For instance, Montagnac et al. (Nature, 502: 567–570, 2013) showed that defects in the α-tubulin acetyltransferase αTAT1-clathrin-dependent endocytosis axis reduce only tubulin acetylation, resulting in a shift from directional to random cell migration. Although the details of the molecular function of Map7D1 are beyond the main purpose of this study, we plan to add a discussion of the reduced tubulin acetylation by Map7d1 knock-down based on the above.

      - MAP7D1 and MAP7D2 depletion leads to subtle defect in cell migration and neurite outgrowth, which the author suggest is caused by reduced MT stability. However, MAP7 proteins have well characterized functions in kinesin-1 transport, and thus the phenotypes may well be caused by defects in kinesin-1 transport. Ideally the authors would do rescue experiments with FL or just the MT binding N-termini to separate these functions. Moreover this is needed to substantiate the claim of the authors that MAP7D1 effect on MT stability is not mediated by direct binding.

      Response: As this comment largely overlaps with the comments raised by Reviewer #2, please refer to our responses to the comments of Reviewer #2.

      - The authors do not refer well to published work. Several papers have published very similar work (especially to Fig1+2) and it would help the reader much if this would be discussed/compared along the results section and not briefly mention these in the results section. In addition, authors overstate the novelty of their results e.g. page 3: these proteins are not "functionally uncharacterized" nor are their expression patter and biochemical properties analyzed for the first time in this manuscript; page 8 "Although the physiological role of the C-terminal region of Map7D2 is currently unknow, ..." There is a clear function for the C-terminus for the recruitment/activation of kinesin-1.

      Response: According to the reviewer’s suggestion, we plan to add a comparison with data on the Map7 family members presented in previous papers in the Results section and rephrase the relevant part regarding the physiological role of the C-terminal region of Map7D2.

      Minor comments

      - P6 Map7D3 also binds with its N-terminus to MTs, like other MAP7s (Yadav et al)

      Response: According to the reviewer’s comment, we have revised this as “Map7D3 binds through a conserved region on not only the N-terminal side, but also the C-terminal side (Sun, 2011; Yadav et al., 2014).” (page 6, lines 4, 5)

      - P7 "As Map7D2 has the potential to functionally compensate for Map7D1 loss" where is this based on?

      Response: For clarity, we have rephrased this as “As Ma7D2 expression was upregulated upon suppression of Map7D1 expression, Map7D2 has the potential to functionally compensate for Map7D1 loss.” (page 7, line 17, 18)

      - Fig2F quality of black-white images is low potentially due to conversion issues

      Response: We thank the reviewer for pointing out these conversion issues, and we have made the necessary corrections.

      Significance (Required):

      At this stage the conceptual advance is limited. Part of the findings are not novel. The finding that MAP7s depletion have a different effect on MTs acetylation may be interesting to cytoskeleton researchers, although the potential mechanism has not been addressed experimentally or textually.

      However, their conclusion that this leads to reduced MTs and then to cellar migration and neurite formation defects is not sufficiently supported by experimental evidence.

      Response: We appreciate the reviewer’s comments. We believe that our revision plan will greatly improve the quality of our manuscript.

      \*Referees cross-commenting***

      I completely agree with reviewer #2: At this stage the paper's conclusions are not sufficiently supported by the data. Important will be to further characterize the effect om the MTs (do they really have a different effect) and to look at the possible involvement of the motor recruitment. Maybe that a 3 to 6 months revision time would have been more accurate.

      Response: Please refer to our responses to the comments of Reviewer #2.

    1. Author Response:

      Reviewer #2 (Public Review): Gaffield and Christie trained mice to an interval task of self-initiate bouts of licking to understand how the cerebellar activity relates to the organization of well-timed transitions to motor action and inaction during discontinuous periodically performed movements. Recording and optogenetically stimulating the activities of Purkinje cells, they concluded that the cerebellum encodes and influences the motor transitions, initiation and termination of discontinuous movements. The conclusion of the paper is very interesting and potentially provides insights on the neural mechanism of the previously proposed principle that the cerebellum controls the timings of discrete movements (Ivry et al. 2002). However, in the logic and interpretation to the conclusion I have concerns which they need to address. [Major comments]

      We thank the reviewer for their positive evaluation of our work and their helpful comments. We have substantially altered our manuscript to address their concerns, including an entirely new figure as well as additional supplemental figures.

      First, the activity of Purkinje cells can largely encode each bout of licking movements, in addition to initiation and termination of movements. Figure 2BCEF plays the peak of neural activity around the water time and Figure 2DG indicates the relationship between the neural activity and lick rate. The encoding of the initiation and termination alone cannot explain these observations. Related to this, none of the panels Figure 2BCEF shows a lead of the onset of neural activity to that of the lick rates (around -5 sec to water time). This looks inconsistent with the lead shown in Figure 3. The authors need to explain why such an inconsistency can happen.

      We agree that Crus I and II PCs encode parameters of licking bouts in addition to movement initiation and termination and deeply apologize for not making this point more clearly. To address this concern, we have extensively edited the text in several sections and have added an additional figure to emphasize the richness of the PC representation of behavioral attributes, beyond just initiation and termination alone. We disagree that there is an inconsistency in the lead times differences in our datasets. As the reviewer points out, the water-delivery-aligned firing rate z-scores do not seem to lead the licking rate (Fig. 2B-E). However, these data are averaged across trials with a high variance in the timing of lick initiation relative to water delivery; consequently, it is not possible to assess the timing of PC activity relative to lick bout initiation from these panels. When, by contrast, data are aligned to welldefined licking bouts (i.e., bouts with no licking in the preceding 2 s), it becomes clear that PC firing ramps up in advance of the bouts (Fig. 4C-D). We have edited the text, explaining this rationale, as requested by the reviewer.

      Second, the positive sign of neural modulation indicates biased recording sites. So far, many studies have been indicating the increasing firing modulation at the deep cerebellar nuclei in cerebellar timing tasks and motor tasks (e.g. Ten Brinke et al. 2017 eLIFE for the eyeblink conditioning; Ohmae et al. 2017 JNS for a self-initiate timing task; Becker and Person 2019 Neuron). Ramping-up modulation of Purkinje cells is not able to activate the deep cerebellar nuclei. When the motor-driving module generates negative modulation of Purkinje cells, the neighboring modules can generate positive modulation (e.g. Ten Brinke et al. 2017 eLIFE; De Zeeuw 2021 Nat Rev; Ohmae and Medina 2014 Soc. Neurosci. Abstr.). Because the neighboring modules are much wider than the motor-driving module, recording without identifying the driving modules, as in this study, will result in the recording being biased toward the adjacent modules.

      We too were surprised that we did not observe more negatively modulating PCs. However, our craniotomy was relatively large (>2 mm square) exposing an area over Crus I and II that encompassed zebrin bands 7+, 6-, and 6+. We randomly sampled PC activity within this region, so we don’t think our recordings were necessarily “biased”. We are unaware of any definite experiments showing whether positively and negatively PCs form separate, or convergent, channels of output onto their postsynaptic targets in the cerebellar nuclei. If convergent, then the response of the nuclear neurons will be determined by an ensemble of PCs with time varying signs of activity, in addition to the integration of the activity from pontine collaterals.

      We thank the reviewer for highlighting the developing idea of motor and non-motor cerebellar modules and the loops formed by their connectivity. We have edited our text to address how our recordings could fit into such an organizational scheme and have cited their recent unpublished preprint on this topic, now available on BioRxiv (Ohmae et al. 2021). However, we believe several considerations suggest that both positive and negative modulation of Purkinje cell firing rates will impact movement. (1) Large regions of the cerebellar cortex are capable of evoking or modulating movements when microsimulation is applied. Similarly, optogenetic suppression of IntA activity increases the outward velocity of reaching movements in mice (Becker & Person 2019). (2) In contrast with delay eyeblink conditioning, in which the motor output is an impulse-like twitch, rhythmic movements of the tongue (or, similarly, the limbs) require alternating recruitment and de-recruitment of muscles. Thus, motor commands will necessarily be multiphasic in time, and will tend to be out of phase for populations controlling antagonistic muscles. (3) Excitation of the DCN by collaterals of mossy fibers will likely modulate, and perhaps override, Purkinje cell inhibition. Therefore, further work will certainly be necessary to decipher exactly how potential antagonistic cerebellar modules participate organizing complex motor actions.

      Third, the authors used z scores for the unit of spike rate, but it is more appropriate to use spike per second as in Figure 3CD. In particular, I do not understand the meaning of difference of spike rate in the unit of z score in Figure 3E. The spike rate modulation in Figure 4E looks small which should be evaluated in the unit of spike per second as well. For the analysis of the last lick, the spontaneous spike rates should be displayed, instead of (or in addition to) the spike rate in the middle of lick bouts which should be much higher than the spontaneous spike rate according to Figure 2.

      We appreciate the reviewer’s input regarding style, but the current standard in the neurophysiology field is to report firing rate comparisons from a neural population as z-scores. Z-scoring is particularly useful because this metric provides a probability of an individual score occurring within a normal distribution, as well comparisons of different scores from different normal distributions; it also gives an indication of the raw score differs from the mean, information that isn’t available in spike rate comparisons alone. For these reasons, we elect to not change how we represent our data. However, we have modified our figures to report firing rates for traces from individual example cells as z-scoring is not appropriate for this purpose.

      Forth, I did not understand the conclusion for the optogenetic perturbation. In the result section for Figure 7, I think there is a logical gap between the last conclusion sentence and the sentences before it. The suppression of lick bouts in Figure 7D and the rebound induction in Figure 7G can be explained by the cerebellar contribution to each bout of lick movement (shown in Figure 2). I do not understand if these observations indicate the cerebellar contribution to the initiation and termination of a sequence of lick movements. Also, I have a concern about the location of stimulation sites. The stimulation may cover both the motor-driving module and neighboring modules, which makes the observations difficult to interpret because the stimulation is not specific to the positively modulating Purkinje cells.

      A lick bout is composed of a sequence of tongue protrusions and retractions performed at a highly regular rhythm. Apart from the first lick (Bollu et al., 2021), the motor command for this behavior is under the control of central pattern generators in the brainstem. Said another way, a lick bout is a continuous movement rather than series of discrete actions that are repeatedly started and stopped (they are like stepping during locomotion in some animals). Lick bout initiation and directional control of the bout can be commanded by the cerebral cortex. Given this organization, we do not believe our optogenetic experiment can be interpreted as an effect on the initiation and termination of individual licks because licks are not discrete actions when performed in a consummatory bout. However, based on the reviewer’s recommendation, we investigated how PCs encode information pertinent to individual licks in a bout (Figure 3). Although there was entrainment to individual lick cycles, there were no time-locked responses apparent in their average activity. Instead, there was a continuous mapping of the lick cycle across their population. Notably, licking rhythmicity was disrupted by the optogenetic perturbation, consistent with the influence of PC output on this movement parameter. We have edited the text to address these concerns.

      Fifth, For Figure 8, I had difficulty to understand what kind of activity of Purkinje cells can explain the shift of the peak timing of lick rate, because in the result sections of Figures 2-6 I could not find any activity encoding the peak timing of lick rate. For figure 8EFG, the analysis may not be correct. Because lick onset can be delayed with the photostimulation, in Figure 8E the boundary of onset corresponding to the 1s in control should 1+alpha in stimulation trials to correctly pick up the corresponding trials. Because we do not know the exact values of alpha, I think this analysis is not possible.

      PC ramping activity may contribute to the vigor of the ensuing licking response which would dictate peak licking rate timing. In fact, in many individual PCs, we observed correlations between PC firing and lick rate indicating a relationship. However, this was not borne out in the population response, so we did not pursue it further.

    2. Review #2 (Public Review):

      Gaffield and Christie trained mice to an interval task of self-initiate bouts of licking to understand how the cerebellar activity relates to the organization of well-timed transitions to motor action and inaction during discontinuous periodically performed movements. Recording and optogenetically stimulating the activities of Purkinje cells, they concluded that the cerebellum encodes and influences the motor transitions, initiation and termination of discontinuous movements. The conclusion of the paper is very interesting and potentially provides insights on the neural mechanism of the previously proposed principle that the cerebellum controls the timings of discrete movements (Ivry et al. 2002). However, in the logic and interpretation to the conclusion I have concerns which they need to address.

      Major comments:<br> First, the activity of Purkinje cells can largely encode each bout of licking movements, in addition to initiation and termination of movements. Figure 2BCEF plays the peak of neural activity around the water time and Figure 2DG indicates the relationship between the neural activity and lick rate. The encoding of the initiation and termination alone cannot explain these observations. Related to this, none of the panels Figure 2BCEF shows a lead of the onset of neural activity to that of the lick rates (around -5 sec to water time). This looks inconsistent with the lead shown in Figure 3. The authors need to explain why such an inconsistency can happen.

      Second, the positive sign of neural modulation indicates biased recording sites. So far, many studies have been indicating the increasing firing modulation at the deep cerebellar nuclei in cerebellar timing tasks and motor tasks (e.g. Ten Brinke et al. 2017 eLIFE for the eyeblink conditioning; Ohmae et al. 2017 JNS for a self-initiate timing task; Becker and Person 2019 Neuron). Ramping-up modulation of Purkinje cells is not able to activate the deep cerebellar nuclei. When the motor-driving module generates negative modulation of Purkinje cells, the neighboring modules can generate positive modulation (e.g. Ten Brinke et al. 2017 eLIFE; De Zeeuw 2021 Nat Rev; Ohmae and Medina 2014 Soc. Neurosci. Abstr.). Because the neighboring modules are much wider than the motor-driving module, recording without identifying the driving modules, as in this study, will result in the recording being biased toward the adjacent modules.

      Third, the authors used z scores for the unit of spike rate, but it is more appropriate to use spike per second as in Figure 3CD. In particular, I do not understand the meaning of difference of spike rate in the unit of z score in Figure 3E. The spike rate modulation in Figure 4E looks small which should be evaluated in the unit of spike per second as well. For the analysis of the last lick, the spontaneous spike rates should be displayed, instead of (or in addition to) the spike rate in the middle of lick bouts which should be much higher than the spontaneous spike rate according to Figure 2.

      Forth, I did not understand the conclusion for the optogenetic perturbation. In the result section for Figure 7, I think there is a logical gap between the last conclusion sentence and the sentences before it. The suppression of lick bouts in Figure 7D and the rebound induction in Figure 7G can be explained by the cerebellar contribution to each bout of lick movement (shown in Figure 2). I do not understand if these observations indicate the cerebellar contribution to the initiation and termination of a sequence of lick movements. Also, I have a concern about the location of stimulation sites. The stimulation may cover both the motor-driving module and neighboring modules, which makes the observations difficult to interpret because the stimulation is not specific to the positively modulating Purkinje cells.

      Fifth, For Figure 8, I had difficulty to understand what kind of activity of Purkinje cells can explain the shift of the peak timing of lick rate, because in the result sections of Figures 2-6 I could not find any activity encoding the peak timing of lick rate. For figure 8EFG, the analysis may not be correct. Because lick onset can be delayed with the photostimulation, in Figure 8E the boundary of onset corresponding to the 1s in control should 1+alpha in stimulation trials to correctly pick up the corresponding trials. Because we do not know the exact values of alpha, I think this analysis is not possible.

    1. SciScore for 10.1101/2022.01.23.22269214: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      <table><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Ethics</td><td style="min-width:100px;border-bottom:1px solid lightgray">Consent: Conventionally, the ethical approval and consent were obtained from the CRSTRA and all participants.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Sex as a biological variable</td><td style="min-width:100px;border-bottom:1px solid lightgray">= Other, 2=African / Afro-American, 3= Caucasian, 4= Arabic, 5= Asian, 6= Latino) - Gender (0=Not precise, 1= Male, 2= Female, 3= Other).</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Randomization</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Blinding</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr><tr><td style="min-width:100px;margin-right:1em; border-right:1px solid lightgray; border-bottom:1px solid lightgray">Power Analysis</td><td style="min-width:100px;border-bottom:1px solid lightgray">not detected.</td></tr></table>

      Table 2: Resources

      <table><tr><th style="min-width:100px;text-align:center; padding-top:4px;" colspan="2">Software and Algorithms</th></tr><tr><td style="min-width:100px;text=align:center">Sentences</td><td style="min-width:100px;text-align:center">Resources</td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Statistical analysis: Statistical analyses were performed using SAS® (version 9.4).</td><td style="min-width:100px;border-bottom:1px solid lightgray"><div style="margin-bottom:8px"><div>SAS®</div><div>suggested: (SASqPCR, RRID:SCR_003056)</div></div></td></tr></table>

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
      4.3 Methodological limitations: Our analysis was limited to a number of factors considered as relevant based on the literature review, and which could be ascertained using an online questionnaire. However, different studies also pointed out a number of other potential risk factors. Objectively, it is an almost impossible challenge to know exactly the factors responsible for infection and transmission of COVID-19. Sources may be incomplete; apart from the factors discussed previously, even the meteorological ones were considered a potential explanation [164]. A study in Korea demonstrated that the environment plays a significant role in the spread of COVID-19, but like any factor, it may have also been impacted by various additional features [165]. Hence, further studies are needed to protect people from COVID-19 transmission, specifically on infection dynamics and the mode of transmission, e.g., cluster spaces, closed spaces, and indoor environments [166]. At the individual level, everyone must take the maximum possible precautions. It should also be remembered that no less than 10 reasons supporting airborne transmission were phrased recently by Greenhalgh et al. [167]. The long-term health consequences of COVID-19 remain unclear and continue to be studied [168]. Therefore, it is preferable to avoid any form of infection, even mild. Another factor that we do not necessarily think about and which may be important is the wastewater treatment and disinfection strategies with ch...

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • No funding statement was detected.
      • No protocol registration statement was detected.

      Results from scite Reference Check: We found no unreliable references.


      <footer>

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

      </footer>

    1. Mechanization may yet force the issue, especially in the scientific field; whereupon scientific jargon would become still less intelligible to the layman.

      Just detailing the process of mechanization.