1,089 Matching Annotations
  1. Jan 2021
    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 5 2021, follows.

      Summary

      In this manuscript, Olive and colleagues used a genetic screen to identify Complex I (CI) of the electron transport chain (ETC) as a regulator of IFNg-mediated gene expression in macrophages. They attribute this role of CI to effects on the activity of the JAK-STAT pathway downstream of the IFNg receptor.

      While a potential link between CI activity and the activity of the JAK-STAT pathway would be interesting, the reviewers think that additional analyses are needed to substantiate this claim and rule out alternative interpretations.

      Essential Revisions

      1) Lines 204-205: The authors find that sgRNAs targeting other complexes of the ETC, including CIII and CIV, had no effect on the ability of IFNg to stimulate expression of cell surface markers. How do the authors interpret these findings, since CI does not work in isolation in the ETC and is rather dependent on CIII and CIV activity?

      2) How does IFNg stimulation affect oxidative metabolism as assessed by Seahorse? In order to corroborate the authors' conclusions regarding activity of individual ETC complexes (point 1 above), Seahorse analysis of individual complexes is also advised.

      3) The authors do some limited analyses in human MDMs to suggest that their findings in the mouse macrophage cell line can be generalized to other macrophage populations. It would be great if the analyses in the human MDMs could be extended to further strengthen the generality of their central findings.

      4) Fig 6D: Not clear whether similar exposures were used in different panels. Would be better to load samples in the same gel so that the same exposure can be used and a direct comparison between conditions can be made.

      5) Fig 6D: Does acute treatment with rotenone (but not inhibitors of other ETC complexes) have similar effects in reducing JAK-STAT signaling as knockdown of CI subunits? If not, then stable, long-term knockdown of CI subunits may have some effect independent of respiration in influencing JAK-STAT signaling (for example, on expression of some component of the JAK-STAT pathway). This interpretation could also explain why knockdown of other components of the ETC do not have similar effects to CI. Rotenone treatment could be tried (and compared with inhibitors of other ETC complexes), and if the data are different from knockdown of CI subunits, then related data in the study could be re-interpreted and conclusions modified.

      6) In Fig. 3H a key control is missing. What about survival of the cells when the import of the only energy substrate is blocked?

      7) The authors could consider placing their findings in the context of the broader literature. (As just one example, Ivashkiv Nat Imm 2015 described a role for mTORC1 and metabolism in IFNg-mediated transcriptional and translational regulation in macrophages.) This would increase the impact of their findings.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 7 2021, follows.

      Summary

      In this study, Olive and colleagues used a genetic screen to identify new regulators underpinning the ability of the cytokine IFNg to upregulate MHC class II molecules, of relevance to our understanding of how macrophages are activated by IFNg to confer host defense during microbial infection. They identified the signaling protein GSK3b, and MED16, a subunit of the Mediator complex previously implicated in gene induction.

      Essential Revisions

      1) Experimental treatment with IFNg may not be physiological. In key experiments, authors should try co-culture with activated NK cells +/- IFNg neutralization. A dose and time response curve of IFNg treatment may be valuable in key experiments.

      2) Comparison to cells not stimulated with IFNg is needed in key experiments. Comparison to WT cells is needed in Fig 5A,B.

      3) Stimulation with Type I IFN and other PAMPs in key experiments, as comparison to the effects of IFNg and to broaden the relevance of their findings.

      4) More insight into how IFNg signaling interfaces with GSK3 and MED16 is needed (e.g. role of mTORC1 pathway in regulating GSK3).

      5) Can the authors extend their data to an in vivo setting?

      6) Can the authors clarify the relative roles of GSK3a and GSK3b? For example, how do the authors explain the lack of a robust phenotype in Fig 3B-F?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 17 2020, follows.

      Summary

      In the paper, the authors used metabolomics to identify Valine and TDCA as metabolites depleted in diet-induced obesity (DIO) and replenished after sleeve gastrectomies (SGx) in mice. Intraperioneal injection of these two metabolites mimics many of the benefits of SGx, including weight loss, reduced adipose stores and insulin sensitivity. These benefits are related to Val/TDCA's ability to reduce food intake without altering locomotor activity, leading to a negative energy balance. Val/TDCA injection eliminated the fasting-associated rise in hypothalamic MCH expression in obese mice, and central injections of recombinant MCH blunted weight loss induced by Val/TDCA. Overall, this paper reports interesting and surprising observations related to the impact of metabolomic disturbances in obesity, and suggests a role for Val and/or TDCA in regulating food intake through MCH.

      Essential Revisions

      1) It is unclear from the data whether the effects are derived from valine, TDCA, or both. Both reviewers felt that any reader would want to see experiments where either of these metabolites is injected alone.

      2) No quantitative metabolite concentration values are provided anywhere, making it difficult to evaluate the robustness of the data. How much do the levels of TDCA and valine change with SGx in mice and humans, and what levels are achieved with the injections of these metabolites in the mice?

    1. Reviewer #2:

      In this manuscript, Knight et al examine the genetic diversity in >12,000 publicly available C. difficile genomes in order to characterize genomic evidence of taxonomic incoherence among this genomically diverse pathogen. Their primary analysis employs average nucleotide identity thresholds to identify species boundaries, with secondary analyses examining core genome size changes, gene content, and estimated emergence dates. The authors' main conclusion is that the previously identified C. difficile cryptic clades CI-III are genomically divergent enough from the main clades C1-5 to warrant classification as different genomospecies. This paper is a useful contribution in benchmarking our understanding of the genetic diversity of C. difficile using all currently publicly available genomes, but the results are largely unsurprising given previous phylogenetic analyses involving clades 1-5 and CI-III, and is therefore probably best suited for a specialty journal. Additionally, in some instances, the methods lack details, reducing their interpretability and reproducibility.

      Major Comments:

      1) There are some claims that are too strong and not supported by the data or literature, including the claim that the rise of community-associated CDI is likely due to presence of C. difficile in livestock (Lines 53-54 - far too little evidence to make such a sweeping claim), the statement of apparent rapid population expansion into clades C1-4 (Lines 278-279 - only shown for certain sequence types and greatly impacted by observation bias), the statement that these findings "impacts the diagnosis of CDI worldwide" (Lines 37-38 -too grandiose given limited evidence of the clinical importance of the cryptic clades).

      2) Generally, it is hard to discern which sets of genomes and variants were used for each of the bioinformatic analyses that are described. If there are a limited number of genome sets it might be useful to define them in the results to allow the reader to more easily follow along and understand the scope of different analyses.

      3) The dated phylogenomic analyses methods would benefit from a more thorough assessment of model assumptions along with more description of the sources of bias and uncertainty at play. Specific questions are:

      • Was the temporal signal in the data evaluated?

      • What are the potential impacts of using a single clock model and demographic prior for such a diverse set of taxa?

      • Was the clock rate restricted to the cited 2.5x10-9 - 1.5 x 10-8 range? What clock prior distribution was applied?

      • Were relaxed clock priors explored?

      • What went into the selection of the demographic model prior in BEAST? Were alternative models evaluated?

      • The significant uncertainty in the divergence estimates should be emphasized/listed as a limitation.

      4) Similarly, the pangenome analyses could be more thoroughly described, and the relevance of the core-genome size changes more robustly explored. Specifically:

      • How did the core genome change when excluding any of C1-5? Were these changes much different than when excluding CI-III?

      • The differences between Roary and Panaroo are notable, and potentially important for the microbial genomics community. More details should be provided on these results and how sensitive they are to the input parameters of the respective programs (e.g. collapsing paralogs in Roary and percent identity for orthologs). In addition, it is important to know if any filtering was done with respect to the quality of assemblies, which could have a significant impact on Roary's behavior.

    2. Reviewer #1:

      General Assessment:

      The work presented by Knight et al. in "Major genetic discontinuity and novel toxigenic species in Clostridioides difficile taxonomy" is of excellent quality and spans several of the themes of eLife. The manuscript provides a thorough and robust examination of publicly available C. difficile genomes, to deliver a much-needed update of C. difficile phylogeny, in particular the cryptic clades of C. difficile. However, there are some further clarifications could be included to confirm if the cryptic clades of C. difficile, and the 26 unclassified STs (which seemingly form 4 distinct clusters) should indeed be assigned to the Clostridioides genus, distinct from both C. mangenotii and C. difficile.

      Specific comments:

      Lines 96-97 and Figure 2: Figure 2 suggests the 26 unclassified STs form at least 4 distinct clusters, yet these STs are classified as outliers. Could you please comment on why these are considered outliers? Or do these STs represent new cryptic clades? C-IV, C-V etc.? And do these unclassified STs also fit into the criteria for the novel independent Clostridioides genomospecies?

      Lines 161-162; Table 1: C. mangenotii is referred to as Clostridioides mangenotii on lines 161-162, but has been listed as Clostridium mangenotii in table 1. Was this intentional? Or should this be Clostridioides mangenotii as C. difficile is also listed as Clostridioides difficile?

      Figure 6: Many of the numbers and symbols on the figure are difficult to see e.g. Figure 6A the values listed above each data point are extremely small. Can these values/symbols be increased?

      Lines 224-225: Given that C. difficile strains lacking tcdA and tcdB can still cause infections, consider rephrasing "indicating their ability to cause CDI".

      Figure 7: As with Figure 6, many of the numbers and symbols on the figure are difficult to see. Can these values/symbols be increased?

      General comments:

      Were the unclassified STs included in the species wide ANI analyses in Figure 3? If similar analyses were performed for these STs and given the clusters that are presented in Figure 2 would this support the idea that they may also fit into the criteria for the novel independent Clostridioides genomospecies?

      Similarly, were these same unclassified STs included in the BactDating and BEAST analyses? Or the pairwise ANI and 16S rRNA value comparisons in Figure 5? Or the pangenome and toxin gene analysis also presented in Figures 6 and 7? And would this add further strength to the idea that these "outliers" could be the first typed representatives of additional genomospecies?

      Lastly, your conclusions are a little too on the fence. You have presented sufficient evidence to suggest that the cryptic clades of C. difficile likely represent novel independent Clostridioides genomospecies, but dilute out the importance of this throughout the discussion and conclusions. Although controversial, the evidence provided gives credence to these claims, and the text should be changed to reflect this.

    1. Reviewer #2:

      Recombinant antibodies are the most common and powerful reagents in life science research to identify and study proteins. Yet, every single antibody should always be validated and carefully tested for its relevant application, to ensure constructive and reproductive scientific endeavor. I was thus extremely pleased to review the manuscript of Terkild Buus et al, as it provides a careful assessment of oligo-conjugated antibody signal in CITE-seq. The authors tested four variables (antibody concentration, staining volume, cell numbers and tissue origin) and clearly showed that antibody titration is a crucial step to optimize CITE-seq panel. The authors found that, as a general rule, concentration in the 0.625 and 2.5 µg/mL range provides the best results while recommended concentrations by vendors, 5 to 10 µg/mL range, increase background signal.

      In my opinion, the study is well-performed and may serve as a guideline to accurately validate antibodies for CITE-seq, as a consequence I have only minor comments.

      • As stated by the authors, the starting concentration used for each antibody was based on historical experience and assumptions about the abundance of the epitopes. This approach may not be ideal, and the optimal concentration may have been missed. Do the authors think that a proper titration would be an advantage? Maybe this could be discussed in the text.

      • The authors showed by testing four variables (see above) that they could define the optimal conditions to reduce background signal and increase sensitivity of antibodies and thus this way improves CITE-seq outcome. Nevertheless, the authors rely on the fact that all antibodies used in their panel are specific for their targeted antigens. I am not asking here to test the specificity of every single antibody used in the study as this would be a colossal amount of work. But I feel that this aspect should be discussed in the manuscript, especially when an "uncommon" antibody is intended to be used in the CITE-seq panel; the specificity of this antibody should be indeed tested prior to its use.

    2. Reviewer #1:

      In the study by Buus et al., the authors set out to address an important need to understand how oligo-conjugated antibodies should be optimally utilized in droplet-based scRNA-seq studies. These techniques, often referred to as CITE-seq, complement techniques such as flow cytometry and mass cytometry yet also further extend them by the ability to jointly measure intra-cellular RNA-based cell states together with antibody-based measurements. As is the case with flow cytometry, manufacturers provide staining recommendations, yet encourage users to titrate antibodies on their specific samples in order to derive a final staining panel. Based on the ability to stain with hundreds of antibodies jointly, few studies to date have assessed how the antibodies present in these pre-made staining panels respond to a standard titration curve. In order to address this point, this study tests two dilution factors, staining volume, cell count, and tissue of origin to understand the relationships between signal and background for a commercially available antibody panel. They arrive at the general recommendation that these panels could be improved, grouping various antibodies into distinct categories.

      This study is of general interest to the scRNA-seq and CITE-seq communities as it draws attention to this important aspect of CITE-seq panel design. However, it would stand to be substantially improved by not only providing suggestions but also testing at least one, if not more, of their suggestions from Supplementary Table 2, and preferably performing experiments using more technical replicates or biological replicates. As it stands now, the study is largely based on one PBMC and one lung sample, that were stained once with each manipulation as far as can be gathered from the Methods.

      Major comments:

      1) Given the title is improving oligo-conjugated antibody... it would be important to functionally test one of the suggestions. We would suggest a full titration curve of selected antibodies, perhaps one from each of the categories, but if cost is a concern at least two or three antibodies, to identify how titration impacts antibodies, and especially those in categories labeled as in need of improvement. Relatedly, if the idea is that if antibodies (such as gD-TCR) do not have a cognate receptor leading to general background spread, does spiking in a cell that is a known positive in increasing ratios remedy this issue by acting as a target for the antibodies? Does adding extra washes help to remedy these issues of background?

      2) Another way of improving these panels is through reducing the costs spent on both staining but perhaps more importantly the sequencing-based readouts. Several times in the manuscript (at line 77 for example or line 277) it is alluded to that the background signal of antibodies can make up a substantial cost of sequencing these libraries. However, no formal data on cost is presented, which would be important to formalize the author's points. It would be important to provide cost calculations and recommendations on sequencing depth of ADT libraries based on variation of staining concentration. Relatedly, in the methods, sequencing platform and read depth for ADT libraries was not discussed, nor is the RNA-seq quality control metrics provided other than a mention of ~5,000 reads/cell targeted. This is important to report in all transcriptomic studies, and especially a methods development study.

      3) One of the powerful elements of joint multi-modal profiling, as mentioned in the title, is to be able to measure protein and RNA from a single cell. This study does not formally look at correlation of protein and RNA levels, and whether a decrease in concentration of antibody either improves or diminishes this correlation. This would be important to test within this study to ensure that decreasing antibody levels does not then adversely affect the power of correlating protein with RNA, and whether it may even improve it.

      4) How was the lack of antibody binding determined for Category E? CD56 is frequently detected on NK cells in peripheral blood, CD117 should be detected on mast cells in the lung, and CD127 should be found on T cells, particularly CD8+ T cells. From inspecting Figure 1E, it appears as if all three of these markers are detected on small but consistent cell subsets. As the clusters are only numbered and no supplementary table is provided to help the reader in their interpretation, it is difficult to determine if these represent rare but specific binding, or have not bound with any specificity.

      5) References: At 14 references, the paper overall could benefit from a more comprehensive citation of related literature including flow cytometry and/or CyTOF best practices for antibody staining and dealing with background, and joint RNA and protein measurement from single cells.

    1. Reviewer #2:

      In this paper, Numssen and co-workers focus on the functional differences between hemispheres to investigate the "domain-role" of IPL in different types of mental processes. They employ multivariate pattern-learning algorithms to assess the specific involvement of two IPL subregions in three tasks: an attentional task (Attention), a semantic task (Semantics) and a social task (Social cognition). The authors describe how, when involved in different tasks, each right and left IPL subregion recruits a different pattern of connected areas.

      The employed tasks are "well established", and the results confirm previous findings. However, the novelty of the paper lies in the fact that the authors use these results as a tool to observe IPL activity when involved in different domains of cognition.

      The methodology is sound, well explained in the method section, the analyses are appropriate, and the results clear and well explained in the text and in graphic format.

      However, a solid experimental design is required to provide strong results. To the reviewer's view, the employed design can provide interesting results about functional connectivity, but not about the functional role of IPL in the investigated functions.

      I think the study would be correct and much more interesting if only based on functional connectivity data. Note that rewriting the paper accordingly would lead to a thorough discussion about how anatomical circuits are differently recruited based on different cognitive demands and about the variable role of cortical regions in functional tasks. This issue is neglected in the present discussion, and this concept is in disagreement with the main results, suggesting (probably beyond the intention of the authors) that different parts of the right and left IPL are the areas responsible for the studied functions.

      Major points:

      1) The 3 chosen tasks explore functions that are widespread in the brain, and are not specifically aimed at investigating IPL. The results (see. e.g. fig 1) confirm this idea, but the authors specifically focus on IPL. This seems a rather arbitrary and not justified choice. If they want to explore the lateralization issue, they should consider the whole set of involved areas or use tasks showing all their maximal activation in IPL.

      2) The authors aims to study lateralization using an attentional task, considering the violation of a prevision (invalid>valid), a linguistic task, looking for an activation related to word identification (word>pseudoword) and a social task, considering correct perspective taking (false belief>true belief), but they do not consider that in all cases a movement (key press) is required. It is well known that IPL is a key area also for creating motor commands and guiding movements. Accordingly, the lateralization bias observed could be due more to the unbalance between effectors while issuing the motor command, than to a different involvement of IPL regions in the specific tasks functions.

      3) Like point 2, the position of keys is also crucial if the authors want to explore lateralization. This is especially important if one considers that IPL plays a major role in spatial attention (e.g. Neglect syndrome). In the Methods, the authors simply say "Button assignments were randomized across subjects and kept identical across sessions", this should be explained in more detail.

      4) The authors show to know well the anatomical complexity of IPL, however their results are referred to two large-multiareal-regions. This seems to the reader at odds with all the descriptions related to fig.2. If they don't find any more subtle distinction within these 2 macro-regions, they should at least discuss this discrepancy.

      5) The part about Task-specific network connectivity is indeed very interesting, I would suggest to the authors to focus exclusively on this part. (Note that the results of this part seems to confirm that only the linguistic task is able to show a clear lateralization).

    2. Reviewer #1:

      The authors have performed a rare feat in the study of the posterior parietal cortex, which is to achieve a functional parcellation of this crucial area on the basis of its response during a diverse set of tasks. The variety of tasks and the analytical approach married to it are very strong and lead to a division that agrees well with data from patients with lesions and studies in homologous areas of non-human primates.

      Readers are encouraged to note the analytical approach, with particular regard for the permutation testing that establishes the differences between the tasks in the functional connectivity of the area.

      Conceptually, this paper is another strong argument for understanding the broad role of the posterior parietal across tasks and point at the flexibility of its functional response in supporting those roles.

      This manuscript lays out a series of fMRI investigations and analyses centered on examining the response of the IPL during three different tasks (attention, semantics, social cognition). The analyses are largely data-driven and examine functional response and connectivity, to make the argument for a functional parcellation of the IPL into at least two distinct subregions. The manuscript is well-written and the analyses well described. There are some concerns about the analyses that dampen enthusiasm slightly and a lack of consideration of the associated literature in non-human primates, but these problems seem imminently correctable.

      The analyses begin with a data-driven cluster analysis across an anatomically constrained IPL ROI, searching for cluster solutions that efficiently parcellate IPL on the basis of the response of voxels across the three tasks. This analysis is fine, but does constrain the average activity in the identified clusters to differ across the tasks. That makes the univariate activation in 3b a bit circular and hard to interpret. Either the error bars should be removed and a note added that the univariate activity is purely descriptive or the univariate data should be displayed from a slice of the data that did not contribute to the derivation of the clusters. The strongest version of this analysis would hold out entire participants.

      The predictive coding analysis is potentially informative but the details were a bit unclear. In the one versus rest analysis the strongest test would be to build the model on the data from n-1 participants and then test it on the trials of the held-out participant. If this was not done, some justification for not doing it would be in order.

      Finally, the authors should also consider integrating some of the non-human primate literature as it only strengthens their case. In the human literature the IPL has proved a tough nut to crack, but the single unit physiology has revealed strong differences in the homologous areas of macaque, some of which directly map onto the division argued for here.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 10 2021, follows.

      Summary

      The reviewers agree that this is an interesting and useful contribution for understanding LQ extinctions, and that it is generally well-presented. It shows that the factors that increase extinction risk are de-coupled from the factors that eventually lead to extinction and thus in its timing. However, the reviewers also note that although the modelling approach is novel, it is reliant on datasets that are biased and at times these biases are not well-accounted for. Because much of the conclusions drawn from the modelling could already be drawn from existing records and using literature that is glossed over here, attention to that literature should be improved and the contributions beyond the megafauna debate should be emphasized. Furthermore, the authors should take care to improve clarity in the framing of the models, the presentation and interpretation of results, figures, and discussion.

      Essential Revisions

      1) ADDITIONAL ANALYSES (no additional data collection). The reviewers had specific concerns about the effects of sampling on the extinction chronology and the influence of body mass on a number of things (recovery potential, life history/demographic correlates, etc). Specifically, the analytical issues that present the biggest problems revolve around sampling uncertainty and body mass correlation. The former could be addressed by introducing some sensitivity tests. These could be directed towards chronological biases (how does removing one date affect the confidence intervals?), as well as geographical sampling biases (how does removing a region affect the trends?). The latter in particular would be important in the claims of a continental trend. It is also possible that biases are a function of taxon sampling. There are an increasing number of small mammal Pleistocene extinctions being recognized in Australia, and it is unclear if these follow the same trends as the megafauna. If so, that would indeed remove the body size issues.

      2) BETTER FRAMING OF THE FIVE PUTATIVE DRIVERS OF EXTINCTIONS:

      (i) appears to assume that only human hunting will differentially affect demographically sensitive species. However, novel or extreme climate change can also affect such species (e.g. Selwood, K.E., McGeoch, M.A. and Mac Nally, R., 2015. The effects of climate change and land‐use change on demographic rates and population viability. Biological Reviews, 90(3), pp.837-853.)

      (ii) this mechanism is predicated on using a modelling result [ref. 25] as data. It also makes the bold claim that species inhabiting certain habitats are less accessible to human hunters without any consideration of the archaeological or modern record on this point (e.g. Roberts, P., Hunt, C., Arroyo-Kalin, M., Evans, D. and Boivin, N., 2017. The deep human prehistory of global tropical forests and its relevance for modern conservation. Nature Plants, 3(8), pp.1-9; Fa, J.E. and Brown, D., 2009. Impacts of hunting on mammals in African tropical moist forests: a review and synthesis. Mammal Review, 39(4), pp.231-264).

      (iv) many of the supporting references here do not seem like logical choices for this argument. E.g. [28] refers to coral-reef fishes. Moreover, this hypothesis conflicts with much modern data showing that extinction risk and body size are correlated under climate and environmental change (e.g. Cardillo, M., Mace, G.M., Jones, K.E., Bielby, J., Bininda-Emonds, O.R., Sechrest, W., Orme, C.D.L. and Purvis, A., 2005. Multiple causes of high extinction risk in large mammal species. Science, 309(5738), pp.1239-1241. Liow, L.H., Fortelius, M., Bingham, E., Lintulaakso, K., Mannila, H., Flynn, L. and Stenseth, N.C., 2008. Higher origination and extinction rates in larger mammals. Proceedings of the National Academy of Sciences, 105(16), pp.6097-6102. Tomiya, S., 2013. Body size and extinction risk in terrestrial mammals above the species level. The American Naturalist, 182(6), pp.E196-E214.)

      3) MORE NUANCED INTERPRETATION OF MODEL OUTPUT.

      The major weakness in this manuscript is in the discussion. The authors should be very clear in their discussion that their model does not indicate that demographic factors had no part in extinct events per se, but rather that they don't explain extinction chronology. Extinction chronologies reflect a number of different factors and processes, but they don't take away from the fact that certain life history traits can make a species more likely to go extinct from those factors.

      The authors seem to argue that demographics don't explain the megafaunal extinction in the Sahul, but in fact, their results suggest that they do; the only thing demographics by themselves don't explain is the chronology. Extinction risk as determined by demographic susceptibility is highly related to body mass and generation time (which in turn is also affected by body mass) but differential survival (timing of extinction) is determined by factors such as geographic range size, dispersal ability, access to refugia, and behavioral and morphological adaptations against hunting, and the ability to survive catastrophic events. A reiteration of this point would be beneficial to the clarity of this otherwise well written manuscript.

      The authors clearly (and elegantly) show that extinct species, which were all large, and had long generation times, had demographic traits that made them more susceptible to extinction. This is evident in figures 3 and 4. However, in the discussion, in lines 301-303, they state that no demographic trends explain the extinction. This is not supported by the results. While the timing of when species go extinct doesn't correlate with demographic susceptibility, the peculiar nature of the extinction-a large size biased extinction-is explained by demographic factors, and is a phenomenon that has been explored in a global analysis by Lyons et al. 2016 Biol. Lett. Therefore, demographic trends DO explain why certain species go extinct, while others survive. The authors should be careful when they say that "that no obvious demographic trends can explain the great Sahul mass extinction event"; instead, they should re-iterate that no obvious demographic trend explains the extinction chronology.

      4) MORE CAREFUL DISCUSSION OF RESULTS RELATIVE TO LITERATURE. The authors further go on to suggest that their results suggest that the extinctions were random, but the size-selectivity clearly shows that the extinctions were in fact not random with respect to body size.Their analyses do show that the rate of extinction doesn't exceed background to the same degree that it's been suggested in prior studies, and this is something that researchers need to explore further. Also, the authors raise an important point in lines 309-311 that human hunting could have interacted with demographic susceptibility, something that Lyons et al. 2016 Biol. Lett. show, and the results of the present study should be discussed in light of the 2016 paper.

      They also raise an important point in lines 312-320 that behavioral or morphological adaptations may have allowed some seemingly "high risk" species to persist despite anthropogenic pressure. These model "mis-matches" have been reported by Alroy 2001 Science as well in a multispecies overkill simulation. It would be beneficial to discuss the present results within the context of other examples of model mismatches, such as those from Alroy 2001.

      In lines 353-358, the authors once again state that their results show no clear relationship between body-mass and demographic disadvantage, despite clearly showing these relationships in Figures 3 and 4, and even stating as much in the beginning of the discussion. The plots clearly show that large bodied taxa were at a demographic disadvantage. There is a difference between explaining why certain taxa go extinct vs. why they go extinct at a certain point in time, and this should be made clear. The authors are correct in stating that demographic factors don't explain the relative extinction chronology, i.e. when species go extinction relative to each other, but they do explain why large species go extinct, and why these extinctions take place after human arrival. Moreover, generation length, which is also correlated with demographic susceptibility, is highly correlated with body mass (Brook and Bowman 2005 Pop. Ecol), once again showing that body mass-related effects do help explain the extinctions.

      The authors rightfully point out earlier in the discussion that spatial variation, local climates, ecological interactions, etc. all influence how and why a particular population disappears. Extinction chronologies reflect a number of different factors and processes, but they don't take away from the fact that certain life history traits can make a species more likely to go extinct from those factors. Large proboscideans like mammoths had a high risk of extinction based on life history traits, but managed to survive on island refugia into the mid-Holocene. Similar other examples exist, and show that extinction chronologies can vary vastly.

      Therefore, the lack of correlation can be explained by these factors, and the authors need to expand on these in their discussion, perhaps if possible, by giving specific examples. They should be more careful in their discussion by clearly distinguishing drivers of extinction risk, and how these drivers can be de-coupled from timing, but at the same time providing a good explanation for the biological factors leading to the extinction. Here again the authors should consider the work of Brook and Bowman and Lyons et al.

    1. Reviewer #3:

      The authors present a simple model that explains important outstanding controversies in the field of long-range gene regulation. These controversies include the fact that insulation boundaries tend to be weak; that acute inactivation of CTCF or cohesin (that leads to inactivation of insulation boundaries) leads to only minimal gene expression and that in live cells enhancer-promoter contacts appear not correlated with transcriptional bursting. The model involves a futile cycle of tag addition and removal from promoters, stimulation of more tag addition when tag is already present, and stimulation of tag addition by contacts with distal enhancers. The authors show that such a model explains all the above controversies, and indicate that the controversies are not inconsistent with mechanisms where long-range gene activation is driven by physical contacts with distal regulatory elements.

      The authors have explained and explored the properties of the model well. I have only minor comments.

      1) An alternative explanation for TAD-specific enhancer action is that an E-P interaction within a TAD (between two convergent CTCF sites), one that is brought about by extruding cohesin, is not equivalent to an interaction that occurs between two loci on either side of a CTCF site and that can be a random collision that is not mediated by extruding cohesin. In other words, two interactions can be of the same frequency but can be of a very different molecular nature. I agree that this model would not explain the results of the experiment where cohesin is acutely removed.

      2) In the beginning of the introduction the authors introduce TADS. I recommend that the authors present this in a more nuanced way: compartment domains also appear as boxes along the diagonal, an issue that has led some in the chromosome folding field to be confused. This reviewer believes TADS are those domains that strictly depend on cohesin mediated loop extrusion, whereas compartment domains are not. If the authors agree, perhaps they can rewrite this section?

      3) If I understand the model correctly, the nonlinearity arises because of the increased rate of tag addition when tag is already present. The authors then speculate histone modifications can be one such tag. However, there are only so many sites of modification at a promoter. Can the authors analyze how the possible range of tag densities affects performance of the model? Is the range required biologically plausible?

      4) Can the authors do more analysis to explore how rapid changes in gene expression may occur (e.g. upon signaling a gene may go up within minutes)? How much more frequent does the E-P interaction need to be for rapid switch to the active promoter state? Can the authors do an analysis where they change the rates of the futile cycle upon some signal: at what time scale does transcription then change (keeping E-P frequency the same)?

    2. Reviewer #2:

      The main analyses of the study compare previously published experimental observations from Hi-C and ORCA to predictions of the author's "futile cycle" model. The predictions are derived from simulations and differential equations analysis of the model as a dynamical system. Given its centrality to the manuscript, we recommend describing this overall strategy in more detail in Results. For example, at line 124 (Pg. 4) the authors could talk about how the simulations are done, including where the variability comes from (e.g., random starting conditions vs. probabilistic events vs. different parameters).

      Xiao et al. make several key assumptions to dramatically simplify their model. Namely, it is assumed that promoter modification and transcription are equivalent and that enhancer-promoter contact influences transcription instead of transcription influencing structure. Steady-state equilibrium must also be assumed. It would be helpful if the authors explicitly stated these assumptions and provided references to support their being reasonable.

      It is not totally clear why the authors decide to call their proposed approach the futile cycle model. There are similarities to other well-known models in biochemistry and biophysics that should be noted. It might make sense to simply call this a mechanistic model of cooperative promoter activation. If the authors stick with "futile cycle", the relationship between promoter activation through tags and metabolic signaling should be described in more detail.

      There is also an opportunity to emphasize that the proposed model is not necessarily absolutely correct, but one of many plausible models that can produce a non-linear relationship between genome structure (enhancer-promoter contact) and transcription. Any thoughts on other models that could generate similar dynamics would be a useful discussion point. There are parallels to both sigmoidal dose-response curves, where drug concentration is plotted against response, and transcription factor binding curves, where free ligand concentration is plotted against the fraction bound. We recommend providing background context on these types of models or the Hill equation to illustrate why non-linear behavior is or is not surprising given the proposed model.

      For clarity, it would be helpful to discuss model parameters in greater detail. First, we suggest noting which parameters shift the location of the curve and which increase the steepness of the curve. Second, we recommend including a phase diagram exploring when sigmoidal behavior and any other key model predictions arise across parameter space. In what circumstances does hypersensitivity or time lag emerge? The authors demonstrate that a narrow set of parameters is sufficient to produce a super-linear relationship between enhancer-promoter contact and transcription in Figure 6. One potential dilemma is this model's ability to explain many experimental observations by indicating that minimal changes all occur in the sub-linear regime while observable changes occur in the super-linear regime. Given that one needs specific parameters to replicate an example of the hyper-linear regime (including at least three degrees of stimulation and increasing stimulation of the successive states), it could be valuable to demonstrate how large the plausible parameter space is. Without an exhaustive search across the space of minimal parameters, it is not clear when this property emerges or how common it is within the full parameter space. The authors could vary model parameters and plot a grid visualizing behavior (e.g., steepness of the curve or Hill coefficient).

      Images throughout the manuscript are low resolution, making the figures difficult to read. Increase the resolution of figures throughout, especially those containing text (Fig 6A).

    3. Reviewer #1:

      Xiao et al describes a kinetic model of enhance-promoter interactions, which the authors use to explain the changes in transcription levels upon disruption of genomic contacts within topologically associated domains (TADs). The model uses the law of mass action to describe activity of promoters and enhancers, which are proposed to be able to accommodate multiple transcription activation tags. The authors use the model to explain the nonlinear relationship between the genomic contact frequencies within TADs and their corresponding transcription rates. They recapitulate the superlinear relationship between the changes in genomic contact probabilities and transcription rates within TADs observed in their recent experiments (Mateo et al, 2019). Inspired by the futile cycle of cell signaling, their model incorporates multiple tagging of promoters allowing for transient amplification of transcription rates.

      Conceptually, this work is interesting and the model suggests possible reconciliation of seemingly contradictory experimental observations reported earlier.

      However, the manuscript in its current form fails to substantiate many of its claims.

      Here are my major concerns:

      1) The presentation of the model is unclear. It is currently present in the text, lines 110-122, in pure qualitative description. Authors define only rates in the text; definitions of other model parameters are not present. For example, E and a are not specifically defined in the text or Methods section. Since both terms "enzyme" and "enhancer" are being used and in fact "enzyme tagging" and "enhancer tagging" occur simultaneously in the model, it is not possible to say for sure when do authors call which one in the model and thus the methods section can be interpreted in different ways. Moreover, the cartoon is missing a legend confirming, which molecular player is which. The figure caption mentions only green triangles being the tags, but no other parts of the cartoon are being explained. Taken together, this makes it very difficult to verify the mechanics of the model.

      • The authors should provide a detailed technical description of their model directly in the text, including description of their parameters, list their constitutive equations and identify all parameters in their cartoon Fig. 1C.
      • Axes labels in all figures should be expressed in the parameters/variables of the model (as in Fig. 6C-D) directly connecting to inputs/outputs of the model.

      2) Due to the lack of description, in many sections it is not clear what are the specific inputs and outputs of the model (e.g. Fig. 2).

      3) The Methods section describes the chemical kinetics of the suggested reactions and the insulation score calculations. But it is not clear how do these inform each other, how are contact-frequency maps chosen/computed and cross-referenced with the local E-P kinetics?

      4) In the Methods section, it appears that in lines 577-580 of the model description, the mass is not conserved.

      5) In 587-588, the index of k is 2(n+1), which equals to 2n+2, but then in the next line the following assumption is made 2n+1 → n+1

      6) The authors make assumptions that their kinetic considerations hold for n>2. What is the evidence?

      7) The authors observe hysteresis in median transcription rate as a function of enhancer contact frequency. However, the presented violin plots suggest a presence of two states, one with low and one with high transcription rates. In the intermediate regime of enhancer contact frequency, where authors report hysteresis, the violin plots show bimodal distributions suggesting coexistence of these two states. This would suggest that the system exists in and switches between two distinct states with a discontinuous transition, instead of a continuous hysteretic behavior as suggested by the median behavior.

      8) The language of the paper is often not technically precise with qualifiers missing, which could lead to ambiguities and misinterpretations. Here are some examples:

      • *p. 1, line 10, "difference in contact across TAD borders is usually less than twofold"
      • *p. 1, line 17, "results from recent cohesion disruption"
      • *p. 2, line 71, "A simple model of hypersensitivity to changes in contact frequency"

      9) On p. 13, line 483, authors define Ostwald ripening as given by weak multivalent interactions; however, Ostwald ripening is a thermodynamic process. In addition, they propose that liquid condensates become larger due to Ostwald ripening, but there are also other processes that may occur, such as coalescence of condensates, which would also lead to larger condensates.

      10) At the beginning of the Discussion section authors state they will propose future experiments in each section. However, in some of the sections it is not clear what specifically authors are proposing. These suggestions should be made clearer.

    1. Reviewer #2:

      This manuscript by Diamanti et al. describes their study on how visual neurons responded to identical visual stimuli at two different locations along a virtual linear track. Extending their previous result that spatial location modulates the neuronal activities in the primary visual cortex (V1), they now demonstrate that similar spatial modulation also occurred in the higher visual areas (HVAs), but not so much in a lower visual area, the lateral geniculate nucleus (LGN). In addition, they show that the modulation, measured by a spatial modulation index (SMI), was stronger when animals had more experience in the track and when the animals were actively performing a task rather than passively viewing the same virtual track. The authors have been responsive to comments by previous reviewers at a different journal. Data are appropriately analyzed and clearly presented.

      Since the finding that visual neurons are spatially modulated similarly as hippocampal place cells in spatial navigation tasks (Ji and Wilson, 2007; Haggerty and Ji, 2015; Fiser at al, 2016; Saleem at al, 2018), there has been increasing interest in identifying the source(s) of this modulation. This study adds new evidence to this puzzle, suggesting that it is more likely either generated within the visual cortex or top-down propagated from higher brain areas, rather than bottom-up propagated from the thalamus. This is an important contribution. However, there are concerns, mainly on the data interpretation and the clarification of the main conclusion, as elaborated below.

      1) Because experience and task engagement enhanced spatial modulation, the authors concluded in the abstract that "Active navigation in a familiar environment, therefore, determines spatial modulation...". This conclusion is too strong and not well-supported by the data. First, spatial modulation on Day 1, when the task was novel, was lower than on later days, but it was already much higher than 0 (Fig. 1h). Also the individual neuron data (Fig. 1e) display clear spatial modulation on Day 1. Therefore, "familiar environment" is not a requirement. Second, spatial modulation during passive viewing was much higher than 0 and was correlated with that during active navigation, as shown in Fig. 4e - Fig. 4l. Therefore, "active navigation" is not a requirement either. It is true that both active navigation and familiar environment enhanced spatial modulation. They did not "determine" spatial modulation.

      2) Related to the point above, the presence of spatial modulation in passive viewing reminds us that these cells in the visual system were still mainly driven by visual stimuli. The data in Fig. 4e,f are especially telling: the modulation in V1 was similar and highly correlated between active navigation and running replay. In addition, it is clear from all the raw traces in Fig. 1 and Fig. 2 that these cells did respond to the two segments with identical stimuli reliably with two peaks. The spatial modulation was just a change in one of the peaks. So the nature of the modulation is a "rate remapping" of the expected, classical visual responses. I believe, in order to maintain the big picture of what drives the activities of these neurons, it is beneficial to clarify that the "spatial modulation" is a modulation on top of the expected visual responses. This message is not explicitly conveyed in the current manuscript.

      3) The authors stated that spatial modulation is "largely absent in the main thalamic pathway into V1". This was based on the significantly weaker SMIs in LGN than those in V1 and HVAs. However, it is unclear whether the SMIs in LGN were still significant. The SMI values for both LGN buttons (Line #100) and LGN units (Line# 130) might be statistically significant from zero. The statistical comparison p-values should be given in both cases. Second, Figure 3 - figure supplement 1 b,f show that the SMI values in LGN could be predicted by spatial modulation, but not by visual stimuli alone or behavioral variations, just like those in V1 and HVAs. This seems to me good evidence for the presence of spatial modulation in LGN. Therefore, it is my opinion that the data do not support the complete lack of spatial modulation in LGN, but do clearly demonstrate weaker spatial modulation in LGN than in V1 and HVAs.

    2. Reviewer #1:

      This paper investigates the modulation of spatial signals in higher order visual areas. A number of the findings are novel and interesting, including that signals in higher visual areas are not more influenced by spatial position that signals in V1, that this modulation is not a general feature of the entire visual circuit (i.e. LGN boutons in L4 of V1, as well as LGN units, show very little spatial modulation, and that spatial modulation decreases when mice are watching a replay of tunnel traversals. Overall, I think this paper provides new insight regarding position coding in visual systems. However, there are some points that should be addressed.

      1) The imaging data is from mice with different genetic backgrounds, as well as a mixture of gcamp6f and 6s. In addition, different reward protocols were used for different mice. Although the authors state in the methods that none of these factors impact their results, it would be good to include some quantifications to this effect (e.g. they could show the distribution of SMI for 6f data vs 6s data). While I don't expect the major observations to change if it turns out that some of these factors have as systematic effect, it could affect portions of the results where the dataset is split up - for example in the comparison between different higher visual areas, and the observation that spatial modulation appears to vary with receptive field location.

      2) The authors state that it is to be expected that LGN neurons respond more strongly in the first half of the corridor due to contrast adaption mechanisms. However, I did not see any quantification that could support this statement?

      3) When looking at the spatial modulation index, the authors switch between using median (e.g. Fig 1 and 2) and mean (Fig 4), t-test and rank-sum - and sometimes there is missing information regarding which (mean or median) they are reporting. The authors need to include more detail regarding these statistics.

      4) It was not clear to me if the authors are only imaging from layer 2/3 or if they also attempted to image deeper layers.

      5) Throughout the paper, the authors use 'firing rate' to refer to deconvolved calcium signal. Although this is stated in the methods, this wording can be misleading, especially since the paper also contains extracellular recordings of spiking activity.

      6) It was not clear to me how the dotted lines (e.g. Fig 1 b) were calculated.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 7 2021, follows.

      Summary

      This manuscript describes a detailed investigation of the sigma-1 receptor, with an emphasis on the effects of membrane cholesterol content. The authors report that sigma-1 receptor clusters in cholesterol-rich microdomains in the endoplasmic reticulum (ER), contributing to its previously-described localization at mitochondria-associated ER membranes. A series of reconstitution experiments show cholesterol-dependent clustering of the sigma-1 receptor, an effect which is modulated by membrane thickness and drug-like ligands of the receptor. These findings are supplemented by an investigation of the effects of sigma-1 receptor on IRE1a signaling, leading to the finding that sigma-1 knockout attenuates IRE1a function.

      Essential Revisions

      The reviewers agreed that the manuscript was likely to be of broad interest and addresses important biological questions surrounding the poorly understood sigma-1 receptor. However, concerns were raised regarding a number of points that need to be addressed in order for the manuscript to be suitable for publication. Specifically:

      Most of the imaging experiments throughout the manuscript are interpreted only qualitatively, and many of these show relatively minor differences. See "MINOR POINTS" below for a list of specific examples. Objective quantitative analysis should be provided wherever possible. Any subjective assessments should be conducted using blinding to avoid introduction of bias.

      The connection between the biological effects on IRE1a activation and cholesterol-dependent clustering is relatively indirect. The reviewers agree that additional experimental data should be provided to further assess the validity of the authors' proposed model. For example, inclusion of rescue experiments in sigma-1 knockout cells using the cholesterol-binding mutants would help to strengthen the connection between IRE1a function and membrane cholesterol content. Similarly, disruption of cholesterol-rich domains by addition of beta-cyclodextrin could provide additional evidence to support the model. In addition, testing the effects of ligands in the cellular imaging experiments would strengthen the link between in vitro biophysical experiments and cellular physiology.

      A related issue is that cholesterol binding is not tested explicitly for certain sigma-1 receptor mutants, potentially confounding interpretation of experimental data. These include experiments where alterations were made to the S1R sequence, with results interpreted in light of S1R no longer being able to bind cholesterol. Two specific places where this issue arises are:

      1) Studies described on pages 6-7 and shown in Figure 3B where wild-type sigma-1 receptor is compared to S1R-Y201S/Y206S, S1R-Y173S, S1R-4G, and S1R-W9L/W11L. These mutations had differential effects on receptor distribution that were attributed to alterations in cholesterol binding without confirming the changes in cholesterol binding. This is particularly relevant for the explanation given for why S1R-W9L/W11L fails to cluster in both cells and the cholesterol supplemented GUV system, while the S1R-4G mutant exhibited cholesterol-induced clustering in the GUV system but not in cells (page 7, lines 27-31).

      2) Another example is the membrane thickness experiment described at the top of page 8 and shown in Figure 4A. Shortening the S1R by deletion of 4 aa in the TM region produced a sigma-1 receptor that exhibited a more diffuse distribution when expressed in HEK293 cells. The authors appear to be attributing this only to the decreased length of the sigma-1 receptor transmembrane domain. However, it seems feasible (based on their other data) that if this construct fails to bind cholesterol, the same result would be observed. Confirming that the truncated sigma-1 receptor does in fact bind cholesterol would strengthen the argument being made here.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 5 2021, follows.

      Summary

      Your work analyzes the impact of the INPP5E inositol lipid-5 phosphatase on immune synapse formation and function. INPP5E is a cilium enriched protein. Although T cells do not display primary cilia, previous work by several laboratories showed that several ciliary proteins are involved in immunological synapse formation and in T cell activation and your work intends to further this view. Although the work has potential for publication in eLife, it requires essential additional data to support the central claims of the paper. Each reviewer raised substantive concerns (see below) that need to be resolved experimentally. For instance, experiments involving knockout in primary T cells will need to be performed. A better time series will also help deciding in what process the INPP5E protein is involved in. Moreover, imaging data should be quantified more precisely to assess spatial and dynamic differences.

      Reviewer #1:

      An important aspect of mature synapse formation is signal termination and ... effector responses, such as secretion of cytokines, exosomes and CD40L on synaptic ectosomes (Huse et al, 2006; Mittelbrunn et al, 2011). The demonstration of ESRCT function in both TCR signal termination and CD40L release to B cells on synaptic ectosomes likely involves inositol lipids that lack phosphorylation on the 5' position.

      It might make sense for the author to investigate a synapse effector function like degranulation of CD8 or CD40L transfer of synaptic ectosome in CD4 T cells as these effector functions actually link into synapse formation more directly than bulk IL-2 secretion.

      The ESCRT machinery is also highly entwined with ciliary biology and several ESCRT components important for signal termination and effector function will also require PIP metabolism.

      Reviewer #2:

      Interesting similarities between the primary cilium and the immunological synapse have been noted and investigated extensively over the last few years. In this context and beyond, the role of phosphatidylinositol lipids in the organisation of the immunological synapse and T cell function has been extensively investigated. Here Chiu et al. add to these topics by investigating INPP5E, a primary cilium-associated 5' phosphatidylinositol lipid phosphatase that can use PIP3, PI(4,5P)P2 and PI(3,5)P2 as substrates, in T cell activation. The authors show that INPP5E is recruited to the interface of a T cell with an activating antigen presenting cell. INPP5E binds to TCRzeta, ZAP-70 and Lck. INPP5E knockdown reduces TCR recruitment to the T cell/APC interface, clearance of PI(4,5)P2 from the centre of the interface, and TCR and ZAP-70 phosphorylation. These findings are consistent with the large body of existing work on the role of phosphatidylinositol lipids in the organisation of the immunological synapse and T cell function and, therefore, don't constitute a conceptual advance. Nor do they provide new mechanistic insight into phosphatidylinositol lipids in T cell activation. The data add another molecule to the existing body of work.

      In the first two figures Chiu et al. show that a number of cilium-associated proteins, including INPP5E are recruited to the interface of a Jurkat cells with a Raji B cell presenting superantigen. Such recruitment is not surprising. On the contrary, because of the reorientation of the MTOC to the centre of the cellular interface and the accompanying shift of the nucleus to the back of the T cell to create more cytoplasmic space at the interface, most proteins associated with vesicular trafficking shift their subcellular distribution towards the interface. Only data showing spatial or temporal distinctions in such recruitment within the small cytoplasmic space underlying the T cell/APC interface could provide interesting new insight. Reduced detection of INPP5E interface recruitment after INPP5E knockdown could be trivially caused by the worse signal to staining background noise ratio (Fig. 2A-E). The STORM data showing that INPP5E interface recruitment occurs in the T cell not the APC are welcome. However, spatial and temporal features provided by the higher resolution of these experiments are not explored.

      In the investigation of the contribution of different INPP5E domains to its interface recruitment the representative imaging data in Fig. 3A suggest that substantial quantitative differences exist. The '% conjugate with recruitment' metric doesn't capture such differences. Some form of a recruitment index as used in other parts of the manuscript would be more powerful. A more complex picture of INPP5E domain contributions to INPP5E interface recruitment is likely to emerge.

      The immunological synapse is a highly dynamic structure. TCR interface recruitment and PI(4,5)P2 clearance in response to various manipulations of PI turnover are only analysed at a single time point. A dynamic picture should provide more insight. For example, interface recruitment of the TCR may be consistently impaired, delayed or shifted in time. Reduced interface recruitment of the TCR upon overexpression of PIP5Kgamma (Fig. 5D, E) has already been described in the cited Sun et al. reference. This should be acknowledged.

      In Fig. 6E, the authors show a small reduction in IL-2 secretion in Jurkat cells stimulated with anti-CD3/CD28 upon knockdown of INPP5E. As INPP5E is expected to exert its functional effects through the control of the spatiotemporal organisation of the immunological synapse, activation of Jurkat cells with APCs would be more appropriate.

      The knockdown efficiency of INPP5E should be quantified.

      Reviewer #3:

      The work is fully performed in Jurkat cells, which a very good and widely used model to investigate T cell activation, yet, not perfect. Actually, in the case of events related with phosphoinositide function, Jurkat cells present a strong caveat. These cells lack the Phosphoinositide phosphatase PTEN, therefore having altered phosphoinositide turnover.

      Therefore, as a first critical point, the authors should confirm most of the central data of this work in primary T cells. They should also discuss this point, since it might bias some of their data.

      Additional points needing attention are detailed below.

      1) Regarding data in Fig 1D, the authors say the they find INPP5E localized with the centriole in the absence of SEB stimulation. The pattern shown is in the picture is very diffuse and blurry, not showing at all a centriole pattern.

      It seems to be more visible in Fig S1. The authors should replace Fig1D panel by a better "quality" picture if they wish to convey that message.

      2) What do the authors mean with "number of events" in the figures ? Please explain or replace by another term or means of quantification. If it means counting conjugates with INPP5E recruited "by visual observation", it would be much more appropriated to quantify fluorescence enrichment at the synapse making a ratio.

      It is also bizarre to plot "pairs" which are all at 100%. What does that mean?

      3) In Fig 2 D, E the authors observe by TIRF the presence of INPP5E at the planar pseudosynapse. They do in parallel TCRz. It would be interesting to better take advantage of that type of microscopy images to also quantify the impact of INPP5E on TCRz recruitment and to assess co-localization between INPP5E and TCRz using Pearse corelation on images with a very good resolution. From that image they look like they do not co-localize at all.

      4) The reasoning of the authors in Fig 2 H is somehow strange: "Since the distribution of INPP5E signals mostly appear at the T cell-APC contact site, it was necessary to examine whether INPP5E belonged to T or B cells" Although they use dSTORM the resolution of the image is not single molecule as they claim, but relatively large clusters. Moreover, they say that INPP5E is inside the T cell while TCRz is at the plasma membrane. In that image there are spots labelled far on the B cell. Moreover, it has been shown by several authors that TCRz largely occupies intracellular vesicular compartments. So the conclusion is not accurate. Finally, they claim that the overlap in some regions is suggestive possible interactions. The overlap is really minimal and in zones of clustering. So the comment is far from accurate. A proper colocalization analysis in TIRF_dSTORM images of INPP5E and TCRz quantified by Pearson correlation would be much more appropriate and accurate.

      By the way, the authors could use panel F of T cells transfected with Flag-INPP5E that relocalizes to the synapse to say that INPP5E in T cells relocalizes to the synapse.

      5) Fig 4A: The strongest interactor with INPP5E seems to be Lck, rather than TCRz. It would be interesting to also assess the effect of INPP5E silencing on Lck recruitment at the synapse.

      Is there a mistake in labeling IP in horizontal and IP in vertical. I guess one of them should be IB (immunoblotted / Western blot). Please clarify and correct if necessary. Same in B, there is labelled IP-Flag everywhere, is one of them input? Please clarify/correct if mistaken.

      The term INPP5E "interacted" with TCRz, ZAP and Lck in the text (line 168-169) is not fully correct here, since these molecules make complexes during TCR activation. The term "co-immunoprecipitated" would be more accurate here.

      Fig 4D Not clear here why the authors use cells transfected with TCRz-GFP while to conclude that INPP5E is required for exogenous CD3z clustering, they could just stain for endogenous TCR.

      Fig 6B: If the authors normalized the pProtein band density with respect to the total same protein, the Y axis should be expressed as band density ratio rather than "optical intensity (a.u.)"

    1. Reviewer #3:

      The manuscript explores ageing-associated changes in the Drosophila escape-response (Giant Fiber, GF) circuit and the circuits converging onto the GF. This a convenient system amenable to detailed physiological analyses and the authors made a good effort in extracting a large amount of useful information using a wide range of electrophysiological readouts. The authors identified several physiological parameters that are potentially useful for indexing ageing progression in flies such as ID spike generation and ECS-evoked seizure threshold. The host lab is well-known for its expertise in the field of GF physiology; consequently, the experiments were done with a high level of technical competence and presented (mostly) in a clear and informative manner. There is, however, one major issue that could restrict the usefulness of the data presented in the manuscript (please, see major comment 1).

      Major comments:

      1) Standards for conducting ageing studies in Drosophila and other model systems have gone significantly up in the last ~15 years following experimental evidence that genetic background can (and does) have a significant effect on the outcome of 'ageing' experiments (see Partridge and Gems, Nature, 2007). Today, 'backcrossing' relevant lines into a reference wild-type strain multiple times (to remove any second-site mutations) is a gold standard for virtually all ageing studies in Drosophila. Furthermore, this approach is being widely adopted even in the studies investigating physiological properties in developing flies (for example, in Imlach, Cell, 2012, the authors obtained very different electrophysiological results after 'isogenizing' the genetic background via backcrossing, and concluded that "the previous finding may have been due to a second site mutation"). As this important step is not mentioned in either the main text or in 'Methods' section, it is reasonable to conclude that the authors did not perform this step prior to conducting the experiments. Recent papers, one of which was referenced by the authors (Augustin et al PloSBiol 2017 and NeuroAging 2018) repeatedly demonstrated a significant, age-associated increase in the short-response (TTM and DLM) latency in the GF circuit following a strong stimulation of the GF cell bodies in the brain. It is likely that these age-related changes in the GF circuit remained undetected in the flies with non-uniform genetic background likely used in this work. The same problem affects the paper (Martinez, 2007) referenced by the authors throughout the manuscript.

      It is difficult to say which of the findings reported here are most affected by the variability in the genetic background, but any kind of correlation between the lifespans (Figure 1B) and physiological parameters should be taken with a high dose of scepticism.

      2) The manuscript is entirely 'phenomenological' in the sense that it does not investigate the causes of the observed physiological changes. The manuscript (with minor exceptions) does not discuss the possible reasons behind the functional readouts or speculate about what makes the (sub)circuits differentially susceptible to the effect of ageing. For example, when mentioning the effects of temperature and Sod mutation on the fly physiology, the authors limit their comments to generic and obvious statements such as 'oxidative stress exerts strong influences differentially on some of the physiological parameters and the outcomes are distinct from the consequences of high-temperature rearing'. Some of the possible questions the authors could ask are: could changes in the kinetics of relevant ion channels explain some of the results obtained under different temperatures; could the previously demonstrated effect of ROS on voltage-gated sodium channels explain some of the Sod1 phenotypes, etc?

    1. Summary: This work synthesizes bioinformatics, in vivo, and in vitro transport assays to understand the molecular basis for substrate selection and promiscuity of the mitochondrial carrier family (SLC25). This comprehensive work will be of interest to the fields of mitochondrial physiology, transporter specificity and evolutionary dynamics. However, in its current form, it lacks some critical controls for protein expression and some important details about the methodology.

      Reviewer #1 and Reviewer #2 opted to reveal their name to the authors in the decision letter after review.

      Public review:

      This paper takes a novel and comprehensive approach to understand the molecular basis for substrate selection and promiscuity of the mitochondrial carrier family (SLC25). Informed by a deep assessment of evolutionarily conserved features, mutants that selectively impair Pi flux, but retain Cu2+ transport for the mammalian transporter SLC25A3 were established using a variety of in vitro and in vivo transport assays. In addition to providing a molecular perspective on substrate specificity in mitochondrial carrier proteins, this paper provides interesting and convincing insight into how subfamilies of transporters evolved by juggling substrate specificity. However, in its current form, it lacks some critical controls for protein expression and some important details about the methodology, which are enumerated below:

      1) This manuscript does not report any controls for expression levels or membrane localization of the mutants analyzed in Figure 6. These controls are essential to fairly compare the growth phenotypes/transport capacity of the assorted mutants relative to WT Pic2.

      2) The methods section lacks details required to fully understand several different experiments.

      -Figure 1G shows an analysis with reconstituted proteins, but the methods contain no information about purification or reconstitution of the transporters, or the origin of the CuL fluorescent reporter, so it is difficult to evaluate this line of evidence.

      -The methods do not contain information about the NMR experiment shown in Figure S4, and the interpretation of this data as containing a benzene ring is probably not obvious to a broader scientific audience.

      -The details are also sparse regarding the preparation of the homology model. How much sequence similarity do the ATP/ADP translocase and PIC2 share? How large are the insertions and deletions that were addressed by manual alignment? Was an ensemble of models calculated? It is likely that a number of plausible models could be produced - were any alternative models considered? The clustering of the conserved residues shown in Figure 4 is a nice way to validate the model. It would also be nice to analyze whether the homology model shows the expected pattern of hydrophobic residues facing the membrane.

      -The authors should include all details of the bioinformatic pipeline as supplementary data, including the list of gene ids and/or sequences, phylogenetic tree of the initial 2445 sequences (neighbour joining tree) to show PIC2/MIR1 clusters, and the 92 final sequences (gene IDs, multiple sequence alignment). In addition, the authors should show the entire tree of the superfamily.

      3) The manuscript would be strengthened by additional discussion about what is known (if anything) about the functions of other transporters in the PIC2/MIR1 family. Much of the interpretation of the phylogeny regarding the outcomes of gene duplication seem to depend critically on whether the functions and substrate specificities of the yeast and mammalian homologues described here are representative of the entire clade. Likewise, the authors do not indicate whether there is evidence that neighboring sequences outside the core PIC2/MIR1 cluster are not functionally homologous (promiscuous Cu and/or phosphate transport) to PIC1/MIR1.

    1. Reviewer #2:

      In this manuscript, the authors set out to measure participant's decisions about when an item occurred in a short list of 3 or 4 items, where the first and last items were always at the beginning and end, respectively. They report two behavioral studies that examine time judgments to items in the intermediate positions. They show that time judgments (when did you see X item using a continuous line scale) are always a little off but, more importantly, they tend to be anchored to other items presented. The results are interesting and add to our knowledge of the representation of time in the brain mainly by introducing a new paradigm with which to study time. Within the broader context of research on timing capacities, it should not be surprising that participants do not have a continuous representation of time that lasts beyond traditional time interval training of a few hundred milliseconds to a few seconds. Furthermore, research has also shown that 'events' that require attentional resources do morph our perception and memory for time. So while the paradigm is worth expanding on, the behavioral results are not surprising given this past literature. I do feel however that this work is an important first step in developing a more firm model of memory for time.

    2. Reviewer #1:

      This manuscript reports the results of two timing experiments. The experimental paradigm asks participants to judge the time of target items in an unfilled interval between two landmark stimuli. In experiment 1, there is one item that must be judged. In experiment 2, there are two items to be judged. The basic empirical result is that relative order judgments in experiment 2 are more accurate than one might expect from the absolute timing judgments of experiment 1. A model is presented.

      My overall reaction is that this paper does not present a sufficiently noteworthy empirical result. I can't imagine that there is a cognitive psychologist studying memory who would be surprised by the finding that relative order judgments in the second experiment are more accurate than one might expect from the absolute judgments in experiment 1. On the encoding side, in these really short lists (with no secondary task), there is nothing preventing the participant from noting and encoding the order as the items are presented (not unlike the recursive reminding). On the retrieval side, we've known for a very long time that judgments of serial position use temporal landmarks (see for instance a series of remarkable studies by Hintzman and colleagues circa 1970).

      Methodologically, this paper falls short of the standards one would expect for a cognitive psychology paper. There are basically no statistics or description of the distribution of the effect across participants. Although I'm pretty well-convinced that the basic finding (distributions in experiment 2 are different from experiment 1), I could not begin to guess at an effect size. The model is not seriously evaluated. The bimodal distributions are a large qualitative discrepancy that is not really discussed.

      Although the title of the paper invites us to understand these results as telling us something about episodic memory, the empirical burden of this claim is not carried. Amnesia patients (and animals with hippocampal lesions) show relatively subtle differences in timing tasks. There is no evidence presented here, nor literature review, to convince the reader of this point.

    1. Reviewer #3:

      In this interesting paper authors compare MEG recordings of svPPA patients and 44 healthy controls during living vs. non-living categorization tasks. Both patients and the control group performed this task with similar accuracy. In addition, svPPA patients showed greater activation over bilateral occipital cortices and superior temporal gyrus, and inconsistent engagement of frontal regions. The authors conclude that patients with svPPA compensate for their semantic deficit by recruiting regions involved in perceptual processing.

      This is a well written study and the results are presented clearly. The findings are novel and interesting.

      1) One question for clarification is whether the recruitment of the occipital areas in semantic PPA is truly "compensatory" - does it indicate a shift of resources due to the anterior temporal atrophy? Is the recruitment of the parieto-occipital regions associated with more accurate performance?

      2) The main results concentrate on the differences between patients and controls in the low gamma range. There are also significant effects in the other frequency bands (e.g., high gamma, beta and alpha). Could the authors discuss the functional significance of these effects?

    2. Reviewer #2:

      Borghesani and colleagues aimed to understand how dysfunction in the ATL alters the dynamic activity during semantic categorization. To achieve this, they contrast MEG responses between patients with svPPA and age-matched healthy controls. Both groups show similar profiles of behavioural performance on the task, and broad similarities in MEG responses. Critically, svPPA patients show enhanced gamma synchronization in the occipital lobe compared to controls, while gamma synchronization was correlated to task RTs.

      In general, I found the manuscript interesting, and the major strength being the application of MEG analyses to a clinical population during a cognitive task. In terms of improvements, I think the results could be more fully characterized, which would allow for more expansive interpretations and inferences.

      Major comments:

      1) As the paper is about 'Neural dynamics', I felt this aspect could be developed, with the timing of the effects characterized further, and considered more in relation to the conclusions. For example, the main finding is the increased occipital gamma response in svPPA compared to controls. Looking at Figure 3, there is a peak in the svPPA group near 200 ms, and very little synchronized activity in the control group. This is interesting as there are many ways we could have seen svPPA > controls, but this suggests that the gamma synchronization response associated with compensation is specific to the svPPA group (and largely absent from controls - also from Supp fig 1), and is distinguished from an initial visual evoked response (peaking ~100 ms). I would recommend discussing and characterizing the dynamics of this effect more, such as what a later occipital effect could tell us about dynamics given ATL dysfunction? Is this increase a result of a lack of top-down effects from ATL? I think these kinds of issues could be explored and discussed more.

      2) The occipital gamma effect looks like the primary visual cortex, which might suggest the effects are not related to higher-level perceptual features (such as has eyes, teeth) as the authors suggest, but rather low-level visual effects. Do the authors perhaps think the effects could relate to enhanced processing of visual details (as related to the ideas of Hochstein and Asher's reverse hierarchy), or whether the effects relate to additional visual input following a visual saccade?

      3) The VBM results for the svPPA patients were surprising given that all the atrophy appeared in the left hemisphere. There can be hemispheric differences in svPPA, but is this a true lateral pattern (meaning the right ATL is intact) or a product of VBM being run so that the most atrophied hemisphere is shifted to the left side? If the VBM maps are correct, and the svPPA patients are only showing left hemisphere atrophy, then what does this suggest about the role of the right ATL, and the bilateral nature of occipital increased in svPPA?

      4) Both svPPA patients and healthy controls achieved around 80% accuracy in the categorization task. This seems surprisingly low given, (1) the task (living vs. nonliving after seeing the image for 2 seconds), (2) that all the images were pretested and had high name agreement, and (3) that items were repeated on average 2.5 times. Is there something that explains this low performance for all individuals?

    3. Reviewer #1:

      This study examines MEG activity in a picture categorization task (decide living or non-living) in a sample of 18 patients with semantic variant PPA, compared to 18 controls. As svPPA is a rare (but scientifically informative) disorder, the sample size is impressive, and given that relatively few MEG studies exist in PPA at all, this is an interesting dataset. The authors show differences in engagement of oscillatory activity, specifically increased low-gamma ERS in occipital cortex and increased beta ERD in the superior temporal gyrus. The authors interpret this as reflecting increased engagement of / reliance on early perceptual mechanisms for completing the task, as opposed to semantic identification of the picture.

      Major concerns:

      1) My biggest methodological issue with this paper relates to a very old debate in neuroimaging that still comes up all the time: the choice of statistical threshold. Using a high threshold prevents false positives, but may also lead to false negatives, and I fear that is the case here, with the high threshold contributing to an unrealistic impression of spatial specificity in MEG. It is obvious from the average responses in both groups that these oscillatory responses are widespread through the brain. Indeed the alpha and beta responses are significant in the majority of cortical voxels. This basic property of the responses should be presented clearly and prominently in the paper - I don't think it's appropriate to put it in supplementary information where only a minority of readers will even see it. The authors then use what I think is an extremely high and conservative statistical threshold to contrast differences between the two groups. P<.005 uncorrected is a highly conservative threshold already, even before cluster-thresholding is added (although with data as smooth as MEG beamforming solutions, cluster-thresholding is unlikely to change anything). Basically this makes the only the strongest part of the activation survive, and it is valid to conclude that a significant group difference exists there (protected from Type 1 error), but this can give a false impression of the difference is specific to that region. I think a more realistic characterization of the results would involve measuring differences in the strength of the responses between groups on a broader level, possibly the sensors or in large ROIs - and not ROIs pre-selected to show a dramatic difference by first searching the whole brain for the most significant effects - that is the classic "double-dipping" fallacy in neuroimaging.

      2) Similarly, the ERD/ERS in each frequency band is treated as a separate entity, ignoring the fact that these bands are arbitrary and frequency is a continuous quantity. This matters because much is made of the fact that PPA participants exhibited greater ERS in the low-gamma range, and that this was correlated with reaction time. Supplementary figure 1 shows that both groups had strong occipital ERS in the high-gamma range, but only PPA showed it in the low gamma range as well. This suggests that the ERS in the PPA group may simply have been shifted to a lower frequency range. A more fulsome characterization of these group differences via time-frequency analysis and/or power spectral analysis would help clarify what is going on here.

      3) It is surprising that PPA participants only exhibited increased MEG responses compared to controls - assuming that both gamma ERS and beta ERD can be interpreted as increased neural activation, which is a reasonable assumption based on the literature. No decreases in the PPA group are found, and thus the observed increases can be plausibly attributed to compensatory processes as framed by the authors. However, I am concerned about the role of certain analysis choices in producing this data pattern. In particular, the authors state (line 611): "To remove potential artifacts due to neurodegeneration or eye movement (lacking electrooculograms), we masked statistical maps using patients' ATL atrophy maps (see section MRI protocol and analyses), as well as a ventromedial frontal mask."

      It is not clear whether this masking was done in group space from average atrophy maps, or on an individual level. In either case, I don't think this is well justified. I don't know any physical mechanism by which tissue undergoing neurodegeneration can be said to generate an artifactual signal. Atrophied tissue still contains living neurons with ionic currents; these are real signals not artifacts, and furthermore, atrophy is a continuous process with tissue further from the epicenter also undergoing similar neurodegenerative mechanisms. Atrophied tissue may well generate electromagnetic signals that are different from healthy tissue, and such differences should be included in this paper. I think that there may be regions of hypoactivation as well as hyperactivation in this PPA group. If the hypoactivation localizes to atrophied tissue and the hyperactivation to other regions, that will bolster the case that we are seeing compensatory processes, but it isn't certain with half the story masked. I also don't really see statistical masking of the frontal region as a valid solution to eye movement artifacts. The authors would have to present evidence that the region that they masked corresponds to the region potentially affected by eye movements. However, many studies have found that beamforming already does a pretty good job of removing ocular artifacts from estimated brain signals, except for very close to the eyes.

      4) The correlation with reaction time in the occipital cortex is consistent with the idea that the ERS there may reflect compensatory overreliance on perceptual information, but it isn't conclusive. The authors suggest that PPA patients are able to categorize the stimuli correctly based on visual features, but are unable to name them. What about testing for correlations with the out-of-scanner behavioural measures that established that the patients have a naming deficit? It would strengthen the case if atrophy or hypoactivation (see comment above) correlated with the naming deficit.

    1. Reviewer #3:

      Neuronal ensembles have been shown by this lab and others to constitute one basic functional unit for the representation of information in cortical circuits. It is therefore important to determine how stable these blocks of representation might be. If these ensembles were preserved across time and sensory stimuli, this would indicate a significant degree of structure underlying cortical representations. In a first attempt to address these important issues, this manuscript analyzes the long-term stability of ensembles of coactive neurons in the layer 2/3 of mouse visual cortex across several days. Ensembles were recorded during periods of spontaneous activity as well as during visual stimulation (evoked). For this, the authors record spontaneous and evoked activity using two-photon calcium imaging one, ten and 40 days after the first recording session. In order to maximize overlap between successive imaging sessions, the authors record three planes separated by 5 microns almost simultaneously (9ms interval) using an electrically-tunable lens. They show that ensembles extracted during visual stimulation periods are more stable on days 2 and 10 than those computed during spontaneous activity. Stable ensembles display a higher "robustness" (a parameter that quantifies how many times a given ensemble is repeated and how similar these repeats are) . Neurons displaying stable membership are more functionally connected than unstable ones. It is concluded that such observed stability of spontaneous and evoked ensembles across weeks could provide a mechanism for memories. Long-term calcium imaging within the same population of neurons is a real challenge that the authors seem to overcome in the study. The conclusions are important, my main concern relates to the number of experiments and analyses supporting these findings as detailed below.

      Number of experiments and statistics: According to Table 1, two mice with GCamP6f have been through the complete imaging protocol (days 1,2, 10 and 43) but none with the 6s, since 3 missed the intermediate measure (day 10) and one the last point (day 40+). Therefore five mice have been recorded over weeks with two different indicators, but only two were sampled on day 10. One mouse was only recorded until day 10. Altogether, this is quite a low sampling, but the experiments are certainly difficult. However, the total number of experiments analyzed is higher, due to the repeat of 3 sessions on the same mouse per day. This certainly contributes to reaching significance. However, the three samples from the same mouse are not independent points. Are the FOVs different for each session in the same mouse? If they are the same, then the statistics should be repeated but treating all experiments from the same mouse as single experiments. I would suggest repeating the analysis but using only one data point per mouse per day. Also, given that two different indicators were used (6s and 6f), one would need to see whether the statistics are the same in the two conditions.

      Robustness: the authors compute this metric, as the product of ensemble duration and average of the Jaccard similarity and find that stable ensembles display higher robustness: isn't it expected that robustness is higher in stable ensembles given that stable ensembles should be observed more often?

      Evoked ensembles: It seems to me that evoked ensembles are ensembles extracted during continuous imaging periods that include stimulation. However, one would expect evoked ensembles to be the cells activated time-locked to the visual stimulation. This notion only appears at the end of the paper with "tuned" neurons in Fig. 4. In the discussion, authors conclude lines 205-207 that "sensory stimulus reactivate existing ensembles" . I do not think this is supported by the analysis performed here. For this, I believe that one would need to compare, within the same mouse the amount of overlap between spontaneous ensembles and "tuned neurons".

      How representative are the illustrated examples in Figs. 2&3? The authors report that about 20 neurons remain active from day 1 to 46 but their main figures display example rasterplots with more than 60 neurons, which is three times more than the average. Is this example representative? Which indicator was used? Is there a difference in stability between 6f and 6s?

      Rasterplot filtering: The authors chose to restrict their ensemble analysis to frames with "significant coactivation". Why not use a statistical threshold to determine the number of cells above which a coactivation is significant instead of arbitrarily setting this number to three coactive neurons? In cases of high activity this number may be below significance.

      Demixing neuronal identity: The authors assign a neuron to an ensemble if it displays at least a functional connection with another neuron. They use reshuffling to test significance of functional links but still it seems that highly active neurons are more likely to display a high functional connectivity degree and therefore to be stable members of a given ensemble with that definition of ensemble membership. What is the justification to define membership based on pairwise functional connectivity? The finding that core ensemble members display a high functional degree may be just a property reflecting a property of highly active neurons (as previously described by Mizuseki et al. 2013).

      Type of neurons imaged: The authors use Vglut1-Cre mice, therefore they are excluding GABAergic cells from their study, this should be clearly mentioned and even discussed.

      Volumetric imaging: I am not sure one can say that "volumetric imaging" was performed here, rather this is multi-plane imaging.

      Mouse behavior: there is little detail concerning mouse behavior, are mice allowed to run? What is the correlation between ensemble activation and running?

      Abstract: the authors should say that 46 days is the longest period they have been recording, otherwise it gives the wrong impression that after 46 days ensembles are no longer stable. Also "most visually evoked ensembles" should be replaced by "ensembles observed during periods of visual stimulation" (see above). "In stable ensembles most neurons still belonged to the same ensemble after weeks": how could ensembles be stable otherwise?

      Discussion: I found the discussion quite succinct. It lacks discussion of the circuit mechanisms for assembly stability and plasticity (role of interneurons for example?), the limitations and possible biases in the analysis and the placing of the results in the perspective of other studies analyzing the long-term stability of neuronal dynamics.

    2. Reviewer #2:

      Overall I think the authors collected an interesting dataset. Analyses should be adjusted to include all cells rather than sub-selecting for stability. Additionally, the language needs to be adjusted to better reflect the data. I wish there was any behavioral data included, but if the authors compare their data to publicly available data in V1 for a single recording session during a visually guided task, these concerns could be quelled a bit.

      1) In general the language of this paper and title seem to mismatch the results. The fraction of cells that were 'stable' as the authors say on line 112 was very small, however the authors focus extensively on this small subset for the majority of analyses in the paper. Why ignore the bulk of data (line 119)? What happens if you repeat the same analysis and keep all cells in the dataset? The general language around stability of neural ensembles should be adjusted to better reflect the data (ex: lines 157, 225).

      2) There are claims in this paper about how ensembles 'implement long-term memories' in the introduction and conclusion and yet the authors never link the activity of ensembles to any behavioral or stimulus dependent feature. This language reaches far beyond the evidence provided in this paper. The introduction could provide some better framing for expectations of stability vs. drift in neural activity rather than focus on the link between ensembles and memory given that there isn't much focus on the ensembles' contribution to memory throughout. For example, the last sentence of the paper is not supported by data in the paper. Where is the link between ensembles and memory in the data? What is the evidence that transient ensembles are related to new or degraded memories? This reads as though it was the authors' hypothesis before doing the experiments and was not adjusted in light of the results.

      3) There is no discussion around the alternative to stability of neuronal ensembles. What are the current theories about representational drift? For example, in Line 34 the authors present an expectation for stability without any reasoning for why there need not be stability. This lack of framing makes their job of explaining results in line 217 more difficult. There is a possibility that the most stable cells aren't more important - what is the evidence that they are? Does an ensemble need a core? Would be interesting to include some discussion on the possibility of a drifting readout (Line 223). [https://doi.org/10.1016/j.conb.2019.08.005]

      4) How do activations in V1 in this dataset compare to other data collected from V1 while the animal is performing a task (where for example the angle of the gradings is relevant to how the mouse should respond)? I would be interested to know if the authors compared statistics of their ensembles to publicly available data recorded in V1 during a visually guided behavior. Are the ensembles tuned to anything in particular? Could they be related to movement? [http://repository.cshl.edu/id/eprint/38599/]

      5) The authors provide some hypotheses as to why fewer cells are active in the later imaging sessions (dead/dying cells?). This is worrisome in regards to how much it might have affected the imaged area's biology. One alternative hypothesis is that the animal is more familiar with the environment/ not running as much etc. Have the authors collected any behavioral data to compare over time?

      6) How much do the results change when you vary the 50% threshold of preserved neurons within an ensemble (Line 146)? Does it make sense to call an ensemble stable when 50% of the cells change? Especially given that the cells analyzed as contributing to an ensemble are already sub-selected to be within the small population of stable cells (Line 119)?

      7) Cells are referred to as 'stable' when they're active on 3 different sessions that are separated in time. However, the authors find a smaller number of cells are stable over extended time (43-46 days later). If we extrapolate this over more time, would we expect these cells to continue to be stable? Given these concerns, it might make more sense to qualify the language around stability by the timespan over which these cells were studied.

      8) Filtering frames to only coactive neurons for ensemble identification seems strange to me. Authors may be overestimating the extent of coactivation. What happens when you don't do this? How much do the results change when you don't subselect for Jaccard similarity? I would be interested to see how the results vary as you vary this threshold (Line 136).

      9) The term 'evoked activity' is misleading because the authors don't link these activations to the visual stimulus. There's no task, so the mice could be paying little attention to the stimulus. Should we really consider this activity to be visually driven? Could the authors provide any evidence of this?

      10) A method like seqNMF could reveal ensembles that are offset in time. This looser temporal constraint could potentially reveal more structure. This should be run on the entire dataset (without stability sub-selection). I suggest this as a potential alternative or supplement to the method described by the authors. [https://elifesciences.org/articles/38471]

    3. Reviewer #1:

      Perez-Ortega and colleagues performed rigorous experiments to determine if the activity of neurons in the visual cortex is similar across days, in particular comparing spontaneous activity in the absence of visual stimuli across days, which was previously not examined to my knowledge. The paper claims that evoked ensembles are more stable than spontaneous ensembles, but more convincing quantitative analyses are required to support these claims.

      Major Comments:

      1) There is only one mention of prior work with multi-day imaging in the visual cortex (Ranson 2017). Another related study to cite and compare your results to would be Jeon, ..., Kuhlman 2018 (and I think a comment about how similar/different your results are from this study + Ranson would be useful for the reader). I would also recommend mentioning that there are studies that have observed differences in evoked activity across learning in V1 (e.g. Poort, Khan et al 2015; Henschke, Dylda et al 2020). Do you think there was adaptation across days to the stimulus that you repeated?

      2) Some GCaMP6f mice have aberrant cortical activity (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5604087/). In the raw data (Fig 1F) it doesn't look present, but it would be useful to show more time and sort the neurons by their first PC weights perhaps to see the activity structure.

      3) The approach of 3 plane imaging taking the maximum projection seems useful for tracking cells across days. There is a claim that some cells are no longer found / no longer active. Based on Fig 1G it appears there may have been some Z-movement from day 10 to day 46. This Z movement may explain some of the lost active cells. As a sanity check I would recommend plotting the Z-plane on which the cells were maximally active on day 1 vs the Z-plane on which the cells were maximally active on day n.

      4) There is an emphasis on analyzing the data as ensembles but I think this may be missing other slow, gradual changes. The definition of stable is at least 50% of neurons were preserved across days. However, the fitting procedure of finding ensembles may produce different ensembles even if those neurons are still correlated to each other. I would recommend two possible additional analyses: 1) compare the correlation matrices for common neurons across days (unless there are too few neurons for this); 2) look at changes in single neuron statistics across days. For 2) this may include reliability of neural responses to the visual stimuli, the weights of the neuron onto the first principal component of spontaneous activity, or the correlation of a neuron with running speed. I think these results may solidify your ensemble result (evoked-related statistics change less across time).

    1. Reviewer #3:

      From the technical perspective this manuscript provides clear results that are consistent with, but do not prove, what this reviewer believes is the main objective of the work; to establish the relevance of the open structure of the eukaryotic cysteine desulfurase complex. This reviewer has no good basis to either accept or reject the open structure as having physiological relevance. This could well be the case but it is not clear from my (limited) knowledge of the published literature that the relevance of the open structure is generally accepted. From this perspective I believe the manuscript is sound from the technical approach and experimental implementation but suffers from a lack of clarity about the case for and against the relevance of the open structure. If this is a point of controversy in the field the topic should be discussed in depth and the position of the authors more clearly articulated.

    2. Reviewer #2:

      In this manuscript, Barondeau and co-workers test a hypothesis for the role of the protein frataxin in iron-sulfur cluster assembly, seeking, inter alia, to explain the observation that mutations in the gene encoding this protein are associated with the incurable neurodegenerative disease, Friederich's ataxia. Their notion is that, whereas the bacterial versions of the sulfur-providing cysteine desulfurase are stable homodimers - in which the interactions between the monomers help to organize the mobile loop harboring the key cysteine residue that serves as general acid and nucleophile in the C-S-cleavage reaction that mobilizes the sulfur for incorporation into the cluster - the human enzyme (i) has a dimer interface that has been weakened through evolution, (ii) can be monomeric or form non-optimal dimeric forms, and (iii) can be driven to adopt the optimally active dimer form by intervention of accessory proteins (e.g., frataxin). Their approach was to perturb a bacterial (E. coli) cysteine desulfurase (IscS) by structure-guided mutagenesis in an attempt to introduce into it the behavior of the human enzyme, specifically its activation by accessory proteins (here CyaA and FXN). The experiments were successful in this goal. I like this paper and believe that it is interesting and important. I would point out two aspects that perhaps leave room for improvement.

      1) In principle, it would have been a more powerful test of their hypothesis had they been able to perturb the human enzyme to get a constitutively active form, no longer dependent on the binding of the accessory proteins, either instead of, or in addition to, the converse perturbation of the bacterial system. Perhaps this approach was precluded by difficulties associated with the human enzyme?

      2) The second criticism is that the effects on quinonoid form decay and activity are rather modest. However, I believe that important biological effects can arise from even such modest regulation of enzyme activity levels.

    3. Reviewer #1:

      This study presents a detailed and focused study of the structural basis for a regulation strategy used by a human iron-sulfur cluster biosynthesis system, elucidated by artificial installation of new amino acids into a bacterial system that lacks the allosteric elements of the human enzyme. The work includes quaternary structure analysis and activity assays of variant bacterial proteins. It is performed competently and supports the conclusions. But the focus may be too narrow for a general audience. To bring the work over the bar, the authors could test whether installing the bacterial residues into human NFS1 restores activity without frataxin (inactivated in the human genetic disorder Friedrich's Ataxia). Furthermore, some elements of the study could be presented more clearly/rigorously to communicate the significance of the work to a general audience. These suggestions are listed below.

      1) It would be useful for an unfamiliar reader to include a diagram of the bacterial and human iron-sulfur cluster biogenesis pathway. It would also be helpful to depict the mechanism of the IscS/NFS1 cysteine desulfurase reaction - essentially a picture to go along with the description of the PLP-dependent transformations described in paragraph 2.

      2) In the first paragraph of the results section - I would be interested to see more details about the selection of the three residues targeted for mutagenesis. For example, did the authors inspect the interfaces of existing crystal structures of these complexes? Did they create sequence alignments for multiple eukaryotic/prokaryotic cysteine desulfurases and select sites conserved in bacterial proteins but not eukaryotic ones? More description of the experimental or bioinformatics basis for selecting these three sites would be important for convincing the reader that the basis for this work is sound.

      3) The structural basis for the dimer interaction and the enhanced activity isn't completely clear - how do the changed interactions enhance the enzyme activity? A good description of the different quaternary forms and why they are more/less active is given on page 4-5 - but perhaps another link could be made between the exact residues targeted for substitution and the features of the system important for catalysis.

      4) On page 10, the authors describe changes in IscS quaternary structure as a function of concentration. What is the estimated copy number or concentration inside the cell? Which concentration ranges would be most physiologically relevant?

      5) Addition of any helper protein appears to increase the proportion of variant IscS dimer and activity. Is there any reason to believe that this phenomenon is simply a crowding effect? If the same amount of an unrelated protein is added - does the activity/dimer fraction change compared to variant IscS alone?

      6) I found the color scheme in Figure 1 hard to follow - could the authors keep the subunit colors consistent and use text labels directly on the figure panels for the subunits and forms (open, ready, etc). I also don't think the "Clash!!" labels are necessary. A more effective approach might be to use zoomed-in insets for each clash.

      7) In Figures 4-6 - could the authors include a more complete description of the error bars? What kind of error is shown? Are the replicates different experiments done on different days? These presentations might also benefit from showing the actual data points on top of the bars/error bars.

    1. Reviewer #3:

      This study combines two cutting-edge approaches for the study of polyclonal antibody responses to understand the molecular profiles of antibodies elicited by HIV envelope trimer immunization in a rabbit model. In one arm of the study, the authors performed mutational profiling of serum antibody neutralization escape variants, and in the second arm they used electron microscopy polyclonal epitope mapping (EMPEM) to track antibody binding sites. These authors performed large-scale data collection and present high-quality validation data and explorations of the resulting datasets that compare antibody binding and virus neutralization profiles. These approaches provide a comprehensive window into the molecular specificity and performance of HIV immunization and are expected to inform advanced HIV-1 vaccine designs.

      Summary of any substantive concerns:

      The authors have done a nice job validating the integrity of the NGS data, and the strong data in Figs 4/5/2B show the power of the NGS-based neutralization mapping assays. This adds a solid confirmation of the study findings and demonstrates the quality of the techniques. Overall this is a solid study and the findings are informative. I see just a few methods updates and analyses that would help finalize the presentation of methods and data.

      1) Additional information on the bioinformatic methods for data analysis is needed. How did the authors handle discrepancies in data across replicates or libraries, for example if a mutation that was enriched in one library or replicate, but deleted in another? Were there any quality filters or metrics used to estimate true signal vs. noise?

      2) Differential selection statistics are mentioned briefly, along with citations to prior publications. Prior citations are definitely helpful. I think it is still important to state the key steps used in processing NGS data and the statistical techniques and quality metrics that were used. The authors should also state any criteria for acceptance or rejection or binning of individual data points, or acceptance/rejection of datasets or replicates, if quantitative criteria or metrics were used.

      3) Several replicates showed a low percentage infectivity (Fig S1, e.g. animals 5724 and 2124), but the text indicates averages between 0.3% and 2.7% infectivity. Were some groups omitted from analysis, or were all groups included?

      4) How well did the mutational profiles correlate between different libraries or replicates of the same samples?

    2. Reviewer #2:

      This manuscript by Dingens et al. develops a novel application of mutational antigenic scanning to identify dominant neutralizing antibody epitopes in polyclonal sera from vaccinated animals, and compares the findings of such techniques with those from cryo-EM based unbiased mapping of binding antibodies and from conventional mutational mapping of neutralizing epitopes. Overall, I find the experiments and analyses to be of high quality, thorough and of sound reasoning, and the manuscript to be well written. I also commend the authors for the development of a facile and easy-to-use interactive viewer for exploring the mutational scanning data. I think the dual approach of mutational scanning and cryo-EM based mapping has the potential to be a powerful approach for dissecting antibody content of polyclonal sera post-vaccination or in infected hosts.

      The only major concern I could identify is the following. One of the main advantages of the mutational scanning approach is that it can identify novel epitopes targeted by antibody responses in a high-throughput manner. It is a little disappointing that this advantage was not leveraged in the current manuscript, perhaps due to the choice of the vaccine (BG505 SOSIP trimers where the epitopes have been thoroughly mapped in the literature) and the selection of vaccinated animals. Looking at Fig. 2, animal 5727 was the only animal whose serum showed some selection signatures outside of the regions considered in depth (at sites 507 and 509) - have the authors analyzed these escape mutations? If not, and only if possible within reasonable workload, I urge the authors to pursue this example or any other example where a potential novel epitope discovery could be possible.

    3. Reviewer #1:

      Dingens et al. report a timely complementary study to map neutralizing and binding responses in polyclonal rabbit sera induced by immunization with the BG505 SOSIP Env trimer. Neutralizing responses are mapped using libraries of replication-competent HIV expressing all mutants of the BG505 Env, an approach developed in the Bloom laboratory. Binding responses were mapped using an EM-based method, EMPEM, developed in the Ward laboratory. The Env mutations that affect neutralization of the autologous BG505 strain in the BG505-SOSIP-immunized animals were largely known from other studies, as were the binding (not necessarily neutralizing) responses - the strength of this study is the combination of the two approaches. It is especially useful that the complex datasets have been deposited on-line where they can be interactively explored, including mapping onto Env trimer and monomer structures. Although results were anticipated, it is very nice to directly compare the neutralization epitopes to the binding epitopes determined by EMPEM. This is a well-written and beautifully illustrated paper.

    1. Reviewer #3:

      The authors probe mechanosensory processing in Hydra by measuring calcium activity in neurons and muscles in response to precise mechanosensory stimulation in whole and resected animals. The authors' claims are well supported by the evidence. The development of a mechanosensory delivery system for Hydra is also a significant methodological advancement. Taken together, the work advances our understanding of the Hydra nervous system and is a needed step towards developing Hydra as a powerful model for systems neuroscience.

      Substantive concerns:

      1) One weakness is that different measures of "mechanosensory response" are used at different places in the manuscript. In some contexts, a response is defined as calcium activity in neurons (Fig 2), and elsewhere as calcium activity in muscles (Fig 3 and 4). And in Fig2 SuppFig2 muscle contractions are also measured using MeKs. The relation between neural activity, muscle activity and body movement is of course of high interest, and the paper explores this. But, if technically possible, it would be helpful to report a single metric of behavior that could be used in all experiments. For example, it might be possible to use video of the animal's pose or body length to measure contractions in all experiments. At a minimum the reasoning behind choice of measurement of response for each experiment could be discussed explicitly.

      2) Related: Without a consistent measure of behavior, it will be important to further clarify figures so that a reader can tell at-a-glance how contraction probability is being measured.

    2. Reviewer #2:

      The Hydra, in the phylum cnidaria, is a near microscopic freshwater animal that has recently resurfaced as an attractive model organism in neuroscience due to its optically accessible transparent body, sparsely distributed neural network, and simple behaviors. In this manuscript, Badhiwala and colleagues use calcium imaging of the Hydra neural network, combined with surgical resection and microfluidics pressure stimulation to identify body regions indispensable for mechanosensory activity. They report that while resection of the aboral region did not abolish the mechanical response, resection of the oral region attenuated this response, while combined resection of oral and aboral regions showed the greatest effect. They also find a correlation between reduced stimulated activity and spontaneous activity, suggesting a common mechanism that gives rise to both activities. While this study takes on an innovative approach by using a microfluidics device to mechanically stimulate the hydra under optical recording there are a number of conceptual and technical limitations. Perhaps my biggest reservation is that despite real potential, the data are rather low resolution (body transections and bulk calcium responses) and as such the conclusions that can be reasonably drawn do not extend what is known in a significant way.

      Major comments:

      1) The authors have designed a microfluidic device that allows them to simultaneously mechanically stimulate, monitor movement and functionally image a hydra. The highly quantifiable nature of the microfluidic device is a great asset, although this potential is not deeply explored. While I can see how the microfluidic stimulation could offer benefits over fluid jet or blunt probe, more in-depth characterization is needed.

      2) What is the spatial distribution of the pressure pulse stimulus on the Hydra body? How far does the mechanical force spread from the region directly touching the pressure valve?

      3) The use of the microfluidic device was limited. Have the authors attempted to map mechanical sensitivity across the Hydra body by stimulating different sites?

      4) The authors have not attempted to record calcium responses from single neurons, but rather spatially average a population response from a large region of interest. This should be specifically stated in the results section. More importantly, to provide insight into network function much smaller ROIs over multiple sites are needed instead of the bulk activity of the entire peduncle. This seems like a real lost opportunity as the lure of the optically clear and small hyda is that neural representation and coding can be tracked over large portions of the network at cellular resolution.

      5) It is unclear where the recorded signals are coming from and if movement is creating artifacts. Have authors made any attempts to correct for movement? The supplemental movies show a stationary region of interest and moving animal, in some cases parts of animal moving in and out. Furthermore, is background subtracted and how? There is a large fluorescent signal coming from the entire body/ middle columnar part of the body and spontaneous firing that makes interpretation of the data difficult.

      6) Contraction is a behavioral response of the animal; however, the authors use 'contraction' do describe calcium imaging responses throughout the figures and text. This should be avoided.

      7) I am unsure if the title of the paper is accurate. I do not think this work has demonstrated "multiple nerve rings" are important for coordinating mechanosensory behavior.

      8) Furthermore, the claim that the observed "linear relationship" between the spontaneous contraction probability and resection type is evidence for shared neural pathways is a stretch. These data are fairly coarse resolution and include only 3 animals in each group with highly variable responses (Figure 4C). Additionally, they do not provide evidence to distinguish the motor circuits they hypothesized these neural nets converge upon.

    3. Reviewer #1:

      The manuscript by Badhiwala et al. is an interesting study using the emerging model system Hydra, which has many advantages for studying the entire nervous system of an animal during simple behavior. Some of the foundational neuroscience papers in this field have only come out in the past few years, and new studies such as the one here, might have the potential to contribute to an important early literature. Despite clear reasons for enthusiasm, the many shortcomings in this work greatly diminished my enthusiasm and support for this study. Although I appreciate building the microfluidic devise with simultaneous pan-neuronal imaging, the nature of the new biological insights provided here seems quite limited and easily predicted based on prior studies in hydra and other model systems. Moreover, the crude nature of some experiments inhibits my ability to make fair judgement of potential findings.

      Major concerns:

      1) The pressurized stimulation of the hydra appeared to be specific to the center of the body. The authors don't mention why this region was chosen, which seems critical to this study. Relatedly, why didn't they test multiple areas across the hydra with this system? Might we expect to see different sensorimotor behaviors, and thus different neural outputs?

      2) The authors reference a recent single cell study characterizing multiple neuronal cell types in hydra. This work would greatly benefit by using some cell-type resolution studies to determine the functional nature of the neurons being activated as opposed to solely using pan-neuronal GCAMP imaging. If they can put GCAMP in all neurons, why not put it in specific subsets of neurons based on cellular identity? This point becomes more salient because a major take-home from this paper is that the spontaneous behavior and firing patterns is nearly identical to the stimulus evoked patterns, except for an apparent increase in firing rate. The true nature of the mechanosensory response might be revealed with cell-type specific experiments.

      3) Although the authors reference whole animal imaging, they focus imaging analysis on peduncle and hypostomal nerve rings, despite the videos showing calcium activity in other areas throughout the body. Moreover, are the authors certain their pan-neuronal genetic strategy equally samples neurons throughout the body? In other words, is the apparent increase in activity in the nerve ring over other areas driven by a technical artifact of these neurons being labeled better?

      4) While I appreciate the resection studies to get at "loss-of-function" experiments, this approach seems rather crude, and potentially confounding to clear interpretation. Exactly which neurons are killed and to what extent, and how many, if any began to regenerate throughout this process? My alarm here is raised especially in light of the author's surprising finding that "footless" animals show that the aboral nerve ring is not required for spontaneous or mechanosensory responses. What if residual activity from neurons not ablated is driving this response?

    1. Summary: This work assesses the role of within-host viral shedding dynamics and contact heterogeneity on distribution of transmission events in SARS-CoV-2 and influenza. Using multi-scale modeling, with similar resulting generation time and serial interval distributions to published work, predictions are made on the manner and contribution of super spreading to transmission. Distinctions are seen when comparing to applying a similar modeling framework to influenza.

      Essential revisions:

      1) Statistical analysis: The model parameters are estimated using an exhaustive grid search, which yields good fits for the best-fit values, but there is no assessment of statistical certainty in the parameter values. The authors essentially adopted a strategy in the spirit of approximate Bayesian computation (ABC), by proposing parameter values, simulating from a model, and comparing summary statistics of the simulated output to known values from the literature. The analysis would be helped by doing a more formal ABC analysis, as this would provide a better sense of how narrowly constrained the parameter values are given the available data. At minimum, it would be more convincing to consider additional parameter sets gridded across a narrowed region of parameter space before selecting an optimal fit.

      2) Model validation The state of our knowledge about these infections is limited, both by the short time during which this research has been conducted, and the paper's need to rely on data taken from before the introduction of confounding factors such as social distancing and widespread mask usage. For this reason, in addition to the included sensitivity analysis for the model parameters, a sense of the sensitivity of the model's conclusions to the data set to which it is being fitted is needed. How much would these results change if there are errors in our understanding of the distribution of individual R0 values, or serial intervals?

      3) Distinction in assumptions for flu and covid The populations on which the histograms for the two diseases are based are quite different. For SARS-CoV-2, the studies are from China (Shenzhen, Tianjin and Hong Kong), while those for influenza are from Switzerland. Could cultural differences be relevant? What about seasonal differences, as the time during which the early SARS-CoV-2 studies occurred was necessarily restricted?

      Furthermore, the explanation for the difference between influenza and COVID is based primarily on differences in contact patterns. While the discussion (L. 511-523) clarifies this to be based on the efficiency with which exposures lead to infections (and pre-symptomatic transmission), which does sound like a viral parameter, rather than a social one. These viral factors do seem more believable than having to explain why the patterns of social contact exhibited by influenza patients would differ from those of SARS-CoV-2 patients. More focus on possible mechanistic explanations is warranted.

    1. Reviewer #3:

      The manuscript describes interesting experimental and modelling results of a novel study of human navigation in virtual space, where participants had to move towards a briefly flashed target using optic flow and/or vestibular cues to infer their trajectory via path integration. To investigate whether control dynamics influence performance, the transfer function between joystick deflection and self-motion velocity was modified trial-by-trial in a clever way. To explain the main result that navigation error depends on control dynamics, the authors propose a probabilistic model in which an internal estimate of dynamics is biased by a strong prior. Even though the paper is clearly written and contains most of the necessary information, the study has several shortcomings, as outlined below, and an important alternative hypothesis has not been considered, so that some of the conclusions are not fully supported by results and modelling.

      Substantive concerns

      1) The main idea of the paper for explaining the influence of control dynamics is that for accurate path integration performance participants have to estimate dynamics. This idea is apparently inspired by studies on limb motor control. However, tasks in these studies are often ballistic, because durations are short compared to feedback delays. In navigation, this is not the case and participants can therefore rely on feedback control (for another reason, why reliance on sensory feedback in the present study is a good idea, see point 2 below). This means that the task can be solved, even though not perfectly, without actually knowing the control dynamics. Thus, an alternative hypothesis for explaining the results that has not been considered is that the error dependence of control dynamics is a direct consequence of feedback control. Feedback control models have previously been suggested for goal-directed path integration (e.g., Grasso et al. 1999; Glasauer et al. 2007).

      To test this assumption, I modelled the experiment assuming a simple bang-bang feedback control that switches at a predefined and constant perceived distance from the target from +1 to -1 and stops when perceived velocity is smaller than an epsilon. Sensory feedback is perceived position, which is assumed to be computed via integration of optic flow. This model predicts a response gain of unity, a strong dependence of error on time constant (slope similar to Fig. 3) or of response gain on time constant (Eqn. 4.1) with regression coefficients of 0.8 and 0.05 (cf. Fig. 3D), and a modest correlation between movement duration and time constant (r approximately 0.2, similar to Fig. 3A). Thus, a feedback model uninformed about actual motion dynamics and without any attempt to estimate them can explain most features of the data. Modifications (velocity uncertainty, delayed perception, noise on the stopping criterion, etc.) do not change the main features of the simulation results.

      Accordingly, since simple feedback control seems to be an alternative to estimating control dynamics in this experiment, the authors' conclusion in the abstract "that people need an accurate internal model of control dynamics when navigating in volatile environments" is not supported by the current results.

      2) Modelling: the main rationale of the model (line 173 ff: "From a normative standpoint, ...") is correct, but an accurate estimate of the dynamics is only required if the uncertainty of the velocity estimate based on the efference copy is not too large. Otherwise, velocity estimation should rely predominantly on sensory input. In my opinion that's what happens here: due to the trial-by-trial variation in dynamics, estimates based on efference copy are very unreliable (the same command generates a different sensory feedback in each trial), and participants resort to sensory input for velocity estimation. This results in feedback control, which, as mentioned above, seems to be compatible with the results.

      3) Motion cueing: Motion cueing can, in the best case, approximate the vestibular cues that would be present during real motion. Furthermore, it is not clear whether the applied tilt is really perceived as linear acceleration, or whether the induced semicircular canal stimulus is too strong so that subjects experience tilt. Participants might have used the tilt as indicator for onset or offset of translational motion, specifically because it is self-generated, but the contribution of the vestibular cues found in the present experiment might be completely different from what would happen during real movement. Therefore, conclusions about vestibular contributions are not warranted here and cannot solve the questions around "conflicting findings" mentioned in the introduction.

      4) Methods: I was not able to find an important piece of information: how many trials were performed in each condition? Without this information, the statistical results are incomplete. It was also not possible to compute the maximal velocity allowed by joystick control, since for Eqn. 1.9 not just the displacement x and the time constant is required, but also the trial duration T, which is not reported. One can only guess from Fig. 1D that vmax is about 50 cm/s for tau=0.6 s and therefore the average T is assumed to be around 8.5 s.

      5) Results: information that would be useful is not reported. On page 6 it is mentioned that the "effect of control dynamics must be due to either differences in travel duration or velocity profiles", it is then stated that both are "unlikely", but no results are given. It turns out that in the supplementary Figure 4A the correlation between time constant and duration/velocity is shown, and apparently the correlation with duration is significant (but small) in the majority of cases. Why is that not discussed in the results section? Other results are also not reported, for example, what was the slope of the dependence between time constant and error? Why is the actual control signal, the joystick command, not shown and analyzed?

    2. Reviewer #2:

      The authors asked how the brain uses different sensory signals to estimate self-motion for path integration in the presence of different movement dynamics. They used a new paradigm to show that path integration based on vision was mostly accurate, but vestibular signals alone led to systematic errors particularly for velocity-based control.

      While I really like the general idea and approach, the conclusions of this study hinge on a number of assumptions for which it would be helpful if the authors could provide better justifications. I also have some clarification questions for certain parts of the manuscript.

      1) Lines 26-7: "performance in all conditions was highly sensitive to the underlying control dynamics". This is hard to really appreciate from the residual error regressions in Fig 3 and seems to be contradicting Fig 5A (for vestibular condition). A more explicit demonstration of how tau affects performance would be helpful.

      2) One of the main potential caveats I see in the study design is the fact that trial types (vest, visual, combined) were randomly interleaved. In the combined condition, this could potentially result in a form of calibration of the vestibular signal and/or a better estimate of tau that then is used for a subsequent vestibular-only trial. As such, you'd expect a history effect based on trial type more so (or in addition to) simple sequence effects. This is particularly true since you have a random walk design for across-trial changes of tau. In other words, my question is whether in the vestibular condition participants simply use their previous estimate of tau, since that would be on average close enough to the real tau?

      3) I thought the experimental design was very clever, but I was missing some crucial information regarding the design choices and their consequences. First, has there been a psychophysical validation of GIA vs pure inertial acceleration? Second, were GIAs always well above the vestibular motion detection threshold? In other words could the worse performance in the vestibular condition be simply related to signal detection limitations? Third, how often did the motion platform enter the platform motion range limit regime (non-linear portion of sigmoid)?

      4) Lines 331-345: it's unclear to me why you did not propose a more normative framework as outlined here. Especially, a model that would "constrain the hypothesized brain computation and their neurophysiological correlates" would be highly desirable and really strengthen the future impact of this study.

      5) I would highly recommend all data to be made available online in the same way as the analysis code has been made available.

    3. Reviewer #1:

      The authors investigated the importance of visual and vestibular sensory cues and the underlying motion dynamics to the accuracy of spatial navigation by human subjects. A virtual environment coupled with a 6-degrees of motion platform, as described in prior studies, allowed precise control over sensory cues and motion dynamics. The research builds on previous work in several important ways: 1) the authors demonstrate that reliance on vestibular cues leads to an undershooting of trajectories to hidden goal locations, 2) manipulation of the underlying motion dynamics (the time constant) during navigation alters the accuracy of trajectories particularly when subjects are reliant on vestibular cues, 3) probabilistic models were used to demonstrate that path integration errors can be explained by mis-estimates of the underlying motion time constants, and 4) time constant estimates were improved when visual cues were available. Overall, the analyses are appropriate, the conclusions are judicious, and the authors provide an important contribution to understanding the sensory mechanisms underlying human spatial navigation.

      1) Some minor methodological clarifications: how many trials were performed per subject? How many of the trials were performed in each condition (visual, vestibular, combined)?

      2) The study tested performance by both male and female subjects. Could the authors comment as to whether sex differences were observed across performance measures? Perhaps sex can be indicated in some of the scatter plots.

      3) Figure 2A. It would be helpful if the authors identified the start-point of the trajectory and also provided more explanation of the schematic in the caption.

      4) Figure 2B-C. It would be helpful if the authors could expand this section to show some example trajectories and the relationship between examples and plotted data points. This could be done by presenting measures (radial distance, angular eccentricity, grain) for each example trajectory.

      5) Because the range of sampled time-constants can vary across subjects, it would be nice to show plots as in Figure 3B for each subject (i.e., in supplementary material).

      6) Discussion. The broader implications of the findings from the models are not sufficiently discussed. In addition, some comparison could also be made to other recent efforts to model path integration error (e.g., PMC7250899).

    1. Reviewer #3:

      This is an outstanding work from the lab of Dr. Stains establishing rapid post-translational regulation of sclerostin, a robust inhibitor of bone formation. They carefully and clearly establish that sclerostin is rapidly degraded by lysosomes in response to mechanical loading, and further link lysosomal abnormalities, using Gaucher iPSCs, to sclerostin levels.

    2. Reviewer #2:

      The article by Gould et al breaks new ground by demonstrating a role for lysosomal-mediated degradation in the mechanosensitive repression of Sclerostin levels in bone. Though the post-translational repression of Sclerostin has long been apparent, no one has yet unraveled the mechanisms. Therefore, this discovery is important to the skeletal biology community - both because of the findings themselves, and because the conditions/models used by this team to make these discoveries will be useful for other investigators, including their ability to manipulate and observe the rapid lysosome-dependent control of Sclerostin levels in vitro and in vivo in response to PTH or mechanical stimulation. In addition to the importance within this field, the work has broad impact on multiple levels including a) the clinical relevance for understanding and potentially treating osteoporosis and the skeletal phenotypes in individuals with lysosomal disease, and b) the mechanoregulation of lysosomal function and its relationships to crinophagy, which has implications not only for the regulation of Sclerostin, but also for other factors in and beyond the skeleton (RANKL, insulin).

      The study is elegantly designed, clearly communicated, and rigorously conducted. The conclusions drawn in the manuscript are mostly supported by the data provided. In general, it is important to elaborate on what gives the authors confidence that the inhibitors were effective and act as expected throughout the study - but especially Bafilomycin A1 and Apocynin in vivo. If BafA1 and Apocynin treatment in vivo work as expected, they should prevent the rapid load-dependent repression of Sclerostin levels (shown in Figure 1D).

      Other revisions or additions, described below, would improve the quality of the study:

      1) Are Sclerostin levels insensitive to FSS or PTH in Gaucher cells (though it understandably may not be feasible to differentiate these cells in microfluidic devices)?

      2) Since a sex-specific effect of exercise on bone anabolism has previously been described, and TRPV4 also has a sexually dimorphic effect on bone, were any differences observed between male and female animals here?

      3) Can the authors discuss where the pathway used by PTH diverges from that activated by FSS/load?

      4) Is it possible to detect load dependent changes in sclerostin localization in lysosomes in vivo?

      5) Given the non-specific effects of hydrogen peroxide, Figure 6D may not add a great deal in light of the other data that was gathered with more rigorous approaches. Additional controls would give more confidence in the efficacy/specificity of this approach.

      6) Please include how long the OCY454 cells were differentiated prior to the treatments applied.

      7) Please identify the route by which inhibitory agents were administered to the mice (i.e. subcutaneous, intraperitoneal).

      8) Please increase the N for experiments in Figure 4A and 5D, or remove these data and the corresponding conclusions.

    3. Reviewer #1:

      This manuscript by Gould et al presents highly novel data which is logically presented and is likely to have both clinical and fundamental implications. Of relevance to the bone field, it defines a new mechanism by which one of the most important clinical targets for the treatment of osteoporosis is endogenously regulated. Beyond bone, I am not aware of any other examples of stimulus-directed acute lysosomal degradation of a secreted canonical Wnt antagonist as a mechanism to provide rapid de-repression. What seems lacking is a careful analysis of the physiological consequences of the acute degradation of sclerostin.

      1) A landmark paper which convinced many in the field that sclerostin down-regulation is necessary for osteoanabolic responses to loading was based on a transgenic model from the Bellido lab (Tu et al, Bone, 2012). In that study, expression of Sost from the DMP-1 promoter precluded its transcriptional and protein-level down-regulation at late time points. That was sufficient to largely prevent bone gain following loading. Several other groups interpreted this as indicating Sost transcript regulation is required for bone's adaptation to loading, calling into question the physiological relevance of transient post-translational degradation described here. Can the authors reconcile that study with their own?

      2) One way the authors attempt to demonstrate in vivo relevance is through western blotting of mechanically loaded mouse ulnas, showing previously-undocumented acute reductions in lysate sclerostin levels. It is standard practice in the field to quantify sclerostin positive osteocytes histologically, rather than by western blotting. This is because mechanical loading can rapidly increase blood flow to the limb (even in this study, the authors implicate the vasodilator NO) as well as having inflammatory effects, diluting the proportion of osteocyte-specific proteins in the lysate. Demonstrating protein-level sclerostin down-regulation specifically in osteocytes rapidly following loading would be a major addition to this study.

      3) A long-stranding, reproducible finding which has always been very perplexing is that the largest transcriptomic responses to osteogenic mechanical loading occur very quickly, within an hour of loading, before Sost is down-regulated. Even in UMR106 cells in vitro, B-catenin is stabilised before Sost is down-regulated following exposure to substrate strain. The current findings may explain this temporal discrepancy. The authors should responses to sclerostin degradation such as quantifying Wnt target genes to provide physiologically-relevant readouts of their findings.

      4) Figure 3 shows co-localisation of endogenous or ever-expressed sclerostin with lysosomal markers. Does this co-localisation change following FSS or PTH?

      5) It is not clear whether early lysosomal degradation which transiently decreases sclerostin is triggered by the same mechanoresponsive pathways which subsequently down-regulate its RNA levels, or whether the two responses are distinct. Can the authors clarify this? For example, does Sost decrease in the BafA1-treated cells 8 hours after FSS or PTH treatment?

      6) Discussion "that the rapid and transient nature of sclerostin degradation may be critical to the precise anatomical positioning of new bone formation following an anabolic stimulus" is very unclear. How do the authors propose that lysosomal sclerostin degradation produces regionalised responses to a greater degree than the previously-reported transcriptional mechanisms?

      7) The evidence of lysosomal involvement in sclerostin down-regulation is largely based on pharmacological compounds of limited selectivity. A degree of genetic evidence is indirectly provided by the Gaucher cell line, but this is based on a single patient line. Can the authors provide direct genetic evidence that lysosomal function is necessary for sclerostin down-regulation, and ideally for bone formation?

      8) References to previous studies which described mechanisms and relevance of Sost down-regulation are sparse. For example, see previous implications of NO signalling from the Vanderschueren lab (Callewaert et al, JBMR, 2010), protein-level down-regulation of sclerostin in the context of ageing from the Price lab (Meakin et al, JBMR, 2014) relevant to the discussion in the current manuscript, as well as work from the Ferrari lab on sclerostin regulation following both PTH and mechanical loading (e.g. Bonnet et al, JBC 2009; Bonnet et al, PNAS 2012).

    1. Reviewer #3:

      In this work, Feilong and colleagues use Human Connectome Project fMRI data to investigate the degree to which the strength of functional connectivity is predictive of general intelligence, and the degree to which that predictive power is improved using the hyperalignment procedures their lab has previously developed. I am broadly very supportive of the goals of improving prediction of individual behavioral differences via improved, functionally-based cross-subject registration, and I have always felt that the hyperalignment procedure is one of the most promising approaches for improving cross-subject functional registration. Overall I feel that this paper is an important next step in the development and maturation of the hyperalignment technique.

      However, I do have two significant concerns with the predictive modeling presented in this work. I note that I am not an expert in these techniques, so these concerns may be due to my own ignorance; however, I would like to see the authors at least better explain these issues to non-experts like myself.

      First, the authors employed a leave-one-family-out cross-validation scheme for their predictive modeling. My understanding is that the field has generally moved away from leave-one-out or leave-few-out cross-validation, as that approach consistently overestimates the predictive power of generated models. The HCP is a large dataset. Can the authors employ a more robust approach of using fully split halves?

      Second, the authors make the claim that fine-grained (vertex-wise) connectivity has substantially better predictive power than coarse-grained (parcel-wise) connectivity, based on the variance in intelligence explained by the predictive models. However, the models based on fine-grained connectivity also have many, many more variables being used to make the prediction. Is this not a confound?

    2. Reviewer #2:

      Summary:

      This paper predicts intelligence using either coarse-grained functional connectivity (based on 360 ROIs) or fine-grained functional connectivity (vertex-wise) after hyperalignment. The results show a two-fold increase of variance explained in general intelligence between coarse-grained and fine-grained connectivity.

      General:

      This is a very clearly-written paper that presents an important result, which has the potential of great impact on the field of behavioral prediction. My comments below are relatively minor and primarily aimed at clarifying a few details in the article. Please find my detailed comments below, approximately in order of importance.

      Major comments:

      1) The fine-grained functional connectivity has richer features than coarse-grained, leading to higher dimensionality in the PCA step (supplementary figure S5). I wonder if this might contribute to improved prediction accuracy. Related to this, it appears that there may also be a relationship between PCA dimensionality and regularization parameter, such that more regularization may be needed when more PCs are used in the model. It would be interesting to test the effect of fixing the PCA dimensionality (and perhaps also the regularization) across all models to control model complexity.

      2) The Glasser 360 parcellation was used throughout this work. There are subject-specific parcels and group-level parcels available for this parcellation. Please clarify which of these were used. If the group-level parcels were used, it might be interesting to see how the coarse-grained prediction accuracies might improve when using subject-specific parcels.

      3) The residuals of fine-grained connectivity profiles were obtained after subtracting coarse-grain connectivity. Why was subtraction used here, rather than regressing out (i.e., orthogonalizing with respect to) the coarse-grained connectivity?

    3. Reviewer #1:

      In this study, Feilong and colleagues showed that hyper-aligned fine-grained cortical connectivity profiles can be used to strongly predict general intelligence in individual participants. This is an important study demonstrating the utility of previously developed connectivity hyperalignment and highlighting the behavioral importance of fine-grained connectivity which is typically ignored in more standard functional connectivity analysis.

      1) How does the bootstrapping handle the family structure in the data? More details are needed.

      2) The authors mentioned that "the code for performing hyperalignment and nuisance regression was adapted from PyMVPA". One of the most important contributions of this study is the impressive demonstration of prediction performance improvement using hyperalignment and fine-grained connectivity profiles. Therefore, it is important that the adapted code and code utilized for the current study be made publicly available. While connectivity hyperalignment code from the previous study is available in PyMVPA, my experience is that it is not easy to use. If no code from the current study is made available, I believe it will be very difficult to replicate this study.

    1. Reviewer #3:

      The manuscript "High-quality carnivore genomes from roadkill samples enable species delimitation in aardwolf and bat-eared fox" is mostly well written and demonstrates an interesting and useful method for sequencing genomes from low-quality samples. They also provide a comprehensive overview of the state of genomics across the Carnivora clade, with some improved species/subspecies designations. I think the work is of broad interest. The analyses are mostly clear and I think a few additional analyses and small improvements could be made prior to publication, but otherwise have no issues.

      The additional analyses/clarification I would recommend regards the Genetic differentiation estimate: This is a really interesting statistic! For some of the species you have multiple individuals it seems? Can you explain this a little more in the text. I am just not entirely convinced that the statistic is robust, but I think it would be with a few more analyses. My concern is primarily due to having only two individuals in some of your comparisons, because of population structure/relatedness the random regions you sample could have correlated histories. I think this could be addressed by varying window sizes and replicates across comparisons where you have multiple individuals for both the intraspecific and interspecific calculations.

    2. Reviewer #2:

      This manuscript from Allio is an interesting mix of approach demonstration (population genomic sampling via roadkill) and application (demographic analyses, questions about taxonomic status, and phylogenomics). There are some valuable results from the application component of the paper. In particular, I appreciate the comparative approach for studying patterns of intra- vs. inter-species genetic diversity. However, there is some rework to fully normalize those comparisons, that I feel is required.

      I would suggest the authors be more immediately forthcoming about the sizes of their samples, and perhaps consider changes to the introductory text to avoid giving any mis-impressions to readers about what data are ultimately presented in the manuscript. I had envisioned more of a landscape genetics-level sample and analyses, rather than n=3 individuals per each of the two species. Furthermore, while I think the reporting of the genome assembly qualities is important from confirmatory and quality control perspectives, and while presenting the new assemblies, in my view this shouldn't be set up to be a surprising result. These are very high-quality DNA samples, so we expect to be able to achieve DNA assembly qualities to whatever the invested level using current best-practices data generation and analytical methods.

      On the general genetic diversity and taxonomic questions, from my own experience I know that genetic differentiation metrics are not necessarily precisely comparable between a new study based on genome sequence data and an existing published dataset. Sample size affects false positive and false negative SNP calling error rates and sequence coverage and the variation among samples can also make a difference. Thus, especially since this leads to a key result/conclusion (i.e. that the two subspecies of aardwolf may deserve species status), it isn't sufficient that "similar individual sampling was available" for the carnivoran comparative datasets. The datasets should be equalized with sample number and individual sequence coverage (using downsampling) and then SNPs re-called using the same approach, before making the comparison. From the methods it wasn't clear to me the extent to which this all was done. It does appear that the same number of samples were used, and that the SNP calling approach was likely re-done from the read data (although please be more explicit about this, in the description). However, it doesn't appear that the sequence read data were subsampled for equivalency across the samples, which should be incorporated. Hopefully the results are similar, but there can be big changes that affect interpretation, so a careful approach is required.

      The study design and proposed expanded use of roadkill samples in population genomics led me to think of this study, one of my all-time favorites: Brown & Bomberger Brown (2013), Where has all the road kill gone?, Current Biology. For the present study, the question of potential biases in the sample for similar or related reasons is beyond the scope of investigation; this is not relevant for the sample sizes collected and analyses conducted. However, the importance of keeping this possibility in mind should at least be noted given the more expansive promotion of the wider inclusion of roadkill samples in population genomic studies. E.g. could the sample be biased towards individuals with genetically-mediated and/or culturally learned behavioral tolerance of human-disturbed habitats, etc., rather than a truly random sample representative of the overall landscape.

      In the methods section, the collection process and permits for the four samples from South Africa are described in detail. (Could you preemptively explain that the IUCN status for these two species is Least Concern, and thus that CITES permits are not required for the international transport of the samples?). However, the same information is not provided for the two East African samples that were included in the study (also, I think that there should not be two separate sampling sections in the manuscript). Please provide these details or expanded explanations.

    3. Reviewer #1:

      The manuscript by Allio et al. tries to justify that roadkill can be a useful source for genomic sequencing and even genome assembly level data. The authors cover all aspects of using this resource, from a new protocol to extract DNA, through generating a hybrid short- and long-read genome assembly and to various applications, showing that this data can be used in phylogenomic and population genomic analyses. Although I think that the manuscript is useful in highlighting how this resource can be analysed, it covers a lot of different topics and covers them in varying depth, which makes it difficult to follow and understand the real importance of the different sections.

      Major comments:

      -Overall, this manuscript left me a bit confused about what is the main scope. It covers a lot of different topics from the laboratory-end of the spectrum, e.g. protocol used to get good DNA out of roadkills and how to assemble these genomes with a hybrid genome assembly, and crossing into a phylogenomic analyses making taxonomic suggestions and an analyses of the complete carnivora group, plus a demographic analyses showing the changes of population size over time.

      I was left with an impression that the authors tried to cover a lot of different topics but did not go deep enough in any of those. As a consequence, the results section ends up sounding somewhat shallow, while the discussion takes up a lot of space.

      What I would suggest if this manuscript is indeed to serve as a roadmap to roadkill genomics, is to add a figure showing the pipeline and then adjust the structure of the manuscript accordingly. For example, one box in the figure would correspond to one heading in the results/methods, where the DNA analyses is explained - reasoning why a special protocol is needed, what is the main difference to existing methods, how does the yield compare to other methods, etc.

      And then the different topics explored in this manuscript could be shown as different examples of the application - taxonomic questions on intra-/inter-species level, higher level taxonomic analyses (of the whole Carnivora), population genomic analyses, etc. Highlighting this as examples of the potential use of the roadkill genomes would make it understandable why this paper is trying to cover aardwolf and bat-eared fox genomics from so many ends.

      -Even though showing that roadkill samples can be useful for analysing particular species for which obtaining samples is difficult in other ways, I'm missing a discussion of how difficult it is to obtain roadkill samples and what are the ramifications. Can this approach be generally applied due to legislation reasons, do you need permits, do you find enough roadkill to rely on this source or do you only see it as an opportunistic sampling scheme?

      -Genome assembly is not exactly my field of expertise; therefore, I would like the authors to better explain how their hybrid, short- and long-read genome assembly approach is novel. My impression was that such hybrid assemblies are now a rather common and well-established practice. But a lot of space is dedicated to explaining this topic in the introduction and again in the discussion, which to me is something obvious and reads more like a review than a research article. But maybe I'm missing something obvious here, in which case I'd like the authors to make it clearer.

    1. Reviewer #3:

      In the manuscript by Kim et al., show that, beyond its roles of preventing somatic differentiation in the germline of embryos, Zn-finger protein PIE-1 also functions in the adult germline, where it is both SUMOylated as well as interacts with the SUMO conjugating machinery and promotes SUMOylation of protein targets. They identify HDA-1 as a target of PIE-1-induced SUMOylation. Here too, I find the claims interesting, however data is sometimes missing or does not fully support the claims.

      Main concerns:

      1) A key claim of novelty over previously proposed "glue" functions of SUMO is based on the fact that they find that temporally regulated SUMOylation of a very specific residue in a specific protein is affecting protein activity: The observation that "SUMOylation of HDA-1 only appears to regulate its functions in the adult germline" and not in the embryo together with the finding that "other co-factors such as MEP-1 are SUMOylated more broadly, these findings imply that SUMOylation in the context of these chromatin remodeling complexes, does not merely function as a SUMO-glue (Matunis et al., 2006) but rather has specificity depending on which components of the complex are modified and/or when."

      I find this claim poorly supported by the data. In fact, I find that the data supports that multiple SUMOylations contribute to formation of larger complexes: The His-SUMO IP (Fig 2B) brings down far more un-SUMOylated HDA-1 than SUMOylated. This argues for the presence of large complexes with different factors being SUMOylated and many bringing down unmodified HDA-1. The chromatography experiments (Fig 3B-C) also provide hits that are in complex and not direct interactors. Finally, HDA-1 SUMOylation is indicated to regulate MEP-1 interaction with numerous factors (Fig 3D). If all these factors are in one complex, it is hard to imagine how a single SUMO residue would mediate all of these simultaneously. It is quite likely (and not tested) that loss of HDA-1 SUMOylation leads to (partial?) dissociation of a large complex, rather than loss of individual interactions with the SUMO residue of HDA-1. Unlike claimed by the authors, there is no evidence that the "activity" of HDA-1 is regulated by SUMO modification.

      2) Based on loss of MEP-1/HDA-1 interaction upon pie-1 RNAi and smo-1 RNAi (Fig 4B), the authors conclude that "SUMOylation of PIE-1 promotes the interaction of HDA-1 with MEP-1 in the adult germline".

      The evidence that it is PEI-1 SUMOylation that is affecting MEP-1/HDA-1 interaction is fairly weak. In fact, based on Fig 4A, MEP-1 and HDA-1 interact without expression of PIE-1, and in PIE-1 K68R (sumoylation-deficient), although due to poor labeling of the panel it is not clear whether lane 1 and 4 refer to the WT pie-1 locus without tag or lack of pie-1.

      In 4B the HDA-1 band that is present in L4440 but not in pie-1 or smo-1 RNAi is very faint, and in our experience such weak signal is not linear i.e., bands can disappear or appear depending on the exposure. Importantly, according to the data, seemingly unmodified HDA-1 immunoprecipitated with MEP-1 (Fig 4B). This data contradicts the authors' claim that "These findings suggest that in the adult germline only a small fraction of the HDA-1 protein pool, likely only those molecules that are SUMOylated, can be recruited by MEP-1 for the assembly of a functional NURD complex".

      Furthermore, the fact that pie-1 and smo-1 depletion eliminate the interaction between HDA-1/MEP1 doesn't mean that the SUMOylation of pie-1 specifically is required for the interaction: perhaps un-SUMOylated pie1, and SUMOylation of something else, are both necessary for the interaction. The authors show that MEP-1 is also SUMOylated (Fig3C). When IP-ing GFP-MEP-1, they precipitate all its modified forms and associated factors. One alternative possibility for why smo-1 RNAi abolishes MEP-1/HDA-1 interaction is that MEP-1 SUMOylation is needed for interaction with HDA-1 (independently of pie-1). (On a side note, why are the authors not including MEP-1 SUMOylation in the model?)

      3) On page 13 the authors write: "These findings suggest that SUMOylation of PIE-1 on K68 enhances its ability to activate HDA-1 in the adult germline" and "We have shown that PIE-1 is also expressed in the adult germline where it engages the Krüppel-type zinc finger protein MEP-1 and the SUMO-conjugating machinery and functions to promote the SUMOylation and activation of the type 1 HDAC, HDA-1 (Figure 6)". Activation of HDA-1 is misleading and was never tested. If not performing in vitro assays for HDAC activity, the authors at least need to look at whether pie loss (degron) leads to acetylation of genomic HDA-1 targets and whether it affects HDA-1 (and/or MEP-1) recruitment to these sites. This could be done by ChIP-seq of HDA-1 and H3K9ac in WT and pie-1 degron animals.

    2. Reviewer #2:

      In their manuscript, Kim et al address the role of PIE-1 sumoylation during C. elegans oogenesis. The authors favour a model in which sumoylated PIE-1 acts as a sort of E3-like factor 'enhancing' HDA-1 sumoylation. While the results are indeed very interesting, it is unclear to me whether there is enough data to support the author's model. I have list of comments, suggestions, questions, and concerns, which are listed below, which I hope will help the authors strengthen the manuscript:

      Figure 1)

      I) As with the accompanying manuscript, the extremely low level of SUMO modification should be factored in the model.

      II) Is sumoylation also observed in untagged pie-1? As judged by figure 3A, the authors have a very good antibody to test this.

      III) While the authors claim that PIE-1 sumoylation is not observed in embryos, that panel shows a lower exposure than the corresponding one in Adult (as judged by the co-purified unmodified PIE-1::FLAG). A longer exposure and/or more loading would be helpful.

      IV) Their strategy and optimisation for purification of sumoylated proteins is excellent and will be useful for future research (along with other reagents the authors developed here). Is the 10xHis::smo-1 functional? Could this be tested in vitro and/or in vivo?

      V) In vitro PIE-1 sumoylation would be a desirable addition to this figure.

      VI) In addition to germline PIE-1 localisation, it would be interesting to see embryos and PIE-1(K68R).

      VII) MW markers are missing in the blots.

      Figure 2)

      I) The generation of the ubc-9 ts allele is an exceptional tool. Could the authors show SUMO conjugation levels at permissive vs restrictive temperature? Just out of curiosity, is this a fast-acting allele?

      II) The authors mention that gei-17 alleles are viable, could the authors mention any thoughts on why the tm2723 allele is lethal/sterile?

      Figure 3)

      I) Panel C is mentioned in the text in the wrong place. Also in C, what do the authors think about the big increase in MEP-1 sumoylation in the PIE-1(K68R) background?

      II) I have the same comment for panel D as I had for figure 1 comment III: the exposure/loading for the embryo WB seems lower, as judged by the co-purifying, unmodified HDA-1. A positive control for sumoylated protein coming from embryos would be nice.

      III) In general, the model of PIE-1 acting as a SUMO machinery recruiter should be tested with recombinant proteins. Even if compatible with some results in vivo, showing that this is a plausible mechanism in vitro would be extremely helpful and greatly support the authors' claim.

      Figure 4)

      I) The authors make a quantitative comparison of the HDA-1/MEP-1 interaction in the text. I think this is not correct. Even if these have been run in the same gel, this could just be a lower exposure. In this line, the HDA-1 blot in the 'Adult' IP would benefit from a longer exposure to better appreciate what seems a rather small difference between PIE-1 and PIE-1(K68R).

      II) Since there still seems to be interaction between MEP-1 and HDA-1 in the PIE-1(K68R) background, does smo-1(RNAi) or ubc-9(G56R) reduce this further?

      III) In panel B, the LET-418 blot on the right is massively overexposed.

      IV) Once again, in vitro binding experiments to get some indication that the authors' model is plausible would be a great addition.

      Figure 5)

      I) Could the authors make some quantitation of the immunofluorescence data?

      Overall, I think this manuscript proposes a very interesting model and the results support this model, although I am not convinced these are sufficient to strongly back the authors' claims. I would very much like to see a revised version with some in vitro data backing the authors' model.

    3. Reviewer #1:

      The evidence that sumoylation of K68 in the PIE-1 zinc finger protein is important for HDA-1 type 1 histone deacetylase association and sumoylation seems reasonable, and, is important because as shown in the co-submitted paper HDA-1 sumoylation leads to its association with MEP-1 and LET-418/NuRD complex thus accelerating H3K9ac deacetylation, and silencing gene expression.

      The evidence that PIE-1 is needed for sumoylation of HDA-1, presumably through association of PIE-1 with the UBC-9 SUMO E2, is reasonable. However, several aspects of the authors' model remain unclear, and there is an absence of biochemical assays to establish the role of sumoylated PIE-1 in HDA-1 sumoylation, and the effects of sumoylation on HDA-1 HDAC activity.

      1) How sumoylation of K68 in PIE1 affects its function was not worked out. Can the deleterious effect of the K68R mutation on PIE-1 function be reversed by generating a SUMO-PIE-1 fusion, as was done for HDA-1 in the co-submitted paper? K68 maps to the N-terminal side of ZF1 in the PIE-1 protein in what appears to be an unstructured region. Does the SUMO residue play a role in the interaction of PIE-1 with HDA-1? Are the zinc fingers required for PIE-1 interaction with HDA-1 or UBC-9? No zinc finger mutations were tested. Does HDA-1 have a SIM that would allow it to interact selectively with sumoylated PIE-1? Another possibility is that the PIE-1 SUMO moiety is important because it interacts with the non-covalent SUMO-binding site on the backside of UBC-9 (Capill and Lima, JMB 369:606, 2007), which might stabilize the interaction. The backside interaction of SUMO with UBC-9 is proposed to promote UBC-9-mediated sumoylation of target proteins with SUMO consensus sites that are directly recognized by UBC-9. In this scenario, SUMO-PIE-1 would in effect be acting as an E3 SUMO ligase for HDA-1 by serving as a recruitment "factor". In this regard, the authors could test biochemically whether recombinant PIE-1 or K68SUMO-PIE-1 stimulates sumoylation of HDA-1 by UBC-9, using recombinant WT and KKRR mutant HDA-1 as substrates. These issues deserve discussion.

      2) What is the SUMO E3 ligase that sumoylates PIE-1? Is it possible that through association with UBC-9, perhaps through its zinc fingers, PIE-1 is sumoylated in cis within a PIE-1/UBC-9 complex?

      3) In many places, including the title, the authors make the claim that PIE-1 promotes sumoylation and activation of HDA-1. While it is clear that PIE-1 does increase sumoylation of HDA-1, in a manner requiring K68, and that H3K9ac levels are decreased as a result, the authors do not provide any direct evidence that this process increases HDA-1 catalytic activity, as is implied in the title and elsewhere. As indicated in the review of the co-submitted paper, this would need to be established by carrying out an HDAC assay on control and sumoylated HDA-1 in vitro. Instead of enzymatic activation, it is possible that the PIE-1 interaction and HDA-1 sumoylation results in relocalization of HDA-1 within the nucleus to facilitate more efficient H3K9ac deacetylation.

    1. Reviewer #3:

      This manuscript by Kim et al. describes a role of SUMOylation in Argonaute-directed transcriptional silencing in C. elegans. The authors found that SUMOylation of the histone deacetylase HDA-1 promotes its interaction with both the Argonaute target recognition complex as well as the chromatin remodeling NuRD complex. This enables initiation of target silencing. Impaired SUMOylation of HDA-1 leads to loss of interactions with several protein complexes, reduced silencing of piRNA targets, and reduced brood size. While the findings and claims are interesting, some of the novelty is overemphasized and some of the claims are not fully supported by the data.

      Main concerns:

      1) The importance of HDA-1 SUMOylation for transcriptional repression. The title "HDAC1 SUMOylation promotes Argonaute directed transcriptional silencing in C. elegans" implies a central role of SUMOylation in piRNA-mediated transcriptional silencing. The Argonaute HRDE-1/WAGO-9 targets countless transposons as shown previously and also in this manuscript (Fig S3), and so do the HDA-1 degron and Ubc9 mutant, indicating that histone deacetylation and protein SUMOylation are essential processes in TE silencing. However, the HDA-1 SUMOylation mutant (KKRR) only slightly affects 6 TE families (Fig S3), indicating that SUMOylation of HDA-1 might not be a key mediator of this process. Furthermore, the authors write that "Our findings suggest how SUMOylation of HDAC1 promotes the recruitment and assembly of an Argonaute-guided chromatin remodeling complex to orchestrate de novo gene silencing in the C. elegans germline.", but then they also state that "Comparison with mRNA sequencing data from auxin-treated degron::hda-1 animals revealed an even more extensive overlap with Piwi pathway mutants (Figure S2B), indicating that HDA-1 also promotes target silencing independently of HDA-1 SUMOylation." Based on their results and their own interpretations, I find that the importance of HDA-1 SUMOylation in piRNA-dependent transcriptional silencing is overemphasized.

      Additionally, the model (Fig 7) implies that for initiation of silencing WAGO recruits HDA-1 to targets. This should be tested by analyzing HDA-1 distribution over WAGO targets in WT and upon loss of WAGO.

      2) The mechanistic role of HDA-1 SUMOylation. On page 17 (amongst other places) the authors claim that "The SUMOylation of HDA-1 promotes its activity, while also promoting physical interactions with other components of a germline nucleosome-remodeling histone deacetylase (NuRD) complex, as well as the nuclear Argonaute HRDE-1/WAGO-9 and the heterochromatin protein HPL-2 (HP1)".

      -Regarding activity: Loss of deacetylation/silencing in the SUMO mutant might be due to loss of enzymatic activity, but it might also be due to defects in recruitment/complex formation. There is no data that proves altered enzymatic activity. In fact, Fig 6 indicates SUMO-dependent interaction of WAGO-9 with HDA-1, implying that recruitment is affected. To distinguish between activity and recruitment, at the very least, the authors would need to show that HDA-1 localization to its genomic targets is unaltered upon mutating its SUMOylation site (ChIP-seq of wt and KKRR mutant), while H3K9ac is increased (K9ac ChIP-seq in wt and KKRR mutant) in the mutant. This, in combination with HDA-1 localization in wt and WAGO-9 loss would imply whether complex formation to recruit HDA-1 or HDA-1 enzymatic activity is mostly affected by SUMOylation.

      -Regarding physical interactions: Fig 3D shows that if we fuse a SUMO residue to HDA-1, it will interact with MEP-1, while SUMOylation deficient HDA-1 mutant doesn't interact. However, for the WT HDA-1 control, we only see unSUMOylated protein interacting with MEP-1. Furthermore, in the MEP-1 IPs of samples that should contain SUMO-fused HDA-1, the authors detect a lot of "cleaved", unSUMOylated HDA-1. Unless cleavage happened after IP, during elution (unlikely, and there is "cleaved" HDA-1 in the inputs), these findings argue that the interaction with MEP-1 is not mediated by HDA-1 SUMOylation. An interaction between MEP-1 and unmodified HDA-1 is also shown in the accompanying manuscript, which appears to be dependent on Pie-1 SUMOylation. Thus, SUMOylation of HDA-1 alone seems unlikely to be the major factor necessary for silencing complex assembly. (as a side question: Does the protease inhibitor cocktail used inhibit de-SUMOylation enzymes? I am concerned that deSUMOylating enzymes might compromise some result interpretations.)

      -Regarding functional relevance of HDA-1 acetylation: On pages 12/13 authors claim that because "HDA-1(KKRR) animals and mep-1-depleted worms revealed dramatically higher levels of H3K9Ac compared to wild-type" and "HDA-1, LET-418/Mi-2, and MEP-1 bind heterochromatic", "SUMOylation of HDA-1 appears to drive formation or maintenance of germline heterochromatin regions of the genome." These correlations do not prove function. The authors have performed H3K9me2 (although not H3K9-ac) ChIP-seq in WT, KKRR mutant and HDA-1 degron worms, yet do not analyze globally whether acetylation is lost on genes that are affected (change in RNA-seq vs. change in K9me2 or acetyl). To support the claim that SUMOylation of HDA-1 drives deacetylation and heterochromatin formation, it would be important to show changes in H3K9Ac levels (or other acetyl marks) and potentially NuRD component occupancy between control and HDA-1 SUMOylation-deficient animals at specific targets (i.e. genes derepressed upon loss of SUMOylation identified in RNA-seq, and the reporter locus).

      3) The authors claim (p17) that "initiation of transcriptional silencing requires SUMOylation of conserved C-terminal lysine residues in the type-1 histone deacetylase HDA-1". I do not see any supporting data that has separately looked at formation/initiation and maintenance of silencing (a technically challenging experiment).

      4) The authors repeatedly claim that gei-17 does not play a role in piRNA target silencing, based on loss of gei-17 not affecting the piRNA reporter (Fig 1B). At the same time, they claim that pie-1 plays a role, even though it likewise does not affect the piRNA reporter (it affects the reporter only in F3; data on gei-17 effect in F3 is not present). In the accompanying paper, the authors show that while gei-17 loss by itself causes only moderate effect on extra intestine cells, combined with Pie-1 loss the effect is more severe than when Pie-1 loss is combined with Ubc9 or smo loss. This to me indicates an important role of gei-17 in inhibiting differentiation of germline stem cells to somatic tissues, but these effects are likely synergistic and thus masked by Pie-1. Individually neither Gei-17 nor Pie-1 show an effect on piRNA reporter in P0, but to confirm lack of synergy, their effects should be tested together. Although possible, the present data is insufficient to rule out gei-17 involvement.

    2. Reviewer #2:

      In their manuscript, Kim et al describe a role for HDAC1 (HDA-1) sumoylation in Argonaute-directed transcriptional silencing. The authors suggest that sumoylation of HDA-1 is important for proper assembly of the NuRD deacetylase complex. The role of SUMO modification in heterochromatin has been extensively documented and it is a very interesting topic. The current manuscript provides a very interesting set of results on this topic. I have list of comments, suggestions, questions, and concerns, which are listed below, especially related to the first half of the results:

      1) A general question would be how can HDA-1 sumoylation, which is barely detectable, account for such a big 'positive' effect on complex assembly? HDA-1 SUMO modification seems around 10% after enriching for SUMO-modified proteins, which means that stoichiometry will be way lower than this. While this is common for SUMO-modified proteins, it does make it difficult to associate with a 'simple' model.

      2) In Figure 1, a schematic of the sensor used throughout the study would benefit the reader.

      3) In Figure 1, have the authors checked if the 10xHis::tagged smo-1 has the same effect as the 3xflag::smo-1 (i.e. is it also a partial loss of function allele)?

      4) In Figure 1 it would be nice to see the global SUMO conjugation levels in the different conditions, particularly in the smo-1(RNAi), 3xflag::smo-1, and ubc-9(G56R).

      5) Also Figure 1, was gei-17 depletion/deletion checked in any way (i.e. WB)? Did the authors consider other SUMO E3 ligase, such as the mms-21 orthologue?

      6) While I am not a big fan of fusing SUMO to proteins, in this case it seems like a very reasonable thing to do, considering the modification sites are located very close to the C-terminal end of the protein. Did the authors check an N-terminal fusion?

      7) In Figure 2B, it becomes very clear that the level of SUMO modification of HDA-1 is extremely small, barely detectable after an enrichment method. I also wonder why the gels were cropped so tightly, especially considering that in Figure 3 there is an additional band corresponding to ubiquitylated, sumoylated HDA-1. In vitro modification assays would be helpful. HDA-1 alongside a known and characterised SUMO substrate would indicate how good a substrate HDA-1 is.

      8) In Figure 2D, is the difference between HDA-1(KKRR)::SUMO and HDA-1::SUMO significant?

      9) In Figure 3A-C, it would be useful to control whether the GFP::HDA-1 fusion behaves as the untagged one in the sensor assay (wt vs. KKRR).

      10) I have a few questions regarding Figure 3D:

      I. Considering the extremely low level of HDA-1 sumoylation, did the authors detect SUMO and ub conjugated HDA-1 (not the SUMO usion)?

      II. Is ub conjugated to SUMO or to HDA-1?

      III. Does MEP-1 contain any obvious SIMs and or UIMs?

      IV. To make a stronger case for the SUMO-dependent interaction model, in vitro interaction assays with recombinant proteins would be extremely useful.

      11) In the discussion, the authors compare the lack of requirement for GEI-17 in their manuscript with the requirement for Su(var)2-10 in flies. It is very important to back this claim that the authors control GEI-17 depletion (as pointed out in 5).

      Overall, I think this manuscript provides a very interesting set of results and I believe that, with the addition of some simple biochemical experiments, the quality and impact of the overall work would be much greater.

    3. Reviewer #1:

      The evidence that sumoylation of HDA-1, a type 1 HDAC, plays a key role in establishing transcriptional silencing of piRNA-regulated genes in C. elegans is quite convincing. The genetic analysis demonstrating that the SUMO pathway is involved in piRNA silencing is strong, and the mutational evidence that this involves sumoylation of two Lys in the tail of HDA-1 is reasonable. Likewise, the finding that HDA-1 sumoylation promotes association with NuRD complex components and association of MEP-1, an HDA-1 interactor, with chromatin regulators is convincing. In addition, the evidence that HDA-1 sumoylation increases H3K9ac deacetylation in vivo, leading to negative regulation of hundreds of target genes, and plays a role in the inherited RNAi pathway is solid.

      While the overall conclusion provides an interesting advance in understanding mechanisms of piRNA-mediated gene silencing in C. elegans, the paper is lacking any biochemical analysis of the effects of sumoylation on HDA-1 activity and its association with other transcriptional regulators.

      1) The authors mapped two sumoylation sites close to the C terminus of HDA-1, K444 and K459, based on extremely weak homology with two established sumoylation sites in human HDAC1 that are reported to be important for transcriptional repression (N.B. the authors should indicate here that David et al. reported that K444/476R HDAC1 had reduced transcriptional repression activity in reporter assays.). While the two human sites conform to the sumoylation site consensus, ψKXE, neither K444 nor K459 in HDA-1 fits this consensus (possibly one could argue that K444 is in an inverted motif). The fact that the KKRR mutant HDA-1 is no longer sumoylated is consistent with these two Lys being sumoylated, but it would be reassuring to have direct MS evidence that K444 and K459 are indeed sumoylated, which could be achieved using a SUMO Thr91Arg mutant that generates a GlyGly stub upon trypsin digestion, among other methods.

      2) It remains unclear how sumoylated HDA-1 is recognized by MEP-1 for assembly into the NuRD complex. Does MEP-1, or another NuRD subunit, have a SIM that could facilitate direct interaction of MEP-1 and sumoylated HDA-1?

      3) As the authors discuss, it is surprising that the HDA-1(KKRR)::SUMO protein, which in effect is a constitutively sumoylated form of HDA-1 that will interact constitutively with MEP-1/NuRD, does not have more deleterious effects on the organism, since according to the data in Figure 2B, the stoichiometry of endogenous HDA-1 sumoylation was extremely low. Of course, low sumoylation stoichiometry, which is a general issue with sumoylation studies, means that only a very small fraction of the HDA-1 endogenous population will be able to engage with the silencing complexes at any one time. This point is also worth discussion.

      4) Page 5: Here, and elsewhere, the authors claim that sumoylation of the two C-terminal Lys activates HDA-1 histone deacetylase activity, but provide no direct evidence for this statement. There are no HDAC assays, and it is unclear how C-terminal SUMO residues distant from the catalytic domain would alter its enzymatic activity, unless there is a SIM motif in HDA-1 that might allow for intramolecular interaction with SUMO residues at the tail leading to a conformation change. Did the authors check for a SIM motif in HDA-1? The fact that adding SUMO to the C-terminus rather than one or both of the two Lys would also have to be taken into account in determining bow sumoylation might "activate" HDA-1. To demonstrate that sumoylation activated HDA-1 in vitro deacetylation assays would need to be carried out comparing the activities of unmodified and sumoylated HDA-1. Instead of enzymatic activation, it is possible that the PIE-1 interaction and HDA-1 sumoylation results in relocalization of HDA-1 within the nucleus to facilitate more efficient H3K9ac deacetylation.

    1. Reviewer #3:

      In this manuscript, Soucy et al. describe a new technique that involves a 3D co-culture system that allows the analysis of the regulation of the sympathetic adrenomedullary system. The data demonstrate the advantage of such compartmentalized 3D systems relative to the 2 D system for long-term studies. The findings also show the usefulness of this system to understand the control by preganglionic sympathetic neurons of catecholamines released by the adrenal gland cells.

      The main concern with the work relates to the uncertain physiological relevance of the co-culture system developed by the authors. Although I appreciate the utility of such reductionist techniques to understand how preganglionic sympathetic neurons regulate catecholamines released by the adrenal gland cells, this is too removed from a physiological setting.

      1) It is difficult to judge the level of novelty of the MPS technique reported in this manuscript relative to what is in the previous paper (Ref 36) which is not available.

      2) The innervation of tissues including heart and adrenal gland is highly specific. In addition to the circulating catecholamines secreted by the adrenal glands, cardiomyocytes are tightly controlled by direct innervation. Thus, whether co-culturing PNS with other cells mimic what happens in vivo is not clear.

      3) The number of AMMCs displayed in figure 2B seems minimal as only very few cells were stained with cardiomyocyte markers. It would be interesting to know how many of these AMMCs receive innervation (Fig. 3E).

      4) It is not clear how primary cardiomyocytes were exposed to the catecholamines emanating from the AMMCs? Were these co-cultured or were the cardiomyocytes exposed to the media of AMMCs?

      5) Do the "n" in each figure represent cells or experiments (repeats)?

      6) There is no description of the method used to quantify the immunofluorescent signal.

      7) The Introduction is too long. It can easily be shortened to focus on the literature related to the topic.

    2. Reviewer #2:

      Soucy and colleagues developed a thermoplastic microphysiological system (MPS) to investigate the mechanisms regulating adrenomedullary innervation. This system consists of 3D cultures of adrenal chromaffin cells and preganglionic sympathetic neurons within a contiguous bioengineered microtissue. Using this model, they report that adrenal chromaffin innervation is critical for hypoxia-induced catecholamine release. They also show that opioids and nicotine affect adrenal chromaffin cell response to hypoxia without impairing neurogenic control mechanisms. In addition to providing mechanistic insights on adrenomedullary catecholamine release, this study represents an elegant proof-of-concept that the MPS have the potential to become useful tools to study organ innervation.

      Disclosure: I do not have the expertise to review the engineering aspects of this manuscript and will therefore share some concerns I have regarding the accuracy of this technology to mimic native tissues.

      I understand that one advantage of the MPS over microfluidic devices using micro-posts or micro-tunnels is the presence of an unobstructed interface between the compartments that is similar to tissue interfaces. However, how better is it compared to other organs-on-chips constructs for reaching the biological complexity of an intact organ?

      The system consists of adrenal chromaffin cells and preganglionic sympathetic neurons. I wonder if in this format it could lack the normal cellular heterogeneity of the adrenals. Can the absence of adrenal cortex cells producing aldosterone, androgens and glucocorticoids with important autocrine functions on chromaffin cells interfere with the ability of chromaffin cells to respond normally to a stimulus? Authors discuss that future efforts will incorporate additional adrenal cortical cell populations to better mimic the native physiology. Could they extend this discussion by highlighting the potential weaknesses of the model in its current format? Was any observations made that would suggest caveats?

      In vivo, do all fibers innervating the medulla target the chromaffin cells or do some/most innervate the blood vessels or pericytes? If a majority of the innervation is to blood vessels, how does this system take into account potential changes in blood flow and perfusion of the adrenals that could occur and affect the oxygenation?

      Early work suggests that adrenergic terminals innervate chromaffin cells and that the adrenal medulla receives a sympathetic and parasympathetic efferent and an afferent innervation (J Anat. 1993 Oct; 183: 265-276). How would this system allow to study such complex innervation? Is it possible to add additional neuronal types to this MPS?

      In addition to the nicotinic cholinergic receptors, chromaffin cells express muscarinic receptors that may also be involved in catecholamine release. A quick profiling and comparison of the expression of the different receptors could reinforce the representative nature of the technology to model a biological system.

      One important caveat of MPS is the challenge of delivering a drug in a physiologically realistic manner. Could the author comment on the doses of the different drugs used and how they are representative of what a chromaffin cell would normally "see" in vivo?

      Could the authors comment on the culture media/conditions and how they are representative of a biological system? Would the use of blood or blood components be a better alternative to the system?

    3. Reviewer #1:

      In this manuscript, Soucy and colleagues present a novel innervated system which they use to model the effects of prenatal nicotine and opioid exposure. Using the system they provide potentially interesting insights on how prenatal nicotine and opioid exposure could impact release of catecholamines. However, following careful review of the manuscript,I recommend that the authors provide substantial additional data and evidence to support the biological relevance of their findings.

      Major points:

      1) A main pillar of this manuscript is the assumption that the adrenal medulla is innervated. To substantiate their claims the authors cite books/book chapters, rather than citing convincing primary evidence. In fact, other than old EM images showing vesicular densities akin to synapses, I have not found published images of convincing axonal arborization in the adrenal medulla - if such images exist the authors should at least try to reproduce them for internal consistency of their study. This is particularly relevant if they wish to draw parallels between in vitro and in vivo systems. As this is a major pillar upon which this research stands, the lack of supporting histological evidence, which could be easily done, undermines the validity of this manuscript. Presenting primary evidence (i.e. not a textbook diagram) is essentia.

      2) Multiple experiments lack appropriate controls. See comments on Figure 2B, 2D, Supplementary Figure 2.

    1. Reviewer #3:

      Lee and Daunizeau formulate a model of the effects of mental effort on the precision and mode of value representations during value-based decision-making. The model describes how optimal levels of effort can be determined from initial estimates of precision and relative value difference between competing alternatives, accounting for the subjective cost of incremental effort investment, as well as its impact on precision and value differences. This relatively simple model is impressive in its apparent ability to reproduce qualitative patterns across diverse data including choices, RTs, choice confidence ratings, subjective effort, and choice-induced changes in relative preferences successfully. The model also appears well-motivated, well-reasoned, and well-formulated.

      I have two sets of concerns, my first set relates to model fitting and validation. The model appears to do fairly well in predicting aggregate, group-level data, but does it predict subject-level data? Or, does it sometimes make unrealistic predictions when fitting to individual subjects? The Authors should provide evidence of whether it can or cannot describe subject level choices, confidence ratings, subjective effort, etc.

      Also, I think the Authors should do more to demonstrate that their model is an advance on simpler variants. The closest thing to model comparison is an exercise where the authors show that, relative to when their model is fit to random data, their model explains more variance in dependent variables when fit to real data. This exercise uses a straw man as a baseline because almost any model which systematically relates independent variables to dependent variables would explain more variance when fit to real data than to data for which, by definition, independent and dependent variables do not share variance. It would be more useful to know whether (and if so, how much) their model explains data better, than, e.g. a model with where effort only affects precision (beta efficacy), or a model in which effort only impacts value mode (gamma efficacy). Since the Authors pit their model against evidence accumulation models, it would be yet more useful to ask whether their data predicts these diverse data better than a standard evidence accumulation model variants.

      My second set of concerns are regarding the assumed effect of mental effort on the mode of subjective values. First, is it reasonable to assume that variance would increase as a linear function of resource allocation? It seems to me that variance might increase initially, but then each increment of resources would add diminishing variance to the mode since, e.g., new mnesic evidence should tend to follow old mnesic evidence. How sensitive are model predictions to this assumption? What about if each increment of resources added to variance in an exponentially decreasing fashion? Also, what about anchoring biases? Because anchoring biases suggest that we estimate things with reference to other value cues, should we always expect that additional resources increase the expected value difference, or might additional effort actually yield smaller value differences over time? If we relax this assumption, how does this impact model predictions?

    2. Reviewer #2:

      The manuscript introduces a computational account of meta-control in value-based decision making. According to this account, meta-control can be described as a cost-benefit analysis that weighs the benefits of allocating mental effort against associated costs. The benefits of mental effort pertain to the integration of value-relevant information to form posterior beliefs about option values. Given a small set of parameters, as well as pre-choice value ratings and pre-choice uncertainty ratings as inputs to the model, it can predict relevant decision variables as outputs, such as choice accuracy, choice confidence, choice induced preference changes, response time and subjective effort ratings. The study fits the model to data from a behavioral experiment involving value-based decisions between food items. The resulting behavioral fits reproduce a number of predictions derived from the model. Finally, the article describes how the model relates to well-established accumulator models, such as the drift diffusion model or the race model.

      Before I get into more detailed comments, I would like to highlight that this work addresses a timely and heavily debated subject, namely the role of cognitive control (or mental effort) in value-based decision making (see Shenhav et al., 2020). While there are plenty of models explaining value-based choice, and there is a growing number of computational accounts concerning effort-allocation, little theoretical work has been done to relate the two literatures (but see Major Comment 1). This work contributes a novel and interesting step in this direction. Moreover, I had the impression that the presented model can account for a broad range of behavioral phenomena and that the authors did a commendable amount of work to validate the model (but see Major Comments 2 and 3). The manuscript is also well written in that it seems accessible to a broad audience, including non-technical readers. However, while I remain curious about what the other reviewers have to say, the manuscript misses to address a few issues that I elaborate below.

      Major Comments:

      1) Model Comparison(s): While the manuscript compares the presented computational approach to existing accumulator models, it could situate itself better in the existing literature, ideally in the form of formal model comparisons. For instance, as someone less familiar with choice-induced preference changes in value-based decision making, I wonder how the model compares to existing computational work on this matter, e.g. the models described in Izuma & Murayama (2013) or the efficient coding account of Polanía, Woodford, & Ruff (2019). I do understand that the presented model can account for some phenomena that the other models cannot account for, at least without auxiliary assumptions (e.g. subjective effort ratings), but the interested reader might want to know how well the presented model can explain established decision-related variables, such as decision confidence, choice accuracy or choice-induced preference changes compared to existing models, by having them contrasted in a formal manner. Finally, it would seem fair to compare the presented account to emerging, more mechanistically explicit accounts of meta-control in value-based decision making (e.g. Callaway, Rangel & Griffiths, 2020; Jang, Sharma, & Drugowitsch, 2020). As these approaches are still in preprint, it may not be necessary to relate them in a formal model comparison. However, the manuscript might benefit from discussing how these approaches differ from the presented model in the text.

      2) Fitting Procedure: This comment concerns the validation of the described model based on its fits to behavioral data. If I understand correctly, the authors first fit the model to each participant while "[a]ll five MCD dependent variables were [...] fitted concurrently with a single set of subject-specific parameters" and then evaluate whether model fits match the predicted qualitative relationship between experimental variables (e.g. pre-choice value ratings and pre-choice confidence ratings) and dependent variables (e.g. choice accuracy). I'm happy to be convinced otherwise, but it appears that the model's predictions could be tested in a more stringent manner. That is, it doesn't appear compelling to me that the model, once fitted, matches the behavior of participants -- please note that this is not to diminish the value of the results; I still think that these results are valuable to include in the manuscript. Instead, rather than fitting the model to all dependent variables at once, it would be more compelling to fit the model to a subset of established decision-related variables (e.g. accuracy, choice confidence, choice induced preference changes) and then evaluate how the fitted model can predict out-of-sample variables related to effort allocation (e.g. response time and subjective effort ratings). Again, I am happy to be convinced otherwise but the latter would seem like a much more stringent test of the model, and may serve to highlight its value for linking variables related to value-based decision making to variables related to meta-control.

      3) Parameter Recoverability: Given that many of the results rely on model fits to human participants, it would seem appropriate to include an analysis of parameter recoverability. That is how well can the fitting procedure recover model parameters from data generated by the model? I apologize if I missed this, but the manuscript doesn't appear to report this kind of analysis.

      References:

      Callaway, F., Rangel, A., & Griffiths, T. L. (2020). Fixation patterns in simple choice are consistent with optimal use of cognitive resources. PsyArXiv: https://doi.org/10.31234/osf.io/57v6k

      Izuma, K., & Murayama, K. (2013). Choice-induced preference change in the free-choice paradigm: a critical methodological review. Frontiers in psychology, 4, 41.

      Jang, A. I., Sharma, R., & Drugowitsch, J. (2020). Optimal policy for attention-modulated decisions explains human fixation behavior. bioRxiv: 2020.2008.2004.237057.

      Polania, R., Woodford, M., & Ruff, C. C. (2019). Efficient coding of subjective value. Nature neuroscience, 22(1), 134-142.

      Shenhav, A., Musslick, S., Botvinick, M. M., & Cohen, J. D. (2020, June 16). Misdirected vigor: Differentiating the control of value from the value of control. PsyArXiv: https://doi.org/10.31234/osf.io/5bhwe

    3. Reviewer #1:

      The authors report a model about the confidence-effort tradeoff; explaining how subjects invest effort depending on how confident they want to be in their decision (and how costly this is). They fit their model to behavioural data and report qualitative similarities between model and data.

      I find this an interesting model, with interesting links between timely topics of interest, such as confidence, effort, and cost optimisation. But I have several requests for clarification.

      Major Comments:

      Line 274: Without loss of generality: what does it mean here? I guess that with a different cost function, not all conclusions remain the same?

      The model assumes that it is "rewarding" to choose the correct (highest-value) option (B = R*P). But is this realistic? If the two options have approx the same value, then R should be small (it doesn't matter which one you choose); if the options have different values, it is important to choose the correct one. Of course, the probability P_c continuously differentiates between the two options, but that is not the same as the reward. Can the predictions generalise toward a more general R that depends on value difference?

      In Figure 2, I guess that the important quantity to decide is a standardised delta-mu (similar to d' in signal detection theory). It might be useful to also plot that (essentially combining the current two plots). Or alternatively, plot P_c(z), which relates more directly to the theory.

      The section Probabilistic model fit is unclear. Are the MCD variables y the 5 variables mentioned above? Do different y's share the same alpha, beta, gamma? Are different transformation parameters a and b fitted for each y? Is estimation done per subject? It is mentioned that VBA is used, but what distribution is approximated exactly using VBA? Is it a mean-field approximation, optimised with gradient descent? Is the goal function a posterior across the 5 parameters? It would also be good then to have an intuition on the estimated model parameters (e.g., their standard error or Bayesian equivalent). Is there an estimate of model fit (in addition to checking qualitative predictions)? Figure S3 is a good start (and I think it is worth putting in the main MS), but it would be nice, for example, to see model comparisons where one or more parameters are restricted.

      Figure 4, 5, 6 should be better annotated. I have a hard time trying to fill in what is plotted exactly (eg, scale of the color bar). Why are the data grouped in percentiles? Also in Figure 4 legend, I guess that "beta" is not used as the MCD model parameter? Please avoid overloading definitions.

      Figure 7: It seems that "spreading" of alternatives occurs in the model only for alternatives that are initially close together? Is this consistent with their discussion around equation (14)? (I may be overlooking something; if so, consider making this more explicit.)

      I find it a really interesting feature of the model that it can explain spreading of alternatives from a statistical perspective. So I think it's worth commenting on it in the Discussion. For example, does the current model capture trends in the literature? To what extent is the effect (also in empirical data) dependent on initial value differences?

    1. Reviewer #2:

      This is a nice study that is clearly written and makes use of several datasets. The authors show that a gene signature associated with increased myelopoiesis in utero is associated with increased risk of pediatric asthma. Furthermore they show that cord blood serum PGLYRP -1 is associated with reduced risk of pediatric asthma and increased FEV1/FVC. Interestingly sIL6ra which is derived from neutrophils but not associated with neutrophil granules did not show any association with pulmonary outcomes. This suggests that it is the neutrophil granules rather than the neutrophils per se that are the problem association. The following should be addressed:

      1) While the manuscript is clearly written, the message regarding PGLYRP-1 is at times confusing. The manuscript is clear that PGLYRP -1 is inversely associated with mid childhood asthma risk. The discussion however refers to animal models where PGLYRP -1 is proinflammatory and is associated with increased airway resistance and allergen sensitization. The apparent disparity should be clarified.

      2) What is the proposed role of neutrophil degranulation in the pathogenesis or long term susceptibility to asthma?

      3) While it was not the focus of the current study and maybe beyond the scope of the data it would be interesting to know if there is any association with the subsequent development of adult asthma.

    2. Reviewer #1:

      This paper attempts to explain perinatal risk factors and the associated risk of developing pediatric asthma in the mid-childhood and early teenage years. The authors found that some maternal characteristics such as atopy, BMI, race/ethnicity and demographics such as newborn sex, and birth characteristics such as birthweight, gestational age, and mode of delivery were associated with risks of subsequent asthma development in the pediatric population. The paper then goes on to demonstrate the differences in immune response during the different time frames of pregnancy. Throughout the majority of the pregnancy, fetal hematopoiesis generates mostly lymphoid and erythroid lineages. Towards term, the immune cells are predominantly neutrophils and monocytes. Pre-term is characterized primarily by lymphocytes. It was seen during term deliveries that the myeloid response produces several cytokines that shift CD4+ T- cells away from the Th2 response. Enhanced production of IFN gamma by leukocytes stimulation early in life is associated with reduced susceptibility to infections. However, the author states that these findings do not extend to asthma diagnosis in childhood.

      Major comments:

      I would have liked the paper to readjust the introduction; a lot of emphasis is placed on IFN/infection/asthma, but after this fact, it seems neglected going forward and the paper explores another topic. Instead, the paper's focus was on determining the biological nature, serologically, with a granulocytic luminal marker (PGLYRP-1) and a membrane-bound marker (sIL6Ra) and its association to pediatric asthma.

      The take-home message for the paper - that there appears to be an inverse relationship between serum levels of PGLYRP-1 and overall risk for pediatric asthma - should be explored in relation to the whether a therapeutic role for such proteins is possible since they can accurately predict risk factors for disease and assess pulmonary function. Other proteins, like the sIL6Ra, have no association with disease predictability and have no association with predicting pulmonary outcomes. This should be explored/explained in greater detail.

      Minor comments:

      As part of the validation efforts of the study - the rationale for using three different cohorts to assess pediatric asthma risk was not clearly explained.

      One of the main findings of the analysis was the conclusion that patients with higher levels of myeloid cells in their CBMCs are at lower risk of developing pediatric asthma, and vice versa. Furthermore, CBMC neutrophil abundance was negatively associated with the number of risk factors. (patients with more risk factors, as mentioned above, were found to have lower levels of neutrophils in their CBMC, and more at risk of pediatric asthma). This was further elucidated with measuring CBMC plasma levels of PGLYRP-1 with levels of mRNA and correlating it with risk of developing pediatric asthma. Increased levels of mRNA for the PGLYRP-1 protein was associated with an increased serum concentration of the protein. However, this was inversely correlated with risk factors. Patients with reduced risk factors for development of pediatric asthma were found to have increased levels of the protein and its mRNA.

      The measurement and correlation of PGLYRP-1 (present in neutrophil specific granules) and sIL6Ra (derived from neutrophils, but not present in granules) to pediatric asthma at mid-childhood and early-teen years was determined. There were two follow-up points where asthma outcomes and pulmonary function by way of the FEV1/FVC ratio was determined. It was found that increased levels of PGLYRP-1 were significantly associated with current asthma at mid-childhood. However, there was no association between levels at the early-teen follow-up.

      In terms of correlations between each protein level and pulmonary function - the sIL6Ra protein was NOT associated with the FEV1/FVC ratio or a bronchodilator response at either age group. However, it was found that increased levels of PGLYRP-1 were associated with an INCREASED FEV1/FVC ratio (not indicative of asthma) and reduced odds of developing pediatric asthma at each age group.

      This analysis makes sense as increased production of neutrophil granules, PGLYRP-1, serves a protective effect against infection, reducing incidence of disease states. The paper, however, should explore the rationale behind the no-response to the sIL6Ra protein. In terms of understanding, since this protein is NOT associated with neutrophilic granules, it can be inferred, that is it may not have a role in protecting against infection. However, this could have been explored in more detail in the paper.

    1. Reviewer #3:

      The results of this study suggest that maternal loss alters the HPA stress axis in wild chimpanzees, but these effects are transient and are not evident later in life.

      Overall the study is the result of much careful fieldwork. The number of cortisol samples is impressive and these are robustly analysed. The conclusions are carefully and thoroughly discussed.

      I have very few comments, in part because I am not a specialist in stress hormones and so cannot fully assess the laboratory analysis or interpretation, but in part because my view is that this is a high-quality thorough study and a well-written manuscript.

      My only major point is that I am aware that measurement of cortisol is difficult in the wild. It is possible to inadvertently measure metabolites other than cortisol, and the most robust way to measure cortisol is using a challenge and subsequent measurements. While I cannot adequately assess this aspect of the manuscript, I think it is important that the other reviewers/editor ensure the hormone measurements are appropriate.

    2. Reviewer #2:

      The paper submitted by Girard-Buttoz and colleagues asks whether and how early maternal loss affects cortisol levels and diurnal slopes among wild chimpanzees at Tai Forest, Côte d'Ivoire. The major claim of the paper is that, like humans, chimpanzees experience altered HPA functioning after maternal loss, including alterations to both diurnal slope and overall cortisol levels. However, their chimpanzee orphans exhibited patterns in diurnal slope that were opposite to their predictions (predicted blunted slopes, observed steeper slopes). The authors should be commended for their efforts in collecting a large number of samples for this analysis. However, I am not convinced that it is sufficient for investigating the hypotheses put forth here and, therefore I am also not convinced that their results are solid. I also have concerns about the theoretical grounding for the paper.

      1) My principal concerns with this paper, as written, revolve around the methods/results. First and foremost, I am not convinced that the authors have the sufficient sample size to evaluate the predictions/hypotheses outlined in the introduction. While 849 urine samples is a large number, and again, their efforts here should be commended, the sample spread is actually quite thin once it is spliced up into appropriate categories, especially considering how many samples were collected per individual year, on average. As the authors indicate throughout and especially when describing their modeling approach, cortisol is inherently a very noisy hormone impacted by myriad factors- including age in at least one other densely-sampled chimpanzee community. I'm also surprised that time of day was modeled quadratically. It is my understanding that humans, other populations of chimpanzees, and other mammals follow a sigmoidal curve which should be modeled with a third-order term as well. For these reasons, it's difficult to tell whether model 1A is not significant because of insufficient sample or a true lack of predictive power. Additionally, I'm concerned that the paper seems to focus so much on the results from a single model term in a model that did not reach significance.

      2) Despite acknowledging that the "significance of these predictors should be interpreted with caution" because model 1a did not reach significance, the authors make very strong claims about the results in the discussion- and also feature the finding of that model in the title of the paper. That seems problematic to me- especially because the insignificant model results (more intense diurnal slopes among immature orphans) diverge from the expectations set forth by other works in humans and non-humans. The finding that this is to do with higher-than-expected morning cortisol is puzzling given that evening levels are generally considered more responsive or plastic. However, this could also be an artefact of fitting the models without the third-order term for time.

      3) The introduction needs refinement to help clarify and specify the authors' arguments.

      (a) Does the biological embedding model always lead to negative fitness outcomes? Or is it possible that phenotypic adjustments might be adaptive, or even just making the best of a bad job (e.g. earlier death, but not death today)?

      (b) Throughout the introduction it is unclear whether and where the authors refer to the human clinical literature as opposed to animal literature. It is also unclear how human patterns are similar versus different from those observed in animals. Further, I would recommend that the authors include a deeper review of the animal literature (e.g. early experimental work with macaques, cortisol at other chimpanzee field sites/captivity). It's also unclear whether and where the authors refer more broadly to early life adversity (and what this means for humans vs. animals) versus more specifically to maternal loss. Additionally, there should be further discussion specifically related early maternal loss (rather than "early life adversity" which can include a lot of different factors) focused on the nutritional and social obstacles associated with early maternal loss, how these related to HPA functioning, and how these effects are expected to change during development (Plasticity? Flexibility? The role of HPA in responding to changing environmental conditions?). What about the adaptive calibration model which posits that the HPA can readjust during particular periods of developmental reorganization?

      4) It is difficult to assess the discussion without first dealing with the problems in the introduction/methods. However, despite their claims in the results section, it does not seem that the authors interpreted the results of model 1a with caution.

    3. Reviewer #1:

      A very interesting paper testing the biological embedding model in a wild long-lived mammal using an impressive dataset. However, the results for immature orphans are not entirely straight forward. The effect on the HPA axis is in the opposite direction to humans and there seems to be no significant increase in cortisol compared to non-orphans overall - it depends on time since maternal loss. The paper would be improved by communicating this more clearly and discussing exactly why this pattern may be different to that in humans. Some of the evolutionary ideas discussed in the paper also need to be more clearly conveyed or thought through.

      Substantive concerns:

      1) There are important sections in the introduction (L125-128 particularly) and discussion (L403-409) about the evolution of the HPA response and differences between humans and other mammals that are unclear. Greater detail on the evolutionary logic being used, the precise hypotheses being suggested and references to back the ideas up are required (further details in minor comments).

      2) Table2/Model 1a doesn't directly test whether orphans have higher cortisol than non-orphans (or no p-value reported in table 2) and CIs in table 1 suggest that there is not a significant difference. Therefore, categorical statements that orphans have higher cortisol levels don't seem to be entirely justified. However, model 1B demonstrates that cortisol declines with years since maternal loss and figure 3 supports the idea that orphans do have higher cortisol than non-orphans in the first 2 years following maternal loss but that this declines to levels similar to those of non-orphans after 2 years. Could a statistical test be run to back this up? Perhaps instead of using a binary variable for orphan status (yes/no) it could be analysed as categories (orphaned within 2 years, orphaned more than 2 years ago, not orphaned as an immature) which could be used to directly test this and back up statements e.g. recently orphaned immatures had higher cortisol levels than non-orphans. A broader concern is why likelihood ratio tests have been used to calculate p values (and for only some of the predictors) rather than reporting the output from the models themselves. Could you explain what the benefit of this is over reporting values from the actual models and/or also provide the model outputs?

      3) The effect on cortisol slopes found in this study is in the opposite direction to that in humans. This is discussed in some detail but is lacking clarity in places and I think it would help to make this difference more obvious - it is really a key finding of the paper not a secondary point. The expected pattern is very nicely set out in the introduction so it would be good to format the discussion so there is a paragraph that outlines exactly how the results differ from hypothesized:

      (a) that the effect on cortisol slopes is in the opposite direction

      (b) that only the cortisol levels of recently orphaned immatures are significantly different to non-orphan immatures and then brings in the ideas discussed about why these differences may be present. I think this would really help communicate the findings more clearly, bringing the discussion more inline with what is set out in the introduction.

    1. Reviewer #2:

      The manuscript addresses an interesting question: whether genetic effects of common variants on educational attainment (EA) differ between individuals with and without psychiatric diagnoses. The dataset they use is ideally suited for such an analysis. The authors find evidence that the influence of common variants on EA is attenuated in individuals with a diagnosis of autism spectrum disorder (ASD) or ADHD.

      My main concern with the paper is the statistical analyses used to support the authors' conclusions. The authors draw conclusions from dividing individuals into subgroups and comparing the R^2 of the EA PGS between those subgroups. This analysis is liable to bias due to range restriction: if the subgroups have been selected based on low/high education, then the R^2 of a predictor will tend to be lower in the subgroups than in the overall sample. Furthermore, here the selection into the subgroup (here diagnosis with ASD or ADHD) itself is related to both education and the EA PGS, which could be contributing to the differences in R^2 the authors observe between subgroups.

      A more powerful and robust analysis would be to fit an interaction model in the full sample. The authors could regress individual's EA jointly onto their EA PGS, their diagnoses coded as binary variables, and the interactions between the EA PGS and the diagnoses codings. The authors could do this jointly for all diagnoses in the full sample, which would account for comorbidities between psychiatric disorders. If the influence of the EA PGS is truly weaker in ASD and ADHD cases, there should be a negative interaction effect between the EA PGS and ASD and ADHD diagnoses, which can be tested with a simple statistical test for a non-zero interaction effect.

      It could also be worth first regressing the EA PGS onto the psychiatric diagnoses, and taking the residuals before assessing whether there are interactions between the EA PGS and ADHD/ASD diagnosis. It is possible that correlation between the EA PGS and ADHD/ASD diagnosis could generate a spurious interaction effect in the above analysis.

      It is interesting that controlling for SES appears to mediate the (potential) interaction between EA PGS and ADHD diagnosis. However, I worry again that this could be a function of SES influencing ADHD diagnosis. SES and its interaction with both EA PGS and ADHD diagnosis could also be included in a full interaction model that could help interpret this finding.

      The authors construct the PGS by using a pruning and thresholding approach. This is known to be suboptimal, which may explain why their R^2 is lower than in other studies. The authors could use LD-pred or other methods that account for linkage disequilibrium and non-infinitesimal genetic architectures. In the EA GWAS from which the score was constructed, the best R^2 was found by applying LD-pred to all variants without p-value thresholding.

      The hypothesis that indirect genetic effects differ between psychiatric cases and controls is interesting. Do the authors have sufficient sibling data within their samples to test this?

      Line 581: Closely related individuals were removed from the analysis. Why? How many were removed? Could inclusion of these help assess the hypothesis about indirect genetic effects and improve power? The authors could use a mixed model regression to control for relatedness without having to throw individuals out of their sample.

      The grammar in the writing of the paper is a little odd at times. Often, definite or indefinite articles are omitted preceding nouns, such as in 'association of EA-PGS' in the abstract, which should be 'association of the EA-PGS'.

      line 54: 'strongly influences', I think this is a little overconfident in its assignment of causality to highest level of education, perhaps 'strongly associated' would be better

      Paragraph 3 of the introduction: the authors should mention population stratification and assortative mating as possible mediators of the association between EA PGS and EA, especially when referencing the drop in association strength in within-family designs

      I found the decile based analyses a bit pointless. By arbitrarily dividing a continuous outcome into discrete subgroups, the authors are losing power and not gaining much compared to simply performing linear regression, which they already do. I would relegate these to supplementary figures.

      Line 452: I think that the stated equivalence between low EA PGS and learning difficulties goes a bit too far here. I understand the point the authors are trying to make, but I think it should be phrased more carefully.

      The authors used an MAF threshold of 5% for construction of the score. Typically, a threshold of 1% is used for construction of PGS from summary statistics by software such as LD-pred.

      Line 580: the authors state that an EA PGS based on summary statistics from European samples cannot be used to predict EA in non-European samples. This is not true. It is true that the prediction accuracy is attenuated, but it is not zero.

    2. Reviewer #1:

      This is overall a well written and methodologically sound study researching how educational achievement can be predicted using genomic data when the sample is stratified to those without and those with diagnoses of common psychiatric disorders. I think that it is a very important study area, the study is well powered using a fantastic representative sample and offers some insights into aetiology of associations between psychiatric traits and educational achievement.

      I suggest some minor adjustments for the authors to consider, mainly addressing the conclusions and implications of the findings. I also recommend some clarifications in the methods and the results sections; these suggestions might require some very modest additional analyses and rethinking/rewording some of the conclusions.

      • The major issue I have is that you discuss family SES as a purely environmental factor throughout the manuscript. However, we know that this is not the case and that there is substantial heritability for SES. It follows from what SES composite is made out of, in your case parental education and occupation, both of which are highly heritable (as you rightly note in the manuscript yourself). This needs to be addressed and discussed throughout the manuscript.

      • The major conclusion in the manuscript, even if you acknowledge that this is speculation, is that the attenuation of the association between EA-PGS and school grades after correcting for SES can be explained by genetic nurture. I agree, this can be one of the explanations, however, here you also control (partially) for transmitted genes, that is educationally related genetic variants present in both generations (so without genotyped trios here you cannot distinguish between direct and indirect genetic effects). In addition, this attenuation can also be explained by gene and environment correlation (not only passive which is addressed by genetic nurture hypothesis) but also active and evocative rGE. In addition, in your design, you need to consider assortative mating. I suggest directly addressing this in the manuscript.

      • I also think that you should address that you are dealing with diagnosed disorders only. It is a great strength of the paper, and you are using a fantastic resource, but we know that these disorders are quantitative traits and your study does not allow to take that into account, so there are possibly individuals with high ADHD symptoms are included in the control group; similarly, you cannot take into account the symptom severity. In terms of symptom level data, I see you have referenced Selzam et al., 2019 paper that, among other things, related EA-PGS to ADHD symptoms and vice versa, and also controlled for SES.

      • In the introduction, you rightly state that individual differences are explained by genetic and environmental factors and the interplay between them, however, I suggest rephrasing it, because "much of the variance can be explained" is incorrect, all of the individual differences can be explained by the combination of these factors.

      • You report low rG between schizophrenia and E1, can you specify how this was calculated

      • You state that your prediction in the control sample is lower than the other studies and offer a possible solution of the inclusion or exclusion of 23andMe data in the summary statistics, please note that other studies have not used 23ndme statistics either (for example TEDS publications). You also discuss genetic heterogeneity; I think that the difference can be explained by both genetic and environmental heterogeneity. What is the rG between EA in your sample and GWAS sample?

      • I think that the conclusion that the impact of low EA-PGS is comparable to the impact of ADHD is too strong, your data does not support this strong conclusion. I suggest rephrasing it, especially as we're not aware of the associated mechanisms. Note that people with ADHD in your sample also have lower EA-PGS compared to control conditions. In addition, symptom severity of ADHD varies greatly.

      • I also do not agree with the statement that having wealthy parents does not boost the performance as much for children with ADHD as compared to children without for the reasons mentioned above.

      • I think that you have fantastic data, and you have data available about how many of your participants have multiple diagnoses. I suggest adding a stratified group with multiple diagnoses to the analyses, that is adding groups with 2, 3 or 4 and more psychiatric diagnoses and checking their polygenic score prediction to EA.

      • I suggest making it clearer what covariates were used in every analysis (you say first that you added psychiatric diagnoses as covariate among the usual covariates, but later only that covariates were included 'as before', I assume you did not include diagnoses in later analyses, but this is not clear). In addition, it is not clear to me why you control for psychiatric diagnoses in the first set of analyses, I would have wanted to see full results without this covariate.

      Overall, this is a beautiful study and it was a pleasure to read/review it.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 4 2021, follows.

      Summary

      In this manuscript, the authors have generated a new mouse model for the severe disease, Ataxia Telangiectasia (A-T). They introduce null mutations in Atm onto the background of mice that are somewhat sensitized since they also harbor mutations in the Aptx gene. The outcome is the mice show a set of phenotypes that are strikingly similar to symptoms seen in human patients. These include cerebellar degeneration, cancer, and immune system abnormalities. The also deliver small molecule readthrough (SMRT) compounds into tissue explants and show that such a manipulation can restore the production of ATM protein. The success in producing an Atm model with cerebellar degeneration is a compelling advance as this particular phenotype has been incredibly difficult to reproduce in animal models. The authors perform an interesting set of analyses to confirm that the other important features of the disease are also present in their mice. This paper has broad interest to multiple fields including neuroscience, cancer, and immunology.

      Essential Revisions

      1) It is not clear how progressive the cerebellar degeneration is. What is the spatiotemporal pattern of degeneration? Please consider the lobule by lobule effects over time.

      2) For the electrophysiology, what stage cells have you recorded from? That is, what was the structure of the Purkinje cells that you recorded? If the cells look really "normal" but fire abnormally, then please comment on how they are being affected. If the morphology is abnormal, then please explain what defects you see and how they might impact function. Essentially, the authors need to disentangle cell autonomous effects and non-cell autonomous effects with more clarity. That is, are you studying the "dying" cells or the cells that that escaped the genetic defect?

      3) Are both the Atm and the Aptx genes expressed in all (or the same) Purkinje cells? What is the experimental evidence?

      4) Please provide more context and rationale for Aptx in the abstract. As it stands, its mention comes out of nowhere.

      5) In the Introduction, please provide more information as to why previous studies/models might have failed to produce severe Atm-related cerebellar phenotypes.

      6) In the Introduction, the rationale for the choice of paring the Atm mutations with defects in the Aptx gene is unclear. Are they in the same pathway? Are the genes located in close proximity to one another? There are many issues that need to be discussed.

      Related to above, ATM and APTX, while involved in DDR, are involved in parallel pathways-ATM in DNA double stranded break repair, and APTX in single stranded break repair. Homozygous mutations in APTX causes human ataxia (AOA1), but there is nothing to indicate an intersection mechanistically between AT and AOA1. One could just as well call the AT-APTX double mutation a model of AOA1. As indicated above, please expand on the rationale of the experimental design.

      Also, are there more single stranded DNA breaks? Double stranded DNA breaks? Is there a sequestration of SS DNA break repair components including PARP1? How are the changes in PC firing related to DDR (it would be worthwhile for the authors to examine the following papers Hoch et al. Nature. 2017 Jan 5;541(7635):87-91, Stoyas et al. Neuron 2020 Feb 19;105(4):630-644) to give insight into studies that can explore mechanism for DDR and changes in cerebellar morphology/function.

      Therefore, the authors need to address whether single vs double stranded break repair is present and the authors could do a better job of linking the change in PC firing to DNA damage.

      7) Figure 2B: Apologies if I am missing something, but I do not understand the reason or explanation for what determines the probability of survival for the green, gold, and orange traces (the three severe cases in the graph). That is, why is the gold so strong?

      8) How come rotarod was not used as a test? This is a standard motor behavior test that is useful for comparing across animal models and studies.

      9) Related to above, why not use in vivo recordings? I can understand using slice recordings to tackle the biophysical and intrinsic mechanisms, although the authors did not do that. It seems to me that extracellular recordings would have been more informative in the in vivo, awake context.

      10) The authors picked specific regions of the cerebellum to target their slice recordings, which is perfectly reasonable. But why did you pick these regions? Please provide a full justification and discussion for the importance of these particular lobules in relation to what you are trying to solve.

      11) Given the use of slice recordings and that Purkinje cell degeneration is a key aspect of the phenotype, it would be very compelling if the authors showed some filled cells. As it stands, it is very hard to appreciate what the severity of neuropathology actually looks like, especially in relation to what the functional defects are teaching us.

      12) The authors state that "The largest differences were detected in the anterior [38.6{plus minus}3.4 Hz (n=187) vs. 88.1{plus minus}1.8 Hz (n=222)] and posterior [46.9{plus minus}1.9 Hz (n=175) vs. 84.1{plus minus}2.4 Hz (n=219)] medial cerebellum [1-way ANOVA, p<0.0001; Fig. 4B]." Okay, but what does this mean? What is your interpretation for why these regions were more heavily impacted (cell sensitivity based on circuit architecture, gene expression and protein make-up, neuronal lineage?) and how might it impact the phenotype?

      13) The authors state and reference "Previous studies in mouse models of heritable ataxia indicate that physiological disruption in PN firing not only includes changes in frequency but also affects its regularity (Cook, Fields, and Watt 2020)." I agree with having this reference, but what about other models of ataxia? There are a number of other excellent models that should be discussed.

      14) Purkinje cell firing data (figure 4B) should not be averaged across all of the ages, as this is not standard practice, and would be akin to averaging all behavior across ages. I think the data in fig. 4C suffices. If you want to compare across lobules on one graph, simply choose a particular age (perhaps when behavioral changes are first observed?) or at the oldest age.

      15) Why examine Purkinje cell firing deficits in different lobules but not make that distinction for Purkinje cell loss? The Purkinje cell loss analysis focussed on the areas with most pronounced firing deficits but this means that we don't know whether the cells that fire abnormally are the only ones that die. Also see point #2 above.

      16) Figure 4E and related text: Please provide a much more extensive set of images to show the cerebellar pathology. 1) Please show views of the different lobules to demonstrate the pattern of degeneration. 2) Please show different ages to show the progression of degeneration. 3) Please show higher power images of the Purkinje cells to clearly demonstrate their morphology.

      17) The authors need to need provide more data for what is actually happening in relation to cell death. Why not perform Tunel or caspase staining etc.? The authors must show that there are actually acellular gaps where cells have died, or some other indication that cell death has occurred or is occurring.

      18) Also in relation to the Purkinje cell degeneration, what do the dendrites look like? What about the axons? Do you see any torpedoes or axonal regression?

      19) In regards to the cerebellar degeneration, what happens to the other cell types in the cerebellar cortex? Are they intact? What about the cerebellar nuclei?

      20) The authors state "Of interest, APTX deficiency by itself had the greatest effect on the loss of DN4 cells...". Okay, but it is hard to see what this means for A-T as a disease. Interesting as it is, what is the relevance of this gene and these findings to the actual disease?

      21) Please provide a more extensive description and rationale for why this explant system was chosen.

  2. Dec 2020
    1. Reviewer #3:

      General assessment:

      This paper applies a sophisticated psychophysical paradigm to assess the effect of prior choices on perceptual decisions in a group of 17 high functioning (but not mild cases) children and teenagers (8-17 years) with ASD. Using a model that is assumed to dissociate the contribution of prior stimuli and choices, the study found a strong effect of prior choices not stimuli, which is stronger in ASD than controls. Similar results from another data set are also reported. There was no convincing evidence found for a correlation between the effect of the priors and the ASD severity.

      Overall, this is an impressive study with a sophisticated paradigm, elaborate data analysis, ASD participants who were tested on a large battery, in-depth analysis of the literature with interesting insights, convincing results (but see below) and a well written manuscript.

      Major issues:

      1) The finding from the model that the prior stimuli did not have a positive impact (and even negative) on the decision bias is counter-intuitive and needs explanation (I apologize if there is one and I missed it). There were typically 5 prior trials, ~4 of them on one side, e.g. right, resulting in a higher rate of right presses on the test (because the test was unbiased, and the results showed a bias). Assuming the prior trials were mostly replied correctly, there should be a correlation between the stimuli and the choices. I see 2 possible reasons why the model produced negative weights - one is that indeed the choices were different from the stimuli, in which case we need to know the performance of the participants on the prior trials (which would be useful anyway). The other possibility is that the choices for the model were binary and the stimuli were continuous. If the stimuli had been coded as binary, it would have been difficult to dissociate between the stimuli and the choices. In this case, the conclusion should be that the prior stimulus laterality could have impacted the test choices, but not their magnitude. This issue should be explained in the text.

      2) The performance on the test trials staircase procedure is not reported, only the PSE difference. It would be useful to know if the groups differed on this, as the example psychometric curves shown seem shallower in ASD. Biases are likely to push the staircase procedure to higher laterality discrimination thresholds. I suspect (but without proof) that worse performance (more errors) on the staircase procedure may amplify (but not create) the bias. It would be useful to show the performance data and discuss this issue.

      3) The paradigm used is quite complex and complex paradigms are more difficult to fully understand, so I wonder about the justification for it. Why is it better or different from testing SDT shift of criterion by change in target probability? For example, in a Yes/No experiment for contrast detection set around 70% correct, the criterion may shift when there are more Yes or No trials. What would the authors expect in such an experiment? It would be useful to discuss this for the wondering reader.

      4) About the interpretation: the word "perseveration", i.e. a tendency to repeat the last key or recent keys is not mentioned. The authors conducted a "response invariant" experiment which showed significant but much smaller biases (Figure 7). Are these significantly smaller than the 1st experiment (as seems from the plots)? If so, one cannot rule out a major contribution of repeating the recent keys, i.e. perseveration. It would be useful to see the raw data in this case, e.g. what is the %trials of pressing right when the priors were biased to the right. My understanding is that it must be high given that the staircase was symmetric (50/50 trials on left and right) and that a bias emerged from the data.

      5) I wonder if the data could be analyzed to reveal the different contribution of preceding trials, i.e. the details of the serial dependency. Currently, all previous trials are treated equal in the model, but their contribution is not necessarily equal.

    2. Reviewer #2:

      General assessment and major comments:

      The study addresses a timely and important question of the role of potential modulations in perceptual decision-making in the atypicalities observed in perceptual processing of individuals diagnosed with autism. The manuscript is important, and the methods used are sound.

      There are however some issues to consider:

      Thresholds, or other indications of sensitivity and precision of performance in the task are not detailed (although judging by the individual psychometric functions presented in the figures, slopes seem less steep in ASD). Was sensitivity considered in any way in the analysis? wondering how the model fitting would look like and how it would interact with the biases. Bias magnitude could vary as a factor of noise or sensitivity.

      Also, could larger consistency bias in the ASD group result from weaker performance, more lapses of attention etc.?

      Age range is quite large. Did you check for age-related differences? I understand the sample size is not big enough to analyze data across different age groups but maybe as a covariate? (there is also the problematic issue of determining sample size of children based on the study in young adults).

      Not sure why the effects of prior stimuli are considered adaptation effects, particularly in the first experiment where stimuli were briefly presented. Also, regarding the argument in the Introduction about Bayesian priors producing positive effects -- there are other prior effects that may cause 'negative effects' in relation to prior expectations (for example, in perceptual illusions such as the weight-brightness illusion).

      Can you think of a reason why controls did not show significant consistency bias in their responses in the heading discrimination?

      There is some wording in the reports of the statistics such as 'more significant' or 'more marginally' that needs to be rephrased.

      Were the analyses corrected for multiple comparisons?

      Usually RTs in this sort of perceptual task are longer in ASD. Wonder how this is not the case here, although instructions for the subjects emphasize speed and accuracy.

      I agree with the authors. It is interesting to look at correlations between the effects of prior choices and clinical scores of repetitiveness and flexibility in ASD. Did you look at the correlation between the effects of prior choices and SCQ scores across the two groups? Previous work documenting correlation between autistic traits (AQ) and modulated perception provided important information about the generalization of the findings to the broader spectrum of autism in the wider, nonclinical population (see Lawson, Mathys, & Rees, 2017; Hadad, Scwartz, & Binur, 2019).

    3. Reviewer #1:

      General Assessment:

      I found the studies to be well motivated and thoughtfully designed to disentangle competing interpretations in the extant literature on visual perception in ASD. The first two experiments provided compelling evidence that prior choices affect perceptual decision making in ASD, but the outcome of the response invariant condition suggests that the authors' interpretation goes beyond the data.

      Substantive Concerns:

      "In summary, we found here that individuals with ASD demonstrated an increased influence of recent prior choices on perceptual decisions (vs. controls),..." is the major finding in the paper, quoted here in the concluding paragraph. It seems, however that the data support a narrower (and potentially less interesting) conclusion that individuals with ASD demonstrated an increased influence of recent button presses/motor responses, as the finding which forms the basis of the summary went away when different keys were used to report prior vs. test responses (i.e., in the response invariant condition). I understand that the authors present these data as challenges to theories of attenuated priors in ASD, but they seem to sidestep the issue that these data make their general conclusion more complicated.

      For completeness, it would be helpful to present some information on the stimulus values for the test stimuli, as these were set individually using a staircase. Where did these staircases converge? Were there group differences?

    1. Reviewer #3:

      In this paper the authors have developed a system to simultaneously generate two-, three- and four-photon fluorescence excitation from a single laser line and then proceed to apply this system to a number of turbid biological imaging applications to highlight its capabilities. Using a customised commercial La Vision BioTec Trimscope, they have incorporated a high powered fiber laser source with an Optical parametric amplifier and dispersion compensation to generate a either 1330nm or 1650nm laser lines with high peak pulse energies at low pulse repetition rates. They then compare the relative capabilities of each laser line in terms of number of fluorescence emission channels measured (skin tumour xenografts), fluorescence bleaching analysis and functional toxicity thresholds and fluorescence signal attenuation (excised murine bone).

      Whilst the paper is well written, the concept of utilising high laser peak powers and at low repetition rates to generate 3PE and 4PE at spectral excitations at 1300nm and ~1650nm is not new and has been presented previously (Cheng et al. 2014), as referenced by the authors. The authors have however gone into more detail and presented a number of comparative excitation approaches to compare and contrast low-duty-cycle high pulse-energy infrared with the more common high-duty-cycle low pulse energy near-infrared alternative. The benefits of higher order multiphoton microscopy when combined with higher wavelength excitation allows deeper imaging and more localised fluorescence excitation with reduced phototoxic and photobleaching effects per excitation pulse. One of the major issues associated with generating 4PE is that since higher pulse energy is required, this further reduces the repetition rate of the laser source, in order to reduce the average laser power in order to avoid sample heating effects. This in turn leads to much longer acquisitions and is limited by the fluorophore saturation particularly since they are using single beam excitation.

      Major comments:

      1) It seems as though when you take into consideration duty cycles, fluorescence saturation, water absorption effects and longer acquisition times, which lead to greater phototoxicity, 4-PE at 1700nm excitation is not appropriate for most dynamic biological applications where acquisition speed and/or continued image acquisitions are the key factors. Could the authors comment on this?

      2) How long does it take to acquire a single frame with four-photon excitation at 1700nm? In none of the data sets was frame time mentioned in particular when acquired 3D data sets. Can the authors ensure that these times are mentioned both in the main text and the figures containing images.

      3) In line 131 and figure 3d the authors present data showing relative axial resolution measurements. Are these features measured diffraction limited and how do they know? They are clearly not measuring like for like structures (different fluorescent species) so do not think this can be used as a measure of resolution. Can the author provide other resolution measurements?

      4) In line 140 - 142 the authors present data showing the advantages of THG at 1650nm over other excitation lines. Aside from the excitation wavelength could this data be explained by the greater absorption and scattering at the emission wavelengths generated at these laser lines?

      5) In figure 3A and 3C the SNR for 1650nm increases whilst for 1300nm and 1180 excitation this decreases. Is this simply due to more of the exciting fluorophore species residing deeper into the tissue?

    2. Reviewer #2:

      Nonlinear microscopy is in the unique position that high-resolution images of cells and other tissue components can be obtained in live tissue. However, scattering and absorption limit the penetration depth. The impact of nonlinear microscopy in biomedicine and biology would be much improved if higher imaging depths can be achieved. Lately a few key studies have appeared achieving this. This manuscript contains a well-motivated extension of this research, in particular on the benefits of high-pulse-energy low-duty-cycle infrared excitation near 1300 and 1700 nm over 2-photon excitation, in heterogenous and dense tissue. The authors compare three types of excitation, at 1650 and 1300 nm at 1 MHz and at 1100 (or 1270) nm at 80MHz. They characterize photodamage in the tissue and determine the limits for power densities to stay below that. They study the achieved resolution at high depth for each of the processes and show a deeper imaging depth is resolved in bone and tumor core with 3P and 4P than with 2P. The article is a very solid and extensive study.

      Though I have no major concerns with article, I do have some minor points:

      l.57: Are the resolutions reported for 2-, 3-, or 4- photon processes. Do you not expect these to differ for the different processes? l.60 It is not explained that power is increased from X to Y, instead the peak power of 87 nJ in L 67 is not found back in fig. S2.

      L. 103 Given is the power at the sample surface, after which the readout for cell stress via Ca imaging is done (very elegant). Is not the imaging depth of the readout relevant too, as it is probably the power density at the focus which matters. What imaging depths can be reached with this low power? This comes back later, but would be good to mention here.

      L.110 The phrase 'Furthermore' confuses me. I guess the authors mean to say that with their 2.8-8.7 nJ of power they were well below the 100 mW level? Which is kind of obvious at 1 MHz?

      L. 126 Some words are missing, 'but 1180'.

      Why do some signals show a peak in intensity in fig. 3C and G rather than a slope?

    3. Reviewer #1:

      In this manuscript, the authors show they can accomplish imaging in complex specimens using 3- and 4-photon excitation, deeper in the specimen than comparable optics can accomplish with 2-photon excitation laser scanning microscopy. This is a clear advantage for imaging optically hostile specimens such as cultured organoids or spheroids, or in challenging in vivo settings. I am excited about these findings, but I am not at all supportive of the current version of the manuscript being used to present these lovely findings.

      There are two strong reasons for my opinion:

      i. The manuscript presents the findings in a manner that will only be understandable by the readers who are familiar with the topic, and who are likely to already have heard of the capabilities of 3- and 4-photon excitation to image deeper into specimens.

      ii. The results are not presented in a way that the large body of potential readers can understand. They will be unable to grasp the way that the experiments were performed, or understand what the figures are showing, or critically evaluate the results that are presented.

      Thus, there is a disconnect between the quality of the work and the quality of the presentation. There are many areas of quantitative imaging and intravital imaging that are well known to those that know about them (or use them), and that are a complete mystery to the vast majority of those that don't know about the tools or use them. The authors must take this as an opportunity to reach the many workers that could benefit from this powerful approach, rather than writing for the group that already knows (and even uses) the approaches presented.

      1) Provide needed background and present important things first. The authors should give the reader a clear view into the issues in imaging biological tissues with the longer wavelengths that are used for confocal laser scanning microscopy (CLSM) and for two-photon laser scanning microscopy (TPLSM). There are several factoids presented, all seemingly true, but not presented in an accessible manner. Rather than starting with a mention of the expected temperature rise due to the dramatically higher absorbance by water of 1300nm and 1700nm light, the paper first presents the major absorbance of the light (~2/3 loss) and that this isn't a problem because there is sufficient laser power. For most readers, the need for a larger laser won't be their first question; instead it will be the viability after/during the imaging session. The expected temperature rise, and an indirect mention of burn marks (!), comes at the end of the section.

      2) Explain and perform cell viability tests. Calcium imaging for assessing tissue viability is not the technique of choice for most readers, and is presented in a way that assumes general knowledge that simply does not exist. Membrane patency assays using membrane-impermeant DNA dyes, or other live-dead assays are far more common, but not presented in this study. I am not insistent that the authors use any particular assay, but I am insistent that the authors present the need for viability assay(s), teach the reader the principles of the assay(s) used, and present the results in an understandable manner.

      3) Present the finding and the figures in an accessible manner. The figures are simply not digestible by the readers who do not perform this sort of work, and the legends do not help sufficiently. For those of us who do perform work of this sort, the figures are not as convincing as they should be, or presented in a way that they can be critically evaluated.

      Consider the legend for Figure 1: "Microscopy with simultaneous 2-, 3- and 4 photon processes excited in fluorescent skin tumor xenografts in vivo. Representative images were selected from median-filtered (1 pixel) z-stacks, which were taken in the center of fluorescent tumors through a dermis imaging window. a) Excitation at 1300nm (OPA) in day-10 tumor at 145 μm imaging depth with a calculated 3.3 nJ pulse energy at the sample surface, 24 μs pixel integration time and 0.36 μm pixel size. For calculation of pulse energy at the sample surface see Figure S3. b) Excitation at 1650 nm (OPA) in day-13 tumor at 30 μm depth with a calculated 6.3 nJ pulse energy at the sample surface, 12 μs pixel integration time and 0.46 μm pixel size. c) Excitation at 1650 nm (OPA) in day-14 tumor at 85 μm depth, with a calculated 5.4 nJ pulse energy at the sample surface, 12 μs pixel integration time and 0.46 μm pixel size. Cell nuclei containing a mixture of mCherry and Hoechst appear as green."

      If I gave any of the figures and legends to the people in my lab, the half that don't do multiphoton imaging (but that have sat through many lab meetings) would just hand them back to me with quizzical expressions on their faces.

      The figures are not as compelling as the results, and defer to the body of the paper to explain what was done or what was shown, and assumes that the average reader remembers the differences between OPO and OPA , for example (which they won't). The power plots showing nJ and mW in Figure 3 are inaccessible to most readers, and not well described.

      I should mention that the figures, legends and text are not satisfying for the readers who are familiar with 2-, 3- and 4-photon imaging either. These are fantastic findings, and deserve figures that are as lovely as the results, and are compelling. Some of these issues are due to typos: "Consistently, multiparameter recordings were achieved inside the tumor at 350 μm depth using excitation at 1650 nm and 1300 nm, but 1180 nm (Figure 3b). "

      However, the greater problem is that the text doesn't present the findings in a straightforward, convincing fashion and then interpret them. Instead, the conclusion often leads the evidence: "In line with an improved depth range, the signal-to-noise ratio (SNR) of 3PE TagRFP outperformed the SNR of 2PE TagRFP at depths beyond 150 μm (Figure 3c). Because H2B-eGFP expression in HT1080 tumors was very high, 3PE eGFP emission reached the highest SNR."

      The legend and figure that it describes should be able to stand on their own, and convince a skeptical reader with the help of the text in the body of the manuscript.

      In summary, these are lovely and important results that I am excited about. They are presented in a fashion that will make it difficult for most to appreciate because the body of the paper is not fashioned to teach the reader, and the figures themselves are challenging, and the legends inadequately present what is shown in the figures. Careful expansion and editing should resolve all of these issues and make the manuscript into the presentation these excellent findings deserve.

    1. Reviewer #2:

      The authors quantify virulence factors in Cryptococcus neoformans and C. gattii in a large number of clinical isolates and correlate these virulence factors to survival in a g. mellonella infection model and to the clinical outcome. The authors found a correlation between secreted laccases and disease outcome in patients. In addition, the authors show that a faster melanization rate in C. neoformans correlated with phagocytosis evasion, virulence in the g. mellonella model and worse prognosis in humans.

      The manuscript is well structured with an appropriate abstract summing the main findings, a clear introduction, well described methods section and appropriate number of figures and tables. The results are clearly described.

      1) The authors identify and acknowledge the most important limitation of the study: line 365-366 the patients were treated with different regimens in distinct health services. This reviewers agrees this is a limitation. However, to get a feeling about the impact of these differences the authors should indicate how the patients were treated and whether there were differences in patients that died and survived. Without this information clearly presented, I cannot interpret the correlations between virulence factors and outcome found in this study. Perhaps the authors can show how many patients that were included in the phenotype-survival analysis, that died and survived were treated according to Brazilian guidelines.

      2) The melanin production evaluation assay is an important tool that the authors use in this study and the measurements from these assays were correlated with G. mellonella and patients survival and thus are essential to the conclusions of the study. The method is well standardized, and the authors show elegantly that the outcomes are highly reproducible. Can the authors describe when melanization occurs: does it occur in mature colonies and may growth rate itself may influence the measurements? Do isolates with a high growth rate/colony maturation have a low T-HMM or high melanization Top. Have the final colonies of different species have a different final cell number after 7 days incubation and how does this correlate to melanization? And how does the growth rate/ budding rate/ colony maturation/ correlate to G. mellonella survival?

      3) The figures 1-5 give a clear picture of the wide distribution and variation of virulence parameters e.g. the distribution of melanization kinetics parameters, the distribution of capsule sizes, GXM secretion and LC3 phagocytosis. But what does this distribution mean, it only shows that the isolates are not the same but does not contribute majorly to the final conclusion. Can the authors think of a way to give more meaning to these figures: e.g. indicate with colors which isolates were retrieved from patients that eventually died and which survived (although this may be inappropriate as not all clinical information is available. Figure 6 really gives meaning to the numbers displayed in figure 1-5. Perhaps move some figures to the supplementary file.

    2. Reviewer #1:

      The manuscript describes the characterization of in total 85 Cryptococcus spp. clinical isolates with regard to virulence phenotypes including a Galleria mellonella infection model for cryptococcosis. The authors determined the melanization kinetics of all strains, measured the whole-cell and extracellular laccase activity, the capsule thickness, and the concentration of the cell wall polysaccharide glucuronoxylomannan. In addition, during macrophage interaction the proportion of Cryptococcus-containing LC3-positive phagosomes for each strain was determined as well as the survival of G. mellonella after infection with selected Cryptococcus strains. Finally, regression analyses were performed to estimate the relationship between the risk of death in crytptococcosis patients and the phenotypes of the isolated Cryptococcus strains. A major finding was that the risk of death in patients with disseminated cryptococcosis increased with the level of extracellular laccase activity and the time for half-maximum melanization in the Cryptococcus isolates. This suggests that the melanization rate, more than the total amount of melanin, impacts the outcome of a Cryptococcus infection.

      General assessment:

      The study is based on carefully performed experiments. However, the scientific significance of this work is moderate. Melanin and the laccases that are involved in its synthesis are known virulence factors of Cryptococcus spp. for many years and similar studies have already been published elsewhere (e.g. Samarasinghe et al. 2018). The major new finding of the presented work is that the speed of melanization has an impact on the virulence of Cryptococcus spp. rather than total amount of melanin. The shortcoming of the manuscript is that the author's hypothesis is mainly based on regression analyses, but the final proof based on a genetically well-defined background is missing. Therefore, the study only provides little new insight into fundamental mechanisms of Cryptococcus virulence but includes associations with patients and therefore might be more suited for a journal specialized in pathogenic fungi.

      Following points should be considered:

      1) The authors show the association between faster Cryptococcus melanization and more effective evasion from host immunity. However, the author cannot totally exclude other factors that are associated with host evasion. It would be more appropriate to either create a mutant (e.g. overexpression of LAC1), which showed faster melanization in comparison to a wildtype strain or to perform multilocus sequence typing (including the LAC1 locus) to capture the genetic variation of the clinical isolates and to find come correlations with the speed of melanization. The interesting question is which genetic factors contribute to the difference in the melanization rate.

      2) The authors should critically discuss the suitability of their Galleria mallonella infection model. It is a known fact that temperature has an influence on the melanization in Cryptococcus spp.. Laccasse activity is significantly inhibited at temperatures of 37°C and higher. The Galleria model can only be used at lower temperatures.

    1. Reviewer #2:

      The topic of this manuscript is the basis of continuous and episodic bursting electrical activity in developing spinal cords. The approach used is to employ a simple mathematical model as a representation of the central pattern generator underlying the bursting pattern, and examine how the properties of bursting change with variation in three key system parameters. Some of the model predictions are tested in an actual in vitro spinal cord preparation. Although I enjoyed reading the manuscript, I have some serious concerns about the model that is employed, which I discuss below.

      Major concerns:

      1) The model is a half-center oscillator (HCO) in which one cell inhibits the other, resulting in anti-phasic electrical activity of the two cells. (Each "cell" actually represents a cell population, so the model is a mean field model.) This is certainly one way to get electrical bursts. However, it is not at all clear that such a HCO structure exists in the developing spinal cord, or that there are neural populations with this anti-phasic activity. If such data exists, it is not mentioned in the paper or cited. Indeed, the recordings in Supp. Fig. 1 show extracellular neurogram recordings from ventral roots in different lumbar segments and in which the bursting appears to be synchronous. So I see no evidence that the HCO model reflects the actual neural circuit, other than the fact that it can produce bursting and episodic bursting. This does not mean that such a phenomenological model is without value, but it should be made clear to the reader that that is what the model is. Also, the next two points below do appear to cast doubt on the utility of this model.

      2) In Fig. 3 it is shown that the inter-episode interval (IEI) is increased in the model when the conductance g_h is reduced. Because of this, the episode period (EP) also increases. The data, also in Fig. 3, show the opposite. They show that blocking the h-type current decreases the EP. This seems like a flaw in the model, since it is the h-type current that is responsible for episode production (at least I think it is, see point 4 below). The discrepancy is mentioned in the manuscript, but only briefly and it should be fully addressed.

      3) In Fig. 5 it is shown that, in the model, there is a very small interval of g_NaP where episodic bursting is produced. Otherwise, the model produces continuous bursting (for larger g_NaP values) or silent cells (for smaller g_NaP values). However, the data that is also shown in the figure indicates that blocking the NaP channels has little effect on episodic bursting. This is another serious discrepancy between the model and the experimental data.

      Points for clarification:

      4) It appears from Fig. 1 that episodes stop when h-type current activation slowly moves to an insufficient level to kick off a new burst. Logically, a new episode would start once that activation grows back to a sufficiently large value. Is this right? The mechanism for episode production is never discussed, and it should be.

      5) The model is deterministic, yet there is variation in burst duration and episode duration (see Fig. 3). What is the source of the variation? Does this mean that the episodes are not periodic?

      6) The model has a multistable region in parameter space, and much is made of this in the Results and the Discussion. In Fig. 6, it was demonstrated that hyperpolarizing pulses could switch the system from one behavior to another. Can this be done experimentally in the in vitro prep? If so, was it tried?

      Other:

      7) Discussion is too long and touches on things that were far from the focus of the manuscript. For example, there is about a page and a half of text discussing short term motor memory (STMM) although the Results section did not focus at all on homeostatic functions of the circuit or STMM. Furthermore, some points were made several times during the Discussion, where one time would have been sufficient.

      8) Almost two pages of the Discussion was dedicated to multistable zones, yet in the model the multistable zone was tiny, and there was no evidence that the experimental prep lies in or near that zone. The authors state that in actual neural circuitry there could be a much larger multistable zone, which is true, but there also may be none at all. This discussion appears irrelevant.

    2. Reviewer #1:

      The present paper addresses the very topical problem of understanding of dynamic switching in central pattern generators. The paper investigates switching between bursting and spiking modes in spinal cord neurons. This is modelled using a multichannel HCO that identifies narrow regions in parameters where the system is bistable. It is argued that neurotransmitters drive invertebrate CPGs to favourable bistable regimes that allow rapid switching from one oscillatory state to another (e.g. foraging to escape) to be enacted by fast electrical stimuli. The paper is generally well-written and does a good job at interpreting observations.

      I have two major comments:

      1) The authors seem to ignore the switching between phasic and antiphasic oscillatory states, even though this is shown in Fig.1, and more generally between the polyrhythms that would occur in larger inhibitory networks. The latter switching may be at least as relevant to gait generation as the switching from bursting to spiking. Polyrhythms have also been shown experimentally and theoretically to produce robust multistable states that overlap over a wide parameter space. It would therefore be useful if the authors could comment on the relative robustness of spiking/bursting multistability vs polyrhythm multistability.

      2) It is argued that an hyperpolarizing Ip pulse will induce a transition from continuous spiking to bursting and conversely a depolarizing pulse induces the reverse transition from bursting to continuous spiking. Transitions are a dynamic process which will depend, among other things, on the timing when the pulse is applied during the heteroclinic cycle. In the absence of more information on the dynamics of the system such claims look over-simplistic.

    1. Reviewer #3:

      General Assessment:

      The manuscript is well written and the methods are sound. The strengths of this manuscript are that this study is the first to systematically perform detailed electrophysiological measurements on inhibitory interneurons (INTs), in particular RC and non-RC INTs using the SOD1 mouse model for ALS. It is very interesting that they showed a dichotomy between reduced excitability in RC neurons (which could lead to an indirect increase in overall excitability of MNs) and non-RC INTs, which actually showed an increase in excitability which would have the opposite effect on MNs.

      Main comments:

      1) Most electrophysiological studies have focused on motor neurons and showed that they become hyperexcitable at very young ages, although there is controversy as to whether the hyperexcitability persists and is causative or compensatory to disease progression.

      2) The dichotomy observed between RC and non-RC Inhibitory neurons is interesting. Given that many of the glycinergic non-RC interneurons are Ia-inhibitory interneurons responsible for reciprocal inhibition, their effects on the target motor neurons have opposite effects on MN excitability. At this point it is mere speculation as to how these changes actually exacerbate the progression of the disease and effects circuit function.

      3) This paper is mainly descriptive with no specific hypothesis other that what has been discuss often in the literature: Motor neuron hyperexcitability occurs from intrinsic alterations in MN ion channels, increased excitatory synaptic activity, or a decrease in inhibitory activity or all of the above. Although the authors are most likely the first to demonstrate changes in inhibitory interneuron excitability with direct electrophysiological recordings, it is unlikely that these findings will significantly move the field forward presently. The authors suggest that biomarkers could be developed, this is just a broad statement without concrete proposal for implementation. It would be useful to show a specific target that could be modified pharmacologically in animals over time to see if this changes the progression/survivability of the ALS animals.

      4) Furthermore, the functional significance of early hyperexcitability as either a cause or compensation of ALS is controversial at present. Numerous studies have addressed hyperexcitability but yet we are still far from understanding the bases for this disease and one cannot help question whether this avenue of investigation is fruitful.

      5) Does this change in interneuron excitability and the dichotomy between RC and non-RC demonstrated persist over the course of the disease? How relevant are these changes to disease progression?

      6) It will be necessary to use other animal models available for comparison since SOD1, although historically a well-studied mouse model, is an ectopic over expresser, and is not the predominate mechanism for ALS in humans. There are others probably more pertinent models, ie. C9ORF72. Whether such changes in inhibitory interneurons occur in those other models and in humans remains to be determined.

    2. Reviewer #2:

      Amyotrophic lateral sclerosis (ALS) used to be considered primarily a disease of motoneurons. Recent work using mouse models of ALS has revealed that pathological changes can also be detected in spinal interneurons, particularly inhibitory interneurons, and that some of these changes can be detected before birth. The present paper is the first to directly examine the electrical properties of spinal inhibitory interneurons in a mouse model of ALS and show that some of these are altered in the neonatal period well before the mice start to exhibit symptoms. The authors show that SOD1 Lamina IX neurons are smaller than the Lamina IX WT neurons whereas no differences were found between WT and SOD1 neurons outside Lamina IX. They also use whole cell recordings to reveal that putative 'Renshaw cells' are less excitable in SOD1 mice than wild type animals whereas non-Renshaw inhibitory SOD1 neurons are more excitable.

      Major Comments

      1) The authors claim that Renshaw cells are in lamina IX, when they have been shown to be located mostly in the ventral part of lamina VII, ventromedial to the motor nucleus (Alvarez and Fyffe, 2007). In addition, not all calbindin+ neurons in lamina VII are Renshaw cells. From the location of the whole cell recordings shown in fig.2, it seems likely that most of the recorded neurons are not Renshaw cells because they are outside the classical 'Renshaw' area. It is not clear why the authors are focusing on glycinergic neurons in lamina IX, as there is no evidence that they belong to a unique class or that they are presynaptic to motoneurons.

      2) The concern about the identity of the Renshaw cells obviously undermines the statistical modeling to segregate Renshaw versus non-Renshaw cells. Furthermore, it was not clear from the text whether the model used both WT and SOD1 calbindin-positive neurons to define 'Renshaw cells'. Assuming it did, and given that there were changes in the electrical and morphological properties of the calbindin+ SOD1 neurons, is it not surprising that they could be grouped with the WT 'Renshaw cells'?

      In addition, the characteristics the 'Renshaw cell' population used for the model are not clear. On line 186 that it states that 15/23 of the whole cell recorded interneurons were positive for calbindin. Does this refer to 15 WT and 23 SOD1 neurons? Thus 38 neurons were calbindin positive. Of the remaining 21 neurons how many were calbindin-negative and how many were not tested? How many of the 38 calbindin-positive neurons had their dendrites reconstructed sufficiently from the intracellular fill to be used in the model? The model predicted that 80% of the 59 patched interneurons were Renshaw cells. How many of these were in the calbindin-negative group and how many were in the not-tested group? The spatial distribution of these groups should also be plotted. However, it seems very unlikely that 80% of the recorded cells are Renshaw based on their location as shown in fig.2B.

      Second it would have been useful apply the model to known non-Renshaw cells, to establish that it was not generating too many false positives. Another way the authors could test the model is to establish if it could distinguish WT and SOD1 neurons based on their morphology.

      3) The authors suggest that the reduced excitability of 'Renshaw cells' might contribute to the excitability changes seen in motoneurons. However, based on their own data, this is not a straightforward conclusion. They find that 'non-Renshaw cells' are hyperexcitable and since this population would include 1a inhibitory interneurons and other premotor inhibitory interneurons, it is not clear what the overall effect on motoneuron excitability would be. Additionally, because the authors suggest that 'Renshaw cells' are less excitable this would presumably lead to reduced inhibition of 1a inhibitory interneurons counteracting a potential loss of inhibition onto motoneurons from Renshaw cells.

    3. Reviewer #1:

      In this study, the authors investigated whether the morphological and electrophysiological properties of glycinergic interneurons in the spinal ventral horn of GlyT2eGFP SOD1 G93A mice are altered compared with GlyT2eGFP WT mice at P6-P10 (the SOD1 G93A mice is the classic mouse model of amyotrophic lateral sclerosis). Such an investigation has never previously been done. The main body of results relies on a sample of 34 WT and 25 SOD1 patched interneurons located throughout the ventral horn. The authors found that soma sizes of patched interneurons are not significantly different in SOD1 animals than in WT animals but their dendrites are larger. The onset and the peak of persistent inward currents (PICs) are more depolarized in SOD1 interneurons suggesting that they are less excitable than in WT. Immunohistochemistry for Calbindin was performed in a subset of the patched interneurons to identify Renshaw cells (7 cells in WT animals and 6 cells in SOD1 animals). Calbindin positive cells display more depolarized PICs onset and peak in SOD1 than in WT animals. A predictive statistical analysis was then performed in order to include in the Renshaw cells sample cells that were not tested for calbindin. This analysis suggested that the predicted Renshaw cells are less excitable in SOD1 mice than in WT mice whereas the predicted non-Renshaw cells are more excitable. The implications of these findings for the ALS pathophysiology are discussed.

      However, a number of major concerns substantially weaken the findings:

      1) Morphological properties Texas red allowed the authors to localize the patched cells in the ventral horn, to measure the soma and the dendrites and to investigate whether the patched cells were immunopositive to Calbindin. It appears that the soma volumes of the patched neurons are on average 2-3 times larger than the soma of the general population of GlyT2-GFP neurons in the ventral horn or in lamina IX (Table 1). No explanation is provided for this discrepancy. Does it mean that there is a systematic recording bias towards the largest interneurons ? Alternatively, is there a systematic swelling of the patched cells or a shrinking in the fixed spinal sections? Also, it is not clear what the dendritic parameters are? It is necessary in Figure 2 to show a reconstruction of dendrites in order to figure out which dendritic length, surface and volume are reported in Table 1.

      2) Electrophysiological properties The shift in the onset of the persistent inward currents onset is taken as an important indicator of a reduced excitability in SOD1 interneurons. However the measurement of the PIC onset is problematic. It is claimed in the Material and Methods section that "PIC onset was defined as the voltage at which the current began to deviate from the horizontal, leak substracted trace" (lines 374-375), which seems reasonable. However, in Figure 3A, the arrow for the PIC in the SOD1 motor neuron (red trace) does not point to the initial deviation from the horizontal which actually occurred at about -60mV, i.e. close to the PIC onset for the WT motoneuron (blue trace), in contradiction with the authors claim. The arrow points to a second component whose onset appears at a more depolarized voltage. Then the net current is likely to be complex and a pharmacological dissection of the currents at work is required both in WT and SOD1 neurons. Indeed, the net inward current might result from the summation of inward and outward currents. Are they outward currents at work? Are the inward currents Na+ or Ca++ currents? In the absence of such a pharmacological "dissection" it is difficult to fully interpret the data.

      3) Identification of the Renshaw cells The authors identified a subset of GlyT2 neurons as Renshaw cells because they expressed Calbindin-D-28K. This sole criteria does not allow a proper identification of Renshaw cells, particularly in P6-P10 mice. Indeed, many non-Renshaw cells in the ventral horn are calbindin-immunopositive during this post-natal maturation period in addition to the Renshaw cells (Siembab et al, J Comp Neurol, 2010). One distinguishing feature of Renshaw cells is that they are excited by recurrent motor axon collaterals. Then, the presence of VACht boutons on the GlyT2 cells would have been an interesting additional identification criteria. However, there is another source of VACht boutons than motor axon terminals in the spinal cord (Zagoraiou et al, Neuron 2009). Since this is an electrophysiological work, the authors had the possibility to unambiguously identify Renshaw cells: the presence of synaptic excitations in response to the stimulation of motor axons in a ventral rootlet (using oblique spinal cord slices, see for instance: Lamotte d'Incamps and Ascher, J Neurosci 2008; Bhumbra et al, J Neurosci 2014). The authors are advised to perform such an electrophysiological identification of Renshaw cells.

      4) Statistics and predictive model The number of patched cells identified as "Renshaw cells" on the basis of their Calbindin immunopositivity is low (7 WT and 6 SOD1). Indeed, I do not see any reason why the authors did not repeat the experiments in order to gather a more reasonable number of cells. Statistical analysis was performed on this low cell samples, in order to investigate whether each property under investigation differs or not in WT and SOD1 animals as reported in Table 3 (normality of the distribution was tested for each property and either ANOVA analysis or Kruskall Wallis analysis was performed). The validity of statistics on such low cell samples is questionable. The analysis was then extended to all patched cells using sophisticated random forest and principal components analysis in order to check whether some cells among those not tested for calbindin display enough similarities with the calbindin-positive cells to be considered as putative Renshaw cells. The model predicted that 80% of the 59 patched cells were "Renshaw cells", a percentage astonishingly larger than the percentage of calbindin-positive cells in the ventral horn (65%). This prediction is doubtful since the number of calbindin-positive cells is already higher at P6-P10 than the number of Renshaw cells (see bullet point 3). Nevertheless, the authors made statistics (not shown on the paper) on the basis of this prediction, and they found that the predicted Renshaw cells are less excitable in SOD1 mice than in WT mice whereas the predicted non-Renshaw cells are more excitable.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 14 2020, follows.

      Summary

      This study addresses an important topic of broad ecological interest and provides important insights into the role of local-scale processes in shaping patterns of species diversity, aiming to (i) assess if there is a global latitudinal diversity gradient (using alpha diversity) of rocky shore organisms and its functional groups and, (ii) whether there are any large scale or local environmental predictors of richness patterns. The strength of this paper is the global coverage of studies analyzed, showing for the first time that rocky shore richness does not appear to peak in the tropics - in contrast to many other studies of marine and terrestrial systems. These outcomes are not specific for rocky intertidal systems, with an increasing number of studies showing that the search for global ecological patterns may be elusive. While sampling in the tropics and the polar regions is poor (acknowledged by the authors), this should be viewed as a call for further research in these regions - not as a weakness of the paper per se. There are also some reservations on how the analysis has been conducted, including the lack of standardization of sampling effort and other details (e.g., size of sampling units) to derive a comparable measure of diversity across sites.

      Public Review

      The latitudinal gradient of diversity has been studied and confirmed in many aquatic and terrestrial habitats and species across the globe. In the vast majority of cases, richness increases towards the tropics. Using an impressive global dataset of latitudinal diversity gradients in 433 rocky intertidal assemblages of algae and invertebrates from the Arctic to the Antarctic, Thyrring and Peck show that rocky shore ecosystems may not follow this general pattern. The authors show that there is no clear latitudinal gradient for rocky shore organisms using alpha diversity - as posited by prevailing theories - although some functional groups exhibit contrasting patterns. Diversity within functional groups of predators, grazers and filter-feeders decreased towards the poles, whereas the opposite was observed for macroalgae. Correlation with environmental drivers highlighted the importance of local-scale processes in driving spatial patterns of diversity in rocky intertidal assemblages. The paper is well written and the many of the analyses are well done, but there is the concern, which the authors acknowledge, that sampling within tropical latitudes is sparse and needs to be carefully considered when interpreting the results of this paper.

      The work can be improved in the following manner:

      1) The relevant data to standardize species richness may not be available from the primary literature. However, it should be possible to employ relevant standardization methods within the 5{degree sign} latitudinal bands in which the data have been aggregated. An analysis based on standardized data, at least for the more data-rich latitudinal bands, must be added.

      2) Employ models that allow assessing unimodality, which is stated but untested. At the bare minimum, a quadratic relationship with latitude should be included in the GLMM. As implemented here, the GLMM employed to relate diversity to latitude can only detect linear trends, but not unimodal patterns and the mid-latitude peak suggested by LOESS for the northern hemisphere. To provide a formal test for unimodality, models with or without a quadratic term could be contrasted using standard model comparison procedures. Alternatively, GAM could be used to evaluate nonlinear effects.

      3) Clarify whether p-values are relevant or not. As is, it is confusing. For example, the legend of Table 1 mentions p-values, but these are not reported. Materials and Methods indicate that 95% confidence intervals are used to take decisions on null hypotheses, suggesting that p-values are not used in the analysis (lines 436-439). Nevertheless, p-values are reported in Table 2.

      4) Provide a rationale for distinguishing between canopy and other algal forms (the distinction is compelling, but it is not explained).

      5) We like the conclusion on the importance of local-scale processes. This should be placed in the context of previous studies that have quantified patterns and processes at multiple scales reaching the same conclusion.

    1. Reviewer #4:

      General assessment of the work

      Gene drives can be used for sustainable control of disease vectors, and there is a need for a different gene drive strategies that can be tailored to the particular species, timescale, and desired spatial spread. Kandul and colleagues present a welcome new addition to the growing number of strategies for gene drive, called HomeR, that combines elements of killer-rescue and homing-based drive to exert spatiotemporal control over its spread, whilst counteracting the rise of resistant mutations. Whilst it is extremely promising, some major claims of this manuscript are inaccurate or unsupported by the evidence. The authors could easily address the most important concerns by expanding their sequencing analysis to better detect and quantify resistant mutations, paying careful attention not to overstress the potential of this drive to mitigate resistance, and by comparing the relative strengths of different drive strategies instead of focussing only on features that are most flattering to the HomeR strategy.

      Numbered summary of any substantive concerns

      1) The drive release strategy of Fig 4A + 4C is primed to underestimate and potentially mask resistance. In Fig 4A, where the authors search for signs of resistance, the population was seeded with males that were all homozygous for the drive, meaning that 100% of their G0 progeny will inherit it. As the rate of homing is close to 99%, only a small fraction of their G1 could have inherited a non-drive (potentially resistant allele) allele. In a realistic release scenario, resistant alleles will have ample opportunity to be generated and subsequently selected. Though still far from adequate, resistance testing would have been better performed on samples collected from the lower frequency releases in panel C. This experiment should not be used to draw strong conclusions about resistance to pHomeR, but should be used to make broader observations regarding the spread and stability of the construct.

      2) The strategy for sampling resistance will obscure almost all resistance in the population, and would fail to detect even a strong selection for it. Flies were only selected for resistance genotyping if they lacked GFP, meaning they carry two non-HomeR alleles (i.e. homozygous for the R1 allele or transheterozygous with another R1/R2/WT). One would expect most resistant alleles to be heterozygous in a population that was seeded with almost complete drive homozygosity. The authors could, and should, have done more to identify and quantify these. Amplicon sequencing was used to sample the full diversity of alleles in a larger pool of individuals (including GFP+ flies) collected at G10, why was this approach not used throughout? By adopting the approach earlier they would have been able to track the changing frequencies of R1 and R2 alleles over time.

      3) The impression given in the figure and main text is that R1 alleles were rare (or entirely absent), when they were not. In spite of the incredible advantage given to the drive, and a bias in sampling method that would mask the presence of resistant alleles, resistance was observed in every generation tested (G2, G3 and G10). The authors claim that because GFP-individuals were not observed in later generations, the resistant alleles had not come under positive selection. This logic is flawed, and indeed their own amplicon sequencing analysis performed on G10 flies revealed several resistant alleles, including an R1 present in 80% of non-drive alleles. The two most frequent mutant alleles detected were in frame, and I do not agree that these are likely to be deleterious recessive (as the authors speculated). These could be functionally resistant mutations. I believe there were many more R1 alleles in heterozygosity with the HomeR allele, these alleles could have been spreading, but were excluded from the genotyping analysis. Could these putative R1 individuals not have been specifically tested to see if they do, or do not confer resistance?

      4) The modelling takes a very limited approach to comparing different drive strategies, and by comparing proof-of-principle designs, important differences are obscured. For example, simple modifications that would mitigate resistance are likely to be included in many designs - such as multiplexing gRNAs. The nuances of each design are lost in a discussion focused on the rate of spread, which is largely irrelevant now because all of the drives are predicted to spread well.

      5) The authors did not discuss the relevance of having performed releases in a population that was already homozygous for Cas9. Do the release experiments and model really suggest the drive could spread if released into an otherwise WT population? I'm not sure the data presented in this manuscript can support that claim.

    2. Reviewer #3:

      The authors are to be commended for the effort put into careful experimental design and clear presentation of methods and results.

      My main concern with the manuscript is that the claim about their specific polymerase gene being "ultraconserved" is not backed up with their own data or by citations from the literature. If the gene sequence was ultra-conserved, I wouldn't have expected the authors to be able to do so much recoding of the gene without fitness consequences. Furthermore, it is clear that homozyogous-viable NHEJ mutations did develop in the experiment. Without explanation, this seems to be a fatal flaw in the design.

      This manuscript describes a modification of the general homing gene drive concept by use of a split drive system that increases the frequency of a recoded polymerase gene that replaces a cleavage susceptible, naturally occurring, haplosufficient, conserved polymerase gene. This approach is taken in order to limit the evolution of cleavage resistance in the naturally occurring gene.

      I am not convinced that the research presented achieves the intended goals. I did a quick look for literature on the "ultraconserved" polymerase pol-y35 gene and could find none. I am not sure if the conservation is at the DNA sequence level or at the amino acid level. If at the amino acid level, then it makes sense that resistance alleles can form at the DNA level that don't impact the protein at all. Figure 2a shows the 22 and 27 recoded nucleotides for the two guide RNA sites. The authors say that these changes to the sequences didn't seem to impede fitness. Did the authors try many other recodings and finally decide on these because all others caused loss of fitness, or is it just that this gene is robust to substitutions even though the protein is conserved.

      Figure 4C shows that the frequency of flies with at least one copy of the pol-y35home R1 increased from about 25% to about 50% between the parental and F0 generation when there was no Cas9 present. As long as the transgenic males were competitive with the wild flies this makes sense because the released flies were homozygous for that allele and the offspring should all have inherited one copy of the gene. What doesn't make sense is that when the work was done with all flies harboring the Cas9, the pol-y35home R1 increased less than in the former case, from the parental to generation F0, the frequency of flies with the pol-y35home R1. In some replicates the frequency of such flies didn't increase at all. It should be noted that the parents were always homozygous. This certainly indicates a fitness cost to the flies with a combination of Cas9 and the homing construct.

      In this same figure, results from the model are plotted. It seems like the model assumes no fitness cost because it shows an exact increase from 25% to 50% flies carrying at least one copy of the pol-y35home R1 theoretical construct. In later generations the experimental results outperform the model. Presumably, this model is used to construct figure 6. This mismatch needs to be addressed in the manuscript.

      The fact that in all three replicates of the experiment without Cas9, the F0 is above 50% indicates that something else may be going on that is unrelated to gene drive. It could be due to heterosis between the two slightly different strains of flies. When wildtype males mate with wildtype females, the offspring are more inbred than when a transgenic male mates with a wildtype female. Just a hypothesis.

    3. Reviewer #2:

      Kandul et al. present an interesting study that could lead to important improvements on the use of homing-based gene drives. However, there are a number of things that should be addressed to improve the manuscript for better comprehension by readers.

      Overall the manuscript presents a load of data. But the presentation of these data could be made in a better digestible way. The authors should go over their manuscript with a reader in mind, that is interested but not necessarily knows all the relevant literature in the very detail.

      Abstract (line 18): Please remove "inherently confinable" from the abstract. The drive is indeed designed in a split drive design, however, all the experiments were done in a homozygous Cas9 background. Therefore, there are no experimental data for a split drive provided in this manuscript. The split situation seems to be here more for a practical reason to be allowed to do the experiments in a less stringent laboratory environment. Thus there are no experimental data that would support the confineable nature of this drive. Actually there are not even modelling data to this. Thus, such a statement should not be put in the abstract. This manuscript is not a demonstration of a confineable drive.

      Results (line 124): How was Pol-gamma35 identified? It would be interesting to the reader to get to know about the exact reasoning, why this gene was chosen. Or were there several ones chosen before and this turned out to work the best or was the easiest to design. This could be very interesting considerations important to the field.

      Results (lines 147-148 Fig. 1B; lines 155-156 Fig. 1C) and Methods (lines 698 and 706) and Figure 1 (both Figure and legend): The addressing of the Figure panels and the writing to it don't fit! Has there been a rearrangement of the Figure that was not worked through the text? When referring to "B" in the text, it is still about Act5C-Cas9 and the nos-Cas9 data are in the text referred to Fig1C! But Fig1C is BLM! In current panel Fig1B, what does "all" mean below the X-axis? This is not comprehensible. Panel C is not really described in the Figure legend!

      Results (line 253), Discussion (lines 526-527), and supplementary Figure 1 (line 1101). "converting recessive non-functional resistant alleles into dominant deleterious /lethal mutations" is completely misleading! There is no "conversion" and how should that be done molecularly. There is a continuous removal of such alleles from the population because of lethal transheterozygous conditions caused in the drive. However, there is no active conversion of such alleles into dominant lethal ones. This needs to be clearly rewritten to avoid the misleading idea. Supplementary Figure 1 also seems to have a slight conceptual problem. What are "cells" (rectangles) with a red frame and a green core? Green means at least one wt allele (this must include the recoded rescue allele!). Red means biallelic knock-out: thus a red cell cannot have a wt allele. Thus what is a red-framed green core cell? To explain the removal of R2 alleles, a depiction of yellow framed red core cells in the germ line would be helpful, since this would explain how R2 alleles are selected against and might be continuously removed from the population!

      Results (from line 424 to end of results): Before going into the modelling, the reader should be clearly informed about all the different approaches that are now to be compared. This is currently not done well, if at all! Thus moving current Fig 6 before current Fig. 5 might clearly help! Also a better explanation of the panels in Figure 6 is necessary as well as a correction of Fig6 Panel E! A comparison of a great number of the currently approached toxin-antidote (gene destruction - rescue, but not killer-rescue!) systems is greatly appreciated. However, the authors cannot expect the general reader to know about the small detailed differences between the systems that are compared here. Thus the authors need to provide some explanations and categorization of the different approaches here and also cite all the respective literature.

      -First subdivision: Non-homing (interference-based drives) VERSUS Homing (thus overreplication-based drives). This will also help them to better understand why the interference-based drives (TARE and ClvR) are more sensitive to fitness parameters than overreplication drives.

      -Second subdivision: same-site VERSUS distant site. This is important to understand the difference between the here modelled TARE and the CLvR. Actually ClvR is a TARE, but you use TARE here more specifically as the results in the respective paper are demonstrating only a same-site TARE! But this needs to be clearly stated here!

      -Third subdivision: viable VERSUS haplosufficient VERSUS haploinsufficient. This also needs to be clearly depicted in labelling panels C to F of Figure 6, which are currently hard to grasp what the essential differences are, before looking at the panels in detail: C: HGD of viable gene (HGD) D: HGD of viable gene with rescue (HGD+R) E: HGD of haploinsufficient gene with rescue (HGD-hi+R). THIS PANEL NEEDS MAJOR CORRECTION!! F: HGD of haplosufficiant (essential) gene with rescue (HomeR)

      -Forth subdivision: split VERSUS non-split. Here for the split HGD situation, the respective papers of which the current authors are co-authors should be cited: Kandul et al. 2020 (actually published end of 2019 and still cited as biorxiv Archives article 2019a!?) and Li et al. 2020, Elife). In addition, it is also important to state clearly that "split or two locus" is completely independent of the "distant site" concept! The reader needs to understand the differences of the systems that are compared here, without having the reader to go to the respective publications themselves and then try to find out what the differences really are. This is not so obvious and the current authors have a clear chance here to do that and help the reader in the mists of all this similar but still distinct approaches.

      Figure 6 Panel E: This depiction is not consistent within itself, not consistent with the legend, and not consistent with the cited literature!

      -Why should the rescuing drive construct over the wt allele be lethal as indicated in the right two boxes?

      -The cited paper Champer et al. 2020b, which is by now also published in PNAS! clearly states that there is maternal carry over, which actually makes it so hard to use and is probably only working via male propagation. In the Figure legend, it is said that "maternal carryover and somatic expression ... are empirically unavoidable", which is in contrast with the depiction! The legend then also states that this is "unachievable". This should be better replaced by "hard to achieve", since the approach is published and seems to drive, even though probably just via the males! Thus the depiction of panel E needs to be thoroughly revised.

      Discussion (lines 499-500): The haplolethal HGD works (admittingly poorly) despite the maternal carryover (Champer et al 2020b). Therefore, your statement needs to be refined or deleted: "requires germline-specific promoter that lacks maternal carryover" is not consistent with the published paper! The drive could go via the males because then you do not have maternal carry over! And homing based drives can go via males and do not necessarily have to be promoted through females, see also KaramiNejadRanjbar et al. 2018.

      Discussion (lines 540 to 543). This sentence is based on an old but clearly overruled idea! NHEJ repair is not restricted to a time before the fusion of the paternal and maternal genetic material. It has been clearly demonstrated that R1 and R2 alleles are also generated in the early embryo after the zygote state (Champer et al 2017, KaramiNejadRanjbar et al. 2018). Actually, all of the authors' Figure 1C and Supplementary Figure 1 are about NHEJ mutation in the early embryo causing "BLM". Thus this sentence is inconsistent with current beliefs and also with the authors' own writing!

      Figure 4: Panel C graph: Why is, in the controls, the transgene consistently and significantly higher inherited to the next generation (0). It is about 75% progeny sired by the transgenic fathers compared to the wild type fathers? Was there an age advantage of the transgenic ones or whatever other fitness factor? This is surprising and no explanation is given at all! In contrast, in the Cas9 background in generation 0, less than 50% carry the drive allele, which is probably due to induced lethality. But this should also be commented upon. In the legend it is stated that 7 of 9 flies carried an R1 allele heterozygous to an R2 allele. What about the other two?

    4. Reviewer #1:

      This paper shows that Cas 9 mediated homology directed repair can be used to insert a synthetic rescue gene into an essential gene, here mitochondrial Pol-gamma35 was chosen. The insertion is marked by an eyeless-GFP reporter and also contains the gRNA (gene drive) but not the Cas9 (considered as a safe split gene drive). 'Homing' of the eye-GFP is assayed to detect insertion at the homologous locus when Cas9 is present by HDR. The authors show that this works well in the female germline with various tested Cas9 lines (vas, nos, Act5C and ubiq-Cas9). In all cases close to 100% transmission to the homologous locus on the homologous chromosome is achieved when an effective guide RNA is used. Hence, eye GFP transmits ('homes') in a 'super-Mendelian' ratio at the chosen target. A male specific transmission works less well (exuL-Cas9). The reason why it works well appears to be that the chosen target is an essential gene (Pol-gamma35) in which small changes caused by NHEJ that result in homing 'resistant' alleles will be loss of function alleles and hence will not spread in the population. Unfortunately, the authors did not test how the drive could spread in a wild type population (no Cas9 expression). I am also missing a test relevant for pest studies that would achieve the spread of a potentially deleterious or beneficial insertion that could kill a population or make it resistant to a disease.

      1) This paper is very hard to read. Sentences are excessively long and complicated. References to the Figures appear not always correct.

      2) Figure 1. Genotypes in Figure 1A are unreadable in the print version because of the small font. Are the 2 crossing schemes required that only differ in gRNA1 or gRNA2? The surviving progeny should be quantified as in Fig 1B. Figure 1B shows nos-Cas9 and not act-Cas9 results (several typos in line 148-155). Figure 1C: the incidence of heterozygous, homozygous and 'resistant' cells is schematic and not supported by data, hence questionable if Figure 1C should be shown in results.

      3) Figure 2. Genotypes not readable in print. Is it necessary to show schemes of the procedure how transgenic flies were generated and how the Pol-gamma 35 HomeR were made with all chromosomes detailed (Fig 1D)? This could move to the methods as it is standard and we learn not much new.

      4) More typos: line 286: Fig2B is the wrong reference; line 295 should read Actin 5C. Figure 4B GGG codes for Gly (not Gla). lines 576 to 592 should refer to Figure 6?

      5) Figure 5 - as figures 1+ 2, only readable on the computer.

      6) It would be interesting to see how the gene drive would spread if Home R and Cas9 would be introduced in a competitive way into wild type populations. This is similar to Fig 4C, but the only the Home R males or females would carry the Cas9. This would be a more realistic test how the gene drive could spread in a wild population that obviously does not express Cas9.

    1. Reviewer #3

      This manuscript by Beier et al. has used an impressive array of genetically modified mouse lines to study, which retinal circuits are responsible for driving the pupillary light reflex (PLR). These mouse lines are validated by direct electrophysiological recordings from rods, to rod bipolar cells, to ON and OFF cone bipolar cells. The manuscript makes two key conclusions based on measurements of PLR from darkness to 100 lux and 1 lux light steps: 1) the ON but not the OFF pathways drive PLR, 2) PLR relies on the most sensitive rod pathway - the primary rod pathway. My main concern is that the data shown in the paper does not uniquely support these two key conclusions. There are many issues, some of which may be fixed by better explanations, some of which may require more complete measurements. I outline my main concerns below:

      1) The manuscript uses an incoherent terminology of the retinal pathways. For example, the beginning of the second paragraph of the introduction states that the ON and OFF pathways split in the first synapse, which is not true, for example, for the primary rod pathway (rod bipolar pathway). The latter segments of the same paragraph lay out more clearly the conventional definition of the primary, secondary and tertiary rod pathways. In short, it would be important to use a coherent and conventional terminology of the retinal pathways and relate the experiments and conclusions to these. It would also be important to correlate the used stimuli to the light levels defined to drive signals across different retinal pathways in image forming vision (see Grimes et al. 2014 & Grimes et al. 2018). Now that the light levels for physiological studies are expressed in R / rod (see Supplementary Table), whereas lux are used as units for PLR. Comparison to the previous literature would require a unified intensity space (preferentially Rs or both luxes and Rs). It would also be important to relate the sensitivity of the primary rod pathway (as the authors claim is driving the PLR) to the signaling levels (extremely low light levels, <10 R/rod/sec) where this pathway is supposedly driving image forming vision (Murphy & Rieke, 2006). It seems that the current PLR experiments probe much higher light levels than these papers in relation to the primary rod pathway. A wider stimulus space should be tested and/or at least a clear explanation would be needed for the choices made.

      2) One of the two main conclusions of the paper is that the retinal ON pathway drives the PLR and the OFF pathways do not contribute to the PLR. The authors state (see abstract): "The OFF pathway, which mirrors the ON pathway in image forming vision, plays no role in the PLR". The data in Figs. 2A & B and 3 A & B indeed give strong support for the notion that light steps from darkness to 100 lux and 1 lux drive light responses through the ON pathway. However, this finding is not in conflict with the image forming vision. In fact, both the classic papers (Schiller, 1982, photopic ) as well as more recent results (Smeds et al. 2019; scotopic) support the notion that light increments are coded by the ON pathway. Now the circuits controlling PLR seem to fall exactly in this picture. However, the classic papers based on image forming vision (see e.g. Schiller, 1992) propose that the OFF pathways would drive light decrement stimuli. To justify the conclusion that "OFF pathways do not contribute to PLR" the authors should test a wider stimulus space including light decrements across scotopic and photopic light levels or limit their conclusions to light increments and in line with current notion for image forming vision. The reason that OFF pathways do not play a role may just reflect a limitation in the stimulus space probed.

      3) The authors appear to ignore that the division into ON and OFF pathways occurs only after the AII cells along the primary rod pathway. The fact that Cx36 KO mice exhibit a normal PLR thus seems to invalidate the main claim of the paper that the primary ON pathway drives the PLR. The authors state: "These results imply that either the rod to rod bipolar cell pathway, independent of the AII ON pathway, is capable of driving pupil constriction or that cones are playing a role". Both of these conclusions are in contradiction with the main conclusion that the primary rod pathway as defined conventionally would be the underlying mechanism. If indeed cones are driving the PLR in Cx36 KO mice, that would be in contradiction with the previous literature (Keenan et al. 2016). It would be important to test this perhaps by using a different mouse line allowing to eliminate the cone contribution. Alternatively, showing data on Cx36 KO mice at lower light levels could help but this dataset is missing from the Fig. 3. Similarly, the Cone Cx36 KO dataset seems too sparse (n = 3) to justify the current conclusions in Fig. 3D and for some reason, the corresponding data trace is missing completely from Fig. 3C. In fact, the authors as they speculate might have uncovered (see Discussion) an entirely novel mechanism controlling the PLR. However, this now has been left untested even if it could be the most interesting new discovery if properly tested/shown.

    2. Reviewer #2:

      This work from Beier et al. elegantly dissects the rod circuits contributing to the mouse pupillary light reflex through ipRGCs. The authors combine multiple genetic mouse models with electrophysiology and behavior to demonstrate that the primary rod pathway is the primary driver of the dim light pupillary light reflex, and that the secondary pupillary light reflex cannot effectively compensate for this pathway if it is lost. My technical comments are minor. This will be a welcome addition to the field of ipRGC research. My main concern, which I will leave to the editors, is that the actual advance may not be substantial enough.

      This is the first study to attribute the rod contribution to the PLR to the primary rod pathway. Though elegant, the fact that the primary rod pathway through ipRGCs is the major contributor in low light and that both primary and secondary pathways contribute to the photopic light PLR is not particularly surprising given the previous clear demonstration by the Hattar group that the rod pathway itself is required for the pupillary light reflex (Keenan et al., 2016).

      The authors do convincingly show that the OFF pathway cannot drive the PLR, but again this is in agreement with data showing ipRGCs are the sole conduit for light to drive the PLR (Güler 2008; Chew 2015) and that all ipRGCs get info primarily or solely from the ON pathway (Dumitrescu et al. 2009 and Schmidt 2010, etc.).

      Is 1 lux of mixed wavelength light truly in the scotopic regime? How was this calculation/determination made?

      Was there any difference in the dark adapted pupil diameter between each of the mouse lines?

    3. Reviewer #1:

      This paper uses a variety of mouse lines to investigate what retinal circuits control the pupillary light reflex (PLR). Recordings from rods and bipolar cells confirm that the manipulations work as expected - at least at the level of the bipolars. Measurements of the PLR in these mice then are used to draw inferences about the relevant pathways. The main conclusions are that cones contribute little to the PLR across light levels, that signaling in Off retinal circuits contributes little, and that both primary and secondary rod pathways contribute.

      I have several concerns about the work as presented:

      1) Use of mouse lines. The mouse lines are interpreted as cleanly dissecting different retinal pathways, but this may not be the case. For example, deletion of one pathway may alter signaling in another pathway - either through compensatory effects, or from interactions between the pathways that are missing when one is removed. One way to address this concern would be to record from RGCs to test for such effects. For example, the cone sensitivity in the RGCs in Cx36-/- mice should not be altered. The bipolar recordings are helpful in this regard, but they do not represent the circuit output and hence could miss key interactions or compensation.

      2) Interpretation. The results are interpreted in the context of a standard model of retinal circuitry. Yet several aspects of the results suggest that such a model is incomplete. One example mentioned in the text is the possibility of direct RBC to RGC connections. A specific concern in this regard is that it is unclear how the secondary pathway could control the PLR but cones could not - since rod and cone signals are mixed in the secondary pathway. Accounting for the results in the paper would appear to require revisiting our understanding of retinal circuits - but more direct tests of the circuits are needed for such a conclusion.

      3) Relation with past work. The paper is short and suffers from short or missing descriptions of related past work. For example, a good deal is known about how signals from the primary and secondary pathways modulate cone bipolar and RGC responses. This is directly relevant to what is expected and unexpected in the present work. Recent work (Lee et al., 2019) also shows a contribution of melanopsin to ipRGC responses at low light levels - but this is mentioned only in passing in the present paper. This work appears highly relevant to the present study.

    1. Reviewer #3:

      The manuscript by Shi and colleagues delineates an approach for labeling newly synthesized lipids thereby providing a method to examine how lipids move throughout the cell. The premise of this technical approach is that fluorescently labeled fatty acids are fed to a cell in the presence of another lipid which will incorporate the fluorescent acyl tail using the endogenous cellular acyltransferases. Cellular imaging is paired with this approach to show the subcellular accumulation of the lipid. As presented, the data are intriguing, but there are some concerns and questions about the technique that limits the interpretation of the data and could impact the overall utility of this approach. The authors should provide the additional requested data, and resolve the issues raised below to increase confidence that this labeling approach allows for the monitoring of physiologic lipid trafficking pathways.

      Specific concerns and questions are delineated below.

      1) The authors initially exploit the remodeling of PLs as described in figure 1a. This involves the addition of lyso-PL and NBD-labeled palmitoyl-CoA. The authors imply from their schematic in Fig 1a that they are using lyso-PLs that are being remodeled at the sn1 position by NBD-labeled palmitoyl-CoA. Unless I am missing something, lyso-PA and other related lyso-PLs are generally remodeled at the sn2 position. Additionally, there is specificity for PUFAs acylation to the lyso-PL. So I am a bit confused about the enzymes that are working in this system. I tried to determine which lyso-PLs that the authors are using, but the methods did not specify if they are using 1- or 2-lyso PLs. This should be clarified so that we can understand the enzymes the authors think are underlying the labeling reaction. On a minor, but related note, the lyso-PL in Figure 1a is missing an -OH group at the sn1 position.

      2) The authors use a cell system where the cells are starved of lipids and other metabolites for 1 hour and then fed a large bolus of lipids as substrates. It appears that the cells can remodel and label some PLs under these conditions, but it is not clear to me that this represents physiologic labeling that can be used to track the de novo labeling and trafficking into subcellular compartments. Nor can it be used to draw strong conclusions about required trafficking or enzymatic pathways under normal conditions. What happens if labeling occurs in complete media or defined media? This might help to resolve this.

      3) The labeling looks non-uniform in mitochondria as evidenced by the images in figure 2a. Why is the labeling only at the outer edge of the mito in these cells in this figure? What happens if labeling goes longer? Similarly, the authors quantify "30 cell images" or the like in the figures for Pearson correlations. How were the 30 cells selected, and since labeling across the mitochondria is not uniform, how were images selected? A much larger number of images scanned in an unbiased manner would increase confidence.

      4) Likewise, what happens if the labeling is allowed to proceed beyond 15 min. Can the authors provide a 30 min and 1 hr image?

      5) There are a number of conclusions drawn about specific pathways required for the trafficking of accumulation of labeled lipids. I realize that some of these studies are used as a specific proof-of-concept for the approach. However, there are many studies that go beyond proof-of-concept and draw conclusions about biology. Many of the studies are somewhat superficial and the conclusions reached by the authors should be tempered given that they have not deeply investigated this new biology.

    2. Reviewer #2:

      In this study, Zhang et al. use a semi-novel method to track acyltransferase activity using fluorescently labeled palmitic acid (NBC-16:0) to track where specific lipids are incorporated with subcellular specificity. They show that NBD-16:0 can be incorporated into different lipid classes that segregate based previously known on subcellular specificity. While this is an interesting technique, it is difficult to determine how much fidelity this method has in recapitulating biological function without additional experimentation, orthogonal measurements, and a more descriptive methods section.

      Comments:

      1) The authors do not specify which lysophospholipids were used in their study. In the method section they specify that they came from Avanti, but there are >100 different lPLs in their catalog. Also, the authors give a range of lPL concentrations in the methods, but do not specify which concentration was used for which experiment. Without this information and other unspecified aspect of their studies, interpreting subsequent experiments is difficult.

      2) One potential advantage of this method is that it is a method to track endogenous lipids in live cells, however the authors show the NBD-16:0 transporting to lipid species where palmitate is almost never measured. For example, the use of the transport of NBD-16:0 to CL as evidence this is working. However natural cardiolipins are almost completely devoid of 16:0. In mammalian cells >80% of the fatty acids in CL is 18:2 with most of the remaining being 18:1 and 18:3. Similarly (assuming you are using sn-1 lPL-16:0), phospholipids with two 16:0 are extremely rare in mammalian cells with the exception of lung surfactant. Further, 16:0 composes <5% of cholesteryl esters in typical cells. The authors should be clearer about how this discrepancy between natural sorting of palmitate and the sorting of NBD-16:0 supports this as an accurate model of acyltransferase activity and intracellular transport.

      3) The authors state that PA is primarily remodeled in the ER and transported to the mitochondria as a precursor to CLs (lines 108-111). This statement needs a source. In most studies I am aware of, the vast majority of both PA and 16:0 are primarily converted to TGs or PC/PE with only a small fraction going towards the CDP-DAG pathway required for CL biosynthesis. Are C2C12 cells unique in this regard? Does lPA stimulation specificity induce CL production? Does any of the NBD get into the TG or phospholipid fractions?

      4) This study would be much stronger if another fluorescently labeled fatty acid was added. A comparison of the sorting of 20:4 and 16:0 would be very informative. This is especially true if the studies were done in the context of a known acyltransferase, for example LPCAT3.

      5) This study would also be strengthened by an orthogonal technique showing similar sorting. For example, separation of the organelles and measurement of labeled fatty acids by MS or nano-SIM analysis would greatly strengthen these studies.

      6) In figure 1A, the authors draw a schematic with an sn-2 lyso-PL in the figure. Sn-2 lyso-PLs as labile and will acyl migrate to the sn-1 position without careful handling of the PL in a basic solution. The authors make no mention of this type of handling in the method section. This figure should either be corrected or more details of how they handled their lysophospholipids should be provided.

    3. Reviewer #1:

      My general assessment of this work is that it is full of good ideas and presents a novel and general approach to examine lipid remodeling in cells and perhaps subsequent transport of lipids, mainly to mitochondria, but it lacks the scientific rigor necessary to be fully confident that their conclusions firmly support their claims. Often, insufficient information about the methods are provided and the manuscript is hard to follow critically.

      More specific comments:

      1) I am surprised that acyl-CoAs are transported into cells. I don't know of any precedent for this. Usually fatty acids are imported into cells and then converted to acyl-CoAs as part of the mechanism of import. Could it be that the acyl--CoAs are hydrolysed before uptake only to be reformed inside the cells? I would suggest feeding the NBD-palmitate plus the lysolipids to the cells as a control to see whether this is the case.

      2) In fig 1 as an example they choose a region to blow up. As one can see there is a large variation, even in the blowups of mitochondrial labeling and if one looks at the originals the variation is confirmed. How have they chosen these areas? Furthermore, in figure 1 there is quite a bit of label with MLCL outside of the mitochondria, in particular in regions that they did not choose to blow up. What are these structures? Remodeling of MLCL is thought to take place in mitochondria.

      3) They speak of transport of lipids from ER to mitochondria, but in fact the demonstration of this is very weak from what they show in the time course in supp fig 1. I am also disturbed by the difference in patterns of the NBD-PA patterns in a and b. They should be the same, but there are problems, maybe focus? I would say anyway that there is no clear evidence that the NBD PA first appears in the ER then goes to mitos. It could be synthesized in both compartments from their data.

      4) The product characterization by TLC is insufficient. There are no standards, no characterization. Would they have seen the free NBD-palm by their methods?

      5) When they use mutants and find less "transport" the mitochondrial signal as seen by mitotracker is always more diffuse. This indicates to me that there is another problem.

      6) In fig 3 the fluorescent pictures do not correspond to what is seen in the quantification. There is more yellow in e than in h.

      7) How did they add cholesterol at 50 or 100 micromolar? It is soluble at less than 1 micromolar in aqueous solution. The cholesterol experiments are puzzling. From what we know about StAR protein it recognizes cholesterol not esters. There is no precedent for cholesterol ester transport into mitochondria. Can they rule out that the esters are transported to the surface of the mitochondria and the NBD-Palm cleaved off and transported into the mitochondria?

      8) The MAG and DAG experiments are overinterpreted. It could just be a kinetic problem since the MAG gets converted to DAG before TAG

      9) They compare to externally added NBD lipids, but we don't know which ones they used. Are they using short chain NBD phospholipids. I could not find this in their manuscript. If they do not have the same NBD-palm in the sn-2 position then the comparison is meaningless.

      10) The excitation and emission spectra of their probes are sometimes overlapping. How did they deal with this? Are they sure that they are not seeing FRET?

    1. Reviewer #3:

      In this manuscript, Icke and colleagues show that the secreted protein CexE/Aap from entergotoxigenic E. coli is acylated at an N-terminal glycine and suggest that acylation is required for secretion via a Type I Aat secretion system to the cell's surface or into the environment. The key findings is the identification of an N-acyltransferase (AatD) encoded nearby cexE/aap and demonstration that this enzyme is required for acylation.

      There is a concern about the novelty of the findings. The publication by Belmont-Monroy et al. (PLoS Pathogens, August 2020) cited by the authors is very similar to the current manuscript. That publication demonstrated that N-acylation of Aap (a CexE homolog) occurs at its N-terminal glycine (made available after signal peptide cleavage), that acylation is dependent on the acyltransferase AatD, that acylation is required for Aap secretion, and that N-terminal residues are sufficient for acylation of a heterologous protein (though this was poorly analyzed in that paper). Almost all of those findings are shown in this current manuscript by Icke et al., independently confirming the acylation reaction.

      This Icke et al. study is well done and convincing on the AatD-dependent acylation of CexE/Aap. Overall, the same conclusions are drawn as Belmont-Monroy et al., 2020. The major new advance (not previously described) is the observation that the N-terminal glycine is required for N-acylation by AatD.

      As described in my comments (below), the manuscript could be improved in a few instances by including key controls to support the conclusions. In other instances, broad conclusions are made from narrowly focused data and the text should be revised.

      Major comments:

      1) "To our knowledge this is the first report of enzyme mediated N-palmitoylation in nature". This statement is not correct. The lipoprotein N-acyltransferase Lnt (used as a reference for AatD analysis in this manuscript) performs N-palmitoylation (C16:0) in E. coli and distantly related bacteria such as mycobacteria/corynebacteria. See Jackowski & Rock 1986 (JBC 261,11328-11333), Hillman et al. 2011 (JBC 86, 27936-27946), Brulle et al. 2013 (BMC Microbiology 13, 223).

      2) The conclusion that "we reveal a new function for acylation - protein secretion" is not fully supported. The authors do not directly show that the secreted CexE is acylated (Fig 2A) or that acylation is required for secretion. The use of 17 ODYA is innovative and could be used to show that secreted supernatant CexE is acylated. The CexE N-terminal substitution mutants that are not acylated (Fig 7C) could be used to test if acylation is required for secretion.

      3) If the secreted CexE is acylated, some discussion is needed. How is the acylated form sometimes secreted into the aqueous environment but sometimes embedded in the outer membrane as shown in the model?

      4) Can the authors show/detect CexE acylation in the native system that doesn't rely on overproduction of the CfaD transcription factor? Is the observed acylation physiological or a consequence of strong overexpression?

      5) Claims of novelty in text should be altered following Belmont-Monroy et al., 2020.

    2. Reviewer #2:

      I think this is a superb manuscript - it is written in a clear way, such that the story starts at the historical understanding of lipoprotein trafficking and builds up convincingly using various experimental methods to show that a new class of lipoproteins is trafficked via acylation of glycine, through the Aat secretion system.

      It is highly exciting that a protein that does the acylation AND the secretion from the periplasm to the cell surface has been identified! Next step is to get a structure.

      The data are convincing and the paper is extremely well-written. My comment is that I am not convinced by the argument that CexE would not be recognised by the lol system, when it is acylated it likely would be as the hydrophobic pocket of LolA and LolB are fairly indiscriminate - see e.g. the binding of small hydrophobic molecules to these proteins. The authors should comment on this aspect.

      It is intriguing how glycine in particular is recognised for acylation.

      Overall, a great paper - authors should be commended.

    3. Reviewer #1:

      This study from the Henderson laboratory describes the identification of a hybrid secretion system involved in the acylation and trafficking of a conserved class of bacterial lipoproteins. Spurred by the serendipitous observation of posttranslational modification, Icke et al. identify the AatD protein as the factor responsible for CexE acylation. Combining alignment of conserved sequences and structural data the authors isolate the site of acylation on the CexE polypeptide and identify AatD residues responsible for catalysis Overall this is a strong manuscript, densely packed with supporting data and extremely well written.

      My only significant concern is the issue of novelty. Although the authors seem to imply they are the first to report this type of system, they cite a 2020 PLOS Pathogens paper by Belmont-Monroy detailing nearly identical results in Enteroaggregative E. coli. Given the significant amount of overlap between these two manuscripts, it would seem prudent for the authors to spend some time in the introduction and discussion highlighting open questions that this paper addresses.

    1. Reviewer #2:

      The manuscript prepared by Kim and Colleagues provides a solid attempt at understanding the neural correlates associated with self-reassurance and self-criticism in relation to what they term neural pain. While it is well written and there is a clear story presented here, there appears to be insufficient details in the introduction and discussion. The methodology appears sound for the most part, but I have some concerns relating to stimuli and gender effects that I believe would make the findings more compelling if addressed.

      Criticisms:

      1) The example items of the neutral statements appear to involve an external agent (i.e., a reference to a friend), while the neural pain is purely about the self. Are there also references to other people in the neural pain condition? If not, how have the authors ensured that the neutral condition is actually neutral. It seems likely that the inclusion of an external agent for many of the neutral statements could pose problems with interpretation, especially when talking about self-criticism and self-reassurance. The presence of an external agent in the neutral statements changes the meaning from a purely self-oriented experience to a shared experience.

      2) I am curious as to why the inverse contrasts (i.e., reassurance - criticism) were not run? Knowing whether there was a unique network associated with self-reassurance would provide a more comprehensive understanding of the authors' findings.

      3) I am wondering why the authors did not accommodate for gender differences in their study? Given recent evidence (See citation) it seems likely that this may play a part in self-compassion. The authors report an almost equal distribution of males and females, so it should be possible. If the authors did explore this and found no difference with gender as a regressor then this should be noted in the manuscript. Mercadillo, R. E., Díaz, J. L., Pasaye, E. H., & Barrios, F. A. (2011). Perception of suffering and compassion experience: brain gender disparities. Brain and cognition, 76(1), 5-14.

      4) It seems as though a whole body of literature is being very lightly touched on here but would benefit from inclusion. I think it would be useful to have some information in the introduction regarding moral emotions (i.e., compassion) and the link with empathy and emotion regulation (see work by Jean Decety). This would also be beneficial for the discussion as the authors are in essence describing empathy.

    2. Reviewer #1:

      This is a potentially interesting analysis, but there is a lack of framing, details, and specificity that dampens my enthusiasm for the work.

      1) As far as I can tell, the authors do not really demonstrate that "markers of negative emotion and pain" can be down-regulated during self-reassurance". They simply show that regions surviving multiple comparisons change depending on condition, but they don't show data supporting their hypothesis. How much do regions activated during criticism actually change during reassurance? What is the time course of these differences?

      2) Behaviorally, the neutral statements from the two "conditions" appeared to have distinct intensity levels. Specifically the "intensity" for neutral trials during criticism blocks appears significantly lower than neutral trials for reassuring blocks. Because of this behavioral effect, within their design it is difficult to identify the cause of the brain changes.

      3) How were subjects trained in self-criticism vs. reassurance? Is there any way to confirm that they were in fact doing the "task"? Further, at what point in the 2-week compassion training paradigm were FMRI data collected?

      4) Figure 2 is quite confusing to me: (1) the authors refer to brain maps as "neural pain"? I would strongly advise against this as it is very reverse-inferency. I would recommend against using this phrase throughout the paper. (2) How would one interpret the phrase "neural pain during self-reassurance"? Is this emotional > neutral during reassurance?

      5) Figure 3 refers to "trial by trial ratings of intensity" but if I am understanding the figure, this is not an accurate description. The authors are reporting the mean across subjects for each condition. It is unclear in fact how much variability there is on a trial-by-trial level within persons for the intensity of each condition. One idea is to use an amplitude modulation analysis to scale FMRI parameter estimates by the intensity rating on a per-trial basis. That would be an interesting analysis, IMO.

      6) It is unclear from this paper what was done previously. It appears that the authors examined physiological data (e.g., HRV) in their previous report but don't talk about other measures that were collected here. It would be useful to know the extent to which they buttress the authors findings (or if they do not).

    1. Reviewer #3:

      The manuscript named "Ex vivo observation of granulocyte activity during thrombus formation "submitted by Morozova and colleagues try to demonstrate the implication of deux different types of granulocytes in thrombus formation. Author study thrombus formation in anticoagulated whole blood from healthy and Wiskott-Aldrich patients in parallel-plate flow under collagen type I and low shear rate (100 s-1). They identified a CD66/CD11 cell population defined as granulocytes able to interact with growing thrombus. Two types of granulocytes were observed and differentiated with their fluorescent patterns: type A (uniform DiOC6 staining) and type B (cluster-like DiOc6 staining). Authors studied granulocytes behavior under several kinds of inflammation mediator. The manuscript should be improved, please see my following comments.

      1) Authors should clarify the technical part of the manuscript and the figure 1, essentially the use of anticoagulant to perform follow chamber. It is not obvious which anticoagulant was used to performed flow chamber: citrate, heparin, hirudin. Does recalcification was performed in all experiments?

      2) The authors should explain why the figure 1 demonstrates that granulocytes need free calcium ions to adhere to the growing thrombus. This is not the conclusion of figure 1. Moreover, all the growing thrombi seem different (more compact in citrate than with hirudin, w/o granulocyte in citrate and with granulocytes in hirudin) the authors should discuss this point.

      3) This following sentence is confusing (last sentence of 3.1): “Hirudin- and heparin-anticoagulated blood was used in all further experiments because citrated blood recalcification causes local fibrin formation and platelet activation.” Platelets activation is essential to growing thrombus.

      4) Author hypothesized that type B are more activated than type A essentially based on crawling and velocity cells. Could they do supplemental experiments to prove this point (increased of CD11 active form) and to differentiate neutrophils from eosinophils and basophils?

      5) It will be great to perform a competition experiment to prove that platelets are interacting with granulocytes through CD11.

      6) Did authors find NETs in this setting?

      7) In all pictures platelets seem not well represented, only two and three platelets in figure 2. How the authors could be sure that granulocytes interact with platelets and not collagen?

      8) Some platelets seem inactivated (round form) and annexin V positive. Could the authors discuss this point?

      9) Concerning the last figure, it will be great to use healthy platelets and WAS granulocytes to conclude that crawling is altered.

    2. Reviewer #2:

      The authors report on a model system to study the infiltration of growing thrombi by leukocytes using fluorescent microscopy. Such a system will facilitate studies on the role of leukocytes in the context of immuno-thrombosis and thrombus growth. I like the proposed model and I'm convinced about the utility of such a system. However, although the model is sound, I do have a couple of major concerns:

      1) The authors describe crawling neutrophils; which methodology has been applied to guarantee that the crawling of the neutrophil is a general phenomenon in this system and not a feature of selected neutrophils? Has there been used a method of quantification (e.g. 86% of the neutrophils are crawling)?

      2) The authors identify two types of granulocytes (uniform DiOC6-type A vs cluster-like DiOC6-type B granulocytes. Although the pictures provided (fig 2) do represent nice and clear examples of type A vs type B granulocytes, the experiment will reveal for sure staining patterns which have been less clear. What kind of systematic methodology has been applied to delineate type A from Type B neutrophils?

      3) Figure 3 and 4 are nice figures showing differences in granulocytes/FOV and velocity- but it is not clear to the reader what percentage of the neutrophils visible in the FOV do move or not. Was there a minimal amount of neutrophils being in motion?

      4) The authors want to point out the synergistic effects of neutrophils and platelets in the thrombus growth. To finally make the point they use whole blood of a WAS patient. However, since WAS affects neutrophil motility as well as platelet morphology/number the individual role of platelets and neutrophils in this process remains open. Therefore control experiments targeting either platelets (whole blood of a thrombocytopenic patient and/or platelet depletion) may contribute to identify the role of platelets in the model. On the other hand, inhibition of neutrophil activation/motility may illustrate the individual contribution of neutrophils in this setting.

    3. Reviewer #1:

      The authors utilize a novel ex vivo system to visualize granulocytes migrating within experimentally induced blood clots. Granulocytes are labeled using CD11b and CD66b and visualized over time using fluorescence microscopy as they move within the ex vivo formed clots, which have been prepared under different anticoagulant conditions to generate varied clot structure. Leukocytes are activated using various agonists and behavior classified into 2 different phenotypes based on the staining pattern of DiOC6 as either diffuse or punctate. The experimental system allows for measure of individual cell velocity and clear images of cells showing changes in cell body structure. The use of cells from WAS patients provides a nice validation of the model system being presented in the study.

      1) The authors state that citrated blood results in local fibrin formation and platelet activation; would it not be relevant to compare granulocyte behavior in such a setting to the hiridin or heparin-anticoagulated samples? This could also provide a valuable setting to study platelet-granulocyte interactions.

      2) The further elimination of heparin-anticoagulated blood in favor of hirudin is also not clear. How does heparin pre-activate granulocytes, and what experimental evidence is seen by the authors other than an increase in number?

      3) Are type A and type B leukocytes defined only by the DiOC6 staining pattern, or also by their velocity? Please clarify in the text.

      4) The choice of "leukocyte priming agents" is not clear, in particular myeloperoxidase and lactoferrin. Leukocyte activation caused by these agents should be validated and the rationale more clearly defined (i.e. by referencing previous work that provides the mechanism of binding for these leading to neutrophil activation).

      5) Regarding the Annexin V stained smaller structures presented in Figure 4; have the authors ruled out that these could be procoagulant extracellular vesicles from other leukocytes, i.e. by performing a co-staining with a platelet marker for positive identification?

      6) Have the fibrinogen (Sigma) and von Willebrand factor (an in-house preparation from an academic lab collaborator) been tested for endotoxin levels prior to use in this system?

    1. Reviewer #3:

      In this manuscript, Dempster et al. analysed the predictability of cell viability from baseline genomics and transcriptomics based features. They did a comprehensive analysis across feature and perturbation types, which gives a valuable contribution to the field. The main findings of the paper (gene expression based features outperform genomics based ones) are not necessarily new, but the authors also show the interpretability of gene expression based features, which clearly helps to place these machine learning (ML) models into biological context . This is especially important for the possible translatability, as small (low number of features), interpretable models are generally preferred over large, "black box" models.

      The study is very nicely constructed both from machine learning and cancer biology perspective. My only major comments are regarding some (potential confounding) factors related to tissue-type and feature filtering.

      Major comments:

      1) A well known phenomenon on the field is the tissue-type specificity of drug sensitivity, which is a major confounding factor in several ML-based studies. The authors, absolutely correctly, use tissue-type as features in their models to overcome this problem. However, as RF models (individual trees) do not use all features at the same time, so it is possible that some genomics based models are not using information about tissue-type, even if tissue-type was selected in the 1,000 features. On the other side, for gene expression based models (based on the "tissue specificity of gene expression"), tissue-type information is probably always available. This could (partially) cause the better performance of gene expression features. Could the authors do some additional controls (e.g.: providing "multiple copies" of tissue-type features for genomics based models) to overcome this potential confounding factor?

      2) The authors use a Pearson correlation filter (mainly) to decrease computational time. In Figure 4 (and also inFigure 2 - supplement 3) they show that in case of "combined" features, the features sets including gene expression based features had the best performance. When did they use the Pearson filter in case of combined features, before or after combining them? I.e. in case of expression + mutation, they selected the top 1,000 expression and top 1,000 mutation features, combined them and trained RF models with 2,000 features, or combined expression and mutation features, selected the top 1,000 features, and trained the models with them? If the later, it would be important to see how much of the different feature classes (e.g.: mutation and expression in my example) are included in the top 1,000 features. This is especially important, as Pearson correlation as a filter is probably more suitable for continuous (expression) than binary (mutation) features, so it is possible that the combined features use mostly expression based features. In this case, it is not so surprising that the performance of combined feature models are more close to expression based models.

    2. Reviewer #2:

      Summary and comments:

      This study presents an analysis of five large datasets of cancer cell viability including both genetic and chemical perturbations and find that RNA-seq expression outperforms DNA-derived features in predicting cell viability response in nearly all cases, and the best results are typically driven by a small number of interpretable expression features. The authors suggest that both existing and new cancer targets are frequently better identified using RNA-seq gene expression than any combination of other cancer cell properties.

      Overall, none of the main conclusions in the paper are surprising, and begs the question whether sequencing more cancer exomes is really meaningful? This is a question that deserves serious debate as major resources are being diverted to large-scale exome sequencing projects with low information content returns.

      The paper is well written. And at first glance, the results seem to support the provocative title. Improved clarity around what the predictor and response variables are earlier in the paper would improve readability, particularly around what a "genomic" variable is. Most of the helpful details are buried in the methods.

      The main benefits of the manuscript: (1) emphases on simplistic (i.e. few features) predictors that are themselves easily interpretable; and (2) the choice of random forest classifiers also makes interpretation of the predictions pretty straightforward. One concern is whether the breadth or depth (i.e. completeness) of the genomic predictor variables somehow unfairly bias the findings against the ability to predict with those variables compared to expression variables, which are quite easy to encode and interpret. This concern is alluded to in the discussion when reviewing the findings of previous, related publications, and could be further explored. For instance, while variations in mutant RAS (H, K or N) or B-RAF were the only dependencies noted to be predicted better by genomics (i.e. mutations), are all driver mutations known and represented in the data? One would expect that amplified EGFR or HER2 would be predicted well from genomics, but these are notably missing, presumably because they do not meet the filtering criteria.

      A notable finding was that a single genes' expression data produced notably better results than gene set enrichment scores overall, despite having many more presumably irrelevant features. Predictive models for many vulnerabilities exhibit relationships expected to be specific to a single genes' expression (e.g. los of a paralog's expression predicts dependency on its partner). There is no biological validation for any of the predictions in the manuscript.

      Specific comments:

      What was the thought process for choosing 100 perturbations in each dataset to label as SSVs? Why not 82, or 105? Was there a systematic analysis done to pick this number (e.g. harmonic mean)?

      Did the authors estimate the effect size they are measuring across the 100 selected SSVs? In other words, was there an estimation of fitness effect, single mutant fitness or degree of essentiality for the 100 SSVs and what range of effects are they exploring? One possible way to measure the fitness effect of each of the 100 vulnerabilities is to examine the dropout rate in pooled screens at the guide RNA level, looking for consistency in gRNA behaviour.

      Did the authors include essential and non-essential genes as reference points in their analyses? This wasn't clear from the methods.

      The authors describe a clear gradation of response to either TP53 or MDM2 knockout according to the magnitude of EDA2R expression observed in multiple datasets (i.e. Achilles, Project Score, RNAi, GDSC17, PRISM). Using EDA2R expression to infer TP53 activity could have clinical benefit and deserves more attention (i.e. validation).

    3. Reviewer #1:

      General assessment:

      Since Precision Oncology is getting important these days, understanding the relationship between cancer type-specific vulnerabilities and their biomarker is a major challenge of personalized therapy. Previously, genomic signatures such as mutation and copy number variation were favorable to predicting cancer vulnerabilities. Dempster et al. presented a systematic comparison of predictions with or without gene expression features using five major screen data sets, suggesting that gene expression would better predict cancer vulnerabilities. Although suggested interpretable models in the last part of the paper are questionable, the main message and the supportive comparisons are clear.

      Major comments:

      1) RNA expression cannot be separated from cell lineage bias. For example, ESR1 gene is also relatively overexpressed in normal female tissues. I'm wondering how overexpression specific dependency can be separated from the tissue bias.

      2) Predicting drug response by expression signature might be risky if there is no clear copy number amplification signature or reasonable causality. Is it possible to find casual features of why a gene is overexpressed?

      3) In this paper, the authors presented that EDA2R expression is the top feature of predicting the TP53 dependency and MDM2 inhibitors' response as an example of interpretable models. However, many studies have confirmed MDM2 phenotype depends upon TP53 genomic status. Similarly, the response of MDM2 inhibitors can be explained by TP53 mutational status. I'm curious whether the prediction of MDM2 dependency using EDA2R expression status shows a better prediction than the prediction using TP53 mutational status in statistics.

    1. Reviewer #3:

      Three different anti-asprosin mAbs were produced and tested in different metabolic syndrome animal models. Beneficial effects were noted on body weight, food intake and blood glucose and insulin levels. The effects were modest, but seemed to be relevant to elevated aprosin levels, as the AB blocked the effects of adenoviral overexpression of the hormone. Some issues require attention:

      1) Additional characterization of the aprosin neutralizing effect of the AB is required.. It will be helpful to show the endogenous free asprosin levels at different time points after a single or repeated mAb injection. This result is also important to tell whether this mAb will cause other immune responses and side effects that might confound interpretation of the results.

      2) In Figure 3 (a, e, j) and Figure 4 (a, e, I, m). please show body weight to rule out the stress or side effects caused by virus injection. For DIO mice, 14 days IgG injection also caused weight loss; for db/db mice, IgG injection increased body weight. Please discuss.

      3) Although adenovirus and AAV are widely used for in vivo protein overexpression, it is important to show here that endogenous asprosin levels were increased after virus injection and decreased after antibody neutralization.

      4) In Figure 5, more data on liver weight, histology, etc. is required to support their conclusion on liver health. The current data from three different mice models are very contradictory, this can be caused by the side effect or off-target effect of this mAb.

      5) In Figure 6, it is important to demonstrate the neutralizing effect of the mAbs.

    2. Reviewer #2:

      Asprosin, as identified by the authors' group, is reported to stimulate glucose release from the liver and also centrally act as an orexigenic hormone. The present study developed monoclonal antibodies for asprosin and demonstrated that antibody-based asprosin depletion lowered food intake, prevented diet-induced body weight gain, and lowered blood glucose levels in mice. Overall the data are supportive of the conclusion; however, several concerns were identified as follows:

      1) One of the central issues is the specificity of the antibody action. The authors should demonstrate if the effect of the asprosin antibodies is blunted in mice that lack either asporosin or its receptor OR4M1.

      2) Previous studies from the authors' group show that asprosin acts on hepatocytes and triggers cAMP signaling. The authors should examine if the neutral antibodies blunt the cAMP signaling in DIO mice.

      3) Similarly, asprosin was shown to stimulate AgRP+ neurons. The authors need to demonstrate the effect of asprosin antibodies on AgRP+ neuronal activity.

      4) A recent paper (von Herrath et al. Cell Metabolism 2019) challenged the author's observation of the metabolic action of asprosin. The authors claim that this is due to "due to use of poor quality recombinant asprosin". However, no scientific evidence was presented. This study needs a more rigorous assessment of data reproducibility.

      5) Most of the bodyweight data are presented as "body-weight change". However, the authors should present them as whole-body weight.

      6) Some of the data points and stat analyses require further clarification. e.g.) lack of SE in Fig.1c, statistical analysis of Fig.3, Sup Fig.1

    3. Reviewer #1:

      The study is interesting and does have potential translational relevance. There are some concerns however: (1) in Figure 1 the blood glucose drops independent of food intake is this all related to decreased hepatic glucose output or are there any effects on urine. Was urinary glucose measured? Is there increased glycosuria?; (2) In previous papers you discuss the increased lean body mass when aprosin is not present. There is no body composition data in this study. Was there any body composition differences with the antibody among the different mouse models (e.g DIO vs Nash diet)?; (3) were any changes in lean body mass with the antibodies associated with increases in strength?; (4) several mouse ages are discussed in the Methods section: 12 weeks, 16 weeks, 12 week of high fat diet or 24 week of NASH diet. Not clear from description if mice were matched for age. Please clarify; (5) In Figure 5 there are a number of inflammatory markers which can vary according to the model. What about anti inflammatory markers (cortisol, IL-10 etc) would be helpful to get a better picture of physiologic changes.

    1. Reviewer #3:

      This paper shows that during a second-order conditioning (SOC) task, the representation of a conditioned outcome is represented in the lateral orbitofrontal cortex (lOFC). The BOLD signal in this region shows increased functional coupling with the amygdala for second-order conditioned stimuli that indirectly predict a negative outcome. The authors suggest these findings reflect a mechanism by which value is conferred to stimuli that were never paired with reinforcement.

      The paper tackles an interesting question concerning the neural mechanisms that support second order conditioning. The task design includes relevant controls and, on the whole, the findings support the claims made by the authors. I have a few questions about interpretation of the data, but my main suggestion would be to revise the framing of the article. There are many previous studies that have investigated the mechanisms that support second order conditioning which are not always given due credit. I believe this paper would benefit from placing the hypotheses and findings more firmly within the context of previous literature.

      Comments:

      1) The authors test the hypothesis that CS2 is directly paired with a neural representation of the US. They state that this hypothesis 'has never been tested to date'. However, a number of studies have shown evidence for and against this hypothesis (for example: Wimmer and Shohamy 2012; Wang et al., 2020; Barron et al., 2020). Can the authors clarify how the hypothesis tested here differs from those investigated previously? In addition, it is not clear to me how the four potential mechanisms they propose are really distinct from each other?

      2) Relatedly, given the authors use an SOC paradigm that differs from sensory preconditioning studies used by many previous authors, does the difference in task paradigm provide new insight? Do the authors expect the neural mechanism to be the same or different between their version of SOC and sensory preconditioning?

      3) Why is the behavioural data in Figure 1F bimodal for CS1 and CS2? i.e. what does choice probability of 0 for CS2+ vs CS2- mean for a given subject?

      4) To test the author's hypothesis, is it not necessary to assess evidence for US in response to CS2? They instead report reactivation of US in response to CS1 and for the PPI it is not clear to me how the authors distinguish between CS1 and CS2 given the temporal proximity in their presentation (Figure 1D).

      5) For the PPI, is there a main effect of CS- and CS+ versus CSn in lOFC? If not, how does this affect interpretation of the PPI? On a separate note, is the effect reported in Figure 3 really in the hippocampus? Does it survive small volume correction using a hippocampal mask?

      6) The following is stated as a premise: "To form an associative link between CS2 and US, the reinstated US patterns need to be projected from their cortical storage site to regions like amygdala and hippocampus, allowing for convergence of US and CS2 information." This potentially seems fair for the hippocampus, with added reference to relevant literature (e.g. publications from Shohamy and Preston labs), but in my opinion the jury is still out on this one. It is not clear to me why we necessarily expect amygdala here.

      7) There are various strong statements that in my opinion need to be toned down in light of existing literature. For example, the paper claims this study is the first to show evidence for implicit inference. However, as far as I'm aware, Wimmer & Shohamy 2012 also found no evidence for explicit memory of stimulus-stimulus associations with no relationship between measures of explicit memory and decision bias. Similarly, the authors claim this paper is 'the only report so far of behavioral evidence for associative transfer of motivational value during human second-order conditioning', overlooking a large number of other studies that have shown similar behavioural effects.

    2. Reviewer #2:

      The authors investigate the neural correlates of second order conditioning in carefully designed behavioural experiments coupling multivariate fMRI and functional connectivity. They found that the lateral OFC in connection with the amygdala, plays an important role. I think the paper represents a valuable addition to the human cognitive literature, where second order conditioning is surprisingly under-investigated. I have only a few suggestions to make.

      I encourage the authors to complement the multivariate analyses with a standard univariate analysis. To be clear, I am not without seeing the added value of the multivariate approach, however, given the extensive literature on the neural bases of conditioning using univariate analyses and the strong prediction about directionally of the effects in the OFC (which should positively encoded expected values and rewards), I think the paper would definitely benefit from including the univariate results for the main contrasts / variables.

      I am also curious to see the reaction times in the attentional control task analyzed to check if they were affected by the underlying conditioning procedure. Following the Pavlovian-to-Instrumental transfer theory, we should observe that the reaction times are slower for negative (aversive) stimuli and faster for positive (appetitive) stimuli.

    3. Reviewer #1:

      This manuscript by Luettgau et al. describes a study of second-order conditioning in humans. The behavioral task involved visual first- (CS1) and second-order cues (CS2) and gustatory outcomes (US). Behavioral results show that subjects preferred both the CS1+ and CS2+ over the CS1- and CS2-, respectively. MVPA shows that the CS1 evokes US representations in the lateral OFC, and that US representations in the amygdala increase over second-order conditioning. This study addresses an important and novel question. However, I have several major concerns regarding the study design and data analysis:

      1) I do not see how it would be possible to disentangle responses to the CS1 and CS2 in this task. The delay between the CS2 and CS1 is only 500 ms, which is not long enough to disentangle fMRI responses to the two CS.

      2) For the main "reinstatement" analysis, activity was averaged across both CS2 and CS1, and so it is unclear whether reinstatement is driven by the CS1 or CS2. The authors argue that "US reinstatement during SOC could only be faithfully attributed to the respective CS1, but not to CS2, since only CS1 had been directly paired with the US, and CS2 had not previously been experienced." However, this is only strictly true for the very first trial during which the CS2 could have gained full access to the US representation.

      3) In this regard, it is unclear why the authors did not use data from the first-order conditioning phase to test for US reinstatement. Although the 4-second delay between CS1 and US is still quite short, TR-wise MVPA could provide evidence that signals are related to the CS1 and not the US itself.

      4) Relatedly, the authors perform analyses suggesting that, from early to late phases of second-order conditioning, representations of CS2 in the amygdala became more similar to US representations. Although here they attempt to model fMRI responses to the CS1 and CS2 separately, there is no evidence that this was indeed successful. As I see it, the delay between the two CS is just not long enough to dissociate these responses.

      5) Is there evidence for a CS1 evoked reinstatement of the US in the amygdala, and a CS2 evoked reinstatement of the US in the lateral OFC? In theory these signals should exist, but independently testing for activity related to the two CS requires a task design where the two CS are presented in isolation or with long enough delay between them.

    1. Reviewer #3:

      This manuscript examines data from the Young Adult Human Connectome Project's 900-subject release to compare both structural and functional connections between iso-eccentricity bands in striate cortex and the fronto-parietal, cingulo-opercular, and default mode networks. The authors find that central vision is most strongly connected to the fronto-parietal network, which is associated with attention, while the far periphery is more strongly connected to the default mode network. The questions asked in this manuscript are of considerable interest to the field, and this study has the potential to be impactful. However, substantial work is needed to make the methods and results sufficiently clear and reproducible to the reader.

      Major Comments:

      A major problem throughout this paper is that the authors have not been very careful in documenting their methods, what they are plotting, or how they are supporting their assertions. This is a major shortcoming of the work. I do not believe there is sufficient detail in this paper as is to reproduce the methods, nor was I able to understand what precisely was calculated in the statistical tests reported.

      The amount of work that has been put into this project's quality control (at minimum, visual inspection and filtering of 900 MR images) is very impressive! This information should really be shared with the broader research community in order to make this manuscript more reproducible and in order to ensure that other researchers can simply use and cite the authors' efforts rather than repeating them. This could be as simple as a supplemental table or text-file that includes the subject IDs of those HCP subjects that were included in all analyses.

      It should be crystal-clear from the Methods section whether the manuscript's data were collected or reanalyzed by the authors. My understanding is that all of this manuscript's analyzed data are from the HCP database. However, had I read only the "Data Acquisition" section I would have been left with the strong impression that the authors collected the data themselves using the same kind of scanner and the same analysis pipelines as the HCP. Unless this is the case, the opening sentence of this section should probably be something like "All data were acquired and preprocessed by the Human Connectome Project (Van Essen et al., 2013)" [10.1016/j.neuroimage.2012.02.018]. It may also be wise to reference the HCP in the Acknowledgements section. Further information: https://www.humanconnectome.org/study/hcp-young-adult/document/hcp-citations. This should apply equally to the data and the preprocessing methods-i.e., if the quality control mentioned in the above comment was performed by the HCP and not the authors, that should have been explicit.

      P3, ❡6. This paragraph is critical to the methods but is not at all clear. In particular, the paragraph eventually describes seven eccentricity segments per subject, yet it does not explain what the eccentricity boundaries of these segments are, nor does Figure 2 show these segments. It isn't clear from the manuscript if these are ever used (rather than the 3 central/mid-peripheral/far-peripheral segments) or exclusively used.

      In looking at Figure 4, my first and strongest impression is that the central connectivity is very similar to the far-peripheral connectivity, and the z-score differences are incredibly small. Additionally, the legend does not make the quantities plotted very clear (these are based on the averaged z-scores across subjects?) so I'm left wondering how to assess any sort of significance. I have a similar reaction to Figure 5. More help is needed to understand these results.

      Given that this paper consists of a large analysis of a large existing dataset, it would be especially nice if the authors would make their source code and intermediate analysis files publicly available. Having access to the source code directly is virtually a requirement of making this kind of study reproducible and would mediate many of my concerns about the ambiguities of the methods.

    2. Reviewer #2:

      In this work, Sims and colleagues use resting-state functional connectivity and diffusion tractography in human connectome project data to examine the connectivity of the central and peripheral aspects of the primary visual cortex. They find that central V1 connects more strongly to regions of the prefrontal cortex interpreted as the Fronto-parietal network than does peripheral V1.

      The idea that central V1 may be directly connected to control-related networks is an interesting one, and has fascinating implications for the study of top-down modulation of visual cortex function. However, I must say I am somewhat skeptical of these findings, for several reasons.

      First, I find the a priori anatomical basis for these proposed connections to be dubious. The authors themselves describe how Markov et al. explicitly conducted tract tracing with central V1 and found connections with posterior frontal and parietal cortex, but nothing with areas classically associated with the fronto-parietal cortex. The authors propose that the inferior fronto-occipital fasciculus may connect V1 with lateral prefrontal regions only in humans. However, they provide no evidence for this suggestion. Indeed, my understanding of the iFOF is that it connects to inferior and lateral occipital cortex (see e.g. figures from the Takemura study cited in this work). Can the authors better support the idea that the iFOF might be the route of connection between V1 and frontal cortex?

      Second, I am concerned that both 1) the Central V1 ROI employed in this work and 2) the inferior frontal cortex region showing strong FC with that Central V1 ROI overlap very closely with regions where we have seen poor BOLD signal in our own fMRI data (I would like to attach a figure if possible).

      We are not confident what the source of the poor signal might be in posterior occipital or inferior frontal cortex; we suspect the presence of large veins (possibly the transverse sinus in V1; see Winawer et al., 2010, Journal of Vision). In any case, the data quality is low enough that we believe our data should not be considered to represent actual neural function in those regions. Can the authors demonstrate convincingly that this is not the case in their HCP data?

      Third, I have an issue with the localization of effects in this paper. The paper describes effects in the fronto-parietal network throughout the manuscript, including the title. How surprising, then, that the strongest effects are not in the FP network at all! Figure 4A makes it very clear that the largest effects are in the IFG, which is outside the green outlines describing the extent of the fronto-parietal network, but inside the Default network. Figure 3A also supports this Default-centric localization, with Central V1 effects in posterior lateral parietal, medial parietal, and superior frontal cortex, all outside FP but inside Default. Since the FC effects are not actually primarily in FP, I see no reason why FP should be used as a mask in Figure 5. Indeed, the authors should show the localization of SC effects throughout the cortex, not just in FP. I also see no reason why these V1-Default connections should be characterized in any way as "attention" or "control".

      Fourth, I feel that these FC and SC differences are wildly over-interpreted. From the scale, the actual strength of FC and SC between central V1 and lateral parietal cortex is extremely weak (around Z(r) = .1 for FC and p-track = .1 for SC). Under no circumstances would I believe that either of those values represents any sort of real connection. Cortical regions with direct structural connections have much stronger FC values, as do regions that influence each other indirectly via multi-step connections. Further, very large portions of the brain probably have both stronger FC and SC to central V1 than these FP regions (the authors show this for FC but exclude this info for SC). Most glaringly, I certainly don't believe there is a "direct structural connection" as is claimed in the discussion--a claim based, strangely, on the spatial correspondence between the structural and functional maps, which really has nothing to do with any evidence for a direct connection.

      Finally, the authors must note that p values may not be used for spatial correlations between brain maps. This is because these maps are always highly autocorrelated, which violates the independence assumption of the correlation procedure.

    3. Reviewer #1:

      This manuscript extends on prior work by the authors (Griffis et al, 2017), which originally reported eccentricity-dependent differences in resting state connectivity between V1 and regions brain wide. This study builds on that work by expanding the pool of participants, using the HCP dataset, as well as also investigating any eccentricity-dependent effects that may emerge with tractography. Interestingly, both measures find that foveal areas in V1 are more strongly connected to frontoparietal networks. The study is interesting, but I have a few remaining points.

      1) While during the resting state scans, there was, in theory, no 'task', participants were asked to maintain fixation on the cross in the center of the screen throughout the scan. I think it would be important for the authors to note that there is a possibility that the resting state correlations observed wherein foveal areas were more correlated with frontoparietal regions (and far periphery with DMN areas) could be due to attention directed towards the fixation cross, and away from the periphery. While I acknowledge the authors have no way to test this with this data set, it is possible that if participants had been asked to covertly attend to a ring in their far periphery the entire time instead, the correlations might have been flipped, with frontoparietal connectivity highest in the periphery towards the attended eccentricity. The authors should either explain why this is not a concern, or acknowledge it in the manuscript.

      2) Related to the last point, what was the size of the screen used during the connectivity data acquisition? I ask because the far eccentricity bands determined using Benson et al's technique are very eccentric. And if participants had eyes opened and were fixating, was that eccentricity outside the outer edge of the screen? Because then it would be encouraged to be 'unattended', thereby potentially influencing connectivity results.

      3) Was there any attempt at replicating these results in extra striate cortex? Are these patterns still there, both in structural and functional connectivity, for V2 or V3?

    1. Reviewer #4:

      PREreview of "Structural characterization of an RNA bound antiterminator protein reveal a successive binding mode" Authored by James L. Walshe et al. and posted on bioRxiv DOI: 10.1101/2020.09.27.315978

      Review authors in alphabetical order: Monica Granados, Katrina Murphy

      This review is the result of a virtual, live-streamed journal club organized and hosted by PREreview and eLife. The discussion was joined by ten people in total, including researchers and publishers from several regions of the world and the event organizing team.

      Overview and take-home message

      In this preprint, Walshe et al. use a structural approach to examine a bacteria's RNA-binding ANTAR protein, EutV, including how EutV's antitermination mechanism works to prevent transcription termination and thus regulate gene expression. In addition, the team examined how a single hexaloop with the conserved G4 is recognized in succession by conserved residues in the ANTAR domains, how conserved A1 helps with proper RNA folding, and how these interactions support RNA binding. Although this research is of interest in the field, there are some concerns that could be addressed in the next version. These are outlined below.

      Positive Feedback:

      -I appreciate the comment on how crucial it is to understand the system and structure of these proteins for therapeutic purposes. It helps exemplify the relevancy for people outside of this field.

      -I think it's interesting that there is potential for a new current model for ANTAR-mediated antitermination.

      -I found it interesting that the two domains of the dimer cannot bind to the P1 and P2 helices of the same RNA.

      -New data is used in this preprint and displayed openly in Supplementary Table 1.

      -This research is novel because it's the start of looking at specifics of the mechanisms ANTAR domain proteins use to prevent termination.

      -It will be interesting to look at bioinformatic analyses for the ANTAR domain across diverse bacterial strains. Especially in diverse ecological niches such as host-pathogen.

      -It would be interesting to look at the structure in the context of an RNA construct that includes the P1, P2, and all of the T-loop.

      -I am outside of this field of study, however, there are definitely a lot of details in this paper that it seems to be enough to reproduce. Though others possibly in the field have said, reproducibility is less likely in this type of work.

      -I'm outside of the field, but it is nice that they deposited the atomic sequences on a public repository. I wonder whether this is mandatory for acceptance?

      -Yes [the results are likely to lead to future research], now that there is more interest in mechanisms that ANTAR domain proteins use for antitermination.

      -Are these findings applicable for similar ANTAR proteins (homologues/orthologues) in other bacteria? What about more complex organisms?

      -Interesting topic!

      -First RNA bound!

      -Yes [I would recommend this manuscript to others and peer review], I think this is a promising manuscript.

      Major Concerns:

      -Lot of the details are included [in the preprint], lacking, however, is information in the method section about the modeling of the RNA using RNAComposer. It is mentioned in the results section, but not in the methods section.

      -It's not clear where the EMSA assay is used in the paper. It's mentioned in the methods section, but not anywhere else.

      -I think it would be helpful to see whether ANTAR mutants have anti-termination defects in a transcription reaction. Authors might consider being cautious talking about anti-termination without functional studies.

      Acknowledgments:

      We thank all participants for attending the live-streamed preprint journal club. We especially thank those that engaged in the discussion.

      Below are the names of participants who wanted to be recognized publicly for their contribution to the discussion:

      Aaron Frank | University of Michigan | Assistant Professor, Biophysics and Chemistry | Ann Arbor, MI Monica Granados | PREreview | Leadership Team | Ottawa, ON Katrina Murphy | PREreview | Project Manager | Portland, OR

    2. Reviewer #3:

      General assessment:

      Antitermination (AT) is a widespread mechanism to regulate transcription and can be mediated by ANTAR domains which prevent the formation of the terminator hairpin by binding to and stabilising a dual hexaloop motif in the nascent RNA. In the submitted manuscript Walshe and coworkers address the molecular basis of this AT mechanism which is largely unknown. They report two crystal structures of the dimeric ANTAR protein EutV from E. faecialis, one of EutV alone and one in the presence of a 51 nt long RNA containing the dual hexaloop motif, and combine this structural data with biochemical and biophysical data.

      The study

      -Reveals structural rearrangements that occur upon RNA binding and provides molecular insights into the RNA binding mode

      -Shows for the first time that a Met residue is obligatory for RNA binding

      -Redefines the minimal ANTAR domain binding motif

      -Suggests a new model for ANTAR-mediated AT

      Thus, the study is a comprehensive work, the experiments are performed thoroughly, and the conclusions are supported by the data. The results are of interest to a broad audience, ranging from the field of transcription in all domains of life to protein:nucleic acid interactions in general.

      However, the authors should address the following concerns:

      1) p 5, lines 15-17: The interactions should be described more clearly, i.e. are the hydrogen bonds between main chain atoms or between side chains? Which atoms/functional groups are involved (e.g. carboxy group of sidechain of Glu161)

      2) p 8, line 1-2: The SEC-MALS data indicates that the sample is not homogeneous and the authors suggest that this might be a concentration-dependent effect. This hypothesis is, however, not supported by the data. First, there is no information provided about the concentration used in the SEC run . Second, the SEC run was carried out on a S200 column. The experiment should be repeated on a S75 column which has a better resolution in the range of interest. Furthermore, the SEC runs should be performed with different concentrations to check if the oligomerization is indeed concentration-dependent and it could be used to check if the oligomerization is reversible (i.e. by collecting the "dimeric" form and re-run the solution and see if there is an equilibrium). Finally, as the authors discuss the dimerization behavior/mechanism, they might check if/how phosphorylation influences the oligomerization. These tests are important as this sample was used for the SPR experiments. If the sample, however, is not homogeneous, interpretation of the data might be compromised due to a mixture of different oligomeric states so that concentrations are not correct or a 1:1 binding model cannot be sued (most probably, the concentration of EutV is higher in the SPR experiments than in the SEC run and if there is concentration-dependent oligomerization this might be a significant issue).

      3) p 8: the chronology of Fig. 2 does not correspond to the chronology of the panels mentioned in the text.

      4) p 11, line 20: the authors state that G4 makes the only base specific interaction between the protein and the RNA hairpins. However, the details of the interactions are discussed only later in the manuscript so that this conclusion cannot be drawn at this stage. Thus, the author should present the interaction analysis earlier or adapt their argumentation (maybe by pointing to Fig. 3).

      5) Fig. 3: The interaction network between RNA (bases) and the protein is a very important point in the manuscript. In order to emphasize that only one of the bases, G4, makes base-specific contacts is, most probably, thus responsible for sequence-specific read-out, a 2D representation of the interaction network should be provided as Figure Supplement. (e.g. using LigPlot)

      6) p. 14: alanine mutagenesis. In order to confirm the importance of G4 the authors might substitute the base by another base and repeat the SPR measurements. Moreover, the quality of the protein samples should be checked (and data should ideally be provided as supplemental material), i.e. is the samples homogeneous (see comment on SEC runs) and are the samples free of nucleic acid contamination (how is the A260/A280?)

      7) p. 14: EutV binding to P1 and P2 RNA tested by SPR: was the sample homogeneous ? (see comment above on SEC runs).

      8) p 14: The authors should comment on the differences in the CD spectra in the region around 220 nm.

      9) p 20, ,lines 14-23. G4 plays a critical role in sequence-specific recognition. This recognition mode is reminiscent of the mechanism an operon-specific transcription factor, RfaH, uses. Here, RNA polymerase pauses at a pause site and exposes the nontemplate strand, which forms a hairpin. This hairpin stabilizes the flipping-out of a base in the loop region and allows sequence-specific read-out. Similar to EutV, sequence-specific recognition relies on very few base-specific interaction. However, RfaH binds to DNA. Moreover, also the sigma factor uses a flipped-out residue for recognition, although applying a different mode of stabilization. Thus, a comparison of these recognition modes might be of interest.

      10) p. 22: revised AT mechanism: The proposed model is reasonable and fully supported by the data. Is there a possibility to check the role of the two hairpins in vivo? I.e. if there is a possibility/assay to distinguish between recruitment and AT efficiency, the proposed model could be tested.

    3. Reviewer #2:

      In the manuscript, "Structural characterization of an RNA bound antiterminator protein reveals a successive binding mode," the authors present the solved crystal structure of Enterococcus faecalis EutV by itself as well as bound to its RNA substrate. In previous work, the RNA substrate was proposed to consist of a dual hairpin and the genetics strongly suggested that both hairpins of this feature were crucial to functional antitermination in vivo. The finding revealed by the crystal structure in this work is that the EutV dimer does not appear to bind both hairpins simultaneously. The structure shows one EutV chain binding a hairpin in one RNA molecule and the second binding a second hairpin in a separate RNA molecule. The orientation of the two ANTAR domains is such that it is not possible to bind one RNA molecule simultaneously. Based on their findings, the authors propose a model of successive antitermination in which EutV binds to the first hairpin as it is generated by RNA polymerase and then this somehow favors binding to the second hairpin overlapping the terminator sequence as soon as it is made to prevent terminator formation. My overall assessment is that this is potentially an important and interesting contribution to the fields of transcription termination/antitermination and RNA/protein structural biology. However, there are concerns with how conclusive the data is, how exactly the model can work, and a lack of experimental evidence for the model.

      Major Comments:

      1) One major concern about the structure is that it is of non-phosphorylated EutV bound to its RNA substrate. Two-component system regulators almost always undergo conformational changes upon phosphorylation and therefore I think it is still an open question whether the structure truly represents active EutV bound to RNA. Perhaps the ANTAR binding domains of the EutV dimer change orientation upon phosphorylation such that binding to both hairpins can occur.

      2) If binding does only occur with one hairpin, then why are two necessary for activation? If it is impossible for one dimer to bind both hairpins simultaneously, how does binding to the first hairpin help binding to the second? This is not clearly explained. Also, no experimental evidence is presented to support the model.

      3) Wording of the abstract does not well reflect the final model presented. The abstract makes it sound like the second hairpin is not important, which is not what is shown here or in the previous work. I think the authors should say a bit more about what the actual model is in the abstract to eliminate this misconception.

      4) Ramesh et al. (2012) observed that EutV bound the eutP UTR with a higher KD (less efficiently) when just the P1 loop was used in an EMSA assay compared to P1/P2. This study found the same KD, whether P1, P2, or both are used in a SPR assay. Could the difference in these findings be related to the different techniques or the fact that slightly different versions of the EutV protein were used?

    4. Reviewer #1:

      This paper looks at the mechanism of transcription regulation by the ANTAR domain protein, EutV. ANTAR domain proteins are an evolutionarily widespread family of RNA-binding regulators in bacteria. EutV has been proposed to regulate expression of target genes by binding two RNA loops in a 5' UTR, leading to a change in the RNA structure that modulates premature transcription termination. The current study determines the structure of dimeric EutV bound to an RNA target with two binding sites. Surprisingly, the interactions between the ANTAR domains in each monomer and each of the two RNA loops are incompatible with simultaneous binding of one EutV dimer to both loops. Hence, the authors propose a model in which EutV is "handed off" from one loop to the other as the RNA is transcribed.

      The structural information regarding the interaction between the ANTAR domain and RNA is an important advance, although there is very little comparison to previous studies, including a study that identified many of the same residues as being required for RNA binding (reference 33). The evidence that a EutV dimer cannot bind both RNA loops simultaneously is strong, and inconsistent with a previously proposed model of regulation. However, other than the structure, there are no data that support the authors' proposed hand-off model. In fact, as it is drawn in Figure 6D, I don't think the model is possible based on the same structural constraints that prevent simultaneous binding of the EutV dimer to both RNA loops. Without further experiments, I don't think the authors can conclude much about the mechanism other than it being unlikely that a single EutV protein binds both RNA sites simultaneously.

      Major comments:

      1) Throughout the paper, there is insufficient description of previous work on ANTAR domain proteins. In particular, there is little comparison to published structural data, including modeled RNA-bound structures. There is also very little discussion of the mutagenesis in reference 33 that identified many of the same residues as being required for RNA binding. There is no doubt that the structural work in the current study represents a substantial advance over previous studies, but it is important to describe the similarities and differences to prior work.

      2) Discussion, second paragraph. The evidence for a conformational shift in EutV upon phosphorylation is weak. This hypothesis is based on structural modeling from a homologous protein that has only 37% sequence similarity.

      3) The structure does appear to rule out the possibility of EutV binding both RNA hexaloops simultaneously, but the hand-off model is still rather speculative, and not supported by any additional experimental data; binding of two EutV dimers to the same nascent RNA would seem just as likely. There is insufficient discussion of how the hand-off model fits with previous mutagenesis studies (e.g. reference 25), and no follow-up experiments designed to test the model. If EutV is unable to bind both hexaloops simultaneously due to spatial constraints, how is it able to transition from one hexaloop to the other, as depicted in Figure 6D? I would expect the same spatial constraints to apply.

    1. Reviewer #3:

      This study shows how well mixed populations of yeast cells initially expressing both an anticompetitor toxin and resistance to it, first lose toxin production (because there is a cost but no benefit to toxin production when all cells are resistant) and then lose resistance (because there is a cost but no benefit to resistance when no cells produce toxin). Consequently, these evolved sensitive populations have lower fitness than their own toxin-producing (resurrected) ancestors, but only if the toxic ancestors are introduced at a high enough frequency, that is, there is positive frequency dependent selection. These results are quite intuitive and satisfying, and are well supported by rigorous experiments determining the causal mutations and their selective advantages both within intra-cellular populations of the virus, and between cells in the evolving populations. This was really nice, thorough, and interesting work. However the overall result is not really surprising, as much similar work has been done before (and is properly cited) in which three types of competitors show non-transitive pairwise fitness relationships.

      The main claim to originality is that the three types here are generated sequentially by two rounds of mutation, natural selection, and replacement/fixation: that is, there is genealogical nontransitivity between ancestors and descendants, rather than just ecological nontransitivity between contemporary co-existing variants. This demonstrates an important principle: that natural selection can produce a decline in overall relative fitness in a lineage over multiple rounds of mutation and fixation. The only other reported example of this in experimental evolution is the work of Paquin and Adams (1983), but the authors here argue convincingly that the Paquin and Adams, lacking the benefit of sequencing to identify mutations and their frequencies, inadvertently competed ecological types that were co-exising in their evolving populations and had not fixed.

      My only criticism, then, is that the example of non-transitivity demonstrated here is rather "obvious"; the result is entirely predictable, given the amount of previous work in similar microbial systems. However, this is countered by the fundamental nature of the question for evolutionary biology, and the lack of specific experimental examples, apart from the very old Paquin & Adams. Overall, then, I am satisfied that this paper is a significant step forward. I found it well written, interesting, and the conclusions were well supported by careful and thorough experiments.

    2. Reviewer #2:

      The findings presented in this manuscript are really exciting. They show that selection is happening at multiple scales - among viruses within a cell - and between their host cells within a population. The conflict between these levels of selection results in evolved populations that are less fit than the ancestors. This result is exciting because it happens repeatedly in independently-evolving populations, showing that it can be a general result. It is also an example of how a non-transitive interaction can evolve de novo, as the authors claim in the manuscript. The experiments seem to rule out most alternative hypotheses. However, the authors could explain their reasoning more clearly in some cases.

      1) In particular I found it difficult to understand some of their conclusions on page 9, in the first paragraph around lines 210 - 219, without rereading, rewriting results, and lots of thinking. On lines 211-213, they state that production of active toxin or maintenance of the virus has no detectable fitness cost to the host". There are a lot of comparisons to think through here to get to that conclusion, and I think the average reader needs to be taken through that. Even though I have some experience thinking about costs and how they can be estimated, I still spent quite a lot of time trying to follow the logic from figure A to that statement. In fact, I still do not understand how they are distinguishing between 'production of active toxin' and 'maintenance of the virus'. I also had to spend a lot of time thinking through the results in figure 3 and the conclusion stated on line 217.

      2) I think it would be helpful to the reader, and interesting, if there were more of an explanation about WHY K+|+ cells have positive frequency-dependent fitness relative to K-|- cells. Why is the presence of an active virus and immunity more beneficial at higher frequencies?

    3. Reviewer #1:

      Buskirk et al. examined the evolution of nontransitive fitness effects in yeast. They showed that during evolution in rich glucose medium, a late clone (1000 generations) outcompeted an intermediate clone (300 generations), but lost in direct competition with the ancestor (in a frequency-dependent fashion: late clone when rare loses to ancestor and when abundant outcompetes ancestor). This is due to adaptation in the nuclear genome and intracellular killer virus. Essentially, the ancestor expresses both killing and immunity phenotypes (K+I+), the intermediate clone expresses immunity (K-I+), and the late clone expresses neither (K-I-). This trend is observed in many evolving populations. In the absence of the killing interaction, virus does not affect host fitness. That is, when killing interactions are absent, fitness changes are due to mutations in the nuclear genome. Changes in killing and immunity phenotypes are driven by intracellular competition of viruses where viruses defective in killing and/or immunity have an advantage over functional viruses.

      This work demonstrates that evolution may not be a simple linear march of progress. Rather, progresses over short time scales can sometimes lead to a reduction of fitness over the longer time scale due to ecological interactions. I find the work quite interesting, although I also find it a bit incomplete.

      What are the nuclear mutations that made intermediate clones more fit than ancestor and late clones more fit than intermediate clones? I think that giving one example for both cases will be helpful.

      A schematic summary figure will be helpful.

    1. Reviewer #3:

      The manuscript by Morcom et al., describes mechanisms of Corpus callosum Diysgenesis in mice and how they relate to humans. It will be of interest to the field. It explains the spectrums of disorders of the corpus callosum in humans. It is an important study that sets the focus on midline populations and away from axonal navigation as the main source of corpus callosum dysgenesis.

      The authors found that a mutation in Draxin carried by certain mouse strains is responsible for the heterogenicity of corpus callosum phenotypes found in these mice. Draxin mutations interrupt the normal remodeling (closing) of interhemispheric fissure necessary for callosal axons to cross. The phenotypes in the mouse are very similar to what is found in humans, and also variable, perhaps related to stochasticity on the mechanisms involved, or to the dependency on other allelic variants. The findings are important to understand what mutations cause CCD in humans and how, mechanistically, it occurs. The authors found that Draxin mutation misregulates astroglial and leptomeningeal proliferation. Mechanistically, how this more precisely affects interhemispheric remodeling is still unclear. This is a point that may reinforce the work.

      Major concerns:

      1) The authors have done an excellent job identifying the mutation and characterizing and comparing in detail the phenotypes in mice and humans. They also provide very interesting hints about how Draxin regulates the remodeling of the interhemispheric fissure. But mechanistically, their findings only offer an incomplete view. In my opinion, the findings would be reinforced by a deeper digging into how, cellularly or molecularly, Draxin makes glial and leptomeningeal cells remodel the interhemispheric fissure. Proliferation by itself does not seem to explain the phenotypes. It is not fully clear the model that they are proposing. Does it affect cell-cell adhesion, cell-cell signaling, membrane processes, metalloproteinase activity? Perhaps they could characterize some more the morphology and junctions of the affected cells or perform some studies in acute models or in vitro.

      Minor comments:

      Fig 4C-the expression patterns of mRNA Draxin in C57 or BTBR does not seem so similar as it is mentioned in the description of the results.

      Fig 4D-The full versión of western-blots shown in supplementary showing all forms is more informative than the cuts shown in principal Figure. Please indicate molecular weights.

    2. Reviewer #2:

      This is an interesting study that provides convincing evidence that a Draxin mutation underpins forebrain commissure phenotypes in BTBR mice and crosses.

      The use of BTBR x C57 N2 crosses where commissure phenotype is correlated with the Draxin mutation (Figure 5) is a nice illustration of unpicking variable penetrance. The phenocopy of the BTBR/c57 phenotype to Draxin mutants is a nice confirmatory experiment.

      Further, analysis of midline fusion shows that problems in MZG proliferation and hemisphere fusion are prevalent in BTBR mice supporting the hypothesis that Draxin is needed for midline fusion.

      MRI scans of human subjects with a spectrum of CC abnormalities show that commissure abnormalities correlate with midline fusion defects.

      Major comments.

      1) As a central contention of this study is that variable penetrance of the commissure phenotypes in the BTBR x C57 mice stems from an earlier midline fusion phenotype is would have been useful to see if the (embryonic) midline fusion phenotype also showed the same partial penetrance in BTBR x C57 mice, perhaps also correlated with the WT/MUT Draxin alleles (as in Figure 5). This would be a testable prediction of the hypothesis that midline fusion (and not something else) mediates the Draxin phenotype.

      2) I am not sure the human data adds substantially to the paper as it is not related to Draxin mutations. It is already well known that corpus callosum phenotypes are variable in humans (and mice).

      Minor comments:

      Some of the data are not normally distributed (particularly clear for pink data points in Fig 5a,e,i,m) so it is not appropriate to show standard errors (the SEM bars could simply be removed), a non-parametic Kruskal-Wallis ANOVA has been used which is appropriate.

    3. Reviewer #1:

      This is an interesting translational and comprehensive study which examines cellular and genetic mechanisms involved in the diversity of corpus callosum dysgenesis (CCD) phenotypes. Using mouse models and human cohorts with a spectrum CCD, it is found that the extent of aberrant interhemispheric fissure (IHF) remodeling predicts commissure dysgenesis severity. Elegant neuroanatomical experiments show that abnormal proliferation/migration of midline zipper glia (MZG) progenitors underlies aberrant IHF remodeling. Thus, in addition to genetic perturbations linked to aberrant callosal axon guidance in humans and mice (i.e. variants in DCC guidance cue receptor gene), disruption to IHF remodeling also causes CCD. Indeed, an 8-base pair deletion in the DCC receptor ligand, Draxin, which is expressed in MZG, associates with CC malformations in mice. The findings are novel and important to both basic and clinical scientists.

      Below are comments and suggestions that need to be addressed:

      1) Introduction:

      -More detailed information about the BTBR mouse line and the rationale for using the BTBR x C57 mouse cross should be provided.

      -The main question addressed in the study should be clearly stated.

      2) Methods:

      The Statistical analysis section needs to provide a more detailed description of the statistical tests that were used and the reason why these tests were chosen.

      3) Results:

      In general, the description of the statistical results lacks important details. For example:

      -For figure 1, there is very little information about statistical analysis. For figure 1 C, it needs to be explained why a Welsh test was used instead of a one-way ANOVA. The errors on the bars do not seem to correspond to SEM, this needs to be clarified.

      -For figures 3 G and H, if the data are presented in single graphs, it is not clear why unpaired t tests or Mann-Whitney tests were conducted (instead of ANOVAs). Why a non-parametric test was used is not explained.

      -The description of the findings that prompted the authors to investigate the role of Draxin in CCD needs to be clearer.

      -The references to the different panels of Figures 5 and 4 need to be revised in the Results section.

      -It is not clear what is the impact of the Draxin deletion to IHF remodeling. There seems to be an effect shown in one of the supplementary figures (in BTRB mice), but there is no discussion in this regard. This is particularly important considering that Draxin is expressed by MZG.

      -It seems that the Draxin deletion does not affect HC formation. However, at some point in the Results section it is stated "To investigate how DRAXIN regulates CC and HC formation...". This is confusing. It seems that the effect varies between BTRB mice and the BTRB x C57 cross, but this is not discussed clearly.

      -Figure 7 should indicate the mouse genotype on the actual figure to avoid confusion.

      -The study by Vosberg et al, 2019 in Annals in Neurology needs to be included when referring to studies linking DCC variance and CC dysgenesis in humans.

      Minor Comments:

      The organization of the manuscript could be improved to increase its clarity. The authors may want to consider moving the Draxin findings to the last part of the Results.

    1. Reviewer #3:

      This work by Katada and colleagues uses M4 and 5B transgenic lines to express ChR2 in starburst amacrine cells (SACs) and retinal ganglion cells (RGCs). It finds that ChR2 activation in SACs improves the ChR2 response in RGCs. Thus, in a gene therapy strategy that expresses optogenetic proteins in RGCs, SACs may be considered as a helpful additional target. The rationale of the manuscript basically regards RGCs as a uniform population and disregards all amacrine cells except SACs. If differences in RGC and amacrine subtypes are taken into consideration, some conclusions of this manuscript should be revised.

      Major comments:

      1) This manuscript makes one assumption: that the RGCs in M4-ChR2 and 5B-ChR2 have comparable ChR2 evoked response if activated alone, thus the difference between their ChR2 responses is entirely attributed to the activation of extra SACs in the M4 line. Yet there is no experimental evidence to support this assumption. Both M4-YC and 5B-YC label ~35% of the RGC consisting of multiple subtypes, the subtype compositions of the two populations are not shown. ChR2 response properties of a neuron may be influenced by its own ion channel composition that differ between cell types. The authors need to either a) show the 2 mouse lines label identical subsets of RGCs (unlikely, given FigS6E), or b) compare M4 line with or without coactivation of SACs to single out the effect of SACs.

      2) The experiment results using rAAV (Fig4) are hard to interpret:

      a) CAG promoter directs expression in most cell types. So other amacrine (Fig4D) and RGC cell types in addition to SACs and M4/5B RGCs are also infected. Comparison between rAAV/M4/5B retinas cannot provide clean insight into the effect of SAC.

      b) The manuscript makes comparisons within the rAAV experiments (Fig4I-K FigS8F-H), trying to link induction efficiency into SACs with visual restoration. However, it is a given that higher infection in RGCs also leads to better visual restoration. So SAC effect cannot be separated from RGCs (Fig4J-K FigS8G-H).

      c) The one exception shown in Fig4I and FigS8F, where SAC infection rate is linked to maintained/peak ratio, while RGC infection is not, has two caveat: First, the authors acknowledge that higher maintained response may not causally link to better restoration (line 235). Second, the same correlational analysis for other AC types is missing.

      d) At this stage, a simpler interpretation of the results is equally plausible: that higher infection in all retinal neurons (regardless of type) is correlated with better restoration.

      3) M4-ChR2 retina has very weak OFF response to regular light stimulus, but 5B has normal ON/OFF ratio. The authors speculate that SACs are responsible for this difference. But one observes that M4 labels mostly OFF RGCs while 5B labels equal amount of ON and OFF RGCs (S3 and S6E, lamination patterns of M4 and 5B), so there is a simpler explanation: RGCs that express tet-ON ChR2 are no longer very responsive to regular light stimuli. If that is true, that these cells are very unhealthy, then comparison of their ChR2 responses becomes less meaningful. The authors need to address the cell health problem caused by tet-ON ChR2 expression.

      4) Only a few RGC subtypes form synaptic connections with SACs in the rodent retina. Thus, the effect of SACs would be limited. In the case of primate retina, ChAT positive neurons are much fewer, so their effect in ChR2 gene therapy are likely even more limited.

      5) Lines 154-155: an equally likely explanation: M4 contains ON and ON-OFF DSGCs, which are known to be important for OKR, whereas 5B does not. This possibility needs to be considered.

    2. Reviewer #2:

      This paper presents the results of a study of optogenetic visual restoration. ChR2 was targeted either to a subset of ganglion cells (GCs) or to a subset of ganglion cells-not necessarily the same ones-plus starburst amacrine cells (SACs) using an intersectional genetic strategy. Photoreceptors were ablated using MNU in animals expressing ChR2, and then retinal and whole animal responses to visual stimuli were assessed. Interestingly, co-expression of ChR2 in SACs and GCs resulted in different, potentially more "naturalistic" responses than expression in GCs alone. This is an interesting result, but given the number of possible explanations for it, the lack of any rigorous investigation of the underlying mechanism is problematic. Results presented by the authors indicate that ACh release from stimulated SACs acts upon some network(s) containing electrical synapses and presynaptic to the GCs to alter GC responses, but the identities of these network(s) remain unknown. Given that ACh is considered to act in a paracrine manner within the retina, the affected cells could be any number of amacrine or bipolar cells.

      There are a number of lines of investigation that the authors could pursue to identify-or at least, rule out-specific presynaptic networks. While too numerous to discuss individually, potential lines of investigation could differentiate nicotinic from muscarinic effects and effects on inhibitory and excitatory inputs to ganglion cells. As well, it would be important to express ChR2 in SACs alone to see if this drives changes in GC spiking.

      In all, the authors here have the opportunity to examine the effects of paracrine signaling by SACs on inner retinal network excitability and function using a nice model system, and they should take advantage of it.

    3. Reviewer #1:

      The authors use a tetracycline controlled gene expression system to compare the effectiveness of two difference promoters to express channelrhodopsin in different populations of retinal neurons with the goal of rescuing visual function in mouse models of photoreceptor degeneration. The expression patterns of two promoters were compared - the first a muscarinic AChR (referred to as M4 in the manuscript) led to expression in a subset of RGCs and a subset of amacrine cells, while the second a 5-HT receptor (5B) led to expression in a subset of RGCs only. In the M4 line, the amacrine cells that were labeled were a subset of starburst amacrine cells located in the INL and did not label the SACs displaced in the GCL. Also, it was a subset of the INL-SACs. To assess the impact of these different expression patterns on vision restoration, mice expression ChR under these two different promoters were treated with MNU to induce PR degeneration. The light responses restored by ChR were assessed with a MEA recordings cortical VEPs and behavior. The M4 promoter had stronger light evoked responses. The authors used pharmacology to assess how the M4 retinal circuit might explain the enhanced light response.

      There were several fundamental problems with the manuscript that need to be addressed. These problems range from experimental design, interpretation of findings, some mistakes in description of retina circuits. Moreover, there is no context given comparing these results to the multiple other studies on vision restoration impact on visual-guided behaviors. These problems are listed here:

      1) The choice of promoters and expression patterns need to be further explored. The motivation for a particular subtype of mAChRs and 5-HT is not given. Though M4 and 5b drives expression in roughly the same percentage of total RGCs, there is no way to know whether they drive expression in the same subtypes of RGCs. Hence differences in firing patterns are not likely to be fully explained by the fact that M4 promoter also drives expression I a subset of INL-SACs.

      2) The observation that M4 drives expression in a subset of OFF SACs was quite intriguing. Though there are ways to distinguish ON from OFF SACs, this is the first example of which I am aware that a subset of OFF-SACs is labeled. Does this mean only a subset of OFF-SACs have mAChRs? Or was this reflective of the partial express induced by Tet? It is worth the authors quantifying the percent of OFF-SACs labeled in the M4 mouse line.

      3) The observation that they are able to rescue the OKR result in MNU treated mice using the M4-promoter is impressive. Again, the authors conclude that this is due to presence of ChR in INL-SACs but it could be they also have ChR expression in direction selective ganglion cells themselves. Hence the rescue is impressive, it is difficult to interpret. Also, this important behavior is confined to a supplemental figure.

      4) The authors conclude that M4-driven expression of ChR rescues the OKR in MNU-treated mice and not rd-mice because the rd mice have a "thinner INL" and therefore may have a depletion of INL-SACs. This appears to be an easy test for the authors using immunofluorescence.

      5) The authors do some pharmacology to test whether SACs are the basis of the larger sustained response observed in M4 vs 5B . However, the assumptions/interpretations for these experiments are based on some mistakes regarding retinal circuits. SACs release GABA and acetylcholine. However, the pharmacology they do is quite limited. Namely they use TPMPA, which blocks GABA-C receptors which are found on a subset of bipolar cell terminal and by no means represent the major source of GABAergic signaling in retina which is via GABA-A and GABA-B receptors. Similarly, the authors assess impact of ACh release by using atropine, which blocks muscarinic receptors but not nicotinic ACh receptors. Finally, the authors use MFA, a blocker of gap junctions, which does have clear impact on sustained responses. However, SACs are not thought to be gap junction coupled to anything. So, it is more likely MFA is acting via RGC-RGC gap junction coupling or having an off-target effect. Much more needs to be done to have a complete understanding of the circuits that mediate the ChR-mediated light responses.

      6) 226-227 - what is the conclusion - results suggest not due entirely to gene transfer? This needs further explanation.

      7) Comparison of light induced responses in MNU vs non MNU treated rather confusing. Authors should consider revising this point.

    1. Reviewer #3:

      In this manuscript, Santos and Sirota demonstrated that the in vivo fast choline dynamics detected using choline-oxidase based biosensors is strongly correlated with, and likely caused by, phasic oxygen dynamics in vivo. The authors developed a novel tetrode-based amperometric choline oxidase (ChOx) sensor that can simultaneously measure ChOx and O2 levels within the same tetrode, which enabled the authors to observe strong correlations between ChOx and O2 levels in vivo (in behaving rats and mice, and under several distinct behavioral contexts). To dissect the causal relationship and determine the role of phasic O2 transients, the authors further combined in vivo as well as in vitro perturbation experiments to demonstrate that that phasic fluctuations in O2 concentration can lead to fluctuations in ChOx measurements. Moreover, mathematical modeling recapitulates the systemic relationship between ChOx and O2, suggesting the source of this coupling stems from non-steady-state enzyme kinetics. Together, these findings challenge the long-held belief that ChOx sensors can measure sub-second temporal dynamics of choline concentrations in vivo, and also calls for critical re-evaluation of all oxidase-based biosensors literature to determine the contribution of phasic O2 dynamics in vivo.

      The study provides extensive evidence to support their claim: correlational, causal, analytical and modeling. The authors employed multiple levels of approaches, from the development of novel biosensors that leads to the observed correlation, to careful in vivo and in vitro perturbation experiments to demonstrate causal relationship. The data is carefully analyzed, and elegantly matched with modeling results. The results of this study have broad implications beyond the ChOx literature and in fact challenge the entire literature on oxidase-based biosensors.

    2. Reviewer #2:

      This is an important piece of work addressing in-vivo measurements where two coupled components are to be measured ideally in the same time and space components. Unfortunately, the impact of this work is likely to be minimized due to its poor organization and the attempt to deal with a number of separate but related issues in the same manuscript. Accordingly, it is suggested that this work be divided into two manuscripts to be published together. The focus of the two might be:

      A) A Tetrode-Based Microsensor (TACO) - This work would focus on the criteria for performance that would be expected for a new sensor, presumably with new and unique properties. This work would include differential plating of electrodes (It is unclear whether some or all of this work has been previously reported), dimension considerations, and the simulation of sensor response. (An important consideration but frequently overlooked is that a sensing element with a 17um diameter will exhibit hemispherical diffusion (Eq. 4)). Other issues such as interferences, stability, sensor response time and linearity need to be more fully explained. Presumably such a sensor configuration would be useful in other applications involving oxidase-based sensors.

      B) Effects of Local Field Potential and Oxygen-evoked ChOx transients in the In-Vivo Measurement of Acetylcholine in Freely Moving Rodents (A better title can surely be presented!) - Here the focus should be on the in-vivo measurements including a qualitative explanation of the LFP and O2 response and how the TACO sensor corrects for this. Presumably the detection of REM and NREM sleep will be detected by EEG. This is not well explained. Also unclear is how the improved performance of the sensor affects the conclusions of the in-vivo studies.

    3. Reviewer #1:

      Santos and Sirota developed a novel Tetrode-based Amperometric ChOx (TACO) sensor. This multichannel configuration can simultaneously measure the ChOX activity (COA) and O2 in the same brain spot. Using the TACO sensor in freely-moving and head-fixed rodents, they found that COA and O2 dynamics following locomotion in active state and hippocampal sharp-wave/ripple (SWR) complexes during quiescence state. It's interesting that the COA signal can be calibrated by subtraction of the pseudo-sentinel from the Ch-sensing sites signal the TACO sensor. However, the COA signal is confounded by phasic O2 fluctuations, so, the authentic changes in COA are interfered by O2-evoked enzymatic responses. This question isn't addressed in this paper.

      Major concerns:

      1) The author found that the COA readout is confounded by phasic O2 fluctuations in in vitro and in vivo experiments. These results cast doubt on the validity of the authentic cholinergic response in freely-moving or head-fixed rodents. These findings seem to be generalized to other oxidase-based biosensors, although the author has some discussion on how to address this question. However, we can't get authentic cholinergic dynamic in vivo by TACO biosensor if we didn't clear the biosensor O2 dependence. So, the author should try to address this question.

      2) The author should demonstrate how to calibrate moving artificial signals in freely-moving rodents.

      3) Concerns on the selectivity. Figure 1E shows the TACO sensor also responses to dopamine and ascorbate. The author should demonstrate the selectivity of TACO sensor on different monoamines at different concentrations.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 4 2020, follows.

      Summary

      This study combines high resolution imaging experiments with mechanical modeling to elucidate the energetics of flagellar propulsion and understand the role of internal dissipation in this system. The experiments use mouse sperm cells that are chemically tethered to a glass slip. For each cell, the flagellum shape is imaged over time and segmented into a mathematical curve. This data is analyzed based on a planar Kirckhoff rod model that includes hydrodynamic drag forces (based on resistive force theory), bending elasticity, and an unknown active moment density. An energy balance is written that also includes internal viscous dissipation generated inside the flagellum, with an ad hoc internal friction coefficient. By calculating the various terms in the energy balance based on the reconstructed filament shapes, the authors are able to estimate the active power density along the flagellum. This calculation leads to two unexpected findings: (1) the authors find that the active power density can be negative along some portions of the flagellum, meaning that along these portions the dynein motors act against the local deformation of the structure, and (2) the main origin of dissipation in the system comes from internal dissipation, which exceeds viscous dissipation in the fluid in magnitude.

      Essential Revisions

      1) It is not completely clear from the manuscript what the configuration of the sperm is with respect to the glass slide where the head is tethered. What is the orientation of the cells with respect to the slide, and in which plane are the deformations measured? (from above or from the side?) We would expect that different configurations may lead to slightly different waveforms. In particular, we are surprised that the mean shapes shown in figure 2(a) have a net asymmetry which is observed in nearly all the cells: could this have to do with the relative configuration of the flagellum with respect to the surface?

      2) The experiments are done with flagella very near a no-slip surface, since the cells are chemically adhered to the chamber boundary. Yet, the authors use resistive force theory for filaments in free space, without any reference to the nearby no-slip surface. As the rate of energy dissipation near the surface will be considerably larger than estimated by RFT, it is possible that some (or much, or perhaps all) of the additional dissipation found by the authors is actually within the fluid and simply not accounted for by RFT. Thus, all of the calculations must be redone with the appropriate Blake tensor for stokeslets near a no-slip wall before the results can be considered definitive. The paper must also more carefully illustrate and quantify the proximity of the flagella to the surface in order to make these calculations precise. Absent this analysis, the claims of the paper do not stand up to scrutiny.

      A related point is the need to understand the effect of tethering the cell on its kinematics and energetics? In other words, do the conclusions still hold for freely swimming cells?

      3) Is there any evidence of 3D dynamics? Some recent experiments with human sperm have suggested that sperm beats can take place in 3D (Gadelha et al., Science Advances 2020). As the model in the paper is 2D, this could also affect the energy balance.

      4) The authors should examine the work of K.E. Machin ["The control and synchronization of flagellar movement", Proc. Roy. Soc. B 158, 88 (1963)], which provided the first theoretical formalism to study active moment generation within beating flagella based on examining the difference between known force contributions from viscous dissipation and elastic bending. It seems that this same kind of analysis could be done here to identify directly the non-viscous contribution, rather than having to postulate a particular form.

      Stated another way: Why not try to estimate the active power density directly from the active moment density, which could be calculated from the moment balance of equation (4) where all the other terms are known? This would provide a direct estimate of the active power. The force balance could then be used to estimate the internal friction, which would then no longer rely on an assumed value for the internal friction coefficient. In fact, this could be used to obtain an estimate for that coefficient.

      5) The paper addresses in detail the use of Chebyshev fitting methods for the filaments, but does not appear to address the physical boundary conditions one would expect on elastic objects (particularly at the free end), involving the vanishing of moments and forces. Unlike, for example, the biharmonic eigenfunctions of simple elastic filament dynamics which are tailored to those boundary conditions [see, e.g. Goldstein, Powers, Wiggins, PRL 80, 5232 (1998)], it is not clear how the Chebyshev functions satisfy those conditions. Some explanation is needed.

      6) If indeed internal dissipation dominates, that would suggest that essentially all prior theoretical approaches to calculating sperm waveforms must be quantitatively in error by very large factors. It would be very appropriate for the authors to examine some of those theoretical works to determine if this is the case.

      7) The authors note in the Discussion that the beating waveform changes dramatically in fluid with higher viscosity. Yet, if external dissipation plays such a small role how can this be rationalized?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on November 8 2020, follows.

      Summary

      The manuscript from Perez-Garcia et al. follows up on a prior study by the same authors in which they identified the tumor suppressor BAP1 as a regulator of mouse placentation and trophoblast stem cells (TSCs) (Perez-Garcia et al., Nature, 2018). In their preceding work the authors showed that CRISPR-mediated knockout of BAP1 in TSCs results in upregulation of key stem cell markers Cdx2 and Esrrb and biased differentiation towards trophoblast giant cells at the expense of the syncytiotrophoblast lineage. Here the authors have expanded on these observations by demonstrating that BAP1 modulates the epithelial-to-mesenchymal transition (EMT) in TSCs and that a similar phenotype can be obtained by genetic deletion of Asxl1/2. Declining protein levels of BAP1 during differentiation of human TSCs into extravillous trophoblast suggest that the role of BAP1 may be conserved in humans. As the molecular mechanisms of trophoblast development, including EMT and invasive behaviors of trophoblast giant cells in the mouse and extravillous trophoblast cells in human are only beginning to be understood, this study provides an important advance.

      This is a well-written and technically sound study that clarifies the role of BAP1 in trophoblast development. Overall, the work presented is very important to the fields of EMT and trophoblast stem cell biology, and it warrants publication in eLife in principle. However, the claims in the abstract, the model in Figure 5, and the conclusions in the discussion are not well-supported. Therefore, additional experimental work will be essential for the manuscript to become suitable for publication in eLife.

      Essential Revisions

      1) There was consensus that the current manuscript lacks functional data to demonstrate conservation of BAP1/ASXL1/2 function in human TSCs. These are crucial claims in the abstract that are not supported, and some elements of these claims are necessary for the manuscript to have impact beyond the previous Nature 2018 publication. Currently, the studies in human TSCs are purely observational (Fig. 6D-E). The authors should employ genetic approaches to interrogate whether the functions of BAP1 in TSC self-renewal and differentiation are truly conserved between mouse and human.

      2) The main takeaway from Figure 2 is that BAP1 is dispensable for mouse TSC maintenance and that BAP1 knockout results in increased expression of stem cell markers Cdx2 and Esrrb. Both of these findings were previously reported in the authors' 2018 paper (see Fig. 4b in the Nature paper). Therefore, the statement that "BAP1 deletion does not impair the stem cell gene regulatory network" is not surprising and the authors should state clearly that these experiments confirm their prior observations.

      3) The overexpression data in Figure 4 is difficult to interpret. Vector transduced TSCs show a tight, epithelial morphology (Figure 3A), whereas the NT-sgRNA control cells look like they are undergoing EMT (Figure 4C). Why does the introduction of the NT-sgRNA induce EMT characteristics? Bap1 sgRNA1 cells seem less epithelial than the Vector transduced cells. Do NT-sgRNA TSCs have less BAP1 than Vector transduced TSCs?

      4) Moreover, all the data in Figure 4 are based on a single sgRNA that could activate BAP1 expression. To exclude off target effects, the authors should confirm the effect of BAP1 overexpression using another sgRNA or cDNA overexpression system.

      5) The authors need to examine the gene expression data more closely as well as the functional consequences of BAP1 overexpression on TSC proliferation and differentiation. In particular it would be important to compare the list of DEG in BAP1 KO and overexpression condition. Are they mirror-image or are there differences? For example, Zeb2 expression is strongly upregulated in BAP1 mutant line but not significantly altered in cells overexpressing BAP1. This should be discussed.

      6) In the abstract, the authors state that BAP1 function during trophoblast development is dependent on its binding to Asxl1/2/3. However, the data presented in this manuscript do not address whether BAP1 and Asxl1/2/3 are indeed part of the same complex in TSCs. Furthermore, the fact that Asxl1/2 KO increases expression of syncytial genes (Fig. 5) does not provide direct evidence of functional synergy between these proteins and BAP1. This conclusion could be strengthened by demonstrating that Asxl1 and BAP1 indeed have a protein-protein interaction in TSCs and/or by deleting the BAP1 binding domain in Asxl1/2. It would also be instructive to examine whether the phenotype of BAP1 overexpression in TSCs (e.g. gain of epithelial features and reduced invasiveness) is dependent on Asxl1. This could be examined by overexpressing BAP1 in Asxl1-deficient TSCs.

      7) In some cases, experiments are carried out to "confirm" and "corroborate" hypotheses rather than test them. For example, the similarity between the gene expression signature of Bap1 mutant murine TSCs is and Bap1 mutant melanocytes and mesothelial cells is shown and emphasized. One wonders how unique is this similarity? Is Bap1 expression modulation observed in other EMT processes during development or in cancer? This should be explored and discussed.

    1. Reviewer #3:

      This manuscript attempts to address a timely question about animal social networks - what is their functional resilience to human-induced disturbance? The authors use association data from savanna elephants to construct empirical and virtual networks and assess how these change after virtual removal of individuals based on their age or network position (to simulate poaching events as real-world data were not available). Simulation studies require clear statements of caveats for interpreting the results as they only predict potential direct responses of a network and cannot account for the dynamic and indirect responses that are more likely to occur in nature. Here various network metrics are used to infer functionality, but critically, these are not supported by field data or citations (either from elephants or other study systems), and furthermore the relevance of the metrics to address structure vs. function is unclear to readers less familiar with SNA. Secondly, the motivation for the study is deeply embedded in elephant biology and would benefit a broader audience with a clear introduction to structural vs. functional resilience.

      1) Applicability of simulation studies

      The study sets out to test the functional resilience of elephant networks after simulated poaching events because real-world data were not available (to the authors). There are many caveats for applying the results of network simulations to real-world data because they rarely can take indirect and dynamic responses into account (unless these data are used to inform the simulation), see Shizuka & Johnson Behav Ecol 2020 for a nice review of this point. The authors allude to this in the discussion when they discuss the need for more dynamic models, but conclude by stating the need to work more collaboratively - this is a good point and I'm sure it's true, but there really needs to be a clear statement about the applicability of these simulated results in the introduction and upfront in the discussion. This is essential to avoid inadvertently misleading readers less familiar with these methods.

      2) Network measures need greater empirical support and explanation

      As this is a simulation exercise, it is essential that the network metrics are meaningful in this context. This is especially important given recent discussion of metric hacking in social network analysis studies (e.g. Webber et al. Anim Behav 2020). At present, some of the metrics are presented in a paragraph in the Introduction with vague support e.g. line 281 - "Each of these heuristics... SHOULD change drastically...", and all 7 are in table 1 but there are no references (either from elephants or even broadly-speaking from studies on networks) to support the major assumptions of the study. Refs are given in the table caption but it is unclear what these relate to. There have been some very interesting experimental studies on functional resilience which might help in this regard. E.g. Maldonado-Chaparro et al. 2018 PRSB used captive zebra finches to experimentally test foraging efficiency (i.e. functionality) of social groups after repeated disturbances to their networks, and as here, focused on functional change immediately after disturbance (e.g. line 172-73).

      More importantly, it is unclear which of the 7 metrics are supposed to inform us explicitly about structure vs. function or whether these can even be unambiguously disentangled - e.g. is clustering coefficient structure or function? It is used in both this study and by Goldenberg et al. 2016 that is introduced here as focusing only on structural resilience. It would be very helpful to have clear statements about the metrics and predictions regarding structural vs. functional resilience. At the moment they vary throughout the manuscript, e.g. referred to as metrics of social competence in the discussion (line 543). Sorry for my confusion, but there are so many different ways that we can derive metrics from networks that justifying these clearly is critical for the conclusions of the study.

      1. More succinct presentation of the knowledge gap and its broader implications beyond elephant biology.

      At present, the study is presented with elephant biology and conservation as the core motivation, yet the concept of functional resilience is fundamental for studies of any species where social connections influence the flow of information (and presumably fitness of individuals). The introduction is extremely long (10 paragraphs over 6.5 pages) and functional resilience is not introduced and defined until the end of the Introduction's 4th paragraph and its link to broader literature is confusing . Focusing the introduction on how/why structural and functional resilience may vary in networks (and how this can be inferred from network metrics), and then using elephant biology as an example for why this is relevant to study, might make it much easier to follow.

    2. Reviewer #2:

      The manuscript represents a lot of hard work on an interesting topic. Understanding how threatened populations are impacted by human-derived processes is critical, and requires more study. However, as it stands, the study suffers from some logical flaws that detract from the scientific insights that can be gained from this study. These are:

      1) The authors argue that older individuals are important repositories of ecological knowledge, which is now well-established knowledge. However, the authors then build their study around the consequences of poaching in terms of the effects on network metrics that are assumed to correspond to transmission properties. The logical problem here is that removing ecological knowledge from a network leaves nothing to transmit-hence the transmission properties of the network are inconsequential.

      2) Linked to this point is the issue that the results and discussion focus a lot on the concept of network transmission, but the study uses network metrics (e.g. diameter) as proxies of transmission properties. It is pretty well known that there are many factors (e.g. clustering coefficient) that contribute to transmission dynamics, and it is unlikely that any one network metric alone can capture the ability for a network to transmit information.

      3) The authors note that continuous data on the reorganization of the network after poaching are not existent, and that they justify using a static approach (i.e. the network does not change after a removal/simulated poaching event) by focusing on the consequences immediately after deletion. However, the simulations involve removing up to 20% of the individuals in the population, meaning that their model assumes that poaching events are occurring substantially faster than the network is reorganizing itself. This seems too unrealistic an assumption.

      4) A further issue with using a static approach is that the networks captured in the study may not represent the network structure that is in place when an event takes place in which ecological knowledge is important. For example, studies from other multilevel societies, e.g. hamadryas baboons (from Kummer's work), suggest that units come together when conditions necessitate forming larger groups. So, the network measured in the empirical data may not be the network through which ecological knowledge is transferred when an event necessitates it.

      5) Finally, the results and the conclusions drawn from the study seem in conflict. On the one hand, the main summary of the results are that removing older individuals has little, if any, impact on the network's capacity to transmit information. On the other, the conclusions seem to be slanted towards removal of older individuals as a conservation issue (e.g. L662). Thus, there is tension in the manuscript that, unfortunately, reduces both the clarity of the findings and the clarity of the take-home messages.

      Overall, the study was enjoyable to read, with lots of biology, which is a strength for a modelling study. However, some of its construction, and the reliance on simple node deletions, really limits the capacity to gain substantial new insights from this study.

    3. Reviewer #1:

      Using a simulation approach, the authors investigate the impact of removing group members likely to possess key social or ecological information on the topology of elephant social networks in order to better understand how poaching pressure may influence their resilience and functionality. Removals were based on three metrics thought to correlate with an individual's knowledge (age, degree, betweenness centrality) and compared to random removals for both an empirical network and virtual networks. Whereas targeted removals based on age had relatively limited impact on networks characteristics, removal of socially central individuals led to less integrated networks with potential consequences for the spread of adaptive information.

      The manuscript was generally clear and well-written. The introduction nicely laid out the rationale for this study and the authors do a nice job walking the reader through the steps of the simulation (how the networks were constructed, how deletions were performed, etc.). I also appreciated the discussion given to the limitations of their approach, such as the lack of network restructuring in response to removals.

      1) My main critique is that I believe the authors should be more cautious in attributing functional meaning to their network metrics, particularly given that data was unavailable to allow them to simulate a transmission process. For example, at L461-463, it is stated that targeted removal of individuals with high betweenness decreased the speed of information flow, but what was actually found was that values for weighted diameter increased. Put another way, weighted diameter provides an indication of how rapidly information could potentially flow, but not whether it in fact does so. The actual dynamics of information flow are going to depend on the nature of the information and how it is transmitted among individuals, as the authors note in the discussion (L627-640). I believe that the results should be reworded to focus more on what was actually found (i.e. changes in network metrics), with the potential functional relevance of those changes then examined in the Discussion.

      2) In addition, I couldn't see if this was addressed anywhere, but is there empirical evidence to suggest that the mature elephants that possess high-quality information are those characterized by high degree or betweenness?

      Thank you for the interesting read!

    1. Reviewer #3:

      Quiroga et al. studied the molecular function of mechanosensitive ion channel protein Piezo1 during mouse primary myoblast differentiation in culture condition. The authors measured myoblast proliferation and differentiation after either knockdown of Piezo1 or chemical activation of Piezo1 protein. In overall, the study is significant given its conclusion directly contradicts with a recent study by Masaki Tsuchiya et al. Nature Communications (2018) by which knockout of Piezo1 produced opposite effects. However, major concerns were identified and need to be addressed to strengthen their claim.

      1) It is unfortunate that the authors have confused "fusion index" with "differentiation index". By the description in Method, they actually measured differentiation index though claimed as "fusion index". The commonly used fusion index is the ratio of nuclei in myocytes with {greater than or equal to} 3 nuclei normalized with total number of nuclei in MyHC+ myocytes. Therefore, it appears that what the author claimed about "fusion defect" was actually a differentiation defect. These errors need to be corrected.

      2) Following comment 1, the authors need to evaluate whether or not the differentiation is affected when Piezo1 is knocked-down or activated. It is suggested to run a panel of qPCR assay for myogenic markers including myosin genes (Myh3, Myh8). Western blots of myosin by MF20 antibody will also need to be performed and quantified.

      3) The author discussed the potential off-target effects for siRNA from the previous study. Although it is comparatively more convincing that this manuscript tested 4 siRNA, for the scientific rigor, the authors still need to clarify whether the study by Tsuchiya et al is reproducible. As such, the authors should measure myoblast fusion by using the same siRNAs as listed in Tsuchiya et al. In addition, the authors should also characterize the myoblast fusion phenotype of Piezo1 gene-KO from CRISPR treatment of primary myoblast.

      4) To rule out any off-target effects of the chemical activator of Piezo1, the authors should test whether this drug's effect on myoblast fusion /differentiation can be negated when Piezo1 is knocked down.

      5) Concerning the role of myomixer gene in Piezo1 KD phenotype, the authors should use another set of primers for qPCR. The current forward primer only detects a predicted longer transcript isoform of Mymx but not its predominant isoform (NM_001177468).

      6) For Fig.6, the details of experiment procedure, e.g. the timing of drug treatment in relation to differentiation timing, needs to be provided.

      7) The authors should cite the correct references as being consistent with their description. For instance, line# 528, 1011. In addition, the writing needs to be improved for better readability.

    2. Reviewer #2:

      In this study, Ortuste Quiroga et al. showed that the mechanosensitive ion channel Piezo1 promotes myoblast fusion during the formation of multinucleated, mature myotubes. The authors show that Piezo1 knockdown suppressed myoblast fusion during myotube formation and maturation. This was accompanied by a decrease in Myomaker expression. In addition, Piezo1 knockdown lowered Ca2+ influx in response to stretch. In contrast, the agonist (Yoda1)-mediated activation of Piezo1 increased Ca2+ influx and enhanced myoblast fusion, but only under certain conditions. Over-activation of Piezo1 resulted in the loss of myotube integrity. Surprisingly, the myotubes were thinner in Yoda1-treated cells compared to the control. Furthermore, the authors showed that Piezo1 activation enhanced Ca2+ influx in cultured myotubes and the influx of Ca2+ increased in response to stretch. However, it is unclear how this is related to myoblast fusion.

      Overall, the authors made several interesting observations in this study, such as Piezo1's role in myoblast fusion and Piezo1-mediated Ca2+ influx, etc. However, how these phenomena are linked and what is causal remain largely unclear. Another issue is the discrepancy between this study and Tsuchiya et al. Nature Communication (2018) on the function of Piezo in myoblast fusion.

      Major comments:

      1) In this study, the authors uncovered a positive role for Piezo1 in myoblast fusion. This is in contrast to Tsuchiya et al., which demonstrated an inhibitory role of Piezo1 in this process. While this study used an RNAi approach to knock down Piezo1 and found a decrease in myoblast fusion, Tsuchiya et al. used CRISPR/Cas9 to knock out Piezo1 in muscle cells and observed a significant increase in myoblast fusion. These two opposite results are difficult to interpret and made the role of Piezo1 in myoblast fusion confusing. It is necessary that the authors make some effort to bring clarity to this issue. First, the authors need to perform rescue experiments in their RNAi cells to make sure that the fusion defect is not due to off-target effects caused by the siRNAs. Second, the authors should design an siRNA that causes a more significant knockdown of Piezo1 than the current siRNAs and test if myoblast fusion is enhanced as in the knockout cells (Tsuchiya et al.). Third, the authors could make their own CRISPR/Cas9 knockout cells and examine the resulting fusion index.

      2) How does Ca2+ influx regulate fusion? Tsuchiya et al. provided evidence that Piezo1-mediated Ca2+ influx activates actomyosin activity and inhibits myoblast fusion. This current study suggests that Ca2+ influx increases fusion, but without providing mechanistic explanations. What are the effects of Ca2+ influx that lead to an increase in myoblast fusion? Does it cause more IL4 secretion? Or transcription upregulation of Myomaker? How? Does the Ca2+ influx level correlate with Myomaker expression level? If Ca2+ influx indeed leads to upregulation of Myomaker, why would Piezo1 knockout cells (low Ca2+ influx) show increased levels of fusion (Tsuchiya et al.)?

      3) Is Piezo1 required in myoblasts or myotubes or both cell types for fusion? Is it localized to the fusion sites?

    3. Reviewer #1:

      The manuscript from Quiroga and colleagues reports a function for the mechanosensor Piezo 1 in myocyte fusion. The manuscript concludes via a series of in vitro experiments that Piezo 1 knockdown results in decreased myotube formation.

      While overall the manuscript reports some potentially interesting observations, the main conclusion seems preliminary and the work would benefit from substantial additional validations in multiple models to strengthen the tie between myomaker and Piezo1 functions.

      Major Comments:

      1) siRNA reduces gene expression in a transient manner and it is unclear for how long there is significant silencing of Piezo1 RNA during differentiation. Therefore, a more consistent model that expresses consistent amounts of Piezo1 might be beneficial. Importantly, a more stable mutant form of Piezo 1 (generated with CRISPR/Cas9) was generated in a previous study (Tsuchiya et al, 2018, ref. 17). The long-term consequences of differentiation/fusion of myogenic cells following loss of Piezo 1 expression in the Tsuchiya study reached opposite conclusions to the current study. These findings raise concerns that are not clearly addressed in the present study. While the authors attempt to explain the opposite findings by the use of a different Piezo 1 silencing model, it is difficult to reconcile with the present data the very opposite findings.

      2) Figure 3A and C have duplicated images showing siRNA of Piezo 1 in EDL and Soleus. The correct images need to be inserted.

      3) Quantification of proteins levels downstream of Piezo silencing should be corroborated by western blot analyses. These include data presented in Figures 2 and 3.

      4) In Figure 4, it would be helpful to include a graph illustrating the amount of Piezo1 silencing and the corresponding decrease in Myomaker expression.

      5) In Figure 6, expression of myomaker and myomixer should be monitored following administration of Yoda1. If Yoda1 increases fusion at low concentrations, the fusion genes should be upregulated in expression.

      6) In Figure 7 the myotube width should also be accompanied by quantifications of numbers of nuclei fused in the myotubes. This data will address whether cell fusion changes following Yoda1 treatment.

      7) While the present work explores the function of Piezo 1 in myogenesis in vitro, no experiments address a potential parallel function of Piezo1 in vivo. Supporting data using injured/regenerating muscle should strengthen the overall message.

      8) Figure 9 proposes an interesting hypothesis linking Piezo 1 to FSHD. However, the hypothesis is not supported by experimental data and remains rather exploratory in its current form.

    1. Reviewer #3:

      In this manuscript, Naetar et al. investigate the role of LAP2α binding to A-type lamins in the nucleoplasm. LAP2α was already thought to be important for maintaining the nucleoplasmic pool of soluble A-type lamins, because knockout of LAP2α has previously been shown to reduce nucleoplasmic signal from an antibody that recognizes the lamin-A/C amino terminus. However, by directly tagging A-type lamins with fluorescent proteins and by using an alternative antibody to stain them, Naetar et al. find that the presence of LAP2α does not appreciably affect the pool of soluble lamins in the nucleoplasm. Instead, they find that LAP2α affects the assembly state of soluble lamins within the nucleoplasm, preventing formation of higher order A-type lamin structures that impede the mobility of telomeres within the nucleus.

      There is a lot to like about this paper. I admire the author's mechanistic approach to studying lamin assembly state. The complementary cell biology/microscopy approaches paired with the biochemical approaches in figure 5 lead to an overall convincing story. And finally, I appreciate the efforts the authors made to "show their work," including their genome editing quality control measures.

      Major comments:

      1) Although I appreciate the transparency of the authors in demonstrating their workflow and quality control measures (see above), some of the terminology makes the manuscript difficult to read. At times it feels more like reading a lab notebook than reading a manuscript. For example, The manuscript would be easier to understand if cell lines were given descriptive names (eg: LAP2α KO, or mEos3.2-lmna instead of "WT#21") rather than continuing to refer to them by the small guide RNA that was used to generate them. A second example: it is nice to show biological replicate data as in figure 1, but it took me a while to figure out that the second and third columns in panels A and B were biological replicates; I spent some time trying to determine which experimental condition was different. Perhaps one biological replicate could be displayed in the main text and the second could be moved to the supplement, especially considering that it appears that only one of the clones was used for the quantifications shown in the bottom panels.

      2) Why was the choice made to disrupt LAP2α at the beginning of exon 4? How large are exons 1 and 2, which are not shown in the schematic in the supplemental figures? What percentage of the LAP2α peptide primary sequence is affected by a frameshift mutation at the start of exon 4? Why was this approach preferable to introducing a frameshift mutation closer to the 5' end of the gene? I am concerned that the "LAP2α KO" cells used in the experiments may have some partially functional truncated LAP2α protein.

      3) On page 16, the authors describe a set of experiments that are meant to demonstrate that their failure to see a difference in nucleoplasmic A-type lamins in LAP2α mutants is not due to the fluorescent protein tag used, however, instead of looking at untagged lamins, they elect to look at a cell line that has all lmna alleles tagged. Wouldn't it be better to use the LAP2α KO cells from figure 1 and stain with both the 3A6 antibody and the N18 antibody to determine whether untagged lamins behave the same way as tagged lamins? Perhaps this experiment could be added along with the current data, as it would be nice to compare directly between a cell line with all lmna alleles tagged and a cell line with no lmna alleles tagged.

      This experiment would also give the authors a chance to compare morphology and overall fitness of cells with all untagged lmna with cells with all tagged lmna, to determine whether the tagged proteins are fully functional. Even if the tagged protein is fully functional, it would be appropriate to add a brief discussion of the possibility that fluorescent tags do perturb lamin-A/C function. After all, many lamin mutations do not cause obvious phenotypes in tissue culture cells, but defects can still emerge during development and aging in the context of an animal.

      4) The authors build a convincing case that binding to A-type lamins by LAP2α influences their ability to assemble. But how do cells leverage this relationship for biological functions? Do cells tune the amount of fully soluble vs. partially assembled A-type lamins in the nucleoplasm in order to control nuclear structure or function in response to certain stimuli? Have the A-type lamins in the nucleoplasm been found to be in a different assembly state in different cell types? As the study is currently written, it presents an interesting molecular mechanism but no biological mechanism.

    2. Reviewer #2:

      Naetal et al. studied the effect of Lap2a on lamin A/C dynamics-of-assembly and mobility as well as telomere movements. This study indicated that lamin A/C are first assembled into the lamina, before some of the lamin A/C is re-localized to the nucleoplasm. Interestingly, the amount of nucleoplasmic lamins is independent of Lap2a although its physical properties are different. The results indicated that Lap2a contributes to the dynamics of lamin A/C in the nucleoplasm while its absence reduces nucleoplasmic lamin and telomere dynamics. These results reveal the function of Lap2a as regulator of lamin anchorage in the nucleoplasm but it has no major role in recruiting lamins into the nucleoplasm. Since the impact of lamins on the nuclear organization is critical for nuclear functions and important for nuclear integrity, these results are fundamental for the understanding of both lamin A/C and Lap2a.

      The authors also identified two pathways in which nucleoplasmic lamin emerged. First, lamin can be localized to the lamina and then relocated to the nucleoplasm, and second, from the pool of mitotic lamins which are not associated with the lamina.

      The authors may consider some textual changes, in particular regarding the state of nucleoplasmic lamin polymerization:

      1) The nuclear lamina filaments are typically 200-400 nm in length, but they are very flexible. A 200 nm filament would have a molecular weight of <1.4MDa ( ~50% of a ribosome) and can be bent and curved. That would mean that a single filament has a reasonably high diffusion coefficient. At the lamina, lamins are less mobile, however, it is likely to be due to binding partners that anchor lamins to the INM and chromatin (e.g. emerin is a membrane protein that binds lamin A) - the diffusion of 1.4 MDa protein complexes is quite fast. The above is mentioned because nucleoplasmic lamins may be polymerized but more mobile (less anchored) than their lamina-hosted lamins population.

      2) The authors show that nucleoplasmic lamins are first localized to the lamina, where they can polymerize. Isn't it possible that filaments can be released into the nucleoplasm?

      3) In vitro assembly assays of lamin A in the presence of Lap2a indicated that lamin A assembly is inhibited by Lap2a. Based on these results the authors suggest that Lap2a keeps lamin in a less polymerized state. Previous work by Zwerger et al. 2015, showed that inhibitors of in vitro lamin A assembly, have no impact on incorporation and localization of lamin A into the lamina, while incorporation of lamin A into the nuclear lamina was abolished when other lamin binders that have no effect on lamin assembly in vitro were used. That would suggest that either in vitro assembly is not representing the cellular lamin assembly or assembly of lamin into the lamina is independent of polymerization states of lamins. The authors may want to discuss these views.

    3. Reviewer #1:

      Taken collectively, the findings described in the manuscript provide a new perspective on how LAP2alpha influences the state of A-type lamins. By extension, one impact of the findings is that they provide a mechanism by which A-type lamin state is distinct within the nucleoplasm and at the nuclear lamina. The authors also arrive at some additional insights that are valuable. For example, the data supporting the initial peripheral localization of what is argued to be pre-lamin A during processing rather than filament assembly was interesting and, although indirect, largely convincing. I would encourage the authors to address the fact that this work drives a reinterpretation of their prior findings early in the paper. I also have some concern that the impact of the findings is somewhat narrow.

      Major points:

      1) Given that a major focus of the paper is to explain conflicting results with (the same group's) prior published data on the effect of LAP2alpha depletion, it would have helped to lay this out more clearly from the outset of the paper. As written, the reader is confused until arriving at Figure 3. I appreciate that resolving this conflict leads to a new perspective - namely that LAP2alpha influences the state of the lamin assembly in a way that disrupts its detection by the N18 antibody, but structuring the manuscript to get to this point as quickly as possible would improve its accessibility.

      2) I found the plots in Fig. 1A and B confusing. Can the authors clarify how the measurements are achieved - through ROIs for the entire nucleoplasm/periphery? How do they capture the diffuse versus focal signal within the nucleoplasm? There is also some concern that the nucleoplasmic signal may simply be too low to detect robustly at early time points (leading to an increase at later time points as the protein accumulates). Line profiles (which are useful in Fig. 3) would be very helpful if used more broadly for assessing the data particularly for Figure 1.

      3) Related to Figure 1 - the results for the deltaK32 mutant is essential for the interpretation and should be included in the primary figures.

      4) The authors make no comment on the functionality of the mEos-tagged lamin A/C CRISPR lines. However, the comment suggesting that some clones could have altered nuclear morphology (line 225) raises some questions. How did the authors interpret this? Were these clones in which there were indels in some lmnA alleles affecting the levels? Or is this a consequence of the fusion? How do the authors explain the relatively low expression level of the mEos fusion relative to the untagged? If the MDFs are diploid, presumably we would expect this to be one allele tagged and one allele untagged. Given that the expression ratio is very different from this, could the tagged lamin A/C be targeted for degradation? As these cell lines are critical for the rest of the study, this information is important.

      5) How does the deltaK32 mutation affect the ability to detect lamin A/C with the N18 antibody? Could this provide further insight into the impact of LAP2alpha by extension?

      6) Greater explanation for the apparent paradox between the increase in immobile fraction by FRAP and the increased diffusion coefficient by FCS in the LAP2alpha-depleted condition is needed. The authors suggest that the latter is due to the loss of LAP2alpha binding (line 395), but some modeling would go a long way here. What form are the lamins thought to be in, and how does the bulk that LAP2 alpha would bring match the apparent changes in diffusivity?

      7) One prediction that arises from the proposed model is that regulation of LAP2alpha levels will modulate the relative pool of A-type lamins at the nuclear interior versus the nucleoplasm. Beyond the knock-out cells, is there any other evidence of this relationship?

      8) Much of the biochemical characterization seems confirmatory - e.g. the binding and gradients in Fig. 5A and B. Use of the assembly mutants of lamin here could be informative is essential to interpret the changes induced by addition of LAP2alpha.

      9) With regards to the effects on chromatin mobility - over what time interval was the volume of movement observed? This is important because more fluctuations in nuclear position, for example, could influence this measure. In addition, telomeres are a confusing choice, given abundant evidence that there is crosstalk between the state of the nuclear lamina and telomere biology (e.g. lamin mutants affecting telomere homeostasis, etc.). At a minimum, acknowledging that telomeres may not reflect the effect on chromatin globally is important. Examples of the raw mean squared displacements would be more informative. Is the difference between lmna KO and lmna/Lap2alpha DKO (Fig. 6 right panel) significant?

      10) How do the authors think the membrane integrated LAP2beta fits into the story?

    1. Reviewer #3:

      In the current manuscript (De novo learning and adaptation of continuous control in a manual tracking task), Yang et al. aim to demonstrate that motor adaptation to a mirror reversal perturbation to visual feedback is de-novo learning of a movement controller in contrast to the adaptation of an existing controller with rotation to visual feedback. The authors examine two different experimental paradigms (1) continuous tracking of a cursor (trajectories generated by different sum-of-sinusoid functions) and (2) point to point movements under these two different visual manipulations of the cursor feedback: a 90 deg rotation and mirror reversal. Importantly, the authors set the motion of the cursor under the continuous tracking case as a sum of sinusoidal trajectories in order to perform frequency analysis of the motion tracking. The authors then examine the behavior in the time domain, and dissect the responses at individual frequencies in the frequency domain to determine the response of learning observed in each condition to the fast and slow changing components of the perturbation. There are two major reported results: (1) Participants learn both mirror reversal and rotation learning, but mirror reversal learning shows little to no aftereffect, whereas rotation learning shows an ~25º aftereffect from ~70º of learning. The authors argue that this suggests that mirror-reversal learning arises from a de-novo controller that is not engaged during baseline or washout (Lines 199-200) (2) Learning in the continuous tracking task shows a gradation in performance over frequencies (i.e., higher frequencies demonstrate lower learning). These are interesting experiments, with a well-defined motivation/question and (mostly) clear presentation of results. The figures and results largely support the hypothesis. My specific comments are shown below:

      1) In the abstract, the last line says 'Our results demonstrate that people can rapidly build a new continuous controller de novo and can flexibly integrate this process with adaptation of an existing controller'. It's not clear if the authors have shown the latter definitively. What is the reasoning for this statement, "flexibly integrate this process with adaptation of an existing controller"? It would seem you would need the same subjects to perform both experimental tasks (mirror reversal and VMR) concurrently to make this claim.

      2) It would be helpful if the authors could provide more background/context on their view of de novo learning and explanations on the relationship between de novo learning and the adapted controller model. For example, why does the lack of aftereffects under the mirror-reversal imply that the participants did not counter this perturbation via adaptation and instead engaged the learning by forming a de novo controller (Line 199)? Is the reasoning purely behavioral observations, or is there a physiological basis for this assertion?

      3) Details about frequency analysis are buried deep in the methods (around line 711), especially how the hand-target coherence (shown in 4B) is calculated. It would be helpful to include some of these details in the main text. For example, it is currently very difficult to understand the relationship when from moving from Figure 4A to 4B.

      4) Lines 197-199: The reason for the lack of after-effects in the mean-squared error analysis is a little vague. It took a few tries to understand the reasoning. It would be good to spell this out a little more clearly.

      5) Lines 223-225: The logic behind why coupling across axes is not nonlinear behavior seems to be missing. It's quite unclear and currently difficult to understand. It would be very helpful to spell this out too.

      6) Surprisingly, there is no measurement of aiming in the learning to VMR. Several motor learning studies (several the authors cite) show that learning in VMR is a combination of implicit and explicit. I understand that this is not possible in the continuous tracking task, but can certainly be done in the point to point task. Is there a reason this was not done? Wouldn't this have further supported the author's claim of an existing controller?

      7) Figure 2C: the data for mirror-reversal seems to have a weird uptick in the error. Why would that be? Is there an explanation for this?

      8) Lines 339-342: the results show that mirror-reversal learning is low at high frequencies (Fig 5B). The authors interpret this as reason to believe that this is actually de-novo learning and not adaptation of an existing controller. This seems somewhat unfounded. Could it be that de novo learning performs well at low frequency, through 'catch-up' movements, but not at high frequencies? Do the authors have a counter argument for this explanation?

      9) Lines 343 - 350: The authors ascribe the difference between after-effects and end of learning to be due to de-novo learning even in the rotation group. However, that difference would likely be due to the use of explicit strategy during learning and its disengagement afterwards, or perhaps a temporally labile learning. Can the authors rule these possibilities out? What were the instructions given at the end of the block and how much time elapsed?

      10) Lines 787: Outlier rejection based on some subjects who had greatly magnified or attenuated data seems like it might be biasing the data. Also, the outlier rejection criteria used (>1.5 IQR) seems very stringent. Furthermore, it appears there was no outlier rejection on the main experiment. It would be good to be consistent across experiments.

      11) Figure 4: The authors show the tracking strategies participants applied by investigating the relationship between hand and target movement. The linear relationship would suggest that participants tracked the target using continuous movements. In contrast, a nonlinear relationship would suggest that participants used an alternative tracking strategy. The authors only state this relationship is based on figure 4 but it seems do not provide any proof of the linearity. It would be more convincing to provide an analysis to show that the relationship is indeed linear or nonlinear.

    2. Reviewer #2:

      This manuscript asks how learners solve the problem of continuous motor control. The authors find qualitatively distinct components of learning under continuous tracking conditions: the adaptation of a baseline controller and the formation of a new task-specific continuous controller. These learning components were differentially engaged for rotation-learning and mirror-reversal. Further, the authors present a methodological advance in motor control and learning analysis that relies on frequency-based system identification techniques.

      Overall, this paper presented a valuable third perspective on the learning processes that underlie motor performance and provided an impressive analysis of continuous control data. Furthermore, the system identification technique that they developed will likely be of great value to the study of motor learning. However, I believe that there are some issues with the framing of the de novo learning mechanism and in their interpretation of the results.

      1) Positing a de novo learning mechanism as the absence of established learning process signatures.

      The authors introduce the concept of de novo learning in contrast to both error-driven adaptation and re-aiming: 'a motor task could be learned by forming a de novo controller, rather than through adaptation or re-aiming.' However, the discussion reframes de novo learning as purely in contrast with implicit adaptation: '[...] de novo learning refers to any mechanism, aside from implicit adaptation, that leads to the creation of a new controller'. While this apparent shift in perspective is likely due to their results and realistically represents the scientific process, this shift should be more explicitly communicated.

      As explicitly raised in the discussion and suggested in the introduction, the authors have categorized any learning process that is not implicit adaptation as a de novo learning process. To substantiate this conceptual decision, the authors should further explain why motor learning unaccounted for by established learning processes should be accounted for by a de novo learning process.

      2) The distinction between de novo learning and re-aiming is unclear.

      Participants could not learn mirror-reversal under continuous tracking without the point-to-point task, which the authors interpret to mean that re-aiming is important for the acquisition of a de novo controller. This suggests that re-aiming may not be important for the execution of a de novo controller.

      However, the frequency-based performance analysis presented in the main experiment would seem to suggest otherwise. As mentioned in the introduction, low stimulus frequencies allow a catch-up strategy. Both rotation and mirror groups were successful at compensating at low frequencies but the mirror-reversal group was largely unsuccessful at high frequencies. Assuming that higher frequencies inhibit cognitive strategy, this suggests to me that catch-up strategies might be essential to mirror-reversal, possibly not only during learning but also during execution.

      Further, the authors note that, in the rotation group, aftereffects only accounted for a fraction of total compensation, then suggest that residual learning not accounted for by adaptation was attributable to the same de novo learning process driving mirror reversal. This framing makes it unclear to me how the authors think re-aiming fits into the concept of a de novo learning process (e.g. Is all learning not driven by implicit adaptation de novo learning? What about the role of re-aiming?)

      3) Interpretation of spectral linearity as support for the absence of a catch-up strategy.

      Using linearity as a metric for mechanistic inference has limitations.

      • The absence of learning (errors) would present as nonlinearity.
      • The use of cognitive strategy could present as nonlinearity.
      • It doesn't seem possible to parse the two mechanisms, especially as you might expect both an increase in error at the beginning of learning and possibly an intervening cognitive strategy at the beginning of learning.

      Given these issues, a more grounded interpretation is that linearity simply represents real-time updating. If the relationship between the cursor and the hand is nonlinear, then updating is not in real time.

      The data shown in Fig 4B do not appear to provide clear evidence that the relationship between the cursor and the hand was approximately linear. Currently, it seems equally plausible to say that the data are approximately non-linear. Establishing a criterion for nonlinearity would be useful (e.g. shuffling a linear response for comparison).

      4) The presentation of mean-squared error in Figure 2 seems to have limited utility. As the authors mention, it does not arbitrate between mechanisms or represent the aftereffects observed in rotation learning. I suggest removing panel 2C altogether and magnifying panel 2B so that the reader can better appreciate the raw data.

    3. Reviewer #1 (Timothy Verstynen):

      This work looks at "de novo learning" in the context of fast continuous tasks, i.e., shifts of control policies (or controllers), rather than parameter changes in existing policies that occur with visuomotor adaptation. In a set of 2 experiments, using a mixture of discrete point-to-point movement trials and continuous tracking of moving target trials, the authors set out to determine whether the structure of shifts between visual and proprioceptive information determines whether learning relies on adaptation or shifts in control policies. Using both the presence of post-shift aftereffects and trialwise model fitting, the authors find that, simple rotations of visual inputs of the hand lead primarily to changes in control parameters while mirror reversals lead to changes in the control policy itself. Although there was evidence for a mixture of adaptation and de novo learning in both conditions. The authors infer from this evidence that humans can rapidly and flexibly shift control policies in response to environmental perturbations.

      In general this was a very cleverly designed and executed set of studies. The theoretical framing and experimental design are clean and clear. The data is compelling on the existence of condition differences. However, there are some concerns that temper my acceptance of the key inferences being made about de novo policy shifts.

      Major concerns:

      1) Inferential logic

      There are two key parts to the analyses used to infer that mirror-rotations lead to de novo policy shifts while rotations lead to adaptation. The first is the presence of post-perturbation aftereffects. The second are the alignment matrices (in both immediate hand position and movement frequency spaces), that are estimated based on model fits to the data. I'll consider both in turn.

      First, while we clearly see stronger aftereffects in the rotation condition than in the mirror reversal condition, suggesting a difference in fundamental control mechanisms, it is not clear why control policy shifts are the only alternative explanation for attenuated aftereffects. I'm pretty sure that this is just a confusion based on how the problem is posed in the paper.

      Second, and perhaps more problematically, the alignment matrices (Fig. 3A) and vectors (Fig. 3A, 5B, 6B), based on the model fits, show a very high degree of variability across conditions and do not perfectly align to the simple predictions shown in Fig. 3A. While I do agree that if you squint on the mean vector direction they look qualitatively consistent with the models, but only qualitatively. In fact, the fits to the "ideal" shifts or rotations (Fig. 5C, 6C) suggest only partial alignment to the pure models. How are we sure that this isn't reflecting an alternative mechanism, instead of partial de novo learning?

      In both the aftereffect and alignment fit analyses, the inference for de novo learning seems to be based on either a null (i.e., no aftereffect in mirror-rotation) or partial fits to a specific model. This leaves the main conclusions on somewhat shaky ground.

      2) Linearity analysis

      I had a really hard time understanding the analysis leading to the conclusion that there is a linear relationship between target motion and hand motion. The logic of the spectral analysis was not clear to me and the results shown in Figure 4 were not intuitive. In addition, there was no actual quantification used to make a conclusion about linearity. Thus it was difficult to determine whether this aspect of the authors' conclusion (a critical inference for them to justify their main conclusion) was correct.

      3) Statistical results

      Many of the key statistical results were buried in the main text and some were incompletely reported. Can the authors provide a table (or set of tables) of the key statistics, including at least the value of the statistical test itself and the p-value, if not also estimates of confidence on the estimates?

      4) Experiment 2

      The intention for experiment 2 is to see how much training on the point-to-point task influenced adaptation mechanisms during the tracking task. Yet, this experiment still included extensive exposure to the point-to-point task. Just not as much as in experiment 1. Given this, how can an inference be cleanly made about the influence of one task on the other? Wouldn't the clean way to ask this question be to just not run the point-to-point tracking task at all?

      5) Frequency analysis

      The authors state that "The failure to compensate at high frequencies ... is consistent with the observation that people who have learned to make point-to-point movements under mirror-reversed feedback are unable to generate appropriate rapid corrections to unexpected perturbations." This logic is not clear. How is this inferred based on which movement frequencies show an effect, and which do not, leading to this conclusion?

      Minor comments:

      Pg. 10, line 330: The authors report that "compensation for the visuomotor rotation resulted in reach-direct aftereffects of similar magnitude to that reported in previous studies". Please cite those studies here.

      Pg. 18, lines 661-668: There is only a description of the first experiment but not the second.

      Figure 5, supplement 1 seems to be a critical image for understanding the different dynamics of realignment between the rotation and mirror-reversal tasks. It seems better to have it be a main figure instead of a supplement.

    1. Reviewer #2:

      While much independent progress has been made in the development of RL models for learning and DDM-like models for decision-making, only recently have people begun to combine the two (e.g. Pedersen et al., 2017). In this paper, Miletić et al. develop a new set of combined reinforcement learning (RL) and evidence-accumulation models (EAM) in an attempt to account for learning/choice data and reaction time data in a series of probabilistic selection tasks (Frank et al., 2004). While previous developments have provided proof-of-concept that these models can be joined, here the authors present a new model, Advantage Racing Diffusion, which additionally captures stimulus difficulty, speed-accuracy trade-offs, and reversal learning effects. Using behavioral experiments and Bayesian model selection techniques, the authors demonstrate a superior fit to choice/RT data with their model relative to similar alternatives. These results suggest that the Advantage framework may be a key element in capturing choice/RT behavior during instrumental learning tasks.


      I think this paper asks some really interesting questions, the methods are quite sound, and it is written nicely. I do think that the central focus of the Advantage learning element is key to the study's novelty. However, I feel that the framing of the paper and the implementation are somewhat at odds, and thus additional experiments (or re-analyses of extant data sets) may be needed to transform the paper from a welcome, modest incremental improvement to a qualitative theoretical advance. I outline my major concerns/suggestions below:

      Major Points:

      In the abstract, the authors allude to both learning tasks with >2 options and to the role of absolute values of choices in characterizing the limitations of the typical DDM. However, in the manuscript the former is not addressed (and actually does not appear to be amenable to the current model implementation; see below), and the latter is addressed via modest improvements to model fits rather than true qualitative divergence between their model and other models' ability to capture specific behavior effects. Thus, I think the authors' could greatly strengthen their conclusions if they extend their model to RL data sets with a) >2 options, and b) variations in the absolute mean reward across blocks of learning trials. For instance, does their model predict set size effects during instrumental learning? Does their model predict qualitative shifts in choice and RT when different task blocks have different µ rewards? At the moment the primary results are improved fits, but I think it would be important to show their model's unique ability to capture more salient qualitative behavior effects.

      Moreover, I'm not sure I understand how the winning model would easily transfer to >2 options. As depicted in Equation 1, the model depends on the difference between two unique Q-values (weighted by w-d). How would this be implemented with >2 options? I see some paths forward on this (e.g., the current Q relative to the top Q-value, the current Q minus the average, etc.) but they seem to require somewhat arbitrary heuristics? Perhaps the authors could incorporate modulation of drift rates by policies? Or use an actor-critic approach? I may be missing something, but I think if the model in its current form doesn't accurately transfer to >2 options, the primary contribution is the utility of urgency, which has been presented in earlier studies.

      I appreciate the rigorous parameter recovery experiments in the supplement, but I think the authors could also perform a model separability analysis (e.g., plot a confusion matrix) - it seems several of the models are relatively similar and it could be useful to see if they're confusable (though I imagine they're mostly separable).

      I may be missing something, but I do not think the authors are implementing SARSA. SARSA is: Q(s,a)[t+1] = Q(s,a)[t] + lr(r[t+1] + discount(Q(s,a)[t+1]) - Q(s,a)[t]. However, this is a single-step task...isn't it just 'SAR' (aka, the standard Rescorla-Wagner delta rule)?

    2. Reviewer #1:

      This is a rigorous and very interesting study on a timely topic: combining modeling traditions of (reinforcement) learning and decision-making. The central claim of the paper is that the often-used combination of reinforcement learning with the drift diffusion model does not provide an adequate model of instrumental learning, but that the recently proposed "advantage accumulation framework" does. This claim will likely be of interest for anyone studying learning and decision-making, ranging from mathematical psychologists to neuroscientists running animal labs. I have a number of concerns regarding this paper.

      1) I think the basic behavior and model fit quality should be better described. The reinforcement-learning + evidence accumulation models (RL-EAM) are fitted to choices and reaction times (RTs). I find it therefore odd that we don't get to see any actual RT distributions, but only the 10th, 50th and 90th percentile thereof. What did the grand average RT distribution and model predictions look like (pooled across subjects and trials)? How much variability was there across subjects? I understand that that model was fit hierarchically, but it would be nice to (i) see a distribution of fit quality across subjects, to (ii) see RT distributions of a couple of good and bad fits, and to (iii) check whether the results hold after excluding the subjects with worst fits (if there are any outliers). Related, in the RT percentile plots (Figures 3 & 4), it would be nice to see some measure of variability across subjects.

      2) The authors pit four competing RL-EAMs against one another. I have a number of issues with the way this is done:

      -The qualitative model fits presented in Figure 3 are potentially misleading, as the competing models have different numbers of free parameters: DDM, 4; RL-RD, 5; RL-IARD, 5; RL-ARD: 6. RL-ARD has most free parameters, which might trivially lead to the best visual fit. For this reason, I find the BPIC results more compelling, and I think these should feature more prominently (perhaps even as bars in the main figure?).

      -All three racing diffusion models implement an urgency signal. Why did the authors not consider a similar mechanism within the DDM framework? Here, urgency could be implemented either as (linearly or hyperbolically) collapsing bounds, or as self-excitation (inverse of leak); both require only one extra parameter.

      3) I could imagine a scenario in which the decision-making process becomes progressively biased toward the more rewarding stimulus. In fact, this can be observed in Figure 7. Therefore, I wonder if the authors have considered RL-AEMs in which the choice boundaries do not correspond to correct vs. error, but instead to the actual choice alternatives (stimulus A vs. B). In such an implementation one can fit bias parameters like starting point and/or drift bias.

      4) The authors write that RL-AEMs assume that "[...] a subject gradually accumulates evidence for each choice option by sampling from a distribution of memory representations of the subjective value (or expected reward) associated with each choice option (known as Q-values)." Sampling from a distribution of memory representations is a relatively new idea, and I think it would help if the authors would be more circumscribed in the interpretation of these results, and also provide more context and rationale both in the Introduction and Discussion. For example, an interesting Discussion paragraph would be on how such a memory-sampling process might actually be implemented in the brain.

    1. Reviewer #2:

      In this manuscript, Lee and Usher study choices between two options, and model how such choices are affected by the certainty with which the decision-maker evaluates the two options. They insist that this value certainty should be incorporated in current models, and compare ways to do so within the framework of the drift-diffusion model (DDM).

      My main concern is that I find the main contribution a bit light. Mathematically, we know that in a DDM higher noise leads to shorter RTs. Empirically, we already know that options rated with low certainty lead to longer RTs (e.g. as demonstrated by the first author in Lee & Coricelli, 2020). So it is not surprising that low certainty cannot correspond to higher noise in a DDM, and might be captured by a lower drift instead. Then, the specific way it can be done deserves to be investigated, but the authors should explore in more details the different classes of models, and the ways in which value certainty could affect other parameters of the model as well.

      Suggestions:

      I would suggest presenting in the introduction more details about how DDM is currently used in studies of value based decisions, to better explain the context of the present work and highlight the specific contribution of the study.

      The authors consider a number of models in the discussion (effects of uncertainty on the bounds, balance of evidence, collapsing bounds, etc.) but do not give the full details of these models. I would suggest including these models in the analyses presented in the result section. Maybe the authors could capitalize on the amount of data they have to do some model fitting, to estimate how the parameters of the DDM would change with value certainty. Parameters of interest are the drift and the drift variability (in the extended version of the DDM) but the authors could also explore the bounds and the variability in the starting point. A basic approach would be to split the data based on value certainty: using a median-split for both options, they could fit separately the choices between 2 options rated with high certainty, and the choices between 2 options rated with low certainty, etc. A more involved approach would be to estimate the effect of value certainty on the parameters in a single analysis across all the data (e.g. using a hierarchical ddm).

      Minor points:

      The motivation for model 5, which includes an additional component for accumulating certainty, should be more detailed. This approach is not standard, and would deserve more details and some references to prior work offering the same approach, if it exists.

      A figure would be helpful to present the typical experimental paradigm, and including the notations of the variables.

      In Figure 2, the variable C1 and C2 are not properly defined.

    2. Reviewer #1:

      This article investigates how uncertainty about the value of alternatives affects the decision process through the lens of the drift diffusion model. The article proposes several models for how uncertainty might affect the drift rates or diffusion variance, and tests those models on four different food-choice datasets. The authors conclude that the best model is one in which the drift rate depends on the values of the options divided by their degree of uncertainty.

      I think the article is pursuing an interesting question. The core set of results are perhaps not as surprising or as puzzling to a DDM audience as the introduction might have you believe, but from there the paper does a nice job of exploring different ways in which uncertainty might affect the choice process. This seems like a good set of models to consider, as they cover the obvious ways in which one might consider incorporating uncertainty into the DDM, and each one, except for the favored Model 4, has a clear inability to capture a facet of the data.

      1) I could quibble about why the authors don't explore more variants of the favored Model 4, for example ones where the values are divided by non-linear functions of the uncertainty measure (e.g. squared or square root)? The results in Figure 4 are not a slam dunk for Model 4, as the effect of dC seems to outweigh C, while in the data it is the opposite. I don't think this is critical, but the authors might try an extra exponent parameter on uncertainty in Model 4. At minimum, the authors should discuss how they might modify Model 4 to better match the data.

      2) As I alluded to above, I think the article somewhat mischaracterizes the DDM by saying that "the most straightforward way to include option-specific noise in the preferential DDM - by assuming that noise increases with value uncertainty - leads to the wrong qualitative predictions..." "Most straightforward" is subjective. The standard diffusion model sets the diffusion noise variance to a constant, and so no, adjusting the noise is not "straightforward"; in many DDM software packages it is not even an option. Instead the effect of uncertainty would show up in the drift rate (or boundaries), as it does here. So, I would urge the authors to temper their claims in the introduction and discussion about what the "straightforward" model would be. Many researchers who use the DDM think about the drift rate as a signal-to-noise ratio, and for them Model 4 would have been the straightforward model.

      3) This isn't to say that what the article does isn't interesting or important. A standard DDM analysis would just fit different drift-rate and boundary parameters to high and low uncertainty conditions and then report the differences. This article takes a more elegant approach by explicitly modeling uncertainty in the DDM components. This is why I would urge the authors to do a bit more with that aspect of the paper, to try to better understand how uncertainty impacts the drift rates.

      4) On Page 16 - the authors write "in line with the best fit parameters". What exactly do they mean here? Did they use the best-fitting parameters or not? Could the authors add a table to the supplements with the average best-fitting parameters for each model, for each dataset? That would greatly help in understanding the results.

      5) Figure 4 - how were the experimental data and model simulations combined to generate these figures? For the data, was this one big mixed-effects regression including all datasets? How did the authors handle the random effects in this case, given the multiple datasets? The simulations are also vaguely described. How "similar" were the input values to the data; how exactly were these input values generated? Again, how were the simulations from different subjects/studies combined to generate a single plot per model? It would be useful, though not strictly necessary, to see the basic behavioral results broken down by study (in the supplements). It is unclear how consistent the patterns in Figure 2/4 are across the studies.

    1. Reviewer #3:

      This manuscript is a detailed analysis of the molecular mechanism for ISW2 recruitment in yeast and delineates not only the binding interface between ISW2 and the transcription factor Ume6, but also finds similar interactions between ISW2 and Swi6. The authors take a systematic and rigorous approach in finding that a 27 amino acid region of Ume6 and the WAC domain of Itc1, accessory subunit in ISW2, are responsible for recruiting ISW2 to Ume6 binding sites. The strength of this paper is that they focus on examining these interactions in vivo and using MNase-seq to show changes in nucleosome positioning upon mutation of Itc1, Ume6 and Swi6. The data is well supported and the conclusions are compelling. In addition, they use the Spytag approach to show these regions alone are capable of recruiting Isw2 to genomic target sites. They also show that amino acids 1-73 of Itc1 alone are sufficient for binding to the correct genomics sites and is compelling evidence of their specificity. The authors, by comparing the sequence composition of the WAC domain in ISW2 orthologs from flies to humans, are able to explain a contradiction that has been in this field for a long time about the apparent different role of yeast ISW2 and its Drosophila homolog ACF/ISWI. The Drosophila ISWI complex appears to have a more global role in chromatin organization; whereas yeast ISW2 is more specialized or targeted. The WAC domain in ISWI is defective for recruitment by such transcription factors like Ume6 and Swi6, unlike that observed for ISW2. The other interesting finding or correlation that is derived from their findings is that the recruitment of ISW2 by Ume6 and Swi6 may not only work to recruit ISW2 but may also regulate ISW2 activity as the same region of Itc1 shown to bind to these transcription factors is also shown to regulate the activating function of the H4 tail on Isw2. The paper is well written, clear and nicely organized. I did have one question for the authors, as it seems that this type of recruitment may not be universal as there are only a subset of Ume6 sites that behave as expected in their mutational analysis. Do the authors have any idea why that is the case and what makes this subset of sites behave differently?

    2. Reviewer #2:

      Chromatin remodelers use the energy derived from ATP hydrolysis to reposition or evict nucleosomes, thus shaping the chromatin landscape of the cell. In this study, the McKnight lab use creative genetic and genomic approaches to understand how the apparently nonspecific biochemical activity of one such chromatin remodeler, Isw2, is targeted to specific nucleosomes in the budding yeast genome. The use of an isw1/chd1 mutant is a nice approach to remove the effects of spacing factors, and the SpyTag/SpyCatcher approach is a novel idea for artificial recruitment of factors. The bottom line of the study is that small, conserved epitopes in transcription factors act as recruiting elements for Isw2, allowing precise targeting of a nonspecific biochemical activity to specific genomic loci. From a larger perspective, the results lend support to an interacting barrier model of nucleosome positioning, wherein positioning of specific nucleosomes defines the borders of nucleosomal arrays. The data appear to be of high quality and soundly interpreted, and I believe that the results will be of great interest to those interested in chromatin and transcription. There are many questions raised by the results that I believe will drive further investigation into specificity in chromatin remodeling. My one major criticism (not that major in the scheme of things) is that the authors analyze the interesting subsets of their sites, as detailed below. One example is the analysis of the Isw2/Itc2 co-bound sites to the exclusion of the Isw2-alone sites. I think some exploration of these sites would be warranted, as discussed below.

      1) In Fig. S1C, there is nice correspondence between strong Isw2 K215R binding and Isw2-dependent nucleosome remodeling. However, at PICs where there is no apparent Isw2 remodeling, there does seem to be some Isw2 K215R ChIP-seq signal, albeit at a lower level. Does this potentially represent capture of transient sampling-type interactions, or something else?

      2) In Fig. S1D, Ume6 ChIP (WT and DBD alone) is shown at 202 intergenic Ume6 motifs. It is stated that the rows are linked with Fig. 1B - it would be nice to see the nucleosome data next to the ChIP data in this panel, as it appears that Ume6 is bound to at some level to the majority of these 202 sites, while Isw2 seems only to be active at the 58 sites of cluster 1. Germane to this point, I of course understand why the authors focused on the cluster 1 sites, but it would be nice to have some speculation on why Isw2 only seems to function at a fraction of Ume6-bound loci. Also, the lengths of the cluster-denoting bars appear to be off here relative to Fig. 1B.

      3) In Fig. 5C, it appears that only a subset of Isw2 sites are bound by Itc1 as well. Again, as with the selection of the 58 Ume6 sites, I understand why the Isw2/Itc1 co-bound sites are selected for further analysis, but the Isw2 sites without Itc1 could be discussed as well. Are these sites non-functional? How does Itc1 ChIP-seq data compare to the Isw2 remodeling activity shown in Fig. 1A? How does it compare to Ume6 binding? Does it specify the Isw2-remodeled nucleosomes?

      4) Did the authors perform western blots to ensure that their various truncation constructs were stable? This is important for interpretation of the results vs deletions.

      5) To summarize the above points, a major thing missing from the discussion is why only subsets of TF binding sites recruit Isw2. For instance, as mentioned above, 58 Ume6 sites seem to specific Isw2 remodeling - what is special about those sites versus the other ~150 sites that appear to be bound by Ume6? It's mentioned briefly in the discussion that only three Swi6 sites were identified as Isw2-recruiting and that this may be tuned by cellular context, but this is quite vague and superficial. More speculation on what differentiates these sites from the TF-bound but non-Isw2 recruiting sites could be included.

    3. Reviewer #1 (Jerry Workman):

      This is a paradigm shifting study which demonstrates targeting of the Isw2 complex by a sequence-specific DNA binding protein Ume6. Previously the Isw2 complex was thought to be a promiscuous nucleosome sliding ATPase that would globally space nucleosomes like Chd1 or Isw1. However, the current study demonstrates the Isw2 primarily targets a single nucleosome adjacent to Ume6 binding sites.

    1. Reviewer #3:

      The authors report results of an MEG analysis deploying a cognitive paradigm in which participants engage in a source memory task characterized by the appearance of three images in succession and are then tested via a cue (the first of the three images) followed by a choice of responses for a two dimensional pattern and then a choice (out of three images) of a photographic scene.

      The principal finding is that (via MEG sensor level data) there is a widespread 8-15 Hz power decrease that is correlated with the number of recalled items (from 0 to 2) on a given trial. In the hippocampus (via MEG source reconstruction), the magnitude of phase amplitude coupling observed as participants are told to associate the items is correlated with memory performance. The 8-15 Hz power decrease/memory correlation (as estimated by beta coefficients in a model described in Figure 1) is larger (across individuals) during moments when subjects are viewing the stimulus items as opposed to during the "associate" period. The novelty in the result is related to the experimental task that attempts to dissociate memory-related effects related to perception from those related to binding which putatively occurs when subjects are given the "associate" instruction.

      My main conceptual concern is related to the design of the experimental task. I am not sure that the perception/binding framing is appropriate, since there is no reason to think that subjects are not associating/binding items during the periods when the items are being shown on the screen. I suppose this may partly explain the lack of a significant difference in PAC/memory beta coefficients observed in the hippocampus when contrasting these two epochs (Figure 4). But the corollary is that the alpha power-related beta coefficients are observed while binding is likely also occurring within the paradigm (esp since each image is shown for 1.5 seconds it would seem). Is the alpha power effect seen in the hippocampus? The plots in 3a suggest there is an oscillation present in the relevant frequency range, and the time course of alpha power differences seen in Figure 2 suggests that they occur relatively late after onset of the images, which may fit better with some contribution for this pattern to the forming of associations rather than perception.

      I understand that the paradigm was constructed in an attempt to temporally dissociate memory effects attributable to perception versus those attributable to binding. But given the temporal resolution available using EEG, I would imagine that the authors could differentiate an earlier perception-related effect from a later PAC binding effect in the time series if the associated images were presented in conjunction. Is it correct to frame the alpha results as related to "perception?" The beta coefficients used for analysis reflect a "memory related effect observed when visual stimuli are present on the screen," but not necessarily improved memory predicated on more accurate perception to my interpretation. I would think that a perception/binding distinction requires operationalizing perception as activity that doesn't vary with later associative memory success, and binding as activity that does. The notion of perception used by the authors here seems slightly different. The authors can perhaps comment on this concern.

      The authors report PAC results for other regions on page 6, but claiming that PAC is a hippocampal-specific effect would require showing that the PAC-related beta coefficients are significantly greater than the other regions, rather than simply the absence of a significant effect in these regions. The authors should also clarify if they combined locally measured PAC over several ROIs into an average for these other regions? It seems unlikely to detect PAC if a single theta/gamma time series were extracted over such a large area of cortex.

      The interaction effect reported at the end of the results (ANOVA model) is interpreted such that the cortical alpha effect is stronger when the visual items are presented, while the hippocampal PAC effect is stronger when no items appear on the screen, but these recordings are made in different regions (hippocampus versus the entire cortex). If my understanding is correct, a result in line with the model the authors suggest (cortical alpha power decrease/hippocampal PAC) would show a region (hipp v cortex) x task (images on screen vs "associate" command) x metric (PAC vs alpha) interaction. Can the authors clarify if the cortical data entered into this model includes only those regions that showed a significant effect initially, or just all the sensors? The former would seem to introduce bias.

      Similarly, the different visual classes are always presented in the same order, which may give rise to the strong disparity in recall fraction between the pattern and scene images. I understand the linear model incorporates predictor variables for scene/pattern recall, but given that scene recall is driving a significant amount of the overall recall number observed as the main variable of interest, I would wonder if the alpha/beta power effects are related to the relative complexity of the scene images as compared to the patterns. Given the analysis schematic the authors report, I assume the authors have analyzed whether the same effects occur when contrasting scene versus no recollection and pattern vs no recollection. If the same effects are observed regardless of type of image (when compared with no recollection) this may help address this concern.

      My second conceptual question is related to MEG data. It appears to me that the authors use MEG sensor-level data for the alpha-related effect in the cortex (Figure 2), but MEG beamformer reconstructed data (localized to the hippocampus) for the PAC effect. Is there a reason the authors did not use MEG data localized to specific cortical regions rather than sensor data? This may reflect confusion on my part, but I don't understand why they would use qualitatively different types of data for these two aspects of the analysis that are then combined (in the ANOVA, for example).

      The authors should also engage with concerns regarding the validity of localizing MEG signals (especially for an analysis such as PAC) to deep mesial temporal structures such as the hippocampus. I understand that MEG systems with greater than 300 sensors are more reliable for this purpose, but I think a number of readers would still have doubts about MTL localization of signal. Also, my understanding is that such deep source localization requires around 100 trials per class, which I think fits with what the subjects completed, but the authors may include references related to this issue.

      I think the signal processing steps are overall quite reasonable. I would ask the authors to clarify if they limited their analysis of cortical alpha/beta oscillations to those in which a peak exceeded the 1/f background, as they report for the PAC analysis on page 5. Also, it would be helpful to show that the magnitude of the MI values in the hippocampus exceed those observed by chance (using a shuffle procedure) in addition to showing that there is a memory-related association reflected in the beta coefficients.

    2. Reviewer #2:

      In this manuscript, the authors examine the neural correlates of perception and memory in the human brain. One issue that has plagued the field of memory is whether the neural processes that underlie perception can be dissociated from those that underlie memory formation. Here the authors directly test this question by introducing a behavioral paradigm designed to dissociate perception from mnemonic binding. In brief, while recording MEG data, they present subjects with a sequence of visual stimuli. Following the sequence, the subjects are instructed to bind the three stimuli together into a cohesive memory, and then are tested on their memory for which pattern was associated with an object, and which scene. The authors investigate changes in alpha/beta power and theta/gamma phase amplitude coupling during two separate epochs - perceptual processing and mnemonic binding. Overall, this is a well written and clear manuscript, with a clear hypothesis to be tested. Using MEG data enables the authors to draw conclusions about the neurophysiological changes underlying both perception and memory, and establishing this dissociation would be an important contribution to the field. I think the conclusions are justified, but there are several issues that should be addressed to improve the strength and clarity of the work.

      The fundamental premise of the task design is that subjects view a sequence of stimuli, and then separately at a later time actively try to bind those visual stimuli together as a memory. However, it is entirely possible, and even likely, that memories are being formed and even bound together as the subjects are still viewing the sequences of objects. How would the authors account for this possibility? One possible way would be if there were a control task where subjects were just asked to view items and not remember them.

      Another possibility would be to examine the trials that the participants failed to remember correctly. Presumably, one would still see the same decreases in alpha power. Yet it seems from the data, and the correlations, that during those trials that were not remembered properly, alpha power changed very little. Of course, it is unclear in these trials if failed memory is due to failed perception, but one concern would be that this would imply that decreases in alpha power are relevant for memory too. It would be helpful to see how changes in alpha power break down as a function of the number of actual items remembered. It would also be helpful to know how strong these correlations actually are.

      A related issue is with respect to hippocampal PAC. The authors investigate this during the mnemonic binding period. Yet they also raise the possibility in discussion that this could also be happening during perception, which goes back to the point above. Did they analyze these data during perception, and are there changes with perception that correlate with memory? This would suggest that binding is actually occurring during this sequence of visual stimuli.

      The authors perform a whole brain analysis examining the correlation between alpha power and memory to identify cluster corrected regions of significant. However, the PAC analysis focuses only on the hippocampus, raising the question of whether these results can account for the possible comparisons one could make in the whole brain. They do look at four other brain regions for PAC, which it would be helpful to account for. In addition, are there other measures of mnemonic binding that are significant? For example, theta power, or even gamma power?

      The authors note in the discussion that the magnitude of hippocampal gamma synchrony has been shown to be related to the decreases in alpha power. Is this also true in their data?

    3. Reviewer #1:

      This MEG study by Griffiths and colleagues used a sequence learning paradigm which separates information encoding and binding in time to investigate the role of two neural indexes - neocortical alpha/beta desynchronization and hippocampal theta/gamma oscillation - in human episodic memory formation. They employed a linear regression approach to examine the behavioral correlates of the two neural indexes in the two phases, respectively and demonstrated an interesting dissociation, i.e., decreased alpha/beta power only during the "sequence perception" epoch and increased hippocampal theta/gamma coupling only during the "mnemonic binding" phase. Based on the results, they propose that the two neural mechanisms separately mediate two processes - information representation and mnemonic binding. Overall, this is an interesting study using a state-of-art approach to address an important question. Meanwhile, I have several major concerns that need more analysis and clarifications.

      Major comments:

      1) The lack of theta-gamma coupling during the stimulus encoding period is possibly due to the presentation of figure stimulus, which would elicit strong sensory responses that mask the hippocampus activity. How could the author exclude the possibility? In other words, the dissociated results might derive from different sensory inputs during the two phases.

      2) About the hippocampal theta/gamma phase-power coupling analysis. I understand that this hypothesis derives from previous research (e.g., Heusser et al., 2018) as well as the group itself (Griffiths et al., PNAS, 2019). Meanwhile, MEG recording, especially the gradiometer, is known to be relatively insensitive to deep sources. Therefore, the authors should provide more direct evidence to support this approach. For instance, the theta/gamma analysis relies on the presence of theta-band and gamma-band peak in each subject. Although the authors have provided two representative examples (Figure 3A), it remains unknown how stable the theta-band and gamma-band peak exist in individual subject.

      3) Related to the above comment, the theta-gamma coupling is a brain-wide phenomenon including both cortical and subcortical areas and not limited to just hippocampus. Although the authors have performed a control analysis to assess the behavioral correlates of the coupling in other regions, the division of brain region is too coarse and I am not convinced that this is a fair comparison, since they differ from hippocampus at least in terms of area size in the source space. The authors could consider plotting the power-phase coupling distribution in the source space and then assessing their behavioral correlates, rather than just showing results from hippocampus. This result would be important to confirm the uniqueness of the hippocampus in this binding process.

      4) About behavioral correlates. The current behavioral index confounds encoding and binding processes. Is there any way to seperate the encoding and binding performance from the overall behavioral measurements? It would be more convincing for me to find the two neural indexes at two phases predict the two behavioral indexes, respectively.

      5) The author's previous works have elegantly shown the two neural indexes during fMRI and intracranial recording in episodic memory. The current work, although providing an interesting view about their possible dissociated functions, only focuses on the memory formation period (information encoding and binding). Given previous works showing an interesting relationship between encoding and retrieval (Griffith et al., PNAS, 2019), I would recommend the authors to also analyze the retrieval period and see whether the two indexes show consistent dissociated function as well.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 1 2020, follows.

      Summary

      The study isolates bacteria from diverse Antarctic samples which utilise DMSP as the sole carbon source. It initially focuses on a Gammaproteobacterium, Psychrobacter sp.D2, which the authors establish lacks a known DMSP lyase enzyme despite having DMSP lyase activity (this needs to be quantified). Through RNA-seq and bioinformatics, they identify the gene cluster responsible for this activity and identify a novel DMSP lyase somewhat related to DddD in that it involves CoA, but critically also ATP, which distinguishes it from the pack of other known Ddd enzymes. This enzyme is a ATP-dependent DMSP CoA synthase required for growth on DMSP and its transcription is upregulated by DMSP availability. The novel mechanism of this enzyme is proposed from a strong structural component to the study. The authors propose the downstream pathway for DMSP catabolism, which we find to be oversold and requiring gene mutagenesis to confirm, and to be preliminary in comparison with the authors' other findings. Finally, the study attempts to show how widespread the enzyme is in sequenced bacteria, confidently showing it to be functional in other related Gammaproteobacteria and some Firmicutes.

      Essential Revisions

      1) Title: "a missing route", was it really missing? We would suggest a more precise title. Would be better to say "that releases DMS" or an alternative.

      2) This is a Ddd enzyme by definition and should be named as such.

      Line 27- We disagree with the use of a new gene prefix when there is a strong precedent for the use of Ddd for "DMSP-dependent DMS". If this enzyme is a DMSP lyase and is in bacteria then its naming should follow protocol and be called Ddd"X"-X-being a letter not currently utilised in known systems. Deviating from this convention causes confusion and is not appropriate. Furthermore, AcoD is already assigned in some bacteria to acetaldehyde dehydrogenase II.

      3) As presented, the bioinformatics-based evidence regarding the broad distribution of this enzyme (as claimed e.g. in the Abstract, line 33) does not stand up. Currently as presented in the manuscript, especially Fig 6, we are led to believe the enzyme is more widespread than can be demonstrated based on the authors' evidence (i.e., the authors allow a very low threshold of sequence identity and claim function outside of the groups they have tested). Either more work is needed to show that claims of such a wide distribution are merited, or the authors should limit their claims to what can be substantiated by their work. Specifically, the authors cannot comment on the "functional" enzyme being widespread outside of the Gamma's and Firmicutes that were tested, let alone the importance of the role in DMSP cycling. Only three "AcoD" enzymes were ratified in this study, which are relatively closely related to each (Psychrobacter sp. D2 Sporosarcina sp. P33 and Psychrobacter sp. P11G5 that are > 77% identical to each other). As can be seen in Fig 6, these three proteins cluster together and are far removed from all the other sequences on the figure, for which we have no evidence of their function (i.e., nothing can realistically be said on Deltas, Actinos or Alphas or the MAGS). Just to be clear, these other proteins shown in clades above and below the functional "AcoDs" in fig 6 are only ~30% identical to ratified "AcoD". Furthermore, only strain D2 was shown to make DMS; none of the other strains were tested. Far more testing of the diverse enzymes and strains are needed to make these statements as this study only tests one strain and three of the closely related enzymes (defined on Fig 6). Additional specific comments on this issue:

      Line 280. The sentence on MAGS and the environments containing them does not stand up for reasons summarised above. All MAGS shown on Fig 6 are not similar enough to "AcoD" to be termed as functional Ddd enzymes. More work has to be done on the strains and enzymes that are more divergent to true "AcoDs" before such a statement is supported. Please delete. Line 509-We agree with what the authors write about stringency. However, these parameters do not seem to have been utilised as stated here. Their stringency statement holds up for comparison between the D2 "AcoD" and two other tested "AcoD" enzymes and all those in the middle clade on Fig.6. But this is not the case for the proteins shown above and below this "AcoD" clade in Fig 6 which have at best around 30% identity to characterised enzymes. See below for examples. As the authors state in their methods, high-stringency methods are needed to exclude other acetyl-CoA synthetase family proteins. Thus, most of the genes shown on fig6 cannot be taken as having this Ddd activity.

      "To further validate that these AcoD homologs" the authors examined the activity of two closely related enzymes from a group of nine homologs with > 65 % sequence identity (starting line 283, Figure 6). It is not surprising that these enzymes have the same activity. Homologs outside this group of nine (Figure 6) are far less related to the characterized AcoD (< 32 % seq. identity). Conservation of the phosphate-transferring His (His292) and an active site Trp (Trp391) does not seem to be strong evidence for functional conservation. The manuscript does not provide any additional evidence that these less related enzymes also degrade DMSP. Either more experimentation is necessary, or the paragraph on the "Distribution of the ATP DMSP lysis pathway in bacteria" must be revised.

      For example: Psychrobacter AcoD (WP_068035783.1) is 31% identical to Bilophila sp. 4_1_30 (WP_009381183.1) in the below group of bacteria on Fig 6. Psychrobacter AcoD (WP_068035783.1) is 29% identical to Thermomicrobium roseum (WP_041435830.1) in the above group of bacteria on Fig 6. Line 283. This is not the case! The two sequences that were chosen to "validate" are far to close to the D2 "AcoD" than to MAGS and other potential "AcoDs" shown above and below the functional Ddd clade on Fig 6. This section design is weak and does not lend weight to the expansiveness of this family. More work on the more diverse enzymes and bacteria is needed to support the authors claims. Please delete or study the activity of the more diverse strains and their candidate "AcoDs". Fig. 6. This is a nicely presented figure that unfortunately slightly deceives the reader. The authors need to clearly show which strains they have shown to have Ddd activity (currently one as I understand it) and which enzymes they have shown to have the appropriate activity (currently three closely related enzymes as I understand it). If I am not wrong these are all confined to the middle clade of Gammas and Firmicutes. These stand clearly apart form the other strains (above and below) which have not been studied and which are only ~ 30% Identical to "AcoD" at the protein level. This is not clear on the figure and definitely misleads in the abstract and throughout the manuscript.

      4) We expect to see kinetics done on the new enzyme in line with what the authors have done in other related studies on Ddd and Dmd enzymes.

      This is important to place the work in context with previously identified Ddd and Dmd enzymes, many of which have been analysed by these authors in previous publications. The characterization of the AcoD activity remains entirely qualitative. The authors only provide relative activities measured at a single substrate concentration. This data does not support the following statement: "Mutations of these two residues significantly decreased the enzymatic activities of AcoD, suggesting that these residues play important roles in stabilizing the DMSP-CoA intermediate" (l.223-225).

      5) The manuscript does provide unambiguous evidence for the activity of AcoD and its function during growth on DMSP. On the other hand, the description of the "ATP DMSP lysis pathway" is less clear.

      Transcriptomics analysis (Figure 2C) suggest that growth on DMSP upregulate the genes 1696 (BCCT), 1697 (AcoD), 1698 and 1699. The function of the third and fourth protein remain unclear (line 253). Instead, a reductase (AcuI) encoded somewhere else on the same genome was shown to transform the acryloyl-CoA to propionate-CoA. What was the transcription profile of acuI acuH in the RNA-seq? were they induced by growth on DMSP? Is the 1696-1697-1698-1699 gene cluster conserved? What is the function of 1698 and 1699? These questions are only relevant if the authors plan to maintain the claim of having identified a new pathway. This pathway prediction component is very weak and could be supplemented by KO mutagenesis of the dddCB and acuI. Without such work this is speculation and needs to be written as such.

      6) Appropriate controls, units and quantification should be used:

      Line 102- Please give a normalised value for the level of DMS produced from DMSP per time and protein/cells.

      Figure 2. A. One would expect to see a growth curve of D2 on DMSP compared to acrylate, a conventional carbon source (e.g. pyruvate, glycerol or succinate) and a no carbon control. As "AcoD" is predicted to ligate CoA to DMSP it would be good to know if the strain grows on acrylate. It might be predicted to have different properties to e.g. Halomonas which does grow on acrylate. At least a no carbon and conventional carbon source should definitely be included.

      B. The units for this figure are not appropriate. It would be more appropriate to show the actual amount of DMS that is produced by the strain, ideally normalised to protein, cells or absorbance and time. Detail in the figure what the control is.

      C. Would like to see error bars on this figure. Also would have been sensible to colour code these to match panel D.

      Figure 3. B and C. as with Figure 2 we need to see levels of DMS normalised to cells/protein and time.

      Line 374 - No controls. Please include these as detailed above. No carbon, conventional carbon source, acrylate?

      Quantitative data supporting Supplementary Fig. 12 would be helpful. After all this route would have to explain that the bacteria can use acrylate CoA as sole carbon source (or at least alternatives would have to be discussed). Is the identified activity sufficient for this task?

      Line 388 - This method is/should be quantitative. It is standard practice to report DMS production normalised to time and cells/protein. Here we are only given peak area.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on November 30 2020, follows.

      Summary

      The exact relationship between G6PD deficiency and malaria protection remains uncertain. This study provides evidence that the G6PD Med mutation (563 C>T) protects against clinical Plasmodium vivax disease. It uses a Bayesian statistical approach which specifically elucidates the particular protection which female heterozygotes versus male hemizygotes (or female homozygotes) for the Med mutation may experience. This is an important contribution to our understanding of the relationship between G6PD deficiency and P. vivax.

      Overall, the reviewers were positive about the work and its potential, but have some clear concerns that will require additional data, analyses, and interpretation. Below are the main points raised by the reviewers that would need to be addressed to for a revised manuscript.

      Essential Revisions

      1) The presence of mixed infections: although the work is focused on P. vivax, the majority (95%) of malaria in Afghanistan is caused by P. falciparum that means mixed species infections are likely high and P. falciparum infections may be obscuring P. vivax infections. It is not clear to what extent G6PD deficiency may impact the chance of being coinfected with both falciparum and vivax. Ideally, PCR verification of these samples would be performed to confirm the species for samples included in the analysis. Without this molecular data, the overall assessments of susceptibility to vivax malaria in association with G6PD Med is incomplete.

      2) The analysis relies on a number of assumptions made about Pashtun population genetics (e.g. is it reasonable to assume the same frequency of the relevant mutation throughout all the tribes in the study, and should this be at Hardy Weinberg equilibrium?) and it is not clear to what extent these assumptions are justified since little evidence/support is provided. In particular, the assumptions about Hardy Weinberg equilibrium of G6PD Med within the Pashtun population need to be justified and supported since the analysis is highly reliant on this assumption.

      3) The exclusion criteria does not appear to have been uniformly applied - in particular anemia was an exclusion criteria for only part of the data. This was not clear and may impact the overall significance of statistical results.

      4) While the manuscript makes a number of conclusions about female homozygotes, these are not strongly supported by the evidence. In particular, the study is likely under-powered with regard to clinical associations among female homozygotes with G6PD Med, but this is not addressed and the stated conclusions are likely stronger than what can be supported by the data/analyses provided.

  3. Nov 2020
    1. Reviewer #3:

      This is a very thorough study giving new insight into a non-cell autonomous mechanism for DCC in axon guidance in midline fusion important for corpus callosum axon guidance.

      I have no substantive concerns.

    2. Reviewer #2:

      This paper is the second in a series of landmark studies from the Richards lab that re-assess the molecular and cellular mechanisms that permit the corpus callosum (CC) to cross the interhemispheric midline in the telencephalon. The Richards lab previously showed key role for a specialized population of fetal astrocytes, the midline zipper glia (MZG), establishing this substrate when the MZG migrate into the interhemispheric fissure (IHF), intercalate with one another and degrade the intervening leptomeninges. In this manuscript, the authors now assess the requirement for the Ntn1/Dcc pathway in remodeling the IHF. In an elegant series of experiments, they show that Ntn1/Dcc regulate the migration pathway of the MZG, potentially by directly controlling cytoskeletal dynamics. This mechanism is conserved between humans and rodents; the authors show that Dcc mutations that cause CC dysgenesis in humans, cause striking changes in the morphology of astroglial-like cells, consistent with the regulation of MZG migration. Thus, Dcc appears to have two roles first, remodeling the MGZ and then guiding CC axons towards the telencephalic midline. Together, these studies continue the overgoing re-evaluation of the role of netrin1/Dcc in establishing neural circuitry, and shed further understanding on a fascinating and beautiful piece of biology.

      This is a very beautiful manuscript, the authors are to be congratulated for the very high quality of their images, and detailed quantifications. Would that all studies were so thorough! These studies will be of great interest to the developmental neuroscience research and clinical communities.

      Major comments

      The authors should be congratulated by including what was clearly a difficult conditional analysis to assess whether Dcc is required in the callosal axons, or in the MZG radial fibers. This analysis was confounded a) by the low efficiency of the shRNA to knock down Dcc and b) the mosaic nature of Emx::cre line, which appears to be variably expressing cre in both callosal neurons and MZG, given that TDT/Dcc are present in both axons (Fig 5B), and the MZG (Fig 5O) in the less severely affected animals.

      As currently presented, however, the analysis (sadly) does not greatly add to the paper, since technical issues beyond the authors' control, have made it difficult to assess specifically where Dcc is required with much confidence. Would the authors could consider removing the shRNA approach from the manuscript, and re-focusing the cKO data on a description of a Dcc phenotypic series? This analysis might fit better with the initial description of lack of interhemispheric remodeling observed in Dcc/Ntn mutant mice, and how they relate to (variable?) phenotypes observed in patients.

      Minor Points:

      1) Fig 3C, D. The failure of the MZG radial fibers to extend along the IHF in Dcc mutant at E15 is very striking, and well described in the text. However, there appears to be an additional more punctate Glast/Nestin signal immediately above the radial fibers in IHF in the E15 mutants, what is that?

      2) Fig 4E. Could the increased numbers of migrating MGZ cells seen on the surface of the IHF in E16 Dcc mutants be because there is no "stop" signal created when the IHF is remodeled?

      3) Fig 5B. The failure of the GFAP cells to move away from the third ventricle in Dcc mutants seems profound in both the figures and the quantification. Can the authors elaborate more on why the 0-400 um measurement doesn't rise to being significant in the Dcckanga mutants? Perhaps spell out (p=0.0?) where the trend lies on Fig 5B. ?

    3. Reviewer #1:

      In this manuscript, the authors revisit DCC and NTN1 mutants in order to better define the basis for midline crossing defects. This group recently demonstrated that midline zipper glia (MZG) must migrate along the interhemispheric fissure (IHF) and intercalate across the midline while remodeling the meningeal basement membrane to provide a substrate for callosal axons to cross the midline. In this study, they show that DCC and its ligand NTN1 are required for proper midline zipper glia (MZG) distribution/morphology along the IHF, proper remodeling of the basement membrane, and subsequent corpus callosum (CC) formation. The data in figures 2 and 3 generally do a nice job of supporting the model that DCC and NTN1 are expressed in MZG and that the morphology and distribution of MZG are affected in DCC/NTN1 mutants. There appear to be some defects in MZG migration that may account for this (Figure 4). Due to technical limitations, the author's attempt to use a conditional knockout of DCC to genetically dissect whether CC formation defects are due to defects in MZG or callosal axons are a bit inconclusive (Figure 6). Finally, the paper ends with experiments showing that mutations in DCC identified in acallosal patients are loss-of-function using an in vitro cell morphology assay (Figure 7 and 8).

      The authors are commended for the quality of their imaging data and for being as quantitative as possible when measuring their in vivo phenotypes, which is not often done with these types of studies. There are few issues that need to be addressed.

      Major points:

      1) In Figure 4, in addition to the migration defects of Sox9+ MZG, there seems to be a rather large increase in the total number of Sox9+ cells along the IHF by E16 (more than 2 fold, Figure 4G). The authors show there is no change in cell cycle or apoptosis of these cells in the supplemental data (Figure S4), so what accounts for this increase? Is this also seen with NFIA/B staining at E16?

      2) Regarding the attempt to distinguish between DCC in MZG versus callosal axons (Figure 6), the incomplete deletion/loss of DCC protein (Figures 6C, I, J) is a bit concerning. It's not clear to me why this would happen, but it confounds the interpretation of the results. While the authors state "The severity of callosal agenesis was associated with the extent to which the IHF had been remodeled" (pg 15), they don't actually quantify this. It might be informative to generate scatterplots of IHF length vs. CC/HC length to determine if there is a significant correlation between the two. This might lend more evidence to a causal relationship between IHF remodeling and CC/HC formation.

      3) At the end of the result section, the authors state: "mutations that affect the ability for DCC to regulate cell shape (Figure 8F), are likely to cause callosal agenesis through perturbed MZG migration and IHF remodelling." (pg. 19). While the authors nicely show that patient mutations in DCC affect the morphology of cells in cell lines (Figure 7-8), it is not clear why simply transfecting WT DCC into cell lines results in such a dramatic change in morphology, or why addition of NTN1 doesn't increase this. The authors mention that the cell lines could express NTN1 or that NTN1 is not required for the effect. This seems an important distinction. Did the authors check this? Could they use a function blocking antibody or a soluble fragment of the NTN1 binding domain of DCC to block NTN1:DCC interactions? DCC has been shown to function as a "dependence receptor" that can induce apoptosis in the absence of ligand; are the authors certain that the morphology changes they are seeing in DCC transfected cells aren't cytoskeletal changes resulting from caspase activation?

      Minor points:

      1) The authors should mention recent work showing Netrin localization to basement membranes during axon guidance (Varadarjan et al, Neuron 2017). The data in Figure 2 are very much in agreement with this previous work, and it should be mentioned in this context.

      2) Figure S5A: Representative images from each genotype don't look comparable, even though there's no difference in quantification.

      3) Did the authors check whether the cell lines they used in Figure 7-8 express DCC?

    1. Reviewer #3:

      Substantive concerns:

      1) Regarding hypothesis 4, the authors test whether or not desiccating species have lower TE loads than non-desiccating species, but in my opinion the logic outlined in lines 114-124 suggests that the relationship between desiccation and TE load may be more nuanced than overall TE load. It could be possible that DSB repair associated with desiccation removes only recent insertions if homologous pairing is involved, or high-copy TEs if ectopic recombination has occurred. The authors already test recent TE activity elsewhere in the manuscript, so they could compare signatures of recent activity in desiccating vs non-desiccating species to see if there are fewer recently active TEs in desiccation species. Similar comparisons could easily be made for abundance of high-copy TEs (regardless of length).

      2) Additionally, regarding the signatures of recent transposition, the authors have done a thorough job comparing TE divergences and LTR insertions, but since transcriptomes for some species are available, presence of transcribed TEs could provide further support for recent and ongoing TE activity.

    2. Reviewer #2:

      This manuscript represents a very considerable amount of work, both wet lab and analytic, constituting excellent science. This may be the best paper yet produced on Bdelloids. Despite this glowing recommendation I have some very significant concerns about certain parts, their conclusions section, and the evidence for "enhanced cellular defence mechanisms" in the abstract. Some parts are very rigorous, but others give in to excess speculation. This paper does not really need additional work, it needs some re-writing. Afterwards this important manuscript would be a welcome addition to the field, even without the supposedly unique defence mechanisms.

      Substantive concerns:

      1) Line 273 onwards: There is a comparison in the manuscript between Bdelloids and Monogonants. It wasn't clear to me however that these groups had been sampled sufficiently. The Monogonants are represented by 5 species (8 genomes) within a single genus in no way representing the diversity of Monogonants and the sampling of Bdelloids is also small. The authors should take a more cautious tone to any conclusions.

      2) Line 276-278: The rationale for focussing on this specific group of TEs did not appear robust. The authors say "this class of TEs is thought to be least likely to undergo horizontal transfer and thus the most dependent on sex for transmission". But other groups are not evolving predominantly by horizontal transfer, transposons can change without meiotic sex and this section needs writing a little more clearly. The following lines make a case that some transposon groups increase, and some decrease in frequency. The obvious hypothesis is drift, but the writing was unclear, I always felt that some other mechanism was being proposed but never really stated clearly.

      3) Lines 288-300, comparison of TE abundances across animals; this section was very poorly done. I thought the authors could delete this comparison and have a better manuscript. How were these other species chosen? Is C. elegans a good representative of the entire phylum Nematoda? Are the tardigrades representatives of their phylum? Assembly and annotation methods vary enormously across datasets so what can the authors conclude without standardising assembly and annotation for these other animal groups? The authors say "as expected, both the abundance and diversity of TEs varied widely across taxa" This was indeed expected, Figure 2b seems to show noise, and suggests to me that the inclusion of this data was not a good idea. I suggest it is removed, or a very substantive analysis and discussion of the way in which it is an accurate and representative sample of animal transposon loads is written.

      4) Line 350-353: This section is weak and needs to be improved. The authors need to make it very clear that this is not a test, it is a single observation. The phrase "as predicted by theory for elements dependent on vertical transmission" seems rather unsupported. Does this relate to the argument put forward in lines 276-278? It was unconvincing at this point also. The current description that some families increase and some decrease is couched in what sounds like too meaningful sounding language, which could be improved to be more consistent with the results. Lines 353-355 here seem to make an argument that the variation of TEs in bdelloids is purely a phylogenetic effect variably present in some bdelloid lineages and related groups. If this is their view (and it seems very reasonable indeed) then the manuscript would be improved considerably if they stated it more clearly.

      5) Lines 533-535 "consistent with a high fit of the data to the phylogeny under a Brownian motion model as would be expected if TE load evolves neutrally along branches of the phylogeny." I felt that this was a truly excellent result that needed to be put forward more strongly in other areas of the manuscript. In this area, and some others in this manuscript the authors have truly unique data dramatically improving our understanding of bdelloids. The manuscript would be improved if authors concentrated much less throughout on ideas this data is exceptional and different from other animals, and instead followed their own analysis that this fits with current biological thought.

      6) Lines 621-632: "no significant difference between monogononts and bdelloids, or between desiccating and non-desiccating bdelloids" It is not clear to me here what statistical test is being carried out. All tests require phylogenetic control of course. I do agree that they are quite similar, perhaps this should be rewritten to reflect only that?

      7) 705-706 The authors look at 3 gene families concerned with transposon control to examine copy number. In one of them they say "the RdRP domain in particular is significantly expanded". I am unclear of what test of significance was carried out and where to find this analysis. Unlike the query concerning desiccating and non-desiccating above I think this analysis is essential. The authors make a really big thing about the expansion of this gene family, including it in the abstract. If they wish to keep its prominence then they need to clearly show whether there is evidence that the size of this domain family is significantly expanded along the branch leading to bdelloids. I understand that this is illustrated in Figure 7 but this is not a test. This needs to be made much clearer in a quantitative rather than descriptive way. There is a need for broad taxonomic sampling, standardisation of assembly and annotation, and a phylogenetic design for this analysis. Else it should be removed or at the least described more conservatively.

      8) Line 725: "Why do bdelloids possess such a marked expansion of gene silencing machinery?" There is no evidence presented that they do. There may be a hypothesis that they do it differently, rather than more, but that also needs testing. There is a lot of speculation in this paragraph, and I think removing this whole paragraph would improve the manuscript.

      9) If there is an expansion of this family what can we then conclude? The authors say in the abstract "bdelloids share a large and unusual expansion of genes involved in RNAi-mediated TE suppression. This suggests that enhanced cellular defence mechanisms might mitigate the deleterious effects of active TEs and compensate for the consequences of long-term asexuality" yet they also review that animal groups can utilize different gene families for transposon control. Is there evidence that clade 5 nematodes with PIWI have a quantitatively different transposon defence mechanism? No, they just use a different pathway to some other groups, and the default position surely has to be the same for bdelloids, there is no evidence presented that their defence is enhanced. I would strongly recommend that the authors reduce the strength of their claims about the significance of bdelloid transposon control gene families in this manuscript.

      10) I felt that the Conclusions (and Abstract) were too speculative and not fully supported by the existing data, though this can easily be addressed by a substantial re-write.

    3. Reviewer #1:

      This manuscript investigates TE diversity and variation across several clades of bdelloid rotifers, which are particularly interesting from an evolutionary perspective since they reproduce asexually. As stated by the authors, theory predicts that asexuality may lead to two opposite outcomes in terms of TEs content. In the absence of sex, TEs may not easily jump into new genomic backgrounds where they are not repressed, leading to a decline in TE content. On the other hand, there is no recombination without sex, which removes the selective pressure against TEs due to their involvement in ectopic recombination. The authors show that despite these extreme expectations, asexual rotifers do not seem to display any of these patterns, although recent insertions seem rare and possibly brought through horizontal transfers. They do not observe any clear effect of adaptation to desiccation on TEs content, which seems to exclude any effect of enhanced DNA repair mechanisms in controlling TEs. They observe less LINEs and more (recent) DNA transposons in bdelloid rotifers, which is consistent with the absence of sex (limiting LINEs spread) and horizontal transfers (more frequent for DNA transposons). The expansion of RNAi gene silencing pathways suggests that asexuality comes at a cost, such as the proliferation of TEs, the accumulation of genetic load, and the control of horizontal gene transfers that might be deleterious. I think this supports the hypothesis of strong TEs activity associated with the onset of asexuality, leading to a strong evolutionary response. This suggests that these clades survived the arms race with TEs. This work shows how intricate the coevolutionary dynamics between TEs and their hosts can be. The manuscript is well-written, analyses are sound and detailed. I have a few general comments/questions that I detail below: Horizontal gene transfer: given the abundance of recent DNA transposons in some clades (class I), it may be worth discussing a bit more this possibility (at this stage it is mostly discussed in the Conclusion).

      If my understanding is correct, there is no assessment of TEs or SNPs heterozygosity for each individual. This might be interesting to explore. If TEs are deleterious recessive, one might observe more frequently at the heterozygous state. For intraspecific data, it may be interesting to look at how nucleotide diversity varies along the genome. Since variable recombination may be associated with diversity due to the effects of selection at linked sites, checking diversity along the genome may bring another layer of information about the frequency of sexual reproduction and its effects on TEs diversity. I acknowledge that this would be a rather exploratory analysis, and am not asking the authors to carry it, but I am curious to know how do methods designed to estimate effective recombination rates perform on these data (e.g. LDHat, or more recently iSMC for a single diploid genome).

      Question related to demography and selection: would it be possible to obtain estimates of the effective population size for these clades? It would be interesting to have such an estimate to get an idea of the efficiency of purifying selection against TEs, and whether Muller's ratchet could explain the current abundance of TEs (in the case of moderate/small effective population sizes). I liked the idea of using the ABC to test for consistency with asexuality, but am wondering to what extent it is biased by non-constant transposition rates, which cannot be properly modeled by the coalescent simulation? I would also assume these simulations do not take into account past changes in demography (I believe this option has not been included in the software yet). This is not necessarily a major issue for me, as long as these limitations are mentioned. When presenting the ABC framework in the Methods section, you may want to give more details about the part carried with the abc package itself (e.g. which regression/rejection algorithms were used, etc.).

      A few other comments linked to specific paragraphs/sentences:

      • L419: why choosing LTR-Rs in particular (abundance and the fact they are not class I I guess).
      • L450: Would it be possible to obtain a time in generations from, e.g., an approximate mutation rate?
      • L455: Would it be possible to call heterozygote SNPs/elements?
      • L550-656: do you examine the most recent elements only? It may be interesting to check these correlations for elements of different ages, since selection may have had the time to act on the most ancient TEs.
      • L642: It might also be that longer elements display functional regulatory/promoter regions, and have a stronger impact on fitness.
      • L725: I liked this part, but wondered if a slightly more detailed discussion was possible. As the authors state, the expansion of RNAi pathways is consistent with a control mechanism against TEs. It is important to detail alternative explanations since there is no functional evidence in this model that this expansion actually controls TEs proliferation (unless I missed something). Given the rather unique properties of these organisms, it may be worth discussing.
    1. Reviewer #3:

      This paper compares two methods for assessing the effect of luminance on visual processing speed. One method represents conventional methodology, using a forced choice button push approach to assess the Pulfrich effect (whereby delayed processing of horizontal motion in one eye creates a percept of motion in depth). The other, more novel method uses a continuous (monocular) tracking task to assess relative delays in signal processing caused by luminance changes. The authors show that the two approaches yield remarkably close agreement (to within a few milliseconds) in their estimates of the relative processing delays caused by luminance differences across eyes. The authors go on to establish Pulfrich-like effects in a binocular tracking task.

      The paper is very clearly written, and the experiments and analyses have been meticulously conducted. The technical quality of the work is excellent. Scientifically, the paper does not really contribute any novel insights about the nature of perceptual processing. Rather, the paper represents more of a methodological manifesto advocating for the power of tracking-based psychophysics approaches. The experiments serve as a powerful illustration of how well tracking tasks can work in practice, validated by more conventional approaches. The paper makes a compelling case that tracking tasks are able to reproduce existing findings, and can do so significantly more efficiently (i.e. in much less time).

      The novelty of the approach is a bit overstated. On the first page, the authors suggest that continuous target tracking is "a new stimulus-response data collection technique". This is a bit much. People have been doing manual tracking tasks for decades, in many cases with quite sophisticated analysis and an emphasis on elucidating perceptual processing, in a similar spirit to this paper. Studies of eye movement and postural control have also employed related approaches. See, for example, the work of John Jeka, Tim Kiemel, Chris Miall, Otmar Bock, Noah Cowan - as well as the likes of Jex and McRuer in the 70s. Perhaps the authors were not aware of this substantial body of work. It seems appropriate to offer some acknowledgement and discussion of this prior work that has also recognized the power of such methods and employed them very effectively.

      A significant weakness of the paper is the small number of participants who performed the tasks - only five, two of which were the authors of the paper. While the within-participant comparisons are compelling, the broader agenda of advocating for wide adoption of these tracking tasks for scientific and potentially clinical applications will need more extensive validation on much broader populations. I do share the authors' optimism about the use of tracking tasks, but broad adoption for probing perceptual processing will require demonstrations that these approaches can be robust across much larger cohorts.

    2. Reviewer #2:

      This is a beautiful and clever paper, expanding the authors' tracking method for fast psychophysics to the domain of interocular delay. They find that it is possible to measure interocular delay quite accurately by comparing 1D tracking (in x) in each eye. The tracking technique is exciting because it potentially makes psychophysics much more accessible, and this paper demonstrates that it can be used to measure interocular timing differences.

      The authors also examine whether it's possible to estimate interocular delay in a single binocular experiment where people track in depth (x and z). The answer at this point is no - while some aspects of the depth tracking are beautifully accounted for in this way, other factors clearly contribute.

      I don't have any substantive concerns at all but I would be interested to see some quantification of the advantage of tracking over button-press psychophysics. It's clear from the error bars in Fig 6B that button-press results are considerably more precise, but presumably they take a lot longer. Could the authors quantify this for us? E.g. button-press psychophysics: 95% confidence interval is 1ms after 100 minutes of experimentation; tracking : 95% CI is 5ms after 10 minutes, or similar.

      Could you select a subset of the button-press psychophysics (fewer trials per data point) in order to say what precision could be achieved after the same time as the tracking? This would really help readers assess the costs & benefits of the two approaches.

    3. Reviewer #1:

      This paper presents a very interesting set of techniques (monocular and binocular visuomotor tracking) to evaluate subtle differences in visual processing as a function of luminance.

      Despite some technical caveats I'll explain below, the paper fairly convincing demonstrates that the monocular visuomotor tracking task can be used to identify millisecond-scale differences in visual processing lags, e.g. caused by different levels of luminance. The basic experimental analysis and comparison to traditional approaches were fairly thorough and convincing.

      The binocular tracking component was less convincing, and the data were messy (which the authors acknowledge). Unfortunately, the very small sample size (N=5), lack of attention to trial order effects and learning of this new task, etc, reduce enthusiasm about this part of the paper.

      While this seems like a solid paper in most respects, it seems it’s primary focus is to demonstrate that a 'new' technique visuomotor tracking (which is not new per se, but may be new in this field), gives results on delay estimation that are indistinguishable from traditional psychophysical techniques. This new approach requires fewer experiments and uses the richness of the full time series for analysis. The basic approach is near and dear to my heart in that it uses continuous-time system identification to really extract rich information.

      However, while I think the technique (which I quite like) is promising, I do not know what the new finding is. The analysis also only scratches the surface. I think this is a solid, field specific paper that verifies a new method and, despite its technical contributions, may be suitable for a field-specific readership, with modest effort to address or at least acknowledge the technical limitations.

      Technical Limitations:

      1) The visuomotor behavior is not new; continuous tracking moving stimuli is an age-old process. What is potentially new here is the use of this behavior for identifying subtle differences in delay. For a fairly old review with several papers cited in this area, see:

      Roth, S. Sponberg, and N. J. Cowan, "A Comparative Approach to Closed-Loop Computation," Curr Opin Neurobiol, vol. 25, pp. 54-62, 2014

      But there are many (much older) papers dating back for example to McRuer on visuomotor tracking tasks for identifying control systems in human visumotor control, including careful analysis of visuomotor delay.

      For a recent paper (in a non-human system) for detecting differences in delay, see:

      Luminance-dependent visual processing enables moth flight in low light Sponberg et al, 2015, SCIENCE 12 JUN 2015 : 1245-1248

      2) There are no error bars. With 40 trials per condition, a simple SEM may be sufficient.

      3) The binocular data highlights a general problem which is that people need to learn this task, and if you are doing system identification during learning, you are doing system ID on a time varying system. This sounds like a confusing task and I agree with the authors that "higher level cognitive processes" are probably taking place but more importantly the learning system is not in steady state even after that many trials.

      4) Very importantly, unlike the traditional psychophysics trials (which are based on perception not motor output), this data must be analyzed as a closed-loop system. There are now two pieces of visual information: exogenous reference and self-movement feedback. It is extremely likely that these are processed differently, via feedforward and feedback controllers. See these papers ... These are very new, so I wouldn't have expected the authors to know about them, but they will still be useful for understanding this concept and improving your analyses:

      Yamagami, M., Howell, D., Roth, E., & Burden, S. A. (2019). Contributions of feedforward and feedback control in a manual trajectory-tracking task. IFAC-PapersOnLine, 51(34), 61-66.

      Yamagami, Momona, et al. "Effect of Handedness on Learned Controllers and Sensorimotor Noise During Trajectory-Tracking." bioRxiv (2020). https://www.biorxiv.org/content/10.1101/2020.08.01.232454v1

      That said, the highest-frequency responses - those picked up in the earliest moments of the impulse response function - are largely "open-loop", a fact that can be verified by noting that in the frequency domain, there is a very low gain (which is almost surely true with this data as it is in all other visuomotor tracking data across species that I am aware of, and that fundamentally must be true to ensure stable tracking!). So, the observations about short-time-scale (i.e, high frequency) differences being attributed to differences in the visual processing, are likely substantiated. But a more nuanced and accurate description of the theoretical basis for this is warranted.

      5) One second is not steady state in human visuomotor tasks. Tracking bandwidth for visuomotor behavior is in the ballpark of around 0.5-2Hz, which means there is still significant phase lag at 1 Hz. So the 11 second trials, with the first second thrown away does not necessarily "erase" initial conditions. As one example, see a recent paper (again I wouldn't have expected you to know this, but it still shows 1 second is not long enough):

      Zimmet, A. M., Cao, D., Bastian, A. J., & Cowan, N. J. (2020). Cerebellar patients have intact feedback control that can be leveraged to improve reaching. eLife, 9, e53246.

      In Fig 4S2 in that paper you see that the phase lag at 1Hz is well over 90 degrees. Always wait 10 seconds to be certain, since at 0.1Hz, the phase lag is very low.

      6) Perhaps most fundamentally, lag and delay are not the same thing. Delay induces a very specific time shift, but it should be noted that in a closed-loop system one can NOT just shift the closed-loop cross-correlation function (equivalent to the impulse response in this case due to the noise input). If the delay were only on the measured target signal, and not on the feedback of self-motion, then indeed a simple time shift would be adequate; but there is a complex and subtle "compounding" of the feedback delay in closed-loop that leads to a distortion, not a simple shift, of the impulse response function. These papers show different ways on how to estimate delay differences in closed loop correctly:

      Luminance-dependent visual processing enables moth flight in low light Sponberg et al, 2015, SCIENCE 12 JUN 2015 : 1245-1248

      Zimmet, A. M., Cao, D., Bastian, A. J., & Cowan, N. J. (2020). Cerebellar patients have intact feedback control that can be leveraged to improve reaching. eLife, 9, e53246.

      I love the first paper's method, but it is not always applicable. I think it may be applicable in this case where one may be able to assume nothing changes but the delay.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on October 14 2020, follows.

      Summary

      Injuries of the meniscus are associated with the future development of articular cartilage damage and ultimately osteoarthritis (OA). Prior work in this field has suggest that there are undifferentiated progenitor cells residing in the meniscus and it has been hypothesized that these cells could be harnessed to aid in meniscal healing after injury. The authors provide evidence that Hedgehog activation promotes meniscal healing and identify population of cells that are positive for Gli1, an effector protein in the Hedgehog signaling pathway. Using a combination of approaches, including lineage tracing, in vitro cell culture approaches and cell transplantation experiments, they show that this Gli1 populations contains putative meniscal progenitors involved in meniscus development and healing. Overall, the reviewers found this paper to have high potential and were particularly enthusiastic about the therapeutic potential of Purmorphamine to promote meniscal healing. However, all reviewers felt that the conclusions (and title) of this paper overstated the utility of Gli1 as a marker of de facto meniscal progenitor cells. It is specifically requested that the title of this manuscript be reworded in light of the previous work showing that Gli1 can be found in a number of cell types. In several places, additional work was requested to support the conclusions made in this manuscript. Please see the detailed comments below.

      Essential Revisions

      1) The authors describe many of their results as "novel". Gli1 reporter mice have been used extensively in other tissues to non-specifically describe progenitor cells (bone marrow, periosteum, peri-vascular spaces and others). Further, the role of Gli1+ cells in enthesis and and periodontal ligament (PDL) formation and healing has been previously explored. Gli proteins, which have a half-life of minutes-to-hours, may be a relatively unstable foundation for defining cellular identity. While the value of Gli1 as a general Hh reporter is clear, its utility as a putative stem cell marker (Title) does not seem adequately substantiated. The authors must temper their statements on novelty, exclusivity and utility of Gli1. The title of this paper also should be reworded.

      2) The Hedgehog (Hh) signaling manipulation conducted is rather straightforward and some overlapping studies have been performed in murine joints. Many of the experimental results could have been predicted. Other elements that contribute to the superficial nature of the studies are that Gli1 reporter activity is the only marker of Hh signaling examined (for example Gli2/Gli3 are not), and that the abundance and cellular source of an Hh ligand during development or repair is never entertained. Of note, these reporters for Ihh and Shh are available.

      3) It is a stretch to say that Gli1;tdTom labels meniscus progenitor cells (Lines 268-271). There is relative enrichment of Sca1/CD90/CD200/PDGFRa in Gli1+ cells (Fig 2B), yet the vast majority of cells positive for those markers are Gli1-negative (Fig S5). Positive outcomes during in vitro differentiation and scratch assays may primarily result from increased Hh-mediated proliferation. This logic extends all the way through the in vivo experiments (which are quite promising, translationally).

      4) The spatial profile of Gli1-expressing cells in the meniscus is beautifully described, however an interpretation for the superficially restricted zonation of Gli1 reporter activity is not given. Do these superficial cells have more or less cartilage antigen expression? Is there something clearly physiologically different in the Gli1-rich superficial layers that could be determined? Line 401 cites an osteoblast paper to set up the relevance of Gli1+ cells in development of musculoskeletal tissues. However, the meniscus is much more similar to the enthesis and the PDL. The authors should therefore lead with that literature. The PDL literature in particular is not cited and should be added. Also missing are recent enthesis development/regeneration papers (PMID: 30504126, 26141957, and 28219952).

      5) The characterization of Gli1+ and Gli1- FAC sorted cells could be expanded on a bit.

      6) CFU-F images should be provide in addition to quantification. The differentiation studies in Fig 2E are non-quantitative and not convincing. Further, it is a little contradictory that under certain contexts Gli1+ cells form more cartilage (2E), but under other culture conditions they have reduced cartilage markers (2F). These points need to be clarified.

      7) In Fig 5, changes in distribution or survival of Gli1+/- cells may underlie the difference, but survival nor Gli1- cell distribution were not assessed.

      8) Cartilage differentiation within the meniscus appears to be promoted with Gli1+ cell therapy and Purmorphamine. This could be assessed. Similarly, Hh signaling is known to induce osteogenesis. Osteoblastic antigens and/or presence of osteophytes should be assessed for in purmorphamine treated joints.

      9) One topic that is not covered in the paper is the role of Hh signaling in chondrocyte mineralization. This has been well studied in the growth plate (esp. related to PTHrP / IHH feedback loop) and may have relevance to the meniscus as well. The healing studies should consider this carefully, as ectopic mineralization is a possible negative side effect of Hh treatment.

      10) There are a number of places in the results where it is unclear if the authors are talking about Gli+ cells or Gli1-lineage cells. This should be clarified throughout, perhaps with specific nomenclature that defines "Gli1+" as cells that are positive for Gli and "Gli1-lineage" for cells that are descendants of Gli+ cells. Supplemental Figure 1A should be in the main document. Similar schematics in other figures are very useful for understanding the experiment.

      11) What are the temporal expression patterns of Gli1 and other Hh related genes during development and healing? It would be informative to see localized expression (e.g., in situ hybridization) or qPCR expression for healing tissues.

      12) The authors should clarify a number of things with meniscal cell isolation: (a) There are clearly differences in cell phenotype between superficial and deep areas and between attachment and midsection; was this considered for cell isolation? (b) I assume TAM injections were performed and then cells were isolated a few days later via FACS; please clarify details to show that Gli1+ (not Gli1-lineage) cells were characterized. (c) Fig 2: 3-month old mice were used, but again, Gli+ vs. Gli1-lineage cells is not indicated.

      13) The mechanisms by which Gli1+ and Hh treatments work is not explored. Some of the results are counter-intuitive. For example, why would Hh stimulate proliferation if Gli1+ cells if these are thought to be slow turnover resident stem cells? Furthermore, why would Hh stimulation lead to proliferation rather than differentiation, in contrast to what is know in growth plate biology)?

      14) The assessment of healing is qualitative/semi-quantitative (histomorphometry). The authors should perform a more rigorous assessment of healing to demonstrate the effectiveness of the Gli1+ cell and Hh therapies. This should include quantitative outcome(s) such as qPCR, mechanics, etc.

      15) The Gli1+ cell therapy histologic results are impressive. This is surprising because the delivery method was relatively simple. How much cell engraftment was there? Can the authors comment further (or add experiments to elucidate) on how long the cells were present and what their direct involvement was in healing?

      16) The authors show that native Gli1+ cells expand after injury. If this is the case, what is the rationale for adding more Gli1+ cells? Is the idea that the tissue has the capacity to heal but there aren't enough native Gli1+ cells to do the job?

      17) Figures and text jump between methodologies, making interpretation of results difficult. Fig1 shows that superficial cells of the meniscus generally have active Hh signaling 24-hours prior to a variety of postnatal-to-adult timepoints (A, B, E, F), and postnatal Hh signaling drives proliferation of early meniscus cells (C, D). It does not appear to report any long-term pulse/chase lineage tracing experiment as suggested in the text (Lines 223+). If this interpretation is incorrect, perhaps this could be addressed by increased clarity of figures and text (Methods, Results, Figure organization and captions).

    1. Reviewer #3:

      Whole genome sequence data from a geographically large set of 86 Brachypodium distachyon samples is presented and combined with previous data. In addition, flowering time collected from both field and controlled conditions are presented. Overall, the manuscript has many interesting aspects and ideas but overall, the main agenda is not clear. They mention selfing, seed dispersal, coalescence theory, microevolution, plasticity and frequency dependent selection in the abstract but none of those topics are explored in-depth in the manuscript. There were multiple points e.g. in the methods that needed clarification. The manuscript would benefit from focusing on one or two aspects and making strong cases for them.

      Main comments:

      1) It is an overstatement to claim that this dataset covers the region from Iberia to Iraq, when already previous datasets covered Iberia and Iraq. Here French and Italian samples are added to previous data.

      2) The connection between the heterozygosity, structural variation and assembly issues due to paralogy should be more clearly presented. For example, in r. 130-134,it is not obvious what does mapping against BdTR7a to itself and identifying less heterozygous sites prove? In addition, the procedure for masking the fake heterozygosity should be more explicitly described. Inspection by IGB, or defining thresholds by "trial and error" are not reproducible methods. Also, wouldn't one want to take into account the overall level of diversity in a given region instead of putting a threshold as "ten or more SNPs along a distance of at least 300 bp".

      3) Sympatry issue: The different lineages are described to be sympatric thus it would be important to be really specific about the sampling locations. How close are the closest sympatric samples representing different lineages? Is that truly a sympatric setting? Further in r. 176-181, how does plotting ancestry components in the map prove that there has not been gene flow between sympatric lineages? There seems to be shared ancestry but it is a known issue that shared ancestry and admixture are not easy to separate. This aspect is central to the paper and would need more rigorous analysis with e.g. forward or coalescence simulations. The reasoning continues in rows 344-352, but is not really backed up by any analysis other than plotting ancestry components on the map. Or if it is, it should be more precisely expressed.

      4) R. 301-303 this statement sounds like the authors are suggesting that selfing and dispersal are actively (or as a result of selection) interacting and maintaining the diversity. I did not see convincing evidence that the distribution of lineages is not just a combination of drift, selfing and random dispersal events. Maybe this is what the authors mean, but should be more clearly stated.

    2. Reviewer #2:

      Generally, this paper is excellent. It explores many characteristics of Brachypodium distachyon population genetics and demography, many of which have been assumed or hypothesised by less data-rich papers over the last two decades. The authors do so with whole-genome sequencing of both a pre-existing global collection and some novel "gap-filling" sampling. The authors appear to have conducted all analyses using best practices, and the conclusions are largely not over-interpreted. I have only a few minor comments.

      L68: Ideally a more detailed summary of the work summarised in Supp File 1 would be brought into the main manuscript. The introduction in and of itself largely skims over the quite large amount that is already known or assumed about the population genomics and dynamics of B. distachyon, especially the ~4 other recent WGS popgen papers which cover adjacent/overlapping collections and topics to this manuscript.

      L165: with regards random sequence subsets for BPP: does this include sequence only from genes, or from intergenic space? what about TE or other repeat loci? How do you ensure subset regions are single-copy orthologs in all accessions? I'm no expert on BPP, but I'm largely aware of BPP being used on exon capture data (i.e. genic sequence and flanking introns), admitted at different evolutionary scales with a greater expectation that assumptions of orthology are not met.

      L338: the speculation about heterozygosity being induced "in the lab" is very interesting. If you have the data which allows investigating this, could you test if the maternal/paternal haplotypes in heterozygous regions match implausibly distant accessions, suggesting in-lab outcrossing?

      L364-365: wouldn't a decrease in diversity as one moves east imply an eastwards migration? I'm not sure if I'm misreading this sentence or there is a typo which switches the direction of the decrease. In any case perhaps reword this sentence for clarity.

      L403: typo: distance is week -> distance is weak

      L405: typo: descent -> descend. Also, a suggestion: did not descend from a single recent colonization (add "single")

      L410: Seed dispersal then ensures OR "would then ensure" (delete would, or ensures -> ensure).

      L421: While human-commensal seed dispersal likely explains most recent migration, surely the estimated branch times (fig 5) predate significant human movement? Or, phrased alternatively, were there other/additional historical agents of migration?

      L433: are pathogens not a potentially strong selective pressure on (nearly) all plants? How then do pathogens relate uniquely to the reproductive strategy/population structure and dynamics of B. distachyon?

      L435: Is a concluding paragraph required? I feel the discussion ends somewhat abruptly.

      L539: (optional suggestion): given the non-linearity in the IBD plots you present, it would be interesting to apply Generalised Dissimilarity Modelling to test for/examine IBD.

      L567: Please give light measurements in uE PAR (umol photos /m2/sec; 400-700nm) in addition to/instead of klux.

    3. Reviewer #1:

      The manuscript describes analyses of genomic data to study the population structure and demographic history of Brachypodium distachyon - a selfing Mediterranean grass species. Major findings include the existence of large-scale population structure (3 lineages), discordance between geographical occurrence and genetic relatedness (clades within the lineages), and at shorter scales, signs of dispersal without interbreeding. These patterns are explained by a combination of near-complete selfing and seed dispersal. The methods are appropriate, results well reported, and writing is good. As such, the paper provides interesting insights into the evolutionary history of B. distachyon, but due to its descriptive nature, I somewhat question the paper's value for a wider audience (i.e. people not directly working with B. distachyon). At points, the authors also engage in speculation (not supported by data) where I feel that more simpler population genetic processes are ignored.

      In my opinion, the biggest weakness is the descriptive nature of the paper: it describes the genetic structure and demographic history of B. distachyon, but potential processes giving rise to the structure are only speculated. In particular, the authors invoke pre- and post-zygotic reproductive isolation (lines 384 - 387) and pathogen-driven frequency-dependent selection (lines 431 - 435) as potential causes for the observed structure. However, as the paper provides no evidence for such processes, it's not clear to me why they need to be invoked in the first place? Evidence for seed dispersal over relatively short spatial scales is shown (within populations in Italy, Fig 4), but to my reading the results suggest little dispersal/gene flow over long distances (only few individuals with increased heterozygosity or signs of admixture). Therefore, I believe that the simplest explanation for the genetic structure is founder effects (perhaps human-induces, given the peculiar differences within the A and B lineages) combined with the near-complete selfing. This would explain the emergence of the genetic lineages and the lack of interbreeding. Furthermore, I would imagine that the genetic groups are locally adapted (e.g. there's extensive local adaptation among the selfing populations of A. thaliana), which would ensure that one lineage/accession doesn't take over when otherwise feasible (e.g. within the B lineage). If the authors argue otherwise, I would like to see more convincing evidence and/or discussion supporting the invoked processes.

      Below I list a few more specific comments:

      Lines 26 - 27: "[our study] identifies adaptive phenotypic plasticity and frequency-dependent selection as key themes to be addressed with this model system". While reading the abstract this sentence got me interested and I expected at least some analyses addressing these topics. However, the only place where they are mentioned again are two highly speculative sentences at the end of the discussion (lines 427 - 435). Although the authors write "themes to be addressed", I think that the complete lack of evidence for adaptive plasticity or pathogen-driven frequency-dependent selection in the current study makes this sentence too misleading to be left in the abstract.

      Lines 51 - 53: "For plants, genome-wide coalescence approaches have therefore been largely restricted to domesticated species and Arabidopsis thaliana". This might have been true some years ago, but not anymore. Just to highlight a few wild plant species (and studies) where demographic history has been studied using whole-genome data: A. lyrata (Mattila et al. 2017 MBE), A. arenosa (Monnahan et al. 2019 Nat Ecol Evol), Capsella genus (Douglas et al. 2015 PNAS, Koenig et al. 2019 eLife), Boechera stricta (Wang et al. 2019 Genome Biol), Populus genus (Wang et al. 2016 MBE, Hou & Li 2020 Front Plant Sci), Coclearia genus (Bray et al. 2020 bioRxiv), and many more.

      Lines 383 - 387: "Flowering time differences are at best part of an explanation for genetic structure. In the scenario of subsequent lineage expansions we propose here, reproductive isolation might have evolved when the lineages were geographically isolated; and it might include other pre- and post-zygotic barriers in addition to flowering time, namely niche differentiation or genomic incompatibilities". These sentences kind of come out of nowhere. First, I don't fully understand the distinction between genetic structure and lineage expansions. If the latter is a process beyond population structure (i.e. incipient speciation), the paper shows no evidence of that. In fact, as I outlined above, I would imagine that founder effects and near-complete selfing is enough to cause and maintain population differentiation without reproductive isolation?

      Lines 389 - 390: "Furthermore, differences observed in the greenhouse are most likely exaggerated through artificially short vernalization times. As our outdoors experiment shows, all accessions produced flowers within two weeks when they went through prolonged vernalization during winter". How representative are these vernalization times of the natural growing conditions? Large differences were observed in the greenhouse experiment, but the authors argue that these are not meaningful because the outdoor experiment showed little differences. However, a single experiment conducted in Zurich certainly does not capture environmental variation existing across the Mediterranean, so I'm not convinced that the role of flowering time can be ruled out so strongly based on these results. That said, the near-complete selfing suggests to me that flowering time is likely not a major factor underlying the genetic structure, and founder effects are a better explanation for it.

      Line 548: Only one species (B. stacei) was used to define ancestral alleles in the fastsimcoal2 analysis. There are multiple studies showing that the use of a single outgroup, especially based on parsimony, leads to unreliable inferences of ancestral and derived alleles (e.g. Keightley et al. 2016 Genetics, Keightley & Jackson 2018 Genetics). In particular, this leads to overestimation of high-frequency derived variants, distorting the shape of the unfolded SFS. As the observed SFS has more shared high-frequency variants than predicted by the demography model (Fig S5), I imagine that this is an issue. FSC2 also works with the folded SFS, so I wonder why the authors chose to use the unfolded SFS? Unless there is a compelling reason, I suggest to either add more outgroups or to simply fold the SFS.

    1. Reviewer #2:

      This is a fascinating study demonstrating the role of KIF21B in control of T cell microtubule network required for T cell polarization during immunological synapse formation. The authors show that knockout of KIF21B results in longer microtubules that result in an inability to move the polarise the MT network by a mechanism consistent with dynein motor function at the immunological synapse to capture long MT and center the MT aster at the synapse. They use the Jurkat cell line, which is a classical model for this step in immune synapse function and fully appropriate. They show that KIF21B-GFP can rescue the knockout phenotype and then use this as a way to follow KIF12B dynamics in the Jurkat cells. KIF21B works by binding to the + end and inducing pausing and catastrophe, thus, more MT that are shorter when present. They also rescue the defect in the KIF21B Kos with 0.5 nM vinblastine, that directly increases catastrophes, shortens the MT and restores MT network polarization to the synapse. As a functional surrogate they investigate lysosome positioning at the synapse, which is one of the proposed functions of this cytoskeletal polarization. The use of expansion microscopy in this system is relatively new and clearly very powerful. The modelling component adds to the story and supports the sliding model proposed by Poenie and colleagues in 2006, but cannot say that there is no component of end capture and shrinkage as proposed by Hammer and colleagues more recently.

    2. Reviewer #1:

      This is an excellent study of centrosome polarization in the process of establishing immunological synapse and the effect of kinesin-4 on this process. The authors use a variety of microscopy techniques and controlled perturbations of the cell to obtain beautiful images that clearly suggest that kinesin-4, by increasing frequency of pauses and subsequent MT catastrophes, limits MT length, which assists dynein pulling in polarizing the centrosome. They complement the experiments with modeling based on Cytosim; the model supports the conclusions from the data, and suggests some interesting ideas.

      I am not an expert in experimental techniques, though I understand what's been done, and in my limited opinion, the results are first-rate. The paper is well written and accurate. Modeling, which I know intimately, is done very well.. I have just a few minor comments:

      1) I was not quite clear what does the modeling say about the centrosome sometimes being in apical position, and sometimes half-way between apical and basal positions.

      2) I understand that 2d modeling cannot address this issue explicitly, but can the authors speculate about the apparent ring of MTs along the periphery of the synapse in the non-polarized case?

      3) My perhaps most significant comment: the model nicely integrates and explains the data, but is it predictive? A detailed model like that clearly can generate some nontrivial prediction that could be experimentally tested.

      4) "Interestingly, in our simulations, a small number of KIF21B motors was sufficient to prevent the overgrowth of the MT network." - this is a bit counter-intuitive: if the motor number is less than MT number, how would this work? Or, by a "small number of KIF21B motors" you mean still greater than ~ 100?

    1. Reviewer #2:

      This short study highlights the complexity of the octopaminergic system in insect behavior. This aspect of neuromodulation has received little attention in comparison with the role of dopamine in learning and motivation. The main question being addressed is whether, how and where octopamine modulates the generation of rhythmic behavior (peristalsis) upon noxious sensory stimulation (touch and pain). Using a combination of functional imaging and behavioral inspections, the authors explore the role of octopamine released by the VUM neurons on the escape crawling behavior of the Drosophila larva.

      The specific observations reported in the study are:

      1) Isolated larval CNS preparations that do not receive sensory input (deafferented preps) show spontaneous rhythmic wave patterns of neuronal activity in octopaminergic VUM neuron cluster.

      2) In vivo preps that receive sensory input did not show spontaneous rhythmic patterns in the neural activity of the VUM neuron cluster.

      3) The VUM neurons show weaker responses in clusters that get sensory input from physically stimulated body segments and stronger responses in clusters that get input from segments further away from stimulated segments.

      4) In functional (GCaMP) imaging experiments, repeated gentle (rod) touch stimulations led to decreased VUM response intensities. Repeated harsh (brush) stimulations resulted in increasing VUM intensities. The authors correlate these physiological observations of the VUM activity with an increase in crawling speed upon repeated harsh stimulations, and a decrease in crawling speed upon repeated gentle touch stimulations.

      Based on observations (4), the authors propose that the differences in the behavior elicited by series of gentle touch and harsh stimulations are due to differences in adaptation of two classes of mechanosensory neurons. The class III da neurons responsible for detecting gentle touch would quickly adapt, whereas the class IV da neurons responsible for detecting harsh touch would integrate neural activity over time. The authors also conclude that (i) the octopaminergic system is strongly coupled to the CPG underlying peristalsis and (ii) "it is simultaneously activated by physical stimulation, rather intensity than sequential coded" (line 53). The first conclusion is supported by observations (1-2). While the involvement of octopamine in the modulation of a key CPG of the larva is a certainly interesting result, it represents the starting point of a mechanistic inspection. The problem is that the rest of the study falls short of testing or establishing any concrete mechanism.

      Although the topic of this study is exciting and its results are generally promising, the work is largely inconclusive. In addition, some conclusions are phrased in a way that is cryptic. For instance, I found it difficult to decipher the meaning of "the octopaminergic system is simultaneously activated by physical stimulation, rather intensity than sequential coded" (line 53). This conclusion appears to contradict the observation that repeated gentle touch stimulations produce a gradual decrease in the overall activity of VUM neurons. In the discussion section, the authors nicely refer to published findings in stick insects, honey beers and locusts. Compared to these systems, the advantage of Drosophila is that it offers the neuro-genetic tools to shed mechanistic insights into the molecular and cellular bases of neuromodulation.

      Questions and mechanisms that the authors might have wanted to address at a mechanistic level:

      Re. observations (1-2): What explains the observation that sensory inputs present in in-vivo preps abolish the spontaneous rhythmic pattern in the VUM activity? How does this relate to the VUM activity elicited by the tactile stimulations presented in Fig 3?

      It would be important to establish the importance of the VUM activity on peristalsis through loss of function experiments. Expression of Tdc2 could be restrictive to the VNC by using tshirt-Gal4. These experiments would support the authors' proposal that octopamine is released to facilitate motor coordination (in lines 474-478).

      Technical concerns:

      -How can you rule out that the mini-stage featured in the in-vivo prep (Fig 2A) does not sever nervous fibers innervating the VNC? The plate placed under the CNS is very large. It is difficult to believe that this plate can be inserted while leaving all nerves (afferent and efferent neurons) intact on both sides. The integrity of the preparation should be controlled anatomically.

      -In Fig 2, a statistical analysis should be performed to establish a lack of correlation between the VUM activity and patterns of crawling. Trial 2.2 suggests the existence of some correlation. This correlative analysis would be important to back up the statement that "unstimulated larvae showed no consistent VUM neuron responses correlated to crawling movements" (lines 228-229; see also lines 235-236).

      -Lines 234-236: How can "movements" be assessed in an isolated deafferented prep?

      Re. observation (3): Do the mechanosensory inputs have an inhibitory effect on the VUM activity patterns? If so, how does the inhibition come about?

      How do you explain that harsh stimulation at the posterior end inhibits activity of both the most abdominal and thoracic segments? Does this imply that the t1 and a8 segments are somehow coupled?

      In line 400, the authors propose that "VUM neurons as one possible system to modulate either indirectly the endogenous input or directly the central pattern generating neurons as a response to external tactile stimulation of the body wall." How does this model and subsequent discussion fit with the observations of Fig 3? It would be helpful to test the validity of the two alternatives described in line 400.

      Technical concerns:

      -Line 292: The segments displaying highest activity upon tactile stimulations are said to be consistent across consecutive simulations. Are they consistent across preparations as well? Were the data of Fig 3 generated on more than one prep?

      -Are the results of Fig 3 dependent on the strength of the tactile stimulations? More than one intensity should be tested to rule out intensity coding, as is stated in the abstract (lines 53 and 55).

      Re. observation (4): One of the observations reported in Fig 3 is that posterior harsh stimulations produce an overall increase in VUM activity whereas anterior harsh stimulation produce a decrease in activity. In Fig 4, larvae undergo harsh physical stimulations. However, it is unclear whether the harsh stimulations are applied to the posterior or anterior end of the larva. Based on the physiological results of Fig 3, wouldn't the authors expect that harsh stimulations of the head/neck region should lead to a deceleration of the larva, as was observed for gentle touch? Couldn't this prediction be tested experimentally? For the same reason, stating in line 512 that the same stimulation is used to activate the VUM neurons in Fig 3 and Fig 4 is misleading.

      The discussion about the adaptive nature of the class III and IV da neurons is compelling. However it ought to be supported by more direct experimental evidence that could be collected in the Drosophila larva.

    2. Reviewer #1:

      In this work the authors measure the activity of the octopaminergic VUM neurons that arborize throughout the somatic body wall muscles in the Drosophila larva. They use three different larval preparations: isolated CNS (no sensory afferents), semi-intact (CNS exposed while maintaining sensory input), and intact. They find that isolated CNS has rhythmic waves of activity in the VUM neurons, but that semi-intact preparations do not show rhythmic VUM activity. They also show that "harsh" or "gentle" touch elicits different responses in VUM neurons.

      There are several interesting findings. The ability of VUM neurons to show rhythmic activity in the isolated CNS is a novel finding. It would be even more interesting to register these waves to that of the glutamatergic body wall motor neurons that drive locomotion. It is also interesting that touch applied to an anterior segment results in elevated VUM activity in a posterior segment, and conversely posterior touch leads to elevated VUM activity in an anterior segment, suggesting that sensory input dampens VUM activity.

      There are also issues that need to be addressed, which are listed below.

      1) The function of the VUM neurons in locomotion was not tested, e.g. by silencing or activating them. These experiments would greatly strengthen the paper.

      2) The three larval preparations are poorly described. (a) The fictive preparation is clearest but still should have a citation to Pulver 2015 at first use, as that paper provides a detailed description of the isolated CNS prep. (b) The semi-intact prep is not well described: is the CNS pulled from the body? How can this be done without ripping the nerves? How can the intactness of the nerves be validated? (c) The intact prep sounds simple, but how is VUM GCaMP3 fluorescence measured in an intact larva as shown in Figure 4? Is the "intact" prep the same as the "in vivo" prep? One name should be used throughout for clarity.

      3) The semi-intact prep showed Ca++ signals in only 5% of the preps. This makes me worried that the prep is unhealthy, and that the data from the 5% are not physiological.

      4) Experiment 1 shows four individuals, but population data for all larvae were not shown. Selecting only a subset of the analyzed larvae is not appropriate; data from all should be shown.

      5) Experiment 2 shows low resolution data (left) that is not interpretable. The data highlighted in the right panel is much better but again, only three examples are presented; no population data or statistics are shown.

      6) It is also unclear how many larvae were analyzed in Experiment 2. Line 163 says "...~5% of the in vivo preparations (n=27)..." but is that 1/27 or 27/540? In addition, are the different stimulation patterns done sequentially on the same larva, or independently on different larvae?

      7) The prep used for Experiment 3 is not mentioned. Not in the text, not in the figure legend.

      8) The prep for Experiment 4 appears to be the intact larva, but if so, how were GCaMP signals measured? How were movement artifacts handled?

      9) In Experiment 4, the term "crawling frequency" is not defined. Is it frequency that locomotion is initiated?

      10) How do the authors standardize harsh and gentle touches?

      11) It says "in very rare cases..." on line 246. Please give actual numbers.

      12) The figures are cited out of order (1, 3, 2, 4).

      13) Many references are missing in the first part of the Introduction, e.g. lines 64. 65, 73, 78, and 83.

    1. Reviewer #3:

      This manuscript investigated the interactions of SARS-CoV-2 S protein and its RBD domain with ACE2 protein of host cells using mainly the HDX-MS approach. The results revealed the dynamics information about the interactions and how ACE2 binding at the RBD domain primes enhanced proteolytic processing at the S1/S2 site of S protein, and are potentially useful for the relevant research, e.g., therapeutic development. This is a rather straightforward study, without further biological validation of the major conclusions. Detailed comparison and integration of the HDX-MS results with those from cryo-EM were not provided in the manuscript as well. Some details of the manuscript also need further clarification.

      Major comments:

      1) Fig. S1: The SDS-PAGE showed around 90 kDa for the molecular weight of RBDisolated, which should be around 25 kDa based on its sequence (318-547). Please check and clarify.

      2) It is confusing about the existing forms of the S protein and ACE2 and their binding stoichiometry, regarding the statements such as "we measured dynamics of a trimer of this near-full length S protein..." (Page 4, line 87), "we performed HDXMS experiments of monomeric ACE2..." (Page 10, line 220-222), "......were pre-incubated at 37{degree sign}C for 30 min in a molar ratio of 1:1 to achieve >90% binding......" (Page S2, line 65-66). Please confirm whether the expressed ACE2 is dimeric and S protein is trimeric or not, and their binding stoichiometry is 1:1 or 2:3. Please also provide the concentration and calculation details for ensuring the >90% binding. If only one ACE2 in the ACE dimer and one S protein in the S protein trimer are involved in the binding, how sensitive and accurate could the HDX-MS results reflect the binding, since no HDX difference would be observed for the other ACE2 and other 2 S proteins?

      3) Page 2, line 33-35: Other studies (e.g., Ref. 11) have shown that ACE2 binding can enhance S1/S2 cleavage by furin and S1/S2 cleavage site could be possible targets for small molecule inhibitor/antibody development. It would be helpful if further evidence could be provided to support that the stalk hinge regions could also be the targets for that.

    2. Reviewer #2:

      This is a super interesting exploration of the dynamic allosteric changes in the SARS-CoV-2 S protein upon engagement with the angiotensin 2 converting enzyme 2 (ACE2) receptor (and vice versa). It also represents a tour de force for HDX-MS since the S protein is almost 1200 amino acids long and the ACE2 is also very large. The data are beautiful and the analysis is well-done. The S protein consists of two sub-domains S1 and S2 with the S1 needing to be cleaved-off so the S2 can become the fusion protein responsible for getting the SARS-CoV-2 into the cell. Structures are available but they do not shed light on how the protease furin can access the cleavage site between S1 and S2 in order to begin the process of fusion. In this paper, the Anand group shows that when ACE2 binds to the S protein, a conformational change occurs near the S1/S2 cleavage site exposing it and likely making it more susceptible to furin cleavage. It also dampens exchange in the stalk region. They call these regions "dynamic hotspots in the pre-fusion state".

      There are some things that need to be addressed:

      1) The manuscript appears to have been hastily written, it would benefit from a scientific editor making it more readable. For example, line 90 ff "Average deuterium exchange at these 91 reporter peptides was monitored for comparative deuterium exchange analysis of S protein, ACE2 receptor and S:ACE2 complex, along with a specific ACE2 complex with the isolated RBD." Presumably "reporter peptides" refers to the 321 peptides mentioned two sentences earlier...Why is the ACE2 complex with the isolated RBD qualified as "specific" while none of the others are? Then the article continues with more information about glycosylation…

      2) Figure S1B the concentrations should be reported in molar not ng/ml

      3) Line 90 and Figure S2: A bit more should be said about the glycosylation sites. If only non-glycosylated peptides are observed in the pepsin digestion, the coverage map (Fig. S2), shows expected lack of coverage for only a few sites (17, 122, 149, 165, 234, 282, 709, 1134) whereas many other sites are covered by peptides. Does this indicate that these sites are mostly not glycosylated?

      4) Fig. S3 legend seems to indicate that uptake of each peptide is plotted, whereas uptake per residue should be plotted because overlapping peptides make these data misleading. The peptides are shown in the other relative uptake graphs, but then there is more than one data point per peptide. Can the authors explain a bit more in the legend how they got the data in these figures?

      5) Fig. S4 seems to indicate that the cleavage site is already very dynamic. Can the authors explain this?

      6) Line 98-99 "... Mapping the relative deuterium exchange across all peptides onto this S protein model showed the greatest deuterium exchange at the stalk region" seems to contradict lines 105-106 "The deuterium exchange heat map showed the highest relative exchange in the S2 subunit (Fig. S3) and helical segments," Please clarify.

      7) Fig. 2 A and B look like the same molecular structure (nice that they are in the same orientation) but the domains are labeled differently. Yet a third domain listing is used in panel E. Comparing panels A and B, it's a little strange that some of the least dynamic spots in the Head/ECD are the highest exchanging, do the authors want to comment on this?

      8) I thank the authors for the details provided in the Methods section regarding the HDX-MS data. If it wouldn't slow things down too much, it would be great if the RFU data were calculated after back exchange correction. Even an imperfect correction (such as a global correction for the back exchange during analysis) would make the data more meaningful.

      9) Fig. 3C and 3D look remarkably different considering that they both are reflecting the RBD:ACE2 interaction. Did the authors attempt to find a convergent set of peptides to do this analysis? Perhaps if the binding site were labeled it would help make the differences look less important (overall the top part of the molecule is blue and the bottom more-or-less has some red and if that's all we are supposed to get out of this figure then it is ok).

      10) Fig. 4. The authors state that the significance cut-off for difference in deuterium exchange is 0.3 D but I don't see where they explain how they derived this value.

    3. Reviewer #1:

      The authors have used hydrogen deuterium exchange mass spectrometry and molecular dynamics simulations to study the interaction between the sars-cov-2 spike protein and the ace2 protein. The results suggest that the protein-protein interaction induces extremely long-range allosteric effects on the spike protein, triggering the proteolysis of the spike protein. The results of this work have implications for the development of small molecule inhibitors.

      In general, the manuscript is written extremely well. The work is timely, and the results will be of interest to many. The major conclusions of the work are generally supported by the results. However, there are several key - generally minor - details, enumerated below, the authors should provide in order to strengthen the manuscript and validity of the results.

      1) The authors should provide more technical details of the molecular dynamics simulations in the supplementary materials. Could the authors provide more details on the equilibration protocol? Was there any analysis done or metric used to assess whether the system was properly equilibrated? How often were snapshots of the trajectory saved for analysis? How many Na+ and Cl- ions were added to achieve 0.15 M of salt concentration? Also, how many water molecules were added? These details are relevant to the non-casual readers.

      2) The authors should probably include the techniques used to study the systems in the abstract section of the manuscript.

      3) Also, the authors should probably also include the fact that they performed molecular dynamics simulations in the last paragraph of the introduction. This is not apparent until toward the end of the first paragraph of the results and discussion sections.

      4) Page 7; line 147: Figure 4 is introduced before Figure 3. The authors should switch the order or modify accordingly.

      5) Figure S1: Could the authors elaborate on Figure S1B in the figure legend? Is (i) measuring the binding of ace2 to the S protein? Is (ii) measuring the binding of RBD to the ace2 protein? The distinction between (i) and (ii) is not made in the figure legend.

      In summary, the work is interesting and timely, and the manuscript will be of interest to many in the field. The authors should address the aforementioned points.

    1. Reviewer #3:

      Thank you for inviting me to review this manuscript by Guell and colleagues, in which the authors conduct an interesting study into the hemispheric symmetry (or lack thereof) between low-dimensional resting state functional connectivity gradients in key structures within the subcortex. In a large cohort of individuals, the authors demonstrate interesting asymmetries in the thalamus and pallidum, along with the cerebellum and striatum. They then survey a broad anatomical literature in search of a parsimonious explanation for their observed results.

      Overall, I found the manuscript to be interesting, well-documented and well-reasoned. I have only minor comments that I hope will help the manuscript.

      • My only slightly major concern is in the section titled 'Projection of subcortical functional gradients to cerebral cortex'. Specifically, I'm worried that multiplying each subcortical voxel by the absolute value of its eigenvalue may remove the effects of interest. For instance, in the raw eigenvalue, there is an interpretable (and important) difference between loadings of +1 and -1, however these two scores would be equivalent when the absolute value is taken. The authors mention that "Absolute functional gradient values were used in order to specifically observe the relationship between subcortical regions with strong IHFaS as indexed by asymmetric functional gradients and cerebral cortical connectivity", but I don't see how this follows.

      • Is it perhaps surprising that there is strong IHFaS between first order thalamic regions but not between the cortical regions providing modulatory inputs to those regions?

      • Do the authors predict that these patterns will be similar for task-based data analyses?

      • The thalamic patterns appear to overlap with Ted Jones' concept of 'core' and 'matrix' thalamic nuclei (doi: 10.1016/s0166-2236(00)01922-6). Although these terms loosely overlap with 'first-order' and 'higher-order' thalamus, they are defined by the mode of thalamic projection to the cerebral cortex (targeted, granular vs. diffuse, supragranular, respectively), rather than the projection from cortex (as in the case of first- and higher-order).

      • I couldn't find any information about whether the resting state fMRI data were filtered prior to the calculation of voxelwise cosine similarity. It could be interesting to determine whether the observed patterns are associated with broad-band patterns or more specific frequencies.

      • The large sample size is a strength of the approach, but I did not see this leveraged anywhere in the manuscript. For instance, was there strong split-half reliability, or were some patterns more variable across subjects?

    2. Reviewer #2:

      General assessment:

      Using rsfMRI data, the authors showed that unlike the cortex, cerebellum, and caudate, the thalamus and the pallidum of the lenticular nucleus have strongly asymmetric principal functional gradients across the two hemispheres. Using a laterality metric and confirmed with seed-based rsfMRI, they showed that these thalamic and lenticular asymmetries correspond with hemispheric laterality. They report that the cerebellum and caudate have asymmetric secondary and tertiary gradients. Finally, by summing cortical connectivity maps weighted by the functional gradients, the authors show that the asymmetric functional gradients of the cerebellum and caudate are associated with the default network, while those of the thalamus and lenticular nucleus are associated with the ventral attention network. The Discussion argues for an anatomy-informed model explaining these results.

      These observations and the posited model are very interesting, but I have a serious concern with grouping the putamen with the pallidum as the lenticular nucleus, and drawing conclusions based on this. Also, more work needs to be done to rule out technical artifacts and improve the writing.

      List of substantive concerns:

      1) Why did you group the putamen and globus pallidus together into the lenticular nucleus? The globus pallidus is equally connected to the caudate as to the putamen. There's nothing special functionally between the putamen and pallidum-they were called lenticular nuclei by early anatomists based on their lens-like shape. In fact, I would have grouped the caudate and putamen together as the striatum, and considered the pallidum separately. Grouping the putamen and pallidum together creates a false sense of variability in the lenticular nucleus (Table 1). Based on that, the inferences resting on observations with the lenticular nucleus do not hold in the Discussion. The manuscript should be re-written to address the results of the pallidum specifically, rather than lenticular nucleus. Critically, how would this change the authors' interpretations and dichotomous model in the Discussion?

      2) Another problem with the pallidum is that this is adjacent to the thalamus and may suffer from signal bleeding. Work needs to be done, perhaps by regressing out each signal from the other, to show that the pallidal results are not due to signal bleeding from the thalamus.

      3) As the authors state, a known asymmetry in the brain is the lateralization of certain heteromodal cortical networks, yet these "positive controls" appear highly symmetric (Supp Fig. 1A), at least in comparison to the asymmetry of the thalamus and pallidum. Is this surprising to the authors?

      4) My first order interpretation of the results-that there's greater functional asymmetry/lateralization for the pallidum and thalamus than other brain structures-would be that these structures simply have preferentially ipsilateral connections. The pallidum in particular is a middle link in cortico-basal ganglia-thalamic circuits-it could simply have asymmetry because its connections are mostly with the ipsi basal ganglia and thalamus. A simpler explanation is to see whether these results correspond to anatomical connectivity strength. What are the ispi versus contra connections of these thalamic nuclei to cortical regions?

      5) What does it mean that the asymmetric (sensorimotor?) parts of thalamus are associated with the ventral attention cortical network?

      6) In the Discussion, my first order prediction of the rsfMRI reflections of indirect/direct and driver/modulatory connections would be that direct or driver connections lead to a stronger "influence" of the cortex's properties to the downstream subcortical region. Thus, regions receiving direct or driver connections would be symmetric or asymmetric in a manner consistent with the cortical regions they are connected to. Wouldn't you expect the "influence" of the cortex to be stronger for the regions receiving driver versus modulatory or direct versus indirect inputs?

      7) What other connectional differences explaining these results did you consider and rule out (and for what reason), in addition to cortical inputs?

      8) The dichotomous model interpretation is very interesting, but as there is no direct evidence presented by this paper, I would state these interpretations more speculatively in the Abstract and throughout the paper.

    3. Reviewer #1:

      This study investigates asymmetry in functional gradients in human subcortical structures (thalamus, striatum and cerebellum). The authors found that the 1st principal gradient of thalamus and palladium are asymmetric, while that's not the case for caudate, putamen and the cerebellum. In the case of the caudate and cerebellum, their 2nd and 3rd gradients were asymmetric. Further analyses suggest that these differences arise based on connectivity between subcortical structures and the cerebral cortex. In the case of the thalamus and lenticular nuclei, asymmetry is stronger in regions with no direct or driver cerebral cortical afferent connections. In the case of the cerebellum and caudate, asymmetry is stronger in regions linked to cortical regions with higher inter-hemispheric asymmetry. The writing style of this paper is quite different from the usual papers. I actually quite enjoy this conversational/didactic style. Please see my major and minor concerns below.

      1) The computation of the laterality index is not clear to me. In the methods section, it's defined as "(left_score - right_score) / (left_score + right_score), where left_score and right_score correspond to the sum of all functional connectivity values for each left and right structure (for example, in the case of thalamus, functional connectivity values in left and right thalamus)". This sounded like they were averaging across all voxels within for example across all thalamic voxels. But in Figure 2, I assume each dot represents a thalamic voxel. So what are the authors averaging over? Indeed, in the results section, the authors said "We then computed a laterality index that quantified the degree of asymmetry in each functional connectivity map from each seed (see methods), and plotted laterality index scores for each voxel in thalami and lenticular nuclei against their corresponding functional gradient value." So for each thalamic voxel, the authors computed the correlation of the voxel's time course to all brain voxels or something else? This was also not clear. After obtaining the correlation map for a thalamic voxel, how do the authors then compress the correlation map of the thalamic voxel into either "left_score" or "right_score". That was not really explained. Furthermore, in order to compute the laterality index, the authors need to define a homologous thalamic voxel on the other hemisphere. How was this done? Did the authors use a symmetric MNI template? Which one? This was also not explained.

      2) "Projection of subcortical functional gradients to cerebral cortex" does not quite make sense to me. According to the authors, basically FC maps of voxels are weighted by the absolute gradient values of the voxels. Essentially this means that voxels with extreme gradient values are weighted more. In the case of the thalamus, lenticular nuclei and caudate, voxels with extreme gradient values are indeed voxels with high inter-hemispheric functional asymmetry (IHFaS), so this is ok. However, in the case of the cerebellum, motor regions in lobules I-IV have extreme gradient values as well. As such, these regions would also be weighted more. Thus the resulting projected subcortical gradients might not simply reflect gradient asymmetry. Perhaps it would make more sense to compute a laterality index based on the gradient scores (i.e., left score and right scores are gradient values), and then use the absolute value of the laterality index as the weight rather than the absolute gradient values.

      3) The analysis level in Figure 5 is too coarse. By performing a weighted average of thalamic voxels' FC maps (or caudate or lenticular or cerebellum), the authors are ignoring variation in functional connectivity patterns across thalamic (or cerebellar or caudate or lenticular) voxels. A more direct test of the authors' hypothesis should be as follows. According to the authors' hypothesis, cerebellar/caudate voxels that exhibited greater gradient asymmetry should be more strongly correlated with cortical vertices with strong absolute laterality index. Then there should be strong positive correlations between the absolute laterality index of cerebellar/caudate voxels and the absolute laterality index of the cortical locations mostly strongly correlated with the corresponding cerebellar/caudate voxels. On the other hand, there should be weak correlations for thalamic and lenticular nuclei.

      4) The authors suggest that no p value is necessary with a 1000-subject dataset. That might be true for certain things like functional connectivity maps, but a number of analyses, such as Figures 2, 4 and 5 do require supportive inferential statistics.

      5) "IHFaS is more prominent in first order nuclei (compared to higher-order nuclei)" is not really quantified. The authors should specify in Figure S2, which nuclei are first order nuclei and which are non-first order nuclei. Perhaps the labels on the x-axis could be colored differently for first order and non-first order nuclei.

    1. Reviewer #3:

      This work started from the notion that Alzheimer's disease (AD) pathology spreads through connected regions, and investigated whether the level of AD pathology in specific regions relates to the integrity of the fiber bundles connecting them, in 126 elderly with normal cognition at risk of AD. Specifically, AD pathology was quantified by beta-amyloid (Aβ) and tau protein levels from positron emission tomography (PET). Three fiber bundles, the cingulum, the fornix, and the uncinate fasciculus, were a priori selected, and six measures were derived from free-water corrected diffusion tensor imaging. The authors hypothesized that Aβ levels would relate to the integrity of (i) the (anterior) cingulum, and (ii) the uncinate, and (iii) that tau levels would relate to fornix integrity. The direction of the relations was not specified. The authors find support for particularly the second hypothesis (Aβ levels and the uncinate), but also for the first (Aβ levels and anterior cingulum). They also find relations between tau levels and uncinate integrity, and Aβ levels and right fornix integrity. The relations were consistently in a direction the authors refer to as "unanticipated", that is, more restricted diffusion with the presence of pathology. The authors conclude that the result "suggests more restricted diffusion in bundles vulnerable to preclinical AD pathology”.

      The work addresses important topics (early detection and spreading of AD pathology) of great interest to people from several disciplines. The sample is interesting with both regional Aβ and tau measurements, and the imaging processing methods used are advanced. The paper is clearly written and nicely illustrated.

      My main concern relates to the main conclusion of "more restricted diffusion in bundles vulnerable to preclinical AD pathology". Although this result is discussed as "unanticipated", I think the centrality of this point makes more scrutiny warranted.

      1) Direction of relationship. The authors state that "[..]the directionality of the observed pattern of association opposes the classical pattern of degeneration. The classical degeneration pattern accompanying disease progression is characterized by lower anisotropy and higher diffusivity, representing loss of coherence in the white matter microstructure with AD progression", and further: "[..] more restricted diffusion with the presence of pathology was unanticipated [..]".

      Indeed, their results were unanticipated based on the literature, as highlighted by the authors. As this is the central point of the work, I believe it is important to do additional analyses to try and enlighten the results and the suggestion of a biphasic relation. I understand that the authors have done a lot of work already, but here are some fairly simple and not too time-consuming suggestions which might be informative (please feel free to ignore these suggestions and instead follow other paths to show the reader more results to evaluate the unexpected direction of the relations):

      (i) A simple start could be to assess the relationship with age, how strong this relationship is, and what the residuals look like when regressing out age (and bundle volume).

      (ii) As the authors mention, a reduction in crossing fibers might lead to "more restricted diffusion" but be a sign of deterioration. Analyses undertaken to assess this point would be valuable. For instance, one could test if the relations are similar in regions of the bundles where there are little crossing fibers and in regions with more crossing fibers.

      (iii) The authors state that "[...] we estimated that 20% of the participants would be considered Aβ-positive". Were a majority of these also tau-positive? If so (or if participants exist in the larger PREVENT-AD sample that were not "cognitively normal at the time they underwent diffusion-weighted MRI»), creating a group of high AD pathology, is the relations between Aβ/tau and diffusivity similar in this group of high Aβ and tau compared to a similar-sized (and, if possible) age-matched group with (very) low Aβ and tau levels?

      2) Hypotheses. As mentioned, the authors state in the Discussion that directionality of the observed pattern of association was unanticipated. I was therefore somewhat surprised that the directionally of the hypothesized relations were not included in the hypotheses presented in the Introduction. I think it would increase the readability of the Results section if this point was made explicit earlier in the text, and the non-expected direction mentioned in the Results.

      3) Number of tests. The author state that "Associations with a p-value < 0.05 were considered significant, but we also report associations that would survive false-discovery rate (FDR) correction for each bundle with q-value of 0.05, accounting for 6 tests (i.e. the number of diffusion measures assessed per bundle).". I find this somewhat problematic (at least without further justification). First, I think the authors should only consider corrected p-values significant. Second, these 6 measures are tested per hemisphere, and across at least 3 fiber bundles (for cingulum, it seems the authors have done separate analyses for the anterior and posterior part), making the total number of tests higher. Correcting for the number of diffusion measures per bundle might be too strict, but I think the total number to correct for should be higher than 6. Whether any correction has been applied is also difficult to grasp while reading the Result section, as it seems like p-values are not FDR-corrected in Tables 2 and 3 (mentioned only in Table 4). I think the total number of bundles assessed, and the correction should be made explicit when introducing Figure 2 and Table 2.

    2. Reviewer #2:

      Here authors show interesting, seemingly counter-intuitive, associations between key Alzheimer's pathological hallmarks (Aβ and tau) and free-water corrected diffusion measures in a large cohort of cognitively healthy older adults with family history of Alzheimer's. They show direct associations between amyloid (and tau in some cases) and increased FA and decreased MD/RD in key white matter bundle cortical endpoints. Whilst for some tracts this association is only just 'statistically significant' at p<0.05, results for the uncinate fasciculus are very convincing. Overall, this paper is an interesting, well-written and potentially highly impactful piece of work with robust methodology, in which the authors should take pride.

      I have no major concerns to raise regarding this paper. However, I will mention for the authors' interest, that the principle of a biphasic change in quantitative MRI measures (initial decrease due to water mobility restriction, followed by later increase associated in symptomatic phase) is one discussed in a recently published paper (rdcu.be/b62Yp). A linear change across the course of the disease (which the authors here say would be impossible to detect in slowly progressing individuals) may be brought about by studying the changing and increasing distribution width, rather than averaging across a region of interest. I am not suggesting the authors change their analyses to reflect this, it is merely food for thought, or worth a mention in the paper as an avenue of future research.

    3. Reviewer #1:

      The manuscript reports the results of a study examining the linear correlation between white matter tracts and AD- related pathology in the grey matter regions connected by the white matter tracts. The integrity of the tracts were measured using FA, MD, AD, RD (corrected for free water) and free water index (FW) and apparent fiber density (AFD). The white matter tracts examined were the cingulum (main and posterior branch), uncinate fasciculus, and fornix. The population studies were older healthy subjects at risk (based on family history) for developing AD. The AD related pathology were tau and amyloid measured using PET. The study was very well done and it addresses key questions in regards to the p-clinical phase of AD.

      Questions:

      a) It would be very helpful to the reader to understand the distribution of the global ABeta SUVR and temporal tau SUVR - given that studies dichotomise study participants based on high & low deposition, it would help readers better understand the context of the results. The mean and range given in table 1 is not enough.

      b) Related to previous question, I would suggest that the same graphs be made for the ROIs at then end of the tracts - again it would help a reader understand the context of the study.

      c) I am surprised that APOE e4 allele was not included as a covariate in the statistical model. Why not? Given that APOE increases risk of developing AD, it would seem to be a relevant parameter. Amyloid positivity has been shown to be associated with age, sex and APOE e4 status.

      d) The negative results of the posterior cingulate and yet statistically significant results for the uncinate fasciculus are an interesting contrast. Both tracts connect regions with presumably high Beta and high tau deposition. Have there been studies that have compared the amyloid deposition in posterior cingulate cortex and anterior cingulate/anterior frontal regions? It might be supportive of the idea that posterior cingulate is further along the disease progression compared to the anterior frontal regions. Having the data plots as described in (a) and (b) could help in supporting the points made in the discussion.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on November 19 2020, follows.

      Summary

      Kang et al. eloquently describe the active suction organ that the larvae of aquatic insects of the Dipterian family Blephariceridae use to adhere robustly to complex surfaces. While the morphology of the mechanism has been reported previously, it's biomechanical adhesion function and performance across different substrates is unknown. The authors present three advances. First, they quantify the adhesion performance on rough, micro-rough, and smooth surfaces using an effective centrifugal setup. The ultimate adhesion tests show the larvae can resist shear forces up to 1100 times their body weight on smooth surfaces. Second, they visualize the suction function in vivo using interference reflection microscopy. This reveals that small hair like microtrichia can enter gaps in the surface. Because the microtrichia are angled inward, the authors surmise that the microtrichia's angle and small size helps increase adhesion contact area on rough surfaces. Finally, they compare the adhesion performance of the Blephariceridae larvae to other species, showing it is 3-10 times greater than found in stick insects. The finding that the larvae have such high attachment forces is impressive and the study offers new biological insights that may inspire engineers to invent new underwater suction mechanisms.

      Essential Revisions

      Although the reviewers were generally appreciative of the well-written manuscript and the remarkable performance reported for the active suction mechanism, the consensus is that the mechanism itself is not described in sufficient detail for the reader to fully appreciate the advance. Hence the main critiques focus on helping the authors to further flesh out the mechanism and report it in more mechanistic detail like how other adhesion mechanism are described functionally across the biomechanical literature. Further the presentation of the figures does not meet graphic design clarity standards essential to inform eLife's broad readership. To provide guidance, we list the following essential revisions.

      1) The introduction states that the suction organs have been observed, however, the manuscript does not communicate the observed mechanism as one would expect in the biomechanical adhesion literature. Instead it reports the measurements of the force and a suggestion that the microtrichia may be involved. We were hoping to find a quantitative report of the mechanism integrating the force data and microscopy images into biomechanical diagrams and to the extent possible, equations, that capture and communicate the mechanism as quantitatively as possible. Whereas we are not requesting further measurements, because the performance of the mechanism is well documented, we do ask a more in-depth biomechanical analysis that spells out the mechanism in a way it can be compared to the other classic mechanisms that the authors compare to. If this requires some additional measurements to inform the model, those efforts would be well worth it. In case the authors can use a mechanistic analysis lead, we recommend reviewing a couple of papers. E.g. Jeffries, Lindsie, and David Lentink. "Design Principles and Function of Mechanical Fasteners in Nature and Technology." Applied Mechanics Reviews 72.5 (2020). Or any other review or research paper that the authors find more useful.

      2) Please clarify if the experiments are done in air or underwater. We consider underwater as most appropriate; at minimum the surface should be wetted. The authors mention that the Stefan adhesion forces underwater would be higher than in air, but it's not clear if that statement pertains to the experiment. Please provide a full clarification, and in case the experiments were performed in air we would prefer to see them performed in water. If this is not possible, the manuscript should be entirely transparent on this matter so the reader can evaluate the precise merit of this study and its limitations fully.

      3) We found the images confusing at times. To resolve this we would like to see clear schematics (avatars) that ground the reader's perspective in all figures.

      4) Considering eLife's broad multidisciplinary readership and the appeal of this study for bioinspired designers and engineers, Fig 1d,e has to provide better anatomical readability. Please assume a Biology and Engineering undergrad level for the first figure, ensuring all definitions and anatomical names can be fully comprehended without reference to other literature. Please provide clear connections to the different views and perspectives presented in the panels leveraging graphic design to the benefit of the interested reader not familiar with insect morphology.

      5) Likewise, Fig 2 is also confusing. A schematic is in order to show the reader what they are looking at, how the images relate, and why they matter (significance) for understanding the main findings reported in this manuscript.

      6) Fig 3 clearly shows that course-rough surfaces provide far less adhesion force. We wonder, are there any images similar to Fig 6 showing that the microtrichia cannot enter the gaps? To comprehend what causes the differences, we would like to see a report of the length scale of the microtrichia compared to that of the gap's dimensions, both for the rough and micro rough surfaces. To clarify this in a universal fashion, please consider reporting gap size non-dimensionally based on the relevant microtrichia length scale. More discussion of the relevant length scales would help bring the force measurements and the observations of the microtrichia together.

      7) Fig 6 is an important figure, so it would help the reader to more easily grasp the viewing perspective using diagrams and avatars. I panel a, a schematic should clearly define the suction disc fringe and the perspective shown. What part is the suction disc and what is the length scale of this image compared to the suction disc? Also, it would be useful if the columns of the microstructure could all be aligned for clarity.

      8) Currently, the authors provide an estimate of the shear stress. It would be helpful to also include the normal stress based on the normal force data on smooth surfaces for lugubris. It would be informative for the reader to know if it exceeds 1 atm. If so, that is a very interesting finding. Please report and discuss what you find in the revised manuscript.

      9) Discussion: Please include a comparison of the magnitude of shear and normal stress that this suction mechanism creates with that of other organisms. Currently the comparison is done with force per body weight, which is biologically relevant. However, reporting stress provides an objective bio-mechanistic perspective on adhesion performance.

      10) Discussion, Ln 300: The suggestion that the inward-facing microtrichia may function to prevent inward slipping of the suction cup is interesting. Please discuss the tradeoff between smooth and micro-rough surfaces: is it possible that on micro-rough surfaces the microtrichia are better able to resist slip, but on smooth surfaces, the seal is better? And if so, this would suggest the effect of a better seal is more important than preventing slip, since performance is better on smooth surfaces? In-vivo visualization during failure would be very informative (in future work).

      11) Please discuss why there may be an intricate branching of the fan-fibres into the microtrichia. E.g. in the gecko, the branched tendons insert into the lamella, supporting the large tensile loads applied to the adhesive. However, here it is less clear if large tensile loads would be applied to the microtrichia. It seems logical that applying large normal loads to the suction cup should be done at its centre, resulting in decreased pressure if no slip occurs (as opposed to applying the normal force to the rim, which would not decrease pressure). So, this would not explain the intricate network of fan-fibres. However, for shear loads, it could make more sense: pulling in shear would engage the microtrichia on the far side of the cup, and the fan-fibres could help transmit this tension. It might be worth thinking this through and discussing the outcome in the paper to strengthen the mechanistic analysis.

      12) We would be excited to learn if the authors have thoughts on the slight curvature of the microtrichia and how it may be involved in the adhesion mechanism. In case this is purely speculative, this could go into the last paragraph of the paper, alternatively it could go into the biomechanical model of the mechanism.

    1. Reviewer #3:

      Carreño-Muñoz et. al. describe an piezoelectric sensor based approach to quantify rodent behavior. Piezoelectric sensors convert pressure, acceleration, strain, and even temperature and sound into an electoral charge. They are exquisitely sensitive and have a wide range of functionalities. The paper describes an open field arena that sits on top of three sensors on an air table that is able to detect animal movement. The authors use several behavioral paradigms and genetic models to validate their system. Overall, the piezo and pressure/force/vibration based systems have been well established for rodent behavior. Some examples of commercial systems are the Laboras (Metris BV) and PeizoSleep (Signal solutions), along with many papers that describe similar systems. The advantage of the system described in this paper (Phenotypix) is that it encompasses a large open field which allows the mouse to carry out naturalistic behavior. It also sits on top of an air table which allows more sensitive measurements. Although the system described has some advantages, the manuscript does not describe a system that leads to a significant enough advance. The manuscript does not offer a thorough solution for any one problem in biology and does not make a convincing case for adaptation of this platform. The figures and experimental description are also lacking leading to unclear interpretation of data.

      One of the major issues with this paper is that it does not adequately describe the Phenotypix platform to allow for replication. This may be fine if the platform is commercially available, which seems to be the goal, but when I searched for the "Phenotypix, Roddata", I did not find a commercial supplier. Thus, it is unclear how this data can be replicated. Another major issue is that it is never clear if behavior state determination based on mechanoelectrical signal, video data, or both. Ideally, one would use the video data to train classifiers that only use the mechanoelectrical data. However, it is not clear that this was done in most of the experiments. Without the hardware specifications and classifiers for the behaviors, replicability is an issue. The fact that the apparatus needs to be place on a 250kg air table brings its practical utility and scalability into question. Systems such as Laboras can be obtained with readily available classifiers for numerous behaviors (https://www.metris.nl/en/products/laboras/laboras_specs/) and allow for long term monitoring in home cage environment and questions the claim of "A novel device for behavioural phenotyping of freely moving laboratory animals (rats and mice) now allows to detect behavioural components out of reach of existing systems."

      One issue that is not addressed for the various behaviors - how does body weight affect the spectral properties of behaviors. How can we compare the same behavior between two animals of differing sizes? Since this is a pressure sensor, this is important.

    2. Reviewer #2:

      General assessment of the work:

      The authors present the Phenotypix, a device that uses piezoelectric pressure-sensors, in combination with video recording and signal analysis, to observe physiological states within a subject mouse. Using computational approaches, they show that this device can detect locomotion, and even sub-components of locomotion such as grooming. Similarly, they show the device can detect heart rate and breathing rate in both anesthetized and awake (but immobile) subjects. Next, in a series of proof-of-concept experiments they show that differences in pain, fear, and gait responses can be detected between control and experimental subjects.

      Numbered summary of substantive concerns:

      1) The anti-vibrational setup that the system is located on appears to be critical to successful use of the system. Please provide some parametric data showing how different degrees of dampening influence system performance. This will be critical for replication of results in different labs.

      2) How does the device account for changes in the environment, such as bedding moving around or the animal defecating/urinating? Is this system compatible with behavioral enrichment like cotton bedding, etc?

      3) Is it possible to track multiple subjects in a single chamber? This seems like it should be feasible with the inclusion of video data in the analysis.

      4) It appears that only locomotion related data can be reliably recorded while the subjects are moving, and that features such as heart rate and respiration rate are limited to immobile states. Is this correct? If so, a discussion of potential ways to overcome this confound would be welcomed.

      5) The lack of publicly available code and data is not compatible with the mission of supporting the open science environment. It has also made evaluating the technical merit of the work in this manuscript difficult.

    3. Reviewer #1:

      The manuscript by Carreño-Muñoz seeks to tackle an important problem in behavioral neuroscience, that is classifying behavior at fine resolution during free exploration in rodents. Though the goals of this study are lofty, this platform, in my opinion, isn't a substantive step forward in relation to other tools currently available.

      Major concerns:

      1) What is presented in this work is a piezoelectric based sensor to detect rodent movements. My main criticism with this work is that the behaviors were coded by hand. If the authors had developed a way to automatically measure spontaneous behaviors of interest, or even train a machine to detect behavioral signatures after some human input, this system would have broader appeal. As is, the experimenter uses standard whole animal tracking with ethovision, then observes what the animal is doing by hand, then quantitation is added to certain movements. This I believe, is not a major advance, as current weight bearing devices already have this capacity.

      2) For the breathing and heartbeat studies in figure 2, I am not convinced that this approach is more beneficial than the standard EEG approaches.

      3) Figure 3 is poorly developed and the biology is very questionable. "Shaking" after surgery as a read-out of pain is not a measurement currently used or seen in the pain field. Although the authors report that this measurement is reduced with BPN, there are other trivial or pure coincidental explanations for this unusual finding. This reviewer tends to believe that the anesthesia or some other surgical by-product, not with pain as a driver, is contributing to this phenotype. I don't believe the authors have discovered a new post-op pain behavior. If so, substantial data needs to be added to be convincing.

    1. Reviewer #3:

      The work by Münch et al addresses an important problem of modeling data that originates from multiple channels (100s-1000s) by establishing a Bayesian inference-based framework to extend an existing Kalman filter-based method. They convincingly demonstrate that their approach is much more accurate at quantifying channels using previous, and is impressively able to combine multiple experimental modalities. Most importantly, as a Bayesian method, this approach allows the incorporation of prior information such as the diffusion limit or previous experiments, and also allows one to perform model selection to select the best kinetic model of the data (although this aspect is less developed). In particular, the Bayesian approach of this work is an important advance in the field.

      1) The manuscript needs line editing and proofreading (e.g., on line 494, "Roa" should be "Rao"; missing an equals sign in equation 13). Additionally, in many paragraphs, several of the sentences are tangential and distract from communicating the message of the paper (e.g., line 55). Removing them will help to streamline the text, which is quite long.

      2) Even more emphasis on the approximation of n(t) as being distributed according to a multivariate normal, and thus being continuous, should be placed in the main text. To my understanding, this limits the applicability of the method to data with > ~100s of channels; although the point is not investigated that I could find. In Fig. 3, it seems the method is only benchmarked to a lower limit of ~500 channels. Although an investigation of performance below that point would be interesting, it is only necessary to discuss the approximate lower bound cutoff.

      3) The methods section should include information concerning the parameter initialization choices, HMC parameters (e.g. number of steps) and any burn-in period used in the analyses used in Figs. 3-6

      4) In the section on priors, the entire part concerning the use of a beta distribution should be removed or replaced, because it is a probabilistic misrepresentation of the actual prior information that the authors claim to have in the manuscript text. The max-entropy prior derived for the situation described in the text (i.e., an unknown magnitude where you don't know any moments but do have upper and lower bounds; the latter could be from the length from the experiment) is actually P(x) = (ln(x{max}) - ln(x{min}))^{-1} * x^{-1}. I'm happy to discuss more with the authors.

      5) Achieving the ability to rigorously perform model selection is a very impressive aspect of this work and a large contribution to the field. However, the manuscript offers too many solutions to performing that model selection itself along with a long discussion of the field (for instance, line 376-395 could be completely cut). Since probabilistic model selection is an entire area of study by itself, the authors do not need to present underdeveloped investigations of each of them in a paper on modeling channel data (e.g., of course WAIC out performs AIC. Why not cover BIC and WBIC?). The authors should pick one, and maybe write a second paper on the others instead of presenting non-rigorous comparisons (e.g., one kinetic scheme and set of parameters). As a side note, it is strange that the authors did not consider obtaining evidences or Bayes factors to directly perform Bayesian model selection - for instance, they could have used thermodynamic integration since they used MC to obtain posteriors anyway (c.f., Computing Bayes Factors Using Thermodynamic Integration by Lartillot and Philippe, Systematic Biology, 2006, 55(2), 195-207. DOI: 10.1080/10635150500433722)

    2. Reviewer #2:

      Extracting ion channel kinetic models from experimental data is an important and perennial problem. Much work has been done over the years by different groups, with theoretical frameworks and computational algorithms developed for specific combinations of data and experimental paradigms, from single channels to real-time approaches in live neurons. At one extreme of the data spectrum, single channel currents are traditionally analyzed by maximum likelihood fitting of dwell time probability distributions; at the other extreme, macroscopic currents are typically analyzed by fitting the average current and other extracted features, such as activation curves. Robust analysis packages exist (e.g., HJCFIT, QuB), and they have been put to good use in the literature.

      Münch et al focus here on several areas that need improvement: dealing with macroscopic recordings containing relatively low numbers of channels (i.e., hundreds to tens of thousands), combining multiple types of data (e.g., electrical and optical signals), incorporating prior information, and selecting models. The main idea is to approach the data with a predictor-corrector type of algorithm, implemented via a Kalman filter that approximates the discrete-state process (a meta-Markov model of the ensemble of active channels in the preparation) with a continuous-state process that can be handled efficiently within a Bayesian estimation framework, which is also used for parameter estimation and model selection.

      With this approach, one doesn't fit the macroscopic current against a predicted deterministic curve, but rather infers - point by point - the ensemble state trajectory given the data and a set of parameters, themselves treated as random variables. This approach, which originated in the signal processing literature as the Forward-Backward procedure (and the related Baum-Welch algorithm), has been applied since the early 90s to single channel recordings (e.g., Chung et al, 1990), and later has been extended to macroscopic data, in a breakthrough study by Moffatt (2007). In this respect, the study by Münch et al is not necessarily a conceptual leap forward. However, their work strengthens the existing mathematical formalism of state inference for macroscopic ion channel data, and embeds it very nicely in a rigorous Bayesian estimation framework.

      The main results are very convincing: basically, model parameters can be estimated with greater precision - as much as an order of magnitude better - relative to the traditional approach where the macroscopic data are treated as noisy but deterministic (but see my comments below). Estimate uncertainty can be further improved by incorporating prior information on parameters (e.g., diffusion limits), and by including other types of data, such as fluorescence. The manuscript is well written and overall clear, and the mathematical treatment is a rigorous tour-de-force.

      There are several issues that should be addressed by the authors, as listed below.

      1) I think packaging this study as a single manuscript for a broad-audience is not optimal. First, the subject is very technical and complex, and the target audience is probably small. Second, the study is very nice and ambitious, but I think clarity is a bit impaired by dealing with perhaps too many issues. The state inference and the bayesian model selection are very important but completely different issues that may be better treated separately, perhaps for a more specialized readership where they can be developed in more detail. Tutorial-style computational examples must be provided, along with well commented/documented code. The interested readers should be able to implement the method described here in their own code/program.

      2) The authors should clearly discuss the types of data and experimental paradigms that can be optimally handled by this approach, and they must explain when and where it fails or cannot be applied, or becomes inefficient in comparison with other methods. One must be aware that ion channel data are very often subject to noise and artifacts that alter the structure of microscopic fluctuations. Thus, I would guess that the state inference algorithm would work optimally with low noise, stable, patch-clamp recordings (and matching fluorescence recordings) in heterologous expression systems (e.g., HEK293 cells), where the currents are relatively small, and only the channel of interest is expressed (macropatches?). I imagine it would not be effective with large currents that are recorded with low gain, are subject to finite series resistance, limited rise time, restricted bandwidth, colored noise, contaminated by other currents that are (partially) eliminated with the P/n protocol with the side effect of altering the noise structure, power line 50/60 Hz noise, baseline fluctuations, etc. This basically excludes some types of experimental data and experimental paradigms, such as recordings from neurons in brain slices or in vivo, oocytes, etc. Of course, artifacts can affect all estimation algorithms, but approaches based on fitting the predicted average current have the obvious benefit of averaging out some of these artifacts.

      The discussion in the manuscript is insufficient in this regard and must be expanded. Furthermore, I would like to see the method tested under non-ideal but commonly occurring conditions, such as limited bandwidth and in the presence of contaminating noise. For example, compare estimates obtained without filtering with estimates obtained with 2, 3 times over-filtering, with and without large measurement noise added (whole cell recordings with low-gain feedback resistors and series resistance compensation are quite noisy), with and without 50/60 Hz interference. How does the algorithm deal with limited bandwidth that distorts the noise spectrum? How are the estimated parameters affected? The reader will have to get a sense of how sensitive this method is to artifacts.

      3) A better comparison with alternative parameter estimation approaches is necessary. First of all, explain more clearly what is different from the predictor-corrector formalism originally proposed by Moffatt (2007). The manuscript mentions that it expands on that, but exactly how? If it is only an incremental improvement, a more specialized audience is more appropriate.

      Second, the method proposed by Celentano and Hawkes, 2004, is not a predictor-corrector type but it utilizes the full covariance matrix between data values at different time points. It seems to me that the covariance matrix approach uses all the information contained in the macroscopic data and should be on par with the state inference approach. However, this method is only briefly mentioned here and then it's quickly dismissed as "impractical". I am not at all convinced that it's impractical. We all agree that it's a slower computation than, say, fitting exponentials, but so is the Kalman filter. Where do we draw the line of impracticability? Computational speed should be balanced with computational simplicity, estimation accuracy, and parameter and model identifiability. Moreover, that method was published in 2004, and the computational costs reported there should be projected to present day computational power. I am not saying that the authors should code the C&H procedure and run it here, but should at least give it credit and discuss its potential against the KF method.

      The only comparison provided in the manuscript is with the "rate equation" approach, by which the authors understand the family of methods that fit the data against a predicted average trajectory. In principle, this comparison is sufficient, but there are some issues with the way it's done.

      Table 3 compares different features of their state inference algorithm and the "rate equation fitting", referencing Milescu et al, 2005. However, there seems to be a misunderstanding: the algorithm presented in that paper does in fact predict and use not only the average but also - optionally - the variance of the current, as contributed by stochastic state fluctuations and measurement noise. These quantities are predicted at any point in time as a function of the initial state, which is calculated from the experimental conditions. In contrast, the KF calculates the average and variance at one point in time as a projection of the average and variance at the previous point. However, both methods (can) compare the data value against a predicted probability distribution. The Kalman filter can produce more precise estimates but presumably with the cost of more complex and slower computation, and increased sensitivity to data artifacts.

      Fig. 3 is very informative in this sense, showing that estimates obtained with the state inference (KF) algorithm are about 10 times more precise that those obtained with the "rate equation" approach. However, for this test, the "rate equation" method was allowed to use only the average, not the variance.

      Considering this, the comparison made in Fig 3 should be redone against a "rate equation" method that utilizes not only the expected average but also the expected variance to fit the data, as in Milescu et al, 2005. Calculating this variance is trivial and the authors should be able to implement it easily (and I'll be happy to provide feedback). The comparison should include calculation times, as well as convergence.

      4) As shown in Milescu et al, 2005, fitting macroscopic currents is asymptotically unbiased. In other words, the estimates are accurate, unless the number of channels is small (tens or hundreds), in which case the multinomial distribution is not very well approximated by a Gaussian. What about the predictor-corrector method? How accurate are the estimates, particularly at low channel counts (10 or 100)? Since the Kalman filter also uses a Gaussian to approximate the multinomial distribution of state fluctuations, I would also expect asymptotic accuracy. Parameter accuracy should be tested, not just precision.

      5) The manuscript nicely points out that a "rate equation" approach would need 10 times more channels (N) to attain the same parameter precision as with the Kalman filter, when the number of channels is in the approximate range of 10^2 ... 10^4. With larger N, the two methods become comparable in this respect.

      This is very important, because it means that estimate precision increases with N, regardless of the method, which also means that one should try to optimize the experimental approach to maximize the number of channels in the preparation. However, I would like to point out that one could simply repeat the recording protocol 10 times (in the same cell or across cells) to accumulate 10 times more channels, and then use a "rate equation" algorithm to obtain estimates that are just as good. Presumably, the "rate equation" calculation is significantly faster than the Kalman filter (particularly when one fits "features", such as activation curves), and repeating a recording may only add seconds or minutes of experiment time, compared to a comprehensive data analysis that likely involves hours and perhaps days. Although obvious, this point can be easily missed by the casual reader and so it would be useful to be mentioned in the manuscript.

      6) Another misunderstanding is that a current normalization is mandatory with "rate equation" algorithms. This is really not the case, as shown in Milescu et al, 2005, where it is demonstrated clearly that one can explicitly use channel count and unitary current to predict the observed macroscopic data. Consequently, these quantities can also be estimated, but state variance must be included in the calculation. Without variance, one can only estimate the product i x N, where i is unitary current and N is channel count. This should be clarified in the manuscript: any method that uses variance can be used to estimate i and N, not just the Kalman filter. In fact, the non-stationary noise analysis does exactly that: a model-blind estimation of N and i from non-equilibrium data. Also, one should be realistic here: in some circumstances it is far more efficient to fit data "features", such as the activation curve, in which case the current needs to be normalized.

      7) I think it's great that the authors develop a rigorous Bayesian formalism here, but I think it would be a good idea to explain - even briefly - how to implement a (presumably simpler) maximum likelihood version that uses the Kalman filter. This should satisfy those readers who are less interested in the Bayesian approach, and will also be suitable for situations when no prior information is available.

      8) The Bayesian formalism is not the only way of incorporating prior knowledge into an estimation algorithm. In fact, it seems to me that the reader would have more practical and pressing problems than guessing what the parameter prior distribution should be, whether uniform or Gaussian or other. More likely one would want to enforce a certain KD, microscopic (i)reversibility, an (in)equality relationship between parameters, a minimum or maximum rate constant value, or complex model properties and behaviors, such as maximum Popen or half-activation voltage. A comprehensive framework for handling these situations via parameter constraints (linear or non-linear) and cost function penalty has been recently published (Salari et al/Navarro et al, 2018). Obviously, the Bayesian approach has merit, but the authors should discuss how it can better handle the types of practical problems presented in those papers, if it is to be considered an advance in the field, or at least a usable alternative.

      9) Discuss the practical aspects of optimization. For example, how is convergence established? How many iterations does it take to reach convergence? How long does it take to run? How does it scale with the data length, channel count, and model state count? How long does it take to optimize a large model (e.g., 10 or 20 states)? Provide some comparison with the "rate equation method".

      10) Here and there, the manuscript somehow gives the impression that existing algorithms that extract kinetic parameters by fitting the average macroscopic current ("fitting rate equations") are less "correct", or ignorant of the true mathematical description of the data. This is not the case. Published algorithms that I know of clearly state what data they apply to, what their limitations are, and what approximations were made, and thus they are correct within that defined context and are meant to be more effective than alternatives. Some quick editing throughout the manuscript should eliminate this impression.

      11) The manuscript refers to the method where the data are fitted against a predicted current as "rate equations". I don't actually understand what that means. The rate equation is something intrinsic to the model, not a feature of any algorithm. An alternative terminology must be found. Perhaps different algorithms could be classified based on what statistical properties are used and how. E.g., average (+variance) predicted from the starting probabilities (Milescu et al, 2005), full covariance (Celentano and Hawkes, 2004), point-by-point predictor-corrector (Moffatt, 2007).

    3. Reviewer #1:

      The authors develop a Bayesian approach to modeling macroscopic signals arising from ensembles of individual units described by a Markov process, such as a collection of ion channels. Their approach utilizes a Kalman filter to account for temporal correlations in the bulk signal. For simulated data from a simple ion channel model where ligand binding drives pore opening, they show that their approach enhances parameter identifiability over an existing approach based on fitting average current responses. Furthermore, the approach can include simultaneous measurement of multiple signals (e.g. current and fluorescence) which further increases parameter identifiability. They also show how appropriate choice of priors can help model and parameter identification.

      The application of Bayesian approaches to kinetic modeling has recently become popular in the ion channel community. The need for approaches that inform on parameter distributions and their identifiability, as well as allow model selection, is unquestioned. Also, it is ideal to use as much information in the experimental data as possible, including temporal correlations. As such, the authors’ addition is a valuable contribution.

      Comments:

      I note that my comments are restricted largely to the results rather than the mathematical derivation of the author's approach.

      1) I understand that this is somewhat secondary to the paper's intellectual contribution. However, one thing that would be enormously useful is accompanying software usable by others. The supplied code is not well commented, and it is unclear whether it is applicable beyond the specific models examined in the paper. It was supplied as .txt files, but looks like C code. I did not spend the time to get it working, so an accompanying GitHub page or some such with detailed instructions for how to apply this approach for one's own model of interest would make this contribution infinitely better. Even better if there was a GUI, although easily adaptable code is of primary importance.

      2) What are the temporal resolutions of the current and fluorescence simulations shown in Fig 1? I assume that they are the same. However, most current recordings are much higher temporal resolution than fluorescence recordings. If you were to reduce the sample rate of the binding fluorescence relative to current simulations to something experimentally reasonable, how would the resulting time averaging of the binding signal impact its enhancement of parameter identifiability?

      3) For comparison, it would also be nice to see how addition of the binding signal in the data helps the RE approach. i.e. Is addition of the binding signal more important than choice of RE vs KF, or is optimization method still an important factor in terms of correctly identifying the model's rate constants or in selecting the true model?

      4) Fig 7: For PC data, why is RE model BC appear to be better than KF model BC if the KF model does a better job at estimating the parameters and setting non true rates to zero? Doesn't this suggest that RE with cross validation is better than the proposed KF approach? In terms of parameter estimates (i.e. as shown in Fig. 3), how does RE + BC stack up?

    1. Reviewer #3:

      In this manuscript, Robert et al. demonstrated that medial SuM sends glutamatergic projections to the hippocampal CA2 region, and stimulation of these projections exert mixed excitatory and inhibitory responses in CA2 pyramidal neurons. Furthermore, they showed that SuM-CA2 circuits recruit local PV basket cells to provide feedforward inhibition to CA2 pyramidal cells, which increases the precision of action potential firing in conditions of low and high cholinergic tone. Finally, they performed in vivo electrophysiology recording to show that stimulation of SuM-CA2 projections can influence CA1 activity. Overall, this is a well-designed study, and the quality of the data is high. The authors performed an impressive amount of electrophysiology recording in acute slices and provided detailed information on how long-distance SuM projection neurons regulate CA2 pyramidal cell activity. These findings provide insights into how SuM activity directly acts on the local hippocampal circuit to modulate social memory encoding. However, there are some concerns that need to be addressed.

      1) The authors performed CAV-based retrograde tracing and demonstrated that medial SuM sends glutamatergic projections to CA2. These results are in contrast to a recent study (Li et al, Elife 2020) showing that lateral SuM neurons send dense projections to both CA2 and DG, and the SuM-DG projections release both glutamate and GABA to dentate granule cells. Based on the results from this study and the study from Li et al. does that mean medial SuM neurons are different from lateral SuM neurons in terms of the neurotransmitters they release? The authors need to clarify this point and provide additional ephys data to show that pyramidal cells do not receive direct GABAergic inputs upon stimulation of SuM-CA2 projections using high-chloride internal solution to reveal the IPSCs.

      2) The authors claim that SuM-CA2 circuits recruit local PV basket cells to provide feedforward inhibition to CA2 pyramidal cells. While the data presented are supportive, they are not entirely convincing. Specifically, MOR agonist DAMGO is not specific to PV BCs. Though DAMGO has a preferential effect on PV cells over CCK cells, other interneuron types have been shown to be sensitive to DAMGO manipulation. Therefore, these results are subject to alternative interpretation that other types of CA2 local interneurons may be involved. To show whether PV BCs is the sole interneuron subtype involved, the authors may use a P/Q type calcium channel blocker, ω-agatoxin-TK, as P/Q Ca2+ channels are unique to PV BCs. In addition, chemogenetic inhibition of PV BCs was used, but light-evoked IPSCs are not completely blocked. The authors claimed this could be due to partial silencing of PV BCs. However, there is no evidence showing the efficacy of 10µM CNO application in suppressing CA2 PV basket cell activity. These data should be provided in order to draw such conclusions.

      3) CCK basket cells are known to excite PV basket cells (Lee et al 2011) via a pertussin-toxin sensitive pathway. Is it possible that SuM-CA2 mediated excitation of PV basket cells includes a CCK intermediary? This point should be discussed.

      4) The in vivo recording data showed that SuM-CA2 circuit stimulation decreases the firing rate of CA1 pyramidal cells followed by increased firing rate in these cells. Then the authors performed slice recording and showed that the reduced firing rate of CA1 neurons in vivo is likely caused by increased inhibitory inputs onto CA1 pyramidal cells. Figure 7G-H seems to explain the reduced events in the first phase of the tetrode recordings, but not the rebound part. Is there some circuit component that is lost when making slices? Furthermore, what does SuM-CA2 circuit stimulation do to theta/gamma rhythms in CA1? These data should be available in the tetrode recordings.

    2. Reviewer #2:

      The article brings to light the functional consequences of the activity of SuM afferents terminating at CA2 neurons in the hippocampus using a combination of a variety of methods like whole-cell voltage clamp and optogenetics. In addition, the authors provide evidence that modulation of the CA2 neurons by SuM afferents affects the activity pattern of CA1 neurons. Specifically, the study reveals that the 'functional' connectivity between SuM and CA2 is mainly mediated by the regulation of PV+ basket cells that are involved in the feed forward inhibition of CA2 principal neurons. This study is also relevant in the context of neuropsychiatric disorders where PV+ IN density in the CA2 area is preferentially reduced.

      It would be good if some results and implications are further clarified for better understanding in the discussion section:

      1) The results indicate that SuM recruits a feed forward inhibition onto CA2 PNs, which contributes to the shaping of CA2 AP firing. However, it is not entirely intuitive how the feed forward inhibition of CA2 PNs by SuM also reduces CA1 activity, as CA2 has also been known to recruit strong feed forward inhibition onto CA1. This would intuitively suggest that decrease in CA2 activity by photostimulation of SuM afferents will in turn decrease the feed forward inhibition by CA2 onto CA1, and thereby increase CA1 activity. However, the results suggest otherwise. Would this be suggestive of a stronger direct excitatory projection from CA2 to CA1 PNs that is more dominant than the feed forward inhibition of CA1 PNs by CA2? This may be a good point to further elaborate on in the discussion section, so that the effect of SuM-CA2 connectivity on CA1 output becomes clearer.

      2) In the introduction section line 44, it is written that 'CA2 neurons do not undergo NMDA-mediated synaptic plasticity'. This may not always be the case; rather it may be better to rephrase 'NMDA-mediated' as 'high frequency stimulation-induced'. It has been shown previously that NK1 receptor activation by pharmacological application of substance P in hippocampal slices triggers a slow onset NMDA-dependent LTP in CA2 neurons by high frequency stimulation of CA3 afferents to CA2 (Dasgupta et al., 2017).

      3) Line 250: "BC transmission is insensitive to MOR activation (Glickfeld et al., 2008)."

      Was the Glickfeld study done in CA2 neurons? If not, it would be good to show that PV+ CA2 BCs are also sensitive to DAMGO and to what degree? The experiment shows that IPSC in PNs are inhibited by DAMGO that should have enhanced light induced EPSCs if PV+ BCs are responsible for feed forward inhibition. But it seems that has not been observed. What are direct EPSCs - electrical stimulation of CA3-CA2 synapses?

      4) Overall, the results seem to suggest that SuM stimulation would induce a net inhibition (?) of CA2 PNs by recruiting interneurons (INs). However, the role played by the direct glutamatergic connections from SuM to CA2 PNs is not entirely clear. Is it less prominent due to sparse SuM-PN projections compared to SuM-IN connections in the CA2 area? It may be good to elaborate on this a bit in the discussion.

    3. Reviewer #1:

      In this study Robert et al. describes the properties of long-range projections from the SuM to the CA2 area of the hippocampus. The authors identified direct excitatory and indirect inhibitory drive from SuM inputs on CA2 pyramidal neurons and showed that direct excitatory drive impinges on PV-positive basket cells. The overall effect of the input on CA2 activity was an increased precision of APs. The study also suggests that the input from the CA2 drives inhibition in the CA1 area. The study provides very interesting and new information about the cellular properties of SuM input in the CA2 area. This is an important question given the increasing importance of SuM inputs in social memory encoding. The study is timely, currently we have very limited data about the features and exact cellular profile of this input. The study is using elegant technical approaches to answer the central question of the study. While the study is addressing an important question and provides novel data, the author's central claim about the role of feed-forward inhibition would need to be strengthened by the addition of experiments addressing how E-I balance changes in trains in individual neurons and how this can be linked to changes in the temporal precision of synaptically evoked APs.

      Action potentials are evoked with a current step. Since the study is focused on the network effects of feed-forward inhibition, it would be useful to see how the properties of synaptically evoked action potentials change. In the cortex and in the CA1 feed forward inhibition was shown to limit the temporal summation of excitatory inputs which lead to decrease in AP jitter (Gabernet et al., 2005, Pouille and Scanziani 2001). In order to map these dynamics APs should be evoked via synaptic stimulation and not through current injection.

      The authors show recordings of monosynaptic EPSCs in pyramidal cells and interneurons. It would be important to know how inhibitory and excitatory PSCs change in a train. Recordings from single cells held at E-GLUT and E-GABA would allow the authors to monitor excitatory and inhibitory events in a train and map how their balance changes. Can the change in E-I balance explain the change in AP jitter?

      What are the characteristics of the SuM-driven inhibitory currents? Does the latency and jitter of monosynaptic EPSCs and disynaptic IPSCs differ? If one is monosynaptic and the other is disynaptic one would expect significant differences in both of these parameters.

      How do the authors exclude the contribution of feed-back inhibition? Feed-forward and feed-back inhibition both could have an impact on the temporal precision of APs.

    1. Reviewer #2:

      In this manuscript, the authors combine genetic/hormonal manipulation of expansin expression, localization studies, and mechanical measurements of root cell walls to study how this family of cell wall-loosening proteins influences root growth and development. This is an exciting topic, since expansins have a long history of in vitro characterization, but their characterization in living plants has lagged behind. The localization patterns of EXPA1, EXPA10, EXPA14, and EXPA15 are depicted using mCherry fusion proteins, and are shown to be distinct from one another. Despite the wide range of interesting approaches described here, I have some important concerns about the work as it stands, in terms of providing new insights into how expansins actually influence root growth.

      Major Comments:

      One major concern is the lack of appropriate controls, statistical appropriateness, and reporting (e.g., defining "n" clearly in all cases) in this work. All comparisons should include wild type and no-treatment controls; for example, in Figure 8, no AFM images are shown for wild type or EXPA1 overexpression cells.

      Figure 1-S1: there is no change in pEXPA1::nls:3xGFP - why is there this discrepancy with the EXPA1 qPCR result? This is not explained.

      Figure 3-S1: The finding of a lack of colocalization between EXPA10 and CFW staining is not convincing, due to a lack of a control showing positive colocalization and a lack of quantification of the degree of colocalization (e.g., Pearson correlation coefficient between red/blue pixels). The authors use these data as a lynchpin for part of their discussion, but this lack of colocalization could simply be an artifact of chromatic aberration, etc.

      L256: This statement is not supported by the statistical comparisons shown in Figure 5B-C. In Figure 5B, why does the WT show higher MOC with Dex than without? In Figure 5B-C, you do not compare 8-4 + Dex with WT + Dex statistically, which is the salient comparison, and instead compare each genotype with vs. without Dex. In addition, the fact that the pRPS5A>GR>EXPA1:mCherry line does not show a significant difference in BLS signal with Dex addition (Figure 5-S1) argues against a clearly established relationship between expansin expression and BLS signal. The data in Figure 5D-E are more informative, but there is no wild type control for these experiments.

      In Figure 8, the AFM color code scales do not seem to match the graphs, in that the color scales range from 0-2 MPa, whereas the graph Y axes range from 0 to 3 e6 MPa (unless that is supposed to be 0-3 MPa, or 0 to 3 e6 Pa!). No-Dex controls are missing from 8B.

      In the Discussion, the authors use the words "unclear" and "elusive", and "remains to be identified" to sum up their work, and this to me is an indication of the state of this work overall. Although some of the data are intriguing, they are neither conclusive nor explanatory in revealing the mechanisms of expansin-mediated growth control in roots.

      Finally, the manuscript needs to be revised for proper English grammar, syntax, and style.

    2. Reviewer #1:

      Expansins are mysterious cell wall proteins because they lack known hydrolytic activity but are somehow correlated with acid-induced cell wall loosening/extension and cell expansion. Here the authors catalog the tissue expression of several native promoter driven expansin-FP fusions (EXPA1, 10, 14, 15) and find partially overlapping expression patterns and evidence that some expansins are restricted to particular cell wall regions (e.g. tricellular junctions (Figs 1-4). Using Brillouin light scattering (BLS) microscopy they find that, contrary to several previous reports for EXPA1, EXPA1 overexpression induces tissue stiffening that is relatively independent of extracellular pH (Fig 5, 7). They corroborate these data using AFM of different cell walls in a similar tissue (Fig 8). Thus, EXPA1 overexpression results in shorter roots (Fig 9). While BLS seems like an interesting technique for studying cell walls, essential controls are missing making it difficult to interpret these results.

      Major Comments:

      1) Expansins have traditionally been identified with promoting cell wall extension by loosening the cell wall under acidic conditions. Recent reports have corroborated this: Ramakrishna et al., 2019 showed decreased lateral root initiation in mutants, implying EXPA1 plays a role in loosening, while Pacifici et al 2018 showed decreased cell elongation in expa1 mutants and increased cell elongation in EXPA overexpression lines, but only when grown on low pH (pH 4) media. All of these results are consistent with EXPAs playing a role in cell wall loosening. By contrast, the authors here find that EXPA1 overexpression causes cell wall stiffening and reduced root growth, that low pH (pH 4) media decreases this stiffening (Fig 5). Their discussion of these discrepancies is insufficient. For example, how do their levels of EXPA1 overexpression compare to Pacifici et al., 2018? How can they reconcile the results in these previous papers with their study?

      2) Since the authors only really see changes in BLS of their EXPA1 line with over 10,000x overexpression (their inducible EXPA1-mCherry line with "only" >100x expression relative to wild type does not cause significant changes to cell wall "stiffness"), it is unclear how sensitive this technique is to cell wall changes. Controls are required to interpret these BLS experiments. For example, a known mutant or overexpression line with increased cell wall stiffness and another with decreased cell wall stiffness.

      3) It will also be important to document whether the authors can replicate the lack of changes to cell wall stiffness in the expa1 mutant using AFM.

      4) It would be helpful to see a detailed correlation analysis between the new technique (BLS) and an established cell wall analysis technique (AFM) across multiple data points (i.e. positive and negative controls for cell wall stiffness changes).

      5) These AFM values are also presented on a scale that is almost 7x higher than previous data from the authors (e.g. Peaucelle 2014 JoVE). Please discuss.

      6) The authors are comparing BLS data from the inner longitudinal cell wall versus AFM data from the outer longitudinal cell wall, which have very different properties. Please discuss.

      7) EXPA1 gene overexpression is determined 7 days after Dex induction, but BLS experiments are conducted on plants that have been induced for a much shorter time (e.g. 3h). What is the expression of the EXPA1 gene over this timeframe of induction? Ideally, the authors would also use an EXPA1 antibody to monitor protein levels, since this is what is actually relevant.

      8) It is difficult to see from the BLS shift maps provided (e.g. Fig 5A) where in the root the authors are imaging. Given that this is a relatively new technique to the cell wall field, it would be helpful to provide additional images to provide context to readers.

      9) "Data not shown" (e.g. trans-zeatin treatments, line 149; EXPA1 protein levels, line 360) must be included as supplemental figures or the claims removed from the manuscript.

    1. Reviewer #3:

      The paper titled: "Auditory detection is modulated by theta phase of silent lip movements" the authors investigate visual entrainment to lip movement using behavioral (exp1) and non-invasive physiology (EEG; exp2).

      In the first experiment participants engage in the detection of a brief tone embedded in noise. Critically, the tone appears whilst subjects are viewing a silent movie clip. Tones are critically timed with respect to the phase of the theta rhythm prevalent in the lip action trajectory (and its relation to the original audio track). Each trial includes 0, 1 or 2 tones and subjects provide a speeded response when the tone is detected. Tones are also critically presented either during the first half of the clip or the second half of the clip (or both or neither). This latter timing parameter is designed to probe the possibility of an increasing degree of entrainment to visual lip movement as the clip evolves. In the second experiment the findings demonstrated in the exp 1 are met with an analysis of visual entrainment and its impact on auditory sources using EEG and source estimation on data obtained while observers viewed the same silent movie clips passively. The paper is well written, the premise is clear and the findings are interesting and timely. In what follows I outline some questions and concerns that come to mind when assessing the validity of the interpretation of the findings. Those span the experimental and stimulus design as well as the analysis choices made.

      1) The behavioral procedure suggests that the tones were pseudo-randomly positioned w/ respect to the quantified theta phase of the lip movement. It would be interesting to understand whether any care was taken to exhaustively sample different phases of the phase of interest in the lip movement. It might be important, therefore to demonstrate that phases were equivalently sampled by chance in the first and second half trials and over the different clips. An inset in figure 1 would make for a good spot to demonstrate the descriptive statistics of target positioning (as a function of phase).

      2) Second and somewhat related, wouldn't it make more sense to quantify accuracy based on phase bins? This way no division to subpopulation would be required since each individual could be aligned to their best phase. The methods leave it somewhat unclear whether this was a possibility in terms of the stimulus design (i.e., were there enough phases to accomplish this in the stimulus/tone timing; see previous point).

      In addition the subject mean phase of the correctly detected target provides little insight as to the periodic nature of performance. Analyzing whether there is a periodic modulation of the pattern of responses over phase would provide richer, more nuanced evidence for the claims.

      3) It would be important and interesting to learn whether the first and second part of the trial has the same MI profile at theta b/w lip movement and audio track. Currently, The characterization of MI was done on the whole movie clips. This is crucial for both Experiment 1 and Experiment 2 interpretation.

      4) The distinction b/w the first and second half -- indicating that entrainment takes time to build up is somewhat overstated in the context of this paper seeing that the literature suggests that by 0.5 s entrainment is fully arrived at (among others -- the authors themselves say so in the TINs piece). Other processes such as calibration to a given speaker might take longer, and those might justify (or account for?) the result showing that early vs. late targets differ in the degree to which the phase of the lip action affects performance.

      Important details over the stimuli need to be clarified:

      5) Did every clip introduce a new speaker to the subject? Thus, time on cl cip also amounts to degree of familiarity with the speaker?

      6) Did each clip have the same degree of MI b/w audio and lip movement or were there better (more pronounced) lip clips than others when considering their link to the audio? Would it make sense to add these measures as covariates in the analysis?

      7) Is the same target timing used for the same clip for all subjects? Or are the tones truly randomly placed and matched onto clips such that a given clip could appear w/ tones at different times for different subjects?

      At the risk of somewhat repeating point #2 above -- within the analysis the following should be considered:

      8) The authors establish that in the second half performance there are, in fact, two subpopulations in the sample. Wouldn't this post hoc grouping factor, which isn't obviously motivated be better described by properly delineating performance as a function of phase? I can readily understand that the authors might not have a clear hypothesis over what might be the better phase for performing on an irrelevant tone probe. Nonetheless, if a periodic process is entraining performance once a best phase is identified adjacent phase bins should demonstrate this circular relationship. This would allow for a direct quantification of ALL data together after aligning performance to the best phase bin, per subject.

      Finally, the following points pertain for most for the contextualization of this work and the discussion:

      9) While the authors discuss at least two mechanisms relating to how entertainment affects growth by the second part of the clip, it would be nice to relate the concrete reading of this effect to cognitive processes that may evolve within these timescales. In other words, learning that tracking takes 0.5 s or learning that visual inputs to frontal cortex take a given time scale to exert impact on auditory sensory regions is another description of the finding. What might these time scales buy me as a speaker and as a listener? What processes might be reflected by arriving at these states of synchrony and top-down control for speech comprehension?

      10) The post hoc description of the subpopulations preferred phases is interesting and could relate interestingly to the entertainment literature (from Spaak 2014 in vision through Hickok 2015 in audition and others). Might the authors speculate on what part of speech is characterized by one phase vs. another?

      11) The author's conjecture in the discussion of this topic - an additional one - there are recent papers by Assaneo et al. (Poeppel as PI, Nat Neurosci, 2019) that show bi-modal behavior in a spontaneous synchronization task (motor to auditory), which was found to be related to morphological differences in frontal-to-auditory white matter pathways, functional differences AND better learning in a statistical learning paradigm. How do the two sets of bi-modal populations interact? The author's discussion of the motor cortex suggests they would.

      Methods section:

      The paper by and large is well written. An exception to this would be the methods section. Currently, the methods do not comply with best practices that would generate the work reproducible by others.

    2. Reviewer #2:

      This study performs behavioral assessment of the impact of watching lip movements on tone detection in noise and EEG recordings from passive observers of the same movies. The basic paradigm is that listeners watch a silent movie of lip movements (selected to be at ~theta rate) while listening for tone bursts that occur most commonly twice in a trial (early and late). The key findings are that perceptual sensitivity is higher when tones are in the second half of the trial, when hits align at a particular phase angle of the visual stimuli. Brain signals were also observed to entrain through the course of the trial. The authors conclude that visual modulation of auditory excitability explains these effects.

      The stimulus design is elegant, and if taken at face value are a nice demonstration that visual stimuli can modulate auditory perception in a temporally specific manner. However, I have concerns with the interpretation of the data while also feeling to some extent that these findings are expected; stimulating AC with a speech envelope modulates speech perception (Wilsch et al., 2018), silent speech modulates human auditory cortex (Calvert 1999) and visual stimuli modulated at theta rates directly entrain auditory cortical phase in animals (Atilgan et al., 2018) as do audiovisual speech stimuli in humans (Zion-Golumbic et al., 2013). This study is a further piece of evidence along these lines, but it's hard to be certain of a causal relationship when the behaviour and neurophysiology are in different listeners. I also have some concerns about the current interpretation some of which are addressable with additional analysis.

      I'm not convinced that the authors have sufficiently ruled out the possibility that the first tone causes a phase reset in AC that causes detected second tones to be entrained to a particular stimulus phase. In theory this should be easily addressed by looking at the 1 tone trials where the tone is in the second half of the stimulus. These data are in the supplemental material but are not particularly reassuring - while the d' is higher for the second tone, but the phase angles are uniformly distributed across participants in comparison to the clustering observed in the 2-tone data. This finding calls into question the causal link between the phase relationship and performance. The authors note that there are relatively few trials (50% of those available in the 2 tone data) - the contribution that this plays could be addressed by subsampling half the trials from the 2 tone dataset and re-estimating the phase modulation to estimate whether the single tone condition is any different. Another analysis that could be enlightening/ reassuring would be to compute the phase of the hits to tone 2 relative to the onset of tone 1 using the modulation rate of the clip (or 6 Hz, if clips were selected to be that anyway).

      I would like to see the distribution of the tones w.r.t. the phase of the lip movement (all tones, not just hits) to be reassured that there is nothing inherent in the movies that causes the phase alignment?

      The neurophysiology does not demonstrate a significant increase in entrainment from early to late windows, only that there is a different phase angle. Doesn't this also call into question the conclusion that performance is better in the second half due to better entrainment? While the phase in the second might be 'more efficient' if the entrainment is equivalent shouldn't there be a behavioural relationship in both cases? This is where performing both behaviour and EEG simultaneously (or at least in the same listeners) may have proved enlightening.

    3. Reviewer #1:

      In this manuscript, the authors report on two separate experiments designed to understand the relationship between lip-movement induced theta phase and auditory processing. In the first experiment, subjects detected tones embedded in noise while viewing silent videos. The results demonstrate that tone detection performance improved when tones are presented later relative to earlier in a trial. It was also demonstrated that correct detection, for tones that occurred later in the trial, was systematically linked with the phase of the theta oscillatory activity conveyed by the lip movements. In the second experiment EEG was recorded while participants viewed the silent videos and performed an emotion judgement task. Theta phase coupling was demonstrated between auditory and visual areas such that oscillations in the visual cortex preceded those in the auditory cortex.

      The authors conclude that these results demonstrate that lip movements directly affect the excitability of the auditory cortex. However, due to the indirect nature of the reported effects, I do not believe this conclusion is justified. I elaborate on this concern below:

      1) In experiment 1, the main finding that performance is better later in the trial could arise from many factors including non-specific attentional effects.

      2) The analysis reported in the bottom of page 5 (comparing vector lengths for hits vs misses) is critical to the argument but the results are inconclusive (significant interaction, but subsequent comparisons not quite significant. Likely because the experiment is underpowered?).

      3) In Experiment 2: the task performed by the listeners might have biased them towards speech imagery leading to the pattern of effects observed. Indeed, the observed involvement of the left hemisphere may be consistent with the involvement of speech imagery. This would render the observed link between visual and auditory cortices as somewhat trivial and not new (such links have been reported in many previous studies as acknowledged by the authors).

      4) Most importantly, the authors do not provide any direct evidence that the auditory effects observed in Experiment 2 are related to those observed in experiment 1.

      Other comments:

      1) For the analyses in Figure 2A, were the number of trials over which the analysis is conducted adjusted for "first tone" vs "second tone"? Since the hit rate is higher for the second tone, there may be a concern that including more trials in the analysis would result in better SNR and hence a more robust effect.

      2) In Experiment 2 the analysis is focused on phase effects. Can you report whether there are any power differences in the delta band in the "early" vs "later" time windows?

      3) Line 176, the authors write "these results established that entrainment of theta lip activity increased in time". It is not clear to me which aspect of the results supports this statement.

      4) Line 405: "any lag between visual and auditory stimuli onsets was later compensated...". I could not find mention of this elsewhere (i.e. how lags were compensated, how large they were). This is critical for interpreting the results and therefore should be described in detail.

      5) Line 430-437 why did you choose to quantify the envelope in this way rather than just taking the wide band envelope?

      6) Figure S3 is important and should be in the main text.

      7) Line 473 "auditory pure tones"

      8) The description in lines 478-481 doesn't make sense. It is unclear how loudness reported in line 480 (91dB SPL; incidentally this is very loud) relates to the later reported value of 72dB SPL.

      9) Line 485 "embedded"

      10) Please clarify whether in your loudness adjustment procedure you were adjusting the loudness of the tone, the noise or the SNR (and thus keeping the overall loudness of the stimulus fixed)

      11) Line 537 "preceding"

    1. Reviewer #3:

      In this manuscript, the authors test the long-standing and long overdue "evolution-on-demand" hypothesis of integrons. Using a combination of genetic construction work, experimental evolution, and WGS the authors present a convincing body of work favoring the presented hypothesis. The paper is clear, well written and the authors should be given credit for including experimental data from an integron containing clinical plasmid including resistance cassettes to the last resort antibiotics carbapenems. This is largely missing in the field.

      My overall assessment of the manuscript is very positive. The "evolutionary ramp" approach is an elegant way to test the "evolution on demand" hypothesis and the authors provide compelling evidence favoring the evolutionary effects of an active class 1 integrase. However, reading through the manuscript I have three major questions/comments regarding the mechanistic aspects and conclusions of the paper. Regarding the last two points, I believe a slightly more balanced discussion including other possible explanations (such as experimental conditions) would add more balance to the Conclusion chapter and improve the manuscript.

      Major Comments:

      1) Based on WGS the authors characterize evolved populations and claim to demonstrate extensive integrase driven rearrangements in combination with chromosomal mutations underpinning the adaptations towards both constant sub-MIC and 2- fold increments of gentamicin concentrations.

      My first concern regards the crucial control in Figure S2 where control PCRs confirm data from Illumina short read sequencing on whole populations. It is hard for me to follow and understand this figure. I suggest that a schematic figure of each combination of cassettes, primer positions, and expected band length combined with proper lane descriptions should be prepared.

      2) Surprisingly, and contrasting integron structures from environmental and clinical samples, the authors provide evidence for a strong predominance of "copy and paste" as opposed to the emblematic "cut and paste" insertions of the gentamicin resistance cassette during experimental evolution. They argue that their data suggest that intI1 has a bias towards "copy and paste" cassette rearrangements.

      First, I find the term "copy and paste" somewhat confusing. I cannot see that the underlying mechanism of cassette excision differs between the two outcomes in integron structure. The cassette is in both cases excised (cut) from the ancestral integron before it is inserted (paste) into either arrays. I may have missed something here- but why "copy" and how is this novel?

      Second, I am not convinced that the presented evidence provides sufficient support for the proposed "copy and paste" bias of IntI1. As the authors discuss thoroughly, the presence of multiple copies of the ancestral structure provides more "ancestral" integration targets for the excised cassettes. The authors exclude the alternative hypothesis that a second copy of aadB increased fitness as compared to a single copy (as expected from copy and paste). Fitness effects of different arrays are discussed solely on the basis of retrospective analyses of populations that did not go extinct. I would have been more convinced if this was backed by some measure of fitness, for example MIC values of integron arrays containing two aadB cassettes. From Fig 1C it is not unlikely that it could be increased.

      3) The authors highlight in the abstract and in the Conclusion section that they found no evidence of deleterious off-target integrase effects. They suggest that integrase activity, rather purge deleterious chromosomal mutations and enable more targeted beneficial adaptive responses.

      The authors present cases where likely beneficial off target recombination events occurred. To what extent do the authors think the absence of deleterious off target effects is due to the experimental conditions (continuous increments in gentamicin concentrations combined with strong bottlenecks)?

    2. Reviewer #2:

      This manuscript addresses the evolutionary benefits of integrase activity using experimental evolution of integrons in the presence of antibiotics. The authors demonstrate that activity increases survival of populations at high gentamicin concentrations, by shuffling a gentamicin resistance cassette towards the start of the integron.

      The paper is very well written and interesting, and demonstrates neatly the benefits of integron shuffling. I am suggesting a few additional assays, in order to measure phenotypic effects of evolved integrons. However, if these are not possible to perform , the main conclusions could be slightly altered instead to focus more on the genomics.

      Major Comments:

      1) The paper would benefit from MIC assays (or any other resistance measure) using evolved clones, to properly demonstrate and quantify the evolution of increased resistance associated with the different integron arrays. For now, the only phenotypic data measured from the evolution experiment is survival of populations during the experiment itself. I was first going to say that this is a minor comment, as the genetic / genomic data is very interesting and solid on its own, but the paper is still framed around evolution of increased antibiotic resistance, which is not directly quantified. Survival of populations might be influenced by other factors, including the chromosomal mutations described in the manuscript but also non-genetic effects, for instance population density effects, with populations that grow slightly more at a given time point then having a higher inoculum for the next step.

      2) MIC assays could even be done with no need for further sequencing, using clones from the populations in which integrons are not polymorphic (Fig 3B). Comparing resistance levels for the aadB-blaVEB-1-dfrA5-aadB array, and the aadB-aadB and aadB arrays with the ancestral array would allow the authors to link genotype and phenotype, and to demonstrate more directly the selective advantages (or absence of, for some of the arrays) that they suggest. Effects of plasmid evolution could also be separated relatively easily from chromosomal mutations that contribute to gentamicin resistance by transferring evolved plasmids to an unevolved host.

      3) I don't actually think anything else is happening than the evolution of increased resistance via shuffling that the authors are suggesting - and they are very careful in stating clearly that increased resistance is only 'suggested' whenever they discuss the genomic results directly. But I am still a bit uneasy about drawing conclusions of increased antibiotic resistance (in the title, end of introduction, and conclusion) when the only phenotypic data is survival at the population level. Alternatively, this text could be reformulated to focus clearly on the genetics and not on phenotypic resistance.

    3. Reviewer #1:

      The manuscript 'Integron activity accelerates the evolution of antibiotic resistance' by Souque et al. investigates the genetic variations created by a class 1 integron during antibiotic exposure. In the study, the authors examine the evolution of an integron encoded on a R388 plasmid; they introduce three antibiotic gene cassettes into the integron and follow its evolution in the presence of one corresponding antibiotic - here gentamicin. They find that antibiotic exposure leads to a rapid re-shuffling of the integron cassette. The re-shuffling favors the aadB gene in the first position downstream of the integron promoter while mainly keeping the (original) last position in the integron. The study represents an interesting example of rapid adaptation to increasing concentrations of an antibiotic that is facilitated by mobile elements. While the experiments are overall interesting and very well designed, the study lacks a certain depth. In the sense that their results might be as well explained by random mutations (genetic diversity). In addition, the two parts of the experiments (integron analysis & chromosomal evolution) need to be connected as it is so far unclear what role the chromosomal mutations have in the integron-facilitated evolution.

      Major Comments:

      1) The authors don't mention whether they detected re-arrangements in the negative control that was evolved without antibiotics. Furthermore, re-arrangements might appear but at a very low frequency. What is the sequence coverage used in the study? How can the authors ensure they don't miss a low frequency of re-arrangements? It might be possible that random re-arrangements appear at a very low frequency that are only fixed under changing conditions (similar to mutations). The authors should clarify this point.

      2) Did the authors measure the Integrase expression levels? This could ensure that there is no expression without stress to the cell.

      3) Regarding the mutational analysis: Is there any sign of a cost to the integrase activity? The authors conduct an intensive analysis on chromosomal and plasmid mutations. Nonetheless, it is unclear how these mutations are generally connected to the integrase activity (and not only to the AB treatment).

      4) The authors call the integrase activity 'adaptation on demand'. It would be interesting to know how fast a potential reversal would appear in the integron in the populations. Is there any evidence for a deletion of the duplication of the aadB gene after removal of the antibiotic? In the same line of thought, do the authors expect the other AB resistance genes to follow the same path when incubated in the corresponding antibiotic? It would be interesting to know how antibiotic 'type' dependent the experimental result might be.

    1. Reviewer #2:

      In this work, the authors analyze fungal and bacterial communities in 49 host species and find evidence of phylosymbiosis, a correlation between these microbiomes and host that suggests host recruitment of specific microbial communities. They further carry out a network analysis that suggests co-occurrence of fungal and bacterial communities across hosts. While host recruitment has been shown previously for bacteria, the authors here include a broad survey of mycobiomes and based on their analysis conclude that fungal communities are also critical to interactions and host health.

      This descriptive study provides important insight regarding the general characteristics of the mycobiome and its relationship to the bacterial communities and the host. The work is in agreement with these fungal communities being important for host function and health, the work does not provide direct information on these communities, their interactions or possible effects on the host.

      The overall presentation of the results are geared towards a focused readership.

      The authors could be more explicit regarding the value behind the modularity of networks for a given host (in mammals) and what exactly is the significance of this finding in the broad context of microbiomes.

      Some groups of samples are obtained from very varied sources (amphibia) but others are not. Beyond sample type being important, what other effects could these sampling differences have on the final conclusions, for example in their network analysis?

      What is the significance of having some species with more negative interactions? Are there any ideas how a negative interaction can be sustained over time?