6,070 Matching Annotations
  1. Oct 2020
    1. Reviewer #3:

      Wang et al describe a tissue specific knockout system to target neutrophil specific genes. Tissue specific knockout system is an important tool to study gene function in specific tissues. To the best of my knowledge there are only four major publications describing the tissue specific knockouts in zebrafish, two of them are not acknowledged by the authors.

      In this manuscript, authors used neutrophil specific promoter to drive the expression of Cas9, and ubiquitous promoter for sgRNA expression or vice versa. The authors have previously published a similar paper describing a transgenic construct (Tg(lyzC:nls-cas9-2A-mCherry/U6a:polg sgRNA)) expressing Cas9 as well as sgRNAs from the single construct. Authors claimed that the knockout efficiency drops significantly when the knockout line is crossed with other lines that use the neutrophil-specific promoter possibly due to the presence of another construct driven by the same neutrophil-specific promoter in the genome competes with the transcriptional factors for Cas9 expression and reduces Cas9 protein to a level that is not sufficient for efficient knockout.

      In this manuscript authors created a sgRNA-resistant rescue construct, and incorporate biosensors into the knockout line for live imaging in the context of the cell-specific knockout, and studied the function of Rac2 and Cdk2.

      This manuscript does not offer any further advances other than showing the tissue specific rescue, and subcellular localization of Rac activation in wild-type and rac2-knockout neutrophils.

      There is no evidence that this strategy is better than the previously published method, the quantification of knockout efficiency is absent.

    2. Reviewer #2:

      In the manuscript "A CRISPR/Cas9 vector system for neutrophil-specific gene disruption in zebrafish" by Wang et al, the authors describe methods for targeted inactivation of genes in a cell-type specific fashion, in this case in neutrophils in zebrafish embryos, and use this tool to examine the role of rac2 in neutrophil motility. The overall goal of broadening the ability to target tissue-specific gene inactivation is laudable and an ongoing need in the zebrafish toolbox, as is the goal of developing an increased understanding of motility regulation in neutrophils, as evaluated here in a series of quite stunning motion-tracking videos. Unfortunately, the current manuscript does not appear to advance the technology, nor evaluate it in sufficient depth, nor reveal sufficient new biology in regards to neutrophils/rac2.

      Major Points:

      1) With the title "A CRISPR/Cas9 vector system for neutrophil-specific gene disruption in zebrafish", the manuscript seems to be targeting a "technology" aim. As the authors cite, they have already published a neutrophil-specific CRIPSR/Cas9-based knockout tool in their DMM, 2018 manuscript. The addition of the crystallin reporter in the current manuscript is a convenient method for tracking the cas9 portion of the transgene, but this is a modest alteration to the existing technology.

      2) While billed as a neutrophil-specific gene-disruption technology, the authors do not show genome sequence of a mutated/disrupted rac2 gene. They have previously done this in the DMM 2018 paper, so should be feasible. Disruption of neutrophil motility is being used as a proxy read-out for rac2 disruption, but it seems that, as currently billed, the study should show neutrophil-specific disruption of rac2. The neutrophil-specific rescue experiments are very nice, but fail to show that the targeted gene disruption is limited to neutrophils, only that the gene disruption includes neutrophils. This could be of concern in a stable transgene context as well since transgenes can exhibit ectopic gene expression (i.e. not limited to neutrophils), and this cannot be tracked with the un-tagged CAS9 in the construct.

      3) At the outset, it is expected that disruption of rac2 would lead to neutrophil motility disruption and changes in F-actin dynamics using this tool as previously described in Deng et al, "Dual roles for Rac2 in neutrophil motility and active retention in zebrafish hematopoietic tissue", Dev Cell, 2011. As a proof of concept for the ability of targeting a gene in neutrophils, this makes sense to evaluate a well-studied pathway, but it is not clear if this expands on the understanding of rac2/control of actin dynamics and neutrophil motility, or if the newly described targeting vectors allow for an analysis that was not previously possible.

      4) The ribozyme approach described in Figure 6 seems perhaps most novel as an approach to target tissue-specific inactivation of a gene, but to truly nail down the technology, this would seem to require again some analysis of (a) the specific genomic lesions induced by the combination of ubiquitous CAS9 and tissue-specific gRNA and (b) some assessment of the specificity to neutrophils (i.e. are these mutations generated in other cell types?).

    3. Reviewer #1:

      Summary:

      Wang et al. utilize in their manuscript two trangenic lines to tissue-specifically knockout the rac2 gene in neutrophils. While technically CRIPSR-Cas9 has been well established, tissue-specific knockouts in zebrafish are missing in the field. Therefore, the manuscript of Wang et al. is highly timely and would help advance the field further; however, the manuscript and figures would greatly benefit from thorough editing and rewriting as outlined below.

      Major comments:

      Wang et al. base all their conclusions on observations of the targeted cells, and do not show any sequenced alleles of the neutrophil cells to verify that indels occurred. To go forward with the results, including sequences of the targeted alleles is crucial. Therefore, the manuscript would greatly benefit from including these basic allele confirmations, before drawing scientific conclusions about the efficacy of the system.

      1) Line 100 onwards. "To test the efficiency of the gene knockout using this system, we injected the F2 embryos of the Tg(lyzC:cas9, cry:GFP) pu26 101 line with the plasmids carrying rac2 sgRNAs or ctrl sgRNAs 102 for transient gene inactivation. The sequences of the sgRNAs are described in Fig. 1C, D. A 103 longer sequence with no predicted binding sites in the zebrafish genome was used as a control 104 sgRNA (Fig. 1D). As expected, we observed significantly decreased neutrophil motility in larvae of Tg(lyzC:cas9, cry:GFP) pu26 105 fish transiently expressing sgRNAs targeting rac2 (Fig. 106 1E, F and Movie S1), indicating that sufficient disruption of the rac2 gene had been achieved."

      Please include sequenced alleles from rac2 in neutrophil cells. "Significantly decreased neutrophil motility" is not an indicator that rac2 in neutrophil cells is mutated. Only sequenced alleles are.

      2) Line 107 onwards. "To test the knockout efficiency in stable lines, we generated transgenic lines of Tg(U6a/c: ctrl sgRNAs, lyzC:GFP) pu27 or Tg(U6a/c: rac2 sgRNAs, lyzC:GFP) pu28 108 , crossed the F1 fish with Tg(lyzC:cas9, cry:GFP) pu26 109 and quantified the velocity of neutrophils in the head mesenchyme 110 of embryos at 3 dpf. A significant decrease of motility was observed in the neutrophils 111 expressing Cas9 protein and rac2 sgRNAs (Fig. 1G, H and Movie S2)."

      Also here, "a significant decrease of motility" doesn't mean the rac2 gene in neutrophils is mutated. See point 1.

      Summarizing, the authors are advised to include this basic, but necessary and very important information in their manuscript instead of drawing conclusions from their observations. Otherwise, it stays unclear if everything Wang et al. observe is really due to indels in the rac2 gene, and not some other side effect of the system.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      As you can see from the reviewer comments attached below, all reviewers appreciated the approach you took for neutrophil-specific gene disruption, as such tissue-specific tools remain greatly missing in the field. Nonetheless, the reviewers all agreed that your phenotype description is insufficient to warrant the claims of the study. In particular, the lack of sequence verification of the claimed Cas9-induced mutagenesis has been picked up by all reviewers. We hope the reviewer comments are instrumental for refining your work.

    1. Reviewer #3:

      In the current study, baseline samples of salivary and plasma oxytocin were assessed in 13, respectively, 16 participants, to assess intra-individual reliability across four time points (separated by approximately 8 days). The main results indicate that, while as a group, average salivary and plasma samples were not significantly different across time points, within-subject coefficient of variation (CV) and intra-class correlation coefficient (ICC) showed poor absolute and relative reliability of plasma and salivary oxytocin measurements over time. Also no association was established between plasma and salivary levels, either at baseline or after administration of oxytocin (either intranasally, or intravenously). Further, salivary/ plasma oxytocin was only enhanced after intranasal, respectively intravenous administration.

      The study addresses an important topic and the paper is clearly written. While the overall multi-session design seems solid, sample collections were performed in the context of larger projects and therefore there appear to be several limitations that reduce the robustness of the presented results and consequently the formulated conclusions.

      General comments

      1) A main conclusion of the current work is that 'single measures of baseline oxytocin concentrations in saliva and plasma are not stable within the same individual'. It seems however that the study did not adhere to a sufficiently rigorous approach to put forward this conclusion. It lacks a control for several important factors, such as timing of the day at which saliva/ plasma samples were obtained, as well as sample volume. Particularly while it is indicated that all visits were identical in structure, important information is missing with regard to whether or not sampling took place consistently at a particular point of time each day, to minimize the influence of circadian rhythm. Without this information it is not possible to draw any firm conclusions on the nature of the intra-individual variability as demonstrated in the salivary and plasma sampling. Correspondingly, a deeper discussion is needed on the reason why ICC's were considerably variable across pairs of assessment sessions, with some pairs yielding good reliability, whereas others yielded (very) poor reliability. More detailed descriptions regarding sampling procedures (timing and sampling intervals) are necessary. Also, more information is needed on the volume of saliva collected at each session, to control for possible dilution effects.

      2) It is indicated that the initial sample would allow to detect intra-class correlation coefficients (ICC) of at least 0.70 (moderate reliability) with 80% of power. Is this still the case after the drop-outs/ outlier removals? Since the main conclusions of the work rely on negative results (conclusions drawn from failures to reject the null hypothesis) it is important to establish the risk for false negatives within a design that is possibly underpowered.

      3) Did the authors also assess within-session reliability? For example, by assessing ICC between pre and post-measurements in the placebo session.

      4) It is indicated that the intra-assay variability of the adopted radioimmunoassay constitutes <10%. Were analyses of the current study run on duplicate samples? Was intra-assay variability assessed directly within the current sample?

      Introduction & Discussion

      5) The introduction and discussion is missing a thorough overview of previous studies assessing intra-individual variability in oxytocin levels.

      6) The paper misses a discussion of previous studies addressing links between salivary/ plasma levels and central oxytocin (e.g. in cerebrospinal fluid). I understand the claim that salivary oxytocin cannot be used to form an estimate of systemic absorption, although technically, a lack of a link between salivary and plasma levels, does not necessarily imply a lack of a relationship to e.g. central levels. The lack of effect is limited to this specific relationship.

      Methods

      7) Related to the general comment, the variability in days between sessions is relatively high (average 8.80 days apart (SD 5.72; range 3-28). However, it appears that no explicit measures were taken to control the conducted analyses for this variability.

      8) A rationale for the adopted dosing and timing (115 min post administration) of the sample extraction is missing. Additionally, it seems that intravenous administrations were always given second, whereas intranasal administrations were given third, with a small delay of approximately 5 min. Hence, it seems that the timing of 115 min post-administration is only accurate for the intranasal administration.

      9) Since the ICC of baseline samples showed poor reliability, it seems suboptimal to pool across sessions for assessing the relationship between salivary and blood measurements. It should be possible to perform e.g. partial correlations on the actual scores, thereby correcting for the repeated measure (subject ID). Further, since the sample size is relatively small (13 subjects), it might be recommended to use non-parametric (e.g. Spearmann correlations) instead of Pearson. The additional reporting of the Bayes factor is appreciated; it is very informative.

      10) Now, the authors only compared relationships between salivary and plasma levels, either at baseline or post administration. I'm wondering whether it would be interesting to explore relationships between pre-to-post change scores in salivary versus plasma measures.

      11) Please provide more information on the outlier detection procedure (outlier labelling rule).

      12) Please indicate how deviations from a Gaussian distribution were assessed.

      Results

      13) Please verify the degrees of freedom for the post-hoc tests performed to assess pre-post changes at each treatment level (e.g. baseline vs Post administration: Spray - t(122) = 7.06, p < 0.001) . Why is this 122? Shouldn't this be a simple paired-sample t-test with 13 subjects?

    2. Reviewer #2:

      Summary:

      To test questions whether salivary and plasmatic oxytocin at baseline reflect the physiology of the oxytocin system, and whether salivary oxytocin index its plasma levels, the authors quantified baseline plasmatic and/or salivary oxytocin using radioimmunoassay from two independent datasets. Dataset A comprised 17 healthy men sampled on four occasions approximately at weekly intervals. In the dataset A, oxytocin was administered intravenously and intranasally in a triple dummy, within-subject, placebo-controlled design and compared baseline levels and the effects of routes of administration. With dataset A, whether salivary oxytocin can predict plasmatic oxytocin at baseline and after intranasal and intravenous administrations of oxytocin were also tested. Dataset B comprised baseline plasma oxytocin levels collected from 20 healthy men sampled on two separate occasions. In both datasets, single measurements of plasmatic and salivary oxytocin showed insufficient reliability across visits (Intra-class correlation coefficient: 0.23-0.80; mean CV: 31-63%). Salivary oxytocin was increased after intranasal administration of oxytocin (40 IU), but intravenous administration (10 IU) does not significantly change. Saliva and plasma oxytocin did not correlate at baseline or after administration of exogenous oxytocin (p>0.18). The authors suggest that the use of single measurements of baseline oxytocin concentrations in saliva and plasma as valid biomarkers of the physiology of the oxytocin system is questionable in men. Furthermore, they suggest that saliva oxytocin is a weak surrogate for plasma oxytocin and that the increases in saliva oxytocin observed after intranasal oxytocin most likely reflect unabsorbed peptide and should not be used to predict treatment effects.

      General comments:

      The current study tested research questions relevant for the study field. The analyses in two independent datasets with different routes of oxytocin administrations is the strength of current study. However, the limited novelty of findings and several limitations are noticed in the current report as described below.

      Specific and major comments:

      1) Previous study with similar results has already revealed that saliva oxytocin is a weak surrogate for plasmatic oxytocin, and increases in salivary oxytocin after the intranasal administration of exogenous oxytocin most likely represent drip-down transport from the nasal to the oral cavity and not systemic absorption (Quintana 2018 in Ref 13). Therefore, the novelty of current findings is limited. The authors should more clearly state the novelty of current results and the replication of previous findings.

      2) As authors discussed in the limitation section of discussion, the current study has several limitations such as analyses only in male participants and non-optimized timing of collection of saliva and blood due to the other experiments. These limitations are understandable, because the current study was the second analyses on the data of the other studies with the different aims. However, these limitations significantly limit the interpretations of the findings.

      3) As reported in page 6, the dataset A comprises administrations approximately 40 IU of intranasal oxytocin and 10 IU on intravenous. The rationale to set these doses should be described. Since the 40IU is different from 24 IU which is employed in most of the previous publications in the research field, potential influence associated with the doses should be tested and discussed.

      4) It is difficult to understand that no significant elevations in plasma oxytocin levels were observed after intranasal spray or nebuliser of oxytocin. From figure 4A, the differences between levels at baseline and post administration are similar between nebuliser, spray, and placebo. Please discuss the potential interpretation on this result.

      5) In page 12, the reason why not to employ any correction for multiple comparisons in the statistical analyses should be clarified.

    3. Reviewer #1:

      This article describes the investigation of a valuable research question, given the interest in using salivary oxytocin measures as a proxy of oxytocin system activity. A strength of the study is the use of two independent datasets and the comparison between intranasal and intravenous administration. The authors report poor reliability for measuring salivary oxytocin across visits, that intravenous delivery does not increase concentrations, and that salivary and blood plasma concentrations are not correlated.

      Line 77-78: While it's true that saliva collection provides logistical advantages, there are also measurement advantages (e.g., relatively clean matrix) that are summarised in the MacLean et al (2019) study, which has already been cited.

      Line 86: It is important to note that the 1IU intravenous dose in this study led to equivalent concentrations in blood compared to intranasal administration.

      Line 158: When using both ELISA and HPLC-MS, extracted and unextracted samples are correlated when measuring oxytocin concentrations in saliva, at least in dogs. (https://doi.org/10.1016/j.jneumeth.2017.08.033).

      Statistical reporting: I ran the article through statcheck R package (a web version is also available) and found a number of inconsistencies with the reported statistics and their p values. For example, on Line 302 the authors reported: t(123) = 1.54, p = 0.41, but this should yield a p value of 0.13. The authors should do the same and fix these errors.

      Line 305: The confidence intervals for these correlations should be reported.

      Line 348: This is an important point, but it's important to note that the vast majority of these studies use plasma or saliva measures. Perhaps CSF measures are more reliable, but the question wasn't assessed in the present study, and I'm not sure if anyone has looked at this question.

      Line 423: I broadly agree with this conclusion, but it should be added that "single measurements of baseline levels of endogenous oxytocin in saliva and plasma are not stable under typical laboratory conditions" Perhaps these measures can be more stable using other means (i.e., better standardising collection conditions). But the fact remains, under typical conditions these measures do not demonstrate reliability.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The strengths of the study are the findings that a single oxytocin level measured from saliva or plasma is not meaningful in the way that the field might currently be measuring. The reviewers appreciated this finding, and the careful attention to detail, but felt that the results fell short.

    1. Reviewer #3:

      This work by Sacchetti et al. describes how phenotypic plasticity contributes to local invasion and metastasis formation in colon cancer cells. Based on human classical colon carcinoma cell lines and cell sorting they identified a subpopulation of colon cancer cells that are CD44hi/EpCAMlo cells which have enhanced phenotypic plasticity that underlies enhanced invasion and metastatic behavior. In these EpCAMlo cells elevated ZEB1 expression has been identified. Increased WNT signaling results in elevated expression of the EMT associated transcription factor ZEB1. The EpCAMlo expression status is linked with the CMS4 subgroup of human malignant colon cancer. Overall this is an interesting and well written paper for which I offer a few supportive questions/remarks.

      Major comments:

      1) Page 6: The miR-200 family of miRNAs is targeting the mRNA of transcription factors ZEB1 and ZEB2 in epithelial cells but this is not transcriptional regulation

      2) A clear association of EpCAMlo cells and elevated ZEB1 expression is identified. Conditional knockdown of ZEB1 results in a strongly decreased number of EpCAMlo cells. For now, it is not clear if ZEB1 KD results in the death of these EpCAMlo cells or that the mesenchymal gene signature is controlled by ZEB1. The functional contribution of ZEB1 as an EMT inducer should be experimentally proven as for now the role of ZEB1 is not clear.

      3) The importance of the role of EMT is not well established so far in the manuscript in relation to resistance to chemotherapeutic drugs and metastasis. Conditional KD of ZEB1 in metastasis and therapy resistance assays should be added otherwise the title and the claims made in the abstract should be tuned down.

      4) The use of AKP organoids brings further relevance to this research manuscript. Are these EpCAMlo cells also present in the AKP organoids and what is the endogenous expression status of ZEB1 in the AKP organoids?

      5) Why have the authors maintained the conditional expression of ZEB1 induced in the AKP-Z organoid transplantation experiments? This is driving the epithelial cells in a locked mesenchymal state - which is not compatible with the earlier observed plasticity with the EpCAMlo cells in SW480 and HCT116 cells. Also mesenchymal to epithelial transition is generally believed to be essential for metastasis formation. The experimental outcome of these experiments is not relevant and the authors should consider temporal ZEB1 expression control in transplanted AKP-Z organoids.

      6) The data depicted in Fig 10A & B are confusing and deserve a better explanation. How is it possible that EpCAMlo and EpCAMhi sorted cells show overlapping single cell expression profiles upon t-sne plotting in particular for the SW480 cells. This is very contradictory as the authors claim earlier in the manuscript that EpCAMlo cells have a more mesenchymal gene expression profile which is then confirmed with the 'EMT signature' analysis. Is there a difference between EpCAM protein expression and EpCAM mRNA expression?

      7) The Heatmap from the EMT signature shown in figure 10B is representing which cell line?

      Overall the authors link the gene expression signature of EpCAMlo with the colon cancer consensus molecular subtype CMS4 which has the worst relapse free and overall survival (Dienstmann R et al. 2017; 17, Nat Rev Cancer 79-92). There are multiple lines of evidence that the mesenchymal signature in CMS4 colon cancers is due to profound infiltration of stromal cells (CAFs, immune cells), extracellular matrix remodeling, TGF-beta pathway activation and not the consequence of EMT in cancers cells (e.g. Calon et al. 2015; DOI: 10.1038/ng.3225). It is of course possible that a few epithelial cells in this inflammatory context are undergoing a partial EMT but there is little evidence and this likely will happen in a minority of cells. Together, the authors should revise their manuscript regarding (partial) EMT and the CMS4 and put their findings in a more critical context.

    2. Reviewer #2:

      The manuscript by Fodde et al investigates the presence of a population of colorectal cancer cells within commonly used human cell lines that have a propensity to form metastasis to the liver and lung. These cells are marked as being CD44HiEpCamlo and have increased expression of the EMT marker Zeb1. They show that this population of EpCam-low cells is able to drive metastatic colonisation and that this is likely due to levels of Zeb1. These cells have a signature similar to the CMS4 group of colorectal cancers, which are highly invasive.

      The manuscript is generally well written and presented in a stepwise and straightforward manner so is relatively easy to follow.

      There is a lot of data presented in this paper with 10 primary figures and a number of supplementary figures. I would encourage the authors to look at which data needs presenting and ask whether some of the earlier figures in particular could be combined and the paper streamlined...its by the time you get to the really interesting data in the organoid transplantation and scRNA seq there has been a lot to get through already.

      There are some questions I have about the experimental data and presentation:

      1) Whilst the authors investigate the expression of EpCam and CD44 in cell lines, is there any evidence of this EpCam-low population in primary human tumours? or primary tumours in the mouse? I appreciate that finding these cells in human could be rate limiting, but what about in tumours that are generated in mice and are metastatic - specifically I am thinking about the recent work in colon showing that Notch signalling drives colonic to liver metastasis (Jackstadt et al 2019) - do the Notch active cells in this model have lower EpCam levels?

      2) For the FACS plots could the authors include their complete gating and FMO control gating strategy in the supplementary. It would be helpful to be able to confirm that the shifts the authors are describing are real.

      3) In figure 2, can the authors quantify the protein expression of Ecad and Zeb1? In one of the panels of the CD44 high EpCam low (SW480 cells) there seems to be cells with quite high levels of EpCam - having a quantified measure of these proteins in the two populations would be important here.

      4) It was very interesting that the different populations gave rise to different metastatic rates following injection through the spleen. Do the authors have information on whether this is because the different populations move out of the spleen and into the liver at different rates (so initiation/seeding) is different or is this a consequence of proliferation i.e. both cell populations colonise the liver, but only the EpCam-low population sticks around and colonises the tissue? Further to this, can the authors delete Zeb1 in the EpCam-low cells (as they have done in vitro) and show that colonisation is Zeb1 dependent - this latter point would not be considered essential given the following overexpression experiments.

      5) Much of the metastatic quantification is done through IVIS imagine (from what I can see) - have the authors pathologically quantified the number and size of tumours following ZEB1 overexpression in AKP derived metastasis with histology?

      6) The authors concede that the continuous activation of Zeb1 following transplantation of AKP organoids (pg9 of the PDF) could be the reason that metastatic colonisation is not as impressive as hoped - have the authors considered pulling Dox to initiate metastatic colonisation of the liver and then withdrawing Dox to favour proliferation following metastatic seeding? It would be interesting to know whether the timing of Zeb1 expression is important for this phenotype.

      7) As Wnt signalling is important in the establishment of the EpCam-low population, have the authors inhibited this pathway (either at the ligand level or through inhibiting b-cat transcription) to confirm that the population is Wnt responsive?

      8) Finally, linked to point 7. In the scRNA sequencing, in the populations that have increased EMT and EMT-gene expression, does this correlate to a Wnt/B-catenin signature on a single cell level?

    3. Reviewer #1:

      Sacchetti and co-workers have employed established human colorectal cancer cell lines to identify a subpopulation of colorectal cancer (CRC) cells (CD44 high/EpCAM low) which represent cells with high tumorigenicity and malignancy in vitro and in vivo. These cells can also be found in patient-derived tumor organoids and in patient samples. Using bulk and single cell RNA sequencing and subsequent functional validation they go on to demonstrate that enhanced canonical Wnt signaling mediates the expression of the EMT transcription factor ZEB1 and with it an EMT-like process. Consistent with this observation, this cell population exhibits higher drug resistance as compared to the parental cells or to CD44 high/EpCAM high cells. They finally employ a number of cutting-edge computational analysis to classify several subgroups within the EMT cell subpopulation which seem to represent various stages of the EMT continuum, and thus may exhibit various degrees of cell plasticity. The particular gene expression signatures of the identified subpopulations also correlate with poor clinical outcome and with the CMS4 subclass of poor prognosis CRC.

      Overall, the manuscript is presented in a straightforward and concise manner, the experimental approaches are thoughtfully designed and appropriately controlled. However, some of the results, in particular of the first part, are not specifically novel. The correlation between CRC invasion and nuclear -catenin and ZEB1 has been reported before, as actually appropriately cited by the authors. Moreover, the migratory and invasive and pro-metastatic and drug-resistant phenotype of ZEB1-expressing, EMT-like cancer cells have been shown before and are as expected. Finally, as detailed below, the mechanisms regulating the homeostasis of the EpCAM-low and EpCAM-high cells in cell culture and in organoids in vitro and in cancers in vivo remain elusive. While the novel insights into the potential trajectories of the genesis of the various subpopulations and the respective gene signatures is exciting, the functional validation of these signatures for the definition of cell plasticity and the actual establishment and functional validation of an identifiable gene signature for cell plasticity has not been directly addressed. Along these lines, the report goes with the mainstream literature in using the term "cell plasticity" with a rather vague description. Is it defined by EMT in general or only by a specific hybrid stage of EMT, by therapy resistance, by differentiation potential, by the reversibility of processes, by stemness, etc.? How can it be functionally tested? The manuscript, as it stands, is not adding tangible data and information on how to identify cell plasticity and what it means in terms of identifying and assessing novel therapeutic targets.

      Specific comments:

      Introduction: the literature on the role of Prrx1 in EMT/MET and the need of MET for metastatic outgrowth should be mentioned already in the Introduction. The discovery and functional characterization of the various EMT stages should also be mentioned already in the Introduction, not only in the Discussion. Finally, the term cell plasticity should be defined in the Introduction, at least how it is used in the following chapters.

      Figure 1/Suppl.1: "similarly variable"? There is a variability of 0 - 99.6% for the levels of the CD44 -igh/EpCAM-low subpopulation in the different CRC cell lines. Notably, there is no correlation of the levels of this subpopulation with the CMS classification of CRC origin, as is claimed later with CMS4.

      Why do the EpCAM-low cells get lost during long-term culture and turn into EpCAM-high, E-cadherin-high cells? How then is the homeostasis between the EpCAM-high and low populations maintained in the parental cells which have been cultured for decades? Also, almost all single cell cones of EpCAM-low cells turn into EpCAM-high over time. Why are some maintaining the EpCAM -ow status? Is there a difference in gene expression or epigenetic imprints? Has the fetal calf serum been stripped of TGF or does it still contain TGF which could induce an EMT?

      Figure 5E, text: the reversibility of EMT by a MET is here used as equal to cell plasticity. Is this a correct definition of cell plasticity (see also above)? The EpCAM-low status seems rather unstable and not metastable in vitro and in vivo, this may not represent the homeostasis of EMT induction and its reversion and thus not true cell plasticity.

      Figure 6: The induction of an EMT by ZEB1 is not new or unexpected as is the increase of metastasis, even though the latter is not statistically significant here. The "excuse" that the incidence of metastasis could be higher, when ZEB1 expression would have been stopped by removing Dox, could have been actually tested. This would be a more meaningful experiment.

      Figure 7: RNA sequencing identifies Wnt signaling to be enhanced in EpCAM-low cells. GSK inhibition induces the expression of ZEB1 (as known before), yet this works only in HCT116 and not in SW480 cells, which actually show an induction of Wnt signaling. The results seem to indicate that there is not just a mere enhancement of Wnt signaling and that other changes/pathways are required as well. What about other cell lines?

      Is the prognostic and predictive value for the gene signature only true for CMS4 CRCs or for all subtypes? Does the EpCAM -ow signature and the signatures of the various EMT stages correlate with CMS subtypes, therapy resistance and clinical outcome? This is not really clear from the data presented.

      The scRNA sequencing seems to reflect the EMT full and hybrid stages. The computational analysis is impressive and exciting, the potential trajectories offer a working model which could be experimentally tested by functional validation of the subgroups to finally pinpoint the cell populations with the highest cell plasticity. And most importantly, what defines cell plasticity at the molecular and cellular level? Is it Wnt signaling or something in addition? Here, the reader is left without a clear picture (see also comment on Discussion, below).

      Text: Seurat33 = Stuart33.

      Discussion: What is the mechanistic basis for the "further enhancement" of Wnt signaling? Is it the dose of Wnt signaling or is it the combination with other signaling pathways which cooperate with Wnt transcriptional control, such as Hippo or TGF signaling? There could be a hint from the RNA sequencing data to distinguish these possibilities. Do the target gene lists change with the enhancement of Wnt signaling?

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      This manuscript is in revision at eLife.

      Summary:

      While all reviewers see merit in aspects of the work, and indeed the consensus that there were elements of novelty and interest in this manuscript, they felt that novel advances were limited as presented. Briefly, the manuscript falls into two parts; it is too long with too much data presented and we recommend focus on potentially the most exciting/novel part, ie. the RNAseq / sc and computational analyses, and extending this to provide further functional validation. Some of the earlier figures reflect quite well understood biology (EMT, Zeb1, Wnt etc in EMT), and would require much more work to tighten up the conclusions; therefore, it was felt that even if these were improved, the data would likely confirm a lot of what we know already. It is true that the role of EMT is controversial - but what is presented in the first part of the manuscript does not add much definitive new data to inform that debate, and indeed the authors' submission letter refers to their 'confirmatory' nature.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to the reviewers’ comments:

      We thank both the reviewer for their critical evaluation and excellent suggestion to improve the manuscript. We are making all the changes suggested by both the reviewers and performing the experiments to address all the concerns specifically from the reviewer #1. Please find below our response to the reviewers’ comments:

      Reviewer #1:

      This is an interesting study from the Rahaman group that identifies cardiolipin (CL) as a potential binding target for Drp6 recruitment to the nuclear membrane in Tetrahymena (that has a unique nuclear remodeling program). In addition, they identify a residue, I553 in the DTD region, which they claim is a key residue involved in specific CL interactions. While the experiments themselves are technically sound, and are well performed and controlled, I don't find the major conclusion that I553 is involved in direct CL interactions justified or well rationalized. By their own admission (in the discussion), the conservative mutation I553M may perturb local folding and may indirectly affect CL interactions. There is no test of DTD folding with and without the I553M mutation, nor are there other mutations (e.g. I553A and in the vicinity) tested. CD experiments in the absence and presence of CL-containing membranes will likely yield information on the impact of the I553 mutations, while DLS experiments would inform on the hydrodynamic properties (overall 3D fold) of the DTD and the impact of these mutations. CL interactions generally involve a combination of electrostatic and hydrophobic forces. Where do the electrostatic interactions come from? Why would an Isoleucine to Methionine mutation affect the hydrophobic component, even if I553 is the key hydrophobic residue?

      Response:

      We thank the reviewer for the comments that the experiments are sound, well performed with appropriate controls. While we agree that the exact mechanism of how I553 provides specificity to cardiolipin binding is not addressed in the present manuscript, our study clearly demonstrates that the isoleucine at 553 plays important role in determining cardiolipin specificity and nuclear recruitment. As pointed out by the reviewer, it is possible that changing isoleucine to methionine may affect the local conformation. However, there is no major conformational change in the DTD due to this mutation. This conclusion is based on clear loss of nuclear localization and cardiolipin interaction for the mutant without affecting other properties. The in vitro floatation assay clearly stablish that the effect is directly by inhibiting interaction specifically with cardiolipin containing membrane. It should be further noted that the same domain DTD interacts with other two lipids (PS and PA) and mutant retains interaction with them arguing that conformation of this domain is not significantly changed due to I to M mutation. Consistent with these results I553M mutant could be targeted to the nuclear membrane as a complex with wildtype Drp6 further confirming that I553 could form correct self-assembled structure with wildtype protein required for association with nuclear membrane. This is further substantiated by comparing all the known biochemical properties including GTPase activity, membrane binding via other two lipids, formation of helical spirals and ring structures. Hence it is clear that I553 provides specificity to bind cardiolipin and recruitment to the nuclear membrane. We will further confirm if there is any local conformation change due to the mutation I to M by fluorescence quenching experiments and will be incorporated in the revised manuscript.

      Regarding overall folding of the mutant, this is an excellent suggestion by the reviewer. We are planning to perform CD experiments of the I553M mutant and wildtype proteins to compare if there is any change in overall folding due to mutation. This result would be incorporated in the revised manuscript.

      Reviewer is right to point out that both electrostatic and hydrophobic interactions are important for interaction with cardiolipin. Electrostatic interaction is important for all the phospholipids while interacting with protein and is expected to come from other amino acid residues which are positively charged. Electrostatic interaction may contribute to the affinity of the interaction by providing additional binding energy. But considering its universal nature of interaction with all the phospholipids, it cannot give specificity for a specific lipid and hence would not discriminate among different phospholipids.

      Regarding affecting hydrophobic component, the reviewer is correct that both are strong hydrophobic amino acids and loss of I553M interaction with cardiolipin may not be due to change in hydrophobicity

      To address that the loss of cardiolipin interaction is not specific to methionine and is due to absence of isoleucine, the suggestion from the reviewer to replace I553 with A (alanine) is an excellent one. We are doing the experiments and we anticipate to incorporate these results in our revised manuscript.

      Reviewer #1 (Significance (Required)):

      The addressed phenomenon is restricted to Tetrahymena and may not have far reaching implications. Regardless, the identification of CL as a binding target for Drp6 at the nuclear membrane of this organism is in itself significant. The conclusion that I553 is the key CL binding residue is however not warranted. Additional experiments are needed to dissect how this residue impacts CL interactions and examine whether the observed effect is direct or indirect.

      Response:

      We thank the reviewer for appreciating the significance of this work. We agree that our data is Tetrahymena specific. However, we believe that the study is relevant for all the proteins whose association with target membranes depend on cardiolipin including many cardiolipin interacting DRPs (such as DRPs involved in biogenesis and maintenance of mitochondria).

      We really appreciate the reviewer for the excellent suggestions. Based on this we are performing the following experiments.

      1. CD experiments to assess overall folding of I553M and Wildtype protein
      2. Fluorescence quenching of Tryptophan (at amino acid position 548) residue in the vicinity of I553 to compare conformation of the mutant with that of wildtype protein.
      3. Evaluation of I553A in nuclear localization and cardiolipin binding. We anticipate these results to further confirm if I553 is the key CL binding residue and if the effect is direct.

      The writing is not clear in some parts and may require a round of language editing. There are no issues with reproducibility.

      Response

      We thank the reviewer for pointing out the language editing. We will edit the language wherever we find it appropriate. We would highly appreciate if reviewer can indicate the portions that need special attention.

      Reviewr #2:

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Dynamin is a GTPase superfamily protein involved in membrane fusion and division. This paper focused on Drp6, one of the eight dynamin superfamily proteins of Tetrahymena, and analyzed its nuclear envelope localization mechanism by a combination of in vivo cytogenetical analysis and in vitro biochemical analysis for the various mutant Drp6 proteins. Results showed that a specific amino acid residue (isoleucine at the 553rd) in the membrane binding domain of Drp6 was required for its nuclear membrane localization, but this residue is not required for ER/endosome localization and GTPase activity. Furthermore, in vitro floating analysis using centrifugation indicated that Drp6 specifically bound to the cardiolipin at the 553rd isoleucine residue and this binding was required for Drp6's nuclear membrane localization. Finally, removal of cardiolipin from the conjugating cells using inhibitor treatment showed that cardiolipin was required for the new macronucleus formation (including the expansion of macronuclear envelope) through the function of Drp6. Based on these results, authors concluded that cardiolipin targets Drp6 to the nuclear membrane in Tetrahymena.

      \*Major comments:***

      The experimental data presented in this paper are reasonable and the results are solid, and therefore I think the deduced conclusions are convincing. However, to improve this paper, I have several minor comments to be revised before publication.

      \*Minor comments:***

      1. In the previous paper, it has been shown that GFP-Drp6 is localized in the inner nuclear membrane of both macronucleus and micronucleus. In this paper, however, this point is not clearly stated and is not shown in the figures --- I could not understand such localization pattern of GFP-Drp6 in Fig. 1C and Fig. 3b and the statements in the text. I suggest adding such statements somewhere in Introduction or Result section. Also, add adequate references to the corresponding statements in the text.
        • Related to the comment 1, I suggest replacing Fig. 1C (images of fixed cells) with Fig. S1B (images of live cells) because nuclear localization of GFP-Drp6 are much clearer in Fig. S1B (live cell) than Fig. 1C (fixed cell), and because fixation may cause artificial redistribution of the proteins. Please add arrows in those figures to point out the position of micronucleus in those figures if necessary.*
        • Similarly, I suggest replacing images of Fig. 5B (fixed cells) with those of Fig. S3 (live cells).*
        • page 7, line 224: GFP-Nup3 is used as a marker protein of the nuclear pore complex (NPC). However, there is no description of how GFP-Nup3 is obtained or made. Add description how this DNA plasmid was obtained or generated.*
        • Related to the comment 4, "Nup3" is first discovered in Malone et al., Eukaryotic Cells, 2009, but also soon after discovered as the name of "MicNup98B" in Iwamoto et al., Curr Biol, 2009 and used in several papers including Iwamoto et al., Genes Cells, 2010; JCS, 2015; JCS 2017; and more. Because Nup3 is the Tetrahymena paralogs of human Nup98 and the name of "Nup98" is well established to call these homologs in various eukaryotes, I suggest adding the name of "MacNup98B" after the word of "Nup3" for reader's better understanding. I also suggest adding appropriate references to refer to this protein as follows: Add Malone et al. 2009 for "Nup3" and Iwamoto et al., 2009 for "MacNup98B."*
        • page 9, line 295: I wonder if "Fig. 3b" may be a mistake of "Fig. 5C." If so, please correct this.*
        • page 10, the second paragraph (lines 311-322): This paragraph discussed the possible involvement of Drp6 in the nuclear envelope expansion of the post-zygotic nucleus. It may be interesting to point out that large-scale nuclear envelope reorganization including the formation of the redundant nuclear envelope and the type-switching of the NPC (from the MIC-type NPC to the MAC-type one) has been reported at this developmental stage (Iwamoto et al., JCS 2015). For example, the peculiar shaped nuclear envelope with the redundant/overlapping nuclear envelope structure can be seen and the MAC-type NPCs rapidly assembles to the expanding nuclear envelope. It may be interesting to point out that cardiolipin and Drp6 may be involved in these phenomena. But it is too speculative and therefore consider adding such a discussion as an option.*
        • page 13, line 412: Is the word "GFP-drp6-I553M" written in italics intended for the gene for the GFP-drp6-I553M protein? If so, protein may be acceptable here. Make sure there are no problems with italicized characters. Also, check if the lowercase letter "d" in "drp6" is OK because large letters are used in other cases.*
        • page 20, figure 1: I recommend switching the positions of HDyn1 and Drp6 in Figure 1a to keep the order in Figure 1b.*
        • page 21, line 671: Add the word "Tetrahymena" before "Drp 6" to pair with the word "human dynamin 1".*
        • page 23, line 729: Remove "and."*
        • page 23, lines 729 and 731: Unify the expression of "cardiolipin" and "Cardiolipin"*
        • page 23, line 732: Add "or" before "10% Phosphatidylserin."*
        • page 24, Figure 3a: Please mark the position of I553M in the figure if possible. Alternatively, indicate the range of amino acid residues after the words "red" and "green" in the figure legend.* Response:

      We thank the reviewer for the excellent comments that “the experimental data presented in this paper are reasonable and the results are solid, and therefore I think the deduced conclusions are convincing.” We also thank the reviewer for the minor comments which are thorough and very insightful. it will improve the manuscript substantially. We would incorporate all the changes in the revised manuscript.

      Reviewer #2 (Significance (Required)):

      The corresponding author and his colleagues have reported that Tetrahymena Drp6 is localized to the outer nuclear membrane of both macronucleus and micronucleus of Tetrahymena (Elde et al., 2005) and that Drp6 is required for the formation of new macronuclei during nuclear differentiation (Rahaman et al., 2008). Therefore, these parts are not novel.

      The novelty of this study is as follows:

      (1) The discovery of a specific amino acid residue (isoleucine at the 553rd) of Drp6 that is required for its nuclear membrane localization.

      (2) the discovery of a lipid molecule, cardiolipin, as a critical partner for Drp6's nuclear membrane targeting.

      (3) Discovery of involvement of cardiolipin in the new macronucleus formation (the expansion of macronuclear envelope) through the function of Drp6.

      *

      I think their findings are highly novel and will provide new insight into a field of cell biology. Especially, their findings will contribute to understanding how specific proteins targeted to the specific intracellular membranes. In addition, their methods (such as floatation assay) for analyzing the interaction between the protein of interest and lipid/liposomes will become an important tool.*

      Response:

      We are very happy to note that the reviewer has pointed out the significance of the present study. We fully agree with reviewer and appreciate thorough analysis and excellent conclusion from the reviewer.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Dynamin is a GTPase superfamily protein involved in membrane fusion and division. This paper focused on Drp6, one of the eight dynamin superfamily proteins of Tetrahymena, and analyzed its nuclear envelope localization mechanism by a combination of in vivo cytogenetical analysis and in vitro biochemical analysis for the various mutant Drp6 proteins. Results showed that a specific amino acid residue (isoleucine at the 553rd) in the membrane binding domain of Drp6 was required for its nuclear membrane localization, but this residue is not required for ER/endosome localization and GTPase activity. Furthermore, in vitro floating analysis using centrifugation indicated that Drp6 specifically bound to the cardiolipin at the 553rd isoleucine residue and this binding was required for Drp6's nuclear membrane localization. Finally, removal of cardiolipin from the conjugating cells using inhibitor treatment showed that cardiolipin was required for the new macronucleus formation (including the expansion of macronuclear envelope) through the function of Drp6. Based on these results, authors concluded that cardiolipin targets Drp6 to the nuclear membrane in Tetrahymena.

      Major comments:

      The experimental data presented in this paper are reasonable and the results are solid, and therefore I think the deduced conclusions are convincing. However, to improve this paper, I have several minor comments to be revised before publication.

      Minor comments:

      1. In the previous paper, it has been shown that GFP-Drp6 is localized in the inner nuclear membrane of both macronucleus and micronucleus. In this paper, however, this point is not clearly stated and is not shown in the figures --- I could not understand such localization pattern of GFP-Drp6 in Fig. 1C and Fig. 3b and the statements in the text. I suggest adding such statements somewhere in Introduction or Result section. Also, add adequate references to the corresponding statements in the text.
      2. Related to the comment 1, I suggest replacing Fig. 1C (images of fixed cells) with Fig. S1B (images of live cells) because nuclear localization of GFP-Drp6 are much clearer in Fig. S1B (live cell) than Fig. 1C (fixed cell), and because fixation may cause artificial redistribution of the proteins. Please add arrows in those figures to point out the position of micronucleus in those figures if necessary.
      3. Similarly, I suggest replacing images of Fig. 5B (fixed cells) with those of Fig. S3 (live cells).
      4. page 7, line 224: GFP-Nup3 is used as a marker protein of the nuclear pore complex (NPC). However, there is no description of how GFP-Nup3 is obtained or made. Add description how this DNA plasmid was obtained or generated.
      5. Related to the comment 4, "Nup3" is first discovered in Malone et al., Eukaryotic Cells, 2009, but also soon after discovered as the name of "MicNup98B" in Iwamoto et al., Curr Biol, 2009 and used in several papers including Iwamoto et al., Genes Cells, 2010; JCS, 2015; JCS 2017; and more. Because Nup3 is the Tetrahymena paralogs of human Nup98 and the name of "Nup98" is well established to call these homologs in various eukaryotes, I suggest adding the name of "MacNup98B" after the word of "Nup3" for reader's better understanding. I also suggest adding appropriate references to refer to this protein as follows: Add Malone et al. 2009 for "Nup3" and Iwamoto et al., 2009 for "MacNup98B."
      6. page 9, line 295: I wonder if "Fig. 3b" may be a mistake of "Fig. 5C." If so, please correct this.
      7. page 10, the second paragraph (lines 311-322): This paragraph discussed the possible involvement of Drp6 in the nuclear envelope expansion of the post-zygotic nucleus. It may be interesting to point out that large-scale nuclear envelope reorganization including the formation of the redundant nuclear envelope and the type-switching of the NPC (from the MIC-type NPC to the MAC-type one) has been reported at this developmental stage (Iwamoto et al., JCS 2015). For example, the peculiar shaped nuclear envelope with the redundant/overlapping nuclear envelope structure can be seen and the MAC-type NPCs rapidly assembles to the expanding nuclear envelope. It may be interesting to point out that cardiolipin and Drp6 may be involved in these phenomena. But it is too speculative and therefore consider adding such a discussion as an option.
      8. page 13, line 412: Is the word "GFP-drp6-I553M" written in italics intended for the gene for the GFP-drp6-I553M protein? If so, protein may be acceptable here. Make sure there are no problems with italicized characters. Also, check if the lowercase letter "d" in "drp6" is OK because large letters are used in other cases.
      9. page 20, figure 1: I recommend switching the positions of HDyn1 and Drp6 in Figure 1a to keep the order in Figure 1b. 
      10. page 21, line 671: Add the word "Tetrahymena" before "Drp 6" to pair with the word "human dynamin 1".
      11. page 23, line 729: Remove "and."
      12. page 23, lines 729 and 731: Unify the expression of "cardiolipin" and "Cardiolipin"
      13. page 23, line 732: Add "or" before "10% Phosphatidylserin."
      14. page 24, Figure 3a: Please mark the position of I553M in the figure if possible. Alternatively, indicate the range of amino acid residues after the words "red" and "green" in the figure legend. 

      Significance

      The corresponding author and his colleagues have reported that Tetrahymena Drp6 is localized to the outer nuclear membrane of both macronucleus and micronucleus of Tetrahymena (Elde et al., 2005) and that Drp6 is required for the formation of new macronuclei during nuclear differentiation (Rahaman et al., 2008). Therefore, these parts are not novel.

      The novelty of this study is as follows: (1) The discovery of a specific amino acid residue (isoleucine at the 553rd) of Drp6 that is required for its nuclear membrane localization. (2) the discovery of a lipid molecule, cardiolipin, as a critical partner for Drp6's nuclear membrane targeting. (3) Discovery of involvement of cardiolipin in the new macronucleus formation (the expansion of macronuclear envelope) through the function of Drp6.

      I think their findings are highly novel and will provide new insight into a field of cell biology. Especially, their findings will contribute to understanding how specific proteins targeted to the specific intracellular membranes. In addition, their methods (such as floatation assay) for analyzing the interaction between the protein of interest and lipid/liposomes will become an important tool.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is an interesting study from the Rahaman group that identifies cardiolipin (CL) as a potential binding target for Drp6 recruitment to the nuclear membrane in Tetrahymena (that has a unique nuclear remodeling program). In addition, they identify a residue, I553 in the DTD region, which they claim is a key residue involved in specific CL interactions. While the experiments themselves are technically sound, and are well performed and controlled, I don't find the major conclusion that I553 is involved in direct CL interactions justified or well rationalized. By their own admission (in the discussion), the conservative mutation I553M may perturb local folding and may indirectly affect CL interactions. There is no test of DTD folding with and without the I553M mutation, nor are there other mutations (e.g. I553A and in the vicinity) tested. CL interactions generally involve a combination of electrostatic and hydrophobic forces. Where do the electrostatic interactions come from? Why would an Isoleucine to Methionine mutation affect the hydrophobic component, even if I553 is the key hydrophobic residue? Additional experiments are therefore essential to identify the actual residues involved in specific CL interactions. CD experiments in the absence and presence of CL-containing membranes will likely yield information on the impact of the I553 mutations, while DLS experiments would inform on the hydrodynamic properties (overall 3D fold) of the DTD and the impact of these mutations.

      The writing is not clear in some parts and may require a round of language editing. There are no issues with reproducibility.

      Significance

      The addressed phenomenon is restricted to Tetrahymena and may not have far reaching implications. Regardless, the identification of CL as a binding target for Drp6 at the nuclear membrane of this organism is in itself significant. The conclusion that I553 is the key CL binding residue is however not warranted. Additional experiments are needed to dissect how this residue impacts CL interactions and examine whether the observed effect is direct or indirect.

    1. Reviewer #3:

      This is a great paper that takes a modelled somatosensory microcircuit and, without parameter adjustment, asks whether stimulus-specific adaptation is capable of emerging. The ability to remove synaptic depression and stimulus-frequency adaptation, in both thalamo-cortical and cortico-cortical populations was a definite highlight for me. Primary negatives were minimal mention of certain aspects of connectivity, and a complete lack of any mention of interneuron processing and its known role in SSA.

      Major Comments:

      1) The NMC model is derived from somatosensory cortex. It's not really discussed at all in the paper, but is the assumption that auditory cortex is similar enough in structure that it is valid to model it with a somatosensory model? Although I'm not a somatosensory expert, there are certainly numerous connectivity differences between auditory and visual cortices (interactions between L6 CT neurons, and the local cortical column for example).

      2) It was not immediately clear to me, how exactly the MGB->ACtx was wired up, and consequently, how this wiring affected tuning bandwidth in ACtx. I don't think it was a one-to-one mapping that was used, because there is talk of multiple TC afferents innervating a single cell, but this should be described in detail. How do these connectivity choices affect bandwidth, at a layer-specific level? (i.e. one could imagine a broadly tuned neuron being so because it's integrating auditory information from heterogeneously tuned thalamic neurons).

      3) Related to points 1&2, it looks from Figure 1C, that the TC input is generating a tonotopically ordered map in ACtx? Is this the case? If so, in light of many recent papers that have shown substantial local heterogeneity in ACtx frequency tuning, this is not particularly plausible.

      4) I appreciate that this is not the focus of the paper, but it wasn't clear to me whether the NMC model consisted primarily of excitatory neurons, or whether there were inhibitory neurons that were included in the analysis. If the population is mixed, then this will affect interpretation of the depression experiments. In some sense, this is also my biggest negative about the paper - there is almost no mention of interneurons at all, even though interneurons also play an important role in SSA (given that they shape frequency-dependent responses) - this has been the focus of several publications from the Geffen Laboratory.

      5) It was mentioned in the discussion that the model was not capable of replicating layer-specific SSA values. Related to this, does the model capture layer-specific changes in frequency tuning properties (i.e. layer 5b pyramidal cells have far broader tuning than other cell-types). And if not, might this affect the SSA differences, especially given how important bandwidth in shaping SSA (TC afferents responding to both deviant and standard).

      6) Were there any layer-specific effects on removal of thalamo-cortical vs cortico-cortical, that could be linked to the fact that different excitatory cell-types in ACtx have vastly different laminar connectivity patterns (L6 CT translaminar inhibition, L5 PT vs IT, for example).

      7) How does the model connectivity map onto the distinct morphology of heterogeneous cell-types throughout the cortex, and does this morphology affect the SSA? (The large apical dendrites of L5b neurons, for example, will play a huge role on how they integrate ascending sensory input).

    2. Reviewer #2:

      In this study authors aim to explain the mechanisms responsible for induction of stimulus specific adaptation (SSA). As the model system authors pick the auditory cortex, where this phenomenon has been well explored. But the mechanisms they identify (synaptic depression, spike frequency adaptation, and recurrent connectivity) are general. It is thus plausible that their conclusions generalize beyond the auditory modality. I think the study is well conceived, its message well communicated, and the specific conclusions the authors make are well supported by the (model) data. The study demonstrates how the high biological fidelity modeling, that has been gaining traction in neuroscience, can serve as a testbed for rapid evaluation of hypothesis and elucidation of mechanism behind brain computation.

      That said, I have several major comments:

      1) I am concerned about the novelty/impact of the study. The impact of the present study can be viewed through two lenses:

      (a) The novelty and added value of the modelling approach itself. While I am very enthusiastic about the merits of the high fidelity modeling used in the present study, this modeling approach has now been well established across multiple manuscripts. The cortical model itself is already published, while I do not think the MGB extension of the model itself represents a significant advancement.

      (b) The impact of the findings of the study itself. The study claims one main novel finding: contribution of the SFA in combination with recurrent cortical connectivity to the SSA. The contribution of SFA to SSA doesn't seem particularly surprising, and as authors write it indeed has already been proposed. Also impact of recurrent connectivity on SSA has already been explored by a previous model (Yarden et al. 2014). Furthermore, my understanding is that the model was for the first time able to replicate the weaker presence of SSA in thalamo-cortical layers, and the dependence of SSA on frequency preference of the neuron. It is my understanding that all other replicated phenomena have already been demonstrated in previous models.

      2) I was surprised no comment was made on (a) the potential difference between the anatomy of the auditory cortical column in comparison to the somatosensory column, which the present model has been designed around, and (b) the lack of functionally specific connectivity, that at least in other sensory cortices (e.g. V1) has been shown to play an instrumental role in shaping the computation. This is particularly surprising in the context of the inability of the model to reproduce some of the interesting findings on SSA (distribution of SAA values in different cortical layers, specific deviance sensitivity), and on the other hand the level of optimism on the future of the model expressed in the last paragraphs of the discussion. I think for the modelling approach in future to fulfill such optimistic goals, both these major problems will have to be addressed, which represent a major body of new work - this should be acknowledged.

      3) I am concerned about the lack of functional verification of the model. Do for example the cortical neurons have frequency tuning curves characteristics that match well auditory data? Unfortunately, I am not an A1 expert, but I would expect wealth of data on elementary functional properties of A1 neurons exists. This represents somewhat of a paradox, where the model is at some level extremely detailed and well matched to experimental data, which (justifiably) authors sell as a major advantage. But it is surprisingly poorly validated against the elementary computations that A1 performs, which in the context of this study, is just as if not more important as the anatomical fidelity. I feel that, at minimum, this issue warrants thorough discussion, both in the context of the SAA, and the modelling approach itself.

    3. Reviewer #1:

      This study investigates whether a detailed biophysical model of a cortical column, simulating more than 30,000 fully detailed neurons, is able to reproduce a well known property of the auditory cortex: stimulus specific adaptation or SSA. SSA has been successfully reproduced in a simplistic model which shows that adaptation mechanisms explain the qualitative phenomenology of this effect (decreased responsiveness for repeated stimuli, specific to the repeated sound and to sounds whose representation overlaps within the repeated sound). Here the authors aimed at testing whether without any parameter optimization, a detailed biophysical model is able to reproduce the observed phenomenon. As the model contains two well-known adaptation mechanisms, synaptic depression and spike frequency adaptation, unsurprisingly, a qualitative match between natural SSA and modeled SSA is observed. Moreover, effects related to representation overlap are found by including a mostly data-driven representation model and without fine tuning. Finally, the biophysical model suggests that both synaptic depression and spike frequency adaptation (SFA) contributes to SSA and that SFA exclusively contributes to the asymmetry of cross frequency adaptation with respect to the preferred frequency, that is both observed in the model and in the data, and can be explained by asymmetry of cochlear representations.

      This is a nice and important exercise to test the efficiency of a so-called detailed model at reproducing basic experimental observation. Unfortunately, here the model performs very well qualitatively but not quantitatively as little quantitative match is observed with spike data from auditory cortex (Figure 5). In fact there is little comparison with actual data, and this is disappointing. One of the purposes of detailed models is to identify their limitations and thereby identify useful details that may have been missed or incorrectly measured. Unfortunately, the quantitative mismatch in Fig. 5 is not mentioned in the results and no attempt is made to fill the gap. Hence, the conclusions of the paper do not go much beyond the well known role of adaptation and representation overlap. The identification of a measure to separate the two components, depression and SFA, is a nice contribution, but it is not tested experimentally, so it remains to be done (e.g. suppressing recurrencies by tetanus toxin light chain) to validate this hypothesis.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      All reviewers have acknowledged the value of a detailed model of auditory cortex, and expressed their support for an integrative approach building the link between neural circuits details and observables. It was found particularly interesting that two complementary mechanisms could play a role in stimulus specific adaptation (SSA). Nevertheless, while the reviewers recognized that the simulations were technically sound and that the conclusions represent interesting hypotheses to pursue about the mechanisms of SSA in auditory cortex, they all felt that the precision to which the specificity of auditory cortex circuits were modeled or to which the SSA observables were captured was not sufficient to demonstrate the advantage of the detailed modeling approach with respect to previous simpler models which reached similar conclusions.

    1. Reviewer #2:

      The authors describe the dependence of the p-value on sample size (which is true by definition) and offer a solution, using simulated data and an applied example.

      I'm not sure that the introduction successfully motivates the paper. It is unclear whether this is due to misunderstandings by the authors of some key points, or rather is a matter of awkward communication, such that the authors' intentions are accurately conveyed.

      The authors note the link between the p-value and sample size. In particular, the authors suggest that statistical significance can be achieved by using a sufficiently large sample size, and they call this 'p-hacking'. I certainly don't recognise use of a large sample size as an example of p-hacking. Instead, this term refers to analytical behaviours which cause the p-value to lose its advertised properties (advertised type 1 error rate). Examples would include taking repeated looks at data without making any appropriate adjustment, trying tests on different groupings of data (and selecting results on the basis of significance), or trying different definitions of an outcome measure. The key point is that, when these actions are performed, reported p-values are no longer valid p-values - they do not behave as they are supposed to. So straight away the authors' argument becomes confusing. Are they criticising the behaviour of the valid p-value? Or are they trying to criticise behaviours that cause the p-value to lose its stated properties? This point remains very unclear. I believe the authors are attempting the former, but wrongly describe this as an example of p-hacking.

      But other statements in the introduction invite further confusion. The authors say " even when comparing the mean value of two groups with identical distribution, statistically significant differences among the groups can always be found as long as a sufficiently large number of observations is available using any of the conventional statistical tests (i.e., Mann Whitney U-test (Mann and Whitney, 1947), Rank Sum test (Wilcoxon, 1945), Student's ttest (Student, 1908)) (Bruns and Ioannidis, 2016)." Again, it is unclear what the authors are trying to say here, and the statement is clearly false under the most obvious interpretation. If the authors are saying that significance will always be found when the null is true and model assumptions are correct provided that the sample size is large, then this is clearly false. In this case, the test will reject the null 5% of the time, using a significance threshold of 5%. The authors can easily confirm this for themselves with a simple simulation. Are the authors trying to make the point that the error rate is conditional not only on the null, but also on the test assumptions (and so when they are violated the test may reject erroneously?) They certainly do not state this, and the fact that they refer to 'identical distribution' suggests otherwise. Another way the test assumptions could be violated is if actual p-hacking (see examples above) were present, such that the reported p-values were no longer valid. Again, the authors do not tell us that this is what they mean, if they in fact do, and this would be a criticism of p-hacking behaviours rather than of the p-value.

      When they write "big data can make insignificance seemingly significant by means of the classical p-value" they might be thinking of confusion between statistical and practical significance, which is a common misinterpretation made in the presence of large data size, but again, if this is what the authors are thinking of they should say it. The discussion by Greenland (Valid P-Values Behave Exactly as They Should: Some Misleading Criticisms of P-Values and Their Resolution With S-Values, especially section 4.3) seems to address the concerns raised by the authors fairly decisively. For a given parameter size, increasing sample size should produce stronger evidence against the null. The p-value does not tell you about the size of the parameter directly - it measures the discrepancy between the data and the null - interpreted correctly, there is no problem.

      So, with apologies to the authors, I don't think they are successful in convincing the reader that there is a problem to be solved, and the manner of presentation (which may just be an issue of communicating the authors' intentions) is such that it causes doubt about the authors' handling of the relevant concepts. Throughout the text, there are other confusing presentations around fundamental concepts. E.g. the authors write things like "Hence, we claim that whenever there exist real statistically significant differences between two samples..." I know what a real difference is, but what is a real statistically significant difference? There are no statistically significant differences in nature. Are the authors trying to refer to instances where the null is false and is rejected? Or, are they trying to say that a 'real significant difference' is where the difference exceeds some magnitude?

      For example - the authors write things such as "When 𝑁(0,1) is compared with 𝑁(0,1), 𝑁(0.01,1) and 𝑁(0.1,1), 𝜃 is null; so those distributions are assumed to be equal. In the remaining comparisons though, 𝜃 = 1, thus there exist differences between 𝑁(0,1) and 𝑁(𝜇,1) for 𝜇 ∈ [0.25,3]", highlighting the fact that perhaps the authors really want to address the practical significance vs statistical significance issue (although again, this is not explicitly stated). If the authors are interested in size of effect/ difference, then it is not clear that this proposal offers any advantage in that regard over the p-value (which, as noted, does not tell us about the size of a parameter). If interest is in size, then it is unclear why the authors do not direct the reader to consider the estimate and confidence interval, so that they may consider this explicitly in terms of magnitude and precision.

      With apologies to the authors, who have clearly spent a large amount of time on this - I would think that the best way forward here would be to post this as a preprint and to try to invite as much feedback as possible. The authors have lofty ambitions with this work. Maybe there is a good underlying idea here, obscured by the presentation? Unfortunately, it is difficult to assess this at present.

    2. Reviewer #1:

      The paper sets out to confront p-hacking and addressing the dependence of the p-value on the sample size. The paper sets out the motivation behind the problem and then proposes a solution using three examples.

      I have a major problem with this work in that I do not understand the motivation and hence cannot judge the value of the proposed solution.

      The authors need to set out some definitions which might help them framing the context. I outline below what I understand as the context and hence why I do not understand how their proposal will address the problem.

      Firstly 'p-hacking' is the term usually reserved for when researchers do not follow a pre-specified protocol on how a research question will be answered through the statistical analysis of a resource, single study or experiment, but instead analyse the data in many ways. Maybe they use slightly different assumptions, adjust the definition of an outlier or who is eligible for inclusion or adjust to a different outcome variable. In this manner they select to report the analysis that gives the smallest p-value. (Ioannidis referred to some of this as vibration effects) This is a major problem in science but it is not only the problem of the size of the data available. Although the bigger the dataset, the more subgroups that can be analysed. The main problem here is that we do not know how many ways the data have been analysed, we only know what researchers have selected to report. The manuscript does not address this problem at all.

      The p-value is defined as the probability of observing a result as or more extreme when the null hypothesis is true. In most settings the 'null' is that there are no differences between two or more groups, for example that all the means are the same or equal. Often this translates into the statement that we expect the distribution of p-values under the null to be uniformly distributed [0,1]. This can be demonstrated or checked by simulation. In the hypothesis testing framework we usually power our studies so we will be able to detect a (true) difference between two groups with some high probability. The specific difference we are interested in would be called the alternative hypothesis. Hence the p-value is used to reject the null, but under the alternative hypothesis the p-value will not be uniform [0,1]. It is well known that the larger your sample size the more precise estimates you will obtain and the smaller differences you will be able to detect. Sample size calculations require a specific alternative to be stated (e.g. a difference in means of 0.5 of a standard deviation) then a sample size that guarantees as specific power for the specific type 1 error can be calculated.

      This manuscript is confusing properties of the p-value when there are no differences and minimal differences between the two groups. I think the authors are trying to make the point that a statistically significant result is not necessarily a clinically or biologically meaningful result. They have done some simulations to show the distribution of the p-value when the true difference between the two means is 0.01. This is an example of an 'unimportant' difference, but it is not the null. This problem is best addressed by reporting effect sizes and 95% confidence intervals for quantities of interest rather than trying to adjust p-values in some way. Obviously when we have access to large datasets we may have a much larger sample than we needed to detect a meaningful effect though we may find small p-values. Adjusting the p-values will not really help as it is the effect sizes that are of interest.

      I feel the manuscript needs to be redrafted to be more clear about the problem they are trying to fix.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary:

      The authors describe the dependence of the p-value on sample size (which is true by definition) and offer a solution, using simulated data and an applied example. Unfortunately, both reviewers found it difficult to understand the motivation for the work and hence both had difficulty judging the value of the proposed solution. Detailed comments and suggestions are provided below.

    1. Reviewer #3:

      This is a manuscript by Karimi-Rouzbahani et al, about the neural encoding of facial familiarity using EEG and MVPA.

      I essentially found the article interesting, clear and using solid methods. Besides a few minor comments, which I list below, I found only one major issue which has to be addressed.

      Major comment:

      My only major problem with the results lies in the simple interpretation of anterior contributions to the encoding of familiarity as feed-back. You find, using a clever partialling out method, that eliminating the occipital contributions from the frontal (or rather anterior, as it involves temporal cortex too) electrode pattern familiarity decoding reduces stronger and earlier-longer information encoding about familiarity, when compared to the opposite, when you partial out the frontal information from that of the occipital/posterior electrode pattern. The former is interpreted as a signal of feed-back, while the opposite as feed-forward information flow. This makes sense but only if the frontal cortex does not play a role, on its own right, in face processing. However, the inferior frontal face area (see e.g. Collins and Olson,2014) is known to be associated with the STS and playing a role in social, dynamic and eye-movement related information processing. If we assume that these tasks are more related to the frontal than to the posterior areas, as for example Duchaine and Yovel, 2015 do, then the results of the partialling out analysis merely mean that the functions of the frontal areas are modulated more by the posterior areas (in other words, in those functions the parietal areas also play a role) than the other way around. The lower-level functions of the posterior sites are, on the other hand, modulated less, shorter, later by the removal of frontal areas, in other words the frontal cortexes do not play much role in them.

      This is different from your conclusion where you state feed-forward vs feed-back connections. I don't see any good way to come around this alternative (and simpler) conclusion than your assumption about connectivity. Time would be a potential factor to resolve it, feed-back being later, but in your figures it is clear that the two periods overlap entirely and the peaks also almost fall into identical windows.

      Unless I overlooked something and you can give a convincing way to exclude this possibility I would recommend a) discuss this in the paper and b) tune down your respective conclusions throughout the manuscript.

    2. Reviewer #2:

      The authors employed a clever experimental paradigm to investigate how the brain integrates visual information to reach a decision on the familiarity of a presented face. Eighteen subjects performed an EEG experiment while they were presented with images of themselves, close friends, famous individuals, or unfamiliar individuals. They were required to perform a 2AFC task to decide on the familiarity of the image (familiar/unfamiliar). The authors report behavioral differences in accuracy and reaction times depending on the task difficulty (more or less degraded images) and depending on the familiarity of the face, with self and personally familiar faces being recognized more easily and faster. Some of these behavioral differences were reflected in brain activity as evaluated by ERPs, decoding, and RSA analyses. Adopting a novel RSA-based connectivity method, the authors claim that under conditions with limited visual information (more degraded images), top-down effects from frontal areas to occipital areas are stronger than in conditions with increased visual information (less degraded images).

      The main question of this work is of interest and important in the face processing literature. The paradigm is clever and has the potential to address the question of interest. However, I have strong concerns about the methods, as well as some issues with the interpretation and framework in which the authors place the results of this work.

      Methods:

      1) There is little information about single-subject results or effect sizes, except for behavioral results. Only the mean values across subjects are reported with significance values (however, the reader cannot be sure about this as it is not explicitly mentioned anywhere). It's unclear from the description of the methods how data from different subjects were pooled for group analysis. Similarly, it's unclear how the null distributions were generated across subjects for permutation testing.

      2) Different analyses use either correct trials only or both incorrect and correct trials, without any clear rationale of why this is warranted. This is especially important in a task with highly different accuracy values depending on the conditions of interest. Figure 1B shows different levels of behavioral accuracy depending on coherence levels, while Figure 1D shows different levels of accuracy depending on familiarity type. This is very interesting, but it creates challenges for the analysis of brain data.

      On the one hand, if only correct trials are selected for the analysis (as in the decoding results), then different conditions will have a different number of trials. In turn, this will change the distribution of samples into classes, it will change the theoretical chance level, and it will change the levels of noise for estimates of central tendency. For example, the difference in decoding results between different familiarity types in Figure 3B could potentially be driven by a different number of trials belonging to each of the subclasses of familiarity.

      On the other hand, if both correct and incorrect trials are selected for the analysis (as in the RSA analysis), then results are confounded by potentially different brain processes that take place for correct and incorrect trials. Consider that in a 2AFC task, participants can be correct in one way only (correct classification), while they can be incorrect in many ways (slow RT, low attention level, or true misclassification). Given this experimental paradigm, I think the more straightforward approach would be to analyze correct and incorrect trials separately for all analyses and report both results. This would limit confounding effects in the interpretation of the data.

      3) For the decoding analyses, I find it suboptimal (and potentially problematic) to use a binary classifier (familiar vs. unfamiliar) to investigate a multiclass problem (levels of familiarity). A better approach would be to run a 4-way classification from the beginning, and then use this classifier to generate a 2-way classifier. This approach would preserve the actual structure of the data, which is divided into four classes of interest and not only two. In addition, I cannot tell from the methods whether the labels were permuted appropriately for permutation testing. Since there is a different number of trials in each class, the label permutation should maintain the same proportion of trials in each class to preserve the original structure and generate an appropriate null distribution (Etzel, 2015; Etzel & Braver, 2013; Nichols & Holmes, 2002)

      4) It's unclear to me what the brain-behavior correlation analysis is meant to represent (Figure 3C) when the decoding analysis is performed on correct trials only, while behavioral accuracy is (necessarily) computed on all trials. In addition, I am left to wonder whether the overall within-subject behavioral accuracy is predicted by (or correlates with) the overall decoding accuracy across timepoints based on within-subject brain data. If such an effect exists, then the more complicated, time-varying analysis would be warranted. However, this analysis should be reported with individual subject's results to highlight the effect size of such a correlation. Finally, I would suggest the authors move some of the text describing this analysis from the methods to the main text. I find the description in the main text to be particularly opaque and much clearer in the methods section.

      5) It's unclear how the RSA results were pooled across subjects. In addition, these analyses used both correct and incorrect trials. I don't see why these analyses cannot be performed on correct and incorrect trials separately by sub-selecting rows and columns of the RDMs for each subject. This would make the interpretation of the results much more straightforward. These results are now confounded by whether the image was correctly or incorrectly classified by the participant.

      6) I'm not convinced the partial correlation results with low-level visual features are sufficient to account for the effect of visual differences. These differences necessarily exist when using pictures of famous people with less staged pictures of friends and other individuals. I'd like to know how much each image class can be predicted by image statistics alone either by mimicking the experiment using a classifier or by training a classifier to distinguish familiarity type on the actual images. This would quantify whether the familiarity of the person can be decoded simply based on low-level visual properties (such as luminance values from pixel intensities), or from more biologically inspired features that simulate early visual cortex, such as HMAX features or the first layer of a general recognition visual DNN.

      7) I find the proposed connectivity method quite interesting, but I'm highly concerned whenever a method is developed and tested in a single dataset to support the main hypothesis. I realize it is hard to obtain a real "ground truth" dataset to test this method, especially in our global condition. However, I would be more confident in this method if it were applied to some simulated data to show that it can recover the simulated feedforward/feedback dynamics with different amounts of noise in the dataset. In addition, especially for this analysis, differences between correct and incorrect trials should be analyzed. Otherwise, the interesting findings in Figure 4D could be confounded by a different number of correct trials in each of the coherence levels (with more incorrect trials for the 22% condition).

      Interpretation:

      8) Throughout the manuscript, I find the description of the visual pathway and the face processing network to be too simplified. It is described with a simple distinction into "peri-occipital" and "peri-frontal" areas, and a dichotomy between feed-forward/feed-back connection. While EEG cannot afford a more precise spatial resolution, I think both the introduction and the discussion should place the results of this manuscript within the broader and more precise knowledge we have about the visual system and the face processing system. For example, how do these results fit within the framework of (familiar) face processing (Duchaine & Yovel, 2015; Freiwald et al., 2016; Haxby et al., 2000; Visconti di Oleggio Castello et al., 2017)?

      While I agree that the evidence for top-down effects from frontal areas in visual recognition is substantial (as the seminal work by Moshe Bar and others has shown), recurrent and feedback connections exist much earlier in the pathway (Kravitz et al., 2013). These recurrent connections have been shown to play a role in tasks with occluded images as well (Tang et al., 2018), which has similarities with the task presented in this manuscript. Thus, for this task, do we really need to assume a contribution from frontal areas? Could it be more easily explained by these recurrent connections in occipital and temporal areas alone? I think the discussion should present a more precise (and nuanced) description of the visual pathway and the face processing network, rather than a simplified dichotomy between frontal/occipital areas.

      References:

      Duchaine, B., & Yovel, G. (2015). A Revised Neural Framework for Face Processing. Annual Review of Vision Science, 1(1), 393-416.

      Etzel, J. A. (2015). MVPA Permutation Schemes: Permutation Testing for the Group Level. 2015 International Workshop on Pattern Recognition in NeuroImaging, 65-68.

      Etzel, J. A., & Braver, T. S. (2013). MVPA Permutation Schemes: Permutation Testing in the Land of Cross-Validation. 2013 International Workshop on Pattern Recognition in Neuroimaging, 140-143.

      Freiwald, W., Duchaine, B., & Yovel, G. (2016). Face Processing Systems: From Neurons to Real-World Social Perception. Annual Review of Neuroscience, 39(1), 325-346.

      Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2000). The distributed human neural system for face perception. Trends in Cognitive Sciences, 4(6), 223-233.

      Kravitz, D. J., Saleem, K. S., Baker, C. I., Ungerleider, L. G., & Mishkin, M. (2013). The ventral visual pathway: an expanded neural framework for the processing of object quality. Trends in Cognitive Sciences, 17(1), 26-49.

      Nichols, T. E., & Holmes, A. P. (2002). Nonparametric permutation tests for functional neuroimaging: a primer with examples. Human Brain Mapping, 15(1), 1-25.

      Tang, H., Schrimpf, M., Lotter, W., Moerman, C., Paredes, A., Ortega Caro, J., Hardesty, W., Cox, D., & Kreiman, G. (2018). Recurrent computations for visual pattern completion. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.1719397115

      Visconti di Oleggio Castello, M., Halchenko, Y. O., Swaroop Guntupalli, J., Gors, J. D., & Gobbini, M. I. (2017). The neural representation of personally familiar and unfamiliar faces in the distributed system for face perception. In Sci. Rep. (Issue 1, p. 138297). https://doi.org/10.1038/s41598-017-12559-1

    3. Reviewer #1:

      In this manuscript the authors report a study investigating the "neural familiarity spectrum" of face recognition. The authors used a paradigm via which stimuli (i.e. facial identities with varied levels of familiarity) were gradually revealed. In general, I entirely agree that the previous overemphasis of and/or arguing "for a dominance of feed-forward processing" ought to be replaced by a more "nuanced view". In my opinion, the constraints imposed by our methodological choices, which ultimately determine the nature of our observations, also need to be humbly considered. I commend the authors for their efforts and their well-written, interesting manuscript, which I believe represents a valuable and needed contribution to the field of face cognition and beyond.

      Major Points:

      Throughout the manuscript references are warranted to a number of studies that have:

      (i) Used similar approaches to a) decelerate the categorization process and b) investigate representations across time by applying uni-/multivariate analyses that were stimulus onset and/or reaction time aligned (eg, Carlson et al., 2006; Jiang et al., 2011; Ramon et al., 2015; Quek et al., 2018)

      (ii) Have reported findings related to frontal contributions towards familiar face recognition (numerous EEG studies by Caharel and colleagues, and Ramon et al. (2010, 2015) What I am missing is an explicit discussion of the challenging effect of expectations related to identities (as well as specific images since observers provided stimuli themselves). The authors discuss the role of perceptual difficulty and familiarity level, but the latter is in fact confounded with expectations of the specific to-be-presented identities that moreover appear in the context of the active (vs. orthogonal) task, both of which increase signal strength. (Note: this is not a critique and applies to all studies using personally familiar identities - especially those that have used a relatively small number of identities).

      In light of this, I believe that statements related to the dominance of "feed-forward flow" in relation to perceptual difficulty should be more nuanced. Examples include:

      -"perceptual difficulty and the level of familiarity influence the neural representation of familiar faces and the degree to which peri-frontal neural networks contribute to familiar face recognition"

      -"We observed that the direction of information flow is influenced by the familiarity of the stimulus"

      Level of familiarity and perceptual difficulty are correlated in the present study, as well as most studies precisely because observers know who will be seen. Therefore, one could argue that the expectations, not the level of familiarity per se determine "the involvement of peri-frontal cognitive areas in familiar face recognition". (cf. Huang et al., (2017) and Ramon & Gobbini (2018) for a discussion).

      Related to this aspect and relevant for the analyses is the different number of trials across categories (3x as many unfamiliar face trials vs. each of the familiar ones). How was this dealt with statistically (cf. also stats reported in Figure 2) and were Ss informed about the ratio beforehand? Given the provision of self and personally familiar images, the task could also be considered a n-identity search task (cf. Besson et al., 2017), as they match sensory inputs to one of n possible known vs. an unknown number of unfamiliar identities / events. (To illustrate, the effects of expectations can determine the degree to which recovery from neural adaptation is observed across different face-preferential regions using the same task; e.g. Rotshtein et al, 2005, Nat Neurosci vs. Ramon et al., 2010, EJN)

      The authors list "levels of categorization [...], task difficulty [...] and perceptual difficulty [...]" as potentially affecting "the complex interplay of feed-forward and feedback mechanisms in the brain" (l.442). I agree and point towards further relevant papers to be cited that additionally investigate the impact of expectations or "decisional space" on categorical decisions in the healthy as well as impaired brain (eg Ramon, 2018, Cogn Neuropsychol; Ramon et al., 2019, Cognition; Ramon et al., 2019, Cogn Neuropsychol).

      To summarize, can "accumulation of sensory evidence in the brain across the time course of stimulus presentation" (l.267) and "the strength of incoming perceptual evidence and the familiarity of the face stimulus" considered to determine the direction of information processing be distinguished from the effect of expectations that potentially increases over time? (This is naturally non-existent for unfamiliar stimuli, for which no "domination of feed-forward flow of information" was found).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      The reviewers appreciated the clever paradigm and the focus on top-down influences during familiar face recognition. However, the reviewers also raised several serious methodological concerns. For example, they noted that the familiarity conditions cannot be easily compared, considering that these conditions differed in multiple ways beyond the level of familiarity (e.g., staged vs supplied photos, one vs many identities).

    1. Reviewer #3:

      The manuscript by Mioka et al. is the synthesis of a lot of well executed experiments examining a "void" zone in the plasma membrane of yeast cells lacking phosphatidylserine. The authors demonstrate that this is a specialized micron-size domain with many intriguing properties. However, there are several issues that limit my enthusiasm. Some of the experiments are misinterpreted, and there are also inconsistencies and inaccuracies in the text. In my opinion Figure 6 and Figure7 provide little benefit from the primary findings of the paper.

      Other concerns:

      1) The void zones shown are more prevalent at 37C than 30C. This is opposite to the other micron sized phase separation in the yeast vacuole (Rayermann et al., 2017). If this is a Lo domain then rapid oscillations in temperature should control the reversible assembly and disassembly. This should be examined.

      2) It's odd to me that the filipin signal has "thickness" beyond what you would expect if it was confined to a bilayer. In other experiments it appears that the cytosolic fluorescence is also quenched in the vicinity of the voids. This is problematic as every GFP construct examined on the cytosolic side of the PM is excluded. Perhaps these cells actually have ergosterol crystals (a 3D structure) rather than a Lo domain within the bilayer. Given the importance of cholesterol crystals in being a "danger" signal and activating inflammasomes it could be worth examining. This would require specialized imaging techniques.

      3) Spira et al., (2012, NCB). Highlighted the patchwork nature of the plasma membrane. With Pma1 and Ras2 being excluded from one another and proteins with similar TMDs tend to colocalize. This article should be included in the discussion to help place these findings in a greater context. Yet here all of the constructs that are examined are excluded from the void zones. This again suggests to me that this is different from an Lo domain. In the cho1 cells that do not have obvious voids, what is the localization and overlap a few of the well characterized markers Ras2, Pma1, Sur7, Bio5?

      4) Figure 1B shows 40% of cells grown overnight at 37C have voids but Figure 2C shows that they are lost after ~15h. This seems inconsistent.

      5) The authors state that psd1 psd2 are PE-deficient and cho2 opi3 are PC-deficient in the figure. This is incorrect.

      6) Figure 3C is not convincing. Images on the right have substantially more red pixels and so positions where there were voids at 0 min now have a bit of green at 25 min. I also don't understand how the ergosterol rich region is able to quench signal in the cytosol. Is this an extended focus representation of multiple slices?

      7) GPI-linked proteins are crosslinked to the cell wall. The authors' conclusions cannot be drawn from this experiment. The authors could potentially do the same experiment in spheroplasts.

      8) Alternatively, adding rhodamine-PE to the cells could be used to assess the partitioning in the outer leaflet.

      9) The significance of the vacuole - void contact is unclear. Typically, ~50% of the PM is in close apposition to cER in yeast. In mammalian cells it is known that cortical actin can restrict ER-PM contact sites formation. Thus, it could simply be that in the absence of cER that the Vacuole will come in close proximity to the PM. This can be tested by using a strain deficient in reticulons or the so-called delta tether or delta super-tether cells. If these cells also display Vac - PM contacts, then I don't see the relevance of including this figure in this study.

      10) Vacuole - void contacts are seen in roughly 50% of the cells with voids. In the cells that don't have this V-V contact do they have the nucleus or nER in contact with the PM? This is related to the above point. Is this simply a result of removing the cER and making the PM available?

      11) Figure 7 is unnecessary and just makes things more complicated. It actually detracts from the main findings since it is just a collection of observations. For instance, how would loss of the HOPS complex prevent Lo phase separation in the plasma membrane? Do these cells have less total cellular or plasmalemmal ergosterol? Do the levels of complex sphingolipids change?

      12) Provide a reference or a direct measurement showing that growing cells in pH7.0 medium impacts the cytosolic pH.

    2. Reviewer #2:

      This study shows that plasma membrane (PM) voids, regions devoid of proteins, form in cells lacking phosphatidylserine (PS). It argues these regions are enriched in ergosterol and are liquid ordered. Domain formation is reversible and may require ergosterol and sphingolipids for formation. A number of genes that disrupt void formation are also identified. The study proposes that PS prevents the formation of void zones by interacting with ergosterol. Overall, the study is well done and makes a persuasive case that that protein-free voids form in the PM and do not seem to affect cell growth; a fascinating discovery. There are, however, two weaknesses in the study that reduce its impact. One is that it does not show PS is directly involved in void formation or that void zone formation is driven by PS-ergosterol interactions, as stated in the abstract and elsewhere. This could be addressed in vitro using GUVs or supported bilayers. I realize these experiments are challenging, but they could add significant mechanistic insight. The second major weakness of the study is that it does not demonstrate PM void zones occur in wild-type cells in response to stress or in some growth conditions. There are other, more minor concerns.

      1) There is no direct demonstration that the void domains are ordered. This could be shown using order sensitive dyes like Laurdan. Further evidence could be provided by directly measuring diffusion rates of fluorescent lipids in the void zones compared to the rest of the PM. In addition, if the void domains are ordered, it should be possible to show they melt and reform as cells are heated and cooled.

      2)The role of Osh6 and Osh 7 in void formation should be assessed since these proteins are thought to be necessary to maintain PS enrichment in the PM, at least in some growth conditions.

      3) The investigation of void zone-vacuoles (V-V) contact sites is not well explained. It is not clear what is being proposed. How would contact sites promote void zone formation? Are they sites of lipid transfer and, if so, how would that affect void-zone formation? Or is some other mechanism being proposed?

      4) It is not clear what the mutant analysis adds to the story. Do the mutations affect PS levels in the PM? If that is what is being proposed it should be tested. Or do the authors think the mutants affect void zone formation by some other mechanism?

    3. Reviewer #1:

      The manuscript by Mioka et al. presents an interesting and puzzling observation. The authors showed the existence of a so-called "void zone" in PS-deficient cho1∆ cells. This void zone is a membrane region devoid of proteins and with a specific lipid composition, which the authors suggest to be a microscopic liquid-ordered domain. They also tested different stress conditions and found some that prevented void zone formation in cho1∆ cells. The authors propose that PS is a key lipid in preventing macroscopic raft-like domain formation in WT cells. Although it is unclear whether such PM void zones can appear in WT cells under any stress conditions (hence a caution note on the physiological relevance of the findings herein presented), the authors' proposal that PS in WT cells can suppress the formation of macroscopic lipid domains is an interesting hypothesis that deserves to be followed to my opinion. Finally, the authors start a search for genes required for void zone formation, which is interesting in my opinion, and although only partial conclusions from that can be drawn at the moment, I think this a promising way to study the mechanisms and maybe physiological relevance of void zone formation in the future.

      I have some concerns, especially on the fact that they seem to claim that the void zone is a liquid-ordered domain (if so, it should look more circular and not as they show they look like).

      Major concerns:

      1) The authors say that Lo domains are completely depleted of transmembrane (TM) proteins. However, there are many reports (e.g. from the Levental lab), where TM proteins with "raft" affinity have been shown. The authors should express some of these raft TM markers and check whether they partition or not into the void zone.

      2) The claim that the void zone is a liquid-ordered (Lo) domain, I do not think there is enough experimental evidence for that. In particular:

      -Line 82: the fact that the domains are not circular isn't this against a Lo phase and favor a more gel/solid phase? Have the authors seen fusion of void zone domains in live cells?

      -Line 84: does FM4 partition equally to Lo and Ld (liquid-disordered) domains in vitro? What about gel-like domains?

      -Lines 304-307: along the same lines, this is true for some proteins, although there are TM proteins that have been shown to be targeted specifically to Lo regions in GMPVs.

      -The fact that the void zone appears at high temperature is puzzling if compared to standard liquid-ordered domains.

      -Line 687: these observations are also compatible with gel-like domains.

      -Is it possible to do some dynamic measurements of dye diffusion in void zones? FRAP? Single particle tracking?

      3) Many trafficking routes/genes are required for void zone formation. What about for the stability/maintenance? Could the authors provide dynamic anchor-away or degron-tagging of some of these candidates to test whether void zones disappear upon depletion of these proteins?

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      This manuscript shows the interesting observation that plasma membranes in yeast cells lacking phosphatidylserine (PS) present differentiated regions, the so-called "void zones". Void zones are devoid of proteins and have a specific lipid composition (are enriched in ergosterol), which the authors suggest to be a microscopic liquid-ordered domain. Void zone formation is reversible and may require ergosterol and sphingolipids for its formation. They also tested different stress conditions and found some that prevented void zone formation in cho1∆ cells. The authors propose that PS is a key lipid in preventing macroscopic raft-like domain formation in WT cells, in particular by interacting with ergosterol. Finally, a study for genes that disrupt void formation is also presented.

      As you will see all the reviewers acknowledge that the manuscript presents high quality experiments and potentially very interesting discoveries. However, they all coincide in that the story has some weaknesses.

    1. Reviewer #3:

      In the manuscript, Polaski et al. compared the reported UPF1 mutations with a collection of three databases and found 42.5% of these mutations are identical to germline genetic variation. However, most of these overlapped mutations are located within introns, and only present in Exome Aggregation Consortium (ExAC) database (Figure 2). This raised some concerns since the ExAC database mainly reportsreport exon variants rather than intron variants, the authors need to provideneed provide other information such as allele frequency to examine whether these intronic mutations are rare or low-frequency variants. Another suggestion is that the authors may cross-reference UPF1 mutations with the recent gnomAD v3 database (Nature 2020), which provided non-coding genetic variants within much better resolution. In addition, most of the other UPF1 exon mutations are indeed novel as they are not present in any databases (Figure 2 - figure Supplement 1). The authors need to provide some additional analysis such as separating these two types of variants (exon/intron variants) and analyzing the frequency of overlapped UPF1 mutations.

    2. Reviewer #2:

      This paper aims to resolve the disparity between one report (Liu et al., 2014), which described somatic mutations in pancreatic adenosquamous carcinoma (PASC) that did not typify normal pancreatic tissue of the patients, and other reports (Witkiewicz et al., 2015; Fang et al., 2017; Hayashi et al., 2020), which did not find these mutations. The authors show here that many (40%) of the mutations described by Liu et al. typify genetic variations in the human population at large, and they suggest that these mutations are not pathogenic, e.g. are not drivers of PASC, and also not somatic but, rather, are genetic in origin.

      The authors use CRISPR-Cas9 to generate in mouse pancreatic cancer (KPC) cells, which harbor Kras and Tp53 gene mutations as do PASC patients, a Upf1 gene, and thus its product mRNA, lacking exons 10 and 11, as Liu et al. reported not only inhibits NMD by disrupting UPF1 helicase activity but also promotes tumorigenesis. After injection into mice, the authors found no detectable effects on pancreatic cancer growth compared to the injection of control cells.

      The authors acknowledge that mice may differ from humans. Thus next, rather than using mini-UPF1 genes, as did Liu et al., the authors introduced two of the Liu et al. mutations separately into the UPF1 gene of HEK293T cells. In contrast to Liu et al., the authors found modestly increased NMD efficiency and no evidence of UPF1 pre-mRNA mis-splicing. The authors note that this makes sense since these mutations are found in people not as somatic mutations but genetic mutations, and thus would not be expected to inhibit NMD given the importance of NMD to aspects of human development in utero and beyond.

      This is a very well-written paper describing carefully executed experiments that lead the reader to discount three claims made about UPF1 gene mutations in PANC as described by Liu et al., namely, that these mutations: (i) have a somatic origin, (ii) lead to UPF1 pre-mRNA mis-splicing so as to inhibit NMD, and (iii) promote tumorigenesis. The authors are careful not to over-interpret their data.

      Specific comments:

      Page 4, in reference to Figure 1f. It is unexpected that the variations in UPF1 protein levels were "uncorrelated with NMD efficiency". Possibly, this reviewer doesn't understand what the authors mean. Please clarify.

      Additionally, in this regard, it is better to draw conclusions about NMD efficiency by measuring more than just the efficiency with which mRNA from a reporter construct is targeted for NMD. It is recommended that the authors assay the levels of a few (e.g. three) cellular NMD targets, normalized to the level of their pre-mRNA to control for any changes to gene transcription.

    3. Reviewer #1:

      This manuscript identifies that the UPF1 variants previously reported as frequent somatic mutations in pancreatic adenosquamous carcinoma are actually germline genetic variants with no clear effects on UPF1 splicing, protein splicing, or nonsense mediated decay. Given that the manuscript challenges a striking finding from a prior study that has not been validated in subsequent studies, it is important to publish to correct the literature. At the same time, several points should be clarified to make sure the data are as comprehensive as possible:

      1) In the experiments evaluating the effect of skipping exons 10-11 of UPF1, it is surprising that this genetic perturbation in UPF1 is actually tolerated in these cells as UPF1 is an essential gene in most cancer cell lines (this point also has likely motivated this current study). Also, the Western blots for UPF1 protein are not particularly clear (Supplementary Figure 1c) and the fact that the cells don't perturb the growth of KPC cells does not prove that UPF1 alterations is not tumorigenic. Have the authors checked to see if UPF1 is downregulated and mis-spliced still in the cells following in vivo growth? A simple in vitro competition assay between UPF1 exon 10-11 targeted cells and control sgRNA cells would also be helpful. It would also be helpful to evaluate if NMD is altered in these cells given these issues.

      2) Although it is clear that the authors have used similar minigene assays as were used in the original publication, a more systematic evaluation for potential alteration in NMD with UPF1 variants (via RNA-seq) would be helpful given that this work questions the prior publication.

      3) Do the authors believe that the UPF1 variants reported as mutations initially in PASC are actually SNPs? The terminology describing what these variants are could be a little clearer in the Abstract and Discussion.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Eric J Wagner (University of Texas Medical Branch) served as the Reviewing Editor.

      Summary:

      The authors have sought to address what has become a considerably debated topic of whether mutations in Upf1 are tumorigenic in pancreatic adenosquamous carcinoma. Specifically, the authors introduced Upf1 mutants found in pancreatic tumors into pancreatic adenosquamous carcinoma cells, and found they did not provide significant advantage for tumor progression. Moreover, the authors described how a significant percentage of Upf1 mutants observed in pancreatic carcinoma are also present as variants in the human population, raising further doubts about their potential role as cancer drivers. Altogether, this work provides further evidence as to whether Upf1 disruptive mutations represent driving factors in pancreatic adenosquamous carcinoma.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      This is a fascinating and beautifully written article about the possible evolutionary relationship between two major protein superfamilies - the P-loop NTPases and the Rossmans. Both are ancient and highly diverse superfamilies, containing a significant proportion of all extant domain sequences and were probably amongst the earliest enzyme superfamilies to emerge in evolution. No major evolutionary classification of proteins, such as SCOP, reports evolutionary relationships between them.

      Both share the same structural architecture of a beta-alpha-beta 3-layer sandwich and have an intriguing number of other shared structural features including the location of the binding site for phospho-ligands. However, whilst both bind phosphorylated ribonucleosides, the mode of binding differs and also the manner in which these compounds are exploited. Furthermore, there are differences in the topologies of the folds possibly suggesting distinct evolutionary trajectories. The Rossmanns appear to be more structurally conserved, whilst the P-Loops vary more in their topologies and possibly represent less stable arrangements of beta-sheets and alpha-helices. The authors have brought together several strands of evidence to explore possibly evolutionary relationships. Detailed structural analyses allow the authors to explicitly detail the significant shared structural features. For example, similarities in the mode of binding the phosphate moiety in the ligand. The structural features are well described and there are appropriate illustrations visualising key differences and similarities. The shared features of the phosphate binding site likely emerged and were favoured early in evolution, as supported by other analyses reported by Longo et al. However, as the authors point out there are other compelling similarities including the equivalent location of this site in the first beta-loop-alpha element in both superfamilies, which is not a necessary constraint of phosphate binding and the authors support this by giving examples of phosphate binding at the tip of alpha-4. In addition, they provide evidence supporting the common involvement of beta-2 which contains the conserved Asp in the Rossmanns common ancestor. The Walker-B Asp in the P-loops is also at the tip of the beta-strand adjacent to beta-1, as in the Rossmanns - although this is an inserted strand relative to the Rossmann topology. The authors propose feasible evolutionary scenarios for how the P-Loops and Rossmans may have diverged to acquire additional secondary structure elements extending the common beta-PBL-alpha-beta-Asp feature present in both superfamilies. Further compelling evidence is given by detection of a bridging protein - Tubulin - linking the two superfamilies. This has the distinct Rossmann topology but binds GTP in the P-loop NTPase mode. Furthermore, the GTP is hydrolysed by water activated by a ligated metal dication. Final support is given by reporting common sequence themes between the P-loop enzyme HPr kinase/phosphatase and some Rossmann proteins. The authors present further interesting and detailed analyses of similarities between the proteins sharing this unusual theme. The evidence provided by the authors for the shared beta-PBL-alpha-beta-Asp fragment seems very strong to me and has been presented in an interesting and informative way. Of course, it is not possible to know the subsequent evolutionary trajectories but the scenarios presented seem plausible.

      We thank the reviewer for their encouraging remarks on our manuscript.

      **I only have minor comments** 1) SCOP2 provides information on links between superfamilies based on rare sequence or structural features. Have the authors checked this resource for any details on beta-PBL-alpha-beta-ASP fragment? Or perhaps consulted with Alexey Murzin about this feature?

      The classification of Rossmann and P-Loop proteins in SCOP2 is consistent with the ECOD classification scheme. For further confirmation, we wrote Alexey Murzin and he replied that Rosmanns and P-Loops are annotated as two separate evolutionary lineages, termed “hyperfamilies” in SCOP2. He found our new evidence compelling, but that given the current criteria for shared ancestry, P-loops and Rossmanns are separate lineages.

      2) I was rather confused by the way in which EC annotations were collected for the two superfamilies ie via Pfam – wouldn’t it be better to use SUPERFAMILY as the domain structures would map directly to these sequence relatives. I’m also surprised that they only took the common EC from a Pfam family since the aim of this analysis was to identify how many different enzyme functions the two superfamilies supported. Pfam does not classify by function and so inevitably groups functionally diverse relatives. However, to get the full range of enzyme functions supported by these superfamilies I would have thought all non-redundant EC functions across these constituent Pfam families should be counted. Perhaps I have misunderstood.

      We have updated the analysis to make use of the SUPERFAMILY database and, as per your suggestion, we now count all non-redundant EC numbers. Although the EC number counts have somewhat changed, the major point – that these are exceptionally diverse evolutionary lineages – has not.

      3) The authors refer to a set of previously curated ‘themes’ and allude to a methodology that will be reported in a forthcoming manuscript. The idea of identifying rare themes and then using them to locate very distant homologues is appealing. However, I think some details should be provided here. For example, some brief details on the technology for detecting the themes and thresholds on significance. How rare are they and how conserved do these fragments need to be between superfamilies to join their curated list? Furthermore, how many of these curated themes are similar to the one reported in their article and do they get crosslinks to other superfamilies based on closely related themes? ie how unique is this theme to the P-loop and Rossmanns and are there closely related themes linking these two superfamilies to other superfamilies? I would imagine it is quite a distinct theme but I would have liked to see a few more details on this to reassure that there are no closely related themes.

      We have updated the manuscript to include a more detailed description of the methods used to detect bridging themes shared between the Rossmann and P-Loop evolutionary lineages. In addition, we now include a supplemental table (Table S2) with all of the initial hits from the theme analysis.

      4) The authors have built model structures to allow them to estimate ligand location in proteins with no structural characterisation. It would be helpful if they reported the degree of sequence similarity between the query and template proteins and also the model quality.

      We have updated this section to include more details. In addition, we have identified a structure from the same T-group to serve as our ligand donor. The updated ligand donor is more closely related to 1ko7 than the previous ligand donor, though the positioning of the ligand is effectively unchanged. We note that the global sequence identity to both the previous and new ligand donor is low (less than 30% sequence identity). However, the phosphate binding loops align well in both sequence and structure, as is detailed in the revised Methods section.


      The study by Longo et al. was devoted to evolutionary history of P-loop NTPases and Rossmann fold proteins. Although not related in sequence, the two protein families share some structural features that imply that they could be diverged from a common ancestor. Using bioinformatic analyses, the study under review identified some bridge proteins (of tubulin family) that share themes of both P-loops and Rossmanns, offering a possible support for the common ancestry. A minimum ancestral peptide structure is proposed based on the analysis and its possible diversification trajectory is hypothesized. Even though the divergence scenario is clearly outlined, the authors do not over-interpret the observations and admit that convergence could still explain the scenario. The methodology and results are sufficiently described and conclusions are explained in detail. Although it would be really interesting to design an experimental study to support the conclusion (and I suppose that the authors will do that), that is clearly outside the scope of this bioinformatic study.

      Obtaining experimental evidence for our hypothesis is far from trivial. Modern proteins, including the bridging ones identified here, may not be amenable to exchange due to differing contexts (epistasis). Still, we agree that highlighting experimental directions is a good idea. We have updated the sections From an ancestral seed to intact domains and Conclusion to include a brief discussion of experiments that may help test our hypotheses about the evolution of these protein lineages.

      I would not propose any major changes to the manuscript as I think that the message is very clear. **Minor comments:** (1)In the results section, the text is very clear but tends to be repetitive in places. I think the manuscript would be more easily readable if more to the point at some sections.

      We have edited the manuscript to remove cases of unnecessary repetition in the results section and throughout.

      (2)There is probably a few typos or unclear sentences, e.g. pg 5, mid-page, "The core, most common topology...); pg 12, three lines from the bottom "(where this element in canonical", probably should be "is canonical"; pg 11, mid page "the mode of binding of the catalytic dication of tubuling (often Ca2+)" - all the structures listed in Table S1 list Mg2+, so "often" is a bit misleading.

      We have corrected the unclear sentences and typos noted above, as well as a few others.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The study by Longo et al. was devoted to evolutionary history of P-loop NTPases and Rossmann fold proteins. Although not related in sequence, the two protein families share some structural features that imply that they could be diverged from a common ancestor. Using bioinformatic analyses, the study under review identified some bridge proteins (of tubulin family) that share themes of both P-loops and Rossmanns, offering a possible support for the common ancestry. A minimum ancestral peptide structure is proposed based on the analysis and its possible diversification trajectory is hypothesized.

      Even though the divergence scenario is clearly outlined, the authors do not over-interpret the observations and admit that convergence could still explain the scenario. The methodology and results are sufficiently described and conclusions are explained in detail. Although it would be really interesting to design an experimental study to support the conclusion (and I suppose that the authors will do that), that is clearly outside the scope of this bioinformatic study.

      I would not propose any major changes to the manuscript as I think that the message is very clear.

      Minor comments:

      (1)In the results section, the text is very clear but tends to be repetitive in places. I think the manuscript would be more easily readable if more to the point at some sections.

      (2)There is probably a few typos or unclear sentences, e.g. pg 5, mid-page, "The core, most common topology...); pg 12, three lines from the bottom "(where this element in canonical", probably should be "is canonical"; pg 11, mid page "the mode of binding of the catalytic dication of tubuling (often Ca2+)" - all the structures listed in Table S1 list Mg2+, so "often" is a bit misleading.

      Significance

      I think this is a very interesting analysis of the evolutionary history of the P-loop and Rossmann fold family which are considered among the most ancient and abundant protein folds. That makes them of high interest also for origins of protein structure. The results are not firmly conclusive (because of the limits of such analyses), making the outcomes of the study partly hypothetical. I think it would be very interesting to outline suggestions for future experiments that could test the hypothesis to be more valuable to a broader audience.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is a fascinating and beautifully written article about the possible evolutionary relationship between two major protein superfamilies - the P-loop NTPases and the Rossmans. Both are ancient and highly diverse superfamilies, containing a significant proportion of all extant domain sequences and were probably amongst the earliest enzyme superfamilies to emerge in evolution. No major evolutionary classification of proteins, such as SCOP, reports evolutionary relationships between them.

      Both share the same structural architecture of a beta-alpha-beta 3-layer sandwich and have an intriguing number of other shared structural features including the location of the binding site for phospho-ligands. However, whilst both bind phosphorylated ribonucleosides, the mode of binding differs and also the manner in which these compounds are exploited. Furthermore, there are differences in the topologies of the folds possibly suggesting distinct evolutionary trajectories. The Rossmanns appear to be more structurally conserved, whilst the P-Loops vary more in their topologies and possibly represent less stable arrangements of beta-sheets and alpha-helices.

      The authors have brought together several strands of evidence to explore possibly evolutionary relationships. Detailed structural analyses allow the authors to explicitly detail the significant shared structural features. For example, similarities in the mode of binding the phosphate moiety in the ligand. The structural features are well described and there are appropriate illustrations visualising key differences and similarities.

      The shared features of the phosphate binding site likely emerged and were favoured early in evolution, as supported by other analyses reported by Longo et al. However, as the authors point out there are other compelling similarities including the equivalent location of this site in the first beta-loop-alpha element in both superfamilies, which is not a necessary constraint of phosphate binding and the authors support this by giving examples of phosphate binding at the tip of alpha-4. In addition, they provide evidence supporting the common involvement of beta-2 which contains the conserved Asp in the Rossmanns common ancestor. The Walker-B Asp in the P-loops is also at the tip of the beta-strand adjacent to beta-1, as in the Rossmanns - although this is an inserted strand relative to the Rossmann topology. The authors propose feasible evolutionary scenarios for how the P-Loops and Rossmans may have diverged to acquire additional secondary structure elements extending the common beta-PBL-alpha-beta-Asp feature present in both superfamilies.

      Further compelling evidence is given by detection of a bridging protein - Tubulin - linking the two superfamilies. This has the distinct Rossmann topology but binds GTP in the P-loop NTPase mode. Furthermore, the GTP is hydrolysed by water activated by a ligated metal dication. Final support is given by reporting common sequence themes between the P-loop enzyme HPr kinase/phosphatase and some Rossmann proteins. The authors present further interesting and detailed analyses of similarities between the proteins sharing this unusual theme.

      The evidence provided by the authors for the shared beta-PBL-alpha-beta-Asp fragment seems very strong to me and has been presented in an interesting and informative way. Of course, it is not possible to know the subsequent evolutionary trajectories but the scenarios presented seem plausible.

      I only have minor comments

      1)SCOP2 provides information on links between superfamilies based on rare sequence or structural features. Have the authors checked this resource for any details on beta-PBL-alpha-beta-ASP fragment? Or perhaps consulted with Alexey Murzin about this feature?

      2)I was rather confused by the way in which EC annotations were collected for the two superfamilies ie via Pfam - wouldn't it be better to use SUPERFAMILY as the domain structures would map directly to these sequence relatives. I'm also surprised that they only took the common EC from a Pfam family since the aim of this analysis was to identify how many different enzyme functions the two superfamilies supported. Pfam does not classify by function and so inevitably groups functionally diverse relatives. However, to get the full range of enzyme functions supported by these superfamilies I would have thought all non-redundant EC functions across these constituent Pfam families should be counted. Perhaps I have misunderstood.

      3)The authors refer to a set of previously curated 'themes' and allude to a methodology that will be reported in a forthcoming manuscript. The idea of identifying rare themes and then using them to locate very distant homologues is appealing. However, I think some details should be provided here. For example, some brief details on the technology for detecting the themes and thresholds on significance. How rare are they and how conserved do these fragments need to be between superfamilies to join their curated list? Furthermore, how many of these curated themes are similar to the one reported in their article and do they get crosslinks to other superfamilies based on closely related themes? ie how unique is this theme to the P-loop and Rossmanns and are there closely related themes linking these two superfamilies to other superfamilies? I would imagine it is quite a distinct theme but I would have liked to see a few more details on this to reassure that there are no closely related themes.

      4)The authors have built model structures to allow them to estimate ligand location in proteins with no structural characterisation. It would be helpful if they reported the degree of sequence similarity between the query and template proteins and also the model quality.

      Significance

      This article present compelling new evidence on the evolutionary relationship between two major, ancient enzyme superfamilies. As far as I'm aware these insights are novel and the detection of the bridging protein relative and the common 'theme', i.e. beta-PBL-alpha-beta-Asp fragment, is a new discovery.

      This work makes an important contribution to understanding the evolution of two major enzyme superfamilies and the insights can guide future evolutionary studies and protein design studies.

      The audience will be structural and evolutionary biologists, both experimental and computational.

      My expertise is in protein evolution and protein structure analyses and I have published a number of reviews and articles analysing and discussing Rossmann-like superfamilies.

    1. Reviewer #3:

      Kinsler et al measure the fitness of 292 mutants, which were recovered from previously performed experimental evolution in glucose limited batch culture condition, using barseq in 45 different conditions. They analyze the matrix of individual fitness measurements in different conditions using dimensionality reduction (singular value decomposition) and then study the explanatory power of the matrix decomposition. Although 95% of the variance is explained by the first vector, they identify 7 additional orthogonal vectors that explain a significant fraction of the remaining 5% of variance. They find that this reduced dimensionality representation of fitness profiles is able to predict mutant fitness in conditions similar to that in which the evolution experiment was performed and in environments that differ from the original selection experiment. They observe that different adaptive mutations have different effects across environments despite having similar fitness effects in the selective environment. From these findings the authors conclude that adaptive mutations affect a small number of phenotypes in the condition in which they are selected, but that they have the potential to affect additional phenotypes across conditions concluding that adaptive mutations are locally modular, but globally pleiotropic.

      This experimental study is well performed and the data analysis is clear and comprehensive. The authors have done an exemplary job in describing their study with clear and scholarly writing.

      However, the central question is whether the conclusions of the study are justified. The authors goal is to establish a "genotype-phenotype-fitness" map, but as they state "our phenotypic dimensions are not necessarily comparable to what people traditionally think of as a "phenotype". Indeed, I agree that what the authors have identified are not phenotypes at all but are instead properties of the genotype-fitness map assayed in different conditions. These properties are themselves interesting; however, describing them as phenotypes - observable and measurable traits of an organism -, or even inferring the number of phenotypes they represent, is incorrect. Therefore, I am not convinced that the authors have achieved their goal of defining a genotype-phenotype-fitness map.

      Key points that the authors should consider:

      -The central conclusion is not supported. The authors claim that adaptive mutations affect a small number of phenotypes in the evolved conditions, but many phenotypes over different conditions. But, this conclusion cannot be drawn from the results. Why is a scenario in which hundreds of "phenotypes" (e.g. the expression of 100 genes) underlies enhanced fitness in the adapted environment, but a change in the environment means that only 10 of those genes are expressed (i.e. fewer "phenotypes") and thus the fitness effect is different in that environment incompatible with the results? In that scenario the overall conclusion would be completely the opposite. Perhaps constructing a mechanistic model and performing simulations that explore these different possibilities would strengthen the argument.

      -A primary result of the study is that mutations that are beneficial in one condition are frequently deleterious in other conditions. This phenomenon of antagonistic pleiotropy has been described innumerable times in the experimental evolution literature - indeed, it seems to be the rule rather than the exception - and these prior observations should be more clearly described.

      -The extent to which the results are dependent on the number of environments is not investigated. For example, reducing the number of "similar" environments would likely decrease the variance explained by the first singular value as would increasing the diversity of environments that are studied. How does this variation impact the results and interpretation?

      -In figure 2, it looks like fitness is defined relative to the most fit genotype. Typically, in experimental evolution fitness is defined relative to the ancestor. Perhaps defining ancestral fitness as zero for the SVD is necessary, but this is atypical based on similar studies and may be a source of confusion for readers.

      -In figure 2C an idea of the variance is given for the EC conditions, but not for the other conditions. Some measure of uncertainty for fitness in each condition would help (give the 2-4 replicates of each).

      -Why not use an ancestral strain without a barcode for competition assays, rather than having to digest the ancestral barcode with restriction enzymes?

      -cutoff of 1000 reads for a times point with 400 strains seems really low (or is it supposed to be reads/strain?).

      -The arrows in figure 2C are unexplained.

    2. Reviewer #2:

      In the manuscript titled "A genotype-phenotype-fitness map reveals local modularity and global pleiotropy of adaptation," the authors describe an approach for uncovering the phenotypic complexity that underlies fitness by tracking hundreds of experimentally-evolved adaptive mutants across a range of environments. This approach yields a genotype-phenotype-fitness map without actually naming and measuring the phenotypes themselves. Instead, by perturbing environmental conditions and measuring mutant fitness across environments, the authors develop a model that reveals a collection of abstract phenotypes that contribute significantly to fitness. The authors find that a low-dimensional phenotypic model is sufficient for capturing fitness of the panel of mutants across subtle environmental perturbations - which suggests that only a few phenotypes contribute to fitness near the evolution conditions. Further, the model accurately predicts fitness in environments that deviate from the evolution condition, often through components that contribute little to fitness near the evolution condition - which suggests that adaptive mutants have latent phenotypic effects that only impact fitness in distant environments. These findings lead the authors to conclude that adaptive mutations are locally modular yet globally pleiotropic, thereby lending valuable insight into our understanding of how adaptive mutations affect the complex physiological interconnectedness of the cell.

      Overall, I am very impressed with the work described in the manuscript. The manuscript is well-written, especially considering the conceptual depth of the topic and novelty of the approach. The experiments were elegantly designed and adopt a variety of molecular tools developed recently within the field. The figures are appealing and present the data in a clear manner. The conclusions are justified by the data, and the findings represent a significant contribution to the field.

    3. Reviewer #1:

      The distribution of pleiotropic effects of mutations selected in a particular environment is of broad and fundamental significance. We've known for a while from large and even larger-scale screens of beneficial genetic variation that the rising tide of these mutants in the focal environment often lifts other boats in neighboring conditions, but not in orthogonal conditions, where outcomes are unpredictable. This beautifully written, executed, and analyzed study shows that we actually can gain predictability if the number of environments scales to dozens, mutants scale to hundreds, and most importantly, multidimensional analyses are taken seriously enough to derive the most salient predictor variables. Here, the magic number is 8 parameters, and the authors do a great job of justifying this decision given the noise of batch effects and the surprising power of the few, less explanatory parameters in the selective environment to explain variation in the more foreign environments.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary:

      The distribution of pleiotropic effects of mutations selected in a particular environment is of broad and fundamental significance. We've known for a while from large and even larger-scale screens of beneficial genetic variation that the rising tide of these mutants in the focal environment often lifts other boats in neighboring conditions, but not in orthogonal conditions, where outcomes are unpredictable. This well written, executed, and analyzed study shows that we actually can gain predictability if the number of environments scales to dozens, mutants scale to hundreds, and most importantly, multidimensional analyses are taken seriously enough to derive the most salient predictor variables. The authors find that a low-dimensional phenotypic model is sufficient for capturing fitness of the panel of mutants across subtle environmental perturbations - which suggests that only a few phenotypes contribute to fitness near the evolution conditions. Further, the model accurately predicts fitness in environments that deviate from the evolution condition, often through components that contribute little to fitness near the evolution condition - which suggests that adaptive mutants have latent phenotypic effects that only impact fitness in distant environments.

    1. Reviewer #3:

      This article asks the question if within trial (present) and ITI (past) task parameters are encoded in mPFC, and how encoding during these two trial epochs are encoded. They claim that firing in mPFC reflects past and present, but population encoding of past and present are independent. Further they show that the present is reactivated during sleep, not the past.

      On the face of it, this seems like an interesting paper. It is novel in that ITI encoding would be highly related to what was going on in the trial. The sleep finding is also interesting but I don't quite get the distinction between present and past for sleep. That could use some clarification.

      1) I'm not an expert in regards to this type of analysis, but throughout I was left with the feeling that I would prefer at least some single neuron data and firing rate analysis to complement the highly computational analysis, which frankly, was difficult to understand or critique by somebody who is not an expert.

      2) I would have liked to see more analysis of firing correlations with behavior. It seems to me if animals were doing different things during the trial and the ITI, then it might not be a surprise that there is independent encoding.

      3) I also wonder if the finding is solely dependent on the task (which is poorly described). It seems like there should be independent coding of past and present in this circumstance because they do not feed into each other, and behavior during one is independent of behavior in the other.

      4) Relatedly, the authors suggest that independent encoding can explain how the brain resolves interference between past and present, but in this task there was no interference between past and present, and the authors do not show that when there is more or less dependent encoding that there is more or less interference. Without it is unclear how to know how important this finding is as it relates to performance and general mPFC function.

      5) Could activity reflect what the animal predicts will happen on the next trial, or what they are planning to do? It wasn't clear if that was examined.

      6) I have some issue with the definition of past and present in the context of this task. More justification should be provided.

    2. Reviewer #2:

      The study by Maggi and Humphries re-examines data by Peyrache et al. (2009), which the authors have themselves analysed previously (Maggi et al., 2018), recorded , in rat prelimbic/infralimbic cortex (see comment below on terminology). In particular, they look at the relationship between decoding of task events during performance of a trial, and during the subsequent intertrial interval. (n.b. in this study, unlike in many studies, the ITI is considerably longer than the trial period). They find that although task-relevant information can be decoded during these two periods, the information is encoded in orthogonal subspaces during trials ('the present') and ITIs ('the past'). They build on this to examine how information is encoded during sleep following training (vs a pre-training control period). They find that only the trial subspaces are reactivated during sleep, not the ITI subspaces, and more so if the rat received a higher rate of average reward.

      On the whole, I found this an interesting paper with a clear set of findings, and well-analysed data. Although the advance in some ways an incremental one on previous studies of sleep/replay, and on the authors' previous analyses of this dataset, the study will undoubtedly be of interest to researchers who are interested in consolidation of past experience during sleep. In particular, the study benefits from being able to look for two different types of information ('past' and 'present' decoders) in the same sleep recording sessions. There were a few things that I felt the authors could address:

      1) For the cross-decoding analysis in figure 2 b, it is not entirely clear from the main text which part of the trial and ITI coding is being used here. It seems to me like a more useful way of showing the cross-decoding analysis would be to show the 10x10 matrix of cross decoding accuracy for each of the 5 maze positions in both trials and ITIs. This is, I think, different from what the analysis in figure 3g is trying to show (which plots the classification error after dimensionality reduction to a 2D space).

      2) It was surprising to me that the authors do not mention the finding in figure 4e anywhere in the abstract or introduction. It makes the reactivation story far more compelling if it can be linked to a change in behaviour during the preceding trials. I think this finding would benefit from not being buried deep in the results section.

      3) The finding in figure 5 seems slightly extra-ordinary. It suggests that reactivation decoding during sleep is reliable even if very long bins of activity are used to calculate the firing rate (e.g. up to 10s). Does this relationship ever break down? Presumably with the sleep data, it would be possible to extend bins up to 1 minute, 5 minutes, etc. If there is still more reactivation at these extremely long time-bin lengths, does this mean that these neurons are essentially more persistently active? One possible way to test for this might be to project the data recorded during sleep through the classifier weights, and then calculate the autocorrelation function of this projected data (e.g. Murray et al., Nat Neuro 2014) - if this activity becomes more persistent, the shape of the ACF may change post-training.

      4) I disagree with the use of the term 'medial prefrontal cortex' to describe this area of the rodent brain. Although this is the term used in the original paper by Battaglia et al. (2009), I would suggest the authors use the more anatomically precise description of 'prelimbic/infralimbic cortex', and mention that the recordings are ~2.7mm anterior to bregma (see supplementary figure 1 of Battaglia 2009 paper; see Laubach et al., eNeuro 2018 for further discussion on terminology). Also, when the authors discuss these recordings in the context of the wider literature, it is difficult to know how to relate activity in this dysgranular region of the rodent brain to regions of granular prefrontal cortex in the primate brain - given the anatomical correspondence between rodents and primates is very uncertain for these granular regions (e.g. citations to Schuck et al., 2015; Averbeck and Lee, 2006; etc). It would be good to acknowledge this somewhere.

    3. Reviewer #1:

      Maggi and Humphries examined how the coding of the present and past choices in the medial prefrontal cortex (mPFC) of the rats during a Y-maze task overlaps and whether they can be reliably distinguished. They found that the neural signals related to the animal's choice in the present and past are distinct and as a result they can be recalled separately, for example, during post-training sleep. Although these are very important questions and an interesting set of analyses have been applied, the results in this report are not entirely convincing, because the analyses did not successfully exclude some alternative hypotheses.

      1) The authors analyzed the signals related to the choice, light cue, and outcome separately, and this is possible because the relationship between the animal's choices and cues were decoupled by testing the animals under at least two different rules. There were a total of 4 alternative rules and different sessions included different subsets of these rules. It is possible that at least some results reported in this paper might vary depending on which of these results were tested. For example, rules might affect how the animals learned the task. Therefore, the authors should provide more detailed information about how often different rules were used to collect the neural data reported in this paper, and whether any of the results change according to the rules used in a given session.

      2) The authors claim that the neural coding identified in this study does not depend on the signals in individual neurons by showing comparable results after removing the neurons with significant modulations. This logic is flawed, because the neurons without "significant" modulations might still include meaningful signals due to type II errors. Furthermore, if individual neurons carry absolutely no signals, how can a population of neurons still encode any signals? This might suggest some kind of joint coding, and the authors should not merely implicate such a possibility without more thorough tests.

      3) The authors analyzed the activity divided into 5 different epochs, where the position #3 corresponds to a choice point and #5 corresponds to the reward site. Therefore, it is surprising that the reliable outcome signals begin to emerge from the position #3 (i.e., choice point). Is this a false positive?

      4) The authors report that there is retrospective coding, i.e., no coding of the choice in the previous. By contrast, during the intertrial interval (while the animal's returning to the start position), the signals related to the "past" choice were still present but different from how this information was coding earlier during the trial. This is not surprising since during the intertrial interval, the animal's movement direction is opposite compared to that during the trial, so this coding change could reflect the animal's sensory environment. Whether the brain encodes the past and previous events using different coding schemes or not cannot be tested with such confounding.

      5) The authors tested whether the coding of present and past events is consistent using a transfer (cross-decoding) analysis. However, this is based on simply correlation, and does not exclude the possibility that neurons changing their activity similarly according to (for example) the animal's choice might also change their baseline activity between the two periods (as revealed by the analysis of "population activity" in Figure 3) or might additionally encode different variables. In this case, decoding based on simple correlation might not reveal consistent coding that might be present.

      6) Given the length of the inter-trial interval, it might be informative to examine whether neurons activity during the early part of the inter-trial interval might get reactively differently during sleep compared to those becoming active later during the intertrial interval.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript. Daeyeol Lee (Johns Hopkins University) served as the Reviewing Editor.

      Summary:

      Although the reviewers have acknowledged the significance of better understanding how neurons in the prefrontal cortex can simultaneously encode signals related to the animal's present and past behaviors, they were concerned that the findings reported in this paper did not control for potential confounding of behavioral variables during the epochs analyzed in this manuscript. They also raised several concerns about the analytical methods used. In the consultation, the reviewers all agree that the advance represented here is not to the level that would be expected by readers and the overall enthusiasm was limited.

    1. Reviewer #3:

      In this manuscript, Schorscher-Petcu et al., describe a very exciting new approach combining precise optogenetic stimulation of cutaneous nerve terminals with high-speed imaging for machine-guided behavior analysis. This work is timely, and there are many clear applications to understand peripheral somatosensory encoding using this strategy. More thorough methodology and guidance for future end users could be provided. However, I am much less enthusiastic about the conclusions drawn for a sparse neural coding hypothesis, based on the data presented. Significant support for this hypothesis would require more substantial revisions, including testing in mouse lines to target other specific sensory modalities, innervation regions, and possibly pain states.

      Substantive concerns:

      1) A major strength would be the ability to combine precise optogenetic stimulation with other behavioral assays. Can this be used in combination with existing nociceptive tests? For example, does the NIR-FTIR allow for tracking of spontaneous pain behaviors after intraplantar formalin or CFA? And can this then also be used to assess sensitization of genetically-identified fibers using scanned optogenetics?

      2) What is the rationale for varying the pulse-widths rather than light intensity for these experiments? Increasing light intensity will generally lead to larger ChR2 photocurrents, while changing light duration generally affects deactivation and desensitization kinetics. At a peripheral terminal, the effects of subthreshold depolarization may in fact mimic the physiological activation of endogenous receptors, like TRP channels. This level of fine-tuned control would be a significant advancement for understanding how information from different somatosensory modalities is processed and integrated.

      3) It would be useful to have more thorough characterization of the strengths and limitations of the optical system. For example, how quickly are the spatially patterned stimuli able to be moved? What is the maximal area for a single spot or array of spots, and how long does this take to scan? Does the time between patterned stimuli, both in a single spot or when spatially distributed, alter withdrawal responses? How quickly can the beam spot size be altered? These will be important points that potential users will need to consider before building this system.

      4) It would also be extremely helpful to provide more thorough details and discussion of implementing Deep Lab Cut analysis with this system.

      5) The proposed activation of myelinated A fibers is very surprising given the opsin expression patterns in TRPV1:ChR2 mice. The authors cite Arcourt et al., however they did not find any expression of TRPV1 in their genetically-defined A-fiber nociceptors. And with this breeding strategy can the authors please clarify and provide support for this apparent discrepancy?

      6) The response latencies in Figure 3 fit well with the hypothesis that fibers with different conduction velocities are activated by changing pulse areas. Do different stimulus intensities (or durations) preferentially activate A vs C-fiber afferents akin to electrical stimulation of dorsal roots in spinal cord recordings? Or does the larger stimulation area merely increase the probability that an A nerve ending is in the illuminated region? Could this alternatively be explained by additive depolarization or more complex spike interference at these axon collaterals that branch extensively in the skin? Also, do the response profiles vary after activation of a presumptive A vs C-fiber?

      7) Is the pain-related behavior in response to single or patterned optogenetic stimulation reduced by analgesics acting centrally or peripherally? This could reveal important differences in rapid reflex or protective behaviors and more complicated nocifensive responses, and support the author's claims of true pain-related behaviors.

    2. Reviewer #2:

      The manuscript by Schorscher-Petcu et al developed a method/system for scanned optogenetic activation of nociceptors on the paw in freely behaving TrpV1-Cre::ChR2 mice, with concurrent measure of both paw responses (using near-infrared frustrated total internal reflection to measure paw/floor contacts) and full body responses (scoured using DeepLabCut). Using this approach, they showed that the number of activated nociceptors governs the timing and magnitude of rapid protective pain-related behavior. The detailed description of how to construct the setup, and the open availability of the software are useful for other labs to apply this method.

      I have three points that I would like the authors to address:

      1) I have a hard time evaluating the hierarchical bootstrap procedure, which references a pre-print. Is this method really ensuring that the results are more rigorous? Or is it needlessly complicating the reporting of fairly simple metrics for what appear to be obvious phenomena (Figure 3) like paw rise time?

      2) I have an issue with the word "sparse code". In neuroscience in general, sparse code refers to the phenomenon that a given stimulus only activates a very small percentage of neurons in a population. Here the authors refer to a single action potential elicited by optogenetic stimulus. Some other term should be used.

      3) For Figure 4 (whole body movement), the analysis should be using a vector instead of a scalar. The example in Figure 4D clearly shows directionality, i.e. the nose moves toward the stimulated paw. But the authors only analyzed maximum distance (a scaler, not vector). So the correlation here in Figure 4F is showing "when body part A moves a lot, does body part B also move a lot". Instead, I think the analysis more in line with the examples would be when body part A moves one direction, the direction of movement of body part B would be correlated. In other words, the analysis needs to be done where distance is some kind of vector, either closer to or further away from the paw or moving toward or away from the stimulated paw.

    3. Reviewer #1:

      The manuscript by Schorscher-Petcu is a very innovative study addressing an important problem in pain and somatosensory neuroscience - precise and remote delivery of sensory stimuli. The strength of this work is the experimental paradigm, as the biological insight seems quite weak and not more expansive than previous work from the authors and others in the field. One has to ask, is this work being sold on the tool or new biology? If it were the latter, this work could easily benefit by comparing the data with Trpv1-ChR2 with other sensory neuron populations - as the authors mention in the discussion. Nonetheless, the rationale for such a tool developed here is widely agreed upon in the field, and if others can easily adopt this strategy, this could become the standard for peripheral optogenetic stimulation of the hind paw.

      Major comments:

      1) It remains unclear to me how one actually remotely aims at the hind paw of interest. Is there a joystick where one aims at the paw? Relatedly, are there ever any misfires where one intends to aim at the paw but hits another area? Or does the mouse sometimes move when you intend to hit one area thus causing an unintended stimulus delivery?

      2) In Figure 2 the authors cite their previous studies which demonstrate that a brief optogenetic stimulus to the paw elicits a single action potential which is capable of causing a behavioral response. The authors then infer here that their nanosecond manipulation of light also influences single action potentials. However, without verifying that in this new experimental context, simply citing the older work is insufficient evidence to draw any correlation to action potentials.

      3) In Figure 3 the authors mention that in a fraction of trials (presumably ~35%) the paw moved but did not withdraw, and that this was detected by the acquisition system and not by eye. I am confused about what the authors are considering a paw withdrawal. Is not any paw lift also a withdrawal? Additionally, how can the acquisition system see things that cannot be seen by the experimenter? Could this point towards an error of the system? Is there an independent validation of how well the system is working compared to some benchmark?

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The manuscript by Schorscher-Petcu is a very innovative study approaching an important problem in pain and somatosensory neuroscience - precise and remote delivery of sensory stimuli. This work is timely, and there are many clear applications to understanding peripheral somatosensory encoding using this strategy. The rationale for such a tool developed here is widely agreed upon in the field, and if others can easily adopt this strategy, this could become the standard for peripheral optogenetic stimulation of the hind paw.

    1. Reviewer #3:

      The study by Mangeol et al. aims to dissect the localisations, interactions and hierarchical order of apical protein complexes crucial to the generation and maintenance of epithelial polarity in epithelial tissues.

      They analyse by super-resolution microscopy (STORM) three different mature epithelia, human and mouse intestine as well as mature Caco-2 cells in culture. Using immunofluorescence labeling of endogenous proteins, they compare individual components to markers of tight junctions, to each other and to the actin cytoskeleton. They identify defined clusters in defined sub regions of the apical domain of the analysed cells, raising interesting questions for future analyses.

      The subject matter of the study, the generation and maintenance of epithelial polarity and the role of apical polarity complexes, is clearly a very important one, especially as most organ systems are epithelial in nature. And despite decades of study, many questions are still unresolved.

      The imaging performed in this study is skilful and beautifully presented. The imaging achieving, according to the authors, an isotropic resolution of about 80nm is impressive. Because of this great gain in resolution compared to other studies of similar components I have a couple of technical questions or comments:

      1) I would very much appreciate some comments or thoughts on the fact that polarity proteins were revealed using antibodies. Antibodies are in the range of 10-15nm in length, so with an isotropic resolution of 80 nm, this might have to be taken into account when using primary and secondary antibodies to reveal proteins. In particular, monoclonal versus polyclonal antibodies might have differing effects on localisation precision.

      2) The authors use rather high concentrations of detergent (1% SDS or 1% Triton X-100) for permeabilisation according to their protocols. Are they not worried that this might affect tissue integrity and protein distribution?

      The authors rightly point out where their study fits within what has been attempted by other labs previously in order to understand and dissect apical polarity complex function. They clearly define interesting aspects, such as PALS1-PATJ and aPKC-PAR6 forming independent clusters, and the lack of colocalisation and thus maybe association with Crumbs3. In contrast to the last sentence statement of their abstract 'This organization at the nanoscale level significantly simplifies our view on how polarity proteins could cooperate to drive and maintain cell polarity.' I cannot yet see what these results simplify about our understanding of apical polarity complexes and even more so what the authors' new model is of how the complexes work. This needs to be spelt out more clearly, please. And I would also point out that, in part, other studies have pointed in the same direction. The recent paper by the Ludwig lab (Tan et al. 2020 Current Biology 30, 2791-2804) points in part in a similar direction, identifying a vertebrate 'marginal zone' similar to the one already known from invertebrate epithelia, as well as identifying basal to this an apical and basal tight junction area. Furthermore, as the authors themselves discuss in the discussion, the 'splitting away' of Par3 has been observed in Drosophila epithelia (embryonic, follicle cells and eye disc), and should maybe be introduced already at an earlier point of the paper. Furthermore, papers by Wang et al. and Dickinson et al., that also analyse PAR complex clustering should be cited and mentioned in the introduction/discussion (Wang, S.-C., Low, T. Y. F., Nishimura, Y., Gole, L., Yu, W., & Motegi, F. (2017). Cortical forces and CDC-42 control clustering of PAR proteins for Caenorhabditis elegans embryonic polarization. Nature Cell Biology, 19(8), 988-995. http://doi.org/10.1016/S0960-9822(99)80042-6; Dickinson, D. J., Schwager, F., Pintard, L., Gotta, M., & Goldstein, B. (2017). A Single-Cell Biochemistry Approach Reveals PAR Complex Dynamics during Cell Polarization, 1-42. http://doi.org/10.1016/j.devcel.2017.07.024).

      I am also a bit confused by the analysis presented in Figure 5 with regards to colocalisation of components with apical F-actin structures and the deduction from these and the EM data that some components, aPKC/Par6, localise to 'the first row of' microvilli near junctions whilst PALS1-PATJ localise near the base of said microvilli. How would localisation to the apical plasma membrane outside of or within microvilli be restricted to only the ones near junctions? There is not only F-actin in microvilli but also all over and near the apical cortex, so what distinguished the ability of aPKC/PAR6 to bind to actin in microvilli? The PATJ knock-down results are interesting, and I agree suggestive of some interaction between the complexes and actin organisation. But without further analyses as to what other components might be affected in their localisation in this situation, it is hard to judge whether the effect on actin is a direct or rather indirect one, so I am unsure as to what these images add without more in depth follow-up.

      Some more specific comments:

      Figure 1: It would be good to show and demonstrate that Occludin and ZO-1 labeling are completely interchangeable in terms of localisation precision.

      Figure 3: I do understand the authors' rationale for analysing the localisation in the orientation (planar versus apical-basal) that reveals the largest distance, but it would be good to nonetheless show the other orientation for completeness (maybe as supplementary).

    2. Reviewer #2:

      The manuscript addresses a fundamental problem: the organisation of epithelial polarity determinants at the apical domain of human epithelial cells. The authors use STED microscopy to examine antibody-stained fixed Caco2 cells. My major concern is that the process of fixation and immunostaining may introduce artefacts that are causing the segregated dots to appear. This issue could be addressed by using CRISPR-knockin GFP versions of some of the proteins studied, which is technically straightforward to perform these days, and would allow the conclusions to be drawn with full confidence.

    3. Reviewer #1:

      Mangeol et al investigate the nanoscale organization of apical-basal polarity complexes using super-resolution microscopy approaches (STED) in polarized intestinal epithelial cells, both in culture and from in vivo tissue samples. They provide a careful characterization of Par3-Par6-aPKC and Patj-Pals1-Crb3a localization relative to tight junctions in both planar and apical-basal axes. They find that each protein localizes in the near vicinity of the tight junction, in a clustered organization. Through pairwise colocalization analyses, they observe significant separation of polarity proteins that are generally considered to be part of the same molecular complex based on biochemical assays. Specifically, PAR3 is not associated with aPKC or PAR6, and CRB3a colocalizes poorly with all other polarity proteins.

      Overall, this paper provides a thorough description of polarity protein localization at the submicron scale. The data are presented in a clear and convincing manner and the conclusions are largely consistent with the data. The unexpected separation of polarity proteins suggests that some of the previously described biochemical interactions may be transient, warranting further investigation comparing different stages of polarization. These findings will be of interest to those in the field of cell polarity.

      Comments/concerns:

      1) All of the results depend on antibody quality, specificity, and antigenicity but no antibody validation provided (with the exception of PATJ). If one primary antibody is less specific than the others, the colocalization data will be heavily skewed, appearing not to be colocalized. Perhaps this can explain why Crb3a fails to colocalize with the other proteins? Validating the results with a second primary antibody or an endogenously tagged GFP-fusion protein would alleviate this concern.

      2) The authors show that CRB3a doesn't colocalize PALS or PATJ, suggesting another transmembrane protein recruits them to the membrane. Could this function be provided by another CRB family member or is CRB3a the only one expressed in intestinal epithelia?

      3) The super-resolution characterization of actin organization is not as extensive or convincing as the description of polarity protein localization. A closer examination of actin organization relative to PATJ and aPKC at junctional, apical, and villi positions would strengthen the findings in Figure 5.

      4) In some cases the number of biological replicates is small. Only one mouse sample was used, and the quantifications of junctions are performed across just 1 or 2 cell culture replicates (although more replicates were performed, just not used for quantification). Therefore, the data reflect the variability across junctions (violin plots in Figs 1-2) but they don't reflect the variability across biological replicates. This also means the p-value in Figure 5 was calculated using n=number of junctions rather than n=experimental replicates, which would be a more appropriate comparison of means. Quantifying the data across 3 biological replicates to show the variability across experiments would greatly strengthen the results and conclusions.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      This manuscript is in revision at eLife.

      Summary:

      The manuscript addresses a fundamental problem: the organisation of epithelial polarity determinants at the apical domain of human epithelial cells. Mangeol et al investigate this question using super-resolution microscopy approaches (STED) in polarised intestinal epithelial cells. Using immunofluorescence labeling of endogenous proteins, they provide a careful characterization of Par3-Par6-aPKC and Patj-Pals1-Crb3a localization relative to tight junctions. They find that each protein localizes in the near vicinity of the tight junction, in a clustered organization. Through pairwise colocalization analyses, they observe significant separation of polarity proteins that are generally considered to be part of the same molecular complex based on biochemical assays. Specifically, PAR3 is not associated with aPKC or PAR6, and CRB3a colocalizes poorly with all other polarity proteins, raising interesting questions for future analyses.

      The imaging performed in this study is skillful and beautifully presented and, achieving an isotropic resolution of about 80nm, is impressive. However, because of this great gain in resolution compared to other studies of similar components, the major concern of all three reviewers is that the process of fixation and immunostaining may introduce artefacts that are causing the segregated dots to appear. Variable antibody quality and insufficient validation of antibody specificity raise additional concerns about the observed patterns of localization.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RESPONSE TO REVIEWER #1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Ishihara et al. investigate and compare microtubule polymerization/depolymerization dynamics inside vs. at the periphery of microtubule asters in a cell-free Xenopus egg extract system. By tracking EB comets, which localize to growing microtubule ends, they find that the microtubule growth rates and EB comet lifetimes (interpreted as an indicator of microtubule catastrophe rates) are similar between the two spatially-distinct microtubule populations. However, using a tubulin-intensity-difference image analysis, the authors are also able to measure local microtubule depolymerization rates, and they find a significant difference in depolymerization rates of the two populations. Specifically, the authors report that the microtubule depolymerization rates measured within asters are faster than those measured at the periphery.

      \*Specific comments:***

      Figure 2.

      In the text, the authors report: "The depolymerization rate was 36.3 {plus minus} 7.9 μm/min (mean, std) in the aster interior, compared to 29.2 {plus minus} 8.9 μm/min (mean, std) at the aster periphery." This difference is certainly not two-fold (as stated in the abstract). It would also be useful to mark the mean rates on the graph in 2B.

      We removed the words ‘almost two-fold’ in the abstract. In the revision, we will mark the mean rates on Fig. 2B (using vertical lines).

      The bimodal shape of the depolymerization rate distributions in 2B is very interesting. This definitely warrants further investigation. At the minimum, the depolymerization rates should be determined at 50 um- intervals, as done for other parameters in Figure 1. Could it be that there are two coexisting populations of microtubules at the same location? Or is there a clear spatial compartmentalization of the two that is not obvious here because of the too large of a distance interval used for the measurements. This is a very important distinction for the claims of the paper.

      We understand the reviewer’s concern. There are some technical limitations that make the depolymerization measurement more challenging. While we use widefield imaging of EB1-GFP comets to obtain polymerization rates from a field of view spanning 500 microns, we may only use TIRF imaging for depolymerization measurements. In this method, we are limited to observing microtubules very close to the cover slip in a small field of view of 80x80 microns at 500 ms time intervals (movies span 1-2 minutes). One would need to move the TIRF field every 1-2 minutes at 50 micron intervals, but the aster periphery would be changing during this time, so the exact location of the measurement is hard to define. Thus, we opted to image the two spatial extremes: interior (close to the MTOCs) and the very periphery (where MT density is still sparse.)

      Perhaps, the largest limitation of this approach is the choice of peripheral regions based on the apparent sparsity of MTs in the TIRF field of view. Indeed, when we examine the depolymerization rate distributions for individual movies separately (see figure below, periphery #1-3 are three individual movies), we observe that some movies have rates as low as 20 µm/min, while others have higher values with a center around 36 µm/min. The depolymerization rates for the interior also vary from the mean values of 34.8-43.2 µm/min (interior #1-3 are three individual movies). In general, the spread of depolymerization rate within a field of view as well as across different fields of view is much larger than for polymerization. It is possible that this is partly explained by the lack of precise definition of interior vs. periphery in this TIRF-based measurement approach.

      Our data still supports the spatial regulation of depolymerization rate. However, there is no clear evidence for a bimodal distribution of depolymerization rate in any given field of view (80x80 micron square region). To clarify this point, we have removed the language “bimodal” in the main text. In the revisions, we will provide this figure as a supplement.

      We thank the critical feedback from reviewer #1 and #2 that allowed us to clarify this issue of apparent bimodality of the depolymerization rates.

      The authors make a point here that the distribution of measured polymerization rates is fairly narrow. This appears to be in contrast with Figure 1B, where polymerization rates take on a wide range of values. How do the two distributions of polymerization rates obtained by these two methods compare?

      To address this point, we directly compare the standard deviation of the polymerization rate measurements. For Fig. 1B EB1 tracking measurements, std ranges from 7.7-10.5 µm/min for a given spatial bin (as stated in Fig. 1B legend), while for Fig. 2A TIRF measurements std is 4.0 (periphery) and 4.5 µm/min (interior) as stated in the main text. Given that the mean values of polymerization rates are similar, this suggests that the TIRF measurements are less noisy. This further highlights the relative pros and cons of the two measurement methods. To discuss these issues, we have added a new paragraph in the discussion section.

      Figure 3.

      The laser ablation figure and movies are beautiful, but don't seem to add support to the story. Importantly, the authors do not confirm any spatial variability in depolymerization rate with these experiment. As a matter of fact, although the laser ablation experiments are only performed in the aster interior, the measured depolymerization rates appear to be just as consistent with the periphery rates in Figure 2. as they are with the interior rates in Figure 2. (They span quite a large range of values with the average right in the middle between what was measured for the two areas in Figure 2).

      Indeed, the values obtained with laser ablation are quite variable, even compared to the physiological depolymerization rate measured via TIRF microscopy. This perhaps reflects the variability of biology as well as the nature of the laser ablation which measures depolymerization rate at the level of microtubule populations. We hope our paper will increase interest in this rarely measured parameter, and perhaps invention of new probes to measure it more accurately and conveniently.

      Given the variability of our measurements, we conclude that the results between the TIRF based approach vs. laser ablation based approach of depolymerization rates are indistinguishable. We agree with the reviewer that the data does NOT argue that laser ablation results are more consistent with the interior TIRF measurements than peripheral TIRF measurements.

      To clarify this point, we remove the following clause “, which was comparable to the modal value of the depolymerization rates in the aster interior (Fig. 2).”

      We change the concluding sentence of our laser ablation paragraph from

      “Overall, these observations suggest that depolymerization dynamics are similar for plus ends following a natural catastrophe vs. ablation in the aster interior.”

      to

      “Overall, these observations confirm that depolymerization rates are variable, and we find no statistical distinction of rates between plus ends following a natural catastrophe vs. ablation.”

      Although the authors report they don't see any correlation between the distance and depolymerization rate, they should still plot the rate as a function of initial cut positions (Figures 3D, 3E).

      To address this concern, we plan to provide a supplemental figure in the revision. Please see the preliminary figure below. Due to technical limitations with the laser ablation system (field of view for 60x magnification), we only have measurements that span 15-100 microns from the center..

      From the single decaying inward wave the authors conclude that microtubules depolymerize fully to their minus ends which are distributed throughout the aster. Can the possibility that depolymerization is stopped by microtubule lattice defects/islands be excluded by these observations?

      The existence of microtubule lattice/defects is a recent development in the field and much is not known. If we assume that defects are structurally unstable, we predict that the episode of depolymerization will continue even when reaching a defect. If defects are stable and lead to instantaneous rescue of plus ends, we cannot distinguish the defects from minus ends. In this latter scenario, the interpretation of the decaying inward wave requires caution.

      What are the effects of the local increase in tubulin concentration due to the subunit release by depolymerization? What about the release of other lattice-binding MAPs (stabilizers)?

      We are interested in these questions as well. Soluble GDP-bound tubulin, released by depolymerization, is thought to exchange its nucleotide to GTP without need of a GEF, and no GEF is known. The dissociation rate of GDP is ~0.1 [1/sec], for a half-life of ~5 sec (Brylawski and Caplow, 1983, J. of Biol. Chem.), so we believe the tubulin subunits are recycled relatively quickly. It is not entirely obvious whether this necessarily results in a significant increase in ‘soluble’ tubulin concentration given tubulin diffusive transport. We hypothesize the main effect of stabilizing MAPs is on the depolymerization rate as discussed in our model in Fig. 5.

      Figure 4.

      Is the local depletion of tubulin/EB1 thought to be only within the narrow annulus at ~100 um distance, or is it not measurable on the inside due to the polymer signal? Can the two be separated? Such a sharp transition within a discrete annular region doesn't speak to the relative effects on the inside vs. the outside of the aster?!

      Yes, we also believe the soluble tubulin levels are even lower in the more inner regions of the aster. However, polymerized tubulin accounts for a large part of the fluorescence intensity in these inner regions, and our method does not faithfully reflect the soluble fraction. It will be important for future studies to employ specific methods that may unequivocally distinguish polymer vs. soluble tubulin concentrations (see below).

      More importantly, the local depletion of either tubulin or EB1 is not a good representation of a depletion of a MAP component that associates with the microtubule lattice. Both tubulin and EB1 bind preferably to microtubule ends, not lattice. Thus showing a profile of slight local tubulin and/or EB depletion does not seem to be relevant for the proposed model. Rather, overall microtubule polymer mass/density as a function of distance may be more relevant?

      Reviewer #1 makes a valid point that tubulin and EB1 are specifically incorporated to plus ends and not to the entire lattice as we assume for the MAPs in our theoretical model. To address this issue, we analyzed the fluorescence intensity of images obtained for a MAP that associates with the MT lattice, Tau-mCherry (Mooney et al. 2017). This quantification shows a depletion pattern similar to tubulin and EB1. Thus, we believe the local depletion is a general feature. For the revision, we plan to incorporate this Tau-mCherry data in Fig. 4.

      Figure 5.

      The toy model is intuitive and clear, but not sufficient without any experimental investigation. An attempt to quantify the actual distributions of at least one or a few selected proposed MAPs is needed. Is the depletion strongest where microtubule density is highest? What is the ratio of a MAP intensity to microtubule polymer density as a function of distance? How does that relate to local depolymerization rates? What are other testable model predictions that can show support for the proposed mechanism?

      We understand that our proposal is rather speculative, and the goal of this manuscript was to propose a hypothesis that may inspire others working on assembly on intracellular organelles. Although Tau is not an endogenous component of the egg extract system, we believe that our new quantification of Tau-mCherry depletion adds more credibility to our general proposal.

      Microtubule density is roughly uniform within the interior of the aster according to our current understanding (Ishihara et al. 2016 eLife). So the MAP:MT ratio is relatively uniform throughout the aster except at the very periphery where there are very few MTs assembled (i.e. “depletion is weakest where MT density is lowest.”)

      In the future, we may perform (1) FCS measurements of candidate MAPs to directly measure the concentration profile of the candidate MAP in soluble form and (2) depletion/addback to show which MAP most affects depolymerization rate. Although these experiments are appealing, this requires generation of new molecular reagents as well as calibration of a highly specialized optical method. Therefore, we decided to limit this paper to focus on the unusual observation of the variation of depolymerization rate and speculate the underlying mechanism.

      Also, the table is insufficiently described. Are any or all of these MAPs known to be specific regulators of microtubule depolymerization rates, but not other dynamics parameters?

      There are a large number of MAPs in Xenopus eggs, as there are in all cells, and the degree to which their effects on microtubules has been characterized is variable. To address this comment we include in the revised ms a list of known MAPs that are present in Xenopus egg extract, along with their estimated concentration from a published proteomic study. We annotate each MAP as to whether it increases or decreases microtubule stability, acknowledging that these data are very incomplete, in some cases there is disagreement in literature, and that we are combining pure protein and whole cell analysis. This table illustrates the challenge of associating dynamics regulation with any one MAP, since the behavior of microtubules is regulated by all these factors operating in parallel. That said, certain MAPs jump out as candidate depolymerization regulators that have been little studied for effects on dynamics, for example, MAP7.

      In the revision, we suggest to add this expanded table as a supplementary Table in addition to Table 1.

      Protein Description

      Gene Symbol

      Est. Conc. (nM)

      MT polymerization/nucleation/rescue?

      MT depolymerization/catastrophe?

      Lead reference

      Microtubule-associated protein RP/EB family member 1

      MAPRE1

      1800

      Increase

      Decrease

      PMID: 18364701

      Stathmin

      STMN1

      1600

      Decrease

      Increase

      PMID: 11792540

      MAP4

      MAP4

      960

      Increase

      Decrease

      PMID: 7962090

      Echinoderm microtubule-associated protein-like 2

      EML2

      580

      Decrease

      Increase

      PMID: 11694528

      EML4 protein

      EML4

      500

      Increase

      Decrease

      PMID: 17196341

      Disks large-associated protein 5

      DLGAP5

      380

      Increase

      Decrease

      PMID: 16631580

      Cytoskeleton-associated protein 5

      CKAP5

      300

      Increase

      Increase

      PMID: 23666085

      Kinesin-like protein KIF2C

      KIF2C

      200

      Decrease

      Increase

      PMID: 12620232

      CAP-Gly domain-containing linker protein 1

      CLIP1

      190

      na

      na

      Cytoskeleton-associated protein 4

      CKAP4

      160

      Increase

      Decrease

      PMID: 9799226

      Echinoderm microtubule-associated protein-like 1

      EML1

      140

      na

      na

      Ensconsin

      MAP7

      91

      na

      Decrease

      PMID: 31391261

      Targeting protein for Xklp2

      TPX2

      91

      Increase

      Decrease

      PMID: 26414402

      Microtubule-associated protein 1B

      MAP1B

      85

      Increase

      Decrease

      PMID: 7664878

      MAP1S

      MAP1S

      66

      Decrease

      Decrease

      PMID: 25300793

      Hyaluronan mediated motility receptor

      HMMR

      61

      na

      na

      MAP7 domain-containing protein 1

      MAP7D1

      47

      na

      na

      Cytoskeleton-associated protein 2

      CKAP2

      46

      Increase

      Decrease

      PMID: 15504249

      Microtubule-associated tumor suppressor 1

      MTUS1

      43

      na

      na

      Kinesin-like protein KIF2A

      KIF2A

      37

      Decrease

      Increase

      PMID: 29980677

      CLIP-associating protein 1

      CLASP1

      30

      Decrease

      Decrease

      PMID: 29937387

      Microtubule-associated protein RP/EB family member 3

      MAPRE3

      21

      Increase

      Decrease

      PMID: 20850319

      MAP7 domain containing 2 protein variant 2 (Fragment)

      MAP7D2

      8

      na

      na

      CAP-Gly domain-containing linker protein 4

      CLIP4

      2

      na

      na

      \*Minor comments:***

      Figure 1.

      typo in the figure legend: "interior (distance>300 μm) vs. periphery (50 μmThere appears to be a clear dip in EB1 density at 100 um (Figure 1C). What could be the cause of that?*

      Thank you for catching the typo. We corrected this to “periphery (distance>300 µm) vs. interior (50 µmFigure 2.

      Note that the distances used in Figure 2. to define 'interior' and 'periphery' are completely different than those in Figure 1. (Interior in Figure 1 is defined to be between 50 and 280 um from the MTOC, and exterior larger than 300 um. However, in Figure 2. interior is defined as less than 100 um, and exterior as larger than 200 um.) Given that the asters are actively growing, it would be good to clearly explain how these intervals were defined in each case.

      For both experiments, we had clearly stated the definitions of interior and periphery, either in the figure legends or in the methods section. We have added a new paragraph explaining why we could not choose exactly the same quantitative definitions for these two methods (please also see our reply to Reviewer #2 comment 1).

      In the periphery movie, there are several notable examples of apparent minus-end depolymerization and treadmilling. The authors state these are very rare - perhaps a quantification would be useful here?

      Thank you for pointing this out. We modified the sentence to reflect the outward depolymerization events in the periphery. “We observed few outward-moving depolymerization events (Reviewer #1 (Significance (Required)):

      The observation of distinct depolymerization rates within vs. at the periphery of microtubule asters is novel and interesting. However, the manuscript in its current form is rather preliminary. The observation can be significantly strengthened by additional experiments/analysis that would characterize the effect in more detail. Even more importantly, the authors propose a highly speculative (although compelling) mechanism, but make no attempt to test it in any way. This is a major deficiency of the current manuscript that should be addressed prior to publication.

      REFEREES CROSS COMMENTING

      I agree with Reviewer #2 that our comments are both overlapping and complementary. I also find Reviewer #2's comments fair and reasonable and see no need for further adjustments.

      RESPONSE TO REVIEWER #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      \*SUMMARY ***

      This paper reports measurements of microtubule dynamics in interphase asters nucleated in Xenopus egg extracts. Dynamics are measured using two methods. First tracking of GFP tagged EB1 protein forming comets at the tips of growing microtubules, as used in other studies, which can only measure growth rates. Second using a recently developed automated tracking based on subtractive difference images of fluorescently labelled microtubules, which can measure both growth and shrinkage rates. The main and novel observation of this paper, using difference image tracking, is that the MT shrinkage rate is ~2 fold faster in the interior of the aster compared with the periphery of the aster, whilst rates of MT polymerisation and catastrophe vary only slightly, if at all. The authors speculate that this might be due to a reduced MAP concentration and occupancy in the aster interior. They also discuss the role of a depletion-dependent increased shrinkage rate as a feedback mechanism to maintain a low MT polymer density in the aster interior.

      \*MAJOR COMMENTS***

      The movies are startling in their beauty and clarity and the key conclusion that the shrinkage rate is significantly faster in the interior compared to the periphery of the aster is convincing.

      The observation that the rate of net MT plus end growth rate is ~10% faster at the periphery compared to interior of the aster is only supported by EB1 tip tracking method. The difference imaging method shows no significant difference in rates. The authors need to discuss this discrepancy between the established and new methods of analysis. It is insufficient to state that the growth rates obtained by the two methods are "consistent".

      This comment prompts the comparison of the two methods (EB1 vs. TIRF difference imaging). On one hand, EB1 tracking is more sensitive in detecting plus ends, and allows large N observations so it is likely to show statistical significance. On the other hand, EB1 tracking method is noisier (higher standard deviation) than the TIRF based measurements (see our response to Reviewer #1). In the TIRF difference imaging, the exact location of the periphery (relative to the center as well as the overall microtubule density profile) is hard to evaluate.

      What is consistent between the two methods is the approximate mean value of polymerization rates. The 10% faster polymerization velocity is only suggested by the EB1 tracking method, calling for caution/further investigation. However, the potential relatively small difference in polymerization rate is not the main point of this paper.

      We deleted the sentence in the results section for the TIRF method: “These values of polymerization rates are consistent with EB1 comet tracking (Fig. 1). ” We have added a new paragraph discussing the discrepancies between the methods in reporting polymerization rate.

      The discussion proposing MAP depletion-dependent increased shrinkage rate as a feedback mechanism to limit MT polymer density is reasonable.

      The model and discussion of the role of MAPs might be criticised as highly speculative and unsupported by any experimental data. The authors do acknowledge this. Whether the ratio of data to speculative interpretation is appropriate will be an editorial decision for whichever journal ultimately hosts this.

      Thank you. This is exactly the kind of comments that we wanted to hear from an initiative like Review Commons. This helps us gauge how our work is received and decide which journal to submit our work.

      In particular since the aster forms by growth from the nucleating bead, early in its formation the final interior MTs must have first formed the peripheral MTs and could therefore enter fresh media and bind MAPs. The authors show by calculation that as the aster expands, these MTs and MAPs become isolated from mixing with the external media. This isolation would then suggest that any MAPS released by dissociation or MT depolymerisation must remain in the interior, and are therefore available to rebind to newly formed MTs. So, it is unclear why the MAPs should be depleted in the interior compared to the periphery, unless expansion of the Aster is slowed in which case additional MAPs could diffuse into the stationary periphery from the surrounding media. The kinetics of MT growth, MAP binding and aster expansion would then also be expected to have an effect on the outcome beyond a simple "depletion" of the internal MAP concentration.

      We use the term “depletion” to mean a significant decrease of MAP from the cytoplasm. As outlined in our toy model, more MTs lead to more MAP binding and depletion of soluble MAPs. Note that the total local abundance of MAP is constant unless there is significant diffusive transport of MAP from one region to another. We argue this transport is ineffective for the large length scale of interphase asters.

      It is also not clear how the authors preferred model would account for the suggestion of bimodal shrinkage rates. It is not clear if this is a simplification (binning things in to external and internal) applied for the purposes of discussion.

      Please see our comment to Reviewer #1. We now believe there is no evidence for bimodality of depolymerization rates. The spread of the data reflects the variability of depolymerization rates in a given a field of view as well as the variability across multiple fields of view.

      \*MINOR COMMENTS***

      Line 71

      Authors reference Gardner et al 2011, when discussing depolymerisation as a zero order process, as showing a free tubulin dimer concentration effect on shrinkage rates. However, the results in Gardner refer to the off rate during MT polymerisation, and measurements of rapid small scale events during overall growth phases and would be applicable to GTP-heterodimers, whereas the extended shrinkage events measured in this paper would presumably apply to post-catastrophe GDP-heterodimer dissociation and may not be comparable. The reference should be omitted or a further explanation given.

      Thanks, good point. We wanted to cite Gardner et al (2011) to make the point that classic assembly models may not always hold, but the reviewer is correct, that paper only looked at concentration dependence of depolymerization at growing ends. The text was changed to:

      “This assumption has been questioned for growing ends (Gardner 2011)​, but not for shrinking ends to our knowledge.”

      Line 89

      States "density of plus ends is approximately homogenous within interphase asters"

      However, in results section it is stated Line 111 that "the plus end density is lower at the periphery compared to the aster center".

      Please clarify

      The plus end density is approximately homogenous from the center to the periphery of the aster. However, only at the most peripheral region, where there are few microtubules, the density drops.

      Line 135

      The distances given for the interior and periphery appear to be mixed up.

      Thank you, we corrected this.

      Line277

      "approximately consistent with our Peclet number estimate". 50µm gives a Pe value of 2.8. The Peclat number "significance" is earlier given in terms of "Pe>>1" (Line255). Please clarify what range of experimental values is required for the argument to hold.

      Our statement was unclear. We modified the sentence in the following way to clarify our point: “The half-width of the depleted zone extended ~50 microns beyond the growing aster periphery, which is smaller than the typical aster radius. This analysis indicated that soluble protein levels may vary between subregions of growing asters due to subunit consumption.”

      Line 404

      needs details of the GFP-EB1 and fluorescent tubulin used in this experiment.

      The detailed concentrations are described for each method in the subsequent sections. To avoid confusion, we removed the sentence in line 404, which omitted details.

      The tubulin depletion measurements detect a 4% reduction in tubulin concentration in the interior versus the exterior, and the same for eGFP-EB1 (Fig.4B). This observation provides important support for the depletion proposal. But the experiments apparently lack a control for potential reduction of fluorescence excitation intensity with depth in these deep specimens (equivalent to the inner filter effect in spectroscopy). Is there a component whose apparent concentration (fluorescence emission intensity) does not decrease by 4% in the interior of the aster?

      Indeed, fluorescent intensity measurements require special attention. Our samples are made by squashing 4 ul of extract under a 18 mm x 18 mm coverslip and the resulting thickness is 10 micron, which we believe is a distance that is too small to result in an inner filter effect.

      In response to Reviewer #2’s request for an example of a component whose fluorescence intensity is uniform, we provide the intensity profile of the inert 10kDa Dextran labeled with Alexa568. This serves as a control for the reviewer’s specific concern with our method. We will incorporate this as a supplementary figure in the revision.

      There is no direct discussion of the relative lifetime of MTs in the interior compared to the exterior of the aster. Catastrophe rates and growth rates are essentially invariant, I think this implies that MT lifetimes are essentially the same in the interior versus the exterior? Please confirm and estimate the lifetime. This could exclude a maturation process whereby one set of MAPs got replaced by another over time?

      Indeed, MT lifetime is a function of four rates: polymerization, depolymerization, catastrophe, and rescue. The figure below shows the MT lifetime as a function of depolymerization rate, assuming other parameters are fixed at what we found in our previous report Ishihara et al. 2016. In regions of fast depolymerization rate 40 µm/min, the microtubule lifetime is 0.98 min. As the depolymerization rate decreases to 30 and 25 µm/min, the lifetime increases to 1.5 and 2.4 min. This implies that the microtubules at the aster periphery are longer lived than those in the interior.

      Association and dissociation rate constants have not been measured for most MAPs, but in general we expect them to be fast compared to the timescale of MT lifetime of ~1 minute. Most MAPs bind in the low micromolar or high nM regime, which implies dissociation rates of seconds or less. MAP4 and MAP7 were both shown to bind and dissociate rapidly in living cells (PMID: 16714020, PMID: 11719555)

      Reviewer #2 (Significance (Required)):

      This paper is significant as it is the first observation of spatial variation in MT shrinkage rates in an aster. It proposes the broad shape of an underlying mechanism (depletion of stabilising MAPS in the aster interior) and presents sound quantitative arguments, but the experiments do not directly test this mechanism. Aster formation in Xenopus egg extracts is widely used as a model system, and if indeed the spatial variation turns out to be due to spatial depletion of components then this will become a landmark paper. The paper may promote wider use of this method of automated analysis and encourage study of shrinkage rate mechanisms in other systems.

      REFEREES CROSS COMMENTING

      In my opinion the comments of reviewer #1 are fair and reasonable and overlap with and complement my own. In my opinion there is zero conflict requiring adjustment.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      SUMMARY

      This paper reports measurements of microtubule dynamics in interphase asters nucleated in Xenopus egg extracts. Dynamics are measured using two methods. First tracking of GFP tagged EB1 protein forming comets at the tips of growing microtubules, as used in other studies, which can only measure growth rates. Second using a recently developed automated tracking based on subtractive difference images of fluorescently labelled microtubules, which can measure both growth and shrinkage rates. The main and novel observation of this paper, using difference image tracking, is that the MT shrinkage rate is ~2 fold faster in the interior of the aster compared with the periphery of the aster, whilst rates of MT polymerisation and catastrophe vary only slightly, if at all. The authors speculate that this might be due to a reduced MAP concentration and occupancy in the aster interior. They also discuss the role of a depletion-dependent increased shrinkage rate as a feedback mechanism to maintain a low MT polymer density in the aster interior.

      MAJOR COMMENTS

      The movies are startling in their beauty and clarity and the key conclusion that the shrinkage rate is significantly faster in the interior compared to the periphery of the aster is convincing.

      The observation that the rate of net MT plus end growth rate is ~10% faster at the periphery compared to interior of the aster is only supported by EB1 tip tracking method. The difference imaging method shows no significant difference in rates. The authors need to discuss this discrepancy between the established and new methods of analysis. It is insufficient to state that the growth rates obtained by the two methods are "consistent".

      The discussion proposing MAP depletion-dependent increased shrinkage rate as a feedback mechanism to limit MT polymer density is reasonable.

      The model and discussion of the role of MAPs might be criticised as highly speculative and unsupported by any experimental data. The authors do acknowledge this. Whether the ratio of data to speculative interpretation is appropriate will be an editorial decision for whichever journal ultimately hosts this.

      In particular since the aster forms by growth from the nucleating bead, early in its formation the final interior MTs must have first formed the peripheral MTs and could therefore enter fresh media and bind MAPs. The authors show by calculation that as the aster expands, these MTs and MAPs become isolated from mixing with the external media. This isolation would then suggest that any MAPS released by dissociation or MT depolymerisation must remain in the interior, and are therefore available to rebind to newly formed MTs. So, it is unclear why the MAPs should be depleted in the interior compared to the periphery, unless expansion of the Aster is slowed in which case additional MAPs could diffuse into the stationary periphery from the surrounding media. The kinetics of MT growth, MAP binding and aster expansion would then also be expected to have an effect on the outcome beyond a simple "depletion" of the internal MAP concentration.

      It is also not clear how the authors preferred model would account for the suggestion of bimodal shrinkage rates. It is not clear if this is a simplification (binning things in to external and internal) applied for the purposes of discussion.

      MINOR COMMENTS

      Line 71 Authors reference Gardner et al 2011, when discussing depolymerisation as a zero order process, as showing a free tubulin dimer concentration effect on shrinkage rates. However, the results in Gardner refer to the off rate during MT polymerisation, and measurements of rapid small scale events during overall growth phases and would be applicable to GTP-heterodimers, whereas the extended shrinkage events measured in this paper would presumably apply to post-catastrophe GDP-heterodimer dissociation and may not be comparable. The reference should be omitted or a further explanation given.

      Line 89 States "density of plus ends is approximately homogenous within interphase asters" However, in results section it is stated Line 111 that "the plus end density is lower at the periphery compared to the aster center". Please clarify

      Line 135 The distances given for the interior and periphery appear to be mixed up.

      Line277 "approximately consistent with our Peclet number estimate". 50µm gives a Pe value of 2.8. The Peclat number "significance" is earlier given in terms of "Pe>>1" (Line255). Please clarify what range of experimental values is required for the argument to hold.

      Line 404 needs details of the GFP-EB1 and fluorescent tubulin used in this experiment.

      The tubulin depletion measurements detect a 4% reduction in tubulin concentration in the interior versus the exterior, and the same for eGFP-EB1 (Fig.4B). This observation provides important support for the depletion proposal. But the experiments apparently lack a control for potential reduction of fluorescence excitation intensity with depth in these deep specimens (equivalent to the inner filter effect in spectroscopy). Is there a component whose apparent concentration (fluorescence emission intensity) does not decrease by 4% in the interior of the aster?

      There is no direct discussion of the relative lifetime of MTs in the interior compared to the exterior of the aster. Catastrophe rates and growth rates are essentially invariant, I think this implies that MT lifetimes are essentially the same in the interior versus the exterior? Please confirm and estimate the lifetime. This could exclude a maturation process whereby one set of MAPs got replaced by another over time?

      Significance

      This paper is significant as it is the first observation of spatial variation in MT shrinkage rates in an aster. It proposes the broad shape of an underlying mechanism (depletion of stabilising MAPS in the aster interior) and presents sound quantitative arguments, but the experiments do not directly test this mechanism. Aster formation in Xenopus egg extracts is widely used as a model system, and if indeed the spatial variation turns out to be due to spatial depletion of components then this will become a landmark paper. The paper may promote wider use of this method of automated analysis and encourage study of shrinkage rate mechanisms in other systems.

      REFEREES CROSS COMMENTING

      In my opinion the comments of reviewer #1 are fair and reasonable and overlap with and complement my own. In my opinion there is zero conflict requiring adjustment.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Ishihara et al. investigate and compare microtubule polymerization/depolymerization dynamics inside vs. at the periphery of microtubule asters in a cell-free Xenopus egg extract system. By tracking EB comets, which localize to growing microtubule ends, they find that the microtubule growth rates and EB comet lifetimes (interpreted as an indicator of microtubule catastrophe rates) are similar between the two spatially-distinct microtubule populations. However, using a tubulin-intensity-difference image analysis, the authors are also able to measure local microtubule depolymerization rates, and they find a significant difference in depolymerization rates of the two populations. Specifically, the authors report that the microtubule depolymerization rates measured within asters are faster than those measured at the periphery.

      Specific comments:

      Figure 2. In the text, the authors report: "The depolymerization rate was 36.3 {plus minus} 7.9 μm/min (mean, std) in the aster interior, compared to 29.2 {plus minus} 8.9 μm/min (mean, std) at the aster periphery." This difference is certainly not two-fold (as stated in the abstract). It would also be useful to mark the mean rates on the graph in 2B.

      The bimodal shape of the depolymerization rate distributions in 2B is very interesting. This definitely warrants further investigation. At the minimum, the depolymerization rates should be determined at 50 um- intervals, as done for other parameters in Figure 1. Could it be that there are two coexisting populations of microtubules at the same location? Or is there a clear spatial compartmentalization of the two that is not obvious here because of the too large of a distance interval used for the measurements. This is a very important distinction for the claims of the paper.

      The authors make a point here that the distribution of measured polymerization rates is fairly narrow. This appears to be in contrast with Figure 1B, where polymerization rates take on a wide range of values. How do the two distributions of polymerization rates obtained by these two methods compare?

      Figure 3. The laser ablation figure and movies are beautiful, but don't seem to add support to the story. Importantly, the authors do not confirm any spatial variability in depolymerization rate with these experiment. As a matter of fact, although the laser ablation experiments are only performed in the aster interior, the measured depolymerization rates appear to be just as consistent with the periphery rates in Figure 2. as they are with the interior rates in Figure 2. (They span quite a large range of values with the average right in the middle between what was measured for the two areas in Figure 2).

      Although the authors report they don't see any correlation between the distance and depolymerization rate, they should still plot the rate as a function of initial cut positions (Figures 3D, 3E).

      From the single decaying inward wave the authors conclude that microtubules depolymerize fully to their minus ends which are distributed throughout the aster. Can the possibility that depolymerization is stopped by microtubule lattice defects/islands be excluded by these observations?

      What are the effects of the local increase in tubulin concentration due to the subunit release by depolymerization? What about the release of other lattice-binding MAPs (stabilizers)?

      Figure 4. Is the local depletion of tubulin/EB1 thought to be only within the narrow annulus at ~100 um distance, or is it not measurable on the inside due to the polymer signal? Can the two be separated? Such a sharp transition within a discrete annular region doesn't speak to the relative effects on the inside vs. the outside of the aster?!

      More importantly, the local depletion of either tubulin or EB1 is not a good representation of a depletion of a MAP component that associates with the microtubule lattice. Both tubulin and EB1 bind preferably to microtubule ends, not lattice. Thus showing a profile of slight local tubulin and/or EB depletion does not seem to be relevant for the proposed model. Rather, overall microtubule polymer mass/density as a function of distance may be more relevant?

      Figure 5. The toy model is intuitive and clear, but not sufficient without any experimental investigation. An attempt to quantify the actual distributions of at least one or a few selected proposed MAPs is needed. Is the depletion strongest where microtubule density is highest? What is the ratio of a MAP intensity to microtubule polymer density as a function of distance? How does that relate to local depolymerization rates? What are other testable model predictions that can show support for the proposed mechanism?

      Also, the table is insufficiently described. Are any or all of these MAPs known to be specific regulators of microtubule depolymerization rates, but not other dynamics parameters?

      Minor comments:

      Figure 1. typo in the figure legend: "interior (distance>300 μm) vs. periphery (50 μm<distance<280 μm)" There appears to be a clear dip in EB1 density at 100 um (Figure 1C). What could be the cause of that?

      Figure 2. Note that the distances used in Figure 2. to define 'interior' and 'periphery' are completely different than those in Figure 1. (Interior in Figure 1 is defined to be between 50 and 280 um from the MTOC, and exterior larger than 300 um. However, in Figure 2. interior is defined as less than 100 um, and exterior as larger than 200 um.) Given that the asters are actively growing, it would be good to clearly explain how these intervals were defined in each case.

      In the periphery movie, there are several notable examples of apparent minus-end depolymerization and treadmilling. The authors state these are very rare - perhaps a quantification would be useful here?

      Significance

      The observation of distinct depolymerization rates within vs. at the periphery of microtubule asters is novel and interesting. However, the manuscript in its current form is rather preliminary. The observation can be significantly strengthened by additional experiments/analysis that would characterize the effect in more detail. Even more importantly, the authors propose a highly speculative (although compelling) mechanism, but make no attempt to test it in any way. This is a major deficiency of the current manuscript that should be addressed prior to publication.

      REFEREES CROSS COMMENTING

      I agree with Reviewer #2 that our comments are both overlapping and complementary. I also find Reviewer #2's comments fair and reasonable and see no need for further adjustments.

    1. Reviewer #3:

      This manuscript reports a series of unique experiments with a single human participant, using an electrode array implanted in the left posterior parietal cortex several years after high-level spinal cord injury. There is a small but increasing number of groups capable of performing this type of research in humans. Most of this work has been focused on the motor system, but studies like this one, characterizing the somatosensory system (touch, in particular), have been increasingly common in the past five years. However, this is the only group focusing on this higher-level, multimodal association area of the cortex.

      Most of the recorded neurons were activated bilaterally, which is consistent with earlier monkey work from this lab. Probably the most important component of the work is the analysis of the modest activation in this area that occurs simply when the participant imagines different places on her body being touched - even the insensate arm. This work is virtually impossible to do in monkeys. There are extensive and overlapping analyses of the relation between actual and imagined activation, and the activation arising from inputs (or imagined inputs) from the two sides of the body. Eliminating a number of these and clarifying the remainder may improve the impact.

      1) 63: in a tetraplegic human subject recorded with an electrode array implanted in the left PPC I am curious why the array was placed in the left PPC, given the clinical evidence for the greater role of the right side in the formation of internal, multi-modal maps. Some comments would be useful.

      2) Fig 1: It would be good to show a panel of representative spikes, perhaps with their single-trial raster responses. This could be in a new figure that includes panel 1D, which is presented in a bit of an odd order as it now stands, coming in the midst of higher-level analyses. Indicate how many trials went into the averages in 1D.

      3) 146: we computed a cross-validated coefficient of determination (R^2 within) to measure the strength of neuronal selectivity for each body side. Even after reading the methods (further comments below) it is difficult to figure out what all these related measures reveal. At this point in the text it is very difficult to intuit how R^2 would measure selectivity.

      4) Fig 4: Several panels would be more effective if plotted as a function of distance rather than a category. 4E: This panel is borderline too small 4F: definitely too small. Enlarge, perhaps with fewer examples The curves drawn on the panels do not appear to be Gaussian, but neither are they just connected points. Show whatever it was you actually used. The Gaussian assumption does not appear to be very good for the edge cases (first two, last two) which is not terribly surprising.

      5) What is added by including both classification and Mahalanobis distance?

      6) 354: information coding evolves for a single unit. Two complementary analyses were then performed. In what sense are they complementary? What is added (besides complexity) by including both cluster analysis and PCA?

      7) Fig 8C: Despite my best efforts, I have no idea what this is showing

      8) 753: Classification was performed using linear discriminant analysis with the following assumptions:

      One, the prior probability across tested task epochs was uniform; It is not clear what prior probability this refers to. Just stimulus site?

      Two, the conditional probability distribution of each unit on any epoch was normal; Is this a reference to firing rate probability conditioned on stimulus site?

      Three, only the mean firing rates differ for unit activity during each epoch (covariance of the normal distributions are the same for each);

      Four, firing rates for each input are independent (covariance of the normal distribution is diagonal).

      Does this refer to independent firing rates of neurons across stimulus sites? This seems very unlikely, given everything we know about dimensionality of cortex. Perhaps it refers to something else. Cannot all of these assumptions be tested? Were they?

      9) 768: we computed the cross-validated coefficient of determination (R2 within) to measure how well a neuron's firing rate could be explained by the responses to the sensory fields. This needs a better description, and I may be missing the point entirely. I assume it is an analysis of mean firing rate (which should be stated explicitly) and that it uses something like the indicator variable of the linear analysis of individual neuron tuning above. In this case is this a logistic regression? As it is computed for each side independently, it would appear that there are only four bits to describe the firing of any given neuron. This would seem to be a pretty impoverished statistic, even if the statistical model is accurate.

      10) 786: The purpose of computing a specificity index was to quantify the degree to which a neuron was tuned to represent information pertaining to one side of the body over the other. This is all pretty hard to follow. The R2 metric itself is a bit mysterious, as noted above. Within and across R2 is fairly straightforward, but adds to the complexity, as does SI, which makes comparisons of three different combinations of these measures across sides. Aside from R2 itself, the math is pretty transparent. However, a better high-level description of what insight all the different combinations provide would help to justify using them all. As is, there is no discussion and virtually no description of the difference across these three scatter plots. The critical point apparently, is that, "nearly all recorded PC-IP neurons demonstrate bilateral coding". There should be much a more direct way to make this point.

      11) Computing response latency via RF discrimination is rather indirect and assumes that there is significant classification in the first place. I suspect it will add at least some delay beyond more typical tests. Why not a far simpler and more direct test of means in the same sliding window? Alternatively, a change point analysis?

    2. Reviewer #2:

      General assessment:

      The study by Chivukula et al., explored a unique (n=1) dataset of multi-unit neuron recordings collected in the postcentral-intraparietal area (PC-IP) of a tetraplegic human subject taking part in a brain machine interface clinical trial. The recordings were collected across a set of tasks designed to investigate neuronal responses to both experienced and imagined touch.

      Overall I found the manuscript to be well-written, the study to be interesting, and the analysis reasonable. I do, however, think the manuscript would benefit by addressing two main, and a number of minor, issues.

      Major comments:

      1) The methods would benefit from additional rationale / supporting references throughout. Whereas it is generally clear what was done, it is sometimes less clear why certain choices were made. Perhaps some of the choices are "standard practice" when working with single unit recordings, but I was left in want of a bit more reasoning (or at least direction to relevant literature). Some examples are below:

      For the population correlation (line 723): why was the correlation computed 250 times or why were the two distributions shuffled together 2000 times?

      For the decode analysis (line 752): consider providing a reference for those interested in better understanding the "peeking" effects mentioned.

      Response latency (line 798): how were window parameters determined (for both visualization and the latency calculation). And what was the rationale for them being different - especially given that the data used for the response latency calculation was still visualized (at least in part)? Relatedly, I'd be curious to see the entire time-course for that data rather than just the shaded region of the "visualization" data. Also, it would be nice if a comment (or some data) could be provided regarding how much the latency estimates change based on these parameter choices.

      Temporal dynamics of population activity (line 830): why use a 500 ms window, stepped at 100 ms intervals instead of something else?

      Temporal dynamics of single unit activity (line 887): it is stated that the neurons were restricted to those whose 90th percentile accuracy was at least 50% to ensure only neurons with some degree of significant selectivity were used for the cluster analysis. But why these particular values? Are the results sensitive to this choice? In this section, I'd also suggest providing references for those interested in better understanding the use of Bayesian information criteria. Similarly, it is stated that PCA is a "standard method for describing the behavior of neural populations" - as such it would be nice to provide some relevant references for the reader.

      2) The manuscript would benefit from additional context in the intro as well as a more thorough discussion - particularly with respect to the imagination aspect of the experiment.

      Intro: The second paragraph did well in establishing why one might be interested in examining somatosensory processing in the PPC. It was however, less clear why the particular questions at the end of the paragraph were being posed. Perhaps an extra paragraph could be added to bridge the notion that a sizeable body of literature has been developed around somatosensory representation within the PPC and the several "fundamental" questions remaining that are of interest here.

      Discussion: The manuscript would benefit from a more thorough discussion of "imagination per se" and the various top-down processes that might be involved - as well as better positioning with respect to previous studies investigating top-down modulation of the somatosensory system. The authors state that the cognitive engagement during the tactile imagery may reflect semantic processing, sensory anticipation, and imagined touch per se - which I would not argue. But I would also expect some explicit mention of processes like attention and prediction. Perhaps these are intended to be captured by "sensory anticipation" - but, for example, attention can be deployed even if no sensation is anticipated. Importantly, it seems that imagining a sensation at a particular body site might well involve attending to that body part. That is, one may first attend to a body part before "imagining" a sensation there - and then even continue to attend there the entire time the imagining is being done. Because of this, perhaps the authors are considering attention to be a part of "imagination per se". But since attention has been shown to modulate somatosensory cortex without imagination, how can one exclude the possibility that the neuronal activity measured here simply reflects this attention component? Regardless, I think the discussion would benefit from a more explicit treatment of these top-down processes - especially given the number of previous studies showing that they are able to modulate activity throughout the somatosensory system. Some literature that may be of interest include:

      Roland P (1981) Somatotopical tuning of postcentral gyrus during focal attention in man. A regional cerebral blood flow study. Journal of Neurophysiology 46 (4):744-754

      Johansen-Berg H, Christensen V, Woolrich M, Matthews PM (2000) Attention to touch modulates activity in both primary and secondary somatosensory areas. Neuroreport 11 (6):1237-1241

      Hamalainen H, Hiltunen J, Titievskaja I (2000) fMRI activations of SI and SII cortices during tactile stimulation depend on attention. Neuroreport 11 (8):1673-1676. doi:10.1097/00001756-200006050-00016

      Puckett AM, Bollmann S, Barth M, Cunnington R (2017) Measuring the effects of attention to individual fingertips in somatosensory cortex using ultra-high field (7T) fMRI. Neuroimage 161:179-187. doi:10.1016/j.neuroimage.2017.08.014

      Yu Y, Huber L, Yang J, Jangraw DC, Handwerker DA, Molfese PJ, Chen G, Ejima Y, Wu J, Bandettini PA (2019) Layer-specific activation of sensory input and predictive feedback in the human primary somatosensory cortex. Sci Adv 5 (5):eaav9053. doi:10.1126/sciadv.aav9053

    3. Reviewer #1:

      In this study Chivukula, Zhang, Aflalo et al. report on an extensive set of neural recordings from human PPC. It is found that many neurons are responsive to touch in specific locations. Interestingly, a considerable fraction of the neurons displayed symmetric bilateral receptive fields. Furthermore, these neurons also became active during imagined touches. The study paves the way for a deeper understanding of the role of the human PPC.

      The paper presents a wealth of analysis on an extensive set of recordings. It is generally well written and the analyses are well thought out. My main concerns are regarding missing information and unclear descriptions of some of the analyses undertaken, which are detailed below.

      1) At the start of the results section it is stated that the recordings were from "well-isolate and multi-unit neurons". This seems to contradict the Methods section, which only talks about "sorted" neurons. This needs to be clarified, and if multi-units were included, it should be stated which sections this concerns as it will have implications for the results (e.g. for selectivity for different body parts). In any case, the number of neurons included in different analyses should be evident. There are some numbers in the Methods and sprinkled throughout the Results section, but for some of the analyses (e.g. clustering analysis, which was run only on a responsive subset of neurons) no numbers are provided.

      2) The linear analysis section needs further details. The coefficients are matched to "conditions" but it is not explained how. I am assuming that each touch location is assigned to a condition c, however the way the model is described suggests that the vector X can in principle have multiple conditions active at the same time. Additionally, could the authors confirm whether it is the significance of the coefficients that determined whether a neuron was classed as responsive as shown in Figure 1? This analysis states a p-value but does give no further information on which test was run and on what data.

      3) Figure 1 C could be converted into a matrix that lists all combinations of RF numbers on either side of the body to highlight whether larger RFs on one side of the body generally imply larger RFs on the other side.

      4) I am confused about the interpretation of the coefficient of determination as shown in Figure 2A. In the text this is described as testing the "selectivity" of the neurons. To clarify, I am assuming that the "regression analysis" is referring to the linear model described in a previous section. The authors then presumably took the coefficients from this model for a single side only and tested how well they could predict the responses to the opposite side, as assessed by R^2 (Fig 2C,E). Before that in Fig 2A, they tested how well each single-side model could predict the responses. This is all fine, but the "within" comparison then simply tests how well a linear model can explain the observed responses, and has nothing to do with the selectivity of the neuron. For example, the neuron might be narrowly or broadly selective, but the model might fit equally well.

      5) Regarding the timing analysis, it is not clear to me how the accuracy can top out at 100% as shown in the figure, when the control conditions were included. Additionally, the authors should state the p value and statistic for the comparison of latencies.

      6) Spatial analysis. Could the authors provide the size of the paintbrush tip that was used in this analysis. Furthermore, as stimulation sites were 2 cm apart, it is not appropriate to specify receptive fields down to millimeter precision.

      7) Imagery: how many neurons were responsive to both imagery and real touch? Were all neurons that were responsive to imagery also responsive to actual touch? This is left vague and Figure 5 only includes the percentages per condition, but gives no indication of how many neurons responded to several conditions. Whether and how many neurons were responsive to both conditions also determines the ceiling for the correlation analysis in Figure 5D (e.g. if the most neurons are responsive only to actual but not imaginary touch, this will limit the population correlation).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Tamar R Makin (University College London) served as the Reviewing Editor.

      Summary:

      Chivukula and colleagues report an extensive set of multi-unit neural recordings from PPC of a tetraplegic patient taking part in a brain machine interface clinical trial. The recordings were collected across a set of tasks, designed to investigate neuronal responses to both experienced and imagined touch. It was found that many neurons are responsive to touch in specific locations. Most of the recorded neurons were activated bilaterally, which is consistent with earlier monkey work from this lab. Probably the most important component of the work is the analysis of the modest activation in this area that occurs simply when the participant imagines different places on her body being touched - even the insensate arm. This work is virtually impossible to do in animals, and as such offers a unique opportunity to describe neural properties for higher-level representation of touch. The study therefore paves the way for a deeper understanding of the role of the human PPC in the cognitive processing of somatosensation.

      Overall, we found the manuscript to be well-written, the study to be interesting, and for the most part the analyses are well thought out. But at the same time, the reviewers raised multiple main concerns regarding missing information and unclear descriptions of some of the analyses undertaken, which are detailed below over many major and minor comments. In addition, it was felt that there was unnecessary overlap across analyses - the first part especially contains a number of analyses that seem to make very similar points repeatedly or where it is not entirely clear what the point is in the first place. As such, there is a need to identify and cut a lot of the duplicative analyses/results and explain both the essential methods and the interpretation of the remaining results more succinctly and clearly. The key analyses could then be streamlined and better justified, ideally with an eye towards a consistent approach in both parts of the paper. here are also some major considerations regarding the contextualisation and interpretation of the key imagery results, as detailed in the first major comment below.

    1. Reviewer #3:

      The manuscript by Schonhaut et al. presents novel analysis on an impressive dataset of more than 1200 neurons across diverse brain areas in the human brain to investigate their modulation by hippocampal theta oscillations. They found a substantive proportion of cells phase-locked to hippocampal activity, mainly in the theta frequency band, in several areas known to be functionally related to the hippocampus, some of them receiving monosynaptic hippocampal inputs but other only indirect ones. These results extend previous reports in humans showing hippocampal interactions with these structures but at the level of mesoscopic activity and highlight the ubiquity of spike-theta timing and the importance of single-unit studies in humans. Additional analysis, detailed below, will contribute to give a better description of the data, provide stronger support for some of the authors' claims and clarify some issues.

      1) I assume that the dataset also includes hippocampal units, why then excluding them from the analysis? Although the main novelty is in the coupling of cells in other structures with hippocampal LFPs, it would be useful to also compare it with the coupling of local hippocampal cells.

      2) Include average power spectrum of hippocampal LFPs. Additional examples of raw LFP traces overlaid to spectrograms (perhaps in Supplementary) will help to illustrate the nature of hippocampal oscillations.

      3) The authors compared fractions of significantly modulated units and their preferred frequencies across regions. While very informative, these analyses are not sufficient to capture the richness of spike-LFP interactions likely existing in the dataset. Were there differences in the strength of phase-locking across regions? (this analysis could be added to Figure 2). Studies in rodents have shown that theta phase-locked units in different structures have characteristics preferred firing phases (when hippocampal LFP is used as a reference). Authors can easily look if this is also the case in their data. They should include both pooled data statistics of mean phases across regions and single neuron examples of firing probability by LFP phase (such examples could be added to the single unit plots in Figure 1).

      4) Did phase-locked and non phase-locked units have different properties? The authors can compare if they differ in basic properties such as mean firing rate, waveform width, inter-spike intervals, burstiness, etc., as it has been reported in other studies in non-human primates and rodents. These analyses could be extended to show if units with different properties also differ in their preferred phase-locking frequency, or phase. It would be very interesting if these analyses reveal the existence of heterogeneous cellular populations with different relation to hippocampal theta, even if the single-unit isolation quality is limited due to the low density recordings. In relation to this, authors should also plot unit auto-correlograms. ACGs can be computed for all the spikes, but also only for the strongly phase-locked spikes, to show if, at least during periods of strong oscillatory activity, some units show rhythmicity.

      5) To better interpret the results in Figure 4, it would be important to know if the recording sites in both hippocampi were from the same sub-region and similar location along the longitudinal hippocampal axis in each subject and if the degree of synchrony between the LFP in both hemispheres. Coherence or phase-locking between LFPs across hemispheres should be computed and also power spectrum for both of them shown.

      6) In Figure 4C-D it seems that phase-locking strength across hemispheres was not correlated but preferred frequency was. This should be quantified and mentioned in text before moving to the correlation in Figure 4E.

      7) The analysis in Figure 5D should be complemented by also checking the LFP-LFP phase-locking between the local region and the hippocampus. Were periods of high LFP power correlation also reflect enhanced phase-phase coupling? Were the structures also more phase-synchronous during periods of stronger spike-LFP coupling? These analyses could provide a more direct support for the interpretation of the authors in line with the CTC hypothesis.

      8) Was there any relation of the "strongly phase-locked" periods with global variables reflecting brain state (e.g. drowsiness versus attention to the task, etc.) or with the firing dynamics of the units (instantaneous firing rate or inter-spike intervals)?

    2. Reviewer #2:

      In this study, Schonhaut et al., describe the phase locking statistics of cortical and subcortical neurons with respect to hippocampal local field potential (LFP) recorded in 18 epilepsy patients undergoing seizure monitoring. Nearly 30% of extrahippocampal neurons showed phase locking to some bandpassed hippocampal signal. Amygdalar and entorhinal neurons were more likely to be phase locked, as compared to neurons recorded in other neocortical sites. Most neurons showed the strongest phase locking to hippocampal theta (2-8 Hz), though neocortical and amygdalar neurons tended to phase lock to lower theta bands. Spikes that were phase locked to hippocampal rhythms occurred during local LFP-states that showed moderate correlations with the spectral patterns observed in the hippocampus. These data are interpreted within the broader "communication through coherence" hypothesis.

      Large N, multi-region, single unit studies from humans are rare and the kind of mesoscopic descriptive analyses provided here serve as an important bridge between the large rodent literature on hippocampal physiology and human physiology and cognition. That said, there are some weaknesses in the analyses that could be addressed. Also, a deeper discussion of the biological origin of human theta is merited in the discussion to address alternate explanations - beyond communication through coherence - of the data.

      A similar statistical mistake was made several times. The author's logic goes like this: find the argmax in one sample, take the argument that generated that max, and use that to sample in another condition, and report that the max is higher in the first condition than the second. For example, on pg. 6 "This is difficult to reconcile with our results, in which 248/362 neurons (68.5%) phase-locked more strongly to hippocampal LFPs than to locally-recorded LFPs at their preferred hippocampal phase-locking frequency." The same flaw can be seen in Figure 5, where the spikes are sub-sampled to occur during strong phase locking in one condition, thus almost guaranteeing high power in the frequency bands that generated that strong phase locking (which was observed). This is a case in which cross-validating the data may be useful. The authors could take a subset of the hippocampal data to define the preferred frequency, and then test phase locking on the held out data from the hippocampus and cortex.

      The relationship between power and phase locking is not fully controlled in this paper. The phase seems to be calculated irrespective of whether there is any instantaneous power at that frequency band, introducing noise. This will bias away from finding significant phase locking to frequency bands that occur transiently. Therefore, I recommend defining some threshold of the existence of the spectral signal prior to using that signal to calculate phase.

      A related point has to do with the nature of the theta rhythm in the human. There has been considerable controversy over the years as to whether this is a comparable signal to that studied in the rodent. Based on the citations in this manuscript, and the nomenclature of the spectral band, the authors seek to make explicit the commonality of the underlying physiology, or function. Rodent theta is a sustained rhythm, while primate theta seems to come in bouts, perhaps even related to sampling statistics, such as saccades, leading to the suggestion that the apparent theta may be better thought of as semi-rhythmic evoked responses. How long were the bouts of high theta power? Was eye movement tracked? If so it would be important to relate the signal to eye movements. If the low frequency signal is phase locked to eye movement and potentially reflects semi-rhythmic information arriving to (from?) the hippocampus, then a stronger case could be made that hidden "third parties" synchronize the apparent communication through coherence observed here, and in fact there may be no communication at all.

      The authors dedicate much of their discussion to relating the current result to the communication through coherence analyses. Oddly, LFP coherence was never addressed. A strong prediction of the current framing would be that: when coherence is high, phase locking should be high, and higher than other moments when power in either region is high but coherence is not observed. The authors should directly measure how phase locking is modulated by coherence.

      The authors also lump together biological entities that should have different phase locking behaviors. The amygdala is not a monolithic region, does phase locking differ by nucleus? Also, do fast spiking inhibitory cells differ from excitatory cells? The authors should relate their phase locking measure to mean firing rate to show that it is insensitive to lower level cell statistics. This is important since the conclusions of the study would be quite different if neurons in the entorhinal cortex had high rates which artifactually drove up phase locking values.

    3. Reviewer #1:

      Hippocampal theta oscillations are among the most prominent rhythms in the mammalian brain. Extensive research in rodents has shown that neurons not only within the hippocampus but in widespread cortical areas can be phase-locked to hippocampal theta. Such cross-regional communication within theta frames has been postulated to be the foundation of many hippocampal operations. While previous studies in humans have documented the relationship between LFP theta and spiking in the hippocampus, coupling between hippocampal LFP and more remote cortical areas have not been demonstrated in human subjects. This is the topic of the present work. The authors show that spikes of single (and mostly multi-unit) neurons in multiple cortical regions both in the same and opposite hemispheres are phase locked to transient occurrence of hippocampal theta LFP in the 2-6 Hz range. However, phase-locking is stronger in structures known to be part of the 'limbic system', such as the amygdala and entorhinal cortex. Theta phase locking was stronger to hippocampal than to local LFP and the magnitude of spike phase locking increased when the power of theta increased, associated with increased high frequency power. The results are straightforward and the analysis methods are reliable. The novel information is limited but informative and documents a missing aspect of theta communication in the human brain.

      Comments:

      1) Given the simple message, the text is a bit long with many repetitions and loose ends. This applies to both Introduction and Discussion. Potential implications to learning, etc are interesting but the findings do not provide additional clues, thus those aspects of the discussion are mainly distractions. Instead, perhaps the authors would like to discuss potential mechanisms of remote unit entrainment. They are talking about multi-synaptic pathways but these are unlikely to be a valid conduit. Instead, the septum, entorhinal cortex or retrosplenial cortex, with their widespread projections, may be responsible for coordinating both hippocampal and neocortical areas.

      2) Arguably, the weakest part of the manuscript is the lack of hippocampal neurons. The authors refer to their own previous papers, but in a story which compares hippocampal theta oscillations with remote unit activity, it is strange that the magnitude of theta phase-locking to local hippocampal neurons is not available for comparison.

      3) How was the hippocampal LFP reference site chosen and did it vary substantially from subject to subject? Anterior or posterior locations?

      4) The authors list 1233 single neurons but in the discussion they make it clear that most of them were multiple neurons. This should be emphasized up front and may be used as an excuse why the authors did not attempt to separate pyramidal cells from interneurons (interneurons have a much higher propensity to be entrained by projected rhythms).

      5) Given that units were mixed, a logical extension would be to examine how hippocampal theta phase modulates high gamma in neocortical areas. This could potentially yield a much larger data base, targeting the same question.

      6) In the Discussion, the authors suggest that cross-regional theta phase coupling could be related to learning and other cognitive performance. However, spike-LFP coupling and coherence is confounded by LFP power increase and the authors cite Herweg et al., 2020 which did not find a relationship between theta power and memory performance. Is it then not logical to assume that cross-regional coupling may also not be related to memory?

      7) Line 36. "Long-term potentiation and long-term depression in the rodent hippocampus are also theta phase-dependent (Hyman et al., 2003)." Pavlides et al. (Brain Res 1988) or Huerta and Lisman (Neuron 1995) are perhaps more relevant references here.

      8) Line 82: "significant neocortical and contralateral phase-locking suggests". This is a strange phase. Perhaps significant phase locking of neurons in the neocortex in both hemispheres or similar would be a better formulation.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      This is a very intriguing paper showing how hippocampal local field potentials couple with the activity of other cortical regions. This mechanism has been and continues to be extensively studied in other mammals, and thus its existence and relevance in humans is exciting.

    1. Reviewer #2:

      This human psychophysics study claims to provide more evidence in support of the popular notion that visual processing of faces may involve partially independent processes for the analysis of static information such as facial shape versus dynamic information such as facial expression. In this respect the scientific hypotheses and conclusions are not novel, although some of the methods (parametric variation of facial expression dynamics using computer-generated animations) and analyses (Bayesian generative modeling of expression dynamics) are relatively new. Although the science is rigorously conducted, the paper currently feels heavy on statistics and technical details but light on data, compelling results and clear interpretation. However, the main problem is that the study fails to provide sufficient controls to support its central claims as currently formulated.

      Concerns:

      1) A central claim of the paper and the first words in the title are that the behavior studied (categorization of facial expression dynamics) is "shape-invariant". However, the lack of variation in facial shapes (n = 2) used here limits the strength of the conclusions that can be drawn, and it certainly remains an open question whether representations of facial expression dynamics are truly "shape-invariant". A simple control would have been to vary the viewing angle of the avatars, in order to dissociate 3D object shapes from their 2D projections (images). The authors also claim that "face shapes differ considerably" (line 49) amongst primate species, which is clearly true in absolute terms. However, the structural similarity of simian primate facial morphology (i.e. humans and macaques used here) is striking when compared to various non-primate species, which naturally raises questions about just how shape-invariant facial expression recognition is. The lack of data to more thoroughly support the central claim is problematic.

      2) As the authors note, macaque and human facial expressions of 'fear' and 'threat' differ considerably in visual salience and motion content - both in 3D and their 2D projections (i.e. optic flow). Indeed, the decision to 'match' expressions across species based on semantic meaning rather than physical muscle activations is a central problem here. Figure 1A illustrates clearly the relative subtlety of the human expression compared to the macaque avatar's extreme open-mouthed pose, while Fig 1D (right panels) shows that this is also true of macaque expressions mapped onto the human avatar. The authors purportedly controlled for this in an 'optic-flow equilibrated' experiment that produced similar results. However, this crucial control is currently difficult to assess since the control stimuli are not illustrated and the description of their creation (in the supplementary materials) is rather convoluted and obfuscates what the actual control stimuli were.

      The results of this control experiment that are presented (hidden away in supplementary Fig S3C) show that subjects rated the equilibrated stimuli at similar levels of expressiveness for the human vs macaque avatars. However, what the reader really needs to know is whether subjects rated the human vs macaque expression dynamics to be similarly expressive (irrespective of avatar)? My understanding is that species expression (and not species face shape) is the variable that the authors were attempting to equilibrate for.

      In short, the authors have not presented data to convince a reader that their equilibrated stimuli resolve the obvious confound in their original stimuli (namely the correlation between low level visual salience - especially around the mouth region- and the species of the expression dynamics).

      3) This paper appears to be the human psychophysics component of work that the authors have recently published using the macaque avatar. The separate paper (Siebert et al., 2020 - eNeuro) reported basic macaque behavioral responses to similar animations, while the task here takes advantage of the more advanced behavioral methods that are possible in human subjects. Nevertheless, the emphasis of the current paper on cross-species perception begs the question - how do macaques perceive these stimuli. Do the authors have any macaque behavioral data for these stimuli (even if not for the 4AFC task) that could be included to round this out? If not, I recommend rewording the title since it's current grammatical structure implies that the encoding is "across species", whereas encoding of species (shape and expression) was only tested in one species (humans).

    2. Reviewer #1:

      Overall assessment:

      The strengths of this paper are the novel cross species stimuli and very interesting behavioural findings, showing sharper tuning for recognising human expression sequences compared to monkey expressions. Technically, the paper is of a very high quality, both in terms of stimulus creation, but also in terms of analysis. Appropriate control experiments have been run, and in my view, the only concern is the way the results are presented, which I believe can be dealt with by restructuring the text. Other than that, I feel this would make a very nice contribution to the field.

      Concerns:

      The only major concern that I have is that the main take-home messages do not come through clearly in the way the Results section is currently structured. I found there was still too much technical detail - despite considerable use of Supplementary Information (SI) - which made extracting the empirical findings quite hard work. The details of the multinominal regression, the model comparisons (Table 1) and even the Discriminant Functions (Fig 2), for example, could all be briefly mentioned in the main text, with details provided in Methods or SI. These are all interesting, but I feel the focus should be on the behavioural findings, not the methods.

      I would suggest using the Discussion as a guide (this clearly states the key points) making sure the focus is more on Figure 3 and then working through the points more concisely.

      Obviously, this can be achieved simply by re-writing and does not take away from the significance of the work in any way. While the quality of the English is generally very high, some very minor wording issues could also be dealt with at this stage.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The paper employs novel cross species stimuli and a well-designed psychophysical paradigm to study the visual processing of facial expression. The authors show that facial expression discrimination is largely invariant to face shape (human vs. monkey). Furthermore, they reveal sharper tuning for recognising human expressions compared to monkey expressions, independent of whether these expressions were conveyed by a human or a monkey face. Technically, the paper is of a very high quality, both in terms of stimulus creation, but also in terms of analysis.

    1. Reviewer #3:

      This paper investigates the role of Sox2 in early hippocampal development. Previously the authors investigated conditional knockout mice using a Nestin-Cre line and found few phenotypes. The authors hypothesised that Sox2 may have greater impact on earlier developmental stages. The authors used a similar approach in a previous paper (Ferri et al., 2003) studying the ventral forebrain. To test this in the dorsal telencephalon they generated conditional knockout mice using both Emx1-Cre and FoxG1Cre driver lines. These lines displayed more significant phenotypes in the hippocampus, particularly in the cortical hem and dentate gyrus, and were most severe in the FoxG1Cre cross.

      The study is well executed and carefully thought through. Appropriate controls have been included for all experiments.

      In Figure 6, the data on Gli3 has been verified with additional luciferase data. The data on Cxcr4 has been previously published and has not been further verified with luciferase analysis. Including panel C in the figure may not be justified unless additional data is included to verify the result. It could be referred to in the discussion.

      In addition, related to Figure 6, Bertonlini et al., 2019, identified a number of Sox2 responsive enhancers, expressed in the dorsal telencephalon but it is not clear why these are not incorporated into the model. Further justification in the discussion would be helpful. The authors may also consider discussing how Emx2 in their model since they previously showed it was a negative regulator of Sox2 (Mariani et al 2012) and is required for hippocampal development (eg: Pellegrini et al., 1996; Yoshida et al 1997; Zhao et a la 2006).

      Regarding the interpretation of the results in Figure 7, previous work by the authors showed that early deletion of Sox2 using a Bf1Cre driver line resulted in severe developmental defects of the ganglionic eminences and therefore GABAergic interneurons. Are the development of GABAergic interneurons affected in the FoxG1Cre cross? It would be preferable to include some analysis of this, or at least a discussion of this issue in the context of the electrophysiology results.

      The authors use an eYFP reporter line in Figure 1- supplement. If they have similar data demonstrating Cre activity with the eYFP reporter crossed into the FoxG1CreXSox2flox/flox and Emx1CreXSox2 flox/flox it would be good to add this. It would demonstrate cell autonomous knockout versus non-cell autonomous knockout of Sox2 and may help with the interpretation of Sox2 function.

    2. Reviewer #2:

      This study examines the phenotype of early deletion of Sox2 and shows that there is a major dentate phenotype when fl-Sox2 mice are crossed to Foxg1-Cre when compared to Emx1-Cre or Nestin-Cre. This is a novel phenotype, but I don't think the authors have addressed the basis of this phenotype adequately to understand the basis of the phenotype. In addition, I am concerned about a major confounding issue (see below). I believe significant additional studies are needed to establish the specific role of Sox2 here. Below I list the major concerns.

      1) The authors rely on Foxg1-Cre for their main evidence that very early deletion of Sox2 leads to near loss of the dentate. However, it doesn't appear that the authors are aware that Foxg1 het mice have a fairly significant dentate phenotype (see this paper). The Foxg1-Cre line generated by Hebert and used by the authors is a knock-in allele that inactivates the endogenous Foxg1 gene. The authors need to address whether the phenotype they observe is actually due to loss of Sox2 alone at E9.5 vs the combined loss of Sox2 and a copy of Foxg1. In particular, could this explain the difference between Emx1-Cre and Foxg1-Cre lines? If this is the explanation for the difference, it isn't clear to me that the story really holds together without bringing in far more complex compound mutant explanations.

      2) The phenotype as described by the authors appears to be most compatible with the published Wnt3a mutant phenotype - perhaps a hypomorphic version makes the most sense or a near phenocopy of the Lef1 mutant. Given this, it appears to me this is really a hem phenotype and is likely explained by the loss of Wnt3a predominantly. Yet the authors don't show direct regulation of Wnt3a by Sox2 - the study would be dramatically enhanced by addressing the mechanism of loss of Wnt3a expression. In addition, examining the expression of Lef1 might reveal the more proximal mechanism of loss of DGC than simply less Wnt3a. This might also be another potential direct target of Sox2 since Lef1 expression is regulated by Wnt signaling but also by other morphogenic signals and could be a Sox2 target.

      3) The authors provide little specific analysis of hippocampal subfield specific markers. Their assumption is that the cells that are in the malformed dentate are granule neurons but they don't use any specific markers of DGC (eg Prox1). Instead they rely on cell position and expression of NeuroD (which is nonspecific). Similarly, it would make sense to examine other markers of mossy cells and CA3, which are also in the same region as DGC and made by adjacent neuroepithelium.

      4) Much of the study relies on the assumption that Nestin-Cre is an efficient deleter in the entire hippocampus yet there is no direct evidence of this. The authors could easily determine when Sox2 expression is lost in the various Cre-deleter lines using antibodies.

      5) I think the electrophysiology section isn't very useful or important. We know that mice with major developmental defects in the DG and hippocampus will have changes in circuit physiology. There is nothing specific about this phenotype, nor does it shed light on the important biology here.

      6) The only two direct targets they find don't seem likely to be important players in the phenotypes they describe, thus, it seems that they don't necessarily address the biology here. The Gli3 phenotypes that have been published are quite distinct from this.

      7) Some of the dentate phenotype is no doubt due to defects in CR cell production or development and this indirect effect has been seen in many other mutants that affect CR cell production (ie a disorganized dentate). It is hard to see how this part of the phenotype, which is likely due to the hem defects (the neuroepithelium that makes the CR cells) is helping us to understand the fundamental aspects of this phenotype.

    3. Reviewer #1:

      In the paper by Mercurio et al, the authors examine the role of SOX2 in the development of mouse hippocampal dentate gyrus. Using conditionally mutant SOX2 mice the authors show that early, but not late, deletion of SOX2 leads to developmental impairments of the dentate gyrus. A drawback of their study is that these findings have been reported previously by the group (Favaro et al. 2009; Ferri et al. 2013). In the current study the authors show additional examples of SOX2 target genes, which are dysregulated in the cortical hem upon early SOX2 deletion. However, as no mechanistic insights how this may affect dentate gyrus development are provided, the general novelty of the study is limited.

      Comments:

      1) The language of the manuscript needs to be improved.

      2) Using different Cre-drivers the authors aim to delete SOX2 at different developmental stages. What references demonstrate that EMX-Cre first deletes SOX2 after E10.5? I don't find where in Tronche et al. 1999 is it shown that Nes-Cre is deleting after E11.5?

      3) At line 149 the authors state "...remarkably, in the FoxG1-Cre cKO, the DG appears to be almost absent (Figure 2A).". The question is why this finding is remarkable as it already was published in (Ferri et al. 2013).

      4) Line 154 "In the FoxG1-Cre cKO, Reelin expression (marking CRC) is greatly reduced, and a HF is not observed (Figure 2D);...". This statement has no support from Figure 2D.

      5) Some of the images presented in Figure 4 are of such poor quality that they are hard to evaluate.

      6) In Figure 6 the authors show that SOX2 interacts with the promoter region of the Cxcr4 gene and that the SOX2 bound enhancer is active in the developing Zebrafish brain. These data can be removed as they have been published previously in Bertolini et al. 2019.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Joseph G Gleeson (Howard Hughes Medical Institute, The Rockefeller University) served as the Reviewing Editor.

      Summary:

      The positive aspects of the paper are the examination of the role of SOX2 in the development of mouse hippocampal dentate gyrus. Using conditionally mutant SOX2 mice the authors show that early, but not late, deletion of SOX2 leads to developmental impairments of the dentate gyrus. There were substantial criticisms of the work, most importantly that the work does not advance the field as much as is expected for a high-ranking journal, considering prior publications, and that there could be some difficulties interpreting the data as presented.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to express our appreciation for both the Editors’ and Reviewers’ efforts as essential contributions to the peer review process. We highly value the Reviewers’ constructive critique of our manuscript#RC-2020_00434R entitled “A drug repurposing screen identifies hepatitis C antivirals as inhibitors of the SARS-CoV2 main protease.__” __

      We appreciate the Reviewers’ thoughtful consideration of our work and feel their critiques and recommendations have significantly improved our manuscript. Taken together, we believe the additional data, clarification of data presentation, and revised discussion address the heart of the Reviewers’ previous concerns. Thus we feel the work is ready for reconsideration and will be an impactful addition to the literature appropriate for publication. Below we provide a breakdown and a point by point response to previous review critiques.

      Thank you for your attention. We look forward to your response.

      Best Wishes,

      Brian Kraemer, PhD ▪ Associate Director for Research Geriatric Research Education and Clinical Center ▪ Veterans Affairs Puget Sound Health Care System ▪ Research Professor ▪ Departments of Medicine, Psychiatry and Behavioral Sciences, and Pathology ▪ University of Washington ▪ 1660 South Columbian Way ▪ Seattle, WA 98108 ▪ Phone 206-277-1071 ▪ www.kraemerlab.uw.edu

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Baker et al. report the screening of a collection of ~6,070 drugs for their inhibitory activity against the enzymatic activity of the SARS-SoV-2 Mpro protein in vitro using two peptide substrates. 50 compounds with activity against Mpro were identified and tested for their dose-dependent effect in the same assay. Several hits were identified, among which are approved drugs that target the HCV protease.

      Indeed, there is an urgent need for effective drugs for SARS-CoV-2 infection, and high throughput screenings can discover novel candidates. However, the novelty of this work is quite limited, as former screens have been published with the same target using the same substrates. Moreover, as discussed below the translational impact of the hits discussed is also quite limited, particularly in the absence of antiviral data. Lastly, there are several overstatements in the write up and it will require major editing.

      **Major comments:**

      1. Were there any positive controls previously shown to potently inhibit the SARS-CoV-2 Mpro included in the screen (e.g. ebselen)? How did these perform in this assay? When first designing our protease assay, we did use ebselen as the initial control. Ebselen showed low potency in all our in our assays and was not considered as a positive control subsequently. It should be noted that Ebselen failed to work against multiple substrates. It is possible that our buffer conditions prevented Ebselen activity. See data plotted below. After identifying boceprevir as a potent inhibitor, it was used in all subsequent assays as a positive control.

      It will be helpful if the authors would provide info re the 50 hits from prior screens conducted with this library of compounds - how promiscuous are they across screens? How toxic in cell based assays?

      We have updated the table to provide additional useful information as well as a footnote explaining statuses. The compounds in the Broad repurposing library are generally non-toxic and information about them can be found here: https://clue.io/repurposing

      The translational potential of the findings appears to be limited. The calculated IC50s for these drugs in the Mpro assay are very high (10-1000 fold higher) relative to their IC50 in an enzymatic assay involving the HCV protease (Boceprevir: IC50 = 0.95 μM vs. 0.084 μM in HCV), Ciluprevir (IC50 = 20.77 μM vs. 0.0087 in HCV), Telaprevir (IC50 = 15.25 μM vs.0.050 μM in HCV) (https://aac.asm.org/content/aac/57/12/6236.full.pdf ). In the absence of antiviral data, the main statement of the manuscript that "the work presented here supports the rapid evaluation of previous HCV NS3/4A inhibitors for repurposing as a COVID-19 therapy." is thus an overstatement. Even is there is some activity, since likely to be limited, as with the HIV protease inhibitors, its chances to elicit a meaningful clinical effect is low. Moreover, when used in monotherapy, some of these protease inhibitors have a very low genetic barrier to resistance.

      We have reworked the discussion to incorporate these concerns and limitations of our results.

      There are additional inaccurate or overstatements - e.g. line 61 "Probably the most successful approved antivirals are protease inhibitors such as atazanavir for HIV-1 and simeprevir for hepatitis C. [reviewed in 10 and 11]."

      We have reworded this statement: (Page 4, Lines 61-62)

      “There is precedence for targeting the protease, as this approach has been successful in treating both HIV-1 and hepatitis C (10,11).”

      The manuscript requires editing - e.g. structure of sentences, commas, spacing (including in the abstract) etc.

      The manuscript has been re-proofed throughout (see tracked changes version of manuscript)

      What is the take home message? The statement "Taken together this work suggests previous large-scale commercial drug development initiatives targeting hepatitis C NS3/4A viral protease should be revisited because some previous lead compounds may be more potent against SARS-CoV-2 Mpro than Boceprevir and suitable for rapid repurposing." is unclear.

      The take home message of the manuscript is that HCV-targeting protease inhibitors have potential in blocking the SARS-Cov2 protease and a more thorough analysis of the space is needed. As the reviewer pointed out, the identified hits boceprevir and narlaprevir are less potent when targeting the SARS-Cov2 protease as compared to the HCV protease. However, we believe this work does show the potential for screening HCV-targeting protease inhibitors that may not have made it to the clinic. For instance, Boceprevir or Narlaprevir analogs may be even more potent against Mrpo. Further, we believe that these compounds would benefit from further optimization through medicinal chemistry.

      We have expanded the discussion to incorporate issues brought up here and in point 3.

      Reviewer #1 (Significance (Required)):

      Limited. As discussed above

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      SARS-CoV-2 pandemic causing serious health crisis globally. There are no specific medicine or vaccines to contain this virus currently. To address this issue, the authors developed one efficient fluorescent Mpro assay system and screened ~6070 previous used drugs in this article. Several compounds with activity against SARS-CoV-2 Mpro in vitro were founded. Most hits are hepatitis C NS3/4A protease inhibitors with fair IC50 value. Besides, the authors found that most identified compounds in in silico screen lack activity against Mpro in kinetic protease assays.

      These research results are well proved and reproducible. But there are two minor questions I present below:

      1. In your Mpro assay optimization process you said substrate MCA-AVLQSGFR-K(Dnp)- K-NH2 had drastically lower rates of Mpro catalyzed hydrolysis and were not considered further in your assay development. And in your Fig.1 I saw extremely low RFU changes. But several nice inhibitors were screened using this substrate that was reported in April. Can you explain this result? The substrates used in our assay appear to be much more efficiently cleaved at least with our buffer conditions and Mpro concentrations tested. Variables including recombinant Mpro purity and activity, differences in assay buffer, reader sensitivity may all play a role, but our best guess is that the substrate identified by Marcin Drag’s group (https://doi.org/10.1101/2020.04.29.068890), is more readily cleaved by Mpro. Although screening with other reported substrates is feasible given previous results, we believe the Ac-Abu-Tle-Leu-Gln-AFC to be superior for use in high throughput screening because of its superior cleavage kinetics yielding an improved signal to background ratio for HTS.

      To exclude inhibitors possibly acting as aggregators, a detergent-based control should do at the same time when you do IC50 value measurement.

      Compound aggregation is a concern, and our assays were all run with detergent in the buffer. Our buffer composition was 20mM Tris pH 7.8, 150mM NaCl, 1mM EDTA, 1mM DTT, 0.05% Triton X-100.

      Reviewer #2 (Significance (Required)):

      Nice work but the significance of this article is losing now. Most screened hits are reported in the last serval months. Some inhibitor complex structures have been published or released on Protein Data Bank. The novelty is missing. I suggest the authors add more results and resubmit it again.

      **Referees Cross-commenting**

      I agree with the other two reviewers' comments. The significance of this work is losing but still has something interest. I think it can be published in the lower-impact journal if they complete our suggestions

      We concur with both reviewers that demonstration of antiviral activity would strengthen the impact of the manuscript. However, this work remains outside of the scope of feasibility at our institution. We believe that our screen and hit identification can stand on their own until further translational work can be completed.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this report, Baker et al. show that four inhibitors of hepatitis C virus (HCV) NS3/4 protease (ciluprevir, boceprevir, narlaprevir and telaprevir) are also effective inhibitors of the SARS-CoV-2 main protease (Mpro) in enzymatic assays, with lower IC50 values for narlaprevir and boceprevir (around 1 µM in their assay conditions). HCV NS3/4 inhibitors were identified after screening a library of >6,000 compounds of the Broad Institute, including approved drugs. Screening was done with fluorometric proteolytic assays.

      Experiments have been apparently well-done and results are sound. The manuscript needs editing.

      Reviewer #3 (Significance (Required)):

      Experiments have been apparently well-done and results are sound. However, this is a limited study since there are no data obtained in cell culture and a comparison of IC50 values of the selected drugs against HCV and SARS-CoV-2 proteases is missing. It is difficult to infer whether the drugs would be equally effective against SARS-CoV-2 than against HCV, and otherwise, how much should the doses increase in order to have a therapeutic effect.

      The manuscript needs editing (see below) and the Discussion is poor. The results reported by authors are not new, and a discussion of the effects of HCV inhibitors on SARS-CoV-2 replication, based on previous publications is necessary to provide the appropriate context for the study.

      Here are some references on Covid-19 and HCV inhibitors, that in my opinion should be considered for discussion and proper citation. As correctly pointed out by Baker and co- workers, docking studies should be considered with caution, though.

      We appreciate the feedback and have now reworked and expanded the discussion to incorporate reviewer #1 and #3 comments and suggestions.

      1: Ghahremanpour MM, Tirado-Rives J, Deshmukh M, Ippolito JA, Zhang CH, de Vaca IC, Liosi ME, Anderson KS, Jorgensen WL. Identification of 14 Known Drugs as Inhibitors of the Main Protease of SARS-CoV-2. bioRxiv [Preprint]. 2020 Aug 28:2020.08.28.271957. doi: 10.1101/2020.08.28.271957. PMID: 32869018; PMCID: PMC7457600.

      2: Sacco MD, Ma C, Lagarias P, Gao A, Townsend JA, Meng X, Dube P, Zhang X, Hu Y, Kitamura N, Hurst B, Tarbet B, Marty MT, Kolocouris A, Xiang Y, Chen Y, Wang J. Structure and inhibition of the SARS-CoV-2 main protease reveals strategy for developing dual inhibitors against Mpro and cathepsin L. bioRxiv [Preprint]. 2020 Jul 27:2020.07.27.223727. doi: 10.1101/2020.07.27.223727. PMID: 32766590; PMCID: PMC7402059.

      3: Ma C, Sacco MD, Hurst B, Townsend JA, Hu Y, Szeto T, Zhang X, Tarbet B, Marty MT, Chen Y, Wang J. Boceprevir, GC-376, and calpain inhibitors II, XII inhibit SARS-CoV-2viral replication by targeting the viral main protease. Cell Res. 2020 Aug;30(8):678-692. doi: 10.1038/s41422-020-0356-z. Epub 2020 Jun 15. PMID: 32541865; PMCID: PMC7294525.

      4: Ke YY, Peng TT, Yeh TK, Huang WZ, Chang SE, Wu SH, Hung HC, Hsu TA, Lee SJ, Song JS, Lin WH, Chiang TJ, Lin JH, Sytwu HK, Chen CT. Artificial intelligence approach fighting COVID-19 with repurposing drugs. Biomed J. 2020 May 15:S2319- 4170(20)30049-4. doi: 10.1016/j.bj.2020.05.001. Epub ahead of print. PMID: 32426387; PMCID: PMC7227517.

      5: Elzupir AO. Inhibition of SARS-CoV-2 main protease 3CLpro by means of α-ketoamide and pyridone-containing pharmaceuticals using in silico molecular docking. J Mol Struct. 2020 Dec 15;1222:128878. doi: 10.1016/j.molstruc.2020.128878. Epub 2020 Jul 10.

      PMID: 32834113; PMCID: PMC7347502.

      Additional computational studies:

      1: Hosseini FS, Amanlou M. Anti-HCV and anti-malaria agent, potential candidates to repurpose for coronavirus infection: Virtual screening, molecular docking, and molecular dynamics simulation study. Life Sci. 2020 Aug 8;258:118205. doi:10.1016/j.lfs.2020.118205. Epub ahead of print. PMID: 32777300; PMCID:PMC7413873.

      2: Hakmi M, Bouricha EM, Kandoussi I, Harti JE, Ibrahimi A. Repurposing of known anti- virals as potential inhibitors for SARS-CoV-2 main protease using molecular docking analysis. Bioinformation. 2020 Apr 30;16(4):301-306. doi:10.6026/97320630016301.

      PMID: 32773989; PMCID: PMC7392094.

      3: Chtita S, Belhassan A, Aouidate A, Belaidi S, Bouachrine M, Lakhlifi T. Discovery of Potent SARS-CoV-2 Inhibitors from Approved Antiviral Drugs via Docking Screening. Comb Chem High Throughput Screen. 2020 Jul 30. doi:10.2174/1386207323999200730205447. Epub ahead of print. PMID: 32748740.

      4: Alamri MA, Tahir Ul Qamar M, Mirza MU, Bhadane R, Alqahtani SM, Muneer I, Froeyen M, Salo-Ahen OMH. Pharmacoinformatics and molecular dynamics simulation studies reveal potential covalent and FDA-approved inhibitors of SARS-CoV-2 main protease 3CLpro. J Biomol Struct Dyn. 2020 Jun 24:1-13. doi:10.1080/07391102.2020.1782768. Epub ahead of print. PMID: 32579061; PMCID:PMC7332866.

      5: Bafna K, Krug RM, Montelione GT. Structural Similarity of SARS-CoV2 Mpro and HCV NS3/4A Proteases Suggests New Approaches for Identifying Existing Drugs Useful as COVID-19 Therapeutics. ChemRxiv [Preprint]. 2020 Apr 21. doi: 10.26434/chemrxiv.12153615. PMID: 32511291; PMCID: PMC7263768.

      6: Eleftheriou P, Amanatidou D, Petrou A, Geronikaki A. In Silico Evaluation of the Effectivity of Approved Protease Inhibitors against the Main Protease of the Novel SARS- CoV-2 Virus. Molecules. 2020 May 29;25(11):2529. doi:10.3390/molecules25112529.

      PMID: 32485894; PMCID: PMC7321236.

      7: Wang J. Fast Identification of Possible Drug Treatment of Coronavirus Disease-19 (COVID-19) through Computational Drug Repurposing Study. J Chem Inf Model. 2020 Jun 22;60(6):3277-3286. doi: 10.1021/acs.jcim.0c00179. Epub 2020 May 4. PMID: 32315171; PMCID: PMC7197972.

      8: Chen YW, Yiu CB, Wong KY. Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CL pro) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates. F1000Res. 2020 Feb 21;9:129. doi: 10.12688/f1000research.22457.2. PMID: 32194944; PMCID: PMC7062204.

      Minor comments:

      We appreciate the time that the reviewer has taken to address grammatical changes and have addressed each throughout the manuscript with tracked changes.

      p.2, line 26: > appears as an attractive

      Manuscript edited

      p.2, line 27: > we show that the existing

      Manuscript edited

      p.2, line 33: > separate numbers and units, eg. 1.10 µM (this is a persisting error that should be corrected throughout the whole ms)

      Manuscript edited

      p.4, line 44: SARS virus should be referred as to SARS-CoV-1 throughout the whole manuscript. MERS-CoV is the name of the virus causing MERS

      Manuscript edited

      p.4, lines 61-62: > the selection of the specific compounds seems to be arbitrary... why atazanavir and not darunavir or other? The sentence should be rewritten.

      Rewritten as: “There is precedence for targeting the protease, as this approach has been successful in treating both HIV-1 and hepatitis C.”

      p.6, line 100: Citing Fig. 2B before completing the description of Fig. 1 is distracting. Authors should think of a better way to describe their results.

      This was a mistake and should have cited Fig 1B. Thank you for catching this.

      p.7, line 116: It is not clear what "10m-20,810" means

      This has been clarified to state: “ΔRFU at 10 minutes = 20,810 relative fluorescence units”

      p.7, lines 125-126: These sentences belong to an introduction, not appropriate in results section.

      We have removed these sentences.

      Figure 2. Part A is not necessary in results (ok for introduction). Black and purple dots in part B is not a good choice since they are difficult to distinguish, maybe orange and black is better.

      We have removed panel A, expanded the size of panel B and changed the color.

      Table 1: Status should be explained in a footnote (i.e the distinction between launched, P2/P3, phase 2, preclinical is not clear).

      The one compound indicated in P2/P3 development is now Phase 3 and the table has been updated. We have added a footnote:

      *Launched = compound approved for humans, though may only be approved for veterinary use in some countries

      Discussion. I think that subheadings are not necessary.

      Subheadings have been removed from the discussion.

      **Referees cross-commenting** I agree with reviewer no. 1 on the limited interest of the study. However, it could be published in a specialized lower-impact journal after addressing issues raised by reviewers 2 and 3 (likely to be completed in less than a month)

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this report, Baker et al. show that four inhibitors of hepatitis C virus (HCV) NS3/4 protease (ciluprevir, boceprevir, narlaprevir and telaprevir) are also effective inhibitors of the SARS-CoV-2 main protease (Mpro) in enzymatic assays, with lower IC50 values for narlaprevir and boceprevir (around 1 µM in their assay conditions). HCV NS3/4 inhibitors were identified after screening a library of >6,000 compounds of the Broad Institute, including approved drugs. Screening was done with fluorometric proteolytic assays.

      Experiments have been apparently well-done and results are sound. The manuscript needs editing.

      Significance

      Experiments have been apparently well-done and results are sound. However, this is a limited study since there are no data obtained in cell culture and a comparison of IC50 values of the selected drugs against HCV and SARS-CoV-2 proteases is missing. It is difficult to infer whether the drugs would be equally effective against SARS-CoV-2 than against HCV, and otherwise, how much should the doses increase in order to have a therapeutic effect. The manuscript needs editing (see below) and the Discussion is poor. The results reported by authors are not new, and a discussion of the effects of HCV inhibitors on SARS-CoV-2 replication, based on previous publications is necessary to provide the appropriate context for the study. Here are some references on Covid-19 and HCV inhibitors, that in my opinion should be considered for discussion and proper citation. As correctly pointed out by Baker and co-workers, docking studies should be considered with caution, though.

      1: Ghahremanpour MM, Tirado-Rives J, Deshmukh M, Ippolito JA, Zhang CH, de Vaca IC, Liosi ME, Anderson KS, Jorgensen WL. Identification of 14 Known Drugs as Inhibitors of the Main Protease of SARS-CoV-2. bioRxiv [Preprint]. 2020 Aug 28:2020.08.28.271957. doi: 10.1101/2020.08.28.271957. PMID: 32869018; PMCID: PMC7457600.

      2: Sacco MD, Ma C, Lagarias P, Gao A, Townsend JA, Meng X, Dube P, Zhang X, Hu Y, Kitamura N, Hurst B, Tarbet B, Marty MT, Kolocouris A, Xiang Y, Chen Y, Wang J. Structure and inhibition of the SARS-CoV-2 main protease reveals strategy for developing dual inhibitors against M<sup>pro</sup> and cathepsin L. bioRxiv [Preprint]. 2020 Jul 27:2020.07.27.223727. doi: 10.1101/2020.07.27.223727. PMID: 32766590; PMCID: PMC7402059.

      3: Ma C, Sacco MD, Hurst B, Townsend JA, Hu Y, Szeto T, Zhang X, Tarbet B, Marty MT, Chen Y, Wang J. Boceprevir, GC-376, and calpain inhibitors II, XII inhibit SARS-CoV-2 viral replication by targeting the viral main protease. Cell Res. 2020 Aug;30(8):678-692. doi: 10.1038/s41422-020-0356-z. Epub 2020 Jun 15. PMID: 32541865; PMCID: PMC7294525.

      4: Ke YY, Peng TT, Yeh TK, Huang WZ, Chang SE, Wu SH, Hung HC, Hsu TA, Lee SJ, Song JS, Lin WH, Chiang TJ, Lin JH, Sytwu HK, Chen CT. Artificial intelligence approach fighting COVID-19 with repurposing drugs. Biomed J. 2020 May 15:S2319-4170(20)30049-4. doi: 10.1016/j.bj.2020.05.001. Epub ahead of print. PMID: 32426387; PMCID: PMC7227517.

      5: Elzupir AO. Inhibition of SARS-CoV-2 main protease 3CLpro by means of α-ketoamide and pyridone-containing pharmaceuticals using in silico molecular docking. J Mol Struct. 2020 Dec 15;1222:128878. doi: 10.1016/j.molstruc.2020.128878. Epub 2020 Jul 10. PMID: 32834113; PMCID: PMC7347502.

      Additional computational studies:

      1: Hosseini FS, Amanlou M. Anti-HCV and anti-malaria agent, potential candidates to repurpose for coronavirus infection: Virtual screening, molecular docking, and molecular dynamics simulation study. Life Sci. 2020 Aug 8;258:118205. doi:10.1016/j.lfs.2020.118205. Epub ahead of print. PMID: 32777300; PMCID:PMC7413873.

      2: Hakmi M, Bouricha EM, Kandoussi I, Harti JE, Ibrahimi A. Repurposing of known anti-virals as potential inhibitors for SARS-CoV-2 main protease using molecular docking analysis. Bioinformation. 2020 Apr 30;16(4):301-306. doi:10.6026/97320630016301. PMID: 32773989; PMCID: PMC7392094.

      3: Chtita S, Belhassan A, Aouidate A, Belaidi S, Bouachrine M, Lakhlifi T. Discovery of Potent SARS-CoV-2 Inhibitors from Approved Antiviral Drugs via Docking Screening. Comb Chem High Throughput Screen. 2020 Jul 30. doi:10.2174/1386207323999200730205447. Epub ahead of print. PMID: 32748740.

      4: Alamri MA, Tahir Ul Qamar M, Mirza MU, Bhadane R, Alqahtani SM, Muneer I, Froeyen M, Salo-Ahen OMH. Pharmacoinformatics and molecular dynamics simulation studies reveal potential covalent and FDA-approved inhibitors of SARS-CoV-2 main protease 3CL<sup>pro</sup>. J Biomol Struct Dyn. 2020 Jun 24:1-13. doi:10.1080/07391102.2020.1782768. Epub ahead of print. PMID: 32579061; PMCID:PMC7332866.

      5: Bafna K, Krug RM, Montelione GT. Structural Similarity of SARS-CoV2 M<sup>pro</sup> and HCV NS3/4A Proteases Suggests New Approaches for Identifying Existing Drugs Useful as COVID-19 Therapeutics. ChemRxiv [Preprint]. 2020 Apr 21. doi: 10.26434/chemrxiv.12153615. PMID: 32511291; PMCID: PMC7263768.

      6: Eleftheriou P, Amanatidou D, Petrou A, Geronikaki A. In Silico Evaluation of the Effectivity of Approved Protease Inhibitors against the Main Protease of the Novel SARS-CoV-2 Virus. Molecules. 2020 May 29;25(11):2529. doi:10.3390/molecules25112529. PMID: 32485894; PMCID: PMC7321236.

      7: Wang J. Fast Identification of Possible Drug Treatment of Coronavirus Disease-19 (COVID-19) through Computational Drug Repurposing Study. J Chem Inf Model. 2020 Jun 22;60(6):3277-3286. doi: 10.1021/acs.jcim.0c00179. Epub 2020 May 4. PMID: 32315171; PMCID: PMC7197972.

      8: Chen YW, Yiu CB, Wong KY. Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CL <sup>pro</sup>) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates. F1000Res. 2020 Feb 21;9:129. doi: 10.12688/f1000research.22457.2. PMID: 32194944; PMCID: PMC7062204.

      Minor comments:

      p.2, line 26: > appears as an attractive

      p.2, line 27: > we show that the existing

      p.2, line 33: > separate numbers and units, eg. 1.10 µM (this is a persisting error that should be corrected throughout the whole ms)

      p.4, line 44: SARS virus should be referred as to SARS-CoV-1 throughout the whole manuscript. MERS-CoV is the name of the virus causing MERS

      p.4, lines 61-62: > the selection of the specific compounds seems to be arbitrary... why atazanavir and not darunavir or other? The sentence should be rewritten.

      p.6, line 100: Citing Fig. 2B before completing the description of Fig. 1 is distracting. Authors should think of a better way to describe their results.

      p.7, line 116: It is not clear what "10m-20,810" means

      p.7, lines 125-126: These sentences belong to an introduction, not appropriate in results section.

      Figure 2. Part A is not necessary in results (ok for introduction). Black and purple dots in part B is not a good choice since they are difficult to distinguish, maybe orange and black is better.

      Table 1: Status should be explained in a footnote (i.e the distinction between launched, P2/P3, phase 2, preclinical is not clear).

      Discussion. I think that subheadings are not necessary.

      Referees cross-commenting

      I agree with reviewer no. 1 on the limited interest of the study. However, it could be published in a specialized lower-impact journal after addressing issues raised by reviewers 2 and 3 (likely to be completed in less than a month)

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      SARS-CoV-2 pandemic causing serious health crisis globally. There are no specific medicine or vaccines to contain this virus currently. To address this issue, the authors developed one efficient fluorescent Mpro assay system and screened ~6070 previous used drugs in this article. Several compounds with activity against SARS-CoV-2 Mpro in vitro were founded. Most hits are hepatitis C NS3/4A protease inhibitors with fair IC50 value. Besides, the authors found that most identified compounds in in silico screen lack activity against Mpro in kinetic protease assays.

      These research results are well proved and reproducible. But there are two minor questions I present below:

      1.In your Mpro assay optimization process you said substrate MCA-AVLQSGFR-K(Dnp)-K-NH2 had drastically lower rates of Mpro catalyzed hydrolysis and were not considered further in your assay development. And in your Fig.1 I saw extremely low RFU changes. But several nice inhibitors were screened using this substrate that was reported in April. Can you explain this result?

      2.To exclude inhibitors possibly acting as aggregators, a detergent-based control should do at the same time when you do IC50 value measurement.

      Significance

      Nice work but the significance of this article is losing now. Most screened hits are reported in the last serval months. Some inhibitor complex structures have been published or released on Protein Data Bank. The novelty is missing. I suggest the authors add more results and resubmit it again.

      Referees Cross-commenting

      I agree with the other two reviewers' comments. The significance of this work is losing but still has something interest. I think it can be published in the lower-impact journal if they complete our suggestions

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Baker et al. report the screening of a collection of ~6,070 drugs for their inhibitory activity against the enzymatic activity of the SARS-SoV-2 Mpro protein in vitro using two peptide substrates. 50 compounds with activity against Mpro were identified and tested for their dose-dependent effect in the same assay. Several hits were identified, among which are approved drugs that target the HCV protease.<br> Indeed, there is an urgent need for effective drugs for SARS-CoV-2 infection, and high throughput screenings can discover novel candidates. However, the novelty of this work is quite limited, as former screens have been published with the same target using the same substrates. Moreover, as discussed below the translational impact of the hits discussed is also quite limited, particularly in the absence of antiviral data. Lastly, there are several overstatements in the write up and it will require major editing.

      Major comments:

      1.Were there any positive controls previously shown to potently inhibit the SARS-CoV-2 Mpro included in the screen (e.g. ebselen)? How did these perform in this assay?

      2.It will be helpful if the authors would provide info re the 50 hits from prior screens conducted with this library of compounds - how promiscuous are they across screens? How toxic in cell based assays?

      3.The translational potential of the findings appears to be limited. The calculated IC50s for these drugs in the Mpro assay are very high (10-1000 fold higher) relative to their IC50 in an enzymatic assay involving the HCV proteast (Boceprevir: IC50 = 0.95 μM vs. 0.084 μM in HCV), Ciluprevir (IC50 = 20.77 μM vs. 0.0087 in HCV), Telaprevir (IC50 = 15.25 μM vs. 0.050 μM in HCV) (https://aac.asm.org/content/aac/57/12/6236.full.pdf ). In the absence of antiviral data, the main statement of the manuscript that "the work presented here supports the rapid evaluation of previous HCV NS3/4A inhibitors for repurposing as a COVID-19 therapy." is thus an overstatement. Even is there is some activity, since likely to be limited, as with the HIV protease inhibitors, its chances to elicit a meaningful clinical effect is low. Moreover, when used in monotherapy, some of these protease inhibitors have a very low genetic barrier to resistance.

      4.There are additional inaccurate or overstatements - e.g. line 61 "Probably the most successful approved antivirals are protease inhibitors such as atazanavir for HIV-1 and simeprevir for hepatitis C. [reviewed in 10 and 11]."

      5.The manuscript requires editing - e.g. structure of sentences, commas, spacing (including in the abstract) etc.

      6.What is the take home message? The statement "Taken together this work suggests previous large-scale commercial drug development initiatives targeting hepatitis C NS3/4A viral protease should be revisited because some previous lead compounds may be more potent against SARS-CoV-2 Mpro than Boceprevir and suitable for rapid repurposing." is unclear.

      Significance

      Limited. As discussed above

    1. Reviewer #3:

      This fMRI study examines an interesting question, namely how computer code - as a "cognitive/cultural invention" - is processed by the human brain. However, I have a number of concerns with regard to how this question was examined in terms of experimental design, including the choice of control condition (fake code) and the way in which localiser tasks were utilised. In addition, the sample size is very small (n=15) and there appear to be large inter-individual differences in coding performance (in spite of the recruitment of expert programmers). In summary, while promising in its aims, the study's conclusions are weakened by these considerations related to its execution.

      1) The control condition

      The experiment contrasted real Python code with fake code in the form of "incomprehensible scrambled Python functions". Real and fake code also differed in regard to the task performed (code comprehension versus memory) and were distinguished via colour coding. There is a lot to unpack here in regard to how processing might differ between the two different conditions. For example, the real code blocks required code comprehension as well as computational problem solving (which does not necessarily require the use of code), while the control task requires neither. As a result of the colour coding, it also appears likely that participants will have approached the fake code blocks with a completely different processing strategy than the real code blocks. These are just a few obvious differences between the conditions but there are likely many more given how different they are. This, in my view, makes it difficult to interpret the basic contrast between real and fake code.

      2) Use of localiser tasks

      A similar concern as for point 1 holds in regard to the localiser tasks that were used in order to examine anatomical overlap (or lack thereof) between code comprehension and language, maths, logical problem solving and multiple-demand executive control, respectively. I am generally somewhat sceptical in regard to the use of functional localisers in view of the assumptions that necessarily enter into the definition of a localiser task. This concern is exacerbated by the way in which localisers were employed in the present study. Firstly, in addition to the definition of the localiser task itself, this study used localiser contrasts to define networks of interest. For example, the contrast language localiser > maths localiser served to define the "language network". Thus, assumptions about the nature of the localiser itself are compounded with those regarding the nature of the contrast. Secondly, particularly with regard to language, the localiser task was very high level, i.e. requiring participants to judge whether an active and a passive sentence had the same meaning (with both statements remaining on the screen at the same time). While of course requiring language processing, this task is arguably also a problem solving task of sorts. It is certainly more complex than a typical task designed to probe fast and automatic aspects of natural language processing.

      In addition, given that reading is also a cultural invention, is it really fair to say that coding is being compared to the "language network" here rather than to the "reading network" (in view of the visual presentation of the language task)? The possible implications of this for the interpretation of the data should be considered.

      More generally, while an anatomical overlap between networks active during code comprehension and networks recruited during other cognitive tasks may shed some initial light on how the brain processes code, it doesn't support any particularly strong conclusions about the neural mechanisms of code processing in my view. While code comprehension may overlap anatomically with regions involved in executive control and logic, this doesn't mean that the same neuronal populations are recruited in each task nor that the processing mechanisms are comparable between tasks.

      3) Sample size and individual differences

      At n=15, the sample size of this study is quite small, even for a neuroimaging study. This again limits the conclusions that can be drawn from the study results.

      Moreover, the results of the behavioural pre-test - which was commendably included - suggest that participants differed considerably with regard to their Python expertise. For the more difficult exercise in this pre-test, the mean accuracy score was 64.6% with a range from 37.5% to 93.75%. These substantial differences in proficiency weren't taken into account in the analysis of the fMRI data and, indeed, it appears difficult to meaningfully do so in view of the sample size.

    2. Reviewer #2:

      The goal of this fMRI study was to determine which brain systems support coding, by way of the extent of overlap of univariate maps with localizer tasks for language, logic, math, and executive functions. The basic conclusion is one we could have anticipated: coding engages a widespread frontoparietal network, with stronger involvement of the left hemisphere. It overlaps with all of the other tasks, but most with the map for logic. This doesn't seem too surprising, but the authors argue convincingly that others wouldn't have predicted that.

      It's unfortunate that there are differences in task difficulty among the tasks - in particular, that the logic task was the most difficult of all (both in terms of accuracy and response times), since that happens to be the one that had the largest number of overlapping voxels with the coding task. We can't know whether coding and language task voxels would have overlapped more if the language task had been more difficult.

      It seems a shame to present data only from highly experienced coders (11+ years of experience); I can imagine that the investigators are planning to write up another study examining effects of expertise, in comparison with less experienced coders. This seems like an initial paper that's laying the groundwork for a more groundbreaking one.

    3. Reviewer #1:

      This manuscript is clearly written and the methods appear to be rigorous, although the number of subjects (15) is a bit low; however, this does not appear to critically limit interpretation of the results. I appreciated the focused inclusion on expert coders to make a clear comparison to language. I also thought that the inclusion of multiple domains for comparison (logic, math, executive function, and language) was quite informative. The laterality covariance between code and language was also quite interesting. I do have some concerns with the literature review and discussion of present and previous results.

      1) My main concern with this paper is that it does not clearly review previous fMRI studies on code processing. How do the present results compare with previous studies? E.g. Castelhano et al., 2019; Floyd et al., 2017; Huang et al., 2019; Krueger et al., 2020; Siegmund et al., 2017, 2014;) It seems like the localization/lateralization obtained in the present study is largely similar to these previous studies (e.g. Siegmund et al., 2017). If so, this should be discussed: a convergence across multiple methods/authors is useful to know. Any discrepancies are also useful to know. The authors suggest that "Moreover, no prior study has directly compared the neural basis of code to other cognitive domains." However, Krueger et al. (2020) and Huang et al. (2019) appear to have done this.

      2) The authors should point out and discuss the difficulty of understanding the psychological and neural structure of coding in absence of a clear theory of coding, as is the case for language (e.g. Chomsky, 1965; Levelt, 1989; Lewis & Vasishth, 2005). On this point, I appreciate the reference to Fitch et al. (2005) regarding recursion in coding, but I think it would be most helpful to have a clear example of recursion in python code. However, the authors at least focus their results on neural underpinnings without attempting to make strong claims about cognitive underpinnings.

      3) The authors report overlap between code comprehension and language in the posterior MTG and IFG. They note that these activations were somewhat inconsistent; yet, they did observe this significant overlap. However the paper discusses the results as if this overlap did not occur, e.g. "We find that the perisylvian fronto-temporal network that is selectively responsive to language, relative to math, does not overlap with the neural network involved in code comprehension." This is not accurate, as there indeed was overlap. It is important to point out that among language-related regions, these two regions are the most strongly associated with abstract syntax (Friederici, 2017; Hagoort, 2005; Tyler & Marslen-Wilson, 2008; Pallier et al., 2011; Bornkessel-Schlesewsky & Schlesewsky, 2013; Matchin & Hickok, 2019), which very well could be a point of shared resources among code and language (as discussed in Fitch, 2005).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      This was co-submitted with the following manuscript: https://www.biorxiv.org/content/10.1101/2020.04.16.045732v1

      Summary:

      The remit of the co-submission format is to ask if the scientific community is enriched by the data presented in the co-submitted manuscripts together more so than it would be by the papers apart, or if only one paper was presented to the community. In other words, are the conclusions that can be made stronger or clearer when the manuscripts are considered together rather than separately? We felt that despite significant concerns with each paper individually, especially regarding the theoretical structures in which the experimental results could be interpreted, that this was the case.

      We want to be very clear that in a non-co-submission case we would have substantial and serious concerns about the interpretability and robustness of the Liu et al. submission given its small sample size. Furthermore, the reviewers' concerns about the suitability of the control task differed substantially between the manuscripts. We share these concerns. However, despite these differences in control task and sample size, the Liu et al. and Ivanova et al. submissions nonetheless replicated each other - the language network was not implicated in processing programming code. The replication substantially mitigates the concerns shared by us and the reviewers about sample size and control tasks. The fact that different control tasks and sample sizes did not change the overall pattern of results, in our view, is affirmation of the robustness of the findings, and the value that both submissions presented together can offer the literature.

      In sum, there were concerns that both submissions were exploratory in nature, lacking a strong theoretical focus, and relied on functional localizers on novel tasks. However, these concerns were mitigated by the following strengths. Both tasks ask a clear and interesting question. The results replicate each other despite task differences. In this way, the two papers strengthen each other. Specifically, the major concerns for each paper individually are ameliorated when considering them as a whole.

      The concerns of the reviewers need addressing, including, specifically, the limits of interpretation of your results with regard to control task choice, the discussion of relevant literature mentioned by the reviewers, and most crucially, please contextualize your results with regard to the other submission's results.

    1. Reviewer #2:

      This carefully designed fMRI study examines an interesting question, namely how computer code - as a "cognitive/cultural invention" - is processed by the human brain. The study has a number of strengths, including: use of two very different programming languages (Python and Scratch Jr.) in two experiments; direct comparison between code problems and "content-matched sentence problems" to disentangle code comprehension from problem content; control for the impact of lexical information in code passages by replacing variable names with Japanese translations; and consideration of inter-individual differences in programming proficiency. I do, however, have some questions regarding the interpretation of the results in mechanistic terms, as detailed below.

      1) Code comprehension versus underlying problem content

      I am generally somewhat sceptical in regard to the use of functional localisers in view of the assumptions that necessarily enter into the definition of a localiser task. In addition, an overlap between the networks supporting two different tasks doesn't imply comparable neural processing mechanisms. With the present study, however, I was impressed by the authors' overall methodological approach. In particular, I found the supplementation of the localiser-based approach with the comparison between code problems and analogous sentence problems rather convincing.

      However, while I agree that computational thinking does not require coding / code comprehension, it is less clear to me what code comprehension involves when it is stripped of the computational thinking aspect. Knowing how to approach a problem algorithmically strikes me as a central aspect of coding. What, then, is being measured by the code problem versus sentence problem comparison? Knowledge of how to implement a certain computational solution within a particular programming language? The authors touch upon this briefly in the Discussion section of the paper, but I am not fully convinced by their arguments. Specifically, they state:

      "The process of code comprehension includes retrieving code-related knowledge from memory and applying it to the problems at hand. This application of task-relevant knowledge plausibly requires attention, working memory, inhibitory control, planning, and general flexible reasoning-cognitive processes long linked to the MD system [...]." (p.17)

      Shouldn't all of this also apply (or even apply more strongly) to processing of the underlying problem content rather than to code comprehension per se?

      According to the authors, the extent to which code-comprehension-related activity reflects problem content varies between different systems. At the bottom of p.9, they conclude that "MD responses to code [...] do not exclusively reflect responses to problem content", while on p.13 they argue on the basis of their voxel-wise correlation analysis that "the language system's response to code is largely (although not completely) driven by problem content. However, unless I have missed something, the latter analysis was only undertaken for the language system but not for the other systems under examination. Was there a particular reason for this? Also, what are the implications of observing problem content-driven responses within the language system for the authors' conclusion that this system is "functionally conservative"?

      Overall, the paper would be strengthened by more clarity in regard to these issues - and specifically a more detailed discussion of what code comprehension may amount to in mechanistic terms when it is stripped of computational thinking.

      2) Implications of using reading for the language localiser task

      Given that reading is also a cultural invention, is it really fair to say that coding is being compared to the "language system" here rather than to the "reading system" (in view of the visual presentation of the language task)? The possible implications of this for the interpretation of the data should be considered.

      3) Possible effects of verbalisation?

      It appears possible that participants may have internally verbalised code problems - at least to a certain extent (and likely with a considerable degree of inter-individual variability). How might this have affected the results of the present study? Could verbalisation be related to the highly correlated response between code problems and language problems within the language system?

    2. Reviewer #1:

      The manuscript is well-written and the methods are clear and rigorous, representing a clear advance on previous research comparing computer code programming to language. The conclusions with respect to which brain networks computer programming activates are compelling and well conveyed. This paper is useful to the extent that the conclusions are focused on the empirical findings: whether or not code activates language-related brain regions (answer: no). However, the authors appear to be also testing whether or not any of the mechanisms involved in language are recruited for computer programming. The problem with this goal is that the authors do not present or review a theory of the representations and mechanisms involved in computer programming, as has been developed for language (e.g. Adger, 2013; Bresnan, 2001; Chomsky, 1965, 1981, 1995; Goldberg, 1995; Hornstein, 2009; Jackendoff, 2002; Levelt, 1989; Lewis & Vasishth, 2005; Vosse & Kempen, 2000).

      1) p. 15: "The fact that coding can be learned in adulthood suggests that it may rely on existing cognitive systems." p. 3: "Finally, code comprehension may rely on the system that supports comprehension of natural languages: to successfully process both natural and computer languages, we need to access stored meanings of words/tokens and combine them using hierarchical syntactic rules (Fedorenko et al., 2019; Murnane, 1993; Papert, 1993) - a similarity that, in theory, should make the language circuits well-suited for processing computer code." If we understand stored elements and computational structure in the broadest way possible without breaking this down more, many domains of cognition would be shared in this way. The authors should illustrate in more detail how the psychological structure of computer programming parallels language. Is there an example of hierarchical structure in computer code? What is the meaning of a variable/function in code, and how does this compare to meaning in language?

      2) p. 19 lines 431-433: "Our findings, along with prior findings from math and logic (Amalric & Dehaene, 2019; Monti et al., 2009, 2012), argue against this possibility: the language system does not respond to meaningful structured input that is non-linguistic." This is an overly simple characterization of the word "meaningful". The meaning of math and logic are not the same as in language. Both mathematics and computer programming have logical structure to them, but the nature of this structure and the elements that are combined in language are different. Linguistic computations take as input complex atoms of computation that have phonological and conceptual properties. These atoms are commonly used to refer to entities "in the world" with complex semantic properties and often have rich associated imagery. Linguistic computations output complex, monotonically enhanced forms. So cute + dogs = cute dogs, chased + cute dogs = chased cute dogs, etc. This is very much unlike mathematics and computer programming, where we typically do not make reference to the "real world" using these expressions to interlocuters, and outputs of an expression are not monotonic, structure-preserving combinations of the input elements, and there is no semantic enhancement that occurs through increased computation. This bears much more discussion in the paper, if the authors intend to make claims regarding shared/distinct computations between computer programming and language.

      3) More importantly, even if there were shared mechanisms between computer code programming and language, I'm not sure we can use reverse inference to strongly test this hypothesis. As Poldrack (2006) pointed out, reverse inference is sharply limited by the extent to which we know how cognition maps onto the brain. This is a similar point to Poeppel & Embick, (2005), who pointed out that different mechanisms of language could be implemented in the brain in a large variety of ways, only one of which is big pieces of cortical tissue. In this sense, there could in fact be shared mechanisms between language and code (e.g. oscillatory dynamics, connectivity patterns, subcortical structures), but these mechanisms might not be aligned with the cortical territory associated with language-related brain regions. The authors should spend much additional time discussing these alternative possibilities.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      This was co-submitted with the following manuscript: https://www.biorxiv.org/content/10.1101/2020.05.24.096180v3

      Summary:

      The remit of the co-submission format is to ask if the scientific community is enriched by the data presented in the co-submitted manuscripts together more so than it would be by the papers apart, or if only one paper was presented to the community. In other words, are the conclusions that can be made stronger or clearer when the manuscripts are considered together rather than separately? We felt that despite significant concerns with each paper individually, especially regarding the theoretical structures in which the experimental results could be interpreted, that this was the case.

      We want to be very clear that in a non-co-submission case we would have substantial and serious concerns about the interpretability and robustness of the Liu et al. submission given its small sample size. Furthermore, the reviewers' concerns about the suitability of the control task differed substantially between the manuscripts. We share these concerns. However, despite these differences in control task and sample size, the Liu et al. and Ivanova et al. submissions nonetheless replicated each other - the language network was not implicated in processing programming code. The replication substantially mitigates the concerns shared by us and the reviewers about sample size and control tasks. The fact that different control tasks and sample sizes did not change the overall pattern of results, in our view, is affirmation of the robustness of the findings, and the value that both submissions presented together can offer the literature.

      In sum, there were concerns that both submissions were exploratory in nature, lacking a strong theoretical focus, and relied on functional localizers on novel tasks. However, these concerns were mitigated by the following strengths. Both tasks ask a clear and interesting question. The results replicate each other despite task differences. In this way, the two papers strengthen each other. Specifically, the major concerns for each paper individually are ameliorated when considering them as a whole.

      The concerns of the reviewers need addressing, including, specifically, the limits of interpretation of your results with regard to control task choice, the discussion of relevant literature mentioned by the reviewers, and most crucially, please contextualize your results with regard to the other submission's results.

    1. Reviewer #3:

      In this interesting study, the authors explored the effect of five consecutive generations of high-fat high-sugar diet (WD) in mice and their offspring's metabolic performance under a normal chow diet. It is very interesting to find that the chow-diet-fed progenies from these multigenerational western-diet-fed males develop a "healthy" overweight phenotype (which means without problem of glucose metabolism and fatty liver abnormalities) that persist 4 subsequent generations. In parallel, the authors also performed zygotic sperm RNA injection using sperm RNAs from the WD-fed males (both from first generation and five generations of feeding) and showed that the sperm RNA indeed induce offspring metabolic phenotypes in F1 mice and some phenotypes persist to F2-F3, but none persist to F4, which is different from the mating induced phenotype (last 4 generations). The study is overall well-performed and the comprehensive examinations (especially on phenotypes) represent an advance to the mammalian epigenetic inheritance field. I have a few concerns and suggestions for further improvement.

      1) In the abstract, I strongly recommend the authors to clarify what is a "healthy" overweight phenotype, which in the current paper means normal glucose metabolism and without fatty liver. This will make the information in the abstract more informative and precise. In fact, this is the major novel discovery in the phenotypic exploration, not only for social-medical implications, but also from the perspective of evolution. It looks like the five-generational western-diet-fed males have evolved to develop a protective mechanism in glucose and liver fat metabolism that can be inherited by the offspring. The underlying mechanism is intriguing and worth exploring in the future using this model. More extensive discussions on the social-medical and evolutionary aspects could be included.

      2) Regarding the phenotype induced by sperm RNA injection, the description should be more precise as the current description is not all consistent with the data presented. In Figure.4, some parameter changes persist to F2-F3, this already suggests transgenerational inheritance rather than merely intergenerational transmission. The more precise description should be that sperm RNAs can unequivocally induce intergenerational phenotype, but may induce some transgenerational features - although the effect is weaker than the effect induced by whole sperm. In fact, in a previous study using a mental-stress induced model, sperm RNA injection can also induce phenotype in both F1 and F2 generations (Nat Neurosci. 2014 May;17(5):667-9.).

      3) The sperm small RNA analysis part (Fig. S4) is relatively weak. The datasets generated are in fact quite valuable as they include the sperm from the control diet, first-generation WD and the Fifth-generation WD. This is an opportunity to explore the difference especially between the first-generation WD and Fifth-generation WD as no one has done this before. The current data analyses are crude and did not show these differences in an informative way. It is needed to at least provide the overall length distribution of each datasets with the annotation of different types of small RNAs. The authors have shown some difference regarding miRNAs and tRNA-derived small RNAs (tsRNAs) in Fig.S4, it would be interesting to also look at the rRNA-derived small RNAs (rsRNAs) because rsRNAs are also extensively discovered in both mouse and human sperm and these sperm rsRNAs are sensitive to dietary changes (Nat Cell Biol. 2018 May;20(5):535-540; PLoS Biol. 2019 Dec 26;17(12):e3000559.), closely associated with mammalian epigenetic inheritance and thus represent a component of the recently proposed sperm RNA code in epigenetic inheritance (Nat Rev Endocrinol. 2019 Aug;15(8):489-498). The reanalysis of the datasets could be done by SPORTS1.0 (Genomics Proteomics Bioinformatics. 2018 Apr;16(2):144-151.), which provide the annotation and analyses of miRNAs, tsRNA, rsRNAs and piRNAs that have been used in the above mentioned publications (Nat Cell Biol. 2018 May;20(5):535-540; PLoS Biol. 2019 Dec 26;17(12):e3000559)

    2. Reviewer #2:

      Raad et al. examined the effects of multigenerational paternal exposure to an obesogenic diet on epigenetic and metabolic alterations at somatic and germ cell levels. The experimental work addresses an important question. The findings are intriguing that sperm mRNA and natural crosses have different effects on offspring metabolic states. The major tissue of interest explored was WAT. Fat cell size, no and gene expression were reported. The intriguing thing about these data is that the sperm RNA microinjection did not fully recapitulate the effect across multiple generations - there is little explanation of potential mechanisms.

      There is no detailed coverage of the gene changes, small RNAs, piRNAs etc observed and the pathways implicated. This would be a welcome addition.

      As this is such a complex design, more overall schematics would be helpful.

      Number of mice per group ranges widely, and it is unclear how many matings this represents. Fig 3 legend states 4 WD1 and 9 WD5 males from different littermates were mated with CD females - again, unclear - do you mean from different litters? Numbers shown in panel A do not seem to concur with those in panels B, C

      Figure 1 shows outcomes for WD 1,2,3,4,5 and largely focuses on gWAT. Gene expression changes are only briefly summarised. Only 1 CD generation is represented.

      It is unclear why mice were studied at the various ages- eg Across data sets, ages shown range from 10 weeks, 12 weeks, 16 weeks, 18 weeks. Note there are inconsistencies regarding figure formats and some details are missing, which makes it hard to understand what the authors found. Fig S3 and S5- no n values given. Labels in S4 D, E hard to follow.

      In several of the figures, it is not clear what the significance (*) is being compared to - is it always CD? Eg Figure 3, Figure 4

      It appears that variability increases from WD1 to WD5- with larger ranges evident- is this why n increases across generations? And is this a consistent observation across paternal studies of this kind?

      The effect of paternal WD on BW, GTT and adiposity is relatively larger in mice than rats- have the authors considered species differences?

      One page 10 the authors state that the diet used is not associated with hepatic steatosis - but I would have thought there was good evidence of this occurring in mice, over the timeframe described here.

      The intriguing thing about these data is that the sperm RNA microinjection did not fully recapitulate the effect across multiple generations - there is little explanation of potential mechanisms.

      It is surprising that there is no detailed coverage of the gene changes observed and the metabolic pathways implicated. The story is undersold.

    3. Reviewer #1:

      While this study is focusing on an interesting hypothesis and attempting to address the molecular mechanisms at play, there are numerous flaws in the study design and the statistical test that prevail from drawing conclusions.

      1) In line 72, the authors state that "the average body weight of the WD-fed male mice increased gradually with multigenerational WD feeding", however, the results of the test indicating gradual increase is not reported. As described in the legend of Figure 1, the test performed tested differences in body weight between the control group and each individual generation, not the generations to each other. Visually, it rather seems that in fact, body weight was not gradually increased for instance, comparison of WD1 and WD3, or WD2 and WD5, does not support the "gradual increase" in body weight that the authors are claiming.

      2) There is a lack of clarity in the methods in regards to numbers of animals used in each generation, the number of founders, and what constitutes the control group. In the legend of Figure 1, it is stated that 5 males were used from WD2 and on. However, the method section states "(...) 4 to 6 independent males of WD1 group". The reviewer assumes that the authors know how many animals were used in the WD1 group, and that the authors meant 4 to 6 animals per WD generation. However, if the details indicated in the legend of Figure 1 are accurate (5 fathers per group from WD2), how is it possible that 4 to 6 animals were used? The reviewer suggests to clarify this in the text, as well as in a more detailed experimental setup diagram stating the number of fathers in each generation, the number of offspring studied in each litter, and the total number of offspring studied for each generation.

      3) In Supplemental Figure 1I, the CD1 group appears to be composed of 7 individuals and the CD2 group of 10 individuals. This is not consistent with the numbers reported in Figure 1A (10 in CD1 and 13 in WD3) and Figure 1B (22 visible dots). It is thus difficult for the reviewer to trust that body weights were truly compared between all animals in CD1 and CD5. Regardless, the reviewer is intrigued by the choice of the authors to only study control animals from the first generation (CD1), and the fifth generation (CD5) offspring, as they describe in the methods that, for the control group, they followed the same procedure as the WD group, which should have led to the generation of control animals in all F1, F2, F3, F4 and F5 generations. The authors should clarify on this, and if they indeed generated these animals, they should use body weight data in each generation of controls and compare them to their respective generation WD group (i.e. CD1 to WD1, CD2 to WD2 etc..). By having different sample size in the various groups, the authors are biasing results of the statistical test being made, as greater sample size is likely to compare statistically different than a group with lower sample size (as with CD(22 observations) and WD2(12observations) in Figure 1B, but also with the RNA-seq results). In the same line, there were more animals studied in WD4 and WD5 compared to WD1-3 which is likely biasing statistical analysis. Again, if the study design described in the methods section is accurately reported, it implies that an average of 3 offspring per fathers were used in WD1-3, and 8-10 (a full litter) for the WD4-5.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. David E James (The University of Sydney) served as the Reviewing Editor.

      Summary:

      In this manuscript, Raad and colleagues exposed male mice to a western diet before conception for 5 consecutive generations and measured body weight, adiposity and various metabolic markers in the offspring. Sequencing of small RNA in sperm from founders identified several differentially expressed tRF and miRNA species. Microinjection of RNAs recapitulated some, but not all effects on body weight and metabolism. The authors report an aggravation of adiposity along generations and a phenotype that persists for 4 consequent generations. Such persistence of phenotype was not observed in animals originating from microinjection of total RNAs, suggesting other epigenetic mechanisms are at play in the persistence of phenotype. Overall the studies were considered to be of interest by the referees but one major overarching problem identified by them concerned the study design and the statistical analyses that limited interpretation of the study. These issues need to be seriously addressed by the authors. These and other points are listed below.

    1. Reviewer #2:

      Overall, I think this is a creative study, with very interesting findings. A major weakness is that the interpretations seem a bit exaggerated and alternative interpretations not considered.

      Using a creative paradigm of perceptual filling-in, the authors show that increased attention (indexed by a reduction in alpha power over central-parietal locations, and supported by previous psychophysics studies) is associated with perceptual filling-in, and the phenomenal disappearance of targets. By tagging targets and surround with different frequencies, they show that SSVEP elicited by targets increases at the time of perceptual filling-in.

      These results suggest that SSVEP, thought to index the content of visual perception in previous binocular rivalry studies, can be dissociated from conscious perception in this paradigm, and instead reflect attention.

      While the results are interesting and novel, they are perhaps not as surprising as the authors present them to be. Given that previous studies have shown a clear connection between SSVEP and attention (e.g. Ref 14 cited by the authors), these results show that when attention and awareness are dissociated (as the last author has nicely demonstrated/argued previously), SSVEP goes with attention.

      These results do not demonstrate that all sensory-cortical activity goes along with attention instead of awareness, as the authors' abstract/significance statement/discussion suggest to be the case. E.g., in the abstract/significance statement, the authors only state "neural activity" or "neural response", instead of specifically SSVEP, which can be misleading. Similarly, in discussion, it remains a possibility that other types of neural activity (e.g. spiking rate or recurrent activity) in sensory cortex correlates with the vividness of conscious experience, which would in principle be consistent with first-order or GNW theories.

      An analysis comment:

      In discussion, the authors mention "As more targets disappeared and presumably drew attention, both the duration of their absence and strength of target SNR increased."

      The duration effect, shown in SI, is not referenced in the main text as I could find. In Fig. 2, in addition to investigating SSVEP's relation with the number of disappeared targets, the authors could also test its relation with the duration of PFI.

    2. Reviewer #1:

      General assessment:

      In this paper, Davidson et al. characterize the neural correlates of visual disappearance during perceptual filling-in (PFI) using steady-state visual evoked potentials (SSVEPs). They show that target disappearance actually leads to an increase rather than to a decrease of the target SNR. This finding is potentially of importance. However, the current version of the manuscript does not provide enough details regarding the underlying assumptions and neural mechanisms. The results should also be better described, interpreted and compared to the existing literature. I list my most substantive concerns below.

      Substantive concerns:

      1) I was a bit frustrated to see that almost no discussion about the neural mechanisms underlying the results is provided. It seems important to better explain the cortical processes involved (e.g. the authors could compare more carefully their results with those obtained in macaque electrophysiology by De Weerd et al. 1995).

      To go further along this direction, one possibility would also be to analyse the SNRs at the intermodulation frequencies (I see in supplementary figure 3 that responses at F2-F1 = 5Hz are significantly above noise). This would permit to characterize and discuss the interactions between the neural responses corresponding to the processing of the targets and to the surround (see e.g. Appelbaum et al., 2008).

      2) When I read the whole manuscript, I had the feeling that the analysis of the SNR change latencies (which is currently described in the supplements) would deserve to be more documented and to appear in the main document. The finding that changes in background SNR precede changes in target SNR is an important result which clarifies the temporal sequence of neural activations. That would also be nice if the authors could determine when the SNR change corresponding to the inter-modulation product (e.g. at F2-F1) appears (see my first point above).

      3) To better characterize the difference between the responses to PFI vs to phenomenally matched disappearances (PMD) and support the claim that target-SNR decreases rather than increases during PMD (l. 170), that would be great to show the target-SNR changes around button press (i.e. the equivalent of figure 2 b & e) for PMD.

      4) The target disappearance during PFI is associated with an increase of SNR and therefore, SSVEPs in this case do not reflect conscious perception. But does it necessarily imply that this target-SNR increase reflects attention instead? The authors base their interpretation on previous studies (Lou, 1999; De Weerd et al., 2006) where attending to target feature increased PFI probability (which I think is not exactly equivalent to the PFI magnitude reported here) and also on the correlation they found between target-SNR and evoked alpha. However, these are indirect evidences and in their experimental protocol, attention was not directly manipulated (as e.g. in Morgan et al., 1996 or Müller et al., 2006). I would suggest being a little bit more cautious with this interpretation in the manuscript.

      5) Before this study, other groups looked at the dissociation between attention and perceptual awareness (among others, see e.g. Wyart & Tallon-Baudry, 2008; 2009; Koivisto et al., 2009; Norman et al., 2013). A deeper review of the existing literature on this topic (in the introduction and/or discussion) would permit to better understand what is already known and also to provide leads for future investigations.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      This manuscript is in revision at eLife.

      Summary:

      This manuscript describes a human EEG study which aims at characterizing the neural correlates of visual disappearance during perceptual filling-in (PFI) using steady-state visual evoked potentials (SSVEP). The authors report that target disappearance leads in this paradigm to an increase rather than to a decreased SNR of the target SSVEP. The authors interpret this "neural correlate of invisibility" as an empirical challenge for existing theories regarding the relationship between SSVEP and conscious perception. The two reviewers have found the study to be creative and its findings to be of potential importance for the field. However, they have also raised concerns regarding the interpretation of the findings proposed by the authors, which would require additional analyses to be supported by the data and a more extensive account of the existing literature on the relationship between the neural correlates of visual awareness and attention. There are also concerns regarding the number of subjects included in the analyses which should be clarified. The paragraphs below describe the main concerns that have been discussed among reviewers and the reviewing editor.

    1. Reviewer #3:

      In this manuscript, Robinson et al., identified alternative first exon (AFE) switching events conserved between mouse and human following macrophage inflammation. Using short and long-read sequencing, the authors identified a few unannotated transcription initiation sites (TSS) that are specific to an inflammatory response. Among those, they centered on an unannotated TSS in the Aim2 gene that drives expression of a novel isoform regulated by an iron-responsive element in its 5′UTR.

      While previous work had documented crucial AFE switching events in many other biological contexts, Robinson et al. presents here an interesting AFE switching event that can have potential implications for our understanding of the molecular regulation of the innate immune response. I would expect further progress on global mechanisms and biological relevance of these AFE switching events, as well as evidence that the AFE are truly first exons/TSSs.

      Substantive concerns:

      1) Are the AFEs truly first exons/TSS? While both short-read and long-read sequencing detected changes in alternative splicing choices, neither of those are optimal methodologies to analyze first exons. Therefore, I suggest to use a more specialized method to identify (and quantify) more accurately the usage of first exons. Globally, cap analysis of gene expression (CAGE) would be ideal. For validation of specific AFE changes, the qPCR technique has a few issues. First, it does not have nucleotide resolution, so the authors should not refer to TSSs if they used this technique for validation. Second, many downstream first exons are also used as internal exons in other isoforms. There is not a direct technology to analyze specifically first exons/TSSs here. Also, RNA-sequencing technologies, depending on their depth, can definitely miss specific isoforms. Considering a low coverage in 5'end of genes in RNA-seq analysis, this is particularly important for first exons. A qPCR would only analyze the well-known TSSs. Thus, 5'RACE or a similar technology should be performed to assess the relative usage of AFE specifically.

      2) Global mechanism. The authors assumed that the mechanism of AFE switching is generated by transcription initiation and looked for transcription factors binding and chromatin structure modifications in promoters. However, they did not rule out the possibility that the global switching effect is a post-transcriptional regulation, such as differential mRNA stability. A transcription initiation measurement (e.g., 4SU metabolic labelling) is necessary to demonstrate that the changes in AFE usage are co-transcriptional. In addition, in terms of their ATAC-Seq analysis, the chromatin structure changes in promoters can be the cause or consequence of transcription initiation. Thus, it should not be listed as one mechanism driving the expression of AFE events (line 145). Also, to demonstrate a mechanism based on transcription factor binding more than 2 transcription factors should be considered. In any case, the expression patterns of the transcription factors considered are not clear. As a minor note, the bioinformatic analysis of the two promoter regions driving the isoforms of Aim2 (line 156) is not explained in the method section.

      3) Biological relevance. Could the authors evaluate whether the translation regulation of Aim2 based on its AFE switching is a more generalized phenomenon? Are there any global gene regulation changes triggered by the other genes with significant changes in AFE usage?

    2. Reviewer #2:

      This manuscript by Robinson et al. presents an interesting and timely analysis of a wealth of transcriptome data upon immune stimulation. The unique combination of long-read Oxford Nanopore and short-read Illumina high-throughput sequencing across both human and mouse samples presents an opportunity for many interesting inter-species immune response comparisons, as well as elucidation of full-length transcript information. This paper is well-written and has interesting validation and discussions regarding Aim2. My major concern is that the paper seems to narrow in on the characterization of Aim2 and class of RNA processing changes (alternative first exons) quite quickly without really delving into the rest of the data and how they arrived there. Below are my major/minor comments and suggestions:

      1) I would have liked the authors to provide more insight into how they honed-in on specifically talking about first exon changes, by discussing more of the other RNA processing changes they found. There is cursory mention in the text and figures of other alternative exon or splice site changes. Firstly, other studies (including those referenced by the authors) have found hundreds of RNA processing changes genome-wide upon immune stimulation - especially of cassette exons, alternative splice sites, and last exon/3'UTR changes. However here, the authors only find tens of changes (Fig 1B). Are they underpowered to identify changes and can they do any sort of analyses to show that they are sufficiently powered (# of sequencing reads & junctions, complexity of reads, etc)?

      2) Similarly, I would also be interested in seeing an analysis indicating whether the 50 AFE events that overlap between the long-read and short-read sequencing analyses is a statistically significant overlap. Particularly, how many overlapping events would be expected given the difference in quantification power between the two methods? How many real AFE differences might the authors be missing because the long-read sequencing methods often do not have the power to identify them (ie. lower expressed genes in one or the other condition, thus dropout of isoforms and perhaps fewer isoform differences for differentially expressed genes).

      3) Second, for the non-AFE changes that they did find, there is very little discussion about what those changes might represent. Specifically: (a) how many changes are validated with long-read data?, (b) is there any insight into specific domains being included/changed, especially using the long-read data?, (c) how many of these non-AFE changes overlap between species? and (d) which types of genes show higher overlap between species and what are their characteristics (binding sites, etc)? To my knowledge, this is the first study that is really designed to properly really look at the conservation of splicing or RNA processing changes after immune activation, so I would love to see more analysis and discussion of this aspect genome-wide.

      4) The authors define significant splicing changes as those with a p-value <= 0.25 and |dPSI| >= 10. I'd like some more clarification on whether this is an adjusted p-value (BH, FDR, or some other multiple test-corrected p-value). Especially if this is adjusted, I find it surprising that the authors are choosing such a liberal statistical confidence level and that even with such a liberal threshold, they are only getting tens of significant events. I would like the authors to at least show these same trends across multiple p-value thresholds or with rank threshold analysis (top 5%, top 10%, top 20%) to show biological trends.

      5) The authors introduce their long-read sequencing data by mentioning that they wanted to identify "additional splicing events that are not captured using short-read sequencing." They then go on to only talk about novel first exon events identified with the long-read sequencing data. Did they identify any other non-AFE events in using the long-read that could then be quantified with the short read data? And second, how do they quantify confidence for novel AFE isoforms, when long-read data seems to have lots of issues with properly sequencing the terminal ends of transcripts (particularly the 5' end when polyA primed, as occurs in ONT DirectRNA sequencing)? They mention the use of ATAC-seq data to show putative promoter support, but mention at one point in their methods that ATAC regions within 10kb of AFEs are considered. This seems like it could be a rather large region to be sure that the ATAC peak is specific to a novel AFE - what is the average distance between AFEs? Finally, I would love to also see the incorporation of CAGE-seq data (or other 5'end data) to validate the specific AFEs sites - which I believe the FANTOM consortium has across many human and mouse tissues.

    3. Reviewer #1:

      Our understanding of the transcriptomic impact of innate immune signaling remains incomplete. Here Robinson et al., use both long and short read RNA sequencing to gain further insight into LPS-induced changes to mRNA isoform expression in human and mouse macrophages. Their studies report the novel observation that the most common change in isoform expression is alternative use of the first exon. Such changes are indicative of transcriptional regulation, and is thus consistent with the known impact of innate immune signaling on activation of multiple transcription factors. Despite some minor concerns with details of the study, as enumerated below, this is a well-executed and important study that will be of interest and importance to many studying innate immunity, as well as those interested in gene regulation.

      Major comments:

      1) In some ways this is minor, but the authors should be careful to not describe alternative first exon use as alternative splicing. While a novel splice junction is created, mechanistically this is driven by changing transcriptional regulation, and then splicing occurs in the only pattern available to that TSS. In general this is described appropriately in the manuscript, but at a few points there is confusing terminology.

      2) An interesting and somewhat surprising point in the manuscript is that 50% of the AFE events don't show an overall change in gene expression. For Aim2, which does change, the authors show that the AFE change is due to activated use of the unannotated TSS in LPS-stimulated cells. For those genes for which AFE use doesn't correlate with a change in gene expression (e.g. Ncoa7, Rcan1, Ampd3 - Fig S3) is there still transcriptional activation of one TSS and transcriptional silencing of the other? In other words, is there coordinated regulation of the two TSSs to ensure overall message abundance doesn't change, or does activation of one TSS inherently shut off the other (more akin to splice site competition in traditional AS)?

      3) The data suggesting that an IRE regulates translation of the induced 5'UTR is compelling, but more work should be done to confirm. Most importantly, the experiment in Figure 4J should be repeated with the deltaIRE version of the unannotated UTR. Also is the IRE regulation controlled upon LPS-stimulation, or just the presence of the IRE element? In other words, what is the distribution of the annotated and unannotated isoforms in the polysome in the absence of LPS (i.e. repeat 4P without LPS)? Can the authors comment on whether the level of iron or the activity of IRP1/2 change in LPS-stimulated cells?

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Timothy W Nilsen (Case Western Reserve University) served as the Reviewing Editor.

      Summary:

      There was significant enthusiasm for the work. However, it seems that considerable effort including additional experiments will be required to firm up the conclusions.

    1. Reviewer #2:

      The authors set out to investigate whether the cerebellum plays a domain-general and predictive role in speech perception. They leveraged the online platform, Neurosynth to conduct a meta-analysis of fMRI studies to compare the activation results between speech perception and speech production studies. They find that there are distinct as well as overlapping regions of perception- and production-related activity in the cerebellum, and that each of these regions has a distinct connectivity fingerprint with the cerebral cortex. They mined text data from thousands of studies in Neurosynth to determine which labels best explain these speech-perception and speech-production activity patterns. They find that cerebellar regions activated by speech-perception, speech production, and their overlap, are also associated with cognitive and motor processes beyond the domain of speech and language. On the basis of these results, they argue for a domain-general view of cerebellar processing.

      One of the most interesting findings in this paper is that speech-perception and speech-production tasks elicit both distinct and overlapping activity patterns in the cerebellum. It has long been known that the cerebellum is activated by speech processing, however, it has been less clear to what extent these two processes (perception and production) differ in their activation patterns. Importantly, the authors also show that these distinct and overlapping networks in the cerebellum display connectivity patterns with corresponding regions of the cerebral cortex. However, there are some major concerns.

      One of the central take-aways from this study is that prediction is a domain-general mechanism that supports speech perception in the cerebellum. The authors argue for domain-generality on the basis that regions activated by speech perception and production in the cerebellum are also activated by a wide range of non-speech tasks. However, I was a bit confused by this argument. It is my understanding that the same region of the cerebellum can be activated by many different tasks, and that each task will demand its own computational description. However, that does not necessarily provide evidence for domain-generality. What could point to domain-generality is a function/computation that explains the diverse set of computations required by the tasks. That speech-related regions of the cerebellum are also activated by a range of non-speech tasks does not (in my opinion) support a domain-general view of cerebellar processing.

      Another take-away from this study is that the cerebellum plays a predictive role in speech processing. Prediction is at the core of many theories of cerebellar function (e.g., internal models, error-based learning), of course, it is a very broad term that is not necessarily unique to the cerebellum. The authors hypothesize that, "if the cerebellum is involved in prediction during natural speech perception, there should be a greater amount of activity throughout the brain when the cerebellum is not active during this task". The authors compare two different sets of speech perception studies, those that report cerebellar activation and those that do not. They then compare the level of activation in cortex versus cerebellum for both of these study types. They find that cortical activation in the "no cerebellum" studies is increased relative to cortical activation in the "cerebellum active" studies. On the basis of these results, they infer that the cerebellum must be involved in prediction and that prediction results in metabolic savings (i.e. decreased activity in cortex). However, why did the speech perception tasks in the "no cerebellum" studies not activate the cerebellum. Did they not involve prediction in some capacity? There are likely other reasons that there was increased cortical activation in the "no cerebellum" studies that are unrelated to the absence of cerebellar activation.

      It is also not clear to me why speech perception studies that involved passive sound and music perception were included. How are tones related to speech perception? It would have been helpful if the authors had shown consistency across the different modalities (i.e. speech, sounds, instrumental music, and tones). I'm also assuming that the speech production studies were not matched across these four groups. Couldn't differences in activity patterns arising between the two study types potentially be attributed to sounds, instrumental music, and tones present in the speech perception studies?

    2. Reviewer #1:

      I have very much enjoyed reading this piece of work, investigating the role of the cerebellum in non-motor functions using a meta-analysis and focusing especially on speech perception and predictive processing. I believe that this work is highly relevant to the field and will contribute considerably to the understanding of cerebellar functioning.

      I appreciate the careful description of the methods and the aim to challenge the hypotheses through additional testing. However I have only very few major concerns, which however I believe are all addressable:

      1) From page 8, but mainly throughout the whole paper: I am concerned with the inclusion of 22.5% of instrumental music or tone studies. The paper's overall focus is on speech perception and production, and the authors always only refer to "speech" throughout the manuscript. Whereas the inclusion of speech sound perception studies can be easily justified, the inclusion of tone perception is highly different if the focus lies on speech, e.g. due to the varying complexity of the input signal.

      Although the authors address this issue in the limitation section, it weakens the overall impact of the findings (as they also state, but downplay). For consistency the authors should exclude tone processing studies from their analysis; as the role of cerebellum in contributing to processing of time and potential motor sequencing is widely discussed in the literature (see Gordon et al 2018, PLoSOne, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6242316/ ). As I very much support the ideas presented in the paper I believe a clear differentiation between perception of speech and perception of music is crucial for making a convincing argument regarding the role of the cerebellum in "passive" predictive language perception, if that is the focus of the paper. It would be interesting, however, if the regions for perception differ when including music studies compared to speech studies only. A separate analysis of the tone studies might not be feasible for 20 or so studies.

      Generally, the authors should either refrain from setting the focus on "speech perception" when the paper clearly focuses on "speech and tone perception" (or more generally "non-motor auditory perception", which is, by the way, not problematic at all, as the findings support a domain-general function of cerebellum. In that case speech perception should not be mentioned singularly in the title. However, if the authors wish to make a statement on speech perception, then they should exclude the tone perception studies from the analysis.

      2) Relatedly, page 5 last sentence, whereas I do agree with this approach and appreciate effort to test the own hypothesis, this approach is missing the testing of an alternative hypothesis: Could the decrease of general cortical activation be linked to the greater activity of a different region, other than the cerebellum. This should be at least discussed.

      3) Page 16/20: To test their hypothesis the authors compare the cortical activation of studies that report cerebellar activity and those that don't. If the cerebellum had this domain general function in predictive processing why would it not be active in some studies? Was there a systematic difference between the two sets of studies, and, as furthermore argued, did those studies that did not activate the cerebellum use indeed speech in novel contexts? A further investigation of the difference between the two sets of studies would be helpful in support of the argumentation.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The importance of cerebellum in cognition generally, and in speech processing more specifically, are timely and interesting questions, and metaanalysis is a helpful tool. The paper is clearly written. However, in the opinion of at least one reviewer and the Reviewing Editor, neither of the two stated aims of the paper were satisfactorily achieved.

      The stated aims were to demonstrate:

      1) "that the cerebellum plays a domain-general role in speech perception-that is, a role that is not inherently speech specific." However, just showing coactivation with other tasks does not indicate domain-generality; for a variety of reasons. First, this conclusion is not supported because of the computational specificity issue raised by Reviewer 2, and second, coactivation in brain imaging can be an artifact of the spatial resolution of BOLD, and of preprocessing -- it does not necessarily imply coactivation at a neural level.

      2) "that the domain-general role played by the cerebellum and its connections during speech perception is related to prediction." Two lines of evidence are offered for this. 1) the reverse inference that regions identified in the paper are associated in Neurosynth with the term 'prediction'; and 2) that there was more cortical activity when the cerebellum was inactive. Just because accurate prediction should reduce activity, doesn't mean that a reduction in activity signifies prediction.

    1. Reviewer #3:

      Otsuka et al. report the characterisation of three temperature sensitive alleles of genes which prominently lead to overproliferation of cells in lateral root primordia. Interestingly this phenotype which is not underpinned by alteration of the auxin pattern, can be phenocopied by treatment with ROS and by interfering with the mitochondrial respiratory chain. This reveals that ROS modulate cell proliferation in the LR. The cloning and biochemical characterisation of the genes affected, reveal that all three encode enzymes involved in mt RNA processing, that perturb the production of certain components of the mitochondrial electron transport chain.

      This is an excellent manuscript that points to a new and very interesting link between primary metabolism and cell proliferation in lateral roots. It is remarkably well written and presented. The conclusions are fully supported by the data. As it is the case for exciting new discoveries, they raise a lot of questions and this manuscript is no exception. It would be very interesting for future work to uncover the nature of the molecular link between ROS and cell proliferation and why are LR so sensitive to this. It'd be eventually interesting to speculate whether the reported existence of an hypoxic environment in the centre of the LRP has to do with this.

      The one point I would like to hear some comments from the authors about relates to the growth conditions used to reveal the phenotype at restrictive temperature. They mention that they use explant culture on RIM (characterised by high glucose and high 2.5µM IBA). What's the penetrance of the phenotype in standard (1/2 MS, 1% sucrose, no additional auxin/IBA)?

    2. Reviewer #2:

      The manuscript by Otsuka and coworkers, describes the mapping of the mutations in rrd1, rrd2 and rid4 causing the temperature sensitive lateral root morphogenesis defects (fascinated LR meristem). Interestingly, the respective mutated genes all map to genes involved in mitochondrial mRNA processing, mRNA deadenylation, and mRNA editing. The authors propose that defective ROS homeostasis is causal to excessive cell proliferation in the lateral root primordia, and associated fasciation phenotype. Overall the manuscript is well-written, and is overall convincing with respect to characterization and mapping of the mutants, and the importance of RNA editing in mitochondria for the mutant phenotypes. I am not yet entirely convinced about the link to ROS production and the lateral root morphogenesis defects.

      1) The fascinated LR phenotype is reminiscent of mutants defective in coordination of LR emergence, such as CASP:shy2 (Vermeer et al). Suggesting that defective signaling in LR overlaying layers, could be causal to the observed phenotype. However, the phenotyping presented in this manuscript does not allow to assess this. A detailed staging of LRPs would be required, and/or an analysis of the LRP developmental dynamics using a root bending assay.

      2) Furthermore the expression domain analysis shows clear expression in LRPs. However, I suspect expression of at least RID4-GFP in LRP overlaying layers. However, the resolution of the picture, and interference of the bright PI counterstaining in Fig2B preclude a thorough assessment of this.

      3) The colocalization analysis in Fig 2D and E is not very clear. The mitotracker signal is set a bit too weak, making it difficult to assess the distinction between the GFP signal and the overlapping (yellow) signal). This could be amended by using different LUTs (also green/reds are not great for colorblind readers). Of note is the presence of a relatively large structure labeled by RDD1-GFP, that is not colocalizing with mitotracker, suggesting it also localized to another subcellular compartment. Therefore, colocalization should be addressed more quantitatively, also using additional organellar markers. Additionally, the mitochondrial localization could be further supported by western blot on purified mitochondria.

      4) The accumulation of polyadenylated transcripts in Fig3D, seems to also display a temperature sensitivity in the WT. Why was this assay not done using a quantitativePCR, that will allow for better appreciation of temperature component.

      5) In contrast to the LR phenotyping as displayed in Fig 1, the LR phenotyping in Fig4 is done in a completely different way. Why not use a uniform way to quantify. As it was done now, the suppression of rdd1 by ags1 mutation, is not very convincing, as the rrd1 phenotype is nearly abolished in the Col-0 introgressed line (Fig 4 B), suggesting that the rrd1 phenotype is sensitized in the Ler background.

      6) While the authors focus on the LR morphology phenotype in the mutants, there is also a prominent effect on primary root growth that is not described. However, this phenotype does not seem to be very ecotype-specific, and is rescued in the ags1 background. A small phenotypic characterization of the primary root phenotype could thus be beneficial for the manuscript, and it’s wider relevance for development.

      7) Fig5. -> explain arrowheads in B, in the legend. Bar charts using mean + and - SD should be avoided when you do not have many data points, as in D and F (N=3 and 2). Better to show the raw data. Loading controls are missing for Fig5 C and E.

      8) The section about ROS is all based on ROS related pharmacology. However, ROS levels in the mutants were not assessed, making it difficult to use the pharmacological treatments to interpret the origin of the mutant phenotypes.

      9) What is the link to the temperature sensitivity. Are these mutants hypersensitive to ROS inducing treatments?

      10) While the role of ROS in LR development is key to the proposed model, the authors did not introduce what is the state of the art about ROS in lateral and primary root development.

      11) In their model the authors might need to discuss whether or not ROS from the LRP could act as an intercellular coordinative developmental signal.

    3. Reviewer #1:

      This study continues research started by Professor Munetaka Sugiyama and his laboratory who identified about 20 years ago, or so, very interesting temperature-dependent fasciation (TDF) mutants affected in lateral root primordium (LRP) morphogenesis. The authors identified and reported in this study genes responsible for the mutant phenotype of the root redifferentiation defective 1 (rrd1), rrd2, and root initiation defective 4 (rid4). Intriguingly, all the genes are involved in RNA processing. Detailed analysis of the role of RRD2 and RID4 in mitochondrial mRNA editing and RRD1 in poly(A) degradation of mitochondrial mRNA make this work a solid and substantial study. The fact that pharmacological treatments of wild type seedlings by mitochondrial electron transport inhibitors can phenocopy the fasciated LRP phenotype is really fine. Similarly, the experiments with paraquat and ascorbate are very interesting. The main conclusion of the work (that LRP morphogenesis is linked to mitochondrial RNA processing and mitochondrion-mediated ROS generation) is novel and significant. I think this is an important step forward in our understanding of LRP morphogenesis.

      I see only one main conceptual or interpretation problem.

      The authors conclude that "that mitochondrial RNA processing is required for limiting cell division during early lateral root (LR) organogenesis" (line, L, 51). A similar statement appears on L101-103 where the authors postulate that TDF encode "negative regulators of proliferation that are important for the size restriction of the central zone during the formation of early stage LR primordia". Again, similar statements appear on L151-152, 344, and in the section of discussion "Mitochondrial RNA processing is linked to the control of cell proliferation", especially where the authors say about "the control of cell proliferation at the early stage".

      To my opinion, the above conclusions are arguable and cannot be accepted. To conclude about excessive cell division, the number of anticlinal divisions must be estimated per founder cell. This analysis has not been performed. The fact that at early stages LRPs are wider in the TDF mutants suggests that a greater number of FCs in the longitudinal plane participate in LRP formation. So, if this is correct, the mutations apparently affect control of lateral inhibition, and TDF genes are negative regulators of lateral inhibition. This question should be further investigated, but currently a more careful interpretation of the results is required. Also, if TDF genes encode "negative regulators of proliferation" then more frequent divisions would occur in the mutant. This question was not addressed either. If more frequent cell division is expected in early stage LRPs, this should result in formation of smaller cells. In accordance with Fig. 1D of this study and Figs. 1b and 3a of Otsuka and Sugiyama (2012), this is not the case. Contrary, it seems that at the same developmental stage there are lower numbers of cells per unit of volume in the mutants compared to wild type. Another, possible explanation of the TDF mutant phenotype, in addition to lateral inhibition, is abnormal establishment of stem cell identity or affected stem cell function. Therefore, the mechanistic explanation of the link between TDF gene action and the respective mutant phenotype is not satisfactory. The interpretation given can be corrected and carefully rephrased throughout the text.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary:

      The reviewers were very enthusiastic about your work. They identified some shortcomings, but most of it could be addressed by text edits. The reviewers were less convinced about the envisioned link to reactive oxygen species (ROS). Ideally, you should consolidate this aspect by depicting the mis-regulated ROS in the mutant, and its restoration in the suppressor double mutants (e.g. by staining).

    1. Reviewer #3:

      The manuscript by Hutchings et al. describes several previously uncharacterised molecular interactions in the coats of COP-II vesicles by using a reconstituted coats of yeast COPI-II. They have improved the resolution of the inner coat to 4.7A by tomography and subtomogram averaging, revealing detailed interactions, including those made by the so-called L-loop not observed before. Analysis of the outer layer also led to new interesting discoveries. The sec 31 CTD was assigned in the map by comparing the WT and deletion mutant STA-generated density maps. It seems to stabilise the COP-II coats and further evidence from yeast deletion mutants and microsome budding reconstitution experiments suggests that this stabilisation is required in vitro. Furthermore, COP-II rods that cover the membrane tubules in right-handed manner revealed sometimes an extra rod, which is not part of the canonical lattice, bound to them. The binding mode of these extra rods (which I refer to here a Y-shape) is different from the canonical two-fold symmetric vertex (X-shape). When the same binding mode is utilized on both sides of the extra rod (Y-Y) the rod seems to simply insert in the canonical lattice. However, when the Y-binding mode is utilized on one side of the rod and the X-binding mode on the other side, this leads to bridging different lattices together. This potentially contributes to increased flexibility in the outer coat, which may be required to adopt different membrane curvatures and shapes with different cargos. These observations build a picture where stabilising elements in both COP-II layers contribute to functional cargo transport. The paper makes significant novel findings that are described well. Technically the paper is excellent and the figures nicely support the text. I have minor suggestions that I think would improve the text and figures.

      L 108: "We collected .... tomograms". While the meaning is clear to a specialist, this may sound somewhat odd to a generic reader. Perhaps you could say "We acquired cryo-EM data of COP-II induced tubules as tilt series that were subsequently used to reconstruct 3D tomograms of the tubules."

      L 114: "we developed an unbiased, localisation-based approach". What is the part that was developed here? It seems that the inner layer particle coordinates where simply shifted to get starting points in the outer layer. Developing an approach sounds more substantial than this. Also, it's unclear what is unbiased about this approach. The whole point is that it's biased to certain regions (which is a good thing as it incorporates prior knowledge on the location of the structures).

      L 124: "The outer coat vertex was refined to a resolution of approximately ~12 A, revealing unprecedented detail of the molecular interactions between Sec31 molecules (Supplementary Fig 2A)". The map alone does not reveal molecular interactions; the main understanding comes from fitting of X-ray structures to the low-resolution map. Also "unprecedented detail" itself is somewhat problematic as the map of Noble et al (2013) of the Sec31 vertex is also at nominal resolution of 12 A. Furthermore, Supplementary Fig 2A does not reveal this "unprecedented detail", it shows the resolution estimation by FSC. To clarify, these points you could say: "Fitting of the Sec31 atomic model to our reconstruction vertex at 12-A resolution (Supplementary Fig 2A) revealed the molecular interactions between different copies of Sec31 in the membrane-assembled coat.

      L 150: Can the authors exclude the possibility that the difference is due to differences in data processing? E.g. how the maps’ amplitudes have been adjusted?

      L 172: "that wrap tubules either in a left- or right-handed manner". Don't they always do both on each tubule? Now this sentence could be interpreted to mean that some tubules have a left-handed coat and some a right-handed coat.

      L276: "The difference map" hasn't been introduced earlier but is referred to here as if it has been.

      L299: Can "Secondary structure predictions" denote a protein region "highly prone to protein binding"?

      L316: It's true that the detail in the map of the inner coat is unprecedented and the model presented in Figure 7 is partially based on that. But here "unprecedented resolution" sounds strange as this sentence refers to a schematic model and not a map.

      L325: "have 'compacted' during evolution" -> remove. It's enough to say it's more compact in humans and less compact in yeast as there could have been different adaptations in different organisms at this interface.

      L327: What's exactly meant by "sequence diversity or variability at this density".

      L606-607: The description of this custom data processing approach is difficult to follow. Why is in-plane flip needed and how is it used here?

      L627: "Z" here refers to the coordinate system of aligned particles not that of the original tomogram. Perhaps just say "shifted 8 pixels further away from the membrane"

      L642-643: How can the "left-handed" and "right-handed" rods be separated here? These terms refer to the long-range organisation of the rods in the lattice; it's not clear how they were separated in the early alignments.

      Figure 2B. It's difficult to see the difference between dark and light pink colours.

      Figure 3C. These panels report the relative frequency of neighbouring vertices at each position; "intensity" does not seem to be the right measure for this. You could say that the colour bar indicates the "relative frequency of neighbouring vertices at each position" and add detail how the values were scaled between 0 and 1. The same applies to SFigure 1E.

      Figure 4. The COP-II rods themselves are relatively straight, and they are not left-handed or right-handed. Here, more accurate would be "architecture of COPII rods organised in a left-handed manner". (In the text the authors may of course define and then use this shorter expression if they so wish.) Panel 4B top panel could have the title "left-handed" and the lower panel should have the title "right-handed" (for consistency and clarity).

    2. Reviewer #2:

      The manuscript describes new cryo-EM, biochemistry, and genetic data on the structure and function of the COPII coat. Several new discoveries are reported including the discovery of an extra density near the dimerization region of Sec13/31, and "extra rods" of Sec13/31 that also bind near the dimerization region. Additionally, they showed new interactions between the Sec31 C-terminal unstructured region and Sec23 that appear to bridge multiple Sec23 molecules. Finally, they increased the resolution of the Sec23/24 region of their structure compared to their previous studies and were able to resolve a previously unresolved L-loop in Sec23 that makes contact with Sar1. Most of their structural observations were nicely backed up with biochemical and genetic experiments which give confidence in their structural observations. Overall the paper is well-written and the conclusions justified. However, this is the third iteration of structure determination of the COPII coat on membrane with essentially the same preparation and methods. Each time, there has been an incremental increase in resolution and new discoveries, but the impact of the present study is deemed to be modest. The science is good and appropriate for a specialized journal. Areas of specific concern are described below.

      1) The abstract should be re-written with a better description of the work.

      2) Line 166 - "Surprisingly, this mutant was capable of tubulating GUVs". This experiment gets to one of the fundamental unknown questions in COPII vesiculation. It is not clear what components are driving the membrane remodeling and at what stages during vesicle formation. Isn't it possible that the tubulation activity the authors observe in vitro is not being driven at all by Sec13/31 but rather Sec23/24-Sar1? Their Sec31ΔCTD data supports this idea because it lacks a clear ordered outer coat despite making tubules. An interesting experiment would be to see if tubules form in the absence of all of Sec13/31 except the disordered domain of Sec31 that the authors suggest crosslinks adjacent Sec23/24s.

      3) Line 191 - "Inspecting cryo-tomograms of these tubules revealed no lozenge pattern for the outer 192 coat" - this phrasing is vague. The reviewer thinks that what they mean is that there is a lack of order for the Sec13/31 layer. Please clarify.

      4) Line 198 - "unambiguously confirming this density corresponds to 199 the CTD." This only confirms that it is the CTD if that were the only change and the Sec13/31 lattice still formed. Another possibility is that it is density from other Sec13/31 that only appears when the lattice is formed such as the "extra rods". One possibility is that the density is from the extra rods. The reviewer agrees that their interpretation is indeed the most likely, but it is not unambiguous. The authors should consider cross-linking mass spectrometry.

      5) In the Sec31ΔCTD section, the authors should comment on why ΔCTD is so deleterious to oligomer organization in yeast when cages form so abundantly in preparations of human Sec13/31 ΔC (Paraan et al 2018).

      6) The data is good for the existence of the "extra rods", but significance and importance of them is not clear. How can these extra densities be distinguished from packing artifacts due to imperfections in the helical symmetry.

      7) Figure 5 is very hard to interpret and should be redone. Panels B and C are particularly hard to interpret.

      8) The features present in Sec23/24 structure do not reflect the reported resolution of 4.7 Å. It seems that the resolution is overestimated.

      9) Lines 315/316 - "We have combined cryo-tomography with biochemical and genetic assays to obtain a complete picture of the assembled COPII coat at unprecedented resolution (Fig. 7)." Figure 7 is a schematic model/picture; the authors should reference a different figure or rephrase the sentence.

    3. Reviewer #1:

      Hutchings et al. report an updated cryo-electron tomography study of the yeast COP-II coat assembled around model membranes. The improved overall resolution and additional compositional states enabled the authors to identify new domains and interfaces, including what the authors hypothesize is a previously overlooked structural role for the SEC31 C-Terminal Domain (CTD). By perturbing a subset of these new features with mutants, the authors uncover some functional consequences pertaining to the flexibility or stability of COP-II assemblies.

      Overall, the structural and functional work appears reliable, but certain questions and comments should be addressed. This study provides a valuable refinement of our understanding of COP-II that I believe is well suited to a specialized, structure-focused journal.

      Major Comments: 1) The authors belabor the comparison between the yeast reconstruction of the outer coat vertex with prior work on the human outer coat vertex. Considering the modest resolution of both the yeast and human reconstructions, the transformative changes in cryo-EM camera technology since the publication of the human complex, and the differences in sample preparation (inclusion of the membrane, cylindrical versus spherical assemblies, presence of inner coat components), I did not find this comparison informative. The speculations about a changing interface over evolutionary time are unwarranted and would require a detailed comparison of co-evolutionary changes at this interface. The simpler explanation is that this is a flexible vertex, observed at low resolution in both studies, plus the samples are very different.

      2) As one of the major take home messages of the paper, the presentation and discussion of the modeling and assignment of the SEC31-CTD could be clarified. First, it isn't clear from the figures or the movies if the connectivity makes sense. Where is the C-terminal end of the alpha-solenoid compared to this new domain? Can the authors plausibly account for the connectivity in terms of primary sequence? Please also include a side-by-side comparison of the SRA1 structure and the CTD homology model, along with some explanation of the quality of the model as measured by Modeller. Finally, even if the new density is the CTD, it isn't clear from the structure how this sub-stoichiometric and apparently flexible interaction enhances stability. Hence, when the authors wrote "when the [CTD] truncated form was the sole copy of Sec31 in yeast, cells were not viable, indicating that the novel interaction we detect is essential for COPII coat function." Maybe, but could this statement be a leap to far? Is it the putative interaction essential, or is the CTD itself essential for reasons that remain to be fully determined?

      3) Are extra rods discussed in Fig. 4 are a curiosity of unclear functional significance? This reviewer is concerned that these extra rods could be an in vitro stoichiometry problem, rather than a functional property of COP-II.

      4) The clashscore for the PDB is quite high, and I am dubious about the reliability of refining sidechain positions with maps at this resolution. In addition to the Ramchandran stats, I would like to see the Ramachandran plot as well as, for any residue-level claims, the density surrounding the modeled side chain (e.g. S742).

      Minor Comments:

      1) The authors wrote "To assess the relative positioning of the two coat layers, we analysed the localisation of inner coat subunits with respect to each outer coat vertex: for each aligned vertex particle, we superimposed the positions of all inner coat particles at close range, obtaining the average distribution of neighbouring inner coat subunits. From this 'neighbour plot' we did not detect any pattern, indicating random relative positions. This is consistent with a flexible linkage between the two layers that allows adaptation of the two lattices to different curvatures (Supplementary Fig 1E)." I do not understand this claim, since the pattern both looks far from random and the interactions depend on molecular interactions that are not random. Please clarify.

      2) Related to major point #1, the author wrote "We manually picked vertices and performed carefully controlled alignments." I do now know what it means to carefully control alignments, and fear this suggests human model bias.

      3) Why do some experiments use EDTA? I may be confused, but I was surprised to see the budding reaction employed 1mM GMPPNP, and 2.5mM EDTA (but no Magnesium?). Also, for the budding reaction, please replace or expand upon the "the 10% GUV (v/v)" with a mass or molar lipid-to-protein ratio.

      4) Please cite the AnchorMap procedure.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers and Revision Plan

      We thank all three reviewers for their time and their comments on our manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Here Ryan et al. have used localization analysis following induced rapid relocalization of endogenous proteins to investigate the composition and recruitment hierarchy of a clathrin-TACC3-based spindle complex that is important for microtubule organization and stability.

      The authors generate different HeLa cell lines, each with one of four complex members (TACC3, CLTA, chTOG and GTSE1) endogenously tagged with FKBP-GFP via Cas9-mediated editing. This tag allows rapid recruitment to the mitochondria upon rapamycin addition ("knocksideways"). They ultimately quantify each of the 4 components' localization to the spindle following knocksideways of each component using fluorescently-tagged transfected constructs. The authors' interpretation of the results of this analysis are summarized in the last model figure, in which a core MT-binding complex of clathrin and TACC3 recruit the ancillary components GTSE1 and chTOG. In addition, the authors investigate the contribution of individual clathrin-binding LIDL motifs in GTSE1 to the recruitment of clathrin and GTSE1 to spindles. Their findings here largely agree with and confirm a recent report regarding the contribution of these motifs to GTSE1 recruitment to the spindle. They further analyzed GTSE1 fragments for interphase and mitotic microtubule localization, and identified a second region of GTSE1 required (but not sufficient) for spindle localization. Finally, the authors report that PIK3C2A is not part of this complex, contradicting (correcting) a previously published study.

      **Major comments:**

      1.The chTOG-FKBP-GFP cell line the authors generate has only a small fraction of chTOG tagged, and thus should not be used for any conclusions about protein localization dependency on chTOG. Because they were unable to construct a HeLa cell line with all copies tagged, the authors expect that the homozygous knock-in of chTOG-FKBP-GFP is lethal, and thus their experience is appropriate to report. However, the authors should not use this cell line alone to make statements about chTOG dependency. They would have to use similar localization analysis, but after another method to disrupt chTOG (as a second-best approach), such as RNAi. In fact, they have reported this in a previous publication (Booth et al 2011). However, the result was different. There, loss of chTOG resulted in reduced clathrin on spindles, suggesting it may stabilize or help recruit the complex. Alternatively, they could remove their chTOG data, but this would compromise the "comprehensive" nature of the work.

      The referee is correct. The point here is to show the results we had using this approach for all four proteins under study. For this reason, we do not want to remove this data and prefer to show our results “warts-and-all”. We feel that the shortcomings of our approach are honestly presented and discussed in the manuscript. While only a fraction of chTOG was tagged, we should expect some co-removal after its induced mislocalization. Since we saw no change, we concluded that chTOG is auxiliary.

      The “second best” approach suggested (RNAi of chTOG) is problematic for two reasons. First, chTOG RNAi results in gross changes to spindle structure (multipolar spindles) and it is difficult to pick apart differences in protein partner localization that result from loss of chTOG from those resulting from changes in spindle structure. Second, the paper is about induced mislocalization as a method for determining protein complexes once a normal spindle has formed. So, removing chTOG prior to mitosis is not comparable. If we get the same or different result, does it confirm or conflict with the data we have? Nonetheless, given the discrepancy with our earlier work, we should investigate this further.

      To address this concern, we will stain endogenous clathrin, TACC3 and GTSE1 following chTOG RNAi and measure their relative levels at the spindle.

      Making the chTOG-FKBP-GFP cell line was difficult. As described in the paper, we only recovered heterozygous clones despite repeated attempts. Since submission, we have been made aware of a HCT116 chTOG-FKBP-GFP cell line that is reported to be homozygously tagged (Cherry et al. 2019 doi: 10.1002/glia.23628).

      A note about this cell line has been added to the paper (Results section, final sentence of 1st paragraph).

      2.The authors initially analyze complex member localization after knocksideways experiments by antibody staining, which has the advantage of analyzing endogenous proteins (versus the later transfected fluorescent constructs). Setting aside potential artefacts from fixation, this would seem to be a better method for controlled analysis to take advantage of their setup (short of generating stable cell lines with second proteins endogenously tagged in a second color - a huge undertaking). The authors conclude that antibody specificity problems confounded their analysis and explained unusual results. However, I think is worth investing a little more effort to sort this out, rather than bringing doubt to the whole data set. Verifying and then using another antibody for chTOG localization would be informative. Of course, the negative control should not be their chTOG-FKBP-GFP line, as it does not relocalize most of chTOG.

      In the case of GTSE1, an alternative explanation to antibody specificity issues would be that the GTSE1-FKBP-GFP cell line is not in fact homozygously tagged. Given the low expression levels on the western provided, and the detection of GTSE1 on the spindle in the induced GTSE1-FKBP-GFP cell line (but not TACC3-FKBP-GFP), it seems plausible that an untagged copy remains. If there are multiple copies of GTSE1 in Hela cells, one untagged copy could represent a small fraction of total GTSE1. This should thus be ruled out. GTSE1 clones should be analyzed with more protein extracts loaded - dilutions of the extracts can determine the sensitivity of the blot to lower protein levels. In addition, sequencing of genomic DNA can reveal a small percentage with different reads.

      We used a two-pronged approach for assessing relocalization of protein partners (staining vs transfected constructs). The staining approach is superior since endogenous proteins are examined, but it is limited by antibody specificity. The transfection approach overcomes this limitation but is in turn limited by effects of overexpression and tagging. Together the two approaches allow us, and anyone employing this method, to get a picture of protein complexes. We didn’t want to create the impression that one or other approach is confounded, but the referee is correct that this analysis would benefit from further work.

      Specifically, to address these concerns:

      • We will verify and use alternative chTOG antibodies to try to improve this dataset.
      • We will test the possibility that an untagged allele of GTSE1 remains. We will use western blotting and a summary of our genomic analysis will be added to the paper.

        3.There is a lot of data contained in the small graphs summarizing quantification of localization in Figs 3 and 4. They would be more accessible to the reader if they were larger and/or an "example" of the chart with labels was present explaining it (essentially what is in the figure legends). Furthermore, there is no statistical test applied to this data that I see. This is needed. How do authors determine whether there is an "effect"?

      Our aim was to compress a lot of information into a small space, while still showing some example primary data. All reviewers raised the same concern which tells us that we went too far towards “data visualization”.

      To address this point, we will rework these figures.

      **Minor issues:**

      1.The GTSE1 constructs used for mutation and localization analysis are 720 amino acids long. A recent study analyzing similar mutations uses a 739 amino acid construct (Rondelet et al 2020). The latter is the predominant transcript in NCBI and Ensembl databases. It appears the construct used by the authors omits the first 19 a.a.. I do not think using the truncated transcript affects conclusions of the manuscript, but it could generate confusion when identifying residues based on a.a.#s of mutant constructs (Fig 6). This should be somehow clarified.

      We were aware of the longer transcript but were using the 720 residue form since it is the canonical sequence in Uniprot (https://www.uniprot.org/uniprot/Q9NYZ3). We did not know that the 739 form is the predominant transcript. We agree this is unlikely to affect our work but that the numbering may cause confusion.

      We have added a note to the Methods (Molecular Biology section) to accurately describe what we and Rondelet et al. have used.

      2.The labeling of constructs in Fig 6C/D is confusing, and appears shifted by eye at places. Please relabel this more clearly.

      Apologies for the error.

      We have relabeled Figure 6C,D and also made a similar alteration to Figure 5C.

      The recommended new experimental data (Analysis complex member levels on spindles after full perturbation of spindle chTOG; new chTOG antibody stainings in the FKBP lines; reanalysis of GTSE1 DNA/protein in GTSE1-FKBP line) should only require a new antibody/siRNA, plus a few weeks time to repeat the analyses already in the paper with new reagents.

      Reviewer #1 (Significance (Required)):

      While multiple individual components of this complex have been previously characterized, the structure and nature of the complex formation and its recruitment to microtubules/spindles remains a complex problem that has yet to be solved.

      Overall this study represents a comprehensive localization-dependency analysis of the Clathrin-TACC3 based spindle complex using a consistent methodology. Although several of the conclusions of the findings echo previous reports, some of the previous literature is contradictory within itself as well as with the conclusions here. Analyzing all components with a single, rapid-perturbation technique thus has great value to present a clear data set, given that the experimental setup conditions and analysis are solid (a goal to which the majority of comments refer).

      Beyond the complex localization/recruitment analysis, two novel findings of this study that emerge are:

      a)GTSE1 contains a second, separate protein region, distinct from the clathrin-binding motifs that is required for its localization to the spindle, and most likely a microtubule-interaction site. This suggests that GTSE1 recruitment to the spindle is more complex than previously reported.

      b)PI3KC2A, which has been reported previously to be a stabilizing member of this complex, is in fact not a member, nor localizes to spindles, nor displays a mitotic defect after loss. This is important conclusion to be made as it would correct the literature, and avoid future confusion.

      --

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this paper, the authors investigate the nature of interactions between members of the TACC3-chTOG-clathrin-GTSE1 complex on the mitotic spindle. By using a series of HeLa cell lines that they have created by CRISPR/Cas9 editing to enable spatial manipulation (knocksideways) of either TACC3, chTOG, clathrin and GTSE1, they show that on spindle microtubules TACC3 and clathrin represent core complex members whereas chTOG and GTSE1 bind to them respectively but not to each other. Additionally, the authors find that the protein PIK3C2A, which has been implicated in this complex previously is in fact not a component of this complex in mitotic cells. The main advance of the paper in my opinion is the endogenous tagging of the proteins for knocksideways experiments since former experiments depended on RNAi silencing and expression of tagged proteins from plasmids, which introduced issues of protein silencing efficiency and plasmid overexpression problems. This approach seems to alleviate these problems, except in the case of chTOG which seems to be lethal in its homozygous variant.

      **Major comments:**

      I find the key conclusions regarding the localization of the components of the complex convincing. There are some issues regarding the specificity of antibodies in immunostaining experiments (Fig 3.) and the influence of mCherry-TACC3 expression on distorted localization of the complex prior to knocksideways. However, I think the general conclusion about which complex components (clathrin and TACC3) influence the localization of the other proteins in the complex (chTOG and GTSE1) stands. One thing that I miss from the paper is the data on the consequences on the spindle shape and morphology after knocksideways. I have noticed on images in both Figure 3 and Figure 4 that in some cases distribution of the signal seems to influence quite a bit the spindle morphology. Also, In Figure 3 I have noticed what seems to me a quite big variation in spindle size in tubulin signal in both untreated and rapamycin cells. Since authors have many of these images already, I believe it would be realistic, not costly and of additional value for the paper to provide more data on the consequences of the knocksideways experiments. Change of spindle size, tubulin intensity and DNA/kinetochore misalignment upon knocksideways would be helpful to appreciate more the findings of the paper. More so since the authors on more than one occasion find their motivation in the field of cancer research and spindle stability relation to it. Some data connection to this motivation would be of value. Experiments seem reproducible.

      The focus of the paper is on using the knocksideways methodology to understand a protein complex during mitosis, rather than looking at its function. We are not keen to do new experiments that are not part of the central message of the paper. However, the Reviewer is correct that we do already have a dataset that can be mined in the manner described.

      To address this point, we will analyze spindle size parameters and also the intensity of tubulin. Our analysis will be limited to the short timeframe of our experiments, but it should reveal or refute any changes in spindle structure that may result from loss of complex members.

      **Minor comments:**

      I have some problems with the clarity of Figure 3 and 4. For Figure 3. In Figure 3 plots on the right are a bit small and not easy to read. Some reorganization of the figure might be beneficial. In Figure 4 plots to the right are also too small to be clear. Also, I miss the number of cells (n) I can't see the number of individual arrows because of the size of graphs.

      Our aim was to compress a lot of information into a small space, while still showing some example primary data. All reviewers raised the same concern which tells us that we went too far towards “data visualization”.

      To address this point, we will rework these figures.

      Reviewer #2 (Significance (Required)):

      I find that the biggest significance of the paper is in the creation of new tools (cell lines) to study the localization of proteins TACC3, chTOG, clathrin and GTSE1. Cell lines where endogenous proteins can be delocalized rapidly will be of value for scientist working not only in mitosis but such as in the case of clathrin research, vesicle formation and trafficking or p53-dependent apoptosis in the case of GTSE1. In the field of mitosis it will surely help and speed up the research concerning the role of these proteins in spindle assembly and stability.

      Field of expertise: mitotic spindle

      --

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      This papers analyses the chTog/TACC3/clathrin/GTSE1 complex that crosslinks and stabilises microtubule bundles in the mitotic spindle. The authors have developed an elegant knock sideways approach to specifically analyse the effects of removing individual components of the complex from the spindle and study the effect this has on the other interactors. They report, based on these assays that the core of the complex is formed by TACC3 and Clathrin while GTSE1 and chTog are auxiliary interactors. They also refute previous evidence that this complex also incorporates PIK3C2A. Overall, this is an interesting study that distinguishes itself predominantly by its methodology. However, some of the reported results need more thorough analysis to allow convincing conclusions.

      **Major comments:**

      1)The knockside way method is the main highlight if this paper. Unlike previous studies by the PI, this time endogenous genes are tagged which is a key advance and allows much better interpretation of the results. I am not sure why the authors have chosen HeLa cells as their model here, given the messed up genome of these cells. A non-transformed cell line would have been preferable, but as a proof of principle study, I think HeLa are acceptable, and I wouldn't expect the authors to repeat all the experiment in another system.

      Figure 1,2 and S1 are describing and validating this approach in some detail, but this will require some more work.

      The authors state that gene targeting was validated using a combination of PCR, sequencing, Western blotting, but show only the results for westerns. PCR analysis that demonstrates homozygous or heterozygous gene targeting should be shown here.

      Another issue is the penetrance of the phenotypes induced by Rapamycin. The authors show nice data of the system working in individual cells but do not give us an idea if this happens in all cells. The localisation of the individual tagged genes should be quantified (ideally with line plots) in 50 randomly chosen mitotic cells with 3 repeats before and after rapamycin treatment. Moreover, the analysis of mitotic duration (Figure S1D) should be extended to include a plus Rapamycin cohort and this should be moved in the main Figure.

      If the system works only in a small proportion of cells, this should be clearly stated. I don't think this would prevent publication, but it is an important piece of information that is missing.

      The Reviewer raises two issues here.

      • PCR analysis should be shown. This issue was also partly raised by Reviewer 1. A summary of our PCR analysis was actually included in Table 1, since the analysis we did is pretty unwieldy. We agree though that presenting our evidence for homozygosity of the cell lines would be useful. To address this point, we will add more detail of the PCR and sequencing work done to validate these cell lines.
      • Does knocksideways happen in all cells? The answer to this depends on the transient expression of MitoTrap and sufficient application of rapamycin. We agree that this will be a useful piece of information to add to the manuscript. A related issue is whether knocksideways of complex members affects mitotic progression. We have established through other experiments that rapamycin application to wild-type cells alters mitotic progression, although application of Rapalog does not have this effect. Our plan to address these points is 1) to analyze the efficacy of knocksideways that readers can expect to achieve using these, or similar cells, and 2) analyze mitotic duration in rapalog-treated cells expressing a rapalog sensitive MitoTrap.

        2)Apart from a simple quantification of mitotic duration, I believe a more detailed mitotic phenotype analysis for each knock-side way gene, especially the homozygous targeted clones, should be included. This can involve more high-resolution live cell imaging of mitotic progression with SiR-DNA and GFP-tubulin, using the dark mitotrap.

      We don’t agree that such an analysis should be included. The focus of this paper is on using the knocksideways methodology to understand a protein complex during mitosis, and not looking at its function. There are several papers on the mitotic phenotypes of these genes probed using RNAi in different cellular systems (examples for chTOG: 10.1101/gad.245603; TACC3/clathrin: 10.1038/emboj.2011.15, 10.1242/jcs.075911, 10.1083/jcb.200911091, 10.1083/jcb.200911120; GTSE1: 10.1083/jcb.201606081). Moreover, our 2013 paper used knocksideways (with RNAi and overexpression) and has a detailed analysis of mitotic progression, microtubule stability, checkpoint activity and kinetochore motions (Cheeseman et al., 2013 doi: 10.1242/jcs.124834).

      New experiments that are not part of the central message of the paper and are unlikely to give new insight are not the best use of our revision efforts for this paper (especially during the pandemic). Having said this, Reviewer 2’s suggestion to use our existing dataset to investigate mitotic phenotypes, will largely answer Reviewer 3’s request.

      We will analyze spindle size parameters and also the intensity of tubulin. Our analysis will be limited to the short timeframe of our experiments, but it should reveal or refute any changes in spindle structure that result from the loss of complex members.

      3)Overall, the quantitative analysis in Figure 3 ,4 and 7 is not good enough and sometimes doesn't fully support the conclusions. In Figure 3,4 a convoluted way of demonstrating the change in localisation is shown and this panel is so small that is almost impossible to read. Also, there is no statistical analysis, and the sample size seems very small . At least 25 cells should be analysed here in 3 repeats. I would suggest to unify the quantification in the MS and use the line plots shown in Figure 5 and 6 and compare each protein before and after rapamycin addition. This is much easier to read and more convincing. The images of the cells panels can be moved to a supplement as they contain very little information. This would generate space to expand the size and depth of the quantitative analysis. Instead of Anova tests, I would recommend using a simple t-test comparing each condition to its relevant control since this is the only relevant comparison in the experiment. Statistical significance should be calculated for each experiment with sufficient sample size. It would also be better to show the individual data points from the three repeats in different colours so that the reproducibility between repeat can be judged.

      This type of statistical analysis should be uniformly done throughout the MS and also extended to Figure 7.

      The referee raises several issues here with our data presentation and statistical analysis.

      • Our aim in Figures 3 and 4 was to compress a lot of information into a small space, while still showing some example primary data. All reviewers raised the same concern about these figures which tells us that we went too far towards “data visualization”. To address this point, we will rework Figures 3 and 4 to provide more clear data presentation.
      • The Reviewer’s comments about statistical analysis however are not sound. First, it is incorrect to state that simple t-tests can be applied (this is a form of p-hacking). Correction for multiple testing must be done on these datasets. Second, the reviewer arbitrarily states numbers for cells and experimental repeats without considering the effect size or it seems, understanding the structure of the data that we have collected. Sample sizes are small but they are taken from many independent replicates. Third, and related to the previous point, the fixed and live cell data are structured differently which means that a uniform data presentation is not possible. The live data has a paired design and each cell is an independent replicate (with replicates done over several trials). The fixed data is unpaired and we have taken measures from several experiments (independent replicates). The point about applying statistical tests to the data is also made by Reviewer 1 and we will use appropriate tests (NHST or estimation statistics) as we re-work the figures.

        Reviewer #3 (Significance (Required)):

      In my opinion, the most interesting aspect of the MS is the methodology. Based on this, publication is justified and will be of interest to a wider audience. That is why a more detailed analysis of the penetrance of this manipulation across the cell population will be critical.

      The application of this method to analyse the composition of the TACC3/Clathrin complex on the spindle is the main biological advance, and the novel information is rather limited but not unimportant.

      Overall, if these results can be properly quantified I would recommend publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This papers analyses the chTog/TACC3/clathrin/GTSE1 complex that crosslinks and stabilises microtubule bundles in the mitotic spindle. The authors have developed an elegant knock sideways approach to specifically analyse the effects of removing individual components of the complex from the spindle and study the effect this has on the other interactors. They report, based on these assays that the core of the complex is formed by TACC3 and Clathrin while GTSE1 and chTog are auxiliary interactors. They also refute previous evidence that this complex also incorporates PIK3C2A. Overall, this is an interesting study that distinguishes itself predominantly by its methodology. However, some of the reported results need more thorough analysis to allow convincing conclusions.

      Major comments:

      1)The knockside way method is the main highlight if this paper. Unlike previous studies by the PI, this time endogenous genes are tagged which is a key advance and allows much better interpretation of the results. I am not sure why the authors have chosen HeLa cells as their model here, given the messed up genome of these cells. A non-transformed cell line would have been preferable, but as a proof of principle study, I think HeLa are acceptable, and I wouldn't expect the authors to repeat all the experiment in another system. Figure 1,2 and S1 are describing and validating this approach in some detail, but this will require some more work. The authors state that gene targeting was validated using a combination of PCR, sequencing, Western blotting, but show only the results for westerns. PCR analysis that demonstrates homozygous or heterozygous gene targeting should be shown here. Another issue is the penetrance of the phenotypes induced by Rapamycin. The authors show nice data of the system working in individual cells but do not give us an idea if this happens in all cells. The localisation of the individual tagged genes should be quantified (ideally with line plots) in 50 randomly chosen mitotic cells with 3 repeats before and after rapamycin treatment. Moreover, the analysis of mitotic duration (Figure S1D) should be extended to include a plus Rapamycin cohort and this should be moved in the main Figure. If the system works only in a small proportion of cells, this should be clearly stated. I don't think this would prevent publication, but it is an important piece of information that is missing.

      2)Apart from a simple quantification of mitotic duration, I believe a more detailed mitotic phenotype analysis for each knock-side way gene, especially the homozygous targeted clones, should be included. This can involve more high-resolution live cell imaging of mitotic progression with SiR-DNA and GFP-tubulin, using the dark mitotrap.

      3)Overall, the quantitative analysis in Figure 3 ,4 and 7 is not good enough and sometimes doesn't fully support the conclusions. In Figure 3,4 a convoluted way of demonstrating the change in localisation is shown and this panel is so small that is almost impossible to read. Also, there is no statistical analysis, and the sample size seems very small . At least 25 cells should be analysed here in 3 repeats. I would suggest to unify the quantification in the MS and use the line plots shown in Figure 5 and 6 and compare each protein before and after rapamycin addition. This is much easier to read and more convincing. The images of the cells panels can be moved to a supplement as they contain very little information. This would generate space to expand the size and depth of the quantitative analysis. Instead of Anova tests, I would recommend using a simple t-test comparing each condition to its relevant control since this is the only relevant comparison in the experiment. Statistical significance should be calculated for each experiment with sufficient sample size. It would also be better to show the individual data points from the three repeats in different colours so that the reproducibility between repeat can be judged. This type of statistical analysis should be uniformly done throughout the MS and also extended to Figure 7.

      Significance

      In my opinion, the most interesting aspect of the MS is the methodology. Based on this, publication is justified and will be of interest to a wider audience. That is why a more detailed analysis of the penetrance of this manipulation across the cell population will be critical. The application of this method to analyse the composition of the TACC3/Clathrin complex on the spindle is the main biological advance, and the novel information is rather limited but not unimportant. Overall, if these results can be properly quantified I would recommend publication.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper, the authors investigate the nature of interactions between members of the TACC3-chTOG-clathrin-GTSE1 complex on the mitotic spindle. By using a series of HeLa cell lines that they have created by CRISPR/Cas9 editing to enable spatial manipulation (knocksideways) of either TACC3, chTOG, clathrin and GTSE1, they show that on spindle microtubules TACC3 and clathrin represent core complex members whereas chTOG and GTSE1 bind to them respectively but not to each other. Additionally, the authors find that the protein PIK3C2A, which has been implicated in this complex previously is in fact not a component of this complex in mitotic cells. The main advance of the paper in my opinion is the endogenous tagging of the proteins for knocksideways experiments since former experiments depended on RNAi silencing and expression of tagged proteins from plasmids, which introduced issues of protein silencing efficiency and plasmid overexpression problems. This approach seems to alleviate these problems, except in the case of chTOG which seems to be lethal in its homozygous variant.

      Major comments:

      I find the key conclusions regarding the localization of the components of the complex convincing. There are some issues regarding the specificity of antibodies in immunostaining experiments (Fig 3.) and the influence of mCherry-TACC3 expression on distorted localization of the complex prior to knocksideways. However, I think the general conclusion about which complex components (clathrin and TACC3) influence the localization of the other proteins in the complex (chTOG and GTSE1) stands. One thing that I miss from the paper is the data on the consequences on the spindle shape and morphology after knocksideways. I have noticed on images in both Figure 3 and Figure 4 that in some cases distribution of the signal seems to influence quite a bit the spindle morphology. Also, In Figure 3 I have noticed what seems to me a quite big variation in spindle size in tubulin signal in both untreated and rapamycin cells. Since authors have many of these images already, I believe it would be realistic, not costly and of additional value for the paper to provide more data on the consequences of the knocksideways experiments. Change of spindle size, tubulin intensity and DNA/kinetochore misalignment upon knocksideways would be helpful to appreciate more the findings of the paper. More so since the authors on more than one occasion find their motivation in the field of cancer research and spindle stability relation to it. Some data connection to this motivation would be of value. Experiments seem reproducible.

      Minor comments:

      I have some problems with the clarity of Figure 3 and 4. For Figure 3. In Figure 3 plots on the right are a bit small and not easy to read. Some reorganization of the figure might be beneficial. In Figure 4 plots to the right are also too small to be clear. Also, I miss the number of cells (n) I can't see the number of individual arrows because of the size of graphs.

      Significance

      I find that the biggest significance of the paper is in the creation of new tools (cell lines) to study the localization of proteins TACC3, chTOG, clathrin and GTSE1. Cell lines where endogenous proteins can be delocalized rapidly will be of value for scientist working not only in mitosis but such as in the case of clathrin research, vesicle formation and trafficking or p53-dependent apoptosis in the case of GTSE1. In the field of mitosis it will surely help and speed up the research concerning the role of these proteins in spindle assembly and stability.

      Field of expertise: mitotic spindle

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Here Ryan et al. have used localization analysis following induced rapid relocalization of endogenous proteins to investigate the composition and recruitment hierarchy of a clathrin-TACC3-based spindle complex that is important for microtubule organization and stability. The authors generate different HeLa cell lines, each with one of four complex members (TACC3, CLTA, chTOG and GTSE1) endogenously tagged with FKBP-GFP via Cas9-mediated editing. This tag allows rapid recruitment to the mitochondria upon rapamycin addition ("knocksideways"). They ultimately quantify each of the 4 components' localization to the spindle following knocksideways of each component using fluorescently-tagged transfected constructs. The authors' interpretation of the results of this analysis are summarized in the last model figure, in which a core MT-binding complex of clathrin and TACC3 recruit the ancillary components GTSE1 and chTOG. In addition, the authors investigate the contribution of individual clathrin-binding LIDL motifs in GTSE1 to the recruitment of clathrin and GTSE1 to spindles. Their findings here largely agree with and confirm a recent report regarding the contribution of these motifs to GTSE1 recruitment to the spindle. They further analyzed GTSE1 fragments for interphase and mitotic microtubule localization, and identified a second region of GTSE1 required (but not sufficient) for spindle localization. Finally, the authors report that PIK3C2A is not part of this complex, contradicting (correcting) a previously published study.

      Major comments:

      1.The chTOG-FKBP-GFP cell line the authors generate has only a small fraction of chTOG tagged, and thus should not be used for any conclusions about protein localization dependency on chTOG. Because they were unable to construct a HeLa cell line with all copies tagged, the authors expect that the homozygous knock-in of chTOG-FKBP-GFP is lethal, and thus their experience is appropriate to report. However, the authors should not use this cell line alone to make statements about chTOG dependency. They would have to use similar localization analysis, but after another method to disrupt chTOG (as a second-best approach), such as RNAi. In fact, they have reported this in a previous publication (Booth et al 2011). However, the result was different. There, loss of chTOG resulted in reduced clathrin on spindles, suggesting it may stabilize or help recruit the complex. Alternatively, they could remove their chTOG data, but this would compromise the "comprehensive" nature of the work.

      2.The authors initially analyze complex member localization after knocksideways experiments by antibody staining, which has the advantage of analyzing endogenous proteins (versus the later transfected fluorescent constructs). Setting aside potential artefacts from fixation, this would seem to be a better method for controlled analysis to take advantage of their setup (short of generating stable cell lines with second proteins endogenously tagged in a second color - a huge undertaking). The authors conclude that antibody specificity problems confounded their analysis and explained unusual results. However, I think is worth investing a little more effort to sort this out, rather than bringing doubt to the whole data set. Verifying and then using another antibody for chTOG localization would be informative. Of course, the negative control should not be their chTOG-FKBP-GFP line, as it does not relocalize most of chTOG.

      In the case of GTSE1, an alternative explanation to antibody specificity issues would be that the GTSE1-FKBP-GFP cell line is not in fact homozygously tagged. Given the low expression levels on the western provided, and the detection of GTSE1 on the spindle in the induced GTSE1-FKBP-GFP cell line (but not TACC3-FKBP-GFP), it seems plausible that an untagged copy remains. If there are multiple copies of GTSE1 in Hela cells, one untagged copy could represent a small fraction of total GTSE1. This should thus be ruled out. GTSE1 clones should be analyzed with more protein extracts loaded - dilutions of the extracts can determine the sensitivity of the blot to lower protein levels. In addition, sequencing of genomic DNA can reveal a small percentage with different reads.

      3.There is a lot of data contained in the small graphs summarizing quantification of localization in Figs 3 and 4. They would be more accessible to the reader if they were larger and/or an "example" of the chart with labels was present explaining it (essentially what is in the figure legends). Furthermore, there is no statistical test applied to this data that I see. This is needed. How do authors determine whether there is an "effect"?

      Minor issues:

      1.The GTSE1 constructs used for mutation and localization analysis are 720 amino acids long. A recent study analyzing similar mutations uses a 739 amino acid construct (Rondelet et al 2020). The latter is the predominant transcript in NCBI and Ensembl databases. It appears the construct used by the authors omits the first 19 a.a.. I do not think using the truncated transcript affects conclusions of the manuscript, but it could generate confusion when identifying residues based on a.a.#s of mutant constructs (Fig 6). This should be somehow clarified.

      2.The labeling of constructs in Fig 6C/D is confusing, and appears shifted by eye at places. Please relabel this more clearly.

      The recommended new experimental data (Analysis complex member levels on spindles after full perturbation of spindle chTOG; new chTOG antibody stainings in the FKBP lines; reanalysis of GTSE1 DNA/protein in GTSE1-FKBP line) should only require a new antibody/siRNA, plus a few weeks time to repeat the analyses already in the paper with new reagents.

      Significance

      While multiple individual components of this complex have been previously characterized, the structure and nature of the complex formation and its recruitment to microtubules/spindles remains a complex problem that has yet to be solved.

      Overall this study represents a comprehensive localization-dependency analysis of the Clathrin-TACC3 based spindle complex using a consistent methodology. Although several of the conclusions of the findings echo previous reports, some of the previous literature is contradictory within itself as well as with the conclusions here. Analyzing all components with a single, rapid-perturbation technique thus has great value to present a clear data set, given that the experimental setup conditions and analysis are solid (a goal to which the majority of comments refer).

      Beyond the complex localization/recruitment analysis, two novel findings of this study that emerge are:

      a)GTSE1 contains a second, separate protein region, distinct from the clathrin-binding motifs that is required for its localization to the spindle, and most likely a microtubule-interaction site. This suggests that GTSE1 recruitment to the spindle is more complex than previously reported.

      b)PI3KC2A, which has been reported previously to be a stabilizing member of this complex, is in fact not a member, nor localizes to spindles, nor displays a mitotic defect after loss. This is important conclusion to be made as it would correct the literature, and avoid future confusion.

    1. Reviewer #2:

      In this manuscript by de Rus Jacquet et al., authors present an interesting study to detect changes in extracellular vesicles in human PD patient derived iPSC-derived astrocytes carrying the LRRK2 G2019S mutation. Isogenic gene corrected iPSCs were used as controls in all experiments. Authors first performed RNA-Seq for global gene expression changes between G2019S and "WT" gene corrected astrocytes. GO analysis showed an upregulation of extracellular compartments (including exosome compartments) in LRRK2 astrocytes. Subsequent experiments focusing on extracellular vesicles (EVs) and multivesicular bodies (MVBs), showed specific differences of MVB area and the size of secreted EVs. Secreted EVs from G2019S astrocytes also contained more LRRK2 particles and G2019S EVs contained more phosphorylated aSyn particles. Co-culture of LRRK2 astrocytes with human dopamine neurons showed accumulation of CD63+ exosomes in neurites, compared to co-culture with WT astrocytes. Co-culture with LRRK2 astrocytes decreased viability of TH+ neurons and LRRK2 dendrites/neurites were also shorter. These co-culture findings were replicated using EV-enriched conditioned media. Finally, authors showed that the trophic effect of astrocytes on neurons was due both to soluble factors released into the media, and production and release of EVs. Overall, this is a well-written and systematically performed study. This reviewer has several comments as detailed below.

      1) Based on their data, authors conclude that astrocyte-to-neuron signaling and trophic support mediated by EVs is disrupted in LRRK2 G2019S astrocytes. Have authors measured the differences in trophic factors released by LRRK2 astrocytes in EVs and in conditioned media?

      2) Authors differentiate cells (astrocytes and neurons) from midbrain lineage NPCs. The data show convincing effects of the LRRK2 derived astrocytes on neurons, but one question is whether this is specific to dopaminergic cells. Would this genotype specific effect also be expected in other lineages, e.g. cortical neurons? Authors should discuss this point.

      3) Prior work has demonstrated reductions in neurite length in neurons derived from LRRK2 G2019S iPSCs (not specific to dopaminergic neurons in LRRK2 cells) (for example Reinhard et al 2013). It is curious that the LRRK2 G2019S mutation itself can cause such a phenotype in neurons mono-cultures, and as shown in the current study, that LRRK2 G2019S astrocytes also induce a similar effect on WT neurons in co-culture. Can authors expand on this point in the Discussion?

      4) Authors should provide data on % dopaminergic neurons generated in the cultures.

      5) p7. Authors refer to phosphorylated a-synuclein as accelerating PD pathogenesis, but the references cited do not show this. In fact, Gorbatyuk et al 2008, showed that overexpression of S129 with constitutive phosphorylation eliminated a-synuclein induced nigrostriatal degeneration. The Fujiwara et al 2002 reference showed the presence of phospho a-syunclein in Lewy bodies and neurites. Authors should revise their statement that phospho a-synuclein is associated with accelerated pathology.

      6) Please provide details on the number of iPSC lines used for these experiments.

      7) Clarify whether the WT neurons used for co-culture were derived from the isogenic human neurons?

    2. Reviewer #1:

      In this manuscript titled "The LRRK2 G2019S mutation alters astrocyte-to-neuron communication via extracellular vesicles and induces neuron atrophy in a human iPSC-derived model of Parkinson's disease", Jacquet and colleagues investigated the role of Parkinsonism gene mutation LRRK2 G2019S in hiPSC-differentiated astrocytes. By isolating extracellular vesicles from ACM and examining astrocytes with various electron microscopy techniques, the authors found that LRRK2 G2019S affects the morphology and distribution of MVBs and the morphology of secreted EVs in hiPSC-differentiated astrocytes. Furthermore, the authors observed that astrocyte-derived EVs can be internalized by dopaminergic neurons and such EVs support neuronal survival. However, LRRK2 G2019S EVs lost the ability of promoting neuronal survival. This is an interesting study showing a non-cell autonomous contribution to dopaminergic neuron loss in PD.

      The proposed idea of how LRRK2 G2019S dysregulates EV-mediated astrocyte-to-neuron communication is novel and exciting. However, the authors present some conflicting data that is not addressed during the discussion: they first conclude upregulated exosome biogenesis by RNAseq in G2019S vs WT astrocytes, but later show a decrease in the number of <120nm particles in G2019S mutants suggesting a decrease in the classical exosome-sized vesicle secreted compared to WT. Lastly, their MVB images show less CD63 gold particles in G2019S compared to WT control (though this was not quantified). Do the authors suggest and increase or decrease in exosome biogenesis in G2019S mutants? How do they reconcile these seemingly contradicting data? Several experiments, controls and additional analyses are needed to fully demonstrate the validity of the proposed mechanism.

      Major concerns:

      1) In figure 1 A authors demonstrate iPSC-derived astrocytes characterization. Since there is no one unified and validated method for astrocytes differentiation, there is a need for more accurate characterization of iPSC-derived astrocytes. Authors should demonstrate the percentage of cells positive to astrocytic markers and to prove that obtained astrocytes are functional (able to promote synaptogenesis and uptake glutamate). I would also recommend analyzing the iPSC-derived astrocyte cultures for expression of more specific astrocytic markers as GLT1, SOX9 in addition to those which have been analyzed. Moreover, it is highly important to know what is the proportion of astrocytes derived from LRRK2 G2019S line and its isogenic control in order to be able to compare their effect on neurons.

      2) In Figure 1, the authors found a significant upregulation of exosome components in astrocytes, demonstrating an important role of LRRK2 G2019S in EV signaling pathway. In the discussion, the authors briefly mentioned 'sub-populations of CD63- EVs may be differentially secreted in mutant astrocytes'. Since the authors have obtained the RNA-seq data, it would be nice to dig deep into the data and comment on potential EV sub-populations which can be differentially secreted. This information can be very beneficial for follow-up studies in the PD and LRRK2 field. Furthermore, the authors should assess the expression of Rab27a and CD82 in WT and LRRK2 G2019S astrocytes by western blots to verify RT-qPCR data. Furthermore, the authors should present specifically exosome biogenesis or secretion genes are altered to provide further insight into the stage of exosome biogenesis that is affected (ESCRT0-3, VPS4, ALIX, etc).

      3) In Figure 2A and B, data shows that both WT and LRRK2 G2019S astrocytes produce MVBs and MVBs in LRRK2 G2019S astrocytes is smaller than in WT astrocytes. In Figure 2E, the authors showed the abundance of CD63 localized within MVBs in WT astrocytes but did not show the CD63 localization in MVBs in G2019S astrocytes. However, it is important to show CD63 localization in MVBs in G2019S astrocytes to fully support the conclusion that CE63+ MVBs are present in LRRK2 G2019S astrocytes. In addition, CD44 is a marker for astrocyte-restricted precursor cells. Although CD44+ positive cells are committed to give rise to astrocytes, it is crucial to include another astrocyte marker to ensure these cells are indeed mature astrocytes. -Related, authors should consider citing some of the MVB maturation literature to guide the readers.

      4) In Figure 3, it is impressive that the authors are able to image EVs using cyro-EM approach and analyze their sizes. The authors also observed different shapes of EVs. Is there any shape difference between WT EVs and G2019S EVs? Is there a way that the authors could categorize these shapes and do a detailed analysis in EV shapes? Also, In Figure 3D, both WT EV and G2019S EV images should present side by side for comparison. -Related, the size frequencies of EVs presented suggest a difference in the types of EV's released. Interestingly, exosomes are classically known to range from ~50-120nm and this population is significantly decreased in G2019S compared to WT. What does this suggest?

      5) In figure 3c, SBI ELISA claims to quantify CD63+ vesicles, the authors should present more standardized particle quantification data (either by CD63 FACs for isolated EVs in WT vs G2019S or ZetaView/QNano particle tracking). The authors should also directly quantify the total number of EVs secreted in WT vs G2019S conditions (not only CD63+).

      6) In Figure 4, the authors quantify LRRK2+/CD63+ particles by imaging. Importantly, it appears that there are less CD63 "large gold" particles in MVB of G2019S compared to control. This CD63 baseline quantification in MVB of WT vs. G2019S should be presented in this figure. These data are not convincing and should be quantified by FACS in secreted EV. Supplementary figure 3 should be brought into this figure.

      7) In Figure 5, using CD63 as a MVB marker is not the most accurate approach. ESCRT markers should be co-stained with these experiments to truly show MVB localization (CD63 can localize to MVBs but is known to have a wider distribution throughout the cell compared to TSG1010 or other ESCRT complex proteins). Additionally, the authors must show their Supplemental Figure 3 ELISA quantification of p-aSyn in this main figure, and comment on why they conclude higher p-aSyn content in MVBs based on their IEM but then find no differences in aSyn in secreted EVs in WT vs. G2019S by ELISA.

      8) In figure 6, it is even more clear that there is a stark difference between the CD63 presence in/near MVBs between WT and G2019S conditions. Since the authors normalize several pieces of data to CD63 (MVB localization, LRRK2 co-localization, etc), it is critical to quantify the number of baseline CD63 gold particles in MVBs in WT vs G2019S.

      9) In Figure 7, the authors used the co-culture of astrocytes and neurons to assess astrocyte-derived EV uptake by dopaminergic neurons. Although 3D reconstitution of neurons and exosomes can be precise, the data may not be 100% clean. It would be better if the authors collect ACM containing EV fraction from WT astrocyte and G2019S astrocytes and then incubate dopaminergic neurons with ACM containing EV fraction. In this way, only dopaminergic neurons are in the culture and there will be no CD63-GFP expressed astrocytes to contaminate the CD63-GFP signal in neurons.

      10) In Figure 9, the authors must show their ACM control. They show untreated, EV-free, and EV-rich ACM, but do not show unmanipulated ACM control.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The discussion between reviewers and editors centered on a few key points. First, all reviewers felt that it is of utmost importance that a justified and appropriate number of hiPSCs and their appropriate controls are utilized throughout. In particular, there is concern that G2019S-related phenotypes may be more variable than other presumed monogenetic causes of disease, for example a low penetrance of disease causation associated with G2019S in people (e.g., 20% lifetime penetrance for PD) that may necessitate more lines analyzed than usual, and possible lines from carriers of the mutation that appear resilient to disease. Studies in the past decade that use only one or a few lines of G2019S hIPSCs have generally failed to replicate in more than one laboratory, possibly due to low power. The reviewer's were not sure how rigorous the study was in this regard. Second, reviewer's felt there was over-interpretation and speculation regarding the possible roles of differential trophic factors released by the astrocytes in EVs and conditioned media without many measures of specific trophic factors, or rescue experiments, to help define the mechanism. Third, the EV data are not broadly supported by NTA (like Zeta or nanosight) or quantitative measures fairly standard in the EV field. For example, the authors did not clearly quantify the total number of EVs secreted in WT vs. G2019S conditions, which would be a basic experiment needed to create interest in the study in the EV community.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful for the insightful, constructive and very positive reviews provide by the three reviewers. Please find responses to each of the reviewer comments below.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors study proteins localised to the apical end of the highly polarised parasites causing Toxoplasmosis and malaria. They find new proteins using BioID and examine the localisation of these along with recently identified proteins in the two different parasites. They key question they address is whether there is a conservation of the apical components in these distantly related parasites as well as in some even more distantly related organisms. This is an important question as the apical part comprises many essential proteins of invasion of host cells and shows a unique structure that defines the apicomplexans as a group. The apical structure can be highly elaborate such as in T. gondii and less elaborate as in P. falciparum. The authors now show that there is a large conservation between the species in the protein makeup of the apical end. The experiments are well performed, displayed and discussed and there is no doubt about the validity of the presented results. The text is eloquently written, if at times a bit wordy.

      My only main suggestion would be to possibly add data on gene disruption of the two candidates (0310700 and 1216300) that are not detected in blood stage parasites but in the insect stages. A deletion of these should be technically straightforward and would show whether the proteins are important to the parasite. Likely not all of the now many proteins are essential for the parasites but these are good candidates to rapidly investigate. But showing a functional impact might convince editors at certain journals.

      Authors’ response: The central aim of this study was to ask if the molecular composition of the conoid complex is conserved across Apicomplexa. Functional dissection of proteins is part of an exciting set of subsequent questions and studies that will now follow by us and others. However, careful and thorough phenotyping of gene disruptions is not trivial work, would be most informative to perform in both Toxoplasma and Plasmodium, and is therefore beyond the scope of this project. Regarding the two proteins suggested by this reviewer for follow-up work and the question of ‘essentiality’, that the proteins have not been lost during parasite selection through evolution is clear evidence of their relevance to the biology of Plasmodium.

      Other suggestions in chronological order (line numbers would have helped)

      title: maybe write 'conoid complex proteome'

      Authors’ response: while we initially thought that this change would be suitable, given that the subsequent part of the title is ‘reveals a cryptic conoid feature’ we think it is clearer and more logical to leave this title in its original form. The conoid complex includes the apical polar rings, and these are not considered to be cryptic or previously unrecognised, only the conoid. While our study confirms that there is conservation across all proteome components of the conoid complex, this is secondary to the primary question of this study.

      abstract: not sure about the use of the words instrument and substructures

      Authors’ response: we believe that the use of ‘instrument’ is an appropriate analogy of a tool and not different from the use of ‘machine’ and ‘machinery’ that is widely used in molecular and cellular biology. Similarly, ‘substructure’ acknowledges that within recognised structures, such as the conoid, there is further specific organisation such as the conoid base or apex.

      page 2 last lines: is tubulin monomeric or polymerized?

      Authors’ response: to specify the polymerized state of tubulin as mentioned here the text has been changed to ‘the presence of tubulin polymers’.

      page 3 name protein talked about in 9th line

      Authors’ response: we have now named this protein (RNG2) as suggested.

      third paragraph: mention previous proteomics studies e.g. from Ke Hu (mentioned later in discussion)

      Authors’ response: We feel that it is more appropriate to leave the discussion of the Hu et al (2006) proteomics study, along with various subsequent approaches used in pursuit of discovering conoid-associated proteins, to the discussion as currently occurs. In the introduction we seek to efficiently inform the reader of the current state of knowledge that makes the value and nature of the questions that we have asked in this study apparent. But we do give full credit and evaluation of previous studies in the discussion which we think is the most appropriate place for this.

      first paragraph or results could go into introduction

      Authors’ response: The first paragraph of the Results contains specific detail of just one aspect of this study, the use of hyperLOPIT. This is relevant to the new analysis that we have made of the hyperLOPIT data in this study. We, therefore, believe that it is most appropriately presented here in the Results in association with the new analyses we described. Our aim is that the Introduction is succinct and serves the entire study.

      page 4: add reference after BioID

      Authors’ response: reference added as suggested

      page 5: add definitions of the conoid; what technique was used to report YFP-SAS6?

      Authors’ response: It is unclear what this reviewer is requesting with respect to definitions of the conoid on this page. Nevertheless, we have now included a thorough definition of the conoid based on the original electron microscopy studies (fourth paragraph of the Introduction).

      With respect to the technique used to report on YFP-tagged SAS6 in the de Leon et al 2013 study, we now include fuller description of this previous study as follows:

      ‘The fluorescence imaging used in the de Leon et al study was limited to lower resolution widefield microscopy. Immuno-TEM was also used, however, contrary to their conclusions, did show YFP presence throughout transverse and oblique sections of the conoid consistent with our detection of SAS6L throughout the conoid body.’

      page 7: 'showed similar localisation' instead of 'phenocopied'?; add reference after ookinete stage; add expression levels from PlasmoDB to the Table 1 data at least for merozoites, ookinetes and sporozoites or add separate table for the 9 proteins in supplement

      Authors’ response: ‘phenocopied’ replaced, as suggested. Reference added after ookinete stage, as suggested.

      As requested, we have complied available expression data for the Plasmodium proteins throughout the different zoite stages and will include these data as supplemental material in our subsequent revision.

      Discussion: Maybe discuss that the conoid complex is a cytoskeletal structure and that the other cytoskeletons (actin, microtubules, subpellicular network) also differ between the species investigated in their composition and overall architecture

      Authors’ response: These are reasonable suggested analogies and we will introduce them in the subsequent revision.

      page 9: at least two proteins could be deleted as they seem to not confer any growth defect on blood stages (see main comment)

      Authors’ response: This reviewer has not linked this comment to a specific statement on page 9, however, we are cautious not to interpret lack of observed growth defects in experimental scenarios with unimportant or irrelevant proteins. Maintenance, through natural selection and evolution, of proteins of a structure indicate that they are selectively advantageous and of functional relevance. The two proteins in question are not expressed in the blood stage, so one wouldn’t expect their deletion to have consequence in this stage.

      Apart from classic TEM images also Cryo EM data is available for apex of merozoite and sporozoite. Worth to discuss?

      Authors’ response: According to this review’s subsequent suggestion (below), we are now preparing a schematic for the subsequent revision of each of the zoite stages of Plasmodium and these draw on Cryo EM tomography data.

      Add and discuss the recent work from Curr Biol and EMBO J of the Yuan lab on ookinete formation?

      Authors’ response: These two reports are excellent studies of the polarised development of the cell pellicle during ookinete formation and control of gliding initiation, but don’t specifically related to the conoid complex structures that are the subject of our study. We, therefore, do not see a logical place to include discussion of these works.

      Reviewer #2 (Significance (Required)):

      The paper provides a conceptual advance over previous data as it shows clearly a high level of conservation of the protein components of the conoid complex. It could introduce a new terminology for these important apical structure of Apicomplexan parasites and provides a good basis to dissect the molecular functions.

      Authors’ response: We appreciate this reviewer recognising this opportune point in time to more clearly define the terminology applied to these apical structures so that they can be more clearly and easily compared between taxa. We will use the suggested schematic figure (see comment below) that is now in preparation as a basis and guide for a refined nomenclature based on precedent in the literature.

      As it stands all scientists investigating Plasmodium and Toxoplasma invasion of host cells will be highly interested in this study, most scientists researching apicomplexan organisms should be and some evolutionary scientists will be interested in this study.

      Key papers in the field are the discovery of the Toxoplasma conoid as a highly twisted microtubule-like structure (Hu et al., JCB 2002; doi: 10.1083/jcb.200112086) the first description of an apical proteome (Hu et al., PLoS Path 2006; 10.1371/journal.ppat.0020013), the description of a tilted arrangement of the rings in Plasmodium versus Toxoplasma (Kudryashev et al., Cell Microbiol 2012; doi: 10.1111/j.1462-5822.2012.01836.x) and the discovery of apical located proteins that are essential for conoid formation (Tosetti et al., eLife 2020; 10.7554/eLife.56635) to name a few.

      If intended for a broader audience, a cartoon of a conoid complex across the different species investigated and discussed here would help for visual guidance highlighting the similarities and differences

      Authors’ response: This is a good suggestion and we are presently preparing a schematic of all stages studied and supporting this with electron microscopy.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this work, Koreny et al. characterized the localization of a new collection of conoid proteins in Toxoplasma gondii as well as in several different stages of Plasmodium berghei. The authors discovered that these proteins are located in several distinct substructures in Plasmodium and are expressed in a stage-specific manner. The data are of high quality, well‐organized, and well presented. The paper is well written. The introduction, in particular, was a pleasure to read. This reviewer (Ke Hu) does not have any new experiments to suggest.

      However, while the authors present LOPIT+BIOID as a powerful approach to identify conoid proteins, implying that it is more reliable than previously published approaches (see below), the manuscript includes no data to show what the false positive or false negative rate is with the current approach, nor any estimate of how many conoid proteins were missed entirely.

      Authors’ response: In our validation of putative conoid-associated proteins identified by the hyperLOPIT+BioID approach we reporter-tagged 18 proteins to resolve their cellular location by microscopy. All 18 were verified as being located at the site of the conoid. So, by this measure there were no false positives. The veracity of the hyperLOPIT data was also confirmed across other cell compartments in our report where 62 proteins were reporter-tagged from which there were no false positive assignments of cell location (Barylyuk et al., 2020, Cell Host & Microbe, in press:doi:10.1016/j.chom.2020.09.011), bioRixv: https://doi.org/10.1101/2020 .04.23.057125).

      Estimating false negatives is more difficult, but we know that these would occur as for any mass spectrometry-based detection technique. However, we have not claimed to have been exhaustive, nor was this required to answer our central question of are there conserved conoid-associated proteins throughout Apicomplexa? To address this question, we required a good sample of proteins, and the methods that we have employed provided this.

      Page 7: "Previous identification of conoid complex proteins used methods including subcellular enrichment, correlation of mRNA expression, and proximity tagging (BioID) (Hu et al. 2006; Long, Anthony, et al. 2017; Long, Brown, et al. 2017). Amongst these datasets many components have been identified, although often with a high false positive rate. We have found the hyperLOPIT strategy to be a powerful approach for enriching in proteins specific to the apex of the cell, and BioID has further refined identification of proteins specific to the conoid complex region."

      The authors should state whether the candidate proteins were chosen in an unbiased way or not.

      Authors’ response: Candidate proteins selected for validation by microscopy were not biased for any known likelihood of being associated with the conoid, other than our proteomics data what we were seeking to test. However, we did preference proteins with the following traits, 1) proteins with strong corresponding gene knockout fitness phenotypes from published studies, 2) proteins with some evidence of conserved functional domains, and 3) genes with orthologues found in Plasmodium spp. and other apicomplexans. These traits were chosen with future functional studies in mind where proteins might be more informative of conoid-related functions and relevance in other apicomplexans. All validated proteins, however, were otherwise uncharacterised and, therefore, were not knowingly biased for more likely conoid-association over others discovered by our proteomics approach. We now include the following statement.

      “All proteins selected for validation were previously uncharacterised and with no a priori reason to be identified as conoid-associated other than our proteomics data.”

      If so, how many proteins were localized to the conoid and how many were not?

      Authors’ response: as stated above, we observed no false positives from the sample of 18 protein locations verified by microscopy.

      Related to this, the majority (14 out of 20) of the conoid proteins identified by LOPIT+BIOID in this paper were previously identified as conoid candidate proteins in Hu et al's 2006 paper, based on the number of peptides retrieved from the conoid enriched vs depleted fractions. Those data (see below) have been available from ToxoDB for many years and should be acknowledged.

      Accession# - conoid enriched : conoid depleted (from Hu et al. 2006)

      222350 - 2:0

      274120 - 3:0

      291880 - 1:0

      301420 - 3:1

      246720 - 4:0

      258090 - 10:0

      266630 - 8:1

      208340 - 4:2

      253600 - 1:0

      306350 - not found

      250840 - 1:0

      292120 - not found

      219070 - not found

      274160 - not found

      320030 - 7:1

      227000 - 10:0

      278780 - not found

      284620 - not found

      295420 - 6:0

      297180 - 4:0

      Authors’ response: Proteomic methods and mass spectrometry have experienced revolutionary advances since this 2006 study was conducted. These include improvements in both sensitivity and quantitation accuracy. The Hu et al 2006 study provided an exciting first step towards conoid protein discovery. However, by their original estimation, at least 35% of their putative conoid-specific proteins were identifiable as false positives (e.g. ribosomal proteins) and this estimate could not account for the majority of uncharacterised proteins whose potential for false positive attribution to the conoid was untested. From almost 300 proteins, this study only validated four as associated with the conoid. The further proteins listed above were not validated as conoid proteins in the Hu et al study and, therefore, could not be distinguished from the many false positives reported in their work. In our Table 1, we have acknowledged the Hu et al study for the select proteins that they established as conoid proteins in their study.

      To further assess the utility of this 2006 conoid-enriched proteome we sorted the Hu et al detected proteins on our full hyperLOPIT assignments. Of the proteins that were reported by Hu et al as either exclusive to the conoid-enriched fraction or enriched by at least 2-fold over the conoid-depleted fraction, 15% were assigned to the apical 1 and 2 clusters (representing the relevant compartments to the conoid complex). Thus, according to the hyperLOPIT data these represent the true positives found in this study and 13 of these proteins were independently validated as conoid-associated by us. Significantly, however, 85% of the conoid-exclusive and conoid-enriched proteins from Hu et al (2006) were allocated to a non-apical location with 99% probability by hyperLOPIT, and, during our validation of 62 assignments we verified the alternative location of eight of these. False positives, therefore, greatly outnumbered true positives in this earlier dataset. This high rate of false positives in subcellular isolation proteomics is typical of the challenges that this method faces, and this was the rationale for and strength of the alternative hyperLOPIT approach. Given the overall relatively low level of conoid specificity in the earlier work we do not think that there is value in making specific protein-by-protein reference to it.

      Reviewer #3 (Significance (Required)):

      see above

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      This manuscript details the further use of the hyperplexed Localisation of Organelle Proteins by Isotope Tagging (hyperLOPIT) that the group has previous published using T. gondii tachyzoites by combining this with BioID and super-resolution microscopy in order to uncover new proteins that form part of a structurally known and functionally elusive conoid. The authors conclusively identified new proteins that localise to the conoid structure in T. gondii and also excitingly showed that not only is this structure found in all invasive forms of plasmodium (using the P. berghei model) but there also is a different molecular make up in the blood stage merozoites which have a slightly reduced number of proteins (or possible as yet unknown alternatives) compared to ookinetes and sporozoite conoid structures. This study is scientifically sound and the conclusions reached are well supported by the results presented.

      **Major Comments:** No major comments

      **Minor Comments:**

      1)While both the introduction and discussion and well written and detailed they could both be a little more concise.

      Authors’ response: We take this as a style recommendation, but we note that the other reviewers commented on the text’s “eloquence” and that the introduction in particular was a “pleasure to read”. We take these comments as votes of confidence in the current form.

      2)Selection of the 5 new genes in Tg to be tagged (top pg 5) it was not clear as to the selection criteria for these 5.

      Authors’ response: Please see the same query, and response with modified text, made by Reviewer #3.

      This also leads to the second part of this question where there appears to be some genes missing from Table 1 and Table S1, specifically those found in both SAS6L and RNG2 BioID. It was mentioned that 25 were identified in both SAS6L and RNG2 BioID. In Table 1 (there are 23) there is no mention of 223790, 281650, 224700, and 293540 but they are in the Table S1 (assuming these 4 are not selected in this study for tagging) but in table S1 (there are 25 listed) 216080 (AKMT) and 234250 (CIP1) that are in the Table 1 as being identified in both SAS6L and RNG2 BioID are absent from the Table S1 does this mean there are actually 27 or was the indication of identified in both SAS6L and RNG2 BioID for 216080 (AKMT) and 234250 (CIP1) in Table 1 a mistake?

      Authors’ response: This reviewer has overlooked that Table 1 reports on all currently known conoid associated proteins, including those not detected in the hyperLOPIT data but reported in the literature, whereas Table S1 is exclusively those proteins detected and assigned as ‘apical’ by hyperLOPIT. The reported BioID-detection for each protein is then made within this framework. Thus, the proteins that occur in only one or the other table do so because they don’t satisfy these two sets of criteria. We have rechecked the numbers reported in the text and they are correct.

      3)Table 1: There is the fitness score for Pf orthologues but no mention of fitness in Pb (the model used) from the PlasmoGEM screens, considering the authors use the Pb model it would be of interest to add this in the table.

      Authors’ response: The Plasmodium berghei PlasmoGEM gene disruption screen were much more limited in number than that for P. falciparum. Consequently, fitness scores were available for only two of the Plasmodium orthologues for which we have location data. We, therefore, thought it was of limited utility to include these data in Table 1, and these data are in the public domain should a reader seek them.

      4)Figure 2: The image for localisation with SAS6L for 291880 and 258090 appear to be missing.

      Authors’ response: Initially we did not make the separate transgenic cell lines for each protein with both the SAS6L and RNG2 markers. This was because one marker was usually sufficient to resolve the relative location of the protein of interest. However, given this reviewer’s comment and the potential for some extra information to be recovered by using both markers, we have now generated all cell lines necessary for this analysis. We are presently completing the imaging of these new cell lines and these data will be included in the subsequent revision.

      5)Figure 3: It is unclear why both SAS6L and RNG2 are not used for all localisations shown (this could be clarified in the text)

      Authors’ response: see previous comment.

      6)Figure 5: It is a shame only 7 of the 9 plasmodium orthologues were included in the super resolution as there is only 2 more to have the complete set.

      Authors’ response: Ideally, we would have been able to achieve this but, the restrictions imposed by the COVID-19 disruption to laboratory access and activities ultimately slightly limited these analyses. However, to answer the central question of whether there is conservation of the Toxoplasma conoid proteome in Plasmodium it was not necessary to perform super resolution imaging for all of these proteins. The major outcome of this study, therefore, is not affected by this.

      7)Figure 6: As with Figure 5 it would be better if more were included in the super-resolution images in this sporozoite stage.

      Authors’ response: Same response as above. Generation of sporozoites requires passage through the mosquito vector so this is even more resource-intensive than generation of ookinetes that can be differentiated in vitro from mouse-derived parasites. Again, the answers to the central questions posed by this study do not require these further, high resolution, data.

      8)Figure 7: This would be improved with at least a selection (or even all 6) to have the super-resolution images (possibly even with free merozoites)

      Authors’ response: We did apply 3D-SIM imaging to fixed merozoites, however, unlike ookinetes and sporozoites, the imaged fixed material was inferior to the live cell GFP imaging that we have included. This likely reflects the poorer fixation properties of Plasmodium merozoites that is a challenge of these cell forms that is widely experienced by Plasmodium researchers. We do not have access to a 3D-SIM microscope within a containment laboratory necessary for handling viable parasites, therefore, could not attempt to image live material with this instrument. Again, the answers to the central questions posed by this study do not require these further, high resolution, data

      9)As there are numerous new protein identified in 2 different parasites and with the composition of the conoid differing at different stages it would be beneficial to have some sort of schematic model of the apical complex in Tg and Pb indicating where each new protein localises

      Authors’ response: In response to this reviewer, and reviewer #2’s suggestion, we are now preparing schematic models of the apices of all of the relevant organism stages.

      Reviewer #4 (Significance (Required)):

      The authors have combined expert mass spectrometry and super-resolution microscopy to identify new components of the conoid in Tg and added to the knowledge that will help to uncover the function of the structure. But perhaps the most significant is the conclusive identification of the conoid in all 3 invasive stages of the plasmodium parasite. Until now it was widely accepted that the conoid was missing in plasmodium and to uncover multiple proteins that appear to make up and constitute this structure in Plasmodium is highly significant and clear of interest to the Apicomplexean field. Furthermore the suggestion that the conoid differs in the molecular makeup within Plasmodium depending on stage is very intriguing and clearly of interest. This paper expertly combined cutting-edge proteomic and microscopy to identify the conoid in Plasmodium. This manuscript would have a broad readership in parasitology, proteomics, and cell biology

      Our expertise is largely in molecular parasitology and microscopy

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript details the further use of the hyperplexed Localisation of Organelle Proteins by Isotope Tagging (hyperLOPIT) that the group has previous published using T. gondii tachyzoites by combining this with BioID and super-resolution microscopy in order to uncover new proteins that form part of a structurally known and functionally elusive conoid. The authors conclusively identified new proteins that localise to the conoid structure in T. gondii and also excitingly showed that not only is this structure found in all invasive forms of plasmodium (using the P. berghei model) but there also is a different molecular make up in the blood stage merozoites which have a slightly reduced number of proteins (or possible as yet unknown alternatives) compared to ookinetes and sporozoite conoid structures. This study is scientifically sound and the conclusions reached are well supported by the results presented.

      Major Comments: No major comments

      Minor Comments:

      1)While both the introduction and discussion and well written and detailed they could both be a little more concise.

      2)Selection of the 5 new genes in Tg to be tagged (top pg 5) it was not clear as to the selection criteria for these 5. This also leads to the second part of this question where there appears to be some genes missing from Table 1 and Table S1, specifically those found in both SAS6L and RNG2 BioID. It was mentioned that 25 were identified in both SAS6L and RNG2 BioID. In Table 1 (there are 23) there is no mention of 223790, 281650, 224700, and 293540 but they are in the Table S1 (assuming these 4 are not selected in this study for tagging) but in table S1 (there are 25 listed) 216080 (AKMT) and 234250 (CIP1) that are in the Table 1 as being identified in both SAS6L and RNG2 BioID are absent from the Table S1 does this mean there are actually 27 or was the indication of identified in both SAS6L and RNG2 BioID for 216080 (AKMT) and 234250 (CIP1) in Table 1 a mistake?

      3)Table 1: There is the fitness score for Pf orthologues but no mention of fitness in Pb (the model used) from the PlasmoGEM screens, considering the authors use the Pb model it would be of interest to add this in the table.

      4)Figure 2: The image for localisation with SAS6L for 291880 and 258090 appear to be missing.

      5)Figure 3: It is unclear why both SAS6L and RNG2 are not used for all localisations shown (this could be clarified in the text)

      6)Figure 5: It is a shame only 7 of the 9 plasmodium orthologues were included in the super resolution as there is only 2 more to have the complete set.

      7)Figure 6: As with Figure 5 it would be better if more were included in the super-resolution images in this sporozoite stage.

      8)Figure 7: This would be improved with at least a selection (or even all 6) to have the super-resolution images (possibly even with free merozoites)

      9)As there are numerous new protein identified in 2 different parasites and with the composition of the conoid differing at different stages it would be beneficial to have some sort of schematic model of the apical complex in Tg and Pb indicating where each new protein localises

      Significance

      The authors have combined expert mass spectrometry and super-resolution microscopy to identify new components of the conoid in Tg and added to the knowledge that will help to uncover the function of the structure. But perhaps the most significant is the conclusive identification of the conoid in all 3 invasive stages of the plasmodium parasite. Until now it was widely accepted that the conoid was missing in plasmodium and to uncover multiple proteins that appear to make up and constitute this structure in Plasmodium is highly significant and clear of interest to the Apicomplexean field. Furthermore the suggestion that the conoid differs in the molecular makeup within Plasmodium depending on stage is very intriguing and clearly of interest. This paper expertly combined cutting-edge proteomic and microscopy to identify the conoid in Plasmodium. This manuscript would have a broad readership in parasitology, proteomics, and cell biology

      Our expertise is largely in molecular parasitology and microscopy

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this work, Koreny et al. characterized the localization of a new collection of conoid proteins in Toxoplasma gondii as well as in several different stages of Plasmodium berghei. The authors discovered that these proteins are located in several distinct substructures in Plasmodium and are expressed in a stage-specific manner. The data are of high quality, well‐organized, and well presented. The paper is well written. The introduction, in particular, was a pleasure to read. This reviewer (Ke Hu) does not have any new experiments to suggest.

      However, while the authors present LOPIT+BIOID as a powerful approach to identify conoid proteins, implying that it is more reliable than previously published approaches (see below), the manuscript includes no data to show what the false positive or false negative rate is with the current approach, nor any estimate of how many conoid proteins were missed entirely.

      Page 7: "Previous identification of conoid complex proteins used methods including subcellular enrichment, correlation of mRNA expression, and proximity tagging (BioID) (Hu et al. 2006; Long, Anthony, et al. 2017; Long, Brown, et al. 2017). Amongst these datasets many components have been identified, although often with a high false positive rate. We have found the hyperLOPIT strategy to be a powerful approach for enriching in proteins specific to the apex of the cell, and BioID has further refined identification of proteins specific to the conoid complex region."

      The authors should state whether the candidate proteins were chosen in an unbiased way or not. If so, how many proteins were localized to the conoid and how many were not? Related to this, the majority (14 out of 20) of the conoid proteins identified by LOPIT+BIOID in this paper were previously identified as conoid candidate proteins in Hu et al's 2006 paper, based on the number of peptides retrieved from the conoid enriched vs depleted fractions. Those data (see below) have been available from ToxoDB for many years and should be acknowledged.

      Accession# - conoid enriched : conoid depleted (from Hu et al. 2006)

      222350 - 2:0

      274120 - 3:0

      291880 - 1:0

      301420 - 3:1

      246720 - 4:0

      258090 - 10:0

      266630 - 8:1

      208340 - 4:2

      253600 - 1:0

      306350 - not found

      250840 - 1:0

      292120 - not found

      219070 - not found

      274160 - not found

      320030 - 7:1

      227000 - 10:0

      278780 - not found

      284620 - not found

      295420 - 6:0

      297180 - 4:0

      Significance

      see above

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors study proteins localised to the apical end of the highly polarised parasites causing Toxoplasmosis and malaria. They find new proteins using BioID and examine the localisation of these along with recently identified proteins in the two different parasites. They key question they address is whether there is a conservation of the apical components in these distantly related parasites as well as in some even more distantly related organisms. This is an important question as the apical part comprises many essential proteins of invasion of host cells and shows a unique structure that defines the apicomplexans as a group. The apical structure can be highly elaborate such as in T. gondii and less elaborate as in P. falciparum. The authors now show that there is a large conservation between the species in the protein makeup of the apical end. The experiments are well performed, displayed and discussed and there is no doubt about the validity of the presented results. The text is eloquently written, if at times a bit wordy. My only main suggestion would be to possibly add data on gene disruption of the two candidates (0310700 and 1216300) that are not detected in blood stage parasites but in the insect stages. A deletion of these should be technically straightforward and would show whether the proteins are important to the parasite. Likely not all of the now many proteins are essential for the parasites but these are good candidates to rapidly investigate. But showing a functional impact might convince editors at certain journals.

      Other suggestions in chronological order (line numbers would have helped)

      title: maybe write 'conoid complex proteome'

      abstract: not sure about the use of the words instrument and substructures

      page 2 last lines: is tubulin monomeric or polymerized?

      page 3 name protein talked about in 9th line

      third paragraph: mention previous proteomics studies e.g. from Ke Hu (mentioned later in discussion)

      first paragraph or results could go into introduction

      page 4: add reference after BioID

      page 5: add definitions of the conoid; what technique was used to report YFP-SAS6?

      page 7: 'showed similar localisation' instead of 'phenocopied'?; add reference after ookinete stage; add expression levels from PlasmoDB to the Table 1 data at least for merozoites, ookinetes and sporozoites or add separate table for the 9 proteins in supplement

      Discussion: Maybe discuss that the conoid complex is a cytoskeletal structure and that the other cytoskeletons (actin, microtubules, subpellicular network) also differ between the species investigated in their composition and overall architecture

      page 9: at least two proteins could be deleted as they seem to not confer any growth defect on blood stages (see main comment)

      Apart from classic TEM images also Cryo EM data is available for apex of merozoite and sporozoite. Worth to discuss?

      Add and discuss the recent work from Curr Biol and EMBO J of the Yuan lab on ookinete formation?

      Significance

      The paper provides a conceptual advance over previous data as it shows clearly a high level of conservation of the protein components of the conoid complex. It could introduce a new terminology for these important apical structure of Apicomplexan parasites and provides a good basis to dissect the molecular functions. As it stands all scientists investigating Plasmodium and Toxoplasma invasion of host cells will be highly interested in this study, most scientists researching apicomplexan organisms should be and some evolutionary scientists will be interested in this study.

      Key papers in the field are the discovery of the Toxoplasma conoid as a highly twisted microtubule-like structure (Hu et al., JCB 2002; doi: 10.1083/jcb.200112086) the first description of an apical proteome (Hu et al., PLoS Path 2006; 10.1371/journal.ppat.0020013), the description of a tilted arrangement of the rings in Plasmodium versus Toxoplasma (Kudryashev et al., Cell Microbiol 2012; doi: 10.1111/j.1462-5822.2012.01836.x) and the discovery of apical located proteins that are essential for conoid formation (Tosetti et al., eLife 2020; 10.7554/eLife.56635) to name a few.

      If intended for a broader audience, a cartoon of a conoid complex across the different species investigated and discussed here would help for visual guidance highlighting the similarities and differences

    1. Reviewer #2

      In this manuscript, the authors applied Gaussian Process regression to drug response data and attempted to utilize the estimates of uncertainty from these regression to improve on drug response curve fitting and biomarker discovery. Their approach and application case is an interesting one that deserves further investment and attention. However, I have substantive concerns with the current manuscript draft and would recommend to the authors that these concerns be addressed.

      1) Figure 3 and the accompanying text section of the main document seems to be focused on characterizing estimation uncertainty, which appears to simply be the between-sample dispersion of the dose-response curve (or summary statistics thereof) from replicate runs. The main conclusion seems to be that drug compounds with partial responders are the ones with the greatest between-sample dispersions.

      What is missing from this Figure and accompanying text is a comparison of these results with analogous ones for the observation uncertainty to help readers understand why one approach may be preferred over the other.

      2) Figure 5A compares the posterior probability from the Bayesian test (presumably accounting for estimation uncertainty) against the q-value from an ANOVA test. The q-value should be the False Discovery Rate, which controls for the proportion of false positives. This does not seem to be directly comparable to a posterior probability. The authors should clarify why a comparison of proportion to posterior probability is reasonable.

      3) The authors do not appear to have demonstrated how estimation uncertainty can improve on drug response curve fitting or biomarker discovery?

      For the former, the fitted curves using standard approaches appear similar to those fitted using GP regression, as the authors seemed to have focused on those curves where the two approaches are concordant and as the IC50 value differences appear minimal for those cases where IC50 is within the tested concentration range. The greatest differences are seen for those cases where IC50 values are outside the tested concentration ranges, but these cases were not in focus in the text. In addition, for these cases, it is unclear if relying on curve fits from GP regression makes sense because they are also the cases with the highest estimation uncertainty.

      For the latter, it appears that every significant biomarker identified using Bayesian posterior probability is also significant by ANOVA (using a standard q-value < 0.05 cutoff).

    2. Reviewer #1

      The authors propose two related (though distinct) methods for the improvement of pharmacological screening analysis and related biomarker analyses. The first is a Gaussian process (GP) approach to dose-response curve fitting for the estimation of IC50, AUC, and related quantities. The goal of this method is to improve point and uncertainty estimates of these quantities through more flexible functional specification and outlier-robust error modeling. The second method is a hierarchical Bayesian approach to biomarker association analysis. This incorporates uncertainty estimates produced by the GP modeling with the aim of providing more sensitive association analyses with fewer false positives.

      The combination of methods presented has some potential. Flexible modeling of dose-response relationships and better estimation of uncertainty are interesting axes to wring more information out of large-scale screening datasets. There are a few areas to shore up in the paper to increase confidence in the empirical results and generalizability of the methods.

      1) There are a number of fixed parameters in the proposed methods, and the calibration procedure used to set these is unclear to me. For the GP models, there are a set of noise parameters for Beta mixture and the length scales and variance parameter for the kernel. I'm not sure how one would generalize the GP methods to other screening datasets as a result of this ambiguity (e.g., how would one determine appropriate noise parameters?). For the hierarchical Bayesian biomarker association model, we have prior scale parameters related to both the effect size and variance parameters. The number of researcher degrees of freedom introduced by these tuned parameters also raises some concerns about the sensitivity of empirical results (e.g., 24 clinically established biomarkers and 6 novel) to these choices. It's not clear if we're seeing a corner case or a robust result. I think the work would benefit from both sensitivity analyses with respect to tuned parameters and guidance on or methods for their estimation. The latter is particularly important if other researchers hope to employ these methods in a different context.

      2) The proposed hierarchical Bayesian approach to biomarker association analysis is a reasonable start, but it was unclear to me whether changes in performance stemmed from correcting misspecification in original ANOVA or the use of uncertainty estimates. I suggest comparing results to a heteroskedasticity-robust estimator (e.g., HC3, see Long and Ervin, 2000), which would be valid under the stated model without the requirement for explicit uncertainty estimates or priors. The transformations and tuning applied to uncertainty estimates in this context also make generalization of the approach challenging. The need for the c (power) parameter suggests a potential misspecification or miscalibration at some point in the modeling chain. It would be useful to understand this misspecification better, particularly for researchers hoping to extend or reuse these methods.

      3) The GP method provides reasonable estimates of uncertainty, but it would be useful to see them compared to those from the sigmoid model (e.g., from the delta method). It wasn't clear to me how much of the difference in results is coming from incorporation of uncertainty estimates as opposed to changes in the point estimates.

      4) The handling of cases with IC50 beyond the maximum observed dose (extrapolating to 10x the maximum concentration) provided a reasonable starting point, but a few subtleties in the handling of corner cases remain unaddressed (e.g., GPs allow positive slope at right edge of range). It would be useful to provide a more general, systematic procedure to address these. Imposing monotonicity may not be the best path, but additional guidance for researchers applying these methods in other contexts would help.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary

      This manuscript presents two statistical approaches to evaluating drug effect measurements and associations between biomarkers, for dose curve data. Measurements of these kinds are made in many contexts, and frequently reported without accounting well for measurement uncertainties. A statistical framework of this kind will be widely useful and should be frequently applied.

    1. Reviewer #3

      This manuscript reports the first description of a eukaryotic-like Bin/Amphiphysin/RVS (BAR) domain protein in a bacterium (Shewanella oneidensis MR-1), BdpA, with conserved roles in membrane curvature control during outer membrane vesicle (OMV) formation. Consistent with this, a BdpA-defective mutant had defects in the size and shape of redox-active membrane vesicles and formed outer membrane extensions (OMEs) lacking the characteristic tubular structure. Heterologous expression of the BdpA proteins in model (Escherichia coli) and non-model (Marinobacter atlanticus) bacteria hosts promoted OME formation. The authors propose BdpA as a new subclass of prokaryotic BAR proteins with eukaryotic-like roles in membrane curvature modulation. This is an interesting finding that could be strengthened with topological studies of BdpA in OMV/OME and quantitative analyses to validate the many qualitative microscopic observations.

      Numbered summary:

      1) To my knowledge this is the first description of a BAR-domain protein in a prokaryotic organism. But the role of prokaryotic proteins with amphipathic α-helical domains in membrane binding/curvature is not new. A review by Dowrkin1 describes some of these structural homologs and their role in membrane binding and curvature control via their amphipathic domains (e.g., Bacillus subtilis SpoVM, which controls forespore membrane curvature during sporulation using its helical domain). This information is important in the introduction and could help with the phylogenetic analyses (comment 4 below).

      2) I am having a hard time reconciling the presence of a galactose-binding domain in BdpA and LPS sugar binding. This would suggest that the proteins coat the OMV rather than interacting with the periplasmic side of the outer membrane to promote OMV formation and release (which I somehow assume based on the role of some eukaryotic BARs). The lack of topological studies makes these models highly speculative and weakens some of the conclusions. The paper would be strengthened with the addition of topological studies in OMVs and OMEs.

      3) Many experiments rely on microscopic observations of cells, OMVs and OMEs to support conclusions based on (at most) semiquantitative data. These experiments require validation with methods that quantitatively determine critical variables such as OMV size and size distribution. Also note the microscopic methods are poorly described or not described at all in the methods section. Thus, it is not clear how many cells they examined microscopically and how many biological replicates (cultures) they used. The variability associated with this type of microscopic assessments makes sample size (number of cells, typically in the hundreds) and replication in independent cultures critical.

      4) Many of the branch points in the phylogenetic tree (Fig. 5) have very low confidence values. The authors did not provide the alignments so I could not evaluate the accuracy of the approach to offer suggestions for improvement. The predictive value of the tree may improve by including prokaryotic amphipathic helical domains such as those from SpoVM, MinD and FtsA. These issues are not as concerning in the tree presented in Fig. S6 although I note that this tree is supposed to show the distribution of "BdpA orthologs in other prokaryotes" but most of the branches are for eukaryotic proteins. I also note that the Methods section describes important results about the homology (or lack of homology) between BdpA and other prokaryotic and eukaryotic proteins. This information is more appropriate in the Results section.

      References:

      1) Dworkin, J. Cellular polarity in prokaryotic organisms. Cold Spring Harbor perspectives in biology 1, a003368-a003368, doi:10.1101/cshperspect.a003368 (2009).

      2) Gorby, Y. et al. Redox-reactive membrane vesicles produced by Shewanella. Geobiology 6, 232-241, doi:10.1111/j.1472-4669.2008.00158.x (2008).

    2. Reviewer #2

      Some Gram-negative bacteria, such as Shewanella oneidensis, produce outer membrane extensions (OME) that mediate electron transfer to extracellular substrates. Many of the players involved in the transfer of electrons via these nanowires have been discovered but the mechanisms of outer membrane remodeling have remained mysterious. Here, Phillips, Zacharoff, and colleagues, identify BdpA as a protein that stabilizes OMEs in Shewanella oneidensis and perhaps displays outer membrane remodeling activity in other bacterial species. Given its homology to eukaryotic BAR-domain proteins, the authors suggest that BdpA and its homologs define the first prokaryotic family of BAR proteins or pBARs.

      This works tackles a number of significant questions that span broad areas of microbiology and cell biology. First, it explores a critical area of bacterial cell biology: how do gram negatives remodel their outer membranes? Second, it focuses on an underappreciated aspect of extracellular electron transfer, an activity widespread amongst bacteria with clear relevance to basic and applied fields. Finally, it provides a possible glimpse into the evolution of BAR-domain proteins which play diverse cellular roles in eukaryotes. Despite the substantial advances presented here, I have some concerns which, if addressed, can lead to more certain conclusions about the cellular role of BdpA.

      1) I liked the comparative proteomics approach as a tool to identify unique OME components. I was surprised that the two fractions differed so much in their protein composition. Based on the materials and methods the OM and OME fractions were isolated from cells grown under very different conditions. Could this account for the large differences between these two fractions? Looking at the list of proteins enriched in either fraction is there any indication of significant contamination from other cellular fractions? What controls were used to ensure that the purification procedure was working effectively?

      2) The authors conclude that the OM vesicles are conductive. However, some controls are needed since other cellular components (such as OM fractions containing Mtr proteins) may have contaminated the OME fraction. Is the OME fraction "enriched" for this activity compared to just the OM fraction?

      3) Is BdpA really a BAR-domain protein? The authors use computational tools (such as BLAST and homology modelling) to posit that BdpA is a BAR-domain protein. This hypothesis is strengthened by the phenotype of mutants missing bdpA. While OMEs are not absent, their architecture is visibly altered which may point to some instability in the membrane extensions. Significantly, BdpA is sufficient to induce OME-like structures when expressed in planktonic Shewanella cells, a condition during which OMEs are not normally produced. However, as authors state, BdpA barely meets the cutoff (as set by the program used) for a BAR-domain protein. Furthermore, some of its homologs that share high levels of sequence identity don't pass the bar set by these computational methods. However, we cannot say that BdpA is actually a BAR-domain protein. Its effects on membrane stability could be indirect or the result of binding to outer membrane features in a manner distinct from other BAR proteins. Therefore, some biochemical corroboration of its activity on membranes or structural data are needed to confirm its relationship to eukaryotic Bar domain proteins. On a minor note I would prefer "bacterial" rather than "prokaryotic" since BdpA Bar-like domain is not found in archaea. Also, other groups have proposed that bacterial proteins contain BAR domains (for instance, Tanaka et al in reference 28). How similar is BdpA to these proteins?

      4) Heterologous expression of BdpA in other bacteria provides one of the most compelling arguments for its central role in producing OMEs. However, the imaging data provided here (at least in my pdf) do not provide the clearest evidence for induction of OMEs in M. atlanticus and E. coli. This is especially the case with the E. coli images. The extended web of staining in 4c does not resemble the tubules seen in S. oneidensis. It would be great to have some electron microscopy data and/or higher resolution fluorescence images of these bacteria as corroborating evidence. Additionally, only a few cells are shown so some quantification of the proportion of cells with OMEs is needed.

      5) Other than the predicted signal peptide, does BdpA have any predicted features that indicate it is an outer membrane protein? The authors hypothesize that the putative Galactose-binding domain of BdpA mediates binding to LPS. However, it is also possible that it binds to peptidoglycan components. Therefore, independent data on localization of BdpA via microscopy or higher resolution biochemical fractionation would provide greater confidence that the protein is acting in the appropriate cellular location.

    3. Reviewer #1

      In the manuscript "A Prokaryotic Membrane Sculpting BAR Domain Protein" the authors describe the identification of the first bacterial membrane sculpting BAR domain protein, and the characterization of its function. In eukaryotes this protein is important for shaping membrane curvature. Here they identify a protein containing a BAR domain in the bacterium Shewanella oneidensis, which they name BdpA (BAR domain-like protein A). The authors show that BdpA is enriched in outer membrane vesicles (OMVs) and outer membrane extension (OMEs), regulates the size of OMVs and the shape of OMEs. They show this by characterizing and quantifying membrane vesicles and extension comparing WT with a BdpA mutant and the BdpA mutant with heterologous BdpA expression. They further show that heterologous expression of BdpA promotes OME in E. coli.

      In my opinion this paper provides solid support for the presence of these proteins in bacteria with an important function in membrane vesicles and membrane extensions.

      Minor Comments:

      1) In the introduction the authors summarize what is known about BAR eukaryotic protein in terms of membrane localization and their role in membrane curvature and tubulation events. I think it is important to also provide a summary of what is known about the functional biological implication of these proteins in eukaryotes. Namely, if the main function of BAR proteins in eukaryotes is always related to tubulation formation or if there are other functions attributed to these proteins.

      2) Contrast and resolution in Figure 3, panel a, is weak making it difficult to see tubules described by the authors.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary

      In this manuscript the authors propose the identification of a novel protein involved in outer membrane remodelling, named BdpA (BAR domain-like protein A). According to the proposed model BdpA has a conserved role in membrane curvature control during formation of outer membrane vesicle (OMV) and of outer membrane extension (OMEs) in Shewanella oneidensis. The authors also provide evidence that heterologous expression of BdpA promotes formation of OMEs in other bacteria (namely in E. coli), and that BdpA is sufficient to induce OME-like structures when expressed in conditions where OMEs are normally not formed. In eukaryotes proteins containing BAR domains are important for shaping membrane curvature. Given the homology of BdpA to eukaryotic BAR-domain proteins, the authors suggest that BdpA and its homologs define the first prokaryotic family of BAR proteins or pBARs, with eukaryotic-like roles in membrane curvature modulation.

      Overall, the reviewers think that this is a very interesting study, and provided that further support is obtained to substantiate the proposed model the reviewers agree that the findings described here tackle a number of significant questions of broad interest. However, the reviewers also think that the evidence provided in this manuscript still does not fully support the conclusion that BdpA protein is involved in membrane curvature control as the eukaryotic proteins containing the BAR domains.

      We have compiled a list of comments that we hope will help the authors address the concerns of the reviewers to obtain stronger support for the function of BdpA.

      1) The reviewers are concerned that some of the conclusions are based on qualitative observations of microscopy analysis of OMVs and OMEs, and quantitative analyses are lacking to validate qualitative observations. As specified in with examples in the list of minor points below the reviewers propose that the data should be re-analyzed to obtain quantitative results. Specifically, a size distribution analysis could be applied to some microscopy data. Also note that the microscopy methods are poorly described, and as the calculation methods used are not fully available it is difficult to understand if the appropriate methods were used. Please specify how many cells were examined microscopically and how many biological replicates (cultures) were used in each experiment.

      2) Statistical analyses were not always the most accurate. In figure 2 unpaired t-test was used for samples that have high variance, this approach may inflate the statistical difference between the strains. For figure 2 a histogram of size distribution analyses could be shown for each strain.

      3) The reviewers are concerned that the proteomic data is not clear enough to conclude that the BdpA protein is localized to or enriched in OMV/OME. Could the results be complemented with some other method to confirm BdpA localization? The reviewers are particularly concerned by the fact that a large number of proteins were identified in the OMV fraction. Could it be that some of the OMV/OME fractions were contaminated? What controls were used to ensure that the purification procedure was working effectively? Could the data be strengthened by some quality control analyses to determine how many of those proteins are actually predicted to localize to the outer membrane and periplasm? From the methods it seems that the culture conditions used to prepare the OM versus OMV were different, is this so? If yes, why were the culture conditions different? This could affect protein expression? Please include the detailed growth conditions in the method section.

      4) The conclusion that BdpA is a BAR-domain protein is largely based on homology. The supplementary information file includes homology models that show striking similarity with eukaryotic BAR proteins. However, as the authors state, BdpA barely meets the cutoff for a BAR-domain protein. The results with the phenotype of the BdpA mutant, complementations and sufficiency data provide good support to the functional role of BdpA in membrane remodelling. However, the effect of BdpA on membrane stability could be indirect or the result of binding to outer membrane features in a manner distinct from other BAR proteins. Could these results be strengthened with some biochemical corroboration of its activity on membranes or structural data to confirm its relationship to eukaryotic Bar domain proteins? Or structural data to confirm its relationship to eukaryotic BAR domain proteins?

      5) The reviewers propose that the paper would be strengthened with the addition of topological studies in OMVs and OMEs. The reviewers had problems in reconciling the presence of a galactose-binding domain in BdpA and LPS sugar binding. The authors hypothesize that the putative Galactose-binding domain of BdpA mediates binding to LPS. However, it is also possible that it binds to peptidoglycan components. This would suggest that the proteins interact with the periplasmic side of the outer membrane rather than coat the OMV to promote OMV formation and release (which one could assume based on the role of some eukaryotic BARs). The addition of topological studies (or some biochemical approach) could make these models less speculative, strengthening the conclusions.

      6) Heterologous expression of BdpA in other bacteria provides important compelling arguments for its central role in producing OMEs. However, the imaging data provided do not provide the clearest evidence for induction of OMEs in M. atlanticus and E. coli. This is especially the case with the E. coli images. The extended web of staining in 4c does not resemble the tubules seen in S. oneidensis. It would be great to have some electron microscopy data and/or higher resolution fluorescence images of these bacteria as corroborating evidence. Additionally, only a few cells are shown and quantification of the proportion of cells with OMEs is needed. Thus, as already discussed in point 1, quantitative analyses could improve this important point.

    1. Reviewer #4

      This is an innovative and very interesting study reporting the correlation of extracted neural timescales and expression of NMDA and GABA_a receptor subunits amongst others.

      Comments:

      -definition of timescale is missing in the introduction. Fast and slow responding to sensory versus cue related information reflects a circular definition of timescales.

      -the results text say that the aperiodic components is interpreted as time scale but not how the inference is made, i.e. what quantity is interpreted as time scale.

      -it is difficult to keep track of which timescales are referred to when in the text, e.g. the authors start referring to neuronal timescales after having discussed ECOG based time scales and spike timescales. It seems important for cleanly separating the source of the timescale to denote them with a unique label depending on the source data that gives rise to them. Why not use a subscript for spike, epiduralECoG, subduralECoG, intracranialLFP, ... ?

      -the article seems to assume that mRNA expression for specific receptor subunits correspond to the density of expression of those receptors. It seems important that this is made explicit (if correct) and that a reference is given that shows this relationship.

      -line 142 refers to "task-free ECoG recordings in macaques" but does not clarify where the data comes from. No reference is provided.

    2. Reviewer #3

      In this paper entitled 'Neuronal timescales are functionally dynamic and shaped by cortical microstructure', Gao et al. use open access databases to address two distinct questions: 1) the relationship between hierarchically organized variations in neuronal timescales and brain gene expression and 2) the effect of task and age onto the neuronal timescales of a given cortical regions.

      Overall, this is a well-designed study and the combination of open access databases is well organized and astutely exploited. I, in particular, very like the analysis that tests whether variations in gene expression still accounts for variations in neuronal timescales when the main gradient effect is regressed out. Below are my comments on the manuscript.

      1) For the non-specialist reader, the concept of neuronal timescales that is central to the paper should be defined more explicitly in the introduction ('neuronal timescales' appear in paragraph 3, while it gets defined in paragraphs 1 and 2).

      2) In figure 2B, some T1w/T2w values are above values of 2, which is not standard. Likewise, several outliers can be observed. This might have impacted the estimation of the regression slope. This slope currently matches the one from Burt et al. 2018, although the data point distribution is different.

      3) Figure 4B is contradicting figure 2C as the evidenced timescale hierarchy is different (comparing PC, PFC and OFC). Please explain.

      4) Figure 4B and 4C, please show actual data points and justify parametric tests.

      5) Figure 4C: how consistent is the increase in delay period timescales across areas within each subject. In other words, is this a general property of the brain, task-related effects resulting in a non-specific adjustment in neuronal timescales or are there regional differences in the reported increase (you might want to exclude the PFC from the analysis to remove task related effects).

      6) The manuscript addresses two distinct aspects of neuronal timescales: their relationship to local microarchitecture and their dynamics as a function of task or age. Although there is obviously a strong inter-relationship between these two aspects, this deserves a more extensive discussion. For example, in relation with the previous point, if local microstructural properties predict neuronal timescales, why is it that timescale changes during the delay seem to be ubiquitous (or are they)? And why should such changes (that are overall in the same range) correlate with subject performance in the PFC but not in the other areas? How does this relate to the aging observations? Although this discussion is bound to be speculative, I think it is important in order to strengthen the link between these two independent avenues of the paper, and to enrich the discussion about the functional role of these dynamic changes in neuronal timescales.

      7) Given the described age-related effect, did the authors check that the different databases they used sampled from subjects with the same age distribution.

      8) Legend of figure 1 is not self-explanatory and a lot of the symbols and information plotted in the figures are not explained. Unfortunately, this information is also missing from the result section.

      9) Figs 3E and 3F are mislabeled as 4E and 4F.

      10) Generally speaking, given that the main text itself is very dense, figure legends should be more self-explanatory. Quite often, figure detail description and contextual information are missing both from the text and the figures. This also applies to the supplementary figures.

    3. Reviewer #2

      Overall, this is an interesting manuscript and a well-done study. The main finding is that neural timescales, as quantified through the decay of the power spectrum, vary over cortical regions and are correlated with genes that regulate ionic and structural properties of neurons. The findings aren't terribly surprising and the computational impact on cognition and aging remains unclear (other than showing differences), but the overall approach is novel and interesting.

      I have an overarching concern, which is that the manuscript is written to be dense yet terse, which makes it harder to read, particularly given the complexity of the analyses. It feels like it was written for a journal with extreme word limitations. The manuscript would be overall improved if the authors would "loosen their belt" and explain the findings and methods in more detail.

      What are "these" limitations on line 96?

      Figure 1e: how is r2=1 when the dots do not fall on the line?

      I'm confused about the description of the methods on page 5. For example, "we can estimate neuronal timescale from the 'characteristic frequency'" which implies a peak in the spectrum. Yet in the next sentence they write that they extract timescale from aperiodic components.

      Page 7: Are these markers also correlated with cell packing density? If so, it's possible that denser neural networks have longer timescales.

      Relatedly, how strongly inter-correlated are these genetic markers across the cortex? The authors mostly take a mass-univariate approach except for showing gene-PC1 in Figure 3a. There isn't enough information shown to evaluate whether the top PC is suitable, or whether this PC comprises many/all gene contributions or is driven by a small number, etc.

      I'm missing the modeling results. They appear as a schematic in figure 1 and are mentioned in the Methods section. Was this model actually used somewhere?

    4. Reviewer #1

      These findings are a significant advance in comparison to previous work like Murray et al. (2014) and Dotson & Gray (2018 - please cite here) in the sense that brain-wide hierarchy is considered, whereas previous work considered a smaller set of brain areas. Furthermore, several other interesting correlations are reported with timescales. Overall the analyses appear to be of very high quality, providing a standard for similar studies in the future, and the authors carefully considered problems that arise in correcting for dependent samples, which I applaud.

      Some of the claims need further discussion or refinement, in my opinion.

      1) The comparison shown in Figure 2 between spiking time-scale and ECOG time-scale might be problematic, in the sense that the spiking time-scales were taken from the Murray et al. (2014) paper where they were quantified with a different technique. My suggestion would be to quantify time-scales in the same manner as Murray, or maybe there is a convincing argument why this is not a problem.

      2) The correlations shown between transcriptomics and timescales need to be carefully considered. While the authors regress out T1w/T2w residuals, these might just be one structural factor that changes with cortical hierarchy and assumes that the underlying relationships are linear. Hence, it is possible that timescales and gene profiles are correlated with structure but that there is no causal relationship between these genes and timescales. In this sense, the correlation of genes with hierarchy might also yield similar genetic profiles. It would be important to show the correlation of hierarchy with genetic profiles, to see whether this looks different from the correlations that are obtained with timescale.

      3) The authors use T1W/T2W as the measure for cortical hierarchy. This is a gradient-based perspective on cortical hierarchy. However, there are other perspectives on hierarchy that are not gradient-based, but are based on anatomical connectivity, e.g. as pursued by Kennedy and Van Essen (Vezoli et al., 2020, Biorxiv). This needs to be discussed.

      4) The paper does not consider oscillations, which is fine, but the reader is left wondering how oscillations affect these time-scales. Discussion on this aspect would be useful.

      5) Are the rho correlation values corrected for the expected value of the surrogate distribution? That is, are they significantly overestimated due to the dependent samples issue? In this case I would recommend reporting the corrected correlation values, rather than the raw correlation values.

      6) The correlation performed in Figure 4D is a bit unclear to me. Are the different dots+lines participants, or is this a binned correlation? If it is a binned correlation, does that represent a problem for the correlation analysis?

      7) It would be useful in Figure 1/2 to show some examples of ECOG time-scales related to the actual underlying signals and PSDs, rather than just illustrating the technique on simulated data, so that the validity of the technique can be judged.

      8) In general it would be useful to report carefully the N's and the dataset that is used for each analysis, because it is easy to get lost in what is what as the authors analyze a huge number of datasets.

      9) The technique of removing spatial autocorrelations that influence the p-value appears to be sophisticated and well done. In case this analysis poses problems with other reviewers, I would recommend using a cross-validation prediction approach where a subset of subjects is used for training and the other subjects are used for testing.

    5. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary

      Gao et al. analyze how brain-wide timescales of ECoG signals vary across the cortical hierarchy and relate these timescales to several other aspects of structure, behavior and function. They report the following main findings: 1) Timescales increase with the cortical hierarchy. 2) Time-scales, after regressing out the hierarchical T1w/T2w structure variable, correlate significantly with several genes related to synaptic receptors and ion channels. 3) Time-scales increase with working memory task vs. baseline, and predict working memory performance across subjects. 4) Time-scales decrease with aging, in a region-specific way. These findings are a significant advance in comparison to previous work by considering brain-wide hierarchy at a high spatial and temporal resolution and relating them to behaviour and genetics.

    1. Reviewer #3

      The work by Barros et al. looks at the role of the Ribosome Quality Control pathway (RQC) in regulating the expression of endogenous messages containing polybasic sequences. Using ribosome profiling and western blotting, the authors show that proteins containing various types of polybasic sequences are not targeted by the RQC. The authors argue that one of the few endogenous RQC substrate, RQC1, is not regulated via the canonical RQC pathway, but by a Ltn1p-dependent post transcriptional mechanism.

      The question of whether there are endogenous RQC substrates has previously been explored. With the exception of the few identified substrates, such as RQC1 (Brandman et al, 2012) and SDD1 (Matsuo et al., 2020), these studies largely concluded the RQC has a minimal regulatory role for endogenous messages, and is most likely protecting cells from damage and environmental stressors. This idea is further supported by the observation that the RQC is non-essential under standard growth condition, but becomes synthetic lethal with translation inhibitors (Kostova et al, 2017, Choe et al, 2016). The work by Barros et al. comes to the same conclusions, and therefore it is unclear how this work contributes to the already established role of the RQC.

      The authors also explore the regulation of RQC1 by the RQC and argue that this gene is regulated by Ltn1p in an RQC-independent way. However, mechanistic understanding of the proposed regulation is lacking, and the data are largely inconsistent with the previously published observations by Brandman et al, 2012.

      Major points:

      1) The authors use the dataset published by Pop et al., 2014 for their 27-29 nt no drug ribosome profiling analysis. However, these no-drug samples have been reported to exhibit surprising heterogeneity, and similarities with CHX-pretreated samples (see Hussmann et al., 2015 for detailed analysis). It is unclear how this heterogeneity can affect the analysis in the current manuscript, and whether the authors were aware of these caveats. Have the authors used independent datasets to confirm their observations? Have they excluded replicas that show CHX-like characteristics, such as A-site occupancy bias similar to CHX pretreated samples?

      2) It is not clear what the purpose of the analysis presented in Fig 2 is, and how it is different from the modeling in the Park and Subramaniam 2019 paper? Are the authors using these parameters (TE, Kozak score, etc.) to show adaptations that minimize ribosome collisions?

      3) Fig 3 - some of the selected examples (Dbp3, Yro2, Nop58) lack sufficient coverage in the region of interested highlighted in the right column for the short and/or long footprints. Since the data are insufficient to make conclusions about ribosome stalling and queuing, these examples should be excluded from the analysis.

      4) Fig 4:

      -Does ASC1 deletion cause frameshifting? Since the TAP-tag is C-terminal, it is possible that it is now out of frame, and therefore undetectable. Is it possible for the authors to introduce the tag on the N-terminus, and follow simultaneously the stalled nascent polypeptide (upon LTN1 deletion), and the full length protein?

      -Is the putative stalling site of Dbp3 too close to the stat codon to cause collisions?

      -Can the authors include a positive control, such as TAP-tagged Sdd1 to make sure their assay works and their strains and KOs behave as expected?

      5) Fig 5:

      -What is causing the inconsistency with the Brandman et al., 2012 data about RQC-dependent regulation of RQC1? In the original paper, Rqc1p has an N-terminal FLAG tag, so the authors primarily follow the stalled nascent polypeptide, whereas the current study focuses on the full length protein. Can the authors compare the same construct (FLAG-tagged Rqc1p) in their strains, so it is an "apples to apples" comparison?

      -Fig 5c bottom panel - the read coverage is too sparse to make a conclusion. This analysis should be removed.

      -5 d, e. The comparison between the GFP-12R-RFP stalling reporter and RQC2-TAP is not fair. The GFP construct reports on the fate of the stalled nascent polypeptide, whereas the RQC1-TAP looks at the full-length protein, and remains blind to the putative stalling product. Can the authors change the location of the tag, and repeat the experiment now looking at the stalled nascent polypeptide for RQC1? In addition, the signal in Fig. 5e look saturated. Is it possible that no effect is observed simply because the TAP signal is out of the dynamic range for the assay?

      Minor Comments:

      1) The introduction presents an overly simplistic view of ribosome stalling, arguing that stalling can be caused by polybasic stretches. We now know that stalling is much more complex, and there are many other factors, including the presence of non-optimal codon pairs, that cause ribosome collisions. Although the authors discuss these factors in their discussion, they should also be emphasized in the introductory paragraph.

    2. Reviewer #2

      In this manuscript, Barros et al. examine published ribosome profiling data in an effort to identify possible targets for ribosome-quality-control (RQC) process in yeast. They found that although many of the obvious mRNA features, such as polybasic sequences, appear to stall the ribosome, they in fact are not targets of RQC. The authors then went on to confirm these observations by western-blot analysis of a few candidate genes and observe that deletion of the RQC factors Ltn1 and Asc1 has no effect on the levels of the full-length protein products. The authors conclude that RQC has little to no endogenous targets in yeast. While I have no doubt about the authors' conclusions and most of their analyses, I have major issues with the originality of the manuscript.

      1) The argument that RQC has little to no endogenous targets is not new. Many groups, including the authors' one, made the same arguments before. The authors recently published a paper in the Biochemical Journal "Influence of nascent polypeptide positive charges on translation dynamics". In particular, the analysis in that paper appears similar to the one carried out here. Furthermore, the Guydosh group made similar arguments in their recent paper (Meyden and Guydosh, Mol Cell).

      2) The authors conclude their abstract by stating that "our results suggest that RQC should not be regarded as a general regulatory pathway for gene expression". To the best of my knowledge, RQC has not been regarded as such and instead the consensus has been that the process is a quality control one (as the name suggests).

      3) The authors use LTN1 and ASC1 deletions to determine whether certain sequences are RQC targets or not. But for the ltn1D, instead of looking at the stabilized shorter products, the authors only looked at the full-length one. Ltn1 has no effect on readthrough on stalling sequences. A better deletion should have been that of HEL2.

    3. Reviewer #1

      In this manuscript the authors use existing high throughput data sets and perform some new experiments to explore in yeast potential physiological substrates of RQC. In a first step, they use bioinformatics to identify genes with features previously implicated in RQC (usually with reporter assays) including inhibitory codon pairs, poly-basic stretches, and poly-A tracts. With these genes in hand, they characterized various features of "translatability", using existing ribosome profiling data sets, and concluded that with the exception of the ICPs, that there were no strong signatures indicative of reduced ribosome density that might have evolved to deal with problematic ribosome queueing. The authors then looked at the RP data at higher resolution, looking for characteristic patterns of RPF distribution around the pausing site, and found that the striking patterns seen previously for Sdd1 (and for reporter analysis in D'Orazio et al. eLife) were not recapitulated for any of the top candidates in their list. In a final set of experiments, the authors took advantage of TAP-tagged variants of their proteins of interest and asked whether deletion of Asc1 or Ltn1 impacted protein levels - and found that there were no discernible effects (though validation with TAP-tagged Sdd1 is an important missing control). Importantly, expression of full length Rqc1 (previously argued to be a direct target of the RQC) was unaffected by RQC components including Asc1, Hel2 and Rqc2, but was strongly impacted by Ltn1. These data together argue for an RQC-independent role for Ltn1 in regulating Rqc1 expression.

      Overall, the manuscript was thought provoking for consideration of what might be natural targets of RQC, and in the end, one would conclude that natural targets of RQC are not encoded in the genome, but may instead be predominantly either prematurely polyadenylated mRNA substrates that escape nuclear QC, or instead, ubiquitous damaged mRNAs in the cell. In general, the discussion of the analysis of RP data indicated naivete about the identity of different RPF sizes and their relevance to mechanism (this could be corrected easily in a revised version). In the end, this manuscript brings important questions to the table, and provides some reasonable evidence to suggest that natural poly-basic stretches, including the one found in Rqc1, are not targets for the RQC under normal conditions. Moreover, the data support a non-canonical role for Ltn1 in regulating expression of Rqc1 which needs to be more fully explored. Importantly, however, what is critical to support the negative results surrounding Rqc1 is a demonstration of a role for RQC for Sdd1, around which the narrative is constructed (this gene exhibits characteristics by RP of being a target and is reported previously to be impacted by the relevant genes Asc1 etc.).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 4 of the manuscript.

      Summary:

      There were substantial concerns about the novelty of the study, the choice of RP libraries, coverage depth, and analysis of the ribosome profiling data. Previous studies have argued that there are very few endogenous targets (so far Rqc1 and Sdd1) of the RQC pathway, and that this is rather a QC pathway for damaged mRNAs. While we appreciate that your studies were inconsistent with these earlier studies, it will be critical for you to replicate those experiments, using protein tags that allow you to follow the fate of both the full length and truncated species. Additionally, it will be important to validate using your own approaches and reagents that Sdd1 is indeed a substrate for RQC, given that your data suggest that Rqc1 itself is not. Finally, the novel Ltn1-dependent, RQC-independent pathway proposed to regulate Rqc1 expression requires further mechanistic work.

    1. Reviewer #3

      Three distinct amyloid-based cell-death pathways in fungi have been reported. The authors of the current manuscript extend their previous work of the HELLP/SBP/PNT1 pathway in Chaetomium globosum and describe a similar system in P. anserina. It is shown that the amyloid signaling domain of PTN1 can form a prion in cells deleted of HELLP, which is otherwise activated by the prion to cause cell death. Using this artificial system, the authors test whether the related RHIM motif of the human RIP1 and RIP3 protein can also form a prion in P. anserina and whether RHIM amyloids as well as other fungal amyloid-forming motifs can cross-seed PTN1.

      The experiments are well executed and explained but I have a few suggestions:

      1) Amyloid cross seeding is usually assayed in vitro using purified protein fragments. The artificial genetic system used here is certainly clever but the expression level of different proteins needs to be measured for better comparison of cross-seeding efficiencies.

      2) Page 16, line 333-334 and Fig 8: How were recipient strains sampled? How random was it? How many samples?

      3) Jargons/abbreviations. Page 19, line 405; Page 20, line 429: What are PAMPs, MAMPs, and PCD?

    2. Reviewer #2

      This work reports the discovery of an amyloid-based cell death signaling pathway in the filamentous fungus, Podospora anserina. This makes the third such pathway in this fungus. As for the others, the amyloid in this case has prion-like activity, is selectively nucleated by a cognate innate immunity sensor protein, and results in activation of the membrane-disrupting activity of the protein. They show that all three pathways operate orthogonally - that is without cross-seeding. In contrast, cross-seeding did occur between this pathway and the putatively homologous human necroptosis pathway when it is reconstituted in P. anserina, which further supports an evolutionary relationship between them.

      Substantive concerns:

      1) The novelty of this finding is somewhat dampened by this group's prior demonstration of several of the major points of interest in previous papers. They had previously discovered and characterized the homologous pathway in a different fungus, and suggested an evolutionary link between fungal amyloid signalosomes and mammalian necroptosis using strong bioinformatic and structural evidence. In addition, they had shown that the two previously known amyloid signaling pathways in P. anserina operated orthogonally. Hence the major point of novelty, as reflected in the title, is the demonstration that this particular amyloid pathway can cross-seed the human necroptosis amyloids.

      2) Implications of "cross-seeding". The interspecific cross-seeding observed was modest; much lower than that for intraspecific templating between proteins of the same pathway. Specifically, it failed to induce a barrage, the puncta formed at different times, and colocalization was incomplete. More importantly, cross-seeding does not imply functional or evolutionary conservation. Consider the wide range of amyloid proteins that have been reported to cross-seed each other despite in some cases very different sequences, structures, and functions - for example the type-II diabetes peptide IAPP with the Alzheimer's peptide Aβ; the yeast prion protein Rnq1 with human Huntingtin; and the yeast prion Sup35 with human transthyretin. Although a direct comparison with the present data are not possible, these cross-seeding interactions appear comparably robust. The present demonstration of limited cross-seeding therefore seems not to add much additional support for an evolutionary relationship between necroptosis and fungal amyloid cell-death pathways.

      3) Rigor of the fusion experiments. In all cases, despite having generated and validated the use of RFP- and GFP-labeled proteins, all fusion experiments to examine cell death microscopically (using Evans Blue staining) were between two GFP-expressing strains. This is frustrating because it makes it impossible to know from the images alone which of the two proteins is expressed in which cells, and in which cases of mycelia crossing paths is fusion occurring. I must therefore rely entirely on the labels provided, but they sometimes appear implausible. For example, the lower fusion event demarcated in Fig. 3C left panel would have been expected to allow GFP levels to equilibrate across the point of contact; instead there remains a sharp transition in GFP intensity between the two mycelia (third panel) indicating the cytoplasm is not being shared at the time of the image. In Fig. S8 top row, there is no apparent relationship between cell death and HELLP-GFP; moreover, cell death is seen occurring in mycelia containing either punctate or diffuse GFP-RIP3. While I appreciate that Evans Blue fluorescence may overlap with that of RFP (which should be stated) and preclude its visualization without multispectral imaging capabilities that may not be available to the authors, alternative viability stains and fluorescent proteins could in principle have been used to avoid this problem.

      Minor Comments:

      1) The significance of these proteins forming "prions", as opposed to (merely) amyloids, should be articulated. This is important because prion-formation per se is irrelevant to the cell-level functions of the proteins, as nucleation of the amyloid state causes cell death and hence precludes their persistent/heritable propagation. Amyloid by nature is self-perpetuating at the molecular level and hence would seem to explain the properties of the protein. The discussion about possible exaptation of these pathways for allorecognition could be expanded or clarified in this regard.

      2) Colocalization between two proteins does not imply that one has templated the other to form amyloid, even when both are capable of forming amyloid independently (see https://doi.org/10.1073/pnas.0611158104 ).

      3) Statements of partial cross-seeding are supported by quantitation (Fig. 8). In contrast, the authors appear to use qualitative observations to support rather definitive statements about the "total absence of" (line 344) of cross-seeding between other pathways.

      4) Fig. S9. "Note that induction of [Rhim] in transformants leads to growth alteration to varying extent ranging from sublethal phenotype to more or less stunted growth." Can the authors suggest an explanation for this heterogeneity? From my limited perspective, it suggests the existence of amyloid polymorphisms (i.e. a prion strain phenomenon), which is quite unexpected given the lack of polymorphism among known functional amyloids in contrast to rampant polymorphism among pathological amyloids. Hence the phenomenon could be interpreted as suggesting that amyloid is not an evolved/functional state for the PP motif. In any case the phenomenon is interesting and merits further discussion.

    3. Reviewer #1

      Bardin and colleagues identify and characterize a third prion system in P. anserina based on the PNT1/HELLP NLR-based signalosome based on the amyloid signaling motif PP from Chaetomium globosum. The C-terminal domain of HELLP is shown to exist in either soluble or aggregated states based on fluorescence microscopy of tagged protein in vivo, termed the [pi] state, and to form amyloid in vitro. These distinct states can be propagated independently and induce conversion of full-length HELLP upon cytoplasmic mixing, which leads to cell death. The PNT1 N-terminal domain also forms foci in vivo and can seed conversion of HELLP, also leading to cell death. The C-terminal domain of C. globosum HELLP and the RHIM regions of mammalian RIP1 and RIP3, which both contain PP motifs, can cross-seed HELLP conversion to the aggregated state but the other known P. anserina prions [Het-s] and [phi] are unable to do so.

      Support for the model proposed is generally qualitative in nature, with multiple instances of data described but not presented, including the timing of conversion to the aggregated state, revision of the aggregated state in meiotic progeny, the frequencies of conversion and co-localization, and the correlations between growth and prion phenotype. For the data presented, replicates, frequency of observations, and variability are not reported. In addition, a mechanism is proposed to explain the toxicity associated with HELLP conversion to the aggregated state - membrane localization - but this model is not supported by robust data such as a marker for the membrane in the fluorescence images or a biochemical fractionation. Moreover, the absence of functional data, such as mutations that disrupt amyloid formation, leave the model with correlative observations to support it. Finally, observations on the C. globosum system decrease the novelty of the observations.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary

      Bardin and colleagues identify and characterize a third prion system in P. anserina based on a cognate innate immunity signalosome comprised of PNT1/HELLP. The authors demonstrate that the three prion pathways operate orthogonally without cross-seeding; however, the newly identified PNT1/HELLP prion can be cross-seeded by the putatively homologous human necroptosis pathway when it is reconstituted in P. anserina, which further supports an evolutionary relationship between them. The review has identified substantive concerns, which limit the novelty of the work and would require significant new studies to address the mechanistic gaps. These concerns include prior work revealing several major tenets including prion activity for PNT1/HELLP in C. globosum and evolutionary conservation to the mammalian necroptosis pathway and the absence for robust experimental support for cross-seeding, or the absence thereof, membrane disruption as the cause of incompatibility, and for the relationship among toxicity, growth, protein state, and protein interaction. Concerns were also raised about the data presented, or absent, in terms of replicates, frequency of observations, and variability.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point response to reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **

      The authors constructed a virtually complete fitness landscape of the P1 extension region (4-base-paired helix) in the group I intron from Tetrahymena thermophila, using a kanamycin resistance reporter to evaluate the fold-change in fitness, which is related to self-splicing activity. This was a clever choice of system because it was known from earlier work that the P1 extension adopts two different conformations during self-splicing. The fitness of each variant was determined from the number of reads acquired from the sequencing data sets and analyzed through an extensive computational pipeline. The strength of the paper is that this machine learning approach can be used to calculate how individual variants contribute to the fitness landscape and assess the directions of epistasis across a large number of identified genotypes.

      We thank the reviewer for highlighting one of the key strengths of our manuscript, the fact that our analytical approach, using SHAP values, enables contributions of individual variants to be assessed in a genotype-specific manner. This approach provides for a sound, robust, and principled way of describing and understanding the fitness impact of one mutation in the context of (potentially many) others.

      The authors argue that machine learning more successfully models subtle effects that arise from interactions between RNA residues, and that the power to analyze deep mutational sequencing experiments can better rationalize fitness constraints arising from multiple conformational states.

      We do indeed argue that machine learning is likely to play an increasing role in making sense of deep mutational scanning data. These scans provide high-resolution information on how fitness maps onto genotype, but the molecular underpinnings of this relationship often remain obscure. It is these “hidden” underpinnings, including the effects of specific mutations on RNA/protein folding, structures, and dynamics, that machine learning approaches can help elucidate.

      The results are mostly consistent with previous studies even though the authors collected the data in a more advanced and complicated way. They are also able to rationalize complex phenotypes - for example, the observed fitness defects are more prevalent under an unfavorable growth condition (30ºC), because the lower temperature hinders conformational exchange. Although such cold sensitive effects are well known in RNA, it is gratifying that this can be captured in the fitness landscape.

      Finding temperature-related fitness effects that are consistent with impaired conformational exchange was also gratifying for us and we thank the reviewer for highlighting this finding.

      The results would be more convincing if the authors directly measure the self-splicing activity of a few key variants, such as the C2C21 mutant, to determine whether these mutations alter the self-splicing mechanism of the Tte-119(C20A) master sequence in the way that they infer from their model. In interpreting their results, they may want to consider misfolding of the intron core (coupled to base pairing of P1) and reverse self-splicing. Reversibility in the hairpin ribozyme, for example, turned out to be the key for understanding the effects of certain mutations.

      We appreciate that measurements of splicing activity for individual genotypes would complement and further strengthen our study. We will therefore aim to construct strains for a few key genotypes and assay self-splicing activity using RT-qPCR – an approach we previously used successfully to monitor splicing kinetics of self-splicing introns in yeast mitochondria (see Rudan et al. 2018 eLife 7:e35330). Specifically, we will quantify the fraction of spliced and unspliced transcripts using primers that span the exon-exon and the 3’ exon-intron junction, respectively (the 5’ intron-exon junction is genotypically diverse and would require genotype-specific primers). This will be done under non-selective (-kan) conditions, where the relative fraction of spliced and unspliced transcripts is a function of intrinsic splicing ability and not confounded by selection. We aim to include the master sequence, C2C21, G3C20 and its mirror genotype C3G20, U3 (which restores perfect complementarity in the master sequence), and G5 (inferred from the high-throughput experiment to make a strong negative contribution to fitness).

      In interpreting our results, we will consider different mechanisms of splicing failure, such as kinetic problems (slow dissociation of P1ex), misfolding of the intron core, reverse self-splicing, and the use of cryptic splice sites, which has previously been documented (see e.g. Woodson & Cech 1991 Biochemistry 30:2042-2050). We note, however that a precise mechanistic dissection of the splicing defects of individual variants is not the purpose of this manuscript and we therefore do not aim to establish genotype-specific defects in great molecular detail.

      Related to the point above, interesting conclusions regarding the relationships between base identity and epistasis that arise from metastability should be strengthened with additional examples. For example, the authors can explain why a reverse base-pairing variant (C3G20) exhibits negative epistasis but is not similar to that of the G3C20 construct. This would ideally use the data from the screen but also be validated by checking the self-splicing activity of a few individuals at low and high temperature.

      In measuring splicing activity and its link to fitness for a subset of key variants (see point #4), we will include at least one mirror example such as C3G20/G3C20. In addition, we will highlight additional examples of this mirror asymmetry based on the results from our high-throughput screen.

      They should validate the screen by showing that kanamycin resistance does indeed correlate strictly with self-splicing activity, and not some other feature such as RNA turnover. (It would also not be a bad idea to check this in the cell, which can be done by primer extension or Northern blotting.)

      This question (i.e. whether altered RNA stability rather than splicing efficiency explains differential KNT production and ultimately fitness) has previously been addressed by Guo & Cech (2002) when introducing the knt+intron reporter system. These authors found no difference in mRNA stability in constructs that displayed differential kanamycin resistance. To shore up this conclusion further, we will measure fitness (via colony counts, growth rate or more directly through competitive fitness assays) of the key variants for which we determine splicing activity (see point #4) and then correlate splicing and fitness.

      The benefit of the machine learning model is that it can extract signals that may be hard to detect otherwise. The downside is that it doesn't produce a physical model, as far as I am aware. The parameters are themselves not meaningful - except to the degree that trends in the fitness estimates can be explained after the fact. This is something that should ideally be explained more directly in the manuscript.

      The reviewer raises an interesting point, that indeed deserves further discussion/explanation. The reviewer is right that, at first sight, high-resolution fitness landscapes like ours do not directly produce a physical (structural) model of the molecule under investigation. They connect genotype and fitness, but the molecular intermediate – a biophysical structure – is not explicit. However, over the last few years, it has become apparent that deep mutational scanning experiments can – both in principle and in practice – yield information that can be leveraged to infer such a physical model. In short, covariation in fitness between residues in a protein or bases in an RNA can be used as inputs for constraint-based modelling of physical interactions. Notably, Schmiedel & Lehner (2019, Nature Genetics 51: 1177-1186) recently demonstrated that deep mutational scanning data can be used in this manner to reconstruct secondary and tertiary protein structure with high accuracy. In principle, the same approach can be used to reconstruct RNA structures. This will require more extensive, molecule-wide fitness data, but our study points towards just this future, even for data collected from structural ensembles.

      When we stated in the original manuscript that deconvolution of the fitness landscape might help to reverse engineer structures, this ability to interpolate between genotype and fitness to reveal hidden biophysical/structural relationships is what we refer to. We will revise the manuscript to make this connection more explicit.

      The authors claim that by evaluating a large number of sequences at two conditions, they can capture variants with intermediate phenotypes (Fig. 1). This is not necessarily true. If the original screen allows only the most active variants to survive on kan+ medium, then the signature of intermediate phenotypes may not be encoded in the original data, and thus not retrievable even with sophisticated algorithms, which may also be prone to overfitting. At what limit of stringency will the screen fail to yield information about intermediate fitness? How deeply must one sequence to recover this information, especially if noisy or degraded? Some discussion of these effects would be helpful.

      The capacity of any high-throughput sequencing-based DMS experiment to resolve intermediate phenotypes does indeed depend on a number of things. The reviewer highlights two of these: First, in screens where the phenotype is not binary (dead/alive) but fitness can be measured on a continuous scale, can we – and do we – capture phenotypes with intermediate fitness? What if only the fittest/most active variants survive? This is, ultimately, an empirical question, and one we can answer quite definitively: we do observe a large range of intermediate phenotypes, which – in our study – correspond to intermediate fold-change values. For each genotype, we can provide confidence limits and assess statistical significance. Table S1 provides this information. Our capacity to resolve these intermediate phenotypes is mainly based on three things. One is adequate sequencing depth, as highlighted by the reviewer. The second is the number of biological replicates (N=6) we analyse, which allows us to differentiate biological variability from noise for a large number of genotypes. This is an important aspect of DMS experiments that has often been overlooked (i.e. there are many other studies where only a single replicate is analysed and biological heterogeneity is not taken into account). With six replicates in hand, we can directly estimate variability (as done e.g. in our DESeq2 analysis) and quantify uncertainty so as to guard against overfitting. In our view, this is arguably more important than sequencing depth in deriving appropriate fitness estimates. Finally, we can resolve intermediate phenotypes because we keep the time lag between initial exposure to kanamycin and assaying genotype frequencies relatively short (overnight growth, see Methods). Our experiment is effectively a multi-genotype competition experiment, and we provide a snapshot across the genotype pool at a given time. If we had measured after several days of culture, genotypes with greater relative fitness would have spread further through the population, at the cost of less fit genotypes, many of which would likely have been eliminated. We kept measurement lag relatively short on purpose so that we could see a clear differential response to kanamycin while still being able to catch more than just a handful of the very fittest genotypes.

      In light of the above, it will be apparent that there are no simple answers to the reviewer’s questions about required sequencing depth, levels of stringency, etc. The ability to assign differential fitness across a large population of genotypes hinges on multiple interrelated considerations (sequencing depth, complexity of the final & starting pool, number of replicates). In revising the manuscript, we will highlight some of the key considerations just discussed, bearing in mind that the manuscript cannot possibly discuss all possible pitfalls and requirements of deep mutational scanning experiments in great detail.

      Lastly, the evolvability of RNA is fascinating and there is much to learn. However, the authors don't discuss the implications of their findings for molecular evolution although they throw the term around. It would be exciting if there is a trend in the fitness landscape that could help explain the trajectory of RNA evolution in nature.

      We agree with the reviewer that it would be exciting to link deep mutational scanning results more closely with observable patterns of RNA evolution. This is true both in relation to evolution of P1ex/group I introns specifically and evolution of dynamic RNA structures more generally. Regarding the latter, we note that selection against excess stability has previously been inferred for 5’ UTRs (see e.g. Gu et al. 2010 PLoS Comp Biol 6: e1000664), although our case is slightly different in that a helix still needs to form but be sufficiently unstable to enable swift dissociation. We also note that riboswitches might make for an excellent subject to study asymmetric constraint and selection against excess stability as they involve formation of competing helices (including participation of some but not all nucleotides in more than one helix), their structure/function is well understood, and many examples are known, providing opportunities for evolutionary analysis. We consider this outside the scope of the current study. We will, however, seek to analyse patterns of evolution in P1ex to establish whether they correspond in a meaningful way to the fitness trends we observe in the laboratory. To do so, we will analyse the distribution and evolutionary history of variants across orthologous introns in different Tetrahymena species/strains, with a focus on P1ex, P10 and the surrounding sequence. Fortunately for us, the 23S ribosomal RNA gene in which the intron is embedded has been used as a phylogenetic marker so that intron/exon sequence information is available for a reasonable number of species/strains (see Doerder 2018 J Eukaryot Microbiol 66:182-208). We will generate an alignment of these sequences and ask, for example, whether N2-N5 are subject to different constraints than N18-N21 mirroring our experimental findings. We have previously successfully quantified patterns of variation surrounding self-splicing introns in yeast mitochondria (Repar & Warnecke 2017 Genetics 205:1641-1648). Note here that extending this analysis beyond Tetrahymena is problematic. Specifically, the intron is absent from close relatives of Tetrahymena (Doerder 2018 J Eukaryot Microbiol 66:182-208) and P1-proximal structures of distant relatives are quite variable. In addition, we are looking at intronic regions that are not only adjacent to but also directly interact with exonic sequence. The exonic context in which the intron is embedded therefore matters but will be quite different for more distant group I introns. We therefore think that aligning and comparing distant orthologs has limited merit.

      The authors use the abbreviation DMS for deep mutational scanning; the RNA structure field uses the reagent dimethylsulfate that is also abbreviated DMS. They may want to choose a different acronym or just avoid an acronym altogether.

      We appreciate this point about false-friend acronyms. We will either find a different acronym or avoid it altogether.

      Reviewer #1 (Significance (Required)):

      As the importance of RNA structure for gene expression becomes more widely appreciated, interest in understanding the evolution of RNA structures is also increasing. Compared with the molecular evolution of proteins, evolution and fitness in RNA is far less understood, although the authors appropriately point to a number of recent studies on this topic. The main advance here is to use machine learning methods to analyze the results of a large genotypic screen, with the goal of more accurately capturing the fitness effects of sequences at varied distances from the parental sequence. The specific conclusions reached here such as the importance of metastability or the prominence of cold sensitive effects are not revolutionary, but the authors illustrate how such phenomena can be investigated more systematically and in more depth.

      We thank the reviewer for highlighting that our analytical approach showcases how deep mutational scanning data can be analysed in an unbiased and systematic manner to better understand the relationship between genotype, molecular phenotype (e.g. structure), and fitness. The reviewer also rightly points to specific results we obtain regarding temperature-related effects and metastability of P1ex/P10. However, we believe that the most important contribution of this work is a more general one, namely our proof-of-principle demonstration that deep mutational scanning data can capture multiple conformational states simultaneously, and that these states can be deconvoluted from a single fitness landscape to attribute the fitness impact of individual mutations to specific RNA conformations. To our knowledge this had not been explicitly demonstrated before and our work provides an important cornerstone for future studies looking to interpret mutational effects in either RNAs or proteins in the light of dynamic structures.

      In light of comments by reviewer #2 below, it is worth reiterating the proof-of-principle nature of this study. Many of the specific results we obtain (e.g. importance of avoiding excess stability in P1ex) are not revolutionary. Indeed, we would be worried if they were. We chose to investigate P1ex because substantial prior work exists that has furnished us with solid positive controls. This independent prior validation allows us to both have great confidence in the data we generate and demonstrate cogently that the two conformational states at the beginning and end of the splicing reaction are captured in the data.

      Finally, we believe our work, in covering a virtually complete genotype space, using multiple replicates to quantify uncertainty in fitness estimates, and using SHAP scores to interpret variant effects in genotype-specific context, sets a new high bar for this type of study and will provide valuable reference data and analytical recipes for future analyses. **

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Soo et al probes the effect of mutations on the fitness of the Tetrahymena Group I self-splicing intron. They used high-throughput sequencing to simultaneously identify the effect of every possible sequence in a 4-bp helix. The approach is sound and the conclusions are generally supported. However, the analysis seems overly complicated given the dataset. Both the analysis and the accompanying writing make it difficult to understand what seems to be a fairly clear conclusion - that the relative stabilities of two alternative RNA helices are important for splicing.

      We thank the reviewer for testifying to the validity of our approach and the soundness of our conclusions. Regarding the complexity of the analysis, the reviewer is right in that – for the conclusion that the relative stabilities of two alternative helices are important for fitness – a simpler analysis would have sufficed. However, as elaborated in response to point #11 above, our objective here is not merely to draw specific conclusions about the relative stabilities of P1ex and P10, but more general: a) to demonstrate that a single fitness landscape can be deconvoluted to implicate multiple conformations in fitness defects and b) to provide a basic but powerful recipe for doing so in an unbiased, systematic manner using machine learning.

      We will strive to make the writing clearer so that readers can follow this reasoning and appreciate our analytical choices.

      • **Major Comments** *

      The authors state that this method can identify the impact of transient conformational states. However, the two conformational states in this study are not transient - in fact they are associated with two distinct chemical steps of splicing and are quite stable. It may be that the effect of important transient states would be observed, but this study does not demonstrate that.

      We used the word “transient” to describe two alternative RNA structures formed during the life cycle of the intron. Both states (characterized by P1ex and P10 formation) are transient in as much as they disappear as splicing proceeds. In retrospect, we agree with the reviewer that this usage is too loose (given how the term is generally used in the literature) and might evoke the wrong connotations. We will therefore revise the manuscript to eliminate references to P1ex and P10 as transient states, but rather describe them as alternative conformations. Of course, the general point remains true: that deep mutational scanning data should in principle capture all fitness-relevant structural states even if these are transient (in the strict sense of the word).

      "Fitness" ends up being on an arbitrary scale, which impairs some analysis. A similar high-throughput sequencing pipeline could have been used to directly monitor splicing of every mutant, though at this point that is outside the scope of this study. Even with the arbitrary units, it would be clearer if more time were spent comparing fitness to base-pair stability on an individual basis, rather than the broad analyses. (See minor comments for details.)

      The reviewer is right in saying that a high-throughput pipeline could have been designed to monitor splicing of each genotype directly (rather than assaying fitness of the cell population that represents a particular genotype).We chose not to do so. One reason for this is that monitoring splicing directly would have necessitated design of a more complicated assay. This is because, to monitor splicing efficiency, one would have to monitor both pre-mRNA and mRNA for different genotypes. The former is straightforward (using primers that span the exon-intron junction) but the latter is not: successful splicing removes the genotype-specific information from the mRNA (that information being solely encoded in the intron). This a solvable problem in principle. One might, for example, introduce barcodes of sufficient complexity in the mRNA that can be linked back to the intron genotype, but doing so would have introduced a further source of error and complicated analysis. We therefore opted for monitoring genotypic fitness by sequencing the plasmids from which the RNAs originate. This does mean that our measurements of fitness are not coupled to a specific molecular phenotype (such as splicing efficiency) – we presume (but are not entirely sure) this is what the reviewer refers to when talking about fitness being on an “arbitrary scale”. However, fitness derived in this manner has the advantage of providing information that does not start from a mechanistic preconception. We ask how variant affects survival and reproduction of the cell without presuming specific mechanism and the results can therefore capture any mechanism, including those that we did not consider initially. The challenge then becomes to tease out possibly multiple mechanisms from unbiased data.

      We will tackle the reviewer’s final comment, regarding analysis of base-pair stability, below in response to one of the minor comments (point #20).

      \*Minor Comments** *

      The sentence in the abstract beginning "Using an in vivo report system..." is very difficult to comprehend. This is due both to the length of the sentence and the word usage. The final sentence of the abstract is similarly difficult. In general, the writing overemphasizes complexity at the cost of clarity.

      We will revise the entire manuscript to make the writing both clearer and more concise.

      Analysis of results in terms of "epistasis" obscures what could be a straightforward observation. This is the same as saying that mutants are not independent, or that their energetic costs are not additive. This follows obviously from the observation that the nucleotides being mutated are base-paired.

      Making explicit reference to “epistasis” is a considered choice. Framing results in terms of epistasis might be less familiar to readers grounded in RNA or protein biophysics/biochemistry, but is very much at the heart of thinking about the genotype-phenotype relationship from an evolutionary perspective, where global descriptions of epistasis are commonplace and usually provide the starting point for thinking about genotype-phenotype relationships, evolution and evolvability. So what seems unnecessarily obscure when seen through the lens of one field, is natural when considered in the context of another. Importantly, it is also the central approach adopted by many if not most prior deep mutational scanning studies (see e.g. Hayden et al. 2011; Pressman et al. 2019; Zhang et al. 2009; Li et al. 2016; Puchta et al. 2016; Domingo et al. 2018; Li and Zhang 2018; Weinreich et al. 2013; Lalić and Elena 2015; Bendixsen et al. 2017 as cited on page 3 of the manuscript) so we think this framing is helpful to compare our results to prior work.

      We expect that the readership will include many researchers interest in mapping genotype-phenotype-fitness relationships who will expect to see global analyses and descriptors of the type we present. We will, however, revise the manuscript to ensure that our description of the findings remains accessible to readers from other fields.

      More specifically, we also note that the fact that mutations are not independent (i.e. epistasis exists) might be trivial from the fact that P1ex is a base-paired helix. The magnitude and direction (“sign”) of epistasis, however, are not. In fact, as we describe, contrary to prior DMS on RNA helices, we find a lot of positive epistasis, reflecting, as we argue, selection against excess stability of P1ex to allow subsequent formation of P10.

      The novel information is the sensitivity of fitness to base pairing. This is best shown in an analysis like Figure 3A (see below), not broad measures of epistasis.

      Please see responses to points #11, #12, and #16 above for an elaboration of what we consider to be the main merits of this study and why providing broad measures of epistasis is a sensible choice.

      Figure 1C isn't necessary for the reader to understand the process.

      We are happy to follow editorial guidance as to whether this panel is superfluous and should be removed or is worth including.

      It is unclear what figure 2C is showing. It appears that the replicates are similar to each other, that 30 deg C and 37 deg C are also similar, but that +/- Kan are different. This probably doesn't need a figure in the main text.

      This figure does indeed capture what the reviewer describes: genotype pools in +/-kan are least similar to each other, while 30/37ºC are similar but distinct in the +kan condition and effectively indistinguishable in the -kan condition, in line with expectations. We agree with the reviewer that this information per se is something that would typically be found in a supplementary figure. However, we would advocate for retention of this panel in the main manuscript in this instance because of the way in which it was derived: using the Bray-Curtis dissimilarity index. To our knowledge, this is the first time that Bray-Curtis dissimilarity has been used to quantify, in a principled way, the similarity between genotype pools. Borrowed from the ecology literature, the index captures both richness (number of different species/genotypes in the ecosystem/genotype pool) and relative abundance to provide an integrated measure of genotype diversity. We believe that this measure will be useful for future studies and rather than relegating the figure to the supplement, we would aim to briefly highlight its methodological novelty. *

      *

      Figure 3A could be the most informative part of the manuscript. However, predicted minimum free energy should be on the x-axis as the independent variable. The expectation then is that you would see a peak in fitness at some free energy, with fitness falling off both with increased and decreased stability. Furthermore, there should be more analysis along these lines. The authors should calculate helical stability for both P1ex and P10 for every mutant and compare with fitness. Mutations which affect both could also be separated out. Figure 4C comes the closest to this but views it only in terms of GC pairs; there is no reason not to quantify the energetic effects given that predictions of stability for helices is quite good. Deviations from a model invoking only helical stabilities would indicate another factor is involved (alternative base-pairing or tertiary structure, for example).

      We agree with the reviewer that the axes in Figure 3A should be flipped and we will do so in the revised manuscript. We also agree that, when it comes to helical stability of P1ex, the simple expectation would be to see a peak at a certain stability with drop-offs either side, as intimated by Figure 4C. We further agree with the reviewer that Figure 4C is rather indirect and can be made more quantitative by considering helical stability across all genotypes directly. To this end, we will use one of the many tools available that allow prediction of helical stability from primary sequence (e.g. the enf2 function in RNAStructure, as used by Torgerson et al 2018 RNA, see point #24 below) and replace Figure 4C with a more quantitative fitness landscape based on these computations. To provide added confidence in the computations of helical stabilities from primary sequence in the context of our structure, we will also calculate helical stabilities from molecular dynamics simulations for the subset of genotypes we considered previously (Figure 4E/F) and see how inferred stabilities compare.

      There appears to be a missing verb in the legend for figure 3A, second sentence.

      We will fix this error.

      Figure S5 appears to be redundant with Figure 1.

      At first glance, Figure S5 does indeed appear redundant with Figure 1 but it is not. Figure S5 shows the relevant sequence of the group I intron and bordering exons in its native context, i.e. when embedded in the 23S ribosomal RNA gene of Tetrahymena thermophila, whereas Figure 1 shows the genotype of the mutant intron embedded in knt. The sequences are different. We will revise the legend to Figure S5 to make this clearer.

      Figure S6 is a better analysis than what appears in the main text, and could be expanded to all base pairs.

      We will expand Figure S6 to include all base pairs as suggested. We disagree that this is a better analysis compared to what appears in the main text. Rather, it provides a complementary, hypothesis-driven view whereas the analysis in the main text is more systematic and unbiased in approach. *

      *

      Reviewer #2 (Significance (Required)):

      This manuscript largely focuses on the technical approach. The shift in analytic strategy described above would increase the conceptual impact. The conclusions are consistent with and fit in with recent uses of high-throughput sequencing to study RNA systems. For example Pitt & Ferré-D'Amaré, Science (2010) and Kobari et al, NAR (2015) describe fitness landscapes of the ligase and HDV ribozymes, respectively. Torgerson et al RNA (2018) make similar measurements on the glycine riboswitch, including a treatment of relative helix stability for two mutually exclusive conformations. The overall results are of interest to researchers in the field of noncoding RNA.

      We thank the reviewer for highlighting the paper by Torgerson et al, of which – embarrassingly – we were not aware. We will make reference to this paper in a revised manuscript and highlight that riboswitches might be a good model system to further explore asymmetric constraint and selection against excess stability in an evolutionary context (also see our response to point #9 above).

      As highlighted earlier, we think the main conceptual impact of our work lies not in the description of helical stabilities. Rather, it lies in a) providing a rigorous proof-of-principle that deep mutational scanning can capture multiple conformational states simultaneously, and b) that, using an unbiased machine learning approach, these states can be deconvoluted from a single fitness landscape to attribute the fitness impact of individual mutations to specific RNA conformations. A shift in analytical strategy to “cut to the chase” and narrowly focus on helical stability would be misguided in this context, as we seek to provide not only insights into the data at hand but also lay out a sound and general recipe for analysing similar datasets in the future.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript by Soo et al probes the effect of mutations on the fitness of the Tetrahymena Group I self-splicing intron. They used high-throughput sequencing to simultaneously identify the effect of every possible sequence in a 4-bp helix. The approach is sound and the conclusions are generally supported. However, the analysis seems overly complicated given the dataset. Both the analysis and the accompanying writing make it difficult to understand what seems to be a fairly clear conclusion - that the relative stabilities of two alternative RNA helices are important for splicing.

      Major Comments

      1.The authors state that this method can identify the impact of transient conformational states. However, the two conformational states in this study are not transient - in fact they are associated with two distinct chemical steps of splicing and are quite stable. It may be that the effect of important transient states would be observed, but this study does not demonstrate that.

      2."Fitness" ends up being on an arbitrary scale, which impairs some analysis. A similar high-throughput sequencing pipeline could have been used to directly monitor splicing of every mutant, though at this point that is outside the scope of this study. Even with the arbitrary units, it would be clearer if more time were spent comparing fitness to base-pair stability on an individual basis, rather than the broad analyses. (See minor comments for details.)

      Minor Comments

      1.The sentence in the abstract beginning "Using an in vivo report system..." is very difficult to comprehend. This is due both to the length of the sentence and the word usage. The final sentence of the abstract is similarly difficult. In general, the writing overemphasizes complexity at the cost of clarity.

      2.Analysis of results in terms of "epistasis" obscures what could be a straightforward observation. This is the same as saying that mutants are not independent, or that their energetic costs are not additive. This follows obviously from the observation that the nucleotides being mutated are base-paired. The novel information is the sensitivity of fitness to base pairing. This is best shown in an analysis like Figure 3A (see below), not broad measures of epistasis.

      3.Figure 1C isn't necessary for the reader to understand the process.

      4.It is unclear what figure 2C is showing. It appears that the replicates are similar to each other, that 30 deg C and 37 deg C are also similar, but that +/- Kan are different. This probably doesn't need a figure in the main text.

      3.Figure 3A could be the most informative part of the manuscript. However, predicted minimum free energy should be on the x-axis as the independent variable. The expectation then is that you would see a peak in fitness at some free energy, with fitness falling off both with increased and decreased stability. Furthermore, there should be more analysis along these lines. The authors should calculate helical stability for both P1ex and P10 for every mutant and compare with fitness. Mutations which affect both could also be separated out. Figure 4C comes the closest to this but views it only in terms of GC pairs; there is no reason not to quantify the energetic effects given that predictions of stability for helices is quite good. Deviations from a model invoking only helical stabilities would indicate another factor is involved (alternative base-pairing or tertiary structure, for example).

      4.There appears to be a missing verb in the legend for figure 3A, second sentence.

      5.Figure S5 appears to be redundant with Figure 1.

      6.Figure S6 is a better analysis than what appears in the main text, and could be expanded to all base pairs.

      Significance

      This manuscript largely focuses on the technical approach. The shift in analytic strategy described above would increase the conceptual impact. The conclusions are consistent with and fit in with recent uses of high-throughput sequencing to study RNA systems. For example Pitt & Ferré-D'Amaré, Science (2010) and Kobari et al, NAR (2015) describe fitness landscapes of the ligase and HDV ribozymes, respectively. Torgerson et al RNA (2018) make similar measurements on the glycine riboswitch, including a treatment of relative helix stability for two mutually exclusive conformations. The overall results are of interest to researchers in the field of noncoding RNA.

      Our expertise is in RNA biochemistry and biophysics. We are not qualified to evaluate the details of several of the computational pipelines described.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors constructed a virtually complete fitness landscape of the P1 extension region (4-base-paired helix) in the group I intron from Tetrahymena thermophila, using a kanamycin resistance reporter to evaluate the fold-change in fitness, which is related to self-splicing activity. This was a clever choice of system because it was known from earlier work that the P1 extension adopts two different conformations during self-splicing. The fitness of each variant was determined from the number of reads acquired from the sequencing data sets and analyzed through an extensive computational pipeline.

      The strength of the paper is that this machine learning approach can be used to calculate how individual variants contribute to the fitness landscape and assess the directions of epistasis across a large number of identified genotypes. The authors argue that machine learning more successfully models subtle effects that arise from interactions between RNA residues, and that the power to analyze deep mutational sequencing experiments can better rationalize fitness constraints arising from multiple conformational states. The results are mostly consistent with previous studies even though the authors collected the data in a more advanced and complicated way. They are also able to rationalize complex phenotypes - for example, the observed fitness defects are more prevalent under an unfavorable growth condition (30{degree sign}C), because the lower temperature hinders conformational exchange. Although such cold sensitive effects are well known in RNA, it is gratifying that this can be captured in the fitness landscape.

      Despite these strengths, there are several weaknesses that should ideally be addressed before publication.

      1.The results would be more convincing if the authors directly measure the self-splicing activity of a few key variants, such as the C2C21 mutant, to determine whether these mutations alter the self-splicing mechanism of the Tte-119(C20A) master sequence in the way that they infer from their model. In interpreting their results, they may want to consider misfolding of the intron core (coupled to base pairing of P1) and reverse self-splicing. Reversibility in the hairpin ribozyme, for example, turned out to be the key for understanding the effects of certain mutations.

      2.Related to the point above, interesting conclusions regarding the relationships between base identity and epistasis that arise from metastability should be strengthened with additional examples. For example, the authors can explain why a reverse base-pairing variant (C3G20) exhibits negative epistasis but is not similar to that of the G3C20 construct. This would ideally use the data from the screen but also be validated by checking the self-splicing activity of a few individuals at low and high temperature.

      3.They should validate the screen by showing that kanamycin resistance does indeed correlate strictly with self-splicing activity, and not some other feature such as RNA turnover. (It would also not be a bad idea to check this in the cell, which can be done by primer extension or Northern blotting.)

      4.The benefit of the machine learning model is that it can extract signals that may be hard to detect otherwise. The downside is that it doesn't produce a physical model, as far as I am aware. The parameters are themselves not meaningful - except to the degree that trends in the fitness estimates can be explained after the fact. This is something that should ideally be explained more directly in the manuscript.

      5.The authors claim that by evaluating a large number of sequences at two conditions, they can capture variants with intermediate phenotypes (Fig. 1). This is not necessarily true. If the original screen allows only the most active variants to survive on kan+ medium, then the signature of intermediate phenotypes may not be encoded in the original data, and thus not retrievable even with sophisticated algorithms, which may also be prone to overfitting. At what limit of stringency will the screen fail to yield information about intermediate fitness? How deeply must one sequence to recover this information, especially if noisy or degraded? Some discussion of these effects would be helpful.

      6.Lastly, the evolvability of RNA is fascinating and there is much to learn. However, the authors don't discuss the implications of their findings for molecular evolution although they throw the term around. It would be exciting if there is a trend in the fitness landscape that could help explain the trajectory of RNA evolution in nature.

      7.The authors use the abbreviation DMS for deep mutational scanning; the RNA structure field uses the reagent dimethylsulfate that is also abbreviated DMS. They may want to choose a different acronym or just avoid an acronym altogether.

      Significance

      As the importance of RNA structure for gene expression becomes more widely appreciated, interest in understanding the evolution of RNA structures is also increasing. Compared with the molecular evolution of proteins, evolution and fitness in RNA is far less understood, although the authors appropriately point to a number of recent studies on this topic. The main advance here is to use machine learning methods to analyze the results of a large genotypic screen, with the goal of more accurately capturing the fitness effects of sequences at varied distances from the parental sequence. The specific conclusions reached here such as the importance of metastability or the prominence of cold sensitive effects are not revolutionary, but the authors illustrate how such phenomena can be investigated more systematically and in more depth.

  2. Sep 2020
    1. Reviewer #3

      Introduction:

      1) For those not familiar with personality/trait constructs, harm avoidance should be defined.

      2) The authors unnecessarily make a distinction between emotion and cold cognition, or emotion and non-emotional perception. I don't think this distinction needs to be made and furthermore, the separation of emotion and cognition is a little antiquated in what we know about holistic processing of the brain.

      3) There is no mention of the amygdala or bed nucleus of the stria-terminalis in discussions of anxiety and especially in anticipation. Nor is there any mention of anticipatory or arousal components of anxiety.

      4) There are two competing points brought up in the introduction, regarding the pre-SMA: 1) that the pSMA is involved in time tracking and 2) that the pSMA is involved in threat related shock. This appears to be problematic due to the proposed hypotheses. Perhaps, the authors could adjust the hypotheses to illustrate why only time perception is a main effect hypothesis and time and anxiety are an interaction hypothesis.

      5) Hypothesis 5 is unclear, I assume the brain (neural changes) are being correlated with time estimation (behavioral index?), but it is unclear.

      Methods:

      1) Nim-Stim images need to be described in more detail in the methods and not just in the figure caption.

      2) The experimental methods specifics needs to be more clear regarding the differences in stimulus duration. This is an important distinction between the two studies and not enough details are given. It should be clearly worded and not left up to the reader to try and interpret the table.

      3) Why did the number of shocks differ between participants? That seems like a confound for the neural interpretation. The authors need to explain.

      4) There is no mention of fMRI screening for Study 1.

      5) Why a power calculation for Study 2, but not 1?

      6) The methods section is written such that the amount of explanation between the two studies needs to be resolved. They are quite different, i.e., how many total shocks in Study 1? There are inconsistencies throughout.

      7) Why are different analysis methods used to examine behavioral effects? ANOVA vs. paired sample t test? Details like this need to be explained throughout the manuscript if the authors are trying to compare two data sets.

      8) It isn't clear or mentioned that Study 1 was a pilot study for Study 2 until the neuroimaging analysis section. This needs to be explained and more detail should be included much earlier in the manuscript.

      9) For information: Siemens Skyra and Prisma scanners have built in dummy scans at the beginning of sequences to allow for equilibration.

      10) The neuroimaging methods require much more detail, i.e., SPM version used, etc., etc.

      11) ROIs description needs more detail, i.e., a 10mm sphere. 10mm what? Radius, circumference? That's a huge ROI for subcortical regions.

    2. Reviewer #2

      The manuscript "Anxiety makes time pass quicker: neural correlates" outlines an interesting and potentially important set of experiments aimed at replicating a previously reported effect of distorted time perception while under threat of electric shock while adding fMRI measurement of brain activity during the task. The manuscript has multiple strengths, in my opinion, including the use of a cleverly designed paradigm coupled with sophisticated neuroimaging methods, pre-registered predictions and analysis plan, and a potentially informative mechanistic focus. The study is also well grounded in the literature and the manuscript well written. I have some concerns, however, with the current version of the manuscript. These concerns mostly center on the strength of evidence afforded by the current design and the interpretability of the design and results. I outline these concerns, point by point below.

      1) The choice to pre-register the predictions and analysis plan is laudable. For clarity, I believe the authors should indicate, up front, what aspects of the study were pre-registered rather than simply saying that it is pre-registered.

      2) There are potentially important differences between the study pre-registration and the reported hypotheses and analysis. Sticking rigidly to the pre-registration is certainly not necessary to benefit from a pre-registration but I believe all potentially substantive deviations from the pre-registration should be identified and explained in the manuscript for transparency. For example, the specific brain regions mentioned in Prediction 2 are not consistent between the manuscript and pre-registration.

      3) In the pre-registration, I didn't see Prediction 4 (interaction of time-related and anxiety-related neural processing) but this may be attributable to inconsistent wording between the pre-registration and manuscript.

      4) The pre-registration discusses planned hypotheses and analysis involving functional connectivity but I do not see this mentioned in the manuscript.

      5) Some description of why faces (versus anything else) were used as stimuli is needed for readers to understand the task.

      6) Related to point 6 above, it is reported that the durations of stimuli were randomized but I did not see a description of randomization of the face stimuli themselves. This is needed (if I didn't just miss it).

      7) The authors indicated that the study was powered to detect the effect of threat on (I assume) behavior. I would guess that this is one of the largest effects that could be tested for in this study. In fact, the study appears underpowered to detect anything but very large effects. This could explain why many effects tested were not found (especially the interactions). I believe this should be explicitly acknowledged as a limitation for readers to be able to appropriately evaluate the strength of evidence for the claims made.

      8) Given the short ITIs in the task, perhaps the effects attributed to anxiety caused by threat of shock are in actuality effects due to continued processing of the previous aversive shock. I know the authors said they regressed out the effect of shock from the brain measures but it is unclear how one would regress out the effects of processing of previous shocks. Perhaps this potential confound has been addressed in previous reports of this task but I think some brief attention to the issue here would help readers to evaluate the results.

      9) Given the fact that shocks always occurred during the ITI and never during the cue, readers may be left wondering if the participants were indeed anxious versus, e.g., distracted, during the temporal decision task since they technically are not even yet at risk of receiving a shock at that moment of the task. Some clarification of this point would be helpful.

      10) Related to and overlapping with some of the points above, I request that the authors add a statement to the paper confirming whether, for the experiment, they have reported all measures, conditions and data exclusions and how they determined their sample sizes. The authors should, of course, add any additional text to ensure the statement is accurate. This is the standard reviewer disclosure request endorsed by the Center for Open Science [see http://osf.io/hadz3 ]. I include it, where appropriate, in every review.

    3. Reviewer #1

      This manuscript reports a pair of studies investigating the neural correlates of the temporal underestimation that has been shown to accompany anxiety in previous studies. Hypotheses were pre-registered, including increased activation in the anterior cingulate during threat and that "threat-related bold signal changes will correlate with the threat related behavioural changes". The current work found threat-related activity in the anterior cingulate gyrus, and that greater mid-cingulate activity for longer estimates of stimulus duration, with a trend toward overlap between these contrasts, which was subthreshold after correcting for multiple comparisons. In addition, activity associated with state anxiety and temporal estimation overlapped in the insula and putamen. The authors interpret these findings as consistent with the overloading hypothesis that vigilance during state anxiety and duration perception rely on overlapping areas, resulting in inaccurate duration perception during anxiety. However, these results should be interpreted with caution given that, as the authors note, there was no interaction between threat and perceived duration, and no correlation "between the underestimation of time during threat and either insula or midcingulate activation in the interaction contrast". Given the relatively small sample size, these null findings may have been the result of low power. Nevertheless, the current study will likely serve as a useful starting point for future work on this topic.

      Below are my comments on the manuscript:

      1) In the pre-registration, hypothesis 2 refers to the ACC and frontopolar areas, while in the manuscript I am not seeing the frontopolar areas. I know this region is particularly susceptible to dropout, so it is possible you were unable to adequately test this hypothesis – if so, this should be stated in the manuscript. In addition, the manuscript lists right IFG in the hypotheses, but I am not seeing results reported for this region.

      2) It would be good to explain why you chose to use 10 mm spheres centered on your ROIs, rather than using all voxels that met the p>.05 threshold in the clusters identified in Study 1.

      Minor comments:

      The abstract starts off talking about how anxiety can be adaptive, however, unless I missed something, they don't explicitly tie this thought into temporal underestimation. From the perspective of someone who is naive to literature on temporal underestimation, it seems that causing temporal underestimation would be maladaptive, if it causes one to underestimate how long you've been worrying about something. I would suggest either making the relationship between these ideas more explicit in the text, or either removing this first sentence or moving it to a less prominent spot.

      If there was a methodological reason for switching to a train of shocks (ex. an expectation that it would elicit more anxiety) in Study 2, it may be helpful for future researchers to state it. If it was simply a matter of equipment available at the second site, then no changes are needed.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      This manuscript reports a pair of studies investigating the neural correlates of the temporal underestimation that has been shown to accompany anxiety in previous studies. Hypotheses were pre-registered, including increased activation in the anterior cingulate during threat and that "threat-related bold signal changes will correlate with the threat related behavioural changes". The current work found threat-related activity in the anterior cingulate gyrus, and that greater mid-cingulate activity for longer estimates of stimulus duration, with a trend toward overlap between these contrasts, which was subthreshold after correcting for multiple comparisons. In addition, activity associated with state anxiety and temporal estimation overlapped in the insula and putamen. The authors interpret these findings as consistent with the overloading hypothesis that vigilance during state anxiety and duration perception rely on overlapping areas, resulting in inaccurate duration perception during anxiety.

      The reviewers and I identified several strong points:

      1) The current study may serve as a useful starting point for future work.

      2) Interesting set of experiments aimed at replicating a previously reported effect of distorted time perception while under threat of electric shock while adding fMRI measurement of brain activity during the task.

      3) Clever paradigm.

      4) Pre-registered predictions and analytic plan.

      5) Grounded in the literature.

      Yet, on balance, there was consensus that the study provides only an incremental advance, largely owing to limitations of the approach.

      Major/general concerns are:

      1) Insufficient power. E.g.

      -"The authors indicated that the study was powered to detect the effect of threat on (I assume) behavior. I would guess that this is one of the largest effects that could be tested for in this study. In fact, the study appears underpowered to detect anything but very large effects. This could explain why many effects tested were not found (especially the interactions). I believe this should be explicitly acknowledged as a limitation for readers to be able to appropriately evaluate the strength of evidence for the claims made."

      -"The results should be interpreted with caution given that, as the authors note, there was no interaction between threat and perceived duration, and no correlation "between the underestimation of time during threat and either insula or midcingulate activation in the interaction contrast". Given the relatively small sample size, these null findings may have been the result of low power."

      2) Writing style. The reviewers found the lack of attention to polishing the manuscript distracting, e.g. "The methods section appears to be written by two different authors with major inconsistencies in style and phrasing"

      3) Missing details. Crucial methodological details are lacking or inconsistent, making it difficult to fully evaluate the approach

    1. Reviewer #3

      The authors provide a clear and effective response to the demand for robust real-time pose estimation software with closed-loop feedback capabilities. In addition, we appreciate the effort that the authors have put into making the software user-friendly and extensible. The paper is very well written and contains many tools for those in the field to effectively use.

      A small weakness is the authors have demonstrated the LED flash latency but do not show an application such as optogenetic stimulation or behavioural manipulation using the system. Also, most of their benchmark numbers are based on videos and not camera streams, this does not fully address potential hardware issues. I believe the heavy dependence on video data and not actual ground truth live video feed is something that should be checked to present accurate numbers.

      Their Kalman filter approach seems useful but the deviations in pose estimation prediction from the normal pose estimation are sometimes 30 px or more. People may make trade-offs between latency and accuracy when using this software. Another important factor for real-time tracking is the accuracy of the pose estimation, it determines whether the system is really useful in true application.

      It would be nice to see a bit more validation of the software in a realistic live stream context. The quality of their code is quite high.

      1) The authors emphasize that their software enables "low-latency real-time pose estimation (within 15 ms, at >100 FPS)". Upon inspection of table 2, it appears that this range of latency and speed combinations is primarily achieved using 176x137 px images on Windows/Linux GPU based hardware, with corresponding FPS dropping to well below 100 for larger images in the DLCLive benchmarking tool on all platforms except for Windows. As the range in framerate/latency combinations appears to vary quite a bit between setups and frame sizes, we would suggest including a more realistic range for the latency and framerate in the abstract or at least mention the heavily down-sampled video used.

      2) In table 2, the mean and SD latency appear to be stable across modes, frame sizes, and GPU setups. However, there appears to be a notable spike in the latency range (14 {plus minus} 73) for the image acquisition to LED time on Windows computers that stands out from other latency figures. This latency range is concerning for the consistency of real-time feedback applications on a platform and at a frame size that is likely to be commonly used. Would the authors be able to explain a possible reason for this large SD?

      3) The DLG values appear to have been benchmarked using an existing video as opposed to a live camera feed. It is conceivable that a live camera feed would experience different kinds of hardware-based bottlenecks that are not present when streaming in a video (e.g., USB2 vs. USB3 vs. ethernet vs. wireless). Although this point is partially addressed with the demonstration of real-time feedback based on posture later in the manuscript, a replication of the DLG benchmark with a live stream from a camera at 100 FPS would be helpful to demonstrate frame rates and latency given the hardware bottlenecks introduced by cameras.

      4) In Figure 3, the measurement of the latency from frame to led is not very clear. The DLC will always give pose estimation even when the tongue is not appeared in the image so the LED will always be turning on very quickly after obtaining the pose from the image.

      5) In "Real-time feedback based on posture", the Kalman filter approach to reduce latency through forward prediction is innovative and likely of use for rapid characterization of general behaviours. In Figure 8C, the deviation of pose predictions from non-forward predicted poses appears to follow the general trend of the trajectory but appears to deviate by as many as 50 pixels from the non-forward predicted poses. While this tolerance may be acceptable for general pose estimation, many closed-loop pose estimation implementations may focus on rapid and accurate feedback based on very small movements (e.g. small muscular movements). For example, movements differing in magnitude by a few pixels may distinguish spontaneous twitches from conditioned behaviours. Considering that the demonstrated setup achieves a mean image to LED latency of 82 ms without the Kalman filter, it appears that many users would have to make a large trade-off between accuracy and latency in order to use the system with a conventional webcam and reasonably priced setup. Although the methods discussed are state-of-the-art and impressive considering the hardware used, it may be helpful to include a discussion of how the Kalman filter approach may be improved in the future to improve pose estimation accuracy while maintaining low latency.

      6) The software is compared favourably to existing real-time tracking software in terms of latency (refs 12-14). The efficacy of the existing realtime pose estimation software has been validated on animal movements using closed-loop conditioning paradigms. If feasible, a demonstration of the software reinforcing an animal based on real-time pose estimation (e.g. a similar paradigm to that used in the DLG benchmark video) would provide useful context as to whether the pose estimation strategies discussed are effective in closed-loop experiments. In particular, this would be important to evaluate given the novel Kalman filter approach - which influences the accuracy of pose estimation. We list this closed loop experiment as optional given the pandemic conditions we face. In contrast to the live animal reinforcement experiment, we do feel that real world streaming video to output trigger latencies are required (pt #3).

    2. Reviewer #2

      Kane et al. introduce a new set of software tools for implementing real-time, marker-less pose tracking. The manuscript describes these tools, presents a series of benchmarks and demonstrates their use in several experimental settings, which include deploying very low-latency closed-loop events triggered on pose detection. The software core is based on DeepLabCut (DLC), previously developed by the senior authors. The first key development presented is a new python package – DeepLabCut-Live! – which optimises pose inference to increase its speed, a key step for real-time application of DLC. The authors then present a new method for exporting trained DLC networks in a language-independent format and demonstrate how these can be used in three different environments to deploy experiments. Importantly, in addition to developing their own GUI, the authors have developed plugins for Bonsai and AutoPilot, two software packages already widely used by the systems neuroscience community to run experiments.

      The tools presented here are truly excellent and very exciting. In my view DLC has already started a revolution in the quantification of animal behaviour experiments and DeepLabCut-Live! is exactly what the community has been hoping for – to deploy the power of DLC in real-time to perform closed-loop experiments. I have very little doubt that the tools described in this manuscript and their future versions will be a mainstay of systems neuroscience very quickly and for years to come. Key to this is that the software is entirely OpenAccess and easy to deploy with inexpensive hardware. I commend, and as a DLC user, I certainly thank the authors for their efforts. I have a couple of comments below on the manuscript itself, which the authors might want to consider. As for the software itself, all of the benchmarks look good and the case studies make a compelling case for its applicability in real-life – and the beauty of it is that because its Open Access, any issues and improvements needed will be quickly spotted by the community, and I expect duly addressed by the authors judging from their track-record on DLC.

      Main comments:

      1) One important parameter that is not really discussed throughout the manuscript is the accuracy of pose estimation. I realize that this might be more of a discussion on DLC itself, but still, when relying on DLC to run closed-loop experiments this becomes a critical parameter. While offline we can just go back, re-train a new network and try again, in a real-time experiment, classification errors might be very costly. The manuscript would benefit from discussing these errors and how they can be best minimised. It would also be helpful to show rates for positive and false negative classification errors for the networks and use-cases presented here, to highlight the main parameters that determine them and perhaps show how classification errors vary as a function of these parameters (e.g., do any of the procedures to decrease inference latency, such as decreasing image resolution or changing the type of network, affect classification accuracy?). Along the same lines, while the use of Kalman Filters to achieve sub-zero latencies is very exciting, it is unclear how robust this approach is. This applies not only to the parameters of the filter itself, but also on the types of behaviour that this approach can work with successfully. Presumably, this requires a high degree of stereotypy and reproducibility of the actions being tracked and I feel that some discussion on this would be valuable.

      2) A related point is that some applications are likely to depend on the detection of many key-points and it is unclear how the number of key-points affects inference speed. For example, the 'light detection task' using AutoPilot uses a single key-point, how would the addition of more key-points affect performance in this particular configuration?

    3. Reviewer #1

      The authors present a new software suite enabling real-time markerless posture tracking - with the aim of making low-latency feedback in behavioral experiments possible. They demonstrate the software's capability on a variety of hardware and software platforms – including GPUs, CPUs, different operating systems, and the Bonsai data acquisition platform. Moreover, they demonstrate the real-time feedback capabilities of DeepLabCut-Live!.

      While there have been other methods that have been introduced recently that have incorporated real-time feedback on top of DeepLabCut, this software shows improved latency, has cross-platform capabilities, and is relatively easy to use. The software was thoroughly benchmarked (with one small exception that I'll outline below), and although I wasn't able to directly test it myself, I was easily able to download the code, and the documentation was sufficient for me to understand how it works. I have every confidence that this is a piece of software that will be extensively used by the field.

      My one comment is that it would have been good to have some analysis as to how the network accuracy (i.e., real space – not pixel space – error in tracking) scales with resolution, as the fundamental tracking trade-off isn't image size vs. speed, it's accuracy vs. speed. I wouldn't call this an essential revision, but I think that including these curves would greatly help potential users make important hardware and software decisions. Granted, this difference will alter depending on the network, but even getting a sense from the Dog and Mouse networks here would likely be sufficient to provide a general sense.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary

      This submission introduces a new set of software tools for implementing real-time, marker-less pose tracking. The manuscript describes these tools, presents a series of benchmarks and demonstrates their use in several experimental settings, which include deploying very low-latency closed-loop events triggered on pose detection. The software core is based on DeepLabCut (DLC), previously developed by the senior authors. The first key development presented is a new python package – DeepLabCut-Live! – which optimizes pose inference to increase its speed, a key step for real-time application of DLC. The authors then present a new method for exporting trained DLC networks in a language-independent format and demonstrate how these can be used in three different environments to deploy experiments. Importantly, in addition to developing their own GUI, the authors have developed plugins for Bonsai and AutoPilot, two software packages already widely used by the systems neuroscience community to run experiments.

      All three reviewers agreed that this work is exciting, carefully done, and would be of interest to a wide community of researchers. There were, however, four points that the reviewers felt could be addressed to increase the scope and the influence of the work (enumerated below).

      1) The fundamental trade-off in tracking isn't image size vs. speed, but rather accuracy vs. speed. Thus, the reviewers felt that providing a measure of how the real space (i.e., not pixel space) accuracy of the tracking was affected by changing the image resolution would be very helpful to researchers wishing to design experiments that utilize this software.

      2) The manuscript would also benefit from including additional details about the Kalman filtering approach used here (as well as, potentially, further discussion about how it might be improved in future work). For instance, while the use of Kalman Filters to achieve sub-zero latencies is very exciting, it is unclear how robust this approach is. This applies not only to the parameters of the filter itself, but also on the types of behavior that this approach can work with successfully. Presumably, this requires a high degree of stereotypy and reproducibility of the actions being tracked and the reviewers felt that some discussion on this point would be valuable.

      3) A general question that the reviewers had was how the number of key (tracked) points affects the latency. For example, the 'light detection task' using AutoPilot uses a single key-point, how would the addition of more key-points affect performance in this particular configuration? More fully understanding this relationship would be very helpful in guiding future experimental design using the system.

      4) The DLG values appear to have been benchmarked using an existing video as opposed to a live camera feed. It is conceivable that a live camera feed would experience different kinds of hardware-based bottlenecks that are not present when streaming in a video (e.g., USB3 vs. ethernet vs. wireless). Although this point is partially addressed with the demonstration of real-time feedback based on posture later in the manuscript, a replication of the DLG benchmark with a live stream from a camera at 100 FPS would be helpful to demonstrate frame rates and latency given the hardware bottlenecks introduced by cameras. If this is impossible to do at the moment, however, at minimum, adding a discussion stating that this type of demonstration is currently missing and outlining these potential challenges would be important.

    1. Reviewer #3

      This manuscript describes analysis and experiments designed to implicate CBX2 and CBX7 in breast carcinogenesis. Naturally, the analysis of existing data provides only correlative measures, and some of these are likely insignificant and driven by outliers (see specific points below). The experimental validation is done in two cell lines with a single siRNA, and data showing successful targeting of siRNA is lacking. The authors also claim direct regulation of mTORC by CBX2 and CBX7, but the evidence provided is weak. Overall the results are suggestive but do not provide conclusive evidence justifying the conclusions.

      Specific Points:

      The expression of CBX2/CBX7 correlates with breast cancer subtype, so all the predictive power may be in the subtype of cancer. Is there evidence that once standard prognostic methods are applied, CBX2 and/or CBX7 expression levels add to prediction? If not, it is not clear that these are drivers and not simply correlative markers of disease status.

      Figure 2 should include CBX2, CBX7, and other CBX RNA and protein levels to show that targeting was effective and specific. Multiple siRNAs should be used to demonstrate that it is not an off-target effect.

      Figure 3 correlations are extremely weak. Significance is driven by the large number of data points and not by correlation, and likely it is also driven by the few outliers on the left in each figure. If these are removed correlation is likely close to zero.

    2. Reviewer #2

      Saluja and colleagues present a study examining the contribution of chromobox-family of proteins, specifically to CBX2/7, on metabolic reprogramming of breast cancer cells. Notably, little is known regarding CBX2/7's activity in metabolism. The manuscript is well written and clearly presented. The major findings are that CBX2 and 7 are related to metabolic reprogramming and have inverse roles in regulating anerobic glycolysis, respectively. Through mining of several large datasets (TCGA/METABRIC), investigators demonstrate that amplification and upregulation of CBX2 correlates to more aggressive tumors and correlates to increased mTORC signaling. Authors directly demonstrate that siRNA knockdown of CBX2 leads to loss of glucose uptake and a reduction in ATP production. Conversely, loss of CBX7 increased glucose uptake, increased ATP production, promoted an increase in cell number, and pS6 phosphorylation. There is a significant need to better define the contribution of CBX2 and CBX7 in breast cancer, which will shed light on breast cancer progression, metabolic reprogramming, and therapeutic response. The strengths of the study included the use of large, well-annotated datasets and a novel area of cross-talk between epigenetics and metabolism. However, there are concerns detailed below that need to be addressed:

      Major:

      1) Most of the research presented is correlative studies with little mechanistic insight. CBX2 and CBX7 are members of the polycomb repressor complex 1 (PRC1). Are the CBX2 and CBX7 expression mutually exclusive? Related to figure 3, what is the mechanism of action that loss of CBX2 expression and decreases mTORC signaling? CBX2 and CBX7 proteins are not likely functioning alone. In CBX2High cell lines authors should investigate the impact of a PRC1 inhibitor in the context of anaerobic glycolysis to assess whether the CBX2 is functioning independent of PRC1. Also, the discussion regarding the interplay between PRC1, PRC2, and metabolism should be included.

      2) The MTT and Cell titer glo therapeutic sensitivity assays need to be repeated using a non-metabolic readout. The major conclusion of the study is that CBX2 and CBX7 promote metabolic reprogramming thus using metabolic outputs (Cell Titer Glo - ATP production and MTT - mitochondrial respiration) for the chemotherapy assays are flawed.

      3) Only two cell lines examined (MCF7 [ER/PR positive] and MDA-MB-231 [triple negative]), which is a study limitation. Why were these cell lines selected? Also, only pooled siRNA for both CBX2 and CBX7 were used, thus only loss-of-function responses are evaluated. Does overexpression of CBX2 in a CBX2-low cell line exacerbate anaerobic glycolysis and conversely does CBX7 overexpression in CBX7-low inhibit anaerobic glycolysis?

      4) Based on figure 6, the CBX2high lines are less responsive to Rapamycin suggesting that the cells are not dependent on CBX2-mediated upregulation of mTORC. Temsirolimus was also not detected as being significant, further highlighting that CBX2-activity on mTORC is not a critical pathway. Also, given the antagonistic effect of CBX7, what are the therapeutic vulnerabilities conveyed in CBX7high?

      5) The survival curves demonstrated in Figure 5 show a substantial difference between TCGA and Metabric data, what is the possible explanation?

    3. Reviewer #1

      The manuscript entitled "CBX2 and CBX7 antagonistically regulate metabolic reprogramming in breast cancer" analyzed multi-omics data of breast cancer mainly from METABRIC and TCGA with the focus on the chromobox family member genes (CBXs). Authors showed the association of CBX2 and CBX7 expression levels with glycolysis in tumors and the mTOR signaling, especially the levels of phosphorylation of S6 protein in tumors. Knockdown of CBX2 and CBX7 in two breast cancer cell lines showed opposite effects on glycolysis, cell viability and growth. Previous studies reported that CBX2 and CBX7 have oncogenic and tumor-suppressive roles in breast cancers. Results from this study showed their involvement in regulation of glycolysis, as well as their association with the prognosis of disease-specific survival of breast cancers. While some of the findings about CBX2 and CBX7 are interesting, most of the results showed association and provided limited insights about how CBX2 and CBX7 regulates glycolysis and their contribution in breast cancer.

      Specific comments:

      1) The authors need to provide detailed methods of analysis, including glycolysis deregulation score, where to obtain the DNA methylation levels, etc.

      2) It is uncertain that it is acceptable practice to base/categorize breast cancer aggressiveness according to different subtypes (from LumA, LumB, Her2 to Basal) as shown in Figure 1D, Figure 4C, 4F.

      3) 2-DG experiments were only performed in MDA-MB-231 and MCF-con cells but not cells with CBX knockdown (Fig S3). It is therefore unclear whether the changes of cell viability, proliferation by CBX knockdown are due to the metabolic changes (Figure 2).

      4) Figure 3 showed the effects of CBX on pS6 levels in breast tumors. However, it is unclear whether this change contributes to the role of CBX2, CBX7 in glycolysis. The statement on page 6, line 1 "CBX2 and CBX7 exert their effects on breast cancer metabolism via modulation of mTORC1 signaling" is not established and has no data to support.

      5) Figure 5, since the % of CBX2 high/low and CBX7 high/low differ in different subtypes of breast cancers, it is suggested to analyze the association of CBX2, CBX7 expression with prognosis in different subtypes.

      6) Figure 6, please discuss why CBX2 high cells which supposedly have high mTOR activity showed higher resistance towards Rapamycin compared to CBX2 low cells. Also, whether CBX7 showed opposite effects of drug sensitivity towards the same group of compounds.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The manuscript has been reviewed by three experts in the field, including an expert in metabolism, one in breast cancer and one in bioinformatics. There are concerns that much of the data are correlative as opposed to mechanistic, and that the material thus falls short of increasing insights into the role of CBX2/7 in breast cancer. There is concern that the cell viability assays are actually readouts of metabolism, and that viability assays should be repeated using a non-metabolic readout such as trypan blue or calcein/EtBr stain. There are concerns about the possibility that the expression of CBX2 and 7 as markers of breast cancer subtype are actually driving the correlations seen. And there are concerns that only two cell lines are analyzed, and only a single siRNA. It is suggested that performing the metabolism assays in the presence of knockdown for CBX would better support the premise that there is a correlation between metabolism and proliferation, and that these are together regulated by CBX proteins. Finally, one of the reviewers requests more detail in the methods.

    1. Reviewer #2

      General assessment:

      This study uses innovative analytical tools to characterise movement-evoked patterns in the cortex and evaluate functional recovery after stroke. They employ a motor task wherein a sliding platform that has to be pulled back by the mouse upon an acoustic cue to obtain a reward. Calcium-imaging cortical events are matched to force events. A propagation map is generated based on SPIKE-order: an asymmetric counter of threshold crossing coincidences between each and all pixels. Three propagation indicators are investigated: duration, angle and smoothness. These indicators show differences between healthy and stroke mice during the first and last three weeks of treatment.

      The proposed SPIKE-order algorithm is a promising analytical tool to characterize brain dynamics in a variety of cortical functional imaging data. The terms 'spike' or 'synfire' do not correspond to the neuronal processes, but are used analogously referring to threshold crossings, and consistent spatiotemporal patterns of spike coincidence respectively. This analysis is highly versatile, being scale and parameter free, thus this approach must be empirically validated.

      Major concerns:

      1) My main concern with the study is in the use of these indicators to track recovery after stroke. There is no control group that received stroke but did not perform the task during the acute phase. An increase in oxygenation in the area over time due to collateral irrigation may account for the reported effects. Without the appropriate control, the recovery in propagation indicators cannot be attributed to motor rehabilitation.

      2) Notably, there is no effect of training on changes to these indicators in healthy mice. Previous work by Makino et al. 2017 reported decreased duration of activity as learning progressed. Looking at spatial gradients of phase, Makino et al also found a secondary activity flow at later stages of training. The authors should provide reasons for the absence of these changes in their indicators of duration and angle.

      3) There is no analysis on the frequency of action types to indicate behavioural recovery. This should be addressed in the discussion, but it may also suggest that these indicators have no relation to a longitudinal effect of the motor task.

      4) The key to the status codes is missing. There are 7 discrete statuses of the robotic slide in total, but only status 3 is described. Also, the schedule for the acoustic cue within status 3 is unclear.

      5) The nature of the reward for pulling is not specified. In the drawing in Figure 1, it looks like it could be water or sucrose. However, it is stated that mice are not water deprived. The difference in cortical activity between R and nR events is due solely to auditory cues and not voluntary action. It is important to know the nature of the reward to assess motivation and intention in the movement.

    2. Reviewer #1

      I enjoyed reading the paper by Cecchini et al. on using wide-field calcium imaging in mice to assess propagation of motor-related cortical network activity before and after focal photo-thrombotic stroke. The paper is well-written and relatively easy to follow because of the lengthy (perhaps even verbose and at times jargonny) explanations of the methods and results. The authors are clearly experts in the field, having published on the topic of stroke recovery in recent years, and in the methods employed, especially in the analysis approach, which they recently developed (cf. Allegra Mascaro et al., 2019). They also cite many of the relevant papers in the field. After decades of stroke research documenting various aspects of molecular or anatomical changes in circuits after stroke, studies such as this one that focus on alterations in network activity, are very important. The main technique used, single-photon calcium imaging through the skull of bulk signals on the cortical surface, is elegant in its simplicity and has clear advantages over similar wide-field imaging techniques using voltage sensors (which includes sub-threshold activity not related to action potentials) or intrinsic signals (which depend on blood flow/volume and are hard to interpret in the context of stroke). The authors then use sophisticated quantitative approaches to analyze three aspects of the propagation of cortical network activity (duration, smoothness, and angle) and how they are affected by stroke and by two rehabilitative strategies. The main findings can be summarized as follows: 1) These three indicators are stable over time (4 weeks) in healthy mice; 2) After stroke, network events last longer and are more chaotic (lower smoothness); 3) A combination of motor training and silencing the healthy hemisphere after stroke drastically alters these three parameters.

      The main strengths of the paper, in my opinion, include the novelty of their analysis of wide-field calcium imaging in the context of stroke, especially when coupled with a rehabilitative strategy, and the results showing differences in propagation of activity between stroke and healthy controls. However, I have noted the following issues, some of which I consider serious.

      One problem I encountered is that the authors do not provide sufficient data on the impact of stroke, both in terms of size/location and its impact on function (motor pull task), or about the pharmacological silencing approach. Although they refer to their previous paper (Allegra Mascaro 2019), I could not find clear answers there either.

      My first recommendation is that the authors present data on the location and size of the infarcts they produced in each of the mice used in the present study. They should show at least a couple of histological examples of infarcts and, more importantly, a graph that plots infarct volume for all the individual mice (this could be in a suppl. figure), and ideally the location of the infarct with respect to the landmarks of M1. PT strokes can be quite variable, and one wonders whether some mice suffered large infarcts whereas in others they are negligible or may have missed M1 altogether.

      Second, they should clarify in a lot more detail what the behavioral deficits are after such a stroke, if any, not just as detected by the robot task but also with other behavior assays. In the Allegra Mascaro paper, the plots in Fig. 1D indicate that normal control mice have gradual reductions in peak amplitude and in slope of the force over 5 days of training (whereas stroke mice do not), but it's not clear whether this is statistically significant. Moreover, in the Results section of that paper, they claim the "amplitude and slope of the force task (...) were not significantly different across groups." I believe the authors need to show their behavior data for this new cohort of mice. In fact, if they can't find significant deficits in forelimb function with the pull task after PT stroke, then the authors should clearly state that their robot assay is insensitive (which would seriously undermine the significance of their findings.) The present manuscript states that the combined treatment promotes "a generalized recovery of the forelimb dexterity" (line 358), but this is not supported by any data provided. If the authors are unable to provide behavior data, any statements about the robot task should be modified, if not removed. Solely referring to their 2019 paper is not appropriate, since this is an entirely new group of animals. I'm very much hoping that the authors actually have these data on behavioral performance across time for all mice in the study, because they would be in a position to actually correlate changes in pulling (amplitude, slope) with network activity data and provide a more robust narrative. However, Fig. 6 indicates that the effects of Rehab were the same for all types of events (F vs. nF, Act vs. Pass, or RP vs. nRP), which suggests that there is probably no correlation between training and network activity.

      Third, regarding the BoNT/E experiments, neither the Allegra Mascaro 2019 paper, nor this one, provides any evidence that the procedure actually works as intended. The authors should either do in vivo wide-field calcium imaging in a subset of mice in the injected hemisphere to show that spontaneous and motor-related cortical activations are eliminated in toxin-injected mice (or some ephys in slices at the very least), with appropriate controls of course, such as a mice injected with vehicle or with denatured toxin. An important control that is currently missing is a BoNT/E alone group, without stroke (see comment #1 below).

      Lastly, I am concerned about the sample size they use for statistics. Although they discuss the numbers of mice in their power analysis, all the plots they show include many more individual points than the number of mice (what are those, FOVs? events?). The preferred sample size would be to use the number of mice. I believe the authors should show the data (and perform statistics) only for individual mice. Otherwise they need to justify why they didn't do stats with n= # mice.

      Other comments (not necessarily minor):

      1) I agree that the pattern of activity is different in the Rehab group (presumably an effect of silencing the contralesional, healthy hemisphere). But, since it is also very different from the pattern of propagation in healthy control mice (or pre-stroke baseline), it is also possible that this is also a pathologic pattern, not necessarily reflecting a "new functional efficacy (line 358-9). The authors should comment on this possibility in the Discussion, namely that Rehab did not restore activity to a control pattern, but to a different pattern altogether. This will be easier once they analyze a BoNT/E control group in which mice are injected with BoNT but do not receive a stroke. This is a critical control that will allow the reader to determine whether the effects they see in the Rehab group reflect adaptive plasticity to restore functional connectivity, or simply disconnection from the silenced hemisphere.

      2) Regarding the standardized maps for cortical brain regions in Fig. 1, the authors should explain in more detail how the imaging fields of view (FOV) were superimposed and aligned to the contours; it is briefly described in terms of aligning to Bregman and Lambda, but more information would help if there is concern for animal to animal variability (being off by 3 pixels in any direction is >0.5 mm.) In Fig. 1d it looks like the imaging field of view is actually quite caudal, with very little motor cortex included. Is this a typical representation or was there some variability from animal to animal in the location of the imaging FOV? I recommend that the authors provide the exact location of the imaging FOV rectangle for each animal and an outline of where the PT stroke was located in the same figure. I would also recommend redrawing the contours that demarcate brain regions in Fig. 1c and d so that they do not appear so thick.

      3) I was surprised that spatiotemporal dynamics of the calcium signals did not change with learning the task; the authors suggest this is because mice learn the task so quickly (line 401-8). I wonder if, alternatively, the reason is because they don't learn at all (since they did not report significant differences across days in control mice in their 2019 paper) or because it doesn't require learning. The robot task extends the forelimb into an uncomfortable position and the mice may simply reflexively pull it back into a more comfortable resting position.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The reviewers were both very enthusiastic about the novelty and potential application of the calcium imaging technique. However, some major issues were raised that dampened the enthusiasm of the paper. Some of the key issues raised were that essential controls are missing, key measurements (behavioral) of post-stroke recovery are not provided, there are some questions about the statistics that were applied to the data, and the sample size used in the experiments was also an area of question.

    1. Reviewer #3

      Overview and general assessment:

      In further untangling the organisation of occipitotemporal cortex (OTC), this paper attempts to explain, using behavioural and categorical models, the graded representations of images of animal faces and bodies, and objects (plants), in OTC and the face, body, and object-selective regions within OTC. The data suggest two main results. One, the representations in OTC seem to be (independently) related to an animate-inanimate distinction, a face-body distinction, and a taxonomic distinction between the images. Two, the representations in the face and body selective regions in OTC are related to the face/body images' similarity to human face/body respectively as gauged with a behavioural experiment. This similarity to human face/body subsumes the variance in face/body-selective OTC related to the authors' model of taxonomic distinction. These observations are used to suggest that the graded responses to animal images in OTC reported by previous studies (termed the animacy continuum in some cases) might just be based on animal resemblance to human faces and bodies than on a taxonomy. The claims, if valid, are a major addition to the ongoing discussion about the nature and underlying principles of the organisation of object representations in high-level visual cortex.

      There might be a multitude of issues, outlined below, with the way the observations are used to support the authors' claims. Addressing those issues might help reveal if the claims are indeed supported by the data which would be crucial in deciding whether to publish the current version of this paper.

      Main concerns:

      On "OTC does not reflect taxonomy" (line 390): Observations in Figure 4 suggest that the variance in face/body-selective OTC explained by the taxonomy RDMs is for most part a subset of the variance explained by the human face/body similarity RDMs. This observation is used to suggest that "there is no taxonomic organisation in OTC" (line 423). Wouldn't such a statement be valid only if the taxonomy RDM did not explain any variance in OTC? Couldn't the observation that the variance it explains is also explained by human-similarity imply that the human-similarity is partly based on taxonomy? Also, the positive and strong correlation between the human-similarity RDMs and CNN RDMs in Figure 6 suggest that the human-similarity judgements reflect visual feature differences. However, how would you distinguish between the variance in the human-similarity RDM described by visual feature differences and by a more semantic concept such as taxonomy? Without disentangling these visuo-semantic factors (as done in Proklova et al. 2016 and Thorat et al. 2019) how could we be sure that OTC does not reflect taxonomy?

      On "OTC does not represent object animacy" (line 434): Figure 2 suggests that the animacy RDM is related to the OTC RDMs, even after factoring out the face/body and taxonomy RDM contributions. The point raised in the above section also makes it harder to suggest that animacy (the semantic part) is not represented in OTC. While the studies mentioned in the discussion are part of the ongoing debate on whether animacy is indeed represented in OTC, such a definitive statement seems out of place in the discussion in this paper where the data do not seem to suggest the absence of animacy in OTC.

      On "Deep neural networks do not represent object animacy" (line 468): "trained DNNs plausibly do not represent either a taxonomic continuum or a categorical division between animate and inanimate objects" (lines 487-488). In Figure 5 there is a clear negative correlation with the animacy RDM for most of the CNNs i.e. a "categorical distinction". Other models are not factored out in Figure 5 to suggest that the animacy RDM contribution is not unique as the statement suggests. Also, the way the CNNs are trained, they are not fed explicit animacy information so whatever variance is related to animacy as quantified by the categorical/behavioural models suggests that those models might be capitalising on visual feature differences. As such, indeed, CNNs do not represent animacy – but then that is a trivial statement – it seems they do represent visual feature differences which can be associated with animacy.

      Minor comments:

      (lines 53-54) "These studies equate the idea of a continuous, graded organisation in OTC with the representation of a taxonomic hierarchy" This is false. For example, in Thorat et al. 2019 this equality was questioned by dissociating between an agency-based (which would be similar to taxonomy) hierarchy and a visual similarity hierarchy. The point about differential focus on faces or bodies for different animals is a valid point and requires further research to be elucidated.

      For the taxonomy model, is it appropriate that the assumed distance between the Mammal 1 class and the Mammal 2 class is the same as the one between Mammal 2 class and the Birds class? Is this what we expect in OTC? In terms of spearman correlations this assumption might be fine, but when the model contributions are partitioned using regression (e.g. in Figure 2) the emphasis does shift to the magnitude of the distances than the ranks of the distances. This assumption might be running into a bigger problem when comparisons between the taxonomy model and human-similarity models are made. The human-similarity model seems to capture the differences with the Mammal 1 class which are collapsed into one measure in the taxonomy model. Might this difference underlie the observed results where the variance captured by the taxonomy model is subsumed by the variance explained by the human-similarity model?

      Would it be possible to acquire confidence intervals for the independent and shared variance explained by the 3 models in Figure 2 (and elsewhere where there is a similar analysis)? That might help us understand if the individual contribution of, say the animacy model, to OTC is robust. In the same vein, it might be a good to indicate the robustness of the differences between the correlations of the different models with L/V-OTC in the figures.

      (lines 181-182) "the taxonomic hierarchy is more apparent in VOTC-all, while the face-body division is also still clearly present" What is the significance of this distinction (also echoed in lines 222-223 after the face/body ROI analysis)?

      Across the animals how correlated are the human-body similarity and human-face similarity RDMs? It seems that different set of participants provided these two models. Is that the case? Are the correlations between the two models at the noise ceilings of each other? Is there any specificity of model type with ROI type i.e. does the human-face similarity model correlate more with L/V-OTC face than with L/V-OTC body and vice versa for the human-body similarity model? Basically, how different are the two models?

      In Figure 4, how do the correlations of the mentioned models look like with L/V-OTC-object? While it is interesting to understand the graded responses in the face and body areas, it might be good to see if the human-face/body similarity models also explain the graded responses in the, arguably more general, object-selective ROIs. Of course, here the object-selective ROI would share a lot of voxels with the body and face selective ROIs and the results might be similar, but might still make sense to add the object-selective ROI results as a supplemental figure to Figure 4. Also in Figure 1, it is clear that the 3 ROIs do not cover all of L/V-OTC. In making claims about the representations in OTC at large, would it be useful to also analyse L/V-OTC-all (or go further and get an anatomically-defined region) with the human face/body-similarity models?

      What is the value of the noise ceiling for VOTC-body in Figure 4B?

      Why might the animacy model be negatively correlated with the CNN layer RDMs?

    2. Reviewer #2

      The authors sought to reconcile three observations about the organisation of human high-level visual cortex: 1) the reliable presence of focal selective regions for particular categories (especially faces and bodies) 2) broader patterns of brain responses that distinguish animate and inanimate objects and 3) more recent findings pointing to organisation reflecting a taxonomic hierarchy describing the semantic relationships amongst different species. To this end, they conducted a well-designed and technically sophisticated fMRI study following a representational similarity approach, seeking to pull apart these factors via careful selection of stimuli and comparison of evoked BOLD activity with predicted patterns of (dis)similarity. This was complemented by an analysis comparing similarities of these models with the properties of the deeper layers of several deep neural networks trained to categorise images. The authors draw "deflationary" conclusions, to argue that models of OTC emphasising semantic taxonomy or animacy are unnecessarily complex, and that instead the most powerful organisational principle to account for extant findings is by reference to representations that are anchored specifically on the face and the body.

      1) In many ways, this study is designed as a response to a few specific previous papers on related topics, notably two by Connolly et al., and others by Sha et al and Thorat et al. One limitation of the paper is that it perhaps relies too much on knowledge of that previous work - for example, points about the "intuitive taxonomic hierarchy" that build on that work were not fully explicated in the Introduction and only became gradually clear through the ms. More seriously, I am concerned that the authors' conclusions depend on methodological differences with the other work. The authors focused their analyses on focal regions identified as face-, body-, or object-selective in localiser runs. Judging from Figure 1B, this generates a rather restricted set of regions that are then examined in detail with various RDM analyses. In comparison, some of the previous studies worked with much broader occipito-temporal regions of interest, and/or used searchlight methods to find regions with specific tuning properties without defining regions in advance. To put it more bluntly, the authors may have put their thumb on the scale: by focusing closely on regions that by selection are highly face or body selective, they have found that faces and bodies are key drivers of response patterns. So in this light I was confused by the section beginning at line 442 ("Based on this...") in which the authors seem to dismiss the possibility that animacy dimensions are captured over a broader spatial scale, but they have not measured responses at that scale in the present study. In sum: applied to wider regions of occipitotemporal cortex, the same approach might plausibly generate very different findings, complicating the authors' ultimate conclusions.

      2) I was not fully convinced by the inclusion of the DNN analyses. In contrast with the brain/behaviour work, this did not seem strongly hypothesis driven, but rather exploratory, and more revealing of DNN properties than answering the questions about human neuroanatomy that the authors set out in the introduction. Would this part of the study be better reported in more detail, in a different paper?

      3) Looking at Figure 1C - is it the case that each of these data-to-model comparisons is equally well-powered? The three models are not equally complex: the animacy and face-body models are binary, while the taxonomy model makes a more continuous prediction. Potentially, then, this sets a higher statistical bar for the taxonomy model than the others. That is, it is consistent with a narrower and more specific set of the space of possible results: the binary models essentially say "A should be larger than B" but the taxonomy model says "A should be larger than B, should be larger than C, etc.". If not taken into account, this difference might put the taxonomy model at an unfair disadvantage when compared directly against the other two.

      Minor Comments:

      The authors report a series of VOTC/LOTC "all" analyses, and also a series of analyses of the specific ROIs that compose these unified ROIs (e.g. face or body specific regions only). In that sense, these analyses are partly redundant to each other, rather than being independent tests. If I read this correctly, then this suggests that statistical corrections may be in order to account for this non-independence, and/or some tempering of conclusions that rely on these as being two distinct indexes of brain activity.

    3. Reviewer #1

      In this fMRI study, Ritchie et al. investigated the representation of animal faces and bodies in (human) face- and body-selective regions of OTC, testing whether animal representations reflect similarity to human faces and bodies (as rated by human observers) or a taxonomic hierarchy. Results show that similarity to humans best captures the representational similarity of animal faces and bodies in face- and body-selective regions.

      This is a well-conducted study that convincingly shows that animals' similarity to humans is important for understanding responses to animals in face- and body-selective regions. More generally, it suggests that previously observed selectivity to animals is (at least partly) driven by responses in known (human) face- and body-selective regions. These findings make a lot of sense in the context of earlier work. I was, however, a bit puzzled by the framing of the study and the interpretation of the results. I hope my comments are useful for revising the paper.

      Major comments:

      1) The study is framed around a couple of recent fMRI studies (most notably Sha et al., 2015 and Thorat et al., 2019) claiming that the animacy organization in visual cortex reflects a continuum rather than a dichotomy. The submitted study contrasts this claim with the alternative of a face-body division. The authors conclude that taking into account the face-body division explains away the proposed animacy continuum (here taken as taxonomic hierarchy) account. I had difficulty following this logic. There seem to be at least three separate questions here: 1) does the animacy organization reflect activity in face/body-selective regions, or are there animate-selective clusters that are different from known face- and body-selective regions? 2) assuming that animals activate known face- and body-selective regions, are responses in these regions organized along a human-similarity continuum? 3) what is the nature of this continuum - conceptual and/or visual? Could you clarify which questions your study address? See below for more explanation.

      2) One of the conclusions relates to the first question ("Our results provide support for the idea that OTC is not representing animacy per se, but simply faces and bodies as separate from other ecologically important categories of objects."). I am missing a review of previous work here: there is already strong evidence showing that the animacy organization is closely related to the face/body organization. For example, Kriegeskorte et al. (2008) showed that the animate-inanimate distinction is the top-level distinction in OTC, with the animate category consisting of face and body clusters (rather than human vs animal); see also Grill-Spector & Weiner (2014) for perhaps the leading account of how animacy and face/body selectivity may be hierarchically related. Furthermore, earlier work reported responses to animal faces and bodies in human face- and body-selective regions. For example, Kanwisher et al. (1999) found responses to animal faces "as might be expected given that animal faces share many features with human faces" and concluded: "Thus the response of the FFA is primarily driven by the presence of a face (whether human or animal), not by the presence of an animal or human per se.". Tong et al. (2000) reached similar conclusions. Similar findings were also reported for animal bodies in body-selective regions, with stronger responses to animal bodies (e.g. mammals) that are more similar to humans (Downing et al., 2001; Downing et al., 2006). Considering this literature (none of which is cited in the Introduction), it seems rather well established that the animacy organization is directly related to face/body selectivity, that animal faces/bodies activate human face-/body-selective regions, and that this activation depends on an animal's similarity to human faces/bodies. (More generally, visual similarity is well-known to be reflected in visual cortex activity, including in category-selective regions (e.g. work by Tim Andrews)). It would be helpful if the current study is introduced in the context of this previous work so that it is clear what new insights the current study brings.

      3) Related to the second question, the current results provide convincing evidence for a human-similarity dimension. However, contrary to the claims of the paper, the continua proposed in Sha et al. and Thorat et al. would seem to predict a similar result, considering that these studies defined the animacy continuum in terms of an animal's similarity to humans: Sha et al.: "the degree to which animals share characteristics with the animate prototype-humans."; Thorat et al.: "the animacy organization reflects the degree to which animals share psychological characteristics with humans". To model this dimension, rather than assuming a 1-6 taxonomic hierarchy, participants could rate the animals' similarity to humans, as for example done in Thorat et al. You will likely find that these ratings correlate highly with the visual similarity ratings in the current study. The obvious problem is that animals that are similar to humans tend to share both conceptual and visual properties with humans. By the way: it would be relevant to discuss Contini et al. (2020) in the Introduction, as this paper similarly proposed a human-centric account.

      4) This brings us to the third question, whether "similarity to humans" is purely visual (i.e., image based) or whether conceptual similarity also contributes to explaining responses. Sha et al. could not address this question because their stimuli confounded the two dimensions. However, it was not clear to me that the submitted study can address this question any better, considering that the stimuli were not designed for distinguishing the two dimensions either: bodies/faces that are visually more similar to humans will belong to animals that are conceptually more similar to humans as well.

      5) The study is quite narrowly focused on debunking the taxonomy hierarchy supposedly proposed by previous studies. If this is the goal, you would need to stay close to these previous studies in terms of analyses and regions of interest. If not, it is hard to compare results across studies. For example, the abstract states that: "previous studies suggest this animacy organization reflects the representation of an intuitive taxonomic hierarchy, distinct from the presence of face- and body-selective areas in OTC." I'm not sure who made this claim, but if this was the claim that you want to test, wouldn't you need to look outside of face- and body-selective regions for this taxonomic hierarchy? Or if the study is a follow-up to Sha et al., then it would be useful to see their analyses repeated here, or at least present results in comparable ROIs. Alternatively, you could detach the research question from these studies and focus more on animal representations in face- and body-selective regions (after introducing what we know about these regions).

      Minor comments:

      1) The third paragraph of the Introduction mentions "these studies", but it is not clear which specific studies you refer to (the preceding paragraph cites many studies).

      2) Did you correct for multiple comparisons when comparing the models (e.g. p.10)?

      3) Could the human-similarity ratings partly reflect conceptual similarity? Might it not be hard for participants to distinguish purely visual properties from more conceptual properties? Perhaps the DNNs can be used to create an image-based human-similarity score?

      4) It was not entirely clear to me what the DNNs added to the study (which asks a question about human visual cortex). These are also not really introduced in the Introduction, and are only briefly mentioned in the Abstract. Was the idea to directly compare representations in DNNs to those in OTC?

      5) p.15: refers to Figures 6A and 6B instead of 4A and 4B

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The reviewers agreed that your paper reports a well-conducted study revealing several interesting results. However, they were ultimately not convinced that one of the main conclusions of the paper – the absence of an animal taxonomy – was sufficiently supported by the presented data, also considering the difference in analysis methods compared to previous studies. Furthermore, they noted that the reported results are somewhat incremental relative to earlier work reporting responses to animal faces/bodies in face-/body- selective regions.

    1. Reviewer #3

      PREreview of "Evolutionary transcriptomics implicates HAND2 in the origins of implantation and regulation of gestation length"

      Authored by Mirna Marinić et al. and posted on bioRxiv DOI: 10.1101/2020.06.15.152868

      Review authors in alphabetical order: Monica Granados, Katrina Murphy, Maria Sol Ruiz, Daniela Saderi

      This review is the result of a virtual, live-streamed journal club organized and hosted by PREreview and eLife. The discussion was joined by 17 people in total, including researchers from several regions of the world, the last preprint author, and the event organizing team.

      Overview and take-home message:

      In this preprint, Marinić et al. begin the beautiful exploration of gene involvement at the maternal-fetal interface of pregnancy evolution with a look at the importance of a known early-pregnancy gene, HAND2. The research team's findings shown through uterine models and a combination of cell, gene, and data analysis demonstrate HAND2's roles in supporting progesterone in placental mammals by down-regulating estrogen in time for implantation, and through IL15 signaling, where both the promotion of immune and placental cell migration as well as up-regulation of estrogen at the end of term for a healthy gestation length is noted. This important work also sheds some light on progesterone's role in non-placental mammal pregnancy where estrogen continues to be produced throughout the pregnancy. Although this work is an important addition to the field of pregnancy evolution, there are some points that need clarification and a few minor concerns that could be addressed in the next version. These are outlined below.

      Positive feedback:

      1) The selection of HAND2 as a hypothetical regulator of gestation was based on previous knowledge, but the authors supported this selection after an extensive phylogenetic analysis of genes expressed in the endometria of pregnant/gravid organisms from several Eutherian and non-Eutherian species.

      2) Several participants evaluated the results as encouraging for looking into other models such as organoids (as stated by the manuscript), and as a great start for a deeper understanding of pregnancy evolution via the study of gene expression.

      3) The potential implications of these results in the field of abnormalities in pregnancy/infertility were also mentioned as relevant.

      4) Definitely recommended for peer review because this is a great start for a deeper understanding of genes involved in the evolution of pregnancy!

      5) I think the fact that there could be a mechanism involved in HAND2 that ends gestation is really interesting.

      6) Cool to learn that HAND2 expression was specific to fibroblasts and the fibroblasts influence signaling in other cell types.

      7) A proposal of a new hypothesis based on "evolutionary" observations.

      8) Enjoyed learning from the author that a uterus is a counter-intuitive place with immune cells making up half the cells to allow for tolerance towards the pregnancy process.

      9) The methods section was quite detailed; including a GitHub repository and on page 17, a data availability statement for images, genes, and related data. I found the manuscript really interesting. Enjoyed it very much!

      10) In general, the manuscript was easy to follow and figures were logically arranged.

      Concerns:

      Areas that could use more clarification:

      1) It was helpful to hear from the author that the known HAND2 gene wasn't knocked out in mice, so it was an easy early pregnancy gene to start with.

      2) To reproduce the study, there were a couple of questions around the production of the conditioned media including, how long were the cells incubated in the media and what was the volume of the media used. Can more details be shared in the next version?

      3) Can you further explain why the opossum was used to measure the estrogen levels?

      4) Please explain why the researchers decided on the TPM=2 expression cut-off.

      -We heard from the author that genes with TPM less than 2 are functioning in the cell; this might be nice to add in the next version.

      5) Can you include your thoughts on why mammals have evolved this way? This might be a good addition to the discussion.

      6) I think that given the technical model limitations present in the study of the uterus, and in the study of different species, it would deserve some comments about limitations in order to highlight these great findings.

      7) The relationship between ESR1 and HAND2 is a little unclear. Is ESR1 expression correlated with HAND2 expression in all species studied?

      Acknowledgments: We thank all participants for attending the live-streamed preprint journal club. We are especially grateful for both the last author's contributions to the discussion and for those that engaged in providing constructive feedback.

      Below are the names of participants who wanted to be recognized publicly for their contribution to the discussion:

      Monica Granados | PREreview | Leadership Team | Ottawa, ON

      María Sol Ruiz | CONICET-University of Buenos Aires | Postdoctoral Researcher | Buenos Aires Argentina

      Katrina Murphy | PREreview | Project Manager | Portland, OR

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear reviewers,

      Thank you very much for your constructive and helpful remarks and suggestions!

      We marked the changes in the manuscript in yellow.

      Our replies to the specific points:

      Reviewer #1 In the Introduction the authors need to cite earlier work in Chlamydomonas which first showed that binding of specific proteins to the psbA 5'UTR is correlated with increased translation in the light (Danon et al. 1991).

      As suggested, we added the reference to the introduction.

      Reviewer #1 The paper could be improved by testing for protein binding to the footprint region in high vs low light. An obvious candidate is HCF173.

      We agree that HCF173 is an obvious candidate, although its interaction could be mediated via additional proteins. Alice Barkan’s group has demonstrated that in maize HCF173 binds to the same region upstream of the translation initiation region (McDermott et al., 2019) where we detected a footprint (Supplemental Figure S11A-D). Furthermore, McDermott et al showed that the binding sequence is conserved. We would like to analyze this question in more detail, but we have currently in the lab no approach available to specifically isolate psbA mRNA with its bound proteins for this analysis and therefore have to postpone the answer to this question to future studies.

      Reviewer #2: \*Important changes to make before full submission:** 1)It is becoming clear that the translation efficiency (TE) is often not a calculation of translational output from specific mRNAs but in fact is better to be described as ribosome association. There can be many reasons for increased ribosome association including ribosome stalling and increased translational engagement. It would be good for the authors to add a simple Western blot to demonstrate directly increased protein output from psbA during high light as compared to low light treatments. This figure could be added to Figure S1.*

      We want to stress that we have chosen a condition that is well known to increase psbA translation in higher plants as shown in the literature with different methods (e.g. Chotewutmontri and Barkan, 2018; Schuster et al., 2020). The protein encoded by psbA, the D1 subunit of photosystem II, has an increased turnover in high light, i.e. a higher amount of D1 has to be produced to compensate for the increased degradation of photodamaged D1 (Mulo et al., 2012; Li et al., 2018).

      Although there is a lot of evidence in the literature for good correlation of translation efficiency as determined by ribosome profiling and protein synthesis, the reviewer raised a valid concern. Ribosome pausing or even ribosome stalling could also cause increased ribosome binding and thereby increased amounts of ribosome footprints. Therefore, we analyzed ribosome pausing in selected genes including psbA and rbcL. The pattern of ribosome pausing was very similar in low and high light (new Supplemental Figure 14), which rules out any ribosome stalling at specific sites or drastic changes in ribosome pausing. To analyze if there is increased ribosome pausing, we determined the fraction of footprints at pause sites compared to the total number of footprints. We used two different pause scores as cutoffs to determine pause sites. To include as many pausing events as possible, we used a pause score of 1, i.e. everything higher than the mean ribosome density per nucleotide of the corresponding coding region (Gawronski et al., 2018). This fraction was unaltered in low and high light (new Supplemental Figure 14). With a more stringent pause score of 20 (20 times higher ribosome density than the mean), an increase of ribsome pausing in high light was detected for psbA, whereas we did not find differences between high and low light for rbcL and psaA. However, this increase in pausing at the psbA mRNA is insufficient to explain the increase in the total amounts of ribosome footprints. Additional pause scores were tested, the value for the psbA fraction with a pause score of 20 included in Supplemental Figure S14 showed the largest difference.

      Reviewer #2: \*Strongly suggested additions to the manuscript to improve its significance before publication** 1)Identifying the RNA-binding protein(s) (likey HCF173 which may be in a complex with other proteins) that interacts with the 5' UTR of psbA in a highlight dependent manner would increase the significance of this study. Finding that this protein binds to other plastid transcripts with weak Shine-Delgarno sequences would also be a nice addition to this study.*

      See comment to reviewer 1. McDermott et al. (2019) describe HCF173 as relatively specific for psbA. Therefore, we do not assume that other genes with weak Shine-Dalgarno sequences are regulated via HCF173 but via different proteins using a similar molecular mechanism to influence the mRNA secondary structure at the translation initiation region.

      Reviewer #2: \*Strongly suggested additions to the manuscript to improve its significance before publication** 2)Mutational analysis of the RBP binding site and also to change the secondary structure around the start codon based on the new structure maps to show the effects of these various changes on protein output would really provide important new findings on how important the RBP being as compared to the RNA secondary structure changes are for regulating protein output form psbA. It could also allow the demonstration of the dependence or independence of these two features on regulating translation from chloroplast mRNAs.*

      We agree with the reviewer that this would be a very interesting study. Unfortunately, it requires a larger collection of lines with mutated psbA sequences. Plastid transformation in Arabidopsis thaliana is still technically demanding and time consuming. Even in the case of Nicotiana tabacum, for which plastid transformation is well established, such a project would likely need several years. We therefore think that such a study is beyond the scope of the current manuscript.

      Reviewer #3 1.In this paper, author mentioned that DMS can modify four nucleotides under alkaline conditions. Because the chloroplast is slightly alkaline, the authors use DMS reactivity from 4 nucleotides to model RNA secondary structure. Based on Kevin Weeks' s paper, it shows that in cell-free condition, DMS has very limited ability to modify single-stranded G and U compared to A and C (Anthony M. Mustoe et al., 2019, PNAS 116: 24574. fig. 1B). In Lars B. Scharff' paper which is cited by the author, it is also mentioned that A and C is more reliable to model RNA secondary structure. The authors might need to calculate the correlation the DMS data and known RNA structure using G/U or all four nucleotides to show that DMS reactivity from G and U is also reliable to be used. Also in Fig. S3B, the reproducibility of G/U between replicates is not as good as A/C. I don' t think G and U can be used to predict RSS.

      We agree with the reviewer that DMS reactivities at G/U are less reliable than those at A/C. This was shown by Mustoe et al. (2019) and by us for chloroplast rRNAs (Gawronski et al., 2020, Plants). We included a correlation of the known 16S rRNA secondary structure and the DMS reactivities at the different nucleotides (Supplemental Figure S5A) that demonstrates that the DMS reactivities at G/U actually contain information about rRNA secondary structure. This analysis demonstrated again that the reactivities at G/U are less reliable than at A/C. Therefore, we added an analysis of the more reliable A/C for comparison with the results for all four nucleotides (Figure 1D-F, 3C-F).

      Reviewer #3 2.Is the 5'UTR the only region which has RSS change? If not, how do RSS changes in other region contribute to translation?

      Translation initiation in plastids is mainly influenced by the secondary structure of the translation initiation region, especially at the cis-elements required for the recognition of the start codon. In addition, we have analyzed different other regions, e.g. the coding regions, the coding regions without the sequences next to the start codon, the end of the coding region, and the complete 5’ UTR (Supplemental Figure S14). We added a more detailed analysis of the changes of secondary structure of the coding region of those genes we focus on (Supplemental Figure S16). This shows that the secondary structure changes of the complete coding region correlate negatively with translation efficiency (see also Supplemental Figure S14G). A similar observation was made in E. coli and explained to be caused by differences in translation initiation, which are mainly influenced by the secondary structure of the translation initiation region (Mustoe et al., 2018).

      Reviewer #3 3.In Fig. 2A and 2B, the DMS reactivities seem very similar under low light and high light. Why did the authors obtain significantly different RNA secondary structure? Are the parameter of low light and high light the same when modelling RNA structure?

      The parameters for the RNA secondary structure predictions in Figure 2 are not identical (see Figure legend). For all structure predictions, the DMS reactivities were used as constrains, but only for the high light structure the sequence of the RNA binding protein’s footprint was forced to be single-stranded. These structure predictions are included to illustrate the mRNA structures in the presence and absence of an RNA binding protein. These structures are based on the observation that the two halves of the stem loop structure have different DMS reactivities in response to high light. The sequence including the protein footprint has lower DMS reactivities in both low and high light. This is in agreement with both a double-stranded sequence as well as a protein-bound sequence. In contrast, the other half of the stem loop, the sequence including the cis-elements of the translation initiation region, has increased DMS reactivities in high light, indicating that it is single-stranded. This suggests that there is protein binding in high light preventing the formation of the inhibitory stem loop.

      Reviewer #3 4.In Fig. S12, the correlationship between HL and LL in ribo-seq and RNAseq is high, which means no significant changes upon light change. In this paper, psbA should have translation change under high light conditions. I suggest the authors to label the dot representing psbA.

      Thank you very much for this suggestion! We marked psbA in the correlation plots (Supplemental Figure 12). The changes in the transcript levels are really minor, whereas for some genes the translation efficiency changes (see Figure 4 and Supplemental Figure S13).

      Reviewer #3 5.I suggest to use plants at the same stage for DMS-MaPseq and SHAPE probing.

      The different plant material was chosen because of the different requirements during probing. In this context, we would like to point out that observing the same changes in the translation initiation region in response to high light in different developmental stages is a stronger confirmation than observing the same response at the same developmental stage. This indicates that the response is not specific for a developmental stage.

      Reviewer #3 6.In Huang's paper (Jianyan Huang et al., 2019, Cell Reports 29: 4186-4199), there are many differential express genes under high light for 0.5hr. However, in the RNAseq data here, the correlation between high light and low light conditions is very high (Fig. S12). Why? Also, it would be nice if the authors could label several DEG whose expression change under high light treatment in Fig. S12?

      Supplemental Figure S12 contains only plastid-encoded RNAs, whereas Huang et al. (2019) focused on nuclear-encoded mRNAs. We clarified the figure legend of Supplemental Figure S12 by adding “of the plastid-encoded genes”. The values for the individual genes can be seen in Supplemental Figure S13.

      Reviewer #3 7.For the MNase footprint method, is the as-SD region the only region show enrichment under high light conditions? Besides, please provide the detailed method of MNase footprint. Does it work for RNA footprinting?

      The used methods are described under “Ribosome profiling (Ribo-seq)” and “Processing of Ribo-seq and RNA-seq reads” in Material and Methods. The approach was very similar to the one used for ribosome profiling with the difference that also smaller read lengths were included in the analysis (18-40 nt instead of 28-40 nt). We did this, because many plastid RNA binding proteins have footprints that are smaller than a ribosomal footprint. The described footprint is the only one detected near the translation initiation region of psbA. Binding of HCF173 was detected by the Barkan group in the same region using a RIP-Seq Analysis combined with RNase I digestion (McDermott et al., 2019), which confirms that our approach is working. We added a reference to the method section in the results part to clarify which approach was chosen.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      RNA can fold into secondary and tertiary structure through base-pairing. RNA structure plays a crucial role in gene functions and regulations, including transcription, processing, translation and decay. Plants acclimate to fluctuating light conditions to optimize photosynthesis and minimize photodamage. Translational regulation is known to be a strategy of these acclimations. It reported that translation of psbA, encoding the D1 reaction center protein of Photosystem II, is increased under high light condition. The light-controlled psbA translation has been intensively studied and was suggested to be related with redox/thiol signals, the ATP status, and some certain proteins. In this ms, Gawroński et al. explored the possible link between RNA secondary structure and translational efficiency. They adopted DMS-MaPseq and SHAPE-seq methods to profile the RNA secondary structure in 5UTR of psbA under low light and high light conditions. The results showed that the DMS and SHAPE activities of Shine-Dalgarno (SD) sequence, star codon and as-SD region are higher under high light condition than that under low light control, indicating that the psbA translation initiation region becomes more single-strandeness and accessible under high light condition. MNase-digestion and DMS activity analysis suggested that protein binding might cause the change of RNA secondary structure of psbA translation initiation region. In addition, the authors probed the RNA secondary structure of the translation initiation region of rbcL that encodes the large subunit of Rubisco and found no change in RNA structure of rbcL, while the translation of rbcL is also increased under high light condition. To address the question that RNA structure changes is related with high light-dependent translational activation of psbA but not rbcL, plastome-wide translational efficiency and RNA structure were analyzed. The results showed that a significant correlation between the RNA secondary changes and translational efficiency changes in the chloroplast-coded mRNAs with week SDs (such as psbA), but not with strong SDs (such as rbcL).

      The light-dependent translational activation of psbA is critical for maintaining photosynthetic homeostasis. Also, the molecular mechanism of RSS's impact on translation is still exclusive The topic of this study is very important. However, this study just described the phenomenon of RNA secondary structure changes in translational initiation region, but does not give further evidence to validate the effect of RNA secondary changes on the translational activation of psbA under high light condition. Besides, the evidence of protein binding causing RNA structure changes is week and unclear. In addition, there is much room for improvement for this work

      1.In this paper, author mentioned that DMS can modify four nucleotides under alkaline conditions. Because the chloroplast is slightly alkaline, the authors use DMS reactivity from 4 nucleotides to model RNA secondary structure. Based on Kevin Weeks' s paper, it shows that in cell-free condition, DMS has very limited ability to modify single-stranded G and U compared to A and C (Anthony M. Mustoe et al., 2019, PNAS 116: 24574. fig. 1B). In Lars B. Scharff' paper which is cited by the author, it is also mentioned that A and C is more reliable to model RNA secondary structure. The authors might need to calculate the correlation the DMS data and known RNA structure using G/U or all four nucleotides to show that DMS reactivity from G and U is also reliable to be used. Also in Fig. S3B, the reproducibility of G/U between replicates is not as good as A/C. I don' t think G and U can be used to predict RSS.

      2.Is the 5'UTR the only region which has RSS change? If not, how do RSS changes in other region contribute to translation?

      3.In Fig. 2A and 2B, the DMS reactivities seem very similar under low light and high light. Why did the authors obtain significantly different RNA secondary structure? Are the parameter of low light and high light the same when modelling RNA structure?

      4.In Fig. S12, the correlationship between HL and LL in ribo-seq and RNAseq is high, which means no significant changes upon light change. In this paper, psbA should have translation change under high light conditions. I suggest the authors to label the dot representing psbA.

      5.I suggest to use plants at the same stage for DMS-MaPseq and SHAPE probing.

      6.In Huang's paper (Jianyan Huang et al., 2019, Cell Reports 29: 4186-4199), there are many differential express genes under high light for 0.5hr. However, in the RNAseq data here, the correlation between high light and low light conditions is very high (Fig. S12). Why? Also, it would be nice if the authors could label several DEG whose expression change under high light treatment in Fig. S12?

      7.For the MNase footprint method, is the as-SD region the only region show enrichment under high light conditions? Besides, please provide the detailed method of MNase footprint. Does it work for RNA footprinting?

      Significance

      see above

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study uses multiple high-throughput sequencing approaches to probe the secondary structure of the chloroplasitc psbA mRNA during low and high light treatments. They are able to demonstrate a shift in secondary structure around the start codon of this mRNA in response to the high light treatment as compared to under low light conditions. This structural shift is also accompanied by an RBP binding even that may also be involved in regulating the translation from this mRNA in response to high light. I think this study is very interesting and timely. However, I think determining the relative contributions of the secondary structure and RBP binding changes to potential increases in protein outputs from this mRNA in response to high light would improve this manuscript. I also think directly looking at protein levels through a straight-forward Western blot to show increase psbA protein in response to high light treatment is an important addition to this study. I outline my few suggested experimental additions for this manuscript below.

      Important changes to make before full submission:

      1)It is becoming clear that the translation efficiency (TE) is often not a calculation of translational output from specific mRNAs but in fact is better to be described as ribosome association. There can be many reasons for increased ribosome association including ribosome stalling and increased translational engagement. It would be good for the authors to add a simple Western blot to demonstrate directly increased protein output from psbA during high light as compared to low light treatments. This figure could be added to Figure S1.

      Strongly suggested additions to the manuscript to improve its significance before publication

      1)Identifying the RNA-binding protein(s) (likey HCF173 which may be in a complex with other proteins) that interacts with the 5' UTR of psbA in a highlight dependent manner would increase the significance of this study. Finding that this protein binds to other plastid transcripts with weak Shine-Delgarno sequences would also be a nice addition to this study.

      2)Mutational analysis of the RBP binding site and also to change the secondary structure around the start codon based on the new structure maps to show the effects of these various changes on protein output would really provide important new findings on how important the RBP being as compared to the RNA secondary structure changes are for regulating protein output form psbA. It could also allow the demonstration of the dependence or independence of these two features on regulating translation from chloroplast mRNAs.

      Significance

      This study definitely focuses on a research topic that is currently of interest and highly timely.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript addresses the regulation of chloroplast translation, an important topic in chloroplast biology. The authors show that specific changes in the secondary structure of the 5'UTR of the psbA mRNA involving the Shine-Dalgarno sequence and the AUG initiation codon can be correlated with changes in translational efficiency during a low light to high light shift. Based on indirect evidence they propose that this may be caused by binding of specific proteins to this region. They also show that this correlation appears to be valid to some extent for other mRNAs with a weak SD sequence. The technical quality of this manuscript is excellent and the manuscript is clearly written.

      Additional remarks

      In the Introduction the authors need to cite earlier work in Chlamydomonas which first showed that binding of specific proteins to the psbA 5'UTR is correlated with increased translation in the light (Danon et al. 1991). The paper could be improved by testing for protein binding to the footprint region in high vs low light. An obvious candidate is HCF173.

      Significance

      This work provides valuable new insights into the molecular mechanisms involving the psbA 5'UTR in the initiation of chloroplast translation.

      This work will be of interest to a wide audience interested in the mechanisms of translational regulation.

      My expertise is in chloroplast biogenesis and in assembly and regulation of the photosynthetic apparatus