5,788 Matching Annotations
  1. Nov 2020
    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Reijns and colleagues describe an approach to detect the causative agent of COVID-19, the beta coronavirus SARS-CoV-2, using an inexpensive in-house multiplex RT-qPCR. Concomitantly, viral E, N and RdRP(probe P2) as well as human RPP30 and a herpesvirus nucleic acid are also detected in order to monitor both the sample quality and the sample preparation. Reijns et al. performed testing on a huge amount of samples and used the data to describe the strength and limitations of the assay. The data is sound and give a very good impression of the 4-plex PCR capabilities. I read manuscript fluently and consider as linguistically very good. However, I still have a few comments and remarks that would strengthen the manuscript:

      Major issues:

      In the first section of the results section, many primer / probe conditions are given that make the reading flow difficult. Instead of using (data not shown) it would be helpful to use a table or a graphic to illustrate the various approaches. In general, I suggest to replace Ct by Cq, since the IVT standards are a quantification method.

      There has already been a change away from the initial E and RdRP gene based assay because of the published sensitivity issues and the use of degenerate bases as well as the detection of unspecific nucleic acids for E gene). In particular, it has been shown that the Sarbeco-E-yields false positive results (Toptan et al. 2020 (https://doi.org/10.3390/ijms21124396), Konrad et al. 2020 (https://doi.org/10.2807/1560-7917.ES.2020.25.9.2000173)), so that many laboratories do not consider E-gene-based results for borderline samples anymore. In this manuscript, the authors should comment on why they still use the results from the E gene / RdRP and describe their experience.

      In this manuscript, it should be indicated that the SARS-CoV-2 specific Probe P2 (according to Corman et al. 2020) was used. The reason for lower sensitivity due to nucleotide ambiguity and mismatch has to be explained in more detail. In addition to Corman et al. 2020 (see reference 2), Toptan et al 2020 (https://doi.org/10.3390/ijms21124396) might serve as helpful literature. With regard to the marginally positive samples that were not consistent in all assays, were the PCR products analyzed using high-resolution PAA genes and, if possible, sequenced? The sequencing approach (Sanger or NGS) offers the final characterization of the PCR products (especially for pan-genotypic primers such as E-Sarbeco). The samples declared as "inconclusive" could be further characterized in this way.

      The normalization in figure 3 should be also explained in the main text. Especially, why this approach was used for normalization. Nonetheless, it looks like the normalized values wills cluster much more strongly than those corresponding to the actual values. The authors should comment on this phenomenon. It appears that the higher cq values (less virus) are subject to a strong correction factor more often than high values. Are there any statistical relevant tendencies towards this phenomenon? For everyday clinical practice, does this mean that low samples Cqs (mostly) only reflect the quality of the sample, but not the viral load? Finally, it remains somewhat unclear to what extent the Cq values of the RPP30 should have an influence on the routine diagnostics. The authors discuss that a fixed cutoff value would be a possibility to sort out poor swab samples, but if a cq value is available it would also make sense to generate a kind of quality score that can display the significance of a test. It would be helpful if the authors could comment on this or other possibilities.

      Over the past few months, more and more virus subtypes have formed through the manifestation of point mutations (and amino acid substitutions). The authors should therefore definitely comment on the current strains as to whether all primers / probes are able to detect the virus variants circulating worldwide without loss of sensitivity. Along this line,which virus strains were used for the cultivation as described in line 131? Is sequence data available? If so, it would provide helpful information to characterize the viral strain.

      Line 206ff: In my opinion, this section belongs more to the discussion part than to material and methods that describe the technical implementation.

      Is there a loss of sensitivity compared to the single PCRs? This data is very important and useful for other users. They should therefore be included explicitly in the manuscript (supplements).

      Minor issues:

      Line 15 ff.: Source is missing, is this WHO-data?

      Fig S3: How was the digital droplet PCR carried out? A brief description should be included in the legend text.

      Figure 1a: PCR efficiencies are missing.

      Line 145: MS2 appears, but without explaining the context. This should be improved here with additional information (this does not appear until line 154).

      Page 15, H20 instead of H20, reaction mix instead of Reaction mix.

      Significance

      The novel coronavirus SARS-CoV-2 is the causative agent of the acute respiratory disease COVID-19 which has become a global concern due to its rapid spread and high death rate. While some patients have no symptoms at all, but are still able to spread the virus, others have severe symptoms, often with fatal outcome. The gold standard in SARS-CoV-2 detection is the RT-qPCR approach, however, the high cost commercial kits are available in limited amounts only. The issue of the scarcity of resources is still an highly important issue, especially in terms of the incredibly rapidly increasing number of cases worldwide. Thus, the manuscript is of significance for the field and timely. Especially, diagnostic laboratories in low-income countries that are involved in the managing the pandemic but also researchers will benefit from this manuscript and save resources.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In order to improve SARS-CoV-2 diagnostics, Reijns et al. developed a multiplexed RT-qPCR protocol that allows simultaneous detection of two viral genes, one housekeeping gene as well as an external gene as an extraction control. Compared to running parallel assays to detect genes individually, the turnaround time is much shorter and reagents are saved. Furthermore, the presented data suggest that the assay is more sensitive than commercial kits. The authors also propose the detection of the human housekeeping gene as a measure of sample quality control. In principal, this work has potential but the manuscript itself needs a better structure.

      Major concerns:

      The authors have used the Takara RT-qPCR kit for their study. Did the authors try other commercial kits? Can the authors elaborate on the supply chain of the Takara kit? Could it cover population testing in case of shortages of other commercial kits?

      For better comparison, is it possible to give information on which primers the commercial kits are based on? Also, explain better the primers used in this study. For example, the N1 and N2 primers are directed against different regions of the SARS-CoV-2 N gene.

      The result section needs a better structure as the first two pages do not refer to any of the main figures. For example, in which figure or table can the reader find the data that are discussed in lines 83 to 87?

      Table S1, instead of current Table 1, could be moved to main figures as it contains the important finding that the multiplexed assay may be more sensitive than the commercial one. The authors identified some samples that scored negative in commercial assays but positive in their new assay. This is important, however, the possibility of detecting false positives should be strengthened in a "Discussion" section.

      Figures 1 to 3 have different panels which seem to be redundant. For example, Fig 1 A and B, Fig 2 B and C, Fig 3 C and D.

      Figure 1: Give a rational why comparing before and after extraction. This heavily depends on the extraction method and not on the detection itself. In addition, IVT RNA does not reflect the complexity of a clinical specimen. This is rather confusing and deviates from the important findings.

      Figure 3: Were any of the negative samples/patients tested with an undetectable housekeeping gene, re-test positively? Did adding this housekeeping gene as a control actually improve the detection of any patient samples? If the authors want to convince the readership of this quality control, experimental evidence should be provided.

      Fig 3C and D seem to contain this information somewhat, as here, the values were normalized and the CT values for the E and N gene decreased. Nevertheless there is no real explanation of this figure provided in the Result section at all. While this figure has potential, the authors have to keep in mind that the number of cells in a swab can be affected by many biological factors, including age, sample timing, inflammation of the respiratory tract, etc. In addition, viral genomes can exist intra- as well as extracellular, in the form of free virus. So even in the absence of human cells/detectable housekeeping genomes, viral RNA can be or should be present in a sample in case of infection. This explains (probably) why a correlation between detectable housekeeping gene and viral RNA is absent (Fig 3A and B?). This entire Fig 3 just needs a better explanation. The provided text does not describe any results and should go into a "Discussion" section.

      Self-swabbing is surely a potential source of variability and false-negatives, but many publications have shown the suitability of saliva testing. This should also be discussed and would probably negate the need for such a quality control.

      Which assay works better, the N1E-RP or the N2E-RP assay? A final conclusion is missing here.

      Significance

      Naturally, in this pandemic, this topic is important as sensitive and affordable methods to detect SARS-CoV-2 infections are in need. This Reviewer agrees that multiplexing could be an elegant approach to fill this need.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their feedback and constructive comments to our work. We provide here a point-by-point response to the comments of Reviewers #1, #2 and #3 (text in grey and italic).

      Responses written in plain text correspond to Reviewer comments that have been addressed in the revised version of the manuscript provided at this stage of the review process (referred-to as “revised version I” below).

      Reponses written in bold text correspond to comments that need further experiments. The list of experiments we intend to perform to address these comments is provided in a separate document (Revision plan). The results of these additional experiments will be included in a later revised version of the manuscript referred-to as “revised version II” below.

      Reviewer #1

      The manuscript addresses an important topic, the posttranscriptional maturation of ribosomes. This topic is inherently interesting because we normally think of ribosome biogenesis as a sequential series of steps that automatically proceeds and cannot be "accelerated" in physiological conditions, but only "delayed" in the presence of genetic mutations. In short, the manuscript proposes that RIOK2 phosphorylation by the action of RSK, below the Ras/MAPK pathway promotes the synthesis of the human small ribosomal subunit.

      I honestly admit that I have some difficulties in reviewing this manuscript. The quality of the presented data is, in generally, good. However, overall I find the whole manuscript preliminary and I am not much convinced of the conclusions. Several aspects are superficially analyzed. In short, I think that most of the conclusions are not fully supported by the data because shortcuts are present. A list of all the aspects that I found wrong are listed.

      Biological issue

      1. _The authors claim that the effects of the inhibition of the maturation of ribosomes by acting on a pathway upstream of RIOk2 are limited to the 40S subunit. This is far from being a trivial point, for the following reason. RIOK2 is known to affect the maturation of 40S ribosomes. Hence, the fact that using an upstream inhibitor of the MAPK pathway such as PD does not inhibit 60S processing in reality would argue against a biologically relevant control in ribosome maturation (of the MAPK patheay). Have the authors considered this? In a way, also, given the fact that the mutants confirm a role in 18S final maturation, it is a bit complex to put all the data in a clear biological context.

      We agree that we put more emphasis on the effects on the pre-40S pathway than on the pre-60S pathway in the original manuscript but we did not claim that the effects of PD or LJH inhibitors of the MAPK pathway are restricted to the 40S subunit. We described that the effect of PD or LJH on the 32S was less severe than on the 30S, and we did mention variations of the 12S intermediate. These changes are in the same range of amplitude as the changes in the 21S and 18S-E intermediates in the small subunit pathway. The Northern blot data concerning the pre-60S pathway were placed in the supplementary material of the original manuscript, which may have left the reader with an impression of lesser emphasis. We rephrased this part in the present revised version I of the manuscript (Page 6, Line 26) and we now show the pre-40S and pre-60S intermediates on the same figures (Figures 1A and 1C).

      In addition, we will probe more exhaustively the intermediates of the pre-60S pathway in the revised version II of the manuscript as described in the revision plan. These data will be complemented with metabolic labeling experiments to provide a more dynamic analysis of the pre-rRNA processing defects resulting from inactivation of the MAPK pathway. Furthermore, as requested by Reviewer #2 (see below), we will quantify more accurately these data.

      A number of specific issues will be concisely described.

      Manuscript very well written. Data do not always support the strong conclusions. Low magnitude of the observed effects.

      In introduction the authors make a general claim that ribosome biogenesis is one of the most energetically demanding cellular activities. This statement lingers in the literature since 15 years but in reality it has never been formally proved for mammalian cells, and certainly not for HEK293 cells. The original statement, to my knowledge, can be traced by some obscure statement referred to the yeast case and then repeated as a truth. In conclusion, beside being a very banal observation, it should be referenced.

      We agree with this comment of Reviewer #1. The original statement has been proposed by Jonathan R. Warner (Warner, 1999, TiBS and references therein) and data from the Bähler group also supported this statement (Marguerat et al., 2012, Cell). However, these data were indeed referring to yeast (S. cerevisiae and S. pombe). In the present revised version I of the manuscript, we introduced the reference of a review providing quantitative data of ribosome biogenesis in human cells (Lewis & Tollervey, 2000, Science) and we modified the problematic sentence as follows:” Growing human cells produce around 7500 ribosomal subunits per minutes (Lewis and Tollervey 2000), which represents a significant expenditure of energy.” (Page 4, Line 1).

      Growth factors, energy status are not cues but are proteins or metabolites (introduction).

      We agree with this comment of Reviewer #1. We changed the text accordingly in the revised version I of the manuscript (Page 4, Line 8).

      Authors write about mTOR without making statements on mTORC1/2. This is very obsolete. Also I am not sure that the choice of Geyer et al., 1982, and subsequent papers makes much sense. At the very minimum TOP mRNA concepts and mTORC1 must be defined.

      We provide more details on the mTOR pathway in the revised version I of the manuscript according to Reviewer #1’s suggestions (Page 4, Line 13 and Page 5, Line 3).

      The authors claim that their work fills a major gap between known functions of MAPK and cytoplasmic translation. I would not be so sure about it.

      Our original sentence stated that “our work fills a major gap between currently known functions of MAPK signaling in Pol I transcription and cytoplasmic translation”. Indeed, although MAPK signaling was known to regulate Pol I transcription and cytoplasmic translation, the impact of the pathway on the post-transcriptional steps of ribosome synthesis, namely pre-ribosome assembly and maturation, has been very little investigated and remains poorly understood. Our data provides the first example of a detailed mechanism of regulation of the maturation of pre-ribosomal particles by the MAPK pathway. Reviewers #2 and #3 seem to agree with this point:

      Reviewer #2: “However, there is a lacking mechanistic connection of signaling pathways to pre-rRNA processing and maturation steps of ribosome biogenesis. The authors set out to provide a specific example of a direct target of MAPK signaling, RSK that regulates pre-rRNA maturation through the phosphorylation of a ribosome assembly factor (RIOK2), offering for the first time providing mechanistic insight into MAPK regulation of pre-rRNA maturation.

      Reviewer #3: “With these provisos, the work is technically good and will be of considerable interest to the field. The post-transcriptional regulation of ribosome synthesis is increasingly recognized a significant topic.

      Results. Authors start with a major mistake, i.e. that PMA selectively stimulates the MAPK pathway. Perhaps it stimulates, certainly it does not do it selectively.

      We agree with this comment of Reviewer #1. We removed the term “selectively” in the problematic sentence (Page 6, Line 8).

      RIOK2 phosphosites are first found by bioinformatics analysis. It should be noted that the predicted phosphosite (S483) is found only in a limited set of datasets from MS databases. The actual importance of this site would not emerge from unbiased studies. Also, there are many other phosphosites that were not analyzed in this study.

      We agree with Reviewer #1 that phosphorylation of S483 of RIOK2 has been detected in a limited number of mass spectrometry datasets, but these datasets have been reported in high impact journals (Nature Methods, Mol Cell Proteomics, Science), attesting of the quality of these studies

      As mentioned by Reviewer #1, there are several other phosphosites within RIOK2 that were not analyzed in our study. We provided the list of these phosphosites in Supplementary Table S1 of the original manuscript. Besides T481 and S483, none of the other sites belong to consensus motifs recognized by ERK or RSK at medium and high stringency. They are therefore less relevant to our study. We only analyzed phosphorylation at S483 because: (i) our mass spectrometry analysis revealed that S483 is the only phosphosite in RIOK2 whose level increases upon MAPK activation but not in the presence of the MAPK inhibitor PD184352 (Figure 2B); (ii) our in vitro kinase assay showed that the phosphorylation level of RIOK2 by RSK is residual when S483 is replaced by a non-phosphorylatable alanine (Figure 3D); (iii) our data presented in Figure 2C further show that mutation of T481 to an alanine does not prevent RIOK2 phosphorylation on RxRxxS/T motifs upon stimulation of the MAPK pathway.

      We clarified this point in the relevant part of the result section of the revised version I of the manuscript (Page 7, Lines 16 and 24, Page 8, Line 17 and Page 9, Line 5).

      Throughout the paper the authors use the word strongly, significantly, but the actual effects seem in general quite marginal.

      We agree with Reviewer #1 that some of the phenotypes described in the manuscript are modest, in particular the phenotypes resulting from the S483A mutation of RIOK2, which is not aberrant for a point mutation. We rephrased several sentences throughout the manuscript to soften the formulation in the description and interpretation of the data and in the conclusions.

      Discussion. The authors claim that they provide solid evidence on MAPK signalling to ribosome maturation. At the very best this is circumstantial evidence for the 40S maturation.

      We rephrased the sentence accordingly (Page 16, Line 5): “Our study provides evidence that MAPK signaling applies another level of coordination during ribosome biogenesis, by directly regulating pre-40S particle assembly and maturation.

      Figure 1.

      Unclear why LJH should increase P-ERK.

      A negative feedback loop has been described in the MAPK pathway whereby RSK activation partially inhibits ERK phosphorylation (Saha et al., 2012, Horm Metab Res; Dufresne et al., 2001, MCB; Schneider et al., 2011, Neurochem; Re Nett et al., 2018, EMBO Rep). Inactivation of RSK with LJH alleviates this inhibition, which results in increased phosphorylation levels of ERK.

      We added this information in the revised version of the manuscript along with the corresponding references (Page 6, Line 17).

      General lack of quantitation (sd, replicates, bars). Experiment done only on a single cell line in a single experimental setup.

      As also requested by Reviewer #2 (Major comment 1.), we applied in the revised version I of the manuscript RAMP quantifications to all Northern blot data. We included error bars corresponding to biological replicates.

      Furthermore, in order to validate the impact of the MAPK pathway on pre-ribosome assembly and maturation, we plan to perform the same experiments using PD inhibitors in different cell lines and we will provide a figure with accurate RAMP quantifications, error bars and statistical significance, in the revised version II of the manuscript (see revision plan).

      Very different effects on 21S by LJH, PMA and siRNA for RIOK2. Overall the message given by the authors is to me mysterious.

      We assume that the reviewer wanted to point out the difference between PMA, PMA+LJH and shRNA for RSK since we did not perform RNAi targeting RIOK2. We agree with this comment. We believe that this difference is likely due to experimental setups that are different between both experiments. In the experiment using inhibitors, we assessed short-term effects of RSK inhibition after acute stimulation of the MAPK pathway (starved cells stimulated with PMA), while in the experiment using shRSK, we monitored long term effects of RSK depletion in serum-growing cells in which other signaling pathways are also active. Prolonged RSK depletion is likely to induce pleiotropic cellular effects, which would interfere with ribosome biogenesis both directly and indirectly. These differences probably explain the variable effects on the 21S intermediate. However, in both experiments we do observe an accumulation of the early 30S intermediate, consistent with the phenotype observed when ERK is inactivated (PD inhibitor), therefore indicating that RSK regulates some post-transcriptional stages of ribosome biogenesis.

      To make our results clearer we have withdrawn the experiments using shRSK to avoid the risk of showing indirect effects due to the prolonged absence of RSK. Instead, we included RAMP analyses with error bars from 2 biological replicates using PD and LJH inhibitors (Figure 1B).

      Figure 2.

      Several red flags. For instance in 2C the loaded levels of RIOK2-HA loaded are clearly less than the ones of the other genotypes, hence the conclusion on P-RIOK2 is not convincing.

      Our aim in this experiment was to compare the impact of PMA treatment on the phosphorylation levels of different RIOK2 mutants (T481A, S483A, double mutant). For a given mutant, the levels of RIOK2 loaded in the two conditions (i.e. not stimulated and PMA stimulated) are very similar and we therefore assume that our conclusions are valid.

      We nevertheless plan to repeat these experiments and quantify the data for the revised version II of the manuscript.

      Staining with anti-P RIOK2 lacks controls, how can be sure that the signal is due to the phosphate? Phosphatase treatment?

      We fully agree with Reviewer #1 and we did perform an experiment showing that the phosphorylation signal disappears following treatment of the protein extracts with λ-phosphatase. We did not show these data in the original version of the manuscript because of space limitations. We added these data in the supplementary material of the revised version I of the manuscript (Supplementary Figure S2B) and amended the text accordingly (Page 7, Line 24)

      Why FBS does not lead to ERK staining in HEK293? There are plenty of growth factors in FBS that should lead to ERK phosphorylation. I do not understand this experiment.

      We agree with this comment. Addition of serum to starved cells does lead to ERK and RSK phosphorylation but with a much lesser efficiency compared to stimulation by EGF and PMA. ERK phosphorylation is barely visible on the exposure shown in Figure 2D but RSK-phosphorylation is clearly observed, although the signal is much weaker compared to EGF and PMA treatments. It is common to observe a stronger response with purified PMA and EGF (see Carrière et al., 2011, JBC ; Ray et al., 2013, Oncogene). There are indeed several growth factors in the serum, but the most abundant (Insulin, IGF1, TGF) are present at ng/ml concentration, while EGF is used at 25 µg/ml in Figure 2D. Moreover, they are not very strong activators of the Ras/MAPK pathway, and it is also possible that after 20 min of FBS treatment the phosphorylation is in the decreasing phase.

      In the present revised version I of the manuscript, we included a set of western blots from another experiment showing the same results but of better quality to make the effects more visible (Fig. 2D). We also provided quantifications of phosphorylation of RIOK2 and associated statistical analyses (Fig. 2E).

      Figure 3. In vitro phosphorylation, if I understood, it relies on a truncated version of RIOK2. Why? Is the folding of the full length protein not permissive to in vitro phosphorylation?

      We did not test phosphorylation of the full length RIOK2 protein in vitro because RIOK2 has been reported to auto-phosphorylate (Zemp I. et al., 2009, JCB) and we were concerned that this auto-phosphorylation activity of RIOK2 in addition to RSK phosphorylation may render this experiment inconclusive.

      HA-RSK3 is less?

      It was reported that RSK3 is insoluble when over-expressed (Zhao et al., 1996, JBC), which explains the lower levels of protein recovered in our soluble extract. The information was present in the legend of Figure but we transferred it to the main text of the result section in the present revised version I of the manuscript (Page 10, Line 3).

      Figure 4. Immunofluorescence is low mag, difficult to understand.

      We agree with Reviewer #1. We modified the FISH experiment figure to show cells with a higher magnification and we provided more details in the text (Page 12, Lines 20-25) to facilitate the understanding of the data.

      I really like the experiments with RIOK2 mutants, however I wonder what about protein levels after the knock-in? Given the 18S phenotype overlap between the phenotype of the RIOK2 loss of function with the S483A, testing protein level becomes of the utmost importance.

      We checked RIOK2 protein levels and observed that the mutations do not decrease the level of RIOK2. On the contrary, the mutations slightly increase RIOK2 levels. Therefore, we are pretty confident that the phenotypes resulting from expression of RIOK2 mutants do not result from defects in the global accumulation of the protein. These data have been added to Figure 4C of the revised version I of the manuscript and we amended the text accordingly (Page 12, Line 5).

      Figure 5. Low quality IFL.

      Our aim in preparing this figure was to show many cells in the different images to show that the effect of our mutation was homogenous at the level of cell populations. The drawback is that cells are small and look blurred. We improved the quality of the figure in this revised version I of the manuscript with new images from the same experiment, showing less cells with a higher magnification.

      Hard to think that histogram quantitation of nuclear versus cytoplasmic staining are reliable in the absence of fractionation, better quantitation, experiment done in other cell lines and so on.

      We provide in this revised version I of the manuscript a supplementary figure explaining the procedure we used to quantify the fluorescence data (Supplementary Fig. S7).

      Furthermore, to confirm this result using other experimental conditions and cell lines, we will transfect HEK293 and HeLa cells with plasmids expressing GFP-tagged RIOK2 WT or the S483S mutant and we will compare the kinetics of nuclear import of both proteins upon inhibition of pre-40S particle export by leptomycin B using fluorescence microscopy and GFP quantifications. Second, we will transfect HeLa cells with plasmids expressing HA-tagged RIOK2 WT or S483A and perform fractionation assays to monitor their presence in both cytoplasmic and nuclear compartments. We will include these data in the revised version II of the manuscript.

      However, very beautiful Fig. 5E perhaps the best of the paper shows also mobility shift driven by S483, thus supporting posttranslational modifications.

      We thank Reviewer #1 for this comment. We added the note on the evidence of RIOK2 post-translational modification in the result section (Page 14, Line 9).

      Fig. 6. IFL studies are really impossible to interpret.

      We improved the quality of the figure with new images from the same experiment, showing less cells with a higher magnification. NOB1 IF data and quantifications have been transferred to the supplemental material (Supplemental Fig. S4A and S4B) to clarify the figure. In addition, we provided more explanations on the principle of this experiment and expected results in the text (Page 15, Line 9).

      The effects on RIOK2 release (this figure) and 18S maturation (Fig. 5) are very clear and of great quality.

      We thank Reviewer #1 for this comment.

      Overall conclusions. The manuscript tends to overinflate the meaning of several experiments. What to me is very clear and interesting is that the the authors provide clear evidence that S483A mutants have a defect in 40S maturation. Whether this is due to MAPK signalling, is only circumstantial. I would suggest to build up on the strong findings and eliminate ambiguous data.

      We do not fully agree with this comment of Reviewer #1. If mutation S483A were simply a partial loss of function mutation, this would not be of strong interest for the subject of this manuscript. It would just indicate that S483 is important for RIOK2 function independently of its phosphorylation status. Our data show that the impact of S483 mutation on pre-rRNA processing and other phenotypes is different depending on whether the serine is converted to an alanine (phosphorylation mutant) or to an aspartic acid (phospho-mimetic mutation). These data are a strong indication that what matters is not simply the serine residue by itself but its phosphorylation status.

      Reviewer #1 (Significance (Required)):

      The paper deals with an important topic, namely whether a regulation of ribosome maturation exists, and how it is mechanistically regulated. In this context, the analysis of the ERK pathway is highly needed considered that most works deal with effects of the PI3K-mTOR pathway, and the parallel, yet important RAS-ERK pathway, is less understood.

      As a final note, we should consider that S6K downstream of mTOR, and ribosomal S6K, downstream of ERK have been considered to share some substrates.

      We introduced this information in the revised version of the manuscript (Page 19, Line 20). A related comment has been raised by Reviewer #3 (see below, Caveat #2).

      The manuscript is interesting, but several statements given by the authors are rather superficial. An example, listed in the previous section, relates to the linguistic usage of mTOR kinase, instead of detailing whether we are dealing with mTORc1 or mTORc2.

      We agree with this comment of Reviewer #1. Given that the main focus of this manuscript is the regulation by the MAPK pathway, we had chosen to put less emphasis on mTOR in the introduction. However, we added more precise information on mTOR in the present revised version I of the manuscript to address this comment (Page 4, Line 13 and Page 5, Line 3).

      A second gross mistake is the definition of PMA as a stimulator of the ERK pathway. If this is certainly true, this is historically not correct as seminal papers by the group of Parker define this drug as a stimulator of conventional PKC kinases. In short, this paper is a step back in knowledge from the perspective of the literature context.

      We are a bit confused by this comment because seminal papers from the Parker group clearly state that PMA activates the MAPK pathway via PKC (Adams and Parker, 1991, FEBS Lett.; Ways et al., 1992, JBC; Whelan et al., 1999, Cell Growth Differ.). We agree, as mentioned earlier by Reviewer #1, that PMA is not specific to MAPK, a comment that has been addressed above.

      All people interested to the crosstalk between ribosome maturation and signaling pathways will be certainly read this manuscript.

      My expertise is within the ribosome biology and signalling field.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      There have been mechanistic connections of various signaling pathways to regulation ribosome biogenesis steps including rDNA transcription by RNA polymerase I and III, ribosomal protein transcription, and differential mRNA translation efficiency. However, there is a lacking mechanistic connection of signaling pathways to pre-rRNA processing and maturation steps of ribosome biogenesis. The authors set out to provide a specific example of a direct target of MAPK signaling, RSK that regulates pre-rRNA maturation through the phosphorylation of a ribosome assembly factor (RIOK2), offering for the first time providing mechanistic insight into MAPK regulation of pre-rRNA maturation.

      The authors observe slight pre-rRNA processing defects upon the use of RSK inhibitors and RSK depletion. They identified several candidate ribosome assembly and modification factors containing the canonical RSK substrate motif, including the RIOK2 kinase. Phosphorylation at this motif was verified to be specifically phosphorylated by RSK1 and 2 isoforms in cells and in an in-vitro kinase assay. The authors produced RIOK2 knock-in eHAP1 cell lines expressing non-phosphorylatable or phosphomimetic versions of RIOK2, observing slowed cellular proliferation, decreases in global translation, slight pre-rRNA processing abnormalities, but not changes in overall mature 18S rRNA levels. More specifically, the authors defined the inability of RIOK2 to be phosphorylated leads to defects in RIOK2 dissociation from the pre-40S ribosomal subunit in an in-vitro assay, and inability for it to be recycled for reuse in pre-ribosome export from the nucleus to the cytoplasm by immunofluorescence.

      Overall, the authors provide an interesting mechanism of MAPK regulation of a ribosome assembly factor RIOK2. However, they fail to provide the necessary reproducibility, controls, quantification, and consistent results between experiments to support their hypotheses.

      Major Comments:

      1. The northern blots reported throughout the manuscript are lacking proper reproducibility and quantification. First, the northern blots are lacking a loading control, which is necessary to report fold changes that are being measured across treatments. Please include a proper loading control (i.e. 7SL or U6 RNAs). Additionally, more rigorous analysis of the pre-rRNA precursor levels through ratio analysis of multiple precursors (RAMP) (Wang et al 2014) can be completed to provide a clearer depiction on which precursor(s) are accumulating. It is unclear for the Figure 1 northern blots if there were replicates completed and what the error bars represent in Figure 1B. Please report replicates, so that statistical analysis can be completed on the differences in precursor relative abundance. This need is emphasized by the small changes observed in pre-rRNA levels (less than 2 fold) between conditions.

      As mentioned above (Reviewer #1), we applied in the revised version I of the manuscript RAMP quantifications to all Northern blot data. These quantifications are shown as separate panels in the figures of the revised manuscript.

      Furthermore, we are planning to repeat the Northern blot experiments of Figure 1 to obtain biological replicates in other cell lines. We will probe the membranes to detect the 7SL RNA as a loading control in all these experiments. We will perform RAMP analyses on all these Northern blot experiments to provide more accurate quantifications of the pre-rRNA levels in the different conditions. These data will be included in the revised version II of the manuscript.

      1. The western blots reported throughout the manuscript are lacking proper reproducibility and quantification. For example, the western blots validating RSK1 and RSK2 depletion in Figure 1C lack a proper loading control. Additionally, it is unclear if there are replicates completed and there is lack of statistical analysis to determine if the changes are significant. Please include loading controls, replicates, and quantification of the western blots throughout the manuscript.

      We have included actin levels as loading controls in several figures (Figures 2D, 3A, 3C, 3E, 4C) of the revised version I of the manuscript. We also added phosphorylated Rps6 at Ser235/36 to monitor RSK activity in Figures 1A, 2D, 3A.

      We provided quantifications and associated statistical analyses of phosphorylation of RIOK2 presented in Figures 3A and 3C of the revised version I of the manuscript. We also included quantifications of the in vitro phosphorylation assays presented in Figures 3F and 3G.

      We are nevertheless planning to repeat and quantify more accurately the western blot experiments presented in Figures 2A, 2C and 3E of the revised version I of the manuscript. These data will be included in the revised version II of the manuscript.

      1. Please report the full bioinformatic analysis of the RSK substrate motif search among human AMFs including other AMFs found in this search. A sorted list format would be valuable for the reader to understand other potential RSK substrates involved in ribosome biogenesis.

      We understand the request of Reviewer #2. Providing the full list of AMFs identified in our bioinformatic screen would be valuable for the reader, mostly because it would make clearer that RSK seems to be regulating multiple stages of the pre-ribosome maturation pathway, therefore that RSK inhibition induces pleiotropic defects in ribosome synthesis. However, we are currently working on a more global study of the impact of MAPK regulation on the post-transcriptional steps of ribosome synthesis that we would like to publish in a near future.

      1. The authors report that RSK inhibition/depletion leads to accumulation of the 30S pre-rRNA, yet mutation of its target site on RIOK2 or RIOK2 depletion leads to an accumulation of the 18S-E pre-rRNA. Additionally, the phosphomimic mutation of RIOK2 leads to an accumulation of 30S, the opposite of the expected result. Please elaborate on this discrepancy in processing defects observed across experiments.

      In contrast to RIOK2 which is specifically involved in the late, cytoplasmic stages of the maturation of the pre-40S particles, RSK regulates ribosome biogenesis at multiple levels. Upon activation of the MAPK pathway, RSK activates Pol I transcription in the nucleoli and promotes translation of mRNAs encoding ribosomal proteins and AMFs. In addition, our bioinformatic screen identified several AMFs at different stages of the maturation pathway of both ribosomal subunits as potential targets of RSK. These considerations imply that RSK inhibition is expected to impact ribosome biogenesis at multiple levels (Pol I transcription, availability of RPs and AMFs, export of the pre-ribosomal particles, probably several maturation steps) whereas RIOK2 inactivation more specifically delays 18S-E processing in the cytoplasm. In terms of processing, RSK inhibition induces a significant accumulation of the 30S intermediate. This is another evidence that RSK regulates pre-rRNA processing at several stages. This phenotype might result, as recently described in yeast (Yerlikaya et al., 2016, MCB), from an inhibition of RPS6 phosphorylation which affects its early incorporation into pre-ribosomes, although this has not been demonstrated in human cells. This 30S precursor accumulation affects production of the downstream intermediates and we strongly believe that this precludes accumulation of 18S-E even if the activity of RIOK2 is affected. Given the broad implication of RSK at different stages of ribosome biogenesis, it is biologically relevant to observe that inactivation of RSK does not result in the same processing defects as inactivation of RIOK2.

      We nevertheless tried to make this point clearer in the present revised version I of the manuscript. We added in the supplementary material a diagram (Supplementary Fig. S1C) showing all the known and hypothetical targets of ERK and RSK in ribosome synthesis to provide the readers with a global view of the function of RSK in this process and refer to this figure in the introduction and results. In the introduction, we also emphasize more on the multiple aspects of the regulation of ribosome synthesis by ERK and RSK (Page 4, Line 18).

      Concerning the phospho-mimetic mutant, it does accumulate slightly the 45S and 30S intermediates contrary to the non-phosphorylatable mutant but this is not totally unexpected. RIOK2 is incorporated into pre-ribosomes in the nucleus, at a stage that remains unclear, and constitutive RIOK2 phosphorylation may interfere with this recruitment and affect processing at an earlier stage. This point has been addressed in the discussion of the revised version I of the manuscript (Page 18, Line 7).

      Are there similar results for RSK depletion/inhibition and RIOK2 release from the pre-40S and inability to import into the nucleus? If so, this could provide phenotypic consistency between these two proteins in the proposed pathway to further support the hypothesis.

      We performed the same experiments as reported in Figure 6C to try to demonstrate a cytoplasmic retention of RIOK2 after leptomycin B treatment upon ERK inhibition (PD treatment). We also performed IF and cell fractionation experiments upon PD treatment. In all cases, we failed to observe the expected result. We strongly believe that we are facing here the same problem as described above for the previous comment of Reviewer #2. ERK and thus RSK inhibition leads to accumulation of the early, nucleolar 30S intermediate, indicating that the processing pathway is significantly blocked at an early stage preceding formation of the pre-40S particles in which RIOK2 is recruited. This early blockage most likely explains why we do not see the same phenotypes. We discussed this comment in the discussion section of the revised version I of the manuscript (Page 18, Line 19).

      1. Mature levels of 18S rRNA are not altered in the RIOK2 mutant cell lines. This could be due to compensation in these mutant cell lines since RIOK2 is essential.

      We agree with Reviewer #2 that compensation mechanisms may operate to restore mature 18S rRNA levels despite RIOK2 mutation. On the other hand, although RIOK2 is indeed essential, we may expect that the point mutation of S483 only partially affects RIOK2 function and delays the maturation of pre-40S particles but not to a sufficient extent to impact the mature 18S rRNA levels. This has been observed by others (Montellese et al., 2017, NAR; Srivastava et al., 2010, MCB).

      We added this point in the discussion section of the revised version I of the manuscript (Page 19, Line 9).

      Please report the mature 18S rRNA levels upon shRNA depletion and RSK inhibitors to provide insight into if this pathway significantly alters mature 18S rRNAs as a mechanism for the altered translation and proliferation observed.

      We will probe the levels of the mature 18S and 28S rRNAs in these experiments and the results will be included in Figure 1 of the revised version II of the manuscript.

      Minor Comments:

      1. Figure 1A lower: The authors use an RSK inhibitor LJH685, that does not inhibit RSK phosphorylation S380. Therefore, another verification of RSK inhibition must be used besides RSK-pS380 abundance as for PD184352 inhibition. Please validate the usage of this RSK inhibitor in the experiments by inclusion of quantification of a direct downstream substrate of RSK, such as YB1-pS102 quantification.

      We agree with Reviewer #2. We have probed the membrane with anti-RPS6 and anti-phosho-RPS6 antibodies to show the effect of LJH treatment on RPS6 phosphorylation. These data have been added to Figure 1A in the revised version I of the manuscript and the text has been updated (Page 6, Line 16).

      1. Page 7, Lines 8-12: The authors state that RSK knockdown led to increases in the 45S, while the LJH685 treatment led to no changes in 45S levels due to differences in growth conditions. Please elaborate more on how growth conditions would alter 45S pre-rRNA levels. It would be expected that stimulation of the MAPK pathway would increase pre-rRNA transcription compared to steady state growth conditions. However, pre-rRNA processing northern blots are only measuring steady state levels of the precursors. Thus, an rDNA transcription assay would need to be completed to evaluate these differences.

      We do observe that PMA treatment of starved cells induces an increase in 45S precursor levels, consistent with an increase in transcription but we agree that northern blot experiments measure the steady-state levels of the intermediates.

      To address this comment, we propose to perform short pulse labelings with ortho-phosphate to assess synthesis of the 45S precursor independently of its processing in the different conditions. These data will be included in the revised version II of the manuscript.

      1. Figure 2C: Please quantify these results to properly evaluate the role of these two phosphorylation sites in MAPK signaling.

      We will repeat these experiments and quantify the results in the new version of Figure 2C.

      1. Please include the RIOK2 pS483 antibody generation methodology used in this study.

      We added this information in the Materials and Methods section of the revised version I of the manuscript (Page 21, Line 22).

      1. In vitro kinase assay methods: Is the recombinant RSK1 the human version of the protein? Please clarify in methods.

      Human recombinant RSK1 has been purchased from SignalChem. The information has been added in the revised version I of the manuscript (Page 30, Line 5).

      1. Figure 4B: Please include statistical analysis of the puromycin incorporation assay.

      We performed a statistical analysis of this assay out of 3 replicates. This analysis has been included in the present revised version I of the manuscript (Figure 4B).

      1. Page 13, Line 18: Please explain why RIOK2 co-IP with NOB1 is important.

      We added this explanation in the result section of the revised version I of the manuscript (Page 14, Line 3).

      1. In vitro dissociation assay: There is no control for pulldown of entire pre-40S particles and not just NOB1 protein. Thus, it is unclear if RIOK2 is dissociating from NOB1 or entire pre-40S particles. Please reference previous literature of the methodology of this experiment if applicable. Additionally, please include controls, such as western blotting of ribosomal proteins or northern blotting of rRNA in the pulldown fraction used.

      We agree with Reviewer #2. We have probed the membranes with antibodies detecting LTV1 and ribosomal protein RPS7 to show that the entire pre-40S particle is indeed pulled down. These additional data have been added in Figure 6A of the revised version I of the manuscript and the text has been amended accordingly (Page 14, Line 20).

      1. Page 16, Lines 10-12: The authors state "RSK facilitates the release of RIOK2 and other AMFs", however the only other AMF in this study was NOB1. Please reword appropriately that most likely facilitates release of RIOK2 and other AMFs in a RIOK2 dependent or independent manner if it also phosphorylates other AMFs which possess the motif.

      We agree with Reviewer #2 and we changed the text accordingly (Page 16, Line 11) but we did not introduce the hypothesis that RIOK2 may target directly other AMFs of late pre-40S particles which possess the motif because our in silico screen did not identify consensus RXRXXS/T motifs in any of these factors.

      Reviewer #2 (Significance (Required)):

      This manuscript is significant due to the lack of mechanistic connection of cellular signaling pathways to pre-rRNA processing. There have been, for the most part, no mechanistic connection of signaling pathways to pre-rRNA processing regulation and none for direct targets of MAPK signaling (Reviewed in Gaviraghi et al 2019). They provide the groundwork for analysis of MAPK signaling in regulation of an assembly factor and inclusion of their motif analysis could provide RSK signaling targets' regulation of specific steps of ribosome biogenesis that remain to be elucidated.

      Although the research delves into a specific mechanism, its audience could be far reaching as it is in the ribosome biogenesis field and MAPK signaling, which have broad implications in cancer and developmental diseases.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The authors report that inhibition of MAPK signaling via RSK is associated with modest alterations in the relative abundance of human pre-rRNA species, that are most marked for 30S but also visible for 21S - although not clearly shown for 18S-E.

      RIOK2 has two closely spaced sites predicted as RSK targets, one of which was confirmed to be MAPK sensitive and shown to be an RSK substrate in vitro. Substitution of Ser483 with Ala was associated with reduced growth and 18S-E accumulation, consistent with impaired NOB1 cleavage activity. RIOK2-S483A also showed greater pre-ribosome association in vivo and consistent with this, more stable association in vitro and increase cytoplasmic residence. These effects are clear, although the data do not directly demonstrate their linkage to loss of RSK phosphorylation.

      The mutations were apparently generated directly in the genome of haploid cells, potentially raising concerns that the introduction of a deleterious mutation might have been accompanied by compensatory mutations elsewhere. However, three cells line gave similar results, mitigating this concern.

      Specific comments:

      1. To help the reader, the authors should directly discuss why they think the data on MAPK inhibition did not reveal a clearer pre-18S cleavage phenotype, as would have been expected for loss of RIOK2 activity.

      This comment is similar to major comment #4 of Reviewer #2.

      Please refer to the above response.

      1. Fig. S3: The degree of RSK depletion with the siRNAs appears very modest, as are the effects on RIOK2-P. Moreover, the double depletion is not clearly better than single depletions. These data should probably be supported by quantitation or withdrawn._

      We agree with Reviewer #3 that the effects shown in this figure are modest but we originally chose to show these data because their further supported the role of RSK in RIOK2 phosphorylation at S483 in complement to Figure 3.

      We have withdrawn this figure from the present revised version I of the manuscript.

      1. Fig. 5D: For 18S-E recovery with RIOK2, is the ratio adjusted for the increase in 18S-E abundance in the mutant - ie is recovery increased when adjusted for the increased pre-rRNA abundance?_

      In these experiments, the tagged versions of RIOK2 WT and S483A have been expressed ectopically from plasmids in cells expressing the endogenous wild-type protein. RIOK2 S483A does not behave as a dominant negative mutant in these conditions and does not induce 18S-E accumulation, as shown in the northern blot analysis of the 18S-E levels in the cell lysates (lower panel). This information is indicated in the revised version I of the manuscript (Page 13, Line 26).

      Reviewer #3 (Significance (Required)):

      Overall, the analyses on the phenotype of RIOK2-S483A, and the demonstration that this site is an RSK target, appear convincing.

      Caveats are

      1) the phenotype seen on inhibition of RSK, would not have implicated RIOK2 as the obvious candidate for the factor responsible for the observed processing defects;

      We agree with this comment, which has also been raised by Reviewer #2 (Major comment 4.). We provide several evidence in the manuscript that RSK phosphorylates RIOK2 on S483 in vivo and in vitro (Figure 3). However, as explained above in response to Reviewer #2, we cannot correlate the in vivo phenotypes resulting from RSK or RIOK2 inactivation for biological reasons. As mentioned in the introduction, RSK regulates multiple substrates at different stages of ribosome biogenesis (Translation of RPs and AMFs, Pol I transcription, pre-ribosome maturation and export), whereas RIOK2 is specifically implicated in the cytoplasmic maturation of pre-40S particles. Inactivation of RSK is therefore expected to induce pleiotropic defects in ribosome biogenesis, and in particular early defects (Reduced Pol I transcription, 30S precursor accumulation) that preclude observation of the expected phenotype linked to RIOK2 inactivation, i.e. 18S-E accumulation.

      We nevertheless tried to clarify this point as described in the response to Reviewer #2, major comment 4.

      2) the RIOK2-S483A phenotype is not demonstrated to be RSK dependent. This raises the possibility that, although RSK can phosphorylate S483, the effects of the mutation are not due to the loss of this modification.

      As mentioned by Reviewer #3, our data show that RSK can phosphorylate RIOK2 S483 in vitro and in vivo (Figure 3). We believe that Figure 4C strongly suggests that the accumulation of the 18S-E in cells expressing RIOK2 S483A mutant is due to the loss of S483 phosphorylation, since mutation of S483 to an aspartic acid (S483D), generally considered as a mutation mimicking a phosphorylated serine, does not affect 18S-E maturation. However, although our manuscript provides many lines of evidence identifying RSK as the kinase responsible for RIOK2 phosphorylation at S483, we cannot formally exclude that other AGC kinases involved in growth and proliferation, such as S6K or Akt, may also be involved redundantly or alternatively. Our data presented in Figure 3A showing that treatment of cells with the RSK inhibitors LJH decrease RIOK2 phosphorylation at S483 support a specific role of RSK.

      We developed this point in the discussion section (Page 18, from Line 25).

      With these provisos, the work is technically good and will be of considerable interest to the field. The post-transcriptional regulation of ribosome synthesis is increasingly recognized a significant topic.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      There have been mechanistic connections of various signaling pathways to regulation ribosome biogenesis steps including rDNA transcription by RNA polymerase I and III, ribosomal protein transcription, and differential mRNA translation efficiency. However, there is a lacking mechanistic connection of signaling pathways to pre-rRNA processing and maturation steps of ribosome biogenesis. The authors set out to provide a specific example of a direct target of MAPK signaling, RSK that regulates pre-rRNA maturation through the phosphorylation of a ribosome assembly factor (RIOK2), offering for the first time providing mechanistic insight into MAPK regulation of pre-rRNA maturation.

      The authors observe slight pre-rRNA processing defects upon the use of RSK inhibitors and RSK depletion. They identified several candidate ribosome assembly and modification factors containing the canonical RSK substrate motif, including the RIOK2 kinase. Phosphorylation at this motif was verified to be specifically phosphorylated by RSK1 and 2 isoforms in cells and in an in-vitro kinase assay. The authors produced RIOK2 knock-in eHAP1 cell lines expressing non-phosphorylatable or phosphomimetic versions of RIOK2, observing slowed cellular proliferation, decreases in global translation, slight pre-rRNA processing abnormalities, but not changes in overall mature 18S rRNA levels. More specifically, the authors defined the inability of RIOK2 to be phosphorylated leads to defects in RIOK2 dissociation from the pre-40S ribosomal subunit in an in-vitro assay, and inability for it to be recycled for reuse in pre-ribosome export from the nucleus to the cytoplasm by immunofluorescence.

      Overall, the authors provide an interesting mechanism of MAPK regulation of a ribosome assembly factor RIOK2. However, they fail to provide the necessary reproducibility, controls, quantification, and consistent results between experiments to support their hypotheses.

      Major Comments:

      1.The northern blots reported throughout the manuscript are lacking proper reproducibility and quantification. First, the northern blots are lacking a loading control, which is necessary to report fold changes that are being measured across treatments. Please include a proper loading control (i.e. 7SL or U6 RNAs). Additionally, more rigorous analysis of the pre-rRNA precursor levels through ratio analysis of multiple precursors (RAMP) (Wang et al 2014) can be completed to provide a clearer depiction on which precursor(s) are accumulating. It is unclear for the Figure 1 northern blots if there were replicates completed and what the error bars represent in Figure 1B. Please report replicates, so that statistical analysis can be completed on the differences in precursor relative abundance. This need is emphasized by the small changes observed in pre-rRNA levels (less than 2 fold) between conditions.

      2.The western blots reported throughout the manuscript are lacking proper reproducibility and quantification. For example, the western blots validating RSK1 and RSK2 depletion in Figure 1C lack a proper loading control. Additionally, it is unclear if there are replicates completed and there is lack of statistical analysis to determine if the changes are significant. Please include loading controls, replicates, and quantification of the western blots throughout the manuscript.

      3.Please report the full bioinformatic analysis of the RSK substrate motif search among human AMFs including other AMFs found in this search. A sorted list format would be valuable for the reader to understand other potential RSK substrates involved in ribosome biogenesis.

      4.The authors report that RSK inhibition/depletion leads to accumulation of the 30S pre-rRNA, yet mutation of its target site on RIOK2 or RIOK2 depletion leads to an accumulation of the 18S-E pre-rRNA. Additionally, the phosphomimic mutation of RIOK2 leads to an accumulation of 30S, the opposite of the expected result. Please elaborate on this discrepancy in processing defects observed across experiments. Are there similar results for RSK depletion/inhibition and RIOK2 release from the pre-40S and inability to import into the nucleus? If so, this could provide phenotypic consistency between these two proteins in the proposed pathway to further support the hypothesis.

      5.Mature levels of 18S rRNA are not altered in the RIOK2 mutant cell lines. This could be due to compensation in these mutant cell lines since RIOK2 is essential. Please report the mature 18S rRNA levels upon shRNA depletion and RSK inhibitors to provide insight into if this pathway significantly alters mature 18S rRNAs as a mechanism for the altered translation and proliferation observed.

      Minor Comments:

      1.Figure 1A lower: The authors use an RSK inhibitor LJH685, that does not inhibit RSK phosphorylation S380. Therefore, another verification of RSK inhibition must be used besides RSK-pS380 abundance as for PD184352 inhibition. Please validate the usage of this RSK inhibitor in the experiments by inclusion of quantification of a direct downstream substrate of RSK, such as YB1-pS102 quantification.

      2.Page 7, Lines 8-12: The authors state that RSK knockdown led to increases in the 45S, while the LJH685 treatment led to no changes in 45S levels due to differences in growth conditions. Please elaborate more on how growth conditions would alter 45S pre-rRNA levels. It would be expected that stimulation of the MAPK pathway would increase pre-rRNA transcription compared to steady state growth conditions. However, pre-rRNA processing northern blots are only measuring steady state levels of the precursors. Thus, an rDNA transcription assay would need to be completed to evaluate these differences.

      3.Figure 2C: Please quantify these results to properly evaluate the role of these two phosphorylation sites in MAPK signaling.

      4.Please include the RIOK2 pS483 antibody generation methodology used in this study.

      5.In vitro kinase assay methods: Is the recombinant RSK1 the human version of the protein? Please clarify in methods.

      6.Figure 4B: Please include statistical analysis of the puromycin incorporation assay.

      7.Page 13, Line 18: Please explain why RIOK2 co-IP with NOB1 is important.

      8.In vitro dissociation assay: There is no control for pulldown of entire pre-40S particles and not just NOB1 protein. Thus, it is unclear if RIOK2 is dissociating from NOB1 or entire pre-40S particles. Please reference previous literature of the methodology of this experiment if applicable. Additionally, please include controls, such as western blotting of ribosomal proteins or northern blotting of rRNA in the pulldown fraction used.

      9.Page 16, Lines 10-12: The authors state "RSK facilitates the release of RIOK2 and other AMFs", however the only other AMF in this study was NOB1. Please reword appropriately that most likely facilitates release of RIOK2 and other AMFs in a RIOK2 dependent or independent manner if it also phosphorylates other AMFs which possess the motif.

      Significance:

      This manuscript is significant due to the lack of mechanistic connection of cellular signaling pathways to pre-rRNA processing. There have been, for the most part, no mechanistic connection of signaling pathways to pre-rRNA processing regulation and none for direct targets of MAPK signaling (Reviewed in Gaviraghi et al 2019). They provide the groundwork for analysis of MAPK signaling in regulation of an assembly factor and inclusion of their motif analysis could provide RSK signaling targets' regulation of specific steps of ribosome biogenesis that remain to be elucidated.

      Although the research delves into a specific mechanism, its audience could be far reaching as it is in the ribosome biogenesis field and MAPK signaling, which have broad implications in cancer and developmental diseases.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors report that inhibition of MAPK signaling via RSK is associated with modest alterations in the relative abundance of human pre-rRNA species, that are most marked for 30S but also visible for 21S - although not clearly shown for 18S-E.

      RIOK2 has two closely spaced sites predicted as RSK targets, one of which was confirmed to be MAPK sensitive and shown to be an RSK substrate in vitro. Substitution of Ser483 with Ala was associated with reduced growth and 18S-E accumulation, consistent with impaired NOB1 cleavage activity. RIOK2-S483A also showed greater pre-ribosome association in vivo and consistent with this, more stable association in vitro and increase cytoplasmic residence. These effects are clear, although the data do not directly demonstrate their linkage to loss of RSK phosphorylation.

      The mutations were apparently generated directly in the genome of haploid cells, potentially raising concerns that the introduction of a deleterious mutation might have been accompanied by compensatory mutations elsewhere. However, three cells line gave similar results, mitigating this concern.

      Specific comments:

      1.To help the reader, the authors should directly discuss why they think the data on MAPK inhibition did not reveal a clearer pre-18S cleavage phenotype, as would have been expected for loss of RIOK2 activity.

      2.Fig. S3: The degree of RSK depletion with the siRNAs appears very modest, as are the effects on RIOK2-P. Moreover, the double depletion is not clearly better than single depletions. These data should probably be supported by quantitation or withdrawn.

      3.Fig. 5D: For 18S-E recovery with RIOK2, is the ratio adjusted for the increase in 18S-E abundance in the mutant - ie is recovery increased when adjusted for the increased pre-rRNA abundance?

      Significance

      Overall, the analyses on the phenotype of RIOK2-S483A, and the demonstration that this site is an RSK target, appear convincing.

      Caveats are

      1)the phenotype seen on inhibition of RSK, would not have implicated RIOK2 as the obvious candidate for the factor responsible for the observed processing defects;

      2)the RIOK2-S483A phenotype is not demonstrated to be RSK dependent. This raises the possibility that, although RSK can phosphorylate S483, the effects of the mutation are not due to the loss of this modification.

      With these provisos, the work is technically good and will be of considerable interest to the field. The post-transcriptional regulation of ribosome synthesis is increasingly recognized a significant topic.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript addresses an important topic, the posttranscriptional maturation of ribosomes. This topic is inherently interesting because we normally think of ribosome biogenesis as a sequential series of steps that automatically proceeds and cannot be "accelerated" in physiological conditions, but only "delayed" in the presence of genetic mutations. In short, the manuscript proposes that RIOK2 phosphorylation by the action of RSK, below the Ras/MAPK pathway promotes the synthesis of the human small ribosomal subunit.

      I honestly admit that I have some difficulties in reviewing this manuscript. The quality of the presented data is, in generally, good. However, overall I find the whole manuscript preliminary and I am not much convinced of the conclusions. Several aspects are superficially analyzed. In short, I think that most of the conclusions are not fully supported by the data because shortcuts are present. A list of all the aspects that I found wrong are listed.

      Biological issue

      1. The authors claim that the effects of the inhibition of the maturation of ribosomes by acting on a pathway upstream of RIOk2 are limited to the 40S subunit. This is far from being a trivial point, for the following reason. RIOK2 is known to affect the maturation of 40S ribosomes. Hence, the fact that using an upstream inhibitor of the MAPK pathway such as PD does not inhibit 60S processing in reality would argue against a biologically relevant control in ribosome maturation (of the MAPK patheay). Have the authors considered this? In a way, also, given the fact that the mutants confirm a role in 18S final maturation, it is a bit complex to put all the data in a clear biological context.

      A number of specific issues will be concisely described.

      Manuscript very well written. Data do not always support the strong conclusions. Low magnitude of the observed effects.

      In introduction the authors make a general claim that ribosome biogenesis is one of the most energetically demanding cellular activities. This statement lingers in the literature since 15 years but in reality it has never been formally proved for mammalian cells, and certainly not for HEK293 cells. The original statement, to my knowledge, can be traced by some obscure statement referred to the yeast case and then repeated as a truth. In conclusion, beside being a very banal observation, it should be referenced.

      Growth factors, energy status are not cues but are proteins or metabolites (introduction). Authors write about mTOR without making statements on mTORC1/2. This is very obsolete. Also I am not sure that the choice of Geyer et al., 1982, and subsequent papers makes much sense. At the very minimum TOP mRNA concepts and mTORC1 must be defined.

      The authors claim that heir work fills a major gap between known functions of MAPK and cytoplasmic translation. I would not be so sure about it.

      Results. Authors start with a major mistake, i.e. that PMA selectively stimulates the MAPK pathway. Perhaps it stimulates, certainly it does not do it selectively.

      RIOK2 phosphosites are first found by bioinformatics analysis. It should be noted that the predicted phosphosite (S483) is found only in a limited set of datasets from MS databases. The actual importance of this site would not emerge from unbiased studies. Also, there are many other phosphosites that were not analyzed in this study.

      Throughout the paper the authors use the word strongly, significantly, but the actual effects seem in general quite marginal.

      Discussion. The authors claim that they provide solid evidence on MAPK signalling to ribosome maturation. At the very best this is circumstantial evidence for the 40S maturation.

      Figure 1. Unclear why LJH should increase P-ERK. General lack of quantitation (sd, replicates, bars). Experiment done only on a single cell line in a single experimental setup. Very different effects on 21S by LJH,PMA and siRNA for RIOK2. Overall the message given by the authors is to me mysterious.

      Figure 2. Several red flags. For instance in 2C the loaded levels of RIOK2-HA loaded are clearly less than the ones of the other genotypes, hence the conclusion on P-RIOK2 is not convincing. Staining with anti-P RIOK2 lacks controls, how can be sure that the signal is due to the phosphate? Phosphatase treatment? Why FBS does not lead to ERK staining in HEK293? There are plenty of growth factors in FBS that should lead to ERK phosphorylation. I do not understand this experiment.

      Figure 3. In vitro phosphorylation, if I understood, it relies on a truncated version of RIOK2. Why? Is the folding of the full length protein not permissive to in vitro phosphorylation? HA-RSK3 is less?

      Figure 4. Immunofluorescence is low mag, difficult to understand. I really like the experiments with RIOK2 mutants, however I wonder what about protein levels after the knock-in? Given the 18S phenotype overlap between the phenotype of the RIOK2 loss of function with the S483A, testing protein level becomes of the utmost importance.

      Figure 5. Low quality IFL. Hard to think that histogram quantitation of nuclear versus cytoplasmic staining are reliable in the absence of fractionation, better quantitation, experiment done in other cell lines and so on. However, very beautiful Fig. 5E perhaps the best of the paper shows also mobility shift driven by S483, thus supporting posttranslational modifications.

      Fig. 6. IFL studies are really impossible to interpret. The effects on RIOK2 release (this figure) and 18S maturation (Fig. 5) are very clear and of great quality. Overall conclusions. The manuscript tends to overinflate the meaning of several experiments. What to me is very clear and interesting is that the the authors provide clear evidence that S483A mutants have a defect in 40S maturation. Whether this is due to MAPK signalling, is only circumstantial. I would suggest to build up on the strong findings and eliminate ambiguous data.

      Significance

      The paper deals with an important topic, namely whether a regulation of ribosome maturation exists, and how it is mechanistically regulated. In this context, the analysis of the ERK pathway is highly needed considered that most works deal with effects of the PI3K-mTOR pathway, and the parallel, yet important RAS-ERK pathway, is less understood. As a final note, we should consider that S6K downstream of mTOR, and ribosomal S6K, downstream of ERK have been considered to share some substrates.

      The manuscript is interesting, but several statements given by the authors are rather superficial. An example, listed in the previous section, relates to the linguistic usage of mTOR kinase, instead of detailing whether we are dealing with mTORc1 or mTORc2. A second gross mistake is the definition of PMA as a stimulator of the ERK pathway. If this is certainly true, this is historically not correct as seminal papers by the group of Parker define this drug as a stimulator of conventional PKC kinases. In short, this paper is a step back in knowledge from the perspective of the literature context.

      All people interested to the crosstalk between ribosome maturation and signaling pathways will be certainly read this manuscript.

      My expertise is within the ribosome biology and signalling field.

    1. Reviewer #3:

      This is an interesting study in which the authors compare Primacy and Recency weighting models' ability to predict momentary mood assessments during a well-established gambling task. They do so across a range of conditions:

      i) random/structured/structured-adaptive reward environments

      ii) different age groups

      iii) in healthy versus depressed participants They also perform the same task in fMRI. They find that the Primacy model wins in most cases, and relates more strongly to brain activations in fMRI.

      The paper is very clearly written and easy to read and understand. The conclusions are striking, given the greater dominance of recency-based models in the literature (e.g. Kahneman's peak-end heuristic). I do however have some major concerns with some aspects of the modelling and task design: I'm not sure if they are addressable or not. In summary, they are:

      i) the comparison of Primacy and Recency models doesn't seem fair to me, as the models also differ according to whether the E term is based on previous expectations or previous outcomes. How can the authors conclude that primacy/recency is the key feature of the winning model?

      ii) The structured and structured-adaptive versions of the task seem to me to have potential biases against the Recency model due to confounding effects: these other effects must be excluded for the conclusions to be robust.

      The following describes these and other concerns in more detail:

      Methods:

      The modelling seems to me to be problematic as a contrast between primacy and recency because the Primacy and Recency models differ in more than one respect: not just weighting of previous events (presented as the "critical difference between the two models" on p6), but also whether those events are expectations (in the Recency model) or outcomes (in the Primacy model). If the authors want to conclusively establish that Primacy is a better model than Recency then surely more models ought to be compared, at very least using a 2x2 design with primacy/recency of expectations/outcomes? This is also an issue for the fMRI analysis: it is hard to conclude much about the models from the fact that the Primacy model E beta (but not the Recency model E beta) correlates with a BOLD cluster when the Recency model E term is based on previous expectations, not previous outcomes. Likewise with the direct comparison of the models' voxel-wise correlation images.

      There also seems to be an error in Figure 1's Equation (1): presumably this just refers to the Primacy model's E term and not the Recency model's E term? Both should be shown for clarity. Also Equation (6) does not look like Equation (1) - is Equation (6) incorrect? In which case what is the R term supposed to look like in Equation (6) - is it also subject to primacy weighting or not? Also in the Discussion, the authors say the Primacy model maintained the overall exponential discounting of the E term. I might misunderstand but this seems a bit misleading because the discounting is by γ^(t-j) in one model but γ^k in the other?

      The authors also comment that the Primacy model performed better "when we did not distinguish between gambling and non-gambling trials, which was another divergence from the standard Recency model". But as I understand it, the standard Recency model was originally designed such that the certain option C was NOT the average of the two gambles, so C was required in the model (at least in the 2014 PNAS paper). Here, C is the average of the gambles, so presumably it would be identical to E in the Recency model, and therefore be extraneous in the Recency model as well as the Primacy model - did the authors do model comparison to see if it could be eliminated from the Recency model? If so, this is not another difference between the models after all. Apologies if I have misunderstood something...

      I might be misunderstanding the fitting approach here but it sounds like the leave-out sample validation is done to optimise the hyperparameters, not the parameters? In which case there is no complexity penalty to reduce overfitting in the plain MSE measure? I appreciate this is less of an issue if models have the same number of parameters...

      Results:

      The authors state that the Primacy model does best in the Random condition but this is not what is stated in Table S1, where its MSE is higher, not lower (0.006 vs 0.0008)?

      A major issue with the task structures as they stand is that the structured and structured-adaptive tasks seem to have some potential problems when it comes to assessing their impact on mood ratings:

      i) the valence of the blocks was not randomised, meaning that the results could be confounded by valence. E.g., what if negative RPE effects are longer-lasting than positive RPE effects? This seems plausible given the downward trend in mood in the random environment despite an average RPE of zero. This could also explain the pattern of mood in the other two tasks, rather than primacy?

      ii) issues of scale: if there is a non-linear relationship between cumulative RPE and mood, such that greater and greater RPEs are required to lift/decrease mood by the same amounts, then this will resemble a primacy effect? This is unlikely to be an issue in the random task but may well be a problem in the structured and certainly in the structured-adaptive tasks?

      iii) issues of individual differences in responsiveness to RPE: in the structured-adaptive task, some subjects' mood ratings may be very sensitive to RPE, and others very insensitive. One might expect that given the control algorithm has a target mood, the former group would reach this target fairly soon and then have trials without RPE, and the latter group would not reach the target despite ever increasing RPEs. In both cases the Primacy model would presumably win, due to sensitivity to outcomes in the first half or insensitivity to bigger outcomes in the second half respectively? Can these possibilities be excluded using model comparison methods?

      These issues are a concern because the plain MSE is not an ideal model comparison method, and the Streaming Prediction MSE is equivocal between the Primacy and Recency models in the Random environment - the only environment which seems unbiased towards the models (given the adolescent sample was also Structured-Adaptive).

    2. Reviewer #2:

      In this paper the authors report data from a series of online and one neuroimaging study in which participants played a simple game in which they had to select between a sure outcome and a gamble. Participants reported their current mood throughout the game and the authors compared the performance of a number of models of how the mood ratings were generated. They focus on two models, a standard model which assumes that participants' expectations assume a 50:50 gamble and an adapted model that uses average experienced outcomes as the expected value. They frame these models in terms of recency vs. past weighting and suggest that the results provide evidence in favour of a higher weight of earlier events on reported mood.

      The question of how humans combine experienced events into reported mood is topical. This paper takes an interesting approach to this issue.

      I struggled a bit to understand the logic of some of the arguments in the paper, in part because important experimental and methodological detail is missing. I list my points below. The overriding question is, I think, how certain we can be that the results reported by the authors reflect a true primacy effect, as opposed to some other process (e.g. just learning an expected value) that appears in this case to be a primacy effect.

      1) I didn't really understand where the weights from the primacy graph in Figure 1B came from. The recency weights make sense-there is a discount factor in the model that is less than 1, so there is an exponential discount of more distant past events. However, for the primacy model the expectation is calculated as the mean (apparently arithmetic mean) of previous outcomes (which suggests a flat weight across previous trials) and the discount factor remains-so how does this generate the decreasing pattern of weights? It would be really useful if the authors could spell this out.

      2) The models seem to differ in terms of whether they learn about the expected value of the gamble outcomes or whether they assume a 50:50 gamble (the recency model assumes this, the primacy model generates an average of all experienced outcomes). Might the benefit of the primacy model when explaining human behaviour simply be that people use experienced outcomes to generate their expectations rather than taking stated outcome probabilities as absolutes? In other words, it is not so much that people place more weight on earlier events, but that they learn.

      3) Linked to the above, the structured and adaptive environments seem to have something to learn (blocks with positive vs. negative RPEs), so it is perhaps not surprising that humans show evidence of learning here and a model with some learning outperforms one with none. The description of these environments isn't really sufficient at present-please explain how RPEs were manipulated (was it changing the probability of win/loss outcomes, if so, how? Or was it changing the magnitude of the options? For the adaptive design was the change deterministic? So was the outcome, and thus RPE, always positive if mood was low, or was this probabilistic and if so with what probability?). Also, did the recency model still estimate its expectations here as 50:50, even when (if) this was not the case? If so, can the authors justify this?

      4) What were participants told about the gambles (i.e. were they told they were 50:50, including in structured/adaptive environments)?

      5) Please report the estimated parameter values of the models (and tell us where the common parameters differed between models). This would help in understanding how they are behaving.

      6) In addition to changing the expectation term of the recency model, the primacy model also drops the term of for the sure outcomes (because this improves the performance of the primacy model). Does this account for the relative advantage of the primacy over the recency model? i.e. if the sure outcome term is dropped from the recency model, does the primacy model still perform better?

    3. Reviewer #1:

      Keren and co. presents a very interesting study whose goal is to determine what are the determinants of subjective mood rating. They correctly identify as the "baseline" model the model proposed by Rutledge et al. where a big determinant of mood seems to be the reward prediction error (Recency model) and they contrast it with a Primacy model, where first events (not late events) play a more important role.

      They validate the model across different behavioural datasets, involving (supposedly) healthy subjects, teenagers and depressive patients. They also have a fMRI experiment and found that the weights of the Primacy model (and not the weights of the Recency model) correlate across subjects with prefrontal activity.

      Overall I think this paper addresses an important question and presents an impressive amount of data. However, I do believe that there are some important checks to be made both concerning the computational and the fMRI analyses.

      Concerning model comparison, I would like the authors to show us whether or not their model selection criteria allows us to correctly recover the true generative model in simulated datasets. Are we sure that the model selection criteria are unbiased toward the two models?

      Equally important: can the authors provide at the group level a qualitative signature of mood data that falsify the Recency model (see Palminteri, Wyart and Koechlin. 2017). They do so in Figure S2 for one subject, but it would be important to show the same (or similar) result at the group level. This should be easier in the structured or in the structured-adaptive conditions.

      Concerning neuroimaging, if I am not missing something, the results they present in the main texte is the results of a second level ANCOVA, where the individual weights of the Primacy model are shown to correlate with activity in the prefrontal cortex. Similar analyses using the weights of Recency model do not produce significant results at the chosen threshold. This analysis is problematic for two reasons. First, absence of evidence does not imply evidence of absence. Second, to really validate the model the authors should show that the trial-by-trial correlates of expectations and prediction errors are consistent with the Primacy and not the recency model. Can the authors show that the Primacy regressors explain better trial-by-trial neural activity compared to the competing model? They could do so formally by estimating the model using the Baysian toolbox usually used to compare DCM models.

      Also concernant neuroimaging, I would be important to verify that the authors replicate Rutledge et al's results and Vinckier et al's results (vmPFC, insula, striatum...). This will tell us if the studies are really comparable and would be informative regardless of the result.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      This is a very interesting study whose goal is to determine what drives subjective mood over time during a reward-based decision making task. The authors report data from a series of online studies and one performed with fMRI. Participants played a well-established gambling task during which they had to select between a sure outcome and a 50:50 gamble, reporting momentary mood assessments throughout the game. The authors compared the performance of a number of models of how the mood ratings were generated.

      The authors identify as their "baseline" model that proposed by Rutledge and colleagues, in which an important determinant of mood seems to be the reward prediction error: the authors call this Recency model. They contrast it with a Primacy model, where earlier events (in this case, average experienced outcomes) play a more important role. They validate the model across different behavioural conditions, involving healthy subjects, teenagers and depressive patients. The conclusion is that the data are more consistent with their Primacy model, in other words a higher weight of earlier events on reported mood. In the fMRI experiment they found that the weights of the Primacy model correlated with prefrontal activation across subjects, while this was not the case for the Recency model.

      The paper is clearly written and easy to understand. The question of how humans combine experienced events into reported mood is topical and the conclusions are striking, given the dominance of recency-based models in the literature (e.g., Kahneman's peak-end heuristic). The paper takes an interesting approach and presents an impressive amount of data.

      However, at some points the arguments seemed a considerable stretch, in part because important experimental and methodological detail is missing, and in part because the analyses do not currently consider a number of potential confounds in both the models and the task design. Ultimately, these concerns come down to whether we can be certain that the results reflect a true primacy effect, as opposed to some other process that simply appears at face value to be a primacy effect. To this end, some important checks need to be made concerning both the computational and the fMRI analyses, as detailed below. These do require substantial extra modelling work, and it is quite possible that the conclusions will not survive these control analyses.

    1. Reviewer #3:

      This is an interesting paper examining the role of electric fields as a tissue damage signal for epithelial cells in vivo. Previous work had indicated the presence of electric fields in wounded tissues. But whether these phenomena play a role in early wound detection by epithelial cells has been unclear. The authors use live imaging in zebrafish to track the behaviour of epithelial cells in response to wounds. Imaging of actin dynamics was used as a readout for directional sensing in these cells. The authors show that directional sensing depends on the local concentration of specific electrolytes and that application of external electric fields can stimulate directional migration. These major conclusions are interesting and well supported. Although this is not the first time that electric fields are suggested to play a role, the study offers valuable direct evidence, in vivo evidence, and introduces a new system in which the mechanisms can be studied further.

      Main comment:

      The study is focused on establishing whether electric fields play a role in wound sensing and does not touch on how these effects are mediated. The experiments were designed to distinguish osmotic from electric effects, establish whether the effects are global or local and assess the direct effects of electric fields on epithelial cell motion. These are significant and do not appear trivial. Nevertheless, some insight, even in the form of discussion, into how these effects might be sensed by epithelial cells seemed to be lacking. At the minimum, the authors could provide ideas based on the literature. Ideally, the study would include an analysis of cytoskeletal rearrangements and calcium dynamics in response to electric fields or alterations of electrolytes for completion. The authors introduce these key readouts of epithelial signalling, but they did not make full use of these in their functional assays. Depending on whether electric fields influence the calcium wave, different mechanistic hypotheses can be made for future studies.

    2. Reviewer #2:

      I enjoyed the manuscript. Driving cell movement and even overriding wound migrational cues with an electric field is very interesting. My principal concern is that it appears the manuscript has been written in a way to downplay the previous findings in this field. I am no expert on the effects of electric fields on wound healing and chemotaxis, but a cursory look at the literature shows that that lot has been published in this arena. It appears that most if not all of the findings in this manuscript have been seen before in other contexts.

      The zebrafish offers a great set of tools to interrogate electric fields on chemotaxis and wound healing. I am simply asking for a bit of clarity with respect to the history of electrical fields, cell chemotaxis and wound healing. The authors need to provide more context for their work in the introduction with respect to electrical fields and more clearly describe what has been done before. In addition, the authors need to make additions to the conclusion that clearly define what is novel in their findings and how it relates to previous studies of electric fields and cell chemotaxis.

    3. Reviewer #1:

      This manuscript by Kennard and Theriot reports that electrical cues guide skin cells directional migration in response to injury. The authors bring molecular tools and analysis to study environmental cues, like osmolarity and electric fields in vivo. The effects of electrical cues are most studied in vitro. The in vivo model, the vivo approaches with molecular and imaging techniques bring bioelectricity research closer to mainstream techniques. Demonstrating the direct effect of electrical effects independent of osmolarity represent a significant step in this field. The results demonstrating the effects of NaCl, but not quite a few osmolarity control are impressive.

      I have the following questions and suggestions, which I do not expect the authors to address with new experiments, because as with other pioneering research, this manuscript suggests more research questions/directions on the basis that it answers some very important questions. I believe perhaps the authors already have some results to some of those questions.

      1) Good reason for choosing laceration over transection is given. I am a bit puzzled if the EFs and osmolarity are the mechanisms, why were there such differences? The endogenous EFs and osmolarity would be expected to be the same in both the laceration and transection models. Could the laceration stretch the tissue during injury procedure, so the marked increased migration was present in the laceration model? The stretch could activate stretch activated channels, stimulate cells, and realign matrix.

      2) It is not clear what relationship can be established between GCaMP6f response and migration speed (Fig.1E, G, H). inhibition of the calcium response may help to test the relationship.

      3) The local concentration of NaCl showed remarkable inhibitory effects on cell migration, and cell volume. As we know injury may activate channels and pumps, which then facilitate the ionic fluxes, thus generate persistent ionic currents. Channel and pump inhibition experiments could quickly point to some molecular basis of the involvement of NaCl.

      4) I consider using Iso KCl is very interesting, because high K+ would significantly modulate cell membrane potential, however the effect on cell migration is very similar to those of Iso Choline Cl, iso NaGlunate, Iso Sorbitol. This would provide another side evidence for the role of wound electric fields in cell migration.

      5) 200V DC is much higher than endogenous EFs expected in such a model. Caution should be given when interpreting the results. I also wonder whether the authors attempted experiments (Fig. 4B, C) using wounded animals, perhaps the tissues after injury are not technically plausible (too fragmented) for such experiments.

      6) One assumption in the paper is the TEP and wound EFs in vivo. Glass microelectrodes may be able to verify those in space and time. If this works (the TEP and wound EFs can be mapped), the effects of various treatments can be tested and exclude other possibilities.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      This paper examines the role of electric fields as a damage signal for epithelial cell wounding using a zebrafish tail laceration in vivo model. While electrical fields had been previously noted in vitro, whether they played a role in early wound detection by epithelial cells has been unclear. They tracked the ability of epithelial cells to sense direction by imaging actin dynamics in zebrafish epidermis. From these studies, they find that directional sensing depends on the local concentration of specific electrolytes. Additionally, external electric fields can independently stimulate directional migration.

    1. Reviewer #3:

      The goal of this manuscript “to develop predictive tools for inferring fitness trajectories in new environments” is an important goal and I appreciate the synthesis of theoretical modeling with parameter estimation from empirical mutation studies.

      Reading through the manuscript, however, I found myself repeatedly wondering whether the stated application of the methods developed here doesn't constitute something of a tautology. This could be a misreading on my end, but I'll explain: the authors state that they have the central goal of predicting whether a population adapting to one environment will lose fitness in another "non-home" environment. Yet the parameter estimation they develop and propose for estimating fitness trajectories requires fitness measurements in both the home and non-home environments. If one already has fitness measurements for both home and non-home, how much more information is added by estimating the JDFE? I understand that the authors are estimating the fitness trajectories over time, with the incorporation of population genetic parameters, but again, I was unsure of how much information was added with the JDFE particularly given large discrepancies in the Wright-Fisher models and the decreasing predictive capacity with time. The bottom row of Figure 1 provided perhaps the most convincing evidence of the usefulness of the JDFE, but the unintuitive result was not adequately explored nor explained (see comment below). Also, perhaps an exploration of how the predictions could be extended to unmeasured environments is possible (as in Kinsler et al 2020)?

      Further specific conceptual comments and suggestions:

      1) The authors demonstrate in Figure 1 that JDFEs even with similar shapes produce markedly different fitness trajectories. They argue that the correlation coefficient of the JDFE is not a reliable predictor of fitness trajectories in the home environment. I was struck by this counterintuitive result, and found myself searching for further explanation. Are the authors arguing that the practice of simply looking at the correlation coefficient in tradeoff studies in general is insufficient for predicting the fates of pleiotropic mutations? Either way, it would be helpful to the reader to elaborate on why and under which conditions the discrepancy with the correlation coefficient and fitness trajectories arises.

      2) The modeling results throughout the manuscript reveal poor predictive capabilities in Wright-Fisher simulations. For example, the results in figure 2 show substantial discrepancy between the theoretical predictions and the results of the Wright-Fisher simulations. The authors address this only briefly stating that outside of the strong selection, weak mutation model (SSWM) the pleiotropy statistics are only "statistical predictors". But the discrepancy was systematic and wide, suggesting rather little insight from the pleiotropy statistics in sequential adaptation scenarios. I could not find discussion of this discrepancy between the SSWM and Wright-Fisher modeling predictions.

    2. Reviewer #2:

      The authors present a theoretical framework for analysing pleiotropic effects in populations evolving in different environments based on the concept of a joint distribution of fitness effects (JDFE). Simple correlation measures are derived from the JDFE that allow one to predict the evolutionary outcome in the non-home environment. Analytic theory is derived in the SSWM regime and complemented by simulations covering the regime of large mutation supply. A proof-of-concept application to collateral antibiotic resistance and sensitivity in bacteria based on a published data set for knockout strains is presented. Overall, this is an important, systematic contribution to a very timely subject.

      Major Concerns:

      1) I do not quite share the authors' surprise at the outcomes shown in Figure 1. In fact there is a simple heuristic that allows one to predict the direction of the fitness change in the non-home environment in all cases: Simply look at the y-coordinate of the tail of the JDFE corresponding to the largest beneficial effects along the x-axis.

      2) Along the three rows of panels in Figure 2, there appears to be a systematic but in two cases non-monotonic variation of the slope with the mutation supply NU_b. Do the authors have a (tentative) explanation for this behavior?

    3. Reviewer #1:

      Ardell and Kryazhimskiy use bacterial TnSeq data in multiple conditions to study the structure of pleiotropy, that is the degree to which a genetic perturbation affects multiple phenotypes, and present a theoretical framework to predict and assess fitness trajectories observed in environments other than the one selection is operating in. The work is thoroughly done and has potentially interesting implications for sequential drug therapy.

      The central object of their framework is the joint distribution of fitness effects of mutations in multiple environments where the distribution is over all mutations in the genome. The dynamics in the space of fitness in multiple environments is then modeled as a random walk (described by a diffusion equation) assuming that mutations sweep separated in time (SSWM). The model and the calculations necessary to arrive at the predictions are simple and transparent. The results quantitatively predict simulation results with the range of validity of SSWM. Outside this range, the model predicts the qualitative behavior, but is quantitatively wrong.

      1) My main disappointment with the paper is the inability to quantitatively describe the dynamics outside the SSWM regime. I would expect that the effects of competing mutations or weak selection could be accounted for at least perturbatively. Alternatively, one could determine the distribution of the effects of fixed mutations in the "home" environment in simulations and use this distribution to predict the dynamics in other environments.

      2) My other substantial concern is the question whether anything can be learned about drug resistance evolution or collateral sensitivity/resistance from TnSeq experiments. While some drug resistance evolution involves loss-of-function mutations (e.g. porin losses), it often proceeds via point mutations, up-regulation, or horizontal acquisition. Furthermore, the statistical treatment here requires many mutations to sample the joint effect distribution to give reliable answers. In clinical resistance evolution, the number of mutations observed is often quite small and their effect distributions are wide. The practical relevance of this is therefore far from clear.

      3) While the similarity of this work to similar questions in quantitative genetics is discussed in the introduction, I would like to see an extended discussion to determine whether some limits of the model at hand can be described by the quantitative genetics approach.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The reviewers agreed that pleiotropy of mutations and the resulting adaptive trajectories across different environments are important topics that are both of theoretical and applied interest. Your theoretical framework predicts fitness trajectories observed in environments other than the one selection is operating in (home environment). These trajectories in non-Home environments are calculated via integrals over the joint fitness effect distribution weighted by the fixation probability in the home environment. However, your framework assumes strong selection and weak mutation (SSWM) and deviations from this assumption seem to have strong effects. We think that these effects need to be at least partially understood. Furthermore, application to the KO library is a useful proof-of-concept, but the practical relevance of these patterns for understanding collateral sensitivity/resistance is far from obvious. In summary, we felt that the manuscript needs to make more substantive theoretical advances and/or provide more robust actionable insights into drug resistance evolution.

    1. Reviewer #3:

      Mahfooz et al. employed FM4-64 to assay vesicle fusion of cultured mouse hippocampal synapses. They observed that the FM destaining time course deviates from a mono-exponential function during 1-Hz and 20-Hz trains. The deviation from mono-exponential kinetics was also seen during a second stimulus train applied after recovery periods of up to eight minutes. Destaining was faster after loading at low frequency (1Hz) compared to high frequency (20 Hz). The destaining time course during high-frequency stimulation was independent of the length of preceding low-frequency trains. Conversely, short high-frequency trains did not affect destaining kinetics during subsequent low-frequency stimulation. Finally, they probed destaining in Synapsin DKO cultures and found faster destaining during short high-frequency trains and long low-frequency trains. Based on these data, the authors conclude that slowly and quickly mobilized reserve vesicles are mobilized in parallel without intermixing.

      The paper addresses an interesting question - the relationship between the mobilization and the release of synaptic vesicles. Most data are solid. However, most major conclusions are not/only weakly supported by the data presented in the manuscript. A major limitation is that direct links between the FM data and a previously established 'modular' model of reserve vesicle mobilization are missing. The following points need to be addressed:

      1) The deviation from a mono-exponential destaining time course is a central observation of the study. The quantification is essentially based on comparing relative destaining during 2-min intervals. Mostly, this 'fractional destaining' is compared between the first and last two minutes of 1- or 20-Hz stimulation. This is not convincing. By eye, most destaining time courses actually look quite mono-exponential. The authors need to provide additional evidence for a deviation from mono-exponential kinetics. For instance, data could be approximated with a double-exponential function and considered double-exponential if the amplitude and time constant of the two components significantly differ from one another, and if each component contributes significantly to the overall amplitude. How large is the amplitude of the slow exponential component, and how slow is its time course? Is the overall contribution of the slow component significant? How do the amplitudes and kinetics of reserve mobilization compare to the ones of fast/reluctant release from the RRP?

      2) Correcting for rundown/bleaching in the absence of stimulation is key for concluding that the time course differs from a single exponential. According to Raja et al. (2019), rundown was corrected by subtracting a line fit to the data before stimulus onset. The authors need to record for longer periods in the absence of stimulation and subtract these data from the data obtained in the presence of stimulation. They could also compare the resulting time constant with the slow time constant during stimulation.

      According to the methods section, data were corrected for rundown. However, many time courses (Fig. 2A, C, 3A, 4...) display a decrease in fluorescence before stimulus onset. How exactly was the data corrected? Was this also done during the recovery periods? More details are needed to conclude on a potential slowing of FM destaining.

      3) The decrease in fractional destaining depends on the duration of 1-Hz stimulation (Fig. 6B, D). How specific are the results to 1-Hz stimulation for 15 minutes? The relationship between fractional destaining and stimulation frequency/duration needs to be investigated systematically.

      4) Data is mainly represented as averages of many preparations. Individual ROIs display vastly different destaining time courses (Figure 2D). How robust are the phenotypes at the level of individual preparations? I suggest plotting and fitting average data of individual preparations in addition to showing grand averages and box plots. Moreover, it would be helpful to show the data of all ROIs and the corresponding average for one representative preparation to get a sense of the variability.

      5) The authors claim that destaining during 20-Hz stimulation is largely independent of the duration of preceding 1-Hz trains (Figure 6). However, the time courses shown in figure 6C look different. Indeed, the destaining appears slower during 20-Hz stimulation following long 1-Hz trains, arguing against the modular model. The time courses/fractional destaining of the 20-Hz data shown in figure 6 should be quantified.

      6) Destaining was faster in Synapsin DKO cultures compared to WT for short 20-Hz trains (Fig. 8A), as well as long low-frequency trains (Fig. 9). The quantification of destaining during 20-Hz stimulation for 4 s, or 0.1/1-Hz trains for the first 4 min seem somewhat arbitrary. Is the difference by a factor of 1.5 between Synapsin DKO and WT also seen for other durations of short high-frequency trains depleting the RRP, or long low-frequency protocols?

      7) The authors claim that the destaining time courses are similar between Synapsin DKO and WT for longer 20-Hz trains (Fig 8B). However, the data shown in figure 8B indicate a difference. The destaining kinetics/fractional destaining should be also quantified for the 20-Hz trains for 100 s.

      8) The authors conclude that their data support a 'modular model', in which chains of synaptic vesicles are connected to release sites in parallel. Although this model is interesting, direct links between the FM data and the model are missing. For instance, direct links between vesicle chains, their replacement or length (Fig. 1) and the FM data are missing. I therefore suggest discussing the data in the context of the model at the end of the paper instead of starting the paper with a cartoon of the model. In general, the model, which is mainly based on previous data by the same group, should be less emphasized, and terms like "re-conceptualization" should be avoided.

      Additionally, the authors need to discuss other reasons that could explain a deviation from a mono-exponential time course. They claim to exclude potential contributions from long-term depression, because destaining is faster after 1-Hz compared to 20-Hz loading, but I don't find this convincing. How can they exclude contributions of other factors, such as pr depression (e.g. by presynaptic calcium channel inactivation; e.g. Xu and Wu, 2005), effects of endocytosis etc.? Could other aspects of the known Synapsin DKO phenotypes explain their data?

    2. Reviewer #2:

      This is an interesting study attempting to conceptualize the long-standing question of the mode of vesicle trafficking in presynaptic terminals. The authors used classical FM dye release experiments to support a hypothesis that rapidly and slowly releasing vesicles are mobilized in parallel without intermixing. The use of synapsin KOs effectively supports the authors' model. This modular model is also supported indirectly by the authors' recent findings of molecular links that connect a subset of vesicles in linear chains (published elsewhere). However, the scope of the model is limited by a number of caveats. The main concerns include a limited dataset measured in bulk from a highly heterogeneous synapse population, and a complex interrelationship between vesicle mobilization and FM dye de-staining kinetics. The second major limitation is measurements being performed at room temperature, which inhibits or alters a number of critical synaptic processes that are being modeled. This includes the efficiency of exo/endocytosis coupling, vesicle mobility and release site refractory period, which are stimulus- and temperature-dependent, but are not accounted for in the current model.

      Major Comments:

      1) The model lacks consideration of vesicle endocytosis efficiency. Hippocampal synapses can efficiently sustain release for at least 300APs at 35C (but not at 25C) at frequencies up to 10Hz (Fernandez-Alfonso and Ryan, 2006). Therefore a very rapid and efficient replenishment of the RRP is present at this synapse, particularly at 1Hz stimulation used in many experiments in the current study. The efficiency of endocytosis determines vesicle availability and thus release kinetics during stimulus trains; it is unclear how it is reflected in FM dye de-staining and the resulting model since the newly endocytosed and recycled vesicles are not labeled. Moreover the efficiency of exo-/endocytosis coupling is dramatically reduced at room temperatures (Fernandez-Alfonso and Ryan, 2006). It is also strongly calcium-/stimulus dependent (Leitz and Kavalali, 2011, 2014). These effects are not considered in the study, which is performed entirely at room temperature, thus greatly limiting interpretation of the results.

      2) Related to the above: authors point to lack of vesicle intermixing, a core hypothesis of the study, as being consistent with lack of vesicle mobility in previous studies. However, lack of vesicle mobility is simply an artifact of low recording temperatures (Gaffield and Betz 2007, Peng, Rotman et al. 2012); a majority of recycling synaptic vesicles are highly mobile at body temperatures (Westphal, Rizzoli et al. 2008, Kamin, Lauterbach et al. 2010, Lee, Jung et al. 2012, Park, Li et al. 2012).

      Thus intermixing might be limited or largely inhibited at room temperatures because of inefficient endocytosis or lack of vesicle mobility.

      These two considerations make it difficult to interpret the FM de-staining measurements at room temperature simply as a reflection of the mode of vesicle mobilization alone. The study would greatly benefit from more direct measurements of vesicle release, controls for endocytosis kinetics at different stimulus paradigms, and from the key measurements repeated at body temperatures.

      3) The bulk FM measurements used in the study represent an average of highly non- homogeneous population, which is not well represented by a Gaussian distribution. Indeed, the authors show a marked variability in FM de-staining among individual synapses. Extending the model to account for variability among individual synapses would greatly strengthen the conclusions.

      4) Release site refractory period (Neher, 2010) may vary among release sites and can make substantial contributions to FM release kinetics depending on stimulation frequency. This is not accounted for in the current model.

    3. Reviewer #1:

      In this manuscript, the authors show the data supporting two types of parallel reserve pool. The concept is original and interesting. However, at least for me, the manuscript is very difficult to follow because the main text and figure legend do not have sufficient explanation and sometimes it is difficult to understand what the figures tell us (what the axis means? for example). Therefore, after reading the ms several times, I cannot judge whether the data support the authors concept or not. In addition, I have the following issues, which may come from my lack of understanding as described above.

      1) Interpretation of FM data is not necessarily straightforward, because there are stained and non-stained vesicles. In addition, stained vesicles are converted into non-stained ones after exocytosis of synaptic vesicles. It will be easier to interpret the data if the authors show EPSC data or synaptopHluorin data, which only measured exocytosis, and compare the difference between FM data and others.

      2) Fig 3 is interesting because the data show the decrease of de-staining at the second stimulation by waiting a longer time, which is opposite to what people expect. However, the data may support the idea of mixing stained vesicles and non-stained vesicles with time perhaps in the same reserve pool. Figure 3 shows that dyes are completely lost after 20 Hz stimulation at the end of the protocol, which is against this idea. On the other side, Figure 2 shows residual fluorescence remaining after 20 Hz stimulation.

      3) Fig 4 is again interesting, because loading with 1 Hz stimulation may load the vesicle pool which is used for lower stimulation frequency. However, it is not known if 1 Hz stimulation triggers more exocytosis or less compared with 20 Hz stimulation. With high frequency stimulation, there may be AP failure, Ca current inactivation, less time for new vesicle recruitment. It could have been more informative to have additional data which directly shows this (see 1)

      4) Fig 7 is not really consistent with parallel vesicle pools because 1 Hz stimulation decreases the amounts of exocytosis of the following 20 Hz stimulation (compare A and B), although C shows the amounts of exocytosis are the same between A and B.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      Overall, there was a strong enthusiasm for the topic of the study, and all the reviewers acknowledged the originality of the hypothesis being pursued. However, several technical issues and shortcomings have been raised, and as such, the present experiments fall short of compellingly supporting the conclusion of the study. These concerns have been detailed in the individual reviews below.

    1. Reviewer #3:

      This study by Haimson et al. aims at examining the diversity of dI2 interneurons and their role in coordinating activity across different regions of the spinal cord and in reporting back activity to the brain. The results show that dI2 interneurons comprise different sub-classes based on their axonal projections, soma diameter and transmitter identity. They also show that some dI2 interneurons project rostrally from the lumbar spinal cord and make putative synaptic contacts with other dI2 interneurons in the brachial spinal cord on their way to the cerebellum. Finally, it is shown that some dI2 interneurons receive putative inputs from DRG neurons and may serve to transmit movement-related feedback. An indiscriminate silencing of dI2 interneurons results in instability of locomotion. Overall, this study reports some interesting observations by showing the heterogeneity of dI2 interneurons and their potential function. I have the following concerns:

      1) 12% express Pax2 and are considered inhibitory. However, Gad is expressed in only 25% of dI2 interneurons while vGlut is expressed in 88%. These proportions suggest that there are dI2 neurons that co-express vGlut and Gad. Is this the case? Are there additional inhibitory dI2 neurons in addition to those expressing Pax2 which could explain the fact that Gad labels 25% of dI2 neurons. These points need some clarifications and discussion.

      2) Of all dI2 interneurons, 91% are small diameter and 9% are large diameter neurons - large diameter neurons are mostly apparent in the lumbar spinal cord. The small and large diameter dI2 neurons cannot be differentiated by their expression of TFs, but can be distinguished by their transmitter identity? Is the proportion of small and large diameter neurons the same along the spinal cord?

      3) Do all dI2 neurons receive putative synaptic contacts from DRG neurons? Unless I have missed it, it would be helpful to provide quantification of the number of small vs large diameter dI2 neurons with regard to the different putative synaptic contacts they receive from DRG neurons, dI2 and V1 interneurons.

      4) Lines 218-220: It is stated that DRG putative contacts are mainly targeting dorsal dI2 neurons while ventral ones receive virtually no contacts. Since large diameter VSCT dI2 neurons are located ventrally, they do not seem to receive direct sensory information. However, the authors conclude that VSCT dI2 neurons receive sensory input (lines 227-228) and also in the Discussion. There seem to be a mismatch between the results and the conclusion drawn by the authors (lines 374-377). Unless I am missing something here, this is not consistent with the conclusions of this study. Please clarify.

      5) The silencing experiments are interesting, however it is unclear which sub-class of dI2 neurons and at what level (lumbar vs brachial spinal cord or cerebellum) the observed behavioral perturbations take place. It is possible to selectively silence excitatory vs inhibitory or only VSCT neurons to provide some link between dI2 sub-classes and behavioral perturbations.

    2. Reviewer #2:

      This work addresses the possibility that developmentally-characterized di2 neurons contribute to the ventral spinocerebellar tract and regulate stepping in the chick. The work is sound considering that most information we have on spinal subtypes are for ventrally-born and local circuit interneurons (i.e. motor related), but less is known about the dorsally-born types and about long-range projecting neurons that link the spinal cord with higher integrative centers. Here, using a combination of cell-type specific manipulations, circuit tracing tools and kinematic analysis of gaits in the chick authors propose that spinal di2 interneurons contain multiple subgroups including a population that sends projection to the cerebellum. Silencing di2 neurons overall leads to impaired stepping.

      Overall, the strategy is sound and there is potential novelty, provided the weaknesses in the scientific demonstration listed below can be first addressed, experimentally and/or by additional analysis. Equally importantly also, the work suffers from a severe lack of clarity (writing, figures, results).

      I start with the scientific weaknesses:

      1) Synaptic connections rely mostly on the anatomical overlap between di2 cells and the synaptic field of their putative pre-synaptic partners. While this is indeed suggestive, it is not enough to ascertain actual synaptic connections, and even less so in a comparative manner between the different groups. Furthermore, some tracers (e.g PRVmCherry) do not seem to be under a synapse-specific promoter, so labelled elements might just as well be passing fibers. Clearer evidence of actual connections should be provided, functionally if possible or at the very least by showing clearer putative boutons onto neuronal somata/dendrites, quantifying them and quantifying differences between input cell types. Current figures (2F / 3B', C', D' / 4C, D', E', F') are not sufficiently convincing since we see only one cell and can barely detect boutons visually on some of them (not to mention that pseudo-colors keep changing, see other comment below). In addition, please consider using the term "putative" or "presumed" synapses, contacts and connections throughout the study.

      2) The loss of function and gait analysis is stronger and convincingly presented. However, unless I missed it, the strategy silences all di2 neurons but cannot discriminate the contributions of the pre-cerebellar ones. This poses problems for the interpretation of the data. Since this paper is about either subpopulations of di2, or the vSCT (see other comment about general scope of the work), it would be more robust if more specific silencing was included. It is currently assumed that one likely mechanism for the disturbed gait owes to the function of di2 as precerebellar neurons (line 385, 389) but the phenotype could also, or even entirely, be due to their proprio-spinal connectivity. This is a major caveat.

      On top of this, writing and data presentation MUST be substantially improved on multiple aspects:

      3) Please have the manuscript deeply proofread. In addition to numerous English mistakes (missing "the", "or", plural and singulars, lots of unnecessary commas, etc...) examples of confused writing include (non-exhaustive list):

      (a) Line 128: what does this phrase mean ("TF expression is redundant"...)

      (b) Line 159: I don't understand here, the Di2 ascend to the cerebellum, cross the midline to the targeted di2? To which Di2 do the authors refer to here, it sounds like they are in the cerebellum, or that the ascending Di2 redescend to the spinal cord...

      (c) The term targeted is in fact used alternatively and confusingly to refer to either "manipulated" cells, "synaptically-targeted" cells, there is also "targeted overground locomotion",....

      (d) Stage HH18 is sometimes referred to as E3. Please be consistent throughout.

      (e) When describing inputs onto di2, add "neurons" (i.e. "onto di2 neurons").

      4) I would appreciate more background on di2 neurons in the introduction and why these have been investigated. Currently, most of this is given in the first paragraph of the results (lines 91-100 and also line 103). Also, it is stated first that "the role of di2 neurons is elusive due to the lack of genetic targeting means" (line 59). This contradicts the later statement that "the progenitor pdi2 expresses [various transcription factors]", and that the "post mitotic di2 are defined by..." (line 103). Please clarify what is known and not known about di2 already in the introduction.

      5) Related to the above, it is not sufficiently clear what is investigated here. The genetic identity of ventral spinocerebellar neurons? Or the diversity of di2 neurons? In the way the introduction is written, it gives the impression that it is the former, but then functional investigations are not specific enough (since they are targeted to the overall di2 population, see dedicated comment later). Authors should revise to make clearer what is the scope of the work.

      6) Histology Figures should be made more convincing, self-explanatory, and to a higher standard.

      (a) Anatomical landmarks must be placed on all figures, e.g: the midline and minimal nuclei of the cerebellum, the deep cerebellar nuclei should be indicated in Fig S4,... Also, please give the orientation axis on all figures (especially the ones illustrating large territories, like 2B, 4A).

      (b) Add the CTB or HSV tracer on Fig. 2A and check coherence: I believe for instance that HSP is wrongly stated instead of HSV in Fig 2D and PRV is wrongly stated instead of CTB in Fig 2F (and there might be other confusions throughout).

      (c) It is extremely confusing that histology pseudo-colors are sometimes changed from one related figure to the other, for unclear reasons (e.g. 2B, 2B', 2C, also 2C and S4A...). Consistency will help the reader go through all panels and figures comparatively.

      (d) Figures must be addressed in proper order. This also applies to supplemental figures. Otherwise, it gives the impression we have missed something.

      (e) What is the rationale for plotting the overlap in area versus volume (Figure 2H, I)? If overlap with area shows a higher percentage than with volume, does it mean that the overlap is only limited to a given A/P plane? I'm really confused about this representation and its meaning.

      7) Authors should avoid relying on subjective formulations like "that reside at the lateral dorsal aspect of lamina VII". Instead, they MUST demonstrate the positioning of Di2 neurons into the different spinal laminae with some form of quantitative measurements. This is currently just an "impression" that large, precerebellar Di2 are more ventral, in lamina VII and possibly VIII but without the representation of lamina borders on figures, this information cannot be appreciated by the reader. It is all essential that these borders are depicted in Figures and neurons be quantitatively allocated to each laminae. In addition/alternatively, authors should report the average D/V position of the different subtypes and test for significant differences to make the case of different spatially-confined populations stronger.

      8) FoxD3 expression on Supplemental Figure 2B is not convincing. It is also not reported in the statistics of Fig 1E. Do we have to assume that all di2 investigated here are FoxD3-positive? If so, one would need a better illustration and quantifications should be given. Otherwise, I would suggest simply relying on the literature and removing Figure S1B which is not helping. On other panels of that supplemental Figure 2, please add arrow/arrowheads on all neurons that are or are not co-labelled so we can appreciate co-labelling.

      9) The demonstration that di2 are excitatory is essential. It is the title of a paragraph (line 102), thus I think that the corresponding data with the neurotransmitters (Vglut2, GAD) would deserve to be in the main Figures. Also, the chosen illustration only shows ONE double-labelled cell with Vglut2. Authors should be able to show a field of view that more convincingly conveys the message with more cells.

    3. Reviewer #1:

      This is a well-put-together manuscript describing carefully performed circuitry dissection and functional analysis of dl2 neurons in the chick. A genetic toolbox is used taking advantage of the electroporation technique applied to the embryos. The findings include a fairly convincing connectome for dl2 neurons and a functional phenotype that is, unfortunately, rather unsatisfying. The investigators conclude that dl2 interneurons regulate "stability" of bipedal stepping in the chick, which is fine, but the analysis misses an opportunity to more fully explore what the instability involves and thus to perhaps shed more light on the likely roles of this neuron population. The concerns/issues 3 and 4 below focus on this issue and the need for additional careful analysis of the behavior that will allow the phenotype to be more precisely described or ascribed to some aspect of stepping that might guide future studies in other models. For example, can the link between partial collapse and over-extensions be made more solid and thus argue that reduced extensor gain might be what results in the instability? What other analysis could be performed using the existing data/video to better describe the behavioral phenotype?

      Major Concerns:

      1) The connectome part of the work appears solid and supports the concept that a subpopulation of the population are likely VSCT neurons, that the non VSCT neurons receive the bulk of the afferent input and that these neurons project to contralateral dl2 neurons (some which may be VSCT) and other premotor neurons. Anatomically, the only concern is that no distinctions were made between the lumbar and brachial populations, and if differences in these populations exist, it would be important and interesting to describe them.

      2) Figure 2 Characterization of dl2/VSCT neurons as being primarily large dl2 neurons is quite convincing, and the observation that the dl2 neurons account for 10% of the VSCT axons is also of interest and quite compelling. A question arises, however, about the source, rostrocaudally, of the VSCT neurons and tract. Is the 10% for the total or for a specific level or levels? Can more be said/quantified about differences in these populations at different spinal levels?

      3) Whole-body collapses and subsequent over-extensions are important and speak to changes in reflex arc and motor output. The statement "usually followed by" over-extension should be followed-up. Can this be further quantified? Are the two events linked or distinct, and did over-extensions happen in the absence of collapses?

      4) These issues mesh with the lower knee height and angle of the TMP joint, even when collapses are excluded. It appears as though the control system to maintain muscle shortening (force output of extensors) is altered. I agree that stability is compromised, but could we go further to state that the compromise is due to extensor gain control?

    1. Reviewer #3:

      In this work Stachiak and colleagues investigate the role of Prox1 on the development of VIP cells. Prox1 is expressed by the majority of GABAergic derived from the caudal ganglionic eminence (CGE), and as mentioned by the authors, Prox1 has been shown to be necessary for the differentiation, circuit integration, and maintenance of CGE-derived GABAergic cells. Here, Stachiak and colleagues show that removal of Prox1 in VIP cells leads to suppression of synaptic release probability onto cortical multipolar VIP cells in a mechanism dependent on Elfn1. This work is of interest for the field because it increases our understanding of differential synaptic maturation of VIP cells. The results are noteworthy, however the relevance of this manuscript would potentially be increased by addressing the following suggestions:

      1) Include histology to show when exactly Prox1 is removed from multipolar and bipolar VIP-expressing cells by using the VIP-Cre mouse driver.

      2) Clarify if the statistical analysis is done using n (number of cells) or N (number of animals). The analysis between control and mutants (both Prox1 and Elfn1) need to be done across animals and not cells.

      3) Clarify what are the parameters used to identify bipolar vs multipolar VIP cells. VIP cells comprise a wide variety of transcriptomic subtypes, and in the absence of using specific genetic markers for the different VIP subtypes, the authors should either include the reconstructions of all recorded cells or clarify if other methods were used.

    2. Reviewer #2:

      Stachniak et al., provide an interesting manuscript on the postnatal role of the critical transcription factor, Prox1, which has been shown to be important for many developmental aspects of CGE-derived interneurons. Using a combination of genetic mouse lines, electrophysiology, FACS + RNAseq and molecular imaging, the authors provide evidence that Prox1 is genetically upstream of Elfn1. Moreover, they go on to show that loss of Prox1 in VIP+ cells preferentially impacts those that are multipolar but not the bipolar subgroup characterized by the expression of calretinin. This latter finding is very interesting, as the field is still uncovering how these distinct subgroups emerge but are at a loss of good molecular tools to fully uncover these questions. Overall, this is a great combination of data that uses several different approaches to come to the conclusions presented. I have suggestions that I think would strengthen the manuscript:

      1) Can the authors add a supplemental table showing the top 20-30 genes up and down regulated in their Prox1 KOS? This would make these, and additional, data more tenable to readers.

      2) It is interesting that loss of Prox1 or Elfn1 leads to phenotypes in multipolar but are not present or mild in bipolar VIP+ cells. The authors test different hypotheses, which they are able to refute and discuss some ideas for how multipolar cells may be more affected by loss of Elfn1, even when the transcript is lost in both multipolar and bipolar after Prox1 deletion. If there is any way to expand upon these ideas experimentally, I believe it would greatly strengthen the manuscript. I understand there is no perfect experiment due to a lack of tools and reagents but if there is a way to develop one of the following ideas or something similar, it would be beneficial:

      a) Would it be possible to co-fill VIPCre labeled cells with biocytin and a retroviral tracer? Then, after the retroviral tracer had time to label a presynaptic cell, assess whether these were preferentially different between bipolar and multipolar cell types, the latter morphology determined by the biocytin fill? This would test whether each VIP+ subtype is differentially targeted.

      b) Another biocytin possibility would be to trace filled VIP+ cells and assess whether the dendrites of multipolar and bipolar cells differentially targeted distinct cortical lamina and whether these lamina, in the same section or parallel, were enriched for mGluR7+ afferents.

    3. Reviewer #1:

      Stachiak and colleagues examine the physiological effects of removing the homeobox TF Prox1 from two subtypes of VIP neurons, defined on the basis of their bipolar vs. multipolar morphology.

      The results will be of interest to those in the field, since it is known from prior work that VIP interneurons are not a uniform class and that Prox1 is important for their development.

      The authors first show that selective removal of a conditional Prox1 allele using a VIP cre driver line results in a change in paired pulse ratio of presumptive excitatory synaptic responses in multipolar but not bipolar VIP interneurons. The authors then use RNA-seq to identify differentially expressed genes that might contribute and highlight a roughly two-fold reduction in the expression of a transcript encoding a trans-synaptic protein Elfn1 known to contribute to reduced glutamate release in Sst+ interneurons. They then test the potential contribution of Elfn1 to the phenotype by examining whether loss of one allele of Elfn1 globally alters facilitation. They find that facilitation is reduced both by this genetic manipulation and by a pharmacological blockade of presynaptic mGluRs known to interact with Elfn1.

      Although the results are interesting, and the authors have worked hard to make their case, the results are not definitive for several reasons:

      1) The global reduction of Elfn1 may act cell autonomously, or may have other actions in other cell types. The pharmacological manipulation is less subject to this interpretation, but these results are not as convincing as they could be because the multipolar Prox1 KO cells (Fig. 3 J) still show substantial facilitation comparable, for example to the multipolar control cells in the Elfn1 Het experiment (controls in Fig. 3E). This raises a concern about control for multiple comparisons. Instead of comparing the 6 conditions in Fig 3 with individual t-tests, it may be more appropriate to use ANOVA with posthoc tests controlled for multiple comparisons.

      2) The isolation of glutamatergic currents is not described. Were GABA antagonists present to block GABAergic currents? Especially with the Cs-based internal solutions used, chloride reversal potentials can be somewhat depolarized relative to the -65 mV holding potential. If IPSCs were included it would complicate the analysis.

      3) The assumption that protein levels of Elfn1 are reduced to half in the het is untested. Synaptic proteins can be controlled at the level of translation and trafficking and WT may not have twice the level of this protein.

      4) The authors are to be commended for checking whether Elfn1 is regulated by Prox1 only in the multipolar neurons, but unfortunately it is not. The authors speculate that the selective effects reflect a selective distribution of MgluR7, but without additional evidence it is hard to know how likely this explanation is.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      This work is of interest because it increases our understanding of the molecular mechanisms that distinguish subtypes of VIP interneurons in the cerebral cortex and because of the multiple ways in which the authors address the role of Prox1 in regulating synaptic function in these cells.

    1. Reviewer #3:

      This manuscript reports results from an eye tracking study of humans walking in natural terrain. These eye movements together with images simultaneously obtained by a head-fixed camera are used to calculate optic flow fields as seen by the retina and as seen by the head-fixed camera. Next, the structure of these flow fields is described. It is noted that this structure is somewhat stable in the retinal image, due to compensatory gaze stabilisation reflexes, but varies wildly in the head-centric image. Then, the authors estimate the focus of expansion in the head-centric flow and argue that it cannot be used for locomotor control, because it also varies wildly during walking. In a second, more theoretical section of the manuscript, they calculate retinal flow for a movement over an artificial ground plane, given the locomotor and eye movements recorded previously. They describe the structure of the retinal flow and compute the distribution of curl and divergence across the retina as well as in a projection onto the ground plane. They argue that curl around the fovea and the location of the maximum of divergence can be used to estimate the direction of walking relative to the direction of gaze and in relation to the ground plane.

      I really like the experimental part of the study. However, I see fundamental issues in the theoretical part, in the general framing of the presentation, and in misrepresentations of previous literature.

      The simultaneous measurement of head-centric image and gaze with sufficient temporal resolution to calculate retinal flow during natural walking provides a beautiful demonstration of retinal flow fields, and confirms many known aspects of retinal flow. The calculation of head-centric flow from the head camera images provides a compelling, though not unexpected, demonstration that the FOE in head-centric flow is not useful for locomotor control. It is not unexpected since one of the most well-known issues in optic flow is that the FoE is destroyed when self-motion contains rotational components (Regan and Beverley, 1982, Warren and Hannon, 1990, Lappe et al. 1999). Although this is often presented as an issue of eye movements in retinal flow, it applies to all rotations and combinations of rotations that exist on top of any translational motion of the observer. Thus, the oscillatory bounce and sway motion of the head during walking is expected to render any use of the FOE in a head-centric image futile.

      Yet, the first part of the manuscript is very much framed as a critique of the idea of a stable FoE in head-centric flow, presuming that this is what previous researchers commonly believed. This argument contains a logical fallacy. Previous research argued that there is no FoE in retinal flow because of eye rotations (e.g. Warren and Hannon, 1990). This does not predict, inversely, that there is an FoE in head-centric flow. In fact, it does not provide any prediction on head-centric flow. The authors often suggest that a stable FoE in head-centric flow is tacitly implied, commonly believed, etc without providing reference. In fact, the only paper I know that specifically proposed a head-centric representation of heading is by van den Berg and Beintema (1997).

      Instead, the fundamental problem of heading perception is to estimate self-motion from retinal flow when the self-motion that generates retinal flow combines all kinds of translations and rotations. The present study shows, consistent with much of the prior literature, that the patterns of retinal flow are sufficiently stable and informative to obtain the direction of one's travel in a retinal frame of reference, and, via projection, with respect to the ground plane. This is due to the stabilising gaze reflexes that keep motion small near the fovea and produce (in case of a ground plane) a spiralling pattern of retinal flow. This is well known from theoretical and lab studies (e.g. Warren and Hannon, 1990, Lappe et al., 1998, Niemann et al., 1999, Lappe et al. 1999) and, to repeat, beautifully shown for the natural situation in the present data. The presentation should link back to this work rather than trying to shoot down purported mechanisms that are obviously invalid.

      The second part of the manuscript presents a theoretical analysis of the retinal flow for locomotion across a ground plane under gaze stabilisation. This has two components: (a) the structure of the retinal flow and the utility of gaze stabilisation, and (b) ways to recover information about self-motion from the retinal flow. Both aspects have a long history of research that is neglected in the present manuscript. The essential circular structure of the retinal flow during gaze stabilisation is long known (Warren and Hannon, 1990, van den Berg, 1996, Lappe et al., 1998, Lappe et al. 1999). Detailed analyses of the statistical structure of retinal flow during gaze stabilisation have shown the impact and utility of gaze stabilisation (Calow et al., 2004; Calow and Lappe, 2007; Roth and Black, 2007) and provided links to properties of neurons in the visual system (Calow and Lappe, 2008). These studies included simulated motions of the head during walking, as in the current manuscript, and extended to natural scenes other than a simple ground plane.

      Given the structure of the retinal flow during gaze stabilisation the central question is how to recover information about self-motion from it. The authors investigate a proposal originally made by Koenderink and van Doorn (1976; 1984) that relies on estimates of curl and divergence in the visual field. They propose that locomotor heading may be determined directly in retinotopic coordinates (l. 314). This is true, but it fails to mention that other models of heading perception during gaze stabilisation similarly determine heading in retinotopic coordinates (e.g. Lappe and Rauschecker, 1993; Perrone and Stone, 1994; Royden, 1997). In fact, as outlined above, the mathematical problem of self-motion estimation is typically presented in retinal (or camera) coordinates (e.g. Longuet-Higgins and Prazdny, 1980). The problem with the divergence model in comparison to the other models above is threefold. First, it really only works for a plane, not in other environments. Second, it requires a local estimate of divergence at each position in the visual field. The alternative models above combine information across the visual field and are therefore much more robust against noise in the flow. One would need to see whether the estimate of the divergence distribution is sufficient to work with the natural flow fields. Third, being a local measure it requires a dense flow field while heading estimation from retinal flow is known to work with sparse flow fields (Warren and Hannon, 1990). Thus, the theoretical part of the manuscript should either provide proof that the maximum of divergence is superior to these other models or broaden the view to include these models as possibilities to estimate self motion from retinal flow.

      The case is similar for the use of curl. It is true that the rotational or spiral pattern around the fovea in retinal flow provides information about the direction of self motion with respect to the direction of gaze, as has been noted many times before. This structure is used by many models of heading estimation. However, curl is, like divergence, a local property and thus not as robust as models that use the entire flow field. It may be interesting to note that neurons in optic flow responsive areas of the monkey brain can pick up this rotational pattern and respond to it in consistency with their preference for self-motion across a plane (Bremmer et al., 2010; Kaminiarz et al. 2014).

      I think what the authors may want to draw more attention to is the dynamics of the retinal flow and the associated self-motion in retinal (or plane projection) coordinates. The movies provide compelling illustrations of how the direction of heading (or the divergence maximum, if you want to focus on that) sways back and forth on the retina and on the plane with each step. This requires that the analysis of retinal flow (and the estimation of self-motion) has to be fast and dynamic, or maybe should include some form of temporal prediction or filtering. Work on the dynamics of retinal flow perception has indeed shown that heading estimation can work with very brief flow fields (Bremmer et al. 2017), that the brain focuses on instantaneous flow fields (Paolini et al. 2000) and that short presentations sometime provide better heading estimates than long presentations (Grigo and Lappe, 1999). The temporal dynamics of retinal flow is an underappreciated problem that could be more in the focus of the present study.

      Additional specific comments:

      Footnote on page 2: It is not only VOR but also OKN (Lappe et al., 1998, Niemann et al., 1999) that stabilises gaze in optic flow fields.

      Line 55: Natural translation and acceleration patterns of the head have been considered by (Cutting et al., 1992; Palmisano et al. 2000; Calow and Lappe, 2007, 2008; Bossard et al., 2016)

      Line 59: The statement is misleading that the key assumption behind work on the rotation problem is that the removal of the rotational component of flow will return a translational flow field with a stable FoE. Only one class of models, those using differential motion parallax (Rieger and Lawton, 1985, Royden, 1997) explicitly constructs a translational flow field and aims to locate the FoE in that field. Other models (Koenderink and van Doorn, 1976, 1984; Lappe and Rauschecker, 1993; Perrone and Stone, 1994) do not subtract the rotation but estimate heading in retinal coordinates from the combined retinal flow. This also applies to line 109.

      Last paragraph on page 5: Measures of eye movement during walking in natural terrain were also taken by Calow and Lappe (2008) and 't Hart and Einhäuser (2012).

      Lines 140 to 163: This paragraph is problematic and misleading as pointed out before.

      Line 193: The lack of stability is expected, as outlined above. The use of a straight line motion in psychophysical experiments reflects an experimental choice to investigate the rotation problem in retinal flow, not an implicit assumption that bodily motion is usually along a straight line.

      Line 200: That gaze stabilization may be an important component in understanding the use of optic flow patterns has also long been assumed (Lappe and Rauschecker, 1993; 1994; 1995; Perrone and Stone, 1994; Glennerster et al. 2001; Angelaki and Hess, 2005; Pauwels et al., 2007).

      Line 314: Locomotor heading may be determined directly in retinotopic coordinates. Yes, and this is precisely what the above mentioned models do.

      Line 334: What is meant by "robust" here? The videos seem to show simulated flow for a ground plane, not the real flow from any of the terrains. It is not clear whether the features can be extracted from the real terrain retinal flow.

      First paragraph on page 15: This is an important discussion about the dynamics of retinal flow in conjunction with the dynamics of the gait cycle. It should be expanded and better balanced with respect to previous work and other models. It is true that any simple inference of an FoE would not work. However, models that estimate heading (not FoE) in the retinal reference frame would be consistent with the discussion. Oscillations of the head during walking affect the location of the divergence maximum and curl as much as the direction of heading in retinal coordinates. In fact, the videos nicely show how these variables oscillate with each step. This applies to all retinal flow analyses, and is a problem for any model. It requires a dynamical analysis. The speed of neural computations is an issue, of course, but it applies to divergence and curl in the same way as to other models. There is some indication, however, that neural computations on optic flow are fast, deal with instantaneous flow fields, and respond consistently to natural (spiral) retinal flow, as described above.

      Line 393: This paragraph is misleading in suggesting that naturally occurring flow fields have not been used in psychophysical and electrophysiological experiments.

      Line 516: This has been done by Bremmer et al. (2010) and Kaminiarz et al. (2014). Their results are consistent with computing heading directly in a retinal reference frame as predicted by several models of retinal flow analysis (e.g. Lappe et al. 1999).

      References:

      Angelaki, D. E. and Hess, B. J. M. (2005). Self-motion-induced eye movements: effects an visual acuity and navigation. Nat. Rev. Neurosci., 6:966-976.

      Bossard, M., Goulon, C., and Mestre, D. R. (2016). Viewpoint oscillation improves the perception of distance travelled based on optic flow. J Vis, 16(15):4.

      Bremmer, F., Kubischik, M., Pekel, M., Hoffmann, K. P., and Lappe, M. (2010). Visual selectivity for heading in monkey area MST. Exp. Brain Res., 200(1):51-60.

      Calow, D., Krüger, N., Wörgötter, F., and Lappe, M. (2004). Statistics of optic flow for self-motion through natural scenes. In Ilg, U., Bülthoff, H. H., and Mallot, H. A., editors, Dynamic Perception, Workshop of the GI Section 'Computer Vision', pages 133-138, Berlin. Akademische Verlagsgesellschaft Aka GmbH.

      Calow, D. and Lappe, M. (2007). Local statistics of retinal optic flow for self- motion through natural sceneries. Network, 18(4):343-374.

      Calow, D. and Lappe, M. (2008). Efficient encoding of natural optic flow. Network Comput. Neural Syst., 19(3):183-212.

      Cutting, J. E., Springer, K., Braren, P. A., and Johnson, S. H. (1992). Wayfinding on foot from information in retinal, not optical, flow. J. Exp. Psychol. Gen., 121(1):41-72.

      Grigo, A. and Lappe, M. (1999). Dynamical use of different sources of information in heading judgments from retinal flow. JOSA A, 16(9):2079-2091.

      't Hart, B. M. and Einhäuser, W. (2012). Mind the step: complementary effects of an implicit task on eye and head movements in real-life gaze allocation. Exp. Brain Res., 223(2):233-249.

      Kaminiarz, A., Schlack, A., Hoffmann, K.-P., Lappe, M., and Bremmer, F. (2014). Visual selectivity for heading in the macaque ventral intraparietal area. J. Neurophys. 112(10):2470-80

      Lappe, M., Pekel, M., and Hoffmann, K. P. (1998). Optokinetic eye movements elicited by radial optic flow in the macaque monkey. J. Neurophysiol., 79(3):1461-1480.

      Lappe, M. and Rauschecker, J. P. (1993). A neural network for the processing of optic flow from ego-motion in man and higher mammals. Neural Comp., 5(3):374-391.

      Lappe, M. and Rauschecker, J. P. (1994). Heading detection from optic flow. Nature, 369(6483):712-713.

      Lappe, M. and Rauschecker, J. P. (1995). Motion anisotropies and heading detection. Biol. Cybern., 72(3):261-277.

      Niemann, T., Lappe, M., Büscher, A., and Hoffmann, K. P. (1999). Ocular responses to radial optic flow and single accelerated targets in humans. Vision Res., 39(7):1359-1371.

      Pauwels, K., Lappe, M., and Hulle, M. M. (2007). Fixation as a mechanism for stabilization of short image sequences. Int. J. Comp. Vis., 72(1):67-78.

      Perrone, J. A. and Stone, L. S. (1994). A model of self-motion estimation within primate extrastriate visual cortex. Vision Res., 34(21):2917-2938.

      Regan, D. and Beverley, K. I. (1982). How do we avoid confounding the direction we are looking and the direction we are moving? Science, 215:194-196.

      Rieger, J. H. and Lawton, D. T. (1985). Processing differential image motion. J. Opt. Soc. Am. A, 2(2):354-360.

      Roth, S. and Black, M. J. (2007). On the spatial statistics of optical flow. Int. J. Comp. Vis., 74(1):33-50.

      Royden, C. S. (1997). Mathematical analysis of motion-opponent mechanisms used in the determination of heading and depth. J. Opt. Soc. Am. A, 14(9):2128-2143.

      van den Berg, A. V. (1996). Judgements of heading. Vision Res., 36(15):2337-2350.

      van den Berg, A. V. and Beintema, J. A. (1997). Motion templates with eye velocity gain fields for transformation of retinal to head centric flow. NeuroReport, 8(4):835-840.

    2. Reviewer #2:

      The manuscript by Matthis et. al. nicely measures both the visual scene and eye, body, and head kinematics during natural locomotion. The authors propose that certain features of optic flow as observed at the retina might be useful to guide locomotion. The data are a natural follow-up to earlier work from the same group that examined patterns of gaze during locomotion across different terrains. Taken together, the work here is a fine extension of the earlier paper, suggesting an interesting perspective on the way visual information could be processed to facilitate locomotion. Unfortunately, these findings are framed in the manuscript as if they overturn a dogma about the use of the head-centered Focus of Expansion (192-195, 397-399, 440). I found this argument to be quite confusing and insufficiently supported. As a result it was hard to evaluate the impact of this work.

      The authors find that one cannot extract a useful flow-field from a head-mounted camera (section 2,153-159). The literature cited doesn't claim that it would be, and given the familiarity with the VOR, I wouldn't expect it to. I was further confused by the fact that the authors could extract a useful FoE from drone video -- a clever calibration of their analysis! As a (mediocre) drone pilot, I know that the gimbal uses pitch/yaw/roll acceleration to stabilize a camera relative to the drone body at an angle defined by the user. If the authors can extract an FoE from such footage then certainly when the VOR does the same stabilization for the eye a similar computation ought obtain (contra 52-53). Furthermore, it is well-established that the oculomotor system provides a veridical estimate of eye-in-orbit to the rest of the brain: wouldn't this be the final component necessary to transform retinal flow into "head-centered FoE." There is considerable work that proposes solutions to understand the transformation from retinal coordinates to body-centered coordinates. The manuscript would benefit from consideration of these issues.

      None of this is to say that curl as computed at the fovea isn't useful for locomotion. To that point, the authors might find Oteiza et. al. Nature 2017 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5873946/ interesting as an example of another sensory system that uses curl as a cue for navigation. Notably, though, the manuscript doesn't even establish that it is, only that it might be. Optic flow fields generate a strong percept of self-motion, and they have been used to study perception or the neural correlates thereof. It isn't clear that the work here truly speaks to those findings, much less overturn their foundation.

    3. Reviewer #1:

      The study of how optic flow guides perception and action dated back to the 1950s and drew inspiration from pilots flying planes and birds gliding in the sky. These relatively constant-speed translational motions are different from what humans do every day, which is walking. Nevertheless, it is often assumed that laboratory findings using stimuli simulating smooth translational self-motion can be generalized to locomotive optic flow processing. In this paper, the authors directly challenge this assumption by investigating the structure of flow during natural locomotion, using simultaneous recordings of eye and body movements and the participants' view during walking. Their findings call for attention to reconsider assumptions about optic flow processing during natural locomotion, including the role of stabilizing eye movements.

      One of the most substantial contributions this paper makes is the careful characterization of the structure of flow in a naturalistic context, in terms of both the behavior involved and the environment in which the behavior occurs. The dataset is rich, challenging to come by, and complex to process. I applaud the authors' efforts to describe and contextualize the observed patterns. I am convinced about most claims made in the paper, with specific concerns and ideas to strengthen them as elaborated below. This work can have a significant impact as it is relevant not only to researchers studying vision and action in naturalistic contexts but also to researchers who translate basic science knowledge to advance real-life simulation (e.g., virtual reality, simulators, rehabilitation).

      Major Comments:

      1) A key finding from this paper is that the focus-of-expansion (FoE), a cue to heading direction, is highly variable in head-centered flow without considering eye movements. Although I am convinced about the variability of FoE velocity in head-centered optic flow based on the results reported by the authors, I see the potentials to strengthen the interpretation of this finding. The authors attribute the instability of the FoE to head motion during natural locomotion by showing the distribution of FoE velocities (Fig. 2) and the changes in head velocity as a function of % step (from one heel strike to the next, Fig. 3), respectively. More direct evidence to show this link would be that the FOE velocity changes as a function of % step, resembling patterns shown in Figure 3. Is this the case? I believe this result, if true, will strengthen the authors' claim.

      2) The instability of the FoE is contrasted against the stability of the retinal flow, as illustrated in Figure 2. The authors did not characterize eye movements used to achieve this stabilization and only briefly introduced vestibular ocular reflex (p. 2, line 21; Fig. 1 caption). While it might be beyond the scope of this paper to characterize these eye movements, it will be appropriate to include literature on how eye movements respond to laboratory optic flow stimulus (e.g., Knöll, Pillow & Huk, 2018; Niemann, Lappe, Büscher & Hoffmann, 1999). This literature provides a link between the eyes-fixed laboratory studies cited by the authors and the eyes-free naturalistic setting adopted in this paper.

      3) The other key finding is that retinal flow contains simple geometric features (curl, divergence) corresponding with the direction of heading relative to the fovea. The authors proposed that these cues could be used to determine the heading direction. This idea that there are visual cues alternative to FoE for heading direction guiding and perception is not new, as the authors have adequately cited previous studies suggesting so. Nonetheless, it is crucial to distinguish between speculation and empirical evidence showing the role of these cues. This paper has not demonstrated that participants can determine heading direction using these cues alone, or that the curl/divergence cues affect participants' behavior. The lack of an empirical test for these cues is concerning when combined with some statements that can be interpreted as it has been done. For example, on p.5 lines 105-110, the authors wrote: 'We show that this structure of fixation-mediated retinal optic flow provides a rich and robust source of information that is directly relevant to locomotor control without the need to subtract out or correct for the effects of eye rotations' and on p. 14 lines 347-349: 'We found that a walker can determine whether they will pass to the left or right of their fixation point by observing the sign and magnitude of the curl of the flow field at the fovea.' If the roles of these cues on behavior can be demonstrated from the data (e.g., by correlating simulated retinal flow cues and kinematic data), I recommend adding this analysis to support the authors' claim. Otherwise, I think all statements related to this claim (not exclusive to ones listed here) should be checked and altered.

      References:

      Knöll, J., Pillow, J. W., & Huk, A. C. (2018). Lawful tracking of visual motion in humans, macaques, and marmosets in a naturalistic, continuous, and untrained behavioral context. Proceedings of the National Academy of Sciences, 115(44), E10486-E10494. https://doi.org/10.1073/pnas.1807192115

      Niemann, T., Lappe, M., Büscher, A., & Hoffmann, K.-P. (1999). Ocular responses to radial optic flow and single accelerated targets in humans. Vision Research, 39(7), 1359-1371. https://doi.org/10.1016/S0042-6989(98)00236-3

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 4 of the manuscript. Miriam Spering (University of British Columbia) served as the Reviewing Editor for this submission.

      Summary:

      Your work is based on a fascinating and rich dataset with great potential. There was general agreement on the value of these data, and on the thoroughness with which the data were collected and preprocessed. Your approach of exploring how gait-induced instabilities of the head and terrain-dependent eye movements during natural locomotion will shape retinal optic flow is important and addresses an obvious gap in the literature. It also has the potential to merge knowledge across subfields (motion processing, eye movements, locomotion). However, there are several theoretical limitations that we believe cannot be fully addressed with the current dataset, even if the manuscript was rewritten entirely, as highlighted in the reviews below.

      1) The suggestion to use curl and divergence of the retinal flow for the control of self-motion is interesting, but it is insufficiently demonstrated as a valid strategy for the visual system (and alternatives are not considered). The reviewers briefly discussed whether conducting a correlational analysis between the sign/magnitude of cues and the participants' movement at a future timepoint based on the existing data might address this issue, but the more general concern here is that any such analysis might perpetuate the (wrong) idea that these cues are used in the visual system. The seminal paper by Warren and Hannon (1990) has taken a good look at this proposal and essentially refuted it on the grounds mentioned by Reviewer 3. Their arguments still stand and not much has been made of the divergence maximum since. A more encompassing view is needed to look in general at cues that predict instantaneous heading in the retinal reference frame. Another solution could be an analysis of the dynamics of retinal heading as produced by the locomotor cycle. Then, it might be possible to provide some constraints on necessary dynamics of any of the possible algorithms for retinal flow analysis.

      2) The case against the use of the head-centric FoE is valid but presented in a confusing (and possibly misleading/exaggerated) fashion. The data presented do not appear to provide sufficient evidence to overturn the idea that the FoE is not used to control heading during locomotion.

      3) The role of stabilizing eye movements on retinal flow is insufficiently discussed. Along the same lines, the purpose of the different experimental manipulations that presumably trigger significantly different eye movement patterns is never fully elaborated. It seems that there is a missed opportunity here to take a more hypothesis-guided rather than exploratory approach.

  2. Oct 2020
    1. Reviewer #3:

      The Suv39 class of methyl transferases are responsible for establishment and maintenance of constitutive heterochromatin via the deposition of H3K9me2/me3 marks. Clr4 is the sole H3K9me2/me3 HMTase in the fission yeast S. pombe and is part of the E3 ubiquitin ligase CLRC complex. It has been shown recently that CLRC mediates the ubiquitylation of H3K14 residue which in turn boosts the methyl transferase activity of Clr4 . A region C-terminal to the chromo domain (aa 63-127) was also shown to be required to bind Ubiquitin and provide specificity for ubiquitylated H3K14 relative to unmodified H3 (Oya et al 2019 EMBO Rep. 2019 20:e48111).

      Here the authors further explore crosstalk between Clr4 activity and H3K14Ub. They do this via a structure-function approach employing a range of structural methods combined with in vivo assays. The primary finding here is that the presence of H3K14ub on histone H3 enhances Clr4 methyltransferase activity and this H3K14ub sensing region resides within the KMT methyltransferase domain itself (aa 192-490) not the aa 63-127 region as previously reported.

      The authors further identify regions within this domain that are responsible for H3K14ub binding and Clr4 mutants which abrogate this interaction. These Clr4 mutants display dramatically reduced activity towards ubiquitylated peptide substrates. In vivo tests show that the same mutants exhibit silencing defects associated with almost a complete loss of H3K9me2/me3 from centromeric heterochromatin. Additionally, the authors show that H3K14ub sensing also appears to operate within the KMT domain of human SUV39H2 but not human G9a or Arabidopsis SUVH4.<br> Thus the key differences here from the Oya et al. 2019 study are the structural approaches employed and that Ubiquitin is sensed by the KMT methyltransferase domain itself without the previously identified Ubiquitin binding region in (aa 63-127). The authors offer a reasonable explanation for this discrepancy.

      Additional analyses would perhaps help to strengthen their conclusions.

      Major Points:

      1) The relevance of the proposed mechanism in a cellular chromatin context is unclear. A significant fraction of H3K9me2/3 nucleosomes isolated from cells should also carry H3K14ub in cis. How frequently do K9Me2/3 and K14ub co-occur on nucleosomes in heterochromatin regions? This could be explored by westerns with anti-H3K9me2 and or me3 - a mobility shift equivalent to monoubiquitylation should be visible.

      2) The authors should consider including mutant peptide controls such as H3K9RK14ub to make sure what is detected here is indeed H3K9 methylation. Additionally, a completely unrelated substrate such as a ubiquitylated H4 N-terminal peptide could be used in the methyltransferase assays to strengthen the author's claims of specificity.

      3) The IP-western (Fig. 4C) shows association of Clr4 proteins with the Rik1, suggesting that they are incorporated into the CLRC complex. However, a more rigorous test would be to analyze these IPs by mass spectrometry to determine if the Clr4 GS253 and F3A mutant proteins are indeed assembled into a CLRC complex containing the other components.

      4) The Clr4-F3A mutant appears to have a differential effect on the level of transcript generation from the dg and dh regions of centromeric repeats. For completeness ChIP-qPCR data should be included for both the dg and dh regions (currently only dh is assayed Fig 4 E) to determine if a difference is also detected.

      5) Are similar structural features found in the SUV39H2 KMT domain to those shown for Clr4 (Fig 5C) that would also allow ubiquitin to dock? Does computational comparison between Suv39H2, Clr4, G9a and SUVH4 provide insight into similarities/differences?

    2. Reviewer #2:

      In this manuscript Stirpe and colleagues describe structural insight into a novel regulation mechanism of SUV39 class histone methyltransferases. Clr4 is the sole SUV39-family H3K9me2/3 methyltransferase in fission yeast and recent evidence suggests that ubiquitylation of lysine 14 on histone H3 (H3K14ub) plays a key role in H3K9 methylation. To understand the molecular mechanisms of this regulation, the authors first set up in vitro assay system and demonstrate that H3K14ub promotes Clr4 methyltransferase activity and that the catalytic domain of Clr4 senses the presence of H3K14-linked ubiquitin. The authors then performed hydrogen/deuterium exchange coupled to mass spectrometry analysis and show that ubiquitin moiety binds to a region involving residues 243-261 of Clr4. Using this information, they further show that Clr4 mutants containing amino-acid substitutions in the ubiquitin binding region lose affinity for H3K14ub. The authors also demonstrate that fission yeast strains expressing mutant Clr4 display silencing defects and lose heterochromatic H3K9me2/3. Finally, the authors demonstrate that H3K14ub also stimulates the enzymatic activity of mammalian SUV39H2.

      Comments:

      This is an excellent paper that provides structural insights into how H3K14ub stimulates Clr4 methyltransferase activity. The results presented are of high quality and convincingly controlled. The paper is carefully written, and the conclusions presented are fully supported by the data included. The results described are of high interest to the field of heterochromatin and crosstalk of histone marks. However, the following points should be addressed by the authors.

      Major points:

      Is the H3K14ub-mediated stimulation a shared property of SUV39 class methyltransferases? This is a quite important question considering the mechanisms underlying heterochromatin assembly in eukaryotic cells. While the authors demonstrate that SUV39H2's enzymatic activity is stimulated by H3K14u (Fig. 5A), it would be interesting to test whether the activity of SUV39H1, the other mammalian Su(var)3-9 homologue, is also stimulated by the presence of H3K14ub.

    3. Reviewer #1:

      H3K14ub is a histone modification that facilitates deposition of H3K9me on heterochromatin in fission yeast, but the mechanism by which this modification stimulates Clr4 was unknown. Using mutants and HDX, the authors identified the interaction surface of Clr4 for H3K14ub, which they used to design mutants that responded poorly to H3K14ub stimulation. In vivo, these mutations resulted in loss of heterochromatin marks and defects in heterochromatin-based silencing, suggesting that H3K14ub stimulation is essential to K9me-mediated silencing. Finally, the authors show that human SUV39H2 but not G9a or Arabidopsis SUVH4 can be stimulated by H3K14ub in a similar manner.

      The authors provided biochemical and structural insights into the mechanism that increases the H3K9-specific methyltransferase activity of Clr4 by H3K14ub. Although H3K14ub-mediated promotion of H3K9 methylation is shown in Oya et al. EMBO Rep 2019, this study further characterizes the potential mechanism. However, there are some issues with the results that need to be resolved.

      1) Similarity and difference with the previous study. As the authors acknowledge, this manuscript builds on a previous study by Oya et al. 2019, however I think the similarities and the differences need to be made even more explicit and better addressed.

      a) The authors should clearly state that Figure 1B and 1C are basically a confirmation of Oya et al. 2019.

      b) I am more puzzled by the difference in the mapping of the region required for H3K14ub stimulation. The authors suggest that a difference in the preparation of the recombinant proteins might be responsible. This can and should be tested as it would seemingly be a simple experiment (compare with and without GST tag).

      c) Possibly to reconcile their findings with the previous report the authors state in the description of Fig. 1 that "the N-terminus plays a regulatory role in the sensing of H3K14ub by the catalytic domain" but I don't see this reflected in the data show in Fig. 1C, given that the degree of stimulation is very similar for KMT and FL.

      2) Stimulation-defective mutants. The authors should carefully discuss the stimulation-defective mutants, which should be premised on the retention of their methyltransferase activity on unmodified H3. The authors claim that 30% loss of activity of the Clr4 KMT mutants on unmodified H3 is observed in Figure S3C (Pg 11 line 15), but this cannot be determined from the graph provided, which is normalized to unmodified H3. The authors should (1) make another graph to show the 30% loss and (2) compare Clr4 KMT mutants with catalytic-dead Clr4 KMT or dissolution buffer (no protein). It is still possible that GS253 and F3A mutations simply reduce MTase activity, thus displaying lower activity than WT in the presence of H3K14ub, which would also suggest a different interpretation for the results in vivo.

      3) Heterochromatin localization of Clr4 mutants. The FLAG ChIP results in Fig. 4E is not very informative, as with the loss of heterochromatin a loss of Clr4 is predicted. If the authors want to test whether the localization activity of Clr4 mutants is intact, (1) FLAG ChIP in the clr4+, Flag-Clr4GS253/F3A background (i.e., two clr4 alleles exist) or (2) in vitro H3K9me2/3 binding assay should be performed. Since Clr4 N-terminus might regulate MTase activity as discussed in Pg 18 line 19, it is also possible that amino acid substitutions in the KMT region affect the function of N-terminus, including CD. The co-IP in Fig. 4C is not sufficient to clarify this point as Clr4 directly binds heterochromatin via its CD, in addition to the CLRC-mediated mechanism, and it is unclear if this is affected in the mutants.

      4) Allosteric vs. binding regulation. On Pg. 11, the authors suggest that an allosteric mechanism is at play, but this is not supported by the data. In fact the observation that providing ubiquitin in trans does not stimulate and rather inhibits the activity on H3K14ub would suggest that the ubiquitin just increases binding affinity. To clarify this the authors should measure binding affinity of WT and mutants to the H3 peptide with and without ubiquitin.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      Based on the reviews and following discussion, the editors have judged your manuscript of interest but think that additional experiments are required. We also think that several of the other points made by the reviewers might help you strengthen this manuscript and encourage you to consider addressing them if possible.

      Essential Points:

      1) Additional support for the claim that the mutants are only (or mostly) impaired in the ubiquitin binding activity. This is key for the proper interpretation of the in vivo data. As suggested by the reviewers, this could entail (but is not limited to) a better quantification or presentation of enzymatic activity (absolute instead of fold-change in stimulation), additional characterization of interacting proteins by mass spec, localization of the mutants to chromatin in a wild-type context.

      2) Clarification of allostery vs. changes in binding affinities (Rev 1, point 4) ideally including measurements for the binding affinity of WT and mutants to the H3 peptide with and without ubiquitin.

      3) Better characterization of silencing defects: ChIP-qPCR data should be included for both the dg and dh regions across mutants (Rev 3, point 4).

      4) Analysis of the conservation of structural features in SUV34H2 (Rev 3 point 5)

    1. Reviewer #3:

      Non-alcoholic fatty liver disease is a growing health issue worldwide. The pathogenesis and mechanism causing the disease are poorly understood. As the authors state correctly, unravelling mechanistic details of liver lipid metabolism is extremely important yet also technically very challenging. This report aimed at defining the role and mechanism of action of HILPDA in liver cells. The presented paper shows very interesting aspects on the role of HILPDA and brings novel concepts into the field and, as such, has extremely high potential. An overwhelming amount of data is shown that leads to development of the story. However, in the current form, the novel mechanism as outlined from the title has not been worked out with sufficient detail.

      1) de la Rosa Rodriguez et al. claim that 'The increase of DGAT1 activity via HILPDA is a novel mechanism that links elevated fatty acid levels to stimulation of triglyceride synthesis and storage in hepatocytes." Experiments correlate HILPDA with DGATs, e.g. upregulation of HILPDA in NASH, overexpression of HILPDA correlating with increase of DGAT1 levels, localization studies demonstrating colocalization of HILPDA with DGAT1 and DGAT2. As experienced in previous HILPDA studies, many effects are modest (e.g. decrease of TG in mice liver with NASH upon deletion of HILPDA, changes in plasma ALT levels).

      2) As the authors correctly state in their results section, the presented data suggest that HILPDA promotes lipid storage at least partly via an ATGL-independent mechanism. Fig 3 also indicates different sized individual lipid droplets comparing Atglistatin treatment, even though the total LD area might differ significantly.

      3) HILPDA is associated with increased DGAT activity, the suggested mechanism behind it (transcriptional activation?) is not described sufficiently. DGAT1 activity decreases FA-levels and as such would back in down-regulation HILPDA expression. To support the very interesting and very strong claim that DGAT1 is increased by direct interaction with HILPDA, this should be shown in vitro.

    2. Reviewer #2:

      This manuscript further characterizes the role of HILPDA/HIG2 in TAG/LD biology. The major finding is that HILPDA interacts with and promotes DGAT activity and TAG synthesis, which is novel given that HILPDA has largely been thought to regulate TAG turnover as a lipolytic inhibitor.

      Characterization of the interaction between HILPDA and DGAT1 (and to a lesser extent DGAT2) is the major strength of this paper and an important advancement in the field. The early parts of the paper are not particularly novel (Fig. 1) or well-designed (Fig 2. - poor NAFLD/NASH model showing almost no effects) and the study is a bit on the thin side for data.

      1) The data shown in Figure 1 is not particularly striking given that HILPDA is a known target gene of PPAR-alpha, which is activated by FAs. Showing that HILPDA expression tracks with PLIN2 is also pretty obvious as PLIN2 tracks with LD accumulation. I really don't see the need/relevance of this figure.

      2) The MCD diet is widely regarded as a poor model for NAFLD/NASH since it doesn't replicate human NASH in so many regards. As a result, the use of this model makes these studies less relevant. Also, it is referenced that HILPDA was found to be up in a MCD study, but why not look at the plethora of human and mouse studies of NAFLD that have done RNAseq or arrays to provide a more physiological assessment of its expression in NAFLD/NASH?

      3) The conclusion that effects are independent of ATGL are not overly convincing. Since ATGListatin is not specific for ATGL (Quiroga et al. 2018), a more thorough and quantitative analysis of TAG turnover with ATGL knockdown/out is warranted if these claims are to be made.

      4) Since DGAT1 mRNA is unchanged but protein goes up, it would be assumed that HILPDA is affecting DGAT1 stability/turnover. This should be considered.

    3. Reviewer #1:

      This study dissects the role of LD associated protein HILPDA in triglyceride and LD homeostasis in hepatic tissue. Using a mouse tissue-specific HILPDA KO, live cell imaging, and lipid analysis, it proposes that HILPDA promotes TAG storage in LDs independently of ATGL regulation. Instead, HILPDA is proposed to interact with DGAT1 and promote TAG synthesis/storage.

      This is an interesting and potentially exciting study that provides a new insight for HILPDA in liver fat storage. The proposed model differs from previous literature that proposes HILPDA regulates lipolysis via ATGL. Unfortunately, while the data presented support a potential role for HILPDA in DGAT regulation, a clear mechanism is not identified. The first half of the paper that phenotypes loss and over-expression of HILPDA is thorough and conclusive. The latter half of the paper, investigating the interplay between HILPDA and DGAT1, appears more preliminary.

      The critical issue in this study is that the nature of the HILPDA-DGAT1 interaction is not well defined. HILPDA over-expression is shown to increase DGAT1 protein levels, but the specific mechanism underlying this is not further dissected. Furthermore, it is still unclear whether this interaction is direct, or merely stochastic due to the fact that both DGAT1 and HILPDA reside on the same LDs in the experiments presented. More biochemical investigation as to whether these proteins physically interact in their native states, and if so whether that interaction affects DGAT1 enzymatic activity directly or allosterically, is required. Without this the study is mainly descriptive.

      Major concerns:

      1) Fig 4: overnight and acute fatty acid addition experiment: The authors propose that HILPDA enriches at sites where new fatty acids are being processed. Can you demonstrate that both these fluorescent FA species are even being incorporated into TAG during the time periods associated with the microscopy? An alternative explanation is simply that HILPDA localizes to regions of the cell where FA esterification or incorporation into other lipid species is occurring. TAG is potentially only one of many fates for these FAs. Can DGAT1/2 be colocalized with HILPDA in these experiments? Alternatively, what happens in these experiments if DGAT inhibitors are co-added with the FAs?

      2) Fig 5H: The DGAT activity assays indicate that HILPDA over-expression increases the incorporation of fluorescent FA and DAG into TAG, but it is unclear as written whether these assays are normalizing for DGAT1 protein amount. Does HILPDA over-expression enhance DGAT enzymatic activity in this panel, or merely promote TAG synthesis here by the increased total DGAT protein level noted later in the study? This is a clear distinction in mechanism, and needs to be dissected further.

      3) Fig 6/7: DGAT1-HILPDA interaction. The data presented in Fig 7 indicate that DGAT1 and HILPDA co-localize in cells and potentially are in very close proximity with one another. However, the data as presented are not enough to indicate whether these proteins directly interact. Do these proteins immunoprecipitate with one another? Some biochemical evidence for their interaction is necessary

      4) Fig 7: relatedly, the mechanism by which DGAT1 is increased in protein level from HILPDA is also unclear. Is the protein more long-lived, or stabilized in the ER when HILPDA is over-expressed? Again, protein biochemical analysis would be helpful.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      This study further characterizes the role of lipid droplet (LD) associated protein HILPDA in LD biology. The authors propose that HILPDA promotes triglyceride (TAG) storage in LDs by a mechanism independent of ATGL, through activation of DGAT. This is a potentially interesting finding, however, as detailed by the reviewers below, the data presented do not identify a mechanism for how HILPDA affects DGAT.

    1. Reviewer #3:

      This study examines the role of iron-sulfur clusters in M. tuberculosis adaptation to nitric oxide (NO) and pathogenesis. The study uses transcriptomics to identify genes regulated by NO in vitro and then genetically and biochemically characterizes the role of SufR in responding to NO, modulating metabolic adaptations and promoting pathogenesis in macrophages and infected mice. The topic of this study is highly significant as it defines new mechanisms by which M. tuberculosis adapts to host NO. The manuscript includes numerous strengths including rigorous transcriptomic studies, well-defined physiological studies of wild type M. tuberculosis and thorough biochemical characterizations of SufR protein by spectrometry and DNA binding studies. However, the study suffers from a major experimental flaw that makes interpreting the conclusions from the genetic studies very difficult. The knockout of the sufR gene (which is a proposed repressor) also disrupts the NO inducibility of the downstream suf genes. Due to this polar effect, most of the experiments show partial or poor complementation. This complexity in the genetics raises questions about which aspects of the phenotype are directly controlled by SufR and which are controlled by the disregulated suf genes or possibly unlinked mutations. This major issue impacts a significant portion of the data and needs to be experimentally addressed to ensure that the specific function of SufR is defined by the studies. Overall, this is an ambitious, potentially exciting study, but suffers from a major flaw in the genetics that renders the major conclusions uncertain.

    2. Reviewer #2:

      The manuscript by Anand et al. describes very interesting work into the characterisation of M. tuberculosis response to NO stress. The authors identify the SufR transcriptional repressor as a sensor of NO and further show that the 4Fe-4S cluster bound to the holo-protein plays a central role in this response. Interestingly, their results indicate that SufR regulates both the suf operon and the DosR regulon in response to NO. In addition, they identified a palindromic sequence upstream of the suf operon (and some nine other genes) that holo-SufR could bind to. These results collectively indicate that SufR integrates host response to Fe-S cluster homeostasis in Mtb, providing many important contributions to the field. There are, however, several concerns and areas that need improvement and better explanations.

      Major comments:

      1) The most puzzling finding in this manuscript is the inability of sufR-Comp to complement ΔsufR, with the sufR-Comp strains showing an intermediate phenotype (e.g. Figure 5, panels D and E). The authors mention that the partial complementation is likely due to the restored expression of other sufR-specific genes (like DosR regulon). Even more surprising is the result presented in Figure 5B, in which sufR-Comp shows much slower recovery than ΔsufR. In this case, the authors argue that the induction of the entire suf operon is necessary for the growth resumption. But this doesn't explain why the sufR-Comp shows a slower phenotype compared to ΔsufR. I believe that the authors should provide a more plausible explanation for these observations.

      2) Figure 3 shows that the suf operon is not induced upon NO treatment in ΔsufR and the authors stated that removing 345 bp of sufR for constructing ΔsufR might explain this observation. Whereas the primary and alternative TSS (and I'd assume the promoter region) remain intact in ΔsufR, the authors are urged to come up with a better explanation for this result.

      3) As part of their argument, the authors mentioned that Mtb prefers IscS for housekeeping functions and the Suf system for managing stress, and made comparisons with the well-studied Isc and Suf systems of E. coli. This is against the current knowledge in the literature, and contrary to E. coli, the Isc system in Mtb has reduced to only IscS and the Suf system acts as the major player in the assembly of Fe-S clusters (see point #4 below).

      4) I do realise that the authors have used Acn in their experiments to indicate the effects of NO treatment on Fe-S clusters. However, it is known that Acn of Mtb is a target for Mtb-IscS and therefore the results presented in Figure 4A doesn't necessarily mean that the observed phenotype is due to a direct consequence of defects in the suf system upon NO treatment. The paper by Rybniker et al. (reference #65 in the current manuscript) has shown, using Y2H, activity assays and pull-down experiments, that Acn could make direct interactions with IscS in Mtb. Consistent with this, sufR-Comp didn't reinstate Acn activity. Therefore I am doubtful whether Acn is the correct enzyme to use as an indicator to look into the function of suf operon, where its Fe-S formation depends on IscS.

      5) It is a common practice in the field that not only lung burden but also burden in at least one other organ are shown (usually spleen).

    3. Reviewer #1:

      The manuscript of Amit Singh et al. describes a set of experiments that starts with looking at the transcriptomic response towards NO stress. A large number of genes show altered expression, including the Suf operon. They decide to study the Suf operon, whose encoded proteins are involved in [Fe-S] Cluster Assembly in more detail.

      Some of their findings include: that Mtb SufR is a major regulator of Fe-S cluster biogenesis in Mtb under NO stress, that SufR contains a redox-responsive 4Fe-4S cluster, that functions as a repressor and that a sufR mutant is slightly attenuated in mouse infection experiments. Although the results are convincing and important, my major problem is that in fact all of these findings have been described previously, mainly by M. Pandey (Scientific Reports 8:17359 - 2018) and D. Willemse (Plos One 0200145 - 2018). The current manuscript more specifically focuses on the role of NO in this process, but this is, in my opinion, a minor advance, as the effect of NO (and H2O2) was also reported previously.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary:

      Reviewers acknowledge that your submission reports some interesting results on the relationship between Fe-S and the response to NO in Mycobacterium tuberculosis. That said, several concerns were raised regarding genetic complementation and novelty.

    1. Reviewer #3:

      In this article, the Authors study the link between alpha-synuclein (α-syn) inclusions, neuroinflammation and neurodegeneration in mice injected with α-syn pre-formed fibrils (PFF) into the striatum. While this is an important question in the context of Parkinson's disease (PD), both from a pathophysiological and a therapeutic point of view, the present work seems too preliminary at this stage.

      1) The Authors conclude that microglial activation in PFF-injected mice underlies neurodegeneration in this animal model. However, this is a correlative observation and no mechanistic experiments are included to confirm a causal relationship between the inflammatory response and cell death in these animals.

      2) Another major conclusion of this study is that diffusible oligomeric α-syn species, in contrast to fully-formed α-syn inclusions, are the major drivers of microglial activation in these animals. However, the distinction between α-syn oligomers and inclusions/aggregates is not well characterized in the present work. While the Authors performed some PK digestion experiments (i.e. indicating a pathological insoluble/aggregated beta-sheet conformation) and proximity ligation assay (PLA) experiments (i.e. to detect α-syn oligomers), these assessments have not been systematically performed and quantified throughout the different brain regions of PFF-injected mice, with only a couple of qualitative images shown in Fig 1B&C (in which α-syn oligomers are also apparently seen in PBS-injected animals).

      3) As an index of α-syn "inclusions", the Authors mainly used immunohistochemistry for phosphorylated α-syn (pSyn). While pSyn has been extensively used as an index of PD pathology, it can also be seen in tissue from control subjects (e.g. Antunes et al. 2016) and may also result from a non-specific cross-reaction with other phospho-proteins, such as phosphorylated neurofilaments (e.g. Sacino et al. 2014). In addition, the Authors did not include the full quantification and statistical analyses of pSyn signal in the different regions of the different experimental groups (they only mention in the main text some percentages of signal coverage in different brain regions of these animals without any statistical quantifications).

      4) To distinguish between the effects of PFFs versus oligomers, the Authors also injected some additional mice with α-syn oligomers. However, the experiments with α-syn oligomers are only qualitative and were performed in a very limited number of animals (n=3) in a single time-point (i.e. 13 dpi), thus precluding a conclusive comparison with the experiments in PFF-injected animals. In addition, the characterization of α-syn PFFs vs α-syn oligomers is limited to a non-denaturing Western blot (Supplementary Fig. 1) and it is not clear why for intrastriatal injections, α-syn oligomers were used non-sonicated whereas α-syn PFFs were sonicated.

      5) The level of PFF-induced dopaminergic nigral degeneration that the Authors observe at 90 dpi, although statistically significant, is quite weak (16% cell loss). In the original description of this model by Luk et al (2012), dopaminergic nigral degeneration was not statistically significant until 180 dpi. Therefore, later time-points would be needed to clearly assess the link between α-syn inclusions, inflammation and neurodegeneration. Also, while neurodegeneration in the substantia nigra was assessed by stereological cell counts of intrinsic dopaminergic nigral neurons, it is not clear why in other pSyn-containing and non-containing areas (such as the frontal cortex or hippocampus) neurodegeneration was assessed instead at synaptic level, which may reflect impairment of cell bodies projecting to these areas instead of degeneration of intrinsic neurons within these brain regions.

      6) The Authors indicate that they used both male and female animals throughout the article. However, it is not indicated how many animals of each sex have been used and if there is a potential effect of sex in their results, which could be interesting to determine.

      7) From an experimental design point of view, it seems quite odd to inject animals at different ages if the aim is to assess the temporal dynamics of PFF injections at two different time-points. Because mice of different ages might be differentially susceptible to α-syn PPFs, it would seem more important to ensure that the animals have the same age at the time of the injection rather than have the same age at the end of the two different end-points. It is also not clear why the animals were obtained from two different vendors (i.e. Charles River or Janvier Labs).

      8) For statistical analyses the Authors indicate that the values of the different parameters analyzed in ipsilateral and contralateral hemispheres from control (PBS-injected) animals were grouped, in contrast to PFF-injected animals in which ipsi and contralateral hemispheres were analyzed separately. This is justified by an apparent lack of statistical differences between ipsi and contralateral hemispheres from control animals for the different parameters analyzed. However, this is actually not shown. In absence of this information, it is not possible, for instance, to determine the level of Iba1-positive microgliosis induced by PBS injection itself within the ipsilateral hemisphere.

      9) Microgliosis (i.e. Iba1 and/or CD68 immunohistochemistry) has not been systematically performed and quantified in all different brain regions, experimental groups and time-points.

      10) The transcriptomic analysis is interesting but the Authors did not validate any of the differentially-expressed genes (DEGs) detected. Also, how are "most highly changed DEGs" defined as? Does it depend on the p-value or on the fold change?

      11) A full list of DEGs and all results from the enrichment analysis for GO terms should be provided as supplementary data.

    2. Reviewer #2:

      Garcia et al. aims to investigate the relationship between α-syn, neuroinflammation, and neurodegeneration with a model of α-syn seeding in wild-type mice. The authors use transcriptional profiling to assess modest yet detectable responses to the induction of different forms of α-syn species, the characterization of which is primarily based on immunolabeling which has inherent limitations. Moreover, the discussion regarding the pathogenicity of oligomers versus fibrils is important; yet largely unsupported by rigorous characterization of the injected oligomeric species, spread of oligomers in the PFF-injected model, and better experimental controls, thereby limiting the impact of this study. Yet, the observations should be of interest to the field.

      Substantive Concerns:

      1) The authors purport that α-syn oligomers, rather than inclusions, are stronger drivers of neurodegeneration and neuroinflammation. Their primary evidence is that inclusion pathology shows no correlation with either, while oligomers and gliosis but not inclusions are found in the hippocampus of PFF-injected animals. However, no attempt was made to investigate the actual correlation with oligomeric α-syn with gliosis or synaptic integrity, as was done with inclusion load in Fig. 4. PLA was only performed in the hippocampus, while it would be expected that oligomers form elsewhere, especially in regions with inclusions. Similarly, oligomer injections were not employed extensively enough to support the arguments about the pathogenic potential of oligomeric α-syn. The only data shown from this model were of Iba-1 immunofluorescence labeling at 13dpi. While it is remarkable that Iba-1 immunoreactivity is qualitatively very strong at this early time point, it is disputable at best that "the reaction was even stronger than 90dpi after PFF injection" (line 567-568). In addition, why was only the 13dpi time point shown? It is of considerable interest if the microglial response persists with oligomeric injection as it does with PFF injection, or if microglia are able to clear injected oligomers and better prevent pathology. Finally, it is surprising that oligomer injected animals were not included in the transcriptional profiling, which could greatly strengthen the purported link between oligomeric α-syn and microglial reactivity. It may be true that oligomers are the primary driver of neurodegeneration via interactions with microglia, but this was not proven.

      2) What sort of quality control was done on the α-syn preparations? Of important concern is endotoxin contamination, especially since oligomers and PFFs were generated with very distinct procedures. This may be confounding reported measures, especially microgliosis, if endotoxic presence is significant. Additionally, the use of two distinct sonicators may be generating fibrils with different kinetics, which can be detected with Thioflavin T binding assay amongst other methods.

      3) In Supplementary Fig. 1, the authors emphasize monomeric species in their oligomers and PFFs, yet no α-syn monomer-injected controls were employed in this study. Especially since different amounts of PFFs and oligomers were injected, it would be important to account for any noise generated by introducing various amounts of monomeric species.

      4) More extensive investigation about the disagreement between histological and transcriptional data is needed. It may not be accurate that at 90dpi, "major pathological events now appear to take place at the protein level, and are measurable with quantitative histology" (line 607-608) since these protein products were not explored via histology. For example, no biochemical or immunohistochemical assays were performed to investigate the autophagic or mitochondrial changes in this model, and Iba-1 immunolabeling was the only measure taken in pursuit of probing into the immune system. The link between apparent gliosis compared with an alleged downregulation in transcription related to immunity needs to be more thoroughly investigated.

    3. Reviewer #1:

      In this manuscript, the authors seek to assess the pathogenic role of alpha-synuclein (a-syn) inclusions in the neurodegenerative process of PD. To study this important question, the authors administered intrastriatal recombinant murine a-syn PFFs in the brain of wild-type mice (to induce inclusions) and compare the extent of neurodegeneration and microgliosis in brain regions with and without a-syn inclusions. First, the authors demonstrate that neurodegeneration occurs in brain regions with and without a-syn inclusions, a finding that led them to conclude that neuronal injury does not rely on the presence of a-syn inclusions. Second, the authors found a robust immunopositivity for microglial cells in regions with or without inclusions, which was greater than that observed after the intrastriatal administration of 6-OHDA. To note, the authors demonstrate that microgliosis did not correlate with neurodegeneration in the brains of injected mice. To gain insights into the molecular response to the intrastriatal injection a-syn PFFs, the authors performed a bulk gene expression profile analysis and found a host of significant changes in inflammation-related genes and pathways. Because these changes did precede neuron loss, the authors surmise that the microglia contribute to the actual neurodegenerative process and that the microglial response is not merely the reflection of neurons dying.

      This is a mostly well executed study that intends to address an important question. The methods are for the most part appropriate and the results for the most part well presented. However, the enthusiasm of this reviewer for this work is significantly reduced due to the fact that this work is essentially correlative, over-interpretative, and rather incremental. Indeed, this work lacks the level of molecular dissection that is required to reach the strong conclusion the authors put forward. Moreover, this reviewer does not believe that the present data allow any compelling conclusion about the role of microglia in this model to be made and does not understand why and how this work contributes to our understanding of "...how the pathogenic properties of "prion-like" a-syn should be viewed." Aside from these general comments, some specific points can also be raised:

      1) A major emphasis is placed on "inclusions" but yet, unless overlooked, it is not clear to what exactly the authors refer to. It is impossible to be certain what exactly the immunopositive structures called by the authors as inclusions are. Perhaps it would be helpful to include some EM characterizations. See Fig. 1.

      2) Using TH as a surrogate of neurodegeneration is often misleading as phenotypic markers can be readily downregulated in stress cells. Thus, whether the reduced signal for TH indicates loss of TH expression vs living neurons is uncertain.

      3) Using IBA1 label microglia (and macrophages) does not tell anything in terms of activation state. Moreover, it is not clear whether the quantification of the signal is the average of the whole structure of interest (likely) and if it is, from where the illustration from the striatum is derived. Indeed, one challenge in using intrastriatal injection is that it causes radial damage (center of the injection site) and depending on where one looks, the magnitude and type of changes may be very different. It is also unclear why a unilateral injection of PFF should induce changes in the SN on both sides.

      4) While the quantification morphological methods are not optimal, the authors provide enough detail to appreciate how the work was done, and given the data generated, the methods used should be acceptable.

      5) Unless one characterizes the phenotype of microglia at a single cell level, it is no longer acceptable to formulate sound conclusions about the role (or the lack thereof) of microglia in neurodegeneration. Indeed, bulk analysis is notoriously biased toward abundant genes which is not necessarily the most meaningful and fails to take into account the heterogeneity of the neuroinflammatory response. Thus, the genomic analysis provided here is of minimal value.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      While all three reviewers agreed that the question under investigation is of interest, they also raised a number of issues that decreased the overall enthusiasm for the work in its present form. Indeed, as you can see from the appended reviews, all three reviewers thought that more extensive work is needed to support your conclusion. In fact, new studies were recommended for every major aspects of the study including greater validation of the injected material, of the neuropathology including the quantitative morphology (of note while Rev 1 think that the lack of Stereology is acceptable, Rev 3 does not, which suggests that more technical details and stronger justification of the method you used is required), and genomic analysis such as using more up-to-date methodology to capture heterogeneity of the response as well more extensive validations of the reported changes.

    1. Reviewer #3:

      The authors ask whether and how information about an upcoming choice is encoded by neuronal activities in V1. To address this question, they recorded from multiple neurons in V1 simultaneously, while monkeys performed a delayed orientation-match-to-sample task. They then asked whether and how they could decode the stimulus presented to the animal, and/or the upcoming behavioral report of their decision (choice), from these V1 recordings. They found that the combination stimulus+choice could be decoded, and that bursty neurons were most likely to affect the decoded choice. Moreover, neurons in the superficial cortical layer also appeared to have a stronger choice signal. This suggests that the choice signal may arise outside of V1, but nevertheless be reflected by spiking activity within V1.

      This study addresses an interesting and potentially important question: where do choice signals arise in the brain, and how do V1 activities relate to those choice signals? At the same time, I was quite confused about a lot of the data presented and overall remain somewhat unconvinced. My specific critiques are as follows:

      1) In Fig. 1BC: what are these population vectors? In the case of "C", I assume these are the SVM weights that are used to discriminate between choices, and the data for each choice are pooled over both stimulus types (match or non-match). But for "S+C", I don't quite follow what is going on. Is it the case that you do the decoding just on the "correct" trials (as suggested in Table 1)? This critique should highlight the fact that I failed to understand your main point, about decoding C vs "S+C". Much more writing clarity throughout the paper would help with this, and make it possible for me to evaluate the paper's main claims.

      2) Fig. 1D is claimed to tell us how neurons respond differently under different conditions, but it does not do that. It tells us how SVM decoders weight those neurons differently under different conditions. Moreover the result seems kind of trivial: it shows that "strong weights change more" between conditions. That's not very surprising: you are subtracting bigger numbers when there are stronger weights, so the differences will be larger. Is there more going on here?

      3) In Fig. 2: what time intervals were the spikes summed for the decoding? There are some values given for different window lengths, but when did those windows start? Was it at the start of the "test" image presentation? Or some other time?

      4) It seems like movement is a confound. The claim is that choice is represented in V1. But we know from recent work by Stringer et al. (Science 2019), that movement profoundly affects V1 spiking. So if any movement signals precede the behavioural report, those will correlate with choice and be reflected by V1 spiking. In that case, is it really fair to say that V1 encodes choice? Or, rather, that the pre-report motion of the animal is encoded in V1?

      5) I couldn't find strong support for the claim that decoding is better when using superficial neurons vs. deeper ones. A panel like Fig. 7E (which does this for bursty vs non-bursty neurons) but comparing the different layers would help with this. I realize this result is somewhat implied by the differences in bursty neuron fraction across layers (which is shown), but this claim is central and so should be explicitly tested.

      6) I have concerns about a lot of the statistical tests used in this paper. For example:

      a) Fig. 2D. Should do a permutation test, to randomly assign neurons to "big" vs "small" weight categories, then redo the analysis. That will get p-value much more reliably than the t-test, which assumes (incorrectly that data are Gaussian). Another big issue is that the selection of small vs big can have some biasing effects, so the t-test between the two groups could way overemphasize significance. A permutation test is harder to fool in this way.

      b) Fig 3D statistical test compares the analysis of data with optimized weights to a case of random weights and random permutation. That's not quite fair because you optimize the weights for the real data but not for the null hypothesis you are testing. A better test would be to do random permutations of the data, then train the weights on each random permutation and test on held-out data from that random permutation. It will likely yield similar results to what you've got, but be a more compelling test in my opinion.

      c) Fig. 6B: not sure t-test is right. Are these data Gaussian?

      7) The results in Fig. 9BC seem interesting, but it's hard to parse the network diagrams. Showing 3x3 matrices for the CCM coefficients from neurons each layer to ones in each other layer would help me to evaluate the claim that the superficial layer acts as a hub.

    2. Reviewer #2:

      Here the authors present results examining the possibility of decoding a choice signal from V1. They show that a transfer learning approach that mixes stimulus and choice during training provides information about choice that is slightly better than chance. In contrast, decoding choice directly using a linear SVM results in chance decoding. They then examine potential time-varying structure in the "choice signal" and nicely show that the strongest contributions are from bursting neurons in the superficial layers of V1.

      This is a novel approach to an interesting open problem in systems neuroscience. However, based on my understanding, there are several core issues that need to be addressed.

      Major Issues:

      1) I may have misunderstood, but it is not obvious to me that the "choice signal" that the authors report is a signature of choice and not just a stimulus-driven effect. From what I understand the same image was used during an entire recording session, and the difference between target and test is either 0deg (match) or 3-10deg (nonmatch). A decoder is trained to classify the test orientation (using the correct trials only). Then choice prediction accuracy and "choice signals" are assessed using the nonmatch trials. In this setting, it seems that if there is some tuning to the stimulus orientation and some variability in the responses that eventually influences the choice then you would see a difference in the choice signal as calculated here.

      If the "choice signal" calculated here is present for the same/different responses under the match condition I would be more convinced that this is, in some sense, a representation of choice. The authors mention there were few trials in the IM condition, but it seems valuable to show. Alternatively, and I understand it may not be feasible at this stage, I would also be more convinced if the authors got similar results when the stimulus image varied from trial to trial within a recording session. Barring that, I have trouble seeing how this is a "representation" of choice, except under an extremely loose definition of "representation".

      Unless I've misunderstood something fundamental (which is possible), it seems better to frame these results as "evidence that choice can be decoded from V1 activity at slightly better than chance in this particular task" rather than "a time-resolved code that reflects the instantaneous computation of the low-dimensional choice variable in animal's brain...[that] contributes to animal's behavior as it unfolds" (as stated in the introduction).

      If I have misunderstood maybe the authors can clarify where I went wrong and/or show results from simulations to help me understand why the "choice signal" here is distinct from a situation where you just have purely feedforward effects with noisy sensory encoding in V1 and downstream decision making in a different brain area.

      2) It is also not clear to me why the "zero crossing" is the relevant time point to consider when looking at the timing of the choice signal. The point where the choice signal is farthest from zero seems much more relevant and seems to occur very close to the point where firing rates are the highest. Some clarification on this issue would be helpful. Additionally, it could be worthwhile to test what happens when the data are not z-scored. This seems like it may get rid of the zero crossing altogether. I'm somewhat surprised that there is a difference in the same/different responses after 200ms, but the fact that similar differences appear at <50ms might point to a normalization issue.

      3) I'm also concerned about the interpretation of the "plus" and "minus" and "strong" and "weak" subnetworks. It is not obvious to me whether the decoding weights will be stable. Particularly when decoding from small populations, the weights could be influenced by overfitting and omitted variables. This is a relatively minor concern compared to the above issues, but it could be helpful to explicitly measure how stable the weights are. The authors could show weights from the 1st half and 2nd half of the data or see if the weights change when decoding based on subsets of the observed neurons.

    3. Reviewer #1:

      This article asks the question as to whether V1 encodes a behavioral choice variable using visual information. The authors propose an approach, termed generalized learning, to predict the choice variable using a time-resolved code computing from V1 population spiking, in an experiment that utilizes naturalistic stimuli.

      More specifically, the authors build a decoder to predict the stimulus + choice (S+C) variable, and then utilize it to predict the choice variable. Using this approach, the authors report that population activity can predict the choice variable, relying on the overlap b/w the representation of the stimulus and the choice.

      In addition, the authors identify/study the role of different sub-populations of neurons in enabling the prediction of the choice variable. The authors report that the accumulation of a choice signal at the input of a hypothetical read-out neuron facilitates the prediction of choice from V1 population activity. The authors also report that burstiness represents a useful feature of neurons, which facilitates the accumulation of the choice signal.

      Finally, using an analysis of the intrinsic flow of V1 information with three sub-populations of neurons, the authors report that information about the choice in V1 likely comes from top-down processing.

      Major comments:

      1) In Fig. 2b, I find it difficult to assess how significantly different from chance the S+C decoder performs, compared to the choice only decoder. The authors report data from 20 sessions in Fig. 2 a. It seems to me that if the authors were to use the balanced accuracy (BAC) from these 20 sessions to build an empirical distribution of BAC across the sessions, the 95% confidence region would overlap with 0.5 (chance). Does that sound accurate to the authors?

      The authors do report that they've tested for the significance of the difference in the similarity vectors, and call them "weakly" similar.

      Put more simply, my comment relates to the following, more basic, question: how does one interpret a BAC of 0.55 vs 0.5, in terms of how much overlap this means in the shared representation between stimulus and choice? What if the BAC had been 0.7 for S+C vs 0.5 for C? Do the authors think it possible to make more precise statements about the shared representation?

      Similarly, how does one interpret different degrees of similarity? I understand the interpretation of the angle b/w the two vectors, and that at one extreme lies orthogonality and at the other co-linearity. Can one interpret the cosine of the difference in the angles as an amount of shared representation?

      I think that this represents a point that the authors should expand upon, discuss more thoroughly in the manuscript, namely can we really make a statement about how much the representations of stimulus and choice overlap?

      2) The authors S+C analysis relies heavily on the data collected when the animal chooses correctly. As far I understand, the authors suggest that the incorrect trials add "noise". I find this difficult to understand. Have the authors performed the S+C analysis when the animal chooses incorrectly? I could not understand clearly a) why restricting oneself to correct trials seems crucial, and b) the significance of this from the perspective of the representation of choice in the circuit.

      A true decoder of S+C would have 4 possible outcomes (two that the authors already consider, and two additional ones coming from incorrect trials). The authors focus on two of these. To me, this deserves a detailed discussion.

      I suggest that, very early on in the article, the authors make it clear that the S+C decoder conditions on correct choice, and a) why restricting oneself to correct trials seems crucial, and b) the significance of this from the perspective of the representation of choice in the circuit.

      3) Why do random weights (fig 4a, top right) work well? i.e. the figure looks very similar to (fig 3c). As far as I understand, the random weights come from the empirical distribution of the weights (fig 6a). This seems agnostic to the layer to which a cell belongs. How do I reconcile the authors’ statements about the importance of certain groups of cells to predicting the choice variable?

      4) The authors use different feature extraction for training and testing. The authors train on spike counts (features) and test on binary spiking activity smoothed using a first-order filter (exponential impulse response). One reason I think this might be problematic goes as follows: during training, the authors get a prediction from the SVM for a whole time segment. I have no problem with this. For testing, however, the authors get a prediction for every 1ms bin. How does one translate that into a prediction of choice for the whole window?

      I can understand the argument that testing on a different data set represents a form of transfer learning. My reservation comes from the apparent lack of a prediction on the test set, and accuracies on the test data.

      As they stand, I find the authors’ statement about the differences in the choice signal/zero crossings etc very qualitative. It would be nice to report training and test accuracy, as standard in ML.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      Overall, as a group, the reviewers expressed excitement about the topic and questions posed in the paper. At the same time, the reviewers did not think that the data and results of analyses the authors report provide enough evidence here to justify the claim of having found a "representation of "choice" in V1. The following represent two critiques that the reviews have in common (Please refer to the individual reviews for details):

      1) The fact that the authors restrict themselves to "correct only" trials to claim that V1 encodes choice raised eyebrows.

      2) The manner in which the authors conducted the computational and statistical analysis also raised a number of questions/concerns.

    1. Reviewer #3:

      This paper looks at the effect of metal cofactor binding on the aggregation and toxicity of SOD1, which natively binds a Cu2+ and a Zn2+ ion. The authors investigate the WT SOD1, the apo SOD1 and two mutants which do not bind Cu2+ (H121F) or Zn2+ (H72F) in order to look at the effects of the metal binding on SOD aggregation and toxicity. They find by a number of assays and a computational study that Zn2+ rather than Cu2+ is the dominant factor in determining susceptibility to aggregation, membrane binding, etc. Based on this they propose that deficient Zn2+ uptake by SOD1 is responsible for the pathogenic behaviour of some mutants.

      There is a lot of interesting data in this paper supporting this hypothesis (some more so than others), however there are some points the authors should consider:

      1) A potential weakness of the computational estimation of membrane binding affinity is that the WT crystal structure was used for WT, while structure predictions from the I-TASSER server were used for apo and Cu/Zn-deficient mutants. Since one might expect the predicted structure to be of lower quality, it might then have an enhanced propensity for membrane binding via exposed hydrophobic groups? What would be obtained if the I-TASSER server was also used to generate the structure used for WT in this calculation? This point also applies to the computational validation where predicted membrane binding free energies are compared with distance to the Zn2+ or Cu2+ site of the mutants. This again involves a 2-stage prediction - firstly of the mutant structure, then of its binding energy. Maybe the authors can give some intuition as to how this can be sufficiently accurate to be useful?

      2) Correlation functions for A488-SOD1 are shown at the extremes of no SUVs versus a high concentration of SUVs. What happens at intermediate concentrations where there would be more of a mix of bound and unbound populations - can the two components be clearly resolved in the log-linear plots of G(tau)?

      3) I may have missed something, but why does the population of membrane-bound protein saturate at much less than 100%? Is there a baseline parameter for the population at high [DPPC SUV] in addition to Ka? One thing that occurred to me is that membrane binding may quench the fluorescence somewhat, so the amplitude of the membrane-bound population may be lower than it should be, hence this effect; and the differences in folding/misfolding of the SOD mutants may lead to different binding to the SUVs which would in turn affect the relative amplitudes of the two components. This wouldn't affect the fit of the sigmoidal curves, but maybe the relative fraction of slowly diffusing components should not be literally interpreted in terms of a bound population. Rather than "population membrane bound" Fig. 2f could say "Fraction bound fluorescence" or similar? This interpretation would support the authors' contention that H72F is more apo-like and H121F more holo-like.

      4) The differences in the ratio Ksvm/Ksv are basically reflecting differences in Ksv, because the values of Ksvm are all very similar. Thus it may reflect more the differences in non membrane-bound protein than differences in membrane binding, as seems to be the inference in the paper?

      5) The finding of change in secondary structure on membrane binding based on IR data, in particular increase in alpha-helical population, for the apo form and the H72F, is very interesting and strongly supports differences in membrane interaction between WT/H121F and apo/H72F - maybe this data should be included in the main text rather than the SI in fact? To me this seems a more noteworthy change than the modest differences in membrane association constants obtained from FCS.

      6) Aggregation was studied for the reduced form of the disulfides. The authors should motivate why the aggregation is studied using the reduced form of the protein while the prior work in the paper used the oxidized form (I believe?). My knowledge in this area is limited so I'm not sure which is the form more relevant to observed pathologies.

      7) A complicating factor in the perturbation of GUV membranes by the aggregates formed with/without SUVs present is the SUVs themselves. Presumably there is a significant SUV concentration in the aliquots taken from the aggregation reaction - could the SUVs rather than differences in the aggregates be responsible for the difference in the effect on GUVs? A control could be to add just SUVs to the GUV samples.

      8) For the validation, a statistical test should be used to demonstrate the significance of the observed correlations.

    2. Reviewer #2:

      In this manuscript, Sannigrahi et al studied the role of metal binding sites of SOD1 on its aggregation and toxicity. They created a Zn only, Cu only binding mutants as well as Zn/Cu binding-deficient mutant. Zn bearing mutant behaved similarly as wild type protein in terms of membrane binding, aggregate formation and toxicity, while Zn/Cu deficient mutant behaved similarly to Cu bearing (no Zn) mutant. They conclude that Zn binding pocket is crucial to keep the protein in a healthy state and in the absence of Zn binding, protein aggregates especially in the presence of membranes. Lastly, they investigated real disease mutations and sampled two mutations with different degrees of Zn binding, and confirmed the same trend; if the Zn binding pocket is influenced, mutation is more severe.

      I am not an expert of this particular biological question (ALS and role of SOD1), but I evaluated the technical aspects of the manuscript.

      In general, the manuscript is well written, the messages are clear and the conclusions are supported by data. I have only minor points.

      1) Figure 2a - how many times were the experiments performed? Do the authors show the average of multiple measurements?

      2) Figure 2e - it would be useful to show which residues interact with the membrane in the computational model.

      3) "The apoaggm appeared to exhibit network of thin aggregates (the average size was found to be 700-800 nm with an average height of 6-8 nm) which were found to be connected by the spherical DPPC vesicles (Figure.3e, inset; Figure. 3f)." Is it possible that H72F variant (or both mutants) induces a curvature or binds only curved membranes? Authors can address this by looking at the aggregation in GUVs.

      4) It would be interesting to see if the binding and aggregation of the Apo and H72F is dependent on membrane composition.

      5) In Figure 4, why didn't authors use fluorescently labelled proteins they used in Fig3, they could see the aggregation specifically, and curvature effect as well as membrane deformations. GUV pore formation can also be seen directly by fluorescent proteins in the solution.

      6) I can understand that authors picked two known mutations (G37R and I113T) to match their own mutants, and to represent a severe and a mild mutant, but it would be very useful and a lot more convincing if they also picked an intermediate mutant that is not as severe as I113T and not as mild as G37R.

    3. Reviewer #1:

      Sannigrahi et al. report the investigation of structural determinants of membrane insertion and aggregation of Cu-Zn superoxide dismutase (SOD1), an enzyme that is implicated in motor neuron disease. The authors combine mutagenesis experiments with a variety of techniques, involving tryptophan fluorescence, FTIR, AFM, Tht fluorescence, FCS, optical microscopy and computer simulation. They arrive at that conclusion that conformational change and site-specific metal binding modulate membrane insertion and aggregation of SOD1.

      Identifying the origins of SOD1 dysfunction and aggregation can have important implications in the development of therapeutic strategies for motor neuron disease. The underlying molecular biology is not well understood. The study by Sannigrahi et al. is an integrated approach involving an impressive number of complementary methods. However, the conclusions put forward are not sufficiently supported by the data presented. The applied methodologies yield data of insufficient resolution to draw the detailed molecular picture presented. Additional experimental work would be required to substantiate or provide evidence for the findings.

      1) The statistical mechanical model (WSME) is coarse-grained. It e.g. considers three consecutive amino acid residues as a block. It is therefore of limited suitability to study the effects of single-point mutations and metal-binding or conformation and aggregation.

      2) The effect of mutation and Zn/Cu-binding on Trp fluorescence spectral properties of SOD1 is marginal (Fig. 2a). Likewise, the far-UV CD spectra shown in supporting information show marginal changes. The broad spectral characteristics of far-UV CD defies an accurate, quantitative deconvolution of secondary structure content. No solid conclusions concerning a conformational change can thus be inferred. FTIR spectra are broad and smooth (i.e. lack significant sub-structure) (Fig. 2b, c). Their deconvolution in seven discrete sub-states appears ambitious and error-prone.

      3) The authors propose to determine membrane affinities of SOD1 and mutants thereof by applying extrinsic fluorescence modification and by measuring binding to artificial micelles using fluorescence correlation spectroscopy (analysis of diffusion time constants). Extrinsic fluorescence labels are hydrophobic compounds and supposedly tend to strongly interact with membrane lipids. This will provide an artificial bias of conjugates to micelle membranes. Control experiments are required to rule out effects of the labels.

      4) The influence of mutation on stability and conformation of SOD1 is unclear. Mutations H72F and H121F, introduced to alter metal binding, may as well have effects on stability and conformation (folding) of the entire domain, irrespective of the metal-bound/unbound state. Mutation itself may lead to unfolding and aggregation. Mutation of a histidine to a phenylalanine, as applied by the authors, may have disruptive effects on protein structure because a small side chain is replaced by a larger one. Thermal and/or chemical denaturation experiments, carried out on isolated protein material and mutants thereof, and their analysis are required to assess the effect of mutations on folding and stability.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      The reviewers have discussed the reviews with one another. They acknowledge the integrated approach taken by you and your co-authors and the amount of data presented and discussed. However, the reviewers raise major concerns regarding both experiments and computer simulations. Not all conclusions are justified by the data presented and additional data are required.

    1. Reviewer #2:

      General assessment

      The manuscript of Zhang and colleagues studied the expression of PACAP and PAC1 mRNA in inhibitory and excitatory neurons in the entire mouse brain by using dual ISH method. Additionally, a behavioural test is carried out to provide a functional role for PACAP/PAC1 on olfaction and defensive behaviour followed by cFos examination of selected brain regions to indicate the role of PACAP and PAC1 in such behavioural outputs.

      Summary

      In my view, this study has two parts that could work separately.

      Part 1: the PACAP/PAC1 characterization is well designed and executed. The result description is lengthy and sometimes confusing. Figures and tables (including the supplementary information) are clear and informative. The authors decide to not show Vipr1/Vipr2 data, which should be reconsidered. Overall, this part of the manuscript represents a nice piece of work and surely will be very helpful to those who wish to work with PACAP/PAC1.

      Part 2: I think this part is the critical one in this manuscript. Starting from section 4, it uses part 1 of the manuscript to review the literature and build a neuronal circuit with PACAP/PAC1 that makes for behavioural processes. It is literally a review inside the results section. The schematic figures are interesting but also quite speculative regarding brain signalling since the authors did not perform any experiment to investigate the pathway of PACAP and the literature is scarce. Moreover, the role of Vip receptors were completely neglected here.

      Behavioural test: the authors decided for the predator odor paradigm based on the involvement of PACAP on the defensive circuit. However, a global PACAP KO is used instead of specifically targeting a brain region or a neuronal population. Not that this is not interesting, but the entire specificity applied in the first part of the study was not used to find a functional role for PACAP. Despite the cFos analysis demonstrating reduced activity in several brain regions in PACAP KO, the specific role of PACAP in such regions and the importance of each of the three PACAP receptors remained unknown. Also, the use of a global KO inhibits the understanding of the excitatory/inhibitory balance that perhaps the PACAP system may play a role. Moreover, due the specific requirement of the olfaction sense in this test (the considerable expression of PACAP on the olfactory bulb), it is not clear how much the olfaction function is affected in PACAP-deficient mice, and thus, consequently affect the defensive/fear circuit. Finally, is the change in locomotion found here due to a fear response or a hyperlocomotor activity?

    2. Reviewer #1:

      Zhang/Hernandez et al provide a fascinating and comprehensive dataset of the distribution of PACAP (Adcyap1) and PAC1 (Adcyap1r1) mRNA expressing cells in most regions of the mouse brain. Using dual (two-colour) in situ hybridization (DISH) they go further than the Allen Institute ISH datasets by revealing the co-expression with common neurotransmitters (VGAT, VGLUT1, VGLUT2) as well as linking expression to a variety of physiologically and behaviourally relevant neural circuits. Among their observations, they observe a subpopulation of PACAP-expressing CA3 neurons, find that dentate mossy cells express PACAP with a particular septo-temporal distribution, as well as prominent expression in neurons of the bed nucleus of the anterior commissure. They report overlapping PACAP/PAC1 cell groups and also find that PACAP knockout mice exhibit impaired predator odour responsiveness and reduction in neurotransmitter expression in PACAP-related regions. This is a valuable and important study on PACAPergic brain regions in mice, especially relating to the hypothalamus, but would benefit from a reorganisation to improve the presentation of data, and further quantitative criteria to strengthen the observations.

      1) The paper would benefit from a reorganisation, especially when referring to figures and tables. There are a very large number of abbreviations. A list near the beginning of the manuscript would help the reader, and would also shorten the figure legends and improve readability/flow. For the non-expert, some areas should be labelled/highlighted separately or provide more information in the figures, e.g. line 184 'ACA and the entorhinal cortex' one has to search the figure legend, find the number then search the figure panels to find the location of these brain regions. Abbreviations and brain region names should be consistent, e.g. line 241, ACC is used in text, but ACA in figure and legend. Unless mistaken, Table S1 is not mentioned in the text. Figure 9 is first mentioned in the Discussion (line 780). Since these are valuable data, refer to this figure in the main Results section in terms of the knockout. Figure S1 is very informative, but requires a lot of searching to find the panel that is referred to in the text. In Figure S1-7/7-M, panels M1-4 are identical to Fig 1E-H and the scale bar in M3 is different to 1G.

      2) In several places there are anecdotal statements and it is not clear about the reproducibility of the results. The methods for quantification (including those mentioned in Table legends) should be included in Methods. For animals, please check and state the total number of mice and rats used in the study, and whether EGFP mice were also used (as referred to in line 191). In line 816, what is a group?

      For c-fos experiments, how were these cells counted, how many sections per mouse, what was the section thickness, how were the values calculated (mean, absolute numbers). Was fos counting done blind to genotype?

      Was there variation between animals in terms of expression levels/strength? Case/animal numbers in figures would help. It is not clear what is meant throughout by statements such as 'strongest'. Is this by density in cells or number/intensity of puncta? For example, section 3.1, retina. What is meant by 'higher percentage than previously reported' (line 148)? Is this referring to both previous reports in mice? Also see Engelund et al Cell Tissue Res 2010. How many samples and/or mice were examined and how were ganglion cells counted?

      Similarly, lines 174 and 182-183, cortical expression in different layers, how were the values of 80% obtained? Again line 196, 'highest expression level of PAC1 among all brain regions' is a strong claim, how was this quantified? Line 249-251, need references/evidence for observations of mouse claustrum percentages. Line 272, 'more than 90%'. Line 463, 'the highest expression of PACAP was observed in the MnPO'.

      Line 484 in terms of the olfactory pathways, is there evidence of co-transmission or is this a hypothesis?

      Some claims will need careful revision. E.g. in the Fig 5 legend, the last sentence contradicts line 286.

      In line 187, the finding that 100% of the 3 GABAergic subpopulations expressed PAC1 is a big claim, yet there is no quantification to back this up. How many brain regions were examined, how many mice, sections, counted cells etc.? If it just refers to the primary somatosensory cortex, was it all or some layers?

      Table 2 (also applies to parts of Table 1), do blank areas of the table mean not examined? Or should there be '-' in these areas? For example, the medial septal complex contains vglut2 expressing cells but the corresponding row/column is blank.

      Line 191-193, there is the claim that PACAP mRNA was not found in cell body layers, but in Table 1 it is reported that there is weak expression in VGLUT1+ cells. Since VGLUT1 cells are in the pyramidal cell layer, this seems contradictory. It would be helpful to have a higher power image of CA1 (as for rat in Fig S2). Could expression outside this layer be in subpopulations of GABAergic neurons? Were these examined (blank in Table 1)? DG is also missing from Table 1. PAC1 expression. Line 195, claims it is selective for VGAT cells. But there are clear examples of VGAT- cells in Fig S3B expressing PAC1. What are these?

      3) Suggestion about paracrine/autocrine signalling. Is there evidence in literature for such a role? This seems speculative without immunohistochemical evidence. Hannibal 2002, carried out at both the protein and mRNA levels, showed axon terminals in multiple regions. Can these be mapped to the regions that express PAC1 in mice? Is there any evidence or could the authors comment on the existence of presynaptic PACAP receptors? Expression of PAC1 mRNA does not imply that the cell would express the protein exclusively along its somatodendritic membrane. 'Classical' neurotransmission presumably could occur in PACAP/PAC1 rich regions via local axons in addition to long-range axons.

      4) The observation of PACAP in part of temporal CA3, which the authors refer to as CA3c, has in fact previously been defined as CA3vv, corresponding to the coch expressing domain (see Thompson et al Neuron 2008, Fanselow and Dong Neuron 2010). PACAP may indeed be an additional marker along with calretinin for this principal cell subpopulation, and they may want to revise their model or refer to these earlier papers.

      5) PACAP KO. Some clarification would be welcome in terms of animal cohorts. Please state the experimental unit (i.e. n=9 mice/group). In D, the freezing data show only 8 mice, was one pair excluded due to lack of freezing in an animal, as for jumping mice in C? In Ai, Aii, Bi, Bii, does this show the traces for the total time?

      In the separate experiment (lines 630-635), was n=3 a separate cohort of mice or from the N=18 total as stated in the methods? Is the n=3 per group or total mice? This may require an increased sample size for this claim, or show quantification/statistical tests. For this test, were experimenters also blind to the genotype? The last sentence is difficult to follow.

      For the behavioural tests, please include details about whether the wooden boxes, room and experimenter were familiar to the mice before the test (which could affect variability), whether mice were tested at the same time of day, and if KO and WT animals were housed together.

      In the Discussion, ~line 797, can the authors comment on or provide evidence of possible developmental changes / compensatory mechanisms occurring in the KO animals.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      The manuscript of Zhang and colleagues studied the expression of PACAP and PAC1 mRNA in inhibitory and excitatory neurons in the entire mouse brain by using dual ISH method. Additionally, a behavioural test is carried out to provide a functional role for PACAP/PAC1 on olfaction and defensive behaviour followed by cFos examination of selected brain regions to indicate the role of PACAP and PAC1 in such behavioural outputs. The reviewers believe that this is a valuable and important study on PACAPergic brain regions in mice, especially relating to the hypothalamus, but would benefit from a major reorganisation to improve the presentation of data, and further quantitative criteria to strengthen the observations.

    1. Reviewer #3:

      The authors touch upon a highly relevant issue. Non-synaptic peripheral interactions (NSIs) are of interest to the broader neuroscience community as they are typically left in the shadow of the more prominent network studies. The authors compare a simple computational model of pure NSI with the established model of lateral network inhibition, concluding that NSIs perform better in odour mixture identification and source separation. To achieve a comprehensive model study that would become a definitive reference in the field, I identified a number of required improvements with respect to clarity, validity, and interpretation of the model.

      1) Model approach

      The model mixes different methodological approaches and model description overall lacks clarity. The model could be severely streamlined by omitting unnecessary/unwanted simplifications and complications.

      The ORN binding rate model (Eqns.2+3) and ORN-ORN interaction (Eqn.5) are clear (see also #2) and generate activation variables x with adaptation y.

      The authors then claim to use a "biophysical spike generator", which in my eyes is not true. Rather, transfer fcn (4) generates a firing rate nu, subsequently used as intensity for stochastic point process realizations (non-homogenous Poisson, see minor #1). The Poisson assumption is surprising and ref. Kaissling et al. (2014) incomplete. Nagel & Wilson (2011) argue for Poisson-like transduction process and subsequent adaptation in the spike generating mechanism, which in a biophysical conductance/current based model generates beneficial non-renewal properties (Farkhooi et al., 2013). Omitting Eqn.4 and adaptation variable y in Eqn.3+5, using x plus noise (Poisson transduction events?) as input to a biophysical spike-generator model would elegantly separate transduction and spike generation, and naturally implement spike frequency adaptation.

      The next step is confusing: each ORN spike is transformed into a binary signal of a certain duration and amplitude (it took me quite a while to figure out what is actually meant with spike height and width). This seems an unnecessary and unwanted complication, reminiscent of simpler binary models. The biophysical voltage model of the PN includes short synaptic (tau_s) and long adaptation (tau_x) time constants that ensure the temporally extended effect of each incoming spike and synaptic amplitude is encoded as alpha_ORN. Thus, omitting the 'spike block' of height and width should be feasible and render the model more biologically realistic and transparent.

      The authors further introduce a post-hoc model for precise ORN-ORN correlations. Considering the other model simplifications (list in Discussion) this seems a rather unmotivated complication and its effect is not explored. The experimentally observed correlation could stem from either competition of co-housed ORNs or from antennal lobe network interactions affecting ORN axons. The former was explicitly excluded from the model and the latter is not captured.

      2) Model interpretation

      One major concern is the model reduction to two ORN types with exclusive odour sensitivity, which might overemphasize the NSI effect. Tuning of receptor types can be rather broad (e.g. Wilson et al., 2004). Related is the reduction to only two glomeruli. How would the picture change with increasing number of receptor types and glomeruli with a broader receptor tuning model?

      A second major concern is the restricted comparison to the pure NSI and pure LI model. If we assume that LI is present in the AL, the 3rd choice of the combined model should ideally show synergistic effects.

      The conclusion ”information about input correlations is contained in the first part of the response before adaptation takes place" in the NSI model is based on the surplus spike count within a window of 50-150ms of estimated rates above 150Hz (Fig. 8d). The 'encoding' of temporal whiff correlation was seen in the average rate for the LN but not the NSI model (Fig.8c). This looks like an ad-hoc implementation of a new measure to achieve a wanted effect of the NSI model. The authors must motivate this unusual measure with biological plausibility.

      The AL model assumes LN activation by PNs. It has been argued for different species (Galizia 2014) including D. melanogaster (Seki et al., 2010) that LNs receive direct input from ORNs. Previous computational models have used either type of implementation. What is the author's rationale behind their choice and would ORN->LN activation change their conclusions?

      What are the crucial experiments to be conducted for testing model predictions? E.g. transient (temperature-sensitive) genetic suppression of a specific OR type? Optogenetic activation of a specific OR type?

      3) Evolutionary perspective

      The abstract promises that "... results shed light, from an evolutionary perspective, on the role of NSIs, which are normally avoided between neurons..." and I was looking forward to a knowledgeable discussion. The MS would gain relevance on a broader scope if the authors could provide (comparative) arguments. Do some (older) families within the class of insects or other arthropod classes (e.g. crustaceans) lack co-housing of different ORN types? Is there known variation within groups, e.g. between different bee species? Can this be linked to ecological demands?

    2. Reviewer #2:

      In this manuscript, the authors postulate that the observed phenomena of stereotyped colocalization of OSNs in insect antenna coupled with evidence of "non-synaptic interactions" (NSI) can serve an important role in parsing mixture ratios. Parsing these ratios accurately has been of key interest both for the understanding of pheromone recognition, as well as the proposed concept of "concentration invariance".

      The authors perform a nice series of calculations showing that NSI can improve the resolution of synchronous inputs, and conversely, improve the separation between asynchronous inputs. Both aspects are important features of resolving stochastic and intermittent plume information in nature.

      Although I have collaborated in a number of computational studies, my main expertise is in the neuroethology of olfaction, and therefore my comments will be concentrated on this aspect. However, in general the computation performed appears reasonable for the concept to be tackled.

      However, I have a few questions on the rationale for the study, as well as it's interpretation I would like the authors to address. I will separate my concerns into three categories for simplicity:

      1) BIOLOGY: The choice of Drosophila for the calculations is understood and likely necessary as it is the only system for which we have sufficient neurophysiological data at both the periphery and central levels to address this question. However, the concept of co-localization itself is known across the Arthropoda, and varies widely among species. For example, while moths and flies generally have 1-4 colocalized OSNs per sensilla (and these are the two systems that the authors reference), other systems like beetles, ants, and bees have up to 20-30 colocalized sensilla. Locusts, for which Gilles Laurent performed foundational research on blend encoding, have up to 50 OSNs in the same sensilla. Further, while it is true that pheromone blend neurons are often colocalized, this is not always the case.



      Thus, I would like the authors to take some time to consider: If NSIs are important for mixture processing, why do insects like bees (who, as shown by Giovanni Galizia and Paul Szyszka referenced in the manuscript can process mixtures at high speeds) have 20-30 OSNs together? How would this work? 


      2) ENVIRONMENT: While concentration invariance and ratio processing has been shown to be important for pheromone processing in moths and some other cases, the true complexity of odor detection is just beginning to be appreciated. See (https://doi.org/10.3389/fphys.2019.00972) for a nice recent review. First, odors are not always presented as point sources, they are not often without a chemical background, and insects themselves might not always have need for such strict attention to ratio. In the case of Drosophila, one can easily argue that when locating a rotting fruit for oviposition, the exact composition of the fruit odor might be less important, although the flies have specific OSNs to detect it. 



      So, I would like the authors to address - If NSIs are important for mixture processing, what happens when they are not needed, meaning when concentration ratios are not essential for identification? Would they limit the processing otherwise? If the authors disagree with this line of thinking, I would also like them to comment on the evidence that insects always need such fine tuning of ratios in their odor detection.


      3.) OTHER EXPLANATIONS: The authors, as well as others like Tim Pearce and Christiane Linster, have spent considerable time providing computational evidence regarding mixture processing (not just monomolecular odors). While there is time spent on comparing the NSI model to other models ("Comparison with related modelling works"), it mainly focuses on how the current model incorporates more information, rather than on why it performs better in detecting ratios. 

I would like the authors to take more time here to compare the NSI to other mixture processing models (several of which are not referenced) and explain why their model is better, just like they do in comparing how NSI improves ratio processing over LN/PN activity alone. Further, they mention myelination - so can the authors explain how mammals that would need similar attention to ratios accomplish this without NSIs - are there any similarities expected?

      These explanations and additions will greatly improve the relevance of this study to insect science and future research on this interesting topic.

    3. Reviewer #1:

      This is an admirably clear account of how non-synaptic interactions (NSIs) in the ORNs in the insect sensillum might improve processing of odor mixtures with complex temporal structures. The paper methodically goes through the initial constraining to data, comparison with other models, and predictions of the improved signal representation by a model incorporating NSIs.

      The fundamental computational concept here is that the NSIs can carry out highly specific high time-resolution mutual inhibition operations. All else follows directly from this.

      General comments:

      1) My major critique of the paper is that I don't think it adds much conceptually. Higher time-resolution in responses follows directly from the biophysics of ORN interactions in a sensillum. My reading is that the improvements in coding follow directly from this improved time-resolution.

      2) While the authors discuss various limitations of the model by way of simplifications, I would like to point out another by way of network structure: the only pairwise interactions possible here are those encoded by the co-expression of ORNs in a sensillum. Thus the LN network will potentially support a wider range of lateral inhibition interactions than NSIs. There should be some data on this, and certainly the authors should comment on it.

      3) I think that perhaps the authors are missing a possible additional value of NSIs, which is that if the odor filaments are fine enough to excite only a small fraction of sensilla at a time, the NSI computation might be more effective than converging multiple homotypic ORNs into the PNs and then doing lateral inhibition. I don't know if odor filaments on this scale have been demonstrated.

      In summary, I think the paper does a very good job of presenting this model and exploring its implications. However, I found the coding implications to be obvious outcomes of the higher temporal resolution of the NSIs as compared to synaptically mediated lateral inhibition. The well described model of early insect olfaction will be of value to specialists in the field.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      The study is a lucid analysis of non-synaptic interactions between ORNs in insect sensilla, with predictions on how these interactions could improve processing of odor mixtures with complex temporal structures. However, the reviewers and I had a number of major concerns with the study.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Please note that the authors have provided a formatted PDF version of this rebuttal, including additional figures and references, via the Open Science Framework: https://osf.io/5acqp/

      Reviewer #1

      This is an interesting and thorough study characterising human iPSC with hetero or homozygous mutation in pi3k pathway that lead to its hyper-activation. They prove that the increased stemness results from enhanced autocrine responsiveness to TGF signalling pathway. The main conclusions are well supported by the presented data. Cutting edge tools and bioinformatic analysis are adequately applied. I have only one important point:

      1) Western blot based validation of TGF pathway activation in wt and mutant iPSCs will be helpful to strengthen the results based on bioinformatic data.

      AUTHORS’ RESPONSE__:__ We thank the Reviewer for the positive evaluation of our work.

      Functional validation of the signalling hypothesis is indeed important, and we did in fact already present supportive data. Current evidence suggests that SMAD2 is the main transcription factor mediating actions of the TGFb/NODAL pathway in an early developmental context [1,2], and we have shown increased phosphorylation of SMAD2 (S465/S467) in PIK3CAH1047R/H1047R iPSCs using RPPA in the two datasets shown in Fig.2.

      We have attempted to demonstrate increased NODAL protein directly in PIK3CAH1047R/H1047R cells, but have been unsuccessful due to poor signal on immunoblotting. We thus opted for functional testing of our hypothesis using the experiment presented in Fig. 5, wherein TGFb (a surrogate for NODAL) is removed from the culture medium. Human iPSCs depend strictly on TGFb/NODAL for maintenance of NANOG expression and thus pluripotency [3,4]. Upon exclusion of TGFb/NODAL from the culture medium of normal human iPSCs, the early responses (prior to overt differentiation) are expected to be: (A) decreased NODAL expression, due to well-established autoregulation [2], then (B) a decrease in NANOG and ultimately POU5F1 (OCT3/4) mRNA levels (see also Introduction, lines 80-90). The evidence in Fig. 5 that PIK3CAH1047R/H1047R fail to exhibit these responses upon exogenous TGFb/NODAL removal supports the notion that these cells autonomously sustain TGFb/NODAL signalling.

      For improved clarity, we have also added the following information to the revised manuscript:

      lines 202-205: “This is consistent with strong NODAL mRNA upregulation and increased pSMAD2 (S465/S467) in PIK3CAH1047R/H1047R iPSCs in the current study (Dataset S2 and RPPA data in Fig. 2, respectively), and with prior evidence of activation of the NODAL/TGFb pathway in homozygous PIK3CAH1047R iPSCs.”

      Reviewer #2

      In this manuscript, Madsen et al have investigated the role of heterozygous versus homozygous PIK3CAH1047R gain-of-function mutation at maintaining stemness of induced pluripotent stem cells (iPSCs). The authors have performed high-depth RNAseq, proteomic, and RPPA analyses to show that biallelic PIK3CA alterations induce stronger activation of the PI3K signaling axis, compared to monoallelic mutations. The authors claim that a higher PI3K signaling dose activates the NODAL/TGF-b pathway, which in turn supports stemness in an autocrine fashion. These are important findings, however, the manuscript and its conclusions can be improved.

      AUTHORS’ RESPONSE__:__ We thank the Reviewer for acknowledging the importance of the work and for their constructive suggestions for improvements.

      The authors have described the role of PIK3CAH-1047R gain-of-function mutation in cancer and overgrowth syndromes. However, cancer associated somatic mutations in PIK3CA are mostly heterozygous. Similarly, PIK3CA related overgrowth syndromes (PROS) are caused by post-zygotic mosaic PIK3CA activating mutation. As such, the relevance of homozygous PIK3CA alterations to these pathological conditions is unclear. The authors should elaborate on the biological implications of their findings.

      AUTHORS’ RESPONSE__:__ We disagree with the Reviewer’s comment which implies that homozygous PIK3CA mutations are not relevant to many cancers. In our previous work [5], we provided evidence that many human cancers harbour multiple PIK3CA mutant alleles. Specifically, among cancers with a unique PIK3CA mutation, approximately 50% exhibit multiple copies according to allele copy number analysis. We further demonstrated that a substantial proportion of cancers have multiple different PIK3CA variants or additional oncogenic ‘hits’ within the pathway. These findings have been supported by other recent high-profile papers [6–8]. Such multiple alterations increase activity of the PI3K pathway beyond the level seen with heterozygosity alone [5,6]. This substantial body of literature renders our PIK3CAH1047R iPSC model system highly relevant for studying disease-relevant, dose-dependent oncogenic PIK3CA activation.

      The Reviewer is correct, however, that PROS is caused by postzygotic heterozygous PIK3CA mutations almost exclusively. Observations in homozygous cells are therefore not directly relevant to the pathogenesis of PROS. On the other hand, the heterozygous cells are closely relevant, being human, carefully matched with isogenic controls, and unperturbed by further manipulations such as artificial immortalisation. Our prior studies demonstrated no clear phenotypes in heterozygous cells in the iPSC differentiation paradigm, despite the rock solid causal nature of heterozygous mutations in PROS. This negative finding, surprising given the dramatic PROS phenotypes, is very important in understanding how best to create disease-relevant PROS models. One intent of the current study was to increase the sensitivity of our transcriptomic analysis, and to combine this with proteomic studies to determine if heterozygous cells really do not exhibit a phenotype. We now show that there are indeed faint echoes in heterozygous cells of the dramatic changes in homozygous cells. We believe that the human growth phenotype is a summative consequence of such small differences in growth behaviours sustained over months and years, highlighting how subtle difference in signalling can lead to dramatic human growth consequences across the lifecourse. Similar observations were also recently made following systematic analyses of oncogenic RAS mutations [9]. The new information we present about heterozygous PIK3CAH1047R cells, while much less “showy” than the cancer-relevant behavious of homozygous cells, we thus contend is very important for understanding of the PROS phenotype and its experimental modelling. To emphasise this point, we have added the following statements to the abstract and discussion, respectively.

      • lines 56-57: “This work illustrates the importance of allele dosage and expression when artificial systems are used to model human genetic disease caused by activating PIK3CA mutations.”
      • lines 104-106: “We discuss the implications of our findings for understanding and modelling developmental disorders and cancers driven by genetic PI3K activation.”
      • lines 333-340: “Finally, our observations are important for future studies seeking to model human PIK3CA-related diseases. The modest changes observed in heterozygous PIK3CAH1047R cells, in sharp contrast to the radical transcriptional alterations in homozygous cells, emphasise the importance of careful allele dose titration when artificial overexpression systems are used to model disorders caused by genetic PIK3CA activation. Our findings in heterozygous cells are also a reminder that very small effect sizes in cellular systems may summate and result in major human phenotypes over a life course. That such minor changes are found in a cellular study of a rare and severe disorder emphasises the challenges of modelling much more subtle disease susceptibility conferred by GWAS-detected genetic associations, where cellular effect sizes are likely to be smaller still.”

        The role of biallelic PIK3CA mutation is reminiscent of compound mutations in PIK3CA which have also been shown to increase PI3K signaling output. However, double PIK3CA mutations confer enhanced sensitivity to PI3K inhibition (Toska et al. Science 2019). Could the authors kindly speculate on this discrepancy.

      AUTHORS’ RESPONSE: We emphasise first that PIK3CAH1047R/H1047R cells do respond to BYL719 at the signalling level, as demonstrated previously [5] and in the manuscript (revised Figure S5; see also additional Western blot below). Our point is that the cells have undergone a switch to self-sustained stemness. That is, while PIK3CA activation was the driver of the initial change in cell state, the induced stemness phenotype is no longer reversed by removal of that trigger, with our data suggesting that this is now driven by self-sustained TGFb/NODAL signalling. This is in line with the role of this pathway in the maintenance of the pluripotent state. We speculate that this may be important in a cancer context where surviving stem cells may permit cancer persistence after toxic therapies, even if short term growth of tumours is reduced by agents such as PI3K inhibitors.

      Our data are not directly comparable to prior cellular data, for example in Vasan et al. [6], due to: (a) use of different cell model system and (b) assessment of different functional responses. We would also sound some methodological notes of caution re some of the prior studies alluded to, as potentially confounding differences in growth rate in the cells studied was not corrected for. It is well-established that IC50 and Emax values depend on cell division rates, and failure to correct for this can result in artefactual correlations between genotype and drug sensitivity (see, e.g., Hafner et al. Nature Methods 2016: “Growth rate inhibition metrics correct for confounders in measuring sensitivity to cancer drugs” [10]**).

      Similarly, the p110 alpha specific inhibitor, alpelisib, is highly effective against PIK3CA-mutant ER+ breast cancer and PROS. As such, the clinical relevance of the insensitivity of homozygous PIK3CA mutation to PI3K inhibitors is unclear.

      AUTHORS’ RESPONSE__:__ Efficacy of Alpelisib in PROS is currently supported only by unregistered observational studies, but is nevertheless striking. It is not relevant to our findings in homozygous cells, as the Reviewer has previously observed, however.

      As for cancer, in a randomised phase 3 trial that compared Alpelisib/BYL719 with fulvestrant to fulvestrant alone, the overall response (irrespective of PIK3CA mutant status) was indeed greater with the combination treatment (26.6 % vs 12.8 %), with a hazard ratio of 0.65 (95% CI, 0.5 to 0.85) in patients with PIK3CA-mutant caners versus a hazard ratio of 0.85 (95% CI, 0.58 to 1.25) in those without a PIK3CA mutation [11]. This trial demonstrated the utility of additional PIK3CA mutant-centric stratification, but a substantial proportion of patients with PIK3CA-mutant tumours (>50%) did not benefit from the BYL719 and fulvestrant combination [11]. However, these observations are not directly relevant to this manuscript and are instead included in a separate manuscript focused on PI3K signalling and stemness in human breast cancers (preprint [12]**).

      Figure 2: The authors have performed RPPA analysis in the presence of 100 nM BYL719. Alpelisib is commonly used at 1 uM concentration for in-vitro experiments, and has a cMax of ~5 uM. We suggest the authors perform western blot analysis to confirm the results of RPPA.

      AUTHORS’ RESPONSE__:__ We carefully chose the optimal concentration of BYL719 to preserve inhibitor selectivity, and to avoid undue toxicity and confounding off-target effects, rather than copying the dose “commonly used”. The Cmax is not relevant to our use of BYL719 in the current study as a precise tool compound. We refer the Reviewer to the known pharmacological characteristics of this compound [13,14]. According to available evidence, it is only a selective PI3Kα inhibitor at concentrations 250 nM (Table below adapted from Ref. **[13]; for formatted version, please see PDF version: https://osf.io/ecmhr/)

      Enzyme

      In vitro IC50 for NVP-BYL719 (nM)

      PI3Kα

      4.6 +/- 0.4

      PI3Kα-H1047R

      4.8 +/- 0.4

      PI3K**b

      1156 +/- 77

      PI3K**d

      290 +/- 180

      PI3K**g

      250 +/- 140

      PI4K**b

      571 +/- 42

      We have previously demonstrated (Fig. 2C in Ref. [5]) that 100 nM BYL719 is sufficient to restore pAKT (S473) levels in both heterozygous and homozygous PIK3CAH1047R to levels observed in WT cells. This is consistent with the RPPA data reported in the current work (Fig. 2B). Of note, while 500 nM BYL719 completely ablates pAKT irrespective of genotype, we previously noted substantial toxicity [5], precluding use of this or higher doses of BYL719 in our model system. This is in line with a recent Nature Cell Biology study by Yilmaz et al. ([15]) which demonstrated the essential growth-promoting role of the PI3K pathway in human pluripotent stem cells; Yilmaz et al. also demonstrate that compared to somatic cells (fibroblasts), human pluripotent stem cells suffer dramatic effects on growth/survival in response to Torin1/rapamycin [15], overall suggesting that this cell type is exquisitely sensitive to inhibition of the PI3K/AKT/mTOR pathway.

      In the present study we have also confirmed that 250 nM BYL719, used for Fig. 5 experiments, has worked as expected at the level of pAKT (S473) as shown in the below Western blot (see also revised Fig. S5; please access PDF version to view Western blot: https://osf.io/ecmhr/)

      Figures 3 and 4: The authors should expand their RNAseq analysis to demonstrate enrichment of stemness and TGFb signaling in homozygous mutant cells compared to heterozygous cells.

      AUTHORS’ RESPONSE__:__ We thank the Reviewer for this suggestion. The unsupervised MDS plot (Fig. 1A) clearly demonstrates the overlap between wild-type and heterozygous cells, strongly suggesting functional concordance and consistent differences to homozygous counterparts. Indeed, the below count table illustrates that the majority of differentially expressed genes in homozygous versus wild-type cells are also differentially expressed in homozygous versus heterozygous cells, including the direction of the change (please access the PDF version for formatted table: https://osf.io/ecmhr/)

      Comparison

      Differentially expressed gene count

      HOMvsWT

      5644

      HOMvsHET

      5764

      HOMvsWT AND HOMvsHET

      4825 (2300 upregulated; 2525 downregulated; 1 discordant)

      We have now performed additional fast gene set enrichment analyses (fgsea; shown below - please access PDF version to view figure: https://osf.io/ecmhr/) using the R package fgsea ([16]) and 14 of the Broad Institute’s 50 Hallmark Gene Set Collection [17], including manual addition of the PLURINET signature [18]. The 14 gene sets were chosen based on their relevance to answering the Reviewer’s question as well as their connection to PI3K signalling. Fold changes for all expressed genes were included in the analyses, without further thresholding in order to minimise bias.

      The results for homozygous vs wild-type comparisons are concordant with our upstream regulator analyses using IPA; as expected, TGFb signalling and PI3K signalling are among the top positively enriched (NES > 1) in comparison between homozygous and heterozygous cells. Unsurprisingly, however, the strength of the enrichments are lower when comparing the two PIK3CAH1047R genotypes.

      We are not convinced that including these surplus data will add value to the manuscript and its main message, however we will leave this decision to the discretion of the Editor (please also refer to our response to the subsequent question from Reviewer 2). Moreover, these data will remain visible in the publicly available rebuttal document.

      The authors should confirm the results of pathway analysis in vitro to show that homozygous PIK3CA mutation confers increased stemness compared to heterozygous mutation.

      AUTHORS’ RESPONSE__:__ This was a key finding in our previous publication [5]. The aim of the current study was to interrogate this phenomenon further through high-depth transcriptomic/signalling analyses.

      Figure 5: Kindly provide direct evidence demonstrating that increased PIK3CA signaling output induces NODAL expression in this experimental setting.

      AUTHORS’ RESPONSE__:__ We have consistently demonstrated increased NODAL mRNA expression (RNAseq data, Fig. S4 and Ref. [5]). Unfortunately, we have been unsuccessful in attempts to obtain good quality immunoblots for NODAL protein in PIK3CAH1047R/H1047R cells (as noted in response to Reviewer 1). We note, in fact, that such documentation of NODAL protein levels, while not unprecedented, is fairly rare.

      Also, please normalize gene expression data to WT cells so it is easy to visualize the changes in NODAL and NANOG expression in homozygous and heterozygous mutants compared to WT iPSCs.

      AUTHORS’ RESPONSE__:__ It is arithmetically more precise to normalise to the highest expression (i.e. that of PIK3CAH1047R/H1047R cells) – thereby avoiding artificial inflation of fold-changes when normalising to very low levels of expression. Ultimately, the relative levels calculated – and the increased expression of NODAL in PIK3CAH1047R/H1047R cells – are identical visually. Only the entirely arbitrary units change. Thus we do not deem normalisation to WT to be necessary or to add value to the analysis.

      Kindly quantify Fig. S5.

      AUTHORS’ RESPONSE__:__ These brightfield micrographs were taken as part of routine practice to monitor cell health during maintenance and experimentation, and are suboptimal for direct quantitation due to uneven illumination background and lack of whole-well imaging. Nevertheless, we have now undertaken quantification as the Reviewer suggests, using individual images taken during independent experimental replicates. The results have been added to Fig. S5 and support our assertion that 250 nM BYL719 had a growth inhibitory effect in homozygous PIK3CAH1047R iPSCs. All raw images and associated data have been uploaded to the Open Science Framework (https://osf.io/hbf7x/). The following short method section detailing the image analysis algorithm has also been included in the revised supplementary material:

      “Colony size quantitation from light micrographs

      Routine cell culture light micrographs were acquired on an EVOS FL digital inverted microscope (AMF4300, Thermo Fisher Scientific) using the 4X or 10X objective (final magnification 40X and 100X, respectively). For quantitation, 4X images were used for colony segmentation with Definiens Developer XD software. Background was detected using a contrast threshold; for this each pixel was compared to those in the surrounding 24 pixels (i.e. a 5x5 pixel box), and pixels with low contrast (between -50 and +50) were classified as background. Remaining pixels were classified as colonies, and any holes (pixels that were not initially classified as being part of the colony due to low contrast) were filled. Edges of the resulting colonies were smoothened by shrinking and then growing the colonies by 2 pixels. Finally, colonies less than 2000 pixels in size were reclassified as background. The area of the resulting colonies could then be measured and averaged over each field of view.”

      Reviewer #3

      In this manuscript by Madsen et al., a comparison of the transcriptome and proteome in heterozygous and homozygous PIK3CAH1047R human pluripotent stem cells mutants is presented. The authors demonstrate marked alterations in expression at both the protein and RNA level of homozygous mutants compared to wildtype, while heterozygous lines exhibit only minor changes. Multiple analytical approaches are employed to investigate network alterations, leading the authors to suggest a TGFβ-mediated rewiring of key pluripotent genes to induce a state of sustained stemness. Madsen et al. conclude with a set of experiments to functionally implicate NODAL/TGFβ autocrine signalling in PIK3CAH1047R dose-dependent stemness. The key conclusions are not convincing. While the unbiased omics approach sets up this study well, the study suffers from a lack of convincing functional assays (cell biological assays) to test their model and tease apart a phenotype for the het cells. More robust functional experiments are required to support the finding the NODAL/TGFβ signalling mediates the self-sustained stemness, particularly because this is the major novel finding distinguished from the authors previous work.

      AUTHORS’ RESPONSE__:__ We thank the Reviewer for their detailed critique. Our perspective on the robustness and novelty of our findings diverges from that of the Reviewer, however, as we elaborate on in more detail below.

      While the authors present a comprehensive omics investigation into alterations between wild type, homozygous, and heterozygous mutants, the critical functional experiments are lacking. In Figure 5, the authors seek to support the role of TGFβ in mediated stemness in the homozygous mutants, however, are not able to directly deplete TGFβ due to technical limitations of the culture conditions. Consequentially, the experiments are primarily built on the use of NODAL withdrawal and stimulation. The data presented thus implicate NODAL in the stemness phenotype, but it's not obvious TGFβ is substantially involved, particularly considering the inhibitor subsequently employed also inhibits NODAL type 1 receptors.

      AUTHORS’ RESPONSE__:__ NODAL and TGFb activate shared signalling pathways downstream from their respective receptors, and indeed they (as well as Activin) can be used interchangeably in stem cell culture, which is common practice [19–21]. Commercially available Essential 8/TeSR-E8 is supplemented with TGFb not NODAL; therefore the factor we have removed is TGFb, prior to any controlled introduction of NODAL (based on strong upregulation of its mRNA in PIK3CAH1047R/H1047R). Any residual TGFb-like ligands will be contributed by Matrigel as outlined in the text (lines 247-251). It is well-established that “NODAL/TGFb signalling” denotes signalling through SMAD2/3/4 (as opposed to BMP signalling through SMAD1/5/8), and this is how we use the term throughout the manuscript. Accordingly, it is functional activation of the “NODAL/TGFb signalling pathway” that we investigate (see also response to Reviewer 1, p.1).

      In summary, we seek not to make a distinct point about TGFb, but rather refer to NODAL/TGFb signalling as a matter of biochemical correctness. For clarity, we now replace mentions of “TGFb signalling” with “NODAL/TGFb signalling” throughout the revised manuscript. We have also revised the legend for Figure 3 to make this clearer.

      Furthermore, there is a paucity of readouts for stemness. For example, a more convincing narrative would include additional expression markers of the core pluripotency network (e.g. OCT4, SOX2, etc.) as well as functional readouts (e.g. NODAL withdrawal and assessment of differentiation) after NODAL stimulation/depletion and comparing across genotypes. Overall, the primary conclusions of this work are not well-evidence by the presented data and the authors should consider additional functional experiments or reframing the narrative.

      AUTHORS’ RESPONSE__:__ We chose the current strategy because we wanted to capture the earliest changes after depletion of NODAL/TGFb/ signalling, prior to any signalling rewiring triggered by differentiation. In fact, we believe that a strength of this study is our observation of differences in critical stemness markers in spite of the short time course. To aid non-expert readers we offered a primer on stemness genes and rationale for the markers chosen in the existing introduction (lines 80-90).

      We have further assessed additional stemness and differentiation marker genes in two independent homozygous PIK3CAH1047R cell lines using a high-throughput pluripotent stem cell scorecard (Fig. S4). This replicates the effect on cell marker genes documented by RT-qPCR in Fig.5, while also showing additional reductions in genes that were upregulated in homozygous PIK3CAH1047R cells (MYC, GDF3, FGF4) and which have previously been shown to be highly expressed in pluripotent stem cells (we have now added this additional clarification to the legend of Fig. S4) [22]. Despite the short term treatment, these data also show that no other treatment but SB431542 is capable of triggering expression of early neuroectoderm markers (CDH9, MAP2 and PAPLN) [23], prior to overt morphological changes in the cultures (Fig. S5; higher resolution images are also available via The Open Science Framework: https://osf.io/hbf7x/). Neuroectodermal gene expression is expected upon inhibition of TGFb signalling in human pluripotent stem cells [24,25].

      A key conclusion of this study is there is a dose-dependent stemness phenotype. As this is not explicitly defined, to this reader, it would imply a graded response between wild type, heterozygotes, and homozygotes in the phenotypic and molecular characteristics. However, as is noted particularly in the omics components of the manuscript, there is in fact "near-binary" alteration in the assayed characteristics. Again, this should be qualified more explicitly, but it is more consistent with the data, which suggests the heterozygotes behave very similarly to the wild types, while homozygotes have substantial alterations. I would suggest the authors consider renaming their descriptions, removing "near-binary" and "dose-dependent" to something like "dose-threshold." This suggests after X threshold of oncogenic PI3K signalling, substantial alterations occur; under this threshold (e.g. hets), changes are marginal. In the event however that there may be a more "dose-dependent" effect, I would expect the transcriptomic and proteomic changes observed in the heterozygous cell lines should be seen in the homozygous cell lines (of which they are likely in greater in magnitude in addition to other changes).

      AUTHORS’ RESPONSE__:__ This appears to us to be largely a matter of semantics. In talking of “dose dependency” we were certainly not implying a graded affect (as the Reviewer points out, our are findings are far from this, suggesting a sharp threshold of dose which triggers widespread changes), and indeed nothing in these words strictly suggests this interpretation. Nevertheless we are sensitive to the fact of the Reviewer’s interpretation of the term, and mindful that this might be shared by other readers. On the other hand talking of a “near-binary” effect seems to us to be an accurate description of our findings. We have edited the manuscript to minimise ambiguity with the following changes:

      • line 49 “dose” replaced with “strength”: “We demonstrate signalling rewiring as a function of oncogenic PI3K signalling strength, and provide experimental evidence that self-sustained stemness is causally related to enhanced autocrine NODAL/TGFb
      • line 102: “This work provides in-depth characterisation of the near-binary PI3K signalling effects seen in hPSCs ….”
      • lines 195, 198, 317: inserted “allele dose-dependent We would also like to take issue with the case that the Reviewer seems to be making that a more graded change in gene expression across heterozygotes and homozygotes is to be expected. As mentioned in the manuscript (lines 206-210), there is evidence for NODAL/TGFb pathway activation in heterozygous cells. Nevertheless given the known temporal, context- and dose-dependent effects of this pathway [1,2,26,27] and, importantly, the widely described biological properties of developmental systems (featuring positive feedback loops, bistability and hysteresis; see Ref. [28,29]), we have no reason to expect that transcriptomic and proteomic changes observed in homozygous cell lines will be reproduced in heterozygous cell lines.

      The manuscript would benefit from more direct comparisons between the heterozygotes and homozygotes.

      AUTHORS’ RESPONSE__:__ Please refer to the additional data provided in response to a similar question by Reviewer 2.

      Further to the above point, as the marginal phenotype observed in heterozygotes is a critical point in this paper, the authors would benefit from including heterozygote lines in the functional experiments presented in Fig 5. Inclusion of the hets in these experiments would instill confidence in this reader that the marginal molecular alterations characterized at the proteomic and transcriptomic level is reflected in the lack of functional stemness-sustaining behaviour.

      AUTHORS’ RESPONSE__:__ The lack of stemness-sustaining behaviour in the heterozygous clones was demonstrated across multiple different experiments in our previous work, and further functional studies of early differentiation in these cells seemed a poor use of resource and very unlikely to give useful insights. Given the major disease phenotype associated with the same genetic change (PROS), the relative lack of phenotype in heterozygous cells was surprising and holds obvious implications for disease modelling (see also response to Reviewer 2, pp.2-3), and for how model systems are “calibrated” against human developmental disease. The aim of the current work was to:

        • Determine whether increasing the depth of signalling and transcriptomic analyses would unmask small but important changes in heterozygous mutants that might have been missed in prior studies (i.e. we actively aimed to increase the power of the study for identification of subtle changes) and *
        • To characterise in greater depth the signalling and transcriptional changes underpinning the robust threshold effect observed for self-sustained stemness driven by PIK3CAH1047R/H1047R. We would further observe that PROS does not feature obvious qualititative errors in tissue specification, but rather excessive growth of more or less normally differentiated tissues. We conceptualise this as reflecting a small incremental growth advantage in normally differented tissues of certain lineages that summates to create a major disease phenotype over months and years.*

      Thus, without the functional and mechanistic experiments alluded to above, the claims/ conclusions are speculative. In particular, the cancer narrative is irrelevant to the study. Considering both the lack of conclusive differentiation experiments or relevant breast cancer experiments, the discussion on differentiation therapy for breast cancer should be removed.

      AUTHORS’ RESPONSE__:__ The reference to cancer links to a computational study of human breast cancers where we specifically looked at the relationship between strength of PI3K signalling and ‘stemness’ [12], both measured using established transcriptional indices. We have included the bioRxiv reference in our revised manuscript (see l.337). While there is an element of speculation in this cancer observation, we do feel it is important and grounded in this and the BioRXiv study, and would prefer to maintain it. However, if editors take a different view it can be removed.

      Reproducibility is a concern for this study. The authors should perform more replicates on their experiments (focusing on technical replicates of the lines employed to discern technical vs biological variability). A challenge in reading this manuscript is understanding which replicates were used for which experiments, and whether they are technical or biological (i.e. different lines). While some of the figure legends note this information, it would be helpful to provide clarity throughout the text. In addition, it should be noted that some experiments (e.g. the RPPA analysis in Fig 2B and Fig S3B) show substantial variability between replicates, but because it appears only a single technical replicate from two different cell lines was used, it is impossible to distinguish whether the variability is of a biological or technical nature. The authors would do well to focus on collecting more technical replicates of fewer biological replicates, and then expand to include more biological replicates if initial biological variation is observed.

      AUTHORS’ RESPONSE__:__ We strenuously disagree with the Reviewer on this point. Throughout this manuscript, we have been transparent and thorough in reporting how experiments were performed, including the number of both biological and technical replicates. Representative examples include:

      Legend to Figure 2A (RPPA dataset in growth-replete conditions): “The data are based on 10 wild-type cultures (3 clones), 5 PIK3CAWT/H1047R cultures (3 clones) and 7 PIK3CAH1047R/H1047R cultures (2 clones) as indicated.”

      Legend to Figure 5: “The data are from two independent experiments, with each treatment applied to triplicate cultures of three wild-type and two homozygous iPSC clones.

      Specifically to address the RPPA studies, and as is clear from the Figure 2 legend, we initially performed RPPA analyses in growth factor-replete conditions with extensive technical and biological replication, arguing against the Reviewer’s point. To aid interpretation, we opted for summarising this large dataset in Venn diagrams (following extensive limma-based statistical analysis, including correction for multiple comparisons and sample interdependence as advised in Ref. [30]). If the Reviewer deems it valuable, we could include a heatmap overview as shown below:

      [To view figure, please access PDF version of this rebuttal on https://osf.io/ecmhr/]

      We took the view that the above representation, while comprehensive, is not particularly informative to the reader. All individual data points for both total and phosphoproteins – with and without normalisation – are plotted as part of separate barplots in the accompanying RNotebook (https://osf.io/d9tca/). These clearly demonstrate that the technical and biological variability in canonical PI3K signalling responses at the level of AKT and immediately downstream of AKT is very low. The same applies to the increased phosphorylation of SMAD2 (S465/S467) in PIK3CAH1047R iPSCs. We include two examples below, and would be happy to include the link to the above RNotebook in the respective Figure legend if the Reviewer deems this helpful.

      [To view figure, please access PDF version of this rebuttal on https://osf.io/ecmhr/]

      The interpretation of the second RPPA experiment (Fig. 2B) in growth factor-depleted conditions is focused entirely on these responses due to their consistency across both datasets (further supported by low-throughput signalling analyses in the previous PNAS publication).

      We had made all raw data and guided analysis scripts for the above RPPA dataset publicly available, and the same is true for all original data as highlighted in the Materials & Methods section. Thus we strongly believe that readers have the opportunity to assess our work and reproduce our analyses/conclusions fully should they wish to do so.

      • Finally, we noted in the initial PNAS paper describing these models that we derived and worked with up to 10 independent homozygous PIK3CAH1047R clones, as well as with 3 and 4 independent heterozygous and wild-type clones, respectively. This exceeds the common use of 2 clones (if at all mentioned) in many similar studies in the stem cell literature (e.g. Ref. [31–34]). In our view, derivation of more than two independent clones is crucial for reproducibility in gene editing studies given substantial variability arising from genetic drift [35,36]. We have consistently shown the phenotypic robustness of our mutant clones across the two studies; note, for example, the low technical and biological variability in both heterozygous and homozygous mutants in the transcriptomic data in Fig. 1A. As noted in the manuscript, the high-depth RNAseq data analysis was performed in different clones and independently of the RNAseq reported in Ref. [5], yet yields highly similar results and confirms transcriptional rewiring of PIK3CAH1047R/H1047R iPSCs.*

      Throughout the text, the authors frequently reference their previous study in PNAS and often the lines of what is novel in this paper vs. reproduction of previous findings is blurred. The authors would benefit from reducing the frequency of referencing their previous study and focusing on emphasizing the novelty of the present findings.

      AUTHORS’ RESPONSE__:__ We have carefully reviewed all instances of citation of our previous study in the manuscript and have reduced their numbers to improve focus on the current findings as suggested. As noted above, however, the current study builds closely upon the findings of the previous work, and referring to these to put the current work in context is important. Indeed, this is reflected in some of the reviewers’ collective comments and questions which are answered by the prior study. We have carefully reviewed the places in which we have cited our previous study and note that except for 2 citations in the Introduction and 3 more in the Discussion, all remaining citations are in the context of linking new and old data, which we believe is important for clarity as suggested by the reviewers. However, if editors take a different view we can minimise this and reduce the number of citations.

      Without functional assays to complement and test their models, this manuscript is not a significant advance.

      AUTHORS’ RESPONSE__:__ While we take the Reviewer’s point that further studies could have strengthened robustness of the evidence supporting a mediating role of NODAL/TGFb signalling in PI3K-driven stemness, we think this assertion is far too sweeping, and neglects numerous facets of the study of use and interest to several fields (as agreed by the other reviewers). To recapitulate some key points of interest/use of this study:

      • Using a carefully derived PIK3CAH1047R iPSC model system and pharmacologically relevant doses of a recently approved PI3Ka-selective inhibitor, we demonstrate that the efficacy of the latter can depend on the strength of PI3K pathway activation and phenotype under investigation – despite expected downregulation of PI3K signalling by Alpelisib, the stemness phenotype is not reversed.
      • We link this to self-sustained TGFb signalling in cells with strong PI3K activation by homozygous PIK3CAH1047R The link between the two pathways and the underlying rewiring are likely to be relevant in other contexts, as observed recently in a breast epithelial model system [37]. Given similarity between human pluripotent stem cells and cancer cells, our findings are of wider relevance.
      • Aberrant PI3K activation has been associated with numerous pathologies, so it is important for the field to have well-characterised model systems with endogenous expression of one of the most common PIK3CA mutations. Our thorough characterisation of PIK3CAH1047R iPSCs validates one such model.
      • To our knowledge, this is the first study to provide a comprehensive and integrated characterisation of isoform-specific PI3K signalling and transcriptomic changes in human pluripotent stem cells. This is important because current knowledge of PI3K signalling in human PSCs is largely based on extrapolation of findings from mouse embryonic stem cells, with many previous studies relying on high concentrations of the non-specific pan-PI3K inhibitor LY294002 (the use of which has been discouraged by the PI3K signalling community [38]).

        I believe the narrative was written for pluripotent stem cell biologists but without robust functional and quantitative cell biological assays to test their models, I don't anticipate stem cell biologists will be very interested.

      AUTHORS’ RESPONSE__:__ The Reviewer is incorrect in his/her assertion about the target audience. PI3K signalling plays a key role in numerous disease and physiological processes as well as in development, and is of broad interest to cancer biologists, genetecists, rare disease biologists, biochemists, cell signallers, and endocrinologists among many others. Indeed we started with a primary focus on disease modelling (cancer, PROS) rather than stem cell biology, but because our findings are significant for the role of PI3K in stem cell biology as well as for these diseases, we aimed to make findings accessible across many of these readers. We refer the Reviewer to our previous response with regards to the significance of this work.

      **Minor Comments:**

      Consider adding gridlines to the MDS plots for clarity of read

      AUTHORS’ RESPONSE__:__ This is a matter of taste, and as we honestly can not see how it would enhance appreciation of the very clear clustering, we have decided to leave the plot in its current form.

      In Fig S2, some of the in-figure labelling is incorrect

      AUTHORS’ RESPONSE__:__ We thank the Reviewer for spotting this. We believe the labelling error to be corrected now and we have further tried to streamline the plot headings, but please do let us know if there is something else which we may have missed.

      In Fig S1C, the authors note poor correlation in the heterozygotes between this and a previous study. It would be helpful to qualify this discrepancy, as it is potentially concerning.

      AUTHORS’ RESPONSE__: The sensitivity to detect differential gene expression is high for large fold changes (as seen in PIK3CAH1047R/H1047R mutants) in transcriptomic studies, but declines rapidly for fold changes in expression lines 126-131: “The magnitudes of gene expression changes in PIK3CAH1047R/H1047R cells correlated strongly with our previous findings (Spearman’s rho = 0.74, p WT/H1047R iPSCs (Fig. S1C), as expected given the smaller number and lower magnitude of observed gene expression changes in heterozygous cells, and the lower depth of previous transcriptomic studies__.”*

      Line 208, the authors state that the small p-value for the homozygotes is suggestive of a dose-dependent effect. This is not the case; it simply suggests a greater probability of the effect being non-random.

      AUTHORS’ RESPONSE__:__ The Reviewer is formally correct, and we apologise for the imprecision of our language. Nevertheless biological effect size is pertinent to the p value determined, and so our statement, while requiring an inductive leap from the reader, is not wholly invalid. To tidy this up and improve precision we have reworded as follows:

      lines 215-217: “This is in keeping with the much lower effect size in heterozygous cells, and consistent with a critical role for the TGFbeta pathway in mediating the allele dose-dependent effect of PIK3CAH1047R in human iPSCs.”

      What does the height in Fig 4B correspond to? It would perhaps be of value to scale nodes based on the significance value.

      AUTHORS’ RESPONSE__:__ 4B illustrates hierarchical clustering of the module eigengenes - the height corresponds to similarity of gene expression. We clarify this in the revised manuscript.

      References

      1 Lee, K. L. et al. (2011) Graded Nodal/Activin signaling titrates conversion of quantitative phospho-Smad2 levels into qualitative embryonic stem cell fate decisions. PLoS Genet. 7.

      2 Hill, C. S. (2018) Spatial and temporal control of NODAL signaling. Curr. Opin. Cell Biol. 51, 50–57.

      3 Xu, R. H. et al. (2008) NANOG is a Direct Target of TGFβ/Activin-Mediated SMAD Signaling in Human ESCs. Cell Stem Cell 3, 196–206.

      4 Vallier, L. et al. (2009) Activin/Nodal signalling maintains pluripotency by controlling Nanog expression. Development 136, 1339–49.

      5 Madsen, R. R. et al. (2019) Oncogenic PIK3CA promotes cellular stemness in an allele dose-dependent manner. Proc. Natl. Acad. Sci. 116, 8380–8389.

      6 Vasan, N. et al. (2019) Double PIK3CA mutations in cis increase oncogenicity and sensitivity to PI3Kα inhibitors. Science (80-. ). 366, 714–723.

      7 Saito, Y. et al. (2020) Landscape and function of multiple mutations within individual oncogenes. Nature 582, 95–99.

      8 Gorelick, A. N. et al. (2020) Phase and context shape the function of composite oncogenic mutations. Nature.

      9 Gillies, T. et al. (2020) Oncogenic mutant RAS signaling activity is rescaled by the ERK/MAPK pathway 1–19.

      10 Hafner, M. et al. (2016) Growth rate inhibition metrics correct for confounders in measuring sensitivity to cancer drugs. Nat. Methods 13, 521–527.

      11 André, F. et al. (2019) Alpelisib for PIK3CA-mutated, hormone receptor-positive advanced breast cancer. N. Engl. J. Med. 380, 1929–1940.

      12 Madsen, R. R. et al. (2020) Relationship between stemness and transcriptionally-inferred PI3K activity in human breast cancer. bioRxiv 2020.07.09.195974.

      13 Fritsch, C. et al. (2014) Characterization of the novel and specific PI3Ka inhibitor NVP-BYL719 and development of the patient stratification strategy for clinical trials. Mol. Cancer Ther. 13, 1117–1129.

      14 Furet, P. et al. (2013) Discovery of NVP-BYL719 a potent and selective phosphatidylinositol-3 kinase alpha inhibitor selected for clinical evaluation. Bioorganic Med. Chem. Lett.

      15 Yilmaz, A. et al. (2018) Defining essential genes for human pluripotent stem cells by CRISPR–Cas9 screening in haploid cells. Nat. Cell Biol. 20, 610–619.

      16 Sergushichev, A. A. (2016) An algorithm for fast preranked gene set enrichment analysis using cumulative statistic calculation. bioRxiv 060012.

      17 Liberzon, A. et al. (2015) The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 1, 417–425.

      18 Müller, F. J. et al. (2008) Regulatory networks define phenotypic classes of human stem cell lines. Nature 455, 401–405.

      19 James, D. et al. (2005) TGFbeta/activin/nodal signaling is necessary for the maintenance of pluripotency in human embryonic stem cells. Development 132, 1273–82.

      20 Vallier, L. et al. (2005) Activin/Nodal and FGF pathways cooperate to maintain pluripotency of human embryonic stem cells. J. Cell Sci. 118, 4495–4509.

      21 Chen, G. et al. (2011) Chemically defined conditions for human iPSC derivation and culture. Nat. Methods 8, 424–429.

      22 Adewumi, O. et al. (2007) Characterization of human embryonic stem cell lines by the International Stem Cell Initiative. Nat. Biotechnol. 25, 803–816.

      23 Tsankov, A. M. et al. (2015) A qPCR ScoreCard quantifies the differentiation potential of human pluripotent stem cells. Nat. Biotechnol. 33, 1–15.

      24 Smith, J. R. et al. (2008) Inhibition of Activin/Nodal signaling promotes specification of human embryonic stem cells into neuroectoderm. Dev. Biol. 313, 107–117.

      25 Vallier, L. et al. (2004) Nodal inhibits differentiation of human embryonic stem cells along the neuroectodermal default pathway. Dev. Biol. 275, 403–421.

      26 Sorre, B. et al. (2014) Encoding of temporal signals by the TGF-β Pathway and implications for embryonic patterning. Dev. Cell 30, 334–342.

      27 David, C. J. and Massagué, J. (2018) Contextual determinants of TGFβ action in development, immunity and cancer. Nat. Rev. Mol. Cell Biol. 19, 1–17.

      28 Alon, U. (2007) Network motifs: theory and experimental approaches. Nat. Rev. Genet. 8, 450–461.

      29 Sonnen, K. F. and Aulehla, A. (2014) Dynamic signal encoding-From cells to organisms. Semin. Cell Dev. Biol. 34, 91–98.

      30 Germain, P. L. and Testa, G. (2017) Taming Human Genetic Variability: Transcriptomic Meta-Analysis Guides the Experimental Design and Interpretation of iPSC-Based Disease Modeling. Stem Cell Reports 8, 1784–1796.

      31 Wang, L. et al. (2017) GCN5 Regulates FGF Signaling and Activates Selective MYC Target Genes during Early Embryoid Body Differentiation. Stem Cell Reports 10, 287–299.

      32 Zeng, H. et al. (2016) An Isogenic Human ESC Platform for Functional Evaluation of Genome-wide-Association-Study-Identified Diabetes Genes and Drug Discovery. Cell Stem Cell 0, 1660–1669.

      33 Ho, L. et al. (2015) ELABELA Is an Endogenous Growth Factor that Sustains hESC Self-Renewal via the PI3K/AKT Pathway. Cell Stem Cell 17, 435–447.

      34 Roudnicky, F. et al. (2019) Modeling the effects of severe metabolic disease by genome editing of HPSC-derived endothelial cells reveals an inflammatory phenotype. Int. J. Mol. Sci. 20, 1–10.

      35 Veres, A. et al. (2014) Low incidence of Off-target mutations in individual CRISPR-Cas9 and TALEN targeted human stem cell clones detected by whole-genome sequencing. Cell Stem Cell 15, 27–30.

      36 Ben-David, U. et al. (2018) Genetic and transcriptional evolution alters cancer cell line drug response. Nature 560, 325–330.

      37 Katsuno, Y. et al. (2019) Chronic TGF-b exposure drives stabilized EMT, tumor stemness, and cancer drug resistance with vulnerability to bitopic mTOR inhibition. Sci. Signal. 12, eaau8544.

      38 Manning, B. D. and Toker, A. (2017) AKT/PKB Signaling: Navigating the Network. Cell 169, 381–405.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript by Madsen et al., a comparison of the transcriptome and proteome in heterozygous and homozygous PIK3CAH1047R human pluripotent stem cells mutants is presented. The authors demonstrate marked alterations in expression at both the protein and RNA level of homozygous mutants compared to wildtype, while heterozygous lines exhibit only minor changes. Multiple analytical approaches are employed to investigate network alterations, leading the authors to suggest a TGFβ-mediated rewiring of key pluripotent genes to induce a state of sustained stemness. Madsen et al. conclude with a set of experiments to functionally implicate NODAL/TGFβ autocrine signalling in PIK3CAH1047R dose-dependent stemness.

      Major Comments:

      1.The key conclusions are not convincing. While the unbiased omics approach sets up this study well, the study suffers from a lack of convincing functional assays (cell biological assays) to test their model and tease apart a phenotype for the het cells. More robust functional experiments are required to support the finding the NODAL/TGFβ signalling mediates the self-sustained stemness, particularly because this is the major novel finding distinguished from the authors previous work. • While the authors present a comprehensive omics investigation into alterations between wild type, homozygous, and heterozygous mutants, the critical functional experiments are lacking. In Figure 5, the authors seek to support the role of TGFβ in mediated stemness in the homozygous mutants, however, are not able to directly deplete TGFβ due to technical limitations of the culture conditions. Consequentially, the experiments are primarily built on the use of NODAL withdrawal and stimulation. The data presented thus implicate NODAL in the stemness phenotype, but it's not obvious TGFβ is substantially involved, particularly considering the inhibitor subsequently employed also inhibits NODAL type 1 receptors. Furthermore, there is a paucity of readouts for stemness. For example, a more convincing narrative would include additional expression markers of the core pluripotency network (e.g. OCT4, SOX2, etc.) as well as functional readouts (e.g. NODAL withdrawal and assessment of differentiation) after NODAL stimulation/depletion and comparing across genotypes. Overall, the primary conclusions of this work are not well-evidence by the presented data and the authors should consider additional functional experiments or reframing the narrative.

      • A key conclusion of this study is there is a dose-dependent stemness phenotype. As this is not explicitly defined, to this reader, it would imply a graded response between wild type, heterozygotes, and homozygotes in the phenotypic and molecular characteristics. However, as is noted particularly in the omics components of the manuscript, there is in fact "near-binary" alteration in the assayed characteristics. Again, this should be qualified more explicitly, but it is more consistent with the data, which suggests the heterozygotes behave very similarly to the wild types, while homozygotes have substantial alterations. I would suggest the authors consider renaming their descriptions, removing "near-binary" and "dose-dependent" to something like "dose-threshold." This suggests after X threshold of oncogenic PI3K signalling, substantial alterations occur; under this threshold (e.g. hets), changes are marginal. In the event however that there may be a more "dose-dependent" effect, I would expect the transcriptomic and proteomic changes observed in the heterozygous cell lines should be seen in the homozygous cell lines (of which they are likely in greater in magnitude in addition to other changes). The manuscript would benefit from more direct comparisons between the heterozygotes and homozygotes.

      • Further to the above point, as the marginal phenotype observed in heterozygotes is a critical point in this paper, the authors would benefit from including heterozygote lines in the functional experiments presented in Fig 5. Inclusion of the hets in these experiments would instill confidence in this reader that the marginal molecular alterations characterized at the proteomic and transcriptomic level is reflected in the lack of functional stemness-sustaining behaviour.

      2.Thus, without the functional and mechanistic experiments alluded to above, the claims/ conclusions are speculative. In particular, the cancer narrative is irrelevant to the study. Considering both the lack of conclusive differentiation experiments or relevant breast cancer experiments, the discussion on differentiation therapy for breast cancer should be removed.

      3.Reproducibility is a concern for this study. The authors should perform more replicates on their experiments (focusing on technical replicates of the lines employed to discern technical vs biological variability). A challenge in reading this manuscript is understanding which replicates were used for which experiments, and whether they are technical or biological (i.e. different lines). While some of the figure legends note this information, it would be helpful to provide clarity throughout the text. In addition, it should be noted that some experiments (e.g. the RPPA analysis in Fig 2B and Fig S3B) show substantial variability between replicates, but because it appears only a single technical replicate from two different cell lines was used, it is impossible to distinguish whether the variability is of a biological or technical nature. The authors would do well to focus on collecting more technical replicates of fewer biological replicates, and then expand to include more biological replicates if initial biological variation is observed.

      Minor Comments:

      • Consider adding gridlines to the MDS plots for clarity of read
      • In Fig S2, some of the in-figure labelling is incorrect
      • In Fig S1C, the authors note poor correlation in the heterozygotes between this and a previous study. It would be helpful to qualify this discrepancy, as it is potentially concerning.
      • Line 208, the authors state that the small p-value for the homozygotes is suggestive of a dose-dependent effect. This is not the case; it simply suggests a greater probability of the effect being non-random.
      • What does the height in Fig 4B correspond to? It would perhaps be of value to scale nodes based on the significance value.

      Significance

      Nature and significance of the advance:

      • Throughout the text, the authors frequently reference their previous study in PNAS and often the lines of what is novel in this paper vs. reproduction of previous findings is blurred. The authors would benefit from reducing the frequency of referencing their previous study and focusing on emphasizing the novelty of the present findings.

      • Without functional assays to complement and test their models, this manuscript is not a significant advance.

      State what audience might be interested in and influenced by the reported findings.

      • I believe the narrative was written for pluripotent stem cell biologists but without robust functional and quantitative cell biological assays to test their models, I don't anticipate stem cell biologists will be very interested.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      • Stem cell biology, cancer biology, systems biology, mTORC1 signalling

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      As below.

      Significance

      In this manuscript, Madsen et al have investigated the role of heterozygous versus homozygous PIK3CAH1047R gain-of-function mutation at maintaining stemness of induced pluripotent stem cells (iPSCs). The authors have performed high-depth RNAseq, proteomic, and RPPA analyses to show that biallelic PIK3CA alterations induce stronger activation of the PI3K signaling axis, compared to monoallelic mutations. The authors claim that a higher PI3K signaling dose activates the NODAL/TGF-b pathway, which in turn supports stemness in an autocrine fashion. These are important findings, however, the manuscript and its conclusions can be improved.

      The authors have described the role of PIK3CAH-1047R gain-of-function mutation in cancer and overgrowth syndromes. However, cancer associated somatic mutations in PIK3CA are mostly heterozygous. Similarly, PIK3CA related overgrowth syndromes (PROS) are caused by post-zygotic mosaic PIK3CA activating mutation. As such, the relevance of homozygous PIK3CA alterations to these pathological conditions is unclear. The authors should elaborate on the biological implications of their findings.

      The role of biallelic PIK3CA mutation is reminiscent of compound mutations in PIK3CA which have also been shown to increase PI3K signaling output. However, double PIK3CA mutations confer enhanced sensitivity to PI3K inhibition (Toska et al. Science 2019). Could the authors kindly speculate on this discrepancy. Similarly, p110 alpha specific inhibitor, alpelisib, is highly effective against PIK3CA-mutant ER+ breast cancer and PROS. As such, the clinical relevance of the insensitivity of homozygous PIK3CA mutation to PI3K inhibitors is unclear.

      Figure 2: The authors have performed RPPA analysis in the presence of 100 nM BYL719. Alpelisib is commonly used at 1 uM concentration for in-vitro experiments, and has a cMax of ~5 uM. We suggest the authors perform western blot analysis to confirm the results of RPPA.

      Figures 3 and 4: The authors should expand their RNAseq analysis to demonstrate enrichment of stemness and TGFb signaling in homozygous mutant cells compared to heterozygous cells.

      The authors should confirm the results of pathway analysis in-vitro to show that homozygous PIK3CA mutation confers increased stemness compared to heterozygous mutation.

      Figure 5: Kindly provide direct evidence demonstrating that increased PIK3CA signaling output induces NODAL expression in this experimental setting. Also, please normalize gene expression data to WT cells so it is easy to visualize the changes in NODAL and NANOG expression in homozygous and heterozygous mutants compared to WT iPSCs

      Kindly quantify Fig. S5.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is an interesting and thorough study characterising human ipsc with hetero or homozygous mutation in pi3k pathway that lead to its hyper-activation. They prove that the increased stemness is results from enhanced autocrine responsiveness to TGF signalling pathway.

      The main conclusions are well supported by the presented data. cutting edge tools and bioinformatic analysis are adequately applied. I have only one important point:

      Major comment:

      1) western blot based validation of TGF pathway activation in wt and mutant ipscs will be helpful to strengthen the results based on bioinformatic data.

      Significance

      Important work for studies on signalling, cancer mutations, modelling cancer in stem cells, pluripotency regulation.

    1. Reviewer #3:

      Summary of the manuscript:

      This manuscript carefully explores different ways of analyzing fMRI data acquired during a subsequent memory paradigm. Subsequent memory paradigms (and variants thereof) are widely used in human memory research. The paradigm involves assessing activity-dependent encoding by first presenting novel stimuli (typically during human brain imaging), before classifying the stimuli post hoc using behavioral performance on a subsequent recognition test. Here, the authors use a subsequent memory paradigm to collect fMRI data from 256 volunteers, including both young (<35 years old) and older populations (>50 years old). The authors then perform cross-validated Bayesian model selection to compare categorical and parametric approaches to data analysis. The authors show that parametric models (particularly those with non-linear transformations) out-perform categorical models in explaining the fMRI signal variance during encoding.

      General assessment:

      The strengths of this manuscript are two-fold. First, the authors illustrate application of a recently published SPM toolbox (Soch et al., 2016; Soch and Allefeld, 2018), used to conduct model assessment, comparison and selection. Second, the manuscript shows that parametric models out-perform categorical models when applied to subsequent memory paradigms. The manuscript is methodologically rigorous and illustrates a pipeline for optimizing GLMs applied to fMRI data. It uses data from a large number of subjects and results are replicated in an independent cohort. The manuscript will provide a useful reference for those researchers designing subsequent memory paradigms or performing analyses on data deriving from this particular paradigm.

      Having said this, by focusing on methodological questions relating specifically to subsequent memory paradigms, the manuscript is relatively narrow in scope. Moreover, despite providing the first formal comparison of categorical and parametric models for data acquired from subsequent memory paradigms, researchers have been applying both types of model to data deriving from this task for more than 10 years.

      Major comments:

      1) The authors do not present behavioral results, yet it seems the variance in confidence on the recognition test underlies the success of the parametric modeling approach. Moreover, it seems important to show whether there are any behavioral differences between young and old adults, given the framing of the Introduction where the authors note that categorical modeling approaches may be limited by ceiling effects in young populations and low accuracy in older populations. Using the behavioral data alone, can the authors illustrate these limitations of the categorical approach?

      2) In the Introduction the authors emphasize the importance of their approach for identifying biomarkers that predict normal aging versus accelerated aging in humans. Given this comparison is not made, it seems more appropriate to move this section of the Introduction to the Discussion?

      3) Clarity of the Results section: The results are somewhat dense and hard to follow at times. One notable factor is the lack of clarity in the figures, where the key point conveyed by each figure is not always immediately apparent. Here are some suggestions to help improve this section of the manuscript:

      a) Figure 3, Figure 4A, Figure 5, Figure 6, Figure 8: it is difficult to distinguish between the red/blue/magenta colours. Can the authors use 3 colours that are more different?

      b) Can the authors explicitly state what they expect to see on selected-model maps? Given the main audience for this manuscript will be from the fMRI community, it is important that these maps are not confused with maps showing task-related modulation of the BOLD signal.

      c) Can the authors describe in more general terms the rationale behind all the different categorical models? By considering so many different models I wonder if the key comparison between categorical and parametric gets lost in the detail.

      d) Figure 3: I'm not sure how helpful this figure is for the main Results section? It doesn't address the key question posed by the authors, so is it not more suitable for the Supplement?

      e) How representative are the plots shown in Figure 4B? Do the authors observe the same gradient if assessing log Bayes factor in an ROI defined from previous subsequent memory paradigms?

      f) Section 4.2. It isn't immediately clear why models that do not include subsequent memory effects are included, if the key comparison is between subsequent memory effects in categorical and parametric models.

      g) Figure 5: The authors distinguish between 'theoretical' and 'empirical' parametric modulators. If both are defined using behavioural performance, then what is the rationale for these terms?

    2. Reviewer #2:

      This paper describes efforts to evaluate and compare different models of a subsequent memory paradigm. In particular, the goal is to improve sensitivity so that the paradigm can be used more effectively in older adults who may have memory problems.

      The paper is well written overall, and the sample size is impressive. I also think that improving sensitivity to detect memory deficits during aging and disease progression is an important goal. Finally, the approach is rigorous, as cvBMS provides a principled means of model comparison and validating the findings in another cohort is very laudable.

      That said, the paper is overly focused on a specific paradigm and it does not provide insights into neural underpinnings of a biological/cognitive function. To be clear, the goal of the paper does not appear to be to provide such insights, and is instead to "...identify several ways to improve the modeling of subsequent memory effects in fMRI".

    3. Reviewer #1:

      General assessment:

      The topic discussed in the current manuscript is interesting and the proposed framework will be a great addition to the traditional methods currently used in the studies of human memory. The manuscript investigated the applicability of parametric compared to categorical models of subsequent memory effects in fMRI. Specifically, the authors applied cross-validated Bayesian model selection (cvBMS) for fMRI models to a subsequent memory paradigm in young and older adults. The cvMBS results showed that parametric models better explained the encoding signals when compared to categorical counterparts, suggesting a new analytical framework that can be applied to participants with low memory performance including memory-impaired individuals whose data would otherwise be challenging to interpret.

      Major comments:

      1) Given that the parametric models are a critical part of this manuscript, the rationale and justifications for the use of these models especially in the context of memory fMRI experiments are currently not sufficiently discussed. For example, in the introduction, there is no reference of past findings that are in line with the assumption that BOLD signals in memory-related brain regions vary quantitatively (rather than qualitatively) as a function of the strength of encoding signals. I believe this to be critical in convincing readers why parametric models can and should be used when thinking about memory fMRI data and paradigms.

      2) While the results section is clearly written, I find the analysis section to be rather difficult to follow. Is it possible at all to even more carefully walk through each of the model subtypes with more details or consider setting up a consistent structure for how each model subtype is explained (across model types; i.e., across 3.1, 3.2, and 3.3). In addition, I believe the readers could also benefit from more explanations/motivations behind why certain models should be considered and how to conceptually think about them (e.g., what are some empirical findings which suggest that model GLM with parametric modulators that are linear, arcsine, and sine should be considered here and are good candidates but not others?).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary:

      We found that this paper is of interest to an audience of cognitive neuroscientists who perform subsequent memory experiments. It provides important technical advice for the analysis of this data. The paper is also of interest for researchers who want to carry out similar technical evaluations in other experiments.

      Whilst we have some comments that could improve the manuscript, we find the key claims of the manuscript to be well supported by the data, and that researchers who use this paradigm would benefit from following the advice to use parametric models. Furthermore the approaches used to support these claims are both thoughtful and rigorous.

    1. Reviewer #3:

      The focus of this manuscript from Moglie et al. is to investigate calcium entry in post-hearing OHCs via the activation of either voltage-gated calcium channels or the MOC efferent fibers. Based on the literature reported, very little is known about how OHCs handle increases in cellular calcium, although oncomodulin is believed to be the major calcium buffer in these cells. Therefore, this work attempts to address this gap in our knowledge by using a combination of calcium imaging and electrophysiology. From the results presented, the authors conclude that the large calcium signals generated by the opening of calcium channels appear to be modulated by ryanodine receptors. In addition, the opening of nicotinic receptors, caused by ACh released from active efferent fibers produced calcium transients that were contained by cisternal calcium-ATPases. The authors have also provided results that sorcin, a calcium binding protein involved in controlling calcium in myocytes, appears to control basal calcium levels and MOC synaptic activity in OHCs. The topic of the study is very interesting but unfortunately there are several major shortcomings in the design and execution of the work that drastically lower its impact. Moreover, the work appears to be designed and written for a specialized "auditory" audience.

      The main issue of the paper is that the imaging data is used as the primary means of quantifying calcium changes under different experimental conditions, including the measurements of basal calcium level. However, all experiments were performed with a non-ratiometric calcium dye, making most of the conclusions and assumptions extremely difficult to interpret.

      Another problem is that the authors make very specific conclusions regarding the mechanisms involved in calcium handling in OHCs, which are used to explain/understand how OHCs operate in vivo. However, experiments were done using whole-cell patch clamp, which is far from physiological, using unphysiological voltages (-100 mV) and at room temperature. The authors should provide evidence that the mechanisms proposed using the above experimental conditions are physiologically relevant.

      Figures 1 and 2 describe the same aspect, and should be combined. Also, it is not clear why 1 mM Ach was used for the experiments. How do the authors know that this is a physiologically saturating concentration?

      Figure 3G-H highlights another major issue with the method used. The similarity of the calcium change between the different stimulus durations could just be due to dye saturation, which is in fact suggested by the initially flat response in panel G despite the reduction in current. This finding should be corroborated by evidence indicating that the calcium dye is not saturated under their experimental conditions.

      Figure 4 describes an even more problematic result. Here calcium changes are reported as DF instead of DF/F0, which is highly inappropriate as it makes comparing different recordings extremely unreliable (F0 can vary significantly between experiments, see Figure 4F). Similarly, DF measurements are done in other experiments (e.g. Figure 5D), in which data for the control condition comes from a different cell. As mentioned above, this problem could be avoided by using a ratiometric dye (e.g., fura-2 or furaptra see Beutner-Moser 2000?).

      Figure 5B. It is surprising to see that a similar variance in baseline calcium level to that reported in Figure 4E (again using non-ratiometric measurements), is now just significant and used to support one of the main conclusions of the paper. Considering that the method used does not provide quantifiable baseline calcium levels, how are the authors able to exclude bias in their measurements due to experimental variability? What is the biological replica needed to validate their statistics based on the mean +/- sem? Also, the fact that adding sorcin "increases" the resting calcium level does not prove that it has a role in OHC function; it only shows that sorcin affects calcium levels, which is not surprising since it is a calcium binding protein.

      Figure 7D is a bit puzzling to me but I may have missed some underlying reason from published work. Why do Ryn concentrations that are known to either facilitate or block the receptors cause the same change in calcium level?

      The method section should contain a statistical statement. It should also explain the reason for using non-parametric analysis for the statistical comparisons. Also, most of the methods are only briefly described; although the authors have probably published these methods before, the method section should be more self-explanatory e.g. exactly how was the photobleaching correction performed?

    2. Reviewer #2:

      In this study, the group of Juan Goutman investigated Ca2+ signaling in immature cochlear outer hair cells (OHCs). The work focuses on the basolateral compartment analyzing Ca2+ signals mediated by afferent ribbon-type active zones and by efferent synapses. Ca2+ influx at the ribbon-type active zones is substantial, which is in keeping with the large ribbons found in OHCs. The authors show that it can be potentiated by ryanodine which indicates an interesting interplay between voltage-gated Ca2+ influx and ryanodine receptor mediated Ca2+ release from internal stores. Finally, adding recombinant sorcin, a Ca2+ binding protein prominently expressed in cardiomyocytes to the patch-pipette modulated the basal [Ca2+]i and efferent Ca2+ signalling in OHCs. The authors provide characterization of efferent and afferent Ca2+ signals. However, there are a number of issues which are discussed below:

      Novelty:

      The approach taken, and some of the conclusions, is similar to what the group presented for immature inner hair cells, that also feature afferent and efferent synapses in close proximity and with functional interaction. This is absolutely reasonable to do but presents an extension of the same concept to a related cell type.

      Relevance for understanding OHC function in the mature cochlea:

      The authors have performed experiments on organs of Corti from mice at postnatal days 12-14. This is around the onset of hearing in mice and represents a time window during which substantial changes have been shown to occur. Figures 4 and 5 of Hackney et al., JN2005 show that the cytosolic abundance Ca2+ binding proteins parvalbumin a, parvalbumin ß, and calretinin changes dramatically around this stage of development. Hence, the presented data should not be taken to conclude on the situation in the mature cochlea.

      Statistical data basis/sample size:

      Analyzing highly variable Ca2+ signals in hair cells poses the challenge of capturing the underlying distribution by sufficient sample size. Several experiments in the present study fall short in acquiring such sample size.

      Role of sorcin:

      I highly recommend the authors to provide their own sorcin immunohistochemistry. Perfusion of the cytosol with recombinant Ca2+ binding proteins is expected to affect Ca2+ signalling (reducing amplitude and spread) and in a way similar to the addition of synthetic Ca2+ chelators. With 3 µM of recombinant protein, it seems difficult to achieve a sizable effect (even when considering fully functional multiple EF-hands. In the present study, a non-significant trend towards a reduced amplitude of afferent Ca2+ signals was observed during whole-cell patch clamp with sorcin (molar concentration should be provided). The relevance of sorcin function for OHC function remains to be studied by deleting sorcin expression in OHCs and performing comparative perforated-patch recordings from sorcin-deficient mice or siRNA knock-down.

      Specific comments:

      Mention species in title and/or abstract

      What is meant by "we found that VGCC Ca2+ signals are larger than expected" please disambiguate or remove?

      Also consider replacing "VGCC Ca2+ signals" by afferent or presynaptic Ca2+ signals, as the proposed CICR contribution indicates a more complex origin of Ca2+ contributing to these signals.

      Line 56: "we found that Ca2+ signals from VGCC are unexpectedly large," see my comment above

      Line 57 and throughout: consider clarifying that you refer to signal amplitude not spatial extent of the signal (perhaps replace size by dF/F0 or amplitude)

      Line 61: "control Ca2+-based excitation-contraction coupling in cardiomyocytes"?

      Line 62: "among the most differentially expressed genes in OHCs" this statement is not useful without mentioning the cells used for the comparison

      Line 64: "Thus, the present results shed light into Ca2+ homeostasis in the hair cells involved in sound amplification at the cochlea, and unveil a role for the novel protein sorcin."

      I don't think so, please see major concerns.

      Line 70 and following: I think this first section is mainly confirmatory (work by the Mammano lab and others) and hence might better serve as supplementary information. Please add whether the data points in C-E correspond to cells and single trials or represent average responses of each OHC.

      Line 88: So, do you assume that the hotspot corresponds to a single efferent OHC synapse being activated?

      Line 97:Was this averaging including the failures? If not the example shown in Fig. 2B does not really seem representative? Consider adding a note relating the dF/F0 for ACh and efferent transmission: 2 orders of magnitude difference. Also please reflect on finding failing Ca2+ signals despite successful IPSC.

      Legend to fig. 2 should mention the imaging approach used here. Please add whether the data points in C-E correspond to cells and singe trials or represent average responses of each OHC. "during double-pulse"

      Line 102: Consider to move this explanation up to where you introduce the experiment.

      Line 117: A methods section detailing the statistical analysis is missing completely. How was the use of a non-parametric test (Friedman's test) justified: i.e. how was normality tested?

      Line 122: "localized Ca2+ rise with a measurable spread which accounted for 31 {plus minus} 5 % of the area corresponding to the imaged OHC area." How was "measurable spread" defined?

      Line 128: The maximal Ca2+ signal with 80 Hz stimulation of efferent synapses is still an order of magnitude lower than that found with ACh. The authors suggest that the Ca2+ rise is limited by SERCA pumps, but do they assume, indeed, that this clearance mechanism is not at work during ACh application?

      Line 141: How sure can the authors be that this cytosolic Ca2+ rise does not result from a store-depletion related Ca2+ entry?

      Line 155: I recommend keeping the order from 20-80 Hz as above and below to make reading easier.

      Line 178: How confident can we be that the recombinant sorcin was Ca2+ free, in other words, could the elevated basal Ca2+ simply reflect preloading of sorcin?

    3. Reviewer #1:

      This manuscript describes convincing measurements of cytoplasmic Ca2+ signals attributable to voltage gated Ca2+ channels and efferent nAChR channels. These channel coexist on the basolateral surface of OHCs and may, with the MET channels, contribute to OHC Ca2+ homeostasis. The main conclusions are that the two channel types are differentially modulated, ryanodine receptor action potentiating VGCC but not efferents, which disagrees with previous claims (Lioudyno et al 2004); and efferent responses were reduced by sorcin, a Ca2+-binding protein recently localized to OHCs, and known inhibitor of ryanodine receptors. Neither the concentration nor exact mechanism of sorcin's action was determined.

      Specific comments:

      1) In the fluorescent images in the various figures, the image orientation is unclear- is it a radial or a transverse view? It would help if in some figures, a representation of an OHC, either as a Nomarski image or as a drawing, can accompany the fluorescent image.

      2) L126. in describing Ca2+ spread, especially with long stimulation, there is concern that the high affinity dye Fluo4 will saturate. This should be discussed. I would not have used this dye - preferred a lower-affinity dye such as Fluo5.

      3) Fig. 3F and L:124. Express the spread in absolute units rather than percent of OHC diameter. I assume the conclusion (not stated) is that the Ca2+ rise is not confined to the sub-cisternal space but spreads throughout the cell. Why does it not activate release of the afferent neurotransmitter? A point not mentioned is that the efferent SK2 and BK channels are distributed along the lateral membrane.

      4) Fig. 6F. Ideally the spread of Ca2+ signals at the peak should be presented as (overlapping) Gaussians for the two sources. The significance of the 3.7 um separation (L319) between the sources needs some context.

      5) L169. State explicitly that the ryanodine results disagree with (Lioudyno et al 2004). 6) L177. Refer to Corey's Shield database (Scheffer et al 2015) who first reported the presence of sorcin mRNA in OHCs.

      7) L178. A concern is over the physiological significance of the sorcin effects. Sorcin is a Ca2+ binding protein that if present at high concentrations could supplement oncomodulin in addition to inhibiting RyRs. Can the authors determine the sorcin concentration in OHC cytoplasm? In addition it seems strange that the reported effect to sorcin is to inhibit RyRs so limiting the temporal spread of the CICR, but the present results suggest Can the authors clarify these problems.

      8) L199-204. The authors could have resolved whether sorcin affected SK2 channels by (briefly) switching to -40 holding potential where the nAChR and SK2 currents would be of opposite polarity.

      9) L350 omit 'novel' Sorcin is not a novel protein having been described in the 1990's

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      There was a consensus that the scope of the work is of general interest for the hearing field. However, three major critiques were raised:

      1) A high-affinity Ca2+ indicator was used, raising the possibility that the fluorescence signals might be saturated in some experiments (Fig.3G-F; Fig. 4G-H; Fig.5D) and thus confounding the conclusions that can be made from the observation of unchanged Ca2+ signals. In particular, could saturation explain why ryonidine has no effect on the Ca2+ influx from efferent synapses, an observation that contradicts published observations by Lioudino et al 2014?

      2) The variability of the data is large and the results are often on the verge of statistical significance, which calls for special care in the statistical methods used to evaluate the effects reported here and ensure that the sample size is large enough to reach a reliable conclusion.

      3) The experiments with sorcin appear preliminary. In particular it is worrisome that sorcin may change the Ca2+ concentration only because it is a Ca2+-binding protein.

    1. Reviewer #3:

      This neat paper continues the story of structural colour evolution in a group that is rarely appreciated for their ornamentation. The study uses colour & ecological data to model their evolution in a comparative framework, and also synthesises transcriptomic data to estimate the presence and diversity of opsins in the group. The main findings are that the tarantulas are ancestrally 'blue' and that green colouration has arisen repeatedly and seems to follow transitions to arboreality, along with evidence of perhaps underappreciated opsin diversity in the group. It's well-written and engaging, and a useful addition to our understanding of this developing story. I just have a few concerns around methods and the interpretation of results, however, which I feel need some further consideration.

      As the authors discuss in detail, this work in many ways parallels that of Hsiung et al. (2015). The two studies seem to agree in the broad-brush conclusions, which is interesting (and promising, for our understanding of the question), though their results conflict in significant ways too. Differences in methodology are an obvious cause, and they are particularly important in studies such as this in which the starting conditions (e.g. the assumed phylogeny or decisions around mapping of traits) so significantly shape outcomes. The current study uses a more recent and robust phylogeny, which is great, and the authors also emphasise their use of quantitative methods to assign colour traits (blue/green), unlike Hsiung et al.

      1) This latter point is my main area of methodological concern, and I am not currently convinced that it is as useful or objective as is suggested. One issue is that the photographs are unstandardised in several dimensions, which will render the extracted values quite unreliable. I know the authors have considered this (as discussed in their supplement), but ultimately I don't believe you can reliably compare colour estimates from such diverse sources. Issues include non-standardised lighting conditions, alternate white-balancing algorithms, artefacts introduced through image compression, differences in the spectral sensitivities of camera models, no compensation for non-linear scaling of sensor outputs (which would again differ with camera models and even lenses), and so on (the works of Martin Stevens, Jolyon Troscianko, Jair Garcia, Adrian Dyer offer good discussion of these and related challenges). Some effort is made to minimise adverse effects, such as excluding the L dimension when calculating some colour distances, but even then the consequences are overstated since the outputs of camera sensors scale non-linearly with intensity, and so non-standardised lighting will still affect chromatic channels (a & b values). So with these factors at play, it becomes very difficult to know whether identified colour differences are a consequence of genuine differences in colouration, or simply differences in white balancing or some other feature of the photographs themselves.

      2) The justification for some related decisions are also unclear to me. The CIE-76 colour distance is used, and is described as 'conservative'. But it is not so much conservative as it is an inaccurate model of human colour sensation. It fails to account for perceptual non-uniformity and actually overestimates colour differences between highly chromatic colours (like saturated blues). The authors note they preferred this to CIE-2000, which is a much better measure in terms of accuracy, because the latter was too permissive (line 300). I understand the problem, and appreciate their honesty, but this decision seems very arbitrary. If the goal is to quantitatively estimate colour differences according to human viewers, then the metric which best estimates our perceptual abilities would strike me as most appropriate. Also, the fact that all species would be classified as 'blue' using the CIE-2000, when some of them are obviously not blue by simply looking at them, is consistent with the kinds of image-processing issues noted above. I only focus on this general point because it is offered as a key advance on previous work (L 40-41), but I don't think that is clearly the case (though I agree that the scoring methods of Hsiung et al. are quite vague). I'm generally in favour of this sort of quantitative approach, but here I wonder if it wouldn't be simpler and more defensible to just ask some humans to classify images of spiders as either 'blue' or 'green', since that seems to be the end-goal anyway.

      3) L26-27, 53-56, 171-176: This is a more minor point than the above, but some of the discussion and logic around hypothesised functions could be elaborated upon, given it's presented as a motivating aim of the text (52-56). The challenge with a group like this, as the authors clearly know, is that essentially none of the ecological and behavioural work necessary to identify function(s) hasn't been done yet, so there are serious limitations on what might be inferred from purely comparative analyses at this stage. The (very interesting!) link between green colouration and arboreality is hypothesised and interpreted as evidence for crypsis, for example, but the link is not so straightforward. Light in a dense forest understory is quite often greenish (e.g. see Endler's work on terrestrial light environments) including at night which, when striking a specular, structurally-coloured green could make for a highly conspicuous colour pattern - especially achromatically (which is what nocturnal visual predators would often be relying on). This is particularly true if the substrate is brown rotten leaves or dirt, in which case they could shine like a beacon. Conversely, if the blue is sufficiently saturated and spectrally offset from the substrate it could be quite achromatically cryptic at dusk or night. To really answer these questions demands information on the viewers, viewing conditions, visual environment etc. The point being that it is a bit too simplistic to observe that, to a human, spiders are green and leaves on the forest floor may be green, and so suggest crypsis as the likely function (abstract L 22-23). So inferences around visual function(s) could either be toned down in places given the evidence at hand or shored up with further detail (though I'm not sure how much is available).

      Minor comments:

      -I'm not familiar enough with with methods for creating homolog networks to comment in detail, but the use of BLASTing existing opsin sequences against transcriptomes seems straightforward enough. As do the methods for phylogenetic reconstruction.

      -L48: What constitutes a 'representative' species? And how reasonable is it to assign a value for such a labile trait to an entire genus? I understand we can only do our best of course and simplifications need to be made, but I can imagine many cases among insects (e.g. among butterflies and flies) where genus-level assignments would be meaningless due to the immense diversity of structural colouration among species (including in terms of simple presence/absence).

      -Line 168: Wouldn't this speak against a sexual function? Only in a tentative way of course, but the presence of conspicuous structural colouration in juveniles, which is absent in adults, would suggest a non-sexual origin to me.

    2. Reviewer #2:

      This paper presents a broad-ranging overview of tarantula visual pigments in relationship with the color of the spiders. The paper is interesting, well-written and presented, and will inspire further study into the visual and spectral characteristics of the genus.

      First a minor remark, Terakita and many others distinguish between opsin, being the protein part of the visual pigment molecule and intact light-sensing, so-called opsin-based pigment, often generalized as a rhodopsin. The statement of line 65, 'convert light photons to electrochemical signals through a signalling cascade' is according to that view strictly not correct. Furthermore, the presence of opsins in transcriptomes may be telling, but it is not at all sure that they are expressed in the eyes, if at all. As the authors well know, in many animal species some of the opsins are expressed elsewhere. It may be informative to mention that.

      The blueness or greenness feature prominently in the paper, but the criteria used for determining to which class a spider belongs are not at all sure. The Colour Survey and Supplementary Table S2 refer to Birdspiders.com, but that requires a donation; not very welcoming. The other used sources are also not readily giving the insight or overview which material was sampled. I therefore think that the paper would considerably gain in palatability by adding a few exemplary photographs as well as measured spectra. Of course, I am inclined to trust the authors, but I would not immediately take color photographs from the web as the best material for assessing color data with 4-digit accuracy. Furthermore, the accessible photographs do not always show nice, uniform colors, so it might be sensible to mention which body part was used to score the animals. And finally, using CIE metric might infer to many readers that the spiders are presumably trichromatic, like us. Any further evidence?

    3. Reviewer #1:

      This study investigates the evolution of blue and green setae colouration in tarantulas using phylogenetic analyses and trait values calculated from photographs. It argues that (i) green colouration has evolved in association with arboreality, and thus crypsis, and (ii) blue colouration is an ancestral trait lost and gained several times in tarantula evolution, possibly under sexual selection. It also uses transcriptome data to identify opsin homologs, as indirect evidence that tarantulas may have colour vision.

      Otherwise, a few comments:

      1) Given that data is limited for the family (only 25% of genera could be included in this study), it seemed a shame not to discuss further the variation in colour and habit within genera. Based on Figure 1 and supplementary tables, the majority of "blue" genera contain a mix of blue and not-blue (and not-photographed) species. Does this mean that blue has been lost many more times in recent evolutionary history? And how often are "losses" on your tree likely to be the result of insufficient sampling for the genus (i.e. you happen not to have sampled the blue species)?

      2) A key conclusion of the study is that sexual selection should not be discarded as a possible explanation for spider colour. However, there is very little detail given in the discussion to build this case. Do these spiders have mating displays that might plausibly include visual signals? How common are sexually-selected colours in spiders generally? Where on the body is the blue coloration (in cases where it is not whole body)? I also missed whether the images used are of males or females or both, or how many species show sexual dimorphism in colouration (mentioned briefly in the Discussion, but not summarised for species or genera).

      3) A quick scroll through the amazing images on Rick West's site suggests that oranges and red/pinks are not rare in tarantulas. Perhaps the data is just not available, but it would be good to mention somewhere the rationale behind the blue/green focus, rather than examining all colours.

      Minor comments:

      I suggest defining stridulating / urticating setae for non-specialist readers. I had to look these up to understand that they were involved in defence.

      I notice the Rick West website says species IDs should not be made from photos alone. Is there a risk of misidentification for any photos?

      The Results section would benefit from some more clear statements of key results. For example, phrases like "AIC values to assess the relationships between greenness and arboreality are reported in Table 3" could be replaced instead with a summary statement indicating what this table shows.

      In the Figure 1 caption I think there is a typo: 'the proportions of species with images that possess blue colouration (grey = no available images)" but should this say "grey = not blue"?

      142 - the lengthy discussion here of whether there is one or more mechanisms by which blue is produced in tarantulas, and the detailed criticism of Hsuing SEMs, seems a bit out of place given that the current study does not investigate the proximate mechanism of blue colouration but merely its presence.

      The Table S7 caption states: "A * indicates currently undescribed species with blue or green colour that can be confidently attributed to corresponding genus. However, as the described species exhibit no blue or green colour, we conservatively scored these as 0." Is this a conservative approach though? If they have been confidently assigned to genus, I don't understand why they would not be included.

      Table S6 - It is not clear to me how the values for predicted N orthologs were calculated.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      This study offers some interesting data and ideas on colour evolution in tarantulas, building upon previous work on this topic. However, the reviewers judged that the insights are too taxon-specific and that several key conclusions are too speculative. There were also concerns about the methodology for trait scoring from photographs that the authors might consider going forward.

    1. Reviewer #3:

      In this manuscript, Dr. Jeroen Bakkers and colleagues build upon their previously described cardiac-intrinsic looping of the heart, a process that is independent of the initial leftward jog of the heart that is driven by left-sided Nodal activity.

      A novel allele of tbx5a is recovered in a genetic screen for mutants affecting cardiac looping subsequent to cardiac jogging. These mutants have normal gut looping, and therefore establish LR asymmetry normally. The oudegracht (oug) allele of tbx5a is molecularly more severe than the well-known heartstrings allele, and unlike hst mutants in oug mutant hearts AV canal specification is expanded. Analysis of cardiomyocyte movement within the heart between 28 to 42 hpf demonstrates a process where while ventricular CMs are displaced in a net clockwise direction (relative to the OFT), atrial CMs do so in a counterclockwise fashion, with distinct differences between behaviour of dorsal and ventral cells in each chamber. This movement is also evident when using transgenic lines to demarcate the early left- and right-sided myocardium of the cardiac cone, which form dorsal and ventral portions of the linear heart tube. Here the dorsal myocardium is found at the outer curvature of both chambers following looping, supporting a torsional model. In the oug mutant these differences in displacement between the dorsal and ventral aspects of the chamber are not evident, perhaps explaining looping defects that are observed. Remarkably, the authors show that the looping process can be recapitulated in explanted 24 hpf hearts, with looping not requiring further addition of second heart field-derived cells. Looping defects in oug mutants can be rescued to some extent by further loss of tbx2b, supporting a model where Tbx5 and Tbx2b act to establish chamber and AVC boundaries to promote torsional rotation of the heart and cardiac looping.

      Overall this work is of a very high quality, with conclusions well supported by the evidence presented. The observations of explanted 24 hpf hearts, and demonstration of a "organ-extrinsic" process that drives looping, are of particular interest, and build well upon previously published observations.

      Substantive Concerns:

      1) Given the discrepancies observed between oug and hst mutants with respect to AVC development, have the appropriate in situs (has2, bmp4, tbx2b) been repeated in the hst background? This would be especially critical for tbx2b, given the genetic rescue experiments.

      2) The use of hearts where heartbeat has been suppressed from 28 to 42 hpf may well affect expression of nppa and formation of the outer versus inner curvature. This should be assessed. It may well be that heartbeat and flow is affected in oug mutants as well, and that defects observed are not due only to effects on CM movement/rotation. This should be commented on, at the very least.

      3) The analysis of cell shape (lines 320-332 and Figure 7) is highly confusing as presented. It was previously shown that left-derived CMs do not reach the OC (Figure 4K). Also, given the known requirements for cardiac contractility and shear stress to promote the elongation of OC CMs, these results are even further difficult to interpret. What is meant by "meandering" in this Figure is also not evident.

    2. Reviewer #2:

      Tessadori et al. address the mechanism of cardiac looping, a morphogenetic event that is essential for the generation of the different chambers of the vertebrate heart. While looping is essential for cardiac function, the complex morphogenetic events that govern this important process remain poorly understood. During the development of the two-chambered zebrafish heart, looping has been proposed to involve planar bending/buckling of the flat heart tube or torsional events that would be more similar to those involved in the formation of the helical structure of the mouse heart. In the present work, the authors use a number of elegant approaches to provide a 3-dimensional description of this process. While a recent study suggested that rotational events may be occurring at the level of the cardiac outflow tract (Lombardo et al, 2019), the present work substantially extends these findings and establishes that planar bending/buckling is only of minor importance for cardiac looping which instead depends on opposing rotational movements of the atrial and ventricular compartments that twist the heart tube around the central hinge region of the atrioventricular canal. The authors furthermore provide evidence that these morphogenetic events depend on tissue-intrinsic processes that require the function of the transcription factor tbx5a. Altogether, the present work provides important new insights into the morphogenetic events that contribute to the shaping of the zebrafish heart.

      The presented experimental work is generally of very good quality and convincing evidence is presented for the different findings. While I outline below several issues that should be clarified, the authors should already have a lot of the requested information that just needs to be included. While some additional data are requested, the required experiments should all be straightforward and allow rapid improvements that would further strengthen the work.

      Individual points:

      1) In their characterization of tbx5a/oug mutants, the authors state that cardiac looping is « defective », but a precise description of the actual type of defect is lacking. From the picture in Fig.1C it looks as if looping occurs still in the right direction, but with reduced amplitude. Is this the only type of defect observed, or are there others (e.g. absent or inverted looping)? How does this phenotype compare to the previously characterized tbx5a/hst mutant (see point 2)? The authors mention/show that cardiac looping and visceral laterality are unaffected, but numbers should be included to substantiate these claims.

      2) The authors analyse different markers of cardiac regionalization (Fig.2H) and suggest that the phenotype of tbx5a/oug mutants is different from the one previously described for tbx5a/hst (Garrity et al 2002, Camarata et al, 2010). As only oug mutant data are presented, it is however not clear to what extent the perceived differences may just be due to differences in the use / interpretation of different markers. For example Tessadori et al. talk about « Increased expression for the AV endocardial markers », which appears similar to Camarata et al. talking about « loss of AV boundary restriction » of AV marker genes. As the authors already detain the tbx5a/hst allele (used in Fig.1G) they should simply show side-by-side comparisons of marker expressions for the two mutant alleles. While the similarity or difference between oug and hst mutant phenotypes is not of major importance for the main conclusions of the paper, this point should be clarified to facilitate follow-up studies that may use either mutant to further characterize the events reported here.

      3) In Fig. 2K & 4J the authors provide a visual representation of Z cell displacement during cardiac looping. While this is very nice, the study could be strengthened further if these data could be analysed in a more quantitative way (e.g. mean displacement index at the atrial/ventricular inner/outer curvature). This would allow us to see whether the changes observed in oug mutants are significant.

      4) The authors report a novel spaw:GFP transgenic line that they use to label the left cardiac field. While the expression of this transgene in the left lateral plate mesoderm is expected, it is more surprising to see spaw as a marker of the left cardiac disc, as previous studies (e.g. Fig.1D of de Campos-Baptista et al, 2008) have shown spaw to be expressed to the left of the cardiac primordium, rather than within the cmlc2-positive cardiac disc itself. As the authors themselves mention in the discussion when comparing their results to Baker et al 2008 (which used myl7:GFP), it is essential to establish which cells are actually labelled by a transgene. A dorsal view of the 23 somite stage cardiac disc (e.g. spaw:GFP/myl7-RFP or GFP/cmlc2 two colour in situ) should be provided to clarify this issue.

      5) As for spaw:GFP, the authors should provide a dorsal view of the 23 som cardiac disc to document that lft2:Gal4 is indeed specifically expressed in the left heart primordium. They should moreover clarify the orientation of the pannels in Fig.S4. E.g. Fig.S4A presents two transversal sections of the 28 hpf heart tube in which left-originating lft2-expressing cells should be located dorsally. However lft2 cells are found in the upper half of the tube in the upper section, but in the lower half in the lower section. Does this mean that the D/V orientation is inverted between the two pictures? Please clarify.

      6) In Fig.4K and Fig.8D spaw:GFP is used to visualize left-originating cells in oug mutants. In both figures, spaw-GFP cells are located in the ventral part of transversally sectioned ventricles. I do not understand how this occurs: In wild-type animals left-originating cells initially give rise to the dorsal part of the ventricle. Through clockwise rotation of the outflow tract, these dorsal cells are then relocated to the outer curvature of the ventricle, as shown in Fig.3B. So if no rotation occurs in tbx5a/oug, why are spaw:GFP cells found in the ventral ventricle, rather than remaining in their initial dorsal position?

      7) Sample numbers should be provided for the experiments in Fig.5C and Fig.6C.

    3. Reviewer #1:

      This is an original paper by Tessadori et al, showing chamber movements during zebrafish heart looping. The combination of cell tracking and genetic tracing of left markers, including with a new 0.2Intr1spaw transgene, suggests differential movements in the ventricle and atrium. Using a new mutant line for tbx5a (oug), the authors show that defective heart looping is associated with defective chamber movements. This can be rescued by inactivation of tbx2b, indicating the importance of tube patterning into chamber/avc regions. Using explant experiments and pharmacological treatments, to interfere with the tube attachment and progenitor cell ingression, the authors conclude on intrinsic mechanisms of zebrafish heart looping, with a minor contribution from planar buckling.

      This study follows previous work of the team, showing that zebrafish heart looping is independent of Nodal signaling and suggestion of intrinsic mechanisms from explant experiments. Whereas asymmetric morphogenesis has been mainly analysed in terms of direction and downstream of Nodal signaling, this work addresses the contribution of other factors to the shape of the heart loop, including chamber movements and tbx genes. It has the potential to provide a significant advance into looping mechanisms, providing that data analysis is strengthened.

      Major comments

      1) The chamber movements are interesting new observations. Yet, their analysis is currently insufficient. Although images and cell tracking have been performed in 3D, it is unclear why the quantification is flattened in 2D. In Fig. 2-4, angles are treated as linear values, whereas they should be treated as circular values using dedicated packages . In the context of the low penetrance (Fig. 1G) and variability (Fig. S2, S6) of the phenotype, the number of samples should be increased. In Fig. 2, it seems that the movement in the ventricle is towards the posterior (or venous pole), rather than the left, and so why are the movements qualified as opposite, rather than perpendicular? In addition, vectors in the dorsal/left ventricle are not opposite, so the rationale of a rotation of the ventricle is unclear. To support the claim that authors "map cardiomyocyte behavior during cardiac looping at a single-cell level", the movement of the overall chamber should be subtracted to the cell traces.

      2) The staining of left transgenic markers is described as dorsal at 28hpf (text and Fig. 3A), and ventral at 48hpf (text and Fig. 3B) : please explain whether this implies a 180° rotation or just a general flip of the heart relative to the embryo. What is the pattern of lft2BAC in oug mutants? The legend of Fig. 9 reports "expansion of the space occupied by left-originating cardiomyocytes" : what is the percentage of the VV, VD, AV, AD regions labelled at different stages and in different experimental conditions? What is the degree of rotation of the pattern and does it correspond to that measured by cell tracking? Are markers of the inner/outer curvature (ex nppa) also rotating?

      3) The rationale for ruling out extrinsic cues of heart looping is currently unclear. It is very difficult to compare the impact of experimental conditions impairing extrinsic cues (Fig. 5-6), without a quantitative analysis of cardiac looping and of the patterns of left-transgenic markers. No observation of the twist is provided after treatment with SU5402 in vivo. What happens with the other 8/20 embryos? A caveat of explant experiments, is that the tissue may shrink and the orientation of the sample is lost. What are the parameters of the explanted tubes (pole distance, size), and which references are used to assess patterns? The authors suggest a minor contribution of planar buckling. However, neither biological quantifications (pole distance, length of the tube axis) nor computer modelling are shown to support their views and expectations. The observation that the ventricle moves posteriorly could be compatible with a convergence of the poles, potentially contributing to looping. In Fig. 6A, it seems that pole distance is higher in oug mutants. The claim on planar buckling should be altered.

      4) The importance of the avc is suggested by the rescue experiment with tbx2b inactivation. Yet the size and constriction of the avc is not quantified in the different experimental conditions. How are cell traces/displacement vectors in this region to support the proposal that the avc acts as a "fixed hinge"? Computer models would potentially be useful to understand the consequences of avc formation on the overall tube shape and chamber movement.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      While cardiac looping is essential for cardiac function, the complex morphogenetic events that govern this asymmetric process remain poorly understood. Asymmetric morphogenesis has been mainly analysed in terms of direction and downstream of left Nodal signaling. The work of Tessadori et al. now addresses the contribution of other factors to shape the heart loop. This manuscript builds upon a previous study from the same group, showing that cardiac looping is independent of the initial leftward jog of the heart that is driven by left-sided Nodal activity. A recent study from another group (Lombardo et al, 2019) suggested that rotational events occur at the level of the cardiac outflow tract. The present work substantially extends these findings by providing more evidence of intrinsic mechanisms driving looping. The authors use a number of elegant approaches to provide a 3-dimensional description of this process. The presented experimental work is generally of high quality. The combination of cell tracking and genetic tracing of left markers, including with a new 0.2Intr1spaw transgene, suggests differential movements in the ventricle and atrium. A novel allele (oug), encoding a truncated version of the transcription factor tbx5a, is analysed, showing normal gut looping, indicative of normal left-right asymmetry establishment. This allele is molecularly more severe than the well-known heartstrings allele; unlike hst mutants, in oug mutant hearts specification of the atrio-ventricular canal is expanded. Oug mutants display defective heart looping, associated with defective chamber movements. This can be rescued to some extent by further loss of tbx2b, supporting a model where Tbx5a and Tbx2b act to establish chamber and atrio-ventricular canal boundaries to promote torsional rotation of the heart tube and shape the loop. Explant experiments and pharmacological treatments, to interfere with the tube attachment and progenitor cell ingression, do not prevent heart looping. Altogether, the present work provides important new insights into the morphogenetic events that contribute to the shaping of the zebrafish heart. However, there are important issues that should be addressed.

    1. Reviewer #3:

      General assessment:

      In this manuscript, the authors bring up a contemporary and relevant topic in the field, i.e. theta rhythm as a potential biomarker for prediction error in infancy. Currently, the literature is rich on discussions about how, and why, theta oscillations in infancy implement the different cognitive processes to which they have been linked. Investigating the research questions presented in this manuscript could therefore contribute to fill these gaps and improve our understanding of infants' neural oscillations and learning mechanisms. While we appreciate the motivation behind the study and the potential in the authors' research aim, we find that the experimental design, analyses and conclusions based on the results that can be drawn thereafter, lack sufficient novelty and are partly problematic in their description and implementation. Below, we list our major concerns in more detail, and make suggestions for improvements of the current analyses and manuscript.

      Summary of major concerns:

      1) Novelty:

      (a) It is unclear how the study differs from Berger et al., 2006 apart from additional conditions. Please describe this study in more detail and how your study extends beyond it.

      (b) Seemingly innovative aspects (as listed below), which could make the study stand out among previous literature, but are ultimately not examined. Consequently, it is also not clear why they are included.

      -Relation between Nc component and theta.

      -Consistency of the effect across different core knowledge domains.

      -Consistency of the effect across the social and non-social domains.

      -Link between infants looking at time behavior and theta.

      (c) The reason to expect (or not) a difference at this age, compared to what is known from adult neural processing, is not adequately explained.

      -Potentially because of neural generators in mid/pre-frontal cortex? See Lines 144-146.

      (d) The study is not sufficiently embedded in previous developmental literature on the functionality of theta. That is, consider theta's role in error processing, but also the increase of theta over time of an experiment and it's link to cognitive development. See, for example: Braithwaite et al., 2020; Conejero et al., 2018; Adam et al., 2020.

      2) Methodology:

      (a) Design: It is unclear what exactly a testing session entails.

      -Was the outcome picture always presented for 5secs? The methods section suggests that, but the introduction of the design and Figure 1 do not. This might be misleading. Please change in Figure 1 to 5sec if applicable.

      -Were infants' eye-movements tracked simultaneously to the EEG recording? If so, please present findings on their looking time and (if possible) pupil size. Also examine the relation to theta power. This would enhance the novelty and tie these findings to the larger looking time literature that the authors refer to in their introduction.

      (b) Analysis:

      -In terms of extracting theta power information: The baseline of 100ms is extremely short for a comparison in the frequency domain, since it does not even contain half a cycle of the frequency of interest, i.e. 4Hz. We appreciate the thought to keep the baseline the same as in the ERP analysis (which currently is hardly focused on in the manuscript), but it appears problematic for the theta analysis. Also, if we understand the spectral analysis correctly, the window the authors are using to estimate their spectral estimates is largely overlapping between baseline and experimental window. The question arises whether a baseline is even needed here, or if a direct contrast between conditions might be better suited.

      -In terms of statistical testing

      -It appears that the authors choose the frequency band that will be entered in the statistical analysis from visual inspection of the differences between conditions. They write: "we found the strongest difference between 4 - 5 Hz (see lower panel of Figure 3). Therefore, and because this is the first study of this kind, we analyzed this frequency range." ll. 277-279). This approach seems extremely problematic since it poses a high risk for 'double-dipping'. This is crucial and needs to be addressed. For instance, the authors could run non-parametric permutation tests on the time-frequency domain using FDR correction or cluster-based permutation tests on the topography.

      -Lack of examining time- / topographic specificity.

      3) Interpretation of results:

      (a) The authors interpret the descriptive findings of Figure S1 as illustration of the consistency of the results across the four knowledge domains. While we would partly agree with this interpretation based on column A of that figure (even though also there the peak shifts between domains), columns B and C do not picture a consistent pattern of data. That is, the topography appears very different between domains and so does the temporal course of the 4-5Hz power, with only showing higher power in the action and number domain, not in the other two. Since none of these data were compared statistically, any interpretation remains descriptive. Yet, we would like to invite the authors to critically reconsider their interpretation. You also might want to consider adding domain (action, number etc.) as a covariate to your statistical model.

      References:

      Adam, N., Blaye, A., Gulbinaite, R., Delorme, A., & Farrer, C. (2020). The role of midfrontal theta oscillations across the development of cognitive control in preschoolers and school‐age children. Developmental Science, e12936.

      Braithwaite, E. K., Jones, E. J., Johnson, M., & Holmboe, K. (2020). Dynamic modulation of frontal theta power predicts cognitive ability in infancy. Developmental Cognitive Neuroscience, 100818.

      Conejero, Á., Guerra, S., Abundis‐Gutiérrez, A., & Rueda, M. R. (2018). Frontal theta activation associated with error detection in toddlers: influence of familial socioeconomic status. Developmental science, 21(1), e12494.

      Köster, M., Langeloh, M., & Hoehl, S. (2019). Visually Entrained Theta Oscillations Increase for Unexpected Events in the Infant Brain. Psychological Science, 30(11), 1656-166.

    2. Reviewer #2:

      The manuscript reports increases in theta power and lower NC amplitude in response to unexpected (vs. expected) events in 9-month-olds. The authors state that the observed increase in theta power is significant because it is in line with an existing theory that the theta rhythm is involved in learning in mammals. The topic is timely, the results are novel, the sample size is solid, the methods are sound as far as I can tell, and the use of event types spanning multiple domains (e.g. action, number, solidity) is a strength. The manuscript is short, well-written, and easy to follow.

      1) The current version of the manuscript states that the reported findings demonstrate that the theta rhythm is involved in processing of prediction error and supports the processing of unexpected events in 9-month-old infants. However, what is strictly shown is that watching at least some types of unexpected events enhance theta rhythm in 9-month-old infants, i.e. an increase in the theta rhythm is associated with processing unexpected events in infants, which suggests that an increase in the theta rhythm is a possible neural correlate of prediction error in this age range. While the present novel findings are certainly suggestive, more data and/or analyses would be needed to corroborate/confirm the role of the observed infant theta rhythm in processing prediction error, or document whether and how this increase in the theta rhythm supports the processing of unexpected events in infants. (As an example, since eye-tracking data were collected, are trial-by-trial variations in theta power increases to unexpected outcomes related to how long individual infants looked to the unexpected outcome pictures?) If it is not possible to further confirm/corroborate the role of the theta rhythm with this dataset, then the discussion, abstract, and title should be revised to more closely reflect what the current data shows (as the wording of the conclusion currently does), and clarify how future research may test the hypothesis that the infant theta rhythm directly supports the processing of prediction error in response to unexpected events.

      2) The current version of the manuscript states "The ERP effect was somewhat consistent across conditions, but the effect was mainly driven by the differences between expected and unexpected events in the action and the number domain (Figure S1). The results were more consistent across domains for the condition difference in the 4 - 5 Hz activity, with a peak in the unexpected-expected difference falling in the 4 - 5 Hz range across all electrodes (Figure S2)". However, the similarity/dissimilarity of NC and theta activity responses across domains was not quantified or tested. Looking at Figures S1 and S2, it is not that obvious to me that theta responses were more consistent across domains than NC responses. I understand that there were too few trials to formally test for any effect of domain (action, number, solidity, cohesion) on NC and theta responses, either alone or in interaction with outcome (expected, unexpected). It may still be possible to test for correlations of the topography and time-course of the individual average unexpected-expected difference in NC and theta responses across domains at the group level, or to test for an effect of outcome (expected, unexpected) in individual domains for subgroups of infants who contributed enough trials. Alternatively, claims of consistency across domains may be altered throughout, in which case the inability to test whether the theta and/or NC signatures of unexpected event processing found are consistent across domains (vs. driven by some domains) should be acknowledged as a limitation of the present study.

    3. Reviewer #1:

      Köster and colleagues present a brief report in which they study in 9 month-old babies the electrophysiological responses to expected and unexpected events. The major finding is that in addition to a known ERP response, an NC present between 400-600 ms, they observe a differential effect in theta oscillations. The latter is a novel result and it is linked to the known properties of theta oscillations in learning. This is a nice study, with novel results and well presented. My major reservation however concerns the push the authors make for the novelty of the results and their interpretation as reflecting brain dynamics and rhythms. The reason for that is, that any ERP, passed through the lens of a wavelet/FFT etc, will yield a response at a particular frequency. This is especially the case for families of ERP responses related to unexpected event e.g., MMR, and NC, etc. For which there is plenty of literature linking them to responses to surprising event, and in particular in babies; and which given their timing will be reflected in delta/theta oscillations. The reason why I am pressing on this issue, is because there is an old, but still ongoing debate attempting to dissociate intrinsic brain dynamics from simple event related responses. This is by no means trivial and I certainly do not expect the authors to resolve it, yet I would expect the authors to be careful in their interpretation, to warn the reader that the result could just reflect the known ERP, to avoid introducing confusion in the field.

      A second aspect that I would like the authors to comment on is the power of the experimental design to measure surprise. From the methods, I gathered that the same stimulus materials and with the same frequency were presented as expected and unexpected endings. If that is the case, what is the measure of surprise? For once the same materials are shown causing habituation and reducing novelty and second the experiment introduces a long-term expectation of a 50:50 proportion of expected/unexpected events. I might be missing something here, which is likely as the methods are quite sparse in the description of what was actually done.

      Two more comments concerning the analysis choices:

      1) The statistics for the ERP and the TF could be reported using a cluster size correction. These are well established statistical methods in the field which would enable to identify the time window/topography that maximally distinguished between the expected and the unexpected condition both for ERP and TF. Along the same lines, the authors could report the spatial correlation of the ERP/TF effects.

      2) While I can see the reason why the authors chose to keep the baseline the same between the ERP and the TF analysis, for time frequency analysis it would be advisable to use a baseline amounting to a comparable time to the frequency of interest; and to use a period that does not encroach in the period of interest i.e., with a wavelet = 7 and a baseline -100:0 the authors are well into the period of interested.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      Köster and colleagues report on a study in 9 month-old infants and their electrophysiological response to expected and unexpected events. All reviewers acknowledge the rationale of your study and find merit in the overall approach. However, there were major concerns expressed regarding various methodological as well as more conceptual and interpretational angles. In sum, there was consensus amongst reviewers and editors about a critical sum of methodological, conceptual, and novelty concerns.

    1. Reviewer #3:

      General Assessment:

      This study demonstrates that IP3R signaling (triggered by muscarinic receptor activation) affects excitability and quantal content of a subset of dopaminergic neurons to modulate flight duration and food search. I had no technical concerns and am generally supportive. My only major concern was that the narrative was fragmented. I believe this is because the perspective shifted between the IP3Rs and the dopamine neurons themselves, and was too focused. I think that streamlining the narrative and providing a broader perspective for the results will remedy this issue.

      Major Comments:

      -I would like the authors to expand upon their final section of the discussion to discuss more about 1) the potential context for cholinergic modulation of the PPL1-y2alpha'1 DANs, 2) the proposed role of these DANs (which have been studied in several contexts) and 3) modulation of innate behavior in general. The paper begins with the importance of modulating innate behavior, but the discussion on this topic is spare and focused almost entirely on research on the mushroom bodies of Drosophila. The discussion section leans heavily on summarizing the results, rather than making connections to work in other systems or networks.

      -The developmental section seemed somewhat tangential as the authors cannot distinguish between a developmental role for the IP3R from a need to express the ItprDN transgene prior to adulthood to overcome a potential slow turnover of endogenous IP3R. In essence, it was unclear how these results contributed to the overall narrative of state modulation of behavior. Is this section informative to the development of the mushroom bodies or rigorous validation of the novel transgene?

    2. Reviewer #2:

      The results of the individual experiments reported by the authors are convincing. The approach is rigorous and they take full advantage of the many powerful molecular genetic tools available in Drosophila. The identification of a mechanism by which a small subset of dopaminergic cells may control behavior is significant. My concerns about the manuscript are relatively minor.

      Minor comments:

      I have reviewed "Modulation of flight and feeding behaviours requires presynaptic IP3Rs in dopaminergic Neurons" by Sharma and Hasan. The authors first translated to Drosophila a dominant negative (DN) strategy first tested in mammalian cells to block the function of the fly IP3 receptor. Controls using westerns to test the expression in vivo and calcium imaging to assess inhibitory activity in an ex vivo prep were generally convincing. They then show that the DNA, RNAi and a wt transgene disrupts flight as they have shown previously using both genetic mutants and RNAi. They use genetic rescue to further show that alterations in the function of itpr in dopaminergic cells are likely to mediate at least some aspects of the flight deficit. The restricted distribution of the THD' driver was used to narrow down the identity of DA cell clusters responsible for this effect to PPL1 and/or PPL3. Additional split GAL4 lines identified a deficit when the DN was expressed in the PPL1-γ2α′1 subset of DA cells that project to the mushroom bodies. This is a key finding of the paper since it localizes the requirement of the IP3R to cells that have been implicated in other behaviors. Developmental tests using TARGET/GAL80 indicate a requirement for itpr during late development. Disruption of itpr only in the adult did not have a significant effect. This seems likely to be due to perdurance of itpr as suggested by the authors. However, these data make it difficult to determine which aspects of the phenotype are due to broad developmental deficits versus disruption of IP3R in the adult (see below). The authors next test the effects of mAhR with the idea that mAChR is likely to signal through IP3R. While it was known that developmental expression of mAcHR expression is required for adult flight, the current data more specifically that the PPL1-γ2α′1 DANs are required, enhancing the impact of the paper.

      To tie these results to vesicle recycling and release the authors use the shibere[ts] transgene in PPL1-γ2α′1. Flight bouts were disrupted via exposure to the non-permissive temperature both during late pupal development and the adult. The adult phenotype has been demonstrated previously but the developmental defect is novel. The demonstration of an effect in adults is important since it suggests loss of itpr during adulthood might also have an effect in adults even though this can't be tested due to perdurance. Expression of shibire[ts] in PPL1-γ2α′1 also disrupts feeding, and the authors next phenotype these effects with the itpr DN, indicating that IP3R expression in PPL1-γ2α′1 is required for both feeding and flight. However, here as with the flight experiments, it is not possible to directly demonstrate an effect in adults due to perdurance. They show that knockdown of mAChR also reduces feeding similar to its effects on flight and suggest that the deficits are due to disruption of the mAchR ->(Gq) ->IPR3 pathway. The suggestion of connections between mAchR and IPR3 within PPL1-γ2α′1 and the idea that PPL1-γ2α′1 controls two distinct behaviors are a significant finding and one of main contributions of the paper.

      To help link the shibire[ts] data set with and the results of perturbing mAchR and IPR3, the authors show that carbochol induced DA release is reduced, making excellent use of the relatively new GRAB-DA lines. As a control, they show that synapse density of PPL1-γ2α′1 in the γ2α′1 MB lobes are not altered. The demonstration that DA release is altered elevates the technical strength of the paper. Moreover, although further experiments might be needed to prove their model, these data support the argument that mAchR ->(Gq) ->IPR3 pathway is disrupted in the adult. The final set of experiments in Fig 6 indicate that excitability of the PPL1-γ2α′1 DANs is also disrupted by knock down or IP3R. Is it possible that this deficit contributes to the decrease in DA release by the mAchR ->(Gq) ->IPR3 and the authors nicely explain a possible mechanism and cite relevant references in the Discussion.

      The results of the individual experiments reported by the authors are convincing. The approach is rigorous and they take full advantage of the many powerful molecular genetic tools available in Drosophila. The generation of the DN transgene is a nice idea and in combination with other tools helped them to identify specific subsets of DA neurons important for the behaviors they test. However, they have previously demonstrated similar effects with mutants and RNAi, and again use them to help map the relevant cells. Since the use of the DN construct did not really go beyond the experiments using RNAi or genetic rescue, the emphasis on the importance of this reagent might be reduced in the abstract and introduction.

      Flight deficits have also been seen in other experiments on these the DANs identified by the authors. Thus, the major novel finding of this section is the demonstration that itpr is required in these cells for regulating flight. While it was previously shown that feeding behavior is also required by DAN projections to the MB, the idea that overlapping cells might control both flight and feeding is interesting. Although the idea that these two phenotypes are specifically related to each other seems somewhat speculative, one major strength of the paper lies in tying together prior observations on itpr and the DANs with their current experiments. They do this again at the cellular level using GRAB to show that carbachol induced release of DA (but not synapse density) is reduced by itpr knock-down, thus tying together data on shibere, AcHR and itpr.

      These connections make for an exciting story, and they have been cleverly woven together by the authors. On the other hand, they also represent a possible concern about the manuscript as a whole, since causal relationships between the deficits between the effects of blocking the effects of IP3R, mAcHR, neuronal excitability and vesicle release are not yet proven. It is therefore possible that all of these are relatively non-specific effects of disrupting the function of PPL1-γ2α′1 neurons. This modestly reduces the strength of the paper but is also a relatively minor concern. A second potential concern is that despite the interesting connections made by the authors as well as some exciting new data, some of the findings replicate previous data.

      A third concern is the relationship between the effects of disrupting PPL1-γ2α′1 during development versus the adult. As the authors suggest, perdurance (of protein expression) and/or "perdurance" of previously formed tetramers could easily account for the failure of itpr and mAChR knock down in the adult to cause behavioral deficits. By the same token, it is difficult to parse out the contribution of developmental defects in the DA cells versus problems with signaling in the adult and the following issues should be addressed: the observation that synaptic bouton density is not disrupted is a good way to eliminate gross disruption of connectivity during development but does not rule out other more subtle developmental defects in neuronal function. The fact that shibire[ts] can cause effects in the adult is appreciated but does not really help us to understand what IP3R and perhaps mAcHR are doing during development.

      These, too are relatively minor concerns, and the difficulty inherent in overcoming the confounding effects of perdurance are appreciated. Indeed, the authors have already made it clear that they don't know whether developmental vs adult effects of their genetic manipulations are most important. In fact, the authors have tried to address potential this concern at multiple sites, perhaps trying to address previously critiques. While all of these caveats are correct, it may be useful to consolidate some of them.

      Additional Minor Concerns.

      To validate the decrease in the overall response to carbachol in Fig 1D and E, the authors show a statistically significant difference for area under the curve. A parallel metric and statistical test might be used to support the statement that the response is delayed in 1D but not 1E.

      "Interestingly, the mitochondrial response did not exhibit a delay in reaching peak values." Why is that? A brief explanation might be useful.

      The second explanation of how shibire[ts] works might be shortened.

    3. Reviewer #1:

      The authors report experiments on Drosophila to show that the proper function of an IP3 receptor in a small subset of dopaminergic neurons is required for flight behavior. Most interesting is the fact that the requirement is restricted to a time point during pupal development. Technically, the authors report a novel dominant-negative mutant for of the IP3 receptor to interfere with its function. Physiologically, the IP3 receptor-dependent impairment in the function of the dopaminergic neurons affects both synaptic vesicle release and excitability, Also, muscarinic acetylcholine receptors are required for proper development of the flight-modulating circuit during development.

      The role of dopamine in the brain of Drosophila (as a model for general dopamine and brain function) is in the center of current research, and is studied by a large number of laboratories. More and more types of behavior are discovered that are modulated by dopaminergic neurons, and in particular those innervating the mushroom body. Therefore, the study is of very high interest for researchers working on Drosophila, but also to a broader readership.

      The experiments are well designed. with appropriate controls at place. The conclusions drawn are highly interesting and novel (dopaminergic modulation of flight behavior, perhaps in the context of food seeking behavior, molecular mechanisms of circuit maturation).

      Minor comments:

      1) A test for normal distribution of data is required to determine whether parametric statistical tests are actually appropriate.

      2) It is not clear to me why the authors conclude an acute requirement of IP3R during the adult state although the phenotype can arise through a genetic intervention during earlier time points in development (Page 9, lines 297ff). This has to be outlined much clearer. My interpretation of the data is: During a certain time window after pupal formation IP3 signaling is required for a proper formation of the neuronal circuit. This is likely to be not only a cell-intrinsic (i.e., cell autonomous) effect because the mAchR is also required during this time window. This provides an excellent example (there are actually only very few!) of circuit development that requires synaptic interactions between neurons. If one keeps in mind that dopaminergic neurons have reciprocal synapses with Kenyon cells (e.g. Cervantes-Sandova, elife 2017; should be included in schematic illustration!)), and these release acetylcholine onto dopaminergic neurons, a potential circuit maturation based on the concerted activity is most interesting. I suggest that the authors point out more precisely how they think the actual phenotype comes about, of course, with all due caution.

      3) Statistical tests should be done across independent brains, not across different cells in the same brains.

      Additional data files and statistical comments:

      A test for normal distribution of data is required to determine whether parametric statistical tests are actually appropriate.

      Figure legend 5 C should be 5B. The scaling of the y-axis is not optimal.

      Statistical tests should be done across independent brains, not across different cells in the same brains. This would cause a mixture of dependent and independent data. This is of importance!

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript. Ronald L Calabrese (Emory University) served as the Reviewing Editor.

    1. Reviewer #2:

      Despite the availability of a high resolution, expertly annotated digital adult mouse brain atlas (Allen CCFv3), accurately labeled 3D digital atlases across mouse neural development are lacking. The authors have filled that gap by developing novel computation methods that transform slice annotations in the Allen Developing Mouse Brain Atlas into digital 3D reference atlases. They demonstrate that the resulting brain parcellations are superior to a naive agglomeration of the existing 2D labels, and provide MagellanMapper, a suite of tools to aid quantitative measures of brain structure. Cellular level whole-brain quantitative analysis is rapidly becoming a reality in many species and this manuscript provides a foundational resource for mouse developmental studies. The methods are sophisticated, carefully applied and thoroughly evaluated. I have mostly minor comments that should be interpreted as suggestions to strengthen or clarify the presentation, not an indication of any significant concerns.

      1) The authors developed a clever 'edge-aware procedure' that they first employed to extend existing labels to unannotated lateral regions of the brain, taking advantage of intensity gradations in underlying microscope images. As this is an innovative procedure, the authors should manually annotate a small part of the lateral brain region to compare accuracy and compare computationally generated labels to the partial lateral labels in P28 brain.

      2) I have questions about how well the edge-aware procedure performed internally within the brain to smooth region parcellation. First, the edge-aware procedure relies on intensity differences in the light microscope images. However, the work of neuroanatomists would be dramatically simplified if such gradations provided sufficient information for brain segmentation. Annotations present in the ADMBA took advantage of co-aligned ISH data (and computational approaches using co-aligned gene expression data have been used for de novo brain parcellation). Intensity differences in the light-microscope images may not always provide enough information for accurate segmentation. Could there be instances where adjacent regions do not have intensity differences, and the edge-aware procedure actually reduces the accuracy of the manual annotation? Second, it does appear that despite the care to avoid losing thin structures, there is some loss, for example for the light-green structure in the forebrain in Fig. 5E. Could the authors indicate if all labels were preserved, and perhaps provide information on volume changes by label size.

      3) The accuracy of non-rigid registration of light-sheet images to the references is assessed only using a DSC value for whole-brain overlaps. This does not assess the precision of registration within the brain. The authors should apply some other measure to measure the quality of alignment within the brain (e.g. mark internal landmarks visible in the reference and original light-sheet images, and measure the post-registration distance between them).

      4) The P56 reference is close to an adult brain. The authors should compare the boundaries of their computationally derived parcellations to the recently published Allen CCFv3 brain regions.

    2. Reviewer #1:

      The manuscript demonstrated some interesting aspects of the data processing for the 3D registration of the mouse brain. At the same time, several concerns need to be addressed, by either revising the text or making additional computations.

      1) The 3D "smoothing" was the central part of the method reported in the manuscript. For example, the inclusion of the "skeletonization" step helped prevent the loss of thin structures compared to the previous methods such as the one by Niedworok et al (Ref #40 in the manuscript). However, the overall improvement did not involve any conceptually new algorithm but instead relied on the optimization of known parameters, which may appear incremental. The authors should avoid overstating their work.

      2) The pipeline of the method involved the "mirroring" before the "smoothing" steps. Is it possible to perform the "smoothing" of one hemisphere and then "mirror" the smoothed 3D atlas onto the other hemisphere to check for the alignment? By doing so, the other hemisphere could serve as an internal control for the quality and accuracy of the 3D atlas.

      3.) The "edge-aware" adjustment, which was essential for the improvement of 3D atlas, surely worked for the large brain regions with identifiable anatomical edges based on the 2D images. However, for more delicate subregions (e.g., those in the hypothalamus) without clear anatomical boundaries, this adjustment step may become ineffective. What could then be done for these subregions? Also, it is important to note that the anatomical edges required the manual annotation.

      4) The results presented throughout the manuscript are the axial views of brains. It would be informative to include, at least in Figures 2 and 3, the coronal views of 3D atlases to exemplify the quality.

      5) It is unclear why the authors chose the P0 brains for the lightsheet imaging. In addition, since both male and female mice were analyzed, is there any difference observed within the 3D brain atlases obtained?

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript. Joseph G Gleeson (Howard Hughes Medical Institute, The Rockefeller University) served as the Reviewing Editor.

      Summary:

      Despite the availability of a high resolution, expertly annotated digital adult mouse brain atlas (Allen CCFv3), accurately labeled 3D digital atlases across mouse neural development are lacking. The authors have filled that gap by developing novel computational methods that transform slice annotations in the Allen Developing Mouse Brain Atlas into digital 3D reference atlases. They demonstrate that the resulting brain parcellations are superior to a naive agglomeration of the existing 2D labels, and provide MagellanMapper, a suite of tools to aid quantitative measures of brain structure. Cellular level whole-brain quantitative analysis is rapidly becoming a reality in many species and this manuscript provides a foundational resource for mouse developmental studies. The methods are sophisticated, carefully applied and thoroughly evaluated. The manuscript reports a computational approach to transforming available 2D atlases of mouse brains into the 3D volumetric datasets. By optimizing the "smoothing" steps, a better quality of such 3D atlases is produced. In addition, the authors applied their method to the imaging dataset of neonatal mouse brains obtained by lightsheet microscopy, as proof of its potential utilization in research.

    1. Reviewer #3:

      Authors aim to test the presence and functional significance of KNDy co-transmission at the GnRH distal dendrites in the ventrolateral ARN. The authors use expansion microscopy to score synaptic connections between KNDy and GnRH distal dendrites. Next, they use ex-vivo slice imaging to report the Ca2+ transients of GnRH distal dendrons during pipette application of candidate neurotransmitters. The authors go on to investigate the functional role of kisspeptin on the pulsatile firing of KNDy neurons and the subsequent release of LH using a combination of fiber photometry and repeated blood sampling. This manuscript is a continuation of a large body of work from this laboratory. Most of the techniques used here have been previously published by this group and are at the cutting edge of this research field. As a reviewer I have two points for the authors to consider:

      1) In 2016 Qi, Nestor et al. evaluated the mechanistic properties of synchronous firing of KNDy neurons. Along with this, they demonstrated that the influence of NKB on GnRH neurons was indirect and mediated by kisspeptin from KNDy neurons. Given this, I think it is important for the authors to more specifically compare and contrast the work from Qui, Nestor et al. 2016. While the authors do cite the manuscript, the findings are not thoroughly compared.

      2) The authors show that NKB was sufficient to induce [Ca2+] in KNDy neurons, but not in GnRH dendrons. Given this, I found it curious that a delayed, indirect, spike was not observed in (Fig 2 A,B) from KNDy induction. Can the authors clarify this?

    2. Reviewer #2:

      In this manuscript Liu and co-workers use in vitro and in vivo experiments to explore KNDy neuronal input onto GnRH nerve-fibers (called dendrons) in the arcuate nucleus median eminence area. The main strength of this work is the in vivo photometry experiment to activate ARN Kiss1 neurons combined with tail blood sampling for measurements of plasma LH as a substitute for GnRH secretion. It is well known that Kiss1 deletion causes infertility. In addition, it is known that in some Kiss1Cre mouse models homozygous animals are designed to be infertile, including the mouse model used in the current study.

      1) Using the infertile homozygous Kiss1Cre mouse, the authors showed that the lack of kisspeptin eliminates LH pulses following photometry stimulation in vivo of KNDy neurons, indicating that kisspeptin is responsible for LH pulses and is the main output signal from KNDy neurons onto GnRH terminals in the ME area. They also used this animal model to show that the absence of kisspeptin did not affect the synchronous firing of KNDy neurons, illustrating that kisspeptin is not involved in synchronous firing and that synchronous firing alone does not maintain fertility. However, previous studies both in vivo (Wakabayashi et al., 2010) and in vitro (Navarro et al., 2009, Qiu et al., 2016) had already provided substantial evidence for kisspeptin being the main output signal onto GnRH neurons and that NKB and dynorphin are responsible for synchronous firing.

      2) It is interesting that although KNDy neurons release the peptides kisspeptin, NKB and dynorphin as well as the classical neurotransmitter glutamate, only kisspeptin was able to activate GnRH dendrons in the ME area. This is surprising since this group has shown previously (Herde et al 2013) that both GABA and glutamate can depolarize GnRH distal dendrons. Specifically, they showed that puff application of glutamate (500 µM) on distal dendrons in vitro elicited bursts of action potentials. Currently, the authors used a similar concentration of glutamate applied in vitro and found no effect on Dendron calcium activity. Clearly further experiments are needed to sort out these differences. Overall, although this manuscript reports some compelling in vivo studies to ascertain the specific role of kisspeptin in the GnRH distal Dendron and confirm the role of NKB and dynorphin on synchronous firing, it is of limited scope and new information.

    3. Reviewer #1:

      The authors of this high-quality paper use contemporary viral/genetic technologies to show that KNDy neurons in the ARN regulate GnRH release in median eminence (ME) via kisspeptin signaling only, even though they release all their transmitters there. They monitor GCaMP fluorescence in GnRH dendrons to establish that kisspeptin signals there, but NKB, Dyn and GLU do not, whereas these 3 transmitters signal onto Kiss1-neuron cell bodies, while kisspeptin does not. They also show that loss of kisspeptin signaling in ME prevents LH release.

      1) Fig. 6A Authors should compare dF/F trace of Kiss1-Cre -/- with +/- mice, rather than referring to unpublished results.

      2) Line 337, Authors say, "As such, it is interesting to consider whether the episodic release of NKB and dynorphin from KNDy varicosities in the region of the ventrolateral ARN may impact on other ARN neuronal cell types." It is equally interesting to consider the possibility that KNDy neurons release all their neurotransmitters in the ME and NKB, Dyn and Glu may signal to non-GnRH neurons. It would be useful to include references documenting that NKB, Dyn and GLU are released in ME, even if kisspeptin is the only molecule that can signal to GnRH dendrons. If references do not exist, would it be possible to express GCcMP6 non-specifically ME and express ChR2 in Kiss1-Cre-/- KNDy neurons to show that cells in ME can respond to the other transmitters released by KNDy-neuron activation. Antagonists could then be used to establish which transmitters are released there.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The importance of kisspeptin signaling from arcuate KNDy neurons (expressing kisspeptin, neurokinin B, dynorphin and glutamate) for fertility is well established. KNDy neurons are thought to be critical for the episodic release of LH by acting on GnRH-neuron terminals in the median eminence. A question posed here is whether kisspeptin is the only transmitter signaling onto GnRH terminals (referred to here as dendrons) in the median eminence. Some evidence suggests that the KNDy neuropeptides can be packaged into individual vesicles; thus, it is possible that only those vesicles containing kisspeptin travel to the median eminence. Alternatively, it is possible that all peptides and glutamate are released in the median eminence, but only receptors for kisspeptin are present there. To address this issue, the authors express a calcium indicator in GnRH dendrons and determine which transmitters can generate a calcium signal. They show that only kisspeptin can do so and go on to demonstrate that in the absence of kisspeptin (using KO mice), no signal is generated. This is an important result but does not completely distinguish between the two hypotheses.

    1. Reviewer #4:

      PREreview of "Regulatory roles of 5′ UTR and ORF-internal RNAs detected by 3′ end mapping"

      Authored by Philip P. Adams et al. and posted on bioRxiv DOI: 10.1101/2020.07.18.207399

      Review authors in alphabetical order: Monica Granados, Runhua Han, Katrina Murphy, Nik Tsotakos

      This review is the result of a virtual, live-streamed journal club organized and hosted by PREreview and eLife. The discussion was joined by 20 people in total, including researchers from several regions of the world, four of the preprint authors, and the event organizing team.

      Overview and take-home message:

      Adams et al. have demonstrated with both their genome-wide and targeted analyses of RNA elements in E.coli how two labs can collaboratively come together to make significant advances in their respective fields while also producing a model paper to open the door for more research. Their research not only looked at non-coding sRNA regulation but identified so called gene "mistakes" that also have functions. In addition, they bridged the gaps in our knowledge about how sRNAs derived from internal open reading frames can act as sponges and how termination spots of 5' untranslated regions affects sRNA regulation on their target mRNAs. Although this work is of interest in microbial gene expression, below are a few concerns that could be addressed in the next version of this paper.

      Positive feedback:

      • All this work took only a year?! Congratulations, this is really great work.
      • The journal club definitely recommends this preprint for others in the field and for peer review. This work could be an important contribution.
      • The conclusions are supported by data. Many of the newly discovered sRNA and their regulatory mechanism were experimentally confirmed and investigated. All the data analysis support the hypothesis and conclusion of each section.
      • Really appreciate the manuscript. The introduction had enough background for why the researchers took this approach. Really enjoyed reading this paper - even with a neuroscience background.
      • This preprint provides a lot of information of RNA termination sites by 3' end mapping in the model bacteria (E. coli), which also enlightens the studies relevant to sRNA discovery and RNA regulation mechanisms in other bacteria.
      • One of the most exciting and novel findings is that the 3'-end termination of the 5'-UTR of some known sRNA targets can reinforce the sRNA regulation.
      • A potentially exciting opportunity for studying differential regulation via sRNAs during the exponential growth and plateau phases.
      • The paper employs very high-throughput sequencing technologies on a bacterial model that normally doesn't get so much attention, especially on the non-coding RNAs and post-transcriptional regulation.
      • The current knowledge of sRNA regulation mechanisms are expanded.
      • Primes more questions by trying out new techniques to find new regulatory areas.
      • Just the beginning of the deeper dive model into gene regulation and Rho-dependent termination - opens the door for more research and makes this paper extra referenceable moving forward. For future research or consideration, what can be extrapolated from this research for other organisms?
      • I thought the methods were very thorough, they also have a data availability statement and uploaded the sequencing data.
      • Data is available, and UCSC browser tracks made available!
      • Gold star for having code used for calling the 3' ends open and available on github (always a pro in my eyes!) +2 for GITHUB!
      • Genome-wide data can give other researchers a chance to find new mechanisms relevant to their genes and circuits of interest.
      • So much detail was available! I am not familiar with the standard techniques in the field, but from what I read the detail seemed to be reproducible.
      • I always appreciate when the Results subsections are bolded which helps gather my thoughts.

      Major concerns:

      1) The authors may consider adding another figure panel or some additional text summarising how the 3' ends they mapped are distributed over the genome - e.g. are they enriched in any specific region or well-distributed?

      2) The authors mention that they identified 412 genomic loci putatively associated with a Rho termination event, based on a Rho score of 2.0, indicated in Table S2. However, in Fig. 1C the total number of Rho-dependent termination events mentioned is 433. The discrepancy between these two numbers can be slightly confusing. Could the authors describe the methodological differences that led to the two different numbers?

      3) The authors identify the 280nt mdtJI transcript that is the result of premature termination, and show very nicely how this transcript is susceptible to read through in the presence of spermidine under elevated pH conditions (see Figure 3). In Figure 2F, however, the Northern blot indicates the presence of a longer transcript as well in the presence of the mutant Rho. Do the authors have any indications what this longer transcript (~400bp) is?

      4) With regards to the results presented in Figure 4, the authors consider the possibilities of MicA-directed cleavage of the ompA mRNA or protection from degradation due to base pairing with the sRNA. If the first possibility were true, could the probe used in the Northern blot detect smaller fragments, or was it designed to only detect the full length transcript?

    2. Reviewer #3:

      The paper of Adams et al. attempts to provide a resource of Rho-dependent and independent transcript 3' ends in the model bacterium Escherichia coli, focusing especially on 3' ends identified in 5' UTRs and within coding sequences. Studying several of these termini in detail, the authors present interesting novel types of regulatory loops involving products of pre-mature transcription termination or of mRNA transcript processing. These include, for example, small RNAs derived from 5' UTRs of targets of canonical sRNAs, which sponge the canonical sRNAs and, in turn, affect the target they are derived from. The paper will be of interest to the microbiology and RNA communities, and may inspire in-depth investigation of regulatory loops and novel sRNAs discovered here, as well as the discovery of additional novel regulatory RNAs and new structures of regulatory loops inferred from the resource that the authors provide.

      Major comments:

      Additional analyses of the data are needed, as detailed below.

      1) Comparison between the large-scale data sets of 3' ends provided by the current and previous studies. It is very important that the comparisons between the current data set of 3' ends and previous ones will be done properly, especially the comparison with a data set generated by the same protocol (Term-seq) by the developers of the protocol, Dar and Sorek (2018). There are several issues that should be considered in regard to the comparisons to previous data and evaluation of the statistical significance:

      a) Computation of the statistical significance of overlapping results by the hypergeometric test. It is not clear how the reported p-values were computed, and it is not possible to re-compute them as the value of N was not provided. For this test, the p-value of a result at least as good as the one obtained should be computed ("cumulative p-value"). Looking at the results in the Venn diagrams presented in Supplementary Figure S1, it is hard to see how p-values of <10-100 were obtained. The authors should check their computation. They should provide the details of the computation for all hypergeometric tests included in the manuscript, to enable their assessment.

      b) Data processing to reveal 3' ends. The computational method used to process the Term-seq data is different from the one presented in the paper of Dar and Sorek. The authors should explain why they turned to a different computational scheme and what is it’s advantages. It would be more appropriate to compare the current data set and Dar and Sorek's data set when analyzed by the same computational methodology. The authors should apply their new computational method to Dar and Sorek's data, or analyze their results by Dar and Sorek's computational method, and re-assess the overlap in the determined 3' ends.

      c) Rho-dependent termination. It is not clear why the authors followed Dar and Sorek for determining Rho-dependent termination. Dar and Sorek used available data of BCM treated cells from Peters et al. (2012), and therefore could only evaluate the readthrough in the vicinity of determined 3' ends. Since the authors made the effort to treat the cells with BCM and generate sequencing libraries, it is not clear why they did not simply carry out Term-seq following BCM treatment and compared the identified 3' ends to those determined without BCM. Secondly, in evaluation of the readthrough the authors, again, modified the computational method of Dar and Sorek. This needs justification and the parameters used need explanation (window size of 500 nt and threshold of the Rho score of at least 2). For the comparison of the results, the Dar and Sorek data set and the current data set should be analyzed by the same method and the results compared. In connection to that, since the BCM experiment was conducted in the current study only once, it would be important to analyze the Peters et al. data by the new computational method and compare the results. The analyses described in comments (1b) and (1c) might improve the overlap between the results of the different studies and reduce the inconsistencies.

      d) If the present large discrepancies between the current data set and previous one stay despite the new analyses, the authors need to carefully examine the similarities and inconsistencies, try to understand the reasons for that, and assess the reliability of their data.

      e) The authors can compare their own data sets in the different growth phases and conditions. It would be interesting to examine if the same or different 3' ends were deciphered in the three experiments. I believe it is expected that many of the termini will be re-discovered but some will be different between the different growth phases and conditions. This analysis will provide an assessment of the consistency of the results and might provide new biological insights.

      2) Experimental results

      a) Several 3' termination sites were tested experimentally by molecular experiments. From the reported results it seems that all tested sites were re-confirmed by the molecular experiments. How were the studied sites selected? Were there sites from the large-scale data that were tested by the molecular experiments and were not confirmed as 3' ends? A report of true positives and false positives would provide another important assessment of the reliability of the data.

      b) It would be informative to assess the correspondence between the Rho score and the ratio of beta galactosidase activity between rho mutant and WT cells (Figure 2 and Supplementary Figure S2). It seems that genes with Rho scores below 2, such as sugE, may show high ratios. How should users of the provided resource consider the Rho score values?

    3. Reviewer #2:

      Adams et al. have comprehensively identified the 3' ends of transcripts in E. coli and demonstrate that many transcripts are prematurely terminated either by Rho-dependent or intrinsic manner. Strikingly, in addition to small RNAs prevalently discovered in 3'UTR, the authors reveal that several premature transcripts generated from 5'UTR or internal CDS also function as sponges of Hfq-dependent small RNAs, i.e. pairs of ChiZ-ChiX, IspZ-OxyS and FtsO-RybB. It remains unclear which RNA chaperones and RNases are involved in the regulation. This study introduces new members to an emerging class of bona fide regulatory RNAs derived from mRNAs.

      1) Pages 10 - 12; The results of LacZ reporter assay and northern blot seem contradictory at a glance. Expectedly the reporter experiments which are carried out with the cells of OD0.4~0.6 showed a significant increase of LacZ activity in the rhoR66S mutant, which is defective in Rho-dependent termination (Figs. 2DE and S2B). On the other hand, in many cases, the northern blot analysis of total RNA extracted from the cells of OD0.4 revealed the increase of premature terminated 5'UTR fragments in the rhoR66S strain (Figs. 2F and S2C). Moreover, some 5'UTRs exhibited different patterns at OD2.0. This cannot be accounted for simply by the difference in growth phase (the last sentence of Page 10). The authors' suggestion that higher levels of longer transcripts in the absence of Rho are processed to give the shorter products (Page 12, Lines 7-8) is confusing since the increased LacZ reporter should be expressed from the longer transcripts. This point can be clarified by rehybridizing the northern blots with probes for corresponding genes downstream of the premature termination regions.

      2) In the same direction as the comment above, the northern blot analysis for mdtJI shows that the premature termination product of mdtU (~280 nt) is increased in the rhoR66S strain during growth in a normal LB medium (Fig. 2F). In stark contrast, the increase of mdtU transcript seems not significant in the LB pH9.0 without spermidine (Fig. 3E; lanes 1 and 3). However, in the presence of spermidine, the level of mdtJI long transcript was rather decreased in the rhoR66S strain (Fig. 3E; lanes 2 and 4). This result is contradictory to the result of LacZ reporter assay (Figs. 2DE). The influence of spermidine to the mdtU-lacZ reporter expression should also be tested.

      3) Pages 20-21; The effect of RybB on FtsO has not been clarified in the manuscript. When RybB is abundant, the level of FtsO was lower than the other situations (Fig. 7B, lane 6). This is indicative of coupled degradation upon base-pairing between FtsO and RybB. However, when RybB was induced by ethanol, the level of FtsO was rather increased (Fig. 7E), probably attributable to transcriptional activation of ftsI. To clarify the reciprocal regulation between RybB and FtsO and its consequence, this reviewer suggests quantifying the half-life of each sRNA in the presence or absence of its counterpart sRNA.

    4. Reviewer #1:

      In this study, Adams et al. apply various RNA-seq-based approaches to map transcript 3'ends in E. coli in a genome-wide manner and distinguish between 3' ends derived from processing, Rho-dependent, or intrinsic termination. Strikingly, classification of 3'ends revealed that less than one quarter located within a 50 bp window downstream of annotated coding sequences (CDSs), whereas a substantial fraction fell within 5'UTRs and CDSs. The authors show that several transcription termination sites (TTSs) in 5'UTRs locate downstream of known cis-regulatory elements (riboswitches, uORFs) and may arise from premature transcription termination, leading to the hypothesis that other cis-regulatory elements may be discovered by characterizing 3'ends within 5'UTRs. Indeed, further supporting this, the authors present mechanistic data for a uORF (termed mdtU) affecting Rho-dependent transcription termination of the downstream operon in response to the polyamine spermidine.

      Other 3'ends were adjacent to known sRNA target sites within mRNA 5'UTRs or ORFs. Since several of these RNA fragments accumulate to high levels under physiological conditions, the authors go on demonstrating function for three such representatives (namely two 5'-derived RNAs, termed ChiZ, IspZ, and one ORF-internal candidate, FtsO). Interestingly, all three of them were found to be "sponges" of bona fide intergenic sRNAs, affecting either the activity of the latter (ChiZ on ChiX) or their steady-state levels (IspZ on OxyS; FtsO on RybB).

      Together, this important study expands our definition of bacterial sRNAs, demonstrates functionality of several "nonconventional" sRNAs, blurs the discrimination between regulator and target, and is expected to boost future studies looking into bacterial sRNAs derived from 5'UTRs or ORFs. The study is timely - as several recent studies proposed the existence of noncanonical sRNAs - and highly relevant as it provides data to support functionality of some of these RNAs (e.g. FtsO is the first ORF-internal sRNA with a reported function).

    5. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      In the present study, you have comprehensively identified the 3' ends of transcripts in E. coli and demonstrated that many arise from premature transcription termination in either Rho-dependent or intrinsic manner. As a result, you discovered numerous stable RNAs derived from 5'UTRs or CDSs and functionally characterized several of these "unconventional" RNAs as sponges of well-studied Hfq-dependent small RNAs. The reviewers all agreed that this is impressive work, the findings are novel and relevant for researchers within the microbiology and RNA communities and may inspire future studies of non-canonical bacterial sRNAs. Overall, they deem the results convincingly supported by the experimental data, but would like to see a few more experimental and analytical amendments to your work.

    1. Reviewer #2:

      The manuscript addresses a very interesting topic, namely the possibility that DHX30 protein exists in two alternatively transcribed variants that have a role, respectively, in the cytoplasm and in the mitochondria. The first of the two functions is relatively new and barely addressed in the literature. The mitochondrial localization has already been described in previous works where, among others, has been shown to be important for mitochondrial function, possibly acting at the transcriptional level. The experimental approach is largely based on the "specific" depletion of either one of the two isoforms, and a downstream analysis (RNAseq, a few biochemical endpoints). The phenotypic results are relatively few and the authors conclude that DHX30 may have a role in "...coordinating ribosome biogenesis, global translation and mitochondrial metabolism...".

      The main criticism that I have of this work is that...although this term is often abused by editor's polite answers, it is rather preliminary. There are a consistent number of shortcuts that, in my mind, when taken all together, cast some doubts on the correct message. I will describe these limits by going systematically through the data.

      In Figure 1, the authors describe the effects of shDHX30 on several endpoints: 1. The authors employ here a single shRNA which is really not sufficient given the very well known problem of off-target effects; 2. With the exception of a few confirmatory experiments the whole analysis is based on a single cell line; 3. In 1B there is a plot indicating the relative translation efficiency of ribosomal protein mRNAs. However the Supplementary Table 1 is not properly annotated and not all ribosomal mRNAs seem equally regulated; 4. The polysomal profiles have very low polysomes and very high 80S, raising some questions on the actual relevance of the regulation of Pol/Sub peak described in Fig. 1g (seen with a single shRNA); 5. The statement of increased ribosome biogenesis is not solid. The authors mention quantitation of 18S rRNA and nucleolar intensity of 18S staining. However, the state of the art must be pulse-chase analysis followed by autoradiography and/or Northern blotting of rRNA precursor, possibly with two shRNAs and perhaps even with a couple of cell lines; 6. The logic by which an increase in rRNA is co-regulated with an increase of translation of ribosomal protein mRNA is obscure and has no explanations: is signalling involved? Is it indirect? 7. The authors claim an effect on translation. The correct interpretation of the polysomal profile is a reduction in initiation of translation (which in itself brings back to the question of 6. what happens to mTOR signalling?). 8. The authors show a very clear increase in AHA. How does this increase in incorporation fit with the data of Fig. 2/3 showing a reduction in mitochondrial fitness? In short this Figure assembles several data without building a strong case. All these points are touched upon but not developed properly in the following tables.

      In Figure 2, the authors show the effects of shDHX30 on mitochondrial proteins. In general, this set of data is relatively convincing. What is not totally convincing is the existence of a cytosolic form of DHX30 (Fig. 2f, for instance). I believe that the existence of a cytosolic form of DHX30 is a potentially very cool finding. But a) the levels of this cytosolic form seem minimal, b) the effects of its specific downregulation with a (single) specific shRNA are absent or a bit contradictory (Fig. 2g, MRPS22 versus MRPL11), and c) none of the assays of Fig. 1 (global DHX30 downregulation) has been reproduced by the interesting experiment, here, of the specific downregulation of either a cytosolic or a mitochondrial form of DHX30.

      Finally, in Figure 3, the authors explore the effects of downregulation of DHX30 on mitochondrial functionality. Overall, the biological effects are very convincing (in short, a reduction in the oxygen consumption rate), although the mitochondrial analysis is really rudimentary (EM? ATP? ). What strikes me is that the authors started with the point of translation of mitochondrial mRNAs and then, here, look at data on mRNA levels of the OxPhos machinery. I fail to see the mechanistic connection.

      The manuscript is written in an approximate way with some confused statements. Example, methods "rRNA biogenesis was performed" (??), fluorescence is low quality with bad resolution, I failed to find Supplementary Table 2 and 3 (perhaps it is my browser, but they seem empty). If the authors would be able to clearly define a) the effects of downregulating DHX30, b) convince about the presence of a cytosolic isoform and c) its role, this paper is really interesting.

    2. Reviewer #1:

      In this manuscript, Bosco et al. propose that DHX30 coordinates cytoplasmic translation and mitochondrial function to impact on cancer cell survival. They deplete DHX30 and report that this causes an enhancement of translation including those of mRNAs encoding for cytoplasmic ribosomal proteins, while paradoxically reducing the translation of mitoribosome protein mRNAs. There are cytoplasmic and mitochondrial isoforms of DHX30 and the authors assess the long-term consequences of knockdown of the cytoplasmic versus mitochondrial + cytoplasmic proteins. Some of the novelty of this paper has been preempted by a previous publication by Antonicka and Shoubridge showing that loss of DHX30 results in impaired mitochondrial ribosome assembly, impaired mitochondria OXPHOS assembly, impaired mitochondrial mRNA precursor processing, and a very severe decrease in mitochondrial translation. I think the work, while interesting, is preliminary and should aim to provide mechanistic insight for the phenotype associated with DHX30 knockdown.

      As far as I can see, none of the targets obtained from the polysome profiling are validated in this study. This is concerning since polysome profiling was previously reported in a Cell Report 2020 publication by the authors (GSE 95024; available at the GEO database), but the origin of the RNA-seq data in the current paper is not clear (GSE 154065; not available at the GEO database). We do not know if the RNA-seq data was generated from the same samples as the polysome profiling samples previously reported or completely independent of these (this information is lacking). Regardless, validation of any putative translation responsive genes predicted from polysome profiling data would appear to be a reasonable expectation these days.

      The authors claim that depletion of DHX30 leads to increased global translation (Figs 1f, g). They also provide evidence that translation of mRNAs encoding cytoplasmic ribosomal proteins is increased, while the translation of mRNAs encoding mitoribosome ribosomal proteins is decreased (Fig 1b). DHX30 is associated with ribosomal subunits, 80S monosone and low-molecular weight polysomes, and it also interacts with a CG-rich motif for p53-dependent death (CGPD) in 3' UTRs of mRNAs. What is lacking is a mechanism to explain these observations (if the data validates)? To this reviewer the lack of mechanistic insight is a serious shortcoming of the current submission. What is responsible for the general translational increase (including cytoplasmic rps encoding mRNAs), yet mitochondrial rp mRNA translation decrease, upon DHX30 knockdown? Many rp mRNAs have TOP motifs at their 5' ends, is this pathway affected?

      The authors previously identified DHX30 as a CGPD-motif interactor. They published this as a specific DHX30 binding motif, yet this motif is not enriched in the new data set established by the authors. I don't understand the statement put forth by the authors on line 286 that " While we cannot exclude that the CGPD motif can be implicated, only a subset of RP transcripts harbors instances of it". Either it is significantly enriched or it is not. In any event, there appears to be an inconsistency with previously published data.

      The ENCODE eCLIP data suggests that DHX30 can bind to 67 cytoplasmic ribosomal and 23 mitochondrial protein transcripts. Yet in their eCLIP validation experiments using RIP, the authors probe for the potential of DHX30 to bind to only MRPL11 and MRPS22 (Fig 2a). They write "These findings suggest that DHX30 directly promotes the stability and/or translation of mitoribosome transcripts." What about the cytoplasmic ribosome protein mRNAs, which according to the ENCODE data can also bind DHX30, yet their response to DHX30 depletion is the opposite of that of the mitoribosome protein mRNAs. I think it may be premature to correlate DHX30 with mitoribosome protein regulation.

      The comparison of the efficiency of knockdown using siRNAs targeting the cytoplasmic form versus the mitochondrial + cytoplasmic forms versus shRNA knockdown efficiency is confusing and, in my humble opinion doesn't add insight into mechanism of action. "Transient silencing of DHX30" (ie, using siRNAs) achieves ~50% mRNA reduction in HCT and U2OS cells 48-96s following transfection. On the other hand, silencing of DHX30 mRNA using shRNA achieved better levels of reduction (60-75% decrease) in U2OS and MCF7 cells (Fig S2e). The authors use these differences in knockdown efficiencies to correlate differences in expression response of several mitochondrial encoded genes. The authors need to show the extent to which DHX30 protein levels are reduced in the siRNA treated cells (only changes in mRNA levels are presented). As well, there should be a genetic rescue experiment to show that siRNA or shRNA resistant DHX30 cDNA can overcome this effect. Lane 3 of Fig 2h appears underloaded as assessed by the actin intensity. MRPL11 protein levels appear greater in lane 2 (siDHX30-C) compared to lane 1, why is that?

      Please provide details on the siRNA and shRNAs used. It appears that only one shDHX30 was used to target cytoplasmic DHX30 and one shRNA to target cytoplasmic + mito DHX30. I couldn't find information on this.

      If mutations in DHX30 are known to trigger stress granules formation, does knockdown of DHX30 do the same. Is eIF2 alpha phosphorylated upon HDX30 knockdown?

      There appears to be several DHX30 mRNAs made through alternative splicing (see https://www.ncbi.nlm.nih.gov/gene/22907). In this study, when the authors refer to cytoplasmic DHX30, is the equivalent function being attributed to these different potential isoforms?

      The pictures in Figs 1e, 2d, and S3g are quite difficult to appreciate and should be provided at higher magnification.

      Fig 2f. Why is there so much tubulin in the mitochondrial protein extract lane?

      Suppression of DHX30 mRNA leads to lowered proliferation rates in HCT116 cells. This however was not due to significant alterations in the cell cycle (Fig 4e). Apoptotic rates do not appear to be affected (compare HCT_shNT to HCT_shDHX30 in the DMSO samples of Fig 4g). Can the authors please provide an understanding into what is leading to the lowered proliferation rates if cell cycle progression and cell death are unaffected. Confusingly, "transient" silencing of DHX30 mRNA (protein levels were not assessed) in U2OS cells did not impact proliferation while in MCF7 cells it did. Although the authors attribute this difference in response to better depletion of DHX30 mRNA in MCF7 cells, they do not actually measure DHX30 protein levels and the use of different cell lines complicates the interpretation.

      Line 267 "none of the DHX30 closer homologs showed strong evidence of such localized translation". What homologs are being referred to here?

      Line 269. "Although our experiments did not enable us to confirm this in HCT116, a previous report also showed evidence for DHX30 interaction with mitochondrial transcripts in human fibroblasts by RIP-seq (Antonicka and Shoubridge, 2015). Our data instead point to a direct interaction with mitoribosome transcripts and their positive modulation as another means by which DHX30 can indirectly affect mitochondrial translation." DHX30 thus interacts with many different mRNAs and in my view it becomes difficult to ascribe a particular biological response to DHX30 to a particular set of transcripts based on interaction data.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The major weaknesses of the paper are: 1. The work is preliminary as there is very little mechanistic insight to explain the major findings. 2. Some of the conclusions are not substantiated by the data. 3. Targets from the ribosome profiling were not validated.

    1. Reviewer #2:

      Dzyubenko et al. have addressed the role of ECM in the control of inhibition and excitation in primary neuronal cultures. Their impact statement reads: "this study revealed the essential role of brain extracellular matrix in controlling synaptic inhibition and neuronal network activity", which makes it erroneously appear that no other past studies have addressed exactly this topic. There is a vast amount of literature on the link between ECM, particularly on PV-INs and development of inhibition, critical period and regulation by the orthodenticle homeobox 2 (Otx2) by the Hensch group. None of this literature is cited in the text. Moreover, there are numerous references indicating clear functional changes following depletion of ECM in vivo (e.g., PMID: 32457072, just to mention one of the most recent studies). In addition to failing to cite previous evidence obtained in vivo for the role of ECM in the regulation of E/I balance and development, with the exception of an anatomical study in the cortex, the authors limit themselves to studying the effects of ECM depletion in immature neuronal cultures. The following list of major concerns with the study is far from complete:

      1) It is unclear how the ratio of excitatory to inhibitory cells of 2:1 was established in the primary cultures. This seems purely coincidental based on Fig.S2, but it surely does not reflect the 4:1 or 5:1 ratio found in vivo. With such an abundance of I-cells vs E-cells in the culture, one can immediately question the physiological relevance of the findings.

      2) One of the physiological consequences of the deletion of ECM in culture is the increased amplitude and frequency of mIPSCs. However, the bimodal distribution of these mIPSC parameters begs the question of how the authors made sure that they recorded from the same neuronal types in their cultures. Moreover, the use of TTX may not ensure that the mIPSCs are Ca2+-entry independent events. Depolarized terminals, and spontaneous closures of K channels within may lead to the opening of voltage-gated Ca channels that could increase both amplitude and frequency of the "mIPSCs".

      3) A similar concern as above surrounds the MFR and MBR of the cultures as measured with the MAE. In these recordings there is no distinguishing between the firings and bursting of E- or I-neurons.

      4) The modeling part of the study cannot be but biased by the results obtained in cultures. Does it also accurately predict the effects of BMI and CGP46381? How was the effect of CGP46381 distinguished between excitatory and inhibitory terminals, as the antagonist affects GABA-B receptors on both?

    2. Reviewer #1:

      The authors of the manuscript entitled "Extracellular matrix supports excitation-inhibition balance in neuronal networks by stabilizing inhibitory synapses" undertook a study to understand the mechanism(s) by which the extracellular matrix (ECM) of the brain may stabilize neuronal excitability and synaptic plasticity. The study heavily utilized in vitro networks consisting of mature, cultured, hippocampal neurons (with a 2:1 ratio of excitatory to inhibitory neurons) where the ECM was disrupted via enzymatic treatment with chondroitinase ABC or hyaluronidase for 16 hours. Control cells were treated with vehicle (0.1 M PBS).

      The study made several interesting observations. Using their in vitro network, the authors were able to show a reduction in both excitatory and inhibitory synapse density after ECM depletion (Figure 1C). In vivo, they observed a specific decrease only in the inhibitory synapse density after ECM depletion (Figure 2D). To understand how ECM depletion-induced reductions in inhibitory synapse density affect synaptic transmission, the authors recorded miniature inhibitory postsynaptic currents (mIPSCs) in control and ECM depleted cultures. These measurements showed an increase rather than a decrease in the amplitude and frequency of mIPSCs (Figure 3C-D). In contrast, spontaneous network activity measured via multielectrode arrays revealed a significant increase in both firing rate and bursting rate after ECM depletion. Ultrastructural microscopic analysis of scaffolds within structurally complete GABAergic and glutamatergic synapses showed that ECM depletion reduced the size of gephyrin, but not PSD95 scaffolds (Figure 4C). Although the size of the gephyrin scaffolds were reduced, the immunoreactivity of GABAA receptors inside gephyrin containing postsynapses was not altered (Figure 4B, D) nor was the total expression of GABAA receptors affected (Figure S3). A significant reduction in GABABR in VGAT+ terminals was however noted.

      The current manuscript provides ample evidence for both an ECM depletion mediated reduction in inhibitory synapse density and an increase of spontaneous network activity. However, essential functional data is needed (see the list of concerns below) to support the conclusion of a homeostatic increase in inhibitory synapse strength via the reduction of presynaptic GABAB receptors. Functional evidence should also be supplied to show an ECM depletion mediated alteration in the excitation-inhibition (E-I) balance.

      Concerns:

      1) To ensure that ECM depletion did not affect cell survival in neuronal cultures, the authors examined DAPI stained neurons for fragmented nuclei, but more specific assays for cell death such as TUNEL, Fluoro-Jade or activated caspase-3 staining should be incorporated into their study.

      2) It is unclear whether enzymatic ECM digestion/disruption is equally efficient at inhibitory and excitatory synapses. Data in Figure 4C shows no magnitude reductions in the PSD95 scaffolds after ECM depletion, is this reflective of specificity or rather a less efficient enzymatic disruption at excitatory synapses?

      3) Although the PBS vehicle and ECM digestion were delivered ipsilaterally, it was unclear whether there was an accompanying effect contralaterally. This was largely because neither quantification of synapse densities nor the magnified images of the yellow contralaterally positioned squares were shown.

      4) Additional functional tests are needed to show that ECM depletion strengthens inhibitory input to single neurons. These functional tests could include measurements of the paired-pulse ratio and uIPSCs, with analysis of both the CV for uIPSCs and the failure rate. Functional tests should also be added to show that in this in vitro cell culture preparation, ECM depletion results in a functional reduction in presynaptic GABABR activation and a subsequent increase in presynaptic release of neurotransmitter.

      5) Given that excitatory synapse densities were also reduced in the cultured neuronal preparations (Figure 1C), measurement of miniature excitatory postsynaptic currents (mEPSCs) should be included in the study. In some cases, reductions in inhibition and excitation can be balanced leading to no net change in E-I balance in the neural circuit, so it's important to consider both parameters.

      6) It is unclear whether the increased firing and bursting are due to the presynaptic blockade of GABABRs or GABABRs localized elsewhere. The equally increased firing rate in the control and ECM depleted condition after bicuculine methiodide application could be interpreted to show that (in the absence of all GABAA-mediated inhibition) the maximum neuronal firing rate is largely unaffected by ECM depletion, and remains similar to the controls.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to [version 2] (https://www.biorxiv.org/content/10.1101/2020.07.13.200113v2) of the manuscript.

    1. Reviewer #3:

      The manuscript "EFF-1 promotes muscle fusion, paralysis and retargets infection by AFF-1-coated viruses in C. elegans" describes the ability to VSV virus coated with AFF-1 fusogen can be targeted to specific cells in vivo using C. elegans. Using this technique, the authors elegantly show that AFF-1 viruses show tissue/cellular tropism in vivo that largely match known AFF-1 or EFF-1 receptor expression, which they verify through genetic mutation and ectopic expression. Overall, I would like to commend the authors on a fascinating and scientifically thorough manuscript that would be of interest to a broad range of scientists, from C. elegans researchers to viral engineers. However, while there are several lines of evidence that suggest cell-to-cell fusion in the muscle upon EFF-1 ectopic expression, they are all circumstantial. So I suggest the authors tone down the strong language used throughout the manuscript that outright state EFF-1 induces muscle fusion, including in the title, unless they use EM or photoconvertible fluorescent markers that show actual shared cytoplasm between cells.

      Major issues:

      1) The authors have not clearly shown that EFF-1 and VSV-EFF-1 cause muscle cell fusion. Nuclei count is not evidence of cell-cell fusion (Fig. 4I) and it is not clear from the images how the authors can distinguish the plasma membrane of muscle cells in order to count nuclei per cell in Fig 4I and Fig 7O-P. Furthermore, the authors claim muscle cell fusion in the myo-3p::eff-1 strain based on indistinguishable membranes expressing membrane-bound YFP and even distribution of mCherry (Fig 5). But loss of membrane bound YFP and distribution of mCherry are not clear evidence of cell fusion, especially when qualified and not quantified. Definitive evidence of cell-cell fusion in the muscle can be shown with EM or using a photoconvertible fluorescent protein which could show actual sharing of cytoplasm between cells. So claims like the following (and many others including the title) are too strong given the data in the manuscript:

      a) "EFF-1 expression in BWMs induces their fusion" (Line 331)

      b) "evenly distributed cytoplasmic myo-3p::mCherry indicating fusion and content mixing between these cells during development" (lines 297-299)

      c) EFF-1 expression in fused BWMs enables VSV∆G-AFF-1 and VSV∆G-G spreading (line 349)

      2) Figure 3 does not convincingly show key data to fit with their hypothesis that VSV-AFF-1 infection would increase upon EFF-1 expression in a dose-dependent manner. Based off of Figure 3, the authors conclude that "hypodermal infection by VSV∆G-AFF-1 increases with conditional induction of eff-1." (Lines 229-230). But they use an assay counting GFP-positive nuclei. So the result showing a decrease in GFP+ nuclei as eff-1 levels decrease is likely due to a loss of natural syncytium formation in the hypodermis rather than due decreased infection by VSV-AFF-1. As they stated in lines 199-200, GFP+ nuclei in the hypodermis are localized closer to the injection region of the head in eff-1 mutants. So higher eff-1 expression would lead to both a larger hypodermal target for viral infection and more posterior nuclei within that target for the virus to spread towards, showing GFP expression when the syncytium becomes infected. To control for this, the authors could infect the eff-1-ts mutant with VSV-G and show no dose dependent effect.

    2. Reviewer #2:

      The manuscript by Meledin et al have used the C. elegans model to investigate two interesting aspects: (1) The consequence of ectopically fusing the normally mononuclear body wall muscle cells by expressing the eff-1 fusogen (2) using VSV∆G virus particles coated with the AFF-1 fusogen to change the tropism of the virus and preferentially infect muscle cells. This manuscript describes a novel and truly innovative approach in the C. elegans model to develop methods for cell-specific viral targeting by modifying the host genome. I find the data showing preferential and efficient infection of EFF-1 expressing cells by VSV∆G-AFF-1 spectacular, as there are many applications that could be developed using this approach. In addition, showing that fused body wall muscles do not function normally is a significant finding, even though the exact causes of the strong defects that were observed are not investigated in detail. Here, the manuscript could be strengthened, for example by including an ultrastructural (EM) analysis of the fused muscle cells.

      Overall, the manuscript is very well written and based on solid data. Some figures are a bit difficult to interpret (e.g. fig. 6 showing the fused muscle cells).

    3. Reviewer #1:

      In their manuscript, the authors examine Vesicular Stomatitis Virus (VSV) coated with fusogen infection in C. elegans based on previously developed pseudotyped virus VSVG-AFF-1. They show VSVG-AFF-1 can efficiently infect C. elegans multiple tissues through microinjection, and the infection requires the function of bilateral fusogen (AFF-1 or EFF-1) on the target cells. Furthermore, using the genetic and living imaging techniques, they observed that overexpression of EFF-1 in muscle leads to paralysis, dumpy, and uncoordinated phenotype. AFF-1 coated pseudovirus can thus infect BWMs with ectopically express EFF-1, and significantly enhance the uncoordinated behavior, which may be due to the merge of BWMs or formation of non-functional syncytial muscle fibers. This is an interesting, well-written, and thoughtful study to show that C. elegans can be infected by a virus with the bilateral fusogen and represents a significant advance in identifying important players mediating virus infection in C. elegans.

      Major Comments:

      1) myo-3 encodes a myosin heavy chain, and its promoter is very strong for the gene expression. Overexpression of myo-3p::GFP/mCherry with high concentration extrachromosome array frequently results in uncoordinated, dumpy, or paralysis phenotype, which due to inconsistent expression, chimeric expression, leak expression and varies copies expression that inhibits the endogenous promoter. The authors show that extrachromosome array of muscle expression of EFF-1 causes uncoordinated, dumpy, larval arrest, and paralysis phenotypes, which may be due to both myo-3 promoter or EFF-1 expression in the muscle. It is very difficult to draw any solid conclusion here. As most of the data were based on the extrachromsome muscle expression of EFF-1, it is important to generate a single-copy insertion of myo-3p::EFF-1 to mimic the endogenous expression levels and test whether ectopic expression of EFF-1 is required for VSVG-AFF-1 infection and others.

      2) Is it possible to examine/observe AFF-1 and EFF-1 interaction after VSVG-AFF-1 infection and in the fused BMWs in vivo?

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      All reviewers thought this is an interesting study and most of the experiments are convincingly performed. However, they also raised a number of concerns.

    1. Reviewer #3:

      Summary

      This study used the method of lesion-symptom mapping to disassociate the neural correlates underlying syntactic and semantic functions. The results suggest that different brain regions of the language network do not share similar functions; instead, they should perform different high-level functions that contribute to linguistic processing. Specifically, the pMTG and the aSTS were found associated with syntactic comprehension; the pIFG and the aIFG were found associated with expressive agrammatism; and the iAG was found associated with semantic category word fluency. Overall, I find the research question interesting. However, I have some doubt on the methodology, and the interpretation of experimental results, though not implausible, was somehow hasty. I'll elaborate below.

      Detailed comments:

      1) The fundamental reasoning underlying the method of lesion-symptom mapping.

      I agree with the paper that high-level linguistics functions are intertwined in language performance (in language comprehension and production), and any manipulation of syntax is likely to affect semantic interpretation as well. However, it seems problematic to claim that this conundrum can be solved with the help of lesion-symptom mapping, and that lesion-symptom mapping can identify brain regions "causally" involved in linguistic functions.

      Suppose that the execution of function X crucially depends on two other functions Y and Z, while function Z also causally depends on function Y. I doubt we can discover this kind of causal network from lesion-symptom mapping. In other words, simply detecting the correlation between a lesion area and the performance of a certain linguistic task is still far from detecting the actual causal dependence between a certain brain region and a certain linguistic function. Therefore, I think the paper should avoid overclaims and include more details on how the specific procedures of the current study led to contributions "towards" revealing the general or language-specific function of a brain region.

      Y → Z

      ↓ ↙

      X

      2) Methodological details of this paper.

      This issue is also related to the previous one. It seems that the assignment of the two groups of participants was based on some other studies. The specific lesion-mapping procedures adopted in this paper also followed some other studies. Though I understand that there might be some word limits for the submission, I still hope that (i) the paper includes more methodological details on these, so that the paper can be better self-contained, and (ii) some explanations are given on how these procedures led to contributions "towards" revealing the general or language-specific function of a brain region.

      3) The interpretation of results.

      The behavioral tasks used in this study, namely the comprehension of sentences with non-canonical word order, the description of pictures, and the naming of animal names, are associated with three kinds of linguistic functions: syntactic comprehension, expressive agrammatism, and semantic category word fluency. There might be alternatives to interpret these three linguistic tasks: e.g., (i) sentence-level processing vs. discourse-level processing vs. word-level processing; (ii) syntax vs. pragmatics vs. lexical ability; etc. The interpretation of results can include a discussion on these.

      4) How the findings were consistent with the theory proposed in Matchin & Hickok (2020)

      I read the paper of Matchin & Hickok (2020) ("The cortical organization of syntax", Cerebral Cortex), and found some discrepancies between the theory proposed in that paper and the finding from the current experiment. In that paper, the pMTG is associated with the lexical-syntactic function, underlying both language production and comprehension, while the pIFG is associated with linearization, underlying specifically language production. In the current study, the association between the pMTG and syntactic comprehension seems to suggest that the pMTG is specifically related to the processing of sentences with non-canonical order. Isn't the processing of this kind of sentences an issue related to linearization, not issues related to argument structure or other lexical-syntactic issues?

    2. Reviewer #2:

      This paper attempts to disentangle the neural instantiation of syntax and semantics using VLSM correlations between regions of brain-damaged tissue and language performance across three tasks in relatively large groups of stroke patients. Although the work addresses an important, and currently debated, issue in cognitive neuroscience, the paper is significantly methodologically flawed and the results are untenable.

      Major problems:

      1) Independent measures. Three tasks were used to index (1) syntactic comprehension, (2) expressive agrammatism, and (3) semantic processing. All are problematic and reliability of measurement was not addressed for any of the tasks. This is particularly problematic for expressive agrammatism, but is of concern for all measures.

      For syntactic comprehension, a combined score reflecting comprehension of three complex sentence types with long-distance dependencies (wh-movement constructions) were contrasted with scores for active sentences. This contrast is linguistically unfounded: it is not possible to isolate syntactic process using this contrast, since there are critical differences between the experimental and control sentences on several variables, beyond syntactic processing, including the number of propositions, lexical-semantics, sentence length, etc. as well as domain-general processes, etc. For any studies seeking to determine the cognitive and/or neural resources engaged for syntactic processing, a fundamental requirement is that experimental conditions consist of pairs of stimuli that differ along a single dimension - the dimension of interest - with all else kept constant across conditions, lest the comparison be confounded by additional dimension(s) (cf. Grodzinsky, 2010, for discussion). To do so in the present study the non-canonical forms would need to be contrasted with their canonical counterparts, e.g., subject-relatives for object-relatives, subject questions for object questions, etc.

      Expressive agrammatism was determined based on samples of connected speech elicited by picture description or story retelling and the "presence of expressive agrammatism was . . . rated by speech and language experts . . ." This is problematic. Subjective judgement is insufficient for a study of the scope reported. Objective analysis of the speech samples is needed to quantify salient dimensions of agrammatism or, better, inclusion of a constrained task, like that used to quantify sentence comprehension is recommended.

      2) A very gross measure of "semantic" processing was used - a word fluency task. This is arguably not a semantic task and no rationale for using it is provided. Given this, the title of the paper is inappropriate and misleading: ". . . dissociations of syntax and semantics . . .". It also is stated that assessment occurred at "a variable number of timepoints". Why? When were the time points? Were there any intervening variables between time points? Why was performance "averaged" over samples? In what way does this make the data more "reliable"? Were all participants beyond the period of spontaneous recovery (this is not evident based on data presented in Table 1)?

      3) Dependent measures. Six ROIs were selected for analysis and the rationale for their selection is based on one model of sentence processing. There are two main issues here: (1) there is no rational for using an ROI rather than a voxel-based approach; of the two approaches, a voxel-based approach is the most rigorous as ROI analyses may lead to spurious results simply based on the ROIs selected, (2) the voxel-wise analyses were uncorrected; tables reporting the coordinates derived from voxel-wise analyses are needed; the corrected voxel-wise analyses (with corresponding data tables) should replace the ROI analysis at least for first-pass analyses, (3) greater motivation/justification for selection of the 6 ROIs is needed; there are well-known and well-conceptualized data-based models of sentence processing that include ROIs other than the six tested, e.g., pSTG/pSTS (Friederici, 2012, 2018; Friederici & Gierhan, 2013; Bornkessel-Schlesewsky & Schlesewsky, 2013; Bornkessel-Schlesewsky et al., 2012). It is questioned why the authors overlook this important body of work? ROI selection could be better motivated based on data derived from well-controlled studies of syntactic and semantic processing (e.g., for syntactic processing: Bahlmann et al., 2007; Bornkessel et al., 2005; Bornkessel-Schlesewsky et al., 2010; Constable et al., 2004; Fieback et al., 2005; Friederici et al., 2006; Meltzer et al., 2010; Sonti & Grodzinsky, 2010; Thompson et al., 2010). In addition, there are several published meta-analyses within these domains that would better elucidate appropriate ROIs.

      4) Discussion/conclusions. Several statements in this section are overstatements, not supported by the study:

      a) "Research critically needs to incorporate insights from lesions symptom mapping in order to understand the architecture of language...". Why? Lesioned brains arguably have undergone reorganization (particularly in chronic stroke). This issue is not addressed in the paper.

      b) "...results are ...consistent with neuroanatomical models that posit distinct syntactic and semantic functions to different regions...". It is not possible to determine precise functions of brain regions based on lesioned tissue. The only conclusion that can be drawn is that the infarcted region is involved in and may disrupt the function of interest, but it cannot be said that it is responsible for it. Such an assertion fails to recognize the well-known fact that brain regions do not work in isolation, rather a network of regions is required for execution of complex tasks.

      c) "The [Matchin & Hickok] model posits that the ...pMTG is critical for processing hierarchical structure for production and comprehension.". The data presented do not address or support this claim.

      d) "Damage to the pMTG was significantly associated with semantic comprehension deficits...". Semantic comprehension was not tested.

      e) "damage to the pIFG was ...associated with agrammatic speech deficits". This observation, albeit unreliable based on limitations of the method used for quantifying agrammatism, does not support the M&H model; the authors claim that it does in spite of the fact that there was a "marginally" significant interaction between IFG and MTG.

      Given the substantial methodological limitations inherent in this study, the results and conclusions are unreliable.

    3. Reviewer #1:

      This is a lesion-symptom mapping study of syntactic comprehension, syntactic production, and a semantic measure, namely category word fluency. The authors argue that each of these language functions depends on a different brain region. With some revision this paper could be a worthwhile contribution to the literature, but in my opinion it largely replicates prior work, and the aspects in which it attempts to go beyond prior work are not very strong.

      1) The links between the brain regions and linguistic functions studied here have all been firmly established already. For the IFG and agrammatism, the authors cite two papers from their own work and two from other labs that already make this case (p. 8). For the pMTG and receptive syntax, there are many previous findings, most of which are cited in the present paper and/or the authors' 2020 review paper; Pillay et al. (2017) is a particularly compelling study reporting this association. Semantic fluency has previously been associated with inferior parietal cortex by Baldo et al. (2006), also appropriately cited in the present paper. In sum, none of the major findings of the present study are novel.

      2) The most novel aspect of this study is that the authors carry out some interaction analyses, which indeed are often not carried out when they should be when making claims about differential roles of different brain areas. But the value of this is undercut by the fact that these interaction analyses are still based on univariate analyses of lesion-behavior relationships in each region. The fact that many lesions to one region will extend to one or more of the other regions is simply ignored (as in most VLSM studies). This unrealistic model is just inherently limited (Mah et al., 2014). A multivariate approach to lesion-symptom mapping would be needed to make progress in teasing out differential contributions of different regions. Furthermore, one of the three interactions is not statistically significant, and another one (involving the semantic measure) is not well motivated because the authors present no analysis of the category fluency task, and therefore no principled reason to expect it to be associated with one or another semantic region. Regarding that finding, they end up making a reverse inference on p. 9, and although they cite Schwartz et al. (2011), they don't explain that that paper already showed differential roles of these two regions in a lesion-symptom mapping study. Finally, there are no interactions that actually address the segregation of syntax and semantics promised in the title.

      Some other issues to consider:

      1) Speech rate is used as a covariate to control for non-semantic factors influencing category word fluency, but it cannot possibly serve that purpose. There are many factors influencing speech rate, especially motor factors, and completely different factors contributing to word fluency performance, especially executive. The bottom line is category word fluency is really not a very helpful measure because there are too many contributing factors.

      2) There seems to be inadequate lesion coverage in the TP ROI.

      3) Although the uncorrected voxelwise maps are reassuring with respect to the main ROI analysis, the fact that they are uncorrected means that they don't really have any evidentiary value.

      4) It is problematic to combine two sentence comprehension measures without showing that they are on an identical scale or adjusting them accordingly.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      The reviewers feel that the authors are addressing an interesting and important issue in cognitive neuroscience. Nevertheless, serious shortcomings in methods and analytic approaches, and in interpretation, were flagged by all three reviewers.

    1. Reviewer #3:

      Wang et al describe a tissue specific knockout system to target neutrophil specific genes. Tissue specific knockout system is an important tool to study gene function in specific tissues. To the best of my knowledge there are only four major publications describing the tissue specific knockouts in zebrafish, two of them are not acknowledged by the authors.

      In this manuscript, authors used neutrophil specific promoter to drive the expression of Cas9, and ubiquitous promoter for sgRNA expression or vice versa. The authors have previously published a similar paper describing a transgenic construct (Tg(lyzC:nls-cas9-2A-mCherry/U6a:polg sgRNA)) expressing Cas9 as well as sgRNAs from the single construct. Authors claimed that the knockout efficiency drops significantly when the knockout line is crossed with other lines that use the neutrophil-specific promoter possibly due to the presence of another construct driven by the same neutrophil-specific promoter in the genome competes with the transcriptional factors for Cas9 expression and reduces Cas9 protein to a level that is not sufficient for efficient knockout.

      In this manuscript authors created a sgRNA-resistant rescue construct, and incorporate biosensors into the knockout line for live imaging in the context of the cell-specific knockout, and studied the function of Rac2 and Cdk2.

      This manuscript does not offer any further advances other than showing the tissue specific rescue, and subcellular localization of Rac activation in wild-type and rac2-knockout neutrophils.

      There is no evidence that this strategy is better than the previously published method, the quantification of knockout efficiency is absent.

    2. Reviewer #2:

      In the manuscript "A CRISPR/Cas9 vector system for neutrophil-specific gene disruption in zebrafish" by Wang et al, the authors describe methods for targeted inactivation of genes in a cell-type specific fashion, in this case in neutrophils in zebrafish embryos, and use this tool to examine the role of rac2 in neutrophil motility. The overall goal of broadening the ability to target tissue-specific gene inactivation is laudable and an ongoing need in the zebrafish toolbox, as is the goal of developing an increased understanding of motility regulation in neutrophils, as evaluated here in a series of quite stunning motion-tracking videos. Unfortunately, the current manuscript does not appear to advance the technology, nor evaluate it in sufficient depth, nor reveal sufficient new biology in regards to neutrophils/rac2.

      Major Points:

      1) With the title "A CRISPR/Cas9 vector system for neutrophil-specific gene disruption in zebrafish", the manuscript seems to be targeting a "technology" aim. As the authors cite, they have already published a neutrophil-specific CRIPSR/Cas9-based knockout tool in their DMM, 2018 manuscript. The addition of the crystallin reporter in the current manuscript is a convenient method for tracking the cas9 portion of the transgene, but this is a modest alteration to the existing technology.

      2) While billed as a neutrophil-specific gene-disruption technology, the authors do not show genome sequence of a mutated/disrupted rac2 gene. They have previously done this in the DMM 2018 paper, so should be feasible. Disruption of neutrophil motility is being used as a proxy read-out for rac2 disruption, but it seems that, as currently billed, the study should show neutrophil-specific disruption of rac2. The neutrophil-specific rescue experiments are very nice, but fail to show that the targeted gene disruption is limited to neutrophils, only that the gene disruption includes neutrophils. This could be of concern in a stable transgene context as well since transgenes can exhibit ectopic gene expression (i.e. not limited to neutrophils), and this cannot be tracked with the un-tagged CAS9 in the construct.

      3) At the outset, it is expected that disruption of rac2 would lead to neutrophil motility disruption and changes in F-actin dynamics using this tool as previously described in Deng et al, "Dual roles for Rac2 in neutrophil motility and active retention in zebrafish hematopoietic tissue", Dev Cell, 2011. As a proof of concept for the ability of targeting a gene in neutrophils, this makes sense to evaluate a well-studied pathway, but it is not clear if this expands on the understanding of rac2/control of actin dynamics and neutrophil motility, or if the newly described targeting vectors allow for an analysis that was not previously possible.

      4) The ribozyme approach described in Figure 6 seems perhaps most novel as an approach to target tissue-specific inactivation of a gene, but to truly nail down the technology, this would seem to require again some analysis of (a) the specific genomic lesions induced by the combination of ubiquitous CAS9 and tissue-specific gRNA and (b) some assessment of the specificity to neutrophils (i.e. are these mutations generated in other cell types?).

    3. Reviewer #1:

      Summary:

      Wang et al. utilize in their manuscript two trangenic lines to tissue-specifically knockout the rac2 gene in neutrophils. While technically CRIPSR-Cas9 has been well established, tissue-specific knockouts in zebrafish are missing in the field. Therefore, the manuscript of Wang et al. is highly timely and would help advance the field further; however, the manuscript and figures would greatly benefit from thorough editing and rewriting as outlined below.

      Major comments:

      Wang et al. base all their conclusions on observations of the targeted cells, and do not show any sequenced alleles of the neutrophil cells to verify that indels occurred. To go forward with the results, including sequences of the targeted alleles is crucial. Therefore, the manuscript would greatly benefit from including these basic allele confirmations, before drawing scientific conclusions about the efficacy of the system.

      1) Line 100 onwards. "To test the efficiency of the gene knockout using this system, we injected the F2 embryos of the Tg(lyzC:cas9, cry:GFP) pu26 101 line with the plasmids carrying rac2 sgRNAs or ctrl sgRNAs 102 for transient gene inactivation. The sequences of the sgRNAs are described in Fig. 1C, D. A 103 longer sequence with no predicted binding sites in the zebrafish genome was used as a control 104 sgRNA (Fig. 1D). As expected, we observed significantly decreased neutrophil motility in larvae of Tg(lyzC:cas9, cry:GFP) pu26 105 fish transiently expressing sgRNAs targeting rac2 (Fig. 106 1E, F and Movie S1), indicating that sufficient disruption of the rac2 gene had been achieved."

      Please include sequenced alleles from rac2 in neutrophil cells. "Significantly decreased neutrophil motility" is not an indicator that rac2 in neutrophil cells is mutated. Only sequenced alleles are.

      2) Line 107 onwards. "To test the knockout efficiency in stable lines, we generated transgenic lines of Tg(U6a/c: ctrl sgRNAs, lyzC:GFP) pu27 or Tg(U6a/c: rac2 sgRNAs, lyzC:GFP) pu28 108 , crossed the F1 fish with Tg(lyzC:cas9, cry:GFP) pu26 109 and quantified the velocity of neutrophils in the head mesenchyme 110 of embryos at 3 dpf. A significant decrease of motility was observed in the neutrophils 111 expressing Cas9 protein and rac2 sgRNAs (Fig. 1G, H and Movie S2)."

      Also here, "a significant decrease of motility" doesn't mean the rac2 gene in neutrophils is mutated. See point 1.

      Summarizing, the authors are advised to include this basic, but necessary and very important information in their manuscript instead of drawing conclusions from their observations. Otherwise, it stays unclear if everything Wang et al. observe is really due to indels in the rac2 gene, and not some other side effect of the system.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      As you can see from the reviewer comments attached below, all reviewers appreciated the approach you took for neutrophil-specific gene disruption, as such tissue-specific tools remain greatly missing in the field. Nonetheless, the reviewers all agreed that your phenotype description is insufficient to warrant the claims of the study. In particular, the lack of sequence verification of the claimed Cas9-induced mutagenesis has been picked up by all reviewers. We hope the reviewer comments are instrumental for refining your work.

    1. Reviewer #3:

      In the current study, baseline samples of salivary and plasma oxytocin were assessed in 13, respectively, 16 participants, to assess intra-individual reliability across four time points (separated by approximately 8 days). The main results indicate that, while as a group, average salivary and plasma samples were not significantly different across time points, within-subject coefficient of variation (CV) and intra-class correlation coefficient (ICC) showed poor absolute and relative reliability of plasma and salivary oxytocin measurements over time. Also no association was established between plasma and salivary levels, either at baseline or after administration of oxytocin (either intranasally, or intravenously). Further, salivary/ plasma oxytocin was only enhanced after intranasal, respectively intravenous administration.

      The study addresses an important topic and the paper is clearly written. While the overall multi-session design seems solid, sample collections were performed in the context of larger projects and therefore there appear to be several limitations that reduce the robustness of the presented results and consequently the formulated conclusions.

      General comments

      1) A main conclusion of the current work is that 'single measures of baseline oxytocin concentrations in saliva and plasma are not stable within the same individual'. It seems however that the study did not adhere to a sufficiently rigorous approach to put forward this conclusion. It lacks a control for several important factors, such as timing of the day at which saliva/ plasma samples were obtained, as well as sample volume. Particularly while it is indicated that all visits were identical in structure, important information is missing with regard to whether or not sampling took place consistently at a particular point of time each day, to minimize the influence of circadian rhythm. Without this information it is not possible to draw any firm conclusions on the nature of the intra-individual variability as demonstrated in the salivary and plasma sampling. Correspondingly, a deeper discussion is needed on the reason why ICC's were considerably variable across pairs of assessment sessions, with some pairs yielding good reliability, whereas others yielded (very) poor reliability. More detailed descriptions regarding sampling procedures (timing and sampling intervals) are necessary. Also, more information is needed on the volume of saliva collected at each session, to control for possible dilution effects.

      2) It is indicated that the initial sample would allow to detect intra-class correlation coefficients (ICC) of at least 0.70 (moderate reliability) with 80% of power. Is this still the case after the drop-outs/ outlier removals? Since the main conclusions of the work rely on negative results (conclusions drawn from failures to reject the null hypothesis) it is important to establish the risk for false negatives within a design that is possibly underpowered.

      3) Did the authors also assess within-session reliability? For example, by assessing ICC between pre and post-measurements in the placebo session.

      4) It is indicated that the intra-assay variability of the adopted radioimmunoassay constitutes <10%. Were analyses of the current study run on duplicate samples? Was intra-assay variability assessed directly within the current sample?

      Introduction & Discussion

      5) The introduction and discussion is missing a thorough overview of previous studies assessing intra-individual variability in oxytocin levels.

      6) The paper misses a discussion of previous studies addressing links between salivary/ plasma levels and central oxytocin (e.g. in cerebrospinal fluid). I understand the claim that salivary oxytocin cannot be used to form an estimate of systemic absorption, although technically, a lack of a link between salivary and plasma levels, does not necessarily imply a lack of a relationship to e.g. central levels. The lack of effect is limited to this specific relationship.

      Methods

      7) Related to the general comment, the variability in days between sessions is relatively high (average 8.80 days apart (SD 5.72; range 3-28). However, it appears that no explicit measures were taken to control the conducted analyses for this variability.

      8) A rationale for the adopted dosing and timing (115 min post administration) of the sample extraction is missing. Additionally, it seems that intravenous administrations were always given second, whereas intranasal administrations were given third, with a small delay of approximately 5 min. Hence, it seems that the timing of 115 min post-administration is only accurate for the intranasal administration.

      9) Since the ICC of baseline samples showed poor reliability, it seems suboptimal to pool across sessions for assessing the relationship between salivary and blood measurements. It should be possible to perform e.g. partial correlations on the actual scores, thereby correcting for the repeated measure (subject ID). Further, since the sample size is relatively small (13 subjects), it might be recommended to use non-parametric (e.g. Spearmann correlations) instead of Pearson. The additional reporting of the Bayes factor is appreciated; it is very informative.

      10) Now, the authors only compared relationships between salivary and plasma levels, either at baseline or post administration. I'm wondering whether it would be interesting to explore relationships between pre-to-post change scores in salivary versus plasma measures.

      11) Please provide more information on the outlier detection procedure (outlier labelling rule).

      12) Please indicate how deviations from a Gaussian distribution were assessed.

      Results

      13) Please verify the degrees of freedom for the post-hoc tests performed to assess pre-post changes at each treatment level (e.g. baseline vs Post administration: Spray - t(122) = 7.06, p < 0.001) . Why is this 122? Shouldn't this be a simple paired-sample t-test with 13 subjects?

    2. Reviewer #2:

      Summary:

      To test questions whether salivary and plasmatic oxytocin at baseline reflect the physiology of the oxytocin system, and whether salivary oxytocin index its plasma levels, the authors quantified baseline plasmatic and/or salivary oxytocin using radioimmunoassay from two independent datasets. Dataset A comprised 17 healthy men sampled on four occasions approximately at weekly intervals. In the dataset A, oxytocin was administered intravenously and intranasally in a triple dummy, within-subject, placebo-controlled design and compared baseline levels and the effects of routes of administration. With dataset A, whether salivary oxytocin can predict plasmatic oxytocin at baseline and after intranasal and intravenous administrations of oxytocin were also tested. Dataset B comprised baseline plasma oxytocin levels collected from 20 healthy men sampled on two separate occasions. In both datasets, single measurements of plasmatic and salivary oxytocin showed insufficient reliability across visits (Intra-class correlation coefficient: 0.23-0.80; mean CV: 31-63%). Salivary oxytocin was increased after intranasal administration of oxytocin (40 IU), but intravenous administration (10 IU) does not significantly change. Saliva and plasma oxytocin did not correlate at baseline or after administration of exogenous oxytocin (p>0.18). The authors suggest that the use of single measurements of baseline oxytocin concentrations in saliva and plasma as valid biomarkers of the physiology of the oxytocin system is questionable in men. Furthermore, they suggest that saliva oxytocin is a weak surrogate for plasma oxytocin and that the increases in saliva oxytocin observed after intranasal oxytocin most likely reflect unabsorbed peptide and should not be used to predict treatment effects.

      General comments:

      The current study tested research questions relevant for the study field. The analyses in two independent datasets with different routes of oxytocin administrations is the strength of current study. However, the limited novelty of findings and several limitations are noticed in the current report as described below.

      Specific and major comments:

      1) Previous study with similar results has already revealed that saliva oxytocin is a weak surrogate for plasmatic oxytocin, and increases in salivary oxytocin after the intranasal administration of exogenous oxytocin most likely represent drip-down transport from the nasal to the oral cavity and not systemic absorption (Quintana 2018 in Ref 13). Therefore, the novelty of current findings is limited. The authors should more clearly state the novelty of current results and the replication of previous findings.

      2) As authors discussed in the limitation section of discussion, the current study has several limitations such as analyses only in male participants and non-optimized timing of collection of saliva and blood due to the other experiments. These limitations are understandable, because the current study was the second analyses on the data of the other studies with the different aims. However, these limitations significantly limit the interpretations of the findings.

      3) As reported in page 6, the dataset A comprises administrations approximately 40 IU of intranasal oxytocin and 10 IU on intravenous. The rationale to set these doses should be described. Since the 40IU is different from 24 IU which is employed in most of the previous publications in the research field, potential influence associated with the doses should be tested and discussed.

      4) It is difficult to understand that no significant elevations in plasma oxytocin levels were observed after intranasal spray or nebuliser of oxytocin. From figure 4A, the differences between levels at baseline and post administration are similar between nebuliser, spray, and placebo. Please discuss the potential interpretation on this result.

      5) In page 12, the reason why not to employ any correction for multiple comparisons in the statistical analyses should be clarified.

    3. Reviewer #1:

      This article describes the investigation of a valuable research question, given the interest in using salivary oxytocin measures as a proxy of oxytocin system activity. A strength of the study is the use of two independent datasets and the comparison between intranasal and intravenous administration. The authors report poor reliability for measuring salivary oxytocin across visits, that intravenous delivery does not increase concentrations, and that salivary and blood plasma concentrations are not correlated.

      Line 77-78: While it's true that saliva collection provides logistical advantages, there are also measurement advantages (e.g., relatively clean matrix) that are summarised in the MacLean et al (2019) study, which has already been cited.

      Line 86: It is important to note that the 1IU intravenous dose in this study led to equivalent concentrations in blood compared to intranasal administration.

      Line 158: When using both ELISA and HPLC-MS, extracted and unextracted samples are correlated when measuring oxytocin concentrations in saliva, at least in dogs. (https://doi.org/10.1016/j.jneumeth.2017.08.033).

      Statistical reporting: I ran the article through statcheck R package (a web version is also available) and found a number of inconsistencies with the reported statistics and their p values. For example, on Line 302 the authors reported: t(123) = 1.54, p = 0.41, but this should yield a p value of 0.13. The authors should do the same and fix these errors.

      Line 305: The confidence intervals for these correlations should be reported.

      Line 348: This is an important point, but it's important to note that the vast majority of these studies use plasma or saliva measures. Perhaps CSF measures are more reliable, but the question wasn't assessed in the present study, and I'm not sure if anyone has looked at this question.

      Line 423: I broadly agree with this conclusion, but it should be added that "single measurements of baseline levels of endogenous oxytocin in saliva and plasma are not stable under typical laboratory conditions" Perhaps these measures can be more stable using other means (i.e., better standardising collection conditions). But the fact remains, under typical conditions these measures do not demonstrate reliability.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The strengths of the study are the findings that a single oxytocin level measured from saliva or plasma is not meaningful in the way that the field might currently be measuring. The reviewers appreciated this finding, and the careful attention to detail, but felt that the results fell short.

    1. Reviewer #3:

      This work by Sacchetti et al. describes how phenotypic plasticity contributes to local invasion and metastasis formation in colon cancer cells. Based on human classical colon carcinoma cell lines and cell sorting they identified a subpopulation of colon cancer cells that are CD44hi/EpCAMlo cells which have enhanced phenotypic plasticity that underlies enhanced invasion and metastatic behavior. In these EpCAMlo cells elevated ZEB1 expression has been identified. Increased WNT signaling results in elevated expression of the EMT associated transcription factor ZEB1. The EpCAMlo expression status is linked with the CMS4 subgroup of human malignant colon cancer. Overall this is an interesting and well written paper for which I offer a few supportive questions/remarks.

      Major comments:

      1) Page 6: The miR-200 family of miRNAs is targeting the mRNA of transcription factors ZEB1 and ZEB2 in epithelial cells but this is not transcriptional regulation

      2) A clear association of EpCAMlo cells and elevated ZEB1 expression is identified. Conditional knockdown of ZEB1 results in a strongly decreased number of EpCAMlo cells. For now, it is not clear if ZEB1 KD results in the death of these EpCAMlo cells or that the mesenchymal gene signature is controlled by ZEB1. The functional contribution of ZEB1 as an EMT inducer should be experimentally proven as for now the role of ZEB1 is not clear.

      3) The importance of the role of EMT is not well established so far in the manuscript in relation to resistance to chemotherapeutic drugs and metastasis. Conditional KD of ZEB1 in metastasis and therapy resistance assays should be added otherwise the title and the claims made in the abstract should be tuned down.

      4) The use of AKP organoids brings further relevance to this research manuscript. Are these EpCAMlo cells also present in the AKP organoids and what is the endogenous expression status of ZEB1 in the AKP organoids?

      5) Why have the authors maintained the conditional expression of ZEB1 induced in the AKP-Z organoid transplantation experiments? This is driving the epithelial cells in a locked mesenchymal state - which is not compatible with the earlier observed plasticity with the EpCAMlo cells in SW480 and HCT116 cells. Also mesenchymal to epithelial transition is generally believed to be essential for metastasis formation. The experimental outcome of these experiments is not relevant and the authors should consider temporal ZEB1 expression control in transplanted AKP-Z organoids.

      6) The data depicted in Fig 10A & B are confusing and deserve a better explanation. How is it possible that EpCAMlo and EpCAMhi sorted cells show overlapping single cell expression profiles upon t-sne plotting in particular for the SW480 cells. This is very contradictory as the authors claim earlier in the manuscript that EpCAMlo cells have a more mesenchymal gene expression profile which is then confirmed with the 'EMT signature' analysis. Is there a difference between EpCAM protein expression and EpCAM mRNA expression?

      7) The Heatmap from the EMT signature shown in figure 10B is representing which cell line?

      Overall the authors link the gene expression signature of EpCAMlo with the colon cancer consensus molecular subtype CMS4 which has the worst relapse free and overall survival (Dienstmann R et al. 2017; 17, Nat Rev Cancer 79-92). There are multiple lines of evidence that the mesenchymal signature in CMS4 colon cancers is due to profound infiltration of stromal cells (CAFs, immune cells), extracellular matrix remodeling, TGF-beta pathway activation and not the consequence of EMT in cancers cells (e.g. Calon et al. 2015; DOI: 10.1038/ng.3225). It is of course possible that a few epithelial cells in this inflammatory context are undergoing a partial EMT but there is little evidence and this likely will happen in a minority of cells. Together, the authors should revise their manuscript regarding (partial) EMT and the CMS4 and put their findings in a more critical context.

    2. Reviewer #2:

      The manuscript by Fodde et al investigates the presence of a population of colorectal cancer cells within commonly used human cell lines that have a propensity to form metastasis to the liver and lung. These cells are marked as being CD44HiEpCamlo and have increased expression of the EMT marker Zeb1. They show that this population of EpCam-low cells is able to drive metastatic colonisation and that this is likely due to levels of Zeb1. These cells have a signature similar to the CMS4 group of colorectal cancers, which are highly invasive.

      The manuscript is generally well written and presented in a stepwise and straightforward manner so is relatively easy to follow.

      There is a lot of data presented in this paper with 10 primary figures and a number of supplementary figures. I would encourage the authors to look at which data needs presenting and ask whether some of the earlier figures in particular could be combined and the paper streamlined...its by the time you get to the really interesting data in the organoid transplantation and scRNA seq there has been a lot to get through already.

      There are some questions I have about the experimental data and presentation:

      1) Whilst the authors investigate the expression of EpCam and CD44 in cell lines, is there any evidence of this EpCam-low population in primary human tumours? or primary tumours in the mouse? I appreciate that finding these cells in human could be rate limiting, but what about in tumours that are generated in mice and are metastatic - specifically I am thinking about the recent work in colon showing that Notch signalling drives colonic to liver metastasis (Jackstadt et al 2019) - do the Notch active cells in this model have lower EpCam levels?

      2) For the FACS plots could the authors include their complete gating and FMO control gating strategy in the supplementary. It would be helpful to be able to confirm that the shifts the authors are describing are real.

      3) In figure 2, can the authors quantify the protein expression of Ecad and Zeb1? In one of the panels of the CD44 high EpCam low (SW480 cells) there seems to be cells with quite high levels of EpCam - having a quantified measure of these proteins in the two populations would be important here.

      4) It was very interesting that the different populations gave rise to different metastatic rates following injection through the spleen. Do the authors have information on whether this is because the different populations move out of the spleen and into the liver at different rates (so initiation/seeding) is different or is this a consequence of proliferation i.e. both cell populations colonise the liver, but only the EpCam-low population sticks around and colonises the tissue? Further to this, can the authors delete Zeb1 in the EpCam-low cells (as they have done in vitro) and show that colonisation is Zeb1 dependent - this latter point would not be considered essential given the following overexpression experiments.

      5) Much of the metastatic quantification is done through IVIS imagine (from what I can see) - have the authors pathologically quantified the number and size of tumours following ZEB1 overexpression in AKP derived metastasis with histology?

      6) The authors concede that the continuous activation of Zeb1 following transplantation of AKP organoids (pg9 of the PDF) could be the reason that metastatic colonisation is not as impressive as hoped - have the authors considered pulling Dox to initiate metastatic colonisation of the liver and then withdrawing Dox to favour proliferation following metastatic seeding? It would be interesting to know whether the timing of Zeb1 expression is important for this phenotype.

      7) As Wnt signalling is important in the establishment of the EpCam-low population, have the authors inhibited this pathway (either at the ligand level or through inhibiting b-cat transcription) to confirm that the population is Wnt responsive?

      8) Finally, linked to point 7. In the scRNA sequencing, in the populations that have increased EMT and EMT-gene expression, does this correlate to a Wnt/B-catenin signature on a single cell level?

    3. Reviewer #1:

      Sacchetti and co-workers have employed established human colorectal cancer cell lines to identify a subpopulation of colorectal cancer (CRC) cells (CD44 high/EpCAM low) which represent cells with high tumorigenicity and malignancy in vitro and in vivo. These cells can also be found in patient-derived tumor organoids and in patient samples. Using bulk and single cell RNA sequencing and subsequent functional validation they go on to demonstrate that enhanced canonical Wnt signaling mediates the expression of the EMT transcription factor ZEB1 and with it an EMT-like process. Consistent with this observation, this cell population exhibits higher drug resistance as compared to the parental cells or to CD44 high/EpCAM high cells. They finally employ a number of cutting-edge computational analysis to classify several subgroups within the EMT cell subpopulation which seem to represent various stages of the EMT continuum, and thus may exhibit various degrees of cell plasticity. The particular gene expression signatures of the identified subpopulations also correlate with poor clinical outcome and with the CMS4 subclass of poor prognosis CRC.

      Overall, the manuscript is presented in a straightforward and concise manner, the experimental approaches are thoughtfully designed and appropriately controlled. However, some of the results, in particular of the first part, are not specifically novel. The correlation between CRC invasion and nuclear -catenin and ZEB1 has been reported before, as actually appropriately cited by the authors. Moreover, the migratory and invasive and pro-metastatic and drug-resistant phenotype of ZEB1-expressing, EMT-like cancer cells have been shown before and are as expected. Finally, as detailed below, the mechanisms regulating the homeostasis of the EpCAM-low and EpCAM-high cells in cell culture and in organoids in vitro and in cancers in vivo remain elusive. While the novel insights into the potential trajectories of the genesis of the various subpopulations and the respective gene signatures is exciting, the functional validation of these signatures for the definition of cell plasticity and the actual establishment and functional validation of an identifiable gene signature for cell plasticity has not been directly addressed. Along these lines, the report goes with the mainstream literature in using the term "cell plasticity" with a rather vague description. Is it defined by EMT in general or only by a specific hybrid stage of EMT, by therapy resistance, by differentiation potential, by the reversibility of processes, by stemness, etc.? How can it be functionally tested? The manuscript, as it stands, is not adding tangible data and information on how to identify cell plasticity and what it means in terms of identifying and assessing novel therapeutic targets.

      Specific comments:

      Introduction: the literature on the role of Prrx1 in EMT/MET and the need of MET for metastatic outgrowth should be mentioned already in the Introduction. The discovery and functional characterization of the various EMT stages should also be mentioned already in the Introduction, not only in the Discussion. Finally, the term cell plasticity should be defined in the Introduction, at least how it is used in the following chapters.

      Figure 1/Suppl.1: "similarly variable"? There is a variability of 0 - 99.6% for the levels of the CD44 -igh/EpCAM-low subpopulation in the different CRC cell lines. Notably, there is no correlation of the levels of this subpopulation with the CMS classification of CRC origin, as is claimed later with CMS4.

      Why do the EpCAM-low cells get lost during long-term culture and turn into EpCAM-high, E-cadherin-high cells? How then is the homeostasis between the EpCAM-high and low populations maintained in the parental cells which have been cultured for decades? Also, almost all single cell cones of EpCAM-low cells turn into EpCAM-high over time. Why are some maintaining the EpCAM -ow status? Is there a difference in gene expression or epigenetic imprints? Has the fetal calf serum been stripped of TGF or does it still contain TGF which could induce an EMT?

      Figure 5E, text: the reversibility of EMT by a MET is here used as equal to cell plasticity. Is this a correct definition of cell plasticity (see also above)? The EpCAM-low status seems rather unstable and not metastable in vitro and in vivo, this may not represent the homeostasis of EMT induction and its reversion and thus not true cell plasticity.

      Figure 6: The induction of an EMT by ZEB1 is not new or unexpected as is the increase of metastasis, even though the latter is not statistically significant here. The "excuse" that the incidence of metastasis could be higher, when ZEB1 expression would have been stopped by removing Dox, could have been actually tested. This would be a more meaningful experiment.

      Figure 7: RNA sequencing identifies Wnt signaling to be enhanced in EpCAM-low cells. GSK inhibition induces the expression of ZEB1 (as known before), yet this works only in HCT116 and not in SW480 cells, which actually show an induction of Wnt signaling. The results seem to indicate that there is not just a mere enhancement of Wnt signaling and that other changes/pathways are required as well. What about other cell lines?

      Is the prognostic and predictive value for the gene signature only true for CMS4 CRCs or for all subtypes? Does the EpCAM -ow signature and the signatures of the various EMT stages correlate with CMS subtypes, therapy resistance and clinical outcome? This is not really clear from the data presented.

      The scRNA sequencing seems to reflect the EMT full and hybrid stages. The computational analysis is impressive and exciting, the potential trajectories offer a working model which could be experimentally tested by functional validation of the subgroups to finally pinpoint the cell populations with the highest cell plasticity. And most importantly, what defines cell plasticity at the molecular and cellular level? Is it Wnt signaling or something in addition? Here, the reader is left without a clear picture (see also comment on Discussion, below).

      Text: Seurat33 = Stuart33.

      Discussion: What is the mechanistic basis for the "further enhancement" of Wnt signaling? Is it the dose of Wnt signaling or is it the combination with other signaling pathways which cooperate with Wnt transcriptional control, such as Hippo or TGF signaling? There could be a hint from the RNA sequencing data to distinguish these possibilities. Do the target gene lists change with the enhancement of Wnt signaling?

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      This manuscript is in revision at eLife.

      Summary:

      While all reviewers see merit in aspects of the work, and indeed the consensus that there were elements of novelty and interest in this manuscript, they felt that novel advances were limited as presented. Briefly, the manuscript falls into two parts; it is too long with too much data presented and we recommend focus on potentially the most exciting/novel part, ie. the RNAseq / sc and computational analyses, and extending this to provide further functional validation. Some of the earlier figures reflect quite well understood biology (EMT, Zeb1, Wnt etc in EMT), and would require much more work to tighten up the conclusions; therefore, it was felt that even if these were improved, the data would likely confirm a lot of what we know already. It is true that the role of EMT is controversial - but what is presented in the first part of the manuscript does not add much definitive new data to inform that debate, and indeed the authors' submission letter refers to their 'confirmatory' nature.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to the reviewers’ comments:

      We thank both the reviewer for their critical evaluation and excellent suggestion to improve the manuscript. We are making all the changes suggested by both the reviewers and performing the experiments to address all the concerns specifically from the reviewer #1. Please find below our response to the reviewers’ comments:

      Reviewer #1:

      This is an interesting study from the Rahaman group that identifies cardiolipin (CL) as a potential binding target for Drp6 recruitment to the nuclear membrane in Tetrahymena (that has a unique nuclear remodeling program). In addition, they identify a residue, I553 in the DTD region, which they claim is a key residue involved in specific CL interactions. While the experiments themselves are technically sound, and are well performed and controlled, I don't find the major conclusion that I553 is involved in direct CL interactions justified or well rationalized. By their own admission (in the discussion), the conservative mutation I553M may perturb local folding and may indirectly affect CL interactions. There is no test of DTD folding with and without the I553M mutation, nor are there other mutations (e.g. I553A and in the vicinity) tested. CD experiments in the absence and presence of CL-containing membranes will likely yield information on the impact of the I553 mutations, while DLS experiments would inform on the hydrodynamic properties (overall 3D fold) of the DTD and the impact of these mutations. CL interactions generally involve a combination of electrostatic and hydrophobic forces. Where do the electrostatic interactions come from? Why would an Isoleucine to Methionine mutation affect the hydrophobic component, even if I553 is the key hydrophobic residue?

      Response:

      We thank the reviewer for the comments that the experiments are sound, well performed with appropriate controls. While we agree that the exact mechanism of how I553 provides specificity to cardiolipin binding is not addressed in the present manuscript, our study clearly demonstrates that the isoleucine at 553 plays important role in determining cardiolipin specificity and nuclear recruitment. As pointed out by the reviewer, it is possible that changing isoleucine to methionine may affect the local conformation. However, there is no major conformational change in the DTD due to this mutation. This conclusion is based on clear loss of nuclear localization and cardiolipin interaction for the mutant without affecting other properties. The in vitro floatation assay clearly stablish that the effect is directly by inhibiting interaction specifically with cardiolipin containing membrane. It should be further noted that the same domain DTD interacts with other two lipids (PS and PA) and mutant retains interaction with them arguing that conformation of this domain is not significantly changed due to I to M mutation. Consistent with these results I553M mutant could be targeted to the nuclear membrane as a complex with wildtype Drp6 further confirming that I553 could form correct self-assembled structure with wildtype protein required for association with nuclear membrane. This is further substantiated by comparing all the known biochemical properties including GTPase activity, membrane binding via other two lipids, formation of helical spirals and ring structures. Hence it is clear that I553 provides specificity to bind cardiolipin and recruitment to the nuclear membrane. We will further confirm if there is any local conformation change due to the mutation I to M by fluorescence quenching experiments and will be incorporated in the revised manuscript.

      Regarding overall folding of the mutant, this is an excellent suggestion by the reviewer. We are planning to perform CD experiments of the I553M mutant and wildtype proteins to compare if there is any change in overall folding due to mutation. This result would be incorporated in the revised manuscript.

      Reviewer is right to point out that both electrostatic and hydrophobic interactions are important for interaction with cardiolipin. Electrostatic interaction is important for all the phospholipids while interacting with protein and is expected to come from other amino acid residues which are positively charged. Electrostatic interaction may contribute to the affinity of the interaction by providing additional binding energy. But considering its universal nature of interaction with all the phospholipids, it cannot give specificity for a specific lipid and hence would not discriminate among different phospholipids.

      Regarding affecting hydrophobic component, the reviewer is correct that both are strong hydrophobic amino acids and loss of I553M interaction with cardiolipin may not be due to change in hydrophobicity

      To address that the loss of cardiolipin interaction is not specific to methionine and is due to absence of isoleucine, the suggestion from the reviewer to replace I553 with A (alanine) is an excellent one. We are doing the experiments and we anticipate to incorporate these results in our revised manuscript.

      Reviewer #1 (Significance (Required)):

      The addressed phenomenon is restricted to Tetrahymena and may not have far reaching implications. Regardless, the identification of CL as a binding target for Drp6 at the nuclear membrane of this organism is in itself significant. The conclusion that I553 is the key CL binding residue is however not warranted. Additional experiments are needed to dissect how this residue impacts CL interactions and examine whether the observed effect is direct or indirect.

      Response:

      We thank the reviewer for appreciating the significance of this work. We agree that our data is Tetrahymena specific. However, we believe that the study is relevant for all the proteins whose association with target membranes depend on cardiolipin including many cardiolipin interacting DRPs (such as DRPs involved in biogenesis and maintenance of mitochondria).

      We really appreciate the reviewer for the excellent suggestions. Based on this we are performing the following experiments.

      1. CD experiments to assess overall folding of I553M and Wildtype protein
      2. Fluorescence quenching of Tryptophan (at amino acid position 548) residue in the vicinity of I553 to compare conformation of the mutant with that of wildtype protein.
      3. Evaluation of I553A in nuclear localization and cardiolipin binding. We anticipate these results to further confirm if I553 is the key CL binding residue and if the effect is direct.

      The writing is not clear in some parts and may require a round of language editing. There are no issues with reproducibility.

      Response

      We thank the reviewer for pointing out the language editing. We will edit the language wherever we find it appropriate. We would highly appreciate if reviewer can indicate the portions that need special attention.

      Reviewr #2:

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Dynamin is a GTPase superfamily protein involved in membrane fusion and division. This paper focused on Drp6, one of the eight dynamin superfamily proteins of Tetrahymena, and analyzed its nuclear envelope localization mechanism by a combination of in vivo cytogenetical analysis and in vitro biochemical analysis for the various mutant Drp6 proteins. Results showed that a specific amino acid residue (isoleucine at the 553rd) in the membrane binding domain of Drp6 was required for its nuclear membrane localization, but this residue is not required for ER/endosome localization and GTPase activity. Furthermore, in vitro floating analysis using centrifugation indicated that Drp6 specifically bound to the cardiolipin at the 553rd isoleucine residue and this binding was required for Drp6's nuclear membrane localization. Finally, removal of cardiolipin from the conjugating cells using inhibitor treatment showed that cardiolipin was required for the new macronucleus formation (including the expansion of macronuclear envelope) through the function of Drp6. Based on these results, authors concluded that cardiolipin targets Drp6 to the nuclear membrane in Tetrahymena.

      \*Major comments:***

      The experimental data presented in this paper are reasonable and the results are solid, and therefore I think the deduced conclusions are convincing. However, to improve this paper, I have several minor comments to be revised before publication.

      \*Minor comments:***

      1. In the previous paper, it has been shown that GFP-Drp6 is localized in the inner nuclear membrane of both macronucleus and micronucleus. In this paper, however, this point is not clearly stated and is not shown in the figures --- I could not understand such localization pattern of GFP-Drp6 in Fig. 1C and Fig. 3b and the statements in the text. I suggest adding such statements somewhere in Introduction or Result section. Also, add adequate references to the corresponding statements in the text.
        • Related to the comment 1, I suggest replacing Fig. 1C (images of fixed cells) with Fig. S1B (images of live cells) because nuclear localization of GFP-Drp6 are much clearer in Fig. S1B (live cell) than Fig. 1C (fixed cell), and because fixation may cause artificial redistribution of the proteins. Please add arrows in those figures to point out the position of micronucleus in those figures if necessary.*
        • Similarly, I suggest replacing images of Fig. 5B (fixed cells) with those of Fig. S3 (live cells).*
        • page 7, line 224: GFP-Nup3 is used as a marker protein of the nuclear pore complex (NPC). However, there is no description of how GFP-Nup3 is obtained or made. Add description how this DNA plasmid was obtained or generated.*
        • Related to the comment 4, "Nup3" is first discovered in Malone et al., Eukaryotic Cells, 2009, but also soon after discovered as the name of "MicNup98B" in Iwamoto et al., Curr Biol, 2009 and used in several papers including Iwamoto et al., Genes Cells, 2010; JCS, 2015; JCS 2017; and more. Because Nup3 is the Tetrahymena paralogs of human Nup98 and the name of "Nup98" is well established to call these homologs in various eukaryotes, I suggest adding the name of "MacNup98B" after the word of "Nup3" for reader's better understanding. I also suggest adding appropriate references to refer to this protein as follows: Add Malone et al. 2009 for "Nup3" and Iwamoto et al., 2009 for "MacNup98B."*
        • page 9, line 295: I wonder if "Fig. 3b" may be a mistake of "Fig. 5C." If so, please correct this.*
        • page 10, the second paragraph (lines 311-322): This paragraph discussed the possible involvement of Drp6 in the nuclear envelope expansion of the post-zygotic nucleus. It may be interesting to point out that large-scale nuclear envelope reorganization including the formation of the redundant nuclear envelope and the type-switching of the NPC (from the MIC-type NPC to the MAC-type one) has been reported at this developmental stage (Iwamoto et al., JCS 2015). For example, the peculiar shaped nuclear envelope with the redundant/overlapping nuclear envelope structure can be seen and the MAC-type NPCs rapidly assembles to the expanding nuclear envelope. It may be interesting to point out that cardiolipin and Drp6 may be involved in these phenomena. But it is too speculative and therefore consider adding such a discussion as an option.*
        • page 13, line 412: Is the word "GFP-drp6-I553M" written in italics intended for the gene for the GFP-drp6-I553M protein? If so, protein may be acceptable here. Make sure there are no problems with italicized characters. Also, check if the lowercase letter "d" in "drp6" is OK because large letters are used in other cases.*
        • page 20, figure 1: I recommend switching the positions of HDyn1 and Drp6 in Figure 1a to keep the order in Figure 1b.*
        • page 21, line 671: Add the word "Tetrahymena" before "Drp 6" to pair with the word "human dynamin 1".*
        • page 23, line 729: Remove "and."*
        • page 23, lines 729 and 731: Unify the expression of "cardiolipin" and "Cardiolipin"*
        • page 23, line 732: Add "or" before "10% Phosphatidylserin."*
        • page 24, Figure 3a: Please mark the position of I553M in the figure if possible. Alternatively, indicate the range of amino acid residues after the words "red" and "green" in the figure legend.* Response:

      We thank the reviewer for the excellent comments that “the experimental data presented in this paper are reasonable and the results are solid, and therefore I think the deduced conclusions are convincing.” We also thank the reviewer for the minor comments which are thorough and very insightful. it will improve the manuscript substantially. We would incorporate all the changes in the revised manuscript.

      Reviewer #2 (Significance (Required)):

      The corresponding author and his colleagues have reported that Tetrahymena Drp6 is localized to the outer nuclear membrane of both macronucleus and micronucleus of Tetrahymena (Elde et al., 2005) and that Drp6 is required for the formation of new macronuclei during nuclear differentiation (Rahaman et al., 2008). Therefore, these parts are not novel.

      The novelty of this study is as follows:

      (1) The discovery of a specific amino acid residue (isoleucine at the 553rd) of Drp6 that is required for its nuclear membrane localization.

      (2) the discovery of a lipid molecule, cardiolipin, as a critical partner for Drp6's nuclear membrane targeting.

      (3) Discovery of involvement of cardiolipin in the new macronucleus formation (the expansion of macronuclear envelope) through the function of Drp6.

      *

      I think their findings are highly novel and will provide new insight into a field of cell biology. Especially, their findings will contribute to understanding how specific proteins targeted to the specific intracellular membranes. In addition, their methods (such as floatation assay) for analyzing the interaction between the protein of interest and lipid/liposomes will become an important tool.*

      Response:

      We are very happy to note that the reviewer has pointed out the significance of the present study. We fully agree with reviewer and appreciate thorough analysis and excellent conclusion from the reviewer.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Dynamin is a GTPase superfamily protein involved in membrane fusion and division. This paper focused on Drp6, one of the eight dynamin superfamily proteins of Tetrahymena, and analyzed its nuclear envelope localization mechanism by a combination of in vivo cytogenetical analysis and in vitro biochemical analysis for the various mutant Drp6 proteins. Results showed that a specific amino acid residue (isoleucine at the 553rd) in the membrane binding domain of Drp6 was required for its nuclear membrane localization, but this residue is not required for ER/endosome localization and GTPase activity. Furthermore, in vitro floating analysis using centrifugation indicated that Drp6 specifically bound to the cardiolipin at the 553rd isoleucine residue and this binding was required for Drp6's nuclear membrane localization. Finally, removal of cardiolipin from the conjugating cells using inhibitor treatment showed that cardiolipin was required for the new macronucleus formation (including the expansion of macronuclear envelope) through the function of Drp6. Based on these results, authors concluded that cardiolipin targets Drp6 to the nuclear membrane in Tetrahymena.

      Major comments:

      The experimental data presented in this paper are reasonable and the results are solid, and therefore I think the deduced conclusions are convincing. However, to improve this paper, I have several minor comments to be revised before publication.

      Minor comments:

      1. In the previous paper, it has been shown that GFP-Drp6 is localized in the inner nuclear membrane of both macronucleus and micronucleus. In this paper, however, this point is not clearly stated and is not shown in the figures --- I could not understand such localization pattern of GFP-Drp6 in Fig. 1C and Fig. 3b and the statements in the text. I suggest adding such statements somewhere in Introduction or Result section. Also, add adequate references to the corresponding statements in the text.
      2. Related to the comment 1, I suggest replacing Fig. 1C (images of fixed cells) with Fig. S1B (images of live cells) because nuclear localization of GFP-Drp6 are much clearer in Fig. S1B (live cell) than Fig. 1C (fixed cell), and because fixation may cause artificial redistribution of the proteins. Please add arrows in those figures to point out the position of micronucleus in those figures if necessary.
      3. Similarly, I suggest replacing images of Fig. 5B (fixed cells) with those of Fig. S3 (live cells).
      4. page 7, line 224: GFP-Nup3 is used as a marker protein of the nuclear pore complex (NPC). However, there is no description of how GFP-Nup3 is obtained or made. Add description how this DNA plasmid was obtained or generated.
      5. Related to the comment 4, "Nup3" is first discovered in Malone et al., Eukaryotic Cells, 2009, but also soon after discovered as the name of "MicNup98B" in Iwamoto et al., Curr Biol, 2009 and used in several papers including Iwamoto et al., Genes Cells, 2010; JCS, 2015; JCS 2017; and more. Because Nup3 is the Tetrahymena paralogs of human Nup98 and the name of "Nup98" is well established to call these homologs in various eukaryotes, I suggest adding the name of "MacNup98B" after the word of "Nup3" for reader's better understanding. I also suggest adding appropriate references to refer to this protein as follows: Add Malone et al. 2009 for "Nup3" and Iwamoto et al., 2009 for "MacNup98B."
      6. page 9, line 295: I wonder if "Fig. 3b" may be a mistake of "Fig. 5C." If so, please correct this.
      7. page 10, the second paragraph (lines 311-322): This paragraph discussed the possible involvement of Drp6 in the nuclear envelope expansion of the post-zygotic nucleus. It may be interesting to point out that large-scale nuclear envelope reorganization including the formation of the redundant nuclear envelope and the type-switching of the NPC (from the MIC-type NPC to the MAC-type one) has been reported at this developmental stage (Iwamoto et al., JCS 2015). For example, the peculiar shaped nuclear envelope with the redundant/overlapping nuclear envelope structure can be seen and the MAC-type NPCs rapidly assembles to the expanding nuclear envelope. It may be interesting to point out that cardiolipin and Drp6 may be involved in these phenomena. But it is too speculative and therefore consider adding such a discussion as an option.
      8. page 13, line 412: Is the word "GFP-drp6-I553M" written in italics intended for the gene for the GFP-drp6-I553M protein? If so, protein may be acceptable here. Make sure there are no problems with italicized characters. Also, check if the lowercase letter "d" in "drp6" is OK because large letters are used in other cases.
      9. page 20, figure 1: I recommend switching the positions of HDyn1 and Drp6 in Figure 1a to keep the order in Figure 1b. 
      10. page 21, line 671: Add the word "Tetrahymena" before "Drp 6" to pair with the word "human dynamin 1".
      11. page 23, line 729: Remove "and."
      12. page 23, lines 729 and 731: Unify the expression of "cardiolipin" and "Cardiolipin"
      13. page 23, line 732: Add "or" before "10% Phosphatidylserin."
      14. page 24, Figure 3a: Please mark the position of I553M in the figure if possible. Alternatively, indicate the range of amino acid residues after the words "red" and "green" in the figure legend. 

      Significance

      The corresponding author and his colleagues have reported that Tetrahymena Drp6 is localized to the outer nuclear membrane of both macronucleus and micronucleus of Tetrahymena (Elde et al., 2005) and that Drp6 is required for the formation of new macronuclei during nuclear differentiation (Rahaman et al., 2008). Therefore, these parts are not novel.

      The novelty of this study is as follows: (1) The discovery of a specific amino acid residue (isoleucine at the 553rd) of Drp6 that is required for its nuclear membrane localization. (2) the discovery of a lipid molecule, cardiolipin, as a critical partner for Drp6's nuclear membrane targeting. (3) Discovery of involvement of cardiolipin in the new macronucleus formation (the expansion of macronuclear envelope) through the function of Drp6.

      I think their findings are highly novel and will provide new insight into a field of cell biology. Especially, their findings will contribute to understanding how specific proteins targeted to the specific intracellular membranes. In addition, their methods (such as floatation assay) for analyzing the interaction between the protein of interest and lipid/liposomes will become an important tool.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is an interesting study from the Rahaman group that identifies cardiolipin (CL) as a potential binding target for Drp6 recruitment to the nuclear membrane in Tetrahymena (that has a unique nuclear remodeling program). In addition, they identify a residue, I553 in the DTD region, which they claim is a key residue involved in specific CL interactions. While the experiments themselves are technically sound, and are well performed and controlled, I don't find the major conclusion that I553 is involved in direct CL interactions justified or well rationalized. By their own admission (in the discussion), the conservative mutation I553M may perturb local folding and may indirectly affect CL interactions. There is no test of DTD folding with and without the I553M mutation, nor are there other mutations (e.g. I553A and in the vicinity) tested. CL interactions generally involve a combination of electrostatic and hydrophobic forces. Where do the electrostatic interactions come from? Why would an Isoleucine to Methionine mutation affect the hydrophobic component, even if I553 is the key hydrophobic residue? Additional experiments are therefore essential to identify the actual residues involved in specific CL interactions. CD experiments in the absence and presence of CL-containing membranes will likely yield information on the impact of the I553 mutations, while DLS experiments would inform on the hydrodynamic properties (overall 3D fold) of the DTD and the impact of these mutations.

      The writing is not clear in some parts and may require a round of language editing. There are no issues with reproducibility.

      Significance

      The addressed phenomenon is restricted to Tetrahymena and may not have far reaching implications. Regardless, the identification of CL as a binding target for Drp6 at the nuclear membrane of this organism is in itself significant. The conclusion that I553 is the key CL binding residue is however not warranted. Additional experiments are needed to dissect how this residue impacts CL interactions and examine whether the observed effect is direct or indirect.

    1. Reviewer #3:

      This is a great paper that takes a modelled somatosensory microcircuit and, without parameter adjustment, asks whether stimulus-specific adaptation is capable of emerging. The ability to remove synaptic depression and stimulus-frequency adaptation, in both thalamo-cortical and cortico-cortical populations was a definite highlight for me. Primary negatives were minimal mention of certain aspects of connectivity, and a complete lack of any mention of interneuron processing and its known role in SSA.

      Major Comments:

      1) The NMC model is derived from somatosensory cortex. It's not really discussed at all in the paper, but is the assumption that auditory cortex is similar enough in structure that it is valid to model it with a somatosensory model? Although I'm not a somatosensory expert, there are certainly numerous connectivity differences between auditory and visual cortices (interactions between L6 CT neurons, and the local cortical column for example).

      2) It was not immediately clear to me, how exactly the MGB->ACtx was wired up, and consequently, how this wiring affected tuning bandwidth in ACtx. I don't think it was a one-to-one mapping that was used, because there is talk of multiple TC afferents innervating a single cell, but this should be described in detail. How do these connectivity choices affect bandwidth, at a layer-specific level? (i.e. one could imagine a broadly tuned neuron being so because it's integrating auditory information from heterogeneously tuned thalamic neurons).

      3) Related to points 1&2, it looks from Figure 1C, that the TC input is generating a tonotopically ordered map in ACtx? Is this the case? If so, in light of many recent papers that have shown substantial local heterogeneity in ACtx frequency tuning, this is not particularly plausible.

      4) I appreciate that this is not the focus of the paper, but it wasn't clear to me whether the NMC model consisted primarily of excitatory neurons, or whether there were inhibitory neurons that were included in the analysis. If the population is mixed, then this will affect interpretation of the depression experiments. In some sense, this is also my biggest negative about the paper - there is almost no mention of interneurons at all, even though interneurons also play an important role in SSA (given that they shape frequency-dependent responses) - this has been the focus of several publications from the Geffen Laboratory.

      5) It was mentioned in the discussion that the model was not capable of replicating layer-specific SSA values. Related to this, does the model capture layer-specific changes in frequency tuning properties (i.e. layer 5b pyramidal cells have far broader tuning than other cell-types). And if not, might this affect the SSA differences, especially given how important bandwidth in shaping SSA (TC afferents responding to both deviant and standard).

      6) Were there any layer-specific effects on removal of thalamo-cortical vs cortico-cortical, that could be linked to the fact that different excitatory cell-types in ACtx have vastly different laminar connectivity patterns (L6 CT translaminar inhibition, L5 PT vs IT, for example).

      7) How does the model connectivity map onto the distinct morphology of heterogeneous cell-types throughout the cortex, and does this morphology affect the SSA? (The large apical dendrites of L5b neurons, for example, will play a huge role on how they integrate ascending sensory input).

    2. Reviewer #2:

      In this study authors aim to explain the mechanisms responsible for induction of stimulus specific adaptation (SSA). As the model system authors pick the auditory cortex, where this phenomenon has been well explored. But the mechanisms they identify (synaptic depression, spike frequency adaptation, and recurrent connectivity) are general. It is thus plausible that their conclusions generalize beyond the auditory modality. I think the study is well conceived, its message well communicated, and the specific conclusions the authors make are well supported by the (model) data. The study demonstrates how the high biological fidelity modeling, that has been gaining traction in neuroscience, can serve as a testbed for rapid evaluation of hypothesis and elucidation of mechanism behind brain computation.

      That said, I have several major comments:

      1) I am concerned about the novelty/impact of the study. The impact of the present study can be viewed through two lenses:

      (a) The novelty and added value of the modelling approach itself. While I am very enthusiastic about the merits of the high fidelity modeling used in the present study, this modeling approach has now been well established across multiple manuscripts. The cortical model itself is already published, while I do not think the MGB extension of the model itself represents a significant advancement.

      (b) The impact of the findings of the study itself. The study claims one main novel finding: contribution of the SFA in combination with recurrent cortical connectivity to the SSA. The contribution of SFA to SSA doesn't seem particularly surprising, and as authors write it indeed has already been proposed. Also impact of recurrent connectivity on SSA has already been explored by a previous model (Yarden et al. 2014). Furthermore, my understanding is that the model was for the first time able to replicate the weaker presence of SSA in thalamo-cortical layers, and the dependence of SSA on frequency preference of the neuron. It is my understanding that all other replicated phenomena have already been demonstrated in previous models.

      2) I was surprised no comment was made on (a) the potential difference between the anatomy of the auditory cortical column in comparison to the somatosensory column, which the present model has been designed around, and (b) the lack of functionally specific connectivity, that at least in other sensory cortices (e.g. V1) has been shown to play an instrumental role in shaping the computation. This is particularly surprising in the context of the inability of the model to reproduce some of the interesting findings on SSA (distribution of SAA values in different cortical layers, specific deviance sensitivity), and on the other hand the level of optimism on the future of the model expressed in the last paragraphs of the discussion. I think for the modelling approach in future to fulfill such optimistic goals, both these major problems will have to be addressed, which represent a major body of new work - this should be acknowledged.

      3) I am concerned about the lack of functional verification of the model. Do for example the cortical neurons have frequency tuning curves characteristics that match well auditory data? Unfortunately, I am not an A1 expert, but I would expect wealth of data on elementary functional properties of A1 neurons exists. This represents somewhat of a paradox, where the model is at some level extremely detailed and well matched to experimental data, which (justifiably) authors sell as a major advantage. But it is surprisingly poorly validated against the elementary computations that A1 performs, which in the context of this study, is just as if not more important as the anatomical fidelity. I feel that, at minimum, this issue warrants thorough discussion, both in the context of the SAA, and the modelling approach itself.

    3. Reviewer #1:

      This study investigates whether a detailed biophysical model of a cortical column, simulating more than 30,000 fully detailed neurons, is able to reproduce a well known property of the auditory cortex: stimulus specific adaptation or SSA. SSA has been successfully reproduced in a simplistic model which shows that adaptation mechanisms explain the qualitative phenomenology of this effect (decreased responsiveness for repeated stimuli, specific to the repeated sound and to sounds whose representation overlaps within the repeated sound). Here the authors aimed at testing whether without any parameter optimization, a detailed biophysical model is able to reproduce the observed phenomenon. As the model contains two well-known adaptation mechanisms, synaptic depression and spike frequency adaptation, unsurprisingly, a qualitative match between natural SSA and modeled SSA is observed. Moreover, effects related to representation overlap are found by including a mostly data-driven representation model and without fine tuning. Finally, the biophysical model suggests that both synaptic depression and spike frequency adaptation (SFA) contributes to SSA and that SFA exclusively contributes to the asymmetry of cross frequency adaptation with respect to the preferred frequency, that is both observed in the model and in the data, and can be explained by asymmetry of cochlear representations.

      This is a nice and important exercise to test the efficiency of a so-called detailed model at reproducing basic experimental observation. Unfortunately, here the model performs very well qualitatively but not quantitatively as little quantitative match is observed with spike data from auditory cortex (Figure 5). In fact there is little comparison with actual data, and this is disappointing. One of the purposes of detailed models is to identify their limitations and thereby identify useful details that may have been missed or incorrectly measured. Unfortunately, the quantitative mismatch in Fig. 5 is not mentioned in the results and no attempt is made to fill the gap. Hence, the conclusions of the paper do not go much beyond the well known role of adaptation and representation overlap. The identification of a measure to separate the two components, depression and SFA, is a nice contribution, but it is not tested experimentally, so it remains to be done (e.g. suppressing recurrencies by tetanus toxin light chain) to validate this hypothesis.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      All reviewers have acknowledged the value of a detailed model of auditory cortex, and expressed their support for an integrative approach building the link between neural circuits details and observables. It was found particularly interesting that two complementary mechanisms could play a role in stimulus specific adaptation (SSA). Nevertheless, while the reviewers recognized that the simulations were technically sound and that the conclusions represent interesting hypotheses to pursue about the mechanisms of SSA in auditory cortex, they all felt that the precision to which the specificity of auditory cortex circuits were modeled or to which the SSA observables were captured was not sufficient to demonstrate the advantage of the detailed modeling approach with respect to previous simpler models which reached similar conclusions.

    1. Reviewer #2:

      The authors describe the dependence of the p-value on sample size (which is true by definition) and offer a solution, using simulated data and an applied example.

      I'm not sure that the introduction successfully motivates the paper. It is unclear whether this is due to misunderstandings by the authors of some key points, or rather is a matter of awkward communication, such that the authors' intentions are accurately conveyed.

      The authors note the link between the p-value and sample size. In particular, the authors suggest that statistical significance can be achieved by using a sufficiently large sample size, and they call this 'p-hacking'. I certainly don't recognise use of a large sample size as an example of p-hacking. Instead, this term refers to analytical behaviours which cause the p-value to lose its advertised properties (advertised type 1 error rate). Examples would include taking repeated looks at data without making any appropriate adjustment, trying tests on different groupings of data (and selecting results on the basis of significance), or trying different definitions of an outcome measure. The key point is that, when these actions are performed, reported p-values are no longer valid p-values - they do not behave as they are supposed to. So straight away the authors' argument becomes confusing. Are they criticising the behaviour of the valid p-value? Or are they trying to criticise behaviours that cause the p-value to lose its stated properties? This point remains very unclear. I believe the authors are attempting the former, but wrongly describe this as an example of p-hacking.

      But other statements in the introduction invite further confusion. The authors say " even when comparing the mean value of two groups with identical distribution, statistically significant differences among the groups can always be found as long as a sufficiently large number of observations is available using any of the conventional statistical tests (i.e., Mann Whitney U-test (Mann and Whitney, 1947), Rank Sum test (Wilcoxon, 1945), Student's ttest (Student, 1908)) (Bruns and Ioannidis, 2016)." Again, it is unclear what the authors are trying to say here, and the statement is clearly false under the most obvious interpretation. If the authors are saying that significance will always be found when the null is true and model assumptions are correct provided that the sample size is large, then this is clearly false. In this case, the test will reject the null 5% of the time, using a significance threshold of 5%. The authors can easily confirm this for themselves with a simple simulation. Are the authors trying to make the point that the error rate is conditional not only on the null, but also on the test assumptions (and so when they are violated the test may reject erroneously?) They certainly do not state this, and the fact that they refer to 'identical distribution' suggests otherwise. Another way the test assumptions could be violated is if actual p-hacking (see examples above) were present, such that the reported p-values were no longer valid. Again, the authors do not tell us that this is what they mean, if they in fact do, and this would be a criticism of p-hacking behaviours rather than of the p-value.

      When they write "big data can make insignificance seemingly significant by means of the classical p-value" they might be thinking of confusion between statistical and practical significance, which is a common misinterpretation made in the presence of large data size, but again, if this is what the authors are thinking of they should say it. The discussion by Greenland (Valid P-Values Behave Exactly as They Should: Some Misleading Criticisms of P-Values and Their Resolution With S-Values, especially section 4.3) seems to address the concerns raised by the authors fairly decisively. For a given parameter size, increasing sample size should produce stronger evidence against the null. The p-value does not tell you about the size of the parameter directly - it measures the discrepancy between the data and the null - interpreted correctly, there is no problem.

      So, with apologies to the authors, I don't think they are successful in convincing the reader that there is a problem to be solved, and the manner of presentation (which may just be an issue of communicating the authors' intentions) is such that it causes doubt about the authors' handling of the relevant concepts. Throughout the text, there are other confusing presentations around fundamental concepts. E.g. the authors write things like "Hence, we claim that whenever there exist real statistically significant differences between two samples..." I know what a real difference is, but what is a real statistically significant difference? There are no statistically significant differences in nature. Are the authors trying to refer to instances where the null is false and is rejected? Or, are they trying to say that a 'real significant difference' is where the difference exceeds some magnitude?

      For example - the authors write things such as "When 𝑁(0,1) is compared with 𝑁(0,1), 𝑁(0.01,1) and 𝑁(0.1,1), 𝜃 is null; so those distributions are assumed to be equal. In the remaining comparisons though, 𝜃 = 1, thus there exist differences between 𝑁(0,1) and 𝑁(𝜇,1) for 𝜇 ∈ [0.25,3]", highlighting the fact that perhaps the authors really want to address the practical significance vs statistical significance issue (although again, this is not explicitly stated). If the authors are interested in size of effect/ difference, then it is not clear that this proposal offers any advantage in that regard over the p-value (which, as noted, does not tell us about the size of a parameter). If interest is in size, then it is unclear why the authors do not direct the reader to consider the estimate and confidence interval, so that they may consider this explicitly in terms of magnitude and precision.

      With apologies to the authors, who have clearly spent a large amount of time on this - I would think that the best way forward here would be to post this as a preprint and to try to invite as much feedback as possible. The authors have lofty ambitions with this work. Maybe there is a good underlying idea here, obscured by the presentation? Unfortunately, it is difficult to assess this at present.

    2. Reviewer #1:

      The paper sets out to confront p-hacking and addressing the dependence of the p-value on the sample size. The paper sets out the motivation behind the problem and then proposes a solution using three examples.

      I have a major problem with this work in that I do not understand the motivation and hence cannot judge the value of the proposed solution.

      The authors need to set out some definitions which might help them framing the context. I outline below what I understand as the context and hence why I do not understand how their proposal will address the problem.

      Firstly 'p-hacking' is the term usually reserved for when researchers do not follow a pre-specified protocol on how a research question will be answered through the statistical analysis of a resource, single study or experiment, but instead analyse the data in many ways. Maybe they use slightly different assumptions, adjust the definition of an outlier or who is eligible for inclusion or adjust to a different outcome variable. In this manner they select to report the analysis that gives the smallest p-value. (Ioannidis referred to some of this as vibration effects) This is a major problem in science but it is not only the problem of the size of the data available. Although the bigger the dataset, the more subgroups that can be analysed. The main problem here is that we do not know how many ways the data have been analysed, we only know what researchers have selected to report. The manuscript does not address this problem at all.

      The p-value is defined as the probability of observing a result as or more extreme when the null hypothesis is true. In most settings the 'null' is that there are no differences between two or more groups, for example that all the means are the same or equal. Often this translates into the statement that we expect the distribution of p-values under the null to be uniformly distributed [0,1]. This can be demonstrated or checked by simulation. In the hypothesis testing framework we usually power our studies so we will be able to detect a (true) difference between two groups with some high probability. The specific difference we are interested in would be called the alternative hypothesis. Hence the p-value is used to reject the null, but under the alternative hypothesis the p-value will not be uniform [0,1]. It is well known that the larger your sample size the more precise estimates you will obtain and the smaller differences you will be able to detect. Sample size calculations require a specific alternative to be stated (e.g. a difference in means of 0.5 of a standard deviation) then a sample size that guarantees as specific power for the specific type 1 error can be calculated.

      This manuscript is confusing properties of the p-value when there are no differences and minimal differences between the two groups. I think the authors are trying to make the point that a statistically significant result is not necessarily a clinically or biologically meaningful result. They have done some simulations to show the distribution of the p-value when the true difference between the two means is 0.01. This is an example of an 'unimportant' difference, but it is not the null. This problem is best addressed by reporting effect sizes and 95% confidence intervals for quantities of interest rather than trying to adjust p-values in some way. Obviously when we have access to large datasets we may have a much larger sample than we needed to detect a meaningful effect though we may find small p-values. Adjusting the p-values will not really help as it is the effect sizes that are of interest.

      I feel the manuscript needs to be redrafted to be more clear about the problem they are trying to fix.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary:

      The authors describe the dependence of the p-value on sample size (which is true by definition) and offer a solution, using simulated data and an applied example. Unfortunately, both reviewers found it difficult to understand the motivation for the work and hence both had difficulty judging the value of the proposed solution. Detailed comments and suggestions are provided below.

    1. Reviewer #3:

      This is a manuscript by Karimi-Rouzbahani et al, about the neural encoding of facial familiarity using EEG and MVPA.

      I essentially found the article interesting, clear and using solid methods. Besides a few minor comments, which I list below, I found only one major issue which has to be addressed.

      Major comment:

      My only major problem with the results lies in the simple interpretation of anterior contributions to the encoding of familiarity as feed-back. You find, using a clever partialling out method, that eliminating the occipital contributions from the frontal (or rather anterior, as it involves temporal cortex too) electrode pattern familiarity decoding reduces stronger and earlier-longer information encoding about familiarity, when compared to the opposite, when you partial out the frontal information from that of the occipital/posterior electrode pattern. The former is interpreted as a signal of feed-back, while the opposite as feed-forward information flow. This makes sense but only if the frontal cortex does not play a role, on its own right, in face processing. However, the inferior frontal face area (see e.g. Collins and Olson,2014) is known to be associated with the STS and playing a role in social, dynamic and eye-movement related information processing. If we assume that these tasks are more related to the frontal than to the posterior areas, as for example Duchaine and Yovel, 2015 do, then the results of the partialling out analysis merely mean that the functions of the frontal areas are modulated more by the posterior areas (in other words, in those functions the parietal areas also play a role) than the other way around. The lower-level functions of the posterior sites are, on the other hand, modulated less, shorter, later by the removal of frontal areas, in other words the frontal cortexes do not play much role in them.

      This is different from your conclusion where you state feed-forward vs feed-back connections. I don't see any good way to come around this alternative (and simpler) conclusion than your assumption about connectivity. Time would be a potential factor to resolve it, feed-back being later, but in your figures it is clear that the two periods overlap entirely and the peaks also almost fall into identical windows.

      Unless I overlooked something and you can give a convincing way to exclude this possibility I would recommend a) discuss this in the paper and b) tune down your respective conclusions throughout the manuscript.

    2. Reviewer #2:

      The authors employed a clever experimental paradigm to investigate how the brain integrates visual information to reach a decision on the familiarity of a presented face. Eighteen subjects performed an EEG experiment while they were presented with images of themselves, close friends, famous individuals, or unfamiliar individuals. They were required to perform a 2AFC task to decide on the familiarity of the image (familiar/unfamiliar). The authors report behavioral differences in accuracy and reaction times depending on the task difficulty (more or less degraded images) and depending on the familiarity of the face, with self and personally familiar faces being recognized more easily and faster. Some of these behavioral differences were reflected in brain activity as evaluated by ERPs, decoding, and RSA analyses. Adopting a novel RSA-based connectivity method, the authors claim that under conditions with limited visual information (more degraded images), top-down effects from frontal areas to occipital areas are stronger than in conditions with increased visual information (less degraded images).

      The main question of this work is of interest and important in the face processing literature. The paradigm is clever and has the potential to address the question of interest. However, I have strong concerns about the methods, as well as some issues with the interpretation and framework in which the authors place the results of this work.

      Methods:

      1) There is little information about single-subject results or effect sizes, except for behavioral results. Only the mean values across subjects are reported with significance values (however, the reader cannot be sure about this as it is not explicitly mentioned anywhere). It's unclear from the description of the methods how data from different subjects were pooled for group analysis. Similarly, it's unclear how the null distributions were generated across subjects for permutation testing.

      2) Different analyses use either correct trials only or both incorrect and correct trials, without any clear rationale of why this is warranted. This is especially important in a task with highly different accuracy values depending on the conditions of interest. Figure 1B shows different levels of behavioral accuracy depending on coherence levels, while Figure 1D shows different levels of accuracy depending on familiarity type. This is very interesting, but it creates challenges for the analysis of brain data.

      On the one hand, if only correct trials are selected for the analysis (as in the decoding results), then different conditions will have a different number of trials. In turn, this will change the distribution of samples into classes, it will change the theoretical chance level, and it will change the levels of noise for estimates of central tendency. For example, the difference in decoding results between different familiarity types in Figure 3B could potentially be driven by a different number of trials belonging to each of the subclasses of familiarity.

      On the other hand, if both correct and incorrect trials are selected for the analysis (as in the RSA analysis), then results are confounded by potentially different brain processes that take place for correct and incorrect trials. Consider that in a 2AFC task, participants can be correct in one way only (correct classification), while they can be incorrect in many ways (slow RT, low attention level, or true misclassification). Given this experimental paradigm, I think the more straightforward approach would be to analyze correct and incorrect trials separately for all analyses and report both results. This would limit confounding effects in the interpretation of the data.

      3) For the decoding analyses, I find it suboptimal (and potentially problematic) to use a binary classifier (familiar vs. unfamiliar) to investigate a multiclass problem (levels of familiarity). A better approach would be to run a 4-way classification from the beginning, and then use this classifier to generate a 2-way classifier. This approach would preserve the actual structure of the data, which is divided into four classes of interest and not only two. In addition, I cannot tell from the methods whether the labels were permuted appropriately for permutation testing. Since there is a different number of trials in each class, the label permutation should maintain the same proportion of trials in each class to preserve the original structure and generate an appropriate null distribution (Etzel, 2015; Etzel & Braver, 2013; Nichols & Holmes, 2002)

      4) It's unclear to me what the brain-behavior correlation analysis is meant to represent (Figure 3C) when the decoding analysis is performed on correct trials only, while behavioral accuracy is (necessarily) computed on all trials. In addition, I am left to wonder whether the overall within-subject behavioral accuracy is predicted by (or correlates with) the overall decoding accuracy across timepoints based on within-subject brain data. If such an effect exists, then the more complicated, time-varying analysis would be warranted. However, this analysis should be reported with individual subject's results to highlight the effect size of such a correlation. Finally, I would suggest the authors move some of the text describing this analysis from the methods to the main text. I find the description in the main text to be particularly opaque and much clearer in the methods section.

      5) It's unclear how the RSA results were pooled across subjects. In addition, these analyses used both correct and incorrect trials. I don't see why these analyses cannot be performed on correct and incorrect trials separately by sub-selecting rows and columns of the RDMs for each subject. This would make the interpretation of the results much more straightforward. These results are now confounded by whether the image was correctly or incorrectly classified by the participant.

      6) I'm not convinced the partial correlation results with low-level visual features are sufficient to account for the effect of visual differences. These differences necessarily exist when using pictures of famous people with less staged pictures of friends and other individuals. I'd like to know how much each image class can be predicted by image statistics alone either by mimicking the experiment using a classifier or by training a classifier to distinguish familiarity type on the actual images. This would quantify whether the familiarity of the person can be decoded simply based on low-level visual properties (such as luminance values from pixel intensities), or from more biologically inspired features that simulate early visual cortex, such as HMAX features or the first layer of a general recognition visual DNN.

      7) I find the proposed connectivity method quite interesting, but I'm highly concerned whenever a method is developed and tested in a single dataset to support the main hypothesis. I realize it is hard to obtain a real "ground truth" dataset to test this method, especially in our global condition. However, I would be more confident in this method if it were applied to some simulated data to show that it can recover the simulated feedforward/feedback dynamics with different amounts of noise in the dataset. In addition, especially for this analysis, differences between correct and incorrect trials should be analyzed. Otherwise, the interesting findings in Figure 4D could be confounded by a different number of correct trials in each of the coherence levels (with more incorrect trials for the 22% condition).

      Interpretation:

      8) Throughout the manuscript, I find the description of the visual pathway and the face processing network to be too simplified. It is described with a simple distinction into "peri-occipital" and "peri-frontal" areas, and a dichotomy between feed-forward/feed-back connection. While EEG cannot afford a more precise spatial resolution, I think both the introduction and the discussion should place the results of this manuscript within the broader and more precise knowledge we have about the visual system and the face processing system. For example, how do these results fit within the framework of (familiar) face processing (Duchaine & Yovel, 2015; Freiwald et al., 2016; Haxby et al., 2000; Visconti di Oleggio Castello et al., 2017)?

      While I agree that the evidence for top-down effects from frontal areas in visual recognition is substantial (as the seminal work by Moshe Bar and others has shown), recurrent and feedback connections exist much earlier in the pathway (Kravitz et al., 2013). These recurrent connections have been shown to play a role in tasks with occluded images as well (Tang et al., 2018), which has similarities with the task presented in this manuscript. Thus, for this task, do we really need to assume a contribution from frontal areas? Could it be more easily explained by these recurrent connections in occipital and temporal areas alone? I think the discussion should present a more precise (and nuanced) description of the visual pathway and the face processing network, rather than a simplified dichotomy between frontal/occipital areas.

      References:

      Duchaine, B., & Yovel, G. (2015). A Revised Neural Framework for Face Processing. Annual Review of Vision Science, 1(1), 393-416.

      Etzel, J. A. (2015). MVPA Permutation Schemes: Permutation Testing for the Group Level. 2015 International Workshop on Pattern Recognition in NeuroImaging, 65-68.

      Etzel, J. A., & Braver, T. S. (2013). MVPA Permutation Schemes: Permutation Testing in the Land of Cross-Validation. 2013 International Workshop on Pattern Recognition in Neuroimaging, 140-143.

      Freiwald, W., Duchaine, B., & Yovel, G. (2016). Face Processing Systems: From Neurons to Real-World Social Perception. Annual Review of Neuroscience, 39(1), 325-346.

      Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2000). The distributed human neural system for face perception. Trends in Cognitive Sciences, 4(6), 223-233.

      Kravitz, D. J., Saleem, K. S., Baker, C. I., Ungerleider, L. G., & Mishkin, M. (2013). The ventral visual pathway: an expanded neural framework for the processing of object quality. Trends in Cognitive Sciences, 17(1), 26-49.

      Nichols, T. E., & Holmes, A. P. (2002). Nonparametric permutation tests for functional neuroimaging: a primer with examples. Human Brain Mapping, 15(1), 1-25.

      Tang, H., Schrimpf, M., Lotter, W., Moerman, C., Paredes, A., Ortega Caro, J., Hardesty, W., Cox, D., & Kreiman, G. (2018). Recurrent computations for visual pattern completion. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.1719397115

      Visconti di Oleggio Castello, M., Halchenko, Y. O., Swaroop Guntupalli, J., Gors, J. D., & Gobbini, M. I. (2017). The neural representation of personally familiar and unfamiliar faces in the distributed system for face perception. In Sci. Rep. (Issue 1, p. 138297). https://doi.org/10.1038/s41598-017-12559-1

    3. Reviewer #1:

      In this manuscript the authors report a study investigating the "neural familiarity spectrum" of face recognition. The authors used a paradigm via which stimuli (i.e. facial identities with varied levels of familiarity) were gradually revealed. In general, I entirely agree that the previous overemphasis of and/or arguing "for a dominance of feed-forward processing" ought to be replaced by a more "nuanced view". In my opinion, the constraints imposed by our methodological choices, which ultimately determine the nature of our observations, also need to be humbly considered. I commend the authors for their efforts and their well-written, interesting manuscript, which I believe represents a valuable and needed contribution to the field of face cognition and beyond.

      Major Points:

      Throughout the manuscript references are warranted to a number of studies that have:

      (i) Used similar approaches to a) decelerate the categorization process and b) investigate representations across time by applying uni-/multivariate analyses that were stimulus onset and/or reaction time aligned (eg, Carlson et al., 2006; Jiang et al., 2011; Ramon et al., 2015; Quek et al., 2018)

      (ii) Have reported findings related to frontal contributions towards familiar face recognition (numerous EEG studies by Caharel and colleagues, and Ramon et al. (2010, 2015) What I am missing is an explicit discussion of the challenging effect of expectations related to identities (as well as specific images since observers provided stimuli themselves). The authors discuss the role of perceptual difficulty and familiarity level, but the latter is in fact confounded with expectations of the specific to-be-presented identities that moreover appear in the context of the active (vs. orthogonal) task, both of which increase signal strength. (Note: this is not a critique and applies to all studies using personally familiar identities - especially those that have used a relatively small number of identities).

      In light of this, I believe that statements related to the dominance of "feed-forward flow" in relation to perceptual difficulty should be more nuanced. Examples include:

      -"perceptual difficulty and the level of familiarity influence the neural representation of familiar faces and the degree to which peri-frontal neural networks contribute to familiar face recognition"

      -"We observed that the direction of information flow is influenced by the familiarity of the stimulus"

      Level of familiarity and perceptual difficulty are correlated in the present study, as well as most studies precisely because observers know who will be seen. Therefore, one could argue that the expectations, not the level of familiarity per se determine "the involvement of peri-frontal cognitive areas in familiar face recognition". (cf. Huang et al., (2017) and Ramon & Gobbini (2018) for a discussion).

      Related to this aspect and relevant for the analyses is the different number of trials across categories (3x as many unfamiliar face trials vs. each of the familiar ones). How was this dealt with statistically (cf. also stats reported in Figure 2) and were Ss informed about the ratio beforehand? Given the provision of self and personally familiar images, the task could also be considered a n-identity search task (cf. Besson et al., 2017), as they match sensory inputs to one of n possible known vs. an unknown number of unfamiliar identities / events. (To illustrate, the effects of expectations can determine the degree to which recovery from neural adaptation is observed across different face-preferential regions using the same task; e.g. Rotshtein et al, 2005, Nat Neurosci vs. Ramon et al., 2010, EJN)

      The authors list "levels of categorization [...], task difficulty [...] and perceptual difficulty [...]" as potentially affecting "the complex interplay of feed-forward and feedback mechanisms in the brain" (l.442). I agree and point towards further relevant papers to be cited that additionally investigate the impact of expectations or "decisional space" on categorical decisions in the healthy as well as impaired brain (eg Ramon, 2018, Cogn Neuropsychol; Ramon et al., 2019, Cognition; Ramon et al., 2019, Cogn Neuropsychol).

      To summarize, can "accumulation of sensory evidence in the brain across the time course of stimulus presentation" (l.267) and "the strength of incoming perceptual evidence and the familiarity of the face stimulus" considered to determine the direction of information processing be distinguished from the effect of expectations that potentially increases over time? (This is naturally non-existent for unfamiliar stimuli, for which no "domination of feed-forward flow of information" was found).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      The reviewers appreciated the clever paradigm and the focus on top-down influences during familiar face recognition. However, the reviewers also raised several serious methodological concerns. For example, they noted that the familiarity conditions cannot be easily compared, considering that these conditions differed in multiple ways beyond the level of familiarity (e.g., staged vs supplied photos, one vs many identities).

    1. Reviewer #3:

      The manuscript by Mioka et al. is the synthesis of a lot of well executed experiments examining a "void" zone in the plasma membrane of yeast cells lacking phosphatidylserine. The authors demonstrate that this is a specialized micron-size domain with many intriguing properties. However, there are several issues that limit my enthusiasm. Some of the experiments are misinterpreted, and there are also inconsistencies and inaccuracies in the text. In my opinion Figure 6 and Figure7 provide little benefit from the primary findings of the paper.

      Other concerns:

      1) The void zones shown are more prevalent at 37C than 30C. This is opposite to the other micron sized phase separation in the yeast vacuole (Rayermann et al., 2017). If this is a Lo domain then rapid oscillations in temperature should control the reversible assembly and disassembly. This should be examined.

      2) It's odd to me that the filipin signal has "thickness" beyond what you would expect if it was confined to a bilayer. In other experiments it appears that the cytosolic fluorescence is also quenched in the vicinity of the voids. This is problematic as every GFP construct examined on the cytosolic side of the PM is excluded. Perhaps these cells actually have ergosterol crystals (a 3D structure) rather than a Lo domain within the bilayer. Given the importance of cholesterol crystals in being a "danger" signal and activating inflammasomes it could be worth examining. This would require specialized imaging techniques.

      3) Spira et al., (2012, NCB). Highlighted the patchwork nature of the plasma membrane. With Pma1 and Ras2 being excluded from one another and proteins with similar TMDs tend to colocalize. This article should be included in the discussion to help place these findings in a greater context. Yet here all of the constructs that are examined are excluded from the void zones. This again suggests to me that this is different from an Lo domain. In the cho1 cells that do not have obvious voids, what is the localization and overlap a few of the well characterized markers Ras2, Pma1, Sur7, Bio5?

      4) Figure 1B shows 40% of cells grown overnight at 37C have voids but Figure 2C shows that they are lost after ~15h. This seems inconsistent.

      5) The authors state that psd1 psd2 are PE-deficient and cho2 opi3 are PC-deficient in the figure. This is incorrect.

      6) Figure 3C is not convincing. Images on the right have substantially more red pixels and so positions where there were voids at 0 min now have a bit of green at 25 min. I also don't understand how the ergosterol rich region is able to quench signal in the cytosol. Is this an extended focus representation of multiple slices?

      7) GPI-linked proteins are crosslinked to the cell wall. The authors' conclusions cannot be drawn from this experiment. The authors could potentially do the same experiment in spheroplasts.

      8) Alternatively, adding rhodamine-PE to the cells could be used to assess the partitioning in the outer leaflet.

      9) The significance of the vacuole - void contact is unclear. Typically, ~50% of the PM is in close apposition to cER in yeast. In mammalian cells it is known that cortical actin can restrict ER-PM contact sites formation. Thus, it could simply be that in the absence of cER that the Vacuole will come in close proximity to the PM. This can be tested by using a strain deficient in reticulons or the so-called delta tether or delta super-tether cells. If these cells also display Vac - PM contacts, then I don't see the relevance of including this figure in this study.

      10) Vacuole - void contacts are seen in roughly 50% of the cells with voids. In the cells that don't have this V-V contact do they have the nucleus or nER in contact with the PM? This is related to the above point. Is this simply a result of removing the cER and making the PM available?

      11) Figure 7 is unnecessary and just makes things more complicated. It actually detracts from the main findings since it is just a collection of observations. For instance, how would loss of the HOPS complex prevent Lo phase separation in the plasma membrane? Do these cells have less total cellular or plasmalemmal ergosterol? Do the levels of complex sphingolipids change?

      12) Provide a reference or a direct measurement showing that growing cells in pH7.0 medium impacts the cytosolic pH.

    2. Reviewer #2:

      This study shows that plasma membrane (PM) voids, regions devoid of proteins, form in cells lacking phosphatidylserine (PS). It argues these regions are enriched in ergosterol and are liquid ordered. Domain formation is reversible and may require ergosterol and sphingolipids for formation. A number of genes that disrupt void formation are also identified. The study proposes that PS prevents the formation of void zones by interacting with ergosterol. Overall, the study is well done and makes a persuasive case that that protein-free voids form in the PM and do not seem to affect cell growth; a fascinating discovery. There are, however, two weaknesses in the study that reduce its impact. One is that it does not show PS is directly involved in void formation or that void zone formation is driven by PS-ergosterol interactions, as stated in the abstract and elsewhere. This could be addressed in vitro using GUVs or supported bilayers. I realize these experiments are challenging, but they could add significant mechanistic insight. The second major weakness of the study is that it does not demonstrate PM void zones occur in wild-type cells in response to stress or in some growth conditions. There are other, more minor concerns.

      1) There is no direct demonstration that the void domains are ordered. This could be shown using order sensitive dyes like Laurdan. Further evidence could be provided by directly measuring diffusion rates of fluorescent lipids in the void zones compared to the rest of the PM. In addition, if the void domains are ordered, it should be possible to show they melt and reform as cells are heated and cooled.

      2)The role of Osh6 and Osh 7 in void formation should be assessed since these proteins are thought to be necessary to maintain PS enrichment in the PM, at least in some growth conditions.

      3) The investigation of void zone-vacuoles (V-V) contact sites is not well explained. It is not clear what is being proposed. How would contact sites promote void zone formation? Are they sites of lipid transfer and, if so, how would that affect void-zone formation? Or is some other mechanism being proposed?

      4) It is not clear what the mutant analysis adds to the story. Do the mutations affect PS levels in the PM? If that is what is being proposed it should be tested. Or do the authors think the mutants affect void zone formation by some other mechanism?

    3. Reviewer #1:

      The manuscript by Mioka et al. presents an interesting and puzzling observation. The authors showed the existence of a so-called "void zone" in PS-deficient cho1∆ cells. This void zone is a membrane region devoid of proteins and with a specific lipid composition, which the authors suggest to be a microscopic liquid-ordered domain. They also tested different stress conditions and found some that prevented void zone formation in cho1∆ cells. The authors propose that PS is a key lipid in preventing macroscopic raft-like domain formation in WT cells. Although it is unclear whether such PM void zones can appear in WT cells under any stress conditions (hence a caution note on the physiological relevance of the findings herein presented), the authors' proposal that PS in WT cells can suppress the formation of macroscopic lipid domains is an interesting hypothesis that deserves to be followed to my opinion. Finally, the authors start a search for genes required for void zone formation, which is interesting in my opinion, and although only partial conclusions from that can be drawn at the moment, I think this a promising way to study the mechanisms and maybe physiological relevance of void zone formation in the future.

      I have some concerns, especially on the fact that they seem to claim that the void zone is a liquid-ordered domain (if so, it should look more circular and not as they show they look like).

      Major concerns:

      1) The authors say that Lo domains are completely depleted of transmembrane (TM) proteins. However, there are many reports (e.g. from the Levental lab), where TM proteins with "raft" affinity have been shown. The authors should express some of these raft TM markers and check whether they partition or not into the void zone.

      2) The claim that the void zone is a liquid-ordered (Lo) domain, I do not think there is enough experimental evidence for that. In particular:

      -Line 82: the fact that the domains are not circular isn't this against a Lo phase and favor a more gel/solid phase? Have the authors seen fusion of void zone domains in live cells?

      -Line 84: does FM4 partition equally to Lo and Ld (liquid-disordered) domains in vitro? What about gel-like domains?

      -Lines 304-307: along the same lines, this is true for some proteins, although there are TM proteins that have been shown to be targeted specifically to Lo regions in GMPVs.

      -The fact that the void zone appears at high temperature is puzzling if compared to standard liquid-ordered domains.

      -Line 687: these observations are also compatible with gel-like domains.

      -Is it possible to do some dynamic measurements of dye diffusion in void zones? FRAP? Single particle tracking?

      3) Many trafficking routes/genes are required for void zone formation. What about for the stability/maintenance? Could the authors provide dynamic anchor-away or degron-tagging of some of these candidates to test whether void zones disappear upon depletion of these proteins?

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      This manuscript shows the interesting observation that plasma membranes in yeast cells lacking phosphatidylserine (PS) present differentiated regions, the so-called "void zones". Void zones are devoid of proteins and have a specific lipid composition (are enriched in ergosterol), which the authors suggest to be a microscopic liquid-ordered domain. Void zone formation is reversible and may require ergosterol and sphingolipids for its formation. They also tested different stress conditions and found some that prevented void zone formation in cho1∆ cells. The authors propose that PS is a key lipid in preventing macroscopic raft-like domain formation in WT cells, in particular by interacting with ergosterol. Finally, a study for genes that disrupt void formation is also presented.

      As you will see all the reviewers acknowledge that the manuscript presents high quality experiments and potentially very interesting discoveries. However, they all coincide in that the story has some weaknesses.

    1. Reviewer #3:

      In the manuscript, Polaski et al. compared the reported UPF1 mutations with a collection of three databases and found 42.5% of these mutations are identical to germline genetic variation. However, most of these overlapped mutations are located within introns, and only present in Exome Aggregation Consortium (ExAC) database (Figure 2). This raised some concerns since the ExAC database mainly reportsreport exon variants rather than intron variants, the authors need to provideneed provide other information such as allele frequency to examine whether these intronic mutations are rare or low-frequency variants. Another suggestion is that the authors may cross-reference UPF1 mutations with the recent gnomAD v3 database (Nature 2020), which provided non-coding genetic variants within much better resolution. In addition, most of the other UPF1 exon mutations are indeed novel as they are not present in any databases (Figure 2 - figure Supplement 1). The authors need to provide some additional analysis such as separating these two types of variants (exon/intron variants) and analyzing the frequency of overlapped UPF1 mutations.

    2. Reviewer #2:

      This paper aims to resolve the disparity between one report (Liu et al., 2014), which described somatic mutations in pancreatic adenosquamous carcinoma (PASC) that did not typify normal pancreatic tissue of the patients, and other reports (Witkiewicz et al., 2015; Fang et al., 2017; Hayashi et al., 2020), which did not find these mutations. The authors show here that many (40%) of the mutations described by Liu et al. typify genetic variations in the human population at large, and they suggest that these mutations are not pathogenic, e.g. are not drivers of PASC, and also not somatic but, rather, are genetic in origin.

      The authors use CRISPR-Cas9 to generate in mouse pancreatic cancer (KPC) cells, which harbor Kras and Tp53 gene mutations as do PASC patients, a Upf1 gene, and thus its product mRNA, lacking exons 10 and 11, as Liu et al. reported not only inhibits NMD by disrupting UPF1 helicase activity but also promotes tumorigenesis. After injection into mice, the authors found no detectable effects on pancreatic cancer growth compared to the injection of control cells.

      The authors acknowledge that mice may differ from humans. Thus next, rather than using mini-UPF1 genes, as did Liu et al., the authors introduced two of the Liu et al. mutations separately into the UPF1 gene of HEK293T cells. In contrast to Liu et al., the authors found modestly increased NMD efficiency and no evidence of UPF1 pre-mRNA mis-splicing. The authors note that this makes sense since these mutations are found in people not as somatic mutations but genetic mutations, and thus would not be expected to inhibit NMD given the importance of NMD to aspects of human development in utero and beyond.

      This is a very well-written paper describing carefully executed experiments that lead the reader to discount three claims made about UPF1 gene mutations in PANC as described by Liu et al., namely, that these mutations: (i) have a somatic origin, (ii) lead to UPF1 pre-mRNA mis-splicing so as to inhibit NMD, and (iii) promote tumorigenesis. The authors are careful not to over-interpret their data.

      Specific comments:

      Page 4, in reference to Figure 1f. It is unexpected that the variations in UPF1 protein levels were "uncorrelated with NMD efficiency". Possibly, this reviewer doesn't understand what the authors mean. Please clarify.

      Additionally, in this regard, it is better to draw conclusions about NMD efficiency by measuring more than just the efficiency with which mRNA from a reporter construct is targeted for NMD. It is recommended that the authors assay the levels of a few (e.g. three) cellular NMD targets, normalized to the level of their pre-mRNA to control for any changes to gene transcription.

    3. Reviewer #1:

      This manuscript identifies that the UPF1 variants previously reported as frequent somatic mutations in pancreatic adenosquamous carcinoma are actually germline genetic variants with no clear effects on UPF1 splicing, protein splicing, or nonsense mediated decay. Given that the manuscript challenges a striking finding from a prior study that has not been validated in subsequent studies, it is important to publish to correct the literature. At the same time, several points should be clarified to make sure the data are as comprehensive as possible:

      1) In the experiments evaluating the effect of skipping exons 10-11 of UPF1, it is surprising that this genetic perturbation in UPF1 is actually tolerated in these cells as UPF1 is an essential gene in most cancer cell lines (this point also has likely motivated this current study). Also, the Western blots for UPF1 protein are not particularly clear (Supplementary Figure 1c) and the fact that the cells don't perturb the growth of KPC cells does not prove that UPF1 alterations is not tumorigenic. Have the authors checked to see if UPF1 is downregulated and mis-spliced still in the cells following in vivo growth? A simple in vitro competition assay between UPF1 exon 10-11 targeted cells and control sgRNA cells would also be helpful. It would also be helpful to evaluate if NMD is altered in these cells given these issues.

      2) Although it is clear that the authors have used similar minigene assays as were used in the original publication, a more systematic evaluation for potential alteration in NMD with UPF1 variants (via RNA-seq) would be helpful given that this work questions the prior publication.

      3) Do the authors believe that the UPF1 variants reported as mutations initially in PASC are actually SNPs? The terminology describing what these variants are could be a little clearer in the Abstract and Discussion.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Eric J Wagner (University of Texas Medical Branch) served as the Reviewing Editor.

      Summary:

      The authors have sought to address what has become a considerably debated topic of whether mutations in Upf1 are tumorigenic in pancreatic adenosquamous carcinoma. Specifically, the authors introduced Upf1 mutants found in pancreatic tumors into pancreatic adenosquamous carcinoma cells, and found they did not provide significant advantage for tumor progression. Moreover, the authors described how a significant percentage of Upf1 mutants observed in pancreatic carcinoma are also present as variants in the human population, raising further doubts about their potential role as cancer drivers. Altogether, this work provides further evidence as to whether Upf1 disruptive mutations represent driving factors in pancreatic adenosquamous carcinoma.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      This is a fascinating and beautifully written article about the possible evolutionary relationship between two major protein superfamilies - the P-loop NTPases and the Rossmans. Both are ancient and highly diverse superfamilies, containing a significant proportion of all extant domain sequences and were probably amongst the earliest enzyme superfamilies to emerge in evolution. No major evolutionary classification of proteins, such as SCOP, reports evolutionary relationships between them.

      Both share the same structural architecture of a beta-alpha-beta 3-layer sandwich and have an intriguing number of other shared structural features including the location of the binding site for phospho-ligands. However, whilst both bind phosphorylated ribonucleosides, the mode of binding differs and also the manner in which these compounds are exploited. Furthermore, there are differences in the topologies of the folds possibly suggesting distinct evolutionary trajectories. The Rossmanns appear to be more structurally conserved, whilst the P-Loops vary more in their topologies and possibly represent less stable arrangements of beta-sheets and alpha-helices. The authors have brought together several strands of evidence to explore possibly evolutionary relationships. Detailed structural analyses allow the authors to explicitly detail the significant shared structural features. For example, similarities in the mode of binding the phosphate moiety in the ligand. The structural features are well described and there are appropriate illustrations visualising key differences and similarities. The shared features of the phosphate binding site likely emerged and were favoured early in evolution, as supported by other analyses reported by Longo et al. However, as the authors point out there are other compelling similarities including the equivalent location of this site in the first beta-loop-alpha element in both superfamilies, which is not a necessary constraint of phosphate binding and the authors support this by giving examples of phosphate binding at the tip of alpha-4. In addition, they provide evidence supporting the common involvement of beta-2 which contains the conserved Asp in the Rossmanns common ancestor. The Walker-B Asp in the P-loops is also at the tip of the beta-strand adjacent to beta-1, as in the Rossmanns - although this is an inserted strand relative to the Rossmann topology. The authors propose feasible evolutionary scenarios for how the P-Loops and Rossmans may have diverged to acquire additional secondary structure elements extending the common beta-PBL-alpha-beta-Asp feature present in both superfamilies. Further compelling evidence is given by detection of a bridging protein - Tubulin - linking the two superfamilies. This has the distinct Rossmann topology but binds GTP in the P-loop NTPase mode. Furthermore, the GTP is hydrolysed by water activated by a ligated metal dication. Final support is given by reporting common sequence themes between the P-loop enzyme HPr kinase/phosphatase and some Rossmann proteins. The authors present further interesting and detailed analyses of similarities between the proteins sharing this unusual theme. The evidence provided by the authors for the shared beta-PBL-alpha-beta-Asp fragment seems very strong to me and has been presented in an interesting and informative way. Of course, it is not possible to know the subsequent evolutionary trajectories but the scenarios presented seem plausible.

      We thank the reviewer for their encouraging remarks on our manuscript.

      **I only have minor comments** 1) SCOP2 provides information on links between superfamilies based on rare sequence or structural features. Have the authors checked this resource for any details on beta-PBL-alpha-beta-ASP fragment? Or perhaps consulted with Alexey Murzin about this feature?

      The classification of Rossmann and P-Loop proteins in SCOP2 is consistent with the ECOD classification scheme. For further confirmation, we wrote Alexey Murzin and he replied that Rosmanns and P-Loops are annotated as two separate evolutionary lineages, termed “hyperfamilies” in SCOP2. He found our new evidence compelling, but that given the current criteria for shared ancestry, P-loops and Rossmanns are separate lineages.

      2) I was rather confused by the way in which EC annotations were collected for the two superfamilies ie via Pfam – wouldn’t it be better to use SUPERFAMILY as the domain structures would map directly to these sequence relatives. I’m also surprised that they only took the common EC from a Pfam family since the aim of this analysis was to identify how many different enzyme functions the two superfamilies supported. Pfam does not classify by function and so inevitably groups functionally diverse relatives. However, to get the full range of enzyme functions supported by these superfamilies I would have thought all non-redundant EC functions across these constituent Pfam families should be counted. Perhaps I have misunderstood.

      We have updated the analysis to make use of the SUPERFAMILY database and, as per your suggestion, we now count all non-redundant EC numbers. Although the EC number counts have somewhat changed, the major point – that these are exceptionally diverse evolutionary lineages – has not.

      3) The authors refer to a set of previously curated ‘themes’ and allude to a methodology that will be reported in a forthcoming manuscript. The idea of identifying rare themes and then using them to locate very distant homologues is appealing. However, I think some details should be provided here. For example, some brief details on the technology for detecting the themes and thresholds on significance. How rare are they and how conserved do these fragments need to be between superfamilies to join their curated list? Furthermore, how many of these curated themes are similar to the one reported in their article and do they get crosslinks to other superfamilies based on closely related themes? ie how unique is this theme to the P-loop and Rossmanns and are there closely related themes linking these two superfamilies to other superfamilies? I would imagine it is quite a distinct theme but I would have liked to see a few more details on this to reassure that there are no closely related themes.

      We have updated the manuscript to include a more detailed description of the methods used to detect bridging themes shared between the Rossmann and P-Loop evolutionary lineages. In addition, we now include a supplemental table (Table S2) with all of the initial hits from the theme analysis.

      4) The authors have built model structures to allow them to estimate ligand location in proteins with no structural characterisation. It would be helpful if they reported the degree of sequence similarity between the query and template proteins and also the model quality.

      We have updated this section to include more details. In addition, we have identified a structure from the same T-group to serve as our ligand donor. The updated ligand donor is more closely related to 1ko7 than the previous ligand donor, though the positioning of the ligand is effectively unchanged. We note that the global sequence identity to both the previous and new ligand donor is low (less than 30% sequence identity). However, the phosphate binding loops align well in both sequence and structure, as is detailed in the revised Methods section.


      The study by Longo et al. was devoted to evolutionary history of P-loop NTPases and Rossmann fold proteins. Although not related in sequence, the two protein families share some structural features that imply that they could be diverged from a common ancestor. Using bioinformatic analyses, the study under review identified some bridge proteins (of tubulin family) that share themes of both P-loops and Rossmanns, offering a possible support for the common ancestry. A minimum ancestral peptide structure is proposed based on the analysis and its possible diversification trajectory is hypothesized. Even though the divergence scenario is clearly outlined, the authors do not over-interpret the observations and admit that convergence could still explain the scenario. The methodology and results are sufficiently described and conclusions are explained in detail. Although it would be really interesting to design an experimental study to support the conclusion (and I suppose that the authors will do that), that is clearly outside the scope of this bioinformatic study.

      Obtaining experimental evidence for our hypothesis is far from trivial. Modern proteins, including the bridging ones identified here, may not be amenable to exchange due to differing contexts (epistasis). Still, we agree that highlighting experimental directions is a good idea. We have updated the sections From an ancestral seed to intact domains and Conclusion to include a brief discussion of experiments that may help test our hypotheses about the evolution of these protein lineages.

      I would not propose any major changes to the manuscript as I think that the message is very clear. **Minor comments:** (1)In the results section, the text is very clear but tends to be repetitive in places. I think the manuscript would be more easily readable if more to the point at some sections.

      We have edited the manuscript to remove cases of unnecessary repetition in the results section and throughout.

      (2)There is probably a few typos or unclear sentences, e.g. pg 5, mid-page, "The core, most common topology...); pg 12, three lines from the bottom "(where this element in canonical", probably should be "is canonical"; pg 11, mid page "the mode of binding of the catalytic dication of tubuling (often Ca2+)" - all the structures listed in Table S1 list Mg2+, so "often" is a bit misleading.

      We have corrected the unclear sentences and typos noted above, as well as a few others.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The study by Longo et al. was devoted to evolutionary history of P-loop NTPases and Rossmann fold proteins. Although not related in sequence, the two protein families share some structural features that imply that they could be diverged from a common ancestor. Using bioinformatic analyses, the study under review identified some bridge proteins (of tubulin family) that share themes of both P-loops and Rossmanns, offering a possible support for the common ancestry. A minimum ancestral peptide structure is proposed based on the analysis and its possible diversification trajectory is hypothesized.

      Even though the divergence scenario is clearly outlined, the authors do not over-interpret the observations and admit that convergence could still explain the scenario. The methodology and results are sufficiently described and conclusions are explained in detail. Although it would be really interesting to design an experimental study to support the conclusion (and I suppose that the authors will do that), that is clearly outside the scope of this bioinformatic study.

      I would not propose any major changes to the manuscript as I think that the message is very clear.

      Minor comments:

      (1)In the results section, the text is very clear but tends to be repetitive in places. I think the manuscript would be more easily readable if more to the point at some sections.

      (2)There is probably a few typos or unclear sentences, e.g. pg 5, mid-page, "The core, most common topology...); pg 12, three lines from the bottom "(where this element in canonical", probably should be "is canonical"; pg 11, mid page "the mode of binding of the catalytic dication of tubuling (often Ca2+)" - all the structures listed in Table S1 list Mg2+, so "often" is a bit misleading.

      Significance

      I think this is a very interesting analysis of the evolutionary history of the P-loop and Rossmann fold family which are considered among the most ancient and abundant protein folds. That makes them of high interest also for origins of protein structure. The results are not firmly conclusive (because of the limits of such analyses), making the outcomes of the study partly hypothetical. I think it would be very interesting to outline suggestions for future experiments that could test the hypothesis to be more valuable to a broader audience.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is a fascinating and beautifully written article about the possible evolutionary relationship between two major protein superfamilies - the P-loop NTPases and the Rossmans. Both are ancient and highly diverse superfamilies, containing a significant proportion of all extant domain sequences and were probably amongst the earliest enzyme superfamilies to emerge in evolution. No major evolutionary classification of proteins, such as SCOP, reports evolutionary relationships between them.

      Both share the same structural architecture of a beta-alpha-beta 3-layer sandwich and have an intriguing number of other shared structural features including the location of the binding site for phospho-ligands. However, whilst both bind phosphorylated ribonucleosides, the mode of binding differs and also the manner in which these compounds are exploited. Furthermore, there are differences in the topologies of the folds possibly suggesting distinct evolutionary trajectories. The Rossmanns appear to be more structurally conserved, whilst the P-Loops vary more in their topologies and possibly represent less stable arrangements of beta-sheets and alpha-helices.

      The authors have brought together several strands of evidence to explore possibly evolutionary relationships. Detailed structural analyses allow the authors to explicitly detail the significant shared structural features. For example, similarities in the mode of binding the phosphate moiety in the ligand. The structural features are well described and there are appropriate illustrations visualising key differences and similarities.

      The shared features of the phosphate binding site likely emerged and were favoured early in evolution, as supported by other analyses reported by Longo et al. However, as the authors point out there are other compelling similarities including the equivalent location of this site in the first beta-loop-alpha element in both superfamilies, which is not a necessary constraint of phosphate binding and the authors support this by giving examples of phosphate binding at the tip of alpha-4. In addition, they provide evidence supporting the common involvement of beta-2 which contains the conserved Asp in the Rossmanns common ancestor. The Walker-B Asp in the P-loops is also at the tip of the beta-strand adjacent to beta-1, as in the Rossmanns - although this is an inserted strand relative to the Rossmann topology. The authors propose feasible evolutionary scenarios for how the P-Loops and Rossmans may have diverged to acquire additional secondary structure elements extending the common beta-PBL-alpha-beta-Asp feature present in both superfamilies.

      Further compelling evidence is given by detection of a bridging protein - Tubulin - linking the two superfamilies. This has the distinct Rossmann topology but binds GTP in the P-loop NTPase mode. Furthermore, the GTP is hydrolysed by water activated by a ligated metal dication. Final support is given by reporting common sequence themes between the P-loop enzyme HPr kinase/phosphatase and some Rossmann proteins. The authors present further interesting and detailed analyses of similarities between the proteins sharing this unusual theme.

      The evidence provided by the authors for the shared beta-PBL-alpha-beta-Asp fragment seems very strong to me and has been presented in an interesting and informative way. Of course, it is not possible to know the subsequent evolutionary trajectories but the scenarios presented seem plausible.

      I only have minor comments

      1)SCOP2 provides information on links between superfamilies based on rare sequence or structural features. Have the authors checked this resource for any details on beta-PBL-alpha-beta-ASP fragment? Or perhaps consulted with Alexey Murzin about this feature?

      2)I was rather confused by the way in which EC annotations were collected for the two superfamilies ie via Pfam - wouldn't it be better to use SUPERFAMILY as the domain structures would map directly to these sequence relatives. I'm also surprised that they only took the common EC from a Pfam family since the aim of this analysis was to identify how many different enzyme functions the two superfamilies supported. Pfam does not classify by function and so inevitably groups functionally diverse relatives. However, to get the full range of enzyme functions supported by these superfamilies I would have thought all non-redundant EC functions across these constituent Pfam families should be counted. Perhaps I have misunderstood.

      3)The authors refer to a set of previously curated 'themes' and allude to a methodology that will be reported in a forthcoming manuscript. The idea of identifying rare themes and then using them to locate very distant homologues is appealing. However, I think some details should be provided here. For example, some brief details on the technology for detecting the themes and thresholds on significance. How rare are they and how conserved do these fragments need to be between superfamilies to join their curated list? Furthermore, how many of these curated themes are similar to the one reported in their article and do they get crosslinks to other superfamilies based on closely related themes? ie how unique is this theme to the P-loop and Rossmanns and are there closely related themes linking these two superfamilies to other superfamilies? I would imagine it is quite a distinct theme but I would have liked to see a few more details on this to reassure that there are no closely related themes.

      4)The authors have built model structures to allow them to estimate ligand location in proteins with no structural characterisation. It would be helpful if they reported the degree of sequence similarity between the query and template proteins and also the model quality.

      Significance

      This article present compelling new evidence on the evolutionary relationship between two major, ancient enzyme superfamilies. As far as I'm aware these insights are novel and the detection of the bridging protein relative and the common 'theme', i.e. beta-PBL-alpha-beta-Asp fragment, is a new discovery.

      This work makes an important contribution to understanding the evolution of two major enzyme superfamilies and the insights can guide future evolutionary studies and protein design studies.

      The audience will be structural and evolutionary biologists, both experimental and computational.

      My expertise is in protein evolution and protein structure analyses and I have published a number of reviews and articles analysing and discussing Rossmann-like superfamilies.

    1. Reviewer #3:

      Kinsler et al measure the fitness of 292 mutants, which were recovered from previously performed experimental evolution in glucose limited batch culture condition, using barseq in 45 different conditions. They analyze the matrix of individual fitness measurements in different conditions using dimensionality reduction (singular value decomposition) and then study the explanatory power of the matrix decomposition. Although 95% of the variance is explained by the first vector, they identify 7 additional orthogonal vectors that explain a significant fraction of the remaining 5% of variance. They find that this reduced dimensionality representation of fitness profiles is able to predict mutant fitness in conditions similar to that in which the evolution experiment was performed and in environments that differ from the original selection experiment. They observe that different adaptive mutations have different effects across environments despite having similar fitness effects in the selective environment. From these findings the authors conclude that adaptive mutations affect a small number of phenotypes in the condition in which they are selected, but that they have the potential to affect additional phenotypes across conditions concluding that adaptive mutations are locally modular, but globally pleiotropic.

      This experimental study is well performed and the data analysis is clear and comprehensive. The authors have done an exemplary job in describing their study with clear and scholarly writing.

      However, the central question is whether the conclusions of the study are justified. The authors goal is to establish a "genotype-phenotype-fitness" map, but as they state "our phenotypic dimensions are not necessarily comparable to what people traditionally think of as a "phenotype". Indeed, I agree that what the authors have identified are not phenotypes at all but are instead properties of the genotype-fitness map assayed in different conditions. These properties are themselves interesting; however, describing them as phenotypes - observable and measurable traits of an organism -, or even inferring the number of phenotypes they represent, is incorrect. Therefore, I am not convinced that the authors have achieved their goal of defining a genotype-phenotype-fitness map.

      Key points that the authors should consider:

      -The central conclusion is not supported. The authors claim that adaptive mutations affect a small number of phenotypes in the evolved conditions, but many phenotypes over different conditions. But, this conclusion cannot be drawn from the results. Why is a scenario in which hundreds of "phenotypes" (e.g. the expression of 100 genes) underlies enhanced fitness in the adapted environment, but a change in the environment means that only 10 of those genes are expressed (i.e. fewer "phenotypes") and thus the fitness effect is different in that environment incompatible with the results? In that scenario the overall conclusion would be completely the opposite. Perhaps constructing a mechanistic model and performing simulations that explore these different possibilities would strengthen the argument.

      -A primary result of the study is that mutations that are beneficial in one condition are frequently deleterious in other conditions. This phenomenon of antagonistic pleiotropy has been described innumerable times in the experimental evolution literature - indeed, it seems to be the rule rather than the exception - and these prior observations should be more clearly described.

      -The extent to which the results are dependent on the number of environments is not investigated. For example, reducing the number of "similar" environments would likely decrease the variance explained by the first singular value as would increasing the diversity of environments that are studied. How does this variation impact the results and interpretation?

      -In figure 2, it looks like fitness is defined relative to the most fit genotype. Typically, in experimental evolution fitness is defined relative to the ancestor. Perhaps defining ancestral fitness as zero for the SVD is necessary, but this is atypical based on similar studies and may be a source of confusion for readers.

      -In figure 2C an idea of the variance is given for the EC conditions, but not for the other conditions. Some measure of uncertainty for fitness in each condition would help (give the 2-4 replicates of each).

      -Why not use an ancestral strain without a barcode for competition assays, rather than having to digest the ancestral barcode with restriction enzymes?

      -cutoff of 1000 reads for a times point with 400 strains seems really low (or is it supposed to be reads/strain?).

      -The arrows in figure 2C are unexplained.

    2. Reviewer #2:

      In the manuscript titled "A genotype-phenotype-fitness map reveals local modularity and global pleiotropy of adaptation," the authors describe an approach for uncovering the phenotypic complexity that underlies fitness by tracking hundreds of experimentally-evolved adaptive mutants across a range of environments. This approach yields a genotype-phenotype-fitness map without actually naming and measuring the phenotypes themselves. Instead, by perturbing environmental conditions and measuring mutant fitness across environments, the authors develop a model that reveals a collection of abstract phenotypes that contribute significantly to fitness. The authors find that a low-dimensional phenotypic model is sufficient for capturing fitness of the panel of mutants across subtle environmental perturbations - which suggests that only a few phenotypes contribute to fitness near the evolution conditions. Further, the model accurately predicts fitness in environments that deviate from the evolution condition, often through components that contribute little to fitness near the evolution condition - which suggests that adaptive mutants have latent phenotypic effects that only impact fitness in distant environments. These findings lead the authors to conclude that adaptive mutations are locally modular yet globally pleiotropic, thereby lending valuable insight into our understanding of how adaptive mutations affect the complex physiological interconnectedness of the cell.

      Overall, I am very impressed with the work described in the manuscript. The manuscript is well-written, especially considering the conceptual depth of the topic and novelty of the approach. The experiments were elegantly designed and adopt a variety of molecular tools developed recently within the field. The figures are appealing and present the data in a clear manner. The conclusions are justified by the data, and the findings represent a significant contribution to the field.

    3. Reviewer #1:

      The distribution of pleiotropic effects of mutations selected in a particular environment is of broad and fundamental significance. We've known for a while from large and even larger-scale screens of beneficial genetic variation that the rising tide of these mutants in the focal environment often lifts other boats in neighboring conditions, but not in orthogonal conditions, where outcomes are unpredictable. This beautifully written, executed, and analyzed study shows that we actually can gain predictability if the number of environments scales to dozens, mutants scale to hundreds, and most importantly, multidimensional analyses are taken seriously enough to derive the most salient predictor variables. Here, the magic number is 8 parameters, and the authors do a great job of justifying this decision given the noise of batch effects and the surprising power of the few, less explanatory parameters in the selective environment to explain variation in the more foreign environments.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary:

      The distribution of pleiotropic effects of mutations selected in a particular environment is of broad and fundamental significance. We've known for a while from large and even larger-scale screens of beneficial genetic variation that the rising tide of these mutants in the focal environment often lifts other boats in neighboring conditions, but not in orthogonal conditions, where outcomes are unpredictable. This well written, executed, and analyzed study shows that we actually can gain predictability if the number of environments scales to dozens, mutants scale to hundreds, and most importantly, multidimensional analyses are taken seriously enough to derive the most salient predictor variables. The authors find that a low-dimensional phenotypic model is sufficient for capturing fitness of the panel of mutants across subtle environmental perturbations - which suggests that only a few phenotypes contribute to fitness near the evolution conditions. Further, the model accurately predicts fitness in environments that deviate from the evolution condition, often through components that contribute little to fitness near the evolution condition - which suggests that adaptive mutants have latent phenotypic effects that only impact fitness in distant environments.

    1. Reviewer #3:

      This article asks the question if within trial (present) and ITI (past) task parameters are encoded in mPFC, and how encoding during these two trial epochs are encoded. They claim that firing in mPFC reflects past and present, but population encoding of past and present are independent. Further they show that the present is reactivated during sleep, not the past.

      On the face of it, this seems like an interesting paper. It is novel in that ITI encoding would be highly related to what was going on in the trial. The sleep finding is also interesting but I don't quite get the distinction between present and past for sleep. That could use some clarification.

      1) I'm not an expert in regards to this type of analysis, but throughout I was left with the feeling that I would prefer at least some single neuron data and firing rate analysis to complement the highly computational analysis, which frankly, was difficult to understand or critique by somebody who is not an expert.

      2) I would have liked to see more analysis of firing correlations with behavior. It seems to me if animals were doing different things during the trial and the ITI, then it might not be a surprise that there is independent encoding.

      3) I also wonder if the finding is solely dependent on the task (which is poorly described). It seems like there should be independent coding of past and present in this circumstance because they do not feed into each other, and behavior during one is independent of behavior in the other.

      4) Relatedly, the authors suggest that independent encoding can explain how the brain resolves interference between past and present, but in this task there was no interference between past and present, and the authors do not show that when there is more or less dependent encoding that there is more or less interference. Without it is unclear how to know how important this finding is as it relates to performance and general mPFC function.

      5) Could activity reflect what the animal predicts will happen on the next trial, or what they are planning to do? It wasn't clear if that was examined.

      6) I have some issue with the definition of past and present in the context of this task. More justification should be provided.

    2. Reviewer #2:

      The study by Maggi and Humphries re-examines data by Peyrache et al. (2009), which the authors have themselves analysed previously (Maggi et al., 2018), recorded , in rat prelimbic/infralimbic cortex (see comment below on terminology). In particular, they look at the relationship between decoding of task events during performance of a trial, and during the subsequent intertrial interval. (n.b. in this study, unlike in many studies, the ITI is considerably longer than the trial period). They find that although task-relevant information can be decoded during these two periods, the information is encoded in orthogonal subspaces during trials ('the present') and ITIs ('the past'). They build on this to examine how information is encoded during sleep following training (vs a pre-training control period). They find that only the trial subspaces are reactivated during sleep, not the ITI subspaces, and more so if the rat received a higher rate of average reward.

      On the whole, I found this an interesting paper with a clear set of findings, and well-analysed data. Although the advance in some ways an incremental one on previous studies of sleep/replay, and on the authors' previous analyses of this dataset, the study will undoubtedly be of interest to researchers who are interested in consolidation of past experience during sleep. In particular, the study benefits from being able to look for two different types of information ('past' and 'present' decoders) in the same sleep recording sessions. There were a few things that I felt the authors could address:

      1) For the cross-decoding analysis in figure 2 b, it is not entirely clear from the main text which part of the trial and ITI coding is being used here. It seems to me like a more useful way of showing the cross-decoding analysis would be to show the 10x10 matrix of cross decoding accuracy for each of the 5 maze positions in both trials and ITIs. This is, I think, different from what the analysis in figure 3g is trying to show (which plots the classification error after dimensionality reduction to a 2D space).

      2) It was surprising to me that the authors do not mention the finding in figure 4e anywhere in the abstract or introduction. It makes the reactivation story far more compelling if it can be linked to a change in behaviour during the preceding trials. I think this finding would benefit from not being buried deep in the results section.

      3) The finding in figure 5 seems slightly extra-ordinary. It suggests that reactivation decoding during sleep is reliable even if very long bins of activity are used to calculate the firing rate (e.g. up to 10s). Does this relationship ever break down? Presumably with the sleep data, it would be possible to extend bins up to 1 minute, 5 minutes, etc. If there is still more reactivation at these extremely long time-bin lengths, does this mean that these neurons are essentially more persistently active? One possible way to test for this might be to project the data recorded during sleep through the classifier weights, and then calculate the autocorrelation function of this projected data (e.g. Murray et al., Nat Neuro 2014) - if this activity becomes more persistent, the shape of the ACF may change post-training.

      4) I disagree with the use of the term 'medial prefrontal cortex' to describe this area of the rodent brain. Although this is the term used in the original paper by Battaglia et al. (2009), I would suggest the authors use the more anatomically precise description of 'prelimbic/infralimbic cortex', and mention that the recordings are ~2.7mm anterior to bregma (see supplementary figure 1 of Battaglia 2009 paper; see Laubach et al., eNeuro 2018 for further discussion on terminology). Also, when the authors discuss these recordings in the context of the wider literature, it is difficult to know how to relate activity in this dysgranular region of the rodent brain to regions of granular prefrontal cortex in the primate brain - given the anatomical correspondence between rodents and primates is very uncertain for these granular regions (e.g. citations to Schuck et al., 2015; Averbeck and Lee, 2006; etc). It would be good to acknowledge this somewhere.

    3. Reviewer #1:

      Maggi and Humphries examined how the coding of the present and past choices in the medial prefrontal cortex (mPFC) of the rats during a Y-maze task overlaps and whether they can be reliably distinguished. They found that the neural signals related to the animal's choice in the present and past are distinct and as a result they can be recalled separately, for example, during post-training sleep. Although these are very important questions and an interesting set of analyses have been applied, the results in this report are not entirely convincing, because the analyses did not successfully exclude some alternative hypotheses.

      1) The authors analyzed the signals related to the choice, light cue, and outcome separately, and this is possible because the relationship between the animal's choices and cues were decoupled by testing the animals under at least two different rules. There were a total of 4 alternative rules and different sessions included different subsets of these rules. It is possible that at least some results reported in this paper might vary depending on which of these results were tested. For example, rules might affect how the animals learned the task. Therefore, the authors should provide more detailed information about how often different rules were used to collect the neural data reported in this paper, and whether any of the results change according to the rules used in a given session.

      2) The authors claim that the neural coding identified in this study does not depend on the signals in individual neurons by showing comparable results after removing the neurons with significant modulations. This logic is flawed, because the neurons without "significant" modulations might still include meaningful signals due to type II errors. Furthermore, if individual neurons carry absolutely no signals, how can a population of neurons still encode any signals? This might suggest some kind of joint coding, and the authors should not merely implicate such a possibility without more thorough tests.

      3) The authors analyzed the activity divided into 5 different epochs, where the position #3 corresponds to a choice point and #5 corresponds to the reward site. Therefore, it is surprising that the reliable outcome signals begin to emerge from the position #3 (i.e., choice point). Is this a false positive?

      4) The authors report that there is retrospective coding, i.e., no coding of the choice in the previous. By contrast, during the intertrial interval (while the animal's returning to the start position), the signals related to the "past" choice were still present but different from how this information was coding earlier during the trial. This is not surprising since during the intertrial interval, the animal's movement direction is opposite compared to that during the trial, so this coding change could reflect the animal's sensory environment. Whether the brain encodes the past and previous events using different coding schemes or not cannot be tested with such confounding.

      5) The authors tested whether the coding of present and past events is consistent using a transfer (cross-decoding) analysis. However, this is based on simply correlation, and does not exclude the possibility that neurons changing their activity similarly according to (for example) the animal's choice might also change their baseline activity between the two periods (as revealed by the analysis of "population activity" in Figure 3) or might additionally encode different variables. In this case, decoding based on simple correlation might not reveal consistent coding that might be present.

      6) Given the length of the inter-trial interval, it might be informative to examine whether neurons activity during the early part of the inter-trial interval might get reactively differently during sleep compared to those becoming active later during the intertrial interval.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript. Daeyeol Lee (Johns Hopkins University) served as the Reviewing Editor.

      Summary:

      Although the reviewers have acknowledged the significance of better understanding how neurons in the prefrontal cortex can simultaneously encode signals related to the animal's present and past behaviors, they were concerned that the findings reported in this paper did not control for potential confounding of behavioral variables during the epochs analyzed in this manuscript. They also raised several concerns about the analytical methods used. In the consultation, the reviewers all agree that the advance represented here is not to the level that would be expected by readers and the overall enthusiasm was limited.

    1. Reviewer #3:

      In this manuscript, Schorscher-Petcu et al., describe a very exciting new approach combining precise optogenetic stimulation of cutaneous nerve terminals with high-speed imaging for machine-guided behavior analysis. This work is timely, and there are many clear applications to understand peripheral somatosensory encoding using this strategy. More thorough methodology and guidance for future end users could be provided. However, I am much less enthusiastic about the conclusions drawn for a sparse neural coding hypothesis, based on the data presented. Significant support for this hypothesis would require more substantial revisions, including testing in mouse lines to target other specific sensory modalities, innervation regions, and possibly pain states.

      Substantive concerns:

      1) A major strength would be the ability to combine precise optogenetic stimulation with other behavioral assays. Can this be used in combination with existing nociceptive tests? For example, does the NIR-FTIR allow for tracking of spontaneous pain behaviors after intraplantar formalin or CFA? And can this then also be used to assess sensitization of genetically-identified fibers using scanned optogenetics?

      2) What is the rationale for varying the pulse-widths rather than light intensity for these experiments? Increasing light intensity will generally lead to larger ChR2 photocurrents, while changing light duration generally affects deactivation and desensitization kinetics. At a peripheral terminal, the effects of subthreshold depolarization may in fact mimic the physiological activation of endogenous receptors, like TRP channels. This level of fine-tuned control would be a significant advancement for understanding how information from different somatosensory modalities is processed and integrated.

      3) It would be useful to have more thorough characterization of the strengths and limitations of the optical system. For example, how quickly are the spatially patterned stimuli able to be moved? What is the maximal area for a single spot or array of spots, and how long does this take to scan? Does the time between patterned stimuli, both in a single spot or when spatially distributed, alter withdrawal responses? How quickly can the beam spot size be altered? These will be important points that potential users will need to consider before building this system.

      4) It would also be extremely helpful to provide more thorough details and discussion of implementing Deep Lab Cut analysis with this system.

      5) The proposed activation of myelinated A fibers is very surprising given the opsin expression patterns in TRPV1:ChR2 mice. The authors cite Arcourt et al., however they did not find any expression of TRPV1 in their genetically-defined A-fiber nociceptors. And with this breeding strategy can the authors please clarify and provide support for this apparent discrepancy?

      6) The response latencies in Figure 3 fit well with the hypothesis that fibers with different conduction velocities are activated by changing pulse areas. Do different stimulus intensities (or durations) preferentially activate A vs C-fiber afferents akin to electrical stimulation of dorsal roots in spinal cord recordings? Or does the larger stimulation area merely increase the probability that an A nerve ending is in the illuminated region? Could this alternatively be explained by additive depolarization or more complex spike interference at these axon collaterals that branch extensively in the skin? Also, do the response profiles vary after activation of a presumptive A vs C-fiber?

      7) Is the pain-related behavior in response to single or patterned optogenetic stimulation reduced by analgesics acting centrally or peripherally? This could reveal important differences in rapid reflex or protective behaviors and more complicated nocifensive responses, and support the author's claims of true pain-related behaviors.

    2. Reviewer #2:

      The manuscript by Schorscher-Petcu et al developed a method/system for scanned optogenetic activation of nociceptors on the paw in freely behaving TrpV1-Cre::ChR2 mice, with concurrent measure of both paw responses (using near-infrared frustrated total internal reflection to measure paw/floor contacts) and full body responses (scoured using DeepLabCut). Using this approach, they showed that the number of activated nociceptors governs the timing and magnitude of rapid protective pain-related behavior. The detailed description of how to construct the setup, and the open availability of the software are useful for other labs to apply this method.

      I have three points that I would like the authors to address:

      1) I have a hard time evaluating the hierarchical bootstrap procedure, which references a pre-print. Is this method really ensuring that the results are more rigorous? Or is it needlessly complicating the reporting of fairly simple metrics for what appear to be obvious phenomena (Figure 3) like paw rise time?

      2) I have an issue with the word "sparse code". In neuroscience in general, sparse code refers to the phenomenon that a given stimulus only activates a very small percentage of neurons in a population. Here the authors refer to a single action potential elicited by optogenetic stimulus. Some other term should be used.

      3) For Figure 4 (whole body movement), the analysis should be using a vector instead of a scalar. The example in Figure 4D clearly shows directionality, i.e. the nose moves toward the stimulated paw. But the authors only analyzed maximum distance (a scaler, not vector). So the correlation here in Figure 4F is showing "when body part A moves a lot, does body part B also move a lot". Instead, I think the analysis more in line with the examples would be when body part A moves one direction, the direction of movement of body part B would be correlated. In other words, the analysis needs to be done where distance is some kind of vector, either closer to or further away from the paw or moving toward or away from the stimulated paw.

    3. Reviewer #1:

      The manuscript by Schorscher-Petcu is a very innovative study addressing an important problem in pain and somatosensory neuroscience - precise and remote delivery of sensory stimuli. The strength of this work is the experimental paradigm, as the biological insight seems quite weak and not more expansive than previous work from the authors and others in the field. One has to ask, is this work being sold on the tool or new biology? If it were the latter, this work could easily benefit by comparing the data with Trpv1-ChR2 with other sensory neuron populations - as the authors mention in the discussion. Nonetheless, the rationale for such a tool developed here is widely agreed upon in the field, and if others can easily adopt this strategy, this could become the standard for peripheral optogenetic stimulation of the hind paw.

      Major comments:

      1) It remains unclear to me how one actually remotely aims at the hind paw of interest. Is there a joystick where one aims at the paw? Relatedly, are there ever any misfires where one intends to aim at the paw but hits another area? Or does the mouse sometimes move when you intend to hit one area thus causing an unintended stimulus delivery?

      2) In Figure 2 the authors cite their previous studies which demonstrate that a brief optogenetic stimulus to the paw elicits a single action potential which is capable of causing a behavioral response. The authors then infer here that their nanosecond manipulation of light also influences single action potentials. However, without verifying that in this new experimental context, simply citing the older work is insufficient evidence to draw any correlation to action potentials.

      3) In Figure 3 the authors mention that in a fraction of trials (presumably ~35%) the paw moved but did not withdraw, and that this was detected by the acquisition system and not by eye. I am confused about what the authors are considering a paw withdrawal. Is not any paw lift also a withdrawal? Additionally, how can the acquisition system see things that cannot be seen by the experimenter? Could this point towards an error of the system? Is there an independent validation of how well the system is working compared to some benchmark?

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The manuscript by Schorscher-Petcu is a very innovative study approaching an important problem in pain and somatosensory neuroscience - precise and remote delivery of sensory stimuli. This work is timely, and there are many clear applications to understanding peripheral somatosensory encoding using this strategy. The rationale for such a tool developed here is widely agreed upon in the field, and if others can easily adopt this strategy, this could become the standard for peripheral optogenetic stimulation of the hind paw.

    1. Reviewer #3:

      The study by Mangeol et al. aims to dissect the localisations, interactions and hierarchical order of apical protein complexes crucial to the generation and maintenance of epithelial polarity in epithelial tissues.

      They analyse by super-resolution microscopy (STORM) three different mature epithelia, human and mouse intestine as well as mature Caco-2 cells in culture. Using immunofluorescence labeling of endogenous proteins, they compare individual components to markers of tight junctions, to each other and to the actin cytoskeleton. They identify defined clusters in defined sub regions of the apical domain of the analysed cells, raising interesting questions for future analyses.

      The subject matter of the study, the generation and maintenance of epithelial polarity and the role of apical polarity complexes, is clearly a very important one, especially as most organ systems are epithelial in nature. And despite decades of study, many questions are still unresolved.

      The imaging performed in this study is skilful and beautifully presented. The imaging achieving, according to the authors, an isotropic resolution of about 80nm is impressive. Because of this great gain in resolution compared to other studies of similar components I have a couple of technical questions or comments:

      1) I would very much appreciate some comments or thoughts on the fact that polarity proteins were revealed using antibodies. Antibodies are in the range of 10-15nm in length, so with an isotropic resolution of 80 nm, this might have to be taken into account when using primary and secondary antibodies to reveal proteins. In particular, monoclonal versus polyclonal antibodies might have differing effects on localisation precision.

      2) The authors use rather high concentrations of detergent (1% SDS or 1% Triton X-100) for permeabilisation according to their protocols. Are they not worried that this might affect tissue integrity and protein distribution?

      The authors rightly point out where their study fits within what has been attempted by other labs previously in order to understand and dissect apical polarity complex function. They clearly define interesting aspects, such as PALS1-PATJ and aPKC-PAR6 forming independent clusters, and the lack of colocalisation and thus maybe association with Crumbs3. In contrast to the last sentence statement of their abstract 'This organization at the nanoscale level significantly simplifies our view on how polarity proteins could cooperate to drive and maintain cell polarity.' I cannot yet see what these results simplify about our understanding of apical polarity complexes and even more so what the authors' new model is of how the complexes work. This needs to be spelt out more clearly, please. And I would also point out that, in part, other studies have pointed in the same direction. The recent paper by the Ludwig lab (Tan et al. 2020 Current Biology 30, 2791-2804) points in part in a similar direction, identifying a vertebrate 'marginal zone' similar to the one already known from invertebrate epithelia, as well as identifying basal to this an apical and basal tight junction area. Furthermore, as the authors themselves discuss in the discussion, the 'splitting away' of Par3 has been observed in Drosophila epithelia (embryonic, follicle cells and eye disc), and should maybe be introduced already at an earlier point of the paper. Furthermore, papers by Wang et al. and Dickinson et al., that also analyse PAR complex clustering should be cited and mentioned in the introduction/discussion (Wang, S.-C., Low, T. Y. F., Nishimura, Y., Gole, L., Yu, W., & Motegi, F. (2017). Cortical forces and CDC-42 control clustering of PAR proteins for Caenorhabditis elegans embryonic polarization. Nature Cell Biology, 19(8), 988-995. http://doi.org/10.1016/S0960-9822(99)80042-6; Dickinson, D. J., Schwager, F., Pintard, L., Gotta, M., & Goldstein, B. (2017). A Single-Cell Biochemistry Approach Reveals PAR Complex Dynamics during Cell Polarization, 1-42. http://doi.org/10.1016/j.devcel.2017.07.024).

      I am also a bit confused by the analysis presented in Figure 5 with regards to colocalisation of components with apical F-actin structures and the deduction from these and the EM data that some components, aPKC/Par6, localise to 'the first row of' microvilli near junctions whilst PALS1-PATJ localise near the base of said microvilli. How would localisation to the apical plasma membrane outside of or within microvilli be restricted to only the ones near junctions? There is not only F-actin in microvilli but also all over and near the apical cortex, so what distinguished the ability of aPKC/PAR6 to bind to actin in microvilli? The PATJ knock-down results are interesting, and I agree suggestive of some interaction between the complexes and actin organisation. But without further analyses as to what other components might be affected in their localisation in this situation, it is hard to judge whether the effect on actin is a direct or rather indirect one, so I am unsure as to what these images add without more in depth follow-up.

      Some more specific comments:

      Figure 1: It would be good to show and demonstrate that Occludin and ZO-1 labeling are completely interchangeable in terms of localisation precision.

      Figure 3: I do understand the authors' rationale for analysing the localisation in the orientation (planar versus apical-basal) that reveals the largest distance, but it would be good to nonetheless show the other orientation for completeness (maybe as supplementary).

    2. Reviewer #2:

      The manuscript addresses a fundamental problem: the organisation of epithelial polarity determinants at the apical domain of human epithelial cells. The authors use STED microscopy to examine antibody-stained fixed Caco2 cells. My major concern is that the process of fixation and immunostaining may introduce artefacts that are causing the segregated dots to appear. This issue could be addressed by using CRISPR-knockin GFP versions of some of the proteins studied, which is technically straightforward to perform these days, and would allow the conclusions to be drawn with full confidence.

    3. Reviewer #1:

      Mangeol et al investigate the nanoscale organization of apical-basal polarity complexes using super-resolution microscopy approaches (STED) in polarized intestinal epithelial cells, both in culture and from in vivo tissue samples. They provide a careful characterization of Par3-Par6-aPKC and Patj-Pals1-Crb3a localization relative to tight junctions in both planar and apical-basal axes. They find that each protein localizes in the near vicinity of the tight junction, in a clustered organization. Through pairwise colocalization analyses, they observe significant separation of polarity proteins that are generally considered to be part of the same molecular complex based on biochemical assays. Specifically, PAR3 is not associated with aPKC or PAR6, and CRB3a colocalizes poorly with all other polarity proteins.

      Overall, this paper provides a thorough description of polarity protein localization at the submicron scale. The data are presented in a clear and convincing manner and the conclusions are largely consistent with the data. The unexpected separation of polarity proteins suggests that some of the previously described biochemical interactions may be transient, warranting further investigation comparing different stages of polarization. These findings will be of interest to those in the field of cell polarity.

      Comments/concerns:

      1) All of the results depend on antibody quality, specificity, and antigenicity but no antibody validation provided (with the exception of PATJ). If one primary antibody is less specific than the others, the colocalization data will be heavily skewed, appearing not to be colocalized. Perhaps this can explain why Crb3a fails to colocalize with the other proteins? Validating the results with a second primary antibody or an endogenously tagged GFP-fusion protein would alleviate this concern.

      2) The authors show that CRB3a doesn't colocalize PALS or PATJ, suggesting another transmembrane protein recruits them to the membrane. Could this function be provided by another CRB family member or is CRB3a the only one expressed in intestinal epithelia?

      3) The super-resolution characterization of actin organization is not as extensive or convincing as the description of polarity protein localization. A closer examination of actin organization relative to PATJ and aPKC at junctional, apical, and villi positions would strengthen the findings in Figure 5.

      4) In some cases the number of biological replicates is small. Only one mouse sample was used, and the quantifications of junctions are performed across just 1 or 2 cell culture replicates (although more replicates were performed, just not used for quantification). Therefore, the data reflect the variability across junctions (violin plots in Figs 1-2) but they don't reflect the variability across biological replicates. This also means the p-value in Figure 5 was calculated using n=number of junctions rather than n=experimental replicates, which would be a more appropriate comparison of means. Quantifying the data across 3 biological replicates to show the variability across experiments would greatly strengthen the results and conclusions.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      This manuscript is in revision at eLife.

      Summary:

      The manuscript addresses a fundamental problem: the organisation of epithelial polarity determinants at the apical domain of human epithelial cells. Mangeol et al investigate this question using super-resolution microscopy approaches (STED) in polarised intestinal epithelial cells. Using immunofluorescence labeling of endogenous proteins, they provide a careful characterization of Par3-Par6-aPKC and Patj-Pals1-Crb3a localization relative to tight junctions. They find that each protein localizes in the near vicinity of the tight junction, in a clustered organization. Through pairwise colocalization analyses, they observe significant separation of polarity proteins that are generally considered to be part of the same molecular complex based on biochemical assays. Specifically, PAR3 is not associated with aPKC or PAR6, and CRB3a colocalizes poorly with all other polarity proteins, raising interesting questions for future analyses.

      The imaging performed in this study is skillful and beautifully presented and, achieving an isotropic resolution of about 80nm, is impressive. However, because of this great gain in resolution compared to other studies of similar components, the major concern of all three reviewers is that the process of fixation and immunostaining may introduce artefacts that are causing the segregated dots to appear. Variable antibody quality and insufficient validation of antibody specificity raise additional concerns about the observed patterns of localization.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RESPONSE TO REVIEWER #1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Ishihara et al. investigate and compare microtubule polymerization/depolymerization dynamics inside vs. at the periphery of microtubule asters in a cell-free Xenopus egg extract system. By tracking EB comets, which localize to growing microtubule ends, they find that the microtubule growth rates and EB comet lifetimes (interpreted as an indicator of microtubule catastrophe rates) are similar between the two spatially-distinct microtubule populations. However, using a tubulin-intensity-difference image analysis, the authors are also able to measure local microtubule depolymerization rates, and they find a significant difference in depolymerization rates of the two populations. Specifically, the authors report that the microtubule depolymerization rates measured within asters are faster than those measured at the periphery.

      \*Specific comments:***

      Figure 2.

      In the text, the authors report: "The depolymerization rate was 36.3 {plus minus} 7.9 μm/min (mean, std) in the aster interior, compared to 29.2 {plus minus} 8.9 μm/min (mean, std) at the aster periphery." This difference is certainly not two-fold (as stated in the abstract). It would also be useful to mark the mean rates on the graph in 2B.

      We removed the words ‘almost two-fold’ in the abstract. In the revision, we will mark the mean rates on Fig. 2B (using vertical lines).

      The bimodal shape of the depolymerization rate distributions in 2B is very interesting. This definitely warrants further investigation. At the minimum, the depolymerization rates should be determined at 50 um- intervals, as done for other parameters in Figure 1. Could it be that there are two coexisting populations of microtubules at the same location? Or is there a clear spatial compartmentalization of the two that is not obvious here because of the too large of a distance interval used for the measurements. This is a very important distinction for the claims of the paper.

      We understand the reviewer’s concern. There are some technical limitations that make the depolymerization measurement more challenging. While we use widefield imaging of EB1-GFP comets to obtain polymerization rates from a field of view spanning 500 microns, we may only use TIRF imaging for depolymerization measurements. In this method, we are limited to observing microtubules very close to the cover slip in a small field of view of 80x80 microns at 500 ms time intervals (movies span 1-2 minutes). One would need to move the TIRF field every 1-2 minutes at 50 micron intervals, but the aster periphery would be changing during this time, so the exact location of the measurement is hard to define. Thus, we opted to image the two spatial extremes: interior (close to the MTOCs) and the very periphery (where MT density is still sparse.)

      Perhaps, the largest limitation of this approach is the choice of peripheral regions based on the apparent sparsity of MTs in the TIRF field of view. Indeed, when we examine the depolymerization rate distributions for individual movies separately (see figure below, periphery #1-3 are three individual movies), we observe that some movies have rates as low as 20 µm/min, while others have higher values with a center around 36 µm/min. The depolymerization rates for the interior also vary from the mean values of 34.8-43.2 µm/min (interior #1-3 are three individual movies). In general, the spread of depolymerization rate within a field of view as well as across different fields of view is much larger than for polymerization. It is possible that this is partly explained by the lack of precise definition of interior vs. periphery in this TIRF-based measurement approach.

      Our data still supports the spatial regulation of depolymerization rate. However, there is no clear evidence for a bimodal distribution of depolymerization rate in any given field of view (80x80 micron square region). To clarify this point, we have removed the language “bimodal” in the main text. In the revisions, we will provide this figure as a supplement.

      We thank the critical feedback from reviewer #1 and #2 that allowed us to clarify this issue of apparent bimodality of the depolymerization rates.

      The authors make a point here that the distribution of measured polymerization rates is fairly narrow. This appears to be in contrast with Figure 1B, where polymerization rates take on a wide range of values. How do the two distributions of polymerization rates obtained by these two methods compare?

      To address this point, we directly compare the standard deviation of the polymerization rate measurements. For Fig. 1B EB1 tracking measurements, std ranges from 7.7-10.5 µm/min for a given spatial bin (as stated in Fig. 1B legend), while for Fig. 2A TIRF measurements std is 4.0 (periphery) and 4.5 µm/min (interior) as stated in the main text. Given that the mean values of polymerization rates are similar, this suggests that the TIRF measurements are less noisy. This further highlights the relative pros and cons of the two measurement methods. To discuss these issues, we have added a new paragraph in the discussion section.

      Figure 3.

      The laser ablation figure and movies are beautiful, but don't seem to add support to the story. Importantly, the authors do not confirm any spatial variability in depolymerization rate with these experiment. As a matter of fact, although the laser ablation experiments are only performed in the aster interior, the measured depolymerization rates appear to be just as consistent with the periphery rates in Figure 2. as they are with the interior rates in Figure 2. (They span quite a large range of values with the average right in the middle between what was measured for the two areas in Figure 2).

      Indeed, the values obtained with laser ablation are quite variable, even compared to the physiological depolymerization rate measured via TIRF microscopy. This perhaps reflects the variability of biology as well as the nature of the laser ablation which measures depolymerization rate at the level of microtubule populations. We hope our paper will increase interest in this rarely measured parameter, and perhaps invention of new probes to measure it more accurately and conveniently.

      Given the variability of our measurements, we conclude that the results between the TIRF based approach vs. laser ablation based approach of depolymerization rates are indistinguishable. We agree with the reviewer that the data does NOT argue that laser ablation results are more consistent with the interior TIRF measurements than peripheral TIRF measurements.

      To clarify this point, we remove the following clause “, which was comparable to the modal value of the depolymerization rates in the aster interior (Fig. 2).”

      We change the concluding sentence of our laser ablation paragraph from

      “Overall, these observations suggest that depolymerization dynamics are similar for plus ends following a natural catastrophe vs. ablation in the aster interior.”

      to

      “Overall, these observations confirm that depolymerization rates are variable, and we find no statistical distinction of rates between plus ends following a natural catastrophe vs. ablation.”

      Although the authors report they don't see any correlation between the distance and depolymerization rate, they should still plot the rate as a function of initial cut positions (Figures 3D, 3E).

      To address this concern, we plan to provide a supplemental figure in the revision. Please see the preliminary figure below. Due to technical limitations with the laser ablation system (field of view for 60x magnification), we only have measurements that span 15-100 microns from the center..

      From the single decaying inward wave the authors conclude that microtubules depolymerize fully to their minus ends which are distributed throughout the aster. Can the possibility that depolymerization is stopped by microtubule lattice defects/islands be excluded by these observations?

      The existence of microtubule lattice/defects is a recent development in the field and much is not known. If we assume that defects are structurally unstable, we predict that the episode of depolymerization will continue even when reaching a defect. If defects are stable and lead to instantaneous rescue of plus ends, we cannot distinguish the defects from minus ends. In this latter scenario, the interpretation of the decaying inward wave requires caution.

      What are the effects of the local increase in tubulin concentration due to the subunit release by depolymerization? What about the release of other lattice-binding MAPs (stabilizers)?

      We are interested in these questions as well. Soluble GDP-bound tubulin, released by depolymerization, is thought to exchange its nucleotide to GTP without need of a GEF, and no GEF is known. The dissociation rate of GDP is ~0.1 [1/sec], for a half-life of ~5 sec (Brylawski and Caplow, 1983, J. of Biol. Chem.), so we believe the tubulin subunits are recycled relatively quickly. It is not entirely obvious whether this necessarily results in a significant increase in ‘soluble’ tubulin concentration given tubulin diffusive transport. We hypothesize the main effect of stabilizing MAPs is on the depolymerization rate as discussed in our model in Fig. 5.

      Figure 4.

      Is the local depletion of tubulin/EB1 thought to be only within the narrow annulus at ~100 um distance, or is it not measurable on the inside due to the polymer signal? Can the two be separated? Such a sharp transition within a discrete annular region doesn't speak to the relative effects on the inside vs. the outside of the aster?!

      Yes, we also believe the soluble tubulin levels are even lower in the more inner regions of the aster. However, polymerized tubulin accounts for a large part of the fluorescence intensity in these inner regions, and our method does not faithfully reflect the soluble fraction. It will be important for future studies to employ specific methods that may unequivocally distinguish polymer vs. soluble tubulin concentrations (see below).

      More importantly, the local depletion of either tubulin or EB1 is not a good representation of a depletion of a MAP component that associates with the microtubule lattice. Both tubulin and EB1 bind preferably to microtubule ends, not lattice. Thus showing a profile of slight local tubulin and/or EB depletion does not seem to be relevant for the proposed model. Rather, overall microtubule polymer mass/density as a function of distance may be more relevant?

      Reviewer #1 makes a valid point that tubulin and EB1 are specifically incorporated to plus ends and not to the entire lattice as we assume for the MAPs in our theoretical model. To address this issue, we analyzed the fluorescence intensity of images obtained for a MAP that associates with the MT lattice, Tau-mCherry (Mooney et al. 2017). This quantification shows a depletion pattern similar to tubulin and EB1. Thus, we believe the local depletion is a general feature. For the revision, we plan to incorporate this Tau-mCherry data in Fig. 4.

      Figure 5.

      The toy model is intuitive and clear, but not sufficient without any experimental investigation. An attempt to quantify the actual distributions of at least one or a few selected proposed MAPs is needed. Is the depletion strongest where microtubule density is highest? What is the ratio of a MAP intensity to microtubule polymer density as a function of distance? How does that relate to local depolymerization rates? What are other testable model predictions that can show support for the proposed mechanism?

      We understand that our proposal is rather speculative, and the goal of this manuscript was to propose a hypothesis that may inspire others working on assembly on intracellular organelles. Although Tau is not an endogenous component of the egg extract system, we believe that our new quantification of Tau-mCherry depletion adds more credibility to our general proposal.

      Microtubule density is roughly uniform within the interior of the aster according to our current understanding (Ishihara et al. 2016 eLife). So the MAP:MT ratio is relatively uniform throughout the aster except at the very periphery where there are very few MTs assembled (i.e. “depletion is weakest where MT density is lowest.”)

      In the future, we may perform (1) FCS measurements of candidate MAPs to directly measure the concentration profile of the candidate MAP in soluble form and (2) depletion/addback to show which MAP most affects depolymerization rate. Although these experiments are appealing, this requires generation of new molecular reagents as well as calibration of a highly specialized optical method. Therefore, we decided to limit this paper to focus on the unusual observation of the variation of depolymerization rate and speculate the underlying mechanism.

      Also, the table is insufficiently described. Are any or all of these MAPs known to be specific regulators of microtubule depolymerization rates, but not other dynamics parameters?

      There are a large number of MAPs in Xenopus eggs, as there are in all cells, and the degree to which their effects on microtubules has been characterized is variable. To address this comment we include in the revised ms a list of known MAPs that are present in Xenopus egg extract, along with their estimated concentration from a published proteomic study. We annotate each MAP as to whether it increases or decreases microtubule stability, acknowledging that these data are very incomplete, in some cases there is disagreement in literature, and that we are combining pure protein and whole cell analysis. This table illustrates the challenge of associating dynamics regulation with any one MAP, since the behavior of microtubules is regulated by all these factors operating in parallel. That said, certain MAPs jump out as candidate depolymerization regulators that have been little studied for effects on dynamics, for example, MAP7.

      In the revision, we suggest to add this expanded table as a supplementary Table in addition to Table 1.

      Protein Description

      Gene Symbol

      Est. Conc. (nM)

      MT polymerization/nucleation/rescue?

      MT depolymerization/catastrophe?

      Lead reference

      Microtubule-associated protein RP/EB family member 1

      MAPRE1

      1800

      Increase

      Decrease

      PMID: 18364701

      Stathmin

      STMN1

      1600

      Decrease

      Increase

      PMID: 11792540

      MAP4

      MAP4

      960

      Increase

      Decrease

      PMID: 7962090

      Echinoderm microtubule-associated protein-like 2

      EML2

      580

      Decrease

      Increase

      PMID: 11694528

      EML4 protein

      EML4

      500

      Increase

      Decrease

      PMID: 17196341

      Disks large-associated protein 5

      DLGAP5

      380

      Increase

      Decrease

      PMID: 16631580

      Cytoskeleton-associated protein 5

      CKAP5

      300

      Increase

      Increase

      PMID: 23666085

      Kinesin-like protein KIF2C

      KIF2C

      200

      Decrease

      Increase

      PMID: 12620232

      CAP-Gly domain-containing linker protein 1

      CLIP1

      190

      na

      na

      Cytoskeleton-associated protein 4

      CKAP4

      160

      Increase

      Decrease

      PMID: 9799226

      Echinoderm microtubule-associated protein-like 1

      EML1

      140

      na

      na

      Ensconsin

      MAP7

      91

      na

      Decrease

      PMID: 31391261

      Targeting protein for Xklp2

      TPX2

      91

      Increase

      Decrease

      PMID: 26414402

      Microtubule-associated protein 1B

      MAP1B

      85

      Increase

      Decrease

      PMID: 7664878

      MAP1S

      MAP1S

      66

      Decrease

      Decrease

      PMID: 25300793

      Hyaluronan mediated motility receptor

      HMMR

      61

      na

      na

      MAP7 domain-containing protein 1

      MAP7D1

      47

      na

      na

      Cytoskeleton-associated protein 2

      CKAP2

      46

      Increase

      Decrease

      PMID: 15504249

      Microtubule-associated tumor suppressor 1

      MTUS1

      43

      na

      na

      Kinesin-like protein KIF2A

      KIF2A

      37

      Decrease

      Increase

      PMID: 29980677

      CLIP-associating protein 1

      CLASP1

      30

      Decrease

      Decrease

      PMID: 29937387

      Microtubule-associated protein RP/EB family member 3

      MAPRE3

      21

      Increase

      Decrease

      PMID: 20850319

      MAP7 domain containing 2 protein variant 2 (Fragment)

      MAP7D2

      8

      na

      na

      CAP-Gly domain-containing linker protein 4

      CLIP4

      2

      na

      na

      \*Minor comments:***

      Figure 1.

      typo in the figure legend: "interior (distance>300 μm) vs. periphery (50 μmThere appears to be a clear dip in EB1 density at 100 um (Figure 1C). What could be the cause of that?*

      Thank you for catching the typo. We corrected this to “periphery (distance>300 µm) vs. interior (50 µmFigure 2.

      Note that the distances used in Figure 2. to define 'interior' and 'periphery' are completely different than those in Figure 1. (Interior in Figure 1 is defined to be between 50 and 280 um from the MTOC, and exterior larger than 300 um. However, in Figure 2. interior is defined as less than 100 um, and exterior as larger than 200 um.) Given that the asters are actively growing, it would be good to clearly explain how these intervals were defined in each case.

      For both experiments, we had clearly stated the definitions of interior and periphery, either in the figure legends or in the methods section. We have added a new paragraph explaining why we could not choose exactly the same quantitative definitions for these two methods (please also see our reply to Reviewer #2 comment 1).

      In the periphery movie, there are several notable examples of apparent minus-end depolymerization and treadmilling. The authors state these are very rare - perhaps a quantification would be useful here?

      Thank you for pointing this out. We modified the sentence to reflect the outward depolymerization events in the periphery. “We observed few outward-moving depolymerization events (Reviewer #1 (Significance (Required)):

      The observation of distinct depolymerization rates within vs. at the periphery of microtubule asters is novel and interesting. However, the manuscript in its current form is rather preliminary. The observation can be significantly strengthened by additional experiments/analysis that would characterize the effect in more detail. Even more importantly, the authors propose a highly speculative (although compelling) mechanism, but make no attempt to test it in any way. This is a major deficiency of the current manuscript that should be addressed prior to publication.

      REFEREES CROSS COMMENTING

      I agree with Reviewer #2 that our comments are both overlapping and complementary. I also find Reviewer #2's comments fair and reasonable and see no need for further adjustments.

      RESPONSE TO REVIEWER #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      \*SUMMARY ***

      This paper reports measurements of microtubule dynamics in interphase asters nucleated in Xenopus egg extracts. Dynamics are measured using two methods. First tracking of GFP tagged EB1 protein forming comets at the tips of growing microtubules, as used in other studies, which can only measure growth rates. Second using a recently developed automated tracking based on subtractive difference images of fluorescently labelled microtubules, which can measure both growth and shrinkage rates. The main and novel observation of this paper, using difference image tracking, is that the MT shrinkage rate is ~2 fold faster in the interior of the aster compared with the periphery of the aster, whilst rates of MT polymerisation and catastrophe vary only slightly, if at all. The authors speculate that this might be due to a reduced MAP concentration and occupancy in the aster interior. They also discuss the role of a depletion-dependent increased shrinkage rate as a feedback mechanism to maintain a low MT polymer density in the aster interior.

      \*MAJOR COMMENTS***

      The movies are startling in their beauty and clarity and the key conclusion that the shrinkage rate is significantly faster in the interior compared to the periphery of the aster is convincing.

      The observation that the rate of net MT plus end growth rate is ~10% faster at the periphery compared to interior of the aster is only supported by EB1 tip tracking method. The difference imaging method shows no significant difference in rates. The authors need to discuss this discrepancy between the established and new methods of analysis. It is insufficient to state that the growth rates obtained by the two methods are "consistent".

      This comment prompts the comparison of the two methods (EB1 vs. TIRF difference imaging). On one hand, EB1 tracking is more sensitive in detecting plus ends, and allows large N observations so it is likely to show statistical significance. On the other hand, EB1 tracking method is noisier (higher standard deviation) than the TIRF based measurements (see our response to Reviewer #1). In the TIRF difference imaging, the exact location of the periphery (relative to the center as well as the overall microtubule density profile) is hard to evaluate.

      What is consistent between the two methods is the approximate mean value of polymerization rates. The 10% faster polymerization velocity is only suggested by the EB1 tracking method, calling for caution/further investigation. However, the potential relatively small difference in polymerization rate is not the main point of this paper.

      We deleted the sentence in the results section for the TIRF method: “These values of polymerization rates are consistent with EB1 comet tracking (Fig. 1). ” We have added a new paragraph discussing the discrepancies between the methods in reporting polymerization rate.

      The discussion proposing MAP depletion-dependent increased shrinkage rate as a feedback mechanism to limit MT polymer density is reasonable.

      The model and discussion of the role of MAPs might be criticised as highly speculative and unsupported by any experimental data. The authors do acknowledge this. Whether the ratio of data to speculative interpretation is appropriate will be an editorial decision for whichever journal ultimately hosts this.

      Thank you. This is exactly the kind of comments that we wanted to hear from an initiative like Review Commons. This helps us gauge how our work is received and decide which journal to submit our work.

      In particular since the aster forms by growth from the nucleating bead, early in its formation the final interior MTs must have first formed the peripheral MTs and could therefore enter fresh media and bind MAPs. The authors show by calculation that as the aster expands, these MTs and MAPs become isolated from mixing with the external media. This isolation would then suggest that any MAPS released by dissociation or MT depolymerisation must remain in the interior, and are therefore available to rebind to newly formed MTs. So, it is unclear why the MAPs should be depleted in the interior compared to the periphery, unless expansion of the Aster is slowed in which case additional MAPs could diffuse into the stationary periphery from the surrounding media. The kinetics of MT growth, MAP binding and aster expansion would then also be expected to have an effect on the outcome beyond a simple "depletion" of the internal MAP concentration.

      We use the term “depletion” to mean a significant decrease of MAP from the cytoplasm. As outlined in our toy model, more MTs lead to more MAP binding and depletion of soluble MAPs. Note that the total local abundance of MAP is constant unless there is significant diffusive transport of MAP from one region to another. We argue this transport is ineffective for the large length scale of interphase asters.

      It is also not clear how the authors preferred model would account for the suggestion of bimodal shrinkage rates. It is not clear if this is a simplification (binning things in to external and internal) applied for the purposes of discussion.

      Please see our comment to Reviewer #1. We now believe there is no evidence for bimodality of depolymerization rates. The spread of the data reflects the variability of depolymerization rates in a given a field of view as well as the variability across multiple fields of view.

      \*MINOR COMMENTS***

      Line 71

      Authors reference Gardner et al 2011, when discussing depolymerisation as a zero order process, as showing a free tubulin dimer concentration effect on shrinkage rates. However, the results in Gardner refer to the off rate during MT polymerisation, and measurements of rapid small scale events during overall growth phases and would be applicable to GTP-heterodimers, whereas the extended shrinkage events measured in this paper would presumably apply to post-catastrophe GDP-heterodimer dissociation and may not be comparable. The reference should be omitted or a further explanation given.

      Thanks, good point. We wanted to cite Gardner et al (2011) to make the point that classic assembly models may not always hold, but the reviewer is correct, that paper only looked at concentration dependence of depolymerization at growing ends. The text was changed to:

      “This assumption has been questioned for growing ends (Gardner 2011)​, but not for shrinking ends to our knowledge.”

      Line 89

      States "density of plus ends is approximately homogenous within interphase asters"

      However, in results section it is stated Line 111 that "the plus end density is lower at the periphery compared to the aster center".

      Please clarify

      The plus end density is approximately homogenous from the center to the periphery of the aster. However, only at the most peripheral region, where there are few microtubules, the density drops.

      Line 135

      The distances given for the interior and periphery appear to be mixed up.

      Thank you, we corrected this.

      Line277

      "approximately consistent with our Peclet number estimate". 50µm gives a Pe value of 2.8. The Peclat number "significance" is earlier given in terms of "Pe>>1" (Line255). Please clarify what range of experimental values is required for the argument to hold.

      Our statement was unclear. We modified the sentence in the following way to clarify our point: “The half-width of the depleted zone extended ~50 microns beyond the growing aster periphery, which is smaller than the typical aster radius. This analysis indicated that soluble protein levels may vary between subregions of growing asters due to subunit consumption.”

      Line 404

      needs details of the GFP-EB1 and fluorescent tubulin used in this experiment.

      The detailed concentrations are described for each method in the subsequent sections. To avoid confusion, we removed the sentence in line 404, which omitted details.

      The tubulin depletion measurements detect a 4% reduction in tubulin concentration in the interior versus the exterior, and the same for eGFP-EB1 (Fig.4B). This observation provides important support for the depletion proposal. But the experiments apparently lack a control for potential reduction of fluorescence excitation intensity with depth in these deep specimens (equivalent to the inner filter effect in spectroscopy). Is there a component whose apparent concentration (fluorescence emission intensity) does not decrease by 4% in the interior of the aster?

      Indeed, fluorescent intensity measurements require special attention. Our samples are made by squashing 4 ul of extract under a 18 mm x 18 mm coverslip and the resulting thickness is 10 micron, which we believe is a distance that is too small to result in an inner filter effect.

      In response to Reviewer #2’s request for an example of a component whose fluorescence intensity is uniform, we provide the intensity profile of the inert 10kDa Dextran labeled with Alexa568. This serves as a control for the reviewer’s specific concern with our method. We will incorporate this as a supplementary figure in the revision.

      There is no direct discussion of the relative lifetime of MTs in the interior compared to the exterior of the aster. Catastrophe rates and growth rates are essentially invariant, I think this implies that MT lifetimes are essentially the same in the interior versus the exterior? Please confirm and estimate the lifetime. This could exclude a maturation process whereby one set of MAPs got replaced by another over time?

      Indeed, MT lifetime is a function of four rates: polymerization, depolymerization, catastrophe, and rescue. The figure below shows the MT lifetime as a function of depolymerization rate, assuming other parameters are fixed at what we found in our previous report Ishihara et al. 2016. In regions of fast depolymerization rate 40 µm/min, the microtubule lifetime is 0.98 min. As the depolymerization rate decreases to 30 and 25 µm/min, the lifetime increases to 1.5 and 2.4 min. This implies that the microtubules at the aster periphery are longer lived than those in the interior.

      Association and dissociation rate constants have not been measured for most MAPs, but in general we expect them to be fast compared to the timescale of MT lifetime of ~1 minute. Most MAPs bind in the low micromolar or high nM regime, which implies dissociation rates of seconds or less. MAP4 and MAP7 were both shown to bind and dissociate rapidly in living cells (PMID: 16714020, PMID: 11719555)

      Reviewer #2 (Significance (Required)):

      This paper is significant as it is the first observation of spatial variation in MT shrinkage rates in an aster. It proposes the broad shape of an underlying mechanism (depletion of stabilising MAPS in the aster interior) and presents sound quantitative arguments, but the experiments do not directly test this mechanism. Aster formation in Xenopus egg extracts is widely used as a model system, and if indeed the spatial variation turns out to be due to spatial depletion of components then this will become a landmark paper. The paper may promote wider use of this method of automated analysis and encourage study of shrinkage rate mechanisms in other systems.

      REFEREES CROSS COMMENTING

      In my opinion the comments of reviewer #1 are fair and reasonable and overlap with and complement my own. In my opinion there is zero conflict requiring adjustment.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      SUMMARY

      This paper reports measurements of microtubule dynamics in interphase asters nucleated in Xenopus egg extracts. Dynamics are measured using two methods. First tracking of GFP tagged EB1 protein forming comets at the tips of growing microtubules, as used in other studies, which can only measure growth rates. Second using a recently developed automated tracking based on subtractive difference images of fluorescently labelled microtubules, which can measure both growth and shrinkage rates. The main and novel observation of this paper, using difference image tracking, is that the MT shrinkage rate is ~2 fold faster in the interior of the aster compared with the periphery of the aster, whilst rates of MT polymerisation and catastrophe vary only slightly, if at all. The authors speculate that this might be due to a reduced MAP concentration and occupancy in the aster interior. They also discuss the role of a depletion-dependent increased shrinkage rate as a feedback mechanism to maintain a low MT polymer density in the aster interior.

      MAJOR COMMENTS

      The movies are startling in their beauty and clarity and the key conclusion that the shrinkage rate is significantly faster in the interior compared to the periphery of the aster is convincing.

      The observation that the rate of net MT plus end growth rate is ~10% faster at the periphery compared to interior of the aster is only supported by EB1 tip tracking method. The difference imaging method shows no significant difference in rates. The authors need to discuss this discrepancy between the established and new methods of analysis. It is insufficient to state that the growth rates obtained by the two methods are "consistent".

      The discussion proposing MAP depletion-dependent increased shrinkage rate as a feedback mechanism to limit MT polymer density is reasonable.

      The model and discussion of the role of MAPs might be criticised as highly speculative and unsupported by any experimental data. The authors do acknowledge this. Whether the ratio of data to speculative interpretation is appropriate will be an editorial decision for whichever journal ultimately hosts this.

      In particular since the aster forms by growth from the nucleating bead, early in its formation the final interior MTs must have first formed the peripheral MTs and could therefore enter fresh media and bind MAPs. The authors show by calculation that as the aster expands, these MTs and MAPs become isolated from mixing with the external media. This isolation would then suggest that any MAPS released by dissociation or MT depolymerisation must remain in the interior, and are therefore available to rebind to newly formed MTs. So, it is unclear why the MAPs should be depleted in the interior compared to the periphery, unless expansion of the Aster is slowed in which case additional MAPs could diffuse into the stationary periphery from the surrounding media. The kinetics of MT growth, MAP binding and aster expansion would then also be expected to have an effect on the outcome beyond a simple "depletion" of the internal MAP concentration.

      It is also not clear how the authors preferred model would account for the suggestion of bimodal shrinkage rates. It is not clear if this is a simplification (binning things in to external and internal) applied for the purposes of discussion.

      MINOR COMMENTS

      Line 71 Authors reference Gardner et al 2011, when discussing depolymerisation as a zero order process, as showing a free tubulin dimer concentration effect on shrinkage rates. However, the results in Gardner refer to the off rate during MT polymerisation, and measurements of rapid small scale events during overall growth phases and would be applicable to GTP-heterodimers, whereas the extended shrinkage events measured in this paper would presumably apply to post-catastrophe GDP-heterodimer dissociation and may not be comparable. The reference should be omitted or a further explanation given.

      Line 89 States "density of plus ends is approximately homogenous within interphase asters" However, in results section it is stated Line 111 that "the plus end density is lower at the periphery compared to the aster center". Please clarify

      Line 135 The distances given for the interior and periphery appear to be mixed up.

      Line277 "approximately consistent with our Peclet number estimate". 50µm gives a Pe value of 2.8. The Peclat number "significance" is earlier given in terms of "Pe>>1" (Line255). Please clarify what range of experimental values is required for the argument to hold.

      Line 404 needs details of the GFP-EB1 and fluorescent tubulin used in this experiment.

      The tubulin depletion measurements detect a 4% reduction in tubulin concentration in the interior versus the exterior, and the same for eGFP-EB1 (Fig.4B). This observation provides important support for the depletion proposal. But the experiments apparently lack a control for potential reduction of fluorescence excitation intensity with depth in these deep specimens (equivalent to the inner filter effect in spectroscopy). Is there a component whose apparent concentration (fluorescence emission intensity) does not decrease by 4% in the interior of the aster?

      There is no direct discussion of the relative lifetime of MTs in the interior compared to the exterior of the aster. Catastrophe rates and growth rates are essentially invariant, I think this implies that MT lifetimes are essentially the same in the interior versus the exterior? Please confirm and estimate the lifetime. This could exclude a maturation process whereby one set of MAPs got replaced by another over time?

      Significance

      This paper is significant as it is the first observation of spatial variation in MT shrinkage rates in an aster. It proposes the broad shape of an underlying mechanism (depletion of stabilising MAPS in the aster interior) and presents sound quantitative arguments, but the experiments do not directly test this mechanism. Aster formation in Xenopus egg extracts is widely used as a model system, and if indeed the spatial variation turns out to be due to spatial depletion of components then this will become a landmark paper. The paper may promote wider use of this method of automated analysis and encourage study of shrinkage rate mechanisms in other systems.

      REFEREES CROSS COMMENTING

      In my opinion the comments of reviewer #1 are fair and reasonable and overlap with and complement my own. In my opinion there is zero conflict requiring adjustment.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Ishihara et al. investigate and compare microtubule polymerization/depolymerization dynamics inside vs. at the periphery of microtubule asters in a cell-free Xenopus egg extract system. By tracking EB comets, which localize to growing microtubule ends, they find that the microtubule growth rates and EB comet lifetimes (interpreted as an indicator of microtubule catastrophe rates) are similar between the two spatially-distinct microtubule populations. However, using a tubulin-intensity-difference image analysis, the authors are also able to measure local microtubule depolymerization rates, and they find a significant difference in depolymerization rates of the two populations. Specifically, the authors report that the microtubule depolymerization rates measured within asters are faster than those measured at the periphery.

      Specific comments:

      Figure 2. In the text, the authors report: "The depolymerization rate was 36.3 {plus minus} 7.9 μm/min (mean, std) in the aster interior, compared to 29.2 {plus minus} 8.9 μm/min (mean, std) at the aster periphery." This difference is certainly not two-fold (as stated in the abstract). It would also be useful to mark the mean rates on the graph in 2B.

      The bimodal shape of the depolymerization rate distributions in 2B is very interesting. This definitely warrants further investigation. At the minimum, the depolymerization rates should be determined at 50 um- intervals, as done for other parameters in Figure 1. Could it be that there are two coexisting populations of microtubules at the same location? Or is there a clear spatial compartmentalization of the two that is not obvious here because of the too large of a distance interval used for the measurements. This is a very important distinction for the claims of the paper.

      The authors make a point here that the distribution of measured polymerization rates is fairly narrow. This appears to be in contrast with Figure 1B, where polymerization rates take on a wide range of values. How do the two distributions of polymerization rates obtained by these two methods compare?

      Figure 3. The laser ablation figure and movies are beautiful, but don't seem to add support to the story. Importantly, the authors do not confirm any spatial variability in depolymerization rate with these experiment. As a matter of fact, although the laser ablation experiments are only performed in the aster interior, the measured depolymerization rates appear to be just as consistent with the periphery rates in Figure 2. as they are with the interior rates in Figure 2. (They span quite a large range of values with the average right in the middle between what was measured for the two areas in Figure 2).

      Although the authors report they don't see any correlation between the distance and depolymerization rate, they should still plot the rate as a function of initial cut positions (Figures 3D, 3E).

      From the single decaying inward wave the authors conclude that microtubules depolymerize fully to their minus ends which are distributed throughout the aster. Can the possibility that depolymerization is stopped by microtubule lattice defects/islands be excluded by these observations?

      What are the effects of the local increase in tubulin concentration due to the subunit release by depolymerization? What about the release of other lattice-binding MAPs (stabilizers)?

      Figure 4. Is the local depletion of tubulin/EB1 thought to be only within the narrow annulus at ~100 um distance, or is it not measurable on the inside due to the polymer signal? Can the two be separated? Such a sharp transition within a discrete annular region doesn't speak to the relative effects on the inside vs. the outside of the aster?!

      More importantly, the local depletion of either tubulin or EB1 is not a good representation of a depletion of a MAP component that associates with the microtubule lattice. Both tubulin and EB1 bind preferably to microtubule ends, not lattice. Thus showing a profile of slight local tubulin and/or EB depletion does not seem to be relevant for the proposed model. Rather, overall microtubule polymer mass/density as a function of distance may be more relevant?

      Figure 5. The toy model is intuitive and clear, but not sufficient without any experimental investigation. An attempt to quantify the actual distributions of at least one or a few selected proposed MAPs is needed. Is the depletion strongest where microtubule density is highest? What is the ratio of a MAP intensity to microtubule polymer density as a function of distance? How does that relate to local depolymerization rates? What are other testable model predictions that can show support for the proposed mechanism?

      Also, the table is insufficiently described. Are any or all of these MAPs known to be specific regulators of microtubule depolymerization rates, but not other dynamics parameters?

      Minor comments:

      Figure 1. typo in the figure legend: "interior (distance>300 μm) vs. periphery (50 μm<distance<280 μm)" There appears to be a clear dip in EB1 density at 100 um (Figure 1C). What could be the cause of that?

      Figure 2. Note that the distances used in Figure 2. to define 'interior' and 'periphery' are completely different than those in Figure 1. (Interior in Figure 1 is defined to be between 50 and 280 um from the MTOC, and exterior larger than 300 um. However, in Figure 2. interior is defined as less than 100 um, and exterior as larger than 200 um.) Given that the asters are actively growing, it would be good to clearly explain how these intervals were defined in each case.

      In the periphery movie, there are several notable examples of apparent minus-end depolymerization and treadmilling. The authors state these are very rare - perhaps a quantification would be useful here?

      Significance

      The observation of distinct depolymerization rates within vs. at the periphery of microtubule asters is novel and interesting. However, the manuscript in its current form is rather preliminary. The observation can be significantly strengthened by additional experiments/analysis that would characterize the effect in more detail. Even more importantly, the authors propose a highly speculative (although compelling) mechanism, but make no attempt to test it in any way. This is a major deficiency of the current manuscript that should be addressed prior to publication.

      REFEREES CROSS COMMENTING

      I agree with Reviewer #2 that our comments are both overlapping and complementary. I also find Reviewer #2's comments fair and reasonable and see no need for further adjustments.

    1. Reviewer #3:

      This manuscript reports a series of unique experiments with a single human participant, using an electrode array implanted in the left posterior parietal cortex several years after high-level spinal cord injury. There is a small but increasing number of groups capable of performing this type of research in humans. Most of this work has been focused on the motor system, but studies like this one, characterizing the somatosensory system (touch, in particular), have been increasingly common in the past five years. However, this is the only group focusing on this higher-level, multimodal association area of the cortex.

      Most of the recorded neurons were activated bilaterally, which is consistent with earlier monkey work from this lab. Probably the most important component of the work is the analysis of the modest activation in this area that occurs simply when the participant imagines different places on her body being touched - even the insensate arm. This work is virtually impossible to do in monkeys. There are extensive and overlapping analyses of the relation between actual and imagined activation, and the activation arising from inputs (or imagined inputs) from the two sides of the body. Eliminating a number of these and clarifying the remainder may improve the impact.

      1) 63: in a tetraplegic human subject recorded with an electrode array implanted in the left PPC I am curious why the array was placed in the left PPC, given the clinical evidence for the greater role of the right side in the formation of internal, multi-modal maps. Some comments would be useful.

      2) Fig 1: It would be good to show a panel of representative spikes, perhaps with their single-trial raster responses. This could be in a new figure that includes panel 1D, which is presented in a bit of an odd order as it now stands, coming in the midst of higher-level analyses. Indicate how many trials went into the averages in 1D.

      3) 146: we computed a cross-validated coefficient of determination (R^2 within) to measure the strength of neuronal selectivity for each body side. Even after reading the methods (further comments below) it is difficult to figure out what all these related measures reveal. At this point in the text it is very difficult to intuit how R^2 would measure selectivity.

      4) Fig 4: Several panels would be more effective if plotted as a function of distance rather than a category. 4E: This panel is borderline too small 4F: definitely too small. Enlarge, perhaps with fewer examples The curves drawn on the panels do not appear to be Gaussian, but neither are they just connected points. Show whatever it was you actually used. The Gaussian assumption does not appear to be very good for the edge cases (first two, last two) which is not terribly surprising.

      5) What is added by including both classification and Mahalanobis distance?

      6) 354: information coding evolves for a single unit. Two complementary analyses were then performed. In what sense are they complementary? What is added (besides complexity) by including both cluster analysis and PCA?

      7) Fig 8C: Despite my best efforts, I have no idea what this is showing

      8) 753: Classification was performed using linear discriminant analysis with the following assumptions:

      One, the prior probability across tested task epochs was uniform; It is not clear what prior probability this refers to. Just stimulus site?

      Two, the conditional probability distribution of each unit on any epoch was normal; Is this a reference to firing rate probability conditioned on stimulus site?

      Three, only the mean firing rates differ for unit activity during each epoch (covariance of the normal distributions are the same for each);

      Four, firing rates for each input are independent (covariance of the normal distribution is diagonal).

      Does this refer to independent firing rates of neurons across stimulus sites? This seems very unlikely, given everything we know about dimensionality of cortex. Perhaps it refers to something else. Cannot all of these assumptions be tested? Were they?

      9) 768: we computed the cross-validated coefficient of determination (R2 within) to measure how well a neuron's firing rate could be explained by the responses to the sensory fields. This needs a better description, and I may be missing the point entirely. I assume it is an analysis of mean firing rate (which should be stated explicitly) and that it uses something like the indicator variable of the linear analysis of individual neuron tuning above. In this case is this a logistic regression? As it is computed for each side independently, it would appear that there are only four bits to describe the firing of any given neuron. This would seem to be a pretty impoverished statistic, even if the statistical model is accurate.

      10) 786: The purpose of computing a specificity index was to quantify the degree to which a neuron was tuned to represent information pertaining to one side of the body over the other. This is all pretty hard to follow. The R2 metric itself is a bit mysterious, as noted above. Within and across R2 is fairly straightforward, but adds to the complexity, as does SI, which makes comparisons of three different combinations of these measures across sides. Aside from R2 itself, the math is pretty transparent. However, a better high-level description of what insight all the different combinations provide would help to justify using them all. As is, there is no discussion and virtually no description of the difference across these three scatter plots. The critical point apparently, is that, "nearly all recorded PC-IP neurons demonstrate bilateral coding". There should be much a more direct way to make this point.

      11) Computing response latency via RF discrimination is rather indirect and assumes that there is significant classification in the first place. I suspect it will add at least some delay beyond more typical tests. Why not a far simpler and more direct test of means in the same sliding window? Alternatively, a change point analysis?

    2. Reviewer #2:

      General assessment:

      The study by Chivukula et al., explored a unique (n=1) dataset of multi-unit neuron recordings collected in the postcentral-intraparietal area (PC-IP) of a tetraplegic human subject taking part in a brain machine interface clinical trial. The recordings were collected across a set of tasks designed to investigate neuronal responses to both experienced and imagined touch.

      Overall I found the manuscript to be well-written, the study to be interesting, and the analysis reasonable. I do, however, think the manuscript would benefit by addressing two main, and a number of minor, issues.

      Major comments:

      1) The methods would benefit from additional rationale / supporting references throughout. Whereas it is generally clear what was done, it is sometimes less clear why certain choices were made. Perhaps some of the choices are "standard practice" when working with single unit recordings, but I was left in want of a bit more reasoning (or at least direction to relevant literature). Some examples are below:

      For the population correlation (line 723): why was the correlation computed 250 times or why were the two distributions shuffled together 2000 times?

      For the decode analysis (line 752): consider providing a reference for those interested in better understanding the "peeking" effects mentioned.

      Response latency (line 798): how were window parameters determined (for both visualization and the latency calculation). And what was the rationale for them being different - especially given that the data used for the response latency calculation was still visualized (at least in part)? Relatedly, I'd be curious to see the entire time-course for that data rather than just the shaded region of the "visualization" data. Also, it would be nice if a comment (or some data) could be provided regarding how much the latency estimates change based on these parameter choices.

      Temporal dynamics of population activity (line 830): why use a 500 ms window, stepped at 100 ms intervals instead of something else?

      Temporal dynamics of single unit activity (line 887): it is stated that the neurons were restricted to those whose 90th percentile accuracy was at least 50% to ensure only neurons with some degree of significant selectivity were used for the cluster analysis. But why these particular values? Are the results sensitive to this choice? In this section, I'd also suggest providing references for those interested in better understanding the use of Bayesian information criteria. Similarly, it is stated that PCA is a "standard method for describing the behavior of neural populations" - as such it would be nice to provide some relevant references for the reader.

      2) The manuscript would benefit from additional context in the intro as well as a more thorough discussion - particularly with respect to the imagination aspect of the experiment.

      Intro: The second paragraph did well in establishing why one might be interested in examining somatosensory processing in the PPC. It was however, less clear why the particular questions at the end of the paragraph were being posed. Perhaps an extra paragraph could be added to bridge the notion that a sizeable body of literature has been developed around somatosensory representation within the PPC and the several "fundamental" questions remaining that are of interest here.

      Discussion: The manuscript would benefit from a more thorough discussion of "imagination per se" and the various top-down processes that might be involved - as well as better positioning with respect to previous studies investigating top-down modulation of the somatosensory system. The authors state that the cognitive engagement during the tactile imagery may reflect semantic processing, sensory anticipation, and imagined touch per se - which I would not argue. But I would also expect some explicit mention of processes like attention and prediction. Perhaps these are intended to be captured by "sensory anticipation" - but, for example, attention can be deployed even if no sensation is anticipated. Importantly, it seems that imagining a sensation at a particular body site might well involve attending to that body part. That is, one may first attend to a body part before "imagining" a sensation there - and then even continue to attend there the entire time the imagining is being done. Because of this, perhaps the authors are considering attention to be a part of "imagination per se". But since attention has been shown to modulate somatosensory cortex without imagination, how can one exclude the possibility that the neuronal activity measured here simply reflects this attention component? Regardless, I think the discussion would benefit from a more explicit treatment of these top-down processes - especially given the number of previous studies showing that they are able to modulate activity throughout the somatosensory system. Some literature that may be of interest include:

      Roland P (1981) Somatotopical tuning of postcentral gyrus during focal attention in man. A regional cerebral blood flow study. Journal of Neurophysiology 46 (4):744-754

      Johansen-Berg H, Christensen V, Woolrich M, Matthews PM (2000) Attention to touch modulates activity in both primary and secondary somatosensory areas. Neuroreport 11 (6):1237-1241

      Hamalainen H, Hiltunen J, Titievskaja I (2000) fMRI activations of SI and SII cortices during tactile stimulation depend on attention. Neuroreport 11 (8):1673-1676. doi:10.1097/00001756-200006050-00016

      Puckett AM, Bollmann S, Barth M, Cunnington R (2017) Measuring the effects of attention to individual fingertips in somatosensory cortex using ultra-high field (7T) fMRI. Neuroimage 161:179-187. doi:10.1016/j.neuroimage.2017.08.014

      Yu Y, Huber L, Yang J, Jangraw DC, Handwerker DA, Molfese PJ, Chen G, Ejima Y, Wu J, Bandettini PA (2019) Layer-specific activation of sensory input and predictive feedback in the human primary somatosensory cortex. Sci Adv 5 (5):eaav9053. doi:10.1126/sciadv.aav9053

    3. Reviewer #1:

      In this study Chivukula, Zhang, Aflalo et al. report on an extensive set of neural recordings from human PPC. It is found that many neurons are responsive to touch in specific locations. Interestingly, a considerable fraction of the neurons displayed symmetric bilateral receptive fields. Furthermore, these neurons also became active during imagined touches. The study paves the way for a deeper understanding of the role of the human PPC.

      The paper presents a wealth of analysis on an extensive set of recordings. It is generally well written and the analyses are well thought out. My main concerns are regarding missing information and unclear descriptions of some of the analyses undertaken, which are detailed below.

      1) At the start of the results section it is stated that the recordings were from "well-isolate and multi-unit neurons". This seems to contradict the Methods section, which only talks about "sorted" neurons. This needs to be clarified, and if multi-units were included, it should be stated which sections this concerns as it will have implications for the results (e.g. for selectivity for different body parts). In any case, the number of neurons included in different analyses should be evident. There are some numbers in the Methods and sprinkled throughout the Results section, but for some of the analyses (e.g. clustering analysis, which was run only on a responsive subset of neurons) no numbers are provided.

      2) The linear analysis section needs further details. The coefficients are matched to "conditions" but it is not explained how. I am assuming that each touch location is assigned to a condition c, however the way the model is described suggests that the vector X can in principle have multiple conditions active at the same time. Additionally, could the authors confirm whether it is the significance of the coefficients that determined whether a neuron was classed as responsive as shown in Figure 1? This analysis states a p-value but does give no further information on which test was run and on what data.

      3) Figure 1 C could be converted into a matrix that lists all combinations of RF numbers on either side of the body to highlight whether larger RFs on one side of the body generally imply larger RFs on the other side.

      4) I am confused about the interpretation of the coefficient of determination as shown in Figure 2A. In the text this is described as testing the "selectivity" of the neurons. To clarify, I am assuming that the "regression analysis" is referring to the linear model described in a previous section. The authors then presumably took the coefficients from this model for a single side only and tested how well they could predict the responses to the opposite side, as assessed by R^2 (Fig 2C,E). Before that in Fig 2A, they tested how well each single-side model could predict the responses. This is all fine, but the "within" comparison then simply tests how well a linear model can explain the observed responses, and has nothing to do with the selectivity of the neuron. For example, the neuron might be narrowly or broadly selective, but the model might fit equally well.

      5) Regarding the timing analysis, it is not clear to me how the accuracy can top out at 100% as shown in the figure, when the control conditions were included. Additionally, the authors should state the p value and statistic for the comparison of latencies.

      6) Spatial analysis. Could the authors provide the size of the paintbrush tip that was used in this analysis. Furthermore, as stimulation sites were 2 cm apart, it is not appropriate to specify receptive fields down to millimeter precision.

      7) Imagery: how many neurons were responsive to both imagery and real touch? Were all neurons that were responsive to imagery also responsive to actual touch? This is left vague and Figure 5 only includes the percentages per condition, but gives no indication of how many neurons responded to several conditions. Whether and how many neurons were responsive to both conditions also determines the ceiling for the correlation analysis in Figure 5D (e.g. if the most neurons are responsive only to actual but not imaginary touch, this will limit the population correlation).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Tamar R Makin (University College London) served as the Reviewing Editor.

      Summary:

      Chivukula and colleagues report an extensive set of multi-unit neural recordings from PPC of a tetraplegic patient taking part in a brain machine interface clinical trial. The recordings were collected across a set of tasks, designed to investigate neuronal responses to both experienced and imagined touch. It was found that many neurons are responsive to touch in specific locations. Most of the recorded neurons were activated bilaterally, which is consistent with earlier monkey work from this lab. Probably the most important component of the work is the analysis of the modest activation in this area that occurs simply when the participant imagines different places on her body being touched - even the insensate arm. This work is virtually impossible to do in animals, and as such offers a unique opportunity to describe neural properties for higher-level representation of touch. The study therefore paves the way for a deeper understanding of the role of the human PPC in the cognitive processing of somatosensation.

      Overall, we found the manuscript to be well-written, the study to be interesting, and for the most part the analyses are well thought out. But at the same time, the reviewers raised multiple main concerns regarding missing information and unclear descriptions of some of the analyses undertaken, which are detailed below over many major and minor comments. In addition, it was felt that there was unnecessary overlap across analyses - the first part especially contains a number of analyses that seem to make very similar points repeatedly or where it is not entirely clear what the point is in the first place. As such, there is a need to identify and cut a lot of the duplicative analyses/results and explain both the essential methods and the interpretation of the remaining results more succinctly and clearly. The key analyses could then be streamlined and better justified, ideally with an eye towards a consistent approach in both parts of the paper. here are also some major considerations regarding the contextualisation and interpretation of the key imagery results, as detailed in the first major comment below.

    1. Reviewer #3:

      The manuscript by Schonhaut et al. presents novel analysis on an impressive dataset of more than 1200 neurons across diverse brain areas in the human brain to investigate their modulation by hippocampal theta oscillations. They found a substantive proportion of cells phase-locked to hippocampal activity, mainly in the theta frequency band, in several areas known to be functionally related to the hippocampus, some of them receiving monosynaptic hippocampal inputs but other only indirect ones. These results extend previous reports in humans showing hippocampal interactions with these structures but at the level of mesoscopic activity and highlight the ubiquity of spike-theta timing and the importance of single-unit studies in humans. Additional analysis, detailed below, will contribute to give a better description of the data, provide stronger support for some of the authors' claims and clarify some issues.

      1) I assume that the dataset also includes hippocampal units, why then excluding them from the analysis? Although the main novelty is in the coupling of cells in other structures with hippocampal LFPs, it would be useful to also compare it with the coupling of local hippocampal cells.

      2) Include average power spectrum of hippocampal LFPs. Additional examples of raw LFP traces overlaid to spectrograms (perhaps in Supplementary) will help to illustrate the nature of hippocampal oscillations.

      3) The authors compared fractions of significantly modulated units and their preferred frequencies across regions. While very informative, these analyses are not sufficient to capture the richness of spike-LFP interactions likely existing in the dataset. Were there differences in the strength of phase-locking across regions? (this analysis could be added to Figure 2). Studies in rodents have shown that theta phase-locked units in different structures have characteristics preferred firing phases (when hippocampal LFP is used as a reference). Authors can easily look if this is also the case in their data. They should include both pooled data statistics of mean phases across regions and single neuron examples of firing probability by LFP phase (such examples could be added to the single unit plots in Figure 1).

      4) Did phase-locked and non phase-locked units have different properties? The authors can compare if they differ in basic properties such as mean firing rate, waveform width, inter-spike intervals, burstiness, etc., as it has been reported in other studies in non-human primates and rodents. These analyses could be extended to show if units with different properties also differ in their preferred phase-locking frequency, or phase. It would be very interesting if these analyses reveal the existence of heterogeneous cellular populations with different relation to hippocampal theta, even if the single-unit isolation quality is limited due to the low density recordings. In relation to this, authors should also plot unit auto-correlograms. ACGs can be computed for all the spikes, but also only for the strongly phase-locked spikes, to show if, at least during periods of strong oscillatory activity, some units show rhythmicity.

      5) To better interpret the results in Figure 4, it would be important to know if the recording sites in both hippocampi were from the same sub-region and similar location along the longitudinal hippocampal axis in each subject and if the degree of synchrony between the LFP in both hemispheres. Coherence or phase-locking between LFPs across hemispheres should be computed and also power spectrum for both of them shown.

      6) In Figure 4C-D it seems that phase-locking strength across hemispheres was not correlated but preferred frequency was. This should be quantified and mentioned in text before moving to the correlation in Figure 4E.

      7) The analysis in Figure 5D should be complemented by also checking the LFP-LFP phase-locking between the local region and the hippocampus. Were periods of high LFP power correlation also reflect enhanced phase-phase coupling? Were the structures also more phase-synchronous during periods of stronger spike-LFP coupling? These analyses could provide a more direct support for the interpretation of the authors in line with the CTC hypothesis.

      8) Was there any relation of the "strongly phase-locked" periods with global variables reflecting brain state (e.g. drowsiness versus attention to the task, etc.) or with the firing dynamics of the units (instantaneous firing rate or inter-spike intervals)?

    2. Reviewer #2:

      In this study, Schonhaut et al., describe the phase locking statistics of cortical and subcortical neurons with respect to hippocampal local field potential (LFP) recorded in 18 epilepsy patients undergoing seizure monitoring. Nearly 30% of extrahippocampal neurons showed phase locking to some bandpassed hippocampal signal. Amygdalar and entorhinal neurons were more likely to be phase locked, as compared to neurons recorded in other neocortical sites. Most neurons showed the strongest phase locking to hippocampal theta (2-8 Hz), though neocortical and amygdalar neurons tended to phase lock to lower theta bands. Spikes that were phase locked to hippocampal rhythms occurred during local LFP-states that showed moderate correlations with the spectral patterns observed in the hippocampus. These data are interpreted within the broader "communication through coherence" hypothesis.

      Large N, multi-region, single unit studies from humans are rare and the kind of mesoscopic descriptive analyses provided here serve as an important bridge between the large rodent literature on hippocampal physiology and human physiology and cognition. That said, there are some weaknesses in the analyses that could be addressed. Also, a deeper discussion of the biological origin of human theta is merited in the discussion to address alternate explanations - beyond communication through coherence - of the data.

      A similar statistical mistake was made several times. The author's logic goes like this: find the argmax in one sample, take the argument that generated that max, and use that to sample in another condition, and report that the max is higher in the first condition than the second. For example, on pg. 6 "This is difficult to reconcile with our results, in which 248/362 neurons (68.5%) phase-locked more strongly to hippocampal LFPs than to locally-recorded LFPs at their preferred hippocampal phase-locking frequency." The same flaw can be seen in Figure 5, where the spikes are sub-sampled to occur during strong phase locking in one condition, thus almost guaranteeing high power in the frequency bands that generated that strong phase locking (which was observed). This is a case in which cross-validating the data may be useful. The authors could take a subset of the hippocampal data to define the preferred frequency, and then test phase locking on the held out data from the hippocampus and cortex.

      The relationship between power and phase locking is not fully controlled in this paper. The phase seems to be calculated irrespective of whether there is any instantaneous power at that frequency band, introducing noise. This will bias away from finding significant phase locking to frequency bands that occur transiently. Therefore, I recommend defining some threshold of the existence of the spectral signal prior to using that signal to calculate phase.

      A related point has to do with the nature of the theta rhythm in the human. There has been considerable controversy over the years as to whether this is a comparable signal to that studied in the rodent. Based on the citations in this manuscript, and the nomenclature of the spectral band, the authors seek to make explicit the commonality of the underlying physiology, or function. Rodent theta is a sustained rhythm, while primate theta seems to come in bouts, perhaps even related to sampling statistics, such as saccades, leading to the suggestion that the apparent theta may be better thought of as semi-rhythmic evoked responses. How long were the bouts of high theta power? Was eye movement tracked? If so it would be important to relate the signal to eye movements. If the low frequency signal is phase locked to eye movement and potentially reflects semi-rhythmic information arriving to (from?) the hippocampus, then a stronger case could be made that hidden "third parties" synchronize the apparent communication through coherence observed here, and in fact there may be no communication at all.

      The authors dedicate much of their discussion to relating the current result to the communication through coherence analyses. Oddly, LFP coherence was never addressed. A strong prediction of the current framing would be that: when coherence is high, phase locking should be high, and higher than other moments when power in either region is high but coherence is not observed. The authors should directly measure how phase locking is modulated by coherence.

      The authors also lump together biological entities that should have different phase locking behaviors. The amygdala is not a monolithic region, does phase locking differ by nucleus? Also, do fast spiking inhibitory cells differ from excitatory cells? The authors should relate their phase locking measure to mean firing rate to show that it is insensitive to lower level cell statistics. This is important since the conclusions of the study would be quite different if neurons in the entorhinal cortex had high rates which artifactually drove up phase locking values.

    3. Reviewer #1:

      Hippocampal theta oscillations are among the most prominent rhythms in the mammalian brain. Extensive research in rodents has shown that neurons not only within the hippocampus but in widespread cortical areas can be phase-locked to hippocampal theta. Such cross-regional communication within theta frames has been postulated to be the foundation of many hippocampal operations. While previous studies in humans have documented the relationship between LFP theta and spiking in the hippocampus, coupling between hippocampal LFP and more remote cortical areas have not been demonstrated in human subjects. This is the topic of the present work. The authors show that spikes of single (and mostly multi-unit) neurons in multiple cortical regions both in the same and opposite hemispheres are phase locked to transient occurrence of hippocampal theta LFP in the 2-6 Hz range. However, phase-locking is stronger in structures known to be part of the 'limbic system', such as the amygdala and entorhinal cortex. Theta phase locking was stronger to hippocampal than to local LFP and the magnitude of spike phase locking increased when the power of theta increased, associated with increased high frequency power. The results are straightforward and the analysis methods are reliable. The novel information is limited but informative and documents a missing aspect of theta communication in the human brain.

      Comments:

      1) Given the simple message, the text is a bit long with many repetitions and loose ends. This applies to both Introduction and Discussion. Potential implications to learning, etc are interesting but the findings do not provide additional clues, thus those aspects of the discussion are mainly distractions. Instead, perhaps the authors would like to discuss potential mechanisms of remote unit entrainment. They are talking about multi-synaptic pathways but these are unlikely to be a valid conduit. Instead, the septum, entorhinal cortex or retrosplenial cortex, with their widespread projections, may be responsible for coordinating both hippocampal and neocortical areas.

      2) Arguably, the weakest part of the manuscript is the lack of hippocampal neurons. The authors refer to their own previous papers, but in a story which compares hippocampal theta oscillations with remote unit activity, it is strange that the magnitude of theta phase-locking to local hippocampal neurons is not available for comparison.

      3) How was the hippocampal LFP reference site chosen and did it vary substantially from subject to subject? Anterior or posterior locations?

      4) The authors list 1233 single neurons but in the discussion they make it clear that most of them were multiple neurons. This should be emphasized up front and may be used as an excuse why the authors did not attempt to separate pyramidal cells from interneurons (interneurons have a much higher propensity to be entrained by projected rhythms).

      5) Given that units were mixed, a logical extension would be to examine how hippocampal theta phase modulates high gamma in neocortical areas. This could potentially yield a much larger data base, targeting the same question.

      6) In the Discussion, the authors suggest that cross-regional theta phase coupling could be related to learning and other cognitive performance. However, spike-LFP coupling and coherence is confounded by LFP power increase and the authors cite Herweg et al., 2020 which did not find a relationship between theta power and memory performance. Is it then not logical to assume that cross-regional coupling may also not be related to memory?

      7) Line 36. "Long-term potentiation and long-term depression in the rodent hippocampus are also theta phase-dependent (Hyman et al., 2003)." Pavlides et al. (Brain Res 1988) or Huerta and Lisman (Neuron 1995) are perhaps more relevant references here.

      8) Line 82: "significant neocortical and contralateral phase-locking suggests". This is a strange phase. Perhaps significant phase locking of neurons in the neocortex in both hemispheres or similar would be a better formulation.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      This is a very intriguing paper showing how hippocampal local field potentials couple with the activity of other cortical regions. This mechanism has been and continues to be extensively studied in other mammals, and thus its existence and relevance in humans is exciting.

    1. Reviewer #2:

      This human psychophysics study claims to provide more evidence in support of the popular notion that visual processing of faces may involve partially independent processes for the analysis of static information such as facial shape versus dynamic information such as facial expression. In this respect the scientific hypotheses and conclusions are not novel, although some of the methods (parametric variation of facial expression dynamics using computer-generated animations) and analyses (Bayesian generative modeling of expression dynamics) are relatively new. Although the science is rigorously conducted, the paper currently feels heavy on statistics and technical details but light on data, compelling results and clear interpretation. However, the main problem is that the study fails to provide sufficient controls to support its central claims as currently formulated.

      Concerns:

      1) A central claim of the paper and the first words in the title are that the behavior studied (categorization of facial expression dynamics) is "shape-invariant". However, the lack of variation in facial shapes (n = 2) used here limits the strength of the conclusions that can be drawn, and it certainly remains an open question whether representations of facial expression dynamics are truly "shape-invariant". A simple control would have been to vary the viewing angle of the avatars, in order to dissociate 3D object shapes from their 2D projections (images). The authors also claim that "face shapes differ considerably" (line 49) amongst primate species, which is clearly true in absolute terms. However, the structural similarity of simian primate facial morphology (i.e. humans and macaques used here) is striking when compared to various non-primate species, which naturally raises questions about just how shape-invariant facial expression recognition is. The lack of data to more thoroughly support the central claim is problematic.

      2) As the authors note, macaque and human facial expressions of 'fear' and 'threat' differ considerably in visual salience and motion content - both in 3D and their 2D projections (i.e. optic flow). Indeed, the decision to 'match' expressions across species based on semantic meaning rather than physical muscle activations is a central problem here. Figure 1A illustrates clearly the relative subtlety of the human expression compared to the macaque avatar's extreme open-mouthed pose, while Fig 1D (right panels) shows that this is also true of macaque expressions mapped onto the human avatar. The authors purportedly controlled for this in an 'optic-flow equilibrated' experiment that produced similar results. However, this crucial control is currently difficult to assess since the control stimuli are not illustrated and the description of their creation (in the supplementary materials) is rather convoluted and obfuscates what the actual control stimuli were.

      The results of this control experiment that are presented (hidden away in supplementary Fig S3C) show that subjects rated the equilibrated stimuli at similar levels of expressiveness for the human vs macaque avatars. However, what the reader really needs to know is whether subjects rated the human vs macaque expression dynamics to be similarly expressive (irrespective of avatar)? My understanding is that species expression (and not species face shape) is the variable that the authors were attempting to equilibrate for.

      In short, the authors have not presented data to convince a reader that their equilibrated stimuli resolve the obvious confound in their original stimuli (namely the correlation between low level visual salience - especially around the mouth region- and the species of the expression dynamics).

      3) This paper appears to be the human psychophysics component of work that the authors have recently published using the macaque avatar. The separate paper (Siebert et al., 2020 - eNeuro) reported basic macaque behavioral responses to similar animations, while the task here takes advantage of the more advanced behavioral methods that are possible in human subjects. Nevertheless, the emphasis of the current paper on cross-species perception begs the question - how do macaques perceive these stimuli. Do the authors have any macaque behavioral data for these stimuli (even if not for the 4AFC task) that could be included to round this out? If not, I recommend rewording the title since it's current grammatical structure implies that the encoding is "across species", whereas encoding of species (shape and expression) was only tested in one species (humans).

    2. Reviewer #1:

      Overall assessment:

      The strengths of this paper are the novel cross species stimuli and very interesting behavioural findings, showing sharper tuning for recognising human expression sequences compared to monkey expressions. Technically, the paper is of a very high quality, both in terms of stimulus creation, but also in terms of analysis. Appropriate control experiments have been run, and in my view, the only concern is the way the results are presented, which I believe can be dealt with by restructuring the text. Other than that, I feel this would make a very nice contribution to the field.

      Concerns:

      The only major concern that I have is that the main take-home messages do not come through clearly in the way the Results section is currently structured. I found there was still too much technical detail - despite considerable use of Supplementary Information (SI) - which made extracting the empirical findings quite hard work. The details of the multinominal regression, the model comparisons (Table 1) and even the Discriminant Functions (Fig 2), for example, could all be briefly mentioned in the main text, with details provided in Methods or SI. These are all interesting, but I feel the focus should be on the behavioural findings, not the methods.

      I would suggest using the Discussion as a guide (this clearly states the key points) making sure the focus is more on Figure 3 and then working through the points more concisely.

      Obviously, this can be achieved simply by re-writing and does not take away from the significance of the work in any way. While the quality of the English is generally very high, some very minor wording issues could also be dealt with at this stage.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The paper employs novel cross species stimuli and a well-designed psychophysical paradigm to study the visual processing of facial expression. The authors show that facial expression discrimination is largely invariant to face shape (human vs. monkey). Furthermore, they reveal sharper tuning for recognising human expressions compared to monkey expressions, independent of whether these expressions were conveyed by a human or a monkey face. Technically, the paper is of a very high quality, both in terms of stimulus creation, but also in terms of analysis.

    1. Reviewer #3:

      This paper investigates the role of Sox2 in early hippocampal development. Previously the authors investigated conditional knockout mice using a Nestin-Cre line and found few phenotypes. The authors hypothesised that Sox2 may have greater impact on earlier developmental stages. The authors used a similar approach in a previous paper (Ferri et al., 2003) studying the ventral forebrain. To test this in the dorsal telencephalon they generated conditional knockout mice using both Emx1-Cre and FoxG1Cre driver lines. These lines displayed more significant phenotypes in the hippocampus, particularly in the cortical hem and dentate gyrus, and were most severe in the FoxG1Cre cross.

      The study is well executed and carefully thought through. Appropriate controls have been included for all experiments.

      In Figure 6, the data on Gli3 has been verified with additional luciferase data. The data on Cxcr4 has been previously published and has not been further verified with luciferase analysis. Including panel C in the figure may not be justified unless additional data is included to verify the result. It could be referred to in the discussion.

      In addition, related to Figure 6, Bertonlini et al., 2019, identified a number of Sox2 responsive enhancers, expressed in the dorsal telencephalon but it is not clear why these are not incorporated into the model. Further justification in the discussion would be helpful. The authors may also consider discussing how Emx2 in their model since they previously showed it was a negative regulator of Sox2 (Mariani et al 2012) and is required for hippocampal development (eg: Pellegrini et al., 1996; Yoshida et al 1997; Zhao et a la 2006).

      Regarding the interpretation of the results in Figure 7, previous work by the authors showed that early deletion of Sox2 using a Bf1Cre driver line resulted in severe developmental defects of the ganglionic eminences and therefore GABAergic interneurons. Are the development of GABAergic interneurons affected in the FoxG1Cre cross? It would be preferable to include some analysis of this, or at least a discussion of this issue in the context of the electrophysiology results.

      The authors use an eYFP reporter line in Figure 1- supplement. If they have similar data demonstrating Cre activity with the eYFP reporter crossed into the FoxG1CreXSox2flox/flox and Emx1CreXSox2 flox/flox it would be good to add this. It would demonstrate cell autonomous knockout versus non-cell autonomous knockout of Sox2 and may help with the interpretation of Sox2 function.

    2. Reviewer #2:

      This study examines the phenotype of early deletion of Sox2 and shows that there is a major dentate phenotype when fl-Sox2 mice are crossed to Foxg1-Cre when compared to Emx1-Cre or Nestin-Cre. This is a novel phenotype, but I don't think the authors have addressed the basis of this phenotype adequately to understand the basis of the phenotype. In addition, I am concerned about a major confounding issue (see below). I believe significant additional studies are needed to establish the specific role of Sox2 here. Below I list the major concerns.

      1) The authors rely on Foxg1-Cre for their main evidence that very early deletion of Sox2 leads to near loss of the dentate. However, it doesn't appear that the authors are aware that Foxg1 het mice have a fairly significant dentate phenotype (see this paper). The Foxg1-Cre line generated by Hebert and used by the authors is a knock-in allele that inactivates the endogenous Foxg1 gene. The authors need to address whether the phenotype they observe is actually due to loss of Sox2 alone at E9.5 vs the combined loss of Sox2 and a copy of Foxg1. In particular, could this explain the difference between Emx1-Cre and Foxg1-Cre lines? If this is the explanation for the difference, it isn't clear to me that the story really holds together without bringing in far more complex compound mutant explanations.

      2) The phenotype as described by the authors appears to be most compatible with the published Wnt3a mutant phenotype - perhaps a hypomorphic version makes the most sense or a near phenocopy of the Lef1 mutant. Given this, it appears to me this is really a hem phenotype and is likely explained by the loss of Wnt3a predominantly. Yet the authors don't show direct regulation of Wnt3a by Sox2 - the study would be dramatically enhanced by addressing the mechanism of loss of Wnt3a expression. In addition, examining the expression of Lef1 might reveal the more proximal mechanism of loss of DGC than simply less Wnt3a. This might also be another potential direct target of Sox2 since Lef1 expression is regulated by Wnt signaling but also by other morphogenic signals and could be a Sox2 target.

      3) The authors provide little specific analysis of hippocampal subfield specific markers. Their assumption is that the cells that are in the malformed dentate are granule neurons but they don't use any specific markers of DGC (eg Prox1). Instead they rely on cell position and expression of NeuroD (which is nonspecific). Similarly, it would make sense to examine other markers of mossy cells and CA3, which are also in the same region as DGC and made by adjacent neuroepithelium.

      4) Much of the study relies on the assumption that Nestin-Cre is an efficient deleter in the entire hippocampus yet there is no direct evidence of this. The authors could easily determine when Sox2 expression is lost in the various Cre-deleter lines using antibodies.

      5) I think the electrophysiology section isn't very useful or important. We know that mice with major developmental defects in the DG and hippocampus will have changes in circuit physiology. There is nothing specific about this phenotype, nor does it shed light on the important biology here.

      6) The only two direct targets they find don't seem likely to be important players in the phenotypes they describe, thus, it seems that they don't necessarily address the biology here. The Gli3 phenotypes that have been published are quite distinct from this.

      7) Some of the dentate phenotype is no doubt due to defects in CR cell production or development and this indirect effect has been seen in many other mutants that affect CR cell production (ie a disorganized dentate). It is hard to see how this part of the phenotype, which is likely due to the hem defects (the neuroepithelium that makes the CR cells) is helping us to understand the fundamental aspects of this phenotype.

    3. Reviewer #1:

      In the paper by Mercurio et al, the authors examine the role of SOX2 in the development of mouse hippocampal dentate gyrus. Using conditionally mutant SOX2 mice the authors show that early, but not late, deletion of SOX2 leads to developmental impairments of the dentate gyrus. A drawback of their study is that these findings have been reported previously by the group (Favaro et al. 2009; Ferri et al. 2013). In the current study the authors show additional examples of SOX2 target genes, which are dysregulated in the cortical hem upon early SOX2 deletion. However, as no mechanistic insights how this may affect dentate gyrus development are provided, the general novelty of the study is limited.

      Comments:

      1) The language of the manuscript needs to be improved.

      2) Using different Cre-drivers the authors aim to delete SOX2 at different developmental stages. What references demonstrate that EMX-Cre first deletes SOX2 after E10.5? I don't find where in Tronche et al. 1999 is it shown that Nes-Cre is deleting after E11.5?

      3) At line 149 the authors state "...remarkably, in the FoxG1-Cre cKO, the DG appears to be almost absent (Figure 2A).". The question is why this finding is remarkable as it already was published in (Ferri et al. 2013).

      4) Line 154 "In the FoxG1-Cre cKO, Reelin expression (marking CRC) is greatly reduced, and a HF is not observed (Figure 2D);...". This statement has no support from Figure 2D.

      5) Some of the images presented in Figure 4 are of such poor quality that they are hard to evaluate.

      6) In Figure 6 the authors show that SOX2 interacts with the promoter region of the Cxcr4 gene and that the SOX2 bound enhancer is active in the developing Zebrafish brain. These data can be removed as they have been published previously in Bertolini et al. 2019.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Joseph G Gleeson (Howard Hughes Medical Institute, The Rockefeller University) served as the Reviewing Editor.

      Summary:

      The positive aspects of the paper are the examination of the role of SOX2 in the development of mouse hippocampal dentate gyrus. Using conditionally mutant SOX2 mice the authors show that early, but not late, deletion of SOX2 leads to developmental impairments of the dentate gyrus. There were substantial criticisms of the work, most importantly that the work does not advance the field as much as is expected for a high-ranking journal, considering prior publications, and that there could be some difficulties interpreting the data as presented.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to express our appreciation for both the Editors’ and Reviewers’ efforts as essential contributions to the peer review process. We highly value the Reviewers’ constructive critique of our manuscript#RC-2020_00434R entitled “A drug repurposing screen identifies hepatitis C antivirals as inhibitors of the SARS-CoV2 main protease.__” __

      We appreciate the Reviewers’ thoughtful consideration of our work and feel their critiques and recommendations have significantly improved our manuscript. Taken together, we believe the additional data, clarification of data presentation, and revised discussion address the heart of the Reviewers’ previous concerns. Thus we feel the work is ready for reconsideration and will be an impactful addition to the literature appropriate for publication. Below we provide a breakdown and a point by point response to previous review critiques.

      Thank you for your attention. We look forward to your response.

      Best Wishes,

      Brian Kraemer, PhD ▪ Associate Director for Research Geriatric Research Education and Clinical Center ▪ Veterans Affairs Puget Sound Health Care System ▪ Research Professor ▪ Departments of Medicine, Psychiatry and Behavioral Sciences, and Pathology ▪ University of Washington ▪ 1660 South Columbian Way ▪ Seattle, WA 98108 ▪ Phone 206-277-1071 ▪ www.kraemerlab.uw.edu

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Baker et al. report the screening of a collection of ~6,070 drugs for their inhibitory activity against the enzymatic activity of the SARS-SoV-2 Mpro protein in vitro using two peptide substrates. 50 compounds with activity against Mpro were identified and tested for their dose-dependent effect in the same assay. Several hits were identified, among which are approved drugs that target the HCV protease.

      Indeed, there is an urgent need for effective drugs for SARS-CoV-2 infection, and high throughput screenings can discover novel candidates. However, the novelty of this work is quite limited, as former screens have been published with the same target using the same substrates. Moreover, as discussed below the translational impact of the hits discussed is also quite limited, particularly in the absence of antiviral data. Lastly, there are several overstatements in the write up and it will require major editing.

      **Major comments:**

      1. Were there any positive controls previously shown to potently inhibit the SARS-CoV-2 Mpro included in the screen (e.g. ebselen)? How did these perform in this assay? When first designing our protease assay, we did use ebselen as the initial control. Ebselen showed low potency in all our in our assays and was not considered as a positive control subsequently. It should be noted that Ebselen failed to work against multiple substrates. It is possible that our buffer conditions prevented Ebselen activity. See data plotted below. After identifying boceprevir as a potent inhibitor, it was used in all subsequent assays as a positive control.

      It will be helpful if the authors would provide info re the 50 hits from prior screens conducted with this library of compounds - how promiscuous are they across screens? How toxic in cell based assays?

      We have updated the table to provide additional useful information as well as a footnote explaining statuses. The compounds in the Broad repurposing library are generally non-toxic and information about them can be found here: https://clue.io/repurposing

      The translational potential of the findings appears to be limited. The calculated IC50s for these drugs in the Mpro assay are very high (10-1000 fold higher) relative to their IC50 in an enzymatic assay involving the HCV protease (Boceprevir: IC50 = 0.95 μM vs. 0.084 μM in HCV), Ciluprevir (IC50 = 20.77 μM vs. 0.0087 in HCV), Telaprevir (IC50 = 15.25 μM vs.0.050 μM in HCV) (https://aac.asm.org/content/aac/57/12/6236.full.pdf ). In the absence of antiviral data, the main statement of the manuscript that "the work presented here supports the rapid evaluation of previous HCV NS3/4A inhibitors for repurposing as a COVID-19 therapy." is thus an overstatement. Even is there is some activity, since likely to be limited, as with the HIV protease inhibitors, its chances to elicit a meaningful clinical effect is low. Moreover, when used in monotherapy, some of these protease inhibitors have a very low genetic barrier to resistance.

      We have reworked the discussion to incorporate these concerns and limitations of our results.

      There are additional inaccurate or overstatements - e.g. line 61 "Probably the most successful approved antivirals are protease inhibitors such as atazanavir for HIV-1 and simeprevir for hepatitis C. [reviewed in 10 and 11]."

      We have reworded this statement: (Page 4, Lines 61-62)

      “There is precedence for targeting the protease, as this approach has been successful in treating both HIV-1 and hepatitis C (10,11).”

      The manuscript requires editing - e.g. structure of sentences, commas, spacing (including in the abstract) etc.

      The manuscript has been re-proofed throughout (see tracked changes version of manuscript)

      What is the take home message? The statement "Taken together this work suggests previous large-scale commercial drug development initiatives targeting hepatitis C NS3/4A viral protease should be revisited because some previous lead compounds may be more potent against SARS-CoV-2 Mpro than Boceprevir and suitable for rapid repurposing." is unclear.

      The take home message of the manuscript is that HCV-targeting protease inhibitors have potential in blocking the SARS-Cov2 protease and a more thorough analysis of the space is needed. As the reviewer pointed out, the identified hits boceprevir and narlaprevir are less potent when targeting the SARS-Cov2 protease as compared to the HCV protease. However, we believe this work does show the potential for screening HCV-targeting protease inhibitors that may not have made it to the clinic. For instance, Boceprevir or Narlaprevir analogs may be even more potent against Mrpo. Further, we believe that these compounds would benefit from further optimization through medicinal chemistry.

      We have expanded the discussion to incorporate issues brought up here and in point 3.

      Reviewer #1 (Significance (Required)):

      Limited. As discussed above

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      SARS-CoV-2 pandemic causing serious health crisis globally. There are no specific medicine or vaccines to contain this virus currently. To address this issue, the authors developed one efficient fluorescent Mpro assay system and screened ~6070 previous used drugs in this article. Several compounds with activity against SARS-CoV-2 Mpro in vitro were founded. Most hits are hepatitis C NS3/4A protease inhibitors with fair IC50 value. Besides, the authors found that most identified compounds in in silico screen lack activity against Mpro in kinetic protease assays.

      These research results are well proved and reproducible. But there are two minor questions I present below:

      1. In your Mpro assay optimization process you said substrate MCA-AVLQSGFR-K(Dnp)- K-NH2 had drastically lower rates of Mpro catalyzed hydrolysis and were not considered further in your assay development. And in your Fig.1 I saw extremely low RFU changes. But several nice inhibitors were screened using this substrate that was reported in April. Can you explain this result? The substrates used in our assay appear to be much more efficiently cleaved at least with our buffer conditions and Mpro concentrations tested. Variables including recombinant Mpro purity and activity, differences in assay buffer, reader sensitivity may all play a role, but our best guess is that the substrate identified by Marcin Drag’s group (https://doi.org/10.1101/2020.04.29.068890), is more readily cleaved by Mpro. Although screening with other reported substrates is feasible given previous results, we believe the Ac-Abu-Tle-Leu-Gln-AFC to be superior for use in high throughput screening because of its superior cleavage kinetics yielding an improved signal to background ratio for HTS.

      To exclude inhibitors possibly acting as aggregators, a detergent-based control should do at the same time when you do IC50 value measurement.

      Compound aggregation is a concern, and our assays were all run with detergent in the buffer. Our buffer composition was 20mM Tris pH 7.8, 150mM NaCl, 1mM EDTA, 1mM DTT, 0.05% Triton X-100.

      Reviewer #2 (Significance (Required)):

      Nice work but the significance of this article is losing now. Most screened hits are reported in the last serval months. Some inhibitor complex structures have been published or released on Protein Data Bank. The novelty is missing. I suggest the authors add more results and resubmit it again.

      **Referees Cross-commenting**

      I agree with the other two reviewers' comments. The significance of this work is losing but still has something interest. I think it can be published in the lower-impact journal if they complete our suggestions

      We concur with both reviewers that demonstration of antiviral activity would strengthen the impact of the manuscript. However, this work remains outside of the scope of feasibility at our institution. We believe that our screen and hit identification can stand on their own until further translational work can be completed.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this report, Baker et al. show that four inhibitors of hepatitis C virus (HCV) NS3/4 protease (ciluprevir, boceprevir, narlaprevir and telaprevir) are also effective inhibitors of the SARS-CoV-2 main protease (Mpro) in enzymatic assays, with lower IC50 values for narlaprevir and boceprevir (around 1 µM in their assay conditions). HCV NS3/4 inhibitors were identified after screening a library of >6,000 compounds of the Broad Institute, including approved drugs. Screening was done with fluorometric proteolytic assays.

      Experiments have been apparently well-done and results are sound. The manuscript needs editing.

      Reviewer #3 (Significance (Required)):

      Experiments have been apparently well-done and results are sound. However, this is a limited study since there are no data obtained in cell culture and a comparison of IC50 values of the selected drugs against HCV and SARS-CoV-2 proteases is missing. It is difficult to infer whether the drugs would be equally effective against SARS-CoV-2 than against HCV, and otherwise, how much should the doses increase in order to have a therapeutic effect.

      The manuscript needs editing (see below) and the Discussion is poor. The results reported by authors are not new, and a discussion of the effects of HCV inhibitors on SARS-CoV-2 replication, based on previous publications is necessary to provide the appropriate context for the study.

      Here are some references on Covid-19 and HCV inhibitors, that in my opinion should be considered for discussion and proper citation. As correctly pointed out by Baker and co- workers, docking studies should be considered with caution, though.

      We appreciate the feedback and have now reworked and expanded the discussion to incorporate reviewer #1 and #3 comments and suggestions.

      1: Ghahremanpour MM, Tirado-Rives J, Deshmukh M, Ippolito JA, Zhang CH, de Vaca IC, Liosi ME, Anderson KS, Jorgensen WL. Identification of 14 Known Drugs as Inhibitors of the Main Protease of SARS-CoV-2. bioRxiv [Preprint]. 2020 Aug 28:2020.08.28.271957. doi: 10.1101/2020.08.28.271957. PMID: 32869018; PMCID: PMC7457600.

      2: Sacco MD, Ma C, Lagarias P, Gao A, Townsend JA, Meng X, Dube P, Zhang X, Hu Y, Kitamura N, Hurst B, Tarbet B, Marty MT, Kolocouris A, Xiang Y, Chen Y, Wang J. Structure and inhibition of the SARS-CoV-2 main protease reveals strategy for developing dual inhibitors against Mpro and cathepsin L. bioRxiv [Preprint]. 2020 Jul 27:2020.07.27.223727. doi: 10.1101/2020.07.27.223727. PMID: 32766590; PMCID: PMC7402059.

      3: Ma C, Sacco MD, Hurst B, Townsend JA, Hu Y, Szeto T, Zhang X, Tarbet B, Marty MT, Chen Y, Wang J. Boceprevir, GC-376, and calpain inhibitors II, XII inhibit SARS-CoV-2viral replication by targeting the viral main protease. Cell Res. 2020 Aug;30(8):678-692. doi: 10.1038/s41422-020-0356-z. Epub 2020 Jun 15. PMID: 32541865; PMCID: PMC7294525.

      4: Ke YY, Peng TT, Yeh TK, Huang WZ, Chang SE, Wu SH, Hung HC, Hsu TA, Lee SJ, Song JS, Lin WH, Chiang TJ, Lin JH, Sytwu HK, Chen CT. Artificial intelligence approach fighting COVID-19 with repurposing drugs. Biomed J. 2020 May 15:S2319- 4170(20)30049-4. doi: 10.1016/j.bj.2020.05.001. Epub ahead of print. PMID: 32426387; PMCID: PMC7227517.

      5: Elzupir AO. Inhibition of SARS-CoV-2 main protease 3CLpro by means of α-ketoamide and pyridone-containing pharmaceuticals using in silico molecular docking. J Mol Struct. 2020 Dec 15;1222:128878. doi: 10.1016/j.molstruc.2020.128878. Epub 2020 Jul 10.

      PMID: 32834113; PMCID: PMC7347502.

      Additional computational studies:

      1: Hosseini FS, Amanlou M. Anti-HCV and anti-malaria agent, potential candidates to repurpose for coronavirus infection: Virtual screening, molecular docking, and molecular dynamics simulation study. Life Sci. 2020 Aug 8;258:118205. doi:10.1016/j.lfs.2020.118205. Epub ahead of print. PMID: 32777300; PMCID:PMC7413873.

      2: Hakmi M, Bouricha EM, Kandoussi I, Harti JE, Ibrahimi A. Repurposing of known anti- virals as potential inhibitors for SARS-CoV-2 main protease using molecular docking analysis. Bioinformation. 2020 Apr 30;16(4):301-306. doi:10.6026/97320630016301.

      PMID: 32773989; PMCID: PMC7392094.

      3: Chtita S, Belhassan A, Aouidate A, Belaidi S, Bouachrine M, Lakhlifi T. Discovery of Potent SARS-CoV-2 Inhibitors from Approved Antiviral Drugs via Docking Screening. Comb Chem High Throughput Screen. 2020 Jul 30. doi:10.2174/1386207323999200730205447. Epub ahead of print. PMID: 32748740.

      4: Alamri MA, Tahir Ul Qamar M, Mirza MU, Bhadane R, Alqahtani SM, Muneer I, Froeyen M, Salo-Ahen OMH. Pharmacoinformatics and molecular dynamics simulation studies reveal potential covalent and FDA-approved inhibitors of SARS-CoV-2 main protease 3CLpro. J Biomol Struct Dyn. 2020 Jun 24:1-13. doi:10.1080/07391102.2020.1782768. Epub ahead of print. PMID: 32579061; PMCID:PMC7332866.

      5: Bafna K, Krug RM, Montelione GT. Structural Similarity of SARS-CoV2 Mpro and HCV NS3/4A Proteases Suggests New Approaches for Identifying Existing Drugs Useful as COVID-19 Therapeutics. ChemRxiv [Preprint]. 2020 Apr 21. doi: 10.26434/chemrxiv.12153615. PMID: 32511291; PMCID: PMC7263768.

      6: Eleftheriou P, Amanatidou D, Petrou A, Geronikaki A. In Silico Evaluation of the Effectivity of Approved Protease Inhibitors against the Main Protease of the Novel SARS- CoV-2 Virus. Molecules. 2020 May 29;25(11):2529. doi:10.3390/molecules25112529.

      PMID: 32485894; PMCID: PMC7321236.

      7: Wang J. Fast Identification of Possible Drug Treatment of Coronavirus Disease-19 (COVID-19) through Computational Drug Repurposing Study. J Chem Inf Model. 2020 Jun 22;60(6):3277-3286. doi: 10.1021/acs.jcim.0c00179. Epub 2020 May 4. PMID: 32315171; PMCID: PMC7197972.

      8: Chen YW, Yiu CB, Wong KY. Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CL pro) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates. F1000Res. 2020 Feb 21;9:129. doi: 10.12688/f1000research.22457.2. PMID: 32194944; PMCID: PMC7062204.

      Minor comments:

      We appreciate the time that the reviewer has taken to address grammatical changes and have addressed each throughout the manuscript with tracked changes.

      p.2, line 26: > appears as an attractive

      Manuscript edited

      p.2, line 27: > we show that the existing

      Manuscript edited

      p.2, line 33: > separate numbers and units, eg. 1.10 µM (this is a persisting error that should be corrected throughout the whole ms)

      Manuscript edited

      p.4, line 44: SARS virus should be referred as to SARS-CoV-1 throughout the whole manuscript. MERS-CoV is the name of the virus causing MERS

      Manuscript edited

      p.4, lines 61-62: > the selection of the specific compounds seems to be arbitrary... why atazanavir and not darunavir or other? The sentence should be rewritten.

      Rewritten as: “There is precedence for targeting the protease, as this approach has been successful in treating both HIV-1 and hepatitis C.”

      p.6, line 100: Citing Fig. 2B before completing the description of Fig. 1 is distracting. Authors should think of a better way to describe their results.

      This was a mistake and should have cited Fig 1B. Thank you for catching this.

      p.7, line 116: It is not clear what "10m-20,810" means

      This has been clarified to state: “ΔRFU at 10 minutes = 20,810 relative fluorescence units”

      p.7, lines 125-126: These sentences belong to an introduction, not appropriate in results section.

      We have removed these sentences.

      Figure 2. Part A is not necessary in results (ok for introduction). Black and purple dots in part B is not a good choice since they are difficult to distinguish, maybe orange and black is better.

      We have removed panel A, expanded the size of panel B and changed the color.

      Table 1: Status should be explained in a footnote (i.e the distinction between launched, P2/P3, phase 2, preclinical is not clear).

      The one compound indicated in P2/P3 development is now Phase 3 and the table has been updated. We have added a footnote:

      *Launched = compound approved for humans, though may only be approved for veterinary use in some countries

      Discussion. I think that subheadings are not necessary.

      Subheadings have been removed from the discussion.

      **Referees cross-commenting** I agree with reviewer no. 1 on the limited interest of the study. However, it could be published in a specialized lower-impact journal after addressing issues raised by reviewers 2 and 3 (likely to be completed in less than a month)

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this report, Baker et al. show that four inhibitors of hepatitis C virus (HCV) NS3/4 protease (ciluprevir, boceprevir, narlaprevir and telaprevir) are also effective inhibitors of the SARS-CoV-2 main protease (Mpro) in enzymatic assays, with lower IC50 values for narlaprevir and boceprevir (around 1 µM in their assay conditions). HCV NS3/4 inhibitors were identified after screening a library of >6,000 compounds of the Broad Institute, including approved drugs. Screening was done with fluorometric proteolytic assays.

      Experiments have been apparently well-done and results are sound. The manuscript needs editing.

      Significance

      Experiments have been apparently well-done and results are sound. However, this is a limited study since there are no data obtained in cell culture and a comparison of IC50 values of the selected drugs against HCV and SARS-CoV-2 proteases is missing. It is difficult to infer whether the drugs would be equally effective against SARS-CoV-2 than against HCV, and otherwise, how much should the doses increase in order to have a therapeutic effect. The manuscript needs editing (see below) and the Discussion is poor. The results reported by authors are not new, and a discussion of the effects of HCV inhibitors on SARS-CoV-2 replication, based on previous publications is necessary to provide the appropriate context for the study. Here are some references on Covid-19 and HCV inhibitors, that in my opinion should be considered for discussion and proper citation. As correctly pointed out by Baker and co-workers, docking studies should be considered with caution, though.

      1: Ghahremanpour MM, Tirado-Rives J, Deshmukh M, Ippolito JA, Zhang CH, de Vaca IC, Liosi ME, Anderson KS, Jorgensen WL. Identification of 14 Known Drugs as Inhibitors of the Main Protease of SARS-CoV-2. bioRxiv [Preprint]. 2020 Aug 28:2020.08.28.271957. doi: 10.1101/2020.08.28.271957. PMID: 32869018; PMCID: PMC7457600.

      2: Sacco MD, Ma C, Lagarias P, Gao A, Townsend JA, Meng X, Dube P, Zhang X, Hu Y, Kitamura N, Hurst B, Tarbet B, Marty MT, Kolocouris A, Xiang Y, Chen Y, Wang J. Structure and inhibition of the SARS-CoV-2 main protease reveals strategy for developing dual inhibitors against M<sup>pro</sup> and cathepsin L. bioRxiv [Preprint]. 2020 Jul 27:2020.07.27.223727. doi: 10.1101/2020.07.27.223727. PMID: 32766590; PMCID: PMC7402059.

      3: Ma C, Sacco MD, Hurst B, Townsend JA, Hu Y, Szeto T, Zhang X, Tarbet B, Marty MT, Chen Y, Wang J. Boceprevir, GC-376, and calpain inhibitors II, XII inhibit SARS-CoV-2 viral replication by targeting the viral main protease. Cell Res. 2020 Aug;30(8):678-692. doi: 10.1038/s41422-020-0356-z. Epub 2020 Jun 15. PMID: 32541865; PMCID: PMC7294525.

      4: Ke YY, Peng TT, Yeh TK, Huang WZ, Chang SE, Wu SH, Hung HC, Hsu TA, Lee SJ, Song JS, Lin WH, Chiang TJ, Lin JH, Sytwu HK, Chen CT. Artificial intelligence approach fighting COVID-19 with repurposing drugs. Biomed J. 2020 May 15:S2319-4170(20)30049-4. doi: 10.1016/j.bj.2020.05.001. Epub ahead of print. PMID: 32426387; PMCID: PMC7227517.

      5: Elzupir AO. Inhibition of SARS-CoV-2 main protease 3CLpro by means of α-ketoamide and pyridone-containing pharmaceuticals using in silico molecular docking. J Mol Struct. 2020 Dec 15;1222:128878. doi: 10.1016/j.molstruc.2020.128878. Epub 2020 Jul 10. PMID: 32834113; PMCID: PMC7347502.

      Additional computational studies:

      1: Hosseini FS, Amanlou M. Anti-HCV and anti-malaria agent, potential candidates to repurpose for coronavirus infection: Virtual screening, molecular docking, and molecular dynamics simulation study. Life Sci. 2020 Aug 8;258:118205. doi:10.1016/j.lfs.2020.118205. Epub ahead of print. PMID: 32777300; PMCID:PMC7413873.

      2: Hakmi M, Bouricha EM, Kandoussi I, Harti JE, Ibrahimi A. Repurposing of known anti-virals as potential inhibitors for SARS-CoV-2 main protease using molecular docking analysis. Bioinformation. 2020 Apr 30;16(4):301-306. doi:10.6026/97320630016301. PMID: 32773989; PMCID: PMC7392094.

      3: Chtita S, Belhassan A, Aouidate A, Belaidi S, Bouachrine M, Lakhlifi T. Discovery of Potent SARS-CoV-2 Inhibitors from Approved Antiviral Drugs via Docking Screening. Comb Chem High Throughput Screen. 2020 Jul 30. doi:10.2174/1386207323999200730205447. Epub ahead of print. PMID: 32748740.

      4: Alamri MA, Tahir Ul Qamar M, Mirza MU, Bhadane R, Alqahtani SM, Muneer I, Froeyen M, Salo-Ahen OMH. Pharmacoinformatics and molecular dynamics simulation studies reveal potential covalent and FDA-approved inhibitors of SARS-CoV-2 main protease 3CL<sup>pro</sup>. J Biomol Struct Dyn. 2020 Jun 24:1-13. doi:10.1080/07391102.2020.1782768. Epub ahead of print. PMID: 32579061; PMCID:PMC7332866.

      5: Bafna K, Krug RM, Montelione GT. Structural Similarity of SARS-CoV2 M<sup>pro</sup> and HCV NS3/4A Proteases Suggests New Approaches for Identifying Existing Drugs Useful as COVID-19 Therapeutics. ChemRxiv [Preprint]. 2020 Apr 21. doi: 10.26434/chemrxiv.12153615. PMID: 32511291; PMCID: PMC7263768.

      6: Eleftheriou P, Amanatidou D, Petrou A, Geronikaki A. In Silico Evaluation of the Effectivity of Approved Protease Inhibitors against the Main Protease of the Novel SARS-CoV-2 Virus. Molecules. 2020 May 29;25(11):2529. doi:10.3390/molecules25112529. PMID: 32485894; PMCID: PMC7321236.

      7: Wang J. Fast Identification of Possible Drug Treatment of Coronavirus Disease-19 (COVID-19) through Computational Drug Repurposing Study. J Chem Inf Model. 2020 Jun 22;60(6):3277-3286. doi: 10.1021/acs.jcim.0c00179. Epub 2020 May 4. PMID: 32315171; PMCID: PMC7197972.

      8: Chen YW, Yiu CB, Wong KY. Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CL <sup>pro</sup>) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates. F1000Res. 2020 Feb 21;9:129. doi: 10.12688/f1000research.22457.2. PMID: 32194944; PMCID: PMC7062204.

      Minor comments:

      p.2, line 26: > appears as an attractive

      p.2, line 27: > we show that the existing

      p.2, line 33: > separate numbers and units, eg. 1.10 µM (this is a persisting error that should be corrected throughout the whole ms)

      p.4, line 44: SARS virus should be referred as to SARS-CoV-1 throughout the whole manuscript. MERS-CoV is the name of the virus causing MERS

      p.4, lines 61-62: > the selection of the specific compounds seems to be arbitrary... why atazanavir and not darunavir or other? The sentence should be rewritten.

      p.6, line 100: Citing Fig. 2B before completing the description of Fig. 1 is distracting. Authors should think of a better way to describe their results.

      p.7, line 116: It is not clear what "10m-20,810" means

      p.7, lines 125-126: These sentences belong to an introduction, not appropriate in results section.

      Figure 2. Part A is not necessary in results (ok for introduction). Black and purple dots in part B is not a good choice since they are difficult to distinguish, maybe orange and black is better.

      Table 1: Status should be explained in a footnote (i.e the distinction between launched, P2/P3, phase 2, preclinical is not clear).

      Discussion. I think that subheadings are not necessary.

      Referees cross-commenting

      I agree with reviewer no. 1 on the limited interest of the study. However, it could be published in a specialized lower-impact journal after addressing issues raised by reviewers 2 and 3 (likely to be completed in less than a month)

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      SARS-CoV-2 pandemic causing serious health crisis globally. There are no specific medicine or vaccines to contain this virus currently. To address this issue, the authors developed one efficient fluorescent Mpro assay system and screened ~6070 previous used drugs in this article. Several compounds with activity against SARS-CoV-2 Mpro in vitro were founded. Most hits are hepatitis C NS3/4A protease inhibitors with fair IC50 value. Besides, the authors found that most identified compounds in in silico screen lack activity against Mpro in kinetic protease assays.

      These research results are well proved and reproducible. But there are two minor questions I present below:

      1.In your Mpro assay optimization process you said substrate MCA-AVLQSGFR-K(Dnp)-K-NH2 had drastically lower rates of Mpro catalyzed hydrolysis and were not considered further in your assay development. And in your Fig.1 I saw extremely low RFU changes. But several nice inhibitors were screened using this substrate that was reported in April. Can you explain this result?

      2.To exclude inhibitors possibly acting as aggregators, a detergent-based control should do at the same time when you do IC50 value measurement.

      Significance

      Nice work but the significance of this article is losing now. Most screened hits are reported in the last serval months. Some inhibitor complex structures have been published or released on Protein Data Bank. The novelty is missing. I suggest the authors add more results and resubmit it again.

      Referees Cross-commenting

      I agree with the other two reviewers' comments. The significance of this work is losing but still has something interest. I think it can be published in the lower-impact journal if they complete our suggestions

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Baker et al. report the screening of a collection of ~6,070 drugs for their inhibitory activity against the enzymatic activity of the SARS-SoV-2 Mpro protein in vitro using two peptide substrates. 50 compounds with activity against Mpro were identified and tested for their dose-dependent effect in the same assay. Several hits were identified, among which are approved drugs that target the HCV protease.<br> Indeed, there is an urgent need for effective drugs for SARS-CoV-2 infection, and high throughput screenings can discover novel candidates. However, the novelty of this work is quite limited, as former screens have been published with the same target using the same substrates. Moreover, as discussed below the translational impact of the hits discussed is also quite limited, particularly in the absence of antiviral data. Lastly, there are several overstatements in the write up and it will require major editing.

      Major comments:

      1.Were there any positive controls previously shown to potently inhibit the SARS-CoV-2 Mpro included in the screen (e.g. ebselen)? How did these perform in this assay?

      2.It will be helpful if the authors would provide info re the 50 hits from prior screens conducted with this library of compounds - how promiscuous are they across screens? How toxic in cell based assays?

      3.The translational potential of the findings appears to be limited. The calculated IC50s for these drugs in the Mpro assay are very high (10-1000 fold higher) relative to their IC50 in an enzymatic assay involving the HCV proteast (Boceprevir: IC50 = 0.95 μM vs. 0.084 μM in HCV), Ciluprevir (IC50 = 20.77 μM vs. 0.0087 in HCV), Telaprevir (IC50 = 15.25 μM vs. 0.050 μM in HCV) (https://aac.asm.org/content/aac/57/12/6236.full.pdf ). In the absence of antiviral data, the main statement of the manuscript that "the work presented here supports the rapid evaluation of previous HCV NS3/4A inhibitors for repurposing as a COVID-19 therapy." is thus an overstatement. Even is there is some activity, since likely to be limited, as with the HIV protease inhibitors, its chances to elicit a meaningful clinical effect is low. Moreover, when used in monotherapy, some of these protease inhibitors have a very low genetic barrier to resistance.

      4.There are additional inaccurate or overstatements - e.g. line 61 "Probably the most successful approved antivirals are protease inhibitors such as atazanavir for HIV-1 and simeprevir for hepatitis C. [reviewed in 10 and 11]."

      5.The manuscript requires editing - e.g. structure of sentences, commas, spacing (including in the abstract) etc.

      6.What is the take home message? The statement "Taken together this work suggests previous large-scale commercial drug development initiatives targeting hepatitis C NS3/4A viral protease should be revisited because some previous lead compounds may be more potent against SARS-CoV-2 Mpro than Boceprevir and suitable for rapid repurposing." is unclear.

      Significance

      Limited. As discussed above

    1. Reviewer #3:

      This fMRI study examines an interesting question, namely how computer code - as a "cognitive/cultural invention" - is processed by the human brain. However, I have a number of concerns with regard to how this question was examined in terms of experimental design, including the choice of control condition (fake code) and the way in which localiser tasks were utilised. In addition, the sample size is very small (n=15) and there appear to be large inter-individual differences in coding performance (in spite of the recruitment of expert programmers). In summary, while promising in its aims, the study's conclusions are weakened by these considerations related to its execution.

      1) The control condition

      The experiment contrasted real Python code with fake code in the form of "incomprehensible scrambled Python functions". Real and fake code also differed in regard to the task performed (code comprehension versus memory) and were distinguished via colour coding. There is a lot to unpack here in regard to how processing might differ between the two different conditions. For example, the real code blocks required code comprehension as well as computational problem solving (which does not necessarily require the use of code), while the control task requires neither. As a result of the colour coding, it also appears likely that participants will have approached the fake code blocks with a completely different processing strategy than the real code blocks. These are just a few obvious differences between the conditions but there are likely many more given how different they are. This, in my view, makes it difficult to interpret the basic contrast between real and fake code.

      2) Use of localiser tasks

      A similar concern as for point 1 holds in regard to the localiser tasks that were used in order to examine anatomical overlap (or lack thereof) between code comprehension and language, maths, logical problem solving and multiple-demand executive control, respectively. I am generally somewhat sceptical in regard to the use of functional localisers in view of the assumptions that necessarily enter into the definition of a localiser task. This concern is exacerbated by the way in which localisers were employed in the present study. Firstly, in addition to the definition of the localiser task itself, this study used localiser contrasts to define networks of interest. For example, the contrast language localiser > maths localiser served to define the "language network". Thus, assumptions about the nature of the localiser itself are compounded with those regarding the nature of the contrast. Secondly, particularly with regard to language, the localiser task was very high level, i.e. requiring participants to judge whether an active and a passive sentence had the same meaning (with both statements remaining on the screen at the same time). While of course requiring language processing, this task is arguably also a problem solving task of sorts. It is certainly more complex than a typical task designed to probe fast and automatic aspects of natural language processing.

      In addition, given that reading is also a cultural invention, is it really fair to say that coding is being compared to the "language network" here rather than to the "reading network" (in view of the visual presentation of the language task)? The possible implications of this for the interpretation of the data should be considered.

      More generally, while an anatomical overlap between networks active during code comprehension and networks recruited during other cognitive tasks may shed some initial light on how the brain processes code, it doesn't support any particularly strong conclusions about the neural mechanisms of code processing in my view. While code comprehension may overlap anatomically with regions involved in executive control and logic, this doesn't mean that the same neuronal populations are recruited in each task nor that the processing mechanisms are comparable between tasks.

      3) Sample size and individual differences

      At n=15, the sample size of this study is quite small, even for a neuroimaging study. This again limits the conclusions that can be drawn from the study results.

      Moreover, the results of the behavioural pre-test - which was commendably included - suggest that participants differed considerably with regard to their Python expertise. For the more difficult exercise in this pre-test, the mean accuracy score was 64.6% with a range from 37.5% to 93.75%. These substantial differences in proficiency weren't taken into account in the analysis of the fMRI data and, indeed, it appears difficult to meaningfully do so in view of the sample size.

    2. Reviewer #2:

      The goal of this fMRI study was to determine which brain systems support coding, by way of the extent of overlap of univariate maps with localizer tasks for language, logic, math, and executive functions. The basic conclusion is one we could have anticipated: coding engages a widespread frontoparietal network, with stronger involvement of the left hemisphere. It overlaps with all of the other tasks, but most with the map for logic. This doesn't seem too surprising, but the authors argue convincingly that others wouldn't have predicted that.

      It's unfortunate that there are differences in task difficulty among the tasks - in particular, that the logic task was the most difficult of all (both in terms of accuracy and response times), since that happens to be the one that had the largest number of overlapping voxels with the coding task. We can't know whether coding and language task voxels would have overlapped more if the language task had been more difficult.

      It seems a shame to present data only from highly experienced coders (11+ years of experience); I can imagine that the investigators are planning to write up another study examining effects of expertise, in comparison with less experienced coders. This seems like an initial paper that's laying the groundwork for a more groundbreaking one.

    3. Reviewer #1:

      This manuscript is clearly written and the methods appear to be rigorous, although the number of subjects (15) is a bit low; however, this does not appear to critically limit interpretation of the results. I appreciated the focused inclusion on expert coders to make a clear comparison to language. I also thought that the inclusion of multiple domains for comparison (logic, math, executive function, and language) was quite informative. The laterality covariance between code and language was also quite interesting. I do have some concerns with the literature review and discussion of present and previous results.

      1) My main concern with this paper is that it does not clearly review previous fMRI studies on code processing. How do the present results compare with previous studies? E.g. Castelhano et al., 2019; Floyd et al., 2017; Huang et al., 2019; Krueger et al., 2020; Siegmund et al., 2017, 2014;) It seems like the localization/lateralization obtained in the present study is largely similar to these previous studies (e.g. Siegmund et al., 2017). If so, this should be discussed: a convergence across multiple methods/authors is useful to know. Any discrepancies are also useful to know. The authors suggest that "Moreover, no prior study has directly compared the neural basis of code to other cognitive domains." However, Krueger et al. (2020) and Huang et al. (2019) appear to have done this.

      2) The authors should point out and discuss the difficulty of understanding the psychological and neural structure of coding in absence of a clear theory of coding, as is the case for language (e.g. Chomsky, 1965; Levelt, 1989; Lewis & Vasishth, 2005). On this point, I appreciate the reference to Fitch et al. (2005) regarding recursion in coding, but I think it would be most helpful to have a clear example of recursion in python code. However, the authors at least focus their results on neural underpinnings without attempting to make strong claims about cognitive underpinnings.

      3) The authors report overlap between code comprehension and language in the posterior MTG and IFG. They note that these activations were somewhat inconsistent; yet, they did observe this significant overlap. However the paper discusses the results as if this overlap did not occur, e.g. "We find that the perisylvian fronto-temporal network that is selectively responsive to language, relative to math, does not overlap with the neural network involved in code comprehension." This is not accurate, as there indeed was overlap. It is important to point out that among language-related regions, these two regions are the most strongly associated with abstract syntax (Friederici, 2017; Hagoort, 2005; Tyler & Marslen-Wilson, 2008; Pallier et al., 2011; Bornkessel-Schlesewsky & Schlesewsky, 2013; Matchin & Hickok, 2019), which very well could be a point of shared resources among code and language (as discussed in Fitch, 2005).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      This was co-submitted with the following manuscript: https://www.biorxiv.org/content/10.1101/2020.04.16.045732v1

      Summary:

      The remit of the co-submission format is to ask if the scientific community is enriched by the data presented in the co-submitted manuscripts together more so than it would be by the papers apart, or if only one paper was presented to the community. In other words, are the conclusions that can be made stronger or clearer when the manuscripts are considered together rather than separately? We felt that despite significant concerns with each paper individually, especially regarding the theoretical structures in which the experimental results could be interpreted, that this was the case.

      We want to be very clear that in a non-co-submission case we would have substantial and serious concerns about the interpretability and robustness of the Liu et al. submission given its small sample size. Furthermore, the reviewers' concerns about the suitability of the control task differed substantially between the manuscripts. We share these concerns. However, despite these differences in control task and sample size, the Liu et al. and Ivanova et al. submissions nonetheless replicated each other - the language network was not implicated in processing programming code. The replication substantially mitigates the concerns shared by us and the reviewers about sample size and control tasks. The fact that different control tasks and sample sizes did not change the overall pattern of results, in our view, is affirmation of the robustness of the findings, and the value that both submissions presented together can offer the literature.

      In sum, there were concerns that both submissions were exploratory in nature, lacking a strong theoretical focus, and relied on functional localizers on novel tasks. However, these concerns were mitigated by the following strengths. Both tasks ask a clear and interesting question. The results replicate each other despite task differences. In this way, the two papers strengthen each other. Specifically, the major concerns for each paper individually are ameliorated when considering them as a whole.

      The concerns of the reviewers need addressing, including, specifically, the limits of interpretation of your results with regard to control task choice, the discussion of relevant literature mentioned by the reviewers, and most crucially, please contextualize your results with regard to the other submission's results.

    1. Reviewer #2:

      This carefully designed fMRI study examines an interesting question, namely how computer code - as a "cognitive/cultural invention" - is processed by the human brain. The study has a number of strengths, including: use of two very different programming languages (Python and Scratch Jr.) in two experiments; direct comparison between code problems and "content-matched sentence problems" to disentangle code comprehension from problem content; control for the impact of lexical information in code passages by replacing variable names with Japanese translations; and consideration of inter-individual differences in programming proficiency. I do, however, have some questions regarding the interpretation of the results in mechanistic terms, as detailed below.

      1) Code comprehension versus underlying problem content

      I am generally somewhat sceptical in regard to the use of functional localisers in view of the assumptions that necessarily enter into the definition of a localiser task. In addition, an overlap between the networks supporting two different tasks doesn't imply comparable neural processing mechanisms. With the present study, however, I was impressed by the authors' overall methodological approach. In particular, I found the supplementation of the localiser-based approach with the comparison between code problems and analogous sentence problems rather convincing.

      However, while I agree that computational thinking does not require coding / code comprehension, it is less clear to me what code comprehension involves when it is stripped of the computational thinking aspect. Knowing how to approach a problem algorithmically strikes me as a central aspect of coding. What, then, is being measured by the code problem versus sentence problem comparison? Knowledge of how to implement a certain computational solution within a particular programming language? The authors touch upon this briefly in the Discussion section of the paper, but I am not fully convinced by their arguments. Specifically, they state:

      "The process of code comprehension includes retrieving code-related knowledge from memory and applying it to the problems at hand. This application of task-relevant knowledge plausibly requires attention, working memory, inhibitory control, planning, and general flexible reasoning-cognitive processes long linked to the MD system [...]." (p.17)

      Shouldn't all of this also apply (or even apply more strongly) to processing of the underlying problem content rather than to code comprehension per se?

      According to the authors, the extent to which code-comprehension-related activity reflects problem content varies between different systems. At the bottom of p.9, they conclude that "MD responses to code [...] do not exclusively reflect responses to problem content", while on p.13 they argue on the basis of their voxel-wise correlation analysis that "the language system's response to code is largely (although not completely) driven by problem content. However, unless I have missed something, the latter analysis was only undertaken for the language system but not for the other systems under examination. Was there a particular reason for this? Also, what are the implications of observing problem content-driven responses within the language system for the authors' conclusion that this system is "functionally conservative"?

      Overall, the paper would be strengthened by more clarity in regard to these issues - and specifically a more detailed discussion of what code comprehension may amount to in mechanistic terms when it is stripped of computational thinking.

      2) Implications of using reading for the language localiser task

      Given that reading is also a cultural invention, is it really fair to say that coding is being compared to the "language system" here rather than to the "reading system" (in view of the visual presentation of the language task)? The possible implications of this for the interpretation of the data should be considered.

      3) Possible effects of verbalisation?

      It appears possible that participants may have internally verbalised code problems - at least to a certain extent (and likely with a considerable degree of inter-individual variability). How might this have affected the results of the present study? Could verbalisation be related to the highly correlated response between code problems and language problems within the language system?

    2. Reviewer #1:

      The manuscript is well-written and the methods are clear and rigorous, representing a clear advance on previous research comparing computer code programming to language. The conclusions with respect to which brain networks computer programming activates are compelling and well conveyed. This paper is useful to the extent that the conclusions are focused on the empirical findings: whether or not code activates language-related brain regions (answer: no). However, the authors appear to be also testing whether or not any of the mechanisms involved in language are recruited for computer programming. The problem with this goal is that the authors do not present or review a theory of the representations and mechanisms involved in computer programming, as has been developed for language (e.g. Adger, 2013; Bresnan, 2001; Chomsky, 1965, 1981, 1995; Goldberg, 1995; Hornstein, 2009; Jackendoff, 2002; Levelt, 1989; Lewis & Vasishth, 2005; Vosse & Kempen, 2000).

      1) p. 15: "The fact that coding can be learned in adulthood suggests that it may rely on existing cognitive systems." p. 3: "Finally, code comprehension may rely on the system that supports comprehension of natural languages: to successfully process both natural and computer languages, we need to access stored meanings of words/tokens and combine them using hierarchical syntactic rules (Fedorenko et al., 2019; Murnane, 1993; Papert, 1993) - a similarity that, in theory, should make the language circuits well-suited for processing computer code." If we understand stored elements and computational structure in the broadest way possible without breaking this down more, many domains of cognition would be shared in this way. The authors should illustrate in more detail how the psychological structure of computer programming parallels language. Is there an example of hierarchical structure in computer code? What is the meaning of a variable/function in code, and how does this compare to meaning in language?

      2) p. 19 lines 431-433: "Our findings, along with prior findings from math and logic (Amalric & Dehaene, 2019; Monti et al., 2009, 2012), argue against this possibility: the language system does not respond to meaningful structured input that is non-linguistic." This is an overly simple characterization of the word "meaningful". The meaning of math and logic are not the same as in language. Both mathematics and computer programming have logical structure to them, but the nature of this structure and the elements that are combined in language are different. Linguistic computations take as input complex atoms of computation that have phonological and conceptual properties. These atoms are commonly used to refer to entities "in the world" with complex semantic properties and often have rich associated imagery. Linguistic computations output complex, monotonically enhanced forms. So cute + dogs = cute dogs, chased + cute dogs = chased cute dogs, etc. This is very much unlike mathematics and computer programming, where we typically do not make reference to the "real world" using these expressions to interlocuters, and outputs of an expression are not monotonic, structure-preserving combinations of the input elements, and there is no semantic enhancement that occurs through increased computation. This bears much more discussion in the paper, if the authors intend to make claims regarding shared/distinct computations between computer programming and language.

      3) More importantly, even if there were shared mechanisms between computer code programming and language, I'm not sure we can use reverse inference to strongly test this hypothesis. As Poldrack (2006) pointed out, reverse inference is sharply limited by the extent to which we know how cognition maps onto the brain. This is a similar point to Poeppel & Embick, (2005), who pointed out that different mechanisms of language could be implemented in the brain in a large variety of ways, only one of which is big pieces of cortical tissue. In this sense, there could in fact be shared mechanisms between language and code (e.g. oscillatory dynamics, connectivity patterns, subcortical structures), but these mechanisms might not be aligned with the cortical territory associated with language-related brain regions. The authors should spend much additional time discussing these alternative possibilities.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      This was co-submitted with the following manuscript: https://www.biorxiv.org/content/10.1101/2020.05.24.096180v3

      Summary:

      The remit of the co-submission format is to ask if the scientific community is enriched by the data presented in the co-submitted manuscripts together more so than it would be by the papers apart, or if only one paper was presented to the community. In other words, are the conclusions that can be made stronger or clearer when the manuscripts are considered together rather than separately? We felt that despite significant concerns with each paper individually, especially regarding the theoretical structures in which the experimental results could be interpreted, that this was the case.

      We want to be very clear that in a non-co-submission case we would have substantial and serious concerns about the interpretability and robustness of the Liu et al. submission given its small sample size. Furthermore, the reviewers' concerns about the suitability of the control task differed substantially between the manuscripts. We share these concerns. However, despite these differences in control task and sample size, the Liu et al. and Ivanova et al. submissions nonetheless replicated each other - the language network was not implicated in processing programming code. The replication substantially mitigates the concerns shared by us and the reviewers about sample size and control tasks. The fact that different control tasks and sample sizes did not change the overall pattern of results, in our view, is affirmation of the robustness of the findings, and the value that both submissions presented together can offer the literature.

      In sum, there were concerns that both submissions were exploratory in nature, lacking a strong theoretical focus, and relied on functional localizers on novel tasks. However, these concerns were mitigated by the following strengths. Both tasks ask a clear and interesting question. The results replicate each other despite task differences. In this way, the two papers strengthen each other. Specifically, the major concerns for each paper individually are ameliorated when considering them as a whole.

      The concerns of the reviewers need addressing, including, specifically, the limits of interpretation of your results with regard to control task choice, the discussion of relevant literature mentioned by the reviewers, and most crucially, please contextualize your results with regard to the other submission's results.