6,070 Matching Annotations
  1. Feb 2022
    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their manuscript the authors show the involvement of the AAA ATPase Atad1 in Desmin degradation. They identify PLAA and Ubxn4 as partners of Atad1 that participate to its function in desmin degradation.

      A general comment is that some conclusions are overstated. The authors mention several times that Atad1 depolymerises desmin filaments. The data show that Atad1 participates to the degradation of Desmin and to its solubilization. "Depolymerisation" should be kept for the model presented in figure 8 but not used in the result section. Major comments:

      1. It would be useful to visualize Atad1 and partners localization in muscle fibers in immunofluorescence. Do they colocalize with desmin filaments, with calpain?
      2. In the same line, interactors were obtained from large crosslinked complexes. It would make the model more convincing if direct interactions with Atad1 were shown, for example using Proximity Ligation Assays.
      3. Evaluation of atrophy is made on cross-sections of muscles electroporated with shRNAs. Histology pictures should be shown.
      4. What is the percentage of electroporated fibers? To evaluate the effect of shRNAs it is important to have this information. For example, if the efficiency is 50% it means that the reduction in expression of the target in electroporated fibers is twice the value reported for the whole muscle. Alternatively, immunofluorescence could be provided to see the decrease in targeted proteins in electroporated fibers.
      5. The same is true for all the experiments quantifying the effect of shRNAs in western blot. Since quantifications are probably made on whole muscles (ie a mix between electroporated and non electroporated fibers) and since the percentage of electroporated fibers is not given it is not possible to estimate the efficiency of the shRNAs in electroporated fibers.
      6. Figure 2C: by decreasing solubilization of desmin, one would expect a decrease in the levels of soluble desmin. Conversely the authors observe an increase in both insoluble and soluble desmin. Of course, this can be explained by reduced desmin degradation once solubilized but this should be demonstrated at least by showing that UPS inhibitors induces an increase in soluble ubiquitinated Desmin.
      7. Figure 2E: the levels of Atad1 in the insoluble fraction seem to be the same in the shLacZ and GSK3DN conditions, whereas the phosphor Ser is different. In other words, there should be more Atad1 in the insoluble fraction with shLacZ than with GSAK3DN since the phosphorylation level with shLacZ is significantly higher.
      8. Figure 4E: the authors state that phosphorylation decreases because of increased degradation (lanes 6-8). However, Calpain also increases degradation and phosphorylation is increased (lanes 2-4), so increasing degradation does not systematically cause a decrease in phosphorylation. Similarly, lane 5 Atad1 induces less degradation than Calpain, however, it causes a decrease in phosphorylation. Explain.
      9. The AAA ATPase VCP shares partners with Atad1 and is involved in muscle atrophy. It would greatly add to the manuscript if the authors inhibited VCP to compare its effect to Atad1

      Minor comments:

      1. The soluble fraction contains a large number of ubiquitinated proteins. Please explain how it can be stated that an increase in total soluble polyubiquitinated proteins corresponds to an increase in ubiquitinated desmin.
      2. Page 11: the authors conclude that denervation enhance the interactions with Atad1. Figure 3D indeed show an increase for Ubxn4, but it is not clear for the other proteins.
      3. Figure 4 F: show muscle sections
      4. Page 21 in vivo transfection: it is stated "see details under immunofluorescence" but there is no immunofluorescence section in materials and methods.
      5. The authors show that Atad1 inhibition in innervated muscle is sufficient to induce muscle hypertrophy (Figure 4E). They conclude that the hypertrophic effect of Atad1 is due to the inhibition of Desmin degradation. However, this hypertrophic effect could be independent of the action of Atad1 on Desmin.

      Significance

      This is new information in the field since calpain cannot hydrolyze desmin insoluble filaments and that the mechanisms that give calpain access to desmin are not known.

      The authors already made important contribution in the study of muscle atrophy and especially in desmin degradation. This work constitutes a new advance in their attempts to understand the molecular mechanisms leading to desmin degradation and muscle atrophy.

      Audience: desmin is the main intermediate filament in skeletal muscle. This work will therefore interest scientists working on skeletal muscle.

      Expertise of the reviewer: molecular and cellular biology of skeletal muscles, muscle atrophy.

      Referee Cross-commenting

      I fully agree with reviewer 1.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript reports the identification of a novel protein complex involved in denervation-induced desmin degradation. The first protein to be identified was the ATPAse Atad1. A clever isolation strategy was based on the fact that the ATPAse p97/VCP is involved in the extraction of ubiquitinated myofibrillar proteins but is not required for the removal of ubiquitinated desmin filaments. The authors reasoned that a related ATPAse might be specifically required for desmin filaments. Atad1 was identified by treating desmin filaments with a nonhydolyzable ATP analog and looking for ATPases that are associated with desmin filaments by proteomics. Knockdown of Atad1 causes a loss of desmin degradation and led to a loss of denervation-induced muscle atrophy. It seems that Atad1 binds desmin in a phsphorlation-dependent manner, although the binding maybe mediated by a protein that hasn't yet been identified. The authors went on and identified two additional proteins which together with Atad1 form a protein complex involved in recruiting calpain for desmin degradation.

      Overall, this study is very convincing providing novel important insight. I have only some minor comments

      Minor comments

      1. I wondered whether Aatd1 is expressed at higher-than-normal levels in muscle and heart. I looked that expression pattern up and it seems that they are especially abundant in muscle and heart and expressed at lesser levels in smooth muscle and overall have a restricted expression. Maybe you have some data on their expression in muscle tissue. Did you perform some staining of muscle tissue at baseline and after denervation with regard to the protein localization by immunostaining?
      2. The string data presented in Figure 3C needs some further explanation with regard to the colors used for the different proteins. While the authors explained the meaning of the proteins labeled in red, there is no explanation for the other colors.
      3. Molecular weights in Fig. 2E, 3D needs to be 'repaired' and additional MW information is required in case of the ubiquitin blot shown in 3D.
      4. Fibre size distributions shown in Fig. 1D and 4F. Have the differences been statistically tested?
      5. For my taste the referral to the individual data (Fig. numbers) in the discussion section is too detailed and becomes a second results section. This should be substituted by a summary paragraph before the implications are discussed.
      6. The summary slide is very good. However, could you please add information, which protein of the three in the Atad1 complex is depicted by each symbol?

      Significance

      Novel insight into the proteins involved in desmin filament degradation. Since this is an important subject both in muscle and heart and plays an important role in muscle and heart disease, it is of significant clinical importance. Currently it has only been implicated in denervation-induced skeletal muscle atrophy, but it is likely that desmin filament metabolisms is also similarly regulated in the heart.

      I am a researcher mainly focusing on the cardiac biology with some expertise also on muscle, however no specific knowledge about desmin filament biology.

      Referee Cross-commenting

      Overall, I think all three reviewers agree that this is a significant and important paper. I think that the comments made by the reviewers are fair and probably add to the quality of the manuscript.

      Thus, both myself and reviewer 2 agree that it would be useful to visualize Atad1 and partners localization in muscle fibers by immunofluorescence. These data would provide independent support to the model the authors are proposing, which currently is only based on biochemical analysis.

      I also support the proposed use of proximity ligation to provide further evidence of the presence of the Atad1, Ubxn4 and PLAA in a complex. However, this experiment depends on the quality of the available antibodies and I would consider this not absolutely required.

      I also agree that some further information on the proteomics data (as suggested by reviewer 3) is required with regard to the method of filtering for UPS components was performed.

      The proposed request for further information on the electroporation approach is a valid comment and if the authors have this information, it would be good to provide. However, I do not recommend further experiments as overall the data are very consistent and the findings are very significant and represent a major advance in our understanding of desmin degradation.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RESPONSE TO REVIEWER #1:

      We wish to express our appreciation to Reviewer #1 for his or her insightful comments, which will significantly improve this paper. We thank the reviewers for giving us the opportunity to improve the manuscript. We have responded to all the comments pointed out. The revised sections are highlighted in red characters and yellow backgrounds in the preliminary revised manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): This manuscript "Histidine-rich protein 2: a new pathogenic 1 factor of Plasmodium falciparum malaria" by Iwasaki, et al. reports effects of recombinant HRP2 protein on various mammalian cell lines. The MS clearly demonstrates that recombinant HRP2 enters into HT1080 cells, causes inhibition of autolysosome fusion, increases lysosomal Ca ion concentration and reduces general autophagic degradation. The authors also show that the presence of FBS or metal chelators like EDTA and EGTA mitigate toxicity of HRP2, as the former traps HRP2 and the latter compete with HRP2 for Ca binding. The experiments are appropriately carried out with suitable controls in most of the cases. There are some concerns as listed below:

      **Major concerns:** 1.HRP2 has been shown to be associated with virulence and causes vascular leakage, particularly cerebral malaria (references 37 and 38 ). Plasmodium falciparum histidine-rich protein II has been demonstrated to exacerbate experimental cerebral malaria in mice, which has been proposed to be associated with vascular leakage, activation of inflammasome and cytokine production (references 37, 38 and PMID: 31858717). This study complements the previous findings of the effect of HRP2 on mammalian cells. However, this study reveals another mechanism by which HRP2 might cause toxicity, which is inhibition of general autophagy and increase in lysosomal Ca concentration. However, whether these in vitro effects would translate in vivo needs to be shown.

      Response: We sincerely appreciate the reviewer's effort to evaluate our work. As the reviewer pointed out, this is an in vitro study, so further in vivo validation is essential in the future. However, it is also true that we discovered new findings that have been overlooked because we conducted an artificial and simple in vitro experiment. In the future, it is necessary to demonstrate the cytotoxicity, autophagy inhibition, and lysosomal calcium concentration variation of PfHRP2 by in vivo studies using model animals. Concretely, we need to confirm whether PfHRP2 behaves as a similar virulence factor in vivo by animal experiments using PfHRP2-administrated or PfHRP2-overexpressing/deficient P. falciparum-infected mouse models. These future tasks have been added to the Discussion (page 9, lines 294–297 and 309–310; page 10, lines 339–342). We have also added the study (PMID: 31858717) reporting PfHRP2 elicits pro-inflammatory effect and induces vascular permeability as reference 40.

      Furthermore, the title of the original paper was vague and gave the impression that it included in vivo experiments. Therefore, to avoid misunderstanding, we modified the paper's title to be more concrete, "Plasmodium falciparum histidine-rich protein II exhibits cell penetration and cytotoxicity with autophagy dysfunction".

      Reference

      P. Dinarvand, L. Yang, I. Biswas, H. Giri, A. R. Rezaie, Plasmodium falciparum histidine rich protein HRPII inhibits the anti-inflammatory function of antithrombin. J. Thromb. Haemost. 18, 1473–1483 (2020).

      2.All the experiments are done with recombinant HRP2 and BSA as a control. The authors should show if similar effects happen with infected parasites.

      Response: As the reviewer pointed out, it is required to perform in vivo experiments, i.e., to clarify whether the same phenomenon observed in the present study occurs in PfHRP2-administrated or P. falciparum-infected mouse models. However, in vivo studies are not possible immediately because we do not have the research facilities to carry out in vivo experiments. Therefore, we have added statements (page 9, lines 294–297 and 309–310; page 10, lines 339–342) to emphasize that the present findings are limited to in vitro and that further in vivo studies described above will be required in the future.

      3.HRP2 is released in circulation, making it accessible to endothelial cells and immune cells. How would it reach to the equivalents of these cells in the human body?

      Response: Since PfHRP2 induces vascular permeability as described in References 37–40, we propose that PfHRP2 can reach and contact cells in the human body after causing vascular leakage. I have added this possibility of contact between PfHRP2 and cells in the human body to Discussion (page 9, lines 287–290).

      **Minor concerns** 1.p62 is an appropriate marker to assess autophagy cargo degradation. If possible, it would be good to support this with LC3 processing as well.

      Response: Following the reviewer's advice, we will use LC3 as an autophagy marker as well as p62 to evaluate the autophagy inhibition of PfHRP2. Concretely, we plan to treat HT1080 cells with PfHRP2 (1 μM) for 12–60 hours and quantify the amount of LC3 protein by Western blotting. The results of this experiment will be added to Fig. 5 in the main manuscript.

      2.HRP2 might affect general lysosomal degradation process. The authors can also check whether HPR2 affects degradation of a lysosomal substrate.

      Response: Following the reviewer's advice, we will determine the effect of PfHRP2 on lysosomal degradation activity using the plasmid-based lysosomal-METRIQ (MEasurement of protein Transporting integrity by RatIo Quantification) probe, reported in a previous study (https://doi.org/10.1038/s41598-019-48131-2), to quantify lysosomal activity. The results of this experiment will be added to Fig. 5 in the main manuscript.

      Reviewer #1 (Significance (Required)): This study compelements previous findings (references 37, 38 and PMID: 31858717). It identifies a new mechanism by which HRP2 might cause toxicity. However, it is completely an in vitro study, and the previous studies (references 37 and 38) have used in vivo models as well.

      Response: We wish to thank the reviewer for this comment. As the reviewer pointed out, this study is completely in vitro, and further in vivo studies are essential in the future. Therefore, we have added statements (page 9, lines 294–297 and 309–310; page 10, lines 339–342) to emphasize that the present findings are limited to in vitro and that further in vivo studies are required in the future. We have also added the study (PMID: 31858717) reporting PfHRP2 elicits pro-inflammatory effect and induces vascular permeability as reference 40.

      Furthermore, the title of the original paper was vague and gave the impression that it included in vivo experiments. Therefore, to avoid misunderstanding, we modified the paper's title to be more concrete.

      Reference

      P. Dinarvand, L. Yang, I. Biswas, H. Giri, A. R. Rezaie, Plasmodium falciparum histidine rich protein HRPII inhibits the anti-inflammatory function of antithrombin. J. Thromb. Haemost. 18, 1473–1483 (2020).

      We thank you again for giving us the opportunity to improve our paper, and we hope that the changes are satisfactory.

      RESPONSE TO REVIEWER #2:

      We wish to express our appreciation to Reviewer #2 for his or her insightful comments, which will significantly improve this paper. We thank the reviewers for giving us the opportunity to improve the manuscript. We have responded to all the comments pointed out. The revised sections are highlighted in red characters and yellow backgrounds in the preliminary revised manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): This paper showed that recombinant Plasmodium falciparum HRPII generated in E. coli is internalized by human tumor derived cells lines and at high concentrations, induces calcium-dependent cell death. The authors propose that HRPII inhibits autolysosome formation and autophagy. Of major concern is the use of E. coli generated HRP2 without addressing the inherent confounders of copurified bacterial components, namely endotoxin LPS. It is crucial for validation of their conclusions that the authors address steps taken to remove endotoxin which is known to bind poly-histidine and HRPII, the quantification of endotoxin bound to purified protein, and the LPS sensitivity of model cell lines. Even small quantities of LPS have been shown to potentially inhibit endosome maturation (https://doi.org/10.1074/jbc.M114.611442). Would recommend caution with conclusions regarding cytotoxicity and autophagy inhibition without addressing this issue.

      Response: We sincerely appreciate the reviewer's effort to evaluate our work. The reviewer points out that the endotoxin LPS may also affect the cytotoxicity and autophagy inhibition of PfHRP2 in this study. The reviewer's point is crucial, and we agree with the reviewer. In our study, recombinant PfHRP2 was captured by anti-FLAG antibody-immobilized affinity gel (Medical & Biological Laboratories Co., Ltd., Nagoya, Japan) and washed with 20-bed volumes of washing buffer (20 mM Tris-HCl pH7.4, 500 mM NaCl, 0.1% Triton X-100) to remove contaminants including endotoxin LPS according to the manufacturer's protocol (https://ruo.mbl.co.jp/bio/dtl/dtlfiles/3328R-ver4.0.pdf). After washing, affinity gel was equilibrated with 10-bed volumes of washing buffer without Triton X-100, and recombinant PfHRP2 was eluted by 10-bed volumes of elution buffer (20 mM Tris-HCl pH7.4, 500 mM NaCl, 0.1 mg/mL FLAG peptide: DYKDDDDK). However, we did not determine the residual endotoxin LPS bound to purified PfHRP2. To address the reviewer's concern, we will follow the reviewer's suggestion and quantify the residual endotoxin LPS in the purified PfHRP2 using the LAL Endotoxin Assay Kit. We also plan to test whether the same amount of endotoxin LPS alone as the residual endotoxin LPS affects cytotoxicity and autophagy inhibition. The results of additional experiments on endotoxin LPS will be added to Supplementary Information as Fig. S2. Furthermore, we have added additional information on the purification and washing of PfHRP2 to the Materials and methods section (page 11, lines 356–362).

      Additional concerns for specific experiments are as follows: Figure 2A. There is an increase in BSA penetration at lower pH as well which suggests nonspecific increased cell permeability.

      Response: As pointed out by the reviewer, the cell membrane permeability of BSA was enhanced at low pH (pH less than 5.8), and this result implies an increase in nonspecific cell permeability. Since we have reported in another study (https://doi.org/10.1093/bbb/zbab221) that BSA shows cell penetration to human gastric cancer cell lines at pH 5.0, the cell membrane permeability of BSA at low pH in this study is satisfactory. However, comparing pH 7.4 and pH 5.6, the net charge of BSA increased by 21.9 from -14.0 (pH7.4) to +7.9 (pH5.6), and the cell penetration increased by 34%. On the other hand, the net charge of PfHRP2 increased by 79.4 from -19.2 (pH7.4) to +60.2 (pH5.6), and the cell penetration increased by 246%. This suggests that the increase in cell membrane permeability of PfHRP2 under low pH conditions is due to the increase in net charge, not to the non-specific increase in cell permeability as seen in BSA. The above explanation has been added to lines 97–103.

      Figure 3A, 3B, and 4C. There is inconsistency between the cell viability data. For example, in panel A, 1 μM of HRPII for 24 h showed 84% cell viability whereas in panel B, the cell viability is 61% for 1 μM HRP2 by 24 hours. Figure 3A and 4C (full length) differ at cell viability for 5 μM HRP2.

      Response: We thank the reviewer for the critical remarks. There was an error in the time condition described in the graph of Fig. 3A. Correctly, Fig. 3A is the result of cell viability treated with 1 μM PfHRP2 for 3 hours, so we have corrected the time condition described in Fig. 3A. Namely, Fig. 3A and 3B show that a 3-hour treatment with 1 μM PfHRP2 results in 84% cell viability, but a 24-hour treatment with 1 μM PfHRP2 decreases cell viability to 61%. These results are correctly described in lines 119–120, highlighted in yellow.

      On the other hand, as the reviewer points out, in Fig. 3A and Fig. 4C (full-length PfHRP2), the cell viability treated with 5 μM PfHRP2 for 24 hours was 5% and 26%, respectively. We believe that the discrepancy in these values is an experimental error. However, both Fig. 3A and Fig. 4C (full-length PfHRP2) agree that 5 μM PfHRP2 is statistically and significantly cytotoxic, which should not affect the claims of this study.

      Figure 5C. It would be more informative if the cell viability data at 1 μM of HRP at timepoints beyond 60 hours and for bafilomycin treatment is also presented.

      Response: We thank the reviewer for their suggestions. However, the purpose of the experiment in Figure 5C is to prove that PfHRP2 induces autolysosomal dysfunction. Since we confirmed that treatment of cells with 1 µM PfHRP2 for 60 hours resulted in accumulation of p62 in the same amount as the positive control, Bafilomycin A1, we believe that no further additional experiments are necessary.

      Figure 3D. (Minor) Consider additional experimental detail regarding maintenance of cell cultures for 5 day. Are there interval media changes or supplement additions?

      Response: We apologize for the insufficient information in the description of the experimental procedure in Fig. 3D. In the experiment in Fig. 3D, cell culture was maintained for 5 days by changing a fresh medium containing each concentration of proteins every day. We have added this information to the legends of Figure 3 (page 23, lines 653–655) and Figure S2 (page 4, lines 28–29).

      Reviewer #2 (Significance (Required)): The authors present the novel finding of HRP2 permeability into human cells. The significance of these findings is limited given the major confounder with endotoxin and also since the experiments were conducted in tumor-derived cells lines with supraphysiologic concentrations of HRPII. Although the authors showed cell viability effects with lower concentrations over 3 and 5 days, the bulk of the experiments were at more than 10-fold mean physiological concentrations. Also, since these are early findings in tumor-derived cell lines, it is difficult to extrapolate the physiological relevance of these findings and use of calcium chelators as therapeutics. Several studies have proposed a pathogenic role for HRP2 including those cited in the paper regarding blood-brain barrier disruption (references 37 and 38), coagulation disruption (DOI: 10.1182/blood-2010-12-326876), and pro-inflammatory signaling (DOI: 10.1111/jth.14713). The challenge with all these studies is establishing the clinical relevance of the multitude of HRPII effects. If the issue of endotoxin is addressed, this paper could establish an interesting mechanism for further study in more clinically representative systems. Our lab has studied the many functions of HRPII including catalysis of heme polymerization, inhibition of antithrombin, brain endothelial disruption using tissue culture and mouse models.

      Response: As pointed out by the reviewer, this study must clear up the effect of endotoxin LPS. In this regard, as mentioned above, we plan to quantify the residual endotoxin LPS in the purified PfHRP2 using the LAL Endotoxin Assay Kit. We will also check the effect of the endotoxin LPS itself on cytotoxicity and autophagy inhibition.

      Furthermore, as the reviewer pointed out, this is an in vitro study using high concentrations of PfHRP2 and a tumor-derived cell line, so further in vivo validation is essential in the future. However, it is also true that we discovered new findings that have been overlooked because we conducted an artificial and simple in vitro experiment. In the future, it is necessary to demonstrate the cytotoxicity and autophagy inhibition of PfHRP2 by in vivo studies using model animals. Concretely, we need to confirm whether PfHRP2 behaves as a similar virulence factor in vivo by animal experiments using PfHRP2-administrated or PfHRP2-overexpressing/deficient P. falciparum-infected mouse models. We also need to demonstrate that calcium chelators such as EDTA have an in vivo therapeutic effect. These future tasks have been added to the Discussion (page 9, lines 294–297 and 309–310). We have also added the studies (DOI: 10.1182/blood-2010-12-326876, DOI: 10.1111/jth.14713) reporting PfHRP2 elicits pro-inflammatory effect and induces vascular permeability as reference 37 and 40.

      Furthermore, the title of the original paper was vague and gave the impression that it included in vivo experiments. Therefore, to avoid misunderstanding, we modified the paper's title to be more concrete, "Plasmodium falciparum histidine-rich protein II exhibits cell penetration and cytotoxicity with autophagy dysfunction".

      References

      M. Ndonwi, et al., Inhibition of antithrombin by Plasmodium falciparum histidine-rich protein II. Blood 117, 6347–6354 (2011). P. Dinarvand, L. Yang, I. Biswas, H. Giri, A. R. Rezaie, Plasmodium falciparum histidine rich protein HRPII inhibits the anti-inflammatory function of antithrombin. J. Thromb. Haemost. 18, 1473–1483 (2020).

      We thank you again for giving us the opportunity to improve our paper, and we hope that the changes are satisfactory.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This paper showed that recombinant Plasmodium falciparum HRPII generated in E. coli is internalized by human tumor derived cells lines and at high concentrations, induces calcium-dependent cell death. The authors propose that HRPII inhibits autolysosome formation and autophagy.

      Of major concern is the use of E. coli generated HRP2 without addressing the inherent confounders of copurified bacterial components, namely endotoxin LPS. It is crucial for validation of their conclusions that the authors address steps taken to remove endotoxin which is known to bind poly-histidine and HRPII, the quantification of endotoxin bound to purified protein, and the LPS sensitivity of model cell lines. Even small quantities of LPS have been shown to potentially inhibit endosome maturation (https://doi.org/10.1074/jbc.M114.611442). Would recommend caution with conclusions regarding cytotoxicity and autophagy inhibition without addressing this issue.

      Additional concerns for specific experiments are as follows:

      Figure 2A. There is an increase in BSA penetration at lower pH as well which suggests nonspecific increased cell permeability.

      Figure 3A, 3B, and 4C. There is inconsistency between the cell viability data. For example, iIn panel A, 1 uM of HRPII for 24 h showed 84% cell viability whereas in panel B, the cell viability is 61% for 1 uM HRP2 by 24 hours. Figure 3A and 4C (full length) differ at cell viability for 5 uM HRP2.

      Figure 5C. It would be more informative if the cell viability data at 1 uM of HRP at timepoints beyond 60 hours and for bafilomycin treatment is also presented.

      Figure 3D. (Minor) Consider additional experimental detail regarding maintenance of cell cultures for 5 day. Are there interval media changes or supplement additions?

      Significance

      The authors present the novel finding of HRP2 permeability into human cells. The significance of these findings is limited given the major confounder with endotoxin and also since the experiments were conducted in tumor-derived cells lines with supraphysiologic concentrations of HRPII. Although the authors showed cell viability effects with lower concentrations over 3 and 5 days, the bulk of the experiments were at more than 10-fold mean physiological concentrations. Also, since these are early findings in tumor-derived cell lines, it is difficult to extrapolate the physiological relevance of these findings and use of calcium chelators as therapeutics.

      Several studies have proposed a pathogenic role for HRP2 including those cited in the paper regarding blood-brain barrier disruption (references 37 and 38), coagulation disruption (DOI: 10.1182/blood-2010-12-326876), and pro-inflammatory signaling (DOI: 10.1111/jth.14713). The challenge with all these studies is establishing the clinical relevance of the multitude of HRPII effects. If the issue of endotoxin is addressed, this paper could establish an interesting mechanism for further study in more clinically representative systems.

      Our lab has studied the many functions of HRPII including catalysis of heme polymerization, inhibition of antithrombin, brain endothelial disruption using tissue culture and mouse models.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript "Histidine-rich protein 2: a new pathogenic 1 factor of Plasmodium falciparum malaria" by Iwasaki, et al. reports effects of recombinant HRP2 protein on various mammalian cell lines. The MS clearly demonstrates that recombinant HRP2 enters into HT1080 cells, causes inhibition of autolysosome fusion, increases lysosomal Ca ion concentration and reduces general autophagic degradation. The authors also show that the presence of FBS or metal chelators like EDTA and EGTA mitigate toxicity of HRP2, as the former traps HRP2 and the latter compete with HRP2 for Ca binding. The experiments are appropriately carried out with suitable controls in most of the cases. There are some concerns as listed below:

      Major concerns:

      1.HRP2 has been shown to be associated with virulence and causes vascular leakage, particularly cerebral malaria (references 37 and 38 ). Plasmodium falciparum histidine-rich protein II has been demonstrated to exacerbate experimental cerebral malaria in mice, which has been proposed to be associated with vascular leakage, activation ofinflamosome and cytokine production (references 37, 38 and PMID: 31858717). This study complements the previous findings of the effect of HRP2 on mammalian cells. However, this study reveals another mechanism by which HRP2 might cause toxicity, which is inhibition of general autophagy and increase in lysosomal Ca concentration. However, whether these in vitro effects would translate in vivo needs to be shown.

      2.All the experiments are done with recombinant HRP2 and BSA as a control. The authors should show if similar effects happen with infected parasites.

      3.HRP2 is released in circulation, making it accessibele to endothelial cells and immune cells. How would it reach to the equivalents of these cells in the human body?

      Minor concerns

      1.p62 is an appropriate marker to assess autophagy cargo degradation. If possible, it would be good to support this with LC3 processing as well.

      2.HRP2 might affect general lysosomal degradation process. The authors can also check whether HPR2 affects degradation of a lysosomal substrate.

      Significance

      This study compelements previous findings (references 37, 38 and PMID: 31858717). It identifies a new mechanism by which HRP2 might cause toxicity. However, it is completely an in vitro study, and the previous studies (references 37 and 38) have used in vivo models as well.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this paper, Harterink and colleagues investigate the establishment of minus-end-out microtubule polarity in the anterior dendrite of C. elegans PVD neurons. These neurons offer an excellent model system due to their simplicity and well-defined microtubule polarity. The authors investigate the role of two proteins in particular, the well-studied Patronin protein and a newly identified homologue of Ninein (Noca-2). They show that these proteins are redundantly required for correct minus-end-out polarity. Absence of one of these proteins results in a low penetrant phenotype, but absence of both results in a strongly penetrant phenotype. Interestingly, in all cases the neurons display either almost fully retrograde or almost fully anterograde microtubule polarity, and not a mix of retrograde and anterograde microtubules. This is probably linked to the fact that the authors show that endosomes at the distal tip of the dendrite (that are known to mediate retrograde microtubule nucleation events) are either present or absent in these mutants (to differing degrees that reflect the polarity phenotypes of each mutant type). The authors further show that Noca-2, but not Patronin, is required for proper localisation of γ-tubulin to the distal endosomes, suggesting that the proteins influence microtubule polarity in different ways. They provide some evidence that Patronin clusters, while not colocalized to the distal endosomes, are somehow connected. The paper and figures are clear and the work should be reproducible.

      Most conclusions are supported by the data, except for when the authors say: "Taken together, these results show that PTRN-1 (CAMSAP) and NOCA-2 (NINEIN) act in parallel in the PVD neuron during early development to establish minus-end out microtubule organization, and that this organization is important for proper dendritic morphogenesis." But the authors show that removal of Patr results in some neurons having a complete anterograde phenotype in the anterior genotype, but that no Patr neurons have a severe morphology defect (Fig 2). This would suggest that the severe morphology defects in Patr/Noca-2 double mutants are not simply due to the reversal of polarity in the anterior dendrite. This should be discussed.

      We agree with the comment, and we will discuss this more clearly in a revised manuscript.

      The paper could be strengthened with some biochemistry showing that Noca-2 can associate with γ-TuRCs i.e. do purified fragments of Noca-2 pull out γ-TuRCs from a cell extract (not necessarily a neuron cell extract)? This should be possible within 1 month.

      We thank the reviewer for this suggestion. We will perform some biochemistry experiment to probe the association of NOCA-2 with γ-TuRCs. However, instead of doing the IP by overexpression of NOCA-2 and γ-TuRCs in cells, we will use the CRISPR knockin animals for NOCA-2 and γ-TuRCs, to exclude potential overexpression artifacts.

      Minor comments

      1) "However, in polarized cells such as neurons, most microtubules are organized in a non-centrosomal manner (Nguyen et al., 2011)." Need more up to date reference here, such as a recent review from Jens Lüders.

      We will update the references in the revision version of the manuscript.

      2) "and also in Drosophila Patronin was found important for dendritic microtubule polarity (Feng et al., 2019)." Also Wang et al., 2019 in eLife.

      We will add this reference.

      3) "In the non-ciliated PHC neuron or the ciliated URX neuron we did not observe microtubule organization defects in the ptrn-1 mutant (Supplemental figure 1A-B), which suggests that these neurons do less or do not dependent on PTRN-1." End of sentence needs re-phasing

      We will rephrase the text.

      Reviewer #1 (Significance (Required)):

      Overall, the paper adds some interesting information to the field but does not make a conceptual advance that would make it attractive to a wide audience. It will, however, be of interest to those studying mt regulation in neurons. It is a shame that the molecular mechanism that allow Noca-2, and particularly Patronin, to establish microtubule polarity remain to be determined. Figuring out these mechanisms would significantly strengthen the paper.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Harterink comments:

      In this manuscript He et al investigate the role of two key microtubule minus end regulators, Patronin/CAMSAP and NOCA-2/ninein, in establishing dendrite microtubule organization. The authors use a well-characterized branched sensory neuron in C. elegans for their analysis and make significant contributions to our understanding of neuronal microtubule organization. First, they show that C. elegans has not one, but two, ninein-like proteins, NOCA-1 and NOCA-2. Previously only NOCA-1 had been identified, and neuronal functions of ninein have remained elusive, perhaps in part because NOCA-2 had been missed. It had previously been shown that in epithelial cells NOCA-1 acts with gamma-tubulin as one arm of a microtubule minus end organizing pathway, while Patronin acts in parallel on minus ends. The current manuscript very nicely extends this functional map to neurons. The authors show that NOCA-2 helps recruit the gamma-tubulin ring complex (g-TuRC) to Rab11 endosomes that are important for microtubule nucleation at developing dendrite tips. As in epithelial cells, Patronin seems to act in parallel to this pathway and rather than being involved in recruiting the g-TuRC to Rab11 endosomes, is instead important for allowing the Rab11 endosomes to be transported to developing dendrite tips. In total this analysis not only identifies a new player in dendritic microtubule organization (NOCA-2), but also helps synthesize the functions of other players (g-TuRC, Patronin) into a model that makes sense in the broader context of microtubule organization across species and cell types.

      • Specific points *
        • NOCA-2 is described as a previously unidentified member of the ninein family. In order to evaluate this claim critically, it would be helpful to have a figure showing how similar NOCA-1 and NOCA-2 are to mammalian ninein. It is also critical to include a phylogeny to get a better feel for how NOCA-2 fits into the evolutionary history of the family. * We agree with the suggestion. We will perform this analysis and add it to a revised version of the manuscript.
      • The nucleation assay used throughout is not very clear as one reads through the manuscript. The color-coding of Fig 1G could be better defined in the legend, and it would be helpful to have more information in the legend or results about what is meant here by microtubule nucleation. Is it simply initiation of new microtubule growth events? If so, how are these distinguished from catastrophe rescue? It would be good to use the same color coding in 2E. *

      We thank the reviewer for pointing this out. In Fig 1G and 2E we indeed quantified EBP-2::GFP growth events. Although we later show that the microtubule nucleator gamma-tubulin localized to the distal segment where we observe increased microtubule growth events, we agree that we cannot distinguish microtubule nucleation from regrowth after catastrophe. Therefore, we will describe this more accurately in the text, legend and in the figure.

      • The colocalization of Rab-11 and NOCA-2 seems to be supported only by a single overlapping puncta in a neuron before the anterior dendrite extends (Fig 4). It would be good to flesh out this data set more as it is an important part of the argument that NOCA-2 is involved in recruiting g-TuRC to Rab11 endosomes. *

      We thank the reviewer for pointing this out. We will flesh out this data either by adding several examples and/or a movie to show the localization of Rab-11 and NOCA-2 in the revised version of the manuscript.

      • Summary diagrams of results either as conclusions are made in the individual figures or synthesized at the end would help readers to understand the evidence that NOCA-2 and PTRN-1 function at different steps in establishment of MT polarity *

      We agree that a summary diagram could be helpful, and we will consider adding this to the revised version of the manuscript.

      • NOCA-1 is introduced at the beginning of the manuscript and appears to act in parallel to both NOCA-2 and PTRN-1. One is left with many questions, for example, is it also required to recruit g-TuRCs to Rab11 vesicles, or does it have some other role? However, I appreciate that it is beyond the scope of a single manuscript to answer all questions and the authors state a clear rationale for focusing on NOCA-2. *

      We agree that the function of NOCA-1 is interesting to be investigated in the future, since we found it acts redundantly to PTRN-1 and NOCA-2. As NOCA-1 is an essential gene this brings along some technical difficulties to properly address its function and would require generating novel tools. We appreciate the reviewers understanding that this is beyond the scope of the current manuscript.

      • It would be helpful to clearly state at some point which aspects of localization that are described are seen only in developing dendrites and which are seen in both developing and mature dendrites. For example, is there any similarity in localization of PTRN-1 NOCA-2 and gip2 in mature dendrites to that shown in immature? Is there any sign of continued localization to RAB11 vesicles, or is this only transient? Perhaps a diagram to summarize these findings would also be helpful. *

      We thank the reviewer for pointing this out. We will better explain the localization of NOCA-2, PTRN-1, GIP-2 and RAB-11 vesicles in developing neurons vs mature neurons in a revised version of the manuscript.

      • The authors propose several different ideas about how Patronin might contribute to Rab11 vesicle localization. However, I am not sure that they really describe the simplest one: that Patronin helps minus ends grow out from the cell body as shown in Drosophila (and Fig S6 here), and that these minus-end-out microtubules could be the tracks used to transport Rab11 into dendrites. Have I missed some reason why this model is not presented as a good fit for the data? *

      We thank the reviewer for pointing this out. Feng et al indeed showed that EB proteins can track microtubule plus- and minus-end growth in the sensory neurons of Drosophila. Since the slower event co-localize with Patronin they suggested that these help to populate the minus-end out microtubules in the drosophila dendrites (Feng et al., 2019).

      Although we do not have strong data against this model for the PVD dendrites in C. elegans, there are several observations that to us suggest that it is unlikely that minus-end growth is the driving force for the forward movement of the MTOC vesicles. These include: the MT being mixed in the distal segment, therefore it is hard to imagine how specifically one pool is growing; we do not see EBP-2 localize to the Camsap puncta as was seem in Drosophila; the Camsap dynamics at the growth cone seem very different (less processive) to the dynamics in the shaft (which indeed could be minus-end growth). We will make this reasoning more clear in the revised manuscript.

      • There are some grammatical errors throughout, as well as a few typos (like PTNR-1 for PTRN-1). *

      We will correct the text grammar and typos in the revision version of manuscript.

      Reviewer #2 (Significance (Required)):

      This analysis will help synthesize a more complete and meaningful understanding of how non-centrosomal microtubules are organized. The authors not only identify a new player in non-centrosomal microtubule organization, but also help fit together several existing players into a framework that brings together observations from other model systems and cell types into a more coherent whole.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Harterink comments:

      In this manuscript He et al investigate the role of two key microtubule minus end regulators, Patronin/CAMSAP and NOCA-2/ninein, in establishing dendrite microtubule organization. The authors use a well-characterized branched sensory neuron in C. elegans for their analysis and make significant contributions to our understanding of neuronal microtubule organization. First, they show that C. elegans has not one, but two, ninein-like proteins, NOCA-1 and NOCA-2. Previously only NOCA-1 had been identified, and neuronal functions of ninein have remained elusive, perhaps in part because NOCA-2 had been missed. It had previously been shown that in epithelial cells NOCA-1 acts with gamma-tubulin as one arm of a microtubule minus end organizing pathway, while Patronin acts in parallel on minus ends. The current manuscript very nicely extends this functional map to neurons. The authors show that NOCA-2 helps recruit the gamma-tubulin ring complex (g-TuRC) to Rab11 endosomes that are important for microtubule nucleation at developing dendrite tips. As in epithelial cells, Patronin seems to act in parallel to this pathway and rather than being involved in recruiting the g-TuRC to Rab11 endosomes, is instead important for allowing the Rab11 endosomes to be transported to developing dendrite tips. In total this analysis not only identifies a new player in dendritic microtubule organization (NOCA-2), but also helps synthesize the functions of other players (g-TuRC, Patronin) into a model that makes sense in the broader context of microtubule organization across species and cell types.

      Specific points

      1. NOCA-2 is described as a previously unidentified member of the ninein family. In order to evaluate this claim critically, it would be helpful to have a figure showing how similar NOCA-1 and NOCA-2 are to mammalian ninein. It is also critical to include a phylogeny to get a better feel for how NOCA-2 fits into the evolutionary history of the family.
      2. The nucleation assay used throughout is not very clear as one reads through the manuscript. The color-coding of Fig 1G could be better defined in the legend, and it would be helpful to have more information in the legend or results about what is meant here by microtubule nucleation. Is it simply initiation of new microtubule growth events? If so, how are these distinguished from catastrophe rescue? It would be good to use the same color coding in 2E.
      3. The colocalization of Rab-11 and NOCA-2 seems to be supported only by a single overlapping puncta in a neuron before the anterior dendrite extends (Fig 4). It would be good to flesh out this data set more as it is an important part of the argument that NOCA-2 is involved in recruiting g-TuRC to Rab11 endosomes.
      4. Summary diagrams of results either as conclusions are made in the individual figures or synthesized at the end would help readers to understand the evidence that NOCA-2 and PTRN-1 function at different steps in establishment of MT polarity
      5. NOCA-1 is introduced at the beginning of the manuscript and appears to act in parallel to both NOCA-2 and PTRN-1. One is left with many questions, for example, is it also required to recruit g-TuRCs to Rab11 vesicles, or does it have some other role? However, I appreciate that it is beyond the scope of a single manuscript to answer all questions and the authors state a clear rationale for focusing on NOCA-2.
      6. It would be helpful to clearly state at some point which aspects of localization that are described are seen only in developing dendrites and which are seen in both developing and mature dendrites. For example, is there any similarity in localization of PTRN-1 NOCA-2 and gip2 in mature dendrites to that shown in immature? Is there any sign of continued localization to RAB11 vesicles, or is this only transient? Perhaps a diagram to summarize these findings would also be helpful.
      7. The authors propose several different ideas about how Patronin might contribute to Rab11 vesicle localization. However, I am not sure that they really describe the simplest one: that Patronin helps minus ends grow out from the cell body as shown in Drosophila (and Fig S6 here), and that these minus-end-out microtubules could be the tracks used to transport Rab11 into dendrites. Have I missed some reason why this model is not presented as a good fit for the data?
      8. There are some grammatical errors throughout, as well as a few typos (like PTNR-1 for PTRN-1).

      Significance

      This analysis will help synthesize a more complete and meaningful understanding of how non-centrosomal microtubules are organized. The authors not only identify a new player in non-centrosomal microtubule organization, but also help fit together several existing players into a framework that brings together observations from other model systems and cell types into a more coherent whole.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper, Harterink and colleagues investigate the establishment of minus-end-out microtubule polarity in the anterior dendrite of C. elegans PVD neurons. These neurons offer an excellent model system due to their simplicity and well-defined microtubule polarity. The authors investigate the role of two proteins in particular, the well-studied Patronin protein and a newly identified homologue of Ninein (Noca-2). They show that these proteins are redundantly required for correct minus-end-out polarity. Absence of one of these proteins results in a low penetrant phenotype, but absence of both results in a strongly penetrant phenotype. Interestingly, in all cases the neurons display either almost fully retrograde or almost fully anterograde microtubule polarity, and not a mix of retrograde and anterograde microtubules. This is probably linked to the fact that the authors show that endosomes at the distal tip of the dendrite (that are known to mediate retrograde microtubule nucleation events) are either present or absent in these mutants (to differing degrees that reflect the polarity phenotypes of each mutant type). The authors further show that Noca-2, but not Patronin, is required for proper localisation of γ-tubulin to the distal endosomes, suggesting that the proteins influence microtubule polarity in different ways. They provide some evidence that Patronin clusters, while not colocalized to the distal endosomes, are somehow connected. The paper and figures are clear and the work should be reproducible.

      Most conclusions are supported by the data, except for when the authors say: "Taken together, these results show that PTRN-1 (CAMSAP) and NOCA-2 (NINEIN) act in parallel in the PVD neuron during early development to establish minus-end out microtubule organization, and that this organization is important for proper dendritic morphogenesis." But the authors show that removal of Patr results in some neurons having a complete anterograde phenotype in the anterior genotype, but that no Patr neurons have a severe morphology defect (Fig 2). This would suggest that the severe morphology defects in Patr/Noca-2 double mutants are not simply due to the reversal of polarity in the anterior dendrite. This should be discussed.

      The paper could be strengthened with some biochemistry showing that Noca-2 can associate with γ-TuRCs i.e. do purified fragments of Noca-2 pull out γ-TuRCs from a cell extract (not necessarily a neuron cell extract)? This should be possible within 1 month.

      Minor comments

      1. "However, in polarized cells such as neurons, most microtubules are organized in a non-centrosomal manner (Nguyen et al., 2011)." Need more up to date reference here, such as a recent review from Jens Lüders.
      2. "and also in Drosophila Patronin was found important for dendritic microtubule polarity (Feng et al., 2019)." Also Wang et al., 2019 in eLife.
      3. "In the non-ciliated PHC neuron or the ciliated URX neuron we did not observe microtubule organization defects in the ptrn-1 mutant (Supplemental figure 1A-B), which suggests that these neurons do less or do not dependent on PTRN-1." End of sentence needs re-phasing

      Significance

      Overall, the paper adds some interesting information to the field but does not make a conceptual advance that would make it attractive to a wide audience. It will, however, be of interest to those studying mt regulation in neurons. It is a shame that the molecular mechanism that allow Noca-2, and particularly Patronin, to establish microtubule polarity remain to be determined. Figuring out these mechanisms would significantly strengthen the paper.

  2. Jan 2022
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01129

      Corresponding author(s): Koji Kikuchi


      Reviewer #1

      Evidence, reproducibility and clarity (Required):

      In this manuscript, Kikuchi et al describe the characterization of MAP7D2 and MAP7D1, two MAP7 family members in mouse with specific expression patterns. Focusing mostly on MAP7D2, they assess its expression pattern across the body and find that it is mostly expressed in certain neuronal subsets. They then characterize the MT-related properties of MAP7D2 based on previous knowledge of other MAP7 family members. They show that MAP7D2 binds MTs (via the N-terminus), determine the binding affinity, and show that it can stimulate MT polymerization (or stabilization) both in vitro and in vivo. Using a specific antibody, they localize MAP7D2 to centrosomes, midbody and neurites in N1-E115 cells. Functionally, they show that loss of MAP7D1/2 mildly affects microtubule stability as judged by acetyl-tubulin staining, and properties of these cells that rely on cytoskeletal elements such as cell migration and neurite growth. Interestingly, there might be a feedback loop regulating MAP7D1/2 expression, as knockdown of MAP7D1 upregulates MAP7D2.

      Overall, the experiments and conclusions are very solid and convincing, such that I would not ask for further experiments. This is in part because the experiments are largely based on previous characterizations of other MAP7 family members, which are largely confirmed. The presentation of the data is also very clear.

      Significance (Required):

      I see the value of the study in the fact that it provides solid and specific research tools for MAP7D1/2 which could be very useful for the microtubule/neuronal cytoskeleton community.

      Response: We thank the reviewer very much for appreciating the content of our manuscript.

      \*Referees cross-commenting***

      Reviewers 2 and 3 criticize that the evidence for an effect of MAP7D1/2 on MT dynamics is weak. I would agree in that ac-tub stainings and in vitro experiments are rather indirect. The experiments suggested by reviewer 2 should clarify this (esp. nocodazole should be easy). I also agree that an experiment addressing the potential involvement of kinesin-1 would help, the involvement of which seems to have been omitted by the authors. A kinesin-binding deficient mutant would add another MAP7D1/2 tool and increase the value for the community.

      Response: As for the reviewer’s suggestions listed above, please refer to our responses to the comments of Reviewer #2.

      Reviewer #2

      Evidence, reproducibility and clarity (Required):

      In this study, the authors investigate 2 members from the MAP7 family Map7D2 and Map7D1. They first address the tissue distribution of Map7D2, by northern blotting using a variety of rat tissues. To complement their analysis, they also raised an antibody to look at the protein distribution. From their studies, they concluded that Map7D2 is abundantly expressed in the brain and testis. The authors went on to perform a series of functional assays. First, they biochemically demonstrated that rat Map7D2 directly binds to MTs by MT co-sedimentation assay. The MT binding domain was mapped to the N-terminal half. They performed MT turbidity assay to demonstrate enhanced MT polymerisation in the presence of Map7D2, suggesting that this Map stabilises MTs. The authors went on to characterise in detail the subcellular localisation of Map7D2 which was predominantly present in the centrosome and partially localised to MTs including within neurites from N1-E115 cells. Kikuchi et al. further revealed the overlap in expression between Map7D2 and another family member, Map7D1. The authors continued these studies by a series of functional studies in N1-E115 cells where they performed single or combined knock-downs of Map7D2 and Map7D1 and studied the levels of acetylated and detyrosinated tubulins and the effect of the knock-downs on migration and neurite extension. The main conclusion from this work was that Map7D2 and Map7D1 facilitate MT stabilization through distinct mechanisms which are important in controlling cell motility and neurite outgrowth. Map7D2 is proposed to stabilise MTs by direct binding whereas Map7D1 does it indirectly by affecting acetylation.

      Major comments:

      The main conclusion from this work that Map7D2 and Map7D1 facilitate MT stabilization and that this is necessary for correct migration and neurite extension has not been convincingly demonstrated. In my opinion, a more detailed study of MT properties to demonstrate a role in MT stabilisation would greatly benefit the work, eg. experiments using MT destabilising agents such as nocodazole. In addition, a series of experiments aiming to study MT dynamics would help to understand the function of these MT regulators. The authors proposed an elevation in microtubule dynamics to explain the increase in migration and neurite extension but no experimental proof was provided.

      Response: According to the reviewer’s suggestion, we plan to assess the role of MT stabilization in greater detail by analyzing the sensitivity to the MT-destabilizing agent, nocodazole.

      To study MT dynamics, methods such as analyzing the velocity and direction of an EB1-GFP comet are commonly used. We have previously analyzed the roles of Map7 and Map7D1 in MT dynamics using HeLa cells stably expressing EB1-GFP (Kikuchi et al., EMBO Rep., 2018). However, no such tools have been developed for analyzing MT dynamics in N1-E115 cells, which were used in this study. In addition, it is difficult to analyze MT dynamics by transient expression of EB1-GFP because of the low plasmid transfection efficiency. Therefore, we instead plan to assess the effect on MT dynamics by measuring the EB1 comet length by immunofluorescence, referring to Fig. 7D in EMBO J. 32:1293–1306, 2013.

      Moreover, considering the possibility that the Map7D2 dynamics are altered when MT stability is changed, e.g., before and after differentiation induction, we analyzed the Map7D2 dynamics at the centrosome by fluorescence recovery after photobleaching (FRAP) using N1-E115 cells stably expressing EGFP-rMap7D2. We found that the dynamics were altered between the proliferative and differentiated states (see the figure below). Compared to the proliferative state, the recovery rate of EGFP-Map7D2 was reduced (lower left panel), and the immobile fraction of Map7D2 was increased in the differentiated state (lower right panel). As these data suggest that the increase in immobile Map7D2 may enhance MT stabilization, we will present them in a new figure in our manuscript along with the results of the above two experiments.

      It has been previously demonstrated that loss of MAP7D2 leads to a decrease in axonal cargo entry to axons resulting in defects in axon development and neuronal migration. The C-terminus is necessary for this function as it mediates interaction with Kinesin-1 (Pan et al., 2019). Such mechanisms could also explain the defects in migration and neurite growth that the authors observed. This possibility has not been considered but instead, the subtle changes in total α-tubulin led to suggest MT stabilisation as a key function without proof of causation. Could the authors provide some further experimental evidence to demonstrate that stability is the main contributor to the phenotypes observed? Eg. by rescuing migration and neurite phenotypes with a variant of MAP7D2 which cannot bind kinesin1.

      Response: The reviewer states “Such mechanisms could also explain the defects in migration and neurite growth that the authors observed;” however, our results showed that loss of Map7D2 elevated the rates of both cell motility and neurite outgrowth (original Fig. 5). In contrast, it has been reported in several papers that when Kinesin-1 function is impaired, both cell motility and neurite outgrowth are reduced (Curr. Biol., 23: 1018–1023, 2013; Mol. Cell. Biol., 39: e00109–19, 2019; etc.). Therefore, it is likely that the phenotypes we observed are independent of the functions associated with Kinesin-1 in N1-E115 cells. It is indeed possible that the experiment suggested by the reviewer may reveal relationships between Map7D2 and kinesin-1 in terms of cell motility and neurite outgrowth, however, it is difficult to conduct such an experiment because transient expression of Map7D2 induces MT bundling, as shown in original Fig. 2F. Based on the above, we plan to add a discussion of the relationship between Map7D2 and Kinesin-1.

      A key conclusion proposed by the authors is that Map7D2 and Map7D1 facilitate MT stabilization through distinct mechanisms. Such different roles in MT stabilisation are important in controlling cell motility and neurite outgrowth. In my opinion, their data does not fully support this statement and the findings using MT readouts do not match the defects in migration and neurite growth. Loss of Map7D2 leads to a very subtle phenotype on α-tubulin, while Map7D1 decreases both α-tubulin and acetylated tubulin, but Map7D1 seems to have a milder or similar effect on migration and neurite growth than Map7D2. Furthermore, it would be expected that the combined loss of function would lead to a stronger phenotype in cell migration when compared to the single loss of functions due to their distinct roles on MT stability, however, this seems not to be the case.

      Response: The fact that no stronger phenotype was observed may be because, besides Map7D2 and Map7D1, other molecules are involved in MT stabilization. Another possible explanation is that the increases in both cell motility and neurite outgrowth caused by decreased MT stabilization are offset by Kinesin-1 dysfunction. We plan to add a discussion of the above two possibilities.

      Minor comments:

      1) In the first result section, the author refers to Fig. S3 to suggest the expression of MAP7D2 in the cerebral cortex, however, there are no transcripts in the cerebral cortex according to the figure. Similarly, the immunofluorescence analysis done by the authors shows marginal expression of MAP7D2 in the cerebral cortex.

      Response: According to the reviewer’s comment, we have changed the order of the data shown in Fig. 1C, top panels. The data from the olfactory bulb, cerebellum, and hippocampus, in which Map7D2 expression was detected in the database, were arranged in the top three rows, and the data from the cerebral cortex, in which Map7D2 expression was not detected in the database, were moved to the bottom row as a negative control. In addition, we have revised the relevant part of the Results section as follows: “Based on RNA-seq CAGE, RNA-Seq, and SILAC database analysis (Expression Atlas, https://www.ebi.ac.uk/gxa/home/), Map7D2 expression was detected in the cerebellum, hippocampus, and olfactory bulb, and not in the cerebral cortex (Fig. S3). We further confirmed Map7D2 expression in the above four brain tissue regions of postnatal day 0 mice by immunofluorescence. Among these regions, Map7D2 was the most highly expressed in the Map2-negative area of the olfactory bulb, i.e., the glomerular layer (Fig. 1C). Weak signals were detected in the cerebellum, and marginal signals were observed in the hippocampus and cerebral cortex (Fig. 1C).” (page 5, lines 4–11)

      2) The authors use γ-Tubulin as a housekeeping gene in Fig. 3D, since Map7D2 is enriched in centrosomes this may not be the most appropriate choice.

      Response: γ-Tubulin is abundant in both the cytosol and the nuclear compartments of cells (Sig. Transduct. Target Ther. 3: 24, 2018). As it has been used for similar purposes in several other studies (Cancer Res., 61: 7713–7718, 2001; J. Biol. Chem., 291: 23112–23125, 2016; etc.), we considered it acceptable for use as a loading control for immunoblotting.

      3) According to the authors, knockdown of Map7D2 leads to a decrease in the intensity of α-tubulin and Map7D1 (Fig. 4C and D). This data doesn't agree with the previous statement made by the authors where they show that Map7D2 knockdown or knockout did not affect Map7D1 expression by Western Blot Analysis (Fig. S2C and S5B)

      Response: The immunoblotting results indicate that the total amount of Map7D1 in the cells is not affected by loss of Map7D2. In contrast, the immunofluorescence results indicate that the amount (distribution) of Map7D1 localized around the centrosome is decreased by loss of Map7D2, presumably due to a reduction in the number of MT structures that can serve as scaffolds for Map7D1. We plan to add this interpretation in the Results section.

      4) Line 6 page 7 "Endogenous Map7D2 expression is suppressed in N1-E115 cells stably expressing EGFP-rMap7D2 and was restored by specific knock-down of EGFP-rMap7D2 using gfp siRNA (Fig. 3D)". No quantifications and stats are shown. Also, endogenous Map7D2 after knock-down of EGFP-rMap7D2 is not comparable to the control.

      Response: According to the reviewer’s suggestion, we have quantified the amount of endogenous Map7D2 or EGFP-rMap7D2, normalized it to the amount of γ-tubulin, and calculated relative values to endogenous Map7D2 in the parental control. The amount of endogenous Map7D2 was decreased to 53% in N1-E115 cells stably expressing EGFP-rMap7D2, suggesting that EGFP-rMap7D2 expression suppressed endogenous Map7D2 expression. In this cell line, the total amount of Map7D2 (EGFP-rMap7D2 + endogenous Map7D2) was increased, however, when EGFP-rMap7D2 was depleted using sigfp in this cell line, endogenous Map7D2 was expressed to the same level as EGFP-rMap7D2 before knock-down. Together with the finding that Map7d1 knock-down increased the amount of Map7D2, these findings indicate that the amount of Map7D2 in the cells is regulated in response to the amount of Map7D1 and exogenous Map7D2. We have added this interpretation in the Results section. (page 7, lines 8–15)

      In addition, we have changed the legend of the original Fig. 3D to clarify the quantification method, as follows: “(D) Generation of N1-E115 cells stably expressing EGFP-rMap7D2. To check the expression level of EGFP-rMap7D2, lysates derived from the indicated cells were probed with anti-GFP (top panel) and anti-Map7D2 (middle panel) antibodies. The blot was reprobed for γ-tubulin as a loading control (bottom panel). The amount of endogenous Map7D2 or EGFP-rMap7D2 was normalized to the amount of γ-tubulin, and the value relative to endogenous Map7D2 in the parental control was calculated.” (page 22, lines 18–20)

      5) Line 8 page 7 "These results suggest that the expression of Map7D2 was influenced by changes in that of Map7D1" This statement seems in the wrong place, after the Map7D2 and EGFP-rMap7D2 experiment. Instead for clarity, it would be better placed after line 5 where the authors explain the effect of Map7D1 knock-down on the levels of Map7D2.

      Response: According to the reviewer’s suggestion, we have rephrased the relevant sentence as “Interestingly, Map7d1 knock-down upregulated Map7D2 expression, as confirmed with three different siRNAs (Fig. S2C), suggesting that Map7D2 expression is affected by changes in Map7D1 expression, not by off-target effects of a particular siRNA.” (page 7, lines 7, 8)

      6) Line 8 page 8 "Although the physiological role of the C-terminal region of Map7D2 is currently unknown..." This statement seems not adequate as there are several studies reporting the role of the C-terminal region of Map7D2 in Kinesin1- mediated transport. The authors mention such studies in the discussion.

      Response: According to the reviewer’s suggestion, we plan to add a discussion of the relationship between Map7D2 and kinesin-1.

      7) Line 6 page 9 " Further, the knock-down of either resulted in a comparable reduction of MT intensity (Fig. 4C and D) ..." This is not visible and/or justified by the images provided and would benefit from some sort of quantification at other regions such as neurites.

      Response: Considering the cell motility, quantification of α-tubulin/Ace-tubulin/Map7D1/Map7D2 intensities in neurites is not appropriate. Instead, we have added arrowheads indicating α-tubulin/Ace-tubulin/Map7D1/Map7D2 in Fig. 4C, for better understanding.

      8) In Fig. 2B, a band corresponding to his6-rMAP7D2 of molecular weight >97 kDa co-sedimented with the microtubules. However, the cloned rMAP7D2 had a molecular weight of 84.82 kDa and the addition of 6XHis-Tag would add another 2-3 kDa, therefore, the final protein band observed should be less than 90 kDa. It would be beneficial if the authors could specify the molecular weight of the purified protein after the addition of the V5-his tag and/or if there was addition of amino acids due to cloning strategy.

      Response: In Fig. 2B, we used full-length GST-tagged rMap7D2, like in Fig. 2E and D; therefore, we have corrected His6-rMap7D2 as GST-rMap7D2. We apologize for the mistake.

      9) In Fig. 2C, there is misalignment of the western blot with the panel or text underneath.

      Response: We thank the reviewer for pointing this out; we have corrected the misalignment of the CBB staining in Fig. 2C.

      10) In Fig. 3C the inset from the first panel seems to correspond to a different focal plane than the main image.

      Response: We have revised the relevant part of the figure legend as follows: “In C, images of differentiated cells were captured by z-sectioning, because the focal planes of the centrosome and neurites are different. Each inset shows an enlarged image of the region indicated with a white box at each focal plane. Arrowheads indicate the centrosomal localization of Map7D2.”

      11) In Fig. 4A, the cell type is not specified and is referred as "indicated cells", also the material and methods section seems to omit the specific cells used.

      Response: We have added “in N1-E115 cells treated with each siRNA” in the legend of Fig. 4A.

      12) Fig. S6 is not mentioned in the results.

      Response: We apologize for having referred to Fig. S6 only in the Discussion section in the original manuscript. We plan to describe the findings shown in the original Fig. S6 to the Results section and renumber the figures accordingly.

      Significance (Required):

      MTs play essential roles in practically every cellular process. Their precise regulation is therefore crucial for cellular function and viability. MAPs are specialised proteins that interact with MTs and regulate their behaviour in different manners. Understanding their precise function in different cellular contexts is of utmost importance for many biological and biomedical fields.

      MAPs are well known for their ability to promote MT polymerization, bundling and stabilisation in vitro (Bodakuntla et al., 2019). Several members of the Map7 family have been shown to regulate microtubule stability. For instance, MAP7 can prevent nocodazole-induced MT depolymerization and maintain stable microtubules at branch points in DRG neurons (Tymanskyj & Ma, 2019). Ensconsin, the Drosophila Map, is required for MT growth in mitotic neuroblasts by regulating the mean rate of MT polymerization (Gallaud et al., 2014). However, this family of Maps seems to have diverse functions encompassing a variety of mechanisms, as exemplified by a series of studies demonstrating the involvement of MAP7 family proteins in the recruitment and activation of kinesin1 (Hooikaas et al., 2019; Pan et al., 2019) and in microtubule remodelling and Wnt5a signalling (Kikuchi et al., 2018). Further understanding of this family of Maps and how its members differ in their function is important and will help to advance the field.

      Response: We appreciate the reviewer’s comments. We believe that our revision plan will greatly improve the quality of our manuscript.

      Reviewer #3

      Evidence, reproducibility and clarity (Required):

      Summary:

      Microtubule Associated Proteins (MAPs) are important regulators of microtubule dynamics, microtubule organization and vesicular transport by modulating motor protein recruitment and processivity. In the current manuscript the authors have characterized 2 members of the MAP7 protein family, MAP7D1 and MAP7D2. The authors characterized MAP7D2 expression pattern in the brain and its microtubule binding properties in vitro and in cells. In cells both proteins localize to the centrosome and to microtubules and upon depletion centrosome localized microtubules seem reduced, and cell migration and neurite outgrowth are increased. Surprisingly, they find that microtube acetylation (a common marker for stable microtubules) is reduced upon MAP7D1 depletion but not MAP7D2 depletion. Based on this finding the authors conclude that these proteins have a distinct mechanism in stabilizing MTs to affect cell migration and neurite outgrowth; MAP7D2 stabilizes by binding to MTs, whereas MAP7D1 stabilizes MTs by acetylation.

      Main comments:

      - Both MAP7 proteins show strong localization to the centrosome and to a lesser degree to MTs. Knockdown of either protein leads to reduced MTs around the centrosome, which lead the authors to conclude the MAP7s are stabilizing the MTs. However, the effect could just as well be an indirect effect due to a function of these MAPs at the centrosome. To address this authors could e.g. quantify microtubule properties in postmitotic cells. In addition, antibody specificity should be tested using knockdown of knockout cells, as this centrosome localization was not observed in Hela cells (Hooikaas, 2019; Kikuchi, 2018). Maybe this localization is specific to rat MAP7s or to the cell line used.

      Response: We think that this comment partly overlaps with the comments raised by Reviewer #2. We plan to assess the role of MT stabilization in greater detail by analyzing the sensitivity to the MT-destabilizing agent, nocodazole, and the effect on MT dynamics by measuring the EB1 comet length by immunofluorescence.

      Regarding the reviewer’s concern about antibody specificity, we had carefully confirmed the antibody specificity, as shown in Fig. S2 of the original manuscript. Subsequently, Map7D2 localization was confirmed in N1-E115 cells stably expressing EGFP-rMap7D2, as shown in Fig. 3D, E of the original manuscript. In addition, we are currently conducting analyses using Map7d1-egfp knock-in mice, which confirmed that Map7D1 localizes around the centrosome in cortical neurons, as shown below (we would like to disclose these unpublished data to the reviewers only). Therefore, it is thought that the localization pattern of Map7D2 and Map7D1 differs depending on the cell type and cell line. We plan to add this interpretation to the Results section.

      - Centrosome nucleated microtubules are typically highly dynamic and little modified. Therefore is the Ac-tub staining at the centrosome really MTs? I cannot identify MTs in the fluorescent images in 4C. Maybe authors could consider ac-tub/alpha-tub ratio in non centrosomal region (e.g. neurites). Moreover, as both Acetylation and detyrosination are associated with long-lived/stable MTs, it is surprising that only acetylated tubulin goes down on WB. Does this suggest that long-lived MTs are still present to normal level? If so, can one still argue that the loss of acetylation is the cause of the lower MT levels? This should at least be discussed.

      Response: As for the reviewer’s statement “Centrosome nucleated microtubules are typically highly dynamic and little modified. Therefore is the Ac-tub staining at the centrosome really MTs?”, it has been previously reported that tubulin acetylation is observed around the centrosome in some cell lines (J. Neurosci., 30: 7215–7226, 2010; PLoS One, 13: e0190717, 2018; etc.). N1-E115 is one of the cell lines in which tubulin acetylation is observed around the centrosome.

      It is not surprising that “only acetylated tubulin goes down on WB,” as it has been previously reported that acetylated and detyrosinated tubulins are sometimes not synchronous (J. Neurosci., 23: 10662–10671, 2003; J. Neurosci., 30: 7215–7226, 2010; J. Cell Sci., 132: jcs225805, 2019., etc.). For instance, Montagnac et al. (Nature, 502: 567–570, 2013) showed that defects in the α-tubulin acetyltransferase αTAT1-clathrin-dependent endocytosis axis reduce only tubulin acetylation, resulting in a shift from directional to random cell migration. Although the details of the molecular function of Map7D1 are beyond the main purpose of this study, we plan to add a discussion of the reduced tubulin acetylation by Map7d1 knock-down based on the above.

      - MAP7D1 and MAP7D2 depletion leads to subtle defect in cell migration and neurite outgrowth, which the author suggest is caused by reduced MT stability. However, MAP7 proteins have well characterized functions in kinesin-1 transport, and thus the phenotypes may well be caused by defects in kinesin-1 transport. Ideally the authors would do rescue experiments with FL or just the MT binding N-termini to separate these functions. Moreover this is needed to substantiate the claim of the authors that MAP7D1 effect on MT stability is not mediated by direct binding.

      Response: As this comment largely overlaps with the comments raised by Reviewer #2, please refer to our responses to the comments of Reviewer #2.

      - The authors do not refer well to published work. Several papers have published very similar work (especially to Fig1+2) and it would help the reader much if this would be discussed/compared along the results section and not briefly mention these in the results section. In addition, authors overstate the novelty of their results e.g. page 3: these proteins are not "functionally uncharacterized" nor are their expression patter and biochemical properties analyzed for the first time in this manuscript; page 8 "Although the physiological role of the C-terminal region of Map7D2 is currently unknow, ..." There is a clear function for the C-terminus for the recruitment/activation of kinesin-1.

      Response: According to the reviewer’s suggestion, we plan to add a comparison with data on the Map7 family members presented in previous papers in the Results section and rephrase the relevant part regarding the physiological role of the C-terminal region of Map7D2.

      Minor comments

      - P6 Map7D3 also binds with its N-terminus to MTs, like other MAP7s (Yadav et al)

      Response: According to the reviewer’s comment, we have revised this as “Map7D3 binds through a conserved region on not only the N-terminal side, but also the C-terminal side (Sun, 2011; Yadav et al., 2014).” (page 6, lines 4, 5)

      - P7 "As Map7D2 has the potential to functionally compensate for Map7D1 loss" where is this based on?

      Response: For clarity, we have rephrased this as “As Ma7D2 expression was upregulated upon suppression of Map7D1 expression, Map7D2 has the potential to functionally compensate for Map7D1 loss.” (page 7, line 17, 18)

      - Fig2F quality of black-white images is low potentially due to conversion issues

      Response: We thank the reviewer for pointing out these conversion issues, and we have made the necessary corrections.

      Significance (Required):

      At this stage the conceptual advance is limited. Part of the findings are not novel. The finding that MAP7s depletion have a different effect on MTs acetylation may be interesting to cytoskeleton researchers, although the potential mechanism has not been addressed experimentally or textually.

      However, their conclusion that this leads to reduced MTs and then to cellar migration and neurite formation defects is not sufficiently supported by experimental evidence.

      Response: We appreciate the reviewer’s comments. We believe that our revision plan will greatly improve the quality of our manuscript.

      \*Referees cross-commenting***

      I completely agree with reviewer #2: At this stage the paper's conclusions are not sufficiently supported by the data. Important will be to further characterize the effect om the MTs (do they really have a different effect) and to look at the possible involvement of the motor recruitment. Maybe that a 3 to 6 months revision time would have been more accurate.

      Response: Please refer to our responses to the comments of Reviewer #2.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Microtubule Associated Proteins (MAPs) are important regulators of microtubule dynamics, microtubule organization and vesicular transport by modulating motor protein recruitment and processivity. In the current manuscript the authors have characterized 2 members of the MAP7 protein family, MAP7D1 and MAP7D2. The authors characterized MAP7D2 expression pattern in the brain and its microtubule binding properties in vitro and in cells. In cells both proteins localize to the centrosome and to microtubules and upon depletion centrosome localized microtubules seem reduced, and cell migration and neurite outgrowth are increased. Surprisingly, they find that microtube acetylation (a common marker for stable microtubules) is reduced upon MAP7D1 depletion but not MAP7D2 depletion. Based on this finding the authors conclude that these proteins have a distinct mechanism in stabilizing MTs to affect cell migration and neurite outgrowth; MAP7D2 stabilizes by binding to MTs, whereas MAP7D1 stabilizes MTs by acetylation.

      Main comments:

      • Both MAP7 proteins show strong localization to the centrosome and to a lesser degree to MTs. Knockdown of either protein leads to reduced MTs around the centrosome, which lead the authors to conclude the MAP7s are stabilizing the MTs. However, the effect could just as well be an indirect effect due to a function of these MAPs at the centrosome. To address this authors could e.g. quantify microtubule properties in postmitotic cells. In addition, antibody specificity should be tested using knockdown of knockout cells, as this centrosome localization was not observed in Hela cells (Hooikaas, 2019; Kikuchi, 2018). Maybe this localization is specific to rat MAP7s or to the cell line used.
      • Centrosome nucleated microtubules are typically highly dynamic and little modified. Therefore is the Ac-tub staining at the centrosome really MTs? I cannot identify MTs in the fluorescent images in 4C. Maybe authors could consider ac-tub/alpha-tub ratio in non centrosomal region (e.g. neurites). Moreover, as both Acetylation and detyrosination are associated with long-lived/stable MTs, it is surprising that only acetylated tubulin goes down on WB. Does this suggest that long-lived MTs are still present to normal level? If so, can one still argue that the loss of acetylation is the cause of the lower MT levels? This should at least be discussed.
      • MAP7D1 and MAP7D2 depletion leads to subtle defect in cell migration and neurite outgrowth, which the author suggest is caused by reduced MT stability. However, MAP7 proteins have well characterized functions in kinesin-1 transport, and thus the phenotypes may well be caused by defects in kinesin-1 transport. Ideally the authors would do rescue experiments with FL or just the MT binding N-termini to separate these functions. Moreover this is needed to substantiate the claim of the authors that MAP7D1 effect on MT stability is not mediated by direct binding.
      • The authors do not refer well to published work. Several papers have published very similar work (especially to Fig1+2) and it would help the reader much if this would be discussed/compared along the results section and not briefly mention these in the results section. In addition, authors overstate the novelty of their results e.g. page 3: these proteins are not "functionally uncharacterized" nor are their expression patter and biochemical properties analyzed for the first time in this manuscript; page 8 "Although the physiological role of the C-terminal region of Map7D2 is currently unknow, ..." There is a clear function for the C-terminus for the recruitment/activation of kinesin-1.

      Minor comments:

      • P6 Map7D3 also binds with its N-terminus to MTs, like other MAP7s (Yadav et al)
      • P7 "As Map7D2 has the potential to functionally compensate for Map7D1 loss" where is this based on?
      • Fig2F quality of black-white images is low potentially due to conversion issues

      Significance

      At this stage the conceptual advance is limited. Part of the findings are not novel. The finding that MAP7s depletion have a different effect on MTs acetylation may be interesting to cytoskeleton researchers, although the potential mechanism has not been addressed experimentally or textually.

      However, their conclusion that this leads to reduced MTs and then to cellar migration and neurite formation defects is not sufficiently supported by experimental evidence.

      Referees cross-commenting

      I completely agree with reviewer #2: At this stage the paper's conclusions are not sufficiently supported by the data. Important will be to further characterize the effect om the MTs (do they really have a different effect) and to look at the possible involvement of the motor recruitment. Maybe that a 3 to 6 months revision time would have been more accurate.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this study, the authors investigate 2 members from the MAP7 family Map7D2 and Map7D1. They first address the tissue distribution of Map7D2, by northern blotting using a variety of rat tissues. To complement their analysis, they also raised an antibody to look at the protein distribution. From their studies, they concluded that Map7D2 is abundantly expressed in the brain and testis. The authors went on to perform a series of functional assays . First, they biochemically demonstrated that rat Map7D2 directly binds to MTs by MT co-sedimentation assay. The MT binding domain was mapped to the N-terminal half. They performed MT turbidity assay to demonstrate enhanced MT polymerisation in the presence of Map7D2, suggesting that this Map stabilises MTs. The authors went on to characterise in detail the subcellular localisation of Map7D2 which was predominantly present in the centrosome and partially localised to MTs including within neurites from N1-E115 cells. Kikuchi et al. further revealed the overlap in expression between Map7D2 and another family member, Map7D1. The authors continued these studies by a series of functional studies in N1-E115 cells where they performed single or combined knock-downs of Map7D2 and Map7D1 and studied the levels of acetylated and detyrosinated tubulins and the effect of the knock-downs on migration and neurite extension. The main conclusion from this work was that Map7D2 and Map7D1 facilitate MT stabilization through distinct mechanisms which are important in controlling cell motility and neurite outgrowth. Map7D2 is proposed to stabilise MTs by direct binding whereas Map7D1 does it indirectly by affecting acetylation.

      Major comments:

      The main conclusion from this work that Map7D2 and Map7D1 facilitate MT stabilization and that this is necessary for correct migration and neurite extension has not been convincingly demonstrated. In my opinion, a more detailed study of MT properties to demonstrate a role in MT stabilisation would greatly benefit the work, eg. experiments using MT destabilising agents such as nocodazole. In addition, a series of experiments aiming to study MT dynamics would help to understand the function of these MT regulators. The authors proposed an elevation in microtubule dynamics to explain the increase in migration and neurite extension but no experimental proof was provided.

      It has been previously demonstrated that loss of MAP7D2 leads to a decrease in axonal cargo entry to axons resulting in defects in axon development and neuronal migration. The C-terminus is necessary for this function as it mediates interaction with Kinesin-1 (Pan et al., 2019). Such mechanisms could also explain the defects in migration and neurite growth that the authors observed. This possibility has not been considered but instead, the subtle changes in total -tubulin led to suggest MT stabilisation as a key function without proof of causation. Could the authors provide some further experimental evidence to demonstrate that stability is the main contributor to the phenotypes observed? Eg. by rescuing migration and neurite phenotypes with a variant of MAP7D2 which cannot bind kinesin1.

      A key conclusion proposed by the authors is that Map7D2 and Map7D1 facilitate MT stabilization through distinct mechanisms. Such different roles in MT stabilisation are important in controlling cell motility and neurite outgrowth. In my opinion, their data does not fully support this statement and the findings using MT readouts do not match the defects in migration and neurite growth. Loss of Map7D2 leads to a very subtle phenotype on -tubulin, while Map7D1 decreases both -tubulin and acetylated tubulin, but Map7D1 seems to have a milder or similar effect on migration and neurite growth than Map7D2. Furthermore, it would be expected that the combined loss of function would lead to a stronger phenotype in cell migration when compared to the single loss of functions due to their distinct roles on MT stability, however, this seems not to be the case.

      Minor comments:

      1. In the first result section, the author refers to Fig. S3 to suggest the expression of MAP7D2 in the cerebral cortex, however, there are no transcripts in the cerebral cortex according to the figure. Similarly, the immunofluorescence analysis done by the authors shows marginal expression of MAP7D2 in the cerebral cortex.
      2. The authors use -Tubulin as a housekeeping gene in Fig. 3D, since Map7D2 is enriched in centrosomes this may not be the most appropriate choice.
      3. According to the authors, knockdown of Map7D2 leads to a decrease in the intensity of -tubulin and Map7D1 (Fig. 4C and D). This data doesn't agree with the previous statement made by the authors where they show that Map7D2 knockdown or knockout did not affect Map7D1 expression by Western Blot Analysis (Fig. S2C and S5B)
      4. Line 6 page 7 "Endogenous Map7D2 expression is suppressed in N1-E115 cells stably expressing EGFP-rMap7D2 and was restored by specific knock-down of EGFP-rMap7D2 using gfp siRNA (Fig. 3D)". No quantifications and stats are shown. Also, endogenous Map7D2 after knock-down of EGFP-rMap7D2 is not comparable to the control.
      5. Line 8 page 7 "These results suggest that the expression of Map7D2 was influenced by changes in that of Map7D1" This statement seems in the wrong place, after the Map7D2 and EGFP-rMap7D2 experiment. Instead for clarity, it would be better placed after line 5 where the authors explain the effect of Map7D1 knock-down on the levels of Map7D2.
      6. Line 8 page 8 "Although the physiological role of the C-terminal region of Map7D2 is currently unknown..." This statement seems not adequate as there are several studies reporting the role of the C-terminal region of Map7D2 in Kinesin1- mediated transport. The authors mention such studies in the discussion.
      7. Line 6 page 9 " Further, the knock-down of either resulted in a comparable reduction of MT intensity (Fig. 4C and D) ..." This is not visible and/or justified by the images provided and would benefit from some sort of quantification at other regions such as neurites.
      8. In Fig. 2B, a band corresponding to his6-rMAP7D2 of molecular weight >97 kDa co-sedimented with the microtubules. However, the cloned rMAP7D2 had a molecular weight of 84.82 kDa and the addition of 6XHis-Tag would add another 2-3 kDa, therefore, the final protein band observed should be less than 90 kDa. It would be beneficial if the authors could specify the molecular weight of the purified protein after the addition of the V5-his tag and/or if there was addition of amino acids due to cloning strategy.
      9. In Fig. 2C, there is misalignment of the western blot with the panel or text underneath.
      10. In Fig. 3C the inset from the first panel seems to correspond to a different focal plane than the main image.
      11. In Fig. 4A, the cell type is not specified and is referred as "indicated cells", also the material and methods section seems to omit the specific cells used.
      12. Fig. S6 is not mentioned in the results.

      Significance

      MTs play essential roles in practically every cellular process. Their precise regulation is therefore crucial for cellular function and viability. MAPs are specialised proteins that interact with MTs and regulate their behaviour in different manners. Understanding their precise function in different cellular contexts is of utmost importance for many biological and biomedical fields.

      MAPs are well known for their ability to promote MT polymerization, bundling and stabilisation in vitro (Bodakuntla et al., 2019). Several members of the Map7 family have been shown to regulate microtubule stability. For instance, MAP7 can prevent nocodazole-induced MT depolymerization and maintain stable microtubules at branch points in DRG neurons (Tymanskyj & Ma, 2019). Ensconsin, the Drosophila Map, is required for MT growth in mitotic neuroblasts by regulating the mean rate of MT polymerization (Gallaud et al., 2014). However, this family of Maps seems to have diverse functions encompassing a variety of mechanisms, as exemplified by a series of studies demonstrating the involvement of MAP7 family proteins in the recruitment and activation of kinesin1 (Hooikaas et al., 2019; Pan et al., 2019) and in microtubule remodelling and Wnt5a signalling (Kikuchi et al., 2018). Further understanding of this family of Maps and how its members differ in their function is important and will help to advance the field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Kikuchi et al describe the characterization of MAP7D2 and MAP7D1, two MAP7 family members in mouse with specific expression patterns. Focusing mostly on MAP7D2, they assess its expression pattern across the body and find that it is mostly expressd in certain neuronal subsets. They then characterize the MT-related properties of MAP7D2 based on previous knowledge of other MAP7 family members. They show that MAP7D2 binds MTs (via the N-terminus), determine the binding affinity, and show that it can stimulate MT polymerization (or stabilization) both in vitro and in vivo. Using a specific antibody, they localize MAP7D2 to centrosomes, midbody and neurites in N1-E115 cells. Functionally, they show that loss of MAP7D1/2 mildly affects microtubule stability as judged by acetyl-tubulin staining, and properties of these cells that rely on cytoskeletal elements such as cell migration and neurite growth. Interestingly, there might be a feedback loop regulating MAP7D1/2 expression , as knockdown of MAP7D1 upregulates MAP7D2.

      Overall, the experiments and conclusions are very solid and convincing, such that I would not ask for further experiments. This is in part because the experiments are largely based on previous characterizations of other MAP7 family members, which are largely confirmed. The presentation of the data is also very clear.

      Significance

      I see the value of the study in the fact that,it provides solid and specific research tools for MAP7D1/2 which could be very useful for the microtubule/neuronal cytoskeleton community.

      Referees cross-commenting

      Reviewers 2 and 3 criticize that the evidence for an effect of MAP7D1/2 on MT dynamics is weak. I would agree in that ac-tub stainings and in vitro experiments are rather indirect. The experiments suggested by reviewer 2 should clarify this (esp. nocodazole should be easy). I also agree that an experiment addressing the potential involvement of kinesin-1 would help, the involvement of which seems to have been omitted by the authors. A kinesin-binding deficient mutant would add another MAP7D1/2 tool and increase the value for the community.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      This is the first such piece of data to come from human infective parasites in the field. Technically this is a feat - because the small number of parasites that are present per mL of human blood at any given time during infection with T gambiense. Nevertheless they manage to identify up to 14 unique VSGs per patient sample. And this raises the first theoretical question: can they extrapolate to the average diversity load per human?

      This is an intriguing question that we would like to eventually answer, but we do not believe we can make this estimate from the data we currently have. We know our sampling is insufficient based on the correlation between parasitemia and diversity, and we do not have sufficiently precise estimates of parasitemia that could be used to extrapolate total diversity in the blood. Moreover, our analysis was only performed on RNA extracted from whole blood samples. Recent studies indicate that significant populations of parasites reside in extravascular tissue spaces, and our analysis did not address antigenic diversity in these spaces. We believe it is unlikely that the blood alone reflects the full diversity of VSG expression in an infection, and an estimate based only on blood-resident parasites (if possible) could be misleading.

      this is important because the timing of sample collection (ie that it occurred within a period of weeks) suggesting that an initial group of infected tsetse infected these patients (rather than a small number of interactions between a bloodmeal and a new infection - generally in itself on the order of 1 month or so). If parasitemia is low and diversity limited, this would explain both why CATT works as well as it does (because really it shouldn't at all!) and perhaps even the chronicity of infection (in the sense that the organism is unlikely to "run out" even of complete VSGs, never mind mosaics). The paper would benefit from a direct discussion on this.

      Indeed, the timing of sample collection could inform our interpretation of the data. However, sample collection occurred over a period of six months. More importantly, patients were in both early and late stage disease at the time of sample collection, so we cannot estimate how long any individual patient had been infected. We have added text (line 180) to highlight this fact. Because some patients were infected at least 6 months apart (if not much more than that), it is unlikely that patients were infected around the same time by a small group of infected tsetse flies. Reviewer #1 introduces an interesting point about the efficacy of the CATT diagnostic test as it relates to antigenic diversity. We discuss CATT sensitivity in the introduction (lines 115-120) as well as the discussion, where regional sensitivity differences are mentioned (lines 715-718). Given uncertainty about total diversity and time since initial infection, we have refrained from speculating about how diversity/timing could affect CATT sensitivity.

      An interesting feature of this new study is the apparent bias to type B N-terminal domain VSGs as well as the discovery that two patients share a specific VSG isolate (though it is not mentioned whether they are related by distance etc). This raises the possibility of substrains with different VSG archives that vary by geography.

      We found two VSGs which were expressed in more than one patient. One was expressed in two patients from the same village (village C) while the other VSG was common between two cases originating from villages C and D, some 40 km apart. We agree that our data generally support the possibility that the VSG archive might vary geographically. We have performed additional analyses suggested by reviewer #2 that support this idea: we have now shown that Tbg patient VSGs classified in this study, which originated from the DRC, are distinct from the VSGs encoded by the reference strain Tbg DAL 972 which was isolated in Cote d’Ivoire. We mention this possibility on lines 721-724.

      Alternatively it suggests that perhaps type B VSGs are picked up differentially by serology (and there the one feature of type B VSGs that could be shared, with regards to detection, is the O-hexose decoration on a number of type B VSG surfaces. Could CATT be detecting elements common to sugar decorated VSGs? Experimentally this is something that can be tested even with mouse infection materials.

      This is indeed an intriguing possibility. We mention this in the discussion (lines 772-778): “In T. brucei, several VSGs have evolved specific functions besides antigenic variation [74]. Recently, the first type B VSG structure was solved [75], revealing a unique O-linked carbohydrate in the VSG’s N-terminal domain. This modification was found to interfere with the generation of protective immunity in a mouse model of infection; perhaps structural differences between each VSG type, including patterns of glycosylation, could influence infection outcomes.” While this is an experimentally tractable explanation for the type B VSG bias we observe, we believe such experiments are beyond the scope of the current paper.

      Side comment: are the common VSGs mutated between patient samples?

      We classified VSGs as common between patient samples if they had >98% nucleotide sequence identity as well as meeting the other quality cutoffs such as 1% expression level and consistency across technical replicates. This identity cutoff still allows for several mismatches between sequences, which we do occasionally observe. However, we cannot confidently rule out that the “mutations” we observe are sequencing or PCR errors. Thus, we cannot say for sure if there are mutations between common VSGs.

      Reviewer #2


      1.Throughout the manuscript you observe 'diversity' in expressed VSG and its existence becomes a principal conclusion. I feel that the meaning of diversity and its significance is not sufficiently explained for the reader. In the abstract (l48) you say that there is 'marked diversity' in parasite populations. Presumably you mean parasite infrapopulations, i.e. within patients, not across the DRC? In any case, what is 'marked' about it, and relative to what? Why does it matter that there are multiple expressed VSG in a single patient at one time? Is this not a reasonable expectation for a population of (presumably) clones capable of switching the expressed VSG? How is this different to the view typical of the literature since 1970 that one VSG dominates while others wait in the background at low frequencies. If 'diversity' is the conclusion, then you need to define it and explain its significance more.

      When we refer to diversity, we do mean infrapopulations of parasites within patients, or individual animals in this case, rather than across the DRC. We have edited the text to make this clear (see below). However, the study which benchmarked the application of VSG-seq to quantify VSG expression in vivo during mouse did not support the previously-held view that one VSG dominates while others wait in the background at low frequencies. Frequently we observe a handful of VSGs present at 10-20% of the population at any timepoint, and many VSGs (~50% of all detected variants) present at “In a proof-of-principle study, we used VSG-seq to gain insight into the number and diversity of VSGs expressed during experimental mouse infections [30]. This proof-of-principle study revealed significant VSG diversity within parasite populations in each animal, with many more variants expressed at a time than the few thought to be sufficient for immune evasion. This diversity suggested that the parasite’s genomic VSG repertoire might be insufficient to sustain a chronic infection, highlighting the potential importance of recombination mechanisms that form new VSGs.

      2.Following on from 1., why does the analysis deal in counts of distinct VSG or N-terminal domains, and not then progress to their relative expression? The expression data are in Supp Table 3 and they show that, in most cases where many VSG are observed in the same patient, 1-3 of these are 'dominant', i.e. they account for >50% of the population.

      The VSG-seq analysis pipeline does estimate the relative expression level of each identified variant in the population, and this information is available in the supplemental data (Supplemental Figure 1, Supplemental Table 3). However, we chose not to rely on these measurements too heavily because there was some variation between Tbg technical replicates, which is shown in the supplemental heatmap (Supplemental Figure 1). Replicate three tends to not agree with the first two replicates. We suspect that this was due to the order of sample processing and the fact that the parasite-enriched cDNA sample was repeatedly freeze-thawed between library preparations for technical replicates. Additionally, because our sampling did not reach saturation, some VSGs are not detected in all replicate libraries, making it difficult to estimate their abundance.

      We have added a discussion of these issues to the text on lines 431-433: “Because our sampling did not reach saturation, resulting in some variability between technical replicates, we chose to focus only on the presence/absence of individual VSGs rather than expression levels within parasite populations.”

      Figure 1 deals in VSG counts, but I would then expect another figure to illustrate the reality that only a minority of these observed VSG are likely to be clinically relevant (i.e. the subject of the immune response). This impacts the 'diversity' conclusion, as given in the discussion (ll 657-9), because you cannot afford to treat all these VSG equally when their abundances are quite different.


      We agree that relative expression level is a useful metric, but absent longitudinal sampling it is impossible to determine which VSGs are clinically relevant as defined by the reviewer: low abundance VSGs at one time point may be the predominantly expressed variant at another. Moreover, the threshold for triggering an anti-VSG antibody response remains unknown. Thus, we have chosen to treat all detected variants equally.

      3.How related are these VSG? Were you able to ensure unique read mapping to the VSG assembly? Can you show that reads mapped to a single VSG only and therefore, that the RPKM values are reliable?

      Our analysis accounts for the fact that VSGs can be very similar. We only considered uniquely mapping reads in our VSG-seq analysis. We also account for mappability in our quantification, so VSG sequences that are less unique (and thus have fewer uniquely mapping reads) are not artificially underrepresented in estimates of relative expression. We have specified the parameters used for alignment (line 274) in the methods.

      4.The authors observed no orthology between expressed VSG and DAL972 genes. This is really interesting and deserves closer attention. Presumably there is microhomology? For T. brucei VSG, with constant recombination, we would predict that a comparison of the VSG in West and Central Africa would reveal a pattern of mosaicism, such that individual sequences in DRC would break down into motifs present in multiple genes in the West African reference. Question is, how many genes? What does that distribution look like? What is the smallest homology tract? There is an opportunity here to comment on how VSG repertoires diverge under recombination. How much of the expressed VSG sequence is truly unrepresented in the West African reference (or other T.b.gambiense genome sequences available in ENA). I can believe that none of the N-terminal domains in these data are present intact in DAL972, but I cannot believe that their components are not present without evidence.

      We appreciate the reviewer’s suggestion to look at this more closely. We have performed additional analyses to address sequence similarity, or lack thereof, between the assembled DRC patient VSG and the West African reference TbgDAL972. We ran a nucleotide BLAST of expressed VSGs against the TbgDAL972 genome reference sequence pulled from TriTrypDB.org (release 54). We have added a supplemental figure depicting the results of this analysis (Supplemental Figures 6 and 7). Briefly, our analysis shows that most of the N-termini we identified have no significant similarity to DAL972 VSGs, even with very permissive search parameters. There are frequent hits in the VSG C-termini, however, which might be expected. Most BLAST hits are short spans 98% identity are short 20-25 bp regions. Given the large divergence from the reference, we were unable to infer any patterns of recombination in the VSGs. However, we believe this analysis supports our claim that the N-termini of VSGs assembled from DRC patients are novel, with their component parts largely unrepresented in the West African reference genome.

      Figure 4 compares NTD type composition in the DRC data with previously published mouse experiments. The latter take place over very short timescales in maladapted hosts, while the timescales of the latter in natural hosts are unknown but plausibly very much longer. So are these data really comparable and are we learning anything from their comparison, given that the most likely explanation for the NTD bias in expressed VSG is the underlying genomic composition?


      Indeed, this is our intended conclusion from figure 4. The figure is meant to illustrate our claim that the expressed VSGs in each experimental set reflect the underlying genomic composition of their corresponding reference strains, despite fluctuations over time. The language and legend for Figure 4 has been clarified to emphasize this point. We have emphasized in the text that it is unknown whether these fluctuations occur over time in much longer natural infections.


      6.Please comment on the technical reproducibility of the data, there are multiple instances in Supp Table 3 where technical replicates expressed different VSG.

      Three RNA-seq library technical replicates were prepared for each individual gHAT patient RNA sample. Replicates were prepared in batches together so all 1’s were done on the same day, for example. The original parasite-enriched cDNA sample was frozen and thawed between each batch. We suspect that the cDNA degraded after repeated freeze-thaw cycles, which is why replicate three tends to not agree with the other two as can be seen on the heatmap in supp fig. 1 and the expression data in supp table 3. We also suspect the fact that our sampling did not reach saturation resulted in the detection of different VSGs in individual replicate preps. We have edited the methods and mentioned this variability in the results section to communicate this issue more transparently.

      • Lines 395-397 “Using RNA extracted from 2.5 mL of whole blood from each patient, we prepared libraries for VSG-seq in three separate batches for each technical replicate.”
      • Lines 431-433: “Because our sampling did not reach saturation, resulting in some variability between technical replicates, we chose to only focus on the presence/absence of individual VSGs rather than relative expression levels within the population”

      Reviewer #3

      1. In line 499, the authors conclude the due to the expressed VSGs being different in the blood and CSF being difference it may indicate that different organs harbor different VSG sets. Given that this is n=1 for patient samples I think this is too speculative a statement. There is also no indication as to whether the samples were taken at the same time or not.

      This is absolutely correct. The precise timing of CSF sample collection is unknown for these samples. It likely occurred within hours to days after blood collection, but even on this short time scale, the unique CSF repertoire could represent the antibody-mediated clearance of one VSG population and replacement with another. We have scaled back our language and only point out that there are unique VSGs in this space (Lines 522 – 524).

      I think that the authors need to be very careful as to the conclusions drawn about VSG expression over time in terms of hierarchy and N-terminal fluctuations. For any conclusions to be drawn on the hierarchy of VSG expression more data points are needed taken over time (this is obviously challenging when looking at patient samples). I find it too speculative to draw any conclusions when single time points are assessed and the assumption on the progression of the infection depends on whether it is a Tb or Tbr.

      Reviewer #2 also pointed this out. We agree and have attempted to limit definitive conclusions in the text and instead discuss multiple possible explanations behind our observations.

      I found some of the figure legends a bit terse. For example, in Figure 1 C, what do the black circles and lines represent? Perhaps a little more detail would help the reader.

      Clarified legends for UpSet plots in figures 1C and 3C as follows: “The intersection of expressed VSG sets in each patient. Bars on the left represent the size of the total set of VSGs expressed in each patient. Dots represent an intersection of sets with bars above the dots representing the size of the intersection.”

      In figure 2, I found it difficult to distinguish between the orange and dark red in (A) and the two lighter blue colors.

      We have changed N-terminal type color palette for all plots to make red and blue hues more distinctive.

      In line 389 – estimate

      Corrected

      In line 498 - should be reference been to figure 2C?

      This should be a reference to Figure 3B. We have corrected the reference.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this work, So and Sudlow et al have used an established methodology - VSG-seq to assess the expressed VSG diversity in 12 patients infected with T. brucei gambiense. As with what is seen in mouse models, there is a diversity in VSG expression seen in patients. The application of this technology has not previously been used on patient samples and is now validated as a valuable tool to study antigenic variation in human populations. The authors have found that in addition to the VSG diversity seen there was a significant bais towards B type N-terminal domains and a restricted C-terminal types. This work, although on a small sample group, is an important step forward to applying this technology to understanding trypanosome immune evasion in the field.

      Major comments:

      I think that overall, the key conclusions on the expressed VSG diversity and that there are geographical variations are convincing and would agree with the conclusions that it is now feasible to study antigenic variation in the field. But given the sample size the I feel that two of the findings are overstated and should at least be qualified as speculative.

      1.In line 499, the authors conclude the due to the expressed VSGs being different in the blood and CSF being difference it may indicate that different organs harbor different VSG sets. Given that this is n=1 for patient samples I think this is too speculative a statement. There is also no indication as to whether the samples were taken at the same time or not.

      2.I think that the authors need to be very careful as to the conclusions drawn about VSG expression over time in terms of hierarchy and N-terminal fluctuations. For any conclusions to be drawn on the hierarchy of VSG expression more data points are needed taken over time (this is obviously challenging when looking at patient samples). I find it too speculative to draw any conclusions when single time points are assessed and the assumption on the progression of the infection depends on whether it is a Tb or Tbr. I don't believe that any other experiments are needed and the statistical analysis is adequate.

      Minor comments:

      I found some of the figure legends a bit terse. For example, in Figure 1 C, what do the black circles and lines represent? Perhaps a little more detail would help the reader.

      In figure 2, I found it difficult to distinguish between the orange and dark red in (A) and the two lighter blue colors.

      In line 389 - estimate

      In line 498 - should be reference been to figure 2C?

      Significance

      Overall, this is an interesting study and shows the practical application of VSG-seq on the study of human infections. There is clearly interesting biology to be learned about both Tbg and Tbr infections and immune evasion by these parasites - which can now be done with the development and application of these technologies. I am a molecular cell biologist who specialises in trypanosome biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      So et al. have analyzed the expression profiles of T.b.gambiense VSG genes in 12 natural human infections in DRC during a six month period of 2013, and compared these results to existing data for T.b.rhodesiense VSG and previously published data from mice. They use the VSGseq approach developed by the Mugnier lab over the last few years to good effect and provide a description of the expression profiles using phylogenetic and network approaches. The main conclusions are that parasite infrapopulations in each patient expression largely mutually exclusive VSG cohorts, with a couple of exceptions where patients 'shared' identical VSG transcripts. The authors note that these congolese VSG are not comparable with the West African T.b.gambiense reference sequence, and there is a pronounced bias in the systematic composition of expressed VSG (towards 'B-type VSG') that is not observed in other T. brucei subspecies. These latter observations lead to the suggestion that there may be substantial variation in expressed VSG repertoire among T. brucei populations that could have important consequences for pathology, although the spatial or temporal scale upon which this variation could be expected to occur cannot be inferred from these data. Overall, a competent study and a welcome addition to, if not extension of, recent work describing the dynamics of VSG expression in multiple African trypanosomes.

      Major points:

      1.Throughout the manuscript you observe 'diversity' in expressed VSG and its existence becomes a principal conclusion. I feel that the meaning of diversity and its significance is not sufficiently explained for the reader. In the abstract (l48) you say that there is 'marked diversity' in parasite populations. Presumably you mean parasite infrapopulations, i.e. within patients, not across the DRC? In any case, what is 'marked' about it, and relative to what? Why does it matter that there are multiple expressed VSG in a single patient at one time? Is this not a reasonable expectation for a population of (presumably) clones capable of switching the expressed VSG? How is this different to the view typical of the literature since 1970 that one VSG dominates while others wait in the background at low frequencies. If 'diversity' is the conclusion, then you need to define it and explain its significance more.

      2.Following on from 1., why does the analysis deal in counts of distinct VSG or N-terminal domains, and not then progress to their relative expression? The expression data are in Supp Table 3 and they show that, in most cases where many VSG are observed in the same patient, 1-3 of these are 'dominant', i.e. they account for >50% of the population. Figure 1 deals in VSG counts, but I would then expect another figure to illustrate the reality that only a minority of these observed VSG are likely to be clinically relevant (i.e. the subject of the immune response). This impacts the 'diversity' conclusion, as given in the discussion (ll 657-9), because you cannot afford to treat all these VSG equally when their abundances are quite different.

      3.How related are these VSG? Were you able to ensure unique read mapping to the VSG assembly? Can you show that reads mapped to a single VSG only and therefore, that the RPKM values are reliable?

      4.The authors observed no orthology between expressed VSG and DAL972 genes. This is really interesting and deserves closer attention. Presumably there is microhomology? For T. brucei VSG, with constant recombination, we would predict that a comparison of the VSG in West and Central Africa would reveal a pattern of mosaicism, such that individual sequences in DRC would break down into motifs present in multiple genes in the West African reference. Question is, how many genes? What does that distribution look like? What is the smallest homology tract? There is an opportunity here to comment on how VSG repertoires diverge under recombination. How much of the expressed VSG sequence is truly unrepresented in the West African reference (or other T.b.gambiense genome sequences available in ENA). I can believe that none of the N-terminal domains in these data are present intact in DAL972, but I cannot believe that their components are not present without evidence.

      5.Figure 4 compares NTD type composition in the DRC data with previously published mouse experiments. The latter take place over very short timescales in maladapted hosts, while the timescales of the latter in natural hosts are unknown but plausibly very much longer. So are these data really comparable and are we learning anything from their comparison, given that the most likely explanation for the NTD bias in expressed VSG is the underlying genomic composition?

      6.Please comment on the technical reproducibility of the data, there are multiple instances in Supp Table 3 where technical replicates expressed different VSG.

      Minor points:

      1. Type 'estimates' line 389

      Significance

      The significance of this work relates to the application of VSG expression profiling to natural human infections, something not previously done largely because human infections are rare and materials difficult to obtain. The approach and the conclusions are not novel and do not represent substantial advances on previous efforts, but have an important aspect in confirming for natural infections what has been observed in quite artificial experimental settings. Sample size is small and this means that the conclusions remain speculative and cannot readily be extended to all HAT settings. This is not a criticism, since the analysis of any human samples is progress, but it does mean that the study raises interesting questions (e.g. variation across the population in N-terminal domain usage) rather than providing definitive conclusions. It is likely to interest trypanosome biologists with a specific interest in antigenic variation.

      My own field concerns trypanosome genomics and the evolutionary dynamics of variant antigen genes.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper by Mugnier and colleagues describe the repertoire of VSGs present within a cohort of human HAT cases that occurred at relatively close geographical distance.

      VSG repertoires were first described by the senior author a few years ago already, from mouse infection data. This is the first such piece of data to come from human infective parasites in the field. Technically this is a feat - because the small number of parasites that are present per mL of human blood at any given time during infection with T gambiense. Nevertheless they manage to identify up to 14 unique VSGs per patient sample. And this raises the first theoretical question: can they extrapolate to the average diversity load per human? this is important because the timing of sample collection (ie that it occurred within a period of weeks) suggesting that an initial group of infected tsetse infected these patients (rather than a small number of interactions between a bloodmeal and a new infection - generally in itself on the order of 1 month or so). If parasitemia is low and diversity limited, this would explain both why CATT works as well as it does (because really it shouldn't at all!) and perhaps even the chronicity of infection (in the sense that the organism is unlikely to "run out" even of complete VSGs, never mind mosaics). The paper would benefit from a direct discussion on this.

      An interesting feature of this new study is the apparent bias to type B N-terminal domain VSGs as well as the discovery that two patients share a specific VSG isolate (though it is not mentioned whether they are related by distance etc). This raises the possibility of substrains with different VSG archives that vary by geography. Alternatively it suggests that perhaps type B VSGs are picked up differentially by serology (and there the one feature of type B VSGs that could be shared, with regards to detection, is the O-hexose decoration on a number of type B VSG surfaces. Could CATT be detecting elements common to sugar decorated VSGs? Experimentally this is something that can be tested even with mouse infection materials.

      Side comment: are the common VSGs mutated between patient samples?

      Significance

      Significance: high in the sense that this is the first in human field study of a disease that has been studied quite a lot in mouse models. Clearly from this work, there is still a lot to be learned from studying a disease in context.

      Audience: parasitologists

      My own expertise: parasitology and immunology

      Referees cross-commenting

      Nothing substantial to add. From the comments (all of which are worthwhile) I would suspect this would require minor revision.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity): Thank you for the opportunity to review "Population-level survey of loss-of-function mutations revealed that background dependent fitness genes are rare and functionally related in yeast" by Caudal et al. This manuscript reports on the genetic background-dependent traits resulting from natural variation. Authors use 39 natural isolates of the budding yeast (S. cerevisiae) and apply transposon saturation mutagenesis approach to analyze fitness due to loss of function mutations. They identified background and environment dependent genes. They estimate that background specific rewiring is rare and represents instances of bridging between bioprocesses as well as connecting functional related genes. Major comments

      1. Authors filtered strains based on whole chromosome aneuploidies, but what about chromosome arm aneuploidies. Were they detected and if so how were they handled? This should be discussed.

      We did not detect any chromosome arm aneuploidies. In fact, if any significant segmental duplication were present in any of the tested strains, we would have observed changes of gene essentiality for multiple successive ORFs, which was not the case.

      How does chromatin structure variation across different genetic backgrounds affect the results of the screen? Is this a confounding variable? This should be discussed.

      We thank the author for raising this interesting point. There are two aspects to take into consideration. First, transposon insertion is biased by nucleosome occupation, as is more or less expected. In previous screens and in our data, this bias is translated by the lower insertion density in the promoter in addition to the ORF for essential genes. If the nucleosome occupancy were conserved across different genetic background, this insertion bias won’t be a confounding factor as the same gene will share the same bias across different genetic backgrounds. Second, if the nucleosome occupancy is variable across different genetic backgrounds, it could potentially lead to some background-specific insertion biases, however it is difficult to know whether it would be the cause or the consequence of the mutation. In any case, currently there is no chromatin structure data across different genetic backgrounds available and this could be a direction for future research.

      On page 7 authors discuss the involvement of other biological processes in addition to respiration and mitochondrial function. It is not clear what they are referring to. This should be clarified in the main text.

      We clarified this point in the modified ms.

      It would be useful to annotate the functional information discussed in the text directly on the network in Fig. 4 A and B.

      We included annotations on the networks (see Fig. 4 and Fig. S4) as suggested in the modified ms.

      On page 9, authors should comment on the origin of ACP and CLG strain that would result in the similarity of their fitness profile to S288C which they note as an exception.

      ACP is an isolate from Russian wine and CLG is a clinical isolate from UK. In terms of the overall genetic diversity, these two strains are not closely related to the reference strain S288C. As for other profiles, no correlations were observed between the background-dependent mutant fitness variation and their genetic origins.

      On page 10 authors discuss that background-specific fitness genes can belong to protein complexes. Can authors test this formally by looking at the overlap with the protein complex standard or protein interaction standard? This would strengthen this statement.

      Due to the low number of cases, it is impossible to test this using protein complex standards as the size of the terms are too small as well as the sample size. However, the enriched SAFE terms are in general representative of biological processes which includes multiple protein complexes with similar functions. The genes enriched for each SAFE term is further broken down to specific GO terms, as indicated in Table S4.

      Authors should discuss the reasons why transcription & chromatin remodeling and nuclearcytoplasmic transport, are anticorrelated with genes involved in mitochondrial translation in terms of their fitness profiles and the implications for the evolution of environment-dependent fitness genes.

      These observations were new and we are currently looking for potential explanations to this effect. Unfortunately, there is no obvious explanation we can think of and discuss at this point. More data and further experiments are needed to have some clues about this observation.

      Authors discuss the limitation of the Hermes system however couldn't they test this system with a different inducible promoter such as estradiol regulated promoter to remove the effect of galactose metabolism?

      For the Hermes system to work effectively, we need a highly expressed promoter system that is also inducible and GAL1 is the strongest available. As for the estradiol system, first it requires the induction machinery to be integrated in the strain and that will significantly limit the scaling of the project, and second, the maximum induction level is significantly lower than that from the GAL1 system, as is recently shown in Arita et al., MSB 2021. For these reasons, the effect of galactose metabolism is inevitable using any transposon system at present.

      Minor comments All figures should contain the appropriate colour bars and legends. For example, Figure S5B relies on the colour bar in Figure 5C but it should have its own colour bar.

      We modified the figures as suggested.

      Reviewer #1 (Significance): This work provides a comprehensive survey of the variation in natural isolates of yeast and would be interesting to a broad audience studying the genotype-to-phenotype relationship. It is the first study that systematically assessed the fitness effect of loss of function mutations across a large panel of natural isolates providing novel insight into the background specific and environment dependent genes. This represents a valuable resource for the community to ask questions about natural variation in yeast. My expertise is in complex genetic networks in yeast and genome evolution.

      Reviewer #2 (Evidence, reproducibility and clarity): For decades, geneticists have used loss of function (LoF) mutations to unravel the molecular bases of phenotypic variability. However, a common concern is to what extent the phenotypes observed in a strain or accession recapitulates what happens at the species level. In not few cases, anecdotal evidence show that an observed mutant phenotype is not recapitulated in another strain, presumably due to the "strain background". Recent efforts using different strains of Saccharomyces cerevisiae have addressed the problem, but the number has been limited. Here, Elodie Caudal et al. use an ingenious transposon-saturation strategy to carry out a large-scale, genome-wide screen of LoF mutations in 39 strains. Based on a competitive-pooling strategy, authors estimate the probability of 4,469 genes being essential, compared to the reference S288C laboratory strain. Background-specific effects were in general rare. Around 15% of these genes show an essential phenotype which is dependent on the strain background, most of them showing continuous variation across all backgrounds and one third being specific of only one strain. Such background specific genes are functionally related and are under relaxed purifying selection and show "intermediate" integration in genetic-interaction networks compared to essential and non-essential genes. The manuscript is very easy to follow, and the experimental/statistical procedures are transparent and in general well described. Major comments

      1. A limitation of the transposon saturation strategy is the need of galactose as the carbon source, which confounds scoring of genetic background effects. The study would highly benefit from any kind of orthogonal validation or phenotype predictions, beyond the BMH1 case presented (Fig S5). Few options would be direct testing of lethal/sick phenotypes of clean gene knockouts for discussed hits (Fig 5) in several strains and conditions including galactose, testing few of the transposon libraries under different conditions to validate the environment nature of the continuous behavior, or testing the predictive power of the method using data or strains used in Galardini 2019 (ref. 25).

      As all three reviewers suggested that validation of our predicted probability score should be supported by experimental data, we performed orthogonal validations for 8 genes across 17 backgrounds. We have included the new results in the revised ms.

      Showing the degree of replicability of the entire procedure would also help, form transposon insertion to phenotypic comparisons. If we understood correctly, this was indeed done for isolate AKE. What is the correlation of their probability scores?

      The AKE strain was done twice due to the mixed haploid/diploid profile, as mentioned in the text. In this case the reproducibility in terms of probability scores is expected to be lower. We plotted the predicted probability values for the two reps (attached below) and calculated the Pearson’s correlation. The correlation coefficient is 0.86 (P-value

      The use of "fitness genes" is confusing, since the main phenotypic output here scored for each gene is LoF lethality, or more specifically the probability of being lethal or non-essential. Lethality or essentiality would be a more appropriate concept throughout. A next step would indeed be to quantify the phenotypic effects in a more quantitative manner (which is generally used while referring to a gene's fitness effect).

      We clarified this point in the revised version and use “predicted fitness variation” instead of “fitness genes”.

      Some minor comments -Considering that part of the signal is coming from the specific environment tested, one would expect some degree of clustering among related strains based on their gene-essentiality probability (Fig2), given that growth phenotypes correlate well with strain origins when tested under different environments (Warringer et al., 2011). Please discuss.

      In Warringer et al. 2011, the correlation was more pronounced between species (S. paradoxus vs. S. cerevisiae) than intraspecifically. Moreover, it was based on a very small sample size. In fact, multiple more recent studies have shown that the growth phenotypes across a large number of conditions between strains in S. cerevisiae is not correlated with their genetic origins (Peter et al. Nature 2018). Indeed, it is not unexpected that the gene-essentiality probability profiles are not correlated with their origins.

      -Galactose is not a non-fermentable carbon source (pg 11, pg12). It is true that flux trough the fermentative pathways is lower and that the respiratory pathways are induced in galactose, when compared to growth on glucose, but galactose is readily fermented under low oxygen conditions. Indeed, variation in the regulation of these pathways could explain the environmental effects detected.

      The reviewer raised a good point. While galactose is not a non-fermentable carbon source, the entry of galactose into glycolysis requires the respiration pathways and rho-/rho0 yeast mutants are unable to grow on galactose as the sole carbon source. We clarified this in the new version of the ms.

      -Examples on FigS3 were useful for a better intuition of how the actual data looks like. Perhaps some of this belongs in Fig1.

      Schematic presentation of the insertion profiles is already shown in Figure 1C. Due to the limited size of Figure 1, we kept Fig S3 as it is in the new version of the ms.

      Fig2, restrict the #insertions label to the actual limits for the set of 39 strains. Currently, it seems there are strains with fewer than 100K and no strain with 300K insertions.

      We thank the reviewer for pointing this out, it was a scaling problem and we fixed it.

      -pg5 paragraph 2, a line on how representative is the set of 106 isolates and again later for the final data set of 39. Which main clades are missing or perhaps overrepresented?

      Compared to the original 106 isolates, the final 39 isolates are still broadly representative of the species diversity, albeit some of the most divergent clusters, such as isolates from the French Guiana and from China, are underrepresented. We included this comment in the revised version.

      -pg6 paragraph 1, should be 106 or 107?

      It was 106 plus the reference strain. This point is clarified now in the new ms.

      -pg14 line2, is OD of 0.5 correct or was also 0.05 as in galactose? This is relevant, since it would change the competitive selection regime under galactose or glucose (more generations under glucose in the latter case). For clarity, authors could here state an approximate number of cell divisions in each medium.

      The OD of 0.5 was correct as this step was only intended as a “recovery phase” and was used to increase the mutant pool for sequencing. We also clarified this point in the text. -pg14 line 2, correct wording "to enrich for cells the transposon.."

      We clarified this point in the revised version.

      Reviewer #2 (Significance): While recent previous studies have measured genetic background dependent effects of gene mutations at the genome-wide level, this is the first study addressing the problem at the broader population level. Confirming that such effects are in general rare, even at this broad level, is a significant advance in the field. It is limited in the number of environmental conditions and subsequent insights (as in Galardini 2019, ref #25) and in more mechanistic views of specific allele interactions (as in Mullis 2018, ref #5). We feel, however, that these directions would already be out of the scope of the well-framed question here addressed. Because of the problem addressed and tackled in an ingenious and comprehensive manner, this manuscript will attract the attention of a broad audience of geneticists, genome and systems biologists. Our main expertise is in yeast genetics and functional genomics. **Referee Cross-commenting** Reviewer #1 commented the possibility that insertion density could be determined by local chromatin status instead of gene essentiality, given that transposon insertion occurs more often at nucleosome free sites (point 2). While the insertion pattern around the essential gene's vicinity is convincing, we agree that it would help to show that these phenomena are independent from one another, or that this issue must at least be discussed. The seeming need of further experimental or analytical validation was raised by reviewers #1 and #3. As mentioned above, we performed orthogonal validations for 8 genes across 17 backgrounds and we included the results in the revised ms.

      Reviewer #3 (Evidence, reproducibility and clarity): In this manuscript, Caudal et al tested differences in gene knockout phenotypes across genetically diverse yeast strains using a transposon system. After initially querying 106 strains, most tested strains were removed from further consideration due to low transposon insertion numbers, aneuploidies, or other issues. The authors used the remaining 39 strains to identify a set of 632 genes that are required for normal growth in some genetic backgrounds but not in others. These context-dependent fitness genes are enriched for genes with a role in respiration, which could be because the experiment is performed using galactose as carbon source. Further analysis of potential environment-dependent fitness genes revealed two separate groups of genes that were anti-correlated in their fitness profiles. I found this an interesting paper, that explores differential gene essentiality (fitness) across diverse yeast strains. The authors give a detailed description of their findings, thereby differentiating between "environmental" and "genetic background" factors. The paper is well-written and the results are clearly presented. I have only two main concerns, both regarding the quality of the produced data: Major comments:

      • Looking at the differential fitness scores in the supplementary data, none of the 57 genes that are known to show differential essentiality between S288C and Sigma1278b (Dowell et al., 2010, Science) appear to be identified as having differential fitness in the transposon screen. The authors mention that some of these genes have a severe fitness defect when deleted in the nonessential background and that some are only partially essential. Although this is certainly true for specific cases, deletion mutants of most of these 57 genes show a large difference in fitness between S288C and Sigma, and this thus doesn't sufficiently explain the complete lack of validation of 57 known positive cases. I think the authors need to further clarify why these known positive controls are not identified in their screen.

      In Dowell et al. 2010, the essentiality was determined by tetrad dissection comparing S288C and Sigma, and as shown in the supplemental data, ~1/3 out of the 57 are in fact extremely sick in one background and non-viable in the other. This strong fitness defect cannot be distinguished using the transposon method. More recently in Hou et al. PNAS 2019, it has been shown that ~15 out of the 57 original cases were due to chromosomal genetic modifiers, which again, mainly concerned the “domain essential” effect that we also captured in our data. An addition, 8 hits out of the 57 were shown to be related to mitochondrial genomes in Edward et al. PNAS 2014, and due the galactose condition we used, these cases were not detected. Other undetected cases were due to the low coverage in the corresponding regions in either one or both backgrounds.

      • Related to the previous point, the authors perform no secondary validation of identified context-dependent essential genes. They show that they can recapitulate known sets of essential and nonessential genes in S288c, but given my previous point, it is not clear how well their logistic model works for predicting differential gene essentiality/fitness. In my opinion, experimental validation of a subset of the identified differential fitness genes is needed to be able to be confident about the results.

      As already mentioned above, new experiments were performed in order to validate a subset of the identified differential fitness genes. The results were included in the revised version of the ms.

      Minor comments:

      • The authors provide lots of data spread over many columns in the supplementary tables. However, a description of what is in each column is missing, and without it, it is not always possible to understand the data.

      We added column annotations in the spread sheets as suggested.

      • I didn't understand the sentence at the bottom of page 5: "the number of insertion drops from -100 bp prior to CDS and extends to - 100 bp until the terminator region". Perhaps the authors can rephrase.

      We clarified this point in the revised version.

      Reviewer #3 (Significance): To my knowledge, this is the first paper exploring gene essentiality across a large number of genetically diverse yeast strains. This paper will be of interest to a broad range of geneticists.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, Caudal et al tested differences in gene knockout phenotypes across genetically diverse yeast strains using a transposon system. After initially querying 106 strains, most tested strains were removed from further consideration due to low transposon insertion numbers, aneuploidies, or other issues. The authors used the remaining 39 strains to identify a set of 632 genes that are required for normal growth in some genetic backgrounds but not in others. These context-dependent fitness genes are enriched for genes with a role in respiration, which could be because the experiment is performed using galactose as carbon source. Further analysis of potential environment-dependent fitness genes revealed two separate groups of genes that were anti-correlated in their fitness profiles.

      I found this an interesting paper, that explores differential gene essentiality (fitness) across diverse yeast strains. The authors give a detailed description of their findings, thereby differentiating between "environmental" and "genetic background" factors. The paper is well-written and the results are clearly presented. I have only two main concerns, both regarding the quality of the produced data:

      Major comments:

      • Looking at the differential fitness scores in the supplementary data, none of the 57 genes that are known to show differential essentiality between S288C and Sigma1278b (Dowell et al., 2010, Science) appear to be identified as having differential fitness in the transposon screen. The authors mention that some of these genes have a severe fitness defect when deleted in the nonessential background and that some are only partially essential. Although this is certainly true for specific cases, deletion mutants of most of these 57 genes show a large difference in fitness between S288C and Sigma, and this thus doesn't sufficiently explain the complete lack of validation of 57 known positive cases. I think the authors need to further clarify why these known positive controls are not identified in their screen.
      • Related to the previous point, the authors perform no secondary validation of identified context-dependent essential genes. They show that they can recapitulate known sets of essential and nonessential genes in S288c, but given my previous point, it is not clear how well their logistic model works for predicting differential gene essentiality/fitness. In my opinion, experimental validation of a subset of the identified differential fitness genes is needed to be able to be confident about the results.

      Minor comments:

      • The authors provide lots of data spread over many columns in the supplementary tables. However, a description of what is in each column is missing, and without it, it is not always possible to understand the data.
      • I didn't understand the sentence at the bottom of page 5: "the number of insertion drops from -100 bp prior to CDS and extends to - 100 bp until the terminator region". Perhaps the authors can rephrase.

      Significance

      To my knowledge, this is the first paper exploring gene essentiality across a large number of genetically diverse yeast strains. This paper will be of interest to a broad range of geneticists.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      For decades, geneticists have used loss of function (LoF) mutations to unravel the molecular bases of phenotypic variability. However, a common concern is to what extent the phenotypes observed in a strain or accession recapitulates what happens at the species level. In not few cases, anecdotal evidence show that an observed mutant phenotype is not recapitulated in another strain, presumably due to the "strain background". Recent efforts using different strains of Saccharomyces cerevisiae have addressed the problem, but the number has been limited. Here, Elodie Caudal et al. use an ingenious transposon-saturation strategy to carry out a large-scale, genome-wide screen of LoF mutations in 39 strains. Based on a competitive-pooling strategy, authors estimate the probability of 4,469 genes being essential, compared to the reference S288C laboratory strain. Background-specific effects were in general rare. Around 15% of these genes show an essential phenotype which is dependent on the strain background, most of them showing continuous variation across all backgrounds and one third being specific of only one strain. Such background specific genes are functionally related and are under relaxed purifying selection and show "intermediate" integration in genetic-interaction networks compared to essential and non-essential genes. The manuscript is very easy to follow, and the experimental/statistical procedures are transparent and in general well described.

      Major comments

      1. A limitation of the transposon saturation strategy is the need of galactose as the carbon source, which confounds scoring of genetic background effects. The study would highly benefit from any kind of orthogonal validation or phenotype predictions, beyond the BMH1 case presented (Fig S5). Few options would be direct testing of lethal/sick phenotypes of clean gene knockouts for discussed hits (Fig 5) in several strains and conditions including galactose, testing few of the transposon libraries under different conditions to validate the environment nature of the continuous behavior, or testing the predictive power of the method using data or strains used in Galardini 2019 (ref. 25).
      2. Showing the degree of replicability of the entire procedure would also help, form transposon insertion to phenotypic comparisons. If we understood correctly, this was indeed done for isolate AKE. What is the correlation of their probability scores?
      3. The use of "fitness genes" is confusing, since the main phenotypic output here scored for each gene is LoF lethality, or more specifically the probability of being lethal or non-essential. Lethality or essentiality would be a more appropriate concept throughout. A next step would indeed be to quantify the phenotypic effects in a more quantitative manner (which is generally used while referring to a gene's fitness effect).

      Some minor comments

      -Considering that part of the signal is coming from the specific environment tested, one would expect some degree of clustering among related strains based on their gene-essentiality probability (Fig2), given that growth phenotypes correlate well with strain origins when tested under different environments (Warringer et al., 2011). Please discuss.

      -Galactose is not a non-fermentable carbon source (pg 11, pg12). It is true that flux trough the fermentative pathways is lower and that the respiratory pathways are induced in galactose, when compared to growth on glucose, but galactose is readily fermented under low oxygen conditions. Indeed, variation in the regulation of these pathways could explain the environmental effects detected.

      -Examples on FigS3 were useful for a better intuition of how the actual data looks like. Perhaps some of this belongs in Fig1.

      Fig2, restrict the #insertions label to the actual limits for the set of 39 strains. Currently, it seems there are strains with fewer than 100K and no strain with 300K insertions.

      -pg5 paragraph 2, a line on how representative is the set of 106 isolates and again later for the final data set of 39. Which main clades are missing or perhaps overrepresented?

      -pg6 paragraph 1, should be 106 or 107?

      -pg14 line2, is OD of 0.5 correct or was also 0.05 as in galactose? This is relevant, since it would change the competitive selection regime under galactose or glucose (more generations under glucose in the latter case). For clarity, authors could here state an approximate number of cell divisions in each medium.

      -pg14 line 2, correct wording "to enrich for cells the transposon.."

      Significance

      While recent previous studies have measured genetic background dependent effects of gene mutations at the genome-wide level, this is the first study addressing the problem at the broader population level. Confirming that such effects are in general rare, even at this broad level, is a significant advance in the field. It is limited in the number of environmental conditions and subsequent insights (as in Galardini 2019, ref #25) and in more mechanistic views of specific allele interactions (as in Mullis 2018, ref #5). We feel, however, that these directions would already be out of the scope of the well-framed question here addressed.

      Because of the problem addressed and tackled in an ingenious and comprehensive manner, this manuscript will attract the attention of a broad audience of geneticists, genome and systems biologists. Our main expertise is in yeast genetics and functional genomics.

      Referee Cross-commenting

      Reviewer #1 commented the possibility that insertion density could be determined by local chromatin status instead of gene essentiality, given that transposon insertion occurs more often at nucleosome free sites (point 2). While the insertion pattern around the essential gene's vicinity is convincing, we agree that it would help to show that these phenomena are independent from one another, or that this issue must at least be discussed.

      The seeming need of further experimental or analytical validation was raised by reviewers #1 and #3.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Thank you for the opportunity to review "Population-level survey of loss-of-function mutations revealed that background dependent fitness genes are rare and functionally related in yeast" by Caudal et al. This manuscript reports on the genetic background-dependent traits resulting from natural variation. Authors use 39 natural isolates of the budding yeast (S. cerevisiae) and apply transposon saturation mutagenesis approach to analyze fitness due to loss of function mutations. They identified background and environment dependent genes. They estimate that background specific rewiring is rare and represents instances of bridging between bioprocesses as well as connecting functional related genes.

      Major comments

      1. Authors filtered strains based on whole chromosome aneuploidies, but what about chromosome arm aneuploidies. Were they detected and if so how were they handled? This should be discussed.
      2. How does chromatin structure variation across different genetic backgrounds affect the results of the screen? Is this a confounding variable? This should be discussed.
      3. On page 7 authors discuss the involvement of other biological processes in addition to respiration and mitochondrial function. It is not clear what they are referring to. This should be clarified in the main text.
      4. It would be useful to annotate the functional information discussed in the text directly on the network in Fig. 4 A and B.
      5. On page 9, authors should comment on the origin of ACP and CLG strain that would result in the similarity of their fitness profile to S288C which they note as an exception.
      6. On page 10 authors discuss that background-specific fitness genes can belong to protein complexes. Can authors test this formally by looking at the overlap with the protein complex standard or protein interaction standard? This would strengthen this statement.
      7. Authors should discuss the reasons why transcription & chromatin remodeling and nuclearcytoplasmic transport, are anticorrelated with genes involved in mitochondrial translation in terms of their fitness profiles and the implications for the evolution of environment-dependent fitness genes.
      8. Authors discuss the limitation of the Hermes system however couldn't they test this system with a different inducible promoter such as estradiol regulated promoter to remove the effect of galactose metabolism?

      Minor comments

      All figures should contain the appropriate colour bars and legends. For example, Figure S5B relies on the colour bar in Figure 5C but it should have its own colour bar.

      Significance

      Significance

      This work provides a comprehensive survey of the variation in natural isolates of yeast and would be interesting to a broad audience studying the genotype-to-phenotype relationship. It is the first study that systematically assessed the fitness effect of loss of function mutations across a large panel of natural isolates providing novel insight into the background specific and environment dependent genes. This represents a valuable resource for the community to ask questions about natural variation in yeast. My expertise is in complex genetic networks in yeast and genome evolution.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This is a very interesting paper with novel observations. The authors find that, in yeast, Rvb1/2 AAA+ ATPases couple transcription, mRNA granular localization, and mRNAs translatability during glucose starvation. Rvb1 and Rvb2 were found to be enriched at the promoters and mRNAs of genes involved in alternative glucose metabolism pathways that are transcriptionally upregulated but translationally downregulated during glucose starvation.

      The following are some comments

      Introduction

      1. "Structural studies have shown that they form a dodecamer comprised of a stacked Rvb1 hexametric ring and a Rvb2 hexametric ring." o Rvb1 and Rvb2 form a heterohexameric ring with alternating arrangement (not homohexamers that stack on top of each other as suggested by this sentence) o In yeast, they oligomerize mostly as single hexametric rings, with dodecamers reported being less than 10% in frequency in vivo (eg Jeganathan et al. 2015 https://doi.org/10.1016/j.jmb.2015.01.010)

        Results Section: Rvb1/Rvb2 are identified as potential co-transcriptionally loaded protein factors on the alternative glucose metabolism genes

      2. "These two proteins are generally thought to act on DNA but have been found to be core components of mammalian and yeast cytoplasmic stress granules" • These two papers extensively show Rvb1/Rvb2 localization to granules/condensates under stress/nutrient starvation conditions and should be cited. The Rvb1/2 foci were named Rbits: i. Rizzolo et al. 2017 https://doi.org/10.1016/j.celrep.2017.08.074 ii. Kakihara et al. 2014 https://doi.org/10.1186/s13059-014-0404-4
      3. "a portion of them becomes localized to cytoplasmic granules that are not P-bodies in both 15-minute and 30-minute glucose starvation conditions (Figure 1-figure supplement 2)" • Supplement figure 2 only includes results under 30-min glucose starvation, no 15-min data was shown
      4. Figure 1C, unclear whether p-value here is for FC of GLC3 over HSP or FC of GLC3 over CRAPome. In addition, both FC datasets should have p-values.

        Section: Rvb1/Rvb2 are enriched at the promoters of endogenous alternative glucose metabolism genes

      5. "Here, we performed ChIP-seq on Rvb1, Rvb2, and the negative control Pgk1 in 10 minutes of glucose starvation (Figure 2-figure supplement 3, left)" • Unclear what figure is being referred to, panel A or panel B?
      6. "Structural studies have shown that Rvb1/Rvb2 can form a dodecamer complex. Their overlapped enrichment also indicates that Rvb1 and Rvb2 may function together." • They function together regardless of forming a dodecamer or not, as they assemble as heterohexamers

        Section: Engineered Rvb1/Rvb2 tethering to mRNAs directs the cytoplasmic localization and repressed translation

      7. Does binding of any protein to PP7 loop in this construct alter cytoplasmic fate? A control such as GFP-CP or any other protein attached to CP should be used.
      8. No statistical analysis was done for Figure 4E quantification
      9. "Results showed that after replenishing the glucose to the starved cells, the translation of those genes is quickly induced, with an ~8-fold increase in ribosome occupancy 5 minutes after glucose readdition for Class II mRNAs (Figure 4-figure supplement 9)" o Would be important to see this recovery (increase in translation after glucose replenishment) in one of the reporter constructs used in the paper, such as GL3 promoter driven CFP.

        Section: Engineered Rvb1/Rvb2 binding to mRNAs increases the transcription of corresponding genes

      10. How many biological replicates is in Figure 5B? There does not seem to be any error bars/gray sections indicating sample variation. P-value was also not calculated.

      Reviewer #1 (Significance (Required)):

      This is a very interesting manuscript that ascribes yet another function of the highly conserved RVB1/2 AAA+ ATPases.

      **Referee Cross-commenting**

      All reviewers agree that this an interesting paper. However, the reviewers do suggest specific experiments to verify some of the results. Carrying out these experiments will definitely improve the paper.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In their manuscript entitled "Rvb1/Rvb2 proteins couple transcription and translation during glucose starvation", Chen and co-authors use genetics and microscopy to demonstrate how budding yeast regulate cytoplasmic translation by their promoter sequences by two conserved ATPases Rvb1 and Rvb2 during nutrient stress. The authors show that these two ATPases repress translation of target mRNAs and then propose that these two proteins also recruit mRNAs to P bodies. The authors show that Rvb1/2 preferentially binds in the presence of Class II promoters using CoTrIP, that Rvb1/2 binds specifically at Class II promoters using ChIP-Seq, that Rvb1/2 are bound to transcripts with Class II promoters using RIP-Seq, that tethering of Rvb1/2 to a transcript decreases its translatability and that Rvb1/2 binding to a transcript increases its transcript levels by increasing transcription and not slowing mRNA decay.

      The CoTrIP experiment is clever and for the most part well executed. The key conclusions are largely convincing but some clarifications are nevertheless needed (see below). Overall, this paper is well written with well executed experiments that largely support the authors' model. No major additional experiments are needed to support the claims of the paper. There are a few minor concerns that should be addressed before this manuscript gets published. These are: Minor comments: 1) Are Rvb1/2 components (enriched in) of P bodies? The model proposed by the authors suggests this but no data is show. 2) Fig. 1A: The model proposed by the authors indicates that Rvb1/2 and other proteins are recruited to the mRNAs in a promoter-dependent manner and not mRNA sequence dependent manner. This is largely supported by the data presented in the paper. However the authors should also discuss the possibility that RNA sequences could nevertheless contribute as only a uniform ORF has been tested. Could the promoter recruit Rvb1/2 similarly regardless of the ORF sequence tested? Please provide a sequence of the uniform ORF, discuss what this "uniformity" means and how a change in RNA sequence could affect the outcome of the experiment outlined in Fig. 1A. 3) Fig. 2: The authors use Pgk 1 in their ChIP control but this is not the appropriate control for the experiment as Pgk 1 is not nuclear and thus cannot demonstrate non-specific interaction with genetic regions of tested genes. Regardless, the data is convincing enough to support the model that Rvb1/2 are specifically recruited to the promoters of Class II stress-induced genes and not Class I stress-induced genes. GFP-NLS would be a better control. The authors should discuss in their materials and methods section why they chose a cytoplasmic protein for their normalization control but preferably perform ChIP with GFP-NLS or other nuclear protein that could bind to chromatin non-specifically to further demonstrate the specificity of Rvb1/2 enrichment at Class II promoters. 4) The authors claim that Rvb1/Rvb2 binding to transcripts leads to formation of granules that are non-colocalized with P-bodies and instead co-localized to SGs, but no SG fluorescent marker is used to demonstrate this claim. The authors should provide this data or remove this claim from their manuscript. 5) Fluorescent images are fuzzy, very small and difficult to interpret. mRNA puncta are difficult to observe and it is hard to conclude which green puncta colocalize with P bodies and which do not (and how frequently). It is difficult to differentiate between the cytoplasm and nucleus. Consider adding DAPI overlay. 6) The relevance of Figure 2B is not clear - please discuss. 7) Fig 5A modeling adds little supporting evidence to the entire figure. The experimental results are more convincing. Consider moving to the Supplement. 8) Fig. 4 and 3B. The authors suggest that Rvb1/2 loaded by the promoters onto the mRNA determine accumulation of mRNAs to P bodies. To test this model, the authors tether Rvb1/2 onto the mRNA using MS2-MCP system and then look for co-localization of the mRNA with P bodies. However, if the authors' model is correct, this experiment could have been achieved already using the constructs in Fig. 3B. The authors should look at the P body localization pattern using chimeras used in Fig. 3B. 9) Fig. 6: The authors present a model where mRNAs transcribed from Class II promoters are decorated with Rvb1/2 co-transcriptionally, exported into the cytoplasm, recruited to P bodies and translationally repressed. However, this model is not fully supported by the data shown. Specifically, the authors have not shown that localization of mRNAs to P bodies induces translational repression or whether the recruitment is a consequence of this repression. The authors should revise their model to reflect this uncertainty. Also, the numbering of steps 1,2 3 is confusing. Does it imply a temporal sequences? Some of these steps could be occurring simultaneously (like 1 and 3). How does step 3 lead from step 2? Please clarify this model. 10) Consider showing data-points in Fig 1 figure supplement 1. The box/whisker plot doesn't give a good sense of the enrichment alone 11) Figure 1 Fig supplement 2 shows that the fluorophore seems to influence the % of cells with foci. Why is this the case? 12) List gene names in Fig 2 fig supp 5. 13) Throughout the paper the graph axis labels are very small and difficult to read. 14) Figure 4 fig supplement 7C and 8E: on the y-axis the legend says proportion of cells (%), so the value on the y-axis might be 25, 50, 75 and not 0.25, 0.50 and 0.75. 15) The last paragraph of the Introduction (page 2) detailed how Rvb1/Rvb2 are core components of the stress granule. Yet most experiments were conducted to relate Rvb1/Rvb2 with P-bodies. Maybe some information about the known roles Rvb1/Rvb2 play in the P-bodies in the Introduction section could help.

      Reviewer #2 (Significance (Required)):

      Ruvb helicase has been shown to regulate the formation of stress granules in human U2OS cells during oxidative stress (Parker lab, Cell, 2016). Thus, the authors suggest that Rvb proteins could have a broad and conserved role in the formation of RNA granules, which advances our understanding of how biomolecular condensates could form. In addition, translationally-repressed mRNAs have been shown to preferentially recruit to diverse RNA granules, from stress granules P bodies in human cells as well as germ granules in C. elegans and Drosophila. These observations have gained considerable attention in the past 5 years and exact molecular principles behind this phenomenon are not entirely clear. Long and exposed RNA sequences are thought to be sufficient for this enrichment. The authors however suggest that specific proteins (Rvb1/2) could also trigger enrichment either directly by interacting with P bodies or indirectly by repressing translation and exposing RNA sequences. This finding will be particularly relevant to the field of biomolecular condensates. My expertise is in the area of RNA biology, mRNA decay, RNA granules and mRNA localization.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Dr. Brian Zid has previously published in Nature that, in response to glucose starvation, promoters of some genes ("class II") can control synthesis of mRNAs that are sequestered in cytoplasmic P bodies or Stress granules, away from the translation apparatus. In this paper, his group reports about the underlying mechanism. They have found proteins that bind preferentially class II promoters as well as their transcripts and are capable of repressing their translation and stimulating their assembly with P bodies. They found a correlation between the capacity of Rvb1/2 binding to promoters and binding to mRNAs. Using a tethering technique, they found that Rb1/Rvb2 recruitment to reporter mRNA (not class II) led to the association of the transcript with PBs and its translation repression. Interestingly, Binding of Rvb1/Rvb2 to the studied transcript increased transcription of its own gene, probably by remodeling the nearby chromatin. The paper uncovers a mechanism to sequester mRNAs as translationally repressed in RNA granules during starvation and warrants a publication in a good journal, after responding to various comments below.

      1. CoTrIP is a method to identify proteins that differentially bind plasmids carrying different promoters/genes. However, the claim that it identifies proteins bound to nascent mRNAs is an overreach, as the proteins bind both DNA and RNA and the purified plasmid contains both types of nucleic acids. Therefore, the title of section 1 ("Rvb1/Rvb2 are identified as potential co-transcriptionally loaded protein factors on the alternative glucose metabolism genes") should be changed to something like: Rvb1/Rvb2 are identified as proteins that are co-purified with a plasmid expressing alternative glucose metabolism genes. Description of CoTrIP and its results should be discussed throughout the manuscript accordingly.

      2. The engineered Rvb1/Rvb2 tethering to mRNAs of choice is a potentially convincing way to show the causative effect of Rvb1/Rvb2 on RNA performance. Using this method, the authors show that attachment of Rvb1/Rvb2 to an engineered mRNA mediate its association with granules and inhibits its translation. However, this experiment takes Rvb1/2 out of its natural context such that its behavior in this case may not be exemplative of its endogenous function. The authors are encouraged to support their results by depleting Rvbs with AID and examine the outcome of this depletion on PBs formation and translation of class II genes (and class I as controls).

      3. The tethering experiments, shown in Fig. 4, would be more convincing by including an additional control. To rule out the possibility that any bulky protein that is recruited to the 3'-UTR by the PP7 element affects translation (not an unlikely possibility), they want to consider fusing irrelevant protein (e.g., Pgk1p) to CP, in place of Rvb1/2.

      1. The proposal that Rvb1 binds class II transcripts during transcription is a plausible possibility (which I personally believe to represent the reality), but by no means demonstrated. This should be clearly addressed in the paper.

      2. An optional suggestion: The paper can be upgraded by performing ribosome profiling, as shown in Supplemental Fig. 9, after a short depletion of Rvb1/2 by AID (see comment 2). This, in combination with the results already shown in Supp Fig. 9, can demonstrate the role of Rvb1/2 in mRNA storage in granules and in translation shortly after glucose refeeding. The large data sets thus produced (in particular the ratio between depleted and non-depleted signal per each gene) can be used to try correlate the extent of ribosome occupancy (or the above mentioned ratio) with cis-element(s) or known trans-acting elements within the promoters. This may identify elements within the promoters that recruit (directly or indirectly) Rvb1/2. If successful, it can pave the way to demonstrate co-transcriptional RNA binding. I also suggest moving Supp Fig. 9 as an additional panel of the main Fig. 4. Minor point:

      3. The original reference about "mRNA imprinting" was published by Choder in Cellular logistics 2011.
      4. The graph in 5B does not have error bars and the number of replicates is unclear.

      Reviewer #3 (Significance (Required)):

      The paper uncovers a mechanism to sequester mRNAs as translationally repressed in RNA granules during starvation. This significantly advances our understanding of how gene expression in yeast responds to the environment and warrants a publication in a good journal, after responding to the various comments, indicated above. My expertise is regulation of gene expression.

      **Referee Cross-commenting**

      In general all reviewers feel that the paper deals with a significant issue, each from his/her point of view, and is basically of high quality.

      I concur with all the comments of Reviewer 1 and 2. In particular, two comments drove my attention. Reviewer 1: Would be important to see increase in translation after glucose replenishment in one of the reporter constructs used in the paper, such as GL3 promoter driven CFP. Reviewer 2: The authors should look at the P body localization pattern using chimeras used in Fig. 3B.

      There are comments common to more than one reviewer.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Dr. Brian Zid has previously published in Nature that, in response to glucose starvation, promoters of some genes ("class II") can control synthesis of mRNAs that are sequestered in cytoplasmic P bodies or Stress granules, away from the translation apparatus. In this paper, his group reports about the underlying mechanism. They have found proteins that bind preferentially class II promoters as well as their transcripts and are capable of repressing their translation and stimulating their assembly with P bodies. They found a correlation between the capacity of Rvb1/2 binding to promoters and binding to mRNAs. Using a tethering technique, they found that Rb1/Rvb2 recruitment to reporter mRNA (not class II) led to the association of the transcript with PBs and its translation repression. Interestingly, Binding of Rvb1/Rvb2 to the studied transcript increased transcription of its own gene, probably by remodeling the nearby chromatin.<br> The paper uncovers a mechanism to sequester mRNAs as translationally repressed in RNA granules during starvation and warrants a publication in a good journal, after responding to various comments below.

      1. CoTrIP is a method to identify proteins that differentially bind plasmids carrying different promoters/genes. However, the claim that it identifies proteins bound to nascent mRNAs is an overreach, as the proteins bind both DNA and RNA and the purified plasmid contains both types of nucleic acids. Therefore, the title of section 1 ("Rvb1/Rvb2 are identified as potential co-transcriptionally loaded protein factors on the alternative glucose metabolism genes") should be changed to something like: Rvb1/Rvb2 are identified as proteins that are co-purified with a plasmid expressing alternative glucose metabolism genes. Description of CoTrIP and its results should be discussed throughout the manuscript accordingly.
      2. The engineered Rvb1/Rvb2 tethering to mRNAs of choice is a potentially convincing way to show the causative effect of Rvb1/Rvb2 on RNA performance. Using this method, the authors show that attachment of Rvb1/Rvb2 to an engineered mRNA mediate its association with granules and inhibits its translation. However, this experiment takes Rvb1/2 out of its natural context such that its behavior in this case may not be exemplative of its endogenous function. The authors are encouraged to support their results by depleting Rvbs with AID and examine the outcome of this depletion on PBs formation and translation of class II genes (and class I as controls).
      3. The tethering experiments, shown in Fig. 4, would be more convincing by including an additional control. To rule out the possibility that any bulky protein that is recruited to the 3'-UTR by the PP7 element affects translation (not an unlikely possibility), they want to consider fusing irrelevant protein (e.g., Pgk1p) to CP, in place of Rvb1/2.
      4. The proposal that Rvb1 binds class II transcripts during transcription is a plausible possibility (which I personally believe to represent the reality), but by no means demonstrated. This should be clearly addressed in the paper.
      5. An optional suggestion: The paper can be upgraded by performing ribosome profiling, as shown in Supplemental Fig. 9, after a short depletion of Rvb1/2 by AID (see comment 2). This, in combination with the results already shown in Supp Fig. 9, can demonstrate the role of Rvb1/2 in mRNA storage in granules and in translation shortly after glucose refeeding. The large data sets thus produced (in particular the ratio between depleted and non-depleted signal per each gene) can be used to try correlate the extent of ribosome occupancy (or the above mentioned ratio) with cis-element(s) or known trans-acting elements within the promoters. This may identify elements within the promoters that recruit (directly or indirectly) Rvb1/2. If successful, it can pave the way to demonstrate co-transcriptional RNA binding. I also suggest moving Supp Fig. 9 as an additional panel of the main Fig. 4.

      Minor point:

      1. The original reference about "mRNA imprinting" was published by Choder in Cellular logistics 2011.
      2. The graph in 5B does not have error bars and the number of replicates is unclear.

      Significance

      The paper uncovers a mechanism to sequester mRNAs as translationally repressed in RNA granules during starvation. This significantly advances our understanding of how gene expression in yeast responds to the environment and warrants a publication in a good journal, after responding to the various comments, indicated above.

      My expertise is regulation of gene expression.

      Referee Cross-commenting

      In general all reviewers feel that the paper deals with a significant issue, each from his/her point of view, and is basically of high quality.

      I concur with all the comments of Reviewer 1 and 2. In particular, two comments drove my attention. Reviewer 1: Would be important to see increase in translation after glucose replenishment in one of the reporter constructs used in the paper, such as GL3 promoter driven CFP. Reviewer 2: The authors should look at the P body localization pattern using chimeras used in Fig. 3B.

      There are comments common to more than one reviewer.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their manuscript entitled "Rvb1/Rvb2 proteins couple transcription and translation during glucose starvation", Chen and co-authors use genetics and microscopy to demonstrate how budding yeast regulate cytoplasmic translation by their promoter sequences by two conserved ATPases Rvb1 and Rvb2 during nutrient stress. The authors show that these two ATPases repress translation of target mRNAs and then propose that these two proteins also recruit mRNAs to P bodies. The authors show that Rvb1/2 preferentially binds in the presence of Class II promoters using CoTrIP, that Rvb1/2 binds specifically at Class II promoters using ChIP-Seq, that Rvb1/2 are bound to transcripts with Class II promoters using RIP-Seq, that tethering of Rvb1/2 to a transcript decreases its translatability and that Rvb1/2 binding to a transcript increases its transcript levels by increasing transcription and not slowing mRNA decay.

      The CoTrIP experiment is clever and for the most part well executed. The key conclusions are largely convincing but some clarifications are nevertheless needed (see below). Overall, this paper is well written with well executed experiments that largely support the authors' model. No major additional experiments are needed to support the claims of the paper. There are a few minor concerns that should be addressed before this manuscript gets published. These are:

      Minor comments:

      1) Are Rvb1/2 components (enriched in) of P bodies? The model proposed by the authors suggests this but no data is show.

      2) Fig. 1A: The model proposed by the authors indicates that Rvb1/2 and other proteins are recruited to the mRNAs in a promoter-dependent manner and not mRNA sequence dependent manner. This is largely supported by the data presented in the paper. However the authors should also discuss the possibility that RNA sequences could nevertheless contribute as only a uniform ORF has been tested. Could the promoter recruit Rvb1/2 similarly regardless of the ORF sequence tested? Please provide a sequence of the uniform ORF, discuss what this "uniformity" means and how a change in RNA sequence could affect the outcome of the experiment outlined in Fig. 1A.

      3) Fig. 2: The authors use Pgk 1 in their ChIP control but this is not the appropriate control for the experiment as Pgk 1 is not nuclear and thus cannot demonstrate non-specific interaction with genetic regions of tested genes. Regardless, the data is convincing enough to support the model that Rvb1/2 are specifically recruited to the promoters of Class II stress-induced genes and not Class I stress-induced genes. GFP-NLS would be a better control. The authors should discuss in their materials and methods section why they chose a cytoplasmic protein for their normalization control but preferably perform ChIP with GFP-NLS or other nuclear protein that could bind to chromatin non-specifically to further demonstrate the specificity of Rvb1/2 enrichment at Class II promoters.

      4) The authors claim that Rvb1/Rvb2 binding to transcripts leads to formation of granules that are non-colocalized with P-bodies and instead co-localized to SGs, but no SG fluorescent marker is used to demonstrate this claim. The authors should provide this data or remove this claim from their manuscript.

      5) Fluorescent images are fuzzy, very small and difficult to interpret. mRNA puncta are difficult to observe and it is hard to conclude which green puncta colocalize with P bodies and which do not (and how frequently). It is difficult to differentiate between the cytoplasm and nucleus. Consider adding DAPI overlay.

      6) The relevance of Figure 2B is not clear - please discuss.

      7) Fig 5A modeling adds little supporting evidence to the entire figure. The experimental results are more convincing. Consider moving to the Supplement.

      8) Fig. 4 and 3B. The authors suggest that Rvb1/2 loaded by the promoters onto the mRNA determine accumulation of mRNAs to P bodies. To test this model, the authors tether Rvb1/2 onto the mRNA using MS2-MCP system and then look for co-localization of the mRNA with P bodies. However, if the authors' model is correct, this experiment could have been achieved already using the constructs in Fig. 3B. The authors should look at the P body localization pattern using chimeras used in Fig. 3B.

      9) Fig. 6: The authors present a model where mRNAs transcribed from Class II promoters are decorated with Rvb1/2 co-transcriptionally, exported into the cytoplasm, recruited to P bodies and translationally repressed. However, this model is not fully supported by the data shown. Specifically, the authors have not shown that localization of mRNAs to P bodies induces translational repression or whether the recruitment is a consequence of this repression. The authors should revise their model to reflect this uncertainty. Also, the numbering of steps 1,2 3 is confusing. Does it imply a temporal sequences? Some of these steps could be occurring simultaneously (like 1 and 3). How does step 3 lead from step 2? Please clarify this model.

      10) Consider showing data-points in Fig 1 figure supplement 1. The box/whisker plot doesn't give a good sense of the enrichment alone.

      11) Figure 1 Fig supplement 2 shows that the fluorophore seems to influence the % of cells with foci. Why is this the case?

      12) List gene names in Fig 2 fig supp 5.

      13) Throughout the paper the graph axis labels are very small and difficult to read.

      14) Figure 4 fig supplement 7C and 8E: on the y-axis the legend says proportion of cells (%), so the value on the y-axis might be 25, 50, 75 and not 0.25, 0.50 and 0.75.

      15) The last paragraph of the Introduction (page 2) detailed how Rvb1/Rvb2 are core components of the stress granule. Yet most experiments were conducted to relate Rvb1/Rvb2 with P-bodies. Maybe some information about the known roles Rvb1/Rvb2 play in the P-bodies in the Introduction section could help.

      Significance

      Ruvb helicase has been shown to regulate the formation of stress granules in human U2OS cells during oxidative stress (Parker lab, Cell, 2016). Thus, the authors suggest that Rvb proteins could have a broad and conserved role in the formation of RNA granules, which advances our understanding of how biomolecular condensates could form.

      In addition, translationally-repressed mRNAs have been shown to preferentially recruit to diverse RNA granules, from stress granules P bodies in human cells as well as germ granules in C. elegans and Drosophila. These observations have gained considerable attention in the past 5 years and exact molecular principles behind this phenomenon are not entirely clear. Long and exposed RNA sequences are thought to be sufficient for this enrichment. The authors however suggest that specific proteins (Rvb1/2) could also trigger enrichment either directly by interacting with P bodies or indirectly by repressing translation and exposing RNA sequences. This finding will be particularly relevant to the field of biomolecular condensates.

      My expertise is in the area of RNA biology, mRNA decay, RNA granules and mRNA localization.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is a very interesting paper with novel observations. The authors find that, in yeast, Rvb1/2 AAA+ ATPases couple transcription, mRNA granular localization, and mRNAs translatability during glucose starvation. Rvb1 and Rvb2 were found to be enriched at the promoters and mRNAs of genes involved in alternative glucose metabolism pathways that are transcriptionally upregulated but translationally downregulated during glucose starvation.

      The following are some comments

      Introduction

      1. "Structural studies have shown that they form a dodecamer comprised of a stacked Rvb1 hexametric ring and a Rvb2 hexametric ring." o Rvb1 and Rvb2 form a heterohexameric ring with alternating arrangement (not homohexamers that stack on top of each other as suggested by this sentence) o In yeast, they oligomerize mostly as single hexametric rings, with dodecamers reported being less than 10% in frequency in vivo (eg Jeganathan et al. 2015 https://doi.org/10.1016/j.jmb.2015.01.010)

      Results Section: Rvb1/Rvb2 are identified as potential co-transcriptionally loaded protein factors on the alternative glucose metabolism genes

      1. "These two proteins are generally thought to act on DNA but have been found to be core components of mammalian and yeast cytoplasmic stress granules" • These two papers extensively show Rvb1/Rvb2 localization to granules/condensates under stress/nutrient starvation conditions and should be cited. The Rvb1/2 foci were named Rbits: i. Rizzolo et al. 2017 https://doi.org/10.1016/j.celrep.2017.08.074 ii. Kakihara et al. 2014 https://doi.org/10.1186/s13059-014-0404-4
      2. "a portion of them becomes localized to cytoplasmic granules that are not P-bodies in both 15-minute and 30-minute glucose starvation conditions (Figure 1-figure supplement 2)" • Supplement figure 2 only includes results under 30-min glucose starvation, no 15-min data was shown
      3. Figure 1C, unclear whether p-value here is for FC of GLC3 over HSP or FC of GLC3 over CRAPome. In addition, both FC datasets should have p-values.

      Section: Rvb1/Rvb2 are enriched at the promoters of endogenous alternative glucose metabolism genes

      1. "Here, we performed ChIP-seq on Rvb1, Rvb2, and the negative control Pgk1 in 10 minutes of glucose starvation (Figure 2-figure supplement 3, left)" • Unclear what figure is being referred to, panel A or panel B?
      2. "Structural studies have shown that Rvb1/Rvb2 can form a dodecamer complex. Their overlapped enrichment also indicates that Rvb1 and Rvb2 may function together." • They function together regardless of forming a dodecamer or not, as they assemble as heterohexamers

      Section: Engineered Rvb1/Rvb2 tethering to mRNAs directs the cytoplasmic localization and repressed translation

      1. Does binding of any protein to PP7 loop in this construct alter cytoplasmic fate? A control such as GFP-CP or any other protein attached to CP should be used.
      2. No statistical analysis was done for Figure 4E quantification
      3. "Results showed that after replenishing the glucose to the starved cells, the translation of those genes is quickly induced, with an ~8-fold increase in ribosome occupancy 5 minutes after glucose readdition for Class II mRNAs (Figure 4-figure supplement 9)" o Would be important to see this recovery (increase in translation after glucose replenishment) in one of the reporter constructs used in the paper, such as GL3 promoter driven CFP.

      Section: Engineered Rvb1/Rvb2 binding to mRNAs increases the transcription of corresponding genes

      1. How many biological replicates is in Figure 5B? There does not seem to be any error bars/gray sections indicating sample variation. P-value was also not calculated.

      Significance

      This is a very interesting manuscript that ascribes yet another function of the highly conserved RVB1/2 AAA+ ATPases.

      Referee Cross-commenting

      All reviewers agree that this an interesting paper. However, the reviewers do suggest specific experiments to verify some of the results. Carrying out these experiments will definitely improve the paper.

  3. Dec 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RC-2021-00739

      “Plasma membrane damage limits replicative lifespan in yeast and induces premature senescence in human fibroblasts”

      Kono et al.

      Point-by-point response

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *In this article, Kono et al worked on cellular outcomes induced by plasma membrane damage (PMD) in yeast and in human cells. Plasma membrane damage is induced by some stresses and alteration of its repair can lead to some diseases. Globally little is known about PMD. Authors observed that PMD-induced by low concentration of SDS in yeast and in human cells can limit their replicative lifespan. A genetic screen in yeast has identified the endosomal sorting complexes required for transport (ESCRT) genes as required for PMD response. In human cells, the authors observed that PMD-induced premature senescence is dependent of p53 activity but independent of DNA damage. This work sounds novel and interesting in the context of senescence on human cells. Nevertheless, they are some limits and questions that should be addressed to strongly improve this interesting work.**

      *

      Thank you very much for reviewing our manuscript. We are delighted to know that reviewer #1 thinks our work is novel and interesting.

      **\*Major comments:****

      *- can the authors describe and explain what are common and divergent betweenreplicative lifespan in yeast and human cells, for instance on telomere biology? It is particularly important as the authors jumped from replicative lifespan in yeast to replicative senescence in human cells.

      Thank you for raising this point. The telomere biology in yeast and human cells share at least three central mechanisms but obviously there are limitations of using yeast as a model. We included this point in discussion (page 12, line 10-22).

      - a better characterization of premature senescence induced by SDS is required to delineate this new type of senescence: for instance, SASP content characterization and EdU incorporation assays to properly demonstrate the proliferation arrest.

      According to the reviewer’s suggestion, we added SASP qPCR results (Fig. 3I and J). We also performed EdU incorporation assays and included in the revised manuscript (Fig. 3F).

      - the authors claimed that PMD-induced senescence is DNA damage-independent and that PMD could occur during replicative senescence. As mentioned in some references cited by the authors, replicative senescence normally occurs in response to telomere shortening and this shortening results in a DNA damage response which initiates senescence (ref 23). So authors should formulate their conclusions and discussion in the light of these well described results and tone down some of their conclusions.

      We agree with the reviewer’s point that the best-studied mechanism underlying replicative senescence is telomere shortening (Blackburn, 2001; Shay and Wrightas, 2001) and telomere-dependent replicative senescence is mediated by the DNA repair pathway (d'Adda di Fagagna et al., 2003). We changed the title, abstract, and introduction (title, “Plasma membrane damage limits replicative lifespan in yeast and induces premature senescence in human fibroblasts”, abstract page 2 line 12-13, introduction page 4 line 1-2). We hope new sentences describe our findings more precisely.

      In that context it will be also interesting to investigate whether PMD occurs in other types of cellular senescence (different inducers and different cell types).

      Thank you very much for the suggestion. We performed the experiment. The results indicate that PMD does not occur in DNA damage (doxorubicine)-dependent premature senescence (Fig. S8A and B).

      - this story will be strongly improved if the authors provide some mechanistic insights. In particular if they can link their observations in yeast to their observation in human cells. For instance, does ESCRT impact SDS-induced senescence in human cells? Can this be linked to p53 activity?

      Thank you very much for the suggestion. According to the comment, we tested whether VPS4A/B overexpression extends replicative lifespan in human cells analogous to what we observed in yeast. Unfortunately, VPS4A/B overexpression from CMV promoter gradually decreased cell viability within several days. Therefore, we could not conclude their functions on lifespan extension.

      *

      • in the discussion section, the authors discuss calcium signaling as a possible actor of PMD-induced p53 activation, can they show some data in that direction at least by measuring cytosolic calcium levels during PMD-induced senescence.*

      According to the reviewer’s suggestion, we measured cytosolic calcium levels and included them in our revised manuscript (Fig. 5A-C). Our new results indicate that the cytosolic Ca2+ is increased after SDS treatment. We also added new figures confirming the previously reported result that KCl-dependent Ca2+-influx is sufficient for senescence induction (Fig. 5D-F). To test whether Ca2+ is required for PMDS, we treated the cells with both SDS and Ca2+ chelators but the cells ruptured immediately due to the failure of membrane resealing. Therefore, although it is likely that Ca2+ is required for PMDS, we could not dissect Ca2+’s function in membrane resealing and premature senescence. We will intensively analyze this point in our next paper.

      *- ESCRT is involved in nuclear envelope repair. Can the authors ruled out any effects of SDS on nuclear envelopes as nuclear envelope alterations can be involved in cellular senescence?**

      *

      We appreciate reviewer #1 for raising an important point. We can rule out the possibility based on the following evidence. Nuclear deformation and subsequent upregulation of DNA damage signaling is a striking feature of nuclear envelope damage as observed in premature aging diseases Laminopathies (Eriksson et al., 2003; De Sandre-Giovannoli et al, 2003; Earle et al., 2019). We found that SDS treatment did not induce nuclear envelope deformation (Fig. 1F and Fig. S2A). Moreover, ESCRT did not accumulate at the nuclear membrane after SDS treatment (Fig. S3D, green). These results suggest that the SDS-dependent cellular senescence cannot be attributed to the nuclear envelope damage. We added sentences in discussion of the revised manuscript (page 12, line 1-5).

      \*Minor comments:*** 

      *

      • images are used twice between Figure 1F and S2A, please replaced images to avoid this.*

      According to the reviewer’s comment, we replaced the images.

      - in Figure 3 it will be better to present cumulative population doublings which is a more classical way to present these results.

      According to the reviewer’s comment, we replaced the graphs.

      *- several human cell lines are used but in most of time for different experiments. It will be good to show that at least one of them display the expected results with the different assays.**

      *

      According to the reviewer’s comment, we added Fig. S7 to show that WI-38 cells also show PMDS. Thank you again for reviewing our manuscript despite your hectic schedule.

      * Reviewer #1 (Significance (Required)):

      see above.*

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)):**

      **Summary:**

      Makoto Nakanishi and co-workers use SDS (and EGTA) to induce plasma membrane damage (PMD) on budding yeast cells and human fibroblast. Their results correlate SDS induced PMD with reduced the replicative lifespan of budding yeast and p53 mediated senescence in human fibroblast.

      Using genetic screens in budding yeast, 48 SDS sensitive mutants were identified, including a large set of ESCRT mutants, V-ATPase mutants, and several mutants deficient in metabolic enzymes (amino acid metabolism and lipid metabolism). Three of the SDS sensitive yeast mutants showed a reduced replicative lifespan.

      SDS induced PMD on human fibroblast triggered p53 induction (without concomitant DNA damage) and subsequent p53 mediated senescence. SDS induced PMD also induced phosphatidyl-serine (PS) externalization of PM projections that co-localized with the ESCRT-III subunit CHMP4a.

      These results describe a potentially interesting and novel pathophysiological effect of PMD.

      *

      Thank you very much for serving as a reviewer. We are delighted that the reviewer #2 considers our work to be novel and interesting.

      \*Major points.***

      While the description of the PMD induced phenotypes in yeast and fibroblast are interesting, mechanistic insight is not provided. Perhaps the phenotypic description could be solidified by addressing the following points: *

      1. Quantification of PMD using state-of-the-art FACS analysis in yeast cells and human fibroblasts e.g. using PI together with Annexin V.*

      Thank you so much for the valuable suggestion. According to the comment, we performed these experiments. We could successfully quantify the DAPI penetration in normal human fibroblasts by FACS (added to the revised manuscript as Fig. S2D). In contrast, we failed to detect the increase of Annexin V (PS externalization signals) by FACS, probably due to the detection limit of the FACS machine we used (please see below). Let me remind you that the signal at the PS externalizing spots after PMD are extremely weak; the signals cannot be compared with massive PS externalization during apoptosis. Instead, we quantified the Annexin V signals of entire cells using Zeiss inverted confocal microscope (LSM780) and Zen blue software and included them in Fig. S3B. We hope these new data serve as objective evidence supporting our conclusion.

      • The results from the yeast screens should be better characterized and explained. *

      Thank you very much for the suggestion. According to the reviewer’s comment, we performed characterization of the screening hits and identified four novel mechanisms involved in PMD response in budding yeast (Fig. S5, S6, and Supplementary texts).

      Why do the authors focus on 'replicative lifespan' rather than on e.g. 'nutrient-utilization'.

      Thank you for the comment. Indeed, we are also interested in the relation of PMD and other cellular processes, including nutrient utilization. The project is on-going. In this manuscript, we would like to focus on the point that the PMD response and the replicative lifespan regulation share some key regulators.

      In principle, this is fine with me, given that there are only 48 hits, but then the authors could rather argue e.g.: that they look into ESCRT mutants because the ESCRTs have been already implicated in resealing the PM in a Ca2+ dependent manner.

      Thank you for the comment. In the revised manuscript, we edited the text and emphasized that ESCRT was known to be involved in membrane repair in higher eukaryotes (page 6 line 25-page 7 line 2). Here, we looked into ESCRT to test our working hypothesis that the PMD responses and the replicative lifespan regulation could share part of the fundamental mechanisms.

      To drive home the point the ESCRTs (but also Vps34 and Erg2) limit the replicative life span of budding yeast due to the accumulation of PMD, this should be experimentally tested (e.g. compare replicative life span of the mutants +/- SDS to WT cells +/- SDS). Snf7, Vps34 and Erg2 mutants could affect the replicative life-span in a number of ways that is independent from PMD.

      Thank you very much for raising this point. We performed the experiment. The result was that all mutants (snf7, vps34, and erg2) did not divide at all in the presence of SDS (replicative lifespan=0), consistent with the screening strategy that we isolated the mutants with absolutely no growth on SDS plates (Fig. S4). These results were added to the result section of the revised manuscript (page 7, line 11-13).

      • The rational for over-expressing Vps4 is not clear to me? Vps4 is most likely not the rate limiting factor for the ESCRT machinery under these conditions.*

      Thank you for asking this question. Vps4 is a AAA-ATPase promoting disassembly of the structural components (ESCRT-III filaments) and thus critical for pinching off the membrane. The most straightforward rate-limiting factor could be ATP but obviously it is nonspecific, having too many downstream consequences. Therefore, we decided to mildly overexpress VPS4 from TEF1 promoter and luckily the strategy worked well.

      Perhaps it would be more telling to overexpress Vps4 in a snf7 mutant and test if it still improves the replicative life-span?

      Thank you for the comment. According to the comment, we constructed pTEF1-VPS4 in a snf7 mutant and found that the strain is lethal. Thus, the lifespan extension by pTEF1-VPS4 is at least partly mediated by SNF7. In addition, the synthetic lethality suggests that pTEF1-VPS4 also does some deleterious function to a part of the ESCRT functions. That makes sense because ESCRT is involved in many cellular processes including nuclear membrane repair, lysosome repair, multivesicular body formation, cytokinesis, and exosome production.

      • The finding that PMD induces p53 mediated senescence in fibroblast is an important initial finding, as is the observation of the formation of PM extrusion that contain ESCRTs and externalize PS. Unfortunately, also these experiments remain rather descriptive. Many questions remain open: a. How is p53 activated? b. Are these 'protrusion' formed by the ESCRTs? c. Are the protrusions essential for entry into senescence or a consequence?

        *

      We cannot thank more for these fascinating suggestions. We are thrilled to tackle these questions. Using mRNA seq and pathway analysis, we identified upstream regulators of p53 during PMDS. We are ready to submit it as an independent manuscript because it involves large datasets.

      \*Minor points:****

      I understand that the author can use FIB-SEM as a very powerful technique for volumetric ultrastructural analysis. I'm wondering why it was used in Figure 5c? Would 'simple' SEM not yield exactly the same results but given the relative ease of SEM, many more cells could be quantified...? FIB-SEM would actually be great for the analysis of PMD more directly, right after SDS treatment in both yeast cells (were the entire volume of the cell could be analyzed) and in human cells.

      *

      Thank you very much for a valuable advice. As reviewer #2 may know very well, SEM requires dehydration of cells, and the data acquisition is performed under high-vacuum condition. These two treatments significantly alter the structure of the plasma membrane of human normal fibroblasts. In contrast, for FIB-SEM observation, the cells in a culture dish can be directly fixed and embedded in resin, which preserves fine structures of the plasma membrane including soft and tiny projections (280-2500 nm). Based on these reasons, we decided to utilize FIB-SEM in Fig. 5C (now Fig. 6C in the revised manuscript).

      * Reviewer #2 (Significance (Required)):

      The authors report very exciting observations that describe novel effects of plasma membrane damage (PMD) on cell (patho)physiology. Unfortunately, I find it difficult to connect the yeast part to the studies using human fibroblast (expect that SDS is used to cause PMD). While the description of the PMD induced phenotypes in yeast and fibroblast are interesting, mechanistic insight (e.g. the role of the ESCRTs in PMD and induction of p53 mediated senescence) is largely lacking at the moment. Provided that a more through phenotypic description (see major points) and perhaps some mechanistic insight can be provided, this work will be of interest to a wide audience in molecular cell biology.*

      Thank you very much for the encouraging comment. We are delighted to know that reviewer #2 highly evaluates the potential impact of this work. Here, we would like to report that 1) the PMD limits replicative lifespan in two independent eukaryotic cell types, and 2) the PMD response and the replicative lifespan regulations partly share their fundamental mechanisms, especially the mechanisms underlying cell cycle checkpoint activation. This work opens up many exciting future directions and we are extensively following them up. We hope we will be able to report detailed mechanisms very soon. Thank you again for reviewing our manuscript despite your hectic schedule.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Makoto Nakanishi and co-workers use SDS (and EGTA) to induce plasma membrane damage (PMD) on budding yeast cells and human fibroblast. Their results correlate SDS induced PMD with reduced the replicative lifespan of budding yeast and p53 mediated senescence in human fibroblast.

      Using genetic screens in budding yeast, 48 SDS sensitive mutants were identified, including a large set of ESCRT mutants, V-ATPase mutants, and several mutants deficient in metabolic enzymes (amino acid metabolism and lipid metabolism). Three of the SDS sensitive yeast mutants showed a reduced replicative lifespan.

      SDS induced PMD on human fibroblast triggered p53 induction (without concomitant DNA damage) and subsequent p53 mediated senescence. SDS induced PMD also induced phosphatidyl-serine (PS) externalization of PM projections that co-localized with the ESCRT-III subunit CHMP4a.

      These results describe a potentially interesting and novel pathophysiological effect of PMD.

      Major points.

      While the description of the PMD induced phenotypes in yeast and fibroblast are interesting, mechanistic insight is not provided. Perhaps the phenotypic description could be solidified by addressing the following points:

      1. Quantification of PMD using state-of-the-art FACS analysis in yeast cells and human fibroblasts e.g. using PI together with Annexin V.
      2. The results from the yeast screens should be better characterized and explained. Why do the authors focus on 'replicative lifespan' rather than on e.g. 'nutrient-utilization'. In principle, this is fine with me, given that there are only 48 hits, but then the authors could rather argue e.g.: that they look into ESCRT mutants because the ESCRTs have been already implicated in resealing the PM in a Ca2+ dependent manner.
      3. To drive home the point the ESCRTs (but also Vps34 and Erg2) limit the replicative life span of budding yeast due to the accumulation of PMD, this should be experimentally tested (e.g. compare replicative life span of the mutants +/- SDS to WT cells +/- SDS). Snf7, Vps34 and Erg2 mutants could affect the replicative life-span in a number of ways that is independent from PMD.
      4. The rational for over-expressing Vps4 is not clear to me? Vps4 is most likely not the rate limiting factor for the ESCRT machinery under these conditions. Perhaps it would be more telling to overexpress Vps4 in a snf7 mutant and test if it still improves the replicative life-span?
      5. The finding that PMD induces p53 mediated senescence in fibroblast is an important initial finding, as is the observation of the formation of PM extrusion that contain ESCRTs and externalize PS. Unfortunately, also these experiments remain rather descriptive. Many questions remain open: a. How is p53 activated?<br> b. Are these 'protrusion' formed by the ESCRTs? c. Are the protrusions essential for entry into senescence or a consequence?

      Minor points:

      I understand that the author can use FIB-SEM as a very powerful technique for volumetric ultrastructural analysis. I'm wondering why it was used in Figure 5c? Would 'simple' SEM not yield exactly the same results but given the relative ease of SEM, many more cells could be quantified...? FIB-SEM would actually be great for the analysis of PMD more directly, right after SDS treatment in both yeast cells (were the entire volume of the cell could be analyzed) and in human cells.

      Significance

      The authors report very exciting observations that describe novel effects of plasma membrane damage (PMD) on cell (patho)physiology. Unfortunately, I find it difficult to connect the yeast part to the studies using human fibroblast (expect that SDS is used to cause PMD). While the description of the PMD induced phenotypes in yeast and fibroblast are interesting, mechanistic insight (e.g. the role of the ESCRTs in PMD and induction of p53 mediated senescence) is largely lacking at the moment. Provided that a more through phenotypic description (see major points) and perhaps some mechanistic insight can be provided, this work will be of interest to a wide audience in molecular cell biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this article, Kono et al worked on cellular outcomes induced by plasma membrane damage (PMD) in yeast and in human cells. Plasma membrane damage is induced by some stresses and alteration of its repair can lead to some diseases. Globally little is known about PMD. Authors observed that PMD-induced by low concentration of SDS in yeast and in human cells can limit their replicative lifespan. A genetic screen in yeast has identified the endosomal sorting complexes required for transport (ESCRT) genes as required for PMD response. In human cells, the authors observed that PMD-induced premature senescence is dependent of p53 activity but independent of DNA damage. This work sounds novel and interesting in the context of senescence on human cells. Nevertheless, they are some limits and questions that should be addressed to strongly improve this interesting work.

      Major comments:

      • can the authors describe and explain what are common and divergent between replicative lifespan in yeast and human cells, for instance on telomere biology? It is particularly important as the authors jumped from replicative lifespan in yeast to replicative senescence in human cells.
      • a better characterization of premature senescence induced by SDS is required to delineate this new type of senescence: for instance, SASP content characterization and EdU incorporation assays to properly demonstrate the proliferation arrest.
      • the authors claimed that PMD-induced senescence is DNA damage-independent and that PMD could occur during replicative senescence. As mentioned in some references cited by the authors, replicative senescence normally occurs in response to telomere shortening and this shortening results in a DNA damage response which initiates senescence (ref 23). So authors should formulate their conclusions and discussion in the light of these well described results and tone down some of their conclusions. In that context it will be also interesting to investigate whether PMD occurs in other types of cellular senescence (different inducers and different cell types).
      • this story will be strongly improved if the authors provide some mechanistic insights. In particular if they can link their observations in yeast to their observation in human cells. For instance, does ESCRT impact SDS-induced senescence in human cells? Can this be linked to p53 activity?
      • in the discussion section, the authors discuss calcium signaling as a possible actor of PMD-induced p53 activation, can they show some data in that direction at least by measuring cytosolic calcium levels during PMD-induced senescence.
      • ESCRT is involved in nuclear envelope repair. Can the authors ruled out any effects of SDS on nuclear envelopes as nuclear envelope alterations can be involved in cellular senescence?

      Minor comments:

      • images are used twice between Figure 1F and S2A, please replaced images to avoid this.
      • in Figure 3 it will be better to present cumulative population doublings which is a more classical way to present these results.
      • several human cell lines are used but in most of time for different experiments. It will be good to show that at least one of them display the expected results with the different assays.

      Significance

      see above.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their helpful comments. We believe that we will be able to address all of their concerns and suggestions. We have highlighted our responses in the revision plan and the changes we have already made to the manuscript in blue text. For figures where we have added data or analyses at the request of reviewers, we have highlighted the corresponding text in the figure legends.

      2. Description of the planned revisions

      Reviewer #1

      2- In Figure 4, the two mutations appear to have statistically differential effects on Rab5 and Rab7 puncta even though the data mean and distribution seem very similar. Interestingly, in each case the non-significant effect is associated with a smaller sample size. Given that the overall sample sizes used are rather small for such highly variable data, this could easily cause a statistical anomaly due to sampling bias. The sample size should be made uniform across all genotypes and should ideally be at least doubled.

      We will repeat this staining to increase the n to at least double this number, and adjust our conclusions if need be, in the revised manuscript..

      3- Perhaps the most important issue related to Figure 6 where the authors find that there is no sterol accumulation at 96h APF in the Vps50 mutant. However, even that the dendritic phenotype is slower to appear in this mutant compared to the Vps54, are the authors sure that the accumulation is not just slower? This should be examined using the same temporal sequence used for Vps54 shown in Future 6 C. In addition, the fact that sterol accumulation returns to normal in the Vps54 mutant at 1 day, supports the notion of a delay phenotype (see point 1 above). These issues should be experimentally addressed to see if the data fully support the initial conclusions, or if the conclusions should be modified to suggest differential contribution of the two complexes to the process being studied and to a developmental delay phenotype.

      We had included the filipin staining for Vps50KO/KO at 1 day in Figure S4 A (which did not show a significant difference from control). We did not collect data for this genotype at 72hrs APF because the dendritic length phenotype didn’t appear until later, and so we did not include Vps50KO/KO in the full time-course in Fig 6 C. We will collect additional data so that we can include Vps50KO/KO at all timepoints in this figure in the revised manuscript.

      . Reviewer #2

      It is stated that loss of VPS50 and VPS54 only causes dendrite morphogenesis defects. However, the corresponding supplemental figure S2c (which is not referenced in the text), is not suited to address this question. Axonal arborization, in particular terminal arbors, are not visible in samples where multiple/all c4da axons are labeled simultaneously (Fig. S2c). Analogous to the dendrite analysis of c4da neurons single cell resolution is essential to examine this in a meaningful way. Likely, however, c4da neurons may not be a good choice to address this question.

      We should be able to get single cell resolution of the c4da axon terminals using MARCM. We already have two of the knockout lines recombined with FRTs (Vps53 and Vps54) for this analysis and we will make the third recombinant line so that we can use MARCM for all three lines to examine single-cell axon morphology, as suggested.

      Overall, I am concerned whether the data shown here can be generalized. The cd4a neurons are rather extreme cell types due to their very large dendritic compartment. It seems quite possible that many other neurons may not have a comparable sensitivity to the supply of lipids/sterols. This type of question can only be addressed if other types of neurons/dendrites are examined. Are class 2 or class 3 da neurons showing any defects in VPS mutants?

      Given that we see the phenotype emerge during the pupal stage, we want to analyze neurons that persist from the larval to adult stages. However, not all of the dendritic arborization neurons survive into adulthood- class I and II persist, while class III die during metamorphosis (Shimono et al., 2009). As we do not have adequate tools to for studying the class II neurons, we will examine dendrite morphology of the class I neurons in larvae and adults in our knockout lines. We would be happy to look at class III neurons at the reviewers request, but our analysis will necessarily be limited to the larval stage.

      Reviewer #3

      • Some of the experiments include multiple genotypes and so it would be important to show all in all figures. For example, figure 4B,D show four groups but figure 4F, presumably from the same set of animals, shows only three. Addition of the rescue genotype to 4F is particularly important here so should be shown. The same concern is valid for figure 5, where puncta number and area must be available.

      The data from Fig. 4 F (using a genetically encoded marker for lysosomes, UAS-spin-RFP) are not from the same samples as Fig. 4 B and D (staining). We did not include the rescue for Fig 4. F because the lysosome marker, the rescue transgenes and the neuronal membrane marker are all on the third chromosome. We will build additional fly stocks so that we can include the rescue in experiments looking at lysosome morphology.

      • This concern is amplified by the images in figure 6 of the filipin staining, that are more obviously perinuclear. However, the two sets of images in 6A and 6D, where co-staining with Golgin245 is shown, look very different. Improved images are required and it may be helpful to use supplementary information to show additional examples of the staining.

      The images in Fig. 6 A are maximum projections of z stacks while Fig. 6 D shows single confocal planes, making it easier to see the perinuclear Golgi ring. Because other reviewers wanted some additional experiments related to Fig. 6 that we plan to incorporate into this figure in the revised manuscript, we will address this comment in a future revision and include additional images in the supplement.

      • For the lipid regulation experiments in figure 7, please use an orthogonal approach to show that the Osbp and fwd RNAi had the expected effects on lipid accumulation.

      In addition to sterol, Osbp and fwd both affect levels of PI4P at the Golgi. We have obtained a transgenic PI4P sensor that we can use to show the effect of these manipulations on this lipid as well.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1 While the data presented clearly support a role for GARP in regulating sterol levels to support dendritic growth, they do not inter current for suffice to exclude a role for EARP as important analyses to allow such a clear cut conclusion are either insufficient or missing. If the authors wish to maintain this claim - as suggested by the title of the manuscript - further analyses are essential.

      We don’t mean to argue the EARP complex doesn’t contribute to dendrite development at all – we do show it contributes to development in Fig 3, and as we discuss in the text.. We want to argue that the GARP and EARP complexes contribute to dendrite development by distinct mechanisms. Losing the GARP complex inhibits dendrite development by means of sterol accumulation at the TGN, which is what we are trying to highlight with our title. The reduced dendrite growth that we observe in EARP deficient neurons must occur by some other as yet unknown means. We apologize for the confusion and have reworded the title to read “Sterol accumulates at the trans-Golgi in GARP complex deficient neurons during dendrite remodeling.”

      1- Figure 3E shows that whereas both Vps50 and Vps54 mutations reduce dendritic complexity, the Vps54 phenotype appears earlier (96h APF). Furthermore, at 7 days dendrites appear to grow again but at a slower rate than controls. This begs the question of whether these mutations are causing a delay rather than a block in the regrowth after pruning and whether the growth will eventually be normal a few days later or whether it will stop at some point.

      We have included data for an additional adult timepoint (21 days) in the new Fig. 3 E. We also included graphs in which we show the statistics for each genotype over time (new Fig. S2 D-F), and discuss this analysis in the text (lines 186-195). We have also included a table of the p-values for each comparison in the Supplemental Materials (Table S2). From this analysis, we conclude that there is not a developmental delay in the knockouts, but rather a decrease in growth during the 72-96hrs APF and 1-7 day windows when the control neurons grow. We are unable to draw conclusions about the rate of growth as we analyzed neurons from different samples at each developmental timepoint, and not the same neurons over time.

      Reviewer #2

      It would be important to know, whether the dendrite morphogenesis defect is indeed a developmental patterning defect or rather a "scaling" defect due to the fact that da neurons increase their size (but not necessarily their projection pattern) during larval maturation.

      We have analyzed the larval data for coverage index – neuron area/hemisegment (receptive field) area as defined in (Parrish et al., 2009) to determine if there is a scaling defect at this stage in development. We do not observe a defect in scaling (Fig. S2 C) and discussed in lines 175-182.

      Reviewer #3

      • The statistical analyses generally look appropriate but it would be critical to clarify what N means in every case. For example in figure 2 the authors state n=8 without clarifying if this is n=8 animals or n=8 neurons. N should always be the number of animals, but then the n of independent cells counted should also be indicated. Typically, one would either pre-average per genotype or use a mixed model that includes N of animals and n of cells (or similar).

      For experiments analyzing dendrite morphology, n represents the number of neurons, as we have clarified in our figure legends. As per another reviewer’s request, we will increase the n for the organelle and filipin staining in our planned revision and specify fly and cell number at that time.

      • Please add details of how experiments were blinded to genotype

      The researcher was blinded to genotype during analysis. We have included that detail in our Methods section (line 566).

      • Some of the experiments include multiple genotypes and so it would be important to show all in all figures. For example, figure 4B,D show four groups but figure 4F, presumably from the same set of animals, shows only three. Addition of the rescue genotype to 4F is particularly important here so should be shown. The same concern is valid for figure 5, where puncta number and area must be available.

      We address the first portion of this comment in section 2, for additional experiments involving generating new fly lines. We have included data on puncta area, and mean fluorescence intensity for Rab5 and Rab7 in the supplement (Fig S3). We had already included the data on puncta number and area in Fig 5, but we have added the data on mean fluorescence intensity as well.

      • Related to figure 5, please provide validation of the staining of the TGN. Typically, one would expect trans Golgi to be close to the nucleus with at least some extended stacks. A Golgin245 knockout would be ideal.

      The Golgi in most Drosophila cells is typically found as discrete puncta dispersed throughout the cytosol like what we see in the Golgin245 staining, as opposed to the ribbon “stack of pancake” morphology typically seen near the nucleus in mammalian cells. For reference, please see Figure 6D in (Ye et al., 2007), Figures 2,4,5 in (Rosa-Ferreira et al., 2015), and observations reviewed in (Kondylis and Rabouille, 2009).

      The Golgin245 antibody was well characterized in the paper first describing it (Riedel et al., 2016) (colocalization with other Golgi markers, decreased staining with Golgin245 RNAi), but we would be happy to repeat this validation in the c4da neurons at the reviewer’s request. There do not appear to be Golgin245 mutant or KO lines available, so we would also use the Golgin245 RNAi.

      • For figures 6F, G please show examples of staining for late endosomes and lysosome with appropriate validation.

      Because several of our planned revisions relate to Fig. 6, we will include images for Fig. 6 F and G when we remake this figure to incorporate those planned revisions. To clarify, we used the same reagents to mark late endosomes and lysosomes in both Fig. 4 and Fig. 6. Like the Golgin245 antibody, the Rab7 antibody was developed by the Munro lab and characterized in (Riedel et al., 2016) (partial colocalization with the endosomal marker Hrs and with the lysosomal marker Arl8). Spinster (aka benchwarmer) is a known lysosomal transmembrane protein that colocalizes with Lamp1 (Dermaut et al., 2005; Rong et al., 2011). The fluorescently tagged spin transgenes were developed by the Bellen lab and have been frequently used to mark lysosomes. We would be happy to carry out additional validation experiments at the reviewer’s specification.

      • The title of figure 2 is inaccurate, at least if I understand the experiment, as it does not show neuron-specific knockout but instead whole body knockout with neuron rescue. Please rephrase.

      Because of the lethality of whole body Vps53KO/KO in adult flies, we analyze MARCM clonal neurons that are Vps53KO/KO in flies that are otherwise heterozygous (Vps53KO/+). To clarify this experiment, we have changed the title of Fig. 2 from “Neuron-specific knockout of Vps53 results in smaller dendritic arbors” to “Vps53KO/KO MARCM clonal neurons have smaller dendritic arbors”.

      • Figure 8 needs examples of the TGN and late endosome morphology.

      We have included these images in Figure

      The order appears different in Fig. 4 B & D because we only included the rescue for the KO that shows a phenotype for each staining. The genotypes included in Fig. 4 B are: +/+, Vps50KO/KO, Vps50KO/KO + rescue, and Vps54KO/KO. The genotypes included in Fig. 4 D are +/+, Vps50KO/KO, Vps54KO/KO, Vps54KO/KO + rescue. We have changed the shading of the bars corresponding to these rescue genotypes throughout the manuscript to make it easier to distinguish the two rescue conditions.

      4. Description of analyses that authors prefer not to carry out

      References Cited

      Dermaut, B., K.K. Norga, A. Kania, P. Verstreken, H. Pan, Y. Zhou, P. Callaerts, and H.J. Bellen. 2005. Aberrant lysosomal carbohydrate storage accompanies endocytic defects and neurodegeneration in Drosophila benchwarmer. Journal of Cell Biology. 170:127–139. doi:10.1083/jcb.200412001.

      Kondylis, V., and C. Rabouille. 2009. The Golgi apparatus: Lessons from Drosophila. FEBS Letters. 583:3827–3838. doi:10.1016/j.febslet.2009.09.048.

      Parrish, J.Z., P. Xu, C.C. Kim, L.Y. Jan, and Y.N. Jan. 2009. The microRNA bantam Functions in Epithelial Cells to Regulate Scaling Growth of Dendrite Arbors in Drosophila Sensory Neurons. Neuron. 63:788–802. doi:10.1016/j.neuron.2009.08.006.

      Riedel, F., A.K. Gillingham, C. Rosa-Ferreira, A. Galindo, and S. Munro. 2016. An antibody toolkit for the study of membrane traffic in Drosophila melanogaster. Biology Open. 5:987–992. doi:10.1242/bio.018937.

      Rong, Y., C.K. McPhee, S. Deng, L. Huang, L. Chen, M. Liu, K. Tracy, E.H. Baehrecke, L. Yu, and M.J. Lenardo. 2011. Spinster is required for autophagic lysosome reformation and mTOR reactivation following starvation. Proceedings of the National Academy of Sciences. 108:7826–7831. doi:10.1073/pnas.1013800108.

      Rosa-Ferreira, C., C. Christis, I.L. Torres, and S. Munro. 2015. The small G protein Arl5 contributes to endosome-to-Golgi traffic by aiding the recruitment of the GARP complex to the Golgi. Biology Open. 4:474–481. doi:10.1242/bio.201410975.

      Shimono, K., A. Fujimoto, T. Tsuyama, M. Yamamoto-Kochi, M. Sato, Y. Hattori, K. Sugimura, T. Usui, K. Kimura, and T. Uemura. 2009. Multidendritic sensory neurons in the adult Drosophila abdomen: origins, dendritic morphology, and segment- and age-dependent programmed cell death. Neural Dev. 4:37. doi:10.1186/1749-8104-4-37.

      Ye, B., Y. Zhang, W. Song, S.H. Younger, L.Y. Jan, and Y.N. Jan. 2007. Growing Dendrites and Axons Differ in Their Reliance on the Secretory Pathway. Cell. 130:717–729. doi:10.1016/j.cell.2007.06.032.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      O'Brien et al report how deficiency in GARP specific protein VPS54 or the EARP specific protein VPS50 affects the developmental dendritic remodeling of multidendritic class IV da (c4da) neurons in Drosophila. The main findings are that while both complexes play a role in dendritic remodeling, VPS54 deficiency leads to lipid accumulation in the trans-Golgi network (TGN). Manipulating sterols at the TGN affects dendritic remodeling suggesting that lipid accumulation is responsible for control of neuron morphology in this model. Overall, the data is interesting and the authors develop the experiments enough to be convincing on their major claims. However, a few aspects need clarification and perhaps revisiting conclusions.

      Major comments

      • The statistical analyses generally look appropriate but it would be critical to clarify what N means in every case. For example in figure 2 the authors state n=8 without clarifying if this is n=8 animals or n=8 neurons. N should always be the number of animals, but then the n of independent cells counted should also be indicated. Typically, one would either pre-average per genotype or use a mixed model that includes N of animals and n of cells (or similar).
      • Please add details of how experiments were blinded to genotype
      • Some of the experiments include multiple genotypes and so it would be important to show all in all figures. For example, figure 4B,D show four groups but figure 4F, presumably from the same set of animals, shows only three. Addition of the rescue genotype to 4F is particularly important here so should be shown. The same concern is valid for figure 5, where puncta number and area must be available.
      • Related to figure 5, please provide validation of the staining of the TGN. Typically, one would expect trans Golgi to be close to the nucleus with at least some extended stacks. A Golgin245 knockout would be ideal.
      • This concern is amplified by the images in figure 6 of the filipin staining, that are more obviously perinuclear. However, the two sets of images in 6A and 6D, where co-staining with Golgin245 is shown, look very different. Improved images are required and it may be helpful to use supplementary information to show additional examples of the staining.
      • For figures 6F, G please show examples of staining for late endosomes and lysosome with appropriate validation.
      • For the lipid regulation experiments in figure 7, please use an orthogonal approach to show that the Osbp and fwd RNAi had the expected effects on lipid accumulation.
      • Figure 8 needs examples of the TGN and late endosome morphology.

      Minor comments

      • The title of figure 2 is inaccurate, at least if I understand the experiment, as it does not show neuron-specific knockout but instead whole body knockout with neuron rescue. Please rephrase.
      • For ease of reading, it would be helpful to show genotypes in the same order in all figures (see 4B, 4D)

      Significance

      The advance here is to nominate lipid accumulation at the trans Golgi network (TGN) is sufficient to affect dendritic remodeling during development. Although the work is performed in a model system, it may have relevance to human neurodevelopmental disorders caused by mutations in the orthologous genes. The work will be of highest relevance to developmental neurobiologists, particularly those working on GARP or EARP mutations and those who use Drosophila as an appropriate model for neurodevelopment.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript presents a solid genetic analysis of components of the GARP and EARP complexes. The analysis is focused on a specialized type of sensory neurons i.e. class IV da neurons in Drosophila larvae. The authors show that loss of multiple components (VPS50-54) disrupt dendrite morphogenesis in c4da neurons in distinct ways. Additional genetic interaction studies further support the notion of functional differences of GARP and EARP in vivo.

      Overall this is a solid study and with one exception (see below) I have little concern regarding the presented experiments. I do, however, find the exclusive focus on a highly specialized cell type c4da somewhat problematic.

      Concerns:

      Experimental concern: It is stated that loss of VPS50 and VPS54 only causes dendrite morphogenesis defects. However, the corresponding supplemental figure S2c ( which is not referenced in the text), is not suited to address this question. Axonal arborization, in particular terminal arbors, are not visible in samples where multiple/all c4da axons are labeled simultaneously (Fig. S2c). Analogous to the dendrite analysis of c4da neurons single cell resolution is essential to examine this in a meaningful way. Likely, however, c4da neurons may not be a good choice to address this question.

      It would be important to know, whether the dendrite morphogenesis defect is indeed a developmental patterning defect or rather a "scaling" defect due to the fact that da neurons increase their size (but not necessarily their projection pattern) during larval maturation.

      Overall, I am concerned whether the data shown here can be generalized. The cd4a neurons are rather extreme cell types due to their very large dendritic compartment. It seems quite possible that many other neurons may not have a comparable sensitivity to the supply of lipids/sterols. This type of question can only be addressed if other types of neurons/dendrites are examined. Are class 2 or class 3 da neurons showing any defects in VPS mutants?

      Significance

      At this point i am not convinced that the findings can be generalized. The c4da neuron is really an extreme cell type with a massive disproportionate increase in membrane extensions. This is rather unusual and other neuron types should be tested.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      O'Brien and colleagues use Drosophila dendrite development to dissect the roles of the GARP and EARP vesicular trafficking complexes in the development of neuronal morphology. By making complex-specific KOs they investigate the role of each complex in the growth, pruning and re-growth of sensory dendrites and conclude that the GARP, but not EARP, complex is required for proper dendrite development by limiting sterol accumulation in the neuronal TGN.

      Major comments:

      While the data presented clearly support a role for GARP in regulating sterol levels to support dendritic growth, they do not inter current for suffice to exclude a role for EARP as important analyses to allow such a clear cut conclusion are either insufficient or missing. If the authors wish to maintain this claim - as suggested by the title of the manuscript - further analyses are essential.

      1- Figure 3E shows that whereas both Vps50 and Vps54 mutations reduce dendritic complexity, the Vps54 phenotype appears earlier (96h APF). Furthermore, at 7 days dendrites appear to grow again but at a slower rate than controls. This begs the question of whether these mutations are causing a delay rather than a block in the regrowth after pruning and whether the growth will eventually be normal a few days later or whether it will stop at some point.

      2- In Figure 4, the two mutations appear to have statistically differential effects on Rab5 and Rab7 puncta even though the data mean and distribution seem very similar. Interestingly, in each case the non-significant effect is associated with a smaller sample size. Given that the overall sample sizes used are rather small for such highly variable data, this could easily cause a statistical anomaly due to sampling bias. The sample size should be made uniform across all genotypes and should ideally be at least doubled.

      3- Perhaps the most important issue related to Figure 6 where the authors find that there is no sterol accumulation at 96h APF in the Vps50 mutant. However, even that the dendritic phenotype is slower to appear in this mutant compared to the Vps54, are the authors sure that the accumulation is not just slower? This should be examined using the same temporal sequence used for Vps54 shown in Future 6C. In addition, the fact that sterol accumulation returns to normal in the Vps54 mutant at 1 day, supports the notion of a delay phenotype (see point 1 above).

      These issues should be experimentally addressed to see if the data fully support the initial conclusions, or if the conclusions should be modified to suggest differential contribution of the two complexes to the process being studied and to a developmental delay phenotype.

      Significance

      The study advances our understanding of the role of regulation of lipid storage in sculpting neuronal morphology during development.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewers

      Reviewer #1

      I believe that this is a very sound and authoritative study. The analysis of all data seems appropriate and robust, and many connections between the data (and subsets of data) and their possible interpretations have been considered. In fact, in the massive Results section, some interpretations are supported by cited references (this is not meant as a critique). However, I wonder about the length of the Results section, and the balance between it and the relatively short Discussion section. It is difficult for me to nail down any part of Results that might be shortened, as I could not find clear redundancies. I also think that the level of speculation is absolutely warranted, and I did not find excessive claims being made to this or that end. Rather, I suggest to broaden the perspective somewhat (in their Discussion; see below under Significance), which might allow people with a less mechanistic perspective to grasp the potential relevance of this work for non-model plant systems studied mostly by evolutionary geneticists.

      Response: We thank the reviewer for their kind remarks. We have spent a very large amount of time trying to streamline the results section and we are not sure if it would be possible to shorten it any further without removing critical details.

      We appreciate the reviewer’s comment to add more detail to the discussion to make it more appealing to evolutionary geneticists and we have added the following lines to the discussion section: “The WISO or “weak inbreeder/strong outcrosser” model (Brandvain & Haig, 2005) emerges from the dynamics of parental conflict and parent-of-origin effects. Under this model, a parent from populations with higher levels of outcrossing is exposed to higher levels of conflict and can thus dominate the programming of maternal resource allocation in a cross with an individual from a population with lower levels of outcrossing. Such a phenomenon has been observed in numerous clades including Dalechampia, Arabidopsis, Capsella and Leavenworthia (Brandvain & Haig, 2018; İltaş et al., 2021; Lafon-Placette et al., 2018; Raunsgard et al., 2018). Intriguingly, loss of function phenotypes in the RdDM pathway are more severe in recently outcrossing species than in A.thaliana (Grover et al., 2018; Wang et al., 2020) and suggests that RNA Pol IV functions are more elaborate and important in these species. This raises the possibility that the role for RNA Pol IV and RdDM in parental conflict that we describe in A.thaliana here is likely heightened in and mediates the elevated level of parental conflict in species that are currently or have been recently outcrossing.”

      One aspect that might warrant more scrutiny is the mapping of sRNA reads to the reference genome. I found the short section of this (M&M section, page 20, lines 23-25) to be too brief. It is not clear to me which of ShortStack's v3 weighting scheme the authors used, which is relevant for multi-mapping reads (see NR Johnson et al. 2016, G3). In addition, it is not mentioned whether zero mismatches were allowed. Perhaps this is described in more detail in Erdmann et al. (2017), but even if so, it deserves to be clarified here.

      Response: Small RNA reads were aligned after allowing two mismatches. This was indicated in the bowtie command (‘bowtie -v 2’ where v 2 indicates two mis-matches). We have added text to expand on the meaning of the commands.

      We have also expanded the commands used for ShortStack. We used the “Placement guided by uniquely mapping reads (-u)” option to divide the multi-mapping reads.

      The manuscript is well-written and concise, despite the length of the Results section. The verbal clarity and absence of typos or grammatical issues is superb. I did find some of the Figures to be somewhat "un-intuitive", in the sense that it takes acute concentration for an outsider (of sorts) to gather and interpret the underlying data. This is probably due to the many cross-comparisons of differences between two genotypes on one axis and those of a different pair of genotypes on the other axis. I am not sure how this issue can be ameliorated (nor whether this is really necessary); however, from a technical point of view, all Figures and Suppl. Figures are really well-done.

      Response: We thank the reviewer for their kind remarks. We have strived to make the figures easier to understand but we are aware that the figures do require a lot of concentration. We haven’t found an easy way to fix this. We thank the reviewer for patiently going through the figures.

      The list of references seems adequate in terms of citing relevant (both older and very recent) publications. However, almost all cited papers concern Arabidopsis or other model species; I suggest to consider adding a few relevant studies on non-Brassicaceae (whether considered model taxa or not), in conjunction with my suggestion (in Significance) to potentially broaden the scope by searching for natural phenomena that also involve parent-of-origin effects on endosperm/seed development. Curiously, many of the references are "incomplete" in the sense of stopping with the journal's name, then stating the doi, i.e. they lack volume numbers and page/article numbers. This should be harmonized throughout.

      Response: We have added references to non-Brassicaceae species and have also fixed the references.

      Reviewer #2: This manuscript provides evidence that a loss of either the maternal or paternal copy of NRPD1 have different, and sometime opposite, effects on the accumulation of small RNAs and on expression of a subset of genes, with a loss of the maternal copy having more substantial effects. The manuscript is well written, and the conclusions, as far as they go, are justified by the data, which are effectively presented. The overall effect is subtle but informative and according to the authors support a parental conflict model for imprinting. The experiments failed to find a smoking gun in the form of a mechanism to explain how or why the maternal and paternal alleles have different effects and the explanation for a lack of clear phenotypic differences was reasonable, but untested. I would have like to see it tested by looking in a plant species that is outcrossing and highly polymorphic. However, I do appreciate that the observation that the male and female alleles can have distinct effect when mutant is an important clue. My specific comments below may reflect confusion on my part, rather than real issues. If that is that I hope that confusion can aid in clarifying what are in places quite subtle points.

      Response: We thank the reviewer for their comments. We agree that it would be potentially informative to do similar experiments in an outcrossing species but that this is beyond the scope of this manuscript. Additionally, loss of NRPD1 or other components of the RdDM pathway has dramatic effects on gametogenesis in some examined outcrossing species(Grover et al., 2018; Wang et al., 2020), which could prevent the detection of subtle parent-of-origin effects on seed development.

      Page 6, last paragraph: "Because the endosperm is triploid, in these comparisons there are 3 (wild-type), 2 (pat nrpd1+/-), 1 (mat nrpd1+/-) and 0 (nrpd1-/-) functional NRPD1 alleles in the endosperm. However, NRPD1 is a paternally expressed imprinted gene in wild-type Ler x Col endosperm and the single paternal allele contributes 62% of the NRPD1 transcript whereas 38% comes from the two maternal alleles (Pignatta et al., 2014). Consistent with paternal allele bias in NPRD1 expression, mRNA-Seq data shows that NRPD1 is expressed at 42% of wild- type levels in pat nrpd1+/- and at 91% of wild-type levels in mat nrpd1+/- (Supplementary Table 6)". I would think this would really complicate the analysis. Should all of the dosage values include NPRD1 imprinting values? That is to say, expressed in terms of expression values? This is also a bit confusing. The maternal copies together express 38% of the transcript, so why isn't the mat nrpd1 at 68%, rather than 91%? In any event, given this imprinting and differences in dosage of the male and female it appears that two variables, parental origin and expression levels are being compared. Since 91% is awfully close to 100%, are the mat pat comparisons really just comparing low with nearly normal expression of NRPD1? And actually, given that, the outsized effect of the mat nrpd1 +/- is even more striking.

      Response: We included the details of dosage rather than imprinting values because the potential for buffering of expression upon loss of one allele could not be discounted. Indeed, we do find that the endosperm transcriptome buffers against the loss of the maternal or paternal alleles (Supplementary Table 6). The reviewer is correct in pointing out that the outsized effect of mat nrpd1+/- on gene expression is even more striking, and strongly supports our view that these effects are parental rather than endospermic.

      To reduce confusion in this section, we removed the details about 38% maternal allele transcripts obtained from our previous study, and instead report only the observed values from this study (which are also consistent with the previously reported paternally-biased expression of NRPD1 in endosperm).

      Page 4, Line 16. I'm afraid it's still a bit difficult to understand what was being compared what in this section. Please clarify.

      Response: The authors in this previously published study compared sRNAs obtained from wild-type whole seeds (which consists of three different tissues, including endosperm) with mutant endosperm. We are pointing out that the difference in tissue composition makes the effect of nrpd1 mutation hard to disentangle from the tissue differences between the two genotypes.

      Page 5, Line 5. I'm sure this is fine, but it's not entirely clear what is from the previously published paper and what is reanalysis here. All the crosses and measurements were made then, but not organized in this way?

      Response: This data was indeed previously published. In that analysis, we had pooled results from different crosses and calculated significance between genotypes using chi-square tests. During a later study (Satyaki and Gehring, 2019), we realized that we were losing information by ignoring the seed abortion values per cross. So, a reanalysis of that data on a cross by cross basis allowed us to find strong evidence for maternal and paternal effects.

      Page 6, Line 26. This is an excellent dosage series, but it's complicated by imprinting. So it's not 3, 2, 1, 0 effective copies. If we set the paternal copy at ~1 and each maternal at ~0.1, then it's 1.2 (wild type), 0.20 (pat nrpd1+/-), 1 (mat nrpd1+/-), and 0 (nrpd1-/-).

      Response: At the genomic DNA level, its 3, 2,1 and 0 doses. The reviewer’s comment on the transcriptional dose is not clear to us. Based on measured gene expression levels, relative wild-type NRPD1 transcriptional dose =1, pat nrpd1+/- is 0.42, and mat nrpd1+/- is 0.91.

      Page 6, line 31. Is the main thing we are comparing the difference between expression at 42% verses 91% of wild type?

      Response: We are using the small RNA-seq data alongside the mRNA-seq data to argue that loss of mat and pat nrpd1+/- have no impact on overall Pol IV activity in endosperm (as measured by small RNA production). A nrpd1 heterozygous endosperm has almost the same small RNA profile as a wild-type endosperm. Thus any effects seen in the endosperm, including the effects on mRNA expression described later in the manuscript, are likely parental rather than zygotic endospermic effects.

      Page 7, line 11. So, the overall effect in either direction on smRNA gene targets was really quite small, and I'm guessing the effect on gene expression was even smaller.

      Response: The effects of loss of maternal or paternal Pol IV on sRNAs was indeed small (Fig. 1/Fig. S3). Effect of loss of maternal Pol IV on gene expression was substantially large and distinct from the relatively small impacts observed upon loss of paternal Pol IV (Fig. 3) This observation supports the view that Pol IV mediates parent-of-origin effects on gene expression.

      Page 7, line 17. I take it that it is this difference, rather than the overall numbers that is of interest.

      Response: Correct. The lack of a relationship between sRNAs impacted upon loss of mat and pat nrpd1 is additionally suggestive of parent-of-origin effects

      Page 9, line 2. Really interesting, since one might expect that these are methylated loci that would be expected to be fed into any existing embryo maintenance methylation pathway. Surprising that they are maintained independently.

      Response: It is indeed surprising that Pol IV activity in parents can have different impacts on sRNAs in the endosperm. It should be noted though, that as described in Erdmann et al 2017 and in this paper later on, many endosperm sRNA loci are in fact not associated with endosperm DNA methylation. In addition, sRNA loci that are dependent on paternal Pol IV activity are more likely to be associated with DNA methylation than are sRNA loci associated with maternal Pol IV activity. These points have been described in Figure S8.

      Page 9, line 22. Proportion of total imprinted genes? Did the mutant obviate/enhance the imprinting?

      Response: We have modified the manuscript to describe effects on imprinted genes: “ The expression of imprinted genes is known to be regulated epigenetically in endosperm. In mat nrpd1+/- imprinted genes were more likely to be mis-regulated than expected by chance (hypergeometric test p-15) – 15 out of 43 paternally expressed and 45 out of 128 maternally expressed imprinted genes were mis-regulated in mat nrpd1+/- while two maternally expressed imprinted genes but no paternally expressed imprinted genes were mis-regulated in pat nrpd1+/- (Table S6).” We have also added a new supplementary figure (Fig. S6) that describes the impacts of NRPD1 loss of imprinted gene expression.

      Page 9, line 27. How could 2) occur in the homozygous mutant?

      Response: Loss of NRPD1 may impact gene expression in both parents. When the nrpd1-/- mutant endosperm is investigated, we are also examining the consequences of the inheritance of these disrupted gene expression states. We refer to this as epistatic interactions of mat and pat nrpd1.

      Page 10, line 9. Interesting!

      Response: We strongly agree!

      Page 10, line 11. Is this 2.7 versus 2.18 significant because it's statistically significant, or because it's conceptually significant?

      Response: We are pointing out that the 2.7-fold value is quite similar to the predicted value of 2.18-fold, which is arrived at by simply summing the effects of mat nrpd1 and pat nrpd1. This is a conceptually significant point.

      Are the examples in 3D representative, or the most convincing examples? And a big difference in ROS1 is of some concern, since that may well be expected to affect imprinting indirectly. I know I'm being picky here, but the pattern is so intriguing I'd be worried about confirmation bias.

      Response: The examples in 3D are representative for those genes with significant changes in expression in both mat and pat nrpd1, and other genes also behave similarly. The antagonistic effect described for 3D can also be observed as a much broader trend affecting hundreds of genes to varying extents in Fig 3C and 3E-H. The concern about ROS1 is not clear to us but we agree that an effect of ROS1 may be one way that NRPD1 controls gene expression.

      Page 10, line 18. Ok, but 0.123 is a pretty subtle negative correlation. Although I do appreciate that it clearly is not a positive correlation. If I'm understanding correctly, this was the "aha" moment, because it's exactly what one might expect of NRPD1 from the mother and father or working at cross purposes. But the numbers are getting awfully small here.

      Response: It is unclear how to calibrate our expectations of effect sizes considering that our study is the first (to our knowledge) to make such a measurement involving gene expression in parental conflict. A review of the few empirical examples of parental conflict’s impact on seeds shows that parental conflict may drive small changes in seed size (Brandvain and Haig, 2018).

      The evolution of quantitative traits maybe driven by selection for large effects at a small number of loci and/or by selection of small effects at a large number of loci. In a similar vein, parental conflict can impact seed phenotypes either via large effects at a few loci or via small effects at a large number of loci. Our analysis described in Fig 3D-H can fit either possibility. Large effects can be found at a few loci such as SUC2 and PICC (Fig. 3D). Smaller antagonistic effects can be found at hundreds of loci as shown in Figure 5A. The negative correlation described in this figure can be observed even upon dropping the genes that show a statistically significant differential expression in both mat and pat nrpd1+/- (slope after dropping genes significantly mis-regulated in both mat and pat nrpd1+/- is -0.126). In summary, a correlation of -0.123 strongly supports the existence of a widespread antagonistic regulatory effect.

      Page 10, line 29. The point simply being that that other phenomenon is also significant even if the differences are that large?

      Response: We are pointing out that the magnitude of the effects we see are similar to that observed for phenomenon such as dosage compensation.

      Page 12. So, there is no effect on cleavage and no obvious effect on flanking siRNA clusters. The suspense is building...

      Page 12, line 24. And not in potential regulatory regions? CNSs?

      Response: We did not identify a significant enrichment for differentially methylated regions in regulatory regions. We used the relative distance function in bedtools (https://bedtools.readthedocs.io/en/latest/content/tools/reldist.html) to calculate the relationship between the genomic location of DMRs and genomic location of a differentially expressed gene. This analysis was chosen as it does not make a priori assumptions about the size of the regulatory region of a gene. A broad association between DMRs and differentially expressed genes would be indicated by a frequency far greater than 0.02. We show the results of this analysis in Fig. S8F; we find no evidence for significant enrichment of DMRs in the regulatory regions of differentially expressed genes.

      Page 12, line 28. I guess it depend on whether or not the changes are in regulatory sequences no immediately apparent as part of the gene, doesn't it?

      Response: We examined DNA methylation over genes here because in endosperm, unlike in other tissues, many small RNAs are genic. Moreover, DNA methylation within the gene may control transcript abundance (Eimer et al., 2018; Klosinska et al., 2016). We have also examined regulatory regions adjacent to genes in Fig S8F and found no effect.

      Line 13, line 2. Not sure it's that important, but couldn't you chop all of these genes in half and see if they are no longer significant collectively?

      Response: We do not think that this will provide a useful insight.

      Page 14, line 15. I'm afraid I'm getting confused here with the terms cis and trans here. Just to be clear, cis means a direct effect of small RNAs that are dependent on NRPD1 on a gene and trans means anything else? But in this context, it's not clear that is what is meant. Do you mean that gene expression is determined and preset in the gametophyte? What are the levels of expression of NRPD1 in the two gametophytes?

      Response: The reviewer’s interpretation of cis and trans is correct. However, the cis imprints may be preset in gametophytes or in the sporophytic tissues that surround or give rise to the gametophyte. Pol IV is known to be active either in gametophyte or in related sporophytic tissues in both the mother and the father(Kirkbride et al., 2019; Long et al., 2021; Olmedo-Monfil et al., 2010).

      Page 14, line 19. Prior to fertilization?

      Response: Yes, that is the idea. As described in the manuscript, Pol IV activity in either the parental sporophyte or gametophyte prior to fertilization could impact gene expression in the endosperm after fertilization.

      Page 14, line 27. Do you mean driven by, or just associated with?

      Response: In response to the comment, we have replaced the phrase “driven by” with “due to” for increased clarity. In wild-type, DOG1 is predominantly expressed from the paternal allele. In mat nrpd1+/-, the paternal allele is somewhat upregulated but the maternal allele, which is almost silent in wild-type, is highly expressed in mat nrpd1+/-.

      Page 15, line 26. And this is really the issue. The primary conclusion, backed up by the lack (I'm assuming) of phenotypic differences between mat NRPD1 -/+ and pat NRPD1 +/- suggests that the observed differences in expression are not particularly important, unless the exceptional cases are informative.

      Response: We are not sure whether the reviewer means “issue” in a negative, neutral, or positive light. Seed phenotypes are often subtle and we have not examined phenotypic differences in sufficient detail to comment.

      Page 15, line 12. Yes, but I'm not at all clear what the mechanism for this is.

      Response: We have tested and falsified multiple hypotheses to explain how Pol IV can regulate gene expression in endosperm. Considering the complex genetics and the difficulty of isolating endosperm, we have concluded that this is a matter for a future study. The point of this study is the discovery of Pol IV’s parental effects.

      Page 15, line 23. Since this is a very small subset of genes, are these genes that you might expect to play a role in parental conflict?

      Response: The functions of most genes in endosperm remain unknown. However, some have a likely role in conflict. SUC2 is antagonistically regulated by parental Pol IV (Fig. 3D). SUC2 transports sucrose, the key form of carbon imported into seeds from the mother (Sauer & Stolz, 1994).

      Page 15, line 33. Indeed, these could be the informative exceptions.

      Response: We believe the reviewer means that the identify of strongly antagonistically regulated genes may be informative in terms of thinking about these results in the context of parental genetic conflict, which we agree with.

      Page 15, line 29. Hardly surprising, given that the paternal copy of NRPD1 is expressed at a higher level than the maternal copies, is it?

      Response: It is actually somewhat surprising since we show in Fig. 2 that the sRNA production in mat and pat nrpd1+/- are comparable to that of wild-type. The higher contribution of NRPD1 from the paternal copy does not really explain the methylation differences

      Page 16, line 1. So this is what you mean by in cis. Presetting?

      Response: The reviewer’s previous interpretation of cis (acting directly at a target gene) is correct. However, the cis imprints may be preset in gametophyte or in the sporophytic tissues that surround or give rise to the gametophyte. Pol IV is known to be active in gametophytes and in related sporophytic tissues in both the mother and the father.

      These are intriguing results that would benefit from a test of the hypothesis by comparing these result with those obtained in an outcrossing plant species.

      Response: We agree that it would interesting and informative to perform similar experiments in an outcrossing species. However, loss of NRPD1 or other components of the the RdDM pathway have dramatic effects on gametogenesis in outcrossing species (Grover et al., 2018; Wang et al., 2020), preventing the detection of subtle parent-of-origin effects on seed development. Additionally, this would be a separate study.

      Reviewer #3

      We thank the reviewer for their comments.

      • Expression of NRPD1 was 42% of WT in paternal nrpd1 and 91% of WT in maternal nrpd1, yet throughout the paper the effect of maternal nrpd1 was far stronger than paternal nrpd1. The authors may also want to confirm that protein levels follow the same pattern, in case protein degradation or post-transcriptional regulation may play a role.

      Response: We show in Fig. 2 that sRNA production in mat and pat nrpd1+/- are similar to wild-type endosperm. This strongly suggests that NRPD1 protein is produced at functionally equivalent levels in wild-type, mat and pat nrpd1+/-. The finding that mat nrpd1+/- has a stronger effect on gene expression and small RNAs, despite having higher levels of NRPD1 transcript in endosperm, is consistent with our conclusion that the effects we are observing in heterozygous endosperm are due to NRPD1 action before fertilization.

      P. 9 line 1 - this only seems to be true for maternal ISRs, not paternal ISRs; this claim should be narrowed.

      Response: Accordingly, we have modified the text here to : “In summary, these results indicate that most maternally and some paternally imprinted sRNA loci in endosperm are dependent on Pol IV activity in the parents and are not established de novo post-fertilization.”

      A small number of sRNA loci become highly depleted in maternal nrpd1 but not paternal nrpd1 (Fig. 1D, F, Fig. 2C) - are these siren loci?

      Response: This is an interesting question. Siren loci have not been defined in Arabidopsis but are described as loci with high levels of sRNAs in ovules, seed coat, endosperm and embryo (Grover et al., 2020). Loci losing sRNAs in maternal nrpd1+/- include a large number of maternally expressed imprinted sRNAs (mat ISRs). We do not know if mat ISR loci are expressed in the ovule. In Erdmann et al (2017), we excluded loci that were also expressed in the seed coat from mat ISRs. Thus, these loci meet only some of the conditions for being defined as siren loci.

      Fig. 2 suggests that many of the downregulated sRNA regions in maternal nrpd1 are maternally biased to begin with. Related, are genic sRNAs more likely to be affected by maternal or paternal nrpd1 than non-genic or TE sRNAs?

      Response: As described in Fig. 1B and S3, loss of maternal NRPD1 has more impacts on the sRNA landscape. As a percentage of total loci, genes are more likely to be affected than TEs.

      For the sRNA loci shown in Fig. 2C, how is % maternal affected in maternal vs. paternal nrpd1? These ISRs are normally maternal or paternal biased, does this change in maternal or paternal nrpd1?

      Response: We assess the allelic bias of ISRs only when they have at least ten reads in the genotypes being compared. In mat nrpd1+/-, most mat ISRs lose almost all their reads (Fig. 2) and we can assess allelic bias only at 107/366 mat ISRs. As seen in the Rev. comment. Fig1, these 107 lose their maternal bias. In pat nrpd1+/-, loci with maternally biased sRNAs show somewhat increased expression (Fig 2E) but do not show an appreciable change in maternal bias (Figure Review 1). All paternal ISRs do not show any dramatic impacts on allelic bias in mat or pat nrpd1+/-. We have not added this additional datapoint to our paper because we were worried that the paper was becoming too dense – a concern also voiced by reviewer 1. However, we can add this to the manuscript if the reviewer prefers.

      • Might have missed this, but I didn't see the gene ontology results (p9 line 16) shown anywhere? Would like to see significance values, fold enrichments, etc. In particular, the group of paternal nrpd1 up-regulated genes seems too small to have much confidence for GO enrichment analysis.

      Response: We have added a Supplementary Table 7 with outputs of GO analyses.

      • I would suggest expanding the analysis in Fig. 3D-H to explore whether the additive model is more predictive of nrpd1-/- expression levels than other potential models (epistatic, etc.) in general at all genes, or only at the subsets of genes shown, independently of whether the effects are large enough to pass the arbitrary significance cutoffs used in E-H. Identifying specifically which genes do and don't follow this additive pattern could help dissect mechanism. For example, genes following this pattern might share a TF binding site for a TF that is regulated by Pol IV.

      Response: While we are interested, we currently cannot explore other models such as epistasis as this would require knock-down of NRPD1 in the endosperm and we plan to do this as part of a future study.

      1. 13 line 26 - how do changes in CG methylation in maternal or paternal nrpd1 compare to changes in dme or ros1? Do either set of DMRs significantly overlap dme or ros1 DMRs? Could some of these be explained by changes in ROS1 expression, since ROS1 is a Pol IV target?

      Response: Yes. It’s entirely possible that a subset of observed gene expression changes are linked to changes in ROS1 expression. However, there are no comparable methylation data for ROS1 in the endosperm. A potential role for ROS1 has been discussed on Page 11, line 4. Comparison with DMRs in the dme endosperm is difficult. dme mutant endosperm has low non-CG methylation (Ibarra et al., 2012). We have unpublished data showing that the expression of genes involved in RNA-directed DNA methylation (RdDM) is reduced in the dme endosperm. It is therefore difficult to understand if and how DME-mediated demethylation may impact RdDM.

      P. 10 line 3 - is the overlap of 36 out of 51 genes unlikely to occur by chance

      Response: A hypergeometric test indicates that this is indeed significant. We have added it to text on Page 9, line 34.

      In sRNA and mRNA-seq libraries, what was the overall maternal/paternal ratio in each library? Did loss of Pol IV affect this?

      The graphs above show the maternally derived fraction of mRNA and sRNA libraries for different genotypes. Please note that the Ler nrpd1 mutant was generated by backcrossing Col-0 nrpd1+/- into Ler. Some Col-0 regions remain in this background and are called “hold-outs”. Reads mapping to these hold-outs have been excluded while calculating the maternal fraction of each library described in the graph above. We cannot confidently judge if the overall maternal fraction of the mRNA transcriptome is affected by loss of NRPD1 as we likely need more replicates. However, we find that loss of all NRPD1-dependent sRNAs (as in the nrpd1 null mutant) leaves behind sRNAs that roughly reflect the genomic 2:1 ratio.

      P. 9 line 22 - how many paternally and maternally expressed imprinted genes were considered? Were imprinted genes statistically more likely to be misregulated in mat nrpd1?

      Response: We considered 128 maternally and 43 paternally expressed genes that had been previously been identified as imprinted in Col x Ler crosses (Pignatta et al 2014). We have modified the manuscript to describe effects on imprinted genes: “ The expression of imprinted genes is known to be regulated epigenetically in endosperm. In mat nrpd1+/- imprinted genes were more likely to be mis-regulated than expected by chance (hypergeometric test p-15) – 15 out of 43 paternally expressed and 45 out of 128 maternally expressed imprinted genes were mis-regulated in mat nrpd1+/- while two maternally expressed imprinted genes but no paternally expressed imprinted genes were mis-regulated in pat nrpd1+/- (Table S6). “ We have also added a supplementary figure (Figure S6) that focuses on genic mRNA imprinting in NRPD1 heterozygotes and homozygous mutants.

      References cited in the response

      Brandvain, Y., & Haig, D. (2005). Divergent Mating Systems and Parental Conflict as a Barrier to Hybridization in Flowering Plants. The American Naturalist, 166(3), 330–338. https://doi.org/10.1086/432036

      Brandvain, Y., & Haig, D. (2018). Outbreeders pull harder in a parental tug-of-war. Proceedings of the National Academy of Sciences, 115(45), 11354–11356. https://doi.org/10.1073/pnas.1816187115

      Eimer, H., Sureshkumar, S., Singh Yadav, A., Kraupner-Taylor, C., Bandaranayake, C., Seleznev, A., Thomason, T., Fletcher, S. J., Gordon, S. F., Carroll, B. J., & Balasubramanian, S. (2018). RNA-Dependent Epigenetic Silencing Directs Transcriptional Downregulation Caused by Intronic Repeat Expansions. Cell. https://doi.org/10.1016/j.cell.2018.06.044

      Grover, J. W., Burgess, D., Kendall, T., Baten, A., Pokhrel, S., King, G. J., Meyers, B. C., Freeling, M., & Mosher, R. A. (2020). Abundant expression of maternal siRNAs is a conserved feature of seed development. Proceedings of the National Academy of Sciences of the United States of America, 117(26), 15305–15315. https://doi.org/10.1073/pnas.2001332117

      Grover, J. W., Kendall, T., Baten, A., Burgess, D., Freeling, M., King, G. J., & Mosher, R. A. (2018). Maternal components of RNA ‐directed DNA methylation are required for seed development in Brassica rapa. The Plant Journal, 94(4), 575–582. https://doi.org/10.1111/tpj.13910

      Ibarra, C. A., Feng, X., Schoft, V. K., Hsieh, T.-F., Uzawa, R., Rodrigues, J. A., Zemach, A., Chumak, N., Machlicova, A., Nishimura, T., Rojas, D., Fischer, R. L., Tamaru, H., & Zilberman, D. (2012). Active DNA Demethylation in Plant Companion Cells Reinforces Transposon Methylation in Gametes. Science, 337(6100), 1360–1364. https://doi.org/10.1126/science.1224839

      İltaş, Ö., Svitok, M., Cornille, A., Schmickl, R., & Lafon Placette, C. (2021). Early evolution of reproductive isolation: A case of weak inbreeder/strong outbreeder leads to an intraspecific hybridization barrier in Arabidopsis lyrata. Evolution, 75(6), 1466–1476. https://doi.org/10.1111/evo.14240

      Kirkbride, R. C., Lu, J., Zhang, C., Mosher, R. A., Baulcombe, D. C., & Chen, Z. J. (2019). Maternal small RNAs mediate spatial-temporal regulation of gene expression, imprinting, and seed development in Arabidopsis. Proceedings of the National Academy of Sciences, 116(7), 2761–2766. https://doi.org/10.1073/pnas.1807621116

      Klosinska, M., Picard, C. L., & Gehring, M. (2016). Conserved imprinting associated with unique epigenetic signatures in the Arabidopsis genus. Nature Plants, 2, 16145. https://doi.org/10.1038/nplants.2016.145

      Lafon-Placette, C., Hatorangan, M. R., Steige, K. A., Cornille, A., Lascoux, M., Slotte, T., & Köhler, C. (2018). Paternally expressed imprinted genes associate with hybridization barriers in Capsella. Nature Plants, 4(6), 352–357. https://doi.org/10.1038/s41477-018-0161-6

      Long, J., Walker, J., She, W., Aldridge, B., Gao, H., Deans, S., Vickers, M., & Feng, X. (2021). Nurse cell­–derived small RNAs define paternal epigenetic inheritance in Arabidopsis. Science, 373(6550). https://doi.org/10.1126/science.abh0556

      Olmedo-Monfil, V., Durán-Figueroa, N., Arteaga-Vázquez, M., Demesa-Arévalo, E., Autran, D., Grimanelli, D., Slotkin, R. K., Martienssen, R. A., & Vielle-Calzada, J.-P. (2010). Control of female gamete formation by a small RNA pathway in Arabidopsis. Nature, 464(7288), 628–632. https://doi.org/10.1038/nature08828

      Raunsgard, A., Opedal, Ø. H., Ekrem, R. K., Wright, J., Bolstad, G. H., Armbruster, W. S., & Pélabon, C. (2018). Intersexual conflict over seed size is stronger in more outcrossed populations of a mixed-mating plant. Proceedings of the National Academy of Sciences, 115(45), 11561–11566. https://doi.org/10.1073/pnas.1810979115

      Sauer, N., & Stolz, J. (1994). SUC1 and SUC2: two sucrose transporters from Arabidopsis thaliana; expression and characterization in baker’s yeast and identification of the histidine-tagged protein. The Plant Journal, 6(1), 67–77. https://doi.org/10.1046/j.1365-313X.1994.6010067.x

      Wang, Z., Butel, N., Santos-González, J., Borges, F., Yi, J., Martienssen, R. A., Martinez, G., & Köhler, C. (2020). Polymerase IV Plays a Crucial Role in Pollen Development in Capsella. The Plant Cell, 32(4), 950–966. https://doi.org/10.1105/tpc.19.00938

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Short Summary:

      In this study, Satyaki and Gehring investigate the role of RNA Pol IV in Arabidopsis endosperm, focusing on parent-of-origin-specific functions and potential mechanisms. Using a combination of gene expression, sRNA profiling, and DNA methylation data from reciprocal crosses, they find that maternal loss of Pol IV has distinct, and in some cases opposite, effects on gene expression compared to paternal loss of Pol IV. This is also true to a lesser extent for sRNAs and DNA methylation, consistent with the function of RNA Pol IV in driving 24nt sRNA production and targeted DNA methylation through the RdDM pathway. DNA methylation was more strongly affected by paternal Pol IV loss while expression was much more affected in maternal Pol IV loss. Surprisingly, the authors consistently find no evidence that the minor changes sRNA production or DNA methylation in maternal/paternal nrpd1/+ heterozygotes are correlated with gene expression changes in either heterozygote. However, while the mechanism remains unclear, evidence presented here that maternal and paternal Pol IV can have opposite, additive effects on gene/sRNA expression and phenotype (rescue of paternal excess crosses) is convincing and an interesting finding, potentially consistent with the idea that Pol IV helps mediate parental conflict in endosperm.

      Major Comments/suggestions:

      • Expression of NRPD1 was 42% of WT in paternal nrpd1 and 91% of WT in maternal nrpd1, yet throughout the paper the effect of maternal nrpd1 was far stronger than paternal nrpd1. The authors may also want to confirm that protein levels follow the same pattern, in case protein degradation or post-transcriptional regulation may play a role.
      • P. 9 line 1 - this only seems to be true for maternal ISRs, not paternal ISRs; this claim should be narrowed.
      • A small number of sRNA loci become highly depleted in maternal nrpd1 but not paternal nrpd1 (Fig. 1D, F, Fig. 2C) - are these siren loci? Fig. 2 suggests that many of the downregulated sRNA regions in maternal nrpd1 are maternally biased to begin with. Related, are genic sRNAs more likely to be affected by maternal or paternal nrpd1 than non-genic or TE sRNAs?
      • For the sRNA loci shown in Fig. 2C, how is % maternal affected in maternal vs. paternal nrpd1? These ISRs are normally maternal or paternal biased, does this change in maternal or paternal nrpd1?
      • Might have missed this, but I didn't see the gene ontology results (p9 line 16) shown anywhere? Would like to see significance values, fold enrichments, etc. In particular, the group of paternal nrpd1 up-regulated genes seems too small to have much confidence for GO enrichment analysis.
      • I would suggest expanding the analysis in Fig. 3D-H to explore whether the additive model is more predictive of nrpd1-/- expression levels than other potential models (epistatic, etc.) in general at all genes, or only at the subsets of genes shown, independently of whether the effects are large enough to pass the arbitrary significance cutoffs used in E-H. Identifying specifically which genes do and don't follow this additive pattern could help dissect mechanism. For example, genes following this pattern might share a TF binding site for a TF that is regulated by Pol IV.
      • P. 13 line 26 - how do changes in CG methylation in maternal or paternal nrpd1 compare to changes in dme or ros1? Do either set of DMRs significantly overlap dme or ros1 DMRs? Could some of these be explained by changes in ROS1 expression, since ROS1 is a Pol IV target?
      • P. 10 line 3 - is the overlap of 36 out of 51 genes unlikely to occur by chance?

      Minor Comments:

      • In sRNA and mRNA-seq libraries, what was the overall maternal/paternal ratio in each library? Did loss of Pol IV affect this?
      • P. 9 line 22 - how many paternally and maternally expressed imprinted genes were considered? Were imprinted genes statistically more likely to be misregulated in mat nrpd1?

      Significance

      Significance:

      PolIV is a plant-specific polymerase that functions part of the plant-specific RNA-directed DNA methylation pathway, which has been very well characterized in Arabidopsis. Mutations in PolIV were previously shown to rescue paternal excess crosses when inherited paternally (Erdmann et al. 2017), and this study extended that finding to show that maternal vs. paternal loss of Pol IV has opposite effects on seed survival in paternal excess crosses. Only one other example (met1) of opposed paternal vs. maternal effects on seed development is known, making Pol IV a useful tool for studying why and how these effects occur. As the authors note, the dominant theory on 'why' involves Pol IV mediating parental conflict over resource allocation in the seed, and the opposite effects of Pol IV maternal/paternal loss at some genes support this hypothesis. The 'how' remains unclear, although this study eliminates several possibilities, and the most likely remaining model is that Pol IV parent-of-origin specific effects occur mostly in trans. Future work can build on these findings to identify the mechanism by which Pol IV achieves these parent-of-origin specific effects.

      My background is mostly in plant epigenetics and genomics.

      Referee Cross-commenting

      The other referee comments seem fair, and I have not comments at this time.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript provides evidence that a loss of either the maternal or paternal copy of NRPD1 have different, and sometime opposite, effects on the accumulation of small RNAs and on expression of a subset of genes, with a loss of the maternal copy having more substantial effects. The manuscript is well written, and the conclusions, as far as they go, are justified by the data, which are effectively presented. The overall effect is subtle but informative and according to the authors support a parental conflict model for imprinting. The experiments failed to find a smoking gun in the form of a mechanism to explain how or why the maternal and paternal alleles have different effects and the explanation for a lack of clear phenotypic differences was reasonable, but untested. I would have like to see it tested by looking in a plant species that is outcrossing and highly polymorphic. However, I do appreciate that the observation that the male and female alleles can have distinct effect when mutant is an important clue. My specific comments below may reflect confusion on my part, rather than real issues. If that is that I hope that confusion can aid in clarifying what are in places quite subtle points.

      Specific comments:

      Page 6, last paragraph: "Because the endosperm is triploid, in these comparisons there are 3 (wild-type), 2 (pat nrpd1+/-), 1 (mat nrpd1+/-) and 0 (nrpd1-/-) functional NRPD1 alleles in the endosperm. However, NRPD1 is a paternally expressed imprinted gene in wild-type Ler x Col endosperm and the single paternal allele contributes 62% of the NRPD1 transcript whereas 38% comes from the two maternal alleles (Pignatta et al., 2014). Consistent with paternal allele bias in NPRD1 expression, mRNA-Seq data shows that NRPD1 is expressed at 42% of wild- type levels in pat nrpd1+/- and at 91% of wild-type levels in mat nrpd1+/- (Supplementary Table 6)".

      I would think this would really complicate the analysis. Should all of the dosage values include NPRD1 imprinting values? That is to say, expressed in terms of expression values? This is also a bit confusing. The maternal copies together express 38% of the transcript, so why isn't the mat nrpd1 at 68%, rather than 91%? In any event, given this imprinting and differences in dosage of the male and female it appears that two variables, parental origin and expression levels are being compared. Since 91% is awfully close to 100%, are the mat pat comparisons really just comparing low with nearly normal expression of NRPD1? And actually, given that, the outsized effect of the mat nrpd1 +/- is even more striking.

      Page 4, Line 16. I'm afraid it's still a bit difficult to understand what was being compared what in this section. Please clarify.

      Page 5, Line 5. I'm sure this is fine, but it's not entirely clear what is from the previously published paper and what is reanalysis here. All the crosses and measurements were made then, but not organized in this way?

      Page 6, Line 26. This is an excellent dosage series, but it's complicated by imprinting. So it's not 3, 2, 1, 0 effective copies. If we set the paternal copy at ~1 and each maternal at ~0.1, then it's 1.2 (wild type), 0.20 (pat nrpd1+/-), 1 (mat nrpd1+/-), and 0 (nrpd1-/-).

      Page 6, line 31. Is the main thing we are comparing the difference between expression at 42% verses 91% of wild type?

      Page 7, line 11. So, the overall effect in either direction on smRNA gene targets was really quite small, and I'm guessing the effect on gene expression was even smaller.

      Page 7, line 17. I take it that it is this difference, rather than the overall numbers that is of interest.

      Page 9, line 2. Really interesting, since one might expect that these are methylated loci that would be expected to be fed into any existing embryo maintenance methylation pathway. Surprising that they are maintained independently.

      Page 9, line 22. Proportion of total imprinted genes? Did the mutant obviate/enhance the imprinting?

      Page 9, line 27. How could 2) occur in the homozygous mutant?

      Page 10, line 9. Interesting!

      Page 10, line 11. Is this 2.7 versus 2.18 significant because it's statistically significant, or because it's conceptually significant? Are the examples in 3D representative, or the most convincing examples? And a big difference in ROS1 is of some concern, since that may well be expected to affect imprinting indirectly. I know I'm being picky here, but the pattern is so intriguing I'd be worried about confirmation bias.

      Page 10, line 18. Ok, but 0.123 is a pretty subtle negative correlation. Although I do appreciate that it clearly is not a positive correlation. If I'm understanding correctly, this was the "aha" moment, because it's exactly what one might expect of NRPD1 from the mother and father or working at cross purposes. But the numbers are getting awfully small here.

      Page 10, line 29. The point simply being that that other phenomenon is also significant even if the differences are that large?

      Page 12. So, there is no effect on cleavage and no obvious effect on flanking siRNA clusters. The suspense is building...

      Page 12, line 24. And not in potential regulatory regions? CNSs?

      Page 12, line 28. I guess it depend on whether or not the changes are in regulatory sequences no immediately apparent as part of the gene, doesn't it?

      Line 13, line 2. Not sure it's that important, but couldn't you chop all of these genes in half and see if they are no longer significant collectively?

      Page 14, line 15. I'm afraid I'm getting confused here with the terms cis and trans here. Just to be clear, cis means a direct effect of small RNAs that are dependent on NRPD1 on a gene and trans means anything else? But in this context, it's not clear that is what is meant. Do you mean that gene expression is determined and preset in the gametophyte? What are the levels of expression of NRPD1 in the two gemetophytes?

      Page 14, line 19. Prior to fertilization?

      Page 14, line 27. Do you mean driven by, or just associated with?

      Page 15, line 26. And this is really the issue. The primary conclusion, backed up by the lack (I'm assuming) of phenotypic differences between mat NRPD1 -/+ and pat NRPD1 +/- suggests that the observed differences in expression are not particularly important, unless the exceptional cases are informative.

      Page 15, line 12. Yes, but I'm not at all clear what the mechanism for this is.

      Page 15, line 23. Since this is a very small subset of genes, are these genes that you might expect to play a role in parental conflict?

      Page 15, line 33. Indeed, these could be the informative exceptions.

      Page 15, line 29. Hardly surprising, given that the paternal copy of NRPD1 is expressed at a higher level than the maternal copies, is it?

      Page 16, line 1. So this is what you mean by in cis. Presetting?

      Page 16, line 9. So ideally, one would want to look at a highly polymorphic out-crosser. I'm not suggesting that for this paper, but would this be a good test of the hypothesis? How about maize?

      Page 16, line 15. But the pat and mat heterozygotes looked the same. No differences in phenotype?

      Page 17, line 22. I'm confused, since aren't most 24 nt smRNAs dependent on POLIV (Figure S2)? Do you mean differentially regulated smRNAs? Expression of POLIV specifically in one or the other parent?

      Page 17, line 23. How are you defining important here? Important because at least in the female NPRD1 is not expressed in the central cell? But not important, since this mutant has no effect on phenotype except in an imbalanced cross.

      Page 18, line 13. For this reason, it would be nice to know much more about these genes. Mutant phenotypes, for instance. And how many of these have this feature conserved?

      Significance

      These are intriguing results that would benefit from a test of the hypothesis by comparing these result with those obtained in an outcrossing plant species.

      Referee Cross-commenting

      I agree that the other comments seem both fair and reasonable.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      This study addresses key aspects of gene regulation in the developing endosperm of flowering plants. The endosperm is the product of the fusion of the (normally diploid) female central cell with one of the sperm cells, and is indispensable for nourishing the developing embryo, among other important functions. Evolutionary models predict that in flowering plants, the endosperm ought to be the tissue in which parental conflict over the allocation of (female) resources to progeny should manifest. Consequently, endosperm gene expression (including the phenomenon of genomic imprinting) and developmental trajectories have been studied from various perspectives, including the possibility of the fast build-up of reproductive barriers due to failing endosperm (and thus seed) development.

      More specifically, this study utilizes knock-out mutants of the NRPD1 gene, which codes for the largest subunit of RNA Polymerase IV (Pol IV), which is part of the RNA-directed DNA methylation (RdDM) pathway. It builds on previous work by the authors that suggested an important role for Pol IV in mediating allelic dosage in developing endosperm, that small interfering RNAs are produced from both paternally and maternally-derived alleles (Erdmann et al. 2017; contra purported claims by other labs), and that normally inviable seeds from paternal-excess crosses (2n x 4n) can be largely rescued by knocking out individual (paternal) components of the RdDM pathway (Satyaki & Gehring 2019).

      Here, Satyaki & Gehring characterize a variety of expression responses in reciprocal heterozygotes, i.e. products of crosses between a homozygous WT and a homozygous nrpd1 mutant (all with the Ler and Col-0 accessions of A. thaliana). The resulting heterozygotes differ in whether the maternal parent (mat nrpd1+/-) or the paternal parent (pat nrpd1+/-) contributed the nrpd1 allele. In addition, mRNA and sRNA expression was also assessed for the WT (+/+) and the homozygous nrpd1 lines (-/-).

      Key findings of this work are that the loss of Pol IV in maternal and paternal parents has different consequences for endosperm gene expression, some of which appear to be antagonistic. In other words, the presence of a functioning Pol IV in the mother and father have parent-of-origin effects on the resulting endosperm. Furthermore, one parent's copy of NRPD1 was found to be sufficient for the production of most Pol IV-dependent sRNAs, yet with a fairly small number of mostly non-overlapping loci losing sRNAs upon loss of either maternal or paternal NRPD1. Pol IV activity in the father and mother is shown to have distinct impacts on the endosperm's DNA methylation landscape.

      Interestingly, while the proportion of mis-regulated genes seems small in both heterozygotes, it is much more restricted in pat nrpd1+/-. Jointly, the authors' results suggest that paternal and maternal Pol IV are genetically antagonistic and that their effects on endosperm transcription in heterozygotes is established before fertilization.

      Major Comments:

      I believe that this is a very sound and authoritative study. The analysis of all data seems appropriate and robust, and many connections between the data (and subsets of data) and their possible interpretations have been considered. In fact, in the massive Results section, some interpretations are supported by cited references (this is not meant as a critique). However, I wonder about the length of the Results section, and the balance between it and the relatively short Discussion section. It is difficult for me to nail down any part of Results that might be shortened, as I could not find clear redundancies. I also think that the level of speculation is absolutely warranted, and I did not find excessive claims being made to this or that end. Rather, I suggest to broaden the perspective somewhat (in their Discussion; see below under Significance), which might allow people with a less mechanistic perspective to grasp the potential relevance of this work for non-model plant systems studied mostly by evolutionary geneticists.

      One aspect that might warrant more scrutiny is the mapping of sRNA reads to the reference genome. I found the short section of this (M&M section, page 20, lines 23-25) to be too brief. It is not clear to me which of ShortStack's v3 weighting scheme the authors used, which is relevant for multi-mapping reads (see NR Johnson et al. 2016, G3). In addition, it is not mentioned whether zero mismatches were allowed. Perhaps this is described in more detail in Erdmann et al. (2017), but even if so, it deserves to be clarified here.

      All in all, I find this work to be meticulously presented and the data to be thoughtfully interpreted. The major conclusions seem to be convincing and adequate, given the underlying data. I have no qualms about replication issues, nor about statistics.

      Minor Comments:

      The manuscript is well-written and concise, despite the length of the Results section. The verbal clarity and absence of typos or grammatical issues is superb. I did find some of the Figures to be somewhat "un-intuitive", in the sense that it takes acute concentration for an outsider (of sorts) to gather and interpret the underlying data. This is probably due to the many cross-comparisons of differences between two genotypes on one axis and those of a different pair of genotypes on the other axis. I am not sure how this issue can be ameliorated (nor whether this is really necessary); however, from a technical point of view, all Figures and Suppl. Figures are really well-done.

      The list of references seems adequate in terms of citing relevant (both older and very recent) publications. However, almost all cited papers concern Arabidopsis or other model species; I suggest to consider adding a few relevant studies on non-Brassicaceae (whether considered model taxa or not), in conjunction with my suggestion (in Significance) to potentially broaden the scope by searching for natural phenomena that also involve parent-of-origin effects on endosperm/seed development. Curiously, many of the references are "incomplete" in the sense of stopping with the journal's name, then stating the doi, i.e. they lack volume numbers and page/article numbers. This should be harmonized throughout.

      Significance

      Significance:

      While part of the earlier data from Erdmann et al. (2017) were re-analyzed in the present study, the vast amount of data are new and concern the expression consequences at the diploid level (2n x 2n crosses), and thus may prove to be more relevant for future comparisons with non-model flowering plants, either for normal intraspecific seed development or (partly) failing crosses between slightly diverged evolutionary lineages. In my view, this study presents a significant advance in understanding the downstream consequences (endosperm mRNA and sRNA expression levels, levels and patterns of DNA methylation) of a perturbed Pol IV expression in both parents or the female vs. male parent. Much of the emphasis in the field has been on paternal-excess crosses, within the larger realm of the "triploid block" or the reproductive barriers between plants of different ploidy (typically diploids x tetraploids in both cross directions).

      The fact that the molecular consequences of disabled Pol IV in one or both parents were assessed in balanced crosses (and not interploidy crosses) may allow an easier connection to natural phenomena such as (partial) hybrid seed lethality between closely related lineages of flowering plants, where parent-of-origin effects have emerged in the recent literature, both at the phenotypic level of endosperm/seed growth and at the molecular level (perturbed imprinting in the endosperm, mRNA and sRNA expression levels, etc.). It thus might prove worthwhile to screen recent papers in Solanum, Mimulus, and Capsella to evaluate the possibility of "connections" between the current data and recent, admittedly more descriptive, findings in diverse taxa that don't offer the same genomic resources as Arabidopsis.

      What the current version of this work already does is relating the finding of partly antagonistic influences of pat nrpd1+/- versus mat nrpd1+/- on endosperm mRNA expression to evolutionary models championed by D. Haig ("parental conflict", "kinship theory"). My above suggestions would strengthen such connections and likely would broaden the appeal of this work to scientists with diverse backgrounds outside the core expertise of plant molecular/developmental biologists. In any case, I see the prime scientific audience as the latter group, but I see potential to intrigue people with a more evolutionary background.

      Field of expertise:

      population genomics, hybrid seed lethality, speciation, genomic imprinting, evolutionary models.

      Referee Cross-commenting

      I agree that the other referee comments (while being quite complementary to mine due to differences in main expertise) seem both fair and reasonable.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Description of the planned revisions

      From Review Commons: Please find below our point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees

      **************

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):


      Comment #1: In this manuscript, the authors follow up on an interesting finding that varA null mutants of V. cholerae form spherical cells in stationary phase. The authors determine that this cell rounding is due to weakening of the cell wall via less production of tetrapeptide cross links. Mutation of the regulator csrA and the enzyme aspA lead to a model in which a varA mutant cell lacks aspartate leading to low cross-linked cell wall that is unable to hold the typical curved V. cholerae shape. The data are robust, and the manuscript is clearly written.

      Authors’ reply #1: We very much appreciate the reviewer’s accurate summary and the appraisal of the robust data and the clearly written manuscript.

      • *

      Comment #2: I think the finding is quite interesting, even though it is not clear to me if this observed cell morphology has a biological function or if it is an artifact of completly removing VarA. However, this manuscript builds the foundation to further test this question.

      Authors’ reply #2: We agree with the reviewer. However, it is worth mentioning that this two-component system (TCS) has been first described in 1998 yet the input signal (or repression of the signal under certain conditions) still remains elusive. Maybe, this isn’t too surprising, given that studies on V. cholerae are strongly biased towards its pathogenic lifestyle, while the varA/varS system is highly conserved among Gram-negative bacteria including non-pathogenic environmental V. cholerae strains. These strains can live under very diverse conditions of slow or fast growth, including long starvation periods. Unfortunately, we still lack significant insight into this part of the V. choleraebiology. We therefore believe that the current study is very important, as the elucidation of the molecular mechanism of the observed shape transition within the varA mutant will foster fresh hypotheses on the role that the system plays in V. cholerae and what signals might be sensed.

      We would also like to remind the editor and reviewer(s) that a plethora of studies have been published based on varA and varS (and csrA) deletion mutants of V. cholerae with various readouts ranging from transcriptomics to quorum sensing defects, impairment of virulence, etc. Thus, the argument that the complete removal of varA might cause an artifact seems equally valid for previous work by others and, maybe, even the vast majority of studies in which TCS are investigated for which the sensed signal has yet to be identified.

      In conclusion, we propose to address this valid point of critique in the revised manuscript by clearly stating the caveat of the gene deletion(s). However, as the reviewer correctly stated, “this manuscript builds the foundation to further test this question.

      Comment #3: The data all support the conclusions, but I do think the authors could have really confirmed their model by connecting mutations in csrA and aspA to restoration of high cross-linked cell well similar to the WT strain as done in Fig. 2. As it stands, this is still somewhat hypothetical and has not been directly demonstrated, although I do think their model is correct and these experiments will be conformation of it.

      Authors’ reply #3: We thank the reviewer for their comment and the assumption that our model might be correct. It is very unlikely that the csrA suppressor mutant(s) or the ∆varA∆aspA mutant maintain the low level of cross-links and the high level of dipeptides that we observed for the ∆varA mutant. Indeed, it would be unclear how the cells could restore the Vibrio shape that we visualized in the phase contrast image under such conditions. However, as this point seems very important to the reviewer (see also comment #7 below), we will perform the suggested cell wall analysis of these mutants and include the new data in the revised manuscript.

      Comment #4: I also have a few other suggestions to improve the manuscript, but in sum I think it is a well-done research study that will be interesting to research in V. cholerae and other gamma proteobacteria.

      Authors’ reply #4: Once again, we thank the reviewer for their kind words.

      Major comments:

      Comment #5: 1. The enrichment for suppressors is very creative and connected the varA impact on cell morphology to misregulation of csrA as 10/10 mutants were ultimately linked to this gene. However, insertion in aspA should also suppress this phenotype, and I am curious why this gene was not identified in the transposon suppressor screen.

      Authors’ reply #5: This is a very relevant and important comment. The reason why we did not isolate ∆varA-aspA::Tn mutants is most likely due to a growth defect that we observed for the double ∆varA∆aspA mutant compared to the ∆varA-csrA suppressor mutant(s). In the figure on the right, respective growth curves are shown [∆varA∆aspA in orange and the ∆varA-csrA suppressor mutant ∆varA-Tn A in gray]. Any ∆varA-aspA::Tn mutant is therefore expected to be outcompeted by the ∆varA-csrA suppressor mutants during the enrichment process. We will include this information and the corresponding data (e.g., final growth curves after 3 biologically independent experiments) in the revised manuscript.

      [figure not shown in online form]

      Comment #6: 2. The authors should complement at least one of their varA/csrA mutants with csrA.

      Authors’ reply #6: Agreed. We are in the process of performing the suggested experiment and will include the results in the revised manuscript.

      Comment #7: 3. The changes in cell wall structure are not directly connected to the genetic identification of csrA and aspA. Yes, I agree their model makes sense, but to really nail it down they should analyze the cell wall composition in the varA/csrA and varA/aspA double mutants and show it returns to WT levels of crosslinking.

      Authors’ reply #7: As mentioned above, we will perform those cell wall analyses (see also authors’ reply #3 above), as requested.

      Comment #8: 4. Does deletion of aspA in the WT or varA mutant impact the growth rate?

      Authors’ reply #8: This is indeed the case in the ∆varA background but not in the WT background (as shown under authors reply #5). These data will be included in the revised manuscript.

      • *

      Minor comments

      Comment #9: 5. Are the round cells able to divide? The data in Fig. S2 would suggest they can based on the increase in CFUs from hour 6 to hour 8, but the authors never comment on this point and it might be worth addressing in the discussion.

      Authors’ reply #9: We never observed diving round cells. Indeed, the increase of the optical density and CFUs from 6h to 8h is most likely based on those bacteria that have not yet changed their cellular morphology and therefore keep dividing (see below the imaging/quantification for this timeframe taken from Fig. 2). What we observed though is that upon dilution into fresh medium, the round cells start to elongate and then divide resulting in newborn Vibrio shaped cells. We will include these new data in the revised manuscript.

      [figure not shown in online form]

      • *

      Comment #10: 6. Fig. S2-Why does the OD600 increase from 8 to 24 hours but the CFUs decrease in the varA mutant?

      Authors’ reply #10: This is an interesting observation that might reflect the presence of dead but not yet lysed cells in these cultures. Indeed, while it looks as if the OD600 values are still increasing for the ∆varA mutant at 24h, we cannot exclude at this point that the OD600 values increased during the 16h-time interval and went again down at 24h (e.g., like shifting the WT peak to later time points/the right of the X-axis). Notably, the purpose of this figure was mostly to i) indicate the slower growth of the ∆varA mutant while ii) emphasizing that late during growth (e.g., 24h here) the strain can still reach similar OD600 values as well as CFUs/ml as the WT strain. We will change Fig. S2 to better emphasize these two points in the revised manuscript.

      Comment #11: 7. Lines 309-A little bit more detail here would help the reader. Are the authors examining whole cell lysates or lysates from specific cellular components? I am actually very surprised this worked as there are so many proteins in crude cell lysates.

      Authors’ reply #11: Indeed, these are whole cell lysates, which were prepared as described in the methods section (lines 482 onwards; “SDS-PAGE, Western blotting and Coomassie blue staining”). We fully agree that there are many proteins in the crude cell lysates and realized that we might not have explained well enough that only the gel region containing the overproduced band in the ∆varA strain and the same location in the WT sample were analyzed by mass spectrometry (even though we were referring to the “gel pieces” in line 498 onwards). Please accept our sincerest apologies for this neglect. During the revision, we will ensure that this information is explicitly stated and that these details are included in the main text and the methods section.


      Comment #12: 8. Lines 320-321-I don't think there is evidence that CsrA enhances aspA RNA translation, merely that CsrA enhances AspA protein production. It is likely through increasing translation, but this cannot be concluded without direct evidence.

      Authors’ reply #12: We fully agree and thank the reviewer for this important comment. Indeed, we meanwhile know that the aspA mRNA levels also increase in the ∆varA mutant strain (which might or might not be linked to enhanced translation). We will add these transcript level data to the revised manuscript and discuss all possibility that could explain the AspA overproduction.

      Comment #13: 9. Line 348-350-I do not understand the logic of this sentence stating that the "..until now, the signal that abrogates VarA phosphorylation..." as this manuscript does not contribute to our understanding of the VarS signal.

      • *Authors’ reply #13: We apologize that this sentence or the logic behind it wasn’t clear. As this is a combined result and discussion section, the aim of the sentence was to put the observed shape transition of the bacteria into a broader context, which required us to mentioned that the input signal is still unknown. We will make sure that this becomes more obvious in the revised manuscript by rephrasing this sentence.

      Comment #14: 10. I am curious if the total volume of the round versus curved cells is constant at 20 hours. This should be easy to determine using ImageJ and worth reporting.

      Authors’ reply #14: We are not entirely sure how this question is relevant for the study (e.g., for this report on the observed shape transition phenotype it doesn’t matter if the cells maintain the same volume or not). However, given the importance for the reviewer, we will perform these volume measurements on our images and add a sentence to the revised manuscript on the analysis’ outcome (plus include the data as a supplementary panel).

      • *

      Reviewer #1 (Significance (Required)):

      Comment #15: Understanding changes to cell morphology and their biological implications is a growing area of microbiology. This study makes a new contribution to this area by demonstrating a round, spherical form of V. cholerae that is driven by alterations to the cell that decrease cell-wall cross linking.

      Authors’ reply #15: Once again, we thank the reviewer for this summary and for placing our study into context. We agree that cell morphological changes and the underlying molecular mechanism(s) are an exciting and growing area of microbiology.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Comment #16: In this manuscript, Rocha et al studied the effect of the VarA response regulator on cell shape of Vibrio cholerae. VarA is part of a two-component system that also includes the histidine kinase VarS. It has previously been shown that VarA activates the expression of three redundantly acting regulatory RNAs called CsrB, CsrC, and CsrD. All three Csr RNAs share the same regulatory principle, which is to sequester the activity of the RNA-binding protein CsrA. CsrA in turn can bind hundreds of mRNA species in the cell, which in the majority of cases results in reduced translation of these mRNAs (in addition to various other modes of action that have been reported). Here, the authors discovered that deletion of the varA gene results in an abnormal, spherical cell shape in stationary phase grown V. cholerae cells. Biochemical analysis revealed an unusual peptidoglycan composition in varA-deficient cells, showing increased levels of dipeptides, a reduction of tetrapeptides, and an overall decrease in peptide-cross linkage. Interestingly, the varA phenotype was complemented by the addition of conditioned medium from wild-type cells, which are likely to provide peptidoglycan building blocks in trans. The authors further discover that varA-deficiency results in AspA over-production, which could be linked to the activity of CsrA. The authors speculate that high AspA levels deplete the cell of aspartate, which is required to produce peptidoglycan precursors. The manuscript is interesting, well-written, and the rationale of the experiments is easy to follow.

      Authors’ reply #16: We thank the reviewer for this excellent summary and the kind words on the quality of the manuscript.

      Comment #17: However, I have two major points of criticism, which reduce my enthusiasm for this work. First, the molecular pathways that links varA-deficiency to increased AspA levels is incomplete: please clarify how CsrA activates AspA levels and if this phenotype is linked to direct binding of CsrA to the aspA mRNA and if so how is activation is achieved at the molecular level.

      Authors’ reply #17: We thank the reviewer for the comment. However, we never had the intention to decipher the entire pathway and it is indeed possible that intermediate regulators might be involved. Notably, the first part of the signaling pathway (VarA -> CsrB,C,D -> CsrA) seemed well established in the literature and we truly believe that our work supports this part of the pathway (given the numerous csrA suppressor mutants that we obtained in the varA-minus background). For the link between CsrA and AspA, we indeed do not provide direct evidence. Nonetheless, we discuss recent work in Salmonella by mentioning “Interestingly, previous studies identified the aspA mRNA amongst hundreds of direct CsrA targets in Salmonella using the CLIP-seq technique to identify protein-RNA interactions (32)”. Notably, this finding by Holmqvist et al. (2016, EMBO J.) has been reproduced for E. coli by Potts et al. (2017, Nat. Commun.; see Supplementary Data file 1, CsrA CLIP-seq data; with three highly significant peaks corresponding to aspA mRNA binding), an information that we will add to the revised manuscript. Of course, neither Salmonella nor E. coli belongs to the same genus as V. cholerae (though, of course, all are gamma-Proteobacteria). Thus, to accommodate the reviewer’s comment, we will revise the manuscript to include the caveat that direct CsrA binding of the aspA mRNA has been shown in both Salmonella and E. coli but that it is still feasible that intermediate regulatory proteins might be involved in the case of V. cholerae. We will also revise the model to show such potential intermediate steps between CsrA and AspA.

      Comment #18: Second, I am not convinced about the biological relevance of the findings. The authors speculate in the discussion section that the VarA-pathway could modulate cell shape under physiological conditions, however, I am not sure such conditions exist given that VarA activity is not only controlled by VarS, but rather integrates information from multiple histidine kinases. I have a several additional comments, which I listed below.

      Authors’ reply #18: We regret the referee’s personal opinion that our findings might not be of biological relevance.

      However, we respectfully disagree with the notion that such physiological conditions would never occur just because several histidine kinases can feed into VarA signaling. Indeed, as discussed above under authors’ reply #2, the (VarS/)VarA-CsrA pathway is highly conserved in Vibrio species and other proteobacteria. Yet, for V. choleraemost studies have focused on virulence-inducing conditions, while the species’ environmental lifestyle has been vastly neglected in the past. Indeed, even our own work on several V. cholerae’s phenotypes (natural competence for transformation [Meibom*, Blokesch* et al., 2005, Science]; T6SS production in pandemic strains [Borgeaud et al., 2015, Science]; pilus-mediated aggregation [Adams et al., 2019, Nat. Microbiol]; etc.) has remained unknown for decades, given the chitin dependency for their induction – a substrate not commonly studied in lab settings. Interestingly, several of these findings have also initially been considered biologically irrelevant, “artifacts”, or even non-reproducible by reviewers in the past, while nowadays all these phenotypes have been extensively reproduced by many different research groups and are well accepted in the field as biologically highly relevant. Thus, we truly believe that one should be open to new phenotypes and, as reviewer #1 rightfully acknowledged, consider that “this manuscript builds the foundation to further test this question.

      It should also be noted that the (VarS/)VarA-CsrA system has been studied for >15 years based on deletion strains, as we did in this study, and the readouts of these studies have been well accepted in the field without provision of the physiological conditions that would mimic the situation of these knock-out strains.

      Collectively, we truly believe that there are still many understudied physiological conditions for V. cholerae; however, finding the right conditions could take years and is therefore beyond the scope of the current study.


      Major points:

      Comment #19: - Figs. 1 and S1: I think it is interesting that the varS mutant strain does not share the cell shape phenotype with the varA mutant. As pointed out by the authors, this result indicates that varA activity is controlled by another histidine kinase. While I believe it might be beyond the scope of this manuscript to determine which other histidine kinases signal towards VarA, I think it would be useful to measure and compare CsrB/C/D levels in WT, DvarA, and DvarS cells.

      Authors’ reply #19: Thanks for this comment. We fully agree that finding the secondary histidine kinase is beyond the scope of this study. In the revised manuscript, we will, however, include the CsrB/C/D levels of the WT, ∆varA, and ∆varS strains, as suggested by the reviewer.

      Comment #20: - Figs. 1, S1, and 4C: The regulatory logic implied by these results suggest that deletion of varA results in reduced CsrB/C/D levels, which in turn leads to higher activity of CsrA in the cell. Thus, it would be useful to test if A) over-production of CsrB, CsrC, or CsrC can rescue the phenotype of an varA mutant and if B) combined deletion of csrB/C/D will phenocopy the mutation of varA.

      Authors’ reply #20: These are also very good suggestions. Notably, this has been done in the past for the V. cholerae system (Lenz et al., 2005, Mol. Microbiol.). Indeed, after receiving the reviewer’s comment, we immediately asked these authors to kindly share their csrB,C,D overproduction plasmids as well as the triple knock-out strain with us (as all of these constructs have been extensively verified in their published work). Unfortunately, we are not entirely sure whether it will be possible to receive these constructs any time soon, as we were told that such shipment might take >1 year (though, upon further discussion, this timeframe was lowered to ~3 months). If we manage to receive these published constructs in a reasonable timeframe, we will certainly perform the suggested experiments.


      Comment #21: - Figs. 4B & 5C: I was somewhat surprised by these results. Given that AspA overproduction is suggested to cause cell shape abnormalities in the varA mutant, I would have expected additional transposon insertion in aspA. The fact that mutations only occurred in csrA could indicate that additional (CsrA-controlled) could be involved in the phenotype.

      Authors’ reply #21: See authors’ reply #5 above, where we explain that the ∆varA∆aspA strain has a slight growth disadvantage. For this reason, any ∆varA-aspA::Tn transposon mutant would likely be outcompeted by the csrAsuppressor mutants in our genetic screen.


      Minor points:

      Comment #22: - Throughout study: italicize gene names

      Authors’ reply #22: Gene names have been italicized in the initial manuscript; however, strain names - such as strain ∆varA – haven’t been italicized, in accordance with several of our previous publications. However, for the revision, we will italicize all strain names to accommodate the reviewer’s request.

      Comment #23: - Figs. 1C and S3C: please quantify the results of these western blots and indicate how many replicates were performed.

      Authors’ reply #23: We apologize for this oversight – indeed, all Western Blot were performed three independent times, as is good scientific practice. We will add this information into the methods section of the revised manuscript.

      Concerning the quantification: the primary claim of these figures is that the HapR protein is still produced in the ∆varA mutant in the different pandemic strain backgrounds, while the luxO-mutated strains have a significant defect in HapR production (as we have previously reported; Stutzmann and Blokesch, 2016, mSphere). These data are qualitatively very clear in the Western Blots and can be considered as “black or white” results.

      However, for the revision we will quantify the bands’ intensities of the performed Western blots and provide these quantitative data, as requested by the reviewer.

      Comment #24: - Fig. 5A and B: in order to properly quantify the levels of AspA in the cell (and link them to CsrA activity in the transposon mutants), I think it would be better to add a tag to the chromosomal aspA gene and perform quantitative Western blot analysis.

      Authors’ reply #24: We respectfully disagree. Firstly, this is not a subtle difference that we observe in these cell lysates/the corresponding stained gel bands but a rather strong difference when WT is compared to the mutants (see, for instance, a copy of panel 5B below as a kind reminder). Together with the genetic experiments that follow afterwards, the link seems very solid to us. Secondly, adding a tag could change the proteins abundance (change of the protein’s production/degradation dynamics) and/or activity, which could cause more confusion than needed (and a loss of the spherical cell shape if the enzyme loses its activity through the tagging).

      However, as mentioned above under authors’ reply#12, we meanwhile observed that the aspA mRNA levels also increase in the ∆varA mutant. Thus, we will provide qRT-PCR data in the revised manuscript (and discuss all options on how the increase of the transcript and subsequently the protein might be caused, as mentioned above under authors’ reply #12), which we truly believe will fulfill the reviewer’s request for quantification.

      [figure not shown in online form]

      Reviewer #2 (Significance (Required)):

      Comment #25: I think this manuscript starts with an interesting observation, which is that varA mutant cells of V. cholerae display an aberrant cell shape. The manuscript also provides several important findings explaining the molecular basis of this phenotype.

      Authors’ reply #25: Once again, we thank the reviewer for the kind words.

      Comment #26: However, as pointed out in my report, I think the manuscript is yet incomplete in connecting this information to identify the underlying regulatory mechanism.

      Authors’ reply #26: As mentioned above, the focus of this study was never on the elucidation of the entire regulatory pathway. Instead, we aimed at deciphering the molecular mechanism behind an observed phenotype - that is, the cell wall modification in the varA-deficient strain that leads to the bacterium’s spherical shape, which can be restored to the WT Vibrio shape by peptidoglycan precursor cross-feeding from neighboring cells – followed by the identification of several regulators and enzymes that trigger these phenotypes. Overall, we consider this a very complete study. However, as mentioned above, we will certainly discuss in the revised manuscript that the step between CsrA and AspA could be indirect in V. cholerae, in contrast to what was experimentally shown for Salmonella and E. coli.

      **Referee Cross-commenting**

      Comment #27: As pointed out in my review, I think this manuscript is well written and easy to follow. However, I agree with reviewer #1 that the underlying phenotype is most likely an artifact, which limits the biological relevance of this study. In addition, I am missing the molecular mechanism that connects CsrA with AspA production in V. cholerae.

      Authors’ reply #27: See authors’ reply #18 above. We disagree that there is any strong indication that the observed phenotype is an artifact. Given that it is state-of-the-art to study TCS by deleting their genes, our study isn’t any more prone to being an artifact than any other study on TCSs.

      We truly believe that it is also important to not take reviewer #1’s comment out of context by stating “I agree with reviewer #1 that the underlying phenotype is most likely an artifact”. Indeed, he/she provided a rather encouraging statement in which he/she mentions the possibility of an artifact but also clearly states that this study is interesting and builds the foundation to further investigate the newly observed phenotype(s): “I think the finding is quite interesting, even though it is not clear to me if this observed cell morphology has a biological function or if it is an artifact of completly removing VarA. However, this manuscript builds the foundation to further test this question.”

      Moreover, whether there is a direct (as in Salmonella and E. coli) or indirect connection between CsrA and AspA production is not a key aspect of the current study, as discussed above.

      • *
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Rocha et al studied the effect of the VarA response regulator on cell shape of Vibrio cholerae. VarA is part of a two-component system that also includes the histidine kinase VarS. It has previously been shown that VarA activates the expression of three redundantly acting regulatory RNAs called CsrB, CsrC, and CsrD. All three Csr RNAs share the same regulatory principle, which is to sequester the activity of the RNA-binding protein CsrA. CsrA in turn can bind hundreds of mRNA species in the cell, which in the majority of cases results in reduced translation of these mRNAs (in addition to various other modes of action that have been reported). Here, the authors discovered that deletion of the varA gene results in an abnormal, spherical cell shape in stationary phase grown V. cholerae cells. Biochemical analysis revealed an unusual peptidoglycan composition in varA-deficient cells, showing increased levels of dipeptides, a reduction of tetrapeptides, and an overall decrease in peptide-cross linkage. Interestingly, the varA phenotype was complemented by the addition of conditioned medium from wild-type cells, which are likely to provide peptidoglycan building blocks in trans. The authors further discover that varA-deficiency results in AspA over-production, which could be linked to the activity of CsrA. The authors speculate that high AspA levels deplete the cell of aspartate, which is required to produce peptidoglycan precursors. The manuscript is interesting, well-written, and the rationale of the experiments is easy to follow. However, I have two major points of criticism, which reduce my enthusiasm for this work. First, the molecular pathways that links varA-deficiency to increased AspA levels is incomplete: please clarify how CsrA activates AspA levels and if this phenotype is linked to direct binding of CsrA to the aspA mRNA and if so how is activation is achieved at the molecular level. Second, I am not convinced about the biological relevance of the findings. The authors speculate in the discussion section that the VarA-pathway could modulate cell shape under physiological conditions, however, I am not sure such conditions exist given that VarA activity is not only controlled by VarS, but rather integrates information from multiple histidine kinases. I have a several additional comments, which I listed below.

      Major points:

      • Figs. 1 and S1: I think it is interesting that the varS mutant strain does not share the cell shape phenotype with the varA mutant. As pointed out by the authors, this result indicates that varA activity is controlled by another histidine kinase. While I believe it might be beyond the scope of this manuscript to determine which other histidine kinases signal towards VarA, I think it would be useful to measure and compare CsrB/C/D levels in WT, DvarA, and DvarS cells.
      • Figs. 1, S1, and 4C: The regulatory logic implied by these results suggest that deletion of varA results in reduced CsrB/C/D levels, which in turn leads to higher activity of CsrA in the cell. Thus, it would be useful to test if A) over-production of CsrB, CsrC, or CsrC can rescue the phenotype of an varA mutant and if B) combined deletion of csrB/C/D will phenocopy the mutation of varA.
      • Figs. 4B & 5C: I was somewhat surprised by these results. Given that AspA overproduction is suggested to cause cell shape abnormalities in the varA mutant, I would have expected additional transposon insertion in aspA. The fact that mutations only occurred in csrA could indicate that additional (CsrA-controlled) could be involved in the phenotype.

      Minor points:

      • Throughout study: italicize gene name
      • Figs. 1C and S3C: please quantify the results of these western blots and indicate how many replicates were performed.
      • Fig. 5A and B: in order to properly quantify the levels of AspA in the cell (and link them to CsrA activity in the transposon mutants), I think it would be better to add a tag to the chromosomal aspA gene and perform quantitative Western blot analysis.

      Significance

      I think this manuscript starts with an interesting observation, which is that varA mutant cells of V. cholerae display an aberrant cell shape. The manuscript also provides several important findings explaining the molecular basis of this phenotype. However, as pointed out in my report, I think the manuscript is yet incomplete in connecting this information to identify the underlying regulatory mechanism.

      Referee Cross-commenting

      As pointed out in my review, I think this manuscript is well written and easy to follow. However, I agree with reviewer #1 that the underlying phenotype is most likely an artifact, which limits the biological relevance of this study. In addition, I am missing the molecular mechanism that connects CsrA with AspA production in V. cholerae.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, the authors follow up on an interesting finding that varA null mutants of V. cholerae form spherical cells in stationary phase. The authors determine that this cell rounding is due to weakening of the cell wall via less production of tetrapeptide cross links. Mutation of the regulator csrA and the enzyme aspA lead to a model in which a varA mutant cell lacks aspartate leading to low cross-linked cell wall that is unable to hold the typical curved V. cholerae shape. The data are robust, and the manuscript is clearly written. I think the finding is quite interesting, even though it is not clear to me if this observed cell morphology has a biological function or if it is an artifact of completly removing VarA. However, this manuscript builds the foundation to further test this question. The data all support the conclusions, but I do think the authors could have really confirmed their model by connecting mutations in csrA and aspA to restoration of high cross-linked cell well similar to the WT strain as done in Fig. 2. As it stands, this is still somewhat hypothetical and has not been directly demonstrated, although I do think their model is correct and these experiments will be conformation of it. I also have a few other suggestions to improve the manuscript, but in sum I think it is a well-done research study that will be interesting to research in V. cholerae and other gamma proteobacteria.

      Major comments:

      1. The enrichment for suppressors is very creative and connected the varA impact on cell morphology to misregulation of csrA as 10/10 mutants were ultimately linked to this gene. However, insertion in aspA should also suppress this phenotype, and I am curious why this gene was not identified in the transposon suppressor screen.
      2. The authors should complement at least one of their varA/csrA mutants with csrA.
      3. The changes in cell wall structure are not directly connected to the genetic identification of csrA and aspA. Yes, I agree their model makes sense, but to really nail it down they should analyze the cell wall composition in the varA/csrA and varA/aspA double mutants and show it returns to WT levels of crosslinking.
      4. Does deletion of aspA in the WT or varA mutant impact the growth rate?

      Minor comments

      1. Are the round cells able to divide? The data in Fig. S2 would suggest they can based on the increase in CFUs from hour 6 to hour 8, but the authors never comment on this point and it might be worth addressing in the discussion.
      2. Fig. S2-Why does the OD600 increase from 8 to 24 hours but the CFUs decrease in the varA mutant?
      3. Lines 309-A little bit more detail here would help the reader. Are the authors examining whole cell lysates or lysates from specific cellular components? I am actually very surprised this worked as there are so many proteins in crude cell lysates.
      4. Lines 320-321-I don't think there is evidence that CsrA enhances aspA RNA translation, merely that CsrA enhances AspA protein production. It is likely through increasing translation, but this cannot be concluded without direct evidence.
      5. Line 348-350-I do not understand the logic of this sentence stating that the "..until now, the signal that abrogates VarA phosphorylation..." as this manuscript does not contribute to our understanding of the VarS signal.
      6. I am curious if the total volume of the round versus curved cells is constant at 20 hours. This should be easy to determine using ImageJ and worth reporting.

      Significance

      Understanding changes to cell morphology and their biological implications is a growing area of microbiology. This study makes a new contribution to this area by demonstrating a round, spherical form of V. cholerae that is driven by alterations to the cell that decrease cell-wall cross linking.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): **

      **Summary**

      The authors have performed highly quantitative analyses of GPCR signaling to reveal heterogenous ERK and Akt activation patterns by using kinase translocation reporters. Using a massive number of single-cell imaging data, the authors show heterogeneous responses to GPCR agonists in the absence or presence of inhibitors. By cluster analysis, the responses of ERK and Akt were classified into eight and three patterns. This paper is clearly written with sufficient information for the reproducibility. However, the conclusion may not be necessarily supported by the provided data as described below.

      **Major comments:**

      This work has been well done in an organized way and adds new insight into the regulation of protein kinases by GPCRs. The conclusion will be of great interest in the field of single-cell signal dynamics and quantitative biology. On a bit negative note, considering the complexity of the downstream of GPCRs, some of the conclusions may need revision. *

      We thank the reviewer for the evaluation and for raising a number of comments that have helped us to strengthen the manuscript and that will be addressed below.

      1. The conclusion of the title that "Heterogeneity and dynamics of ERK/Akt activation by GPCR depend on the activated heterotrimeric G proteins," may not be supported by the data. The authors compared just one pair each of GPCR and ligand. The heterogeneity may come from the nature of the ligand or the characteristics of the single clone chosen for this study. The title may suggest that the heterogeneity depends only on the G-protein (although that is not what the title says). Instead, we mean that G-proteins play a role in the heterogeneity, as we infer from the experiments with the G-protein inhibitors. If the reviewer feels strongly about this, we are open to changing the title, for instance to:

      “Kinase translocation reporters reveal the single cell heterogeneity and dynamics of ERK and Akt activation by G protein-coupled receptors”

      • The obvious question is that why the authors did not analyze the correlation between ERK and Akt activity more extensively. Cell Profiler will be able to extract multiple cellular features. Linking the heterogeneous signals to cellular features will benefit readers in the broad cell biology field. If the authors wish to write another paper with that data, it should be at least discussed. *

      We agree that we can add more information on the correlation between ERK and Akt activity and we have added a plot that shows the co-incidence of the ERK and Akt clusters. This is now panel C of figure 8. We have no wish of writing another paper and we have made the data and code available, so anyone can do a more detailed analysis if desired.

      We appreciate the suggestion to correlate activities with cellular features, such as cell area and shape. However, in our analysis we use nuclear fluorescence to segment the nuclear and cytoplasmic fluorescence (as generally done in studies that use KTRs). Therefore, the information on cellular features is not readily available. Such analysis would require a marker for the cytoplasm or membrane (or yet another image analysis procedure).

      Another apparent flaw of this work is that YM was not challenged to UK-stimulated cells. The authors probably assumed lack of effect. Nevertheless, I believe it is required to show. Or, remove the PTx data from the Histamine-stimulated cell data.

      We agree that this is valuable data to include. Unfortunately, this experiment was done in a slightly different condition than the other experiments (different spacing of the time intervals) and we initially skipped the data for these reasons. After careful examination of the data, we have decided to include these data (added to figures 3 & 5).

      We note that we still miss the data from the YM+PTx data for UK and we have currently no way to carry out these experiments (mainly due to lack of funding). In our opinion, the absence of this data is not critical for the interpretation of the results. We prefer to show the YM+PTx data for the other two conditions.

      The most interesting response is that of S1P. ERK is biphasically activated. Combined inhibition of Gq and Gi failed to suppress ERK activity. It may be discussed why the biphasic activation pattern was not identified by the classification.

      We think that the biphasic activation pattern is reflected by cluster 7 and 8 and we now mention this in the text: “The biphasic ERK activation pattern, which is specific for stimulation with S1P are reflected by cluster 7 and 8.”

      For clarity, we now added the dynamics for each cluster to figure 9.

      *The authors argue that the brightness of the KTR reporter was not correlated with the dynamic range of ERK or Akt reporter (Supplementary Figure 3), but it is not clear. I had an impression that ERK-KTR brightness (Supplementary Figure 3A) has a slightly negative correlation with "maximum change in CN ratio" (Supplementary Figure 3B) (e.g., A6>B3>B5 in brightness and A6

      We thank the reviewer for the suggestion and have now added this data to supplemental figure 3 as panel C.

      The authors have shown cluster analyses for the temporal patterns in kinase activations. However, the only difference of cluster 3 and 5 (Figure 7) seem to be amplitude. The authors have also shown the amplitude is dependent on the dose of the activators, which together makes it difficult to see the biological meaning of discriminating the two patterns in comparing different agonists, e.g., Histamine, UK, and S1P. The authors should discuss their views on how the clustering analyses will benefit biological interpretations together with possible limitations.

      This is a valid point, and it is a consequence of clustering method. We have added text to the discussion to explain our view: “The clustering is a powerful method for the detection of patterns and simplification of large amounts of data. Yet, it should be realized that clustering is mathematical procedure that is not necessarily reflecting the biological processes. One example is the graded response of ERK and Akt activities to ligands, whereas cells are grouped in weak, middle and strong responders. This may be solved by developing and using clustering methods that take the underlying biological processes into account.”

      Considering the importance of the content, the supplemental note 2 may be included in the main text.

      We appreciate this suggestion, and we have incorporated supplemental note 2 in the main text.

      \*Minor comments:**

      1. The authors should clarify the cell type they used (HeLa cells) in the main text and figure legends. *

      This information is now indicated in the first paragraph of the results section and in the legend of figure1.

      Supplementary note1: The data-not-shown data (no correlation of KTR expression and its response to serum) should be very informative for the readers. The data should be shown as an independent supplementary figure.

      This relates to major point 5 and we agree that this is valuable. The data of the expression and the maximum response has been added to supplementary figure 3 as panel C.

      Supplementary Figure S2: The authors should clarify this image processing is about background subtraction. Also, the authors should clearly note "rolling ball with a radius of 70 pixels" is about an ImageJ function, "Subtract Background".

      We added text to highlight that the processing is a background subtraction and noise reduction. We added text to explain it is a FIJI function.

        1. Supplementary Figure S5: Figure labels are "A, A, B, B" not "A, B, C, D". Also the top two figures are lacking Y axis labels. *

      Thanks for pointing this out. We the labels are corrected.

      Page6 (top): The authors should mention the description is about Supplementary Figure S5 (UK) and Supplementary Figure S6 (S1P).

      This is an accidental omission, it is corrected.

      Figure 3: the figures are lacking x-axis labels (probably uM, nM and pM from left).

      Well spotted, this is fixed by adding the units to the labels for each ligand.

      Values in tables: The significant figure must be 2, at best. This should be consistent throughout the text. For example, "The EC50 values for histamine, S1P and UK were respectively 0.3 μM, 63.7 nM and 2.5 pM." This is somewhat awkward.

      This has been fixed in the text and in the table.

      Page 7, the first paragraph: No comments on S1P!

      We added our observation that: “The response to S1P is hardly affected by YM, but the amplitude is reduced by PTx.”

      Fig. 3: 100 mM must read as 100 micromolar.

      We do not understand this comment, but the units of figure 3 are now corrected (see also point 6).

      • Fig. 9: Concentration unit is missing.*

      Thanks for pointing this out, units are added.

      • Page 11, line 4: EKR should read as ERK. *

      Fixed

      • Page 13: "So far, only a couple of studies looked into kinase activation by GPCRs and these studies used overexpressed receptors [32,33]." Please describe precisely. Protein kinase activation by GPCR has been studied more than 20 years. Why are these two recent papers cited here? *

      We updated the text to explain that: “So far, only a couple of studies looked into kinase activation by GPCRs in single cells with KTRs and these studies used overexpressed receptors”.

      "This is in marked contrast to other fluorescent biosensors that typically require an overexpressed receptor for robust responses [34]." Following words should be included in the end: "in our hands".

      We’ve included the suggested line.

      • "Histamine is reported to predominantly activate Gq in HeLa cells [36] and UK activates Gi [37]." Describe the name of receptors for the better understanding. *

      We added names: “Histamine is reported to predominantly activate Gq in HeLa cells by the histamine H1 receptor [36] and UK activates Gi by α2-adrenergic receptors [37]”

      • "S1P can activate a number of different GPCRs, all known to be expressed by HeLa cells [24]." Why is this paper chosen? The authors can easily find RNA-Seq data, if they wish to see the expression level. The cited paper did not scrutinize the S1P receptors expressed in HeLa cells. *

      The S1PR levels are scrutinized in the cited paper, but it is ‘hidden’ in the supplemental figure S4A. We will clarify this and explicitly mention this supplemental figure: “The situation for S1P is different. S1P can activate a number of different GPCRs, all known to be expressed by HeLa cells as shown in the supplemental figure S4A of [24]”

      *Reviewer #1 (Significance (Required)):

      The authors used biosensors for ERK and Akt to examine the kinetics of activation by GPCR ligands. Technical advancement is in the massive analysis method and cluster analysis. This is an important direction for the quantitative biology. GPCR signaling is complex because of multiple receptors coupled with different G proteins. The simple ones such as histamine receptor and alpha2-adrenergic receptor can be easily analyzed as shown in this study. However, there are many S1P receptors, which make the interpretation difficult. If the authors could have shown interesting proposal on this data, the paper may interest many researchers in the field of cell biology and systems biology.

      Expertise: Cell biology, signal transduction of protein kinases, fluorescence microscopy.

      **Referee Cross-commenting**

      1. I agree with the other two reviewers in that immunoblotting data is required to show the efficiency of P2A cleavage.
      2. All reviewers think it looks strange that the authors did not show UK + YM data.
      3. Showing the dynamic range of the biosensors will reinforce the data as Reviewer #3 states. ERK-KTR is quite sensitive and can be easily saturated. Ideally, the ratio of pERK vs ERK can be quantified by the different mobility in SDS-PAGE. But, I do not know how we can do it for Akt.

        Reviewer #2 (Evidence, reproducibility and clarity (Required)):

        **Summary**

        In this paper Chavez-Abiega and colleagues investigate the dynamics of ERK and Akt activity downstream of several G protein-couples receptors (GPCRs). Using drugs to block specific G-proteins, they probe the activation of ERK/Akt by different heterotrimeric G proteins with fluorescent biosensors at the single cell resolution. Main finding is that ERK/AKT can be activated by different G-proteins, depending on the receptor coupling to the G-protein subclass, and that the ERK/AKT dynamics for S1P are specifically heterogeneous. Moreover, it seems that the AKT signaling response is very similar to ERK after GPCR stimulation.

        **Major points:**

        1) For this paper, the authors produced a new construct to express simultaneously the nuclear marker, the Akt and the ERK biosensors. The tree parts are connected by P2A peptides that determine their separation. Although, the biosensors are based on existing ones, the connection between them by P2A might create artifacts if the separation of the two parts is incomplete. For that, important controls are missing, such as treatment with an ERK and an Akt inhibitor. If the two parts are well separated the inhibitors should block the cytosol translocation of one of the two components and not of the other. This control is also important to check if in HeLa cells the Akt biosensors is not phosphorylated by ERK as well, as described in other reports. Alternatively, P2A separation can be quantified on a protein blot. *

      We agree that it is important to establish that the P2A sequence results in separation of the reporters. There are several observations that support our notion that the separation is efficient. First, we have been using the 2A-like sequences for over a decade in HeLa cells (first paper: doi:10.1038/nmeth.1415) and we have never encountered situations where the cleavage was problematic. Second, the distribution in signal of the nuclear Scarlet probe differs substantially from that of the mTurquoise2 and the mNeonGreen probe. Third, the dynamics of the ERK-KTR and Akt-KTR are different. Fourth, we have included new data with an ERK inhibitor, showing that the Akt-KTR responds independently of the ERK-KTR (figure S5). We have also added text to explain this: “Next, we examined the effect of the MEK inhibitor PD 0325901. Pre-incubation with the inhibitor for 20 minutes blocked the response of the ERK-KTR to FBS, but not that of Akt-KTR (Supplemental Figure S5). This supports previous observations [14] [15] that the P2A effectively separates the different components, since the Akt-KTR and ERK-KTR show independent relocation patterns.”

      This latter point is also supported by the co-incidence plot of the ERK versus Akt clusters (figure 8C) showing that the probes act independently (which is the main reason for using this strategy).

      Although any of the aforementioned points cannot exclude that a small fraction of the probe remains fused, we think that this potential issue is far outweighed by the benefits of the use of 2A peptides.

      2) The description of ERK and Akt should be reported in a more uniform way, such as using the same representations for both (e.g. the equivalent of figure 2 for Akt is missing) or the same number of clusters.

      We choose to concentrate first on ERK activity, that is why a similar plot for Akt activation is not shown. However, the Akt responses are detailed in figure 4 and supplemental figures S5 and S7.

      For the cluster analysis, we looked into the optimal number of clusters (as explained in Supplemental note S2). This number differs for ERK and Akt, since the complexity of the responses is different. We move supplemental note 2 to the main text, which also clarifies the different number of clusters that we used for the analysis.

      3) Figure 3 & Figure 5: It seems that the YM and YM+PTx data for the UK 14304 data is missing. This would be an interesting addition to the manuscript, and it is easy to add. A similar analysis for the Akt sensor is missing in figure 3 and should be added for consistency. Figure 4 shows data for Akt, but as timeseries and only for Histamine. See point 2, it would benefit the reader greatly if ERK and AKT are presented in a more uniform and complete fashion throughout the manuscript.

      We agree that it is valuable to add data for UK with YM. This data has been added, see also reply to reviewer 1, major point 3

      As for the Akt data, the response was largely similar albeit with less complexity and a lower amplitude. This is the reason to focus on ERK and this is explained in the discussion: “Therefore, the measurement of Akt does not add information. Moreover, the Akt response had a relatively poor amplitude.”

      4) In the results text of figure 4, the authors state that "...as shown in Figure 4C-D, which is in line with the effect of histamine on ERK.". It is unclear what the authors mean with this statement, the effects of single/double inhibition of Histamine stimulation on ERK are not quantified or discussed. Both responses can be quantified more carefully and compared.

      We agree that this is poorly formulated, and we rephrase it to make it clearer: “Inhibition of Gq (figure 4C) decreases the maximum activity up to ~70%, and simultaneous inhibition of Gq and Gi causes a decrease of the responses up to ~90%, as shown in Figure 4D. These Akt amplitudes and effects of inhibitors are largely similar to those observed for ERK.”

      5) This paper would benefit from a mechanistic investigation. For instance, the authors could investigate the pathways that lead to the generation of the pulse of ERK and Akt. These (preliminary) results presented call for deeper investigation into the signaling pathway from Gai and Gaq to ERK and AKT, and the authors are in a great position to probe this. One simple approach is to explore the upstream pathway, such as the MAPK cascade, PI3K, RTKs by means of inhibitors.

      We agree that there is much that can be done with the KTR technology. To this end, we deposit the probe and make all our data analysis methods available. We hope that others will benefit from our efforts and use the tools for mechanistic studies. 6) Since different G-proteins seem to elicit similar responses on ERK and especially for Akt, it is likely a B-arrestin / beta-gamma subunit mediated mechanism? It would be interesting to hear what the authors think of this, did they investigate/consider this possibility? E.g. Perhaps blocking RTK signaling / B-arrestin signaling would reduce heterogeneity?

      We appreciate this suggestion and have added a statement to the discussion: “Based on our data, we cannot exclude that beta-arrestin or RTKs play a role in the activation of ERK and Akt. To study the role of non-classical routes to ERK activation, inhibitor studies, or probes that interrogate these processes would be useful.”

      7) The authors should take a serious effort to summarize the data in the figures better. Many plots that can be merged/presented in a more concise way, which would improve the readability of the manuscript greatly.

      We will take care to improve the data visualization during the revision. We will address any specific points that are raised.

      \*Minor points:**

      1) The authors should spell out in the legend of each figure if they are representing the absolute C/N or the normalized C/N *

      Thanks for pointing this out. We added this information to the legends and it is also written in the materials and methods: "data was normalized by subtracting the average of two time points prior to stimulation (usually the 5th and 6th time point) from every data point."

      2) In Figure 2 the authors should show the control with no stimulus. Also would be informative to inform the reader about the stimulation protocol used, or indicate the stimulation time and length in the figure.

      We have added the no stimulus control and added the information to the legend.

      3) Figure 3: This figure would benefit from a different presentation of the data, it is currently confusing. E.g. Average curves per drug condition in a single graph would present the point the authors make more clear and concise, and this single cell overview can be moved to supplements.

      Our main focus is on single cell analysis and we think that the current plots convey the message in a clear and transparent fashion. It is in line with the recently proposed idea of “superplot” (https://doi.org/10.1083/jcb.202001064). We also provide scripts and data, enabling anyone to replot the data if that is desired.

      4) Figure 4 legend states "CN ERK" and "ERK C/N", but is depicting only Akt responses? Only in 4c the axes are labeled, this together is very confusing.

      Thanks for pointing this out. This is corrected

      *5) Figure 5 is missing the controls with ERK and Akt inhibitors, to show the loss of correlation between the AUC of the two

      *We have included data with a MEK inhibitor (new supplemental figure S5) to demonstrate the specificity of the probe and it also demonstrates that Akt can be independently activated

      6) Figure 6, the presumed lack of correlation between baseline activity and response should be confirmed statistically.

      We have improved the presentation of figure 6. We now show only the maximal response and how this varies between conditions. It is evident from the graphical representation that the curves are similar for the different start ratios. We feel that the use of statistics is not necessary here.

      7) It seems that in S1P treated cells there is a second oscillation in ERK activity well visible in figure 2 and also in S10. Could the authors comment on that?

      We add text to the discussion to address this: “We observed that activation of endogenous S1P receptors resulted in a strong, but highly heterogeneous ERK-KTR response, with two peaks in a population of cells.” and “When PTx is present, the biphasic response is abolished and the first peak of activation is reduced, suggesting that the initial response is due to Gi signaling.”

      *8) In the abstract it is unclear what authors mean with "UK".

      *Changed to brimonidine

      9) Figure 9, it would be helpful to visually repeat the typical curve of the different clusters here, to guide the reader.

      This is a good suggestion and we have added the typical curves for the different clusters to the plot.

      10) The observed heterogeneity in responses might be related to different cell cycle stages, did the authors investigated/consider this possibility (e.g. with a cell cycle biosensor)?

      This is a very valid comment. We do consider its importance, but we did not investigate the effects of cell cycle.

      *Reviewer #2 (Significance (Required)):

      The paper describes with high accuracy the dynamics of ERK and Akt biosensors downstream of several GPCRs.

      However, it feels like this is a preliminary report that leaves many important questions still open. It does not provide mechanistic insight and doesn't fully exploit the potential of single-cell technologies. The authors have the tools to investigate several important questions that are left open in the manuscript (e.g. connection Gaq/Gai to ERK/AKT, B-arrestin/betagamma involvement). Moreover, some important controls are missing. The authors should also consider the data presentation in the figures, to improve readability and interpretation of the manuscript.

      Properly revised, would be of interest for a broad audience in cell biology, specifically GPCR and RTK signaling fields.

      Expertise in cell biology, gpcr and rtk signaling, fluorescent biosensors.

      **Referee Cross-commenting**

      I agree with the assessments by the other reviewers.

      Indeed showing the dynamic range of the biosensors, as Reviewer #3 states, would strengthen the manuscript and put the S1P response heterogeneity in context.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript uses a live-cell biosensor approach to examine the activity kinetics of the ERK and Akt kinases in response to different GPCR ligands. The paper provides a detailed description of the development of a HeLa reporter cell line that expresses both Akt and ERK biosensors, along with a nuclear marker for use in cell tracking. The authors then catalog the individual responses from thousands of cells to three GPCR ligands. Individual cells show strong correlation in stimulated ERK and Akt activity. Using inhibitors for Gq and Gi proteins, it is shown that ERK and Akt activities are dependent on different G proteins. The authors also show that the heterogeneous responses within each population can be decomposed into several clusters representing similar dynamic behaviors; the frequencies of these clusters increase or decrease depending on treatments.

      Overall, this is a well documented extension of an existing biosensor approach to examine GPCR signaling, and the approach is clearly described. There are however, some control experiments that are essential to support the conclusions.

      **Major comments:**

      1. The maximal responses of ERK and Akt biosensors in the selected cell clone are not adequately shown. Although FBS responsiveness is used as a validation and selection criterion, it would be much more informative to show the distribution of single-cell responses for defined activators of ERK and Akt, such as EGF and IGF-1, respectively. Without seeing the variability in these responses, it is difficult to put the heterogeneity observed in GPCR responses into context. *

      The FBS is used as a (crude) way to examine responsiveness of the clones. We understand that treatment of the cells with growth factors would add more data and therefore more information to the manuscript. However, the main aim of the study is to examine whether KTR technology can be used to study endogenous GPCR signaling. It is clear that the answer is positive. Next, we asked whether we could detect differences for different GPCRs and that was the focus of this study. It is unclear how studies with EGF would add new information to our observations.

      It is not clear whether the basal activity for the biosensors represents actual activity or simply the measurement floor. This should be established by using saturating treatment inhibitors for ERK and Akt to determine the biosensor readings in the absence of any activity. Ideally, an approach such as the one shown by Ponsioen et al. (PMID: 33795873) should be used to determine the dynamic range of the sensors.

      We studied the basal levels and the effect of serum. We found that the basal levels are reduced by replacing the growth medium with serum free medium. The reduction in C/N ratio reaches a plateau after ~ 2hours of replacing the medium. This data is added as supplemental figure S4. Therefore, we have performed all experiments 2 hours after replacing the growth medium with serum free imaging medium.

      Because the biosensors are separated by self-cleaving peptides, there is the potential that incomplete cleavage could complicate the results. Cleavage efficiency should be assessed by western blot or an equivalent method.

      We agree that it is important to establish that the P2A sequence results in separation of the reporters. There are several observations that support our notion that the separation is efficient. First, we have been using the 2A-like sequences for over a decade in HeLa cells (first paper: doi:10.1038/nmeth.1415) and we have never encountered situations where the cleavage was problematic. Second, the distribution in signal of the nuclear Scarlet probe differs substantially from that of the mTurquoise2 and mNeonGreen probe. Third, the dynamics of the ERK-KTR and Akt-KTR are different. Fourth, we have included new data with an ERK inhibitor, showing that the Akt-KTR responds independently of the ERK-KTR (figure S5). We have also added text to explain this: “Next, we examined the effect of the MEK inhibitor PD 0325901. Pre-incubation with the inhibitor for 20 minutes blocked the response of the ERK-KTR to FBS, but not that of Akt-KTR (Supplemental Figure S5). This supports previous observations [14] [15] that the P2A effectively separates the different components, since the Akt-KTR and ERK-KTR show independent relocation patterns.”

      This latter point is also supported by the co-incidence plot of the ERK versus Akt clusters (figure 8C) showing that the probes act independently (which is the main reason for using this strategy).

      Although any of the aforementioned points cannot exclude that a small fraction of the probe remains fused, we think that this potential issue is far outweighed by the benefits of the use of 2A peptides.

      Ideally, an alternate method such as immunofluorescence for phosphorylated ERK/Akt or their substrates could be used in a subset of the conditions to validate the heterogeneity observed by the biosensors.

      We thank the reviewer for this suggestion. Since we see a lot of variability in the dynamics, which cannot be addressed by immunofluorescence, we do not think this will experiment be valuable. Of note, GPCR activity is known to induce ERK activity in a dose-dependent manner on a population level as determined with immunolabeling methods and that is what we observe with the ERK KTR as well.

      \*Minor comments:**

      1. In the introduction, more rationale and background could be provided for the examination of GPCR-stimulated ERK and Akt activity. There is not much information provided on why this is an interesting question. Other than the involvement of beta arrestin and RTK transactivation, which are mentioned, what mechanisms are known to be involved? Also, the importance of ERK and Akt in cancer is brought up, but it is not made clear how this approach or results would connect specifically to a cancer model. *

      We think that the connections between heterotrimeric G-proteins and kinase activity are not well established. Except for the classical Gq -> PKC -> ERK pathway, not so much is known and we add this to the discussion: “The classic downstream effector of Gq is PKC, which can activate ERK. On the other hand, it is not so clear how Gq would affect Akt. The molecular network that connects the activity of Gi with kinases also not so clear.”

      *It would be helpful to provide some explanation for why the UK+YM and UK+YM+PTx data are not shown in figure 3

      *

      We agree that this is valuable data to include. Unfortunately, this experiment was done in a slightly different condition than the other experiments (different spacing of the time intervals) and we initially skipped the data for these reasons. After careful examination of the data, we have decided to include these data (added to figures 3 & 5).

      We note that we still miss the data from the YM+PTx data for UK and we have currently no way to carry out these experiments (mainly due to lack of funding). We prefer to show the YM+PTx data for the other two conditions.

      • In the Abstract figure, it is not clear which samples "Inhibitor" and "Agonist" are referring to. **

        *

      Thanks for this comment. We will remove the visual abstract when the preprint is submitted to a journal.

      * Reviewer #3 (Significance (Required)):

      While similar reporter approaches have been used in a number of papers to examine growth factor signaling dynamics of ERK and Akt, this manuscript is the first I have seen to examine the responses of these kinases to different GPCR ligands. In doing so, it adds significantly to the growing body of literature on single-cell signaling responses. The mechanisms of ERK and Akt activation by GPCRs remain somewhat ambiguous, and the data reported here will be helpful in refining models for this signal transduction process. The findings that the GPCR ligands examined show different G protein dependencies than anticipated is an interesting facet, as is the observation that, while ERK and Akt are generally correlated, inhibition of Gi preferentially blocks S1P-induced ERK activity more so than Akt activity. However, the main findings of heterogeneity in signaling, and the observation of clusters that describe the different dynamic behaviors present within a population, are highly consistent with what has been shown in other systems. Overall, this study is a useful confirmation that GPCR signaling to ERK and Akt follows a similar pattern to other forms of stimulation.

      **Referee Cross-commenting**

      Regarding the dynamic range, I don't think it is necessary to do a western blot (though this would be nice) - I think it would be sufficient to show maximal activation using EGF/IGF and full suppression using MEK/ERK and Akt inhibitors. I also agree that all the points raised by the other reviewers. In particular, a deeper exploration and better visualization of the relationship between ERK and Akt would be very useful, as noted by both Reviewers #1 and #2.*

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript uses a live-cell biosensor approach to examine the activity kinetics of the ERK and Akt kinases in response to different GPCR ligands. The paper provides a detailed description of the development of a HeLa reporter cell line that expresses both Akt and ERK biosensors, along with a nuclear marker for use in cell tracking. The authors then catalog the individual responses from thousands of cells to three GPCR ligands. Individual cells show strong correlation in stimulated ERK and Akt activity. Using inhibitors for Gq and Gi proteins, it is shown that ERK and Akt activities are dependent on different G proteins. The authors also show that the heterogeneous responses within each population can be decomposed into several clusters representing similar dynamic behaviors; the frequencies of these clusters increase or decrease depending on treatments.

      Overall, this is a well documented extension of an existing biosensor approach to examine GPCR signaling, and the approach is clearly described. There are however, some control experiments that are essential to support the conclusions.

      Major comments:

      1. The maximal responses of ERK and Akt biosensors in the selected cell clone are not adequately shown. Although FBS responsiveness is used as a validation and selection criterion, it would be much more informative to show the distribution of single-cell responses for defined activators of ERK and Akt, such as EGF and IGF-1, respectively. Without seeing the variability in these responses, it is difficult to put the heterogeneity observed in GPCR responses into context.
      2. It is not clear whether the basal activity for the biosensors represents actual activity or simply the measurement floor. This should be established by using saturating treatment inhibitors for ERK and Akt to determine the biosensor readings in the absence of any activity. Ideally, an approach such as the one shown by Ponsioen et al. (PMID: 33795873) should be used to determine the dynamic range of the sensors.
      3. Because the biosensors are separated by self-cleaving peptides, there is the potential that incomplete cleavage could complicate the results. Cleavage efficiency should be assessed by western blot or an equivalent method.
      4. Ideally, an alternate method such as immunofluorescence for phosphorylated ERK/Akt or their substrates could be used in a subset of the conditions to validate the heterogeneity observed by the biosensors.

      Minor comments:

      1. In the introduction, more rationale and background could be provided for the examination of GPCR-stimulated ERK and Akt activity. There is not much information provided on why this is an interesting question. Other than the involvement of beta arrestin and RTK transactivation, which are mentioned, what mechanisms are known to be involved? Also, the importance of ERK and Akt in cancer is brought up, but it is not made clear how this approach or results would connect specifically to a cancer model.
      2. It would be helpful to provide some explanation for why the UK+YM and UK+YM+PTx data are not shown in figure 3.
      3. In the Abstract figure, it is not clear which samples "Inhibitor" and "Agonist" are referring to.

      Significance

      While similar reporter approaches have been used in a number of papers to examine growth factor signaling dynamics of ERK and Akt, this manuscript is the first I have seen to examine the responses of these kinases to different GPCR ligands. In doing so, it adds significantly to the growing body of literature on single-cell signaling responses. The mechanisms of ERK and Akt activation by GPCRs remain somewhat ambiguous, and the data reported here will be helpful in refining models for this signal transduction process. The findings that the GPCR ligands examined show different G protein dependencies than anticipated is an interesting facet, as is the observation that, while ERK and Akt are generally correlated, inhibition of Gi preferentially blocks S1P-induced ERK activity more so than Akt activity. However, the main findings of heterogeneity in signaling, and the observation of clusters that describe the different dynamic behaviors present within a population, are highly consistent with what has been shown in other systems. Overall, this study is a useful confirmation that GPCR signaling to ERK and Akt follows a similar pattern to other forms of stimulation.

      Referee Cross-commenting

      Regarding the dynamic range, I don't think it is necessary to do a western blot (though this would be nice) - I think it would be sufficient to show maximal activation using EGF/IGF and full suppression using MEK/ERK and Akt inhibitors. I also agree that all the points raised by the other reviewers. In particular, a deeper exploration and better visualization of the relationship between ERK and Akt would be very useful, as noted by both Reviewers #1 and #2.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this paper Chavez-Abiega and colleagues investigate the dynamics of ERK and Akt activity downstream of several G protein-couples receptors (GPCRs). Using drugs to block specific G-proteins, they probe the activation of ERK/Akt by different heterotrimeric G proteins with fluorescent biosensors at the single cell resolution. Main finding is that ERK/AKT can be activated by different G-proteins, depending on the receptor coupling to the G-protein subclass, and that the ERK/AKT dynamics for S1P are specifically heterogeneous. Moreover, it seems that the AKT signaling response is very similar to ERK after GPCR stimulation.

      Major points:

      1) For this paper, the authors produced a new construct to express simultaneously the nuclear marker, the Akt and the ERK biosensors. The tree parts are connected by P2A peptides that determine their separation. Although, the biosensors are based on existing ones, the connection between them by P2A might create artifacts if the separation of the two parts is incomplete. For that, important controls are missing, such as treatment with an ERK and an Akt inhibitor. If the two parts are well separated the inhibitors should block the cytosol translocation of one of the two components and not of the other. This control is also important to check if in HeLa cells the Akt biosensors is not phosphorylated by ERK as well, as described in other reports. Alternatively, P2A separation can be quantified on a protein blot.

      2) The description of ERK and Akt should be reported in a more uniform way, such as using the same representations for both (e.g. the equivalent of figure 2 for Akt is missing) or the same number of clusters.

      3) Figure 3 & Figure 5: It seems that the YM and YM+PTx data for the UK 14304 data is missing. This would be an interesting addition to the manuscript, and it is easy to add. A similar analysis for the Akt sensor is missing in figure 3 and should be added for consistency. Figure 4 shows data for Akt, but as timeseries and only for Histamine. See point 2, it would benefit the reader greatly if ERK and AKT are presented in a more uniform and complete fashion throughout the manuscript.

      4) In the results text of figure 4, the authors state that "...as shown in Figure 4C-D, which is in line with the effect of histamine on ERK.". It is unclear what the authors mean with this statement, the effects of single/double inhibition of Histamine stimulation on ERK are not quantified or discussed. Both responses can be quantified more carefully and compared.

      5) This paper would benefit from a mechanistic investigation. For instance, the authors could investigate the pathways that lead to the generation of the pulse of ERK and Akt. These (preliminary) results presented call for deeper investigation into the signaling pathway from Gai and Gaq to ERK and AKT, and the authors are in a great position to probe this. One simple approach is to explore the upstream pathway, such as the MAPK cascade, PI3K, RTKs by means of inhibitors.

      6) Since different G-proteins seem to elicit similar responses on ERK and especially for Akt, it is likely a B-arrestin / beta-gamma subunit mediated mechanism? It would be interesting to hear what the authors think of this, did they investigate/consider this possibility? E.g. Perhaps blocking RTK signaling / B-arrestin signaling would reduce heterogeneity?

      7) The authors should take a serious effort to summarize the data in the figures better. Many plots that can be merged/presented in a more concise way, which would improve the readability of the manuscript greatly.

      Minor points:

      1) The authors should spell out in the legend of each figure if they are representing the absolute C/N or the normalized C/N

      2) In Figure 2 the authors should show the control with no stimulus. Also would be informative to inform the reader about the stimulation protocol used, or indicate the stimulation time and length in the figure.

      3) Figure 3: This figure would benefit from a different presentation of the data, it is currently confusing. E.g. Average curves per drug condition in a single graph would present the point the authors make more clear and concise, and this single cell overview can be moved to supplements.

      4) Figure 4 legend states "CN ERK" and "ERK C/N", but is depicting only Akt responses? Only in 4c the axes are labeled, this together is very confusing.

      5) Figure 5 is missing the controls with ERK and Akt inhibitors, to show the loss of correlation between the AUC of the two

      6) Figure 6, the presumed lack of correlation between baseline activity and response should be confirmed statistically.

      7) It seems that in S1P treated cells there is a second oscillation in ERK activity well visible in figure 2 and also in S10. Could the authors comment on that?

      8) In the abstract it is unclear what authors mean with "UK".

      9) Figure 9, it would be helpful to visually repeat the typical curve of the different clusters here, to guide the reader.

      10) The observed heterogeneity in responses might be related to different cell cycle stages, did the authors investigated/consider this possibility (e.g. with a cell cycle biosensor)?

      Significance

      The paper describes with high accuracy the dynamics of ERK and Akt biosensors downstream of several GPCRs.

      However, it feels like this is a preliminary report that leaves many important questions still open. It does not provide mechanistic insight and doesn't fully exploit the potential of single-cell technologies. The authors have the tools to investigate several important questions that are left open in the manuscript (e.g. connection Gaq/Gai to ERK/AKT, B-arrestin/betagamma involvement). Moreover, some important controls are missing. The authors should also consider the data presentation in the figures, to improve readability and interpretation of the manuscript.

      Properly revised, would be of interest for a broad audience in cell biology, specifically GPCR and RTK signaling fields.

      Expertise in cell biology, gpcr and rtk signaling, fluorescent biosensors.

      Referee Cross-commenting

      I agree with the assessments by the other reviewers.

      Indeed showing the dynamic range of the biosensors, as Reviewer #3 states, would strengthen the manuscript and put the S1P response heterogeneity in context.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The authors have performed highly quantitative analyses of GPCR signaling to reveal heterogenous ERK and Akt activation patterns by using kinase translocation reporters. Using a massive number of single-cell imaging data, the authors show heterogeneous responses to GPCR agonists in the absence or presence of inhibitors. By cluster analysis, the responses of ERK and Akt were classified into eight and three patterns. This paper is clearly written with sufficient information for the reproducibility. However, the conclusion may not be necessarily supported by the provided data as described below.

      Major comments:

      This work has been well done in an organized way and adds new insight into the regulation of protein kinases by GPCRs. The conclusion will be of great interest in the field of single-cell signal dynamics and quantitative biology. On a bit negative note, considering the complexity of the downstream of GPCRs, some of the conclusions may need revision.

      1. The conclusion of the title that "Heterogeneity and dynamics of ERK/Akt activation by GPCR depend on the activated heterotrimeric G proteins," may not be supported by the data. The authors compared just one pair each of GPCR and ligand. The heterogeneity may come from the nature of the ligand or the characteristics of the single clone chosen for this study.
      2. The obvious question is that why the authors did not analyze the correlation between ERK and Akt activity more extensively. Cell Profiler will be able to extract multiple cellular features. Linking the heterogeneous signals to cellular features will benefit readers in the broad cell biology field. If the authors wish to write another paper with that data, it should be at least discussed.
      3. Another apparent flaw of this work is that YM was not challenged to UK-stimulated cells. The authors probably assumed lack of effect. Nevertheless, I believe it is required to show. Or, remove the PTx data from the Histamine-stimulated cell data.
      4. The most interesting response is that of S1P. ERK is biphasically activated. Combined inhibition of Gq and Gi failed to suppress ERK activity. It may be discussed why the biphasic activation pattern was not identified by the classification.
      5. The authors argue that the brightness of the KTR reporter was not correlated with the dynamic range of ERK or Akt reporter (Supplementary Figure 3), but it is not clear. I had an impression that ERK-KTR brightness (Supplementary Figure 3A) has a slightly negative correlation with "maximum change in CN ratio" (Supplementary Figure 3B) (e.g., A6>B3>B5 in brightness and A6<B3<B5 in maximum change in CN ratio). The authors should show dot plots of average fluorescence vs. the maximum change in CN ratio.
      6. The authors have shown cluster analyses for the temporal patterns in kinase activations. However, the only difference of cluster 3 and 5 (Figure 7) seem to be amplitude. The authors have also shown the amplitude is dependent on the dose of the activators, which together makes it difficult to see the biological meaning of discriminating the two patterns in comparing different agonists, e.g., Histamine, UK, and S1P. The authors should discuss their views on how the clustering analyses will benefit biological interpretations together with possible limitations.
      7. Considering the importance of the content, the supplemental note 2 may be included in the main text.

      Minor comments:

      1. The authors should clarify the cell type they used (HeLa cells) in the main text and figure legends.
      2. Supplementary note1: The data-not-shown data (no correlation of KTR expression and its response to serum) should be very informative for the readers. The data should be shown as an independent supplementary figure.
      3. Supplementary Figure S2: The authors should clarify this image processing is about background subtraction. Also, the authors should clearly note "rolling ball with a radius of 70 pixels" is about an ImageJ function, "Subtract Background".
      4. Supplementary Figure S5: Figure labels are "A, A, B, B" not "A, B, C, D". Also the top two figures are lacking Y axis labels.
      5. Page6 (top): The authors should mention the description is about Supplementary Figure S5 (UK) and Supplementary Figure S6 (S1P).
      6. Figure 3: the figures are lacking x-axis labels (probably uM, nM and pM from left).
      7. Values in tables: The significant figure must be 2, at best. This should be consistent throughout the text. For example, "The EC50 values for histamine, S1P and UK were respectively 0.3 μM, 63.7 nM and 2.5 pM." This is somewhat awkward.
      8. Page 7, the first paragraph: No comments on S1P!
      9. Fig. 3: 100 mM must read as 100 micromolar.
      10. Fig. 9: Concentration unit is missing.
      11. Page 11, line 4: EKR should read as ERK.
      12. Page 13: "So far, only a couple of studies looked into kinase activation by GPCRs and these studies used overexpressed receptors [32,33]." Please describe precisely. Protein kinase activation by GPCR has been studied more than 20 years. Why are these two recent papers cited here?
      13. "This is in marked contrast to other fluorescent biosensors that typically require an overexpressed receptor for robust responses [34]." Following words should be included in the end: "in our hands".
      14. "Histamine is reported to predominantly activate Gq in HeLa cells [36] and UK activates Gi [37]." Describe the name of receptors for the better understanding.
      15. "S1P can activate a number of different GPCRs, all known to be expressed by HeLa cells [24]." Why is this paper chosen? The authors can easily find RNA-Seq data, if they wish to see the expression level. The cited paper did not scrutinize the S1P receptors expressed in HeLa cells.

      Significance

      The authors used biosensors for ERK and Akt to examine the kinetics of activation by GPCR ligands. Technical advancement is in the massive analysis method and cluster analysis. This is an important direction for the quantitative biology. GPCR signaling is complex because of multiple receptors coupled with different G proteins. The simple ones such as histamine receptor and alpha2-adrenergic receptor can be easily analyzed as shown in this study. However, there are many S1P receptors, which make the interpretation difficult. If the authors could have shown interesting proposal on this data, the paper may interest many researchers in the field of cell biology and systems biology.

      Expertise: Cell biology, signal transduction of protein kinases, fluorescence microscopy.

      Referee Cross-commenting

      1. I agree with the other two reviewers in that immunoblotting data is required to show the efficiency of P2A cleavage.
      2. All reviewers think it looks strange that the authors did not show UK + YM data.
      3. Showing the dynamic range of the biosensors will reinforce the data as Reviewer #3 states. ERK-KTR is quite sensitive and can be easily saturated. Ideally, the ratio of pERK vs ERK can be quantified by the different mobility in SDS-PAGE. But, I do not know how we can do it for Akt.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their appreciation of our study and for their comments, which we believe helped us to improve our manuscript.

      We have carefully considered both the general comment on the significance of our work expressed by the Reviewers 1 and 3 and a few specific points requested by the Reviewer 2 and we believe we have answered all of the reviewers' concerns.

      2. Point-by-point description of the revisions

      Reviewers 1 and 3 agreed that our work was both original and well performed. Although they have not raised any specific issues, apart from minor editorial changes, they asked us to clarify the potential interest of our results in designing novel therapeutic interventions for hepatocellular carcinoma.

      Reviewer 1: While studies are very elegant and results convincing, it is unclear how they might be deployed to therapeutic ends.

      Reviewer 3: The authors should explain why this is an interesting finding. They mention in the abstract that this heterogeneity highlights potential vulnerabilities that could be therapeutically exploited. How do they envision this? Why is this not a trivial result and in what way can this observation help design new therapies?

      We believe that our results suggest exciting opportunities in the search for novel therapeutic options and we agree further discussion on this important issue should be included in the revised manuscript. We have now expanded the discussion on these points and commented on the clinical relevance of our findings to answer the reviewers' concern (Discussion section page 15 lines 7-15).

      Our data are consistent with the widely acknowledged role of NK cells in anti-tumour immunity (e.g. Pende et al, Frontiers Immunol. 2019, 10:1179, for review). NK activity is governed by engagement of a repertoire of activating and inhibitory receptors expressed on their surface (Shimasaki et al., Nature Rev Drug Discovery, 2020, 19, 200-218). Among the latter, homophilic interactions of CEACAM1 (CD66A) expressed on melanoma cells have been shown to protect the tumor from NK-mediated toxicity (Markel et al, J Immunol 2002; 168:2803-2810;), in a strict parallel to our interpretation.

      NK cells are relatively sparse in the peripheral blood and abundant in a healthy liver. In patients with HCC, the numbers of peripheral, liver resident and tumor-infiltrating NK all drop significantly, mainly due to the disappearance of CD56dimCD16pos cell subset, corresponding to the cytotoxic NK population. Moreover, despite the continuous expression of activating receptors, the functionality of both the cytotoxic (and the cytokine producing (CD56brightCD16neg) remaining NKs is severely impaired (Cai et al Clinical Immunology (2008) 129, 428–437). The molecular mechanisms underlying NK anergy in the context of HCC have yet to be fully elucidated. However, CEACAM1 expression has been shown to suppress NK function in hepatitis C patients (Suda et al Hepatology Communications 2018;2:1247-1258) and there is ample evidence of CEACAM1 playing a major role in hepatic disease and in particular in protection against inflammation and immune-induced hepatitis (reviewed in Horst et al Int. J. Mol. Sci. 2018, 19, 3110). Thus, CEACAM1 is a bona fide regulator of NK function that is relevant in cancer and in non-cancerous liver pathology.

      In this context, our data introduce an additional notion, namely the tumor-promoting effect of a strong ERK activation in HCC that leads to CEACAM1-mediated anergy of NK.

      How might these findings be translated into future therapeutic options for HCC? Several scenarios can be envisaged, a very attractive being a cell-mediated immunotherapy, notably either autologous or allogeneic NK transfer. These therapies, which were initially developed for hematopoietic malignancies, are currently gaining momentum for solid tumors. Infusion of modified NK cells, including CAR-NK, presents major advantages over T-cell based therapies, mainly due to a very much diminished risk of GVDH in allogenic setting and of cytokine release syndrome and neurotoxicity for autologous transfer. Moreover, because NK-mediated cytotoxicity is HLA-independent, it does not require careful haplotype matching, thus greatly increasing the speed and availability of cellular preparations (recently reviewed in Xie et al. EBioMedicine 59 (2020) 102975).

      Currently, there are 219 registered clinical trials, including 31 on HCC, for NK-mediated anti- solid tumor responses (clinicaltrials.gov). Although most of these are only in phase I or phase II, they bear a great promise for the future. Our data strongly suggest that a new combination therapy might have an improved efficiency in a subset of HCC characterized by a strong ERK activation. This would involve either activated NK or CAR-NK in combination with a FDA-approved inhibitor of the MAPK ERK, such as trametinib. Our data lead us to predict that even a partial decrease in the intensity of ERK signaling would be likely to significantly increase the efficacy of NK-mediated anti-tumor activity, at least in a subset of HCC. While we appreciate that this suggestion remains speculative at this point in time, we believe the strength and novelty of our data warrants an exploration of such novel therapeutic opportunity for this tumor type that dramatically lacks reliable treatment options.

      • Specific points Reviewer 1

      Minor edits: Editorial review for minor, infrequent word usage edits

      We apologize for any English language mistakes in the manuscript. While the formulation of the remark makes us believe that our word usage does not impair the understanding of the text, we shall of course be willing to correct it.

      Figure 1E: Not possible to read genes in left heatmap, middle heatmap very small. Figure 3D: Units at x and y axes not legible/small. 3E: scales not legible. Figure 4: typo in legend H-> G.

      We apologize for not being more careful in preparing these figures, this has now been corrected. We realize that due to the high number or genes, their names are still in a very small print in the left panel of Fig. 1E, however, the complete list the genes is given in the supplementary table 1, and we added larger image of the heat map with the table.

      Reviewer 2

      1. Is there any significant change of EMT like status in BMEL cells having H-RAS (high) vs H-RAS (low)? Several EMT markers (e.g. vimentin or loss of E-cadherin) are induced by H-Ras in BMEL cells, as we have previously reported (Akkari et al. J Hepatol, 2012). Moderate levels of Ras expression appear to be sufficient for this phenotype, since we did not detect significant differences in their expression profiles either between RASHIGH and RASLOW populations or between cells isolated from the hepatic versus the peritoneal tumors. We conclude that the phenotype of a selective advantage afforded by a high RAS expression level is not due to the EMT.

      There is no significant fold difference (MFI number) to put the sorting gates to enrich H-RAS high vs H-RAS low cells. The mRNA expression level was almost 3 fold difference. Is it correlated with protein expression level?

      This is a very valid point. We agree that the MFI difference is not strong, although it is in fact statistically significant in three independent cell sorting experiments. We were confident that the differences in the H-Ras mRNA level were reflected in the level of protein expression, since we have observed distinct transcriptional signatures as well as significant phenotypic differences in RasHIGH and RasLOW cells (Fig. 1 B and D). Nevertheless, we quite agree that the difference in protein expression level needed to be confirmed. This has now been done by immunoblot analysis of protein extracts with an antibody specific to RasG12V (Cell signaling #14412). These data have now been included in Figure 1A, and the text modified accordingly (page 4 line 9-19).

      Is there any translational relevance of these genes Al467606, Aim2, Dynap, Htra3, Itgb7, Tspan13 in HCC patients with poor survivability?

      The expression of these genes positively correlated with the level of the Ras oncogene in the ex vivo cell culture model, thus providing a nice demonstration that variation in HRAS oncogenic dosage translates into differential transcriptomic outputs. The analysis of publicly available data from the cancer genome atlas (TCGA) also showed their expression in the HCC cohort (372 patients samples). The clinical outcome of the level of their expression (shown below) is somewhat ambiguous: strong expression of ITGB7 and C16ORF54 (Human ortholog of Al467606) correlated with a better prognosis, while expression of AIM2 and DYNAP had no impact on patient overall survival. Finally, HTRA3 and TSPAN13 were associated with worse outcomes and thus constitute particularly interesting candidates for future investigations. These somewhat unexpected divergent correlations likely reflect the fact that RAS/MAPK signalling is unlikely to be the sole regulator of their expression.

      4.Is there any difference between survival curve upon grafting of H-RAS (high) vs H-RAS (low) cells in Fig.2A?

      This experiment has not been performed for ethical reasons. Indeed, the difference in tumor growth upon injection of RasHIGH vs RasLOW is statistically significant 21 days after injection (Fig. 2A, p-value= 0.008). The size of the RasHIGH tumours is rather large and we chose to sacrifice the animals before they developed any signs of suffering.

      Is there any difference of H-RAS expression between liver tumor and peritoneal tumors?

      We have quantified H-Ras expression levels by RTqPCR in the flow cytometry sorted tumoral cells derived from the liver and peritoneal tumours (Fig. 3C). In the revised version of the manuscript we provide evidence that the mRNA expression levels of the oncogene correlate with the protein expression. Therefore, while the measurement of H-Ras protein has not been performed on the tumours, we would argue that it will indeed be different in the two tumour locations.

      Please provide the data for pro-inflammatory cytokines in TME.

      These data have been shown in the Suppl. Fig. 4C.

      Please provide an explanation of the DC activation with antigen presentation though the tumor is non-necrotic or apoptotic.

      While it is true that peritoneal tumors are less necrotic and have a lower apoptotic index than the matched hepatic primary ones (Fig. 3E), significant cell death can be detected at both locations. We assume that the released antigens are sufficient for presentation by the DC, as supported by the data in Fig. 4C, D and G.

      Is the TAM showing M2 phenotypes at peritoneal tumors?

      The reviewer correctly points out that the distinction between liver and peritoneal TAM polarization is not perfectly clear-cut, since some immunosuppressive but also some inflammatory markers are present at both tumor locations (Fig. 4B and Suppl Fig4). This is not unexpected, as the spectrum of activation macrophages can undertake in vivo is neither static nor fully faithful to the M1/M2 polarization extremes inducible in vitro (see e.g. Ringelhan et al., Nat Immunol. 2018;19(3):222-232 ; Ruffell et al., Trends Immunol. 2012 33(3):119-26). We thus integrate these results with our observations of other modulated immune cell phenotypes in these tumors. Indeed, in addition to the macrophage polarization markers, we noted a more mature, activated phenotype in the peritoneal TAMs. Together with the cytokine expression profile in the two tumor locations (which is included as a supplementary table in the revised version of the manuscript) our data argue for a less inflammatory environment in the peritoneal tumors.

      Significance

      The data showed pretty promising and has a seminal impact on H-RAS high expressing HCC patients. TAM and DC showed some important immune regulation to promote HCC.

      We thank the reviewer for his appreciation of the significance of our study.

      Reviewer 3

      It is possible that RAS levels may not stay constant but dynamically go up and down. While this is a possibility that would complicate interpretations of the results, I am ok with the conclusions in the manuscript as it is, since there seems to be a significant difference between the different populations assayed.

      This is a valid point that we have addressed by comparing the H-RAS expression level in the parental BMEL population (labelled “cells before injection” in Fig 2D) to those either freshly isolated from the tumors after a rapid cell-sorting by flow cytometry (“tumors” in Fig. 2D) and then to those isolated from tumors and kept in culture for 14 days (“tumoral cell lines” in Fig. 2D). Our conclusion was that the level of RAS expression was stable upon ex vivo culture. This result does not exclude a possibility of epigenetic regulation that operated in vivo and was maintained in the subsequent cell culture. However, even if this was the case, it would not alter the conclusion of distinct selective advantage of the HRAS expression levels in the two tumoral locations.

      Significance

      The authors should explain why this is an interesting finding. They mention in the abstract that this heterogeneity highlights potential vulnerabilities that could be therapeutically exploited. How do they envision this? Why is this not a trivial result and in what way can this observation help design new therapies?

      This important point is very similar to the concern raised by the reviewer 1 and we have answered them together at the beginning of the rebuttal.

      We would like to thank again the reviewers for raising this issue, which prompted us to include the considerations of potential usefulness of our findings in the revised discussion.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, Lozano et al. used primary hepatocyte precursor cells expressing high and low levels of oncogenic RAS to determine the dose dependent effects of RAS on tumor formation. The authors found that cells expressing different levels of RAS encounter different selection forces that modulate tumor growth. That is, high RAS expression was required in tumor cells growing in the liver, but not in the peritoneum. These findings suggest that differences in tumor microenvironment activate selection mechanisms that trigger tumor heterogeneity. The authors also show that different levels of RAS signaling cause resistance/sensitivity to NK cell attack.

      The experiments are done well, and the conclusions follow from the data, with the challenge that the RAS high and low cells are not clones but are purified from a population of cells expressing a continuum of different levels of RAS. It is possible that RAS levels may not stay constant but dynamically go up and down. While this is a possibility that would complicate interpretations of the results, I am ok with the conclusions in the manuscript as it is, since there seems to be a significant difference between the different populations assayed.

      Significance

      I find this finding conceptually only somewhat interesting, the most interesting aspect being that the liver is a more selective environment than the peritoneum. Do the authors have an explanation for this? It is certainly no surprise that tumor cells require different signaling activities to survive and proliferate in different environments. Here it is different levels of RAS, why not.

      The authors should explain why this is an interesting finding. They mention in the abstract that this heterogeneity highlights potential vulnerabilities that could be therapeutically exploited. How do they envision this? Why is this not a trivial result and in what way can this observation help design new therapies?

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      [NOTE FROM THE EDITOR: THIS REVIEWER INDICATED S/HE DOES NOT WISH TO BE CONTACTED AGAIN]

      Anthony Lozano et al manuscript"Ras/MAPK signalling intensity defines subclonal fitness in a mouse model of primary and metastatic hepatocellular carcinoma" showed a mechanistic role of Ras/MAPK pathway in HCC with some immune mechanism. However the author needs to address the following concerns in the manuscript.

      Major Comments-

      1. Is there any significant change of EMT like status in BMEL cells having H-RAS (high) vs H-RAS (low)?
      2. There is no significant fold difference (MFI number) to put the sorting gates to enrich H-RAS high vs H-RAS low cells. The mRNA expression level was almost 3 fold difference. Is it correlated with protein expression level?
      3. Is there any translational relevance of these genes Al467606, Aim2, Dynap, Htra3, Itgb7, Tspan13 in HCC patients with poor survivability? 4.Is there any difference between survival curve upon grafting of H-RAS (high) vs H-RAS (low) cells in Fig.2A?
      4. Is there any difference of H-RAS expression between liver tumor and peritoneal tumors? 6.Please provide the data for pro-inflammatory cytokines in TME.
      5. Please provide an explanation of the DC activation with antigen presentation though the tumor is non-necrotic or apoptotic.
      6. Is the TAM showing M2 phenotypes at peritoneal tumors?

      Significance

      The data showed pretty promising and has a seminal impact on H-RAS high expressing HCC patients. TAM and DC showed some important immune regulation to promote HCC.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is a well-written, elegant manuscript where authors demonstrate that hepatocyte precursors (BMEL), when transformed with different copy numbers of oncologenic mutant H-Ras G12V, exhibit a dose-dependent fitness. Authors find that low dose H-ras mutant clones appear to be eliminated from primary tumors, but not from secondary tumours. Authors suggest that the different (primary v secondary) microenvironments, especially with respect to innate immunity, influence selection pressure differences. Specifically, investigators find that ceacam1-driven NK inhibition contributes to this clonal selection using murine models with and without adaptive immunity.

      Minor Weakness: While RAS pathway activation is present in 40% of HCC, given that few HCC are driven by Ras mutations, relevance could be called into question. Authors offer reasonable explanations for why these used the models they did.

      Minor edits: Editorial review for minor, infrequent word usage edits

      Figure 1E: Not possible to read genes in left heatmap, middle heatmap very small. Figure 3D: Units at x and y axes not legible/small. 3E: scales not legible. Figure 4: typo in legend H-> G.

      Significance

      HCC treatment options remain limited and outcomes are poor. Defining which pathways are coopted by HCC to improve fitness as primary or secondary tumors may improve treatment options. While studies are very elegant and results convincing, it is unclear how they might be deployed to therapeutic ends.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their helpful, detailed and insightful comments. We have modified the figures and rewritten large sections of the manuscript following the reviewers’ suggestions. In addition, we have incorporated new data throughout the manuscript and figures to clarify and better support our conclusions. All of these changes have significantly improved the coherence, consistency and clarity of our data, and have allowed us to better communicate the advance our findings represent for the fields of splicing and muscle development.

      Please find a point-by-point response to the reviewers’ comments below. The reviewers’ comments are in black and italics.

      Response to Reviewer 1* Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Rbfox proteins regulate skeletal muscle splicing and function and in this manuscript, Nikonova et.al. sought to investigate the mechanisms by which Rbfox1 promotes muscle function in Drosophila.

      Using a GFP-tagged Rbfox1 line, the authors showed that Rbfox1 is expressed in all muscles examined but differentially expressed in tubular and fibrillar (IFM)muscle types, and expression is developmentally regulated. Based on RNA-seq data from isolated muscle groups, the authors showed that Rbfox1 expression is much higher in TDT (jump muscle) than IFM.

      Using fly genetics authors developed tools to reduce expression of Rbfox1 at different levels and the highest levels of muscle-specific Rbfox1 knockdown was lethal and displayed eclosion defects (deGradFP > Rbfox1-IRKK110518 > Rbfox1-RNAi > Rbfox1-IR27286). Consistently, Rbfox1 knockdown flies have reduced jumping and climbing phenotypes, due to tubular muscle defect where Rbfox1 is expressed at higher levels. Rbfox1 knockdown in IFM caused flight defects which have been shown previously. Further characterization of IFM and tubular muscles demonstrated a requirement of Rbfox1 for the development of myofibrillar structures in both fibrillar (IFM) and tubular fiber-types in Drosophila. Interestingly, knockdown or overexpression of Rbfox1 displayed hypercontraction phenotypes in IFMs which is often an end result of misregulation of acto-myosin interactions which was rescued by expression of force-reduction myosin heavy chain (Mhc, P401S), in the context of Rbfox1 knockdown (the rescue experiment could not be performed with Rbfox1 overexpression due to complex genetics).

      Authors also performed computation analyses of the Rbfox binding motifs in the fly genome and identified GCAUG motif in 3,312, 683, and 1184 genes in the intronic, 5'UTR, and 3'UTR, respectively. These genes are enriched for factors that play important roles in muscle function including transcription factors (exd, Mef2, Salm), RNA-binding proteins (Bru1), and structural proteins (TnI, encoded by wupA). Many of these gene transcripts and proteins are affected in flies with reduction or overexpression of Rbfox1. Using fly genetics, authors propose and test different mechanisms (co-regulation of gene targets by Rbfox1 and Bru1), and regulators of muscle function (exd, Me2, Salm) and structural proteins (TnI, Mhc, Zasp52, Strn-Mlck, Sls) by which these changes could affect the muscle function.

      *Overall, the characterization of Rbfox1 phenotypes and myofibrillar structure is very well elucidated, mechanisms by which Rbfox1 affects muscle function are not clear and remain largely speculative. We thank the reviewer for the positive evaluation of our phenotypic analysis of Rbfox1 knockdown in multiple muscle fiber types. This manuscript is the first detailed characterization of Rbfox1 in Drosophila muscle, extending far beyond our previous finding that Rbfox1-IR flies are flightless. Beyond behavioral and cellular phenotypes, we report that there are regulatory interactions between Rbfox1, Bruno1 and Salm and identify other Rbfox1 targets in flies. We acknowledge that there are molecular and biochemical details of specific regulatory mechanisms that remain to be elucidated, but this paper provides many foundational observations to guide future biochemical experiments and is thus important to the muscle field.

      \*Major comments**

      *1. The varying level of Rbfox1 knockdown (deGradFP > Rbfox1-IRKK110518 > Rbfox1-RNAi > Rbfox1-IR27286) was achieved by different strategies without validation at the protein level (likely due to lack of a Rbfox1 antibody). It is important to show different Rbfox1 protein level (at least with different RNAi), especially when authors propose that autoregulation of Rbfox1 causes increased level Rbfox1 transcript in case of Rbfox1-RNAi (mild knockdown). Autoregulation of Rbfox1 in mammalian cells may not be similar in flies.

      To address this comment, we have toned-down the discussion of level-dependent regulation throughout the manuscript, and have removed claims of Rbfox1 autoregulation. We appreciate the reviewer’s point that it would be ideal to be able to determine the protein levels of Rbfox1 in the different knockdown conditions. We have tested the published antibody against DmRbfox1, but it is very dirty and we see multiple bands in Western Blot. This background partially obscures the bands from 80-90 kDa at the molecular weight where we expect Rbfox1, and prevents accurate quantification (see Reviewer Figure 1). Verification of protein levels of Rbfox1 will require generation of a new antibody which is beyond the scope of this study. As we do not have a good antibody, we performed two experiments to demonstrate our ability to tune knockdown efficiency. First, we crossed Rbfox1-IRKK110518 and Rbfox1-IR27286 to UAS-Dcr2, Mef2-Gal4 and demonstrated we could enhance the phenotype (Figure 2A, B). Second, we performed knockdown with the same hairpins at different temperatures and demonstrate that stronger knockdown at higher temperature leads to stronger phenotypes with the same hairpin

      (Figure 2B). This data supports our knockdown series interpretation.

      Reviewer Figure 1. Western Blot of whole fly with anti-Rbfox1 (A2BP1) (Shukla et al., 2017). Tubulin was blotted as a loading control.

      • TnI and Act88F protein levels are inversely correlated with Rbfox1 level in IFM but did not correlate with the RNA level. Using RIP authors showed that Rbfox1 was shown to bound to wupA transcripts (has Rbfox binding sites) but not Act88F transcripts (does not have Rbfox binding sites). Authors performed Rbfox1 IP and identified co-IP of components of cellular translational machinery and propose that wupA (TnI) levels are regulated by translation or NMD (non-sense mediated decay). A follow up experiment was not performed to identify the mechanism by which TnI level is regulated by Rbfox1. *

      Further biochemical and genetic verification of the underlying mechanisms of Rbfox1 regulation in Drosophila muscle will be addressed in a future manuscript, as in vivo modulation of translation or NMD in an Rbfox1 knockdown background involves recombination to coordinate multiple genetic elements. We have modified the text to reflect this hypothesis remains to be explored in future experiments (Line 473-474).

      We have further added RT-PCR data for wupA transcript levels in IFM and TDT with Rbfox1-IRKK110518 knockdown (Figure S4 A), but as in Rbfox1-RNAi flies, there is not a significant change in expression. We do see significant downregulation of Act88F when we overexpress Rbfox1 in IFM (Figure S4 B), as well as in TDT when we knockdown Rbfox1 with either Rbfox1-IRKK110518 or Rbfox1-IR27286.

      It was known that TnI mutations (affects splice site, fliH or Mef2 binding site, Hdp-3) led to a reduction in TnI level and hypercontraction. Authors showed rescue of hypercontraction phenotype in hdp-3 background by knocking down Rbfox1, likely due to increase in wupA transcription (Mef2-dependent or independent manner). However, no rescue was observed in the fliH background. Reduced level of Rbfox1 in fliH background would be expected to cause worsening of phenotype as splicing of remaining wupA transcripts would be affected with reduced Rbfox1 level. The splicing of wupA of exon 4 is not affected in Rbfox1 knockdown (fig. 6U), it's not clear if the splicing of exon 6b1 is affected in Rbfox1 knockdown.

      We thank the reviewer for pointing out our lack of clarity regarding exon 6b1 and IFM-specific isoform 6b1. To address this comment and validate our previous data, we performed additional Sanger sequencing on RT-PCR products, added a diagram of the wupA gene region in Figure 4 A and improved the clarity of our discussion of the fliH and hdp3 alleles and our results in the text.

      To directly respond to the reviewer, first, it is unclear if the reduced level of Rbfox1 in a fliH background should actually cause a more severe phenotype. Our data suggests that Rbfox1 represses TnI expression through binding the 3’-UTR, and can likely indirectly regulate wupA expression level via Mef2. Thus, arguably, the reduced level of Rbfox1 in the fliH background might not affect splicing, as the mutations in the regulatory element should rather make wupA insensitive to increased Mef2 expression in the Rbfox-RNAi background.

      Second, we confirmed via Sanger sequencing of RT-PCR products that both IFM and TDT in control and Rbfox1-IR flies use exon 6b1 (current exon 7). The IFM isoform contains exon 3, 6b1 and 9, while the TDT isoform contains exon 3 and 6b1, but skips exon 9 (see Figure 4 A). In other tubular muscles, wupA isoforms skip exons 3 and 9, and use exon 6b2 instead of 6b1. Thus, to directly answer the reviewer’s question, no, splicing of exon 6b1 itself is not affected by Rbfox1. However, Rbfox1 does influence expression of the ”6b1 isoform”, or the wupA isoforms in IFM and TDT containing exon 6b1 and exon 3. Additionally, our data shows that Bru1, not Rbfox1, regulates alternative splicing of wupA exon 9 (Fig. S6 T).

      What the reviewer has correctly identified with this comment is that the effect on splicing in the hdp-3 allele also appears to be complex and to have not been fully clarified. Although hdp-3 results from mutation of a splice site in exon 6b1 (which based on (Barbas et al., 1993) results in aberrant use of 6b2 in IFM), it also results in a near complete absence of the longer isoform containing exon 3 in adult flies. hdp-3 is reported in the same paper to affect both IFM and TDT, which both express isoforms containing exon 3 and 6b1. It is not known how mis-splicing of exon 6b1 leads to loss of isoforms containing exon 3, but our data indicate that Rbfox1 is somehow involved. It is purely speculation and beyond the scope of this manuscript, but perhaps selection of alternative exons in wupA are not independent events (ie that the splicing of exon 3 depends on correct splicing of exon 6b1). This could be mediated with interactions with chromatin, the PolII complex or through a larger splicing factor complex (something like LASR, for example (Damianov et al., 2016)), that restricts choice in alternative events through higher-order interactions. Another possible mechanism is that a second mutation exists in the hdp-3 allele that affects splicing of exon 3, although this was not indicated in the extensive sequencing data in (Barbas et al., 1993).

      Bruno1 was identified as a co-regulator of Rbfox1 in different IFM and tubular muscle types. However, except Mhc, other Rbfox1 targets seem to be regulated by either Rbfox1 or Bruno1, not both. Analyses of RNA-seq datasets from single and double knockouts should identify additional targets to support the claim that - Rbfox1 and Bruno1 co-regulate alternative splice events in IFMs. Phenotypic changes with reduced Rbfox1 and Bruno1 double knockdowns are very severe, but the mechanistic basis of such genetic interaction resulting in synergistic phenotypes in IFMs is lacking as splicing changes in single vs double knockout is similar.

      We agree with the reviewer that RNA-seq data would be useful to obtain a genome-wide perspective on the regulatory interactions between Rbfox1 and Bru1, and we plan to generate this data as part of a future manuscript. However, the tissue-specific dissections to isolate enough material from all of the necessary genotypes will take months to complete, and are not realistic to wait to include in this manuscript. Instead, to address the reviewer’s question, we have expanded our RT-PCR experiments to cover a wider panel of events in 12 sarcomere genes (see new data in Figures 6 and S6 and summary in Figure 8). We now can show that splice events in Fhos and Zasp67 are Rbfox1 dependent, while events in sls, Strn-Mlck and wupA are Bru1 dependent. An event in Zasp66 responds to both Rbfox1 and Bru1, but in opposite directions. Events in Mhc, Tm1 and Zasp52 are regulated by both Rbfox1 and Bru1 (or are sensitive to changes in Bru1 expression in the Rbfox1 background), and change in the same direction. This data provides a clearer mechanistic basis for the synergistic phenotype observed between Rbfox1 and Bru1 in IFM.

      Rbfox1 is expressed at a high level in tubular muscle whereas Bruno1 is expressed at a high level in IFM. Rbfox1 binds to Bruno1 transcript and inversely regulates Bru1-RB level but knockdown of Bru1 does not affect Rbfox1 level (Fig. S5 G,I,J). Overexpression of Bruno1 decreased the Rbfox1 level, however, it's difficult to interpret these results as overexpression of Bruno1 may have other effects on IFM gene expression.

      The reviewer correctly pointed out that we did not observe significant changes in Rbfox1 mRNA levels in the mutant bru1M3 background, however, in the original version of this manuscript, we also showed a significant decrease in Rbfox1 expression in IFM from the bru1-IR background at both 72 h APF and 1 d adult in mRNA-Seq data. To clarify differences in Rbfox1 levels between bru1-IR and our bru1 mutant backgrounds, we have performed additional RT-PCR experiments. We examined Rbfox1 levels after knockdown of bru1 (bru1-IR), and we now show that Rbfox1 levels are significantly decreased in IFM and TDT after bru1-IR (Fig. 5S, Fig S5 I). We see a weaker effect in the bru1M2 hypomorphic mutant, which likely reflects differences in Bru1 expression levels in bru1-IR and the bru1M2 allele. These results are consistent with the mRNA-seq data we presented previously (now in Fig. 5R). These additional data suggest that loss as well as gain of Bru1 affects Rbfox1 expression levels.

      A dose-dependent effect of Rbfox1 knockdown was shown to regulate the expression of transcription factors that are important for muscle type specification and function including exd, Mef2, and Salm. However, it is not clear how Rbfox1 mechanistically regulates the expression of these transcription factors.

      We present two pieces of data suggesting possible regulatory mechanisms for Mef2. First, RIP data suggest Rbfox1 can directly bind the 3’-UTR region of Mef2, and this region contains two binding motifs identified in both the oRNAment database and in our PWMScan dataset. Second, we show that use of the 5’-UTR regions of Mef2 is altered in Rbfox1-IR muscle. Although not definitive, this suggests that regulation of alternative 5’-UTR use may influence transcript stability or translation efficiency. We feel the many experiments to elucidate the detailed mechanism of regulation (and indeed to determine the likely contribution of multiple, layered regulatory processes) are beyond the scope of this paper, and are better left for future studies. This manuscript is the first in-depth characterization of Rbfox1 function in Drosophila muscle, and we provide multiple lines of evidence suggesting that different regulatory mechanisms exist as a basis for future experiments to explore these interesting and important regulatory interactions.*

      **Minor comments**

      1. It is not described if the rescue of Rbfox1 knockout by expression of force-reduction myosin heavy chain (Mhc, P401S) led to rescue of phenotypes (jumping, climbing, flight). *

      Force-reduction myosin heavy chain MhcP401S is a mutation at the endogenous Mhc locus that results in a headless myosin and was previously characterized to be flightless (Nongthomba et al., 2003). It is however able to rescue jumping and walking defects observed with the hdp2 TnI allele, and supports largely normal myofibril assembly (Nongthomba et al., 2003). It is also important to note that fibrillar muscle function is very finely tuned, such that alterations that result in flightlessness in many cases do not alter myofibril structure as detected by confocal microscopy (Schnorrer et al., 2010). We therefore looked at myofiber and sarcomere structure as a more sensitive read-out of the rescue ability in the Rbfox1 knockdown, to be able to detect a partial-rescue of myofibrillar structure that may not be evident in a behavioral assay.

      Immunofluorescence (IF) and Western blotting are different techniques, and Bruno1 antibody was validated for specificity in IF but not in Western blots. Figure 5L and S5 E should include muscle samples from Bru1M2.

      We have added a Western Blot panel in Figure S5 D including bru1-IR, bru1M2 and samples of different wild-type tissues including abdomen, ovaries, testis and IFM.

      To quantify alternative splicing or percent spliced in (PSI), primers are typically designed in the exons flanking the alternative exons. A better primer design along with PSI calculation by RT-PCR will robustly validate alternative splicing changes in different genetic background (Fig 6U and S6 U).

      We do not yet have RNA-Seq data from these Rbfox1 knockdown samples to facilitate calculation of transcriptome-wide PSI values; thus, we rely on the results from our RT-PCR experiments. Our primers used to detect alternative splice events are indeed located within flanking exons or as close to the alternative exons as possible based on sequence design limitations (see schemes in Figure 6 and Figure S6). Many of the events we are detecting are complex, and not a simple “included” or “excluded” determination, and are therefore not amenable to RT-qPCR. To increase the robustness of our validation, we now provide RT-PCR gel-based quantification of exon use for the events we tested in Zasp52, Zasp66, Zasp67, wupA and Mhc (Figure 6 U-W and Figure S6 T-U).*

      Reviewer #1 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Understanding how muscle fiber type splicing and gene expression is regulated will conceptually move the field forward. How transcriptional and posttranscriptional programs coordinate to specify muscle fiber type gene expression is still lacking.

      Place the work in the context of the existing literature (provide references, where appropriate). Multiple RNA binding proteins and splicing factors have been shown to affect muscle function along with hundreds of gene expression and splicing changes in a complex fashion. Linking phenotypes with gene expression changes is still challenging as RNA binding proteins or RBPs are multifunctional and affect the function of other regulators that are important for muscle biology. *We thank the reviewer for recognizing the conceptual advance our findings represent, as well as the complexity in the regulatory network we are seeking to understand. A detailed understanding of the coordination of transcriptional and posttranscriptional programs is enabled by our work and will be the subject of future investigation.

      * State what audience might be interested in and influenced by the reported findings.

      Fly genetics, alternative splicing regulation, muscle specification and function.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Regulation and function of alternative splicing in muscle. I do not have a thorough knowledge of Drosophila genetics.


      Response to Reviewer 2 Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      This paper reports analysis of the function of RbFox1, an RNA-binding protein, best known for roles in the regulation of alternative splicing. It uses Drosophila as its in vivo model system, one that is highly suited to the analysis in vivo of complex biological events. In general, the authors present a very thorough approach with an impressive range of molecular analysis, genetic experiments and phenotypic assays. *We thank the reviewer for recognizing the suitability of our model organism as well as the time investment and diversity of experiments that were performed in this work. We have added and revised multiple experiments during this revision, which has greatly improved the manuscript.

      * The authors report that Rbfox1 is expressed in all Drosophila muscle types, and regulated in both a temporal and muscle type specific manner. Using inhibitory RNA to knock down gene function, they show that Rbfox1 is required in muscle for both viability and pupal eclosion, and contributes to both muscle development and function. A Bioinformatic approach then identifies muscle genes with Rbfox1-binding motifs. They show Rbfox1 regulates expression of both muscle structural proteins and the splicing factor Bruno1, interestingly preferentially targeting the Bruno1-RB isoform. They report functional interaction between Rbfox1 and Bruno1 and that this is expression level-dependent. Lastly, they report that Rbfox1 regulates transcription factors that control muscle gene expression.

      They conclude that the effect on muscle function of RbFox1 knock down is through mis-regulation of fibre type specific gene and splice isoform expression. Moreover, "Rbfox1 functions in a fibre-type and level-dependent manner to modulate both fibrillar and tubular muscle development". They propose that it does this by "binding to 5'-UTR and 3'-UTR regions to regulate transcript levels and binding to intronic regions to promote or inhibit alternative splice events." They also suggest that Rbfox1 acts "also through hierarchical regulation of the fibre diversity pathway." They provide further evidence to the field that Rbfox1's role in muscle development is conserved.

      **MAJOR COMMENTS**

      Are key conclusions convincing?

      In terms of presentation, I suggest ensuring a clear demarcation throughout of the evidence behind the main conclusions. This can get somewhat lost as a great deal of information is presented, including all the parallels with prior findings in other systems. I am not saying this is a major problem, just highlighting the importance of clarity. Conclusions to clearly evidence include: Rbfox1 functions in a fibre-type manner to modulate both fibrillar and tubular muscle development (e.g. L664); Rbfox1 functions in a level-dependent manner (e.g. L664); Rbfox1 functions by binding to 5'-UTR and 3'-UTR regions to regulate transcript levels (e.g. L670); Rbfox1 functions by binding to intronic regions to promote or inhibit alternative splice events" (e.g. L670); "Bru1 can regulate Rbfox1 levels in Drosophila muscle, and likely in a level-dependent manner" (L488) - Clearly evidence the level effect; "first evidence for negative regulation for fine tuning acquisition of muscle-type specific properties. Depending on its expression level, Rbfox1 can either promote or inhibit expression of" muscle regulators (L797). Lastly, the controlled stoichiometry of muscle structural proteins is known to be important, but all mechanisms are not known, so again make the supporting evidence as clear as possible for the interesting point of a role for Rbfox1 in this (e.g. L787). *Using the above comments from the reviewer as a guide, we have rewritten the manuscript, including large portions of the discussion, introduction and results. We thank the reviewer for pointing out where we could more effectively communicate our results, support our conclusions and highlight the significance of our findings.

      * Should some claims be qualified as preliminary or removed?

      P301 "complicated genetic recombination" - seems a bit weak to include. Either do it or don't include? *

      We have removed this statement from the text.*

      *

      Also, see section below on "adequate replication of experiments"

      Are additional exps essential? (if so realistic in terms of time and cost) None essential in my view. It depends on the authors' goals, but for the most impact of the project then following up these suggestions are possible. L369-372: mutate putative Rbfox1 binding site and ask does binding still occur or not. If it doesn't, then ask if this mutation affects the expression of the putative target gene. L775-777 "Our data thus support findings that Rbfox1 modulates transcription, but introduce a novel method of regulation, via regulating transcription factor transcript stability." It would be good to demonstrate this.

      We thank the reviewer for these suggestions, and agree they are indeed interesting experiments, but beyond the scope of this manuscript. We plan to pursue the detailed molecular and biochemical mechanisms of regulation in a future project including exploring Rbfox1 binding through use of reporters, identification of direct targets via CLIP and investigation of post-transcriptional regulation of translation or NMD.*

      Presented in such a way as to be reproduced

      Yes

      Are exps adequately replicated?

      A main area I would address is the authors frequent use of "may", "tend", "trend". This is confusing the picture they present. What is statistically significant and what is not? Only the former can be used as evidence. Examples include: L170: "may display preferential exon use" - does it or doesn't it? L272: "myofibrils tended to be thicker" - were they or weren't they? L350 "wupA mRNA levels tend towards upregulation in Rbfox1-RNAi". L353 "but tended towards upregulation (Fig. S4A)" L466 "Correspondingly, we see a trend towards increased protein-level expression of Bru1-PA" L474 "both Bru1-PA and Bru1-PB tend to increase" L485 "Overexpression of Bru1 in TDT with Act79B-Gal4 also tends to reduce Rbfox1" L595 "Rbfox1-IR27286 tended towards increased exd levels in IFM (Fig. 7A)" L614 "and a trend towards increased use of Mef2-Ex20 " Also, L487 "suggesting that Bru1 can also negatively regulate Rbfox1" - one cannot use a non-significant observation to suggest something. *

      We have modified the text to limit use of “may”, “tend” and “trend”, and have removed discussion of non-significant results. We thank the reviewer for the very helpful and detailed list of sentences to modify.

      \*MINOR COMMENTS**

      *

      Although individual samples are not significant, in aggregate there is a trend….

      * Specific exp issues that are easily addressable

      L162: "dip in Rbfox1 expression levels around 50h APF". The Fig indicates as early as 30h. Is this significantly less than the 24h data point? Comparisons in Figure 1G that are significant based on DESeq2 differential expression analysis with an adjusted p-value L427 "this staining was lost after Rbfox1 knockdown". This conflicts with Fig 5K which says no significant difference. Again in L429 "Rbfox1 knockdown leads to a reduction of Bru1 protein levels in IFMs and TDT." Fig says no significant difference in TDT. *

      We thank the reviewer for pointing out this inconsistency. We have revised the text accordingly. Our Western Blot (Figure 5L, M) and RT-PCR (Figure 5N, O) do show changes of Bru1 protein and mRNA expression levels after knockdown of Rbfox1KK110518. *

      Are prior studies referenced appropriately?

      This m/s is an authoritative presentation of the field as a whole with a comprehensive, impressive reference list. However, a point related to this area is one of the main things I would consider tackling. This is to have more clarity in the demarcation of what this study has found that adds to prior knowledge. It is worthwhile in itself to demonstrate the many similarities with previous work in other systems, as part of establishing the Drosophila system with all its analytical advantages for in vivo molecular genetics as an excellent model for future study in this area of research. However, the impact/strength of this m/s would be enhanced by clarity in presenting what is new to the field in all organisms. *We thank the reviewer for this suggestion. We have rewritten large portions of the manuscript, including the introduction and discussion, to improve the clarity of our findings and their importance to the field.

      * Are the text and Figs clear and accurate?

      TEXT

      L156: more precise language than "in a pattern consistent with the myoblasts" - maybe a simple co-expression with a myoblast marker? *

      We have revised this phrasing in the text. Rbfox1 expression in myoblasts was previously reported by (Usha and Shashidhara, 2010). *

      L181: at first use define difference between RNAi and IR*

      We use IR as an abbreviation for RNAi. In particular, we are trying to distinguish the two hairpins obtained from stock centers (27286 and KK110518) from the third, homemade RNAi hairpin, originally named UAS-dA2BP1RNAi, that was generated by Usha and Shashidhara (Usha and Shashidhara, 2010). We have better defined this in the text and methods. *

      L205: maybe clearly explain the link between eclosion and tubular muscle?? *

      We have added a sentence explaining the link between eclosion and tubular muscle (see Line 331).*

      L231: "Sarcomeres were not significantly shorter at 90h APF with the stronger Mef2-Gal4" - not clear why this is the case when the less strong knockdown conditions have shorter sarcomeres. *

      We have modified the text as well as the figure labeling to clarify that the other samples were tested in 1 d adult, while the KK110518 hairpin was tested at 90 h APF. This likely indicates that the short sarcomeres observed in 1 d adults reflect hypercontraction, which in IFM is classically first apparent after eclosion when the flies actively try to use the flight muscles. The difference in timing is due to pupal lethality of the KK110518 hairpin line, so we could not evaluate adult flies.*

      L234: "classic hypercontraction mutants in IFMs display a similar phenotype" - presumably not similar to the not significantly shorter sarcomeres of the previous sentence. *

      We have modified the text to clarify this statement. The change in sarcomere length from 90 h APF to 1 d adult is actually the relevant observation, as this reflects the progressive shortening of sarcomeres observed in classic hypercontraction mutants.*

      L244: "90h", should be "90h APF"? *

      Yes, we have modified the text.*

      L273: "Myofibrils in Act88F-Gal4 mediated knockdown only showed mild defects (Fig. 3 G, H, Fig. S2 C, D) despite adult flies being flight impaired". This seems worthy of discussion - the functional defect is not due to overt structure change? *

      In our own experience as well as observations included in a genome-wide RNAi screen in muscle (Schnorrer et al., 2010), there are a rather large number of knockdown conditions where few if any structural defects are observed at the level of light microscopy, but flies are completely flightless. We interpret this to reflect the narrow tuning of IFM function, where slight alterations in calcium regulation or sarcomere gene isoform expression result in dysfunction and a lack of flight. Ultrastructural evaluation might reveal defects in these cases, but the defect could also be with the dynamics of tropomyosin complex function, calcium regulation, mitochondrial function or even neuro-muscular junction structure. We have added a sentence to the text to discuss and clarify the Act88F result.*

      L281 "also known as Zebra bodies" - helpful to indicate these on the Fig, they are not. *

      We have added arrows to the figure to mark the Zebra bodies, and updated the figure legend.*

      L282: "we were unable to attempt a rescue of these defects" - I may have missed something, but what about rescue undertaken of the defects on previous pages? *

      This is the first point in the text where we introduced overexpression of Rbfox1, as preceding experiments where knockdown or using a GFP-tagged protein trap line at the endogenous locus. We have revised the sentence to focus on the overexpression phenotype with UH3-Gal4.*

      L283: "Over-expression of Rbfox1 from 40h APF" - this is the first over-expression experiment, so introduce why done now (and perhaps not earlier), and also explain the use of a different Gal4 driver.*

      We have reworded this section of the text. The UH3-Gal4 driver is restricted to expressing in IFM from 40h APF, so is first expressed after myofibrils have been generated and selectively in IFM. This avoids lethality observed from pan-muscle expression with Mef2-Gal4 (presumably due to severe defects in tubular muscles), and also allows us to image IFM tissue from adult flies. Later experiments with Mef2-Gal4 were performed with a later temperature shift to avoid this early lethality.*

      L290 "Interestingly, both Rbfox1 knockdown and Rbfox1 over-expression produce similar hypercontraction defects" - this could be interesting, worthy of discussion/explanation. *

      The most logical explanation is that Rbfox1 regulates the balance in fiber-type specific isoform expression. Loss of Rbfox1 would cause a shift in the relative ratio of the isoforms of structural genes, and overexpression of Rbfox1 would likely cause a similar shift in the opposite direction. This is supported by our RT-PCR panel, where we see co-regulation of different events with Bru1, and we see fiber-type specific difference in regulation of alternative splicing (Figure 8). Overexpression of Rbfox1 would be expected to make IFM look more like TDT, which would result in an isoform imbalance and lead to the observed hypercontraction phenotype. Interestingly, loss and overexpression of Bru1 also result in the same hypercontraction phenotype, similar to what we observe with Rbfox1. We have added a paragraph in the discussion about level-dependent regulation, to address this reviewer comment.*

      P305: Bioinformatic analysis. It is not clear what is taken as a potentially interesting result. On average a specific 5 base motif is found every 1000bps - so what is being looked for? How many sites in what length or position? A range of examples are described in the next pages of the m/s. For example: L337 "Bruno1.... contains 42 intronic and 2 5'-UTR Rbfox1 binding motifs" and L591 "exd contains three Rbfox1 binding sites," *

      We have redone the bioinformatic analysis completely, relying on data from oRNAment and the in-vitro determined PWM. We have also rewritten all portions of the text related to this analysis and no longer focus on the number of observed motifs in a given gene. As we unfortunately do not have RNA CLIP data, we do not know genome-wide which motifs are bound in muscle. Clustering of motifs may reflect binding, but a single, strong motif can also be bound, as we demonstrate via RIP of the wupA transcript. Thus, we identified interesting targets to test based on 1) a previously described role in the literature in myofibril assembly or contractility and 2) the presence of any Rbfox1 motif in that gene. A more elegant selection method of direct and indirect target exons will be designed for a future manuscript after integrating CLIP and mRNA-Seq data that have not yet been collected.

      L315: "many of these genes have binding or catalytic activity". "catalytic activity" seems very vague.

      For the original supplemental figure panel, we relied on Panther high-level ontology terms, which can unfortunately be rather vague, ie “catalytic activity” or “binding activity”. We have redone this analysis and rely rather on GO terms in the biological process and molecular function categories (Figure S3 B).

      L317 "When we look in previously annotated gene lists" - be more specific. What are they?

      This section of the text has been rewritten, and the “previously annotated gene lists” are described in greater detail in the Methods. *

      L327 "may also affect the neuro-muscular junction" - maybe better left for the Discussion? *

      We have removed this sentence from the Results.*

      L333 "extradenticle (exd) and Myocyte enhancer factor 2 (Mef2) contain 3 and 7 Rbfox1 motifs," Discuss the number and position of multiple motifs found in known targets? *

      We have removed the discussion of the number of binding sites for different target genes, instead incorporating this information graphically in Figure S3 C. It is not clear that the number of binding sites per gene has any influence on whether it is regulated in Rbfox1 knockdown. Thus, we have de-emphasized discussion of the number of binding sites throughout the text.*

      L350 "wupA mRNA levels " - clearer to stick to using TroponinI or WupA? *

      We have updated instances throughout the text to consistently refer to the protein as Troponin-I (TnI) and the gene or mRNA as wupA. *

      L376 "To check whether Rbfox1 regulates some target mRNAs such as wupA....." The suggestion here is more of a further indication than a "check". *

      We have reworded this section of the results to make the link between post-transcriptional regulation and our mass spectrometry results more salient.*

      L544 "In IFMs, knockdown of Rbfox1 and loss of Bru1 results in...." clarify if this is the two genes separately or the two genes together? *

      We have rewritten this entire section and present an expanded list of tested alternative events. We have taken care in this revision to clearly denote if the genotype is Rbfox1-IR or bru1M2 or a double knockdown background.*

      L580 "Our bioinformatic analysis identified Rbfox1 binding motifs in more than 40% of transcription factors genes" - is this all TFs or just "muscle" TF genes? *

      We have redone this analysis and changed this sentence in the text.*

      L598, what would be the mechanism of some decrease in Rbfox1 increasing mRNA levels and more of a decrease resulting in a decrease of the mRNA? The authors say "the nature of this regulation requires further investigation". *

      We have added more data to this section of the manuscript and repeated several of these experiments. After adding more biological replicates and additional data points, we have more consistent results that also demonstrate the variability in bru1 expression levels after Rbfox1 knockdown. Overall levels of bru1 assayed with a primer set in exons 14 and 17 now consistently show an increase in bru1 expression after Rbfox1 knockdown between all three hairpins (Rbfox1-RNAi, Rbfox1-IRKK110518 and Dcr2, Rbfox1-IR27286) (Figure 5 N).

      The relationship between expression level of Rbfox1 and expression level of bru1 and Bru1 protein isoforms is more complex. We now report a novel splice event in the annotated isoform bru1-RB that skips exon 7, resulting in a frame shift and generation of a protein that lacks all RRM domains, which we call bru1-RBshort (Figure S5). This short isoform is preferentially used in TDT, while the long isoform encoding the full-length protein is preferentially used in IFM (Figure 5 P). Presumably, this provides a mechanism, in addition to the use of different promoters, for muscle cells to regulate expression levels of different Bru1 isoforms. Knockdown of Rbfox1 in IFM results in a significant increase in the use of the long mRNA isoform, but paradoxically a decrease in the corresponding protein isoform (Figure 5, S5). We interpret this to mean that Rbfox1 regulates alternative splicing of Bru1, and likely independently a translational/post-translational mechanism regulates the expression level of Bru1-RB. This in theory could be mediated by interaction with translational machinery, post-translational modification, increased P-granule association, etc., and given the depth and breadth of experiments (as well as the multitude of isoform-specific expression reagents) required to isolate the responsible pathway, we deem it beyond the scope of this manuscript to biochemically demonstrate this specific regulatory mechanism. *

      L609 "The short 5'-UTR encoded by Mef2-Ex17". Ensure all abbreviations are defined. What does "Ex" mean here? Not straightforward to relate to the diagram in the Supplemental material that indicates the Mef2 gene has many fewer than 17 exons. In Fig7 legend too. *

      We have changed “Ex” to “exon” in the text. We apologize for the confusion. We have also added a diagram to Figure 7 E of the 5’-UTR region of Mef2, and a complete diagram of the locus in Figure S3 C. Based on the current annotation, Mef2 exons are numbered 1 to 21, corresponding to at least 16 distinct regions of the genome (18 if you include the variable 3’-UTR lengths). Exons sometimes will have more than one number in the annotation if a particular splice event causes a shift in the ORF, or if alternative splice sites or poly-adenylation sites are used. Mef2 is also on the minus strand, so as exons are numbered based on the genome scaffold, the exon numbering goes in reverse (ie exon 1 is the 3’-UTR).

      We strongly believe in following the numbering provided in the annotation, to increase reproducibility and transparency in working with complex gene loci for many different genes. Another researcher can go to Flybase, look-up the exon number from a given gene from a specific annotation, and get the exact location and sequence of the exons we name. It is incredibly challenging and time intensive to go through older papers and figure out which exon or splice event corresponds to those in the current annotation, and we aim to alleviate this difficulty (we illustrate this in Figure 4 A for the wupA locus, where we verified exon numbers in annotation FB2021_05 by BLASTing each individual sequence and primer provided in (Barbas et al., 1993).*

      L617 "Levels of Mef2 are known to affect muscle morphogenesis but not production of different isoforms" - clarify what is meant here by "different isoforms". *

      We have revised this section of the text. This statement was meant to reflect that Mef2 affects muscle morphogenesis through regulation of transcription levels, but not at the level of alternative splicing.*

      L638 "Salm levels were significantly increased in IFM from Rbfox1-RNAi animals, but significantly decreased in IFMs from flies with Dcr2 enhanced Rbfox1-IR27286 or Rbfox1-IRKK110518". This is worth discussion or further analysis. Normally would expect an allelic series, with an effect becoming more apparent with increased loss-of-function. *

      Dcr2, Rbfox1-IR27286 and Rbfox1-IRKK110518 produce a stronger knockdown than Rbfox1-RNAi, and indeed produce significantly decreased levels of salm, thus following the allelic series. We repeated this experiment, but obtained the same results. *

      L641 "This suggests that Rbfox1 can regulated Salm". How, if there are no Rbfox1 binding sites? Deserves further analysis? *

      Our new bioinformatic analysis suggests a possible answer, in that it identified possible Rbfox1 motifs in a salm exon and a site in an intron. Previously, we had focused on introns and UTR regions. In addition, using the PWM we now recover Rbfox1 binding sites of the canonical TGCATGA as well as AGCATGA sites. The intron site in salm is an AGCATGA site. Further experiments will be required to determine if Rbfox1 directly binds to salm mRNA, if it interacts with the transcriptional machinery to regulate salm expression, or if this regulation occurs through yet a different mechanism, and are beyond the scope of this manuscript.*

      L674: "We found the valence of several regulatory interactions..." I'm not sure the meaning of "valence" here and elsewhere will be readily understood. *

      Thank you for pointing this out. We have used a different phrasing throughout the text.*

      FIGURES

      Fig 1 it is difficult to see the green in A-F. Can this be improved? It is clearer in I-L. *

      We have replaced the images with better examples and increased the levels to make the green channel better visible. *

      Fig 2 legend (others too), say what the clusters of small black ellipses in P and Q are. *

      Thank you for pointing out this oversight. All boxplots are plotted with Tukey whiskers, such that they are drawn to the 25th and 75th percentile plus 1.5 the interquartile range. Dots represent outlying datapoints outside of this range. We have added statements in the relevant figure legends, as well as a more detailed explanation in the Methods. *

      Fig 3 it is not easy to see a shorter sarcomere in D, as the arrow partially obscures what is being indicated. Also, the data in G indicates that sarcomeres are not shorter in Mef2 GAL4 > KK110518, although the legend says this is shown in D. *We have rephrased the statement in the legend. The arrows are pointing to frayed or torn myofibrils.

      Fig 5 legend "-J). Bru1 signal is reduced with Rbfox1-IRKK110518 (C, F, I)". Clarify that this is only in IFM. It is not significant in TDT or Abd-M.

      Done.*

      Fig 7 legend "quantification of the fold change in exd transcript levels" - only KK110518 in IFM is significant. *

      This panel was moved to Figure S7. The relevant regions of the text and figure legend were modified to reflect that only Rbfox1-IRKK110518 results in a significant change in exd levels. C - "indicates Rbfox1 binds to Mef2 mRNA" - it is not easy to see the band.

      We replaced the image and adjusted the levels to make the band more visible. D - what do the different lanes on the gel below the histogram in D correspond to? We adjusted the labeling on the figure panel. The gel is a representative image of RT-PCR results that are quantified above in the histogram.

      *Suggestions that would help the presentation of their data and conclusion **

      There is a lot of good, thorough work here, but overall there is the impression that some of the presentation/writing could be improved (also see the above lists on clarity and accuracy). I admire the authors for their comprehensive presentation of what has already been found out in this field. As the authors summarise, a lot is already known in many other species, so (as also indicated above) it is crucial to emphasise what new is found in this work that advances overall knowledge in this field. This can be obscured in many places where they say because of what was found in vertebrate systems we looked in Drosophila. These include: L417: "This led us to investigate if Rbfox1 might regulate Bru1 in Drosophila." L452: "and we were curious if these interactions are evolutionarily conserved in flies." L528 "Thus, we next checked if Rbfox1 and Bru1 co-regulate alternative splicing in Drosophila muscle." L677 "Moreover, as in vertebrates, Rbfox1 and Bru1 exhibit cross-regulatory interactions" L683 "Rbfox1 function in muscle development is evolutionarily conserved" L697 "Here we extend those findings and show that as in vertebrates......" L702 "our observations are consistent with observations in vertebrates" L707 "Studies from both vertebrates and C. elegans suggest that Rbfox1 modulates developmental isoform switches." L746 "We see evidence for similar regulatory interactions between Rbfox1 and the CELF1/2 homolog Bru1 in our data from Drosophila." *We thank the reviewer for this honest and helpful assessment of the manuscript. Upon rereading the original text and with the guidance of the list of sentences above, we agreed with the reviewer and we have rewritten large segments of the manuscript. In particular in the introduction and discussion, we now better emphasize what is new in our findings and how they advance overall knowledge in this field.

      L185 paragraph. The knockdown series is important for the study. A lot is presented in this paragraph, especially for a non-specialist and it could be easier to follow. Perhaps present the four genetic conditions in the order of the severity of their phenotype on viability. Also, clearly state what each Gal4 driver is used for. What is the nature of the RNAi/IR lines such that Dcr2 could enhance their action? Also comment on off targets - are any predicted?

      We have rewritten this paragraph as the reviewer requested. The hairpins are ordered by decreasing phenotypic severity, and we have more clearly described each Gal4 driver as well as Dicer2. This information is also available in the Methods, along with the off targets for the hairpins. KK110518 has one predicted off-target ichor, but this gene is not expressed in IFM, TDT or leg based on mRNA-Seq data. 27286 has no predicted off-targets. *

      L227: "In severe examples". Be as clear as possible. Are the "severe examples" using the stronger RNAi line or are they the most severe examples with a single line? I'd suggest including the result in the main Fig rather than in the Supplemental. However, as I read more of the m/s I realise there is a great deal of important information in the Supplemental Figs, and so the case is not much stronger for this example than many others. The balance of what is included where could be looked at, because it is not straightforward for the reader to read the paper and quickly flick between the main and supplemental Figs. Later in the m/s is a substantial section that starts L450 (finishes L489) and which only refers to Supplemental Figs. L503 is another area where it is necessary, and difficult, for the reader to move between main Figs and supplemental Figs. *We have reorganized the figure panels in several figures, notably Figures 4, 5, 6, 7 and 8 and the corresponding supplementary figures, including moving panels from the supplemental figures to the main figures and generating more comprehensive quantification panels. In the specific case referenced here for Fig. S1 P and Q, we chose to keep the most representative images of the phenotype in the main figure (Fig. 2 I, N), and have reworded the text to reflect that the most severe phenotypic instances are in the supplement. As we do not have CLIP data, we chose to keep the bioinformatics analysis in the supplement and have shortened the paragraph in the results devoted to Figure S3. We hope our reorganization and rewriting have better streamlined the text and figures.

      L258: - perhaps a Table summarising this and other phenotype trends with the different RNA conditions might be helpful. It gets quite difficult to follow.

      We have revised the text and several figure panels to make the phenotypic trends with the different RNAi conditions easier to follow.*

      Reviewer #2 (Significance (Required)):

      The advance reported is mechanistic.

      The authors already do a very good job of placing their work in the context of prior research (see comment is Section A).

      Muscle biologists interested in its development and function will be interested in this work. More broadly, those intrigued by alternative splicing will be interested. Despite its very widespread occurrence, much about alternative splicing is still poorly understood in terms of regulation and significance. This is especially the case in vivo, and this paper uses an excellent in vivo model system (Drosophila) for the genetic and mechanistic analysis of complex biological problems. My field of expertise: cell differentiation, gene expression, muscle development, Drosophila.

      Response to Reviewer 3 Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **SUMMARY**

      This manuscript characterizes the role of splicing factor Rbfox1 in Drosophila muscle and explores its ability to modulate expression of genes important for fibrillar and tubular muscle development. The authors hypothesize that Rbfox1 binds directly to 5'-UTR and 3'-UTR regions to regulate transcript levels, and to intronic regions to promote or inhibit alternative splicing events. Because some of the regulated genes encode transcriptional activators and other splicing factors such as Bru1, the effects of Rbfox1 may encompass a complex regulatory network that fine-tunes transcript levels and alternative splicing patterns that shape developing muscle. Most likely the authors' hypothesis is correct that Rbfox1 is critical for muscle development in Drosophila, but overall the interesting ideas presented here are too often based only on correlations without further experimental validation. *

      We respectfully disagree with the reviewer that our hypothesis that Rbfox1 is critical for muscle development in Drosophila is based only on correlation without further experimental validation. In this manuscript we extensively characterize the knockdown phenotype of 3 RNAi hairpins against Rbfox1 as well as a GFP-tagged Rbfox1 protein in both fibrillar flight muscle and tubular abdominal and jump muscle. All hairpins produce similar phenotypes with defects in myofiber and myofibril structure and result in behavioral defects in climbing, flight and jumping, confirming this phenotype is due to loss of Rbfox1 and not a random off-target gene. We also convincingly demonstrate that Rbfox1 regulates Bru1, another splicing factor known to be critical for fibrillar specific splice events in IFM. Moreover, Rbfox1 and Bru1 genetically interact selectively in IFM and our RT-PCR data for 12 select structural genes reveals fiber-type specific alternative splicing defects regulated by Rbfox1 selectively, by Bru1 selectively, or by both Rbfox1 and Bru1. Thus, we conclude that Rbfox1 is indeed critical for muscle development, and this is the first report to demonstrate this requirement in Drosophila.*

      **MAJOR COMMENTS**

      The hypothesis that Rbfox1 plays an important role in regulating muscle development is based on previous studies in other species and supported by much new data in this manuscript. Initial bioinformatic analysis showed that many Drosophila genes, including 20% of all RNA-binding proteins, 40% of transcription factors, etc. have the motifs in introns or UTR regions. However, I think a deeper analysis is required. Any hexamer might be present about once every 4kb, and we do not expect all UGCAUG motifs are necessarily functional, so one might ask whether the association of Rbfox motifs with muscle development genes is statistically significant? Are the motifs conserved in other Drosophila species, which might support a functional role in muscle? Are the intronic motifs located as expected for regulatory effects, that is, proximal to alternative exons that exhibit changes in splicing when Rbfox1 expression is decreased or increased? *

      We appreciate the point of the reviewer that it would be ideal to distinguish genome-wide motifs that are actually bound directly by Rbfox1 from those that are unused, but our behavioral and phenotypic characterization of the knockdown phenotype in this manuscript is also valid without this data. The most effective approach to identify direct targets is to perform cross-linking immunoprecipitation, or CLIP, but we unfortunately do not have CLIP data from Drosophila muscle and it is beyond the scope of the current study to generate this data. It is not trivial to obtain the amount of material necessary to identify tissue-specific binding sites, as we would also likely expect differences in targeting specificity between tubular and fibrillar muscle. Genome-wide analysis of the evolutionary conservation of binding site motifs is also not trivial and is beyond the scope of this paper.

      Despite these limitations and to address the reviewer’s comment, we have done the following:

      1. We have completely redone our bioinformatic analysis using transcriptome data from the oRNAment database (Benoit Bouvrette et al., 2020), as well as searching genome-wide for instances of the in vitro determined PWM using PWMScan, to capture possible sites in introns (Figure S3). The oRNAment database was shown to reasonably predict peaks identified in eCLIP from human cell lines, which we assume would translate to a similar predictive capacity in the Drosophila
      2. We have calculated the expected distribution of Rbfox1 sites in a random gene list for Figure S3, and indeed the number of Rbfox1 sites in sarcomere genes is significantly enriched.
      3. We have looked more carefully at the distribution of Rbfox1 and Bru1 motifs in the transcriptome (in the oRNAment data), and find not only that these motifs frequently occur in the same muscle phenotype genes, but also that they are closer together than is expected by chance (Fig. S4 J).
      4. We marked the location of Rbfox1 and Bru1 motifs in the vicinity of select alternative splice events we tested via RT-PCR on the provided summary diagrams (Fig. 6, Fig. S6).
      5. We have tested additional alternative splice events in total from 12 structural genes, and of the 9 events misregulated after Rbfox1 or Bru1 knockdown, all but 1 are flanked by Rbfox1 or Bru1 binding motifs. This indicates that the motifs are indeed located as expected for a regulatory effect. Is it possible to knock out an Rbfox motif and show that splicing of the alternative exon is altered, or regulation of transcript levels is abrogated?


      The construction and mutation of reporter constructs is possible, but would take longer than the recommended revision time-frame, in particular to generate reporters that can be evaluated in vivo. We intend to address the biochemical mechanism(s) of Rbfox1 regulation with future experiments in a separate manuscript.

      Also, what was the background set of genes used for the GO enrichment analysis? Genes expressed in muscle or all genes?

      The background set of genes for GO enrichment (now Figure S3 B) was all annotated genes for the “all genes” label and all muscle phenotype genes for the “Muscle phenotype” label.

      The data on cross regulation between Rbfox1 and Bru1 are confusing and inconsistent, since mild knockdown and stronger knockdown of Rbfox1 seem to have different effects on Bru1 expression. New data suggest that Rbfox1 can positively regulate Bru1 protein levels (Fig.5), but this seems inconsistent with the lab's earlier studies indicating opposite temporal mRNA expression profiles for Rbfox1 and Bru1 across IFM development. 


      We apologize for the confusion, but the relationship between Rbfox1 and bru1 levels across IFM development has not been published previously. We previously generated that mRNA-Seq data, but presented here (now in Figure 5Q) is a new analysis of that data, specifically focused on Rbfox1 and bru1 expression. We have corrected the phrasing in the text.

      To address this comment, along with points raised above by Reviewer 2, we have revised this part of the manuscript, added more data to this section of the manuscript and repeated several of these experiments. After adding more biological replicates and additional data points, we have more consistent results that also demonstrate the variability in bru1 expression levels after Rbfox1 knockdown. Overall levels of bru1 assayed with a primer set in exons 14 and 17 now consistently show an increase in bru1 expression after Rbfox1 knockdown between all three hairpins (Rbfox1-RNAi, Rbfox1-IRKK110518 and Dcr2, Rbfox1-IR27286) (Figure 5 N). This is consistent with our observations of inversely correlated mRNA levels during IFM development, as when Rbfox1 levels decrease, bru1 transcripts increase.

      We agree with the reviewer that the relationship between the expression level of Rbfox1 and expression level of bru1 mRNA and Bru1 protein isoforms is more complex. We now report a novel splice event in the annotated isoform bru1-RB that skips exon 7, resulting in a frame shift and generation of a protein that lacks all RRM domains, which we call bru1-RBshort (Figure S5). Unknowingly, we had previously used a primer set from exon 7 to exon 8 as “common”, which lead to some confusion. This short isoform is preferentially used in TDT, while the long isoform encoding the full-length protein is preferentially used in IFM (Figure 5 P). Presumably, this provides a mechanism, in addition to the use of different promoters, for muscle cells to regulate expression levels of different Bru1 isoforms. Knockdown of Rbfox1 in IFM results in a significant increase in the use of the long mRNA isoform, but paradoxically a decrease in the corresponding protein isoform (Figure 5, S5). We interpret this to mean that Rbfox1 regulates alternative splicing of Bru1, and likely independently a translational/post-translational mechanism regulates the expression level of Bru1-RB. This in theory could be mediated by interaction with translational machinery, post-translational modification, increased P-granule association, etc., and given the depth and breadth of experiments (as well as the multitude of isoform-specific expression reagents) required to isolate the responsible pathway, we deem it beyond the scope of this manuscript to biochemically demonstrate this specific regulatory mechanism. *

      *

      Both Rbfox1 and Bru1 gene have many Rbfox motifs, but they are both large genes (>100kb) and would be expected to have many copies of all hexamers. How do we know whether any of them are functional?

      We do not know if all of the Rbfox1 binding sites in the Bru1 and Rbfox1 loci are bound, but the CLIP data required to assess this is beyond the scope of this manuscript, as discussed above. We do show, however, that changes in the expression level of Rbfox1 affect the expression of Bru1 on both the mRNA transcript and protein level, and changes in the expression level of Bru1 also can affect the expression level of Rbfox1. The direct or indirect nature of this regulation remains to be fully elucidated, although we do provide RIP data showing we can detect bru1 transcript bound to Rbfox1-GFP (Figure S4 I). We have modified the text to address this comment.

      Figure S4, section I, J: if changes in Bru1-RB isoform expression are correlated with Rbfox1 knockdown, it seems reasonable to test whether the Bru1-RB promoter can drive expression of GFP in an Rbfox1-dependent manner. But if I understand correctly, the assay as described on p. 19 uses the promoter region upstream of Bru1-RA. What is the logic for this experiment? It is not surprising that no effect was observed. The end result is that we have no idea whether Rbfox1 directly regulates bru1-RB. Even if it does, bru-Rb appears to be a minor component of Bru expression in IFM.

      Upon reevaluating this experiment and with respect to the reviewer’s comment, we have removed it from the manuscript to avoid confusion. Our new data indicate a switch in use of the bru1-RBlong and bru1-RBshort isoforms (Figure 5 N-P), suggesting that Rbfox1 regulation is on the level of splicing.

      Further experiments will be necessary to refine the indirect versus direct regulatory effects of Rbfox1 on Bru1, but our data do demonstrate that Bru1 levels are regulated in Rbfox1 knockdown conditions. We also provide a RIP experiment (Figure S4 I) showing that Rbfox1-GFP does directly bind bru1 mRNA, but we did not determine if this was isoform-specific. Multiple additional experiments would be necessary to distinguish between regulation of alternative splicing, direct binding to regulate transcript translation or stability, or transcriptional regulation via regulation of Salm, or some combination of these possible mechanisms. The data presented here are important to the field as they are the first report of isoform-specific regulation of Bru1 in muscle, even if we do not conclusively show if this regulation by Rbfox1 is direct or indirect.

      In the section "Rbfox1 and Bruno1 co-regulate alternative splice events in IFMs", the data show that splicing of several genes is altered by knockdown or over-expression of Rbfox1 and Bru1. The interesting conclusion is for a complex regulatory dynamic where Rbfox1 and Bru1 co-regulate some alternative splice events and independently regulate other events in a muscle-type specific manner. However, if we are to conclude that these activities are due to direct binding of Rbfox1 and Bru1 to the adjacent introns, we need information about the location of flanking Rbfox and/or Bru1 motifs. Do upstream or downstream binding sites correlate with enhancer or silencer activity, as reported in previous studies of these splicing factors in other species? For wupA, Figure S3 shows an intronic Rbfox site, but exon 4 is not labeled so the reader cannot correlate this information with the diagram in Figure 6U.

      As mentioned above, we have marked the location of Rbfox1as well as Bru1 binding motifs in the diagrams in Figure 6 and Figure S6. We have tested additional alternative splice events, and can now show events regulated only in the Rbfox1 knockdown, only after bru1 knockdown, or in double knockdown flies (Figure 8). 8 out of 9 events where we see clear changes in splicing are flanked by potential Rbfox1 or Bru1 motifs. Demonstration of direct binding and assay of genome-wide binding sites through CLIP studies is beyond the scope of this manuscript and will be pursued in the future.

      The evidence that Rbfox1 directly affects expression of transcription factor Exd seems to be based only a correlation between Rbfox1 knockdown and decreased expression of Exd. The observation that binding of Rbfox1 to the Exd 3'UTR in RIP experiments further weakens the case.

      We agree with the reviewer and have moved the data related to exd to the supplement (Figure 7 and S7). We still mention exd in the text as it is significantly decreased after knockdown with Rbfox1-IRKK110518, but we have removed it from larger claims of transcriptional regulation as well as from the summary in Figure 8. Also, just to note that although we failed to detect Rbfox1-GFP bound to exd, this experiment was performed with adult flies. Since Exd is functionally important early in pupal development during fate specification of the IFMs, it is possible we might detect binding to exd mRNA at a different developmental timepoint.

      Similarly, there is a correlation of Rbfox1 knockdown with expression of alternative 5'UTRs in the Mef2 gene. However, the changes in UTR expression appear mostly not statistically significant. Do the authors have a model to explain what mechanism might allow Rbfox to regulate expression of alternative 5'UTRs, which would seem to be a transcriptional process?

      Mef2 transcript levels are significantly increased after knockdown with Rbfox1-RNAi and decreased after overexpression of Rbfox1, and we can detect direct binding of Rbfox1-GFP to Mef2 RNA via RIP. This establishes Mef2 as a likely direct target of Rbfox1 regulation, likely through the two Rbfox1 motifs in the 3’-UTR (Figure S3 C). In addition to this regulation, we made an observation that has not been previously reported in the literature, that IFM expresses a particular isoform of Mef2 that uses a short promoter encoded by Exon 17. We see both tissue-specific use of Exon 17 (Figure 7 F) as well as developmental regulation of Exon 17 use in IFM (Figure S7 C). Surprisingly, we saw that use of exon 17 in the Mef2 promoter is altered in Rbfox1 knockdown muscle. We now provide a quantification of this data, to show the change is statistically significant. We also provide a scheme of the Mef2 locus and RT-PCR primers with exons 17, 20 and 21 labelled (Figure 7 E). We have also rewritten this section of the text to increase the impact and clarity of our finding.

      For Salm, there apparently are no Rbfox motifs in the gene, and there are statistically significant but apparently inconsistent changes in Salm expression when it is knocked down in IFM by Rbfox1-RNAi (Salm increases) vs knockdown by Rbfox1-IR27286 or Rbfox1-IRKK110518 (Salm decreases). These are potentially interesting observations but more data would be needed to make stronger conclusions. How would regulation occur in the absence of Rbfox motifs?


      The best explanation we can provide for why salm expression is increased with the weak hypomorph Rbfox1-RNAi condition, but decreased with the stronger hypomorph Rbfox1-IRKK110518 or Dcr2, Rbfox1-IR27286 conditions is that salm regulation is sensitive to Rbfox1 expression or activity level. We now discuss this in a new section of the discussion. We further attempted several experiments to address this question, including obtaining an endogenously tagged Salm-GFP line, as well as a UAS-Salm line (kindly provided by F. Schnorrer). Disappointingly, there is no GFP expressed in the Salm-GFP line, either live, by immunostaining or in Western Blot of multiple developmental stages, indicating that the line has fallen apart and we have not yet redone the CRISPR targeting to generate a new line. The UAS-Salm construct works (too well), in that overexpression with Mef2-Gal4 results in early lethality and we have not yet managed to optimize the experiment and obtain enough pupal muscle where we can evaluate the effect on Bru1 or Rbfox1 levels.

      Our new bioinformatic analysis further revealed possible Rbfox1 motifs in a salm exon and a site in an intron. Previously, we had focused on introns and UTR regions. Now, using the in vitro determined PWM, we can recover Rbfox1 binding sites of the canonical TGCATGA as well as AGCATGA sites. The intron site in salm is an AGCATGA site. Further experiments will be required to determine if Rbfox1 directly binds to salm pre-mRNA, if it interacts with the transcriptional machinery to regulate salm expression, or if this regulation occurs through yet a different mechanism. We feel the many required experiments are beyond the scope of the current manuscript. Our data provides an experimental basis for future studies on this topic.

      \*MINOR COMMENTS**

      1. In several figures there is a misalignment of the transcriptional driver information with the phenotype data in the bar graphs above. Please correct the alignments to make interpretation easier. *

      We have revised the layout of labels for many plots throughout the manuscript to avoid a category label associated with a genotype label at a 45-degree angle, and to make interpretation easier.

      On p. 14 Brudno et al. is cited as ref for Fox motifs near muscle exons, but this paper only focused on brain-specific exons.

      In addition to brain-specific exons, Brudno et al. also analyzed a set of muscle-specific exons, and thus this is the appropriate reference. For instance, from the Brudno paper, “As an additional control in some experiments we analyzed a smaller sample of muscle-specific alternative exons that were collected exactly as described above for the brain-specific exons” and “UGCAUG was also found at a high frequency downstream of a smaller group of muscle-specific exons.” Further details of the muscle-specific exon analysis can be found in (Brudno et al., 2001).

      For Mef2, why do exons described as 5'UTR have numbers 17, 20, and 21? One would normally expect these to be exon 1, 2 or 1A, 1B, etc.

      We rely on the Flybase annotation and numbering system to refer to exons. Per Flybase, all exons are labeled in the 5’ to 3’ direction of the sequenced genome, even for genes, such as Mef2 or wupA, that are encoded on the reverse strand. We strongly believe in following the numbering provided in the annotation, to increase reproducibility and transparency in working with complex gene loci for many different genes. Another researcher can go to Flybase, look-up the exon number from a given gene from a specific annotation, and get the exact location and sequence of the exons we name. It is incredibly challenging and time intensive to go through older papers and figure out which exon or splice event corresponds to those in the current annotation. We illustrate this in Figure 4 A for the wupA locus, where we verified exon numbers in annotation FB2021_05 by BLASTing each individual sequence and primer provided in (Barbas et al., 1993). The Mhc locus is even more complex, in particular regarding alternative 3’-UTR regions and historic versus current exon designations (Nikonova et al., 2020). For clarity and reproducibility, we therefore rely on the current Flybase designations.

      Fig 8: "regulation of regulators" seems to imply the Rbfox1 is impacting transcription?? Is there precedence for this type of regulation by Rbfox1? Yes, indeed, there is precedence for Rbfox1 impacting transcription, as we presented in the Discussion. Rbfox2 is reported to interact with the Polycomb repressive complex 2 to regulate gene transcription in mouse (Wei et al., 2016) and in flies Rbfox1 interacts with transcription factors including Cubitus interruptus and Suppressor of Hairless to regulate transcription downstream of Hedgehog and Notch signaling (Shukla et al., 2017; Usha and Shashidhara, 2010). In addition, Rbfox1 regulates splicing of Mef2A and Rbfox1 and Rbfox1 cooperatively regulate splicing of Mef2D during C2C12 cell differentiation (Gao et al., 2016). Our results provide a further piece of evidence implicating Rbfox1 either directly or indirectly in transcriptional regulation as well as regulation of alternative splicing.

      * Reviewer #3 (Significance (Required)):

      **SIGNIFICANCE**

      These studies of a major tissue-specific RNA binding protein, Rbfox1, are definitely important for our understanding of functional differences between muscle subtypes, and between muscle and nonmuscle tissues. The broad outlines of Rbfox1 alternative splicing regulation are known, but there is very little specific detail about the important targets in muscle subtypes that might help explain functional differences between subtypes. If more experimental validation can be obtained for regulation of transcript levels by binding 3'UTRs, this would also represent new information. *

      We thank the reviewer for recognizing the significance of our work and our detailed analysis of Rbfox1 phenotypes in different muscle fiber-types. Experimental validation of 3’-UTR binding will be a significant time investment in terms of building and testing in-vivo reporter constructs, assaying NMD and translation effects and performing the CLIP studies necessary for identification of directly-bound 3’-UTR regions, extending beyond the scope of this manuscript and the time allotted for revision. The data we present here represent an important advance in our understanding how Rbfox1 contributes to muscle-type specific differentiation, and form the basis for future experiments to explore the molecular and biochemical mechanisms underlying this regulation. *

      I am reviewing based on my experience studying alternative splicing in vertebrate systems, with an emphasis on Rbfox genes. Therefore I am unable to evaluate the functional data on different subtypes of muscle in Drosophila.

      *

      Reviewer Response References

      Barbas, J. A., Galceran, J., Torroja, L., Prado, A. and Ferrús, A. (1993). Abnormal muscle development in the heldup3 mutant of Drosophila melanogaster is caused by a splicing defect affecting selected troponin I isoforms. Mol Cell Biol 13, 1433–1439.

      Benoit Bouvrette, L. P., Bovaird, S., Blanchette, M. and Lécuyer, E. (2020). oRNAment: a database of putative RNA binding protein target sites in the transcriptomes of model species. Nucleic Acids Research 48, D166–D173.

      Brudno, M., Gelfand, M. S., Spengler, S., Zorn, M., Dubchak, I. and Conboy, J. G. (2001). Computational analysis of candidate intron regulatory elements for tissue-specific alternative pre-mRNA splicing. Nucleic Acids Res 29, 2338–2348.

      Damianov, A., Ying, Y., Lin, C.-H., Lee, J.-A., Tran, D., Vashisht, A. A., Bahrami-Samani, E., Xing, Y., Martin, K. C., Wohlschlegel, J. A., et al. (2016). Rbfox Proteins Regulate Splicing as Part of a Large Multiprotein Complex LASR. Cell 165, 606–619.

      Gao, C., Ren, S., Lee, J.-H., Qiu, J., Chapski, D. J., Rau, C. D., Zhou, Y., Abdellatif, M., Nakano, A., Vondriska, T. M., et al. (2016). RBFox1-mediated RNA splicing regulates cardiac hypertrophy and heart failure. J Clin Invest 126, 195–206.

      Nikonova, E., Kao, S.-Y. and Spletter, M. L. (2020). Contributions of alternative splicing to muscle type development and function. Semin. Cell Dev. Biol.

      Nongthomba, U., Cummins, M., Clark, S., Vigoreaux, J. O. and Sparrow, J. C. (2003). Suppression of muscle hypercontraction by mutations in the myosin heavy chain gene of Drosophila melanogaster. Genetics 164, 209–222.

      Schnorrer, F., Schönbauer, C., Langer, C. C. H., Dietzl, G., Novatchkova, M., Schernhuber, K., Fellner, M., Azaryan, A., Radolf, M., Stark, A., et al. (2010). Systematic genetic analysis of muscle morphogenesis and function in Drosophila. Nature 464, 287–291.

      Shukla, J. P., Deshpande, G. and Shashidhara, L. S. (2017). Ataxin 2-binding protein 1 is a context-specific positive regulator of Notch signaling during neurogenesis in Drosophila melanogaster. Development 144, 905–915.

      Usha, N. and Shashidhara, L. S. (2010). Interaction between Ataxin-2 Binding Protein 1 and Cubitus-interruptus during wing development in Drosophila. Dev Biol 341, 389–399.

      Wei, C., Xiao, R., Chen, L., Cui, H., Zhou, Y., Xue, Y., Hu, J., Zhou, B., Tsutsui, T., Qiu, J., et al. (2016). RBFox2 Binds Nascent RNA to Globally Regulate Polycomb Complex 2 Targeting in Mammalian Genomes. Mol Cell 62, 875–889.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      SUMMARY

      This manuscript characterizes the role of splicing factor Rbfox1 in Drosophila muscle and explores its ability to modulate expression of genes important for fibrillar and tubular muscle development. The authors hypothesize that Rbfox1 binds directly to 5'-UTR and 3'-UTR regions to regulate transcript levels, and to intronic regions to promote or inhibit alternative splicing events. Because some of the regulated genes encode transcriptional activators and other splicing factors such as Bru1, the effects of Rbfox1 may encompass a complex regulatory network that fine-tunes transcript levels and alternative splicing patterns that shape developing muscle. Most likely the authors' hypothesis is correct that Rbfox1 is critical for muscle development in Drosophila, but overall the interesting ideas presented here are too often based only on correlations without further experimental validation.

      MAJOR COMMENTS

      The hypothesis that Rbfox1 plays an important role in regulating muscle development is based on previous studies in other species and supported by much new data in this manuscript. Initial bioinformatic analysis showed that many Drosophila genes, including 20% of all RNA-binding proteins, 40% of transcription factors, etc. have the motifs in introns or UTR regions. However, I think a deeper analysis is required. Any hexamer might be present about once every 4kb, and we do not expect all UGCAUG motifs are necessarily functional, so one might ask whether the association of Rbfox motifs with muscle development genes is statistically significant? Are the motifs conserved in other Drosophila species, which might support a functional role in muscle? Are the intronic motifs located as expected for regulatory effects, that is, proximal to alternative exons that exhibit changes in splicing when Rbfox1 expression is decreased or increased? Is it possible to knock out an Rbfox motif and show that splicing of the alternative exon is altered, or regulation of transcript levels is abrogated?
 Also, what was the background set of genes used for the GO enrichment analysis? Genes expressed in muscle or all genes?

      1. The data on cross regulation between Rbfox1 and Bru1 are confusing and inconsistent, since mild knockdown and stronger knockdown of Rbfox1 seem to have different effects on Bru1 expression. Both Rbfox1 and Bru1 gene have many Rbfox motifs, but they are both large genes (>100kb) and would be expected to have many copies of all hexamers. How do we know whether any of them are functional? New data suggest that Rbfox1 can positively regulate Bru1 protein levels (Fig.5), but this seems inconsistent with the lab's earlier studies indicating opposite temporal mRNA expression profiles for Rbfox1 and Bru1 across IFM development. 

      2. Figure S4, section I, J: if changes in Bru1-RB isoform expression are correlated with Rbfox1 knockdown, it seems reasonable to test whether the Bru1-RB promoter can drive expression of GFP in an Rbfox1-dependent manner. But if I understand correctly, the assay as described on p. 19 uses the promoter region upstream of Bru1-RA. What is the logic for this experiment? It is not surprising that no effect was observed. The end result is that we have no idea whether Rbfox1 directly regulates bru1-RB. Even if it does, bru-Rb appears to be a minor component of Bru expression in IFM.
      3. In the section "Rbfox1 and Bruno1 co-regulate alternative splice events in IFMs", the data show that splicing of several genes is altered by knockdown or over-expression of Rbfox1 and Bru1. The interesting conclusion is for a complex regulatory dynamic where Rbfox1 and Bru1 co-regulate some alternative splice events and independently regulate other events in a muscle-type specific manner. However, if we are to conclude that these activities are due to direct binding of Rbfox1 and Bru1 to the adjacent introns, we need information about the location of flanking Rbfox and/or Bru1 motifs. Do upstream or downstream binding sites correlate with enhancer or silencer activity, as reported in previous studies of these splicing factors in other species? For wupA, Figure S3 shows an intronic Rbfox site, but exon 4 is not labeled so the reader cannot correlate this information with the diagram in Figure 6U.
      4. The evidence that Rbfox1 directly affects expression of transcription factor Exd seems to be based only a correlation between Rbfox1 knockdown and decreased expression of Exd. The observation that binding of Rbfox1 to the Exd 3'UTR in RIP experiments further weakens the case.
      5. Similarly, there is a correlation of Rbfox1 knockdown with expression of alternative 5'UTRs in the Mef2 gene. However, the changes in UTR expression appear mostly not statistically significant. Do the authors have a model to explain what mechanism might allow Rbfox to regulate expression of alternative 5'UTRs, which would seem to be a transcriptional process?
      6. For Salm, there apparently are no Rbfox motifs in the gene, and there are statistically significant but apparently inconsistent changes in Salm expression when it is knocked down in IFM by Rbfox1-RNAi (Salm increases) vs knockdown by Rbfox1-IR27286 or Rbfox1-IRKK110518 (Salm decreases). These are potentially interesting observations but more data would be needed to make stronger conclusions. How would regulation occur in the absence of Rbfox motifs?


      MINOR COMMENTS

      1. In several figures there is a misalignment of the transcriptional driver information with the phenotype data in the bar graphs above. Please correct the alignments to make interpretation easier.
      2. On p. 14 Brudno et al. is cited as ref for Fox motifs near muscle exons, but this paper only focused on brain-specific exons.
      3. For Mef2, why do exons described as 5'UTR have numbers 17, 20, and 21? One would normally expect these to be exon 1, 2 or 1A, 1B, etc.
      4. Fig 8: "regulation of regulators" seems to imply the Rbfox1 is impacting transcription?? Is there precedence for this type of regulation by Rbfox1?

      5. The data on cross regulation between Rbfox1 and Bru1 are confusing and inconsistent, since mild knockdown and stronger knockdown of Rbfox1 seem to have different effects on Bru1 expression. Both Rbfox1 and Bru1 gene have many Rbfox motifs, but they are both large genes (>100kb) and would be expected to have many copies of all hexamers. How do we know whether any of them are functional? New data suggest that Rbfox1 can positively regulate Bru1 protein levels (Fig.5), but this seems inconsistent with the lab's earlier studies indicating opposite temporal mRNA expression profiles for Rbfox1 and Bru1 across IFM development. 


      6. Figure S4, section I, J: if changes in Bru1-RB isoform expression are correlated with Rbfox1 knockdown, it seems reasonable to test whether the Bru1-RB promoter can drive expression of GFP in an Rbfox1-dependent manner. But if I understand correctly, the assay as described on p. 19 uses the promoter region upstream of Bru1-RA. What is the logic for this experiment? It is not surprising that no effect was observed. The end result is that we have no idea whether Rbfox1 directly regulates bru1-RB. Even if it does, bru-Rb appears to be a minor component of Bru expression in IFM.

      7. In the section "Rbfox1 and Bruno1 co-regulate alternative splice events in IFMs", the data show that splicing of several genes is altered by knockdown or over-expression of Rbfox1 and Bru1. The interesting conclusion is for a complex regulatory dynamic where Rbfox1 and Bru1 co-regulate some alternative splice events and independently regulate other events in a muscle-type specific manner. However, if we are to conclude that these activities are due to direct binding of Rbfox1 and Bru1 to the adjacent introns, we need information about the location of flanking Rbfox and/or Bru1 motifs. Do upstream or downstream binding sites correlate with enhancer or silencer activity, as reported in previous studies of these splicing factors in other species? For wupA, Figure S3 shows an intronic Rbfox site, but exon 4 is not labeled so the reader cannot correlate this information with the diagram in Figure 6U.

      8. The evidence that Rbfox1 directly affects expression of transcription factor Exd seems to be based only a correlation between Rbfox1 knockdown and decreased expression of Exd. The observation that binding of Rbfox1 to the Exd 3'UTR in RIP experiments further weakens the case.

      9. Similarly, there is a correlation of Rbfox1 knockdown with expression of alternative 5'UTRs in the Mef2 gene. However, the changes in UTR expression appear mostly not statistically significant. Do the authors have a model to explain what mechanism might allow Rbfox to regulate expression of alternative 5'UTRs, which would seem to be a transcriptional process?

      10. For Salm, there apparently are no Rbfox motifs in the gene, and there are statistically significant but apparently inconsistent changes in Salm expression when it is knocked down in IFM by Rbfox1-RNAi (Salm increases) vs knockdown by Rbfox1-IR27286 or Rbfox1-IRKK110518 (Salm decreases). These are potentially interesting observations but more data would be needed to make stronger conclusions. How would regulation occur in the absence of Rbfox motifs?


      MINOR COMMENTS

      1. In several figures there is a misalignment of the transcriptional driver information with the phenotype data in the bar graphs above. Please correct the alignments to make interpretation easier.

      2. On p. 14 Brudno et al. is cited as ref for Fox motifs near muscle exons, but this paper only focused on brain-specific exons.

      3. For Mef2, why do exons described as 5'UTR have numbers 17, 20, and 21? One would normally expect these to be exon 1, 2 or 1A, 1B, etc.

      4. Fig 8: "regulation of regulators" seems to imply the Rbfox1 is impacting transcription?? Is there precedence for this type of regulation by Rbfox1?

      Significance

      SIGNIFICANCE

      These studies of a major tissue-specific RNA binding protein, Rbfox1, are definitely important for our understanding of functional differences between muscle subtypes, and between muscle and nonmuscle tissues. The broad outlines of Rbfox1 alternative splicing regulation are known, but there is very little specific detail about the important targets in muscle subtypes that might help explain functional differences between subtypes. If more experimental validation can be obtained for regulation of transcript levels by binding 3'UTRs, this would also represent new information.

      I am reviewing based on my experience studying alternative splicing in vertebrate systems, with an emphasis on Rbfox genes. Therefore I am unable to evaluate the functional data on different subtypes of muscle in Drosophila.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      This paper reports analysis of the function of RbFox1, an RNA-binding protein, best known for roles in the regulation of alternative splicing. It uses Drosophila as its in vivo model system, one that is highly suited to the analysis in vivo of complex biological events. In general, the authors present a very thorough approach with an impressive range of molecular analysis, genetic experiments and phenotypic assays.

      The authors report that Rbfox1 is expressed in all Drosophila muscle types, and regulated in both a temporal and muscle type specific manner. Using inhibitory RNA to knock down gene function, they show that Rbfox1 is required in muscle for both viability and pupal eclosion, and contributes to both muscle development and function. A Bioinformatic approach then identifies muscle genes with Rbfox1-binding motifs. They show Rbfox1 regulates expression of both muscle structural proteins and the splicing factor Bruno1, interestingly preferentially targeting the Bruno1-RB isoform. They report functional interaction between Rbfox1 and Bruno1 and that this is expression level-dependent. Lastly, they report that Rbfox1 regulates transcription factors that control muscle gene expression.

      They conclude that the effect on muscle function of RbFox1 knock down is through mis-regulation of fibre type specific gene and splice isoform expression. Moreover, "Rbfox1 functions in a fibre-type and level-dependent manner to modulate both fibrillar and tubular muscle development". They propose that it does this by "binding to 5'-UTR and 3'-UTR regions to regulate transcript levels and binding to intronic regions to promote or inhibit alternative splice events." They also suggest that Rbfox1 acts "also through hierarchical regulation of the fibre diversity pathway." They provide further evidence to the field that Rbfox1's role in muscle development is conserved.

      MAJOR COMMENTS

      Are key conclusions convincing?

      In terms of presentation, I suggest ensuring a clear demarcation throughout of the evidence behind the main conclusions. This can get somewhat lost as a great deal of information is presented, including all the parallels with prior findings in other systems. I am not saying this is a major problem, just highlighting the importance of clarity. Conclusions to clearly evidence include: Rbfox1 functions in a fibre-type manner to modulate both fibrillar and tubular muscle development (e.g. L664); Rbfox1 functions in a level-dependent manner (e.g. L664); Rbfox1 functions by binding to 5'-UTR and 3'-UTR regions to regulate transcript levels (e.g. L670); Rbfox1 functions by binding to intronic regions to promote or inhibit alternative splice events" (e.g. L670); "Bru1 can regulate Rbfox1 levels in Drosophila muscle, and likely in a level-dependent manner" (L488) - Clearly evidence the level effect; "first evidence for negative regulation for fine tuning acquisition of muscle-type specific properties. Depending on its expression level, Rbfox1 can either promote or inhibit expression of" muscle regulators (L797).

      Lastly, the controlled stoichiometry of muscle structural proteins is known to be important, but all mechanisms are not known, so again make the supporting evidence as clear as possible for the interesting point of a role for Rbfox1 in this (e.g. L787).

      Should some claims be qualified as preliminary or removed?

      P301 "complicated genetic recombination" - seems a bit weak to include. Either do it or don't include? Also, see section below on "adequate replication of experiments"

      Are additional exps essential? (if so realistic in terms of time and cost)

      None essential in my view. It depends on the authors' goals, but for the most impact of the project then following up these suggestions are possible.

      L369-372: mutate putative Rbfox1 binding site and ask does binding still occur or not. If it doesn't, then ask if this mutation affects the expression of the putative target gene.

      L775-777 "Our data thus support findings that Rbfox1 modulates transcription, but introduce a novel method of regulation, via regulating transcription factor transcript stability." It would be good to demonstrate this.

      Presented in such a way as to be reproduced

      Yes

      Are exps adequately replicated?

      A main area I would address is the authors frequent use of "may", "tend", "trend". This is confusing the picture they present. What is statistically significant and what is not? Only the former can be used as evidence.

      Examples include:

      L170: "may display preferential exon use" - does it or doesn't it?

      L272: "myofibrils tended to be thicker" - were they or weren't they?

      L350 "wupA mRNA levels tend towards upregulation in Rbfox1-RNAi".

      L353 "but tended towards upregulation (Fig. S4A)"

      L466 "Correspondingly, we see a trend towards increased protein-level expression of Bru1-PA"

      L474 "both Bru1-PA and Bru1-PB tend to increase"

      L485 "Overexpression of Bru1 in TDT with Act79B-Gal4 also tends to reduce Rbfox1"

      L595 "Rbfox1-IR27286 tended towards increased exd levels in IFM (Fig. 7A)"

      L614 "and a trend towards increased use of Mef2-Ex20 "

      Also, L487 "suggesting that Bru1 can also negatively regulate Rbfox1" - one cannot use a non-significant observation to suggest something.

      MINOR COMMENTS

      Specific exp issues that are easily addressable

      L162: "dip in Rbfox1 expression levels around 50h APF". The Fig indicates as early as 30h. Is this significantly less than the 24h data point?

      L427 "this staining was lost after Rbfox1 knockdown". This conflicts with Fig 5K which says no significant difference. Again in L429 "Rbfox1 knockdown leads to a reduction of Bru1 protein levels in IFMs and TDT." Fig says no significant difference in TDT.

      Are prior studies referenced appropriately?

      This m/s is an authoritative presentation of the field as a whole with a comprehensive, impressive reference list. However, a point related to this area is one of the main things I would consider tackling. This is to have more clarity in the demarcation of what this study has found that adds to prior knowledge. It is worthwhile in itself to demonstrate the many similarities with previous work in other systems, as part of establishing the Drosophila system with all its analytical advantages for in vivo molecular genetics as an excellent model for future study in this area of research. However, the impact/strength of this m/s would be enhanced by clarity in presenting what is new to the field in all organisms.

      Are the text and Figs clear and accurate?

      TEXT

      L156: more precise language than "in a pattern consistent with the myoblasts" - maybe a simple co-expression with a myoblast marker?

      L181: at first use define difference between RNAi and IR

      L205: maybe clearly explain the link between eclosion and tubular muscle??

      L231: "Sarcomeres were not significantly shorter at 90h APF with the stronger Mef2-Gal4" - not clear why this is the case when the less strong knockdown conditions have shorter sarcomeres.

      L234: "classic hypercontraction mutants in IFMs display a similar phenotype" - presumably not similar to the not significantly shorter sarcomeres of the previous sentence.

      L244: "90h", should be "90h APF"?

      L273: "Myofibrils in Act88F-Gal4 mediated knockdown only showed mild defects (Fig. 3 G, H, Fig. S2 C, D) despite adult flies being flight impaired". This seems worthy of discussion - the functional defect is not due to overt structure change?

      L281 "also known as Zebra bodies" - helpful to indicate these on the Fig, they are not.

      L282: "we were unable to attempt a rescue of these defects" - I may have missed something, but what about rescue undertaken of the defects on previous pages?

      L283: "Over-expression of Rbfox1 from 40h APF" - this is the first over-expression experiment, so introduce why done now (and perhaps not earlier), and also explain the use of a different Gal4 driver.

      L290 "Interestingly, both Rbfox1 knockdown and Rbfox1 over-expression produce similar hypercontraction defects" - this could be interesting, worthy of discussion/explanation.

      P305: Bioinformatic analysis. It is not clear what is taken as a potentially interesting result. On average a specific 5 base motif is found every 1000bps - so what is being looked for? How many sites in what length or position? A range of examples are described in the next pages of the m/s. For example: L337 "Bruno1.... contains 42 intronic and 2 5'-UTR Rbfox1 binding motifs" and L591 "exd contains three Rbfox1 binding sites,"

      L315: "many of these genes have binding or catalytic activity". "catalytic activity" seems very vague.

      L317 "When we look in previously annotated gene lists" - be more specific. What are they?

      L327 "may also affect the neuro-muscular junction" - maybe better left for the Discussion?

      L333 "extradenticle (exd) and Myocyte enhancer factor 2 (Mef2) contain 3 and 7 Rbfox1 motifs," Discuss the number and position of multiple motifs found in known targets?

      L350 "wupA mRNA levels " - clearer to stick to using TroponinI or WupA?

      L376 "To check whether Rbfox1 regulates some target mRNAs such as wupA....." The suggestion here is more of a further indication than a "check".

      L544 "In IFMs, knockdown of Rbfox1 and loss of Bru1 results in...." clarify if this is the two genes separately or the two genes together?

      L580 "Our bioinformatic analysis identified Rbfox1 binding motifs in more than 40% of transcription factors genes" - is this all TFs or just "muscle" TF genes?

      L598, what would be the mechanism of some decrease in Rbfox1 increasing mRNA levels and more of a decrease resulting in a decrease of the mRNA? The authors say "the nature of this regulation requires further investigation".

      L609 "The short 5'-UTR encoded by Mef2-Ex17". Ensure all abbreviations are defined. What does "Ex" mean here? Not straightforward to relate to the diagram in the Supplemental material that indicates the Mef2 gene has many fewer than 17 exons. In Fig7 legend too.

      L617 "Levels of Mef2 are known to affect muscle morphogenesis but not production of different isoforms" - clarify what is meant here by "different isoforms".

      L638 "Salm levels were significantly increased in IFM from Rbfox1-RNAi animals, but significantly decreased in IFMs from flies with Dcr2 enhanced Rbfox1-IR27286 or Rbfox1-IRKK110518". This is worth discussion or further analysis. Normally would expect an allelic series, with an effect becoming more apparent with increased loss-of-function.

      L641 "This suggests that Rbfox1 can regulated Salm". How, if there are no Rbfox1 binding sites? Deserves further analysis?

      L674: "We found the valence of several regulatory interactions..." I'm not sure the meaning of "valence" here and elsewhere will be readily understood.

      FIGURES

      Fig 1 it is difficult to see the green in A-F. Can this be improved? It is clearer in I-L.

      Fig 2 legend (others too), say what the clusters of small black ellipses in P and Q are.

      Fig 3 it is not easy to see a shorter sarcomere in D, as the arrow partially obscures what is being indicated. Also, the data in G indicates that sarcomeres are not shorter in Mef2 GAL4 > KK110518, although the legend says this is shown in D.

      Fig 4 A - Western blot. Looks over-exposed. Is this in a linear response region?

      Fig 5 legend "-J). Bru1 signal is reduced with Rbfox1-IRKK110518 (C, F, I)". Clarify that this is only in IFM. It is not significant in TDT or Abd-M.

      Fig 7 legend "quantification of the fold change in exd transcript levels" - only KK110518 in IFM is significant. C - "indicates Rbfox1 binds to Mef2 mRNA" - it is not easy to see the band. D - what do the different lanes on the gel below the histogram in D correspond to?

      Suggestions that would help the presentation of their data and conclusion

      There is a lot of good, thorough work here, but overall there is the impression that some of the presentation/writing could be improved (also see the above lists on clarity and accuracy). I admire the authors for their comprehensive presentation of what has already been found out in this field. As the authors summarise, a lot is already known in many other species, so (as also indicated above) it is crucial to emphasise what new is found in this work that advances overall knowledge in this field. This can be obscured in many places where they say because of what was found in vertebrate systems we looked in Drosophila. These include:

      L417: "This led us to investigate if Rbfox1 might regulate Bru1 in Drosophila."

      L452: "and we were curious if these interactions are evolutionarily conserved in flies."

      L528 "Thus, we next checked if Rbfox1 and Bru1 co-regulate alternative splicing in Drosophila muscle."

      L677 "Moreover, as in vertebrates, Rbfox1 and Bru1 exhibit cross-regulatory interactions"

      L683 "Rbfox1 function in muscle development is evolutionarily conserved"

      L697 "Here we extend those findings and show that as in vertebrates......"

      L702 "our observations are consistent with observations in vertebrates"

      L707 "Studies from both vertebrates and C. elegans suggest that Rbfox1 modulates developmental isoform switches."

      L746 "We see evidence for similar regulatory interactions between Rbfox1 and the CELF1/2 homolog Bru1 in our data from Drosophila."

      L185 paragraph. The knockdown series is important for the study. A lot is presented in this paragraph, especially for a non-specialist and it could be easier to follow. Perhaps present the four genetic conditions in the order of the severity of their phenotype on viability. Also, clearly state what each Gal4 driver is used for. What is the nature of the RNAi/IR lines such that Dcr2 could enhance their action? Also comment on off targets - are any predicted?

      L227: "In severe examples". Be as clear as possible. Are the "severe examples" using the stronger RNAi line or are they the most severe examples with a single line? I'd suggest including the result in the main Fig rather than in the Supplemental. However, as I read more of the m/s I realise there is a great deal of important information in the Supplemental Figs, and so the case is not much stronger for this example than many others. The balance of what is included where could be looked at, because it is not straightforward for the reader to read the paper and quickly flick between the main and supplemental Figs. Later in the m/s is a substantial section that starts L450 (finishes L489) and which only refers to Supplemental Figs. L503 is another area where it is necessary, and difficult, for the reader to move between main Figs and supplemental Figs.

      L258: - perhaps a Table summarising this and other phenotype trends with the different RNA conditions might be helpful. It gets quite difficult to follow.

      Significance

      The advance reported is mechanistic.

      The authors already do a very good job of placing their work in the context of prior research (see comment is Section A).

      Muscle biologists interested in its development and function will be interested in this work. More broadly, those intrigued by alternative splicing will be interested. Despite its very widespread occurrence, much about alternative splicing is still poorly understood in terms of regulation and significance. This is especially the case in vivo, and this paper uses an excellent in vivo model system (Drosophila) for the genetic and mechanistic analysis of complex biological problems. My field of expertise: cell differentiation, gene expression, muscle development, Drosophila.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Rbfox proteins regulate skeletal muscle splicing and function and in this manuscript, Nikonova et.al. sought to investigate the mechanisms by which Rbfox1 promotes muscle function in Drosophila.

      Using a GFP-tagged Rbfox1 line, the authors showed that Rbfox1 is expressed in all muscles examined but differentially expressed in tubular and fibrillar (IFM)muscle types, and expression is developmentally regulated. Based on RNA-seq data from isolated muscle groups, the authors showed that Rbfox1 expression is much higher in TDT (jump muscle) than IFM.

      Using fly genetics authors developed tools to reduce expression of Rbfox1 at different levels and the highest levels of muscle-specific Rbfox1 knockdown was lethal and displayed eclosion defects (deGradFP > Rbfox1-IRKK110518 > Rbfox1-RNAi > Rbfox1-IR27286). Consistently, Rbfox1 knockdown flies have reduced jumping and climbing phenotypes, due to tubular muscle defect where Rbfox1 is expressed at higher levels. Rbfox1 knockdown in IFM caused flight defects which have been shown previously. Further characterization of IFM and tubular muscles demonstrated a requirement of Rbfox1 for the development of myofibrillar structures in both fibrillar (IFM) and tubular fiber-types in Drosophila. Interestingly, knockdown or overexpression of Rbfox1 displayed hypercontraction phenotypes in IFMs which is often an end result of misregulation of acto-myosin interactions which was rescued by expression of force-reduction myosin heavy chain (Mhc, P401S), in the context of Rbfox1 knockdown (the rescue experiment could not be performed with Rbfox1 overexpression due to complex genetics).

      Authors also performed computation analyses of the Rbfox binding motifs in the fly genome and identified GCAUG motif in 3,312, 683, and 1184 genes in the intronic, 5'UTR, and 3'UTR, respectively. These genes are enriched for factors that play important roles in muscle function including transcription factors (exd, Mef2, Salm), RNA-binding proteins (Bru1), and structural proteins (TnI, encoded by wupA). Many of these gene transcripts and proteins are affected in flies with reduction or overexpression of Rbfox1. Using fly genetics, authors propose and test different mechanisms (co-regulation of gene targets by Rbfox1 and Bru1), and regulators of muscle function (exd, Me2, Salm) and structural proteins (TnI, Mhc, Zasp52, Strn-Mlck, Sls) by which these changes could affect the muscle function.

      Overall, the characterization of Rbfox1 phenotypes and myofibrillar structure is very well elucidated, mechanisms by which Rbfox1 affects muscle function are not clear and remain largely speculative.

      Major comments

      1. The varying level of Rbfox1 knockdown (deGradFP > Rbfox1-IRKK110518 > Rbfox1-RNAi > Rbfox1-IR27286) was achieved by different strategies without validation at the protein level (likely due to lack of a Rbfox1 antibody). It is important to show different Rbfox1 protein level (at least with different RNAi), especially when authors propose that autoregulation of Rbfox1 causes increased level Rbfox1 transcript in case of Rbfox1-RNAi (mild knockdown). Autoregulation of Rbfox1 in mammalian cells may not be similar in flies.
      2. TnI and Act88F protein levels are inversely correlated with Rbfox1 level in IFM but did not correlate with the RNA level. Using RIP authors showed that Rbfox1 was shown to bound to wupA transcripts (has Rbfox binding sites) but not Act88F transcripts (does not have Rbfox binding sites). Authors performed Rbfox1 IP and identified co-IP of components of cellular translational machinery and propose that wupA (TnI) levels are regulated by translation or NMD (non-sense mediated decay). A follow up experiment was not performed to identify the mechanism by which TnI level is regulated by Rbfox1.
      3. It was known that TnI mutations (affects splice site, fliH or Mef2 binding site, Hdp-3) led to a reduction in TnI level and hypercontraction. Authors showed rescue of hypercontraction phenotype in hdp-3 background by knocking down Rbfox1, likely due to increase in wupA transcription (Mef2-dependent or independent manner). However, no rescue was observed in the fliH background. Reduced level of Rbfox1 in fliH background would be expected to cause worsening of phenotype as splicing of remaining wupA transcripts would be affected with reduced Rbfox1 level. The splicing of wupA of exon 4 is not affected in Rbfox1 knockdown (fig. 6U), it's not clear if the splicing of exon 6b1 is affected in Rbfox1 knockdown.
      4. Bruno1 was identified as a co-regulator of Rbfox1 in different IFM and tubular muscle types. However, except Mhc, other Rbfox1 targets seem to be regulated by either Rbfox1 or Bruno1, not both. Analyses of RNA-seq datasets from single and double knockouts should identify additional targets to support the claim that - Rbfox1 and Bruno1 co-regulate alternative splice events in IFMs. Phenotypic changes with reduced Rbfox1 and Bruno1 double knockdowns are very severe, but the mechanistic basis of such genetic interaction resulting in synergistic phenotypes in IFMs is lacking as splicing changes in single vs double knockout is similar.
      5. Rbfox1 is expressed at a high level in tubular muscle whereas Bruno1 is expressed at a high level in IFM. Rbfox1 binds to Bruno1 transcript and inversely regulates Bru1-RB level but knockdown of Bru1 does not affect Rbfox1 level (Fig. S5 G,I,J). Overexpression of Bruno1 decreased the Rbfox1 level, however, it's difficult to interpret these results as overexpression of Bruno1 may have other effects on IFM gene expression.
      6. A dose-dependent effect of Rbfox1 knockdown was shown to regulate the expression of transcription factors that are important for muscle type specification and function including exd, Mef2, and Salm. However, it is not clear how Rbfox1 mechanistically regulates the expression of these transcription factors.

      Minor comments

      1. It is not described if the rescue of Rbfox1 knockout by expression of force-reduction myosin heavy chain (Mhc, P401S) led to rescue of phenotypes (jumping, climbing, flight).
      2. Immunofluorescence (IF) and Western blotting are different techniques, and Bruno1 antibody was validated for specificity in IF but not in Western blots. Figure 5L and S5 E should include muscle samples from Bru1M2.
      3. To quantify alternative splicing or percent spliced in (PSI), primers are typically designed in the exons flanking the alternative exons. A better primer design along with PSI calculation by RT-PCR will robustly validate alternative splicing changes in different genetic background (Fig 6U and S6 U).

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Understanding how muscle fiber type splicing and gene expression is regulated will conceptually move the field forward. How transcriptional and posttranscriptional programs coordinate to specify muscle fiber type gene expression is still lacking.

      Place the work in the context of the existing literature (provide references, where appropriate). Multiple RNA binding proteins and splicing factors have been shown to affect muscle function along with hundreds of gene expression and splicing changes in a complex fashion. Linking phenotypes with gene expression changes is still challenging as RNA binding proteins or RBPs are multifunctional and affect the function of other regulators that are important for muscle biology.

      State what audience might be interested in and influenced by the reported findings.

      Fly genetics, alternative splicing regulation, muscle specification and function.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Regulation and function of alternative splicing in muscle. I do not have a thorough knowledge of Drosophila genetics.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *The manuscript by Wibisana et al. describes an impressive set of experiments that analyse the NFkB response at the single-cell level, using a variety of cutting-edge techniques (live cell imaging, single-cell RNA-seq, single-molecule RNA FISH, and single-cell ATAC-seq) in chicken DT40 B-cells.

      In the fist half of the paper, the authors perform a detailed characterization of the cell-to-cell variation arising from a homogeneous stimulation with various doses of anti-IgM. They observe that the NFKB TF RelA forms clear nuclear 'foci' upon stimulation in DT40 cells: this was anecdotally shown in a different cell-type by the same authors in ref 7, but (to my knowledge) has never been systematically studied. This allows them to quantitatively analyse the foci formed in response to stimulation, and they show that this is dose-dependent, heterogeneous and biomodal, and exhibits properties of cooperativity. In parallel, the authors analyse the resulting stimulus-driven changes in gene expression, first using single-cell RNA-seq, and then, elegantly, using RNA FISH, which allows them to directly compare the number of RelA foci to gene expression in individual cells. Like the RelA foci, they find that cell-to-cell gene expression is heterogeneous and bimodal (this has been described before). Interestingly, though, they are able to show that individual stimulus-responsive genes exhibit distinct patterns of cell-to-cell hetereogeneity: they can categorize 4 clusters of responding genes according to different patterns of cell-to-cell variation at distinct stimulus doses, and moreover they show that while the heterogeneity of NFKBIA arises due to bimodal expression levels, that of CD83 is simply due to broad variation between cells. Although focused on NFkB, there is a lot of information here with some important (and non-intuitive) implications that could apply to many other stimulus-driven or developmental responses that exhibit heterogeneous patterns of gene expression. A more in-depth analysis of the single-cell datasets would certainly be very worthwhile and fruitful.

      In the second half of the paper, the authors attempt to use their single-cell data, alongside ATAC-seq genomic analyses, to draw inferences about how or whether the model genes NFKBIA and CD83 are regulated by super-enhancers (SEs). Both of these genes are associated with SEs that gain accessibility upon stimulation (recapitulating the authors' findings in ref 8 in a different cell-type), and the CD83 promoter exhibits co-accessibility with two regions within an adjacent SE. The authors also show that both genes are sensitive to treatment with 1.6-HD, a compound that disrupts liquid-like condensates (a characteristic that has been reported for SEs), and CD83 is sensitive to an inhibitor of Brd4 (which has been associated to SE function). However, while these findings could be considered to be suggestive of regulation by SEs, they are clearly not definitive (nor do the authors claim so).

      Finally, the authors show (figure 4a-c) that while the level of stimulus-driven gene upregulation correlates with co-accessibility with both SEs and typical enhancers (TEs), the cell-to-cell heterogeneity of gene expression correlates only with co-accessibility with SEs. This would agree with a model in which SE-regulated gene regulation may generally impart heterogeneous or switch-like gene expression. *

      **Specific comments**

      • The experiments are adequately presented, and the authors indicate that not only the sequencing data but also the analysis code is available. Nevertheless, the methods section is rather terse, and could benefit from more detail to understand the various analyses, particularly concerning the analyses of SEs in figures 3 and S7, where it is often difficult to understand how peaks or genes are categorized.

      Response: We thank the Reviewer for pointing this out and we agree that the Methods section was not described in detail, particularly in how the SEs were analyzed and categorized. Therefore, we have added more details on how SEs were categorized in the Methods section as follows:

      “ Peak calling and enhancer identification from ATAC-seq data were performed using Homer v4.10.4 (http://homer.ucsd.edu/homer/) using the bam files generated from the Cell Ranger pipeline. Tag directories were created for the bam file from each condition using the “makeTagDirectory” program with the “--sspe -single -tbp 1” option. Peak calling was performed using the “findPeaks” program with the “-style super -typical -minDist 5000 -L 0 -fdr 0.0001” option. This procedure stitches peaks within 5 kb and ranks regions by their total normalized number reads and classifies TE and SE by a slope threshold of 1. Peak annotation was subsequently performed using the “annotatePeaks.pl” program with the GRCg6a.96 annotation file. The consequent peak files were merged between each stimulation condition for the SE and TE peaks separately using the “mergeBed” program of bedtools. Peak annotation was performed for the second time for the merged peaks to create the final SE and TE peaks. ATAC fold-change was then calculated between both conditions for the merged peaks separately for SE and TE. Genes associated with both SE and TE were assigned only to the SE.”

      Similarly, we have added more details for other analyses in the Method section and the main sentences.

      • The imaging, scRNA-seq and RNA-FISH experiments are well-presented, although the supplementary figures 4 and 5 include key results that would merit inclusion within the main figures. *

      Response: We thank the Reviewer for this comment. We have included supplementary figures 4b and 5d in the main figures (new Fig. 2g) since both of these figures represent the raw data revealing the differences between smFISH counts and RNA-seq derived gene expression.

      • It is strking that although all the conclusions about SEs are drawn almost exclusively from analysis of ATAC-seq data, no raw ATAC-seq data is directly shown in any figure (even in the browser snapshots of figure 4d & e). It would be important to show the actual ATAC data from which the inferences of figures 3 and 4 are drawn, especially so that it is possible to visualize the implication of a particular 'ATAC fold-change' or of 'ATAC-gained enhancers'. Response: We have added a browser snapshot of the ATAC-seq data, presenting the super-enhancer region assigned to both CD83 and NFKBIA* (new Fig. 3c).

      Reviewer #1 (Significance (Required)):

      • This manuscript can be considered as a follow-up of the authors' previous paper (Michida 2020, ref 8), here focusing on cell-to-cell heterogeneity rather than on the overall magnitude of the stimulus-induced response. Overall, the experiments are well-performed and bring new data to an interesting angle of gene regulation. However, the analyses presented do not seem to fully exploit the data, and the authors do not manage to present any strong conclusions, particularly relating to the possible involvement of super enhancers.

      Response: To strengthen our conclusions about the possible involvement of super-enhancers in regulating heterogeneity, we performed additional analyses on the properties of the SE including the number of transcription factors, NF-κB and PU.1 binding motifs and the length of the enhancers, according to a previous report (Michida et al., 2020, Cell Rep). This was also conducted to confirm whether the ATAC-seq-based SE identification method presents results consistent with those provided by H3K27Ac-ChIP-based methods utilized in the previous study (Michida et al., 2020, Cell Rep). SEs revealed longer genomic length (new Supplementary Fig. 8a) and this length was positively correlated with the ATAC signal (new Supplementary Fig. 8b). Furthermore, gained and lost SE revealed a correlation with enhanced gene expression upregulation and downregulation, respectively, compared to TE (new Fig. 3g). We also demonstrated that SE-regulated genes have a higher Fano factor change, which is consistent with the state of an SE whether it is gained or lost (new Fig. 5a, 5b). For binding motif analysis, we observed a slightly higher PU.1 motif density at SEs (new Supplementary Fig. 11), corresponding to the results of the previous study (Michida et al., 2020, Cell Rep). Interestingly, only the density of NF-κB and not PU.1 was correlated with ATAC signal change in SE (new Fig. 4a), suggesting that those SEs were controlled by nuclear translocation of NF-κB.

      As a mechanism to produce gene expression heterogeneity in phenotypically identical cells, we observed that co-accessibility, which has been reported to be concordant with genomic contacts is correlated to Fano factor change, indicating that gene expression heterogeneity possibly stems from cis-regulatory interactions. NF-κB activation has been reported to increase the heterogeneity in some genes and is attributed to the accumulation of Ser5p RNAPII (Wong et al., 2018, Cell Rep). Additionally, Ser5p RNAPII has been reported to accumulate at enhancer regions (Koch et al., 2011, Nat Struct Mol Biol), and that the accumulation of RNAPII is suggested to assist in gene expression activation through enhancer-promoter contact (Thomas et al., 2021, Mol Cell). Our results support these conclusions since co-accessibility or putative cis-regulatory interactions correlate to Fano factor changes. SE can form phase-separated transcription hubs containing multiple enhancers and/or promoters, which may enable the higher diffusion rate of active enhancers; therefore, it may induce a higher possibility of genomic DNA interactions (Gu et al., 2018, Science). In contrast, the enrichment of TATA motif has also been proposed to generate transcriptional heterogeneity (Faure et al., 2017, Cell Syst). Therefore, we examined this possibility with our data. However, we observed a higher occurrence of TATA box in genes associated with lost SE (new Supplementary Fig. 18) which might have caused gene expression heterogeneity in unstimulated cells. This heterogeneity might be due to the differences in Pol II loading intervals (Tunnacliffe & Chubb, 2020, Trends Genet) however the noise associated with gained SE is possibly generated by the fluctuation of high-order biomolecular assembly. Therefore, we believe that the source of heterogeneity in these conditions were different.

      Additionally, we performed Hill function analysis to reveal the threshold behavior of gene expression in our analysis since previously gained SEs were associated with threshold gene expression (Michida et al., 2020, Cell Rep). In this study, we presented that threshold behavior in gained SE is related to motif density of NF-κB (Fig. 4d), however, threshold behavior does not seem to be related to heterogeneous gene expression.

      Following these results, we concluded that NF-κB activated SE has two closely related but distinct functions for gene control: (1) enhanced heterogeneity and fold-changes and (2) switch-like expression. These are controlled by different mechanisms stemming from chromatin status: (1) frequency of cis-regulatory genomic interactions possibly mediated by phase separation and (2) cooperative binding of NF-κB to DNA. These differences were well represented by expression profiles of CD83 (higher heterogeneity and weak bimodal expression) and NFKBIA (lower heterogeneity and strong bimodal expression).

      • For instance, the existence of multiple gene clusters that exhibit distinct patterns of heterogeneity implies that switch-like gene activation occurs on a per-gene basis, rather than corresponding to an all-or-nothing activation of individual cells. This would be an exciting finding, and the authors have the data to test this. Likewise, the division of heterogeneous gene expression into bimodal (like NFKBIA) or unimodal (like CD83) distributions could be a nice paradigm if systematically applied to the other 1335 differentially-expressed genes identified by the authors. * Response: We appreciate this comment. Following your comment, we analyzed the relationship between heterogeneity and bimodality (switch-like expression or high Hill coefficient) for the remaining genes. We observed that SE having a high Hill coefficient contained a higher number of NF-κB motif in SE (new Fig. 4), indicating that cooperative binding of NF-κB to DNA shaped non-linear gene expression profiles as we indicated in a previous paper (Michida et al., 2020, Cell Rep). Additionally, as described in the earlier section, we observed that heterogeneity arises from cis-regulatory genomic interaction. We compared these gene groups and observed that these properties were not completely shared (new Supplementary Fig. 15), indicating that bimodality and heterogeneity originated from different mechanisms. We assume that those differences are mediated through a combination of chromatin accessibility and the biophysical properties of NF-κB.

      • In contrast, although the authors try to use their data to investigate gene regulation by SEs, these inferences are all somewhat indirect, and the authors themselves do not manage to draw any definitive conclusions. Response: We appreciate this comment. We performed the additional computational analysis and carefully interpreted the data. Additionally, we have now concluded that SEs have two major biological functions: (1) gene expression heterogeneity, which is mediated via cis-regulatory interactions (Fig. 5) and (2) bimodal gene expression, which is mediated by NF-κB binding (new Fig. 4). The latter finding has also been reported in a mouse primary B cell, albeit the mechanism causing heterogeneity was a novel conclusion of this study.

      • I feel that the authors are under-selling their data here. As-is, the data represents more of a resource than a study with a clear message, but I believe that with more in-depth analysis the authors could make a much more significant advance, particularly concerning the cell-to-cell heterogeneity of gene expression. I would be very enthusiastic to review the same data again with a more detailed analysis, which I believe would enormously improve the manuscript. Response: We appreciate this comment. As described in this report and the revised manuscript, we performed a considerably detailed computational analysis and gained several novel insights to answer the question regarding the functional roles of SE. We are grateful to learn that gene expression patterns may be estimated from ATAC-seq profiles and that they may even be controlled. We hope that this Reviewer would observe the scientific value of our study and provide us with your valuable feedback on our revised manuscript.

      Reviewer #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Imaging and single cell sequencing analyses of super-enhancer activation mediated by NF-κB in B cells" by Wibisana et al. examined the relationship between super-enhancers, NF-κB nuclear aggregation, and target gene regulation. The authors have generated a large amount of data from fluorescent microscopy, scRNA-seq, scATAC-seq, smRNA FISH. While this is an impressive dataset in terms of diverse technically advanced methods employed, it is not clear what to take as a main conceptual advance. What could be the functional implications of observed cell-cell variability in B cell transcriptional responses to environmental stimuli? In addition to this general point, the following are specific comments that could improve the manuscript.

      1. In Figure 2, smRNA FISH foci of CD83 and NFKBIA are quantified as # of spots per cell (Supplementary figure 5). But it is difficult to see in Figure 2 the colocalization of any mRNA spots with RelA foci. Ideally, it will be convincing to show by DNA FISH that these target loci are indeed located within NF-κB occupied super-enhancer puncta. Even with the current RNA FISH data, some colocalization analysis could have been performed. * Response: In Figure 2, we were unable to perform accurate colocalization analysis with the current smFISH data as the probes used by us map to exons. Moreover, we have also previously performed DNA-FISH; nevertheless, it was difficult to assess co-localization between the DNA and RelA proteins secondary to the degradation of RelA-GFP proteins. Therefore, we decided to perform intronic smRNA-FISH, which can be used to pinpoint the site of active transcription (Levesque and Raj, 2013, Nat Methods). The results, along with the quantification results, are presented in the new Fig. 2f.
      • Supplementary Figure 5a shows lower correlations of # GFP-RelA foci to CD83 transcripts in comparison to NFKBIA. Even though the foci and smRNA FISH spots are derived from high resolution imaging data, we should remember that any snapshot measurements have limited information content for gene regulatory relationships. Live cell studies (for example, from the groups of Suzanne Gaudet, Kathryn Miller-Jensen, and Myong-Hee Sung) have shown that time-integrated measures (e.g. maximum fold change and area under the curve of RelA signaling time course in single cells) are better correlates to transcriptional output of target genes (Lee REC et al 2014 Mol Cell; Wong VC et al 2019 Biophysical J; Sung MH et al. 2014 Science Signal; Martin EW et al. 2020 Science Signal). *

      Response: We thank the Reviewer for this valuable comment. One of the reasons for a lower correlation between GFP-RelA foci and CD83 transcripts compared to NFKBIA may be the difference in expression timing of CD83 and NFKBIA and the timing of nuclear localization of GFP-RelA. RelA localizes in the nucleus 10−30 mins after cell stimulation, and NFKBIA is an early responsive gene, however, CD83 is expressed later (new Supplementary Fig. 17). Therefore, this time difference possibly affects correlation accuracy. Although we agree that high-throughput time-course measurement of RelA-GFP combined with smFISH measurements, such as that reported in Wong VC et al., 2019, will be ideal, it is technically difficult since DT40 are suspension cells and the smFISH protocol requires multiple washing and centrifugation steps. Thus, with this experimental setup, we were unable to perform the time-course analysis.

      Nonetheless, we measured the time-course foci formation at the same single-cells (new Supplementary Fig. 1b) and observed that it effectively represents Figure 1a, which is a snapshot of the population dynamics of RelA foci across time. Additionally, the observed dynamics, which revealed a steep initial increase and slight decrease with time, effectively recapitulates the previous reports (Lee et al., 2014, Mol Cell; Wong et al., 2019, Biophys J).

      In our analysis, we performed imaging analysis to demonstrate that NF-κB foci formation is switch-like, and this formation might be involved in the formation of phase-separated condensates enhancing DNA to DNA contact. The number of foci may depend upon the intracellular concentration of NF-κB, and fold change in the RelA signal may be correlated with gene expression as previously reported (Lee et al., 2014, Mol Cell; Wong et al., 2019, Biophys J). However, there is another report presenting that promote/enhancer proximity is not related to gene expression (Alexander et al. 2019, eLife). Although we were unable to perform this analysis owing to the limitations stated above, we tried to find the relationship between RelA foci and gene expression by performing biochemical perturbations (Fig 1e-f, Fig 5h) and presented that these foci are related to gene expression.

      • The analyses have been performed using DT40 cells. In the Methods section, no description was provided about what type of B cells DT40 is, even though few outside of the field may not know that the cells were immortalized from chicken. This is an important consideration, because some nuclear bodies and genome organization features are different between host species and they also depend on whether the cells are primary or transformed. Because the authors do not discuss this point, it seems possible that the findings about NF-κB aggregates and super-enhancers may not necessarily hold true for primary B cells. *

      Response: We thank the Reviewer for pointing out these issues. We have added the following description on DT40 cells in the Methods section describing that DT40 cells are chicken bursal lymphoma cells.

      DT40 B lymphocytes have been widely used as a B cell model for studying B cell receptor signaling (Mori et al., 2002, J. Exp. Med.; Patterson et al., 2002, Cell; Saeki et al., 2003, EMBO J.) due to its high gene targeting efficiency. We also previously confirmed that anti-IgM stimulation induces the NF-κB signaling pathway in mouse primary splenic B cells and DT40 and that the signaling molecules and dynamics in these cells are well conserved (Shinohara et al., 2014, Science; Shinohara et al., 2016, Sci. Rep.; Inoue et al., 2016, NPJ Syst. Biol. Appl.). However, we understand the Reviewer’s concerns. Therefore, we have provided the track view of primary B cell ATAC-seq data to demonstrate that the chromatin accessibility changes upon anti-IgM stimulation in CD83 and NFKBIA were similarly observed in primary B cell data (new Supplementary Fig. 9b) and that the upregulation and association with SE of CD83 and NFKBIA were also observed in primary B cell (new Supplementary Fig. 9a).

      • Similarly, the GFP-RelA expressing DT40 cell generation should be described with more detail (beyond "provided by ..."). N-terminal or C-terminal fusion? Did the fusion construct contain an artificial promoter (e.g. CMV) or an upstream fragment of the genomic Rela locus (chicken or human)? Methods of transfection and cloning of stable lines? These choices affect the interpretation of the data, so they must be fully described and justified. *

      Response: We thank you for pointing this out. We have added the following details on the RelA-GFP construct in the Methods section:

      Mouse RelA-eGFP with eGFP on the C terminal was cloned into a pGAP vector containing Ecogpt resistance gene targeting endogenous GAPDH locus. This construct was further electroporated into wild-type cells and selected using Ecogpt to produce RelA-GFP-expressing DT40 cells.

      • DT40 cells were cultured in 39 degrees. Michael White and colleagues have shown that high temperatures can alter NF-kappaB dynamics and function (https://www.pnas.org/content/115/22/E5243). Did the authors try lower temperatures to ascertain that the NF-kB aggregates and other major findings are still observed in 37 degrees? *

      Response: We performed the experiments at 39 degrees to mimic the natural body temperature of chicken since DT40 cells were derived from chicken bursal lymphoma (Saribasak and Arikawa, 2006, Subcell Biochem). Previously, we cultured DT40 cells at 37 degrees and observed that the cell growth was inhibited, and thus, we believed that it was not ideal to perform experiments of DT40 cells at 37 degrees.

      Reviewer #2 (Significance (Required)):

      It is not clear what to take as a main conceptual advance.

      Response: Considering the original manuscript, we agree with the Reviewer on the lack of strong emphasis on the conclusions of our study. Therefore, in this revised manuscript, we have focused on the comprehensive mechanism of heterogeneity and switch-like activation in gene expression control. As we described in the comments to Reviewer #1, we performed an additional in-depth computational analysis on SE and TE. Consequently, we demonstrated that enhanced heterogeneity and expression fold-changes mediated by SE are defined by the number of cis-regulatory genomic interactions in open chromatin regions (Figure 5), however, switch-like expression (bimodal patterns) is determined by the number of NF-κB binding in SE (new Figure 4). The latter finding has also been reported in a mouse primary B cell in our previous study (Michida et al. 2020, Cell Rep.). However, the mechanism causing heterogeneity is a novel conclusion obtained in this study. We also concluded that these similar, albeit quantitatively and slightly different characteristics in gene control can be achieved through a combination of chromatin accessibility of host cells and biophysical properties of NF-κB molecule, which is involved in phase separation.

      What could be the functional implications of observed cell-cell variability in B cell transcriptional responses to environmental stimuli?

      Response: We performed gene ontology analysis to reveal how the heterogeneously expressed genes (cluster 4) (Fig. 2d) presented enrichment for immune-related functions (Supplementary Fig. 5b). This result supports a previous study, which stated that variability in gene expression is related to function (Osorio et al., 2019, Cells).

      This discussion is incorporated in the manuscript as follows:

      “We observed that genes with an increased heterogeneity upon increasing stimulation dose are enriched with cell-type-specific immune regulatory genes (Supplementary Fig. 5b), supporting a previous report where heterogeneity in gene expression is tied to biological functions and may be used by cells as a bet-hedging or a response distribution mechanism (Osorio et al., 2019, Cells), where cells exhibit heterogeneity to enable response to changing environment and also allowing dose-dependent fractional activation respectively. This was observed in CD83, a B cell activation marker, demonstrating the involvement of heterogeneity in B cell development.”

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      "Imaging and single cell sequencing analyses of super-enhancer activation mediated by NF-κB in B cells" by Wibisana et al. examined the relationship between super-enhancers, NF-κB nuclear aggregation, and target gene regulation. The authors have generated a large amount of data from fluorescent microscopy, scRNA-seq, scATAC-seq, smRNA FISH. While this is an impressive dataset in terms of diverse technically advanced methods employed, it is not clear what to take as a main conceptual advance. What could be the functional implications of observed cell-cell variability in B cell transcriptional responses to environmental stimuli? In addition to this general point, the following are specific comments that could improve the manuscript.

      1. In Figure 2, smRNA FISH foci of CD83 and NFKBIA are quantified as # of spots per cell (Supplementary figure 5). But it is difficult to see in Figure 2 the colocalization of any mRNA spots with RelA foci. Ideally, it will be convincing to show by DNA FISH that these target loci are indeed located within NF-κB occupied super-enhancer puncta. Even with the current RNA FISH data, some colocalization analysis could have been performed.
      2. Supplementary Figure 5a shows lower correlations of # GFP-RelA foci to CD83 transcripts in comparison to NFKBIA. Even though the foci and smRNA FISH spots are derived from high resolution imaging data, we should remember that any snapshot measurements have limited information content for gene regulatory relationships. Live cell studies (for example, from the groups of Suzanne Gaudet, Kathryn Miller-Jensen, and Myong-Hee Sung) have shown that time-integrated measures (e.g. maximum fold change and area under the curve of RelA signaling time course in single cells) are better correlates to transcriptional output of target genes (Lee REC et al 2014 Mol Cell; Wong VC et al 2019 Biophysical J; Sung MH et al. 2014 Science Signal; Martin EW et al. 2020 Science Signal).
      3. The analyses have been performed using DT40 cells. In the Methods section, no description was provided about what type of B cells DT40 is, even though few outside of the field may not know that the cells were immortalized from chicken. This is an important consideration, because some nuclear bodies and genome organization features are different between host species and they also depend on whether the cells are primary or transformed. Because the authors do not discuss this point, it seems possible that the findings about NF-κB aggregates and super-enhancers may not necessarily hold true for primary B cells.
      4. Similarly, the GFP-RelA expressing DT40 cell generation should be described with more detail (beyond "provided by ..."). N-terminal or C-terminal fusion? Did the fusion construct contain an artificial promoter (e.g. CMV) or an upstream fragment of the genomic Rela locus (chicken or human)? Methods of transfection and cloning of stable lines? These choices affect the interpretation of the data, so they must be fully described and justified.
      5. DT40 cells were cultured in 39 degrees. Michael White and colleagues have shown that high temperatures can alter NF-kappaB dynamics and function (https://www.pnas.org/content/115/22/E5243). Did the authors try lower temperatures to ascertain that the NF-kB aggregates and other major findings are still observed in 37 degrees?

      Significance

      It is not clear what to take as a main conceptual advance.

      What could be the functional implications of observed cell-cell variability in B cell transcriptional responses to environmental stimuli?

      Referee cross-commenting

      I concur with Reviewer #1's comments about systematic grouping of 1335 differentially expressed genes based on heterogeneity, and also about showing raw ATAC-seq data tracks and plots. We both commented that the study lacks a significant conclusion in its current form.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary of findings & key conclusions

      The manuscript by Wibisana et al. describes an impressive set of experiments that analyse the NFkB response at the single-cell level, using a variety of cutting-edge techniques (live cell imaging, single-cell RNA-seq, single-molecule RNA FISH, and single-cell ATAC-seq) in chicken DT40 B-cells.

      In the fist half of the paper, the authors perform a detailed characterization of the cell-to-cell variation arising from a homogeneous stimulation with various doses of anti-IgM. They observe that the NFKB TF RelA forms clear nuclear 'foci' upon stimulation in DT40 cells: this was anecdotally shown in a different cell-type by the same authors in ref 7, but (to my knowledge) has never been systematically studied. This allows them to quantitatively analyse the foci formed in response to stimulation, and they show that this is dose-dependent, heterogeneous and biomodal, and exhibits properties of cooperativity. In parallel, the authors analyse the resulting stimulus-driven changes in gene expression, first using single-cell RNA-seq, and then, elegantly, using RNA FISH, which allows them to directly compare the number of RelA foci to gene expression in individual cells. Like the RelA foci, they find that cell-to-cell gene expression is heterogeneous and bimodal (this has been described before). Interestingly, though, they are able to show that individual stimulus-responsive genes exhibit distinct patterns of cell-to-cell hetereogeneity: they can categorize 4 clusters of responding genes according to different patterns of cell-to-cell variation at distinct stimulus doses, and moreover they show that while the heterogeneity of NFKBIA arises due to bimodal expression levels, that of CD83 is simply due to broad variation between cells. Although focused on NFkB, there is a lot of information here with some important (and non-intuitive) implications that could apply to many other stimulus-driven or developmental responses that exhibit heterogeneous patterns of gene expression. A more in-depth analysis of the single-cell datasets would certainly be very worthwhile and fruitful.

      In the second half of the paper, the authors attempt to use their single-cell data, alongside ATAC-seq genomic analyses, to draw inferences about how or whether the model genes NFKBIA and CD83 are regulated by super-enhancers (SEs). Both of these genes are associated with SEs that gain accessibility upon stimulation (recapitulating the authors' findings in ref 8 in a different cell-type), and the CD83 promoter exhibits co-accessibility with two regions within an adjacent SE. The authors also show that both genes are sensitive to treatment with 1.6-HD, a compound that disrupts liquid-like condensates (a characteristic that has been reported for SEs), and CD83 is sensitive to an inhibitor of Brd4 (which has been associated to SE function). However, while these findings could be considered to be suggestive of regulation by SEs, they are clearly not definitive (nor do the authors claim so).

      Finally, the authors show (figure 4a-c) that while the level of stimulus-driven gene upregulation correlates with co-accessibility with both SEs and typical enhancers (TEs), the cell-to-cell heterogeneity of gene expression correlates only with co-accessibility with SEs. This would agree with a model in which SE-regulated gene regulation may generally impart heterogeneous or switch-like gene expression.

      Specific comments

      • The experiments are adequately presented, and the authors indicate that not only the sequencing data but also the analysis code is available. Nevertheless, the methods section is rather terse, and could benefit from more detail to understand the various analyses, particularly concerning the analyses of SEs in figures 3 and S7, where it is often difficult to understand how peaks or genes are categorized.

      • The imaging, scRNA-seq and RNA-FISH experiments are well-presented, although the supplementary figures 4 and 5 include key results that would merit inclusion within the main figures.

      • It is strking that although all the conclusions about SEs are drawn almost exclusively from analysis of ATAC-seq data, no raw ATAC-seq data is directly shown in any figure (even in the browser snapshots of figure 4d & e). It would be important to show the actual ATAC data from which the inferences of figures 3 and 4 are drawn, especially so that it is possible to visualize the implication of a particular 'ATAC fold-change' or of 'ATAC-gained enhancers'.

      Significance

      Significance

      This manuscript can be considered as a follow-up of the authors' previous paper (Michida 2020, ref 8), here focusing on cell-to-cell heterogeneity rather than on the overall magnitude of the stimulus-induced response. Overall, the experiments are well-performed and bring new data to an interesting angle of gene regulation. However, the analyses presented do not seem to fully exploit the data, and the authors do not manage to present any strong conclusions, particularly relating to the possible involvement of super enhancers.

      For instance, the existence of multiple gene clusters that exhibit distinct patterns of heterogeneity implies that switch-like gene activation occurs on a per-gene basis, rather than corresponding to an all-or-nothing activation of individual cells. This would be an exciting finding, and the authors have the data to test this. Likewise, the division of heterogeneous gene expression into bimodal (like NFKBIA) or unimodal (like CD83) distributions could be a nice paradigm if systematically applied to the other 1335 differentially-expressed genes identified by the authors.

      In contrast, although the authors try to use their data to investigate gene regulation by SEs, these inferences are all somewhat indirect, and the authors themselves do not manage to draw any definitive conclusions.

      I feel that the authors are under-selling their data here. As-is, the data represents more of a resource than a study with a clear message, but I believe that with more in-depth analysis the authors could make a much more significant advance, particularly concerning the cell-to-cell heterogeneity of gene expression. I would be very enthusiastic to review the same data again with a more detailed analysis, which I believe would enormously improve the manuscript.

      Reviewer field of expertise

      My expertise is in gene regulation and genomics. I am competent to review the implications of all parts of this paper, and all the technical aspects with the exception of the microscopy.

      Referee Cross-commenting

      I agree both with the specific points raised by reviewer #2, and also with the overall comment that - despite the large amount of data - the authors do not present any clear conceptual advance or tackle the functional implications of their results.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      We would like to thank the two reviewers for the valuable comments and suggestions on improvements. We addressed each reviewer’s comments individually. We have carefully revised the manuscript to incorporate new data and to make necessary clarifications.

      Overall we made the following major modifications:

      1. We investigated the relevance of BHRF1 expression in the context of EBV infection, in B cells and epithelial cells. We observed that EBV reactivation leads to MT hyperacetylation and subsequent mito-aggresome formation in both cell types. An EBV+ B cell line deficient for BHRF1 was generated and allowed us to demonstrate the involvement of BHRF1 in this phenotype. These results were added to Figures 2, 3 and Figure 1 – S1 in the revised version of the manuscript.
      2. We better characterized the mechanism leading to MT hyperacetylation, by demonstrating that BHRF1 colocalizes and interacts with the tubulin acetyltransferase ATAT1. These results were added to Figure 5 and Figure 5 – S2 in the revised manuscript.
      3. We generated stable HeLa cells KO for ATG5. Using these autophagy-deficient cells, we demonstrated the involvement of autophagy in BHRF1-induced MT hyperacetylation and mito-aggresome formation. We added these results to Figure 8 in the revised version of the manuscript.
      4. We compared the impact of BHRF1 with other mitophagy inducers on MT hyperacetylation, mitochondrial morphodynamics and the inhibition of IFN production, to demonstrate the specificity of the mechanism of action of BHRF1 (Figure 4 – S1).
      5. We demonstrated that MT hyperacetylation requires mitochondrial fission, using a Drp1-deficient HeLa cell line that we have previously described (Vilmen et al., 2020). This result was added to the revised version of the manuscript in Figure 3 – S2A. Moreover, we confirmed this result in the context of EBV infection (Figure 3 – S2B). ## Reviewer#1 Reviewer #1 (Evidence, reproducibility and clarity)

      Major comments:

      1. In the presented manuscript the authors characterize mainly BHRF1 overexpression in HeLa cells. Does BHRF1 also block type I IFN responses by microtubule hyperacetylation in the context of EBV infection? Do alpha-tubulin K40A overexpressing B cells produce more type I IFN after EBV infection?

      In the revised version of the manuscript, we added several experiments to explore the phenotype of BHRF1 during EBV infection, as requested by the two reviewers. Since EBV infects both B cells and epithelial cells, we used two different approaches. In latently-infected B cells, coming from Burkitt lymphoma (Akata cells), we induced EBV reactivation by anti-IgG treatment. To explore the importance of BHRF1 in this cell type, we constructed a cell line knocked down for BHRF1 expression, thanks to a lentivirus bearing an shRNA against BHRF1. In parallel, HEK293 cells harboring either EBV WT or EBV ΔBHRF1 genome were transfected with ZEBRA and Rta plasmids to induce the viral productive cycle in epithelial cells.

      We demonstrated that EBV infection induces MT hyperacetylation and subsequent mito-aggresome formation, both dependent on autophagy. Moreover, this phenotype requires BHRF1 expression in B cells and epithelial cells. We also observed that the expression of alpha-tubulin K40A in EBV+ epithelial cells blocks mito-aggresome formation induced by EBV reactivation. These results are now presented in Figures 2 and 3 in the revised version of the manuscript.

      Regarding regulation of IFN response during infection, several EBV-encoded proteins and non-coding RNAs have been described to interfere with the innate immune system. For example, BGLF4 and ZEBRA bind to IRF3 and IRF7, respectively, to block their nuclear activity (Hahn et al., 2005; Wang et al., 2009). Moreover, Rta expression decreases mRNA expression of IRF3 and IRF7 (Bentz et al., 2010; Zhu et al., 2014). We therefore think that studying the inhibitory role of BHRF1 on IFN response in the context of EBV reactivation will be arduous. Indeed, the lack of BHRF1 could be compensated by the activity of other viral proteins acting on innate immunity.

      1. The authors document that the observed microtubule hyperacetylation is due to the acetyltransferase ATAT1. How does BHRF1 activate ATAT1? Is there any direct interaction?

      As requested by reviewer#1, we explored a possible interaction of BHRF1 and ATAT1. First, we observed by confocal microscopy that GFP-ATAT1 colocalized with BHRF1 in the juxtanuclear region of HeLa cells (Figure 5 – S2). Second, we demonstrated by two co-immunoprecipitation assays that BHRF1 binds to exogenous ATAT1 (Figures 5E and 5F). These new results have been added to the revised version of the manuscript and clarify the mechanism of action of BHRF1.To go further, we explored whether BHRF1 was able to stabilize ATAT1 because it was recently reported that p27, an autophagy inducer that modulates MT acetylation, binds to and stabilizes ATAT1 (Nowosad et al., 2021). However, BHRF1 expression does not impact the expression of ATAT1 (data not shown).

      1. Furthermore, the authors demonstrate with pharmacological autophagy inhibitors that autophagy is increased in a BHRF1 dependent and microtubule acetylation independent manner but required for microtubule hyperacetylation. How does autophagy stimulate ATAT1 dependent microtubule hyperacetylation? Is this dependency also observed with a more specific ATG silencing or knock-out?

      We generated a stable autophagy-deficient HeLa cell line KO for ATG5, using an ATG5 CRISPR/Cas9 construct delivered by a lentivirus. The lack of ATG5 expression and LC3 lipidation was verified by immunoblot (Figure 8B). We observed that BHRF1 was unable to increase MT acetylation in this autophagy-deficient cell line (Figure 8C) in accordance with our data reported in the original manuscript using treatment with spautin 1 or 3-MA (previously Figure S5C and Figure 8A in the revised version). Moreover, the lack of hyperacetylated MT in BHRF1-expressing cells led to a dramatic reduction of mito-aggresome formation (Figures 8D and 8E). These new results demonstrate that autophagy is required for BHRF1-induced MT hyperacetylation.

      Minor comments:

      1. "Innate immunity" and "innate immune system", but not "innate immunity system" are in my opinion better wordings.

      We thank reviewer #1 for this useful comment. The term “innate immunity system” in the introduction section has been replaced by “innate immune system”. Elsewhere, we used “innate immunity”.

      1. The reader would benefit from a discussion on the role of type I IFNs during EBV infection and how important the authors think their new mechanism could be in this context.

      We thank the reviewer for this suggestion. However, we already discussed the different strategies developed by EBV to counteract IFN response induction, in our previous study, suggesting the importance of IFN in the control of EBV infection (Vilmen et al., 2020). In this study, we have focused the discussion on the role of mitophagy in the control of IFN production.

      Reviewer #1 (Significance):

      The significance of the described pathway for type I IFN production needs to be documented in the context of EBV infection.

      The revised version of the manuscript now explored the role of BHRF1 in the context of EBV infection See above for details (major comment 1).

      Reviewer#2

      Reviewer #2 (Evidence, reproducibility and clarity)

      The work presented is a relatively straightforward cell biological dissection of a subset of the previously described functions of BHRF1, focusing on the mitochondrial aggregation phenotype. The approaches and analysis are performed in cell lines mainly using overexpression and some siRNA experiments and appear well done throughout.

      We thank reviewer #2 for this comment and would like to underline that the revised version of the manuscript includes now a study of BHRF1 in the context of infection in both B cells and epithelial cells, the generation of a stable EBV positive B cells KD for BHRF1 by using shRNA approach and the generation of a stable autophagy-deficient cell line, using CRISPR/cas9 against ATG5.

      Reviewer #2 (Significance):

      The current study unpicks one of the phenotypes induced by BHRF1 over expression: namely the previously reported mitochondrial aggregation phenotype. The findings that peri-nuclear mitochondrial aggregation are dependent on microtubules and retrograde motors are useful but could perhaps have been predicted. Overexpression of many proteins (or indeed chemical treatments) causing cellular and / or mitochondrial stress have been shown to cause mitochondrial perinuclear aggregation.

      To explore the specificity of BHRF1 activity on mito-aggresome formation, we decided to investigate the impact of AMBRA1-ActA, a previously characterized mitophagy inducer, on MT (Strappazzon et al., 2015). We observed that expression of AMBRA1-ActA leads to mito-aggresome formation but does not modulate acetylation of MTs, contrary to BHRF1. This result was added to the revised version of the manuscript (Figure 4 - S1A and S1B). Moreover, chemical treatments with either oligomycin/antimycin or CCCP, which induce mitochondrial stress and mitophagy (Lazarou et al., 2015; Narendra et al., 2008), do not cause mitochondrial juxtanuclear aggregation (Figure 4 - S1C). We also observed that a hyperosmotic shock-induced by NaCl leads to MT hyperacetylation (Figure 4 - S1D) but not to the mito-aggresome formation (data not shown), suggesting that MT hyperacetylation per se is not sufficient to induce the clustering of mitochondria. Altogether, these new results demonstrated the originality of the mechanism used by BHRF1 to induce mito-aggresome formation.

      The findings linking the process to altered tubulin acetylation are more novel and interesting and may add a new dimension to understanding of BHRF1 function. However what is lacking here is really advancing our understanding of how BHRF1 does this.

      We thank the reviewer for underlining the fact that regulation of mitochondrial morphodynamics by BHRF1 via MT hyperacetylation is novel and interesting.

      In the original version of the manuscript, we have demonstrated that autophagy and ATAT1 are required for BHRF1-induced hyperacetylation. In the revised version, we uncovered that BHRF1 interacts and colocalizes with ATAT1 (Figures 5E, 5F and Figure 5 – S2). Moreover, we demonstrated that MT hyperacetylation is involved in the localization of autophagosomes next to the nucleus, thus close to the mito-aggresome. Therefore, we better characterized the mechanism of action of BHRF1 in the revised manuscript.

      Although some downstream processes are identified in the current and previous study it still remains unclear what the exact underlying mechanisms are. Is BHRF1 doing this by disrupting mitochondrial function and making the organelles sick or by causing cellular stress indirectly leading to mitochondrial pathology? Previous studies have shown that cellular stress such as altered proteostasis can also cause stress-induced mitochondrial retrograde trafficking and aggregation. Is BHRF1 causing the same phenotype by generally stressing the cell and if it is more specifically through mitochondrial disruption what is the mechanism? As demonstrated by the authors in their previous work, BHRF1 does a number of things to cell signalling. Which of these are leading to a general disruption of cell signalling versus having specific effects on the cell or mitochondria still seems somewhat unclear.

      We previously reported that BHRF1 expression does not alter the mitochondrial membrane potential (Vilmen et al., 2020). contrary to treatment by O/A or CCCP. Moreover, we observed that these treatments do not induce mitochondrial clustering (Figure 4 – S1). Therefore, BHRF1 modulates mitochondrial dynamics in a specific and regulated manner.

      Our study clearly demonstrated that BHRF1 uses an original strategy to modulate IFN response, via a regulated pathway of successive steps, from mitochondrial fission to mitophagy, via MT hyperacetylation, rather than “a general disruption of cell signalling”.

      It would be interesting to know whether the role of microtubule hyperacetylation and ATAT1 are more generally involved in other previously described processes of stress induced mitochondrial aggregation.

      In the revised version of the manuscript, we observed that AMBRA1-ActA does not change the level of MT acetylation, whereas it induces mito-aggresome formation. These data reinforce the originality of the BHRF1 mechanism.

      Currently while this is a nicely performed follow up study to their 2020 paper, the present study neither provides in depth mechanistic advance of BHRF1 function, nor a better understanding of the molecular steps in a more generally relevant pathway (e.g. mitophagy).

      We disagree with the reviewer’s comment. Indeed, in this new study, we uncovered and characterized a new mechanism of action for BHRF1 via ATAT1-dependent MT hyperacetylation. More generally, we reported for the first time that innate immunity can be regulated by the level of MT acetylation.

      In addition, all the experiments were performed in cell lines and rely on the overexpression of a viral protein. But this is a significant over-simplification of the viral pathological process. It therefore remains unclear how pathophysiologically relevant the findings are (e.g. to EBV pathology) without further extending this element of the work.

      To address this comment, we extended our results in the infectious context, by adding several experiments performed in EBV-infected cell lines (see above reviewer#1 for details). The same phenotype was observed after reactivation of the EBV productive cycle as in BHRF1 ectopic expression. Moreover, we demonstrated that the phenotype is BHRF1-dependent. This suggests the importance of BHRF1 in EBV pathogenesis by participating in innate immunity control.

      An additional minor issue is the authors naming of the process as Mito-aggresome formation. Although this might sound catchy it is somewhat unclear what the biological basis for this is. Aggresomes are defined structures that occur in cells during pathology and due to the peri-nuclear accumulation of misfolded protein. Since the process here is simply the description of aggregated mitochondria next to the nucleus but doesn't seem to have anything to do with protein misfolding it's really unclear how this labelling is helpful to the field. The process of perinuclear mitochondrial aggregation e.g. during mitochondrial stress or damage has been described many times before without the need for calling it a mito-aggresome. This term is likely to cause unhelpful confusion.

      We understand the comment of reviewer #2, but since 2010 the term “mito-aggresome” was previously used in other studies and refers to a clustering of mitochondria next to the nucleus, similarly to what we observed with BHRF1 (D’Acunzo et al., 2019; Lee et al., 2010; Springer and Kahle, 2011, 2011; Strappazzon et al., 2015; Van Humbeeck et al., 2011; Yang and Yang, 2011).

      However, we took into consideration the risk of confusion for the readers, by changing how we introduced the term “mito-aggresome” in the revised version of the manuscript (page 5 line 94).

      References

      Bentz GL, Liu R, Hahn AM, Shackelford J, Pagano JS. 2010. Epstein–Barr virus BRLF1 inhibits transcription of IRF3 and IRF7 and suppresses induction of interferon-β. Virology 402:121–128. doi:10.1016/j.virol.2010.03.014

      D’Acunzo P, Strappazzon F, Caruana I, Meneghetti G, Di Rita A, Simula L, Weber G, Del Bufalo F, Dalla Valle L, Campello S, Locatelli F, Cecconi F. 2019. Reversible induction of mitophagy by an optogenetic bimodular system. Nat Commun 10:1533. doi:10.1038/s41467-019-09487-1

      Hahn AM, Huye LE, Ning S, Webster-Cyriaque J, Pagano JS. 2005. Interferon regulatory factor 7 is negatively regulated by the Epstein-Barr virus immediate-early gene, BZLF-1. J Virol 79:10040–10052. doi:10.1128/JVI.79.15.10040-10052.2005

      Lazarou M, Sliter DA, Kane LA, Sarraf SA, Wang C, Burman JL, Sideris DP, Fogel AI, Youle RJ. 2015. The ubiquitin kinase PINK1 recruits autophagy receptors to induce mitophagy. Nature 524:309–314. doi:10.1038/nature14893

      Lee J-Y, Nagano Y, Taylor JP, Lim KL, Yao T-P. 2010. Disease-causing mutations in Parkin impair mitochondrial ubiquitination, aggregation, and HDAC6-dependent mitophagy. J Cell Biol 189:671–679. doi:10.1083/jcb.201001039

      Narendra DP, Tanaka A, Suen D-F, Youle RJ. 2008. Parkin is recruited selectively to impaired mitochondria and promotes their autophagy. J Cell Biol 183:795–803. doi:10.1083/jcb.200809125

      Nowosad A, Creff J, Jeannot P, Culerrier R, Codogno P, Manenti S, Nguyen L, Besson A. 2021. p27 controls autophagic vesicle trafficking in glucose-deprived cells via the regulation of ATAT1-mediated microtubule acetylation. Cell Death Dis 12:1–18. doi:10.1038/s41419-021-03759-9

      Springer W, Kahle PJ. 2011. Regulation of PINK1-Parkin-mediated mitophagy. Autophagy 7:266–278. doi:10.4161/auto.7.3.14348

      Strappazzon F, Nazio F, Corrado M, Cianfanelli V, Romagnoli A, Fimia GM, Campello S, Nardacci R, Piacentini M, Campanella M, Cecconi F. 2015. AMBRA1 is able to induce mitophagy via LC3 binding, regardless of PARKIN and p62/SQSTM1. Cell Death Differ 22:419–32. doi:10.1038/cdd.2014.139

      Van Humbeeck C, Cornelissen T, Hofkens H, Mandemakers W, Gevaert K, De Strooper B, Vandenberghe W. 2011. Parkin Interacts with Ambra1 to Induce Mitophagy. J Neurosci 31:10249–10261. doi:10.1523/JNEUROSCI.1917-11.2011

      Vilmen G, Glon D, Siracusano G, Lussignol M, Shao Z, Hernandez E, Perdiz D, Quignon F, Mouna L, Poüs C, Gruffat H, Maréchal V, Esclatine A. 2020. BHRF1, a BCL2 viral homolog, disturbs mitochondrial dynamics and stimulates mitophagy to dampen type I IFN induction. Autophagy 17:1296–1315. doi:10.1080/15548627.2020.1758416

      Wang J-T, Doong S-L, Teng S-C, Lee C-P, Tsai C-H, Chen M-R. 2009. Epstein-Barr Virus BGLF4 Kinase Suppresses the Interferon Regulatory Factor 3 Signaling Pathway. J Virol 83:1856–1869. doi:10.1128/JVI.01099-08

      Yang J-Y, Yang WY. 2011. Spatiotemporally controlled initiation of Parkin-mediated mitophagy within single cells. Autophagy 7:1230–1238. doi:10.4161/auto.7.10.16626

      Zhu L-H, Gao S, Jin R, Zhuang L-L, Jiang L, Qiu L-Z, Xu H-G, Zhou G-P. 2014. Repression of interferon regulatory factor 3 by the Epstein-Barr virus immediate-early protein Rta is mediated through E2F1 in HeLa cells. Mol Med Rep 9:1453–1459. doi:10.3892/mmr.2014.1957

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this study the authors continue on from previous recent work demonstrating that the Epstein Barr virus encoded protein BHRF1 causes a number of cellular effects including an impact on autophagy and disruption of mitochondrial dynamics including drp1-dependent mitochondrial fragmentation and mitochondrial peri-nuclear aggregation followed by enhanced Parkin-dependent mitochondrial turnover (mitophagy). In the current study the authors further extend this work by showing that mitochondrial aggregation (as one might predict) is dependent on the microtubule network and coupling to retrograde motors. They also demonstrate that mitochondrial aggregation is dependent on ATAT1 dependent tubulin hyperacetylation.

      Overall the work presented is a relatively straightforward cell biological dissection of a subset of the previously described functions of BHRF1, focusing on the mitochondrial aggregation phenotype. The approaches and analysis are performed in cell lines mainly using over expression and some siRNAi experiments and appear well done throughout.

      Significance

      The current study unpicks one of the phenotypes induced by BHRF1 over expression: namely the previously reported mitochondrial aggregation phenotype. The findings that peri-nuclear mitochondrial aggregation are dependent on microtubules and retrograde motors are useful but could perhaps have been predicted. Overexpression of many proteins (or indeed chemical treatments) causing cellular and / or mitochondrial stress have been shown to cause mitochondrial perinuclear aggregation. This process has often been previously reported to be dependent on retrograde (dynein-dependent) mitochondrial trafficking so finding the process is also required for BHRF1-dependent aggregation is a helpful addition but not in itself particularly impactful. The findings linking the process to altered tubulin acetylation are more novel and interesting and may add a new dimension to understanding of BHRF1 function. However what is lacking here is really advancing our understanding of how BHRF1 does this. Although some downstream processes are identified in the current and previous study it still remains unclear what the exact underlying mechanisms are. Is BHRF1 doing this by disrupting mitochondrial function and making the organelles sick or by causing cellular stress indirectly leading to mitochondrial pathology? Previous studies have shown that cellular stress such as altered proteostasis can also cause stress-induced mitochondrial retrograde trafficking and aggregation. Is BHRF1 causing the same phenotype by generally stressing the cell and if it is more specifically through mitochondrial disruption what is the mechanism? As demonstrated by the authors in their previous work, BHRF1 does a number of things to cell signalling. Which of these are leading to a general disruption of cell signalling versus having specific effects on the cell or mitochondria still seems somewhat unclear.

      It would be interesting to know whether the role of microtubule hyperacetylation and ATA1 are more generally involved in other previously described processes of stress induced mitochondrial aggregation. Currently while this is a nicely performed follow up study to their 2020 paper, the present study neither provides in depth mechanistic advance of BHRF1 function, nor a better understanding of the molecular steps in a more generally relevant pathway (e.g. mitophagy).

      In addition all the experiments were performed in cell lines and rely on the over expression of a viral protein. But this is a significant over-simplification of the viral pathological process. It therefore remains unclear how pathophysiologically relevant the findings are (e.g. to EBV pathology) without further extending this element of the work.

      A additional minor issue is the authors naming of the process as Mito-aggresome formation. Although this might sound catchy it is somewhat unclear what the biological basis for this is. Aggresomes are defined structures that occur in cells during pathology and due to the peri-nuclear accumulation of misfolded protein. Since the process here is simply the description of aggregated mitochondria next to the nucleus but doesn't seem to have anything to do with protein misfolding it's really unclear how this labelling is helpful to the field. The process of perinuclear mitochondrial aggregation e.g. during mitochondrial stress or damage has been described many times before without the need for calling it a mito-aggresome. This term is likely to cause unhelpful confusion.

      Referee Cross-commenting

      Reviewer 1 makes several good points.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Manuscript Nr.: RC-2021-00890 Glon et al., "Essential role of hyperacetylated microtubules in innate immunity escape orchestrated by the EBV-encoded BHRF1 protein"

      The authors demonstrate that overexpression of the early lytic Epstein Barr virus protein BHRF1 causes mitochondrial fission and aggregation of smaller mitochondria in the perinuclear area. This aggregation is dependent on microtubules that are hyperacetylated upon BHRF1 expression, and on dynein motors. Hyperacetylation is dependent on autophagy, but not required for BHRF1 induced autophagy. Expression of acetylation insensitive tubulin abolishes mitochondrial aggregation, but not fission upon BHRF1 expression. This mitochondrial aggregation is required for BHRF1 dependent inhibition of type I interferon (IFN) production and of IRF3 translocation into the nucleus. From these data the authors conclude that BHRF1 might compromise type I IFN production by microtubule acetylation dependent mitochondria aggregation in the perinuclear area.

      The presented study describes an interesting mechanism, but it remains unclear if it occurs and which role it plays during EBV infection.

      Major comments:

      1. In the presented manuscript the authors characterize mainly BHRF1 overexpression in HeLa cells. Does BHRF1 also block type I IFN responses by microtubule hyperacetylation in the context of EBV infection? Do alpha-tubulin K40A overexpressing B cells produce more type I IFN after EBV infection?
      2. The authors document that the observed microtubule hyperacetylation is due to the acetyltransferase ATAT1. How does BHRF1 activate ATAT1? Is there any direct interaction?
      3. Furthermore, the authors demonstrate with pharmacological autophagy inhibitors that autophagy is increased in a BHRF1 dependent and microtubule acetylation independent manner but required for microtubule hyperacetylation. How does autophagy stimulate ATAT1 dependent microtubule hyperacetylation? Is this dependency also observed with a more specific ATG silencing or knock-out?

      Minor comments:

      1. "Innate immunity" and "innate immune system", but not "innate immunity system" are in my opinion better wordings.
      2. The reader would benefit from a discussion on the role of type I IFNs during EBV infection and how important the authors think their new mechanism could be in this context.

      Significance

      The significance of the described pathway for type I IFN production needs to be documented in the context of EBV infection.

  4. Nov 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewer’s comments

      We thank the three reviewers for their positive comments and constructive feedback. We have addressed the issues raised through additional experiments and text changes which have helped to improve the manuscript. Below, we address the specific points with detailed responses (reviewer comments are provided in italic).

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Rodriguez-Lopez et al describes the analysis of long intergenic non-coding RNA (lincRNA) function in fission yeast using both deletion and overexpression methods. The manuscript is very well presented and provides a wealth of lincRNA functional information for the field. This work is an important advance as there is still very little known about the function of lincRNAs in both normal and other conditions. An impressive array of conditions were assessed here. With a large scale analysis like this there is really not one specific conclusion. The authors conclude that lincRNAs exert their function in specific environmental or physiological conditions. This conclusion is not a novel conclusion, it has been proposed and shown before, but this manuscript provides the experimental proof of this concept on a large scale.

      The lincRNA knock-out library was assessed using a colony size screen, a colony viability screen and cell size and cell cycle analysis. Additionally, a lincRNA over-expression library was assessed by a colony size screen. These different functional analysis methods for lincRNAs were than carried out in a wide variety of conditions to provide a very large dataset for analysis. Overall, the presentation and analysis of the data was easy to follow and informative. Some points below could be addressed to improve the manuscript.

      There were 238 protein coding gene mutants assessed in parallel, to provide functional context, which was a very promising idea. But, unfortunately, the inclusion of 104 protein coding genes of unknown function restricted the use of the protein coding genes in the integrated analysis to connect lincRNAs to a known function using guilt by association.

      Reply: Yes, the unknown coding-gene mutants did certainly not help to provide functional context through guilt by association. These mutants were included to generate functional clues for the unknown proteins and compare phenotype hits with unknown lincRNA mutants. Nevertheless, because the known coding-gene mutants included broadly cover all high-level biological processes (GO slim), we could make several useful functional inferences for certain lincRNAs as discussed.

      The colony viability screen is not described well throughout the manuscript. Firstly, the use of phloxine B dye to determine cell viability needs to be described better when first introduced at the bottom of page 6. What exactly is this viability screen and red colour intensity indicating? Please define what the different levels of red a colony would indicate as far as viability. I assume an increase in red colour indicates more dead cells? So it is confusing that later the output of this assay is described as giving a resistant/sensitive phenotype or higher/lower viability. How can you get a higher viability from an assay that should only detect lower viability? Shouldn't this assay range from viable (no, or low red, colour) to increasing amounts of red indicating increasingly less viability? Figure 4D is also confusing with the "red" and "white" annotations. These should be changed to "lower viability" and "viable" or "not viable" and "viable".

      Reply: The colony-viability screen is described in detail in our recent paper (Kamrad et al, eLife 2020). We have now better explained how phloxine B works to determine cell viability (p. 6). The reviewer’s assumption is correct: an increase in red colour indicates more dead cells. However, all phenotypes reported are relative to wild-type cells under the same condition. Many conditions lead to a general increase in cell death, but some mutants show a lower increase in cell death compared to wild-type cells. These mutants, therefore, have a higher viability than wild-type cells, i.e. they are more resistant than wild-type under the given condition. We have tried to clarify this in the text, including the legend of Fig. 4. We agree that the ‘red’ and ‘white’ annotations in Fig. 4D could be confusing. We have now changed these to ‘low viability’ and ‘high viability’. Again, this is relative to wild-type cells.

      How are you sure that when generating the 113 lincRNA ectopic over-expression constructs by PCR that the sequences you cloned are correct? Simply checking for "correct insert size", as stated in the methods, is not really good practice and these constructs should be fully sequenced to be sure they contain the correct sequence and that constructs have not had mutations introduced by the PCR used for cloning. Without such sequence confirmation one cannot be completely confident that the data produced is specific for a lincRNA over-expression. Additionally, a selection of strains with the overexpression constructs should be tested by qRT-PCR and compared to a non-over-expressing strain to confirm lincRNA overexpression.

      Reply: To minimize errors during PCR amplification, we used the high-fidelity Phusion DNA polymerase which features an >50-fold lower error rate than Taq DNA Polymerase. We had confirmed the insert sequences for the first 17 lincRNAs cloned using Sanger sequencing (but did not report this in the manuscript). We have now checked additional inserts of the overexpression plasmids by Sanger sequencing in 96-well plate-format using a universal forward primer upstream of the cloning site. This high-troughput sequencing produced reliable sequence data for 80 inserts, including full insert sequences for 62 plasmids and the first ~900 bp of insert sequences for 18 plasmids). Of these, only the insert for SPNCRNA.601 showed a sequence error compared to the reference genome: T to C transition in position 559. This mutation could reflect either an error that occurred during cloning or a natural sequence variant among yeast strains (lincRNA sequences are much more variable than coding sequences). So, in general, the PCR cloning accurately preserved the sequence information. We have added this information in the Methods (p. 27-28). Please note that lincRNAs depend much less on primary nucleotide sequence than mRNAs, and a few nucleotide changes are highly unlikely to interfere with lincRNA function.

      Minor comments:

      Page 4, lines 19-20 - "A substantial portion of lincRNAs are actively translated (Duncan and Mata, 2014), raising the possibility that some of them act as small proteins." This sentence does not make sense, lincRNAs can't "act as" small proteins, they can only "code for" small proteins. Wording needs to be changed here.

      Reply: We agree and have changed the wording as suggested.

      Figure 1A is a nice representation but what are the grey dots? Are they all ncRNAs including lincRNAs? This needs to be stated in the legend.

      Reply: The grey dots represent all non-coding RNAs across the three S. pombe chromosomes as described by Atkinson et al., 2018. This has now been clarified in the legend.

      How many lincRNAs are there in total in pombe and what percentage did you delete? These numbers should be stated in the text.

      Reply: There are 1189 lincRNAs and we mutated ~12.6% of them. These numbers are now stated at the end of the Introduction, page 5.

      It would be nice if Supplementary Figure 1 included concentrations or amounts of the conditions used. This info is buried in a Supplementary table and would be better placed here.

      Reply: Supplemental Fig. 1 provides a simple overview for the different conditions and drugs used. For most stresses and drugs, we used multiple different doses. So the figure would become cluttered if we indicated all these concentrations, detracting from the main message. Colleagues who are interested in the different concentration ranges used for specific conditions can readily obtain this information from Supplemental Dataset 1. We have now added a statement in this respect to the legend of Supplemental Fig. 1

      Page 6, last sentence. What is a "biological repeat"? Three distinct deletion strains (ie three different deletion strains made by CRISPR) or one deletion strain used three times?

      Reply: Biological repeat means that one deletion strain was assayed three times independently, each with at least two colonies (technical repeats). In most cases, we had two or more independently generated deletion strains for each lincRNA (using the same or different gRNAs), and we performed at least three biological repeats for each strain. The numbers of independent strains for each lincRNA are provided in Supplemental Dataset 1 (sheet: lincRNA_metadata, column: n_independent_ko_mutants). The total numbers of repeats carried out for each condition after QC filtering are available in Supplemental Dataset 2 (columns: observation_count). We have clarified this on p. 7, and the details are now provided in the Methods on p. 28-29 (deletion mutants) and p. 32 (overexpression mutants).

      There is no mention in the manuscript of how other researchers can get access to the deletion strains and over-expression plasmids.

      Reply: As is usual, all strains and plasmids will be readily available upon request.

      Reviewer #1 (Significance (Required)):

      The production of lincRNA deletion strains and overexpression plasmids, and their analysis under an impressive number of conditions, provides key resources and data for the ncRNA field. This work complements nicely the analysis of protein coding gene deletion strains and provides the tools and data for future mechanistic studies of individual lincRNAs. This work would be of interest to the growing audience of ncRNA researchers in both yeast and other systems.

      Field of expertise:

      Yeast deletion strain construction and analysis, RNA functional analysis

      \*Referee Cross-commenting** *

      Reviewer #3 makes an important point that the stability of each lincRNA over expressed from plasmid is not known and therefore some lincRNAs may not be overexpressed as predicted. RT-qPCR would be required to assess lincRNA expression levels from the plasmids. It also appears that we both agree that it is important to determine the sequence of the cloned lincRNAs in the over expression plasmids.

      Reply: See reply in response to Reviewer 3.

      Reviewer #3 also makes an important point in his review that where it is predicted that a lincRNA deletion influences an adjacent gene in cis then the expression of that gene should be tested.

      Reply: See reply in response to Reviewer 3.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      \*Summary:** *

      The Rodriguez-Lopez manuscript from the Bahler lab present the phenotypical and functional profiling of lincRNA in fission yeast. This is the first large-scale, extensive work of this nature in this model organism and it therefore nicely complement the well-documented examples of lincRNA already reported in S.pombe.

      The work is very solid using seamless genome deletion and overexpression followed by colony-based assay in respone to a very wide set of conditions.

      \*Major comments:** *

      - considering that this is a descriptive work by nature and that the experiments were properly conducted as far as I can judge, I don't have major issues with this paper.

      To me the only thing that is missing is a gametogenesis assay, for two reasons: First, several reported cases of lincRNAs in pombe critically regulates meiosis, and second many of the analysed lincRNAs are upregulated durig meiosis. Figure 6B already points to three obvious candidates. I don't think it would take to much time to look at the deletion and OE in an h90 strain and see the effect of gametogenesis for the entire set or at least the 3 candidates from Figure 6.

      If the already broad set of lincRNAs implicated in meiosis would grow, this would be another evidence that eukaryotic cell differentiation relies on non-coding RNAs even in simpler models.

      Reply: We agree that this is a meaningful analysis to add. We have now deleted the three unstudied lincRNA genes, along with the meiRNA gene, from the sub-cluster of Figure 6B in the homothallic h90 background (to allow self-mating). We have analysed meiosis and spore viability of these four deletion strains together with a wild-type h90 control strain. These experiments indicate that cell mating is normal in the deletion mutants, but meiotic progression is somewhat delayed in SPNCRNA.1154, SPNCRNA.1530 and, most strongly, meiRNA mutants (the latter has been reported before (reviewed by Yamashita 2019). Notably, we detected significant reductions in spore viability for all four deletion mutants compared to the control strain. These results point to roles of SPNCRNA.1154, SPNCRNA.1530, and SPNCRNA.335 in meiotic differentiation, as predicted by the clustering analyses. This is a nice addition to the manuscript. We now report these results on p. 23, with a new Supplemental Figure 10, and describe the experimental procedures in the Methods (p. 34-35).

      \*Minor comments:** *

      - A reference to the recent work of the Rougemaille lab on mamRNA is necessary

      Reply: Yes, we now cite this reference in the Introduction (p. 4).

      - a discussion of the possibility to perfom large-scale genetic interactions searches (as done by Krogan for protein-coding genes) would add to the discussion of futue plans

      Reply: We have added a sentence about the potential of SGA screens in the Conclusions (p. 26).

      Reviewer #2 (Significance (Required)):

      The work unambigously shows that that most of the lincRNAs analyzed exert cellular functions in specific environmental or physiological contexts. This conclusion is critical because the biological relevance this so-called « dark matter » is still debated despite a few well-established cases. This is an important addition to the field and the deep phenotyping work already points to some directions to analyse some of these lincRNA in the context of cell cycle progression, metabolism or meiosis.

      \*Referee Cross-commenting** *

      - I agree with the issues raised by referees 1 and 3 but I am concerned about the added value of a RT-qPCR. First, this is a significant amout of work considering the large set of targets. Second a more importantly, what you ll end up with is a fold change. What will be considered as overexpression? Which threshold? This is why I prefer a biological read-out (a phenotype) because whatever the fold change, it tells us that there is an effect. It is very likely indeed that some targets are not overexpressed because of their rapid degradation. To me, this is the drawback of any large-scale studies.

      - Also, looking at the expression of the adjacent gene in the case of a cis-effect is interesting though this is likely condition-dependent (because most phenotypes appear in specific conditions). So, what would be the conclusion if there is no effect in classical rich media?

      - The sequence of the insert should be specified, I agree. Most likely, it is the sequence available from pombase (this is what I understood) but that should be clarified indeed.

      Reply: Yes, the sequences of the inserts are available from PomBase, and we provide the primer sequences used for cloning in the Supplemental Dataset 1. We have now clarified this in the Methods (p. 27).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this work from the group of Jurg Bahler, the authors take advantage of the high throughput colony-based screen approach they recently developed (Kamrad et al, eLife 2020) to perform a functional profiling analysis on a subset of 150 lincRNAs in fission yeast. Using a seamless CRISPR/Cas9-based method, they created deletion mutants for 141 lincRNAs. In addition, the authors also generated strains ectopically overexpressing 113 lincRNAs from a plasmid (under the control of the strong and inducible nmt1 promoter).

      The viability and growth of all these mutants was then assessed across benign, nutrient, drug and stress conditions (149 conditions for the deletion mutants, 47 conditions for the overexpression). For the deletion mutants, the authors also assayed in parallel mutants of 238 protein-coding genes (PCGs) covering multiple biological processes and main GO classes.

      In benign conditions, deletion of 5 and 10 lincRNAs resulted in a reduced growth phenotype (rich and minimal medium, respectively). Morphological characterization by microscopy also revealed cell size defects for 6 lincRNA mutants (2 shorter, 4 longer). In addition, 27 mutants displayed phenotypes pointing defects in the cell cycle.

      Remarkably, the nutrient/drug/stress conditions revealed more phenotypes, with 60 of the 141 lincRNA mutants showing a growth phenotype in at least one condition, and 25 mutants showing a different viability compared to the wild-type in at least one condition.

      Also remarkable is the observation that 102/113 lincRNA overexpression strain displayed a growth phenotype in at least one condition, 14 lincRNAs showing phenotypes in more than 10 conditions.

      The clustering analyses performed by the authors also provide functional insight for some lincRNAs.

      Overall, this is an important study, well conducted and well presented. Together, the data described by the authors are convincing and highlight that most lincRNAs would function in very particular conditions, and that deletion/inactivation and overexpression are complementary approaches for the functional characterization of lncRNAs. This has been demonstrated here, in a very elegant manner.

      I think this manuscript will be acknowledged as a pioneer work in the field.

      \*A. Major comments** *

      - A.1. Are the key conclusions convincing?

      To my opinion, the key conclusions of this study are convincing.

      - A.2. Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      No. The authors are careful in their claims and conclusions.

      - A.3. Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      This study is based on systematic lincRNA deletion/overexpression.

      - For the deletion strains, I could not find any information about the control of the deletions. Are the authors sure that the targeted lincRNAs were indeed properly deleted?

      Reply: Yes, we had carefully checked the correctness of the deletions using several controls as described by Rodriguez-Lopez et al. 2017. All deletion strains were checked for missing open-reading frames by PCR. For 20 strains, we also sequenced across the deletion scars. We re-checked all strains by PCR after arraying them onto the 384 plates to ensure that no errors occurred during the process. We have now specified this in the Methods (p. 27).

      - For the overexpression, there is only a control of the insert size by PCR. Sanger sequencing would have been preferable to confirm that the targeted lincRNAs were properly cloned, without any mutation. In addition, the authors did not check that the lincRNAs were indeed overexpressed (at least in the benign conditions). Is the overexpression fold similar for all the lincRNAs? Do the 14 lincRNAs showing the most consistent phenotypes in at least 10 conditions display different expression levels than the other lincRNAs?

      - A.4. Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      - Validating the deletion strains requires genomic DNA extraction and then PCR. This is repetitive and tedious, but this control is important, I think. The time needed depends on the possibility of automating the process. I think this is feasible in this lab.

      - Controlling the insert sequence into the overexpression vector requires plasmid DNA (available as it was used for PCR) and one/several primer(s), depending on the insert size. The sequencing itself is usually done by platforms.

      - Analysing lincRNA overexpression at the RNA level requires yeast cultures, RNA extraction and then RT-qPCR. Again, the time needed depends on the possibility of automating the process.

      Reply: We have now checked most overexpression constructs by Sanger sequencing of the inserts as described in response to Reviewer 1. Moreover, we have tested the overexpression levels for eight selected overexpression constructs using RT-qPCR analysis. These eight constructs feature the entire range of associated phenotypes hits, including 3 lincRNAs with the highest number of phenotypes in 14 conditions, 3 with no phenotypes, and 2 with intermediate numbers of phenotypes. The RT-qPCR results show that the lincRNAs were 35- to 2200-fold overexpressed relative to the empty-vector control strain (which expresses the lincRNA at native levels). No clear pattern was evident between expression levels and phenotype hits, e.g. lincRNAs without phenotypes when overexpressed showed similar fold-changes as a lincRNA showing 13 phenotypes. We present these results on p. 21/22 and in the new Supplemental Figure 9A, and describe the experiment in the Methods (p. 28).

      As pointed out by Reviewer 2, these fold changes in expression are actually of limited value compared to the phenotype read-outs. The important result is that we detected phenotypes for over 90% of the overexpression strains, indicating that overexpression generally worked. Given that this is a large-scale study, there might be some lincRNA constructs that are faulty or are not overexpressed. It would not be realistic or meaningful to test all constructs. Any follow-on studies focusing on a specific lincRNAs will need to first validate the large-scale results as is common practice.

      - A.5. Are the data and the methods presented in such a way that they can be reproduced?

      The methods are clearly and extensively explained. If necessary, the reader can find more details about the high-throughput colony-based screen approach in the original paper (Kamrad et al, eLife 2020); a very interesting technical discussions can also be found in the reviewers reports and in the authors response published alongside.

      - A.6. Are the experiments adequately replicated and statistical analysis adequate?

      The experiments are replicated. However, I feel confused regarding the number of replicates used in each analysis.

      In the first part of the Results, it is mentioned that all colony-based phenotyping was performed in at least 3 independent replicates, with a median number of 9 repeats per lincRNAs. In the Methods section, I read that for the high-throughput microscopy and flow cytometry for cell-size and cell-cycle phenotypes, over 80% of the 110 lincRNA mutants screened for cellular phenotypes were assayed in at least 2 independent biological repeats. For the overexpression, I read that each strain was represented by at least 12 colonies across 3 different plates and experiments were repeated at least 3 times. Each condition was assayed in three independent biological repeats, together with control EMM2 plates, resulting in at least 36 data points per strain per condition.

      Perhaps I missed something. If not, could the authors clarify this? In addition, I suggest to indicate the number of replicates used for each lincRNA/condition/assay in Supplemental Dataset 2 (I could only find the information for the Flow Cytometry) and in Supplemental Dataset 6.

      Reply: For all colony-based phenotyping, we performed at least three biological repeats, meaning that the strains were assayed three times independently, each with at least two colonies (technical repeats). In most cases, we had two or more independently generated deletion strains for each lincRNA, and we performed at least three biological repeats for each strain (hence the higher median number of nine repeats per lincRNA). The numbers of independent deletion strains for each lincRNA are provided in Supplemental Dataset 1 (sheet: lincRNA_metadata, column: n_independent_ko_mutants). The total numbers of repeats carried out for each condition after QC filtering are available in Supplemental Dataset 2 (columns: observation_count). We have now clarified this on p. 6, and the details are provided in the Methods on p. 28-29 (for deletion mutants) and p. 32 (for overexpression mutants). For the high-throughput microscopy and flow cytometry experiments, we performed the repeats as described in the text.

      \*B. Minor comments** *

      - B.1. Specific experimental issues that are easily addressable.

      - The pattern of the SPNCRNA.1343 and SPNCRNA.989 mutants is consistent with the idea that these lincRNAs act in cis and that their deletion interferes with the expression of the adjacent tgp1 and atd1 genes, respectively. The authors could easily test by RT-qPCR or Northern Blot that the lincRNA deletion leads to the induction of the adjacent gene. Also, if the hypothesis of the authors is correct, the ectopic expression of these two lincRNAs in trans should not complement the phenotypes of the corresponding mutants. These experiments would reinforce the conclusion of the authors about the specific regulatory effect of the SPNCRNA.1343 and SPNCRNA.989 lincRNAs.

      Reply: It would actually not be as easy as suggested to obtain conclusive results in this respect. For SPNCRNA.1343 and its neighbour, atd1, the mechanisms involved have already been shown in detail based on several mechanistic studies (Ard et al., 2014; Ard and Allshire, 2016; Garg et al., 2018; Shah et al., 2014; 2014; Yague-Sanz et al., 2020). But these studies did require multiple precise genetic constructs and specialized approaches to interrogate the complex regulatory relationships between the overlapping transcripts which can be both positive and negative. As correctly pointed out by Reviewer 2, we do not know the particular conditions where any cis-regulatory interactions take place, and a negative result would not be conclusive. We have interrogated our RNA-seq data obtained under multiple genetic and environmental conditions (Atkinson et al. 2018) to analyse the regulatory relationship between SPNCRNA.1343 and atd1 (studied before) as well as SPNCRNA.989 and tgp1 (proposed in our manuscript). Depending on the specific conditions, both of these gene pairs show positive or negative correlations in expression levels. So it is not possible to just perform the easy experiment as suggested to reach a clear conclusion.

      - Is there any possibility that some nutrient/drug/stress conditions interfere with the expression from the nmt1 promoter?

      Reply: This seems unlikely as this widely used promoter is known to be specifically regulated by thiamine. Consistent with this, we actually detected phenotypes for over 90% of the overexpression strains. But we cannot exclude the possibility that some conditions might interfere with nmt1 function.

      - Supplemental Figure 7 refers to unpublished data from Maria Rodriguez-Lopez. Is this still allowed?

      Reply: These are just control RNA-seq data from wild-type cells growing in rich medium. It does not seem that meaningful, but if required we could submit these data to the European Nucleotide Archive (ENA).

      - Supplemental Figure 8 shows drop assays to validate the growth phenotypes revealed by the screen for lincRNAs of clusters 1 and 3. As admitted by the authors in the text, in most cases, the effects are quite difficult to see to the naked eye. Did the authors consider the possibility to use growth curves (for the lincRNAs/conditions they would like to highlight), which might be more appropriate to visualize weak effects?

      Reply: We have tried a few experiments in liquid medium using our BioLector microfermentor. However, the doses need to be substantially changed for liquid media (in which cells typically are more sensitive than on solid media). So the situation with the altered conditions would become too confusing and could not be used as a direct validation of our results from solid media.

      - B.2. Are prior studies referenced appropriately?

      Yes. The authors could have cited the work of Huber et al (2016) Cell Rep. (PMID: 27292640) as another pioneer study where systematic lncRNA deletion was performed, even if in this case, these were antisense lncRNAs.

      Reply: Agreed, we now cite this paper in the Introduction (p. 4).

      - B.3. Are the text and figures clear and accurate?

      Overall, I found the text and figures clear.

      Reviewer #3 (Significance (Required)):

      Eukaryotic genomes produce thousands of long non-coding RNAs, including lincRNAs which are expressed from intergenic regions and do not overlap PCGs. Several lincRNAs have been extensively studied and characterized, showing that they function in different cellular processes, such as regulation of gene expression, chromatin modification, etc. However, beside these well documented lincRNAs, the function of most lincRNAs remains elusive. In addition, under the standard growth conditions used in labs, many of them are expressed to very low levels, and for the few cases for which it has been tested, the deletion and/or overexpression in trans often failed to display in a detectable phenotype.

      High throughput approaches for lncRNA functional profiling are currently emerging. The lab of Jurg Bahler recently developed a high throughput colony-based screen approach enabling them to quantitatively assay the growth and viability of fission yeast mutants under multiple conditions (Kamrad et al, eLife 2020). Here, they take advantage of this approach to characterize mutants of 150 lincRNAs in fission yeast, including not only deletion mutants generated using the CRISPR/Cas9 technology, but also overexpression mutants, tested in 149 and 47 growth conditions, respectively. This systematic approach allowed the authors to reveal specific phenotypes for a large fraction of the lincRNAs, emphasizing the fact that they are likely to be functional in particular nutrient/drug/stress conditions, acting in cis but also in trans.

      As I wrote in the summary above, I think that this study is important and constitutes a significant contribution in the lncRNA field.

      My field of expertise: long non-coding RNAs, yeast, genetics.

      \*Referee Cross-commenting** *

      I can see that reviewer #1 and I have raised the same concerns about the lack of insert sequencing for the overexpression plasmids, which is crucial to control that the correct lincRNAs were cloned and that no mutation has been introduced by the PCR. We are also both asking for RT-qPCR controls to show that the lincRNAs are indeed overexpressed. Again, this control is very important as many long non-coding RNAs are rapidly degraded by the nuclear and/or ctyoplasmic RNA decay machineries. So expressing a lincRNA from a plasmid, under the control of a strong promoter, does not guarantee increased RNA levels.

      I see that reviewer #2 is asking for a gametogenesis assay. I think it should be limited to the 3 lincRNAs which belong to the same sub-cluster as meiRNA.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this work from the group of Jurg Bahler, the authors take advantage of the high throughput colony-based screen approach they recently developed (Kamrad et al, eLife 2020) to perform a functional profiling analysis on a subset of 150 lincRNAs in fission yeast. Using a seamless CRISPR/Cas9-based method, they created deletion mutants for 141 lincRNAs. In addition, the authors also generated strains ectopically overexpressing 113 lincRNAs from a plasmid (under the control of the strong and inducible nmt1 promoter).

      The viability and growth of all these mutants was then assessed across benign, nutrient, drug and stress conditions (149 conditions for the deletion mutants, 47 conditions for the overexpression). For the deletion mutants, the authors also assayed in parallel mutants of 238 protein-coding genes (PCGs) covering multiple biological processes and main GO classes. In benign conditions, deletion of 5 and 10 lincRNAs resulted in a reduced growth phenotype (rich and minimal medium, respectively). Morphological characterization by microscopy also revealed cell size defects for 6 lincRNA mutants (2 shorter, 4 longer). In addition, 27 mutants displayed phenotypes pointing defects in the cell cycle.

      Remarkably, the nutrient/drug/stress conditions revealed more phenotypes, with 60 of the 141 lincRNA mutants showing a growth phenotype in at least one condition, and 25 mutants showing a different viability compared to the wild-type in at least one condition. Also remarkable is the observation that 102/113 lincRNA overexpression strain displayed a growth phenotype in at least one condition, 14 lincRNAs showing phenotypes in more than 10 conditions.

      The clustering analyses performed by the authors also provide functional insight for some lincRNAs. Overall, this is an important study, well conducted and well presented. Together, the data described by the authors are convincing and highlight that most lincRNAs would function in very particular conditions, and that deletion/inactivation and overexpression are complementary approaches for the functional characterization of lncRNAs. This has been demonstrated here, in a very elegant manner. I think this manuscript will be acknowledged as a pioneer work in the field.

      A. Major comments

      • A.1. Are the key conclusions convincing? To my opinion, the key conclusions of this study are convincing.
      • A.2. Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? No. The authors are careful in their claims and conclusions.
      • A.3. Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      This study is based on systematic lincRNA deletion/overexpression.

      • For the deletion strains, I could not find any information about the control of the deletions. Are the authors sure that the targeted lincRNAs were indeed properly deleted?
      • For the overexpression, there is only a control of the insert size by PCR. Sanger sequencing would have been preferable to confirm that the targeted lincRNAs were properly cloned, without any mutation. In addition, the authors did not check that the lincRNAs were indeed overexpressed (at least in the benign conditions). Is the overexpression fold similar for all the lincRNAs? Do the 14 lincRNAs showing the most consistent phenotypes in at least 10 conditions display different expression levels than the other lincRNAs?
      • A.4. Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.
      • Validating the deletion strains requires genomic DNA extraction and then PCR. This is repetitive and tedious, but this control is important, I think. The time needed depends on the possibility of automating the process. I think this is feasible in this lab.
      • Controlling the insert sequence into the overexpression vector requires plasmid DNA (available as it was used for PCR) and one/several primer(s), depending on the insert size. The sequencing itself is usually done by platforms.
      • Analysing lincRNA overexpression at the RNA level requires yeast cultures, RNA extraction and then RT-qPCR. Again, the time needed depends on the possibility of automating the process.
      • A.5. Are the data and the methods presented in such a way that they can be reproduced? The methods are clearly and extensively explained. If necessary, the reader can find more details about the high-throughput colony-based screen approach in the original paper (Kamrad et al, eLife 2020); a very interesting technical discussions can also be found in the reviewers reports and in the authors response published alongside.
      • A.6. Are the experiments adequately replicated and statistical analysis adequate? The experiments are replicated. However, I feel confused regarding the number of replicates used in each analysis.

      In the first part of the Results, it is mentioned that all colony-based phenotyping was performed in at least 3 independent replicates, with a median number of 9 repeats per lincRNAs. In the Methods section, I read that for the high-throughput microscopy and flow cytometry for cell-size and cell-cycle phenotypes, over 80% of the 110 lincRNA mutants screened for cellular phenotypes were assayed in at least 2 independent biological repeats. For the overexpression, I read that each strain was represented by at least 12 colonies across 3 different plates and experiments were repeated at least 3 times. Each condition was assayed in three independent biological repeats, together with control EMM2 plates, resulting in at least 36 data points per strain per condition.

      Perhaps I missed something. If not, could the authors clarify this? In addition, I suggest to indicate the number of replicates used for each lincRNA/condition/assay in Supplemental Dataset 2 (I could only find the information for the Flow Cytometry) and in Supplemental Dataset 6.

      B. Minor comments

      • B.1. Specific experimental issues that are easily addressable.
      • The pattern of the SPNCRNA.1343 and SPNCRNA.989 mutants is consistent with the idea that these lincRNAs act in cis and that their deletion interferes with the expression of the adjacent tgp1 and atd1 genes, respectively. The authors could easily test by RT-qPCR or Northern Blot that the lincRNA deletion leads to the induction of the adjacent gene. Also, if the hypothesis of the authors is correct, the ectopic expression of these two lincRNAs in trans should not complement the phenotypes of the corresponding mutants. These experiments would reinforce the conclusion of the authors about the specific regulatory effect of the SPNCRNA.1343 and SPNCRNA.989 lincRNAs.
      • Is there any possibility that some nutrient/drug/stress conditions interfere with the expression from the nmt1 promoter?
      • Supplemental Figure 7 refers to unpublished data from Maria Rodriguez-Lopez. Is this still allowed?
      • Supplemental Figure 8 shows drop assays to validate the growth phenotypes revealed by the screen for lincRNAs of clusters 1 and 3. As admitted by the authors in the text, in most cases, the effects are quite difficult to see to the naked eye. Did the authors consider the possibility to use growth curves (for the lincRNAs/conditions they would like to highlight), which might be more appropriate to visualize weak effects?
      • B.2. Are prior studies referenced appropriately? Yes. The authors could have cited the work of Huber et al (2016) Cell Rep. (PMID: 27292640) as another pioneer study where systematic lncRNA deletion was performed, even if in this case, these were antisense lncRNAs.
      • B.3. Are the text and figures clear and accurate? Overall, I found the text and figures clear.

      Significance

      Eukaryotic genomes produce thousands of long non-coding RNAs, including lincRNAs which are expressed from intergenic regions and do not overlap PCGs. Several lincRNAs have been extensively studied and characterized, showing that they function in different cellular processes, such as regulation of gene expression, chromatin modification, etc. However, beside these well documented lincRNAs, the function of most lincRNAs remains elusive. In addition, under the standard growth conditions used in labs, many of them are expressed to very low levels, and for the few cases for which it has been tested, the deletion and/or overexpression in trans often failed to display in a detectable phenotype.

      High throughput approaches for lncRNA functional profiling are currently emerging. The lab of Jurg Bahler recently developed a high throughput colony-based screen approach enabling them to quantitatively assay the growth and viability of fission yeast mutants under multiple conditions (Kamrad et al, eLife 2020). Here, they take advantage of this approach to characterize mutants of 150 lincRNAs in fission yeast, including not only deletion mutants generated using the CRISPR/Cas9 technology, but also overexpression mutants, tested in 149 and 47 growth conditions, respectively. This systematic approach allowed the authors to reveal specific phenotypes for a large fraction of the lincRNAs, emphasizing the fact that they are likely to be functional in particular nutrient/drug/stress conditions, acting in cis but also in trans. As I wrote in the summary above, I think that this study is important and constitutes a significant contribution in the lncRNA field.

      My field of expertise: long non-coding RNAs, yeast, genetics.

      Referee Cross-commenting

      I can see that reviewer #1 and I have raised the same concerns about the lack of insert sequencing for the overexpression plasmids, which is crucial to control that the correct lincRNAs were cloned and that no mutation has been introduced by the PCR. We are also both asking for RT-qPCR controls to show that the lincRNAs are indeed overexpressed. Again, this control is very important as many long non-coding RNAs are rapidly degraded by the nuclear and/or ctyoplasmic RNA decay machineries. So expressing a lincRNA from a plasmid, under the control of a strong promoter, does not guarantee increased RNA levels.

      I see that reviewer #2 is asking for a gametogenesis assay. I think it should be limited to the 3 lincRNAs which belong to the same sub-cluster as meiRNA.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The Rodriguez-Lopez manuscript from the Bahler lab present the phenotypical and functional profiling of lincRNA in fission yeast. This is the first large-scale, extensive work of this nature in this model organism and it therefore nicely complement the well-documented examples of lincRNA already reported in S.pombe.

      The work is very solid using seamless genome deletion and overexpression followed by colony-based assay in respone to a very wide set of conditions.

      Major comments:

      • considering that this is a descriptive work by nature and that the experiments were properly conducted as far as I can judge, I don't have major issues with this paper. To me the only thing that is missing is a gametogenesis assay, for two reasons: First, several reported cases of lincRNAs in pombe critically regulates meiosis, and second many of the analysed lincRNAs are upregulated durig meiosis. Figure 6B already points to three obvious candidates. I don't think it would take to much time to look at the deletion and OE in an h90 strain and see the effect of gametogenesis for the entire set or at least the 3 candidates from Figure 6. If the already broad set of lincRNAs implicated in meiosis would grow, this would be another evidence that eukaryotic cell differentiation relies on non-coding RNAs even in simpler models.

      Minor comments:

      • A reference to the recent work of the Rougemaille lab on mamRNA is necessary
      • a discussion of the possibility to perfom large-scale genetic interactions searches (as done by Krogan for protein-coding genes) would add to the discussion of futue plans

      Significance

      The work unambigously shows that that most of the lincRNAs analyzed exert cellular functions in specific environmental or physiological contexts. This conclusion is critical because the biological relevance this so-called « dark matter » is still debated despite a few well-established cases. This is an important addition to the field and the deep phenotyping work already points to some directions to analyse some of these lincRNA in the context of cell cycle progression, metabolism or meiosis.

      Referee Cross-commenting

      • I agree with the issues raised by referees 1 and 3 but I am concerned about the added value of a RT-qPCR. First, this is a significant amout of work considering the large set of targets. Second a more importantly, what you ll end up with is a fold change. What will be considered as overexpression? Which threshold? This is why I prefer a biological read-out (a phenotype) because whatever the fold change, it tells us that there is an effect. It is very likely indeed that some targets are not overexpressed because of their rapid degradation. To me, this is the drawback of any large-scale studies.
      • Also, looking at the expression of the adjacent gene in the case of a cis-effect is interesting though this is likely condition-dependent (because most phenotypes appear in specific conditions). So, what would be the conclusion if there is no effect in classical rich media?
      • The sequence of the insert should be specified, I agree. Most likely, it is the sequence available from pombase (this is what I understood) but that should be clarified indeed.
    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Rodriguez-Lopez et al describes the analysis of long intergenic non-coding RNA (lincRNA) function in fission yeast using both deletion and overexpression methods. The manuscript is very well presented and provides a wealth of lincRNA functional information for the field. This work is an important advance as there is still very little known about the function of lincRNAs in both normal and other conditions. An impressive array of conditions were assessed here. With a large scale analysis like this there is really not one specific conclusion. The authors conclude that lincRNAs exert their function in specific environmental or physiological conditions. This conclusion is not a novel conclusion, it has been proposed and shown before, but this manuscript provides the experimental proof of this concept on a large scale.

      The lincRNA knock-out library was assessed using a colony size screen, a colony viability screen and cell size and cell cycle analysis. Additionally, a lincRNA over-expression library was assessed by a colony size screen. These different functional analysis methods for lincRNAs were than carried out in a wide variety of conditions to provide a very large dataset for analysis. Overall, the presentation and analysis of the data was easy to follow and informative. Some points below could be addressed to improve the manuscript.

      There were 238 protein coding gene mutants assessed in parallel, to provide functional context, which was a very promising idea. But, unfortunately, the inclusion of 104 protein coding genes of unknown function restricted the use of the protein coding genes in the integrated analysis to connect lincRNAs to a known function using guilt by association.

      The colony viability screen is not described well throughout the manuscript. Firstly, the use of phloxine B dye to determine cell viability needs to be described better when first introduced at the bottom of page 6. What exactly is this viability screen and red colour intensity indicating? Please define what the different levels of red a colony would indicate as far as viability. I assume an increase in red colour indicates more dead cells? So it is confusing that later the output of this assay is described as giving a resistant/sensitive phenotype or higher/lower viability. How can you get a higher viability from an assay that should only detect lower viability? Shouldn't this assay range from viable (no, or low red, colour) to increasing amounts of red indicating increasingly less viability? Figure 4D is also confusing with the "red" and "white" annotations. These should be changed to "lower viability" and "viable" or "not viable" and "viable".

      How are you sure that when generating the 113 lincRNA ectopic over-expression constructs by PCR that the sequences you cloned are correct? Simply checking for "correct insert size", as stated in the methods, is not really good practice and these constructs should be fully sequenced to be sure they contain the correct sequence and that constructs have not had mutations introduced by the PCR used for cloning. Without such sequence confirmation one cannot be completely confident that the data produced is specific for a lincRNA over-expression. Additionally, a selection of strains with the overexpression constructs should be tested by qRT-PCR and compared to a non-over-expressing strain to confirm lincRNA overexpression.

      Minor comments:

      Page 4, lines 19-20 - "A substantial portion of lincRNAs are actively translated (Duncan and Mata, 2014), raising the possibility that some of them act as small proteins." This sentence does not make sense, lincRNAs can't "act as" small proteins, they can only "code for" small proteins. Wording needs to be changed here.

      Figure 1A is a nice representation but what are the grey dots? Are they all ncRNAs including lincRNAs? This needs to be stated in the legend.

      How many lincRNAs are there in total in pombe and what percentage did you delete? These numbers should be stated in the text.

      It would be nice if Supplementary Figure 1 included concentrations or amounts of the conditions used. This info is buried in a Supplementary table and would be better placed here.

      Page 6, last sentence. What is a "biological repeat"? Three distinct deletion strains (ie three different deletion strains made by CRISPR) or one deletion strain used three times?

      There is no mention in the manuscript of how other researchers can get access to the deletion strains and over-expression plasmids.

      Significance

      The production of lincRNA deletion strains and overexpression plasmids, and their analysis under an impressive number of conditions, provides key resources and data for the ncRNA field. This work complements nicely the analysis of protein coding gene deletion strains and provides the tools and data for future mechanistic studies of individual lincRNAs. This work would be of interest to the growing audience of ncRNA researchers in both yeast and other systems.

      Field of expertise: Yeast deletion strain construction and analysis, RNA functional analysis

      Referee Cross-commenting

      Reviewer #3 makes an important point that the stability of each lincRNA over expressed from plasmid is not known and therefore some lincRNAs may not be overexpressed as predicted. RT-qPCR would be required to assess lincRNA expression levels from the plasmids. It also appears that we both agree that it is important to determine the sequence of the cloned lincRNAs in the over expression plasmids.

      Reviewer #3 also makes an important point in his review that where it is predicted that a lincRNA deletion influences an adjacent gene in cis then the expression of that gene should be tested.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all reviewers for their thorough assessment and constructive comments.

      For clarity, their comments have been numbered.

      Reviewer #1

      Evidence, reproducibility and clarity:

      Summary:

      Acetylation/Deacetylation controls G1/s transition in budding yeast. The lysine acetyl transferase Esa1 is here shown to play a role, in part via acetylation of the nuclear pore complex basket component Nup60, which stimulates mRNA export.

      Major comments:

      1 • Figure 1C: The curve for esa1-ts in this figure and the curve in the supplementary figure S2B are not similar, while the first shows 10% cells budding after 60 minutes it is about 50% after 60 min in S2B. Another helpful way of presenting the data could be the length of the G1 phase (from cytokinesis to budding) in the WT, esa1-ts, gcn5delta cells over time.

      We thank the reviewer for pointing this out. Indeed, there is some day-to-day variability in the budding kinetics of the temperature-sensitive esa1 mutant, and the text referred to one individual experiment. Therefore, we have changed the text to better reflect the observed variability (p. 7) and added a graph (supplementary Figure S2C) including all individual replicates. This shows that in spite of small differences between experiments, esa1-ts cells always bud slower and less efficiently than wild-type cells. We note that the data cannot be shown in the way suggested (time from cytokinesis to budding, presumably from individual cells) because cells in these experiments were released from a G1 block (after cytokinesis), and samples from cell cultures were imaged at time intervals (and not single cells over time). Time-lapse data of single cells is shown in figure 2E.

      2 • What is the rational of creating the Nup60-KN mutation. Does it prevent acetylation of Nup60, at least by GCN5 and/or esa1?

      The biophysical properties of asparagine resemble those of acetylated lysine. Therefore, the Nup60-KN mutant (lysine 467 to asparagine) is expected to mimic acetylation of Nup60 K467, which was found to be acetylated in earlier studies. Supporting the conclusion that Nup60-KN is indeed an acetyl-mimic, the nup60-KN mutation partially rescues the Start and mRNA export defects on Esa1-deficient cells. We make the rationale of the Nup60-KN mutation clearer in the current version (p. 8).

      3 • Given the much stronger phenotype of the esa1-ts+GCN5 delta condition for G1/S transition as compared to esa1-ts and that GCN5 seems to strongly acetylate Nup60 I do not understand the sole focus on esa1 in the study. The fact that the Nup60-KN cells do not show G1/S transition under esa1-ts+GCN5 delta conditions in experiments presented in Fig. S3 argues that esa1 meaidted acetylation of Nup60 is only one, probably minor aspect of G1/S transition. This should be much balanced discussed.

      We focus on Esa1 because this allows us to dissect the specific role of Nup60 acetylation and mRNA export during the G1/S transition. Of course, Esa1-dependent acetylation of Nup60 is not the only process controlling the G1/S transition, which is regulated at several levels. For example, the concentration of multiple Start activators and inhibitors scales differentially with cell size (PMID: 26390151, 32246903). In addition, daughter-specific factors inhibit Start through a pathway parallel to Nup60 deacetylation (Ace2/Ash1-dependent repression of Cln3 transcription; PMID: 19841732, 19841732). We discuss these studies in the current version (p. 17).

      As for the relative contribution of Esa1 and Gcn5 to the G1/S transition and mRNA export: both of these KATs have overlapping roles in promoting transcription, probably through distinct substrates (such as histone H2 for Gcn5, H4 for Esa1) and this may contribute to their role in Start. Consistent with this, deletion of GCN5 causes a minor delay in transcription of G1/S genes (Kishkevich, Sci. Rep 2019). On the other hand, gnc5 mutants have no detectable mRNA export defects, unlike esa1-ts (our Figure 3E). This suggests that whereas Gcn5 and Esa1 may have overlapping roles in transcription of G1/S genes, Esa1 is more specifically involved in mRNA export. The ability of Nup60-KN to rescue the single mutant esa1 but not the double gcn5 esa1 is consistent with this view: the transcription defects in the double mutant may be so severe as to prevent Start even in the presence of Nup60-KN. We have modified the discussion to mention these points. In addition, we will investigate the transcription defects of esa1 and gcn5 single and double mutants to test this possibility and include the results in a revised version.

      4 • Suppl: Fig 2: I miss the hat1delta+gcn5delta condition.

      We will include the budding index of the hat1 gcn5 double mutant in a revised version.

      Minor comments:

      5 • Figure legend 2C "at least 200 cells were scored": please state number of replicates

      Figure 2C shows RT-qPCR data. The reviewer probably means figure 1C, which shows the budding index of one experiment comparing wild type, esa1, gcn5 and esa1 gcn5 strains. This experiment was repeated 3 times, as is now mentioned in the figure 1 legend.

      6 • Figure 2E: X axis "impor" should be corrected to "import"

      We have corrected this.

      7 • Would Mex67 and/or Mrt2 overexpression recue the esa1-ts and esa1-ts+GCN5 delta phenotype?

      We will include this experiment in a revised version.

      8 • Figure 4 A: The size of the daughter cells in the hos3delta condition seems smaller as compared to esa1-ts. Is this true and can you comment this? Is a premature onset of S phase observed here?

      Since Fig 4A features only wild type and hos3∆ cells, the reviewer is probably referring to esa1-ts cells shown in figure 4B. These two figure panels are not directly comparable: cells in 4A are freely cycling, whereas those in 4B were released from a mitotic arrest using nocodazole. The mitotic arrest was done in order to avoid potentially confounding effects due to inactivation of Esa1 during S phase. However, the arrest also causes daughter cells to grow larger, explaining the size differences pointed out by the reviewer. That being said, it is true that cell size and G1 duration are intimately linked and thus the reviewer question raises a relevant point. We previously showed that although hos3 daughter cells enter S phase prematurely, their size is not significantly different from wild type (Kumar et al., Figure 1d-g). Premature onset of S phase can lead to smaller cell size but this is not the case for hos3 cells, probably due to the slightly faster growth rate of the hos3∆ mutant relative to wild type specifically during S/G2/M phases (Kumar et al., Supplementary Fig. 1b).

      9 • Figure 4D: The still images in figure 2E and 4D do not correspond with the quantitation. E.g. in Fig 2E the esa1ts cells shows Whi5 export at t=81 min, which is according to the shown quantitation unusual late.

      We will modify Figures 2E-4D in a revised version to include cells that export Whi5 at times closer to the median.

      10 • Figure 4B: it is not clear why for the quantitation a different representation is chosen as compared to 4A. It would be better to show the nuclear intensities of mother/daughter as in Figure 4A.

      The reason for the different representation between figures 4A and 4B is that 4A depicts freely cycling cells and in 4B, cells were released from a nocodazole-induced mitotic arrest (as mentioned in our response to point 8). A mitotic arrest perturbs M/D size asymmetries, as daughter cells (but not mothers) continue growing during the arrest, leading to larger nuclear size. In addition, esa1-ts daughters are smaller than wt daughters in this condition, further complicating M/D asymmetries. We thought that in this case, a better metric for protein association with the NPC is the fluorescence intensity relative to a nuclear pore component. We agree that using different types of graphs is confusing, and therefore we have removed M/D comparisons from figure 4A and now represent these data as in figure 4B: the intensity of Sac3 relative to Nup49. Finally, a good control for these experiments is the quantification of total protein levels, which we have added for Sac3. We have also removed Mtr2-GFP data until our analysis of Mtr2 total levels is complete. We hope this simplifies this figure.

      11 • Figure 4D: To strengthen these results, it would be good to perform this assay with esa1-ts Nup60-KN cells as in figure 2a. The release of Whi5-GFP is expected to behave in a similar way to the WT. This would ensure that Nup60 acetylation is a pre-requisite for Whi5 release

      I’m afraid we don't understand this suggestion. Figure 4D shows time-lapse fluorescence microscopy of Whi5 nuclear export when Sac3 is recruited to the nuclear basket. Figure 2a shows western blots of Nup60 acetylation status. Therefore it is not clear how these two assays could be done in similar ways. Perhaps the reviewer refers to a different figure panel. The purpose of the suggested experiment, if we understand properly, is to test whether Nup60 acetylation is required for Whi5 export. This is the hypothesis tested in figure 2D: Whi5-GFP export is delayed in esa1-ts, and this delay is partially rescued in esa1-ts nup60-KN, which mimics acetylation. In fact, the advance in Whi5 export observed in Figure 4D upon Sac3 anchoring to NPC is similar to that observed in a nup60-KN (Figure 2E).

      12 • Page 13 "Finally, we tested whether Esa1 targets Sac3 to G1 nuclei": The effect of esa1 knockdown on Sac3 fit with the story line and the effect esa1 imposes on mRNA export. However targeting of Sac3 which is part of a bigger complex by esa1 is a misleading statement, given that you don't show a proof of direct interactions shown, e.g. by immunoprecipiations.

      We meant to say “we tested whether Esa1 function promotes the localisation of Sac3 to the nuclear basket”. We agree that it is unknown whether this involves direct interactions between Sac3 and Esa1. We have changed the text to make this point clearer.

      13 • Page 18: "Nevertheless, our findings suggest that mammalian nucleoporins may represent a novel category of substrates for KATs and for the multiprotein complexes in which these enzymes reside, with important roles in gene expression." Given that there is little experimental evidence this statement is for my taste too strong. Rather indicate that this is a possibility which needs to be tested...

      We have changed the text as suggested.

      14 • Page 3: "Nuclear pores are macromolecular assemblies composed of approximately 30-50 different Nucleoporins": it is rather approximately 30 different nucleoporins in the species so far analyzed.

      We have corrected this as suggested.

      Significance:

      The concept of acetylation/deacetylation regulation of G1/S transition in budding yeast is very appealing. The specific (and important) contribution of Esa1, especially in comparison to GCN5 and Hat1 remains unclear as well as its precise effect on Nup60. Clarifying this, also in a more balanced way of presentation of discussion, would be of interest for the field.

      My research centers around NPC function.

      Audience: experts in the nuclear structure/function fields and cell cycle regulation.

      A more detailed characterisation of the specific roles of Esa1, Gcn5 and Hat1 in the G1/S transition and mRNA export will be included in a revised version, as mentioned in our response to point 3.

      Reviewer #2

      Evidence, reproducibility and clarity:

      In this manuscript, Gomar-Alba et al. follow up on previous work from the lab that showed that the KDAC Hos3 is targeted to the bud neck and daughter cell nuclear pore complexes in budding yeast where it slows cell cycle progression by influencing gene positioning and nucleo-cytoplasmic transport. Overall, the current manuscript describes a well-conducted study that dissects the role of acetylation and deacetylation on Nup60 during the cell cycle using genetics and microscopy. The authors conclusively identify Esa1 as counteracting Hos3 in the nucleus (Figure 1) and show that part of their effect on cell cycle progression and gene expression is mediated by acetylation of Nup60 at K467 (Figure 2). They also demonstrate that this leads to a differential localization of several mRNA export factors and suggest that deacetylation of Nup60 blocks mRNA export in daughter cells. Although this work is overall carefully done, the last conclusion is still somewhat speculative.

      I have a number of minor suggestions to improve the manuscript, but only one major concern, which revolves around the role of chromatin tethering to NPCs. The authors have shown in their previous paper that this plays a role for CLN2 and it is known that active GAL1 interacts with the nuclear periphery, but in the current manuscript this aspect is largely disregarded although I think it could play a major role in the observed mRNA export phenotypes. Therefore, I think some additional experiments and controls as well as additional analysis are required to substantiate especially the results shown in figure 5.

      Major points:

      1) Figure 2: The authors claim that the mechanism by which Nup60 acetylation promotes cell cycle progression is the enhancement of mRNA export through the NPC. In Figure 2, the authors look at the expression levels of four candidate mRNAs which all show disturbed expression in esa1-ts which is not rescued by the nup60-KN mutation, but expression of the protein of one of these candidates (CLN2) is improved. In their previous paper, the same lab has shown that the CLN2 gene is tethered to the NPC in daughter cells with deacetylated Nup60 and that this is relieved in a Nup60 K467N mutant. I think it would be important here to investigate the protein levels of additional candidates that are not regulated at the level of gene localization. Is it a general effect that protein expression is higher in the nup60KN mutant?

      We agree this is an important point. To establish if Nup60-KN regulates only genes that interact with the NPC (such as CLN2), the reviewer suggests determining the cell cycle levels of proteins encoded by other G1/S genes that do not bind NPCs. The main problem with this approach is that with the exception of CLN2, the nuclear localisation of the (about 200) G1/S regulon genes is not yet known. In addition, establishing connections between mRNA and protein levels during the first cell cycle is only possible for short-lived proteins such as Cln2. For instance, amongst the G1/S genes shown in Figure 2, Cdc21 and Rnr1 have protein half-lives of 10 and 4 h, much longer than the 90-minute yeast cell cycle (PMID 25466257). We think a more direct approach to investigate the connection between gene position and mRNA synthesis / export would be to directly visualise the localisation of single mRNAs upon perturbation of the Nup60 acetylation pathway, using single mRNA labeling techniques (smFISH or PP7). We aim to do this for CLN2 and also for GAL1 (see point 2d of this reviewer). We will attempt these experiments for a revised version of our paper.

      2) Figure 5: In figure 5, the authors investigate the expression of a different inducible RNA (GAL1) to test whether the observed effect on mRNA export is more general. Since this is a crucial point for generalizing the finding, this data needs to be presented in a more convincing manner.

      2a. GAL1 is known to be tethered to the NPC upon transcription. Whether this tethering is affected by the Nup60-KN mutant is unclear, but since Nup60 has been implicated in GAL1 tethering in the literature, this possibility is not unlikely. GAL1 therefore becomes a similar case to CLN2, where it is difficult to disentangle effects directly due to mRNA export from the effects of gene tethering on mRNA transcription and processing. Therefore, this experiment should be repeated with a system that is independent of gene tethering. For example, induction of the GAL promoter via a b-estradiol inducible VP16 transactivator does not seem to induce tethering.

      This is an excellent idea. We are not aware of studies on the localisation of the GAL1 locus induced by a VP16 transactivator, but this was investigated for the HXK1 gene. This subtelomeric gene localises to NPCs in non-glucose carbon sources, and its localisation is perturbed by VP16 transactivation in glucose (PMID: 16760983). We will investigate whether the same is true for GAL1, and if so, perform the suggested experiments.

      2b. The activation kinetics in all mutants analyzed is very different from the wildtype. Therefore, the quantification made in Figure 5C is difficult to interpret. Therefore, it might be more fair to quantify for the mutant strains at an earlier timepoint after activation when the levels are similar to the levels in the wildtype strain. E.g. in the hos3d strain at around 250 min.

      This is a good point - indeed, persistent mother/daughter asymmetry in GAL1 expression in hos3 and nup60-KN mutants could be masked by saturated levels of GFP at late time points. An alternative way to test this is to determine the time of GAL1 induction in mother and daughter cells. We have done this in wild-type and hos3 mutant cells; our results indicate that GAL1 expression occurs first in wildt-type mothers and later in their daughters, whereas it is almost simultaneous in nup60-KN mother/daughter mutant pairs (as shown for a single M-D pair in the new figure 5A). In a revised version, we will include data of GAL1 expression for M-D pairs at different times after galactose addition for cells in figures 5C and 5E.

      2c. Similarly - although not as drastic - , in figure 5E, quantification should be done at a timepoint when the induction level is similar between DMSO and Rapamycin treated samples to make conclusions about differences between mother and daughter cell.

      We agree. See our response to the previous point.

      2d. The major claim of the paper is that mRNA export is inhibited by Nup60 deacetylation. In this figure, the mRNA levels need to be quantified to validate that it is not transcription that is affecting expression.

      We agree. In addition to regulating mRNA export (as suggested by the effect of Sac3 anchoring to NPCs) Nup60 deacetylation may also inhibit GAL1 transcription (directly, and/or indirectly via disruption of Gal1-based transcriptional feedback; PMID 23150580). To directly assess the role of Nup60 acetylation in GAL1 transcription and mRNA export, it would be ideal to determine the levels of GAL1 mRNA in both the nucleus and the cytoplasm, using smFISH and/or PP7 tools, in wild type and in mutants of the Nup60 acetylation pathway as we proposed to do for CLN2 (see our response to point 1 of this reviewer). These or equivalent experiments will be included in a revised version.

      3) The manuscript investigates in detail the effects of a KN mutant, however, a non-acetylatable mutant is not investigated. Is such a mutant viable?

      We have obtained a Nup60-KR mutant, which is predicted to behave as a non-acetylatable mimic, and it is viable. We will describe its phenotype in a revised version.

      Minor comments:

      4) Figure 2E: Is the rescue really specific to daughter cells? The dynamic range in the daughter cells is much higher due to the slower and more heterogenous timepoint of Whi5 export. However, zoom-in on the early timepoints after Whi5 import before the 30 min when 50% of the cells have exported Whi5, might reveal a significant increase of mother cells with shortened time to S phase entry. I suggest that the authors test this possibility. The cells shown in the image panels also suggest that the acetyl mimic might shorten mother cell time to S phase entry. If this is not the case, the authors might want to show a different example cell. Interestingly, it appears from the supplementary figure S5, that while Nup60 K647N partially rescues the export of Whi5, budding does not seem to be different to Nup60 wt. This appears to contradict the budding after alpha factor arrest shown in figure 2.

      We thank the reviewer for this suggestion. Indeed, zooming into the first 30 minutes shows a slight increase in the fraction of nup60-KN mother cells that export Whi5; however this change is not statistically significant when considering the entire cell population (p=0.6017, Mann-Whitney test). Therefore, we will replace the cell shown in figure 2E with a more representative example.

      As for figure S5, the reviewer is correct that in these experiments nup60-KN partially rescues Whi5 export (a marker of Start) but not budding (a downstream event), and this is indeed in variance with the experiment shown in figure 2B. Different experimental conditions may contribute to this apparent discrepancy: as noted in the text, the duration of G1 phase in cells synchronised with alpha factor is not directly comparable with that of freely cycling cells.

      5) Figure 3C: The authors use a truncated version of SAC3 for overexpression, since the full length is toxic (Figure S6A). I think it would be important to include this information in the main text.

      We agree, and have included this information in the main text.

      6) Figure 4B: Is there simply less Sac3 protein in the esa1-ts mutant? Although the authors address this question in figure S9, the very low expression levels of Sac3 may make this difficult to conclude from fluorescence quantification. A Western Blot would be an important control. The relative level of Sac3 still seems to be lower in esa1-ts daughter cells compared to mother cells, but no statistical test is shown.

      We are confident that the total Sac3-GFP levels are sufficient to make accurate comparisons, in both the nucleus and the entire cell. However, we will be happy to include western blot controls for Sac3 total levels in a revised version as the reviewer suggests. As for the levels of Sac3 in M vs D cells: Sac3 is indeed asymmetrically distributed in both wild-type and esa1-ts cells (p

      7) Analysis of mother daughter pairs (e.g. figure 5C): a paired t-test would be appropriate.

      We agree. Results do not change with this new analysis (in fact, p values are even lower for wild-type M-D pairs in figure 5C).

      8) Figure 5A: Can some representative mother-daughter pairs be shown as images for both wt and mutant in the timelapse? It is difficult to see in 5A whether there are any mother daughter pairs.

      We have modified the figure to include clearly identifiable mother-daughter pairs, as requested.

      9) Figure 4C: Please show image of localization of Sac3-GFP-FRB +/- rapamycin to the NPC.

      We have added this.

      Significance:

      This manuscript describes an important advance in understanding the role of non-histone protein modification on the regulation of cell cycle progression and gene expression. It is a logical follow-up on a previous paper from the lab (Kumar et al. 2018) and beautifully builds on this work. It is to my knowledge the first mechanistic description of regulation of nuclear pore complex function by a post-translational modification. This will therefore be a very interesting paper for anyone interested in nuclear pore complex regulation and biology, non-histone protein acetylation, asymmetric cell division, and cell cycle regulation.

      Reviewer #3

      Evidence, reproducibility and clarity:

      The pre-print is dedicated to mRNA export and G1/S transition control in mother and daughter cells of budding yeasts through acetylation/deacetylation of nuclear pore component Nup60 (hsNup153). In particular, authors found that Esa1(hsTip60/KAT5) acetylates the basket nucleoporin Nup60, and this event promotes recruitment of mRNA export factors to the nuclear basket and export of polyA RNA to the cytosol. This export event promotes entry of cells into S phase; in particular, Nup60 is deacetylated by histone deacetylase Hos3 that displaces mRNA export complexes from the NPC and inhibits Start specifically in daughter cells.

      The manuscript is a well-designed and well-written study.

      Please, see my major and minor suggestions below:

      Major comments:

      1. P4-5. "deacetylation of the nuclear basket nucleoporin Nup60 does not affect Whi5 nuclear accumulation". I was confused by this statement because, in the previous article Kumar et al., 2018, both main text and abstract have the following phase "nuclear basket and central channel nucleoporins establish daughter-cell-specific nuclear accumulation of the transcriptional repressor Whi5.." Could you please address this discrepancy?

      Thank you for pointing this out. We should have written: “deacetylation of Nup60 does not strongly affect Whi5 nuclear accumulation”. The Kumar et al. paper shows that deacetylation of central channel nucleoporins (such as Nup49) is important to increase accumulation of Whi5 in daughter cells, whereas deacetylation of the basket nucleoporin Nup60 plays a relatively minor role (see Kumar et al, Figure 7c). We have corrected this in the main text.

      Fig.2A: In addition to increased Nup60 acetylation, I noticed an overall increased level of Nup60 after overexpression of Esa1 and Gcn5. Is it a statistically significant increase in the Nup60 level? It is not mentioned in the main text or figure legend. Does the acetylation level of Nup60 influence its stability?

      We don’t know if acetylation of Nup60 affects its stability, although it is an intriguing possibility. Although it´s true that Nup60 levels in the IP fraction seem to increase upon Esa1 and Gcn5 overexpression, nuclear levels of Nup60-mCherry are similar in wild-type, hos3∆ and nup60-KN (Supplementary Figure S11A). Therefore it is unlikely that changes in Nup60 acetylation affect its stability. We have added this information to the text.

      Authors determined the mRNA level of four representative genes in esa1-ts and esa1-ts nup60-KN cultures.

      3a. Do authors know if Nu60-KN expression affects the perinuclear positioning of these transcripts?

      We did not investigate the localisation of individual transcripts in this study. However, as mentioned in our replies to reviewer 2, we propose to do so for the CLN2 and GAL1 mRNAs, in order to test directly the effect of Nup60 acetylation in the positioning of specific mRNAs.

      3b.I also suggest authors investigate if Nup60-KN affects other transcripts using the RNAseq approach. Nup60-KN might improve the transcription output of other transcripts and it will be interesting to know if these transcripts share similar features.

      We agree that investigating the impact of Nup60 acetylation in mRNA synthesis genome-wide is an exciting challenge. We speculate that Nup60-KN is likely to have some effect in transcription, either directly or indirectly through perturbation of feedback regulatory loops caused by mRNA export defects (for instance, transcription of both CLN2 and GAL1 is regulated by positive feedback). However we think that these experiments are beyond the scope of our study, which is focused on mRNA export.

      3c. Do authors know if GAL1pr:HOS3-NLS expression affects specifically G1-dependent transcripts?

      Answering this question would require RNA sequencing experiments. As mentioned in the previous point, we think these are beyond the scope of our study. That being said, it is likely that the Hos3-Nup60 pathway downregulates gene expression during G1, because Nup60 deacetylation is largely restricted to this phase. Note that this is not the same as regulating expression of the G1/S regulon specifically, because Hos3 also regulates GAL1 expression (Figure 5). We mention this important point in the discussion (p. 17).

      3d. Another interesting question will be to define if there is a group of transcripts that respond specifically to the status of Nup60 acetylation during G1/S transition. Is it possible to make ts-driven Nup60-KN expression to turn in ON/OFF? However, this question is beyond the scope of this paper.

      Thank you for this interesting suggestion. The proposed experiment is technically possible (for example, expression of Nup60-KN could be induced in G1 using a GAL1 promoter, followed by RNA sequencing). We agree that this is beyond the scope of our paper but would like to explore the question in future studies.

      1. Fig.2D It is not mentioned that Cln2 is not cycling anymore upon Nup60-KN overexpression.

      The Cln2 protein peaks at 30 minutes in this experiment, and is degraded at approximately 120 minutes. This corresponds to the slow, incomplete G1/S transition wave of the esa1-ts nup60-KN mutant, as indicated in the budding index at the bottom of the panel. We added this in the figure 2 legend. Note that Nup60-KN is not overexpressed, since the KN mutation is inserted in the endogenous gene under the control of its native promoter.

      Fig.2E. Arrows indicating Whi5 export timing do not match to the numbers in the main text. For example, yellow arrows indicate Whi5 export in wt strain at 30 and 78 min, but it is stated 15 and 59 min in the text. Also, do I understand right that Whi5-mCherry is not visible in the cytosol?

      See our reply to reviewer 2, point 4: we will replace the cell shown in figure 2E with a more representative example. As for Whi5-mCherry, it is visible in the cytoplasm but only weakly (since it is diluted into the larger cytoplasmic volume), and not at all in the images shown due to the overlay with the brightfield channel.

      Did the authors analyze where SAC3 and MTR2 are localized in hos3del, Nup60KN, and Esa-ts strains once their localization was affected in the nucleus? Is the overall level Sac3 level is affected in hos3del and Nup60KN strains?

      We have imaged the localisation of Sac3-GFP and Mtr2-GFP during the whole cycle using time-lapse microscopy. Our impression is that in wild type cells, their perinuclear levels increase during S phase in daughter cells, which mirrors the increase in Nup60 acetylation. In contrast, Sac3 and Mtr2 perinuclear levels seem more stable in hos3 and nup60-KN cells. We will include these analyses in a revised version. The total level of Sac3 is not affected, as shown in the updated figure 4; see our reply to reviewer 2, point 6.

      Fig4C. "Sac3-GFP-FRB partitioned equally to M and D nuclei, in the presence of Nup60-mCherry-FKBP and rapamycin (Figure 4C)." Sac3-GFP-FRB is slightly elevated in mother cells. Did you run a statistical test between the first and the third column on the box plot?

      Comparing the first and third columns in Fig 4C (Nup60 and Sac3 in control cells) shows that the mother cell accumulation is higher for Sac3 than for Nup60 (p

      P15. "GAL1 expression levels were higher in wild-type mother cells than in their daughter, and these differences were absent in cells lacking Hos3 or expressing Nup60KN". GAL1-10 promoter contains information necessary and sufficient for recruitment to the nuclear periphery (PMID: 27489341). I wonder if GAL1pr-driven transgenes of HOS3, spt10, hat1, and etc., contain DNA sequences sufficient for targeting genes to the nuclear periphery, and these genes are asymmetrically expressed in mother and daughter cells because of the presence of GAL1pr?

      We agree that these genes may be expressed at different levels in mother and daughter cells. We don’t think this asymmetric expression affects our conclusions. Indeed, the phenotypes scored (growth on plates) apply to the population and not to individual cells. The one exception is figure 3D, in which mRNA nuclear accumulation is scored in single cells. In this case, it remains possible that some of the variability observed corresponds to differences between mothers and daughters. In this case, our measurements could under-estimate the effect of Hos3-NLS in inhibition of mRNA export. However, since we cannot differentiate M and D cells in this experiment, we prefer not to speculate on this possibility in the text.

      Minor comments:

      1. Supplementary Fig. S1, it will be easy to read cell viability assays if 1A, S1A and S1B figures have the same orientation.

      We have changed the figure as suggested.

      Could you please clarify the difference between HOS3-NLS and GAL1pr:HOS3-NLS in the text of figure legend? P.33

      We have fixed this (figure 1 legend).

      P6. I recommend adding the following sentence to help clarity of the text: "To understand how NPC acetylation regulates the G1/S transition (Start), we sought to identify the lysine acetyl-transferases (KATs) counteracting the activity of the Hos3 deacetylase. Hos3 displays asymmetric distribution between mother and daughter cells in wild type Saccharomyces cerevisiae. Overexpression of a version of Hos3 fused to a nuclear localization signal (GAL1pr-HOS3-NLS) leads to targeting of Hos3 to mother and daughter cell nuclei, deacetylation of nucleoporins, and inhibition of cell proliferation (Kumar et al, 2018)."

      We thank the reviewer for this suggestion. This has been added.

      P8. Misspelling: Though Nup60 acetylation

      This has been fixed.

      FigS7. Description of polyA distribution is missing for single gcn5del strain.

      Thank you for pointing this out. This has been added.

      Misspelling: We conclude that Esa1 and Nup60 acetylation promotes Start, at least in part, by targeting Sac3 to the nuclear basket, where it mediates mRNA export.

      This has been fixed.

      Significance

      Authors of this pre-print overview and try to resolve a fundamental and not well-studied question about NPC acetylation status and S phase entry. This work is a logical extension of their previously published work (PMID: 29531309). However, this study for the first-time links status of NPC acetylation to mRNA export through lysine acetyl transferases. It will be interesting to address this question in mammalian cells considering interaction of basket nucleoporins with Tip60/KAT5 (PMID: 24302573).

      This work might be of interest to researchers investigating RNA export, transcription regulation, and nuclear pores.

      My fields of expertise are RNA export, nucleoporins, transcription regulation.

      I do not have expertise to evaluate yeast strains used in this study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The pre-print is dedicated to mRNA export and G1/S transition control in mother and daughter cells of budding yeasts through acetylation/deacetylation of nuclear pore component Nup60 (hsNup153). In particular, authors found that Esa1(hsTip60/KAT5) acetylates the basket nucleoporin Nup60, and this event promotes recruitment of mRNA export factors to the nuclear basket and export of polyA RNA to the cytosol. This export event promotes entry of cells into S phase; in particular, Nup60 is deacetylated by histone deacetylase Hos3 that displaces mRNA export complexes from the NPC and inhibits Start specifically in daughter cells.

      The manuscript is a well-designed and well-written study.

      Please, see my major and minor suggestions below:

      Major comments:

      1. P4-5. "deacetylation of the nuclear basket nucleoporin Nup60 does not affect Whi5 nuclear accumulation". I was confused by this statement because, in the previous article Kumar et al., 2018, both main text and abstract have the following phase "nuclear basket and central channel nucleoporins establish daughter-cell-specific nuclear accumulation of the transcriptional repressor Whi5.." Could you please address this discrepancy?
      2. Fig.2A: In addition to increased Nup60 acetylation, I noticed an overall increased level of Nup60 after overexpression of Esa1 and Gcn5. Is it a statistically significant increase in the Nup60 level? It is not mentioned in the main text or figure legend. Does the acetylation level of Nup60 influence its stability?
      3. Authors determined the mRNA level of four representative genes in esa1-ts and esa1-ts nup60-KN cultures. Do authors know if Nu60-KN expression affects the perinuclear positioning of these transcripts? I also suggest authors investigate if Nup60-KN affects other transcripts using the RNAseq approach. Nup60-KN might improve the transcription output of other transcripts and it will be interesting to know if these transcripts share similar features. Do authors know if GAL1pr:HOS3-NLS expression affects specifically G1-dependent transcripts?

      Another interesting question will be to define if there is a group of transcripts that respond specifically to the status of Nup60 acetylation during G1/S transition. Is it possible to make ts-driven Nup60-KN expression to turn in ON/OFF? However, this question is beyond the scope of this paper.

      1. Fig.2D It is not mentioned that Cln2 is not cycling anymore upon Nup60-KN overexpression.
      2. Fig.2E. Arrows indicating Whi5 export timing do not match to the numbers in the main text. For example, yellow arrows indicate Whi5 export in wt strain at 30 and 78 min, but it is stated 15 and 59 min in the text. Also, do I understand right that Whi5-mCherry is not visible in the cytosol?
      3. Did the authors analyze where SAC3 and MTR2 are localized in hos3del, Nup60KN, and Esa-ts strains once their localization was affected in the nucleus? Is the overall level Sac3 level is affected in hos3del and Nup60KN strains?
      4. Fig4C. "Sac3-GFP-FRB partitioned equally to M and D nuclei, in the presence of Nup60-mCherry-FKBP and rapamycin (Figure 4C)." Sac3-GFP-FRB is slightly elevated in mother cells. Did you run a statistical test between the first and the third column on the box plot?
      5. P15. "GAL1 expression levels were higher in wild-type mother cells than in their daughter, and these differences were absent in cells lacking Hos3 or expressing Nup60KN". GAL1-10 promoter contains information necessary and sufficient for recruitment to the nuclear periphery (PMID: 27489341). I wonder if GAL1pr-driven transgenes of HOS3, spt10, hat1, and etc., contain DNA sequences sufficient for targeting genes to the nuclear periphery, and these genes are asymmetrically expressed in mother and daughter cells because of the presence of GAL1pr?

      Minor comments:

      1. Supplementary Fig. S1, it will be easy to read cell viability assays if 1A, S1A and S1B figures have the same orientation.
      2. Could you please clarify the difference between HOS3-NLS and GAL1pr:HOS3-NLS in the text of figure legend? P.33
      3. P6. I recommend adding the following sentence to help clarity of the text: "To understand how NPC acetylation regulates the G1/S transition (Start), we sought to identify the lysine acetyl-transferases (KATs) counteracting the activity of the Hos3 deacetylase. Hos3 displays asymmetric distribution between mother and daughter cells in wild type Saccharomyces cerevisiae. Overexpression of a version of Hos3 fused to a nuclear localization signal (GAL1pr-HOS3-NLS) leads to targeting of Hos3 to mother and daughter cell nuclei, deacetylation of nucleoporins, and inhibition of cell proliferation (Kumar et al, 2018)."
      4. P8. Misspelling: Though Nup60 acetylation
      5. FigS7. Description of polyA distribution is missing for single gcn5del strain.
      6. Misspelling: We conclude that Esa1 and Nup60 acetylation promotes Start, at least in part, by targeting Sac3 to the nuclear basket, where it mediates mRNA export.

      Significance

      Authors of this pre-print overview and try to resolve a fundamental and not well-studied question about NPC acetylation status and S phase entry. This work is a logical extension of their previously published work (PMID: 29531309). However, this study for the first-time links status of NPC acetylation to mRNA export through lysine acetyl transferases. It will be interesting to address this question in mammalian cells considering interaction of basket nucleoporins with Tip60/KAT5 (PMID: 24302573).

      This work might be of interest to researchers investigating RNA export, transcription regulation, and nuclear pores.

      My fields of expertise are RNA export, nucleoporins, transcription regulation.

      I do not have expertise to evaluate yeast strains used in this study.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Gomar-Alba et al. follow up on previous work from the lab that showed that the KDAC Hos3 is targeted to the bud neck and daughter cell nuclear pore complexes in budding yeast where it slows cell cycle progression by influencing gene positioning and nucleo-cytoplasmic transport. Overall, the current manuscript describes a well-conducted study that dissects the role of acetylation and deacetylation on Nup60 during the cell cycle using genetics and microscopy. The authors conclusively identify Esa1 as counteracting Hos3 in the nucleus (Figure 1) and show that part of their effect on cell cycle progression and gene expression is mediated by acetylation of Nup60 at K467 (Figure 2). They also demonstrate that this leads to a differential localization of several mRNA export factors and suggest that deacetylation of Nup60 blocks mRNA export in daughter cells. Although this work is overall carefully done, the last conclusion is still somewhat speculative.

      I have a number of minor suggestions to improve the manuscript, but only one major concern, which revolves around the role of chromatin tethering to NPCs. The authors have shown in their previous paper that this plays a role for CLN2 and it is known that active GAL1 interacts with the nuclear periphery, but in the current manuscript this aspect is largely disregarded although I think it could play a major role in the observed mRNA export phenotypes. Therefore, I think some additional experiments and controls as well as additional analysis are required to substantiate especially the results shown in figure 5.

      Major points:

      1) Figure 2: The authors claim that the mechanism by which Nup60 acetylation promotes cell cycle progression is the enhancement of mRNA export through the NPC. In Figure 2, the authors look at the expression levels of four candidate mRNAs which all show disturbed expression in esa1-ts which is not rescued by the nup60-KN mutation, but expression of the protein of one of these candidates (CLN2) is improved. In their previous paper, the same lab has shown that the CLN2 gene is tethered to the NPC in daughter cells with deacetylated Nup60 and that this is relieved in a Nup60 K467N mutant. I think it would be important here to investigate the protein levels of additional candidates that are not regulated at the level of gene localization. Is it a general effect that protein expression is higher in the nup60KN mutant?

      2) Figure 5: In figure 5, the authors investigate the expression of a different inducible RNA (GAL1) to test whether the observed effect on mRNA export is more general. Since this is a crucial point for generalizing the finding, this data needs to be presented in a more convincing manner.

      a. GAL1 is known to be tethered to the NPC upon transcription. Whether this tethering is affected by the Nup60-KN mutant is unclear, but since Nup60 has been implicated in GAL1 tethering in the literature, this possibility is not unlikely. GAL1 therefore becomes a similar case to CLN2, where it is difficult to disentangle effects directly due to mRNA export from the effects of gene tethering on mRNA transcription and processing. Therefore, this experiment should be repeated with a system that is independent of gene tethering. For example, induction of the GAL promoter via a b-estradiol inducible VP16 transactivator does not seem to induce tethering.

      b. The activation kinetics in all mutants analyzed is very different from the wildtype. Therefore, the quantification made in Figure 5C is difficult to interpret. Therefore, it might be more fair to quantify for the mutant strains at an earlier timepoint after activation when the levels are similar to the levels in the wildtype strain. E.g. in the hos3d strain at around 250 min.

      c. Similarly - although not as drastic - , in figure 5E, quantification should be done at a timepoint when the induction level is similar between DMSO and Rapamycin treated samples to make conclusions about differences between mother and daughter cell.

      d. The major claim of the paper is that mRNA export is inhibited by Nup60 deacetylation. In this figure, the mRNA levels need to be quantified to validate that it is not transcription that is affecting expression.

      3) The manuscript investigates in detail the effects of a KN mutant, however, a non-acetylatable mutant is not investigated. Is such a mutant viable?

      Minor comments:

      4) Figure 2E: Is the rescue really specific to daughter cells? The dynamic range in the daughter cells is much higher due to the slower and more heterogenous timepoint of Whi5 export. However, zoom-in on the early timepoints after Whi5 import before the 30 min when 50% of the cells have exported Whi5, might reveal a significant increase of mother cells with shortened time to S phase entry. I suggest that the authors test this possibility. The cells shown in the image panels also suggest that the acetyl mimic might shorten mother cell time to S phase entry. If this is not the case, the authors might want to show a different example cell. Interestingly, it appears from the supplementary figure S5, that while Nup60 K647N partially rescues the export of Whi5, budding does not seem to be different to Nup60 wt. This appears to contradict the budding after alpha factor arrest shown in figure 2.

      5) Figure 3C: The authors use a truncated version of SAC3 for overexpression, since the full length is toxic (Figure S6A). I think it would be important to include this information in the main text.

      6) Figure 4B: Is there simply less Sac3 protein in the esa1-ts mutant? Although the authors address this question in figure S9, the very low expression levels of Sac3 may make this difficult to conclude from fluorescence quantification. A Western Blot would be an important control. The relative level of Sac3 still seems to be lower in esa1-ts daughter cells compared to mother cells, but no statistical test is shown.

      7) Analysis of mother daughter pairs (e.g. figure 5C): a paired t-test would be appropriate.

      8) Figure 5A: Can some representative mother-daughter pairs be shown as images for both wt and mutant in the timelapse? It is difficult to see in 5A whether there are any mother daughter pairs.

      9) Figure 4C: Please show image of localization of Sac3-GFP-FRB +/- rapamycin to the NPC.

      Significance

      This manuscript describes an important advance in understanding the role of non-histone protein modification on the regulation of cell cycle progression and gene expression. It is a logical follow-up on a previous paper from the lab (Kumar et al. 2018) and beautifully builds on this work. It is to my knowledge the first mechanistic description of regulation of nuclear pore complex function by a post-translational modification. This will therefore be a very interesting paper for anyone interested in nuclear pore complex regulation and biology, non-histone protein acetylation, asymmetric cell division, and cell cycle regulation.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Acetylation/Deacetylation controls G1/s transition in budding yeast. The lysine acetyl transferase Esa1 is here shown to play a role, in part via acetylation of the nuclear pore complex basket component Nup60, which stimulates mRNA export.

      Major comments:

      • Figure 1C: The curve for esa1-ts in this figure and the curve in the supplementary figure S2B are not similar, while the first shows 10% cells budding after 60 minutes it is about 50% after 60 min in S2B. Another helpful way of presenting the data could be the length of the G1 phase (from cytokinesis to budding) in the WT, esa1-ts, gcn5delta cells over time.

      • What is the rational of creating the Nup60-KN mutation. Does it prevent acetylation of Nup60, at least by GCN5 and/or esa1?

      • Given the much stronger phenotype of the esa1-ts+GCN5 delta condition for G1/S transition as compared to esa1-ts and that GCN5 seems to strongly acetylate Nup60 I do not understand the sole focus on esa1 in the study. The fact that the Nup60-KN cells do not show G1/S transition under esa1-ts+GCN5 delta conditions in experiments presented in Fig. S3 argues that esa1 meaidted acetylation of Nup60 is only one, probably minor aspect of G1/S transition. This should be much balanced discussed.

      • Suppl: Fig 2: I miss the hat1delta+gcn5delta condition.

      Minor comments:

      • Figure legend 2C "at least 200 cells were scored": please state number of replicates

      • Figure 2E: X axis "impor" should be corrected to "import"

      • Would Mex67 and/or Mrt2 overexpression recue the esa1-ts and esa1-ts+GCN5 delta phenotype?

      • Figure 4 A: The size of the daughter cells in the hos3delta condition seems smaller as compared to esa1-ts. Is this true and can you comment this? Is a premature onset of S phase observed here?

      • Figure 4D: The still images in figure 2E and 4D do not correspond with the quantitation. E.g. in Fig 2E the esa1ts cells shows Whi5 export at t=81 min, which is according to the shown quantitation unusual late.

      • Figure 4B: it is not clear why for the quantitation a different representation is chosen as compared to 4A. It would be better to show the nuclear intensities of mother/daughter as in Figure 4A.

      • Figure 4D: To strengthen these results, it would be good to perform this assay with esa1-ts Nup60-KN cells as in figure 2a. The release of Whi5-GFP is expected to behave in a similar way to the WT. This would ensure that Nup60 acetylation is a pre-requisite for Whi5 release

      • Page 13 "Finally, we tested whether Esa1 targets Sac3 to G1 nuclei": The effect of esa1 knockdown on Sac3 fit with the story line and the effect esa1 imposes on mRNA export. However targeting of Sac3 which is part of a bigger complex by esa1 is a misleading statement, given that you don't show a proof of direct interactions shown, e.g. by immunoprecipiations.

      • Page 18: "Nevertheless, our findings suggest that mammalian nucleoporins may represent a novel category of substrates for KATs and for the multiprotein complexes in which these enzymes reside, with important roles in gene expression." Given that there is little experimental evidence this statement is for my taste too strong. Rather indicate that this is a possibility which needs to be tested...

      • Page 3: "Nuclear pores are macromolecular assemblies composed of approximately 30-50 different

      • Nucleoporins": it is rather approximately 30 different nucleoporins in the species so far analyzed.

      Significance

      The concept of acetylation/deacetylation regulation of G1/S transition in budding yeast is very appealing. The specific (and important) contribution of Esa1, especially in comparison to GCN5 and Hat1 remains unclear as well as its precise effect on Nup60. Clarifying this, also in a more balanced way of presentation of discussion, would be of interest for the field.

      My research centers around NPC function.

      Audience: experts in the nuclear structure/function fields and cell cycle regulation.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      I found this an exceptionally impressive manuscript. The evolution of Y chromosomes has until recently been nearly impossible, and this research group have pioneered approaches that can yield reliable results in Drosophila. The study used an innovative heterochromatin-sensitive assembly pipeline on three D. simulans clade species, D. simulans, D. mauritiana and D. sechellia, which diverged less than 250 KYA, allowing comparisons with the group's previous results for the D. melanogaster Y.

      The study is both technically impressive and extremely interesting (an highly unusual combination). It includes a rich set of interesting results about these genome regions, and furthermore the results are discussed in a well-organised way, relating both to previous observations and to understanding of the genetics and evolution of Y chromosomes, illuminating all these aspects. It is a rare pleasure to read such a study. I believe that this study will inspire and be a model for future work on these chromosomes. It shows how these difficult genome regions can be studied.

      Thank you for the positive evaluation of our paper. While we did not make any specific revisions in response to these comments, we did attempt to improve the writing.

      **Major comments:**

      The conclusions are convincing. The methods are explained unusually clearly, and the reasoning from the results is convincing. When appropriate, the caveats, the caveats are clearly explained. The material is clearly organised and the questions studied are well related to the results. I had a few minor comments concerning the English. Even the figure (often a major problem to understand) are very clear and helpful, with proper explanations. I have very rarely read such a good manuscript, and almost never (in a long career) found a manuscript that could be published without revision being necessary.

      Thank you for pointing out that there were minor concerns with the English. We have carefully gone through the manuscript and fixed some minor issues with the writing. The analysis found 58 exons missed in previous assemblies (as well as all previously known exons of the 11 canonical Y-linked genes, which are present in at least one copy across the group). FISH on mitotic chromosomes using probes for 12 Y-linked sequences was used to determine the centromere locations, and to determine gene orders and relate them to the cytological chromosome bands, demonstrating changes in satellite distribution, gene order, and centromere positions between their Y chromosomes within the D. simulans clade species. It also confirmed previous results for Y-linked ribosomal DNA,genes, which are responsible for X-Y pairing in D. melanogaster males. Although 28S rDNA has been lost in D. simulans and D. sechellia (but not in D. mauritiana), the intergenic spacer (IGS) repeats between these repeats are retained on both sex chromosomes in all three species. Only sequencing can reliably reveal this, as their abundance is below the detection level by FISH in D. sechellia. The 11 canonical Y-linked genes' copy numbers vary between the species, and some duplicates are expressed and have complete open reading frames, and may therefore be functional because they, but most include only a subset of exons, often with duplicated exons flanking the the presumed functional gene copy. Mega-introns and Y-loops were found, as already seen in Drosophila species, but this new study detects turn overs in the ~2 million years separating D. melanogaster and the D. simulans clade. 49 independent duplications onto the Y chromosome were detected, including 8 not previously detected. At least half show no expression in testes, or lack open reading frames, so they are probably pseudogenes. Testis-expressed genes may be especially likely to duplicate into the Y chromosome due to its open chromatin structure and transcriptional activity during spermatogenesis, and indeed most of the new Y-linked genes in the species studied clade have likely functions in chromatin modification, cell division, and sexual reproduction. The study discovered two new gene families that have undergone amplification on D. simulans clade Y chromosomes, reaching very high copy numbers (36-146). Both these families appear to encode functional protein-coding genes and show high expression. The paper described intriguing results that illuminate Y chromosome evolution. First, SRPK, arose by an autosome-to-Y duplication of the sequence encoding the testis-specific isoform of the gene SR Protein Kinase (SRPK), after which the autosomal copy lost its testis-specific exon via a deletion. In D. melanogaster, SRPK is essential for both male and female reproduction, so the relocation of the testis-specific isoform to the Y chromosome in the D. simulans clade suggests that the change may have been advantageous by resolving sexual antagonism. The paper presents convincing evidence that the Y copy evolved under positive selection, and that gene amplification may confer advantageous increased expression in males. The second amplified gene family is also potentially related to an interesting function. Both X-linked and Y-linked duplicates are found of a gene called Ssl located on chromosome 2R. In D. simulans, the X-linked copies were previously known, and called CK2ßtes-like. In D. melanogaster, degenerated Y-linked copies are also found, with little or no expression, contrasting with complete open reading frames and high expression in the D. simulans clade species in testes, consistent with the possibility of an arms race between sex chromosome meiotic drive factors. Other interesting analyses document higher gene conversion rates compared to the other chromosomes, and evidence that these Y chromosomes may differ in the DNA-repair mechanisms (preferentially using MMEJ instead of NHEJ), perhaps contributing to their high rates of intrachromosomal duplication and structural rearrangements. The authors relate this to evidence for turnover of Y-linked satellite sequences, with the discovery of five new Y-linked satellites, whose locations were validated using FISH. The study also documented enrichment of LTR retrotransposons on the D. simulans clade Y chromosomes relative to the rest of the genome, together with turnovers between the species.

      Reviewer #1 (Significance (Required)):

      As described above, the advances are both, technical and conceptual for the field. The manuscript itself does an excellent job of placing the work in the context of the existing literature.

      • Anyone working on sex chromosomes and other non-recombining genome regions should be interested in the findings reported.

      • My field of expertise is the evolution of sex chromosomes, and the evolution of genome regions with suppressed recombination. I have experience of genomic analyses. I have less expertise in analyses of gene expression, but I understand enough about such approaches to evaluate the parts of this study that use them.

      Reviewer #2:

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript describes a thorough investigation of the Y-chromosomes of three very closely related Drosophila species (D. simulans, D. sechellia, and D. mauritiana) which in turn are closely related to D. melanogaster. The D. melanogaster Y was analysed in a previous paper by the same goup. The authors found an astonishing level of structural rearrangements (gene order, copy number, etc.), specially taking into account the short divergence time among the three species (~250 thousand years). They also suggest an explanation for this fast evolution: Y chromosome is haploid, and hence double-strand breaks cannot be repaired by homologous recombination. Instead, it must use the less precise mechanisms of NHEJ and MMEJ. They also provide circumstantial evidence that MMEJ (which is very prone to generate large rearrangements) is the preferred mechanism of repair. As far as I know this hypothesis is new, and fits nicely on the fast structural evolution described by the authors. Finally, the authors describe two intriguing Y-linked gene families in D. simulans (Lhk and CK2ßtes-Y), one of them similar to the Stellate / Suppressor of Stellate system of D. melanogaster, which seems to be evolving as part of a X-Y meiotic drive arms race. Overall, it is a very nice piece of work. I have four criticisms that, in my opinion, should be addressed before acceptance.

      Thank you for your positive comments. We respond to your concerns point-by-point below.

      The suggestion/conclusion that MMEJ is the preferential repair mechanism (over NHEJ) should be better supported and explained. At line 387, the authors stated "The pattern of excess large deletions is shared in the three D. simulans clade species Y chromosomes, but is not obvious in D. melanogaster (Fig 6B). However, because all D. melanogaster Y-linked indels in our analyses are from copies of a single pseudogene (CR43975), it is difficult to compare to the larger samples in the simulans clade species (duplicates from 16 genes). ". Given that D. melanogaster has many Y-linked pseudogenes (described by the authors and by other researchers, and listed in Table S6), there seems to be no reason to use a sample size of 1 in this species.

      We only used pseudogenes with large alignable regions (>300 bp) to prevent the potential bias toward small indels and increase our confidence in indel calling. As a result, we excluded most of the duplicates on the D. melanogaster Y chromosome. We now include 5 additional D. melanogaster Y-linked indels in the manuscript, however, the majority of indels in this species (36/41) are still from the same gene.

      Furthermore, given that D. melanogaster is THE model organism, it is the species that most likely will provide information to assess the "preferential MMEJ" hypothesis proposed by the authors.

      A previous paper has shown that male flies deficient in MMEJ have a strong bias toward female offspring (McKee et al. 2000), suggesting that MMEJ is necessary for successfully producing Y-bearing sperm, consistent with our hypothesis. We agree with the reviewer that careful genetic and cytological experiments in D. melanogaster could further clarify the role of MMEJ in the repair of Y-linked mutations. Even more revealing would be experiments using the simulans clade species, where we hypothesize the MMEJ bias is even more pronounced on the Y chromosome. We believe, however, that these experiments are beyond the scope of this study and should merit their own papers.

      Still on the suggestion/conclusion that MMEJ is the preferential repair mechanism (over NHEJ). Y chromosome in heterochromatic, haploid and non-recombining. In order to ascribe its mutational pattern to the haploid state (and the consequent impossibility of homologous recombination repair), the authors compared it to chromosome IV (the so called "dot chromosome"). This may not be the best choice: while chr IV lacks recombination in wild type flies, it is not typical heterochromatin. E.g., " results from genetic analyses, genomic studies, and biochemical investigations have revealed the dot chromosome to be unique, having a mixture of characteristics of euchromatin and of constitutive heterochromatin". Riddle and Elgin, FlyBook 2018 (https://doi.org/10.1534/genetics.118.301146). Given this, it seems appropriate to also compare the Y-linked pseudogenes with those from typical heterochromatin. In Drosophila, these are the regions around the centromeres ("centric heterochromatin"). There are pseudogenes there; e.g., the gene rolled is known to have partially duplicated exons.

      Thank you for the suggestion. We now include the data from pericentric heterochromatin and pseudogenes in supplemental data (see Fig 7). Both data types support our conclusion that indel size is only larger on Y chromosomes, which is consistent with the comparison between the dot chromosome and pericentric heterochromatin reported by Blumenstiel et al. 2002.

      In some passages of the ms there seems to be a confusion between new genes and pseudogenes, which should be corrected. For example, in line 261: "Most new Y-linked genes in D. melanogaster and the D. simulans clade have presumed functions in chromatin modification, cell division, and sexual reproduction (Table S7)".. Who are these "new genes"? If they are those listed in Table S6 (as other passages of the text suggest), most if not all of them are pseudogenes. If they are pseudogenes, it is not appropriate to refer to them as "new genes". The same ambiguity is present in line 263: "Y-linked duplicates of genes with these functions may be selectively beneficial, but a duplication bias could also contribute to this enrichment (...) " Pseudogenes can be selectively beneficial, but in very special cases (e.g.. gene regulation). If the authors are suggesting this, they must openly state this, and explain why. Pseudogenes are common in nearly all genomes, and should be clearly separated from genes (the later as a shortcut for functional genes). The bar for "genes" is much higher than simple sequence similarity, including expression, evidences of purifying selecion, etc., as the authors themselves applied for the two gene families they identified in D. simulans (Lhk and CK2ßtes-Y)

      Thank you for the suggestion. We now state our criteria for calling genes based on the expression and long CDS and correct the sentences that the reviewer refers to. The protein evolution rates of many Y-linked duplicates were surveyed in Tobler et al. 2017, who found that most are not under strong purifying selection. Our study supports this previous report. We think that protein evolution rate alone may not be a good indicator for functionality. Our current study does not focus on the potential function of these genes, and we think further population studies are required to get a solid conclusion. We changed the text to clarify this point: “Most new Y-linked duplications in D. melanogaster and the D. simulans clade are from genes with presumed functions in chromatin modification, cell division, and sexual reproduction (Table S7), consistent with other Drosophila species [17, 77].” (p15 L281-284)

      The authors center their analysis on "11 canonical Y-linked genes conserved across the melanogaster group ". Why did they exclude the CG41561 gene, identified by Mahajan & Bachtrog (2017) in D. melanogaster? Given that most D. melanogaster Y-linked genes were acquired before the split from the D. simulans clade (Koerich et al Nature 2008), the same most likely is true for CG41561 (i.e., it would be Y-linked in the D. simulans clade). Indeed, computational analysis gave a strong signal of Y-linkage in D. yakuba (unpublished; I have not looked in the other species). If CG41561 is Y-linked in the simulans clade, it should be included in the present paper, for the only difference between it and the remaining "canonical genes" was that it was found later. Finally, the proper citation of the "11 canonical Y-linked genes" is Gepner and Hays PNAS 1993 and Carvalho, Koerich and Clark TIG 2009 (or the primary papers), instead of ref #55.

      Thank you for the suggestion. CG41561 is indeed a relatively young Y-linked gene because it’s not Y-linked in D. ananassae (Muller’s element E). We already have CG41561 in Table S6 and we think that it is reasonable to separate a young Y-linked gene from the others. We also fixed the reference as suggested (p5 L116).

      Other points/comments/suggestions:

      1. a) Possible reference mistake: line 88 "For example, 20-40% of D. melanogaster Y-linked regulatory variation (YRV) comes from differences in ribosomal DNA (rDNA) copy numbers [52, 53]." reference #53 is a mouse study, not Drosophila. Thank you for pointing out this error, we fixed the reference (p4 L91).

      2. b) Possible reference mistake: line 208 "and the genes/introns that produce Y-loops differs among species [75]". ref #75 is a paper on the D. pseudoobscura Y. Is it what the authors intended? Yes, our previous paper (ref 75) found that Y-loops do not originate from the kl-3, kl-5, and ORY genes in D. pseudoobscura because they don’t have large introns in this species.

      c) line 113. "We recovered all known exons of the 11 canonical Y-linked genes conserved across the melanogaster group, including 58 exons missed in previous assemblies (Table S1; [55])." Please show in the Table S1 which exons were missing in the previous assemblies. I guess that most if not all of these missing exons are duplicate exons (and many are likely to be pseudogenes). If they indeed are duplicate exons, the authors should made it clear in the main text, e.g., "We recovered all known exons of the 11 canonical Y-linked genes conserved across the melanogaster group, plus 58 duplicated exons missed in previous assemblies."

      Thank you for the suggestion. However, the 58 exons did not include the duplicated exons. We are similarly surprised how much we will miss if we don’t assemble the Y chromosome carefully. We now mark these exons in red in Table S1 to make this point clearer.

      d) line 116 "Based on the median male-to-female coverage [22], we assigned 13.7 to 18.9 Mb of Y-linked sequences per species with N50 ranging from 0.6 to 1.2 Mb." The method (or a very similar one) was developed by Hall et al BMC Genomics 2013, which should be cited in this context. e) line 118: "We evaluated our methods by comparing our assignments for every 10-kb window of assembled sequences to its known chromosomal location. Our assignments have 96, 98, and 99% sensitivity and 5, 0, and 3% false-positive rates in D. mauritiana, D. simulans, and D. sechellia, respectively (Table S2). The procedure is unclear. Why break the contigs in 10kb intervals, instead of treating each as an unity, assignable to Y, X or A? The later is the usual procedure in computational identification of suspect Y-linked contigs (Carvalho and lark Gen Res 2013; Hall et al BMC Genomics 2013). The only reason I can think for analyzing the contigs piecewise is a suspicion of misassemblies. If this is the case, I think it is better to explain.

      Thank you for the suggestion. We did not break the contigs into 10kb intervals when we assigned the Y-linked contigs. As you suspect, our motivation for evaluating our methods and analyzing the contigs in 10kb intervals was to detect possible misassemblies. We rewrote the sentence to make this point clearer (p6 L129-132).

      1. f) Fig. 1. It may be interesting to put a version of Fig 1 in the SI containing only the genes and the lines connecting them among species, so we can better see the inversions etc. (like the cover of Genetics , based on the paper by Schaeffer et al 2008). Thank you for the suggestion. We would like to make a figure like that fantastic cover image you refer to, but the repetitive nature of the Y chromosome makes it difficult to illustrate rearrangements based on alignments at the contig-level. We instead opted to update Figure 1 to better highlight the rearrangements, still based on the unique protein-coding genes which are supported by the FISH experiments.

      2. g) Table S6 (Y-linked pseudogenes). Several pseudogenes listed as new have been studied in detail before: vig2, Mocs2, Clbn, Bili (Carvalho et al PNAS2015) Pka-R1, CG3618, Mst77F (Russel and Kaiser Genetics 1993; Krsticevic et al G3 2015) . Note also that at least two are functional (the vig2 duplication and some Mst77 duplications). Thank you for the suggestion. We now include a column to indicate the potential function of Y-linked duplicates (see Table S6).

      h) line 421: "one new satellite, (AAACAT)n, originated from a DM412B transposable element, which has three tandem copies of AAACAT in its long terminal repeats." The birth of satellites from TEs has been observed before, and should be cited here. Dias et al GBE 6: 1302-1313, 2014.

      Thank you for the suggestion. We now include a sentence to cite this reference (p27 L467-468).

      1. i) Fig S2 shows that the coverage of PacBio reads is smaller than expected on the Y chromosome. Any explanation? This has been noticed before in D. melanogaster, and tentatively attributed to the CsCl gradient used in the DNA purification (Carvalho et al GenRes 2016). However, it seems that the CsCl DNA purification method was not used in the simulans clade species (is it correct?). Please explain the ms, or in the SI. The issue is relevant because PacBio sequencing is widely believed to be unbiased in relation to DNA sequence composition (e.g., Ross et al Genome Biol 2013). Yes, we used Qiagen's Blood and Cell Culture DNA Midi Kit for DNA extraction. We suspect that the underrepresentation of Y-linked reads is driven by the presence of endoreplicated tissue in adults. Heterochromatin is underreplicated in endoreplicated cells, and thus there may simply be less heterochromatin in these tissues. Consistent with this idea, we find that all heterochromatin seems to be underrepresented in the reads, not just the Y chromosome (see Chakraborty et al. 2021; Flynn et al. 2020). We now include this discussion in the SI of our paper (see supplementary text p75).

      2. j) I may have missed it, but in which public repository have the assemblies been deposited? We link to the assemblies in Github (https://github.com/LarracuenteLab/simclade_Y) and they will also be in the Dryad Digital Repository (doi forthcoming).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Due to suppressed recombination, Y chromosomes have degenerated, undergone extensive structural rearrangements, and accumulated ampliconic gene families across species. The molecular processes and selective pressures guiding dynamic Y chromosome evolution are not well understood. In this study, Chang et al. generate updated Y assemblies of three closely related species in the D. simulans complex using long-read PacBio sequencing in combination with FISH. Despite having diverged only 250,00 years ago, the authors find structural rearrangements, two newly amplified gene families and evidence of positive selection across D. simulans. The authors also suggest the high level of Y duplications and deletions may be mediated by MMEJ biased repair.

      The authors generated a valuable resource for the study of Y-chromosome evolution in Drosophila and describe Y chromosome evolution patterns found in previous Y chromosome sequencing studies, such as newly amplified genes, positive selection, and structural rearrangements. The authors improvements to the Drosophila simulans clade Y chromosomes are commended, as assembly of the highly repetitive Y chromosome sequences is challenging. However, the manuscript is largely descriptive, the claims are largely speculative, and lacks a clear question. There are also a number of concerns with the text and figures (see below concerns). Overall, the manuscript would be significantly improved if the authors focused on a specific question as opposed to a survey of sequence features of the Y chromosome. For example, development of the idea that MMEJ is the primary mechanism for loss of Y chromosome sequence could be nice new twist.

      Our aim is to discover and understand the many different factors and processes that shape the evolution of Y chromosome organization and function. Because these Y chromosomes were largely unassembled, we needed to first generate the sequence assembly before we could ask specific questions. We prefer not to focus the manuscript solely on one specific topic such as MMEJ repair, as our other observations and analyses may be interesting to a wide range of scientists studying topics other than mutation and DNA repair. We are therefore choosing to present the more comprehensive story about Y chromosome evolution that we included in our original manuscript.

      We also respectfully disagree with the comment that our paper is just a descriptive survey of Y chromosomal sequence features. On the contrary, we present thorough evolutionary analyses to test hypotheses about the forces shaping the evolution of Y chromosome organization and Y-linked genes. Specifically, we use molecular evolution and phylogenetic and comparative genomics approaches to show that multi-copy gene families experience rampant gene conversion and positive selection. We posit that one simulans clade-specific Y-linked gene family has undergone subfunctionalization, potentially resolving sexual conflict, and another may be involved in meiotic drive. We also use evolutionary genomic approaches to show that the distribution of Y-linked mutations indeed suggests that Y chromosomes disproportionately use MMEJ and we propose that this unique feature may shape the evolution of Y chromosome structural organization. This is, as far as we know, a novel hypothesis. We think that follow-up studies of either hypothesis merit different papers.

      **Major concerns:**

      1. Title: The authors use "unique structure" in the title, which is a vague point. Are not Y chromosomes, or any chromosome, "unique" in some manner? Also are there not more evolutionary processes governing the rapid divergence of the Y's. Thank you for raising your concern. We believe that we are justified in referring to the Y chromosome as unique among all other chromosomes in its structural properties (e.g. combination of its hemizygosity, abundant tandem repeats, large scale rearrangements, and highly amplified testis-specific genes). Because there are many properties of Y chromosomes that we believe contribute to their rapid divergence, we opted for the general phrase ‘unique structure’ to capture all of these features. Many evolutionary processes likely shape the evolution of that unique structure (e.g. Muller’s Ratchet, background selection, Hill Robertson effects; see Charlesworth and Charlesworth 2000 for a review), and these processes are well-studied, especially on newly evolved sex chromosomes. Here our focus is on evolutionarily old Y chromosomes, which may have comparatively fewer targets of purifying selection and are more likely to be shaped by positive selection (Bachtrog 2008).

      p.2, line 53-56: The authors claim that sexually antagonistic selection and regulatory evolution are causes of recombination suppression. Couldn't this statement be reversed? Recombination suppression via inversions or other rearrangements enable sexually antagonistic selection. This is a chicken or egg question, so it should be revised to have both possibilities be equal.

      Thank you for the suggestion. We think that it is unlikely that recombination suppression itself is beneficial, but for sexually antagonistic selection and regulatory evolution, recombination suppression can have short-term benefits. We rephrased this sentence to be agnostic about the direction (p2 L56).

      p.5, 118-120: Are the assemblies de novo or have they been guided based upon the D. melanogaster Y chromosome assembly? Please clarify how the authors evaluate their methods by comparing their Y-sequence assignments to known chromosomal locations.

      Thank you for the suggestion. We didn’t use D. melanogaster Y chromosome assembly to guide our assemblies. “All assemblies are generated de novo”, and thus we don’t think there is any potential bias. We first assigned Y-linked sequences using the presence of known Y-linked genes, and used this assignment to evaluate our methods. We now make the sentence clear (p5 L112).

      While the gene copy number estimates are accurate, the PacBio-based genome assemblies are still not able to accurately assemble large segmental duplications (see Evan Eichler's laboratories recent primate and human genome assemblies). A statement mentioning the concerns about accuracy of the underlying sequence and genomic architecture shown should be included in the main text. FISH provides support for the location of the contigs, but not for the accuracy of the underlying genomic architecture.

      Thank you for the suggestion. We can’t validate all Y-linked regions. We did validate the larger structural features of the assembly and only discuss the results that we are confident in. We now include sentences to address this concern (p7 L150-152).

      The authors assigned Y-linked sequences based on median male-to-female coverage. Is this method feasible for assigning ampliconic sequence to the Y given the N50 of 0.6-1.2Mb? Are the authors potentially excluding novel Y-linked ampliconic sequence?

      We validated our methods to assign contigs to a chromosome by comparing 10-kb intervals to the contigs with known chromosomal location, including the Y chromosome. Our assignments have high (96, 98, and 99%) sensitivity and low (5, 0, and 3%) false-positive rates in D. mauritiana, D. simulans, and D. sechellia, respectively (see Table S2). Based on these results, we think that this method is reasonable for Y-linked contigs with N50 of 0.6-1.2Mb.

      We might exclude some novel Y-linked sequences since we only assigned ~15Mb out of a total ~40 Mb Y-linked sequences. We acknowledged this possibility, and now include a sentence to address this concern (p31 L554-556).

      Where did the rDNA sequences go in D. simulans and D. sechellia? Can they be detected on another chromosome?

      Please see Fig S5 for detailed results. We found a few copies of rDNA on the contigs of autosomes. We assembled many copies of rDNA that can’t be confidently assigned to Y chromosomes. It’s possible that they might be located on other chromosomes. Based on our FISH data (Fig S4) and previous papers, most of these non-Y-linked rDNA copies should be on the X chromosome. However, in this study, we did not make a concerted effort to assign X-linked contigs.

      Figure 2B is hard to follow and it is unclear what additional value it provides to part A. Why is expression level of specific exons important?

      Exon duplication may be an important contributor to Y-linked gene evolution: most genes have duplications and our figure shows that at least some of these duplicates are expressed. The patterns we see indicate that duplication may play different roles in genes depending on their length. For example, the duplications involving short genes (e.g., ARY) may be functional and influence protein expression, whereas duplications involving large genes (e.g. kl-2) may not influence the overall protein expression level from this gene, although the expressed duplicated exons may play some other role. We revised a sentence in the main text and added a sentence to the figure 2 legend to make this point clearer.

      Figure 3 There are many introns that contain gaps, so it is unclear how confident one can be in intron length when there are gaps.

      Indeed, we are not confident about the length of introns with gaps. Therefore, we separated these introns and showed them in different colors.

      Figure 4: What are the authors using as a common ancestor in this figure to infer duplications in the initial branch?

      We used phylogenies to infer the origin of Y-linked duplicates. Any duplications that happened earlier than the divergence between four species are listed in the branch. We also edited the legend to make this point clearer.

      p.15, paragraph 2: The authors describe a newly amplified gene, CK2Btes-Y, in D. simulans. In the first half of the paragraph the authors state that Y-linked copies are also found in D. melanogaster but have "degenerated and have little or no expression" and call them pseudogenes. Later in the paragraph, the authors state that the D. melanogaster Y-linked copies are Su(Ste), a source of piRNAs that are in conflict with X-linked Stellate. Lastly in the paragraph, the authors discuss Su(ste) as a D. melanogaster homolog of CK2Btes-Y. The logic of defining CK2Btes-Y origins is confusing. Was CK2Btes-Y independently amplified on the D. simulans Y, or were CK2BtesY and Su(Ste) amplified in a common ancestor but independently diverged?

      The amplification of CK2Btes-Y and CK2Btes-like happened in the ancestor of D. melanogaster and D. simulans (Fig S11). However, both CK2Btes-Y and CK2Btes-like became pseudogenes (D. melanogaster CK2Btes-Y is named PCKR in a previous study) in D. melanogaster. On the other hand, Ste and Su(Ste) are only limited to D. melanogaster based on phylogenetic analyses (Fig 5A) and are a chimera of CK2Btes-like and NACBtes. The evolutionary history of this gene family has been detailed in other papers, except for the presence of CK2Btes-Y in the D. simulans complex, which we describe for the first time in this study. We now include a new figure (Figure 5B) a schematic of the inferred evolutionary history of sex-linked Ssl/CK2ßtes paralogs

      Figure 5: Is each FISH signal a different gene copy?

      Yes, based on our assemblies, Lhk-1 and Lhk-2 are mostly located on different contigs. Unfortunately, we are not able to design probes that can separate Lhk-1 from Lhk-2.

      The authors suggest DNA-repair on the Y chromosome is biased towards MMEJ based on indel size and microhomologies. Is there any evidence MMEJ is responsible for variable intron length in the canonical Y-linked genes or the amplification of new gene families? Since MMEJ is error-prone, it's a more tolerable repair mechanism in pseudogenes, so their findings might be biased. Rather than comparing pseudogenes to their parent genes, they should compare chrY pseudogenes to autosomal pseudogenes. Even more would be to track MMEJ on the dot chromosome which is known not recombine and is highly heterchromatic like the Y chromosome.

      We did compare chrY pseudogenes to autosomal pseudogenes in our study. We also add new analyses to address other issues from reviewer 2, which are similar to your concern. We now include data from pericentric heterochromatin and pseudogenes (see Fig 7). Both data types support our conclusion that indel size is only larger on Y chromosomes. This is consistent with a report that the dot chromosome and pericentric heterochromatin have similar indel size distributions (Blumenstiel et al. 2002).

      Reviewer #3 (Significance (Required)):

      While it is a benefit to have much improved Y chromosome assemblies from the three D. simulans clade species, the gap in knowledge this manuscript is trying to address is unclear. The manuscript is almost entirely descriptive and the figures are difficult to follow.

      As stated above, we respectfully disagree with the comment that the manuscript is entirely descriptive, as we present thorough evolutionary analyses to test hypotheses about the forces shaping the evolution of Y chromosome organization and Y-linked genes. We have two guiding hypotheses about the importance of sexual antagonism and DNA repair pathways for Y chromosome evolution, and we conduct sequence analyses that support these hypotheses that sexual antagonism and MMEJ affect Y chromosome evolution.

      References cited in this response:

      Bachtrog D. The temporal dynamics of processes underlying Y chromosome degeneration. Genetics. 2008 Jul;179(3):1513-25. doi: 10.1534/genetics.107.084012. Epub 2008 Jun 18. PMID: 18562655; PMCID: PMC2475751.

      Blumenstiel, J.P., Hartl, D.L, Lozovsky, E.R.. Patterns of Insertion and Deletion in Contrasting Chromatin Domains, Molecular Biology and Evolution, Volume 19, Issue 12, December 2002, Pages 2211–2225, __https://doi.org/10.1093/oxfordjournals.molbev.a004045__

      Chakraborty M, Chang CH, Khost DE, Vedanayagam J, Adrion JR, Liao Y, Montooth KL, Meiklejohn CD, Larracuente AM, Emerson JJ. Evolution of genome structure in the Drosophila simulans species complex. Genome Res. 2021 Mar;31(3):380-396. doi: 10.1101/gr.263442.120. Epub 2021 Feb 9. PMID: 33563718; PMCID: PMC7919458.

      Charlesworth B, Charlesworth D. The degeneration of Y chromosomes. Philos Trans R Soc Lond B Biol Sci. 2000 Nov 29;355(1403):1563-72. doi: 10.1098/rstb.2000.0717. PMID: 11127901; PMCID: PMC1692900.

      Flynn,J, Long, M, Wing, RA, A.G Clark, Evolutionary Dynamics of Abundant 7-bp Satellites in the Genome of Drosophila virilis, Molecular Biology and Evolution, Volume 37, Issue 5, May 2020, Pages 1362–1375, https://doi.org/10.1093/molbev/msaa010

      McKee, Bruce D. et al. “On the Roles of Heterochromatin and Euchromatin in Meiosis in Drosophila: Mapping Chromosomal Pairing Sites and Testing Candidate Mutations for Effects on X–Y Nondisjunction and Meiotic Drive in Male Meiosis.” Genetica 109 (2004): 77-93.

      Tobler R, Nolte V, Schlötterer C. High rate of translocation-based gene birth on the Drosophila Y chromosome. Proc Natl Acad Sci U S A. 2017 Oct 31;114(44):11721-11726. doi: 10.1073/pnas.1706502114. Epub 2017 Oct 19. PMID: 29078298; PMCID: PMC5676891.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Due to suppressed recombination, Y chromosomes have degenerated, undergone extensive structural rearrangements, and accumulated ampliconic gene families across species. The molecular processes and selective pressures guiding dynamic Y chromosome evolution are not well understood. In this study, Chang et al. generate updated Y assemblies of three closely related species in the D. simulans complex using long-read PacBio sequencing in combination with FISH. Despite having diverged only 250,00 years ago, the authors find structural rearrangements, two newly amplified gene families and evidence of positive selection across D. simulans. The authors also suggest the high level of Y duplications and deletions may be mediated by MMEJ biased repair.

      The authors generated a valuable resource for the study of Y-chromosome evolution in Drosophila and describe Y chromosome evolution patterns found in previous Y chromosome sequencing studies, such as newly amplified genes, positive selection, and structural rearrangements. The authors improvements to the Drosophila simulans clade Y chromosomes are commended, as assembly of the highly repetitive Y chromosome sequences is challenging. However, the manuscript is largely descriptive, the claims are largely speculative, and lacks a clear question. There are also a number of concerns with the text and figures (see below concerns). Overall, the manuscript would be significantly improved if the authors focused on a specific question as opposed to a survey of sequence features of the Y chromosome. For example, development of the idea that MMEJ is the primary mechanism for loss of Y chromosome sequence could be nice new twist.

      Major concerns:

      1. Title: The authors use "unique structure" in the title, which is a vague point. Are not Y chromosomes, or any chromosome, "unique" in some manner? Also are there not more evolutionary processes governing the rapid divergence of the Y's.
      2. p.2, line 53-56: The authors claim that sexually antagonistic selection and regulatory evolution are causes of recombination suppression. Couldn't this statement be reversed? Recombination suppression via inversions or other rearrangements enable sexually antagonistic selection. This is a chicken or egg question, so it should be revised to have both possibilities be equal.
      3. p.5, 118-120: Are the assemblies de novo or have they been guided based upon the D. melanogaster Y chromosome assembly? Please clarify how the authors evaluate their methods by comparing their Y-sequence assignments to known chromosomal locations.
      4. While the gene copy number estimates are accurate, the PacBio-based genome assemblies are still not able to accurately assemble large segmental duplications (see Evan Eichler's laboratories recent primate and human genome assemblies). A statement mentioning the concerns about accuracy of the underlying sequence and genomic architecture shown should be included in the main text. FISH provides support for the location of the contigs, but not for the accuracy of the underlying genomic architecture.
      5. The authors assigned Y-linked sequences based on median male-to-female coverage. Is this method feasible for assigning ampliconic sequence to the Y given the N50 of 0.6-1.2Mb? Are the authors potentially excluding novel Y-linked ampliconic sequence?
      6. Where did the rDNA sequences go in in D. simulans and D. sechellia? Can they be detected on another chromosome?
      7. Figure 2B is hard to follow and it is unclear what additional value it provides to part A. Why is expression level of specific exons important?
      8. Figure 3 There are many introns that contain gaps, so it is unclear how confident one can be in intron length when there are gaps.
      9. Figure 4: What are the authors using as a common ancestor in this figure to infer duplications in the initial branch?
      10. p.15, paragraph 2: The authors describe a newly amplified gene, CK2Btes-Y, in D. simulans. In the first half of the paragraph the authors state that Y-linked copies are also found in D. melanogaster but have "degenerated and have little or no expression" and call them pseudogenes. Later in the paragraph, the authors state that the D. melanogaster Y-linked copies are Su(Ste), a source of piRNAs that are in conflict with X-linked Stellate. Lastly in the paragraph, the authors discuss Su(ste) as a D. melanogaster homolog of CK2Btes-Y. The logic of defining CK2Btes-Y origins is confusing. Was CK2Btes-Y independently amplified on the D. simulans Y, or were CK2BtesY and Su(Ste) amplified in a common ancestor but independently diverged?
      11. Figure 5: Is each FISH signal a different gene copy?
      12. The authors suggest DNA-repair on the Y chromosome is biased towards MMEJ based on indel size and microhomologies. Is there any evidence MMEJ is responsible for variable intron length in the canonical Y-linked genes or the amplification of new gene families? Since MMEJ is error-prone, it's a more tolerable repair mechanism in pseudogenes, so their findings might be biased. Rather than comparing pseudogenes to their parent genes, they should compare chrY pseudogenes to autosomal pseudogenes. Even more would be to track MMEJ on the dot chromosome which is known not recombine and is highly heterchromatic like the Y chromosome.

      Significance

      While it is a benefit to have much improved Y chromosome assemblies from the three D. simulans clade species, the gap in knowledge this manuscript is trying to address is unclear. The manuscript is almost entirely descriptive and the figures are difficult to follow.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript describes a thorough investigation of the Y-chromosomes of three very closely related Drosophila species (D. simulans, D. sechellia, and D. mauritiana) which in turn are closely related to D. melanogaster. The D. melanogaster Y was analysed in a previous paper by the same goup. The authors found an astonishing level of structural rearrangements (gene order, copy number, etc.), specially taking into account the short divergence time among the three species (~250 thousand years). They also suggest an explanation for this fast evolution: Y chromosome is haploid, and hence double-strand breaks cannot be repaired by homologous recombination. Instead, it must use the less precise mechanisms of NHEJ and MMEJ. They also provide circumstantial evidence that MMEJ (which is very prone to generate large rearrangements) is the preferred mechanism of repair. As far as I know this hypothesis is new, and fits nicely on the fast structural evolution described by the authors. Finally, the authors describe two intriguing Y-linked gene families in D. simulans (Lhk and CK2ßtes-Y), one of them similar to the Stellate / Suppressor of Stellate system of D. melanogaster, which seems to be evolving as part of a X-Y meiotic drive arms race. Overall, it is a very nice piece of work. I have four criticisms that, in my opinion, should be addressed before acceptance.

      The suggestion/conclusion that MMEJ is the preferential repair mechanism (over NHEJ) should be better supported and explained. At line 387, the authors stated "The pattern of excess large deletions is shared in the three D. simulans clade species Y chromosomes, but is not obvious in D. melanogaster (Fig 6B). However, because all D. melanogaster Y-linked indels in our analyses are from copies of a single pseudogene (CR43975), it is difficult to compare to the larger samples in the simulans clade species (duplicates from 16 genes). ". Given that D. melanogaster has many Y-linked pseudogenes (described by the authors and by other researchers, and listed in Table S6), there seems to be no reason to use a sample size of 1in this species. Furthermore, given that D. melanogaster is THE model organism, it is the species that most likely will provide information to assess the "preferential MMEJ" hypothesis proposed by the authors. Still on the suggestion/conclusion that MMEJ is the preferential repair mechanism (over NHEJ). Y chromosome in heterochromatic, haploid and non-recombining. In order to ascribe its mutational pattern to the haploid state (and the consequent impossibility of homologous recombination repair), the authors compared it to chromosome IV (the so called "dot chromosome"). This may not be the best choice: while chr IV lacks recombination in wild type flies, it is not typical heterochromatin. E.g., " results from genetic analyses, genomic studies, and biochemical investigations have revealed the dot chromosome to be unique, having a mixture of characteristics of euchromatin and of constitutive heterochromatin". Riddle and Elgin, FlyBook 2018 (https://doi.org/10.1534/genetics.118.301146). Given this, it seems appropriate to also compare the Y-linked pseudogenes with those from typical heterochromatin. In Drosophila, these are the regions around the centromeres ("centric heterochromatin"). There are pseudogenes there; e.g., the gene rolled is known to have partially duplicated exons. In some passages of the ms there seems to be a confusion between new genes and pseudogenes, which should be corrected. For example, in line 261: "Most new Y-linked genes in D. melanogaster and the D. simulans clade have presumed functions in chromatin modification, cell division, and sexual reproduction (Table S7)".. Who are these "new genes"? If they are those listed in Table S6 (as other passages of the text suggest), most if not all of them are pseudogenes. If they are pseudogenes, it is not appropriate to refer to them as "new genes". The same ambiguity is present in line 263: "Y-linked duplicates of genes with these functions may be selectively beneficial, but a duplication bias could also contribute to this enrichment (...) " Pseudogenes can be selectively beneficial, but in very special cases (e.g.. gene regulation). If the authors are suggesting this, they must openly state this, and explain why. Pseudogenes are common in nearly all genomes, and should be clearly separated from genes (the later as a shortcut for functional genes). The bar for "genes" is much higher than simple sequence similarity, including expression, evidences of purifying selecion, etc., as the authors themselves applied for the two gene families they identified in D. simulans (Lhk and CK2ßtes-Y) The authors center their analysis on "11 canonical Y-linked genes conserved across the melanogaster group ". Why did they exclude the CG41561 gene, identified by Mahajan & Bachtrog (2017) in D. melanogaster? Given that most D. melanogaster Y-linked genes were acquired before the split from the D. simulans clade (Koerich et al Nature 2008), the same most likely is true for CG41561 (i.e., it would be Y-linked in the D. simulans clade). Indeed, computational analysis gave a strong signal of Y-linkage in D. yakuba (unpublished; I have not looked in the other species). If CG41561 is Y-linked in the simulans clade, it should be included in the present paper, for the only difference between it and the remaining "canonical genes" was that it was found later. Finally, the proper citation of the "11 canonical Y-linked genes" is Gepner and Hays PNAS 1993 and Carvalho, Koerich and Clark TIG 2009 (or the primary papers), instead of ref #55. Other points/comments/suggestions:

      a) Possible reference mistake: line 88 "For example, 20-40% of D. melanogaster Y-linked regulatory variation (YRV) comes from differences in ribosomal DNA (rDNA) copy numbers [52, 53]." reference #53 is a mouse study, not Drosophila.

      b) Possible reference mistake: line 208 "and the genes/introns that produce Y-loops differs among species [75]". ref #75 is a paper on the D. pseudoobscura Y. Is it what the authors intended?

      c) line 113. "We recovered all known exons of the 11 canonical Y-linked genes conserved across the melanogaster group, including 58 exons missed in previous assemblies (Table S1; [55])." Please show in the Table S1 which exons were missing in the previous assemblies. I guess that most if not all of these missing exons are duplicate exons (and many are likely to be pseudogenes). If they indeed are duplicate exons, the authors should made it clear in the main text, e.g., "We recovered all known exons of the 11 canonical Y-linked genes conserved across the melanogaster group, plus 58 duplicated exons missed in previous assemblies."

      d) line 116 "Based on the median male-to-female coverage [22], we assigned 13.7 to 18.9 Mb of Y-linked sequences per species with N50 ranging from 0.6 to 1.2 Mb." The method (or a very similar one) was developed by Hall et al BMC Genomics 2013, which should be cited in this context. e) line 118: "We evaluated our methods by comparing our assignments for every 10-kb window of assembled sequences to its known chromosomal location. Our assignments have 96, 98, and 99% sensitivity and 5, 0, and 3% false-positive rates in D. mauritiana, D. simulans, and D. sechellia, respectively (Table S2). The procedure is unclear. Why break the contigs in 10kb intervals, instead of treating each as an unity, assignable to Y, X or A? The later is the usual procedure in computational identification of suspect Y-linked contigs (Carvalho and lark Gen Res 2013; Hall et al BMC Genomics 2013). The only reason I can think for analyzing the contigs piecewise is a suspicion of misassemblies. If this is the case, I think it is better to explain.

      f) Fig. 1. It may be interesting to put a version of Fig 1 in the SI containing only the genes and the lines connecting them among species, so we can better see the inversions etc. (like the cover of Genetics , based on the paper by Schaeffer et al 2008).

      g) Table S6 (Y-linked pseudogenes). Several pseudogenes listed as new have been studied in detail before: vig2, Mocs2, Clbn, Bili (Carvalho et al PNAS2015) Pka-R1, CG3618, Mst77F (Russel and Kaiser Genetics 1993; Krsticevic et al G3 2015) . Note also that at least two are functional (the vig2 duplication and some Mst77 duplications).

      h) line 421: "one new satellite, (AAACAT)n, originated from a DM412B transposable element, which has three tandem copies of AAACAT in its long terminal repeats." The birth of satellites from TEs has been observed before, and should be cited here. Dias et al GBE 6: 1302-1313, 2014.

      i) Fig S2 shows that the coverage of PacBio reads is smaller than expected on the Y chromosome. Any explanation? This has been noticed before in D. melanogaster, and tentatively attributed to the CsCl gradient used in the DNA purification (Carvalho et al GenRes 2016). However, it seems that the CsCl DNA purification method was not used in the simulans clade species (is it correct?). Please explain the ms, or in the SI. The issue is relevant because PacBio sequencing is widely believed to be unbiased in relation to DNA sequence composition (e.g., Ross et al Genome Biol 2013).

      j) I may have missed it, but in which public repository have the assemblies been deposited?

      Significance

      see above.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      I found this an exceptionally impressive manuscript. The evolution of Y chromosomes has until recently been nearly impossible, and this research group have pioneered approaches that can yield reliable results in Drosophila. The study used an innovative heterochromatin-sensitive assembly pipeline on three D. simulans clade species, D. simulans, D. mauritiana and D. sechellia, which diverged less than 250 KYA, allowing comparisons with the group's previous results for the D. melanogaster Y.

      The study is both technically impressive and extremely interesting (an highly unusual combination). It includes a rich set of interesting results about these genome regions, and furthermore the results are discussed in a well-organised way, relating both to previous observations and to understanding of the genetics and evolution of Y chromosomes, illuminating all these aspects. It is a rare pleasure to read such a study. I believe that this study will inspire and be a model for future work on these chromosomes. It shows how these difficult genome regions can be studied.

      Major comments:

      The conclusions are convincing. The methods are explained unusually clearly, and the reasoning from the results is convincing. When appropriate, the caveats, the caveats are clearly explained. The material is clearly organised and the questions studied are well related to the results. I had a few minor comments concerning the English. Even the figure (often a major problem to understand) are very clear and helpful, with proper explanations. I have very rarely read such a good manuscript, and almost never (in a long career) found a manuscript that could be published without revision being necessary.

      The analysis found 58 exons missed in previous assemblies (as well as all previously known exons of the 11 canonical Y-linked genes, which are present in at least one copy across the group). FISH on mitotic chromosomes using probes for 12 Y-linked sequences was used to determine the centromere locations, and to determine gene orders and relate them to the cytological chromosome bands, demonstrating changes in satellite distribution, gene order, and centromere positions between their Y chromosomes within the D. simulans clade species. It also confirmed previous results for Y-linked ribosomal DNA,genes, which are responsible for X-Y pairing in D. melanogaster males. Although 28S rDNA has been lost in D. simulans and D. sechellia (but not in D. mauritiana), the intergenic spacer (IGS) repeats between these repeats are retained on both sex chromosomes in all three species. Only sequencing can reliably reveal this, as their abundance is below the detection level by FISH in D. sechellia. The 11 canonical Y-linked genes' copy numbers vary between the species, and some duplicates are expressed and have complete open reading frames, and may therefore be functional because they, but most include only a subset of exons, often with duplicated exons flanking the the presumed functional gene copy. Mega-introns and Y-loops were found, as already seen in Drosophila species, but this new study detects turn overs in the ~2 million years separating D. melanogaster and the D. simulans clade. 49 independent duplications onto the Y chromosome were detected, including 8 not previously detected. At least half show no expression in testes, or lack open reading frames, so they are probably pseudogenes. Testis-expressed genes may be especially likely to duplicate into the Y chromosome due to its open chromatin structure and transcriptional activity during spermatogenesis, and indeed most of the new Y-linked genes in the species studied clade have likely functions in chromatin modification, cell division, and sexual reproduction. The study discovered two new gene families that have undergone amplification on D. simulans clade Y chromosomes, reaching very high copy numbers (36-146). Both these families appear to encode functional protein-coding genes and show high expression. The paper described intriguing results that illuminate Y chromosome evolution. First, SRPK, arose by an autosome-to-Y duplication of the sequence encoding the testis-specific isoform of the gene SR Protein Kinase (SRPK), after which the autosomal copy lost its testis-specific exon via a deletion. In D. melanogaster, SRPK is essential for both male and female reproduction, so the relocation of the testis-specific isoform to the Y chromosome in the D. simulans clade suggests that the change may have been advantageous by resolving sexual antagonism. The paper presents convincing evidence that the Y copy evolved under positive selection, and that gene amplification may confer advantageous increased expression in males. The second amplified gene family is also potentially related to an interesting function. Both X-linked and Y-linked duplicates are found of a gene called Ssl located on chromosome 2R. In D. simulans, the X-linked copies were previously known, and called CK2ßtes-like. In D. melanogaster, degenerated Y-linked copies are also found, with little or no expression, contrasting with complete open reading frames and high expression in the D. simulans clade species in testes, consistent with the possibility of an arms race between sex chromosome meiotic drive factors. Other interesting analyses document higher gene conversion rates compared to the other chromosomes, and evidence that these Y chromosomes may differ in the DNA-repair mechanisms (preferentially using MMEJ instead of NHEJ), perhaps contributing to their high rates of intrachromosomal duplication and structural rearrangements. The authors relate this to evidence for turnover of Y-linked satellite sequences, with the discovery of five new Y-linked satellites, whose locations were validated using FISH. The study also documented enrichment of LTR retrotransposons on the D. simulans clade Y chromosomes relative to the rest of the genome, together with turnovers between the species.

      Significance

      As described above, the advances are both, technical and conceptual for the field. The manuscript itself does an excellent job of placing the work in the context of the existing literature.

      • Anyone working on sex chromosomes and other non-recombining genome regions should be interested in the findings reported.

      • My field of expertise is the evolution of sex chromosomes, and the evolution of genome regions with suppressed recombination. I have experience of genomic analyses. I have less expertise in analyses of gene expression, but I understand enough about such approaches to evaluate the parts of this study that use them.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary

      Fusion and fission of the mitochondrial network is one of the hottest topics in mitochondrial biology in the last years. The process is obviously necessary to allow cells to control the quality of small individual organelles, which are degraded by autophagy or mitophagy, if they are not working properly. Since a healthy mitochondrial network is essential for every cell in the body, the molecular players involved in these two processes are heavily investigated. In this paper, the authors investigate the role of MTFP1 in vitro and in vivo, a protein which has been studied and named because it seemed to be an important fission factor in cultured cells.

      Surprisingly and excitingly, the authors find that mitochondrial morphology and homeostasis is not affected by knocking out this protein in the heart of mice. On the contrary, it shows that this protein is a critical regulator of mitochondrial inner membrane coupling via the adenine-nucleotide-transporter (ANT). A loss of MTFP1 leads to a decline in the mitochondrial membrane potential, leading to cell death, which finally results in dilatative cardiomyopathy and causes early death of the animals. Therefore, this paper gives an important mitochondrial inner membrane protein a new role which may become very important to understand the opening of the large channel (MTPT-channel), which is responsible for some kinds of cell death in almost all cell types.

      Major comments:

      The conclusions are convincing, additional experiments on the molecular nature of the interaction between MTFP1 and ANT may be easily proposed by a reviewer; however, this will open a completely new line of research and should not be asked at this moment. Data and methods are presented in a perfect way, typical for the Wai lab. Statistical analysis has been performed meticulously, and there is nothing to add here. I have read the paper very carefully, but cannot find many points which should be changed.

      Minor Points:

      I must admit I hate the title, but the authors are in good company using the "genetic argument", as many others do. Mitochondrial fission process controls energetic efficiency - that is correct, but it does not prevent inflammatory cardiomyopathy and heart failure in mice. It is intact mitochondria which prevent inflammatory cardiomyopathy and heart failure, and as long as we do not know what exactly MTFP1 does, this title is misleading, although it may be considered attractive for readers. I would reformulate that and mention the new role of this protein in coupling of the mitochondrial inner membrane potential, but I leave this to the authors, of course.

      P. 2, line 45: The loss of MTFP1 promotes ... (erase "the")

      P. 12, line 321: There is clearly no indication of mitochondrial elongation, but I do see clearly in these pictures a separation between the organelles in the mutant mice in contrast to wild type, where mitochondria touch each other (Fig. 3c to d). If this is consistent, it should be mentioned. P. 12, line 324: I am not a true expert in fusion and fission, so wouldn't be a blot showing all the OPA1 isoforms necessary here?

      P. 13, line341: The same argument is repeated in two sentences following each other. I suggest to write here "Our data collectively indicate that MTFP1, unlike DRP1, is not an essential fission protein, contrary to its namesake, either in vitro or in vivo.".

      P. 13, line 349: "We sought to investigate..."

      Significance

      Understanding mitochondrial dynamics (fusion and fission) and bioenergetics (which some people considered to be fully known since the 1950s) is of utmost importance for biology and biomedicine. Since this paper gives a prominent protein, which the field believes is a fission factor, a completely new role, it is a paper of high interest. As the authors state, using these mice the protein may help to understand the molecular function of the mitochondrial membrane permeability transition pore (MPTP), which is still enigmatic, but important for so many ways of cell death. The paper is therefore state of the art and at the frontline of cell biology, and the large mitochondrial community will be very interested to read the paper.

      I have been working on mitochondria for 35 years, starting with bioenergetics, switching then to mitochondrial biogenesis regulated by transcription of nuclear genes as well as the mitochondrial genome, followed by studying the consequences of mtDNA mutations, and now considering how mitochondrial dysfunction may be involved in the normal aging process. Therefore, I feel myself competent to critically judge the quality of this paper. I am not a molecular biologist, therefore, the molecular details of protein-protein interaction do not lie in the focus of my interest; on the contrary, I feel that sometimes too much emphasis is laid on such molecular details, while the big question - in this case, how mitochondrial membrane potential is regulated - is not addressed at all.

      Referee Cross-commenting

      I guess we all suffer from reviewers of our own papers asking for more mechanistic insight. This paper unexpectedly shows a new role for MTFP1 - which is important for the mito community - and opens the door to more mechanistic studies how it uncouples the mitos and leads to cell death via ANT and MPTP - which is imprtant for a very broad community.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Donnarumma et al. characterize cardiac-specific KO of Mitochondrial fission process 1 (MTFP1), a mysterious mitochondrial protein thought to be involved in mitochondrial inner membrane fission. They initially demonstrate that the survival, cardiac function and respiration is diminished in the KO mouse and seek to find a mechanism. In MEF cells, surprisingly, they do not report any changes in fission, though mitochondrial morphology is altered. The authors then identify loss of MTFP1 as being damaging through exacerbation of cell death, possibly due to enhanced activation of the mitochondrial permeability transition. This is a beautiful and thorough paper. The data presented is of high quality and the conclusions are well supported by the figures. There was little to criticize in the manuscript!

      1. Although total mtDNA levels were no different, was there mtDNA release into the cytoplasm in Mtfp1 cKO? This is one possible mechanism to consider regarding the interferon response, as this would be a potent trigger for the innate immune response, as pointed out in the discussion and in PMC4409480.
      2. The authors show mitochondrial morphology in the pre-symptomatic period. What happens during DCM? Does this effect become exacerbated in the KO compared to WT?
      3. Given the cellular phenotypes seen in the ppif/Mtfp1 DKO cells, does this translate into a survival benefit in these animals? (If this data is easily available would recommend showing it, even if negative; but if entirely new crosses and 20-30 weeks of follow-up are required then it's fine to not address this question here).
      4. Methods (line 1281): It appears that only male mice were imaged from 10-34 weeks? Why only show one sex, especially as the authors note a difference in survival between males and females? Also, it is unclear why the data on female HF is relegated to the Supplement. This should be in the main manuscript side-by-side with the male data on the same scale to allow comparison of effect sizes on similar assays. Minor comments:
      5. Please change the title: "inflammatory cardiomyopathy" is a poorly defined term and would suggest myocarditis or inflammatory cell infiltrates, which are not shown in the manuscript. In addition, the only discussion of inflammation is through the innate immunity pathway in the RNA-seq data, with no real further follow-up.
      6. Line 39, Abstract: "ANT" needs to be in brackets/parenthesis
      7. Figure 1M: It would be good to see a higher magnification image showing fibrosis in the trichrome stain.
      8. Line 180, "gender" should more properly read "sex".
      9. At line 321, the authors state that there are no changes in mitochondrial elongation, however, Figure 3D seems to suggest that mitochondrial area is decreased in MKO cells. Is this an error or are the authors suggesting that the data in 3D is not significant? How was elongation measured?
      10. At line 335, the authors state that MTFP1 KO mitochondria were not protected from fragmentation, this is supported by the data in Figures 3G-H. However, to my eye, it appears that the mitochondria from the KO cells were far more fragmented in response to hydrogen peroxide. Is this data not significant?

      Significance

      This paper is novel in that it constitutes the first description of a mouse cardiac knockout of MTFP1, a poorly studied protein previously thought to be involved in mitochondrial fission. Previously MTFP1 has been described in knockdown cells (Aung et al. J Cell Mol Med. 2017 Dec; 21(12)) and the current paper builds upon this research. The current paper demonstrates that MTDP1 is important for cardiac function, but intriguingly, does not share the prior in vitro phenotypes related to mitochondrial fission, suggesting that it may have some other physiological function. Most of the methods shown are standard, though there are some quite novel machine learning-based analyses of imaging data. The paper is quite thorough and of relevance to a wide range of investigators interested in cardiac mitochondrial function, mitochondrial kinetics (fusion/fission), and cell death mechanisms more broadly. Our field of expertise is in cardiac mitochondrial function. The ML computational tools are very interesting, but these are not our expertise.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript entitled "Mitochondrial fission process 1 (MTFP1) controls bioenergetic efficiency and prevents inflammatory cardiomyopathy and heart failure in mice" by Donnarumma and collaborators investigate the role of MTFP1 in the heart in vivo. Mice with a cardiac-specific deletion of Mtfp1 were generated and fully characterized. Structure-function analyses were performed prior to and at the onset of the cardiomyopathy and show that homozygous Mtfp1 ko mice develop DCM progressing to heart failure and death in middle age, associated with increased fibrosis. RNAseq data revealed a severe impairment of metabolic processes including reduced oxidative phosphorylation, TCA cycle and mitochondrial gene expression. Mitochondrial respiration was significantly reduced in mitochondria isolated from Mtfp1 ko mice, while global mitochondrial proteins and activity of the Krebs cycle remained normal. Further assessment of a variety of processes revealed an increase in proton leak through ANT as the major contributor to the mitochondrial defects and cardiac dysfunction in Mtfp1 ko mice. Cardiomyocytes isolated from Mtfp1 ko mice were also more sensitive to stress-induced apoptosis and to mPTP opening. The major conclusion of the study is that contrary to previous reports documenting a role of MTFP1 in mitochondrial fission, MTFP1 does not regulate mitochondria morphology but rather is essential for cardiac energy balance. This is substantiated by mass spectrometry experiments which identify mitochondrial proteins of the complex I/IV and proteins regulating mPTP as MTFP1 partners.

      Overall, this is an elegant study with an impressive amount of work performed in isolated mitochondria and in vivo before and at the onset of DCM. Results are important because they challenge previous findings that established a role of MTFP1 in mitochondrial fission and therefore reveal another function of MTFP1. To rigorously establish how MTFP1 regulates cardiac bioenergetics, additional experiments are needed and are listed below. In particular, the use of wildtype mice as control is concerning because some transgenic lines of Myh6-Cre+ develop DCM. Also, experiments addressing MTFP1 as an essential fission protein should be performed in adult ventricular myocytes isolated from Mtfp1 ko mice to show consistency with experiments performed in MEFs.

      Major comments:

      One major concern is with the control mice, which appear to be wildtype (Myh6-Cre+/+ Mtfp1 LoxP/LoxP). The proper control group should be Myh6-Cretg/+. This is important because some models of Myh6-Cre+ mice develop DCM including mitochondrial dysfunction (Buerger et al., J Card Failure 2006; Hall et al., Am J Physiol Heart Circ Physiol 2011). At a minimum, the most critical assays evaluating mitochondrial function should be performed using Myh6-Cre+ as control to verify that they do not develop pathological cardiac remodeling.

      The observation that Mtfp1ko mice show a complete loss of the protein by Western blot analysis is intriguing because it suggests that Mtfp1 is only expressed in ventricular myocytes and not in the other cells populating the heart. Can you please comment on this?

      Was Seahorse analysis from ventricular myocytes isolated from Mtfp1ko performed in parallel with the analysis in MEF and U2OS cells? This should be done to establish the cell specific defects observed in cardiac mitochondria lacking Mtfp1.

      Mitochondrial morphology under normal or stress condition was assessed in MEF, which have very distinct characteristics than primary cardiac cells. The experiment using oligomycin, rotenone and CCCP should be performed in ventricular myocytes isolated from Mtfp1 ko mice, to rigorously reach the conclusion that MTFP1 is not essential for mitochondrial fission.

      Related to that, is-it possible that while total levels of mitochondrial fission and fusion proteins are similar in Mtfp1 ko and wt mice, their phosphorylated forms may be different?

      Figure 4: Cell death in Mtfp1 ko and control cardiomyocytes is measured using supervised ML-assisted high throughput live-cell imaging (Cretin et al., 2021). This result should be substantiated by additional apoptosis assays.

      Cell death assay are performed by treating cardiomyocytes isolated from Mtfp1 ko and wt mice with the cardiotoxic anthracycline doxorubicin (DOX). The dose DOX of 60 microM is extremely high. Can cell death be observed at lower concentrations of DOX?

      Minor comments:

      Line 349: there is a typo. Please replace "we sought investigate whether MTFP1 loss specifically..." with "we sought to investigate whether MTFP1 loss specifically..."

      Line 417: What the authors mean is that "the modest level of over-expression did not negatively impact cardiac function in vivo (Figure S5B-C)".

      Line 490-500: this is a very long sentence. Please break it down into 2 sentences to ease the reading.

      Significance

      The role of MTFP1 has been investigated in isolated cells where conflicting results were reported in the literature. The in vivo role of MTFP1 in the heart is currently unknown. RNAseq and a panoply of approaches assessing mitochondrial structure and function, before symptomatic DCM occurs, provide important insights on early events causing the cardiomyopathy. This study is potentially conceptually innovative and could reveal a new role of MTFP1 in maintaining energy metabolism in the heart as well as in other organs.

      Referee Cross-commenting

      I have read the comments of the other 3 reviewers in details. Like reviewer 3 and 4, I believe that the study is very well performed and provides new knowledge on the role of MTFP1 on cardiac energetics, assuming that the control mice do not develop DCM. I agree with the issues they identified. Regarding the issues raised by reviewer 1, especially concerning the lack of mechanistic insights, I actually thought that the full characterization of the Mtfp1 cko mouse model before and at the onset of the cardiomyopathy showing a strong cardiac phenotype, the RNAseq data showing alteration of metabolic genes and the detailed experiments performed in isolated mitochondria and isolated cells including rescue experiments, provide strong evidence that Mtfp1 regulates energy metabolism. That being said, I agree that direct causality could be better demonstrated by adding siRNA experiments to knockdown Mtfp1 and see if it can recapitulate the adverse effects seen in Mtfp1 ko mice. This was attempted in MEFs and U2OS cells, which did not show the expected results. I would perform this experiment in cardiac cells, which is the relevant cell type to investigate underlying mechanisms. Adding causality experiments would strengthen the study even more.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The study investigated the role of mitochondrial fission process 1 (MTFP1) on cardiac structure and function. MTFP1 deletion in the heart resulted in adult-onset dilated cardiomyopathy (DCM), reduced membrane potential, and increased non-phosphorylation-dependent respiration. MTFP1 deletion also increased the sensitivity to programmed cell death, which was accompanied by an opening of the mitochondrial permeability transition pore (mPTP) in vitro. Thus, the authors conclude that MTFP1 influences mitochondrial coupling and cell death sensitivity.

      I have the following concerns regarding the study and its main conclusions:

      Major concerns:

      1- While the study challenges previous reports regarding the role of MTFP1 in mitochondrial fission, the study is descriptive and does not provide any mechanistic insights delineating the impact of MTFP1 on cardiac energy metabolism and cell death.

      2- The significance of the RNA sequencing data is not clear, and the authors need to put these changes in context and explain how these changes may fit in the study context. It is also not clear why the authors decided to only comment on the changes in Nppa and Nppb levels?

      3- It is not clear how MTFP1 influences bioenergetic efficiency, and the authors do not prove any evidence to suggest that this might be the case.

      4- In Figure 2F, there is a decrease in the expression of ATP5A complex in the cMKO mitochondria, which could explain the changes in state 4 respiration and membrane potential. The authors need to delineate how MTFP1 could influence the activity of the ATP5 complex.

      5- In Figure S4, the author should report the baseline measurements of LV function and structure pre-doxorubicin treatment to ensure no significant difference in these parameters occurred prior to the treatment protocol.

      6- How does MTFP1 modify PTP activity? More work is needed to characterize this effect.

      7- Co-immunoprecipitation data in figure S5 are confusing and have no clear significance. Therefore, the authors need to discuss the significance of these changes and how they might be relevant in the study context.

      Minor:

      • Line 222, "wholesale" > whole cell

      Significance

      Significance: This study challenges existing dogma, although the data is not convincing enough to make this challenge convincing.

      Referee Cross-commenting

      I have read the comments of the other 3 reviewers, and I agree with their comments. This is an interesting study, that if adequately revised would make an important contribution to the literature.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1: **Major comments:**

      The authors state that all the RNA and contaminating DNA was validated and verified with nanodrop and BioAnalyzer which is the correct and accepted approach. However, the following concerns arise with testing reaction efficiency and data analysis:

      Comment 1.

      For reaction efficiency, the standard curves for each reference gene and gene of interest target should be included in the supplemental data. A four point standard curve is the bare minimum to assess reaction efficiency and raises concerns about the data quality. The unknown samples being tested should also be plotted on the corresponding standard curves to assess their efficiency

      Response:

      We have indeed calculated primer efficiencies by serial dilution and performed a four-point standard curve wherever possible. In other cases, at least a three point dilution curve was performed to assess primer efficiency. To have a more extensive range of Cq values in the standard curve, the dilution series was done with serial dilution by a factor of 1/10th as indicated in the materials and methods section under the heading “Amplification Efficiencies”. This provided a range of 6.6 cycles (three point dilution) and about 9.9 cycles (4 point dilution for the primers tested. If the Cq values of the 4th dilution fell beyond the detection range of the machines (above 29 cycles) or closer to the No-RT Control, only the first three dilutions were taken into consideration. We have now included the standard curve for all the genes in the metadata/source data and updated the Figshare DOI. All sample Cq values were within this standard curve as mentioned in the materials and methods section and they have been disclosed already in the metadata files. The raw qPCR Cq output for all references and targets for both datasets can be retrieved from the data file in figshare. Moreover, we will also add a new sentence in the methods sections clarifying the standard curve dilutions and data availability.

      Comment 2. The statement starting on line 510: "The WT experimental group was omitted from this analysis as it was used as the experimental calibrator for differential expression. The mean Fold Change of the WT group is always at 1 regardless of the gene/method in question and therefore it is redundant to test for statistical significance of the WT fold change levels across different methods for each gene." indicates that data analysis was not performed in a rigorous and generally accepted manner. PLease check the analysis with that described in: https://www.cell.com/trends/biotechnology/fulltext/S0167-7799(18)30342-1

      The generally accepted methodology for relative, normalized qPCR data analysis is well described in Figure 5 of that article. qPCR statistical analysis should be performed on the log transformed expression results also well described in that paper.

      Response:

      We apologize for any lack of clarity on this line. We have always compared the Control group to the Test group while performing statistical analysis as shown Figure 3 and Figure 6. This is a fundamental point of any study and we have strictly adhered to this. The highlighted statement pertains to the supplementary figures 1 & 2 where we compare the Fold changes of the “Test” groups between qPCR and RNA-Seq in both datasets. Comparing the Control groups with one another between these methods is redundant as the mean Fold change of the control groups are always 1 as we are measuring relative expression. Thus, we cannot perform any meaningful statistical testing between the control groups between RNA-Seq and qPCR regardless of the method employed for testing.

      Furthermore, the use of the 2-ΔΔCt method for relative expression is in strict adherence to the initial papers describing this method (Livak and Schmittgen 2001, Schmittgen and Livak, 2008), which is again recapitulated in the article that you have cited. This can be seen in the metadata where the excel files that were used for calculating differential expression for all samples and datasets can be accessed. However, we would like to remark that we use more stringent criteria for primer validation (Efficiency between 95% and 105% as opposed to between 90% and 110% as mentioned in the paper). Moreover, the statistical testing and data representation prescribed in Figure 5 of the article that you have mentioned are not well founded for the following reasons:

      • We cannot perform parametric T-Tests using low sample sizes. Furthermore, we cannot test for data normality using few data points employed in standard qPCR assays. Thus, neither our qPCR assays nor the ones used in the mentioned article have enough samples to perform a T-Test. Hence, we have used a non – parametric ordinal Mann Whitney test for testing statistical significance in our study, as it is more apt for such low sample sizes and distributions.
      • The article proposes data representation with the mean and SEM or 95% Confidence Intervals (CI). We would like to kindly remark that SEM and CI are sampling parameters that arise when we perform sampling of data points from a larger population. In our study, we have always shown all the data points (biological replicates) for each experimental group. Hence, we can only show the distribution around the mean with the standard deviations (SD) and not with SEM or CI. We have not performed any sampling whatsoever nor has the study mentioned by the reviewer.

        Comment 3. The authors used Normfinder to assess reference gene stability. Since Normfinder uses a particular algorithm for assessing stability, it is recommended to assess stability using a combination of these "stability calculators" including: GeNorm, NormFinder and BestKeeper. This is described in Table 1 of: https://www.cell.com/trends/biotechnology/fulltext/S0167-7799(18)30342-1. This will give a much more reliable perspective on the ranking of reference genes by their stability.

      Response:

      The method used in our study for reference gene validation is a combination of CV, Normfinder and statistical testing of raw expression profiles. In our previous study (Sundaram et al 2019), we have categorically shown that using a combination of different existing methods such as GeNORM, NormFinder and Best Keeper and comparing their ranks results in a sub optimal choice of reference genes. This is because GeNorm ranks genes with similar expression patterns as stable even if they vary significantly among groups. BestKeeper calculates variation based on Cq values which are exponents while the expression levels are calculated in the linear scale (2^Cq). NormFinder stability scores are influenced by the presence of genes with significant overall variation. More evidence backed up with data can be found in our previous study where we have clearly shown that combining these methods and calculating an overall rank (as proposed in the article you have mentioned) is not the best strategy. Hence, we devised the approach used in the present study, which has been previously validated, published (Sundaram et al 2019, PLoS ONE) and was designed taking into account the advantages and disadvantages of the different existing approaches.

      Comment 4. Finally, since many currently studied targets for relative gene expression are low expressed, it would be important to also examine three deferentially expressed targets in the Cq range of 29 to 32. Yes the variability will be higher but these data will give a more realistic test of reference gene stability.

      Response:

      The target genes used in the study range from about 12 cycles to about 29 cycles (both datasets included, please refer to the source data/metadata). This falls well within the standard curves of all these genes used as mentioned earlier. The stability of the reference genes has been shown with absolute parameters such as the Co-efficient of variation and the Normfinder S scores (Tables 1, 2, 3 & 4). Although we are not opposed to adding more target genes, we fail to see as to how adding target genes with Cq values above 29 cycles would reflect on the stability of reference genes. The variability that will be observed is a mere reflection of the variability of Cq values of the target genes in the Cq range of 29 – 32 as it approaches the detection limits of qPCR assays. The Cq values of the best reference genes would still remain the same. Therefore, this exercise cannot test the “stability” of the reference genes but only demonstrate the limit of qPCR detection (which is already well known). We would also like to remark that we have used No-RT controls in our qPCR assays, which exhibit a signal (different dissociation peak) in this Cq range for some genes and hence this is not a signal that arises from the cDNA. Therefore, we do not consider values above 29 cycles are reliable in our qPCR setup and we switch to droplet digital PCR for such low-expressed genes in our studies.

      Reviewer #2: **Summary + Minor Comments** Reference gene selection is one of the most critical steps in gene expression analysis using qPCR. The authors compared data quality using references selected based on RNA-Seq or using panel of often used reference genes. The manuscript is well prepared and easy to understand. Figures are nice and clear. I do not have major comments, but rather a few suggestions to make the manuscript more advanced. Since it is based on already available data or a few more expression measurements could be easily added, I would suggest to include total RNA factor, some rRNA and mtRNA as potential references. It will be interesting to compare their stability and effect on results of other targeted genes.

      In discussion, authors suggested that: "stable reference genes for qPCR data normalisation can be obtained from any random set of candidates provided the statistical approach of reference gene validation is sound and consistent". I do not think the word random in many sentences is appropriate. Panel of reference genes used in this study contains many known stable genes and that does not look random to me. I would rephrase these sentences. Usually panels of reference genes (for human and mouse are commercially available and contains several genes used in study) are composed of genes coding various biological processes to ensure that some of them will be stably expressed in experiments.

      Response:

      We understand the reviewer’s perspective on the use of the words “random reference genes”. We have replaced it with the words “conventional reference genes” throughout the manuscript.

      Regarding the addition of other RNA species as reference genes, we would like to clarify that we have used only protein coding transcripts (encoded by nuclear genes) as reference genes as all our target genes also belong to the same RNA category. This was done in accordance with the MIQE guidelines for qPCR data publication (Bustin et al 2009, DOI: 10.1373/clinchem.2008.112797) which states that rRNA should not be used for mRNA target gene normalization. This is because the vast majority of RNA from total RNA extraction is rRNA and only about 1% - 5% is mRNA. Thus, it is advisable to normalize mRNA targets with mRNA reference genes as it serves as a control for the extraction and RT PCR protocol. This argument can also be extended to other RNA species either in type or in origin (mtRNA). Regarding the total RNA factor, we have always used the same quantity of total RNA from all samples for RT-PCR as mentioned in the materials and methods section.

      Reviewer #3

      **Summary + Minor Comments**

      The aim of this study was to demonstrate that the statistical approach to determine the best reference genes from randomly selected "standard" reference genes might be more sufficient than employing reference genes as indicated by RNA-Seq.

      In a previous study they established a qPCR data normalization workflow, after comparing several statistical approaches for the assessment of reference gene stability. In this study they apply this workflow to compare "random" reference genes with preselected references genes based on RNA-Seq data. They test their hypothesis in two different experimental setups, varying sample material and methodology. After establishing the most "stable" reference genes, the suitability of these genes for normalization was put on trial by investigating their ability to normalize differential expression of target genes. These results were compared to one another and to fold-changes computed from RNA-Seq. The results indicate that as stated in the title of the study, "RNA-Seq is not required to determine stable reference genes for qPCR normalization", since both approaches render similar results. Potential pitfalls when selecting genes from RNA-seq data are discussed and an integration of influencing factors is suggested.

      The key conclusions of the study are convincing and well-supported by the experiments conducted, which are realistic in terms of time and resources. Data and methods are presented articulate and are reproducible. Experiments are adequately replicated and statistical analysis is adequate. The manuscript is well written, tables and figures provided are sound and corroborate a better understanding of the presented results. Minor changes would be:

      Figure 1, 2, 3, 4, 5, 6: in the figure are uppercase letters, in the figure legend are lowercase letters, please adjust that.

      p10 line 347: I understand what is meant, with "using the NF as the reference gene", however, stating again that the combined NF of the two most stable ref genes was used here, would make it clearer. P11 line 355f: the first sentences here are negligible, as already stated elsewhere P30 line 777: The last sentence is not clear to me.

      Response:

      All minor concerns have been addressed in the revised manuscript as follows:

      1. Figure 1, 2, 3, 4, 5, 6: in the figure are uppercase letters, in the figure legend are lowercase letters, please adjust that – Has been modified
      2. p10 line 347: I understand what is meant, with "using the NF as the reference gene", however, stating again that the combined NF of the two most stable ref genes was used here, would make it clearer. – Has been modified
      3. P11 line 355f: the first sentences here are negligible, as already stated elsewhere – Have been removed
      4. P30 line 777: The last sentence is not clear to me.

        We wanted to say that our study aptly addressed the strongest hurdle in performing reliable qPCR assays, which is the choice of good reference genes. This choice is not dependent on RNA-SEQ results. We have modified this sentence for better clarity.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The aim of this study was to demonstrate that the statistical approach to determine the best reference genes from randomly selected "standard" reference genes might be more sufficient than employing reference genes as indicated by RNA-Seq.

      In a previous study they established a qPCR data normalization workflow, after comparing several statistical approaches for the assessment of reference gene stability. In this study they apply this workflow to compare "random" reference genes with preselected references genes based on RNA-Seq data. They test their hypothesis in two different experimental setups, varying sample material and methodology. After establishing the most "stable" reference genes, the suitability of these genes for normalization was put on trial by investigating their ability to normalize differential expression of target genes. These results were compared to one another and to fold-changes computed from RNA-Seq. The results indicate that as stated in the title of the study, "RNA-Seq is not required to determine stable reference genes for qPCR normalization", since both approaches render similar results. Potential pitfalls when selecting genes from RNA-seq data are discussed and an integration of influencing factors is suggested.

      The key conclusions of the study are convincing and well-supported by the experiments conducted, which are realistic in terms of time and resources. Data and methods are presented articulate and are reproducible. Experiments are adequately replicated and statistical analysis is adequate. The manuscript is well written, tables and figures provided are sound and corroborate a better understanding of the presented results. Minor changes would be:

      Figure 1, 2, 3, 4, 5, 6: in the figure are uppercase letters, in the figure legend are lowercase letters, please adjust that.

      p10 line 347: I understand what is meant, with "using the NF as the reference gene", however, stating again that the combined NF of the two most stable ref genes was used here, would make it clearer. P11 line 355f: the first sentences here are negligible, as already stated elsewhere P30 line 777: The last sentence is not clear to me.

      Significance

      In the last years the necessity of stable reference genes for the normalization of pPCR data has become more and more apparent, since it has been shown, that selecting the genes most "popular", might not always lead to correct expression profiles, since depending on the experimental setup, significant variation can occur. Numerous studies exist, validating potential reference genes, employing several well-established statistical approaches (Genorm, Normfinder etc.) and more recently based on RNA-Seq data. RNA-Seq is definitely accompanied by more work effort and higher costs. Therefore employing the "simpler" approach, obtaining the same results might be beneficial for scientists, establishing a new qPCR protocol, in particular in times, when working cost-effectively is a prerequisite in most laboratories.

      The authors performed a thorough analysis of the two approaches compared in this study. By investigating two entirely different experimental set-ups with a similar outcome, they nicely substantiate their findings. Furthermore, by investigating differential expression of target genes, for both experimental setups, they put their results to the test, convincingly corroborating their results.

      This manuscript is well-written, experiments are thoroughly performed, the findings are convincing and it clearly is an important contribution for the scientific community.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Reference gene selection is one of the most critical steps in gene expression analysis using qPCR. The authors compared data quality using references selected based on RNA-Seq or using panel of often used reference genes. The manuscript is well prepared and easy to understand. Figures are nice and clear. I do not have major comments, but rather a few suggestions to make the manuscript more advanced. Since it is based on already available data or a few more expression measurements could be easily added, I would suggest to include total RNA factor, some rRNA and mtRNA as potential references. It will be interesting to compare their stability and effect on results of other targeted genes.

      In discussion, authors suggested that: "stable reference genes for qPCR data normalisation can be obtained from any random set of candidates provided the statistical approach of reference gene validation is sound and consistent". I do not think the word random in many sentences is appropriate. Panel of reference genes used in this study contains many known stable genes and that does not look random to me. I would rephrase these sentences. Usually panels of reference genes (for human and mouse are commercially available and contains several genes used in study) are composed of genes coding various biological processes to ensure that some of them will be stably expressed in experiments.

      Significance

      Good reference gene selection is needed for most of experiments, where quantities and qualities of samples are not identical. Unfortunately, every experiment has other stable and reliable reference genes. Validation can be time consuming and expensive. RNA-Seq experiments covering broad spectrum of biological samples are potentially a way for faster identification of unknown stable genes, which could be used for normalization in qPCR. Authors compared effectivity of reference genes selected based on RNA-Seq and using panel of potential reference genes. I like their comparison, but do not fully agree with "random" selection.

      I am not aware of other study comparing quality of qPCR references from RNA-Seq or preselected genes. I think the manuscript will be appreciated by technically or methodically oriented readers (gene expression area).

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This article contrasts RNAseq and random selection to assess reference genes for relative gene expression. The study was well contrived with a solid experimental design.

      Major comments:

      The authors state that all the RNA and contaminating DNA was validated and verified with nanodrop and BioAnalyzer which is the correct and accepted approach. However, the following concerns arise with testing reaction efficiency and data analysis:

      1. For reaction efficiency, the standard curves for each reference gene and gene of interest target should be included in the supplemental data. A four point standard curve is the bare minimum to assess reaction efficiency and raises concerns about the data quality. The unknown samples being tested should also be plotted on the corresponding standard curves to assess their efficiency.
      2. The statement starting on line 510: "The WT experimental group was omitted from this analysis as it was used as the experimental calibrator for differential expression. The mean Fold Change of the WT group is always at 1 regardless of the gene/method in question and therefore it is redundant to test for statistical significance of the WT fold change levels across different methods for each gene." indicates that data analysis was not performed in a rigorous and generally accepted manner. PLease check the analysis with that described in: https://www.cell.com/trends/biotechnology/fulltext/S0167-7799(18)30342-1

      The generally accepted methodology for relative, normalized qPCR data analysis is well described in Figure 5 of that article. qPCR statistical analysis should be performed on the log transformed expression results also well described in that paper.

      The authors used Normfinder to assess reference gene stability. Since Normfinder uses a particular algorithm for assessing stability, it is recommended to assess stability using a combination of these "stability calculators" including: GeNorm, NormFinder and BestKeeper. This is described in Table 1 of: https://www.cell.com/trends/biotechnology/fulltext/S0167-7799(18)30342-1. This will give a much more reliable perspective on the ranking of reference genes by their stability.

      Finally, since many currently studied targets for relative gene expression are low expressed, it would be important to also examine three deferentially expressed targets in the Cq range of 29 to 32. Yes the variability will be higher but these data will give a more realistic test of reference gene stability.

      Significance

      This article will be useful for all labs conducting gene expression experiments. It also uncovers additional contrasts between qPCR and RNA seq which are helpful in choosing the appropriate technology for given experiments.

      Referee Cross-commenting

      I agree with the other reviewers comments.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Hello, we wrote our review before seeing that you have special formatting requirements. We're just going to post our review in it's entirety rather than rewrite it based on these suggestions. It encompasses the above content, it's just not formatted in the suggested order. We hope that's OK! **Full review:** This manuscript makes a strong case for the evolvability of multicellular size via selection for settling rate in the icthyosporea. The use of an experimental evolution framework to assess the evolvability of multicellular phenotypes, using sedimentation rate as a selective pressure, extends the previous work of others into a new domain within the holozoan and the closest living relatives of animals. The natural, ecological significance of selection for sedimentation rate is a novel idea, and the connection between sedimentation rate and multicellular evolution in natural as opposed to contrived experimental circumstances is an interesting idea. The results are striking and well supported, with laboratory evolution rapidly adjusting both the cellular composition and the multicellular phenotypes of the organisms involved in ways that are well explained. This is an important result that brings the laboratory study of the evolution of multicellularity forward, into a different branch of the tree of life and showing its broad applicability. Sequencing of evolved lines adds significantly to the completeness of the story. While the causal role of these mutations in the production of the observed multicellular phenotypes are not demonstrated via manipulation or breeding, this is quite understandable in the light of the unusual model organism and the observed homologies and role of the genes involved. While this is largely clear from a reading, we believe the manuscript would benefit from a brief analysis of the numerical enrichment of genes with homologs involved in cytokinesis, cell membrane composition, and cell cycle control relative to the null hypothesis of genes picked randomly from the genome. If this is beyond the scope of this research in an unusual model organism with many poorly annotated genes, then a slightly expanded verbal discussion of the potential roles of the apparent functions of these genes in the evolution of multicellular clumping would be an appropriate substitute. We wholeheartedly recommend the publication of this manuscript with a number of minor revisions, which while not affecting the main conclusions or points of the manuscript will clarify important points, adjust small errors, and point the reader at relevant literature and concepts.

      ANSWER__: We would like to heartily thank the reviewers for their appreciation of our work. __

      **Major points:** none. **Minor points:** Line 79 - is sedimentation rate really invariably associated with multicellularization? Active swimming would seem to prevent this.

      ANSWER__: We meant to refer to the fact that all published examples of the emergence of multicellularity from unicellular ancestors have been accompanied by an increased sedimentation rate. Active swimming alone would just increase the diffusion rate of cells and not counteract the effects of increased size and density; such an active mechanism would also require directionality away from the tendency to sediment. A more passive mechanism, whereby a genetic variant, or cell cycle transition, which simultaneously causes a relative decrease in density while increasing cell size, leaving the net sedimentation rate the same as the ancestor, while conceivable, has not been observed in the literature. We changed the text from “invariably” to “frequently” at line 80 to emphasize how this is an empirical observation.__

      Line 164 - the precise phenotype in the evolution experiment being referred to is unclear without further context, with the ordering of paragraphs possibly needing a little work.

      ANSWER__: We tightened the paragraphs and merged both, the sentence containing “this phenotype” was removed.__

      Line 178 - is sorting them into three classes informative? Are there different mutations associated with these, or is it just visual clumping on the numberline? Perhaps not a useful classification, but the existence of great variation is an important point to get across. A more useful classification might be those that increase sedimentation with large density changes versus exclusively by clumping.

      ANSWER__: We agree with this argument and ultimately decided to remove the visual classification. We revised the text and figures accordingly.__

      Line 254 - excess cellular density is referred to interchangeably with density, when these are very different figures. This continues in line 269, and in the figure legends of Figure 4.

      ANSWER__: We fixed this.__

      Line 341 - the rule of RCC1 homolog in other organisms could be expanded on in slightly more detail. Similarly, other mutations in this same section known to affect cytokinesis could have potential mechanisms for affecting clumping commented upon, especially given the cell membrane results in the figures.

      ANSWER__: We share the reviewer’s enthusiasm about some of these mutations. We, however, try to be very conservative about what each gene or protein could be doing. Indeed, the absence of genetic tools does not allow us to directly test the effect of each mutation. We added a couple of extra sentences about RCC1 as well as about cytokinetic proteins and their potential role in clumping phenotypes.__

      Line 387 - awkward formatting or sentence structure, with dashes and commas.

      ANSWER__: We fixed the sentence structure.__

      Line 395 - this cellular process, or this evolutionary process of selection for faster settling?

      ANSWER__: We revised this appropriately.__

      Line 408 - per unit volume

      ANSWER__: Fixed.__

      Line 425 - the idea of clumpiness as ancestral is quickly put forward and dismissed within a single sentence. This could be explored in slightly more detail as an option, before concluding that what is clear is that the phenotype is easy to change.

      ANSWER:__ We agree that it would be interesting to pursue the ecological role and distribution of clumping and cell cycle phenotypes for other species in the Ichthyosporea genus. We could propose alternative scenarios of which trait came or went first and test this hypothesis by calculating the correlation of the presence or absence of the trait with the branch lengths and branching patterns of phylogenetic trees we have built using genome sequences. However, for our dataset, this would nonetheless remain a fragile correlation consisting of five data points. We do not feel such speculation is helpful for the text.__

      However, because two reviewers have mentioned or suggested in this direction, we expanded the discussion and annotated the tips of the species tree in figure 5 with the traits of interest. The result shows that S. gastrica, S. tapetis and S. nootkatensis species exhibit clumpiness as a trait. However, the data is not enough to resolve whether the traits are “derived” or “ancestral”.

      Line 437 - sedimentation as a highly variable trait, or a highly evolvable trait?

      ANSWER__: Evolvable trait. We fixed it in the text.__

      Figure 1G, 1H: We are fairly certain that the logarithmic scale of DNA content and coenocyte volume are mislabeled. The scale that is labeled log2 in 1G in the legend goes up by factors of 2 rather than single digits. The axis is obviously logarithmic, and the log2 in the legend is superfluous and misleading. Similarly, in 1H a scale labeled as log10 goes from 1 to 30, which on a logarithmic scale would be a sphere approximately 100 kilometers wide. The numbers can remain, but the legend should remove the log10.

      ANSWER__: Fixed. It is indeed a log scale. We made sure to remove the confusing log2 and log10 from figure and legend.__

      **General:** Were there any head to head competitions performed? Not suggesting you need to, but it's a nice way to directly examine fitness consequences of multicellularity, and is commonly done in the field. If you have done this it wasn't clear to us.

      ANSWER__: We now included a fitness experiment previously performed using the clumpy S01 and S03 in a head-to-head competition with the Ancestor (AN). The results are shown in Figure 2E and Figure 2 – figure supplement 1D. The results reflect how the fast-sedimenting clumpy phenotype is highly advantageous in our experimental evolution selection procedure, however deleterious in the absence of selection.__

      Reviewer #1 (Significance (Required)): see the above comments about writing the review before realizing there were specific formatting suggestions. I hope you understand us not wanting to re-write the review having already written it once.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): The present work adds to the growing literature on sedimentation rate as a major player in the evolution of multicellularity. Via rigorous experimentation, the authors convincingly show that they can select for increase sedimentation rate and identify two mechanisms underlying this increase: incomplete cellular separation leading to multicellular groups and increases in cellular density. They also show surprising natural variation in sedimentation and argue that, along with similar evidence from other organisms, their findings cement the likely major role of sedimentation and go farther by revealing the tight genetic control that it is under. Reviewer #3 (Significance (Required)): This is a very significant study because it illuminates processes and underlying mechanisms that could have played a major role in the transition to multicellularity. Their result will likely greatly influence the conceptual and theoretical thinking and will foster additional empirical directions. My only quibble with the manuscript is that I wished for a bit more ecological context and grounding of the main findings: in that respect, both the abstract and the last paragraph of the discussion leave me wanting and occasionally puzzled. If maintaining buoyancy is such a strong selective pressure and the variation in sedimentation rate is such a challenge to it, then I think explaining a bit more exactly why sedimentation would evolve, why so much variation would exist etc etc would be really helpful to the more naive reader. Just a bit further elaboration on selective pressures (even presumed ones and even if speculative) would be helpful to put the picture together.

      ANSWER__: We would like to thank Reviewer #3 for his/her comments. We do believe that extensive ecological context is highly relevant. Throughout the manuscript, we strived to be conservative in the way we describe both our model system and its experimental and natural settings, perhaps to a fault, but we now do offer an evolutionary model that tries to shed light into the phenotypic evolution of the various species through different routes (Fig. 5H). To elaborate more on the rationale behind this strategy, we offer the following two aspects:__

      1. we are investigating a sizeable, but still a very limited number of six Sphaeroforma Therefore, we feel that explaining what trait may be considered ancestral is speculative based on the known species tree (we revised our Discussion in this regard and update figure 5A).
      2. our knowledge about the ecological niches of Sphaeroforma species is limited. We avoid extensive speculation, and while inference of the potential ecological context is part of the scope of this study, we relied on an experimental approach to tackle our questions, rather than ecological observation or computational modeling.
      • throughout the text we aimed to avoid taking a strong stance on the “adaptiveness” of the traits which we are measuring. This is because, depending on the model specification and parameters, ecological models could be made for or against whether the cellular traits of size and density, and their effects on the higher-level trait of sedimentation rate, might be adaptive “in the wild”.

      We hope that future studies will be able to tackle any open questions on the understanding of the ecology of ichthyosporeans, hopefully benefitting from our inferred evolutionary insights in this study.

      **A more minor point:** I remember seeing a talk by Will Ratcliff a while back in which he showed that in S cerevisiae they also see the two mechanisms of increased sedimentation: increased cellular size and clumping. Yet, I didn't see a reference to that work in the context of the cell density mechanism discussion and wondered why.

      ANSWER__: We do believe to have cited the relevant papers from the Ratcliff lab. To be clear, we observed two separate physical mechanisms for fast-sedimentation: __


      1. by cell-clumping (increasing size),
      2. by increasing the number of nuclei per unit volume (increasing density).

      To our knowledge the 1st mechanisms was indeed observed in snowflake yeasts (for which we referenced all relevant studies), whereas the 2nd, which we believe might be specific to multinucleated cells, while a conceivable variable affected by mutations in the organisms from these studies, has not been measured to our knowledge. We added a new model figure (Figure5H) to hopefully better get this message across.


      Reviewer #4 (Evidence, reproducibility and clarity (Required)): In this study Dudin et al. explored the variability of sedimentation rates in members of the Sphaeroforma genus and found that sedimentation rates are very variable between different isolates as well as during the life cycle of each isolates. Following this observation Dudin et al. evolved S. arctica under a regime favoring fast settling objects. After a few hundred generations they observed that most lineages increased their sedimentation rate. Characterization of some of these evolved population suggests two distinct mechanisms allowing fast sedimentation: cluster formation by non-separation of cells post-cellularization and increase in object density. By sequencing the evolved lines Dudin et al. were able to identify that several mutations has been under the effect of positive selection and that some of the mutations relate to mechanisms involved in cell separation and cellularization.

      ANSWER__: We dearly thank Reviewer #4 for his/her time and efforts.__

      **Major comments: **

      • Line 143, I don't understand how figure 1G shows that "nuclear division cycles were periodic...".

      ANSWER__: From previous published results (Ondracka et al 2018 & Dudin et al 2021), we know that nuclear divisions in S. arctica are strictly synchronized and occur within defined time-intervals. As can be seen in Figure 1G, DNA content doubles with a constant interval of about 9 hrs. Likewise, this phenomenon is clearly depicted in Figure 4F and Figure S4H. These results combined with results shown in Figure 1F, demonstrate that division cycles are still periodic in our experimental setting and are not occurring asynchronously as no odd number of nuclei per cell was observed.__

      • When characterizing the evolved lines, the authors display (and measure?) separately the size and the sedimentation rate, but don't directly compare them. If the statement that density plays a role in the sedimentation rate of S4 and S9 but not S1, then correlation between size and sedimentation should be similar between AN and S1 and changed in S4 and S9. It would be nice to see these relationships and the correlations.

      ANSWER__: We do indeed measure the size and the sedimentation rate of each fast-settling mutant separately. This is shown in figure 1C, where sedimentation rate is plotted against cell size for our dataset and the older Smayda (1973) data. Further, both measurements, directly, feed in the estimation of cellular density in Figures 4C and S4D (explained extensively in the methods). Cellular density estimations show the correlations and relationships between S1 and AN as well as between S4 and S9. __

      • Line 288: "surviving 780 generations of passaging for all 10 isolates" what data is this referring to?

      ANSWER__: This refers to growing cultures in the lab of fast-settling mutants with tens of passages done without any selection. These growing cultures maintained their clumping phenotypes even without a constant selection, suggesting they are due to a genetic modification. We are unsure about how to answer reviewer #4 as this is the data we are mentioning. We however changed “surviving” to “persisting for”, and hope it better clarified the sentence.__

      • The weakest aspect of the paper is that there is neither a statistical argument (with a single anecdotal exception), from seeing the same genes or pathways mutated in parallel experiments, or experimental reconstruction that argues that any of the observed mutations were selected as opposed to being neutral mutations that hitch-hiked with adaptive mutations. One strongly suspect that some of the observed mutations were selected, but from the available data, it is impossible to know which were selected and which were hitch-hiking.

      ANSWER__: We agree that our draft did not elaborate in-depth if mutations were drivers versus passengers, a fact also mentioned by another reviewer. To be fair however, there are several important considerations to make.__

      First, and most importantly, we do offer an unprecedented look into the genetic underpinnings of this novel model organism, and demonstrate highly parallel phenotypic evolution in response to selection. The molecular genetic signal reflects this finding given a skewed dN/dS-ratio > 1. While the precise molecular changes are not as easy to interpret, molecular parallelism at the level of genes is not a prerequisite for directional selection in repeat lineages, especially given the complex genomic architecture of S. arctica.

      Second, while we didn’t emphasize this a lot, the results from our bioinformatic analyses are pretty unique. We are dealing with a non-standard model organism here, with highly intriguing placement in the tree of life, but with big genome size, at >140 Mbp. This is 1-2 orders of magnitude larger than that of other single-celled model systems used in evolution experiments, including E. coli or S. cerevisiae. Unlike the latter two, this organism’s genome contains extensive levels of intergenic and intronic sequence, as well as a high amount of (simple sequence) duplication. Hence, the analyses of the resequencing data were a major effort, and it took an extensive amount of time to identify the mutations.

      Third, there are no genetic tools that would allow us to either perform molecular genetics or crossing with S. arctica as of now. This will change in the future, and in this event, our comprehensive list of target genes will be hopefully valuable to the field and beyond.

      • Even if the authors knew which mutations were selected, it is not possible to say if the mutations that have been selected are directly advantageous in the settling regime, they could be due to adaptation to lab conditions and higher temperatures, etc. Having a control evolution experiment with no settling selection would be required to reach the conclusion that the mutants were selected for faster sedimentation.

      ANSWER__: We agree that a “no-selection”-control experiment would have been helpful for the molecular interpretation. But the clumping phenotype has never been observed to occur in many generations of passaging in any of the labs culturing these organisms and at different temperatures (we made sure to specify this in the text) As such, we argue that any adaptation to laboratory conditions must have happened before we conducted our selection experiment. Given that the molecular signals were unique (with one exception), we have reason to believe that the highly controlled nature of the experiment with a constant environment throughout, did at least not bias the molecular signals toward extensive genetic parallelism. __


      **Minor comments:**

      • Line 164, the authors write "this phenotype", it is unclear what phenotype is referred to as.

      ANSWER__: Fixed__

      • Line 187: the authors use the word "radius" in the text, while using "perimeter" in the figure.

      ANSWER__: Fixed__

      • Line 224: Is the use of the expression "incomplete detachment between daughter and mother cell" appropriate given that all cells emerge from a multinucleated cell?

      ANSWER__: Fixed – “incomplete detachment between cells.”__

      • Line 151, typo, the "with" should be removed.

      ANSWER__: We believe the reviewer wanted to point out the “with” in line 251, which we fixed.__

      • The intro about changes in ecology is nice but does not make sense given the rest of the paper, I would add it to the discussion.

      ANSWER__: We beg to differ with Reviewer#4 here, as the water column distribution for plankton in marine environment is one of the key aspects of our paper and is a critical parameter in models of water body ecology.__

      • Line 399 "increase their cell size by increasing cell-cell adhesion post-cellularization" the first use of "cell" is misleading because the objects are now a collection of cells rather than a single cell.

      ANSWER__: Fixed__

      Reviewer #4 (Significance (Required)): Most of the findings made in this study have been obtained in previous studies done with more genetically tractable organisms, however this is the first time that such experimental evolution was made on a unicellular non-model system organism closely related to animals. The significance of the work is reduced by the failure to produce evidence to answer two critical questions about the observed mutations: 1) were they selected during the experiment or did they hitch-hike with other selected mutations, and 2) if they were selected, were they selected because they led to faster sedimentation or some other aspect of the conditions in which they were passaged. It would take serious effort to perform additional experiments to address these questions and thus the authors are likely to be better off explaining that their work is unable to answer the questions and thus they are speculating about both the causality of the mutants and the nature of the advantage they conferred.


      ANSWER__: We beg to differ with the reviewer’s argument.__

      We believe that our study demonstrates heritable phenotypic changes for an evolvable, ecologically relevant trait, and their tight cellular regulation. We identify and carefully quantify how two cellular growth phenotypes – the nuclear division rate and cell size control –– can vary heritably and independently of one another, and together directly shape variation in a critical ecological parameter of a marine organism. Therefore, in addition to the fact that the work was performed in an emerging model marine organism, this work provides fundamental “novel” insight into cellular trait evolution more generally.

      Our results do not depend upon knowing the exact genetic mutations or molecular mechanisms which have caused these phenotypic changes. Nor, as the reviewer implies, do we claim to have identified particular mutations that were selected, or their effects on particular cellular phenotypes. We do, however, provide a large amount of evidence that the changes are likely genetic. With our sequencing effort, we find a strong, statistically significant, molecular signal of adaptation in the lineages (dN/dS > 1), and we publish a curated list of affected genes which are potentially causative for the phenotypes we observe.

      Because we did not observe frequently recurrent mutations, as most directed (and cancer, antimicrobial resistance, etc.) evolution studies find, our results suggest that there is a large mutational target size affecting the phenotype of interest, reflecting its potentially broad genetic and molecular control mechanisms. We view these results as a great strength of the study, and consider this result in and of itself “novel”. Furthermore, we have now added and __used a statistical genetic approach to quantify the heritability of traits, or what proportion of the variance in phenotype is due to an individual’s inherited state__ (Figure 1 – figure supplement 1A). The results show that Heritability exceeds 95% across phenotypes, and across the entire dataset, H exceeded 99% of the total phenotypic variance (ANOVA F = 1118 on 252 and 735 DF, p = 0). This means that for a typical individual genotype in a given environment, we could predict its average phenotypic measurement with >97% accuracy.

      The fact that we do not conclusively identify which particular mutations are causative does not obviate the overwhelming evidence that heritable changes occurred in our samples, leading to repeated phenotypic convergence affecting the trait of sedimentation rate. We believe these phenotypic changes, and our quantification of their magnitude, to be a “novel” and “significant” contribution to the literature on cellular trait evolution, ecology, and multicellularity.





    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this study Dudin et al. explored the variability of sedimentation rates in members of the Sphaeroforma genus and found that sedimentation rates are very variable between different isolates as well as during the life cycle of each isolates. Following this observation Dudin et al. evolved S. arctica under a regime favoring fast settling objects. After a few hundred generations they observed that most lineages increased their sedimentation rate. Characterization of some of these evolved population suggests two distinct mechanisms allowing fast sedimentation: cluster formation by non-separation of cells post-cellularization and increase in object density. By sequencing the evolved lines Dudin et al. were able to identify that several mutations has been under the effect of positive selection and that some of the mutations relate to mechanisms involved in cell separation and cellularization.

      Major comments:

      • Line 143, I don't understand how figure 1G shows that "nuclear division cycles were periodic...".
      • When characterizing the evolved lines, the authors display (and measure?) separately the size and the sedimentation rate, but don't directly compare them. If the statement that density plays a role in the sedimentation rate of S4 and S9 but not S1, then correlation between size and sedimentation should be similar between AN and S1 and changed in S4 and S9. It would be nice to see these relationship and the correlations.
      • Line 288: "surviving 780 generations of passaging for all 10 isolates" what data is this referring to?
      • The weakest aspect of the paper is that there is neither a statistical argument (with a single anecdotal exception), from seeing the same genes or pathways mutated in parallel experiments, or experimental reconstruction that argues that any of the observed mutations were selected as opposed to being neutral mutations that hitch-hiked with adaptive mutations. One strongly suspect that some of the observed mutations were selected, but from the available data, it is impossible to know which were selected and which were hitch-hiking.
      • Even if the authors knew which mutations were selected, it is not possible to say if the mutations that have been selected are directly advantageous in the settling regime, they could be due to adaptation to lab conditions and higher temperatures, etc. Having a control evolution experiment with no settling selection would be required to reach the conclusion that the mutants were selected for faster sedimentation.

      Minor comments:

      • Line 164, the authors write "this phenotype", it is unclear what phenotype is referred to as.
      • Line 187: the authors use the word "radius" in the text, while using "perimeter" in the figure.
      • Line 224: Is the use of the expression "incomplete detachment between daughter and mother cell" appropriate given that all cells emerge from a multinucleated cell?
      • Line 151, typo, the "with" should be removed.
      • The intro about changes in ecology is nice but does not make sense given the rest of the paper, I would add it to the discussion.
      • Line 399 "increase their cell size by increasing cell-cell adhesion post-cellularization" the first use of "cell" is misleading because the objects are now a collection of cells rather than a single cell.

      Significance

      Most of the findings made in this study have been obtained in previous studies done with more genetically tractable organisms, however this is the first time that such experimental evolution was made on a unicellular non-model system organism closely related to animals. The significance of the work is reduced by the failure to produce evidence to answer two critical questions about the observed mutations: 1) were they selected during the experiment or did they hitch-hike with other selected mutations, and 2) if they were selected, were they selected because they led to faster sedimentation or some other aspect of the conditions in which they were passaged. It would take serious effort to perform additional experiments to address these questions and thus the authors are likely to be better off explaining that their work is unable to answer the questions and thus they are speculating about both the causality of the mutants and the nature of the advantage they conferred.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The present work adds to the growing literature on sedimentation rate as a major player in the evolution of multicellularity. Via rigorous experimentation, the authors convincingly show that they can select for increase sedimentation rate and identify two mechanisms underlying this increase: incomplete cellular separation leading to multicellular groups and increases in cellular density. They also show surprising natural variation in sedimentation and argue that, along with similar evidence from other organisms, their findings cement the likely major role of sedimentation and go farther by revealing the tight genetic control that it is under.

      Significance

      This is a very significant study because it illuminates processes and underlying mechanisms that could have played a major role in the transition to multicellularity. Their result will likely greatly influence the conceptual and theoretical thinking and will foster additional empirical directions. My only quibble with the manuscript is that I wished for a bit more ecological context and grounding of the main findings: in that respect, both the abstract and the last paragraph of the discussion leave me wanting and occasionally puzzled. If maintaining buoyancy is such a strong selective pressure and the variation in sedimentation rate is such a challenge to it, then I think explaining a bit more exactly why sedimentation would evolve, why so much variation would exist etc etc would be really helpful to the more naive reader. Just a bit further elaboration on selective pressures (even presumed ones and even if speculative) would be helpful to put the picture together.

      A more minor point:

      I remember seeing a talk by Will Ratcliff a while back in which he showed that in S cerevisiae they also see the two mechanisms of increased sedimentation: increased cellular size and clumping. Yet, I didn't see a reference to that work in the context of the cell density mechanism discussion and wondered why.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Hello, we wrote our review before seeing that you have special formatting requirements. We're just going to post our review in it's entirety rather than rewrite it based on these suggestions. It encompasses the above content, it's just not formatted in the suggested order. We hope that's OK!

      Full review:

      This manuscript makes a strong case for the evolvability of multicellular size via selection for settling rate in the icthyosporea. The use of an experimental evolution framework to assess the evolvability of multicellular phenotypes, using sedimentation rate as a selective pressure, extends the previous work of others into a new domain within the holozoan and the closest living relatives of animals. The natural, ecological significance of selection for sedimentation rate is a novel idea, and the connection between sedimentation rate and multicellular evolution in natural as opposed to contrived experimental circumstances is an interesting idea. The results are striking and well supported, with laboratory evolution rapidly adjusting both the cellular composition and the multicellular phenotypes of the organisms involved in ways that are well explained. This is an important result that brings the laboratory study of the evolution of multicellularity forward, into a different branch of the tree of life and showing its broad applicability.

      Sequencing of evolved lines adds significantly to the completeness of the story. While the causal role of these mutations in the production of the observed multicellular phenotypes are not demonstrated via manipulation or breeding, this is quite understandable in the light of the unusual model organism and the observed homologies and role of the genes involved. While this is largely clear from a reading, we believe the manuscript would benefit from a brief analysis of the numerical enrichment of genes with homologs involved in cytokinesis, cell membrane composition, and cell cycle control relative to the null hypothesis of genes picked randomly from the genome. If this is beyond the scope of this research in an unusual model organism with many poorly annotated genes, then a slightly expanded verbal discussion of the potential roles of the apparent functions of these genes in the evolution of multicellular clumping would be an appropriate substitute.

      We wholeheartedly recommend the publication of this manuscript with a number of minor revisions, which while not affecting the main conclusions or points of the manuscript will clarify important points, adjust small errors, and point the reader at relevant literature and concepts.

      Major points:

      none.

      Minor points:

      Line 79 - is sedimentation rate really invariably associated with multicellularization? Active swimming would seem to prevent this.

      Line 164 - the precise phenotype in the evolution experiment being referred to is unclear without further context, with the ordering of paragraphs possibly needing a little work.

      Line 178 - is sorting them into three classes informative? Are there different mutations associated with these, or is it just visual clumping on the numberline? Perhaps not a useful classification, but the existence of great variation is an important point to get across. A more useful classification might be those that increase sedimentation with large density changes versus exclusively by clumping.

      Line 254 - excess cellular density is referred to interchangeably with density, when these are very different figures. This continues in line 269, and in the figure legends of Figure 4.

      Line 341 - the rule of RCC1 homolog in other organisms could be expanded on in slightly more detail. Similarly, other mutations in this same section known to affect cytokinesis could have potential mechanisms for affecting clumping commented upon, especially given the cell membrane results in the figures.

      Line 387 - awkward formatting or sentence structure, with dashes and commas.

      Line 395 - this cellular process, or this evolutionary process of selection for faster settling?

      Line 408 - per unit volume

      Line 425 - the idea of clumpiness as ancestral is quickly put forward and dismissed within a single sentence. This could be explored in slightly more detail as an option, before concluding that what is clear is that the phenotype is easy to change.

      Line 437 - sedimentation as a highly variable trait, or a highly evolvable trait?

      Figure 1G, 1H: We are fairly certain that the logarithmic scale of DNA content and coenocyte volume are mislabeled. The scale that is labeled log2 in 1G in the legend goes up by factors of 2 rather than single digits. The axis is obviously logarithmic, and the log2 in the legend is superfluous and misleading. Similarly, in 1H a scale labeled as log10 goes from 1 to 30, which on a logarithmic scale would be a sphere approximately 100 kilometers wide. The numbers can remain, but the legend should remove the log10.

      General:

      Were there any head to head competitions performed? Not suggesting you need to, but it's a nice way to directly examine fitness consequences of multicellularity, and is commonly done in the field. If you have done this it wasn't clear to us.

      Significance

      see the above comments about writing the review before realizing there were specific formatting suggestions. I hope you understand us not wanting to re-write the review having already written it once.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements [optional]

      We thank the reviewers for their thoughtful, constructive, and highly actionable critique. The reviewers mentioned that “the experiments presented are well-designed, the methods well-implemented, and communication of the authors' findings is clear and concise”. We are happy to hear that “figure presentation and manuscript layout are top notch and... these data are easy to read and interpret”.

      We appreciate reviewers’ suggestions in improving the interpretability of the morphodynamic representation and address each of the Reviewers’ comments (typeset in blue) in the document below.

      Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      Reviewer # 1 (major points)

      * The Trajectory Feature Vectors (TFVs) are averaged over time - this seems to lose a lot of the salient information in the trajectories themselves, resulting in the low(ish) accuracy of the GMM. Could a Hidden Markov Model trained on the trajectories in state space help to identify/classify those trajectories that change their morphology/motion over time?

      Thanks for the suggestion. We did recognize that averaging will smooth the dynamics in each cell trajectory and reduce diversity of phenotypes. On the other hand, the temporal smoothing serves to reduce the noise, especially when the cells have reached steady state dynamics after being stimulated with pro- or anti-inflammatory cytokines. Our experiments were constructed to probe steady state dynamics and therefore we opted to use temporal smoothing.

      It is possible to identify rare transitions even with some temporal smoothing.

      In our analysis of rare transitions (Fig. 4C), we extracted long trajectories and split them into segments (10~15 frames, 1.5~2 hours). By applying Gaussian Mixture Model (GMM) to each segment, we identified a sequence of states along the full trajectory, from which state transitions were identified.

      During the revision, we will employ the Hidden Markov Model (HMM) to model state transitions in the latent shape space as suggested by the reviewer to detect rare transitions. Our expectation is that HMM will be able to identify more transition events due to its higher time resolution (frame instead of segment), though it may also be affected by unexpected imaging artifacts and noise.

      Reviewer # 1 (minor points)

      Could the authors provide some example images showing interpolation of each PC using the generative decoder?

      Thanks for the suggestion, however the discrete nature of the latent codebook of VQ-VAE makes it challenging to use interpolation as a proxy for utility of interpolation. A possible link between interpolation abilities and usefulness of representation learned by autoencoders has been explored in this paper by Berthelot et al. As Berthelot et al. note, “We perform interpolation in the VQ-VAE by interpolating continuous latents, mapping them to their nearest codebook entries, and decoding the result. Assuming a sufficiently large codebook, a semantically “smooth” interpolation may be possible. On the lines task, we found that this procedure produced poor interpolations. Ultimately, many entries of the codebook were mapped to unrealistic datapoints, and the interpolations resembled those of the baseline autoencoder.”

      Reviewer # 2 (major points)

      -It's unclear what the effect of speed is on the final state determination. TFVs were composed of auto-encoder-based features (PCs from latent space) and speed of the cells. Would the states be very different without speed as part of the TFVs or with TFVs consisting only of speed features? Please quantify and discuss.

      Thanks for your comment. We agree that speed of the cell is a main factor that contributes to the clustering, though shape features (from VQ-VAE) do contribute (Fig. 3B, histograms) to discrimination of cell states. In the revision, we will perform the clustering analysis with only shape features and compare with current results of Fig. 4.

      Reviewer # 3 (major points)

      1. Temporal consistency regularization

      In the authors' framework, models are regularized to minimize the l2 norm between embeddings of adjacent timepoints.

      This is approach is conceptually well-motivated, but could have some unintended effects.

      For instance, some cells may make a rapid state transition such that state(t-1) = A, state(t) = B, state(t+1) = A'.

      In these cases, a regularized model may best minimize the joint loss by returning an embedding at time t that interpolates between state A and A', rather than returning an embedding that reflects the true distinct state B.

      The work would be strengthened if the authors analyzed the impact of this regularization term on the detection of rapid state transitions that occur for only a few frames (e.g. when cells that exhibit filopodial motility "jump" in an actin/myosin contraction).

      This might be accomplished through experiments scanning different regularization hyperparameters on some of the authors' real data, fitting models on temporally downsampled versions of the real data where "slow" multi-timestep transitions now occur in a few timesteps, or perhaps using simulations where rapid state transitions are known to occur.

      Even if the regularization does have some negative impacts, it does not argue against the utility of the general approach, but it is important for users to understand the constraints on downstream applications.

      In our revision, we will evaluate the optimal matching loss for our dataset by training the model with a series of temporal matching loss weights. With this computational experiment, we will illustrate the trade-offs introduced by the relative strengths of matching and reconstruction losses.

      Our expectation is that with very high matching loss, the embeddings (latent vectors) of the frames of the same trajectory will collapse regardless of morphology. For, a relatively wide range of matching loss weights, rank relations between transition pairs ([A->B] + [B->A'] >> [A->A']) should be preserved, from which the rare transitions can be robustly identified. In our experiments, most cells reached steady state morphodynamics when imaged, i.e., the matching loss between two adjacent frames arises primarily due to variations in background/noise. Fast transitions are “rare” in our data. Numerically, fast transitions contribute less to the matching loss during training and therefore their latent representations are not minimized. In other words, if B is a morphologically different state from A/A', the model is driven more by the reconstruction loss due to morphological difference rather than temporal smoothness across three consecutive frames.

      Baseline comparisons

      The authors evaluate their method by assessing the correlation of embedding PCs with heuristic features (Fig. 2C,D + supp.), variation of embedding PCs across cell treatment groups (Fig. 3), and qualitative interpretation of embedding trajectories.

      In the supplement, the authors compare their VQ-VAE approach to VAEs and AAEs and chose to use a VQ-VAE based on lower reconstruction error and higher PC/heuristic feature correlation.

      However, the authors do not compare their method to much simpler baseline approaches to this problem.

      Existing literature suggests that heuristic features of cell shape and motion (similar to those the authors use to evaluate the relevance of their embeddings) are sufficient to perform many of the same tasks a VQ-VAE is used for in this work.

      For instance, in Fig. 3 it appears that a simple analysis of cell centroid speed recovers much of same information as the complex VQ-VAE embeddings.

      In Fig. 2 - Supp. 6, it appears that after regressing out many heuristic features of cell geometry, the latent space largely explains cell non-autonomous information about the background environment, suggesting the heuristic features are largely sufficient.

      To demonstrate the usefulness of their deep modeling approach relative to simple baselines, the authors should compare against existing heuristics and embeddings of heuristics (e.g. PCA) using some of the tasks shown for the VQ-VAE (recovery of perturbation state, state transition detection, qualitative trajectory analysis, discrimination of cell types).

      Heuristics might include those already calculated here, or a more comprehensive set as cited in the Introduction.

      The authors may also consider comparing against baselines that don't include time information for some of their tasks (e.g. recovery of perturbation state could arguably be achieved with CNNs either ignorant of the timestep with simple temporal conditioning, not including trajectory information).

      If these features are sufficient for many of the same tasks performed in this work, the authors should provide a clear argument for readers as to why the unsupervised VQ-VAE approach may be preferable (e.g. ability to recover potentially unknown cell changes, for which no heuristic exists).

      The VQ-VAE doesn't need to be superior along every axis to hold merit, but the work would be strengthened if the authors could show clear superiority along some dimension.

      Thanks for your comments. We agree that through our exploration, specific heuristic features are found to be correlated with latent shape features. We did not start with heuristic features, but instead identified them after observing how cell morphology changes along the principal components of the latent shape space. Discovering the heuristic shape features that describe the variation in shape space, in our view, reinforces the value of self-supervised learning of complex cellular morphologies.

      We’d argue that the dynamorph pipeline complements heuristic approaches: it enables discovery of cell states through unbiased encoding and clustering, and the correlation of learned features with heuristic features enables interpretation of the cell state/data distribution more quantitatively than using either approach in isolation. Our argument is further reinforced by the related work (e.g., Zaritsky et al. and others mentioned in the introduction) on self-supervised learning of cell shape and interpretation of its latent space.

      More specifically, self-supervised learning with temporal matching generates unbiased and smooth encodings for cell morphologies, from which we identified the rank correlations between top PCs and certain geometric properties. However, this does not indicate that the set of heuristics chosen a priori will be equally descriptive of the shape distribution. For example, optical density of cells (phase) is a heuristic feature that has not been used in previous studies, which we recognized after sampling the PCs of shape space. Further identification of such correlations is by itself an interesting discovery enabled by self-supervised learning.

      In the current manuscript, we compared learned latent features (PCA on VQ-VAE latent embeddings) against a simple baseline (top PCs of raw images) and showed superior performances, which already illustrate the advantage of self-supervised learning in denoising data and extracting key diversities. In the revision, we will compare PCs of multiple heuristic features (e.g., cell size) with latent features to further strengthen the above point.

      Reviewer # 3 (minor points)

      For Fig. 4 - supp 1 -- isn't it expected that the GMM cluster of a vector can be predicted from the vector? The GMM clusters were derived from the vectors to begin with, so this seems like a bit of a circular analysis. If I'm missing something, this figure might benefit from more exposition.

      Thanks for your question. The original purpose of having this confusion matrix is to parallel Fig. 3 - supp 2, showing that GMM generated distinct cell states that describe population better than perturbation conditions. The confusion matrix itself is trivial, so we will evaluate how to make this point more precisely during the revision.

      For Fig. 4 - Supp 3, the authors should consider changing the "state" and "cluster" colors on the embedding projections so that they do not match. As presented, it appears as if the states and clusters were co-assayed and linked by some experimental label, when in fact the State 1::Cluster1, State 2::Cluster 2 relationship is just inferred.

      Thanks for your comment, we will change the color scheme for Fig. 4 - supp 3 to avoid confusion in the revision.

      * Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      Reviewer # 1 (major points)

      * The temporal matching to enforce a smooth latent space representation is interesting. The authors mention that they mask out surrounding cells with a median pixel value. Have the authors considered using a pixel weighting in the reconstruction/matching loss to differentiate foreground/background? Also, does this affect detection of any fast (or indeed rare) transitions in the trajectories?

      Thanks for your comment and question. Yes, we indeed incorporated a pixel weighting strategy during training. In addition to masking out surrounding cells, we used a smoothed and enlarged version of individual cell's segmentation mask to emphasize accurate reconstruction of the center cell in each patch, and reduce the influence of the surrounding cells/artifacts/background fluctuations. Matching loss is computed from latent vectors, which will be indirectly affected by the pixel weighting as well.

      More detailed description of the weighting strategy will be added to the methods section. The code for our weighting strategy can be found at: https://github.com/czbiohub/dynamorph/blob/b3321f4368002707fbe39d727bc5c23bd5e7e199/HiddenStateExtractor/vq_vae_supp.py#L287

      Reviewer # 1 (minor points)

      I was a little confused by the labels given to the PCs, as they seem to vary between figures. For example, In Fig2, PC1 and PC2 are Size and Peak Retardance, but in Fig3 they are referred to as Size and Cell Density (which could be interpreted as the number of cells per unit area). Could the authors clarify these in the captions?

      We have clarified the text to distinguish between cell density (population) and optical density (phase).

      The authors note that single-cell tracking is of vital importance. This should be elaborated upon. Also - could the VQ-VAE encodings be used to help track linking in cases of high density?

      We added a clearer reference to the methods section containing details of the tracking procedure. Additionally, we clarified in the discussion that the methods used for segmentation and tracking cells can be refined for high density cultures. Since we rely on the tracks to compute the temporal matching loss and regularize the VQ-VAE encodings (shape space) during the training, the encodings are not useable for refining tracking in high density population.

      Reviewer # 2 (major points)

      -'Cell state' in the field of cell biology has been operationally defined in so many different ways and with so many different types of measurement data, that 'cell state' is becoming a somewhat vacuous term. This is not only a problem of this paper but a challenge for the field. In this case, clustering of cells using a Gaussian mixture model that uses the first few principal components of the latent space coefficients as well speed - both averaged across the frames of cell tracks. This is fine and descriptive, but it's unclear whether this definition of 'cell state' is easily applied to other datasets and how this definition can be operationalized for hypothesis generation and experimentation. For other datasets, e.g. other cell types and other processes, such as differentiation, where e.g. tracking and segmentation may be more difficult and images would look quite different, can one still apply the same approach towards describing cell states? One could state that this definition of cell state is very specific to the dataset and therefore not generally useful. How would the authors respond to such a statement?

      This is an excellent point. We agree that the meaning of a “cell state” or a “cell type” can depend on the context. Cell state can be rigorously described in terms of measurements of the cells, and recent developments of new cell probing techniques, including imaging modalities and single-cell genomics keep adding to the growing list of the features that can be measured. Time-lapse imaging is high dimensional and therefore admits multiple definitions of cell state. Our use of the terms ‘latent shape space’ and ‘trajectory feature vectors’ clarifies how we define the cell state. Given the increasingly wider use of live cell imaging for biological studies and drug discovery, both of these descriptors of cell state are valuable. In the current manuscript, we focus on a combination of morphodynamic features, including but not limited to the cell shape, size, and speed. We use these features to cluster cells in an unbiased manner to detect morpho-dynamic “states” unique for this particular culture system. Our approach can be generalized to other cell culture systems, such as cell differentiation, where cell architecture evolves substantially.

      To clarify this point, we add the following text in the manuscript:

      Line 85: “The meaning of a "cell state" can vary with the physiological and methodological context. In this work, we refer to "morphodynamic states" as a combination of morphological and temporal features. From the trajectory of cells in the latent shape space, we identified transitions among morphodynamic states of single cells. The same approach enabled detection of transitions in the morphodynamic states of cells as a result of immunogeneic perturbations.

      In the discussion:

      Line 333: “ Our work formalizes an analytical approach for data-driven discovery of morphodynamic cell states based on the quantitative shape and motion descriptors. A cell state can be rigorously described in terms of measurements of the cells, and recent developments in measurement techniques, including imaging modalities and single-cell genomics keep adding to the growing list of the features that can be measured. Time-lapse imaging is high dimensional and therefore admits multiple definitions of a cell state.”

      -It's unclear to the reviewer whether the training data (unperturbed microglia) are close enough to the test data (perturbed microglia) such that application of the trained model to the test data makes sense. The authors provide reconstruction loss numbers, but they are difficult to interpret. Can the authors create plots of the unperturbed microglia cells and unperturbed microglia cells in the latent space and show overlap, or in other ways, show that training data and test data are close enough for this application.

      Thank you for pointing out the lack of clarity in generalizability of the model. We trained the model on control, untreated microglia acquired during one experiment, and then applied it to a separate dataset acquired during another experiment that included perturbed and control microglia. The reconstructions shown in Fig. 2 are from the test dataset that was not used during training. The quality of reconstructions supports that the shape space of the training set is representative of the shape space of the larger test set. We will add a density plot in the supplementary figures showing the overlapping latent space distribution of unperturbed (training dataset) and perturbed (test dataset) microglia.

      We now include the revised sentence in the manuscript to clarify the results:

      Line 132: “Comparison of reconstructed shapes from the test set and training set along with the analysis of the shape space described in the next section show that our self-supervised model trained on training dataset generalized well between independent experiments and can be used to compare cell state changes between control microglia and cells treated with multiple perturbations”.

      -Only a small amount of intensity variation is explained; 17% using the first 4 PC components which are mainly used in the analyses. This seems like a very low number. There is a lot of variation in the intensity images that is not explained by the autoencoder. The autoencoder seems to be doing a bad job. At the same time, the downstream analyses using the latent space are insightful and sensible. Can the authors provide more explanation?

      Thanks for your question. We would like to first clarify that the autoencoder (VQ-VAE) used in this work follows the design of the original reference, which doesn't have a very large compression. Given the latent space size (16x16x16), it is understandable that the 4 top PCs captured relatively smaller portions of the variance. The fact that cell shape cannot be described with few principal components is likely due to: a) diversity of morphology of microglia, b) diversity of modalities used to train the model.

      We now include the following text in the manuscript: Line 158: “The high variance of the shape space of microglia can be due to more complex shapes of microglia, such as diversity of protrusions, sub-cellular structures and variations in cell optical density, location of nuclei in migrating cells, etc. As we mentioned above, the inclusion of several imaging channels (brightfield, phase, and retardance) increases the performance of the model, possibly by increasing the diversity of morphological information encoded in our input data.”

      As you note, the downstream analyses from the learned latent space are insightful, e.g., we do detect substantial changes in top PCs upon perturbations. This supports our view that the shape space of microglia as encoded by our data is intrinsically high dimensional and the transients in the shape space are informative.

      Reviewer # 2 (minor points)

      -The motivation for GMMs over k-means is unclear. K-means clustering leads to spatial separation between clusters (states) since all cells/tracks that closest to their cluster mean are per definition further away from the means of other clusters. This is not the case with the more flexible GMMs; e.g. they allow one to have a smaller cluster (with small variance components) inside of a larger cluster (with large variance). The latter scenario seems undesirable for interpretation in terms of states.

      Thanks for your comments. The major reason for choosing GMMs over K-means clustering is that GMM allows different prior distributions for different perturbations. In practice, K-means would be capable of generating clusters regardless of perturbation conditions, while GMM enables a finer separation of states which are very likely correlated with perturbations. We agree that GMM has certain caveats as you mentioned in the comment. In our analyses, we didn’t notice the issues such as ‘nesting of components’ that you described.

      -Related to the previous point, 'self-supervised' sounds nice, but it's still optimizing towards something, in this case explaining the variation in input intensity images. A lot of the variation in the intensity images may not be of interest for the biological investigation of shape and dynamics. Did the authors uncover that indeed some of the latent dimensions are encoding other aspects of the images which may be less related to the biology and more to image properties/artifacts/biases?

      We agree with your assessment. Precisely for the reasons you point out, we counter the dependence of learned representation on non-biological variations in data using temporal regularization. This point is recognized by the reviewer #3. We clarify this concept. We clarify that not all the latent features represent biology of the cells and some represent the features of the instrument and the experiment. We report this for the top few PCs of latent representation and provide the code for the interested reader to discover what other PCs report.

      -The original images are 3D (5 z-planes). The analyzed images were 2D. The reviewer missed how the authors went from 3D to 2D. And since cells are 3D, can the authors describe what they gained by going to 2D and what they potentially lost?

      We added additional text to the methods subsection describing the Dynamorph Pipeline (line 590):

      “The input data for both semantic segmentation and VQ-VAE models are 2D-images of computed phase and retardance that measure integrated optical density and anisotropy across the depth of the cell. The raw collected data is 3-dimensional (5 z-slices acquired in multiple polarization channels). The 2D phase is computed from the full stack of brightfield images via deconvolution. The retardance is computed from an average of the intensities across the 5 z-slices. Subsequent model training is more tractable with 2D data instead of 3D, while capturing the cell architecture across the depth.”

      Reviewer # 3 (major points)

      Cell state transition interpretation

      In line 278, the authors propose that the unbalanced nature of transitions such that p(1 -> 2) >> p(2 -> 1) must represent some difference in timescales across the transitions because "cell states should have reached equilibrium after several days in culture at the time of the imaging experiments".

      This logic is unclear to me for two reasons.

      * If the population obeys detailed balance (e.g. transitions have equal frequency), then observed transitions should be balanced on a reasonably long time window, even if individual transitions occur on different timescales.

      * The assumption that cell states are balanced after a few days in culture is at odds with a few different aspects of the biology. Cell density and nutrient availability are continually changing in the dish, so culture conditions are non-stationary. Imaging apparatuses also commonly impact the cell biology of imaged samples due to imperfect incubation, etc. (2 or 3)

      It seems likelier that these data represent an unbalanced transition due to the non-stationary nature of the culture system.

      Given the authors' emphasis on the value of measuring these transitions, the work would be strengthened by a more careful interpretation of these results, additional analysis details (e.g. how large are most state transitions? are these mostly small shifts "over the border" in state space, or large jumps?), and an attempt at biological interpretation of the observed phenomenon.

      The authors' RNA-seq data may be helpful in this latter regard.

      This is an excellent point. We agree that the cell culture conditions, including nutrient availability, accumulating presence of metabolites and imagine-induced changes constantly introduce new variations to the system. In an attempt to mitigate these dynamic changes to the system, we maintained cells in culture for six days before starting the experiment. To avoid cell stimulation due to freshly added nutrients and growth factors from the culture media, we consistently exchanged the media and performed cytokine treatments 24 hours before each imaging experiment. Each imaging round was started after the cells were allowed to equilibrate to the environmental chamber for at least one hour before imaging. Despite these efforts, we agree with the reviewer that the conditions cannot be considered fully stationary. We removed the sentence “ Given that cell states should have reached equilibrium after several days in culture at the time of the imaging experiments, these results suggest that the transitions from state 2 to state 1 occur at a different time scale (i.e., much slower)” and changed the text to reflect this point:

      Line 294:

      “In our analysis, transition events are very rare among cells treated with IFN beta, while the most frequent cell transitions were observed among cells treated with GBM supernatant. One possible explanation for this imbalance is that IFN-treated cells represent a single polarization axis, while a heterogeneous cell signaling milieu derived from cancer cells provides conflicting pro- and anti-inflammatory signals, instructing cells to transition between the states. While both directions of transitions were observed within the imaging period, cells in state-1 are more likely to transition to state-2 than vice versa within the chosen time frame. This imbalance between the rates of state transitions correlates with the higher state-2/state-1 ratio in GBM and control environment and may explain the longitudinal accumulation of cells in a more activated state under these culture conditions.”

      1. Single cell RNA-seq analysis

      The authors performed a very interesting experiment where they profiled the same cell population using both timelapse imaging and single cell RNA-seq.

      The authors argue that the global structure of the state space resolved by each modality is analogous, but this seems a bit of a stretch to me.

      The behavior state space is unimodal (bifurcated into two states by GMM clustering), while the mRNA-seq space has several distinct clusters.

      The argument that these states are analogous would be significantly strengthened by biological interpretation of the RNA-seq data.

      Do the mRNA profiles exhibit differentially expressed genes that might explain differences in behavior in the cell behavior states?

      The analyses in Fig. 4 - Supp 4 are suggestive that "State 1" contains interferon-responsive cells and not control cells, but broader conclusions don't appear well supported by current analyses.

      We agree with the reviewer’s comment that the analogy between molecular cell states defined with scRNAseq analysis and morphodynamic cell states defined with dynamorph needs to be clarified. In our current work, the correlative measurement of morphodynamics and transcriptome was exploratory and relied on population statistics measured with each modality. More detailed studies linking morphodynamic states to the single cell transcriptomics, such as Patch-Seq or laser microdissection, are needed to decisively link morphodynamics and molecular programs underlying these phenotypes.

      Single cell transcriptomics simultaneously measures thousands of mRNA species in individual cells. Therefore, it can provide a nuanced interpretation for the molecular states of each population, as can be seen at a more granular separation of sub-states in scRNAseq clustering. For example, Cluster 1-2 was defined by high expression of interferon response genes, and predictably, this cluster was primarily derived from the cells treated with IFNb. Interferon exposure induces morphological changes associated with increased cell perimeter, which reports ramification of microglia plasma membrane (Aw et al., PMID: 33183319). It was also shown that infections with neurotropic viruses, leading to interferon response, also leads to decreased velocity and distance traveled for cultured microglia cells (Fekete et al., PMID: 30027450). These observations are in direct agreement with our morphodynamic analysis demonstrating a higher proportion of cells in State 1, characterized by lower cell velocity. Interestingly, scRNAseq analysis also identified a population of cells with high expression of cell cycle genes (Cluster 1-3), which would also be predicted to have a slower speed and potentially larger cell body. These results point to the fact that different molecular states may be underlying very similar morphodynamic states.

      We now provide a revised statement to reflect the above.

      Line 290: “We further compared the detected morphodynamic states with scRNA measurements of the same cell populations. Interestingly, the separation of cells in state-1 and state-2 from control and IFN group parallels the clusters identified with cell transcriptome, suggesting that correlative analysis of gene expression and morphodynamics can reveal molecular programs underlying these phenotypes. In our preliminary analysis, scRNAseq revealed a greater degree of granularity in each of the cell populations, such as cluster 1 of the scRNAseq separating into three additional subclusters. Cluster 1-2 was defined by high expression of interferon response genes, and predictably, this cluster was primarily derived from the cells treated with IFNb. Interferon exposure induces morphological changes associated with increased cell perimeter, which reports ramification of microglia membrane (Aw et al., 2020). It was also shown that infections with neurotropic viruses, leading to interferon response, also leads to decreased velocity and distance traveled for cultured microglia cells (Fekete et al., 2018). These observations are in direct agreement with the higher proportion of cells in State 1, characterized by lower cell velocity. Interestingly, scRNAseq analysis also identified a population of cells with high expression of cell cycle genes (Cluster 1-3), which would also be predicted to have a slower speed and potentially larger cell body. These results point to the fact that different molecular states may be underlying very similar morphodynamic states. Correlative single-cell measurements of morphodynamic states and single cell transcriptomics, such as Patch-Seq or laser microdissection, are needed to decisively link morphodynamics and molecular programs underlying these phenotypes.”

      Reviewer # 3 (minor points)

      1. Check grammar. Some articles are missing and some subject-verb agreements are mismatched. e.g. line 624 "we regularized [the] latent space", line 713 "after both loss[es] achieved".

      Thanks for pointing this out, we have thoroughly checked grammar and typos in this submission.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      Here, the authors present Dynamorph, an unsupervised learning framework for timelapse cell microscopy data built on VQ-VAEs.

      The authors apply this method to the analysis of microglial cell behavior under a series of perturbation conditions.

      Methodologically, the primary contributions of this work are the introduction of a temporal consistency regularization penalty on the latent space of a VQ-VAE model for application to timeseries data and the introduction of a "temporal feature vector"-ization procedure to summarize complex temporal trajectories in a single low-dimensional vector for analysis. Biologically, the primary contributions are the demonstration that microglial responses to different perturbogens and dynamics state transitions can be resolved by transmitted light microscopy.

      Overall, the experiments presented are well-designed, the methods well-implemented, and communication of the authors' findings is clear and concise.

      However, there are unaddressed potential caveats to the proposed framework and the manuscript fails to compare the proposed method to any existing baselines, such that the particular strengths and weaknesses of the method are unclear to readers.

      Major Points

      1. Temporal consistency regularization

      In the authors' framework, models are regularized to minimize the l2 norm between embeddings of adjacent timepoints. This is approach is conceptually well-motivated, but could have some unintended effects.

      For instance, some cells may make a rapid state transition such that state(t-1) = A, state(t) = B, state(t+1) = A'. In these cases, a regularized model may best minimize the joint loss by returning an embedding at time t that interpolates between state A and A', rather than returning an embedding that reflects the true distinct state B.

      The work would be strengthened if the authors analyzed the impact of this regularization term on the detection of rapid state transitions that occur for only a few frames (e.g. when cells that exhibit filopodial motility "jump" in an actin/myosin contraction). This might be accomplished through experiments scanning different regularization hyperparameters on some of the authors' real data, fitting models on temporally downsampled versions of the real data where "slow" multi-timestep transitions now occur in a few timesteps, or perhaps using simulations where rapid state transitions are known to occur.

      Even if the regularization does have some negative impacts, it does not argue against the utility of the general approach, but it is important for users to understand the constraints on downstream applications.

      1. Baseline comparisons

      The authors evaluate their method by assessing the correlation of embedding PCs with heuristic features (Fig. 2C,D + supp.), variation of embedding PCs across cell treatment groups (Fig. 3), and qualitative interpretation of embedding trajectories. In the supplement, the authors compare their VQ-VAE approach to VAEs and AAEs and chose to use a VQ-VAE based on lower reconstruction error and higher PC/heuristic feature correlation.

      However, the authors do not compare their method to much simpler baseline approaches to this problem. Existing literature suggests that heuristic features of cell shape and motion (similar to those the authors use to evaluate the relevance of their embeddings) are sufficient to perform many of the same tasks a VQ-VAE is used for in this work. For instance, in Fig. 3 it appears that a simple analysis of cell centroid speed recovers much of same information as the complex VQ-VAE embeddings. In Fig. 2 - Supp. 6, it appears that after regressing out many heuristic features of cell geometry, the latent space largely explains cell non-autonomous information about the background environment, suggesting the heuristic features are largely sufficient.

      To demonstrate the usefulness of their deep modeling approach relative to simple baselines, the authors should compare against existing heuristics and embeddings of heuristics (e.g. PCA) using some of the tasks shown for the VQ-VAE (recovery of perturbation state, state transition detection, qualitative trajectory analysis, discrimination of cell types). Heuristics might include those already calculated here, or a more comprehensive set as cited in the Introduction. The authors may also consider comparing against baselines that don't include time information for some of their tasks (e.g. recovery of perturbation state could arguably be achieved with CNNs either ignorant of the timestep with simple temporal conditioning, not including trajectory information).

      If these features are sufficient for many of the same tasks performed in this work, the authors should provide a clear argument for readers as to why the unsupervised VQ-VAE approach may be preferable (e.g. ability to recover potentially unknown cell changes, for which no heuristic exists). The VQ-VAE doesn't need to be superior along every axis to hold merit, but the work would be strengthened if the authors could show clear superiority along some dimension.

      1. Cell state transition interpretation

      In line 278, the authors propose that the unbalanced nature of transitions such that p(1 -> 2) >> p(2 -> 1) must represent some difference in timescales across the transitions because "cell states should have reached equilibrium after several days in culture at the time of the imaging experiments". This logic is unclear to me for two reasons.

      • If the population obeys detailed balance (e.g. transitions have equal frequency), then observed transitions should be balanced on a reasonably long time window, even if individual transitions occur on different timescales.
      • The assumption that cell states are balanced after a few days in culture is at odds with a few different aspects of the biology. Cell density and nutrient availability are continually changing in the dish, so culture conditions are non-stationary. Imaging apparatuses also commonly impact the cell biology of imaged samples due to imperfect incubation, etc.

      It seems likelier that these data represent an unbalanced transition due to the non-stationary nature of the culture system. Given the authors' emphasis on the value of measuring these transitions, the work would be strengthened by a more careful interpretation of these results, additional analysis details (e.g. how large are most state transitions? are these mostly small shifts "over the border" in state space, or large jumps?), and an attempt at biological interpretation of the observed phenomenon. The authors' RNA-seq data may be helpful in this latter regard.

      1. Single cell RNA-seq analysis

      The authors performed a very interesting experiment where they profiled the same cell population using both timelapse imaging and single cell RNA-seq. The authors argue that the global structure of the state space resolved by each modality is analogous, but this seems a bit of a stretch to me. The behavior state space is unimodal (bifurcated into two states by GMM clustering), while the mRNA-seq space has several distinct clusters.

      The argument that these states are analogous would be significantly strengthened by biological interpretation of the RNA-seq data. Do the mRNA profiles exhibit differentially expressed genes that might explain differences in behavior in the cell behavior states? The analyses in Fig. 4 - Supp 4 are suggestive that "State 1" contains interferon-responsive cells and not control cells, but broader conclusions don't appear well supported by current analyses.

      Minor Points

      1. Check grammar. Some articles are missing and some subject-verb agreements are mismatched. e.g. line 624 "we regularized [the] latent space", line 713 "after both loss[es] achieved".
      2. For Fig. 4 - supp 1 -- isn't it expected that the GMM cluster of a vector can be predicted from the vector? The GMM clusters were derived from the vectors to begin with, so this seems like a bit of a circular analysis. If I'm missing something, this figure might benefit from more exposition.
      3. For Fig. 4 - Supp 3, the authors should consider changing the "state" and "cluster" colors on the embedding projections so that they do not match. As presented, it appears as if the states and clusters were co-assayed and linked by some experimental label, when in fact the State 1::Cluster1, State 2::Cluster 2 relationship is just inferred.

      Positive comments

      1. Figure presentation and manuscript layout are top notch. Thanks to the authors for making these data easy to read and interpret.

      Significance

      See above.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Please place your comments about significance in section 2.

      -The authors describe Dynamorph; a deep-learning based autoencoder to represent - in an interpretable latent space - live cell microscopy image data of motile microglia in unperturbed and perturbed situations. Using Dynamorph, the authors identify and describe 'morphodynamic' states of the microglia.

      Major comments:

      Are the key conclusions convincing?

      -Yes, the methodology, observations and conclusions are clearly explained and convincing.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      -'Cell state' in the field of cell biology has been operationally defined in so many different ways and with so many different types of measurement data, that 'cell state' is becoming a somewhat vacuous term. This is not only a problem of this paper but a challenge for the field. In this case, clustering of cells using a Gaussian mixture model that uses the first few principal components of the latent space coefficients as well speed - both averaged across the frames of cell tracks. This is fine and descriptive, but it's unclear whether this definition of 'cell state' is easily applied to other datasets and how this definition can be operationalized for hypothesis generation and experimentation. For other datasets, e.g. other cell types and other processes, such as differentiation, where e.g. tracking and segmentation may be more difficult and images would look quite different, can one still apply the same approach towards describing cell states? One could state that this definition of cell state is very specific to the dataset and therefore not generally useful. How would the authors respond to such a statement?

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary to evaluate the paper as it is, and do not ask authors to open new lines of experimentation.

      -It's unclear to the reviewer whether the training data (unperturbed microglia) are close enough to the test data (perturbed microglia) such that application of the trained model to the test data makes sense. The authors provide reconstruction loss numbers, but they are difficult to interpret. Can the authors create plots of the unperturbed microglia cells and unperturbed microglia cells in the latent space and show overlap, or in other ways, show that training data and test data are close enough for this application.

      -It's unclear what the effect of speed is on the final state determination. TFVs were composed of auto-encoder-based features (PCs from latent space) and speed of the cells. Would the states be very different without speed as part of the TFVs or with TFVs consisting only of speed features? Please quantify and discuss. -Only a small amount of intensity variation is explained; 17% using the first 4 PC components which are mainly used in the analyses. This seems like a very low number. There is a lot of variation in the intensity images that is not explained by the autoencoder. The autoencoder seems to be doing a bad job. At the same time, the downstream analyses using the latent space are insightful and sensible. Can the authors provide more explanation?

      -Related to the previous point, 'self-supervised' sounds nice, but it's still optimizing towards something, in this case explaining the variation in input intensity images. A lot of the variation in the intensity images may not be of interest for the biological investigation of shape and dynamics. Did the authors uncover that indeed some of the latent dimensions are encoding other aspects of the images which may be less related to the biology and more to image properties/artifacts/biases? Are the suggested experiments realistic for the authors? It would help if you could add an estimated cost and time investment for substantial experiments. -These are computational experiments based on already existing data/results/code. It should be relatively straightforward to do these additional computational experiments. Careful analysis and interpretation require time.

      Are the data and the methods presented in such a way that they can be reproduced? -The methods are described with sufficient detail.The complicated experimental and computational processes seem reproducible to a decent extent. The code is captured in Github repos. The reviewer did not attempt to reproduce computational results. The reviewer did not check whether the available data meets FAIR requirements. Are the experiments adequately replicated and statistical analysis adequate?

      -Yes, and there is lots of useful supplementary material which helps with interpretation of the results. Minor comments: Specific experimental issues that are easily addressable. -The motivation for GMMs over k-means is unclear. K-means clustering leads to spatial separation between clusters (states) since all cells/tracks that closest to their cluster mean are per definition further away from the means of other clusters. This is not the case with the more flexible GMMs; e.g. they allow one to have a smaller cluster (with small variance components) inside of a larger cluster (with large variance). The latter scenario seems undesirable for interpretation in terms of states.

      -The original images are 3D (5 z-planes). The analyzed images were 2D. The reviewer missed how the authors went from 3D to 2D. And since cells are 3D, can the authors describe what they gained by going to 2D and what they potentially lost? Are prior studies referenced appropriately?

      -Yes, citations are amply and relevant. Are the text and figures clear and accurate?

      -Yes, the figures are informative. Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      -No specific suggestions

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      -This is a technological/computational advance using a large integrative (experimental+computational) approach.

      Place the work in the context of the existing literature (provide references, where appropriate).

      -The authors have done an excellent job at this.

      State what audience might be interested in and influenced by the reported findings.

      -Cell biologists, brain researchers, computer vision computational biologists

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      -Cell biology, cancer biology, systems biology, machine learning, statistics, data integration

      -Brain biology aspects (biological significance of the findings on morphodynamic microglial states) are difficult to assess for the reviewer

      Referee Cross-commenting

      Comments by Reviewer #1 look great and useful. I think they are in line with my comments. I think this manuscript would benefit from a reviewer that could comment on the biological significance. The review reports are skewed towards questions and remarks about the computational approach.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors use a combination of quantitative phase microscopy and machine learning to determine the state space of microglia cells. The key conclusions are that a VQ-VAE is able to capture a compact latent representation of the cell morphology, and combined with motion features, can predict state changes in single cell trajectories, and discriminate between purturbations.

      Major comments:

      Overall - I very much enjoyed reading the manuscript. The work has been carefully performed and the results are interesting.

      • The temporal matching to enforce a smooth latent space representation is interesting. The authors mention that they mask out surrounding cells with a median pixel value. Have the authors considered using a pixel weighting in the reconstruction/matching loss to differentiate foreground/background? Also, does this affect detection of any fast (or indeed rare) transitions in the trajectories?
      • The Trajectory Feature Vectors (TFVs) are averaged over time - this seems to lose a lot of the salient information in the trajectories themselves, resulting in the low(ish) accuracy of the GMM. Could a Hidden Markov Model trained on the trajectories in state space help to identify/classify those trajectories that change their morphology/motion over time?

      Minor comments:

      • Could the authors provide some example images showing interpolation of each PC using the generative decoder?
      • I was a little confused by the labels given to the PCs, as they seem to vary between figures. For example, In Fig2, PC1 and PC2 are Size and Peak Retardance, but in Fig3 they are referred to as Size and Cell Density (which could be interpreted as the number of cells per unit area). Could the authors clarify these in the captions?
      • The authors note that single-cell tracking is of vital importance. This should be elaborated upon. Also - could the VQ-VAE encodings be used to help track linking in cases of high density?
      • I was pleased to see the full source code available!

      Significance

      Nature and significance:

      This is a significant, mostly technical piece of work, that explores a complex new area of science -- using ML and large datasets to gain insight into biological systems. There are significant challenges, not least that interpreting ML models can be challenging.

      Existing literature/context:

      There have been relatively few examples of using self-supervised learning to gain insight into these complex datasets. Much of the work has concentrated on learning morphological descriptors. The present work starts to introduce the time dimension more explicity.

      Target Audience:

      Broadly applicable to those studying cell biology, microscopy and machine learning.

      My expertise:

      ML applied to microscopy data. Single cell tracking.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The authors do not wish to provide a response at this time.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors examine the important and challenging question in current biology, the role of RNA in the structural maintenance of nuclear and cytoplasmic membrane-less organelles including stress granules, processing bodies, nucleolus, Cajal bodies, and nuclear speckles. Furthermore, the authors explored super-enhancer complexes involved in the regulation of gene expression. The authors used RNase L, an interferon-induced ribonuclease which, upon activation in the cytoplasm or targeted to the nucleus, degrades all RNAs within the cell. Then they took the quantitative approach to analyze the effect of RNA degradation on disassembly or reorganization of membrane-less organelles. Interestingly, the authors observed that RNAs present within nuclear organelles are susceptible to RNase IL degradation leading to their disassembly. In contrast, super-enhancer-containing eRNAs are largely unaffected.

      Major concerns

      Many studied organelles are challenging to see in many of the figures. Thus this reviewer encourages the authors to present clearer insets at higher magnification to illustrate what is being quantified, and then show that quantification in the central figure next to the immunofluorescent images.

      The amount of specific RNAs degradation after induction of RNase L for several assemblies should be analyzed by qRT-PCR and quantified. This will justify observations provided by microscopy on an individual cell basis. The main issue regards the connection between RNA and its role in the formation and structural integrity of nuclear organelles. There is consensus that these nuclear assemblies are built on specific nascent transcripts which act as a nucleation scaffold. If specific RNA synthesis is impaired, these assemblies collapse. The authors should discuss it. It would be relevant to mention two experimental works on this topics, DOI: 10.1038/ncb2140 and DOI: 10.1038/ncb2157 The study is limited to observed macroscopical changes in the appearances of assemblies. The authors must dig deeper and provide more conclusive results by several colocalizing components of these assemblies. It has been documented that the visualization of a selective marker for a specific assembly is not enough to prove its functionality/dysfunctionality but also the level of its disassembly. For example, in Figure 4A the authors should more convincible visualize nascent 47/45S pre-rRNA transcript to demonstrate that the nucleolus is built on ongoing pre-rRNA synthesis reflected by the tripartite nucleolar substructures. The loss of the GC component after rRNA depletion should be better presented with NPM1 colocalization.

      In Figure 4C, D the authors used the term "coilin assemblies". That's confusing for a reader. The Cajal body after activation of RNase L likely undergoes the structural rearrangement which cannot be justified only by the presence of rearranged coilin foci. The authors should colocalize them with at least one or two functional markers.

      Enhancer RNAs likely play the role in gene control rather than as a nucleation element to build nuclear assemblies. This should be discussed in the explanation of observed differences between MED1 foci and other assemblies.

      Significance

      Understanding changes in the nuclear and cellular organization that accompany and drive changes in the formation and maintenance of cellular structures is an essential and not well-understood topic. Thus, this manuscript is relevant. However, the presented data in this paper are based on a limited approach, and particularly their interpretation and presentation could be substantially improved. Consequently, the conclusions are not convincingly supported by published data. However, some open questions need to be addressed. Specific criticisms are outlined above.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript by Decker et al. "RNA is required for the maintenance of multiple cytoplasmic and nuclear membrane-less organelles" investigates the structural role of RNA in membraneless organelles. The authors show that degradation of RNA in transient or constitutive membraneless organelles results in the altered formation and structure of many but not all organelles studied. The main assay is the activation of RNAseL activity by dsRNA which then destroys mRNA in the cell. The collected data leads the authors to highlight the possible roles of RNA in membraneless organelle formation and categorize the organelles: some relying more on the RNA-RNA interactions while others on protein-RNA or protein-protein interactions. The manuscript is well written and the data is sound.

      Major comments:

      The authors study the maintenance of organelles by RNA. For the transient ones, like stress granules (SG), it would be very interesting to see the formation/clearance kinetics with and without RNA. Also maybe using something other than dsRNA to trigger the formation. The idea being - if RNA is needed for SG maintenance, then the clearance kinetics with RNA would differ from that of the depleted RNA.

      The experiments were done in cells. It is known that core components of the organelles can form granule like structures in vitro without RNA. If it is possible to show that RNA presence improves the integrity in vitro, that would support the authors claim. For example studying SG maintenance with and without RNAseL using the previously developed SG extraction protocol.

      Minor comments:

      In the Figure 1a, it is not clear if the smaller granules are different from SGs as mentioned in the text, maybe using additional markers can make it clearer. Figure 3 and 4 requires quantification.

      Significance

      This is a solid paper that advances our understanding of membraneless organelle formation and dynamics. This field is of high general interest for the broader scientific community. My expertise is in the field of membraneless organelles.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In the paper entitled. "RNA is required for the maintenance of multiple cytoplasmic and nuclear membrane-less organelles" Decker et al set out to test the rolw of RNA in maintaining the integrity of a variety of biomolecular condensates. To do this, they assess how multiple different assemblies in the cytoplasm and nucleus retain their structural integrity following RNAseL activation. They identified many condensates which are solubilized and have protein components redistributed following RNAse L activation and presumably subsequent RNA digestion. These experiments largely concur with previous findings from RNAseA treatment. The implication is that RNA rather than protein is the essential organizing component for most tested condensates. The manuscript is well written, and the data are convincing. It is my judgement that this is worthy of publication following a few additional experiments/clarifications.

      1. The authors identify condensates which are sensitive to or refractory to RNAaseL. It would be good if the authors more conclusively eliminate the possibility that remaining condensates contain specific residual antiviral RNAs and this is the reason why these condensates remain intact. Are any of these condensates enriched in anti-viral RNAs like IFNbeta following polyIC treatment by FISH, for example (PMID: 31494035)?
      2. Is there a particular protein feature, charge, IDR-types etc. which is common to solubilized versus not solubilized groups? What about dissolved and novel formed assemblies? A simple table comparing protein features in the three groups would suffice, with particular emphasis on RNA binding domains PMID: 32243832 and intrinsic disordered regions PMID: 24773235.
      3. Demonstrate that the RNAseL treatment is reversible (i.e. withdraw polyIC, particularly for a protein that ends up in a novel assembly) or remove the word maintenance from the title.
      4. Control for RNA-dependence of the activity. Try to dissolve a non-RNA dependent/enriched condensate with RNAseL. SPOP mutations (PMID: 30244836) might be interesting as both SPOP and RNAseL loss of function mutations (PMID: 11799394) are associated with prostate cancer.
      5. A caveat is that certain regions of condensates enriched in RNA may not be accessible to RNAseL protein. A way to address this might be to attempt to directly target the enzyme to a compartment that is deemed refractory to the activity (and inferred to not require RNA) via an inducible systsem (ie FKBP12/FK506)
      6. Overall, this paper would be greatly enhanced by including a more extensive discussion on the basic biological implications for these findings. Why are some condensates RNA dependent? What function(s) are common to these condensates? How does disruption of this lead to disease?

      Significance

      This work addresses the neglected role of RNA in structuring condensates throughout the cell. Despite the prevalence of RNA in many condensates and the enrichment of RNA-binding proteins in condensates, there is still a highly limited understanding of the structural roles RNA plays in their assembly s most work has been protein/IDR-centric. This work seeks to systematically assess the RNA-dependence of the assemblies.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The authors do not wish to provide a response at this time.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In mice, failures in conducting meiosis during spermatogenesis can be rescued by injecting prophase I male chromosomes into oocytes, to allow them to undergo the two meiotic divisions within the oocyte, together with the chromosomes of the oocyte. However, segregations are highly error prone and rarely lead to a live birth when the resulting embryos are reimplanted into foster mothers. In this study, the authors show that segregation errors in meiosis I oocytes harboring both male and female chromosomes are mainly affecting the male chromosome set. Most errors are due to precocious segregation of sister chromatids in unpaired male chromosomes (univalents). A delay in alignemnt of male chromosomes compared to female chromosomes was also observed. Reducing the volume of the oocyte cytoplams to half leads to a signifncant reduction in the errors occuring, and hence, a significant increase in successful birth after re-implantation. Excitingly, with this technique, live births were obtained from male mice with a spermatogenic arrest phenotype.

      Main points:

      1)The authors conclude that halving the oocyte cell size is helping in proper segregation of male meiosis I chromosomes in the cytoplasm of meiosis I oocytes. It is also possible that the experimental procedure involved in removing half of the cytoplasm is promoting proper segregation for some unknown reason. The authors should include a condition where half of the cytoplasm is aspirated but then put back again, so oocytes have the same volume as before but the cytoplasm underwent the same treatment as in the halved oocytes. Also, increasing the cytoplasm volume of the oocyte should not lead to a better segregation of male chromosomes but make things worse, have the authors checked for that?

      2)The authors mention that male chromosomes align with a delay, compared to the female chromosomes. Does this delay depend on activation of error correction, or the spindle asembly checkpoint? Is it possible that dilution of factors required for checkpoint control and hence, assuring proper chromosome segregation, are the reason for error prone segregation in oocytes harboring twice the amount of chromosomes? If yes, have the authors stained for SAC proteins at the kinetochores? Maybe slight overepxression of the SAC protein were sufficient to rescue male meiotic divisions in the oocyte- have the authors tested this hypothesis?

      3) The authors state that male chromosomes have a hard time segregating in the hugh cytoplasm of the oocytes. Maybe it is not the fact that the chromosomes came from a male pronucleus, but this is just a manner of double the chromosomes that have to be segregated in the oocyte cytoplams. How do male chromosomes behave in enucleated oocytes undergoing meiosis I? Conversely, if female chromosomes coming from another oocyte are injected into the recipient oocyte instead of ale chromosomes, are those segregating correctly, or the delay in chromosome alignment and error rate comparable to the situation when the additional chromosome set comes from the male?

      4) In the rescue of mice with spermatogenic arrest the authors find aneuploidies of sex-chromosomes in the off-spring, not of autosomes. To my best of knowledge, autosome aneuploidies are not viable in the mouse, hence this result does not indicate that sex-chromosomes are the main source of aneuploidies. Nevertheless, it is attractive to speculate that aneuploidies are mainly due to sex chromosomes, because the oocyte is not prepared to segregate a male sex-chromosome bivalent. The authors should determine whether the segregation errors in meiosis I in oocytes harboring the additional male chromosome set concern mainly the male sex-chromosomes, by doing Fish analysis after meiosis I.

      Significance

      This study is very interesting and of high significance, and very well executed. I think the study can go much further as far as mechanistic insights are concerned, only requiring techniques and tools that the authors have at their disposition.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Previously, the team has shown that primary spermatocyte nucleus can undergo meiosis when transplanted into immature oocytes, and later obtained normal mice from the fertilized oocytes (Zygotes 1997, PMID: 9276513; PNAS 1998, PMID: 9576931). However, the efficiency was quite low (~ 1%) due to chromosome aberration, thus not feasible for basic/clinical research applications. In this study, Ogonuki et al., extrapolated from the recent study showing the reduction of the ooplasm ameliorate the error of chromosome segregation during meiosis (Dev Cell 2017, PMID: 28486131), injected the spermatocyte nucleus into the half-sized GV oocytes, and succeeded to obtain live murine pups with a high incidence (the birth rate improved from 1% with full-sized oocytes to 19% with half-sized oocytes). Further, through detailed observation with high-resolution 3D live imaging, the authors clarified that the misalignment of paternal chromosomes could be ameliorated by reducing the volume of ooplasm. Finally, the authors applied this technology and obtained live pups from azoospermic mice, suggesting the potential application in human infertility treatment.

      Major comments:

      This is a great study combining the expertise on both sperm and oocytes. The experiments are well designed and performed. The key conclusions are convincing.

      Line 228. The authors claimed that all the pups born following the injection of wild-type or mutant spermatocytes grew into fertile adults.

      Because the authors tested 3 males from wt spermatocytes (line 197), the above sentence should be rephrased.

      The authors found one XXY male among the three male mice from wt spermatocytes. Was the XYY male mouse fully fertile without XY/XYY mosaicism?

      How many females and males were obtained from wt spermatocytes?

      Minor comments:

      The authors clearly showed the technique can be applied to rescue the spermatogenic arrest. The readers would appreciate if the authors include any unsuccessful cases.

      To prevent sex-chromosome aberration, are there any potential markers for selecting most developed spermatocytes?

      Significance

      One in six couples suffers from infertility, and 70-90% of male infertility cases are related to defects in spermatogenesis. Clinically, intracytoplasmic injection of sperm is common, but it is not applicable to men who lack haploid germ cells. Injection of primary spermatocyte nucleus can give pups but the efficiency was poor (~1%, PNAS 1998, PMID: 9576931). In the present study, by using halved oocytes as recipient, the authors improved the efficiency from 1% to 19%. With the great improvement, they further obtained healthy fertile offspring from the male mice genetically lacking haploid cells. This approach opens up the window for the infertile patients suffering from spermatogenic arrest.

      The reviewer's field of expertise: knockout mice, male infertility, spermatogenesis, sperm function, fertilization.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Ogonuki et al developed a new technique using primary spermatocyte-injected oocytes for offspring production. They examined chromosome segregation error in biparental meiosis using spermatocyte-injected oocytes. They showed that artificially reducing ooplasmic volume rescued highly error-prone chromosome segregation by preventing sister separation in biparental meiosis. Their live-imaging analysis demonstrated that erroneous chromosome segregation derived from univalent-like chromosomes followed by predivision of sister chromatids during prometaphase I in biparental meiosis. They showed that the birth rate was improved using halved oocytes. Furthermore, they showed that production of offspring was successful using spermatocyte from azoospermic mice.

      Overall data are convincing and the manuscript addresses important questions. The data was produced in a technically high level. Presented data are sufficient to support conclusions of the authors, and further provide a significant insight into application to production of offspring for azoospermia animals. Thus, the manuscript could be open for the fields and are supposed to deserve publication, if they could address following minor concerns.

      Fig1A, Line 117 This is an amazing experiment to set up biparental meiosis using spermatocyte nuclei. Since spermatocytes are in different stages during progression through meiotic prophase, some of them (late pachytene) should yield crossover but others (before mid-pachytene) are yet to complete recombination. Thus, whether donor paternal chromosomes have bivalents or univalents depends on which stage spermatocytes derived from. The authors should describe how spermatocytes were picked up for injection and whether they used a particular stage of spermatocytes.

      Line 159-160 The authors stated that paternal chromosomes are susceptible to errors in ooplasm-hosted biparental meiosis. This is nice demonstration to trace the origin of separated chromatids. In Fig2C right graph, 1 to 2 paternal chromosomes showed misalignment. It is unclear whether premature separation is biased to any particular paternal chromosome, eg XY ? The authors should discuss more about it.

      Line 176-177 The authors stated that most of errors were preceded by premature separation of bivalent chromosomes into univalent-like structures. This implies that premature separation of bivalent chromosomes happens prior to anaphase onset. Does this depend on spindle force? Or is cohesion intrinsically fragile in donor spermatocyte chromosomes? The authors should discuss more about it.

      Fig3E, The authors depicted that in normal sized oocytes, univalent-like chromosomes undergo predivision at anaphase. This is somewhat too simplified, because Fig3B shows that a certain population exhibits nondisjunction. This model and description should be corrected to fit the data they demonstrated. If sister segregation at anaphase is predominant, I wonder what happens to sister kinetochore mono-orientation and sister centromeric protection in such univalent-like chromosomes. It would be nice to show centromeric proteins MEIKIN, SGO2 in donor spermatocyte chromosomes versus those of oocyte to examine centromeric cohesion. The authors should clarify this issue.

      Line296-294 What do the authors mean by the sentence " It is known that sex chromosomes are prepared to undergo meiosis later than autosomes."?

      Significance

      The manuscript will provide biological significance for the reproduction fields. There are two major biological significances : They addressed the mechanism of erroneous chromosome segregation in biparental meiosis. They showed that biparental meiosis using spermatocyte-injected oocytes can be applied to production of offspring of azoospermic mice, which would have great impact on reproductive biology field. The data was produced with their high level of technique.

      Referee Cross-commenting

      I agree to the point described in Reviewer #3's Main points2. It would be better to see SAC proteins.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      The comments of the reviewers were highly insightful and enabled us to greatly improve the quality of our manuscript. We provided point-by-point responses to each of the reviewers’ comments. Revisions in the text are highlighted in yellow. We hope that the revisions in the manuscript and our accompanying responses will be sufficient to make our manuscript suitable for publication.

      2. Point-by-point description of the revisions

      Reviewer #1

      - The authors provide no rationale for using the PTI score to measure the protein-coding potential of transcripts. The only attempt to justify this measure is given in the methods: "The definition of PTI score is motivated by our hypothetical concept that translation of pPTI is limited by alternate competing sPTIs." (lines 426-427, page 20). What the PTI score measures is the dominance of the largest predicted ORF over the predicted ORFs, in terms of length. It is not clear why there would be competition for translation of putative ORFs for genuine protein-coding transcripts. An alternative hypothesis, briefly touched upon in the discussion (lines 318-320) is that translation of non-functional ORFs could give rise to the production of toxic proteins, in addition to being costly in terms of energy. The authors should provide the reasoning behind the PTI score and should explain the biological mechanisms that may underlie differences between coding and non-coding transcripts.

      Thank you for your comment. We previously identified a de novo gene, NCYM, and showed that its protein has a biochemical function (Suenaga et al 2014; Suenaga et al 2020). However, NCYM was previously registered as a non-coding RNA in the public database, and the established predictors for protein-coding potential, coding potential assessment tool (CPAT), showed a coding probability of NCYM of 0.022, labeling it as a noncoding RNA (new Supplementary Figure 1B). Therefore, we sought to identify a new indicator for coding potential, comparing NCYM with a small subset of coding and non-coding RNAs to determine whether NCYM has sequence features that would allow it to be registered as a coding transcript (data not shown). We found that predicted ORFs, other than major ORFs, seem to be short in coding RNAs. In addition, it has been reported that upstream ORFs inhibit the translation of major ORFs (Calvo et al 2009). Therefore, we hypothesized that the predicted ORFs may reduce the translation of major ORFs, thereby becoming short in the coding transcripts, including NCYM, during evolution. The term ORF refers to an RNA sequence that is translated into an actual product; however, the biological significance of non-translating, predicted ORFs has been largely ignored and remains to be characterized. Therefore, we defined a PTI as an RNA sequence from the start codon sequence to the end codon sequence and did not assume that it would result in a translated product. Thus, PTI can be defined even in genuine non-coding RNAs. The major ORFs are often the longest PTIs (hereafter, primary PTIs or pPTIs) in coding transcripts. Thus, to investigate the importance of pPTIs relative to other PTIs (hereafter, secondary PTIs, or sPTIs) for the evolution of coding genes, we defined a PTI score as the occupancy of the pPTI length to the total PTI length (Figure 1A–B) and assumed that the PTI score was high in coding transcripts. These are the rationale for using the PTI score for protein-coding potential and are now included in the revised manuscript (lines 92-115, page 5-6).

      To examine the biological mechanism underlying the difference between coding and noncoding RNAs, we investigated the relationship between translation and PTI scores. We chose a dataset of non-coding RNAs that translated small proteins derived from the databases SmProt and sORF.org. From ribosome profiling and mass spectrometry data, the databases include noncoding RNAs that encode small proteins (less than 100 residues) as well as mRNAs that have extra-small ORFs in addition to major ORFs. The SmProt database divides these small ORFs into three categories: upstream (uORF), small (sORF), and downstream (dORF). The definitions are based on their locations: uORFs and dORFs are located in 5’ and 3’ UTRs, respectively, and sORFs overlap with major ORFs using different reading frames (new Figure 2B). We first calculated PTI scores of lincRNAs encoding small proteins and found that the distribution of these lincRNAs shifted to higher PTI scores compared with the distribution of all lincRNAs (new Figure 2A). Therefore, lincRNA translation is correlated with higher PTI scores. Next, we examined whether PTI scores were associated with the translation occupancy of major ORFs in coding RNAs. We calculated PTI scores in mRNAs with uORF, sORF, or dORFs and found that the distribution of mRNAs encoding such small proteins shifted to lower PTI scores (new Figure 2C). Similar data were obtained from the sORF org dataset (Supplementary Figure 5). These data support the idea that the PTI score is related to the occupancy of the major ORF during translation. These results are now included in the results of the revised manuscript (lines 241-271, pp 12-13).

      Translation of small proteins from noncoding RNAs seems to inhibit noncoding functions because of ribosome binding and subsequent translation. On the other hand, translation of sPTIs in coding RNAs seems to inhibit the translation of major ORFs because of competing translations (Calvo et al 2009). At the same time, however, the translation of such proteins may have the advantage of producing new functional proteins/regulatory mechanisms during evolution. Therefore, the right and left shifts of the PTI score that we observed for noncoding and coding RNAs, respectively, seem to be slightly deleterious or beneficial. As further discussed in the responses below, the overlap of distributions of PTI scores between coding and noncoding transcripts was negatively correlated with the effective population size of the species. Therefore, as nearly neutral theory predicts, mutations causing such slightly deleterious/beneficial effects of translation in coding and noncoding transcripts seem to be fixed in species with small effective population sizes (including humans) by genetic drift (Kimura 1968, 1983; Ohta 1992). Clearly, PTI scores are related to translation of PTIs, and their distributions suggest a mechanism for producing bifunctional RNAs that are simultaneously coding and noncoding. The discussion has now been included in the revised manuscript (lines 487-503, pp 23-24).

      Calvo SE, Pagliarini DJ, Mootha VK. Upstream open reading frames cause widespread reduction in protein expression and are polymorphic among humans. Proc Natl Acad Sci U S A. 2009 May 5;106(18):7507-12. doi: 10.1073/pnas.0810916106. Epub 2009 Apr 16. PMID: 19372376; PMCID: PMC2669787.

      Kimura M. 1968. Evolutionary rate at the molecular level Nature. 217(5129):624-6. PMID: 5637732. https://doi.org/10.1038/217624a0

      Kimura, M. (1983). Neutral Theory of Molecular Evolution Cambridge: Cambridge University Press. https://doi.org/10.1093/obo/9780199941728-0132

      Ohta T. 1992. The Nearly Neutral Theory of Molecular Evolution. Annu Rev Ecol Syst. 23:263-86.

      - The presence of ORFs in transcripts has long been used as a predictor of their protein-coding potential. For example, the ORF size and the ORF coverage are part of the set of predictors implemented in CPAT (Wang et al., 2013). The PTI score is necessarily related to these methods, yet no comparison is provided. If the PTI score is to be used as a measure to classify transcripts as coding or non-coding, its performance should be compared to other classifiers, including those that use the presence of ORFs as a predictor (e.g., CPAT) but not only (e.g., PhyloCSF, based on the pattern of sequence evolution).

      Thank you for your comment. As you noted, our reasons for using the PTI score were not clearly described in the original manuscript and are now included in the Results section (lines 92-115, page 5-6). As mentioned in response to comment 1, CPAT was not able to predict NCYM as a coding transcript (Supplementary Figure 1B). Furthermore, we intended to use this new concept to identify the RNA sequence elements that determine protein-coding potential, but did not intend to use the score as a classifier of coding or non-coding RNAs. Many studies have identified bifunctional RNAs that are simultaneously coding and noncoding (Li and Liu 2019; Huang Y et al. 2021). Moreover, neutrally evolving peptides are encoded by small ORFs of noncoding RNAs, possibly contributing to the evolutionary origin of new functional proteins (Ruiz-Orera et al. 2014). Therefore, we argue that such dichotomous classification is often misleading, by unconsciously ignoring ncRNAs that encode functional or nonfunctional small proteins. Additionally, this approach has several technical problems. For a training set for use with such a classification, we need a dataset of genuine noncoding RNAs. However, it is quite difficult to define such noncoding RNAs without bias, for example, for cell or tissue types, including cancer or normal cells/tissues. Increasing evidence has shown peptide translation from known noncoding RNAs (Li and Liu 2019; Huang Y et al. 2021); moreover, some of these peptides are specific to the cellular context (Dohka et al 2021). Therefore, we cannot be certain that we are identifying genuine noncoding RNAs from the datasets from ribosome profiling or mass spectrometry, which neither cover all cell/tissue types nor all physiological contexts.

      We agree with you in that we need to compare PTI scores with other indicators of coding potential, such as transcript length, ORF size, and ORF coverage. ORFs of less than 100 residues have been used to define noncoding RNAs; thus, such RNAs necessarily have shorter ORF sizes relative to coding RNAs. Therefore, we calculated these indicators by focusing on noncoding RNAs that encode proteins, but not coding RNAs (new Supplementary Figure 4). The PTI score distribution shifted to the right for lincRNAs encoding small proteins, indicating that the PTI score is related to translation (new Figure 2C). In contrast, the distributions of transcript length, ORF size, and ORF coverage did not shift higher for noncoding RNAs encoding small proteins (new Supplementary Figure 4), although a slight shift to higher ORF coverage was found. Therefore, we argue that the PTI score is a better indicator of translation than transcript length, ORF size or ORF coverage. These results are now included in the results of the revised manuscript (lines 241-255, page 12).

      - The authors compare the observed PTI score distributions with the PTI scores from random or shuffled sequences. They conclude that the PTI scores do not depend on transcript lengths but on transcript sequences (lines 122-123). However, this is not true for non-coding RNAs, for which the observed and randomized distributions are very similar. The relationship between transcript length and PTI scores should be analyzed into more detail. Are the annotated non-coding transcripts with high PTI scores particular in terms of length?

      We analyzed the length of high-PTI-score transcripts compared to all lncRNA transcripts. The average high-PTI-score with high coding potential (0.6 PTI score −29), consistent with the distribution of transcript length in lincRNAs translating small proteins (new Supplementary Figure 4C). Therefore, the high PTI scores are not simply due to the larger ORF size derived from longer transcript length, but also because of the occupancy of pPTI among all PTIs. The occupancy of pPTI can be estimated by ORF coverage or PTI score, and we can easily see that transcript length (the denominator of ORF coverage) correlates with the sum of the lengths of all PTIs (the denominator of the PTI score). Thus, we need to clarify which indicators have more biological significance in terms of gene evolution. Higher PTI scores in noncoding RNAs cause overlap of the coding and noncoding transcripts in eukaryotes, especially in multicellular eukaryotes (new Figure 4 and 5). The overlaps of PTI score distributions between coding and noncoding RNAs (Opti) were positively and negatively correlated with mutation rate and effective population size, respectively, and approximated by logarithmic or exponential relationships (new Figure 6). Because the inverse of the effective population size defines the strength of genetic drift relative to the strength of selection, the overlaps quantified by Opti seem to be derived from genetic drift. These results clearly suggest that the observed PTI score distribution of noncoding RNAs is not random. In contrast, ORF coverage (Ocov) showed a weaker relationship with mutation rates and effective population sizes (new Supplementary Figure 8 and 9). These results suggest that ORF coverage is less related to gene evolution than PTI score, with the weak relationship seemingly indirectly derived from the correlation with the PTI score. We have now included these results in the revised manuscript (lines 306-322, page 15).

      - The authors discuss in depth the correlation between PTI scores and PTI-based protein-coding potential measures (e.g., section "PTI scores correlate with protein coding potential in humans and mice", starting line 125; section "Relationship between the PTI score and protein-coding potential", starting line 243). Given that the protein-coding potential is directly derived from the PTI score distributions for coding and non-coding transcripts, it is not surprising that the two should be correlated. The significance of observing a linear or a sigmoid relationship is not clearly explained.

      As you noted, the protein-coding potential was directly derived from the PTI score distributions. Therefore, if the distribution for coding RNA shows a higher or lower PTI score compared to that of noncoding RNA, the protein-coding potential is expected to be positively or negatively correlated with the PTI score. If the distributions of coding and noncoding RNA significantly overlapped (Opti > 0.7), the protein-coding potential became constant and was not correlated with the PTI score (new Figure 7 and new Supplementary Figure 10). Thus, the PTI score is not always positively correlated with the protein-coding potential.

      We had divided the species into three groups; the sigmoidal group, the linear group, or others based on the intercept and slope in the linear approximation, but considering the fit of the linear approximation, there is no essential difference between the sigmoidal and linear groups. Therefore, in the revised text, we classify the species into two groups: linear and constant (new Figure 7 and Supplementary Figure 10). We have now replaced the figures and added a new interpretation of the results in the revised manuscript (lines 341-353, pages 16-17).

      - The authors use the entire set of annotated coding and non-coding transcripts to assess the distribution of PTI scores and to define the protein-coding potential. Traditionally, for methods that aim to classify transcripts as coding or non-coding, this is done using "bona fide" coding and non-coding transcripts, which are used as training sets. The efficiency of the method can then be evaluated using a test set of transcripts. This aspect is lacking here and should be implemented.

      As we wrote in response to your comment 2, we aimed to examine what RNA sequence elements determine genuine-coding RNA but not to identify the classifier of coding and noncoding RNA. Technically, the “bona fide” coding and noncoding RNAs cannot be rigorously defined, given the possible existence of unidentified bifunctional RNAs in the testing sets; therefore, more traditional approaches often eliminate such possibilities.

      - The comparisons among species are likely biased by the quality of lncRNA annotations in non-model organisms - cf. high variations among primates, which are likely driven by the annotation quality and depth.

      As written in the response to comment 3, the variation of PTI score distribution in lncRNA is not random, and overlaps with the distribution of coding RNA are negatively correlated with effective population size (new Figure 6). In addition, we found that the tissue-specific expression of lncRNA influences the PTI score distribution in multicellular eukaryotes (new Figure 8 C and D and new Supplementary Figure 11 and 12). Therefore, the variation is caused, at least in part, by the specificity of gene expression, and it thus contains biological significance. These results are now included in the revised manuscript (lines 383-402, pages 18-19).

      Based on these results, we expect that the quality of the lncRNA annotations derived from two major databases, Ensenbl and RefSeq, are well curated and sufficient to compare the PTI score distribution. Realistically, there is no database that catalogs a number of curated lncRNAs from various species other than these two. However, we also expect that recent progress in whole genome sequencing and transcriptome analysis of vertebrates may improve the annotation of lncRNAs, including non-model organisms, and provide more ideal datasets for comparisons among species.

      - The differences among bacteria, archaea and eukaryotes should be discussed into more depth. In bacteria, the genuine ORF is well defined by the presence of translation signals (e.g., Shine-Dalgarno sequence). Other factors are also at work in both prokaryotes and eukaryotes, including RNA secondary structures. The relationship between these factors and the PTI score should be discussed.

      The Shine–Dalgarno sequence in bacteria and the Kozak sequence in eukaryotes have been identified as important regulatory elements for ribosome binding, but these sequences are not essential for all coding RNAs, and their significance is not well characterized, especially in noncoding RNAs that are translated. Recent research has sought to identify the determinants that regulate ribosome binding to lncRNAs using 99 characteristics, including the weight of each base at the −6 to +1 positions relative to the start codon (Kozak-like sequence) or RNA secondary structure (Zeng et al 2018). They found that transcript length is a stronger indicator than either of these characteristics for ribosome binding in human lncRNAs. Because the PTI score is a better indicator for translation of lincRNAs than transcript length (new Supplementary Figure 4C), we would argue that Kozak sequences and RNA secondary structures are not reliable indicators for ribosome binding of lncRNAs, and their significance should be limited to more specific transcript classes. Furthermore, Hata et al. recently showed that the Kozak sequence is a negative regulator of de novo gene birth in plants (Hata et al. 2021). Therefore, these sequence characteristics seem to evolve after the birth of coding transcripts and are not generally involved in new coding gene origination from noncoding RNAs.

      Zeng C, Hamada M. 2018. Identifying sequence features that drive ribosomal association for lncRNAs BMC Genomics. 19(Suppl 10):906. PMID: 30598103; PMCID: PMC6311901. https://doi.org/10.1186/s12864-018-5275-8

      Hata T, Satoh S, Takada N, Matsuo M, Obokata J. 2021. Kozak sequence acts as a negative regulator of de novo transcription initiation of newborn coding sequences in the plant genome. Mol Biol Evol. 38:2791-2803. PMID: 33705557; PMCID: PMC8233501. https://doi.org/10.1093/molbev/msab069

      - From an evolutionary perspective, the effective population size (Ne) is also likely related to the "quality" of the ORFs. An analysis of Ne vs. the PTI score distributions would be an interesting addition to this manuscript.

      We appreciate this comment. We now include an analysis of the relationship between Ne and PTI scores by defining an indicator of the extent of overlap in the PTI score distributions between coding and noncoding transcripts. This overlapping score was calculated based on PTI scores or ORF coverage and named Opti or Ocov, respectively. Opti showed positive and negative correlations with mutation rates (Up) and effective population size (Ne), respectively (new Figure 6A), suggesting that the overlap of PTI score distribution is related to slightly deleterious or beneficial mutations fixed in populations due to genetic drift. Furthermore, using the relationship between Ne and Opti, we calculated the minimum effective population size to be approximately 1000, which is consistent with the results from conservation biology (Frankham et al. 2014). Indeed, species at risk of extinction had significantly higher Opti than species with little risk of extinction (left panel, new Figure 6B). In addition, Opti was higher for species with a decrease compared to those with stable population sizes (right panel, new Figure 6B). These results are now included in the revised manuscript (lines 323-332, page 15-16).

      Frankham R, Bradshaw CJA. 2014. Genetics in conservation management: Revised recommendations for the 50/500 rules, Red List criteria and population viability analyses, Biological Conservation, 170:56-63, https://doi.org/10.1016/j.biocon.2013.12.036

      Reviewer #1 (Significance (Required)):

      This manuscript is lacking in novelty and is not well positioned in the field. If the aim of this work is to provide a method to classify transcripts as coding or noncoding, the authors should provide detailed comparisons with existing methods (see above). If the aim is to understand what defines a genuine protein-coding transcript, then the biological mechanisms should be better described and the comparisons among species and among functional categories of genes should be further developed. The idea of using the "dominance" of the largest ORF compared to the other predicted ORFs is interesting, and provides a new element compared to existing methods that rely exclusively on ORF length and ORF coverage. I would recommend that the authors develop this idea further and discuss the advantages of using the ORF dominance compared to just the ORF length or coverage.

      Thank you for your comment. To address this, we have revised the description of our aim to investigate what defines a genuine protein-coding transcript and found that doing so prompted us to learn that the extent of overlap of PTI score distribution between coding and noncoding transcripts is negatively correlated with effective population size. In addition, we have added characterizations of functional categories of high-PTI-score lncRNAs in mice (new Supplementary Tables 6 to 8) and C. elegans (new Supplementary Tables 9, 10, and 11). Comparison of ORF size and coverage to PTI score showed that PTI score is a better indicator for translation of lncRNAs than these indicators and has biological significance in molecular evolution because of the clear correlation between mutation rate and effective population size. These results and related descriptions are now included in the revised manuscript (lines 323-332, pages 15-16; lines 210-218, pages 10-11).

      **Referee Cross-commenting**

      I fully agree with Reviewer 2's remarks. In particular, adding ribosome profiling analyses is an excellent idea and could substantially improve the manuscript.

      We investigated the PTI scores in lncRNAs that are translated, using ribosome profiling data, and found that PTI scores correlated with translation (lines 241-271, pages 12-13). Thank you for this excellent suggestion.

      Reviewer 2

      **Major comments:**

      - some validation of their predictions of coding potential would be good to add. There are plenty of ribosome profiling experiments out there for some of the studied organisms (human, mouse, E. coli) that could be used to show that indeed some of the non-coding RNAs are misclassified and have ribosome density across the predicted open reading frames.

      Thank you for your comment. As noted in our response to Reviewer 1 above, we calculated the PTI scores of translated lncRNAs from the two databases and found that the PTI score correlates with translation of both coding and noncoding RNAs (new Figure 2 and new Supplementary Figures 4 and 5). As noted above, such translation seems to produce slightly deleterious/beneficial effects, thereby becoming fixed in species with smaller effective population sizes by genetic drift. These results and related discussion are now included in the revised manuscript (lines 241-271, pages 12-13; lines 323-332, pages 15-16; lines 487-503, page 23-24).

      - the manuscript is at times difficult to follow and the implication of the statements may not be immediately clear to the readers, particularly those without formal training in bioinformatic methods; even in the abstract. Some examples: "The relationship between the PTI score and protein-coding potential was sigmoidal in most eukaryotes; however, it was linear passing through the origin in three distinct eutherian lineages, including humans". Here it is not clear what this means (without reading the paper) - and even after reading the paper the importance of noting the sigmoidal vs linear relationship of PTI vs. protein-coding potential is unclear. I would encourage the authors to double-check that they provide a clear interpretation of their results, with readers unschooled in proper statistics in mind.

      Thank you for these comments. As we noted in response to comment 4 of Reviewer 1, considering the fit of the linear approximation, there was no essential difference between the sigmoidal and linear groups. Therefore, in the revised manuscript, we classify the species into two groups: linear and constant (new Figure 7 and Supplementary Figure 10). We also propose and diagram a new gene birth model to help readers understand our interpretations more easily (Figure 9). These results and discussion are now included in the revised manuscript (lines 341-353, pages 16-17; lines 514-538, pages 24-25).

      - For the definition of PTI and protein-coding potential the authors refer to the Materials and Methods. I would encourage to explain in plain terms in the results section 1.) how they decided on this particular formalization and 2.) explain clearly what this means.

      Thank you for your suggestion. We have included a concise definition in the revised text in plain terms (lines 107-115, page 5-6; lines 144-146, page 7).

      - The definition of protein coding potential for appears to be dependent on database classification of a transcript as either coding and non-coding. Particularly for organisms with complex transcriptomes, databases may not contain the proper information - what are the implications for their protein-coding potential score?

      Organisms with complex transcriptomes, such as multicellular organisms, present difficulties in classifying coding vs. noncoding transcripts because RNAs classified as noncoding based on proteomic data from a subset of cell types may encode functional proteins in other cell types for which proteomic data are not available. To examine whether cell types affect the PTI distribution of coding and noncoding transcripts, we analyzed transcriptomic data from five mammals (human, mouse, rat, macaque, and opossum) and found that the PTI score distributions were similar in most cell or tissue types for noncoding transcripts (new Figure 8C and Supplementary Figure 11). However, PTI score distributions for noncoding RNA in mature testes showed a rightward shift for all five species (new Figure 8C and Supplementary Figure 11).

      Furthermore, we found that tissue specificity of RNA expression was correlated with PTI score (new Figure 8D and new Supplementary Figure 12 and 13), with more specific expression associated with higher PTI scores in all five species, with the majority of the tissue-specific expression in mature testis. Therefore, the mature testis is a special tissue that expresses noncoding RNAs with high coding potentials. These results support the hypothesis that the testis is a special organ for new gene origination (Kaessmann 2010). We have added these results and discussion to the revised manuscript (lines 383-402, pages 18-19; lines 427-434, pages 20-21; lines 435-445, page 21).

      Kaessmann H. 2010. Origins, evolution, and phenotypic impact of new genes. Genome Res, 20:1313-26. Epub 2010 Jul 22. PMID: 20651121; PMCID: PMC2945180. https://doi.org/10.1101/gr.101386.109

      - The authors completely ignore plants - would it make sense to expand their analysis to this branch of the tree of life?

      In Supplementary Figure 5 of our original manuscript (new Supplementary Figure 7), we have included the PTI score distributions from plants. We also present their overlapping scores (Opti) in the revised manuscript.

      Reviewer 2 (Significance (Required)):

      The manuscript presents an elegant way to predict protein-coding and non-coding RNAs, which may be very relevant to the study of organisms with complex transcriptomes. The audience for the manuscript at the moment may be more limited to scientists trained and working in the field of bioinformatics, but with some integration of transcriptomics and ribosome profiling data, as well as an effort to make the results accessible to scientists not trained in bioinformatics, this manuscript may be relevant and of interest to researchers working on the biology of long non-coding RNAs and translation in general. My expertise: systems biology of RNA binding proteins, transcriptomics, RNA biology.

      **Referee Cross-commenting**

      I fully agree with my co-reviewer regarding additional analyses to strengthen the manuscript.

      Thank you for these comments. We analyzed noncoding RNAs using ribosome profiling data and transcriptomes in different tissues. We found that high PTI scores correlated with translation of noncoding RNAs, and that such high PTI-score noncoding RNAs were specifically expressed in mature testes. Because the effective population size was inversely correlated with the overlap of PTI distributions, the slightly deleterious or beneficial mutations in germ cells of matured testis seem to generate high-PTI score noncoding RNAs as candidates for new coding genes in the next generation. This idea is consistent with the hypothesis that new coding transcripts are derived from noncoding transcripts expressed in spermatocytes and spermatids in mature testes. In addition, we found that human noncoding transcripts with high PTI scores tended to be involved in transcriptional regulation, and the target gene of MYCN was significantly enriched as the original gene. A recent study showed that binding sites for transcription factors, including MYCN, are mutational hotspots in human spermatogonia (Kaiser et al. 2021). Therefore, the PTI score offers an opportunity to integrate the concept of gene birth with classical molecular evolutionary theory, thereby contributing to our understanding of evolution.

      Kaiser VB et al. 2021. Mutational bias in spermatogonia impacts the anatomy of regulatory sites in the human genome. Genome Res. Epub ahead of print. PMID: 34417209. https://doi.org/10.1101/gr.275407.121

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the manuscript "Potentially translated sequences determine protein-coding potential of RNAs in cellular organisms" Suenaga and colleagues analyze the available transcriptomes from 100 prokaryotes and eukaryotes, as well as >100 viruses to understand whether transcripts tend to be translated or not. They develop a potentially translated island score (PTI) that combines the number and length of open reading frames in a transcript. From there they develop a protein-coding potential score that combines PTI with database information on coding and non-coding transcripts in various organisms and that in some sense predicts whether a transcript would fall in the coding or non-coding category. The main takeaway appears to be that in prokaryotes PTIs and protein coding potential strongly differentiates coding and non-coding transcripts, while in eukaryotes these differences appear to be more fluid. The manuscript presents an interesting bioinformatic analysis of coding properties across the phylogenetic field and may represent an interesting resource. The audience for the manuscript at the moment may be more limited to scientists trained and working in the field of bioinformatics, but with some integration of transcriptomics and ribosome profiling data, as well as an effort to make the results accessible to scientists not trained in bioinformatics, this manuscript may be relevant and of interest to researchers working on the biology of long non-coding RNAs and translation in general.

      Major comments:

      • some validation of their predictions of coding potential would be good to add. There are plenty of ribosome profiling experiments out there for some of the studied organisms (human, mouse, E. coli) that could be used to show that indeed some of the non-coding RNAs are misclassified and have ribosome density across the predicted open reading frames.
      • the manuscript is at times difficult to follow and the implication of the statements may not be immediately clear to the readers, particularly those without formal training in bioinformatic methods; even in the abstract. Some examples: "The relationship between the PTI score and protein-coding potential was sigmoidal in most eukaryotes; however,it was linear passing through the origin in three distinct eutherian lineages, including humans". Here it is not clear what this means (without reading the paper) - and even after reading the paper the importance of noting the sigmoidal vs linear relationship of PTI vs. protein-coding potential is unclear. I would encourage the authors to double-check that they provide a clear interpretation of their results, with readers unschooled in proper statistics in mind.
      • For the definition of PTI and protein-coding potential the authors refer to the Materials and Methods. I would encourage to explain in plain terms in the results section 1.) how they decided on this particular formalization and 2.) explain clearly what this means.
      • The definition of protein coding potential for appears to be dependent on database classification of a transcript as either coding and non-coding. Particularly for organisms with complex transcriptomes, databases may not contain the proper information - what are the implications for their protein-coding potential score?
      • The authors completely ignore plants - would it make sense to expand their analysis to this branch of the tree of life?

      Significance

      The manuscript presents an elegant way to predict protein-coding and non-coding RNAs, which may be very relevant to the study of organisms with complex transcriptomes.

      The audience for the manuscript at the moment may be more limited to scientists trained and working in the field of bioinformatics, but with some integration of transcriptomics and ribosome profiling data, as well as an effort to make the results accessible to scientists not trained in bioinformatics, this manuscript may be relevant and of interest to researchers working on the biology of long non-coding RNAs and translation in general.

      My expertise: systems biology of RNA binding proteins, transcriptomics, RNA biology.

      Referee Cross-commenting

      I fully agree with my co-reviewer regarding additional analyses to strengthen the manuscript.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The manuscript submitted by Suenaga and co-authors presents a method to evaluate the protein-coding potential of transcripts. This method is based on an index that they name the PTI (potentially translated island) score, which represents the ratio between the length of the largest predicted ORF and the sum of all the predicted ORF lengths, for each transcript. The author compare PTI score distributions between transcripts classified as protein-coding and as non-coding in public nucleotide databases, for a wide range of species, including bacteria, archaea, eukaryotes and viruses. They derive from this comparison a measure of the protein-coding potential of transcripts. To validate this approach, the authors evaluated the distributions of Ka/Ks values for transcripts annotated as coding or non-coding, in various classes of PTI-based protein-coding potential. The main finding of the manuscript stems from the comparison among species: the authors find that bacteria and archaea have narrow, non-overlapping PTI distributions for coding and non-coding transcripts, while eukaryotes have broader and more overlapping PTI distributions.

      Major comments

      • The authors provide no rationale for using the PTI score to measure the protein-coding potential of transcripts. The only attempt to justify this measure is given in the methods: "The definition of PTI score is motivated by our hypothetical concept that translation of pPTI is limited by alternate competing sPTIs." (lines 426-427, page 20). What the PTI score measures is the dominance of the largest predicted ORF over the predicted ORFs, in terms of length. It is not clear why there would be competition for translation of putative ORFs for genuine protein-coding transcripts. An alternative hypothesis, briefly touched upon in the discussion (lines 318-320) is that translation of non-functional ORFs could give rise to the production of toxic proteins, in addition to being costly in terms of energy. The authors should provide the reasoning behind the PTI score and should explain the biological mechanisms that may underlie differences between coding and non-coding transcripts.
      • The presence of ORFs in transcripts has long been used as a predictor of their protein-coding potential. For example, the ORF size and the ORF coverage are part of the set of predictors implemented in CPAT (Wang et al., 2013). The PTI score is necessarily related to these methods, yet no comparison is provided. If the PTI score is to be used as a measure to classify transcripts as coding or non-coding, its performance should be compared to other classifiers, including those that use the presence of ORFs as a predictor (e.g., CPAT) but not only (e.g., PhyloCSF, based on the pattern of sequence evolution).
      • The authors compare the observed PTI score distributions with the PTI scores from random or shuffled sequences. They conclude that the PTI scores do not depend on transcript lengths but on transcript sequences (lines 122-123). However, this is not true for non-coding RNAs, for which the observed and randomized distributions are very similar. The relationship between transcript length and PTI scores should be analyzed into more detail. Are the annotated non-coding transcripts with high PTI scores particular in terms of length?
      • The authors discuss in depth the correlation between PTI scores and PTI-based protein-coding potential measures (e.g., section "PTI scores correlate with protein-coding potential in humans and mice", starting line 125; section "Relationship between the PTI score and protein-coding potential", starting line 243). Given that the protein-coding potential is directly derived from the PTI score distributions for coding and non-coding transcripts, it is not surprising that the two should be correlated. The significance of observing a linear or a sigmoid relationship is not clearly explained.
      • The authors use the entire set of annotated coding and non-coding transcripts to assess the distribution of PTI scores and to define the protein-coding potential. Traditionally, for methods that aim to classify transcripts as coding or non-coding, this is done using "bona fide" coding and non-coding transcripts, which are used as training sets. The efficiency of the method can then be evaluated using a test set of transcripts. This aspect is lacking here and should be implemented.
      • The comparisons among species are likely biased by the quality of lncRNA annotations in non-model organisms - cf. high variations among primates, which are likely driven by the annotation quality and depth.
      • The differences among bacteria, archaea and eukaryotes should be discussed into more depth. In bacteria, the genuine ORF is well defined by the presence of translation signals (e.g., Shine-Dalgarno sequence). Other factors are also at work in both prokaryotes and eukaryotes, including RNA secondary structures. The relationship between these factors and the PTI score should be discussed.
      • From an evolutionary perspective, the effective population size (Ne) is also likely related to the "quality" of the ORFs. An analysis of Ne vs. the PTI score distributions would be an interesting addition to this manuscript.

      Significance

      This manuscript is lacking in novelty and is not well positioned in the field. If the aim of this work is to provide a method to classify transcripts as coding or non-coding, the authors should provide detailed comparisons with existing methods (see above). If the aim is to understand what defines a genuine protein-coding transcript, then the biological mechanisms should be better described and the comparisons among species and among functional categories of genes should be further developed. The idea of using the "dominance" of the largest ORF compared to the other predicted ORFs is interesting, and provides a new element compared to existing methods that rely exclusively on ORF length and ORF coverage. I would recommend that the authors develop this idea further and discuss the advantages of using the ORF dominance compared to just the ORF length or coverage.

      Referee Cross-commenting

      I fully agree with Reviewer 2's remarks. In particular, adding ribosome profiling analyses is an excellent idea and could substantially improve the manuscript.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01041 Corresponding author(s): Gregory P. Way, PhD

      1. General Statements

      On behalf of the authors, I’d like to thank the Review Commons team for sending our manuscript out for review. I’d also like to thank the three anonymous reviewers for providing valuable feedback that will improve the clarity, focus, and analysis interpretation presented in our manuscript.

      To prompt the editorial team, our paper provides two well-controlled innovations:

      We are the first to train variational autoencoders (VAEs) on classical image features extracted from Cell Painting images. VAEs are commonplace in, and have contributed major discoveries to, other biomedical data types (e.g. transcriptomics), but they have been underexplored in morphology data. In our paper, we trained and optimized three different VAE variants using Cell Painting readouts and compared these variants against shuffled data, against PCA (a nonlinear dimensionality reduction algorithm commonly used as a VAE control), and against L1000 (mRNA) readouts from the same perturbations. We found that cell morphology VAEs train with different settings than gene expression data, and that they generate interpretable latent spaces that depend on the chosen VAE variant.

      We tested special VAE properties to predict polypharmacology cell states in a novel way. Polypharmacology is a major reason why drugs fail to reach the bedside. Off-target effects cause unintended toxicity, and lead to adverse clinical events. In our paper, we used VAE latent space arithmetic (LSA) to predict polypharmacology cell states; in other words, what cells might look like if we perturbed them with a compound that had two mechanisms of action (MOA). We compared our results to shuffled data, PCA, and to LSA performed with VAEs trained using L1000 readouts. We found that cell morphology and gene expression provide complementary information, and that we could predict some polypharmacology cell states robustly, while others were more difficult to predict.

      We found value in all of the reviewer comments. We intend to conduct all but four of the proposed analyses to supplement our aforementioned innovations.

      In the following revision plan, we include all reviewer comments exactly as they were written. The reviewers often had overlapping suggestions. In these cases, we grouped together similar reviewer comments and responded to them once.

      We include three sections: 1) A description of the revisions we plan to conduct in the near future; 2) A description of changes we have already made; and 3) A description and rationale of changes we will not pursue.

      Lastly, we would like to highlight that all reviewers provided positive feedback in their reviews. They discussed our paper as “conceptually and technically unique” and were positive about our methods section, stating that we did a “good job making everything available and reproducible”. Our methods section is complete, and we provide a fully reproducible and versioned github repository. We will release a second version of our github repository when we complete our revision plan to maintain clarity for our submitted version and the peer-reviewed version.

      1. Description of the planned revisions

      2.1. Address UMAP interpretability to provide a deeper description of MOA performance

      Reviewer 1: Instead of using UMAP embedding, it would be better to compare reconstruction error or show a reconstructed image with the original image to claim that models reliably approximate the underlying morphology data.

      Reviewer 1: Rather than just stating that the VAE's did not span the original data distribution and saying beta-VAE performed best by eye, some simple metrics can be drawn to analyze the overlap in data for a more direct and quantified comparison. Researchers should also explain what part of the data is not being captured here. Some analysis of what the original uncaptured UMAP represents is important in understanding the limitations of the VAEs' capacity.

      Reviewer 2: The authors compare generation performance based on UMAP. In the UMAP space, data tend to cluster together even though they might be far from each other in the feature space. I would like to see more quantitive metrics on how well these methods capture morphology distributions. You can compute metrics like MMD distance, kullback leibler (KL), earthmoving distance, or a simple classifier trained on actual MoA classes tested on generated data.

      We agree with the reviewers that evaluating reconstruction loss in addition to providing the UMAP coordinates would improve understanding of VAE limitations and enable a better comparison of VAE performance. We will analyze reconstruction loss across models and include these data as a new supplementary figure, which will enable direct comparisons across models and across different MOAs.

      We also agree that UMAP interpretation can be misleading. While currently state-of-the-art, UMAP has mathematical limitations that prevent interpretation of global data structures. However, there are emerging tools, including a new dimensionality reduction algorithm, called PaCMAP, which aims to preserve both local and global structure (Wang et al, 2021). We will explore this tool to determine, both mathematically and empirically, which is most appropriate for our dataset by cross-referencing the visualization with our added supplementary figure describing per-MOA reconstruction loss.

      We would also like to emphasize that we trained our VAEs using CellProfiler readouts from Cell Painting images and not the raw Cell Painting images themselves. As this was one of our primary innovations, this detail is extremely important. Therefore, we have improved clarity and added emphasis to this point in the manuscript introduction and discussion (see section 3).

      2.2. More specific comparisons of MOA predictions to shuffled data and improved description of MOA label accuracy

      Reviewer 1: It is difficult to know the clear threshold for successful performance is on figures like Figure 7 and SFigure 9, but by and large, it appears that the majority of predicted combination MOAs were not successful. Without the ability to either A) adequately predict most all combinations from individual profiles that were used in training or B) an explanation prior to analysis of which combination will be able to predict, it is difficult to see this method being used since the combinatorial predictions are more likely not informative.

      Reviewer 1: The researchers justify the poor performance compared to shuffled data, by saying that A) MOA annotations are noisy and unreliable and B) they MOAs may only manifest in other modalities like what was seen in the L1000 vs morphology predictability. While these might be true, knowing this the researchers should make an effort to clean and de-noise their data and select MOAs that are well-known and reliable, as well as, selecting MOAs for which we have a known morphological or genetic reaction.

      Reviewer 3: Figure 6 is missing error bars (standard deviation of the L2 distance) and, as such, is hard to draw conclusions from.

      We thank the reviewers for raising this concern. We agree that it is critical, and we appreciate the opportunity to address it.

      All three of these comments relate to being unable to draw conclusions from our results when most A∩B predictions appear to have no difference from shuffled controls. Therefore, to address this comment, we will update our LSA evaluation to compare each MOA to a matched set of randomly shuffled data. Specifically, in our existing comparison, we realized a methodological fallacy in how we're displaying these data shuffles. We should be comparing specific MOA combinations to their corresponding shuffled results instead of comparing all to all, which will artificially decrease performance when there are polypharmacology predictions that fail to recapitulate the ground truth cell states.

      We have connected with Paul Clemons, the senior director Director of Computational Chemical Biology Research at the Broad Institute of MIT and Harvard, who has informed us that the Drug Repurposing Hub annotations are among the most well documented. Therefore, while we know that biological annotations are often incomplete, our original text overemphasized the amount of noise contributed by inaccurate labels. We therefore added the following sentence to the discussion to clarify this important point:

      “However, the Drug Repurposing Hub MOA annotations are among the most well-documented resources, so other factors like different dose concentration and non-additive effects contribute to weak LSA performance for some compound combinations (Corsello et al, 2017).”

      We will also update our supplementary figure to account for specific MOA shuffling and include additional text comparing Cell Painting and L1000 showing which MOAs perform best in which modality.

      2.3. More detailed evaluation of MOA performance across drug variance and drug classes

      Reviewer 1: With the small number of combinations that are successfully predicted, to build confidence in the performance, it would be necessary to explain the reason for the differences in performance. Further experimentations should be done looking into any relationship between the type of MOAs (and their features) and the resulting A|B predictability. Looking at Figure 7, the top-performing combinations are comprised entirely of inhibitor MOAs. If the noisiness of the data is a factor, there should be some measurable correlation between feature noisiness and variation and the resulting A|B predictability from LSA.

      We agree with the reviewer that further experimentation would be helpful to gain confidence in our LSA performance. We plan to perform two different analyses to address this question. First, we will compare profile reproducibility (median pairwise correlations among MOAs) to MOA predictability. This will provide insight to determine the relationship between MOA measurement variance and performance. Second, we will split MOAs by category (e.g. inhibitor, activator) and test if there are significant performance differences between categories across VAE models in both L1000 and Cell Painting data. This will tell us if there are certain trends in the type of MOAs we’re able to predict. If there is, this would be useful knowledge since it could suggest that certain types of MOAs are associated with a more consistent cell state.

      2.4. Higher confidence in LSA overfitting assessment

      Reviewer 1: To show that the methodology works well on unseen data, researchers withheld the top 5 performing A|B MOAs (SFig 9) and showed they were still well predicted. This is not the most compelling demonstration since the data to be held out was selected with bias as the top-performing samples. It would be much more interesting to withhold an MOA that was near or only somewhat above the margin of acceptability and see how many holdouts affected the predictability of those more susceptible data points. From my best interpretation, the hold-out experiment also only held out the combination MOA groups from training. It would be better if single MOAs (for example A) which were a part of a combination of MOA (A|B) were also held out to see if predictability suffered as a result and if generalizability did extend to cells with unseen MOAs (not just cells which had already highly performing combinations of seen MOAs).

      We believe our original analysis was extremely compelling. Even if we removed the top MOAs from training, we were still able to capture their combination polypharmacology cell states through LSA. We find this similar to removing all pictures of sunglasses in an image corpus of human faces, but still being able to reliably infer pictures of people wearing sunglasses. Specifically, this tells us that our model is learning some fundamental data generating function that our top performing MOAs tap into regardless of if they are present or not in training.

      However, we agree with the reviewer that withholding intermediate-performing MOAs would also be informative, but for a separate reason. Unlike the best predicted MOAs, the intermediate MOAs are likely more susceptible to changes in the training data, so it would be interesting to determine if intermediate MOAs’ performance is a result of overfitting instead of truly learning aspects of the data generating function. We plan to perform this new analysis and add the results to Supplementary Figure 8 as a subpanel and add a full description of the approach to the appropriate methods subsection.

      2.5. Additional metrics to evaluate LSA predictions to provide more confident interpretation

      Reviewer 2: The predictions are evaluated using L2 distances, which I find not that informative. I would like to see other metrics (correlation or L1 or distribution distances in previous comments)

      We agree with the reviewer that using more than one metric would be helpful because oftentimes a single metric does not tell a complete story. We will add a panel to the LSA supplementary figure (Supplementary Figure 7), using Pearson correlation instead. While L2 distances will tell us how close predictions are to ground truth, Pearson correlations will tell us how consistent, on average, we are able to predict feature direction.

      2.6. Adding a performance-driven feature level analysis to categorize per-feature modeling ability

      Reviewer 2: I would like to see feature-level analysis, which features are well predicted and which ones are more challenging to predict?

      We agree with the reviewer that feature level analysis would be interesting to study. We believe that understanding which features are easy and hard to model could give insight into why certain MOAs (which could be associated with more signal in certain Cell Painting features) are predicted better than others.

      However, we are concerned that it is difficult to have an objective measurement of which features are easier to model because features that have less variation might be easier to model. So, we will analyze the correlation between individual feature reconstruction loss vs. feature variance across profiles. We will color-code the points to represent feature groups or channels. This analysis will not only demonstrate the relationship between feature variance and modeling ability, but also provide insight into the difficulty of modeling individual CellProfiler features.

      1. Description of the revisions that have already been incorporated in the transferred manuscript

      3.1. Documenting positive feedback as provided by the three reviewers

      Reviewer 1: With access to the dataset, the posted GitHub, and documentation in the paper, I believe that the experiments are reproducible.

      Reviewer 1: The experiments are adequately replicated statistically for conventions of deep learning.

      Reviewer 1: This paper proposes a conceptually and technically unique proposal in terms of application, taking existing technologies of VAEs and LSA and, and as far as I know, uses them in a novel area of application (predicting and simulating combination MOAs for compound treatments). If this work is shown to work more broadly and effectively, is seen through to it completion, and is eventually successfully implemented, it will help to evaluate the effects of drugs used in combination on gene expression and cell morphology. An audience in the realm of biological deep learning applications as well as an audience working in the compound and drug testing would be interested in the results of this paper. Authors successfully place their work within the context of existing literature, referencing the numerous VAE applications that they build off of and fit into the field of (Lafarge et al, 2018; Ternes et al, 2021, etc...), citing the applications of LSA in the computer vision community (Radford et al, 2015, Goldsborough et al, 2017), and discussing the biological context that they are working in (Chandrasekaran et al, 2021).

      Reviewer 2: The main novelty of the work is applying VAEs on cell painting data to predict drug perturbations. The final use case could be guiding experimental design by predicting unseen data. However, the authors do not show such an example and use case which is understandable due to the need for doing further experiments to validate computational results and maybe not the main focus of this paper. The authors did a good job of citing existing methods and relevant. The potential audience could be the computational biology and applied machine learning community.

      Reviewer 3: The manuscript is beautifully written in a crystal clear manner. The authors have made a visible effort towards making their work understandable. The methods section is clear and comprehensive. All experiments are rigorously conducted and the validation procedures are sound. The conclusions of the paper are convincing and most of them are well supported by the data. Both the data and the code required to reproduce this work are freely available. Overall, the article is of high quality and relevance to several scientific communities.

      We thank the reviewers for their encouraging remarks and overall positive sentiment. As early-career researchers, we feel empowered by these words.

      3.2. Moved Figure 2 to supplement and removed Figure 5

      Reviewer 1: Fig 2 is not informative so it can go to supplementary.

      Reviewer 2: I liked the paper's GitHub repo, the authors did a good job making everything available and reproducible. As a suggestion, you can move the learning curves in two the sup figures cause they might not be the most exciting piece of info for the non-technical reader.

      Reviewer 3: I would suggest removing Figure 5 (or moving it to the supplementary) as it revisits the content of Figure 1 and does not bring much extra information.

      We agree that Figure 2 might not be informative to a non-technical reader, so we have accepted this suggestion by both reviewers 1 and 2, and we have moved Figure 2 to supplementary.

      We agree with the reviewer and have removed Figure 5.

      3.3. Clarified our data source as CellProfiler readouts, not raw Cell Painting images

      Reviewer 1: In Fig 4, it would be useful to show a few sample representative images with respect to CellProfiler feature groups.

      Reviewer 1: Figure 6, what does it means original input space? Does it mean raw pixel image? As researchers extracted CellProfiler feature groups already, it would be interesting to compare mean L2 distance based on CellProfiler features so that whether VAE improves performance or not (compared to handcrafted features) as a baseline.

      Reviewer 3: While what "morphological readouts" concretely mean becomes clearer later on in the paper, it would be useful to give a couple of examples early on when introducing the considered datasets.

      We thank the reviewer for these suggestions, which bring to light a common source of confusion, which we must alleviate. We are working with CellProfiler readouts (features extracted using classical algorithms) of the Cell Painting images and not the images themselves. We have made several edits throughout the manuscript to improve clarity and remove this confusion, including the introduction, in which we clearly state our model input data:

      “Because of the success of VAEs on these various datasets, we sought to determine if VAEs could also be trained using cell morphology readouts (rather than directly on images), and further, to carry out arithmetic to predict novel treatment outcomes. We derive the cell morphology readouts using CellProfiler (McQuin et al, 2018), which measures the size, structure, texture, and intensity of cells, and use these readouts to train all models.”

      This decision comes with tradeoffs: The benefit of using CellProfiler readouts instead of images is that they are more manageable but we might lose some information. We more thoroughly discuss this important tradeoff in the discussion section:

      “We determined that VAEs can be trained on cell morphology readouts rather than directly using the cell images from which they were derived. This decision comes with various trade-offs. Compared to cell images, cell morphology readouts as extracted by image analysis tools (e.g. CellProfiler) are a more manageable data type; the data are smaller, easier to distribute, substantially less expensive to analyze and store, and faster to train (McQuin et al, 2018). However, it is likely some biological information is lost, because these tools might fail to measure all morphology signals. The so-called image-based profiling pipeline also loses information, by nature of aggregating inherently single-cell data to bulk consensus signatures (Caicedo et al, 2017).”

      3.4. Clarified future directions to infer cell health readouts from simulated polypharmacology cell states

      Reviewer 1: Authors also make the claim that they can infer toxicity and simulate the mechanism of how two compounds might react. This is a claim that would not be supported even if the method were able to successfully predict morphology or gene profiles. Drug interaction and toxicity are quite complex and goes beyond just morphology and expression. VAEs predicting a small set of features would not be able to capture information beyond the readouts, especially when dealing with potentially unseen compounds for which toxicity is not yet known. For example, two compounds might produce a morphology that appears similar to other safe compounds but has other factors that contribute to toxicity. Further, here they show no evidence of toxicity or interaction analysis.

      The reviewer is correct that such a claim is unsupported by our research. Our message was actually that inferring toxicity could be a potential future application of our work. Specifically, for example, we can apply orthogonal models of cell toxicity that we previously derived using other data (Way et al, 2021a) to our inferred polypharmacology cell states. We thank this reviewer for noticing our lack of clarity, and we have made changes in the discussion to make it clear that inferring toxicity is something we may do in the future and is not something that is discussed in the manuscript:

      “In the future, by predicting cell states of inferred polypharmacology, we can also infer toxicity using orthogonal models (e.g. Way et al. 2021) and simulate the mechanisms of how two compounds might interact.”

      3.5. Clarified our method of splitting data, and noting how a future analysis will answer overfitting extent

      Reviewer 2: Could authors outline detailed data splits? Which MoA are in train and which are held out from training? As I understood, there were samples from MoAs that were supposed to be predicted in the calculation of LSA? Generally, the predicted MoA should not be seen during training and not in LSA calculation.

      We now more explicitly detail how we split our data in the methods:

      “As input into our machine learning models, we split the data into an 80% training, 10% validation, and 10% test set, stratified by plate for Cell Painting and stratified by cell line for L1000. In effect, this procedure evenly distributes compounds and MOAs across data splits.”

      We also thank the reviewer for this comment, because they express an important concern about making sure that we are not overfitting to the data. We have explained in the manuscript that because of lack of data, MOAs were repeated in training and LSA. However, we believe overfitting is not playing a large role in model performance. Through our hold 5 out experiment, we are able to show that our models are able to predict the same MOAs irrespective of whether they were in the training data, indicating that we did not overfit to the distribution of certain MOAs.

      Reviewer 1 also suggested that we do the hold 5 out experiment on A∩Bs that were barely predicted. After we do that, we will explicitly demonstrate the extent of overfitting.

      3.6. Introduced acronyms when they first appear in the manuscript

      Reviewer 3: The Kullback-Leibler divergence is properly introduced in the methods part, but not at all in the introduction (it directly appears as "the KL divergence"). To enhance readability, it would be better to fully spell it before using the acronym, and maybe give a one-sentence intuition of what it is about before pointing out to the methods part for more details.

      We thank the reviewer for bringing this to our attention. We have carefully reviewed the entire manuscript and have corrected such instances of clear introductions to acronyms.

      3.7. Fixed minor text changes

      Reviewer 3: In Figure 1, I would recommend changing "compression algorithms" to "dimension reduction algorithm" or "embedding algorithm". In a compression setting, I would expect the focus to be on the number of bits of information each method requires (or the dimension of the resulting embedding) to encode the data while guaranteeing a certain quality threshold. This is obviously not the case here as the dimension of the embedding is fixed and the focus is on exploring how the embedding is constructed (eg how much it decorrelates the different features, etc) - which may be misleading.

      Reviewer 3: I recommend using "A n B" or "A & B" or "(A, B)" to denote the combination of two independent modes of action A and B. The current notation "A | B" overloads the statistical "A given B" which appears in the VAE loss and is therefore misleading.

      We agree with the reviewer, and aim to minimize all sources of potential confusion. We have made the change in the figure.

      We also agree that our current notation can be confusing. We have updated all instances of “A|B” with “A ∩ B”.

      3.8. Added hypothesis of MMD-VAE oscillations to supplementary figure legend

      Reviewer 3: Do the authors have a hypothesis of what may be causing MMD-VAE to oscillate during validation when data are shuffled? This seems to be the case on two of the three considered datasets (Figure 2 and SuppFigure 1) and is not observed for the other models. Including a few sentences on that in the text would be interesting.

      We believe a big reason for this is because of the fact that the optimal MMD-VAE had a much higher regularization term, which puts a greater emphasis on forming normal latent distributions, than the optimal Beta or Vanilla VAE. Forcing the VAE to encode a shuffled distribution into a normally distributed latent distribution would be difficult to do consistently across different randomly shuffled data subsets, and therefore might cause oscillations in the training curve across epochs when the penalty for that term is high. As these observations may be interesting to a certain population of readers, we have incorporated this explanation into the supplementary figure legend (which is where this figure is shown):

      “Forcing the VAE to consistently encode a shuffled distribution into a normally distributed latent distribution would be difficult, and therefore might cause oscillations in the training curve across epochs.”

      3.9. Explained our selection of VAE variants

      Reviewer 3: The different types of considered VAE and their differences are very clearly introduced. It may however be good to motivate a bit more the focus on beta-VAE and MMD-VAE among all the possible VAE models. This is partly done through examples in the second paragraph of page 2, but could be elaborated further.

      We thank the author for their encouraging remarks. We have made edits to the manuscript’s introduction, explaining why we chose these two variants out of all the possible choices:

      “We trained vanilla-VAEs, β-VAEs, and MMD-VAEs only, and not other VAE variants and other generative model architectures, such as generative adversarial networks (GANs), because these VAE variants are known to facilitate latent space interpretability.”

      1. Description of analyses that authors prefer not to carry out

      4.1. We will not explore additional latent space dimensions in more detail, as this is out of scope

      Reviewer 1: As both reconstructed and simulated data did not span the full original data distribution, it might be better to look at reconstruction error and increase the dimension of latent space.

      We thank the reviewer for bringing up this important point. Our VAE loss function consists of the sum of reconstruction error and some form of KL divergence. Specifically, this reviewer is suggesting that if we only minimize reconstruction error (or focus more on reconstruction over KLD by lowering beta), a higher latent dimension would result in better overall reconstruction. This is true, but doing so would have negative consequences. While we would perhaps get the UMAPs to show the full data distribution, the UMAPs are not our focus; predicting polypharmacology through LSA is. We found that when we have a higher focus on the reconstruction term, we have more feature entanglement, as indicated by lower performance when simulating data and overlapping feature contribution per latent feature. The fact that simulating data would logically require less disentanglement than performing LSA shows that we require higher regularization (and hence lower focus on reconstruction) than the one we got from simulating data.

      Essentially, while the reviewer's comments would improve reconstruction and allow us to improve the UMAPs, doing so would likely worsen LSA performance, which is the main focus of the project. Also, increasing the latent dimension without changing beta would likely have caused little to no change because since beta is encouraging disentanglement, it would cause the newly added dimension to have little variation and encode little new information that wasn’t already encoded before.

      We have also previously explored the concept of toggling the latent dimensions in a separate project (Way et al, 2020). We are very interested in this area of research in general, and any additional analyses (beyond hyperparameter optimization) deserves a much deeper dive than what we can provide in this paper.

      Lastly, we intend to include a deeper description and analysis of reconstruction loss across models, datasets, and MOAs as was suggested by a previous reviewer comment (see section 2.1 above)

      4.2. We will not review Gaussian distribution assumptions of the VAE as we feel it is not informative

      Reviewer 1: By looking at SFigure 6, I am wondering whether latent distribution actually met gaussian distribution (assumption of VAE). It may show skew distribution as some of latent features shows low contribution.

      This reviewer’s comment is interesting, but we do not believe it would change the findings of our study. Suppose we find that the latent dimensions aren’t normally distributed. This wouldn’t change much; a gaussian distribution isn’t the most critical to perform LSA. We need the latent code to be disentangled, but having normally distributed latent features doesn't necessarily mean that we have good disentanglement (see https://towardsdatascience.com/what-a-disentangled-net-we-weave-representation-learning-in-vaes-pt-1-9e5dbc205bd1)

      4.3. In this paper, we will not train or compare conditional VAEs nor cycle GANs

      Reviewer 2: While authors provided a comparison between vanilla VAE and MMD-VAE, B-VEA, there are other methods capable of doing similar tasks (data simulation, counterfactual predictions ), I would like to see a comparison with those methods such as conditional VAE( https://papers.nips.cc/paper/2015/hash/8d55a249e6baa5c06772297520da2051-Abstract.html, CVAE + MMD : https://academic.oup.com/bioinformatics/article/36/Supplement_2/i610/6055927?login=true) or cycle GANs(https://arxiv.org/abs/1703.10593 ).

      While such comparisons would be interesting, they are not the main focus of the manuscript, which is to benchmark the use of VAEs in cell morphology readouts and to predict polypharmacology.

      We think that CVAE would not be appropriate for our study. In a CVAE, the encoder and decoder are both conditioned to some variable. In our situation where we are predicting the cell states of different MOAs, it would make most sense to condition on the MOA. However, because we’re using the MOA labels in our LSA experiment, conditioning on them is likely to bias our results and not be effective for MOAs outside the conditioning.

      For cycle GANs, we have found that training using these data, in a separate study in our lab, is extremely difficult. Our lab has not published this yet, but once we are able to better understand cycleGAN behavior in these data, it will require a separate paper in which we compare performance and dissect model properties in much greater detail.

      Nevertheless, we have added citations to multi-modal approaches like cycle GANs (see section 4.4) as they will point a reader to useful resources for future directions.

      4.4. We will not be comparing with multi-modal integration, but we clarified our focus on Cell Painting VAE novelty and added multi-modal citations

      Reviewer 1: Researchers found that the optimal VAE architectures were very different between morphology and gene expression, suggesting that the lessons learned training gene expression VAEs might not necessarily translate to morphology. It would be interesting to compare the result with multimodal integration as baseline (i.e., Seurat).

      Our focus in this paper was to train and benchmark different variational autoencoder (VAE) architectures using Cell Painting data and to demonstrate an important, unsolved application in predicting polypharmacology that we show is now possible for a subset of compounds. It was a natural and useful extension to compare Cell Painting VAE performance with L1000 VAE performance especially since our data set contained equivalent drug perturbations. We feel that any extension including multi-modal data integration will distract focus away from the Cell Painting VAE novelty, and requires a much deeper dive beyond scope of our current manuscript.

      Additionally, there have been other, more in-depth and very recent multi-modal data integration efforts using the same or similar datasets (Caicedo et al, 2021; Haghighi et al, 2021). In a separate paper that we just recently submitted, we also dive much deeper to answer the question of how the two modalities complement one another in various ways and for various tasks (Way et al, 2021b). These two papers already provide a deeper and more informative exploration of Cell Painting and L1000 data integration.

      Therefore, because multi-modal data integration, while certainly interesting, will distract from the Cell Painting VAE novelty and is redundant with other recent publications, we feel it is beyond scope of this current paper.

      Nevertheless, multi-modal data integration is important to mention, so we add it to the discussion. Specifically, we discuss how multi-modal data integration might help with predicting polypharmacology in the future and include pertinent citations so that we, or another reader, might be able to follow-up in the future. The new section reads:

      “Because we had access to the same perturbations with L1000 readouts, we were able to compare cell morphology and gene expression results. We found that both models capture complementary information when predicting polypharmacology, which is a similar observation to recent work comparing the different technologies’ information content (Way et al, 2021). We did not explore multi-modal data integration in this project; this has been explored in more detail in other recent publications (Caicedo et al, 2021; Haghighi et al, 2021). However, using multi-modal data integration with models like CycleGAN or other style transfer algorithms might provide more confidence in our ability to predict polypharmacology in the future (Zhu et al, 2017).”

      1. References

      Caicedo JC, Cooper S, Heigwer F, Warchal S, Qiu P, Molnar C, Vasilevich AS, Barry JD, Bansal HS, Kraus O, et al (2017) Data-analysis strategies for image-based cell profiling. Nat Methods 14: 849–863

      Caicedo JC, Moshkov N, Becker T, Yang K, Horvath P, Dancik V, Wagner BK, Clemons PA, Singh S & Carpenter AE (2021) Predicting compound activity from phenotypic profiles and chemical structures. bioRxiv: 2020.12.15.422887

      Corsello SM, Bittker JA, Liu Z, Gould J, McCarren P, Hirschman JE, Johnston SE, Vrcic A, Wong B, Khan M, et al (2017) The Drug Repurposing Hub: a next-generation drug library and information resource. Nat Med 23: 405–408

      Haghighi M, Singh S, Caicedo J & Carpenter A (2021) High-Dimensional Gene Expression and Morphology Profiles of Cells across 28,000 Genetic and Chemical Perturbations. bioRxiv: 2021.09.08.459417

      McQuin C, Goodman A, Chernyshev V, Kamentsky L, Cimini BA, Karhohs KW, Doan M, Ding L, Rafelski SM, Thirstrup D, et al (2018) CellProfiler 3.0: Next-generation image processing for biology. PLoS Biol 16: e2005970

      Wang Y, Huang H, Rudin C & Shaposhnik Y (2021) Understanding how dimension reduction tools work: an empirical approach to deciphering t-SNE, UMAP, TriMAP, and PaCMAP for data visualization. J Mach Learn Res 22: 1–73

      Way GP, Kost-Alimova M, Shibue T, Harrington WF, Gill S, Piccioni F, Becker T, Shafqat-Abbasi H, Hahn WC, Carpenter AE, et al (2021a) Predicting cell health phenotypes using image-based morphology profiling. Mol Biol Cell 32: 995–1005

      Way GP, Natoli T, Adeboye A, Litichevskiy L, Yang A, Lu X, Caicedo JC, Cimini BA, Karhohs K, Logan DJ, et al (2021b) Morphology and gene expression profiling provide complementary information for mapping cell state. bioRxiv: 2021.10.21.465335

      Way GP, Zietz M, Rubinetti V, Himmelstein DS & Greene CS (2020) Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations. Genome Biol 21: 109

      Zhu J-Y, Park T, Isola P & Efros AA (2017) Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. arXiv [csCV]

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this paper, the authors explore the use of VAE to learn low-dimensional representations of morphological features of cells. They demonstrate that the representations learned by the different VAE models considered accurately model the distribution of features in real data and can be complemented by other biological readouts such as gene expression. Additionally, the structure of the learned feature space appears to be sufficient to generate accurate predictions relying on latent space arithmetic - for instance allowing to predict the morphology of samples subjected to two perturbations knowing the morphology of samples affected by either of these perturbations in isolation.

      Comments:

      The manuscript is beautifully written in a crystal clear manner. The authors have made a visible effort towards making their work understandable. The methods section is clear and comprehensive. All experiments are rigorously conducted and the validation procedures are sound. The conclusions of the paper are convincing and most of them are well supported by the data. Both the data and the code required to reproduce this work are freely available.

      Overall, the article is of high quality and relevance to several scientific communities. I only have a couple of minor comments that I think could help improve it further:

      • The Kullback-Leibler divergence is properly introduced in the methods part, but not at all in the introduction (it directly appears as "the KL divergence"). To enhance readability, it would be better to fully spell it before using the acronym, and maybe give a one-sentence intuition of what it is about before pointing out to the methods part for more details.
      • While what "morphological readouts" concretely mean becomes clearer later on in the paper, it would be useful to give a couple of examples early on when introducing the considered datasets.
      • The different types of considered VAE and their differences are very clearly introduced. It may however be good to motivate a bit more the focus on beta-VAE and MMD-VAE among all the possible VAE models. This is partly done through examples in the second paragraph of page 2, but could be elaborated further.
      • In Figure 1, I would recommend changing "compression algorithms" to "dimension reduction algorithm" or "embedding algorithm". In a compression setting, I would expect the focus to be on the number of bits of information each method requires (or the dimension of the resulting embedding) to encode the data while guaranteeing a certain quality threshold. This is obviously not the case here as the dimension of the embedding is fixed and the focus is on exploring how the embedding is constructed (eg how much it decorrelates the different features, etc) - which may be misleading.
      • Do the authors have a hypothesis of what may be causing MMD-VAE to oscillate during validation when data are shuffled? This seems to be the case on two of the three considered datasets (Figure 2 and SuppFigure 1) and is not observed for the other models. Including a few sentences on that in the text would be interesting.
      • I recommend using "A n B" or "A & B" or "(A, B)" to denote the combination of two independent modes of action A and B. The current notation "A | B" overloads the statistical "A given B" which appears in the VAE loss and is therefore misleading.
      • I would suggest removing Figure 5 (or moving it to the supplementary) as it revisits the content of Figure 1 and does not bring much extra information.
      • Figure 6 is missing error bars (standard deviation of the L2 distance) and, as such, is hard to draw conclusions from.

      Significance

      Nature and significance:

      This work does not hold new conceptual or technical contributions per se as it focuses on showcasing the use of existing techniques established in other fields (eg in the context of natural image processing for latent space arithmetics) to biological data analysis. That said, popularizing successful methodologies beyond the scientific community where they have been developed, as done in this work, is immensely valuable. As such, the approach presented in the paper is likely to inspire and enable many other studies and is therefore a significant contribution (especially so thanks to the code availability!)

      Comparison to existing published knowledge:

      While a bunch of published works use VAEs on biological data, I am not aware of existing ones that study the relative merit of the representations obtained with different VAE models as done here and explore their use in a generative setting with latent space arithmetics. As such, this work is novel and distinguishes itself from existing published knowledge.

      Audience:

      This work is likely to be of interest to life scientists with an enthusiasm for state-of-the-art data analysis techniques. Because the paper is clearly written and makes very few assumptions of prior expert knowledge, it is also likely to be a good entry point to the wider VAE/generative models literature for non-experts. I also believe that this manuscript can be of interest to computer scientists and machine learning researchers as it presents a concrete example of the use of published methods in the context of biological data analysis.

      My expertise:

      Computer vision and machine learning. I do not feel qualified to assess the clinical relevance of this work.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Ler et al. propose a series of VAE based methods to predict compound polypharmacology For cell painting data. They first learn a latent space and try to answer the following counterfactual:

      how would cell morphology or gene expression of a cell perturbed with Drug A change if was perturbed with Drug A and B (A+B) given we have the measurement for drug A and drug B. They address the problem by doing latent space arithmetics (LSA) and decoding the predicted morphology measurements. They first train different VAE models to compare the training stability and simulation performance by sampling from the latent space. Further analysis is the learned latent space to deconvolve latent space to feature space. I like the application of LSA+VAE on cell painting datasets which is the main novelty of the paper. However, I have some major comments and concerns:

      Major comments:

      While authors provided a comparison between vanilla VAE and MMD-VAE, B-VEA, there are other methods capable of doing similar tasks (data simulation, counterfactual predictions ), I would like to see a comparison with those methods such as conditional VAE( https://papers.nips.cc/paper/2015/hash/8d55a249e6baa5c06772297520da2051-Abstract.html, CVAE + MMD : https://academic.oup.com/bioinformatics/article/36/Supplement_2/i610/6055927?login=true) or cycle GANs(https://arxiv.org/abs/1703.10593 ). The authors compare generation performance based on UMAP. In the UMAP space, data tend to cluster together even though they might be far from each other in the feature space. I would like to see more quantitive metrics on how well these methods capture morphology distributions. You can compute metrics like MMD distance, kullback leibler (KL), earthmoving distance, or a simple classifier trained on actual MoA classes tested on generated data.

      The predictions are evaluated using L2 distances, which I find not that informative. I would like to see other metrics (correlation or L1 or distribution distances in previous comments) I would like to see feature-level analysis, which features are well predicted and which ones are more challenging to predict?

      • Could authors outline detailed data splits? Which MoA are in train and which are held out from training? As I understood, there were samples from MoAs that were supposed to be predicted in the calculation of LSA? Generally, the predicted MoA should not be seen during training and not in LSA calculation.

      Minor comments:

      I liked the paper's GitHub repo, the authors did a good job making everything available and reproducible. As a suggestion, you can move the learning curves in two the sup figures cause they might not be the most exciting piece of info for the non-technical reader.

      Significance

      The main novelty of the work is applying VAEs on cell painting data to predict drug perturbations. The final use case could be guiding experimental design by predicting unseen data. However, the authors do not show such an example and use case which is understandable due to the need for doing further experiments to validate computational results and maybe not the main focus of this paper.

      • The authors did a good job of citing existing methods and relevant
      • The potential audience could be the computational biology and applied machine learning community.
      • My expertise is in computational biology and machine learning.
    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Researchers used two primary data modalities (L1000 sequencing data, and Cell painting morphology features) for cell data perturbed by a series of compounds, each with labeled (individual and combination) mechanisms of action. Using several VAEs and ML methods, they evaluated their ability to encode interpretable latent spaces (evaluated by subtracting +/-3stds and checking the contribution off features to the latent space) and adequately reconstruct the input data. Using the constructed latent spaces and labeled MOAs, researchers performed latent space arithmetic, to remove base DMSO features and add features of individual MOAs to produce the features of combination MOAs (evaluated by the significance of difference to shuffled data). Researchers found that MDD-VAE encoded the most information and that VAEs successfully simulated morphology and gene expression features. They found that the optimal VAE architectures were very different between morphology and gene expression. Researchers found that VAEs were able to use individual MOA profiles to simulate some combination MOA profiles with varied success.

      Comments:

      • Researchers found that the optimal VAE architectures were very different between morphology and gene expression, suggesting that the lessons learned training gene expression VAEs might not necessarily translate to morphology. It would be interesting to compare the result with multimodal integration as baseline (i.e., Seurat).

      -Instead of using UMAP embedding, it would be better to compare reconstruction error or show a reconstructed image with the original image to claim that models reliably approximate the underlying morphology data. As both reconstructed and simulated data did not span the full original data distribution, it might be better to look at reconstruction error and increase the dimension of latent space.

      -Fig 2 is not informative so it can go to supplementary. -In Fig 4, it would be useful to show a few sample representative images with respect to CellProfiler feature groups.

      -By looking at SFigure 6, I am wondering whether latent distribution actually met gaussian distribution (assumption of VAE). It may show skew distribution as some of latent features shows low contribution.

      -Figure 6, what does it means original input space? Does it mean raw pixel image? As researchers extracted CellProfiler feature groups already, it would be interesting to compare mean L2 distance based on CellProfiler features so that whether VAE improves performance or not (compared to handcrafted features) as a baseline.

      -It is difficult to know the clear threshold for successful performance is on figures like Figure 7 and SFigure 9, but by and large, it appears that the majority of predicted combination MOAs were not successful. Without the ability to either A) adequately predict most all combinations from individual profiles that were used in training or B) an explanation prior to analysis of which combination will be able to predict, it is difficult to see this method being used since the combinatorial predictions are more likely not informative.

      -Authors also make the claim that they can infer toxicity and simulate the mechanism of how two compounds might react. This is a claim that would not be supported even if the method were able to successfully predict morphology or gene profiles. Drug interaction and toxicity are quite complex and goes beyond just morphology and expression. VAEs predicting a small set of features would not be able to capture information beyond the readouts, especially when dealing with potentially unseen compounds for which toxicity is not yet known. For example, two compounds might produce a morphology that appears similar to other safe compounds but has other factors that contribute to toxicity. Further, here they show no evidence of toxicity or interaction analysis.

      -The researchers justify the poor performance compared to shuffled data, by saying that A) MOA annotations are noisy and unreliable and B) they MOAs may only manifest in other modalities like what was seen in the L1000 vs morphology predictability. While these might be true, knowing this the researchers should make an effort to clean and de-noise their data and select MOAs that are well-known and reliable, as well as, selecting MOAs for which we have a known morphological or genetic reaction.

      -With the small number of combinations that are successfully predicted, to build confidence in the performance, it would be necessary to explain the reason for the differences in performance. Further experimentations should be done looking into any relationship between the type of MOAs (and their features) and the resulting A|B predictability. Looking at Figure 7, the top-performing combinations are comprised entirely of inhibitor MOAs. If the noisiness of the data is a factor, there should be some measurable correlation between feature noisiness and variation and the resulting A|B predictability from LSA.

      -To show that the methodology works well on unseen data, researchers withheld the top 5 performing A|B MOAs (SFig 9) and showed they were still well predicted. This is not the most compelling demonstration since the data to be held out was selected with bias as the top-performing samples. It would be much more interesting to withhold an MOA that was near or only somewhat above the margin of acceptability and see how many holdouts affected the predictability of those more susceptible data points. From my best interpretation, the hold-out experiment also only held out the combination MOA groups from training. It would be better if single MOAs (for example A) which were a part of a combination of MOA (A|B) were also held out to see if predictability suffered as a result and if generalizability did extend to cells with unseen MOAs (not just cells which had already highly performing combinations of seen MOAs).

      -Rather than just stating that the VAE's did not span the original data distribution and saying beta-VAE performed best by eye, some simple metrics can be drawn to analyze the overlap in data for a more direct and quantified comparison. Researchers should also explain what part of the data is not being captured here. Some analysis of what the original uncaptured UMAP represents is important in understanding the limitations of the VAEs' capacity.

      -My suggestions are realistic and feasible. The cost for the recommended tests and validations would cost no additional money (outside of researcher labor and re-training on the existing GPUs) as my recommendations are simply further analysis and training on the same data. Time would be dependent on the time required to train the VAE models, but seeing as 2-layer VAEs are relatively small for the deep learning community, time to train and analyze through existing pipelines should be minimal. This is confirmed by looking at their GitHub code, where jupyter notebooks show that models can be trained in a few minutes.

      -With access to the dataset, the posted GitHub, and documentation in the paper, I believe that the experiments are reproducible.

      -The experiments are adequately replicated statistically for conventions of deep learning.

      Significance

      My background of expertise is developing and applying deep learning and VAEs applied to single cell imaging and expression data. There is no part of this paper that I do not have sufficient expertise to evaluate.

      This paper proposes a conceptually and technically unique proposal in terms of application, taking existing technologies of VAEs and LSA and, and as far as I know, uses them in a novel area of application (predicting and simulating combination MOAs for compound treatments). If this work is shown to work more broadly and effectively, is seen through to it completion, and is eventually successfully implemented, it will help to evaluate the effects of drugs used in combination on gene expression and cell morphology. An audience in the realm of biological deep learning applications as well as an audience working in the compound and drug testing would be interested in the results of this paper. Authors successfully place their work within the context of existing literature, referencing the numerous VAE applications that they build off of and fit into the field of (Lafarge et al, 2018; Ternes et al, 2021, etc...), citing the applications of LSA in the computer vision community (Radford et al, 2015, Goldsborough et al, 2017), and discussing the biological context that they are working in (Chandrasekaran et al, 2021).

  5. Oct 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **Summary:** The manuscript submitted by Djekidel et al entitled: "CovidExpress: an interactive portal for intuitive investigation on SARS-CoV-2 related transcriptomes" reports on a new web portal to search and analyze RNAseq data related to SARS-CoV-2 infections. The authors downloaded and reprocessed data of more than 40 different studies, which is available on the web portal along with all available meta data. The web portal allows to perform numerous differential expression and gene set enrichment analyses on the data and provides publication ready figures. Because of batch effects that could not be removed, the authors do not recommend to analyze data across studies at this point. The authors conclude that the web portal is unique and will allow scientists to rapidly analyze gene expression signatures related to SARS-CoV-2 infections with the potential to make new discoveries. **Major comments:** Based on the scientific literature, the web portal seems to be an unprecedented resource to search and analyze SARS-CoV-2-related RNAseq data and as such would certainly be a useful resource for the SARS-CoV-2 scientific community. The authors argue that new discoveries are possible by using their web portal in providing use cases. However, the section detailing the analyses the authors did to generate new hypotheses about genes potentially relevant in SARS-CoV-2 infections are very difficult to follow and without more guidance very difficult to reproduce with the web portal. It would require substantial expert knowledge in RNAseq data analysis without more information being provided. It also seems that key candidate genes identified by their analyses have all been studied or identified to be related to SARS-CoV-2 infections, so it is somewhat unclear whether new hypotheses can be generated by the reanalysis of RNAseq datasets, especially because combining the data from different studies is currently not recommended by the authors. The manuscript would benefit from providing fewer use cases but for each of them providing more information on how the portal and which studies were used to generate them and which findings were not described in the publication of the used studies. Some observations in the manuscript are not substantiated with significance calculations (see below). At times, the English writing (grammar) should be improved.

      We thank the reviewer for the positive comments. We suppose the reviewer conclude it need substantial expert knowledge in RNAseq data analysis were due to lacking Video Tutorial. We have now put up several Video Tutorials and more tutorials would be added along later along with users’ feedbacks. We believed this would help ease reviewers’ concern.

      In response to whether new hypothesis can be generated. Sorry if it’s not clear, for all the case studies and our “CovidExpress Reveals Insights and Potential Discoveries”, our portal has provided information not reported by their original publications, as listed below:

      1. Case study #1: The original publication employed a multiomics approach to find the predictor genes between ICU and non-ICU patient. But it’s not obviously to know which genes were mainly due to expression level, which might be due to other data they included (e.g. mass spectrometry data). Our portal allow user to quickly check their expression level and find SESN2 does not have strong expression differences.
      2. Case study #2: We replace this case study with bacterial-susceptibility genes to show such questions could be quickly asked and answered using our portal. Such investigation has not been reported before.
      3. FURIN’s function have been well related to SARS-CoV-2. However, for all reports we could find, they focused on Furin cleavage sites of SARS-CoV-2 or whether FURIN were expressed in the SARS-CoV-2 sensitive tissues. SARS-CoV-2 infection could up-regulate FURIN expression have never been reported before. The study published the data didn’t mentioned FURIN at all. We have made this discovery simply by using CovidExpress portal to find the differential expressed genes and overlap with the literature-based gene list (Supplementary Table S2), we believe more discoveries could be made by users by selecting different data.
      4. If we search OASL AND " SARS-CoV-2" on pubmed, only 5 results shown up indicated it’s under-studied. And none of them indicated OASL could be up-regulated both by SARS-CoV-2 infected lung and Rhinovirus-infected nasal in human. It is not clear to us if we might misunderstand reviewers’ suggestion as “fewer use cases”. Thus, we haven’t removed any use cases, instead we provided more details to help users understand what and how did we made those discoveries not reported by their original studies using CovidExpress.

      At last, we have gone through substantial scientific editing to improve the grammar. **Minor comments:** Page 6 last sentence: The statement of this sentence is very much what one would expect. It remains unclear whether the authors mean this as a result to validate the processing of the RNAseq data or as a new discovery. Please, clarify.

      We apologize for the confusion. We intended this statement to be a result confirming what we had expected. We have now amended the text to make this point clearer.

      Figure 3A: The violin plots are so tiny that it is impossible to see any trends. It is also difficult to understand which categories one should compare with each other. If there is anything significant to observe, please, add a statistical test and better guide the reader.

      We agree with the reviewer; therefore, we have removed this figure from the paper. The goal of this figure was to demonstrate how to use violin plots for exploratory analysis; however, in this case, the violin plot did not show a clear trend. By using more filtering and other plots (e.g., Figure 3B-C), we believe we now provide better insight.

      Figure 3C: A legend for the color scale is missing. The signal (I guess expression amounts) for SESN2 seems very weak and the same between ICU and non-ICU samples. What is the significance for assigning this gene to the group of genes being upregulated in ICU samples? Also contrary to what the authors state on page 8, SESN2 does not seem to be highly expressed in ICU samples, however, without knowing what the colors represent (fold changes or absolute expression values?) this is somewhat speculative.

      We thank the reviewer for bringing this to our attention. We have now added a legend for the color scale in the revised figure. In Figures 3A-C, we are showcasing how an exploratory analysis can be performed using CovidExpress. As an example, we investigated the expression of the top 20 genes identified by the random forest classifier of Overmyer et al., 2021, as predictors of ICU and non-ICU cases. In the original Overmyer et al. paper, only the general performance metrics of the models are presented (Fig. 6c-g), but the authors do not show the expression patterns of the top predictors. Hence, we demonstrate how CovidExpress can be used to further investigate some questions not explored in the original paper. SESN2 was listed as a top predictor; however, its expression did not vary between ICU and non-ICU samples, as was also observed by the reviewer. We suspect SESN2 was a top predictor due to other data the Overmyer et al. paper included, such as mass spectrometry data. Our statement about SESN2 was not accurately reflected in the figure; therefore, we have rewritten this section to make it clearer.

      Page 9 first sentence: Please, specify what you mean by "starting list". Furthermore, in this paragraph, how do your results compare to the results from the study that you re-analyze here?

      We thank the reviewer for the question. By “starting list,” we meant the top genes from the Overmyer et al., 2021, article as predictors of ICU and non-ICU cases. We have now rewritten this section to make it clearer. We did not expect our results to differ from their data. Our goal was to ask which of their top predictors (by multi-omics data) show a difference in gene expression. When we downloaded their TPM values from their GEO records, the values were very similar overall (see below).

      Figure 3F: Please add labels to your axes and is there a particular reason why in a correlation plot like this one, the y and x axis are not shown with the same range and why does the y axis not start at 0?

      We thank the reviewer for this helpful comment. Our reasoning for presenting the figure in this way is that different genes can have very different expression levels but still be correlated. For example, if gene A expressed 1, 5, and 10 in samples 1,2, and 3, while gene B expressed 100, 500, and 1000 for samples 1, 2, and 3, then their range would be very different but still perfectly correlated (see panel A below). If we draw the x- and y-axes using the same range, this correlation will not be visually obvious (see panel B below).

      This comparison is different from the correlation plots that compare the expression of one gene in different samples. We apologize for the confusion and to avoid misleading readers, we have enlarged the gene names in the Figure labels to ensure that readers notice their differences. We have also added an option to the correlation plot on our portal so that users can choose the optimal format (see below).

      Page 9 second last sentence: It remains unclear which kind of analysis the authors intend to do here and what the starting question is. Please, try to rewrite with less technical terms (i.e. what do you mean by "precalculated contrasts"). In line with this, it remains unclear what Figure 3I is supposed to show. Please, provide some more information to readers who are not RNAseq analysis experts.

      We thank the reviewer for this suggestion. To avoid any misleading claims, we followed Reviewer #2’s suggestion and replaced the coagulation gene list with a filtered gene list from the “Coronavirus disease - COVID-19” KEGG pathway (hsa05171) to showcase how to identify experiments in which this gene signature is enriched or depleted. We also replaced the related figures and text with new results and rewrote this section to avoid using technical terms.

      Figure 3J is somewhat confusing. Why is the mean expression range indicated from 0 to 1 and why are all genes apparently having a mean expression of 1?

      We thank the reviewer for this question. Because the levels of expression of different genes can vary greatly, in Figure 3J (new Figure 3A and 3I), we normalized the mean expression levels of the genes to their maximum values across groups to improve the visualization. We have now made this clearer in the figure, legend, and text.

      Page 10 line 5-6. Are you referring to coagulation markers here or general expression patterns? In case of the latter, how does this statement fit to the paragraph about analyzing expression patterns of coagulation markers? Please, specify. And in line with this, are the highlighted genes in Figure 3K coagulation markers? If not, what is the relevance of these to make the point that one can use the portal to investigate the role of coagulation markers in SARS-CoV-2 infections?

      As mentioned above, to avoid any misleading claims, we followed Reviewer #2’s suggestion and replaced the coagulation gene list with a filtered gene list from the “Coronavirus disease - COVID-19” KEGG pathway (hsa05171). This revision enables us to show how to identify experiments in which this gene signature is enriched or depleted. We have now replaced these figures and text with new results.

      The appearance of describing batch effects and attempts to remove them from the studies was somewhat surprising on page 10 as I would expect this kind of results rather earlier in the results section before describing use cases of the data. You may consider changing the order of your results for a better flow.

      We apologize for the confusion. However, we want to make it clear that the analysis before page 10 did not involve “batch effect”; all analyses were performed within each study. Thus, it is not necessary to change the order in which the results are presented. Also, based on Reviewer #2’s comments, we did not accurately use the term “batch effect,” because “batch effects are purely due to technical differences.” We have now revised the corresponding text to make this point clearer.

      Page 11, second paragraph. Please, explain briefly what the silhouette score is supposed to reflect and thus how Figure S4G should be interpreted. The difference of both bars in Figure S4G is very marginal and thus, does not seem to support the statement of the authors that the ssGSEA scores-based projection is better unless you perform a significance test or I misunderstood. Please, clarify.

      We thank the reviewer for this suggestion. We have now added an explanation of the silhouette score in the manuscript. Briefly, a silhouette score is a metric of the degree of separability of gene clusters from the nearest cluster. For a given sample, lets be the mean intra-cluster distance, and be the mean distance to the nearest cluster. The silhouette score (sil) will be calculated as follows

      The silhouette score ranges between -1 and 1. A value near 1 means that the clusters are well separated, and a value near -1 means that the clusters are intermingled. Using a Wilcoxon rank test, we showed that using ssGSEA scores significantly improves the separability of global GTEx tissues (in Figure S4G; p=8.75e-26).

      Page 11, third paragraph: Figure 4B, to the best of my understanding, does not support the claim that samples clustered less according to study cohorts using the ssGSEA approach. Please, quantify the effect and test for significance or better explain.

      We apologize for the confusion. We quantified the separability between cohorts (GSE ids) by using the silhouette score. In Figure S4H (panel A below), we show that the TPM-based PCA leads to more separation by studies than does the Covid contrast ssGSEA scores in which the separation between studies is less prominent (p-value=0.0045, paired Wilcoxon test).

      For the analyses described starting on page 12 it remains largely unclear whether they were conducted across studies or within studies and which studies were used. This section until the end of the results would especially benefit from providing more information on how the analyses were performed, either in the results or in the methods section.

      We apologize for the confusion. The goal of the analysis on page 12 and the corresponding Figure 4G was to identify genes whose expression increased in both the SARS-CoV-2 infection lung and rhinovirus-infected nasal tissue. Hence, we did a log2(fold-change) vs log2(fold-change) comparison. The log2(fold-change) values were independently calculated for each study. Because we compared values by using the same ranking metric, the cross-samples comparison was possible, as shown in Figure 4G. We have now added more details to the Methods section to clarify this point.

      Figures 4J and 4K miss axis labels and since we look at correlations, the figures could be redrawn using the same ranges on x and y axis.

      We thank the reviewer for this suggestion. We have now added axes labels to the new figures. However, we have not used the same range on the x and y axes because they depict expression levels of different genes. For example, if gene A is expressed 1, 5, and 10 in samples 1, 2, and 3, while gene B is expressed 100, 500 and 1000 for samples 1, 2, and 3, their range would be very different but still perfectly correlated (panel A below). If we draw x and y axes using the same range, this correlation will not be visually obvious (panel B below).

      This comparison is different from the correlation plots that compare the expression of one gene in different samples. We apologize for the confusion and to avoid misleading readers, we have enlarged the gene names in Figure labels to ensure that readers notice they are different genes. We have also added an option to the correlation plot on our portal so that users can choose the optimal format (see below).

      Page 14 line 5: Is this the right figure reference here to Figure 4G? If yes, then it is unclear how Figure 4G supports the statement in this sentence. Please, clarify.

      We apologize for the confusion. In Figure 4G, we labeled several important genes and used different colors to indicate whether the gene was regulated by SARS-CoV-2 only (purple), Rhinovirus only (black), or both(red). FURIN was the gene that is only significantly upregulated by SARS-CoV-2. The data in Figure 4G were from GSE160435(“SARS-CoV-2 infection of primary human lung epithelium for COVID-19 modeling and drug discovery”); that study used lung organoid alveolar type 2 (AT2) cells as the model. We think this confusion was caused by our failure to provide the details about the GSE160435 study. We have now amended the manuscript to include these details in the Methods section to avoid confusion. We also enlarged the gene labels in the figure to make them more visible. In the manuscript, we have changed from “our results found FURIN gene was also upregulated in SARS-CoV-2–infected lung organoid alveolar type 2 cells (Figure 4G, Supplementary Table S3).” to “We found that FURIN was upregulated in SARS-CoV-2-infected lung organoid alveolar type 2 cells (Figure 4G, Supplementary Table S4) (Mulay, Konda et al., 2021), it has reported that TGF-β signaling could also regulates FURIN (Blanchette, Rivard et al., 2001). Our gene enrichment analysis also found TGF-β signaling enriched only for up-regulated genes in SARS-CoV-2-infected lung cells (FDR correct p=7.58E-05, Supplementary Table S4), these observations implicated a positive feedback mechanism only for SARS-CoV-2-infected lung but not RV-infected nasal cells.”

      Figure 2 is of too low resolution. Many details cannot be read. Please, provide a higher resolution figure.

      We apologize for the inconvenience. However, we did not expect the reader to read the details on Figure 2, as it is just an overview of the CovidExpress portal. The aim is give the reader an impression about what functions CovidExpress could offer.

      Reviewer #1 (Significance (Required)):

      Providing a single platform for the analysis of SARS-CoV-2-related RNAseq data is certainly of high value to the scientific community. However, as the portal and manuscript are currently presented, for scientists that are not RNAseq analysis specialists, more guidance would be required to understand and use correctly the functionalities of the portal. Unfortunately, because batch effects could not be removed from the studies, the authors, correctly, do not recommend to combine data from different studies for analyses, however, this likely will also limit the potential of the resource to make new discoveries beyond what the original studies have already published. As indicated above, the authors could support their claim by comparing their findings with findings published from the studies they reanalyzed. The portal is only of use to scientists studying SARS-CoV-2. I am not an expert in RNAseq data analysis and thus cannot comment on the technicalities, especially the processing of the RNAseq datasets. We thank the reviewer for the positive comments. We apologize for the confusion and acknowledge that we should not describe our effort using the term “batch effect.” As described by Reviewer #2 (and we agree), batch effect should be used only to indicate a purely technical difference in the same biological system; for example, differences in experiments performed on different days or by different lab personnel. Thus, we cannot correct for “batch effect” by using CovidExpress. We hope that the reviewer realizes that what we did was correct for the effect caused by differences in software and parameters across the studies. For example, in our approach, the DEGs from GSE155518 and GSE160435 (both primary lung alveolar AT2 cells (both from Mulay et al., Cell Report, 2021) were significantly correlated (panel A below; p = 1.36e-24, F-test). However, when we downloaded the TPM values from their GEO records, GSE155518 appeared to have a genome-wide decrease in the expression of SARS-CoV-2–infected samples (panel B below). We suspect that this is because in their data processing, the expression of virus themselves were also considered. Thus, using the proceed data directly without careful reviewing the method might lead to false hypothesis.

      At last, researchers can make new discoveries, such as our OASL and FURIN findings, by using many other features that CovidExpress provides.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Djekidel and colleagues describe a web portal to explore several SARS-CoV-2 related datasets. The authors applied a uniform reprocessing pipeline to the diverse RNA-seq datasets and integrated them into a cellxgene-based interface. The major strengths of the manuscript are the scale of the compiled data, with over one thousand samples included, and the data portal itself, which has useful visualization and analysis functions, including GSEA and DEG analysis. My primary concerns with the study are centered on the analysis examples that are presented and their interpretation, as well as the user interface for the data portal. **Major Comments:**

      1. The literature analysis feels out of place and is not informative (Fig 1E), as the conclusions that can be drawn from literature mining are minimal. In evidence of this, the authors highlight that CRP is a top-studied "gene" and later voice their interest in how CRP is not a differentially expressed gene (pg6). This illustrates the problems with the literature-based analysis, since in the context of COVID-19, CRP is a common blood laboratory measurement that is used as a general marker of inflammation. Transcription of CRP is essentially exclusively in hepatocytes as an acute phase reactant (see GTEx portal for helpful reference), and would therefore not be expected to be found in the various datasets collected by the authors. The one exception might be liver RNA-seq samples from COVID-19 patients, but I do not think these are available in the current collection. I would therefore suggest to remove the literature analysis parts from the manuscript.

      We thank the reviewer for sharing knowledge about CRP. As discussed in our manuscript, we agree that not all top genes from literature-based analysis were expected to be included in RNA-seq analysis. We apologize for the confusion, and we have amended our description to make this point clearer. However, we still believe that literature-based analyses are very useful in the following aspects:

      1. This type of analysis bridges the gap between data-driven research and hypothesis-driven research. For example, we found many genes in our meta-analysis, but it is not feasible to describe the functions of all of them. Thus, in Figure 1F, we color-coded genes in red if they also appeared as top genes in the literature-based analysis and read related manuscripts to build confidence that the meta-analysis is useful. Then we expanded our review to more top genes and found more interesting evidence (Supplementary Table S2, “TopGenesbyDifferentialAnalysis” tab).
      2. Literature-based analyses also reduce the time researchers spend prioritizing their investigations. For example, in our comparison of SARS-CoV-2–infected lung and Rhinovirus-infected nasal tissue, we found >2000 genes upregulated only in SARS-CoV-2–infected lung but not in Rhinovirus-infected nasal cells. It is not easy to derive a hypothesis from so many genes. When we overlapped the gene list with literature-based analysis, FURIN popped up as the most well-studied gene, and we did not find any report that mentioned that SARS-CoV-2 can regulate FURIN This raised our interest and led to a suggested mechanism in which SARS-CoV-2 could evolve to induce FURIN expression and gain superior infectivity. FURIN’s upregulation is significant but not among the top genes, in terms of fold change (>2-fold change, FDR p th by fold change). Thus, without the literature-based analysis, this observation could have easily been neglected.
      3. Such analyses help researchers to prime their hypotheses for novel findings. For example, in our comparison between SARS-CoV-2–infected lung and Rhinovirus-infected nasal tissues (Figure 4G, Supplementary Figure 5D and E), we found many upregulated genes, but OASL was not in our literature-based analysis, which indicated that it is under-studied and worth highlighting. We hope the reviewer will agree that we should retain the literature-based analysis in our paper. These analyses were not meant to be conclusive but rather a way to prioritize investigations. Finally, we removed CRP from Fig 1E and the main text to avoid confusion.
      1. The data portal, implemented through cellxgene, is accessible for non-programmers to use. However, it is very easy to end up with an "Unexpected HTTP response 400, BAD REQUEST" error, with essentially no description of the cause of the error or how to rectify it. When this occurs (and in my experience it occurs very frequently), this also forces the user to refresh the page entirely, losing any progress they may have made. I see that the authors describe this error in their FAQ page, but their answer is not very intuitive and I was unsure of what they meant: "This happens because the samples you selected doesn't contain all "Group by" you want compare for each "Split by" group. You could confirm using the "Diff. groups" buttons.".

      We apologize for the confusion. This excellent point made by the reviewer required an improvement in the software engineering, which we have now completed. We have figured out how to avoid this error and have run thorough tests to ensure that it does not appear anymore. We also added a gitter chat channel to our landing page, so that users can report if they encounter this or other errors.

      I would therefore ask that the authors provide more detailed tutorials (ideally step-by-step) on common analyses that users will want to perform, hopefully minimizing the amount of frustration that users will encounter.

      We thank the reviewer for this suggestion. We have uploaded several video tutorials to our landing page and will gradually add more. We also added a gitter chat channel, so users can ask questions, report bugs, or suggest new studies to include in the portal.

      1. Selection of samples is not very quick or intuitive. If I wanted to select only the samples from one specific GEO accession, I had to resort to individually checking the boxes of the sample IDs that I wanted. If I instead selected the GEO accession under the samples source ID, then used the "Subset to currently selected samples" button, I invariable got the HTTP error 400 message. Of course, this may simply reflect my lack of familiarity with cellxgene; I would nevertheless encourage the authors to improve the FAQ to include a step-by-step example for how to do common analyses/procedures.

      We apologize for the confusion. To select an individual GEO accession, users can simply tick the box beside “Samples Source ID.”

      Then all boxes would be clear for “Samples Source ID” that allow you to select only the one you want. We also have uploaded video tutorials to help users learn how to navigate the portal.

      We apologize for the “HTTP error 400” messages. We figured out that users would encounter that message frequently after they encounter it once due to a back-end cache mechanism. We have now improved the portal from the software-engineering side. In our recent tests of the latest version, this error does not appear anymore. We also added a gitter chat channel on our landing page so that users can report encountering this or other errors.

      1. The second case study, centered on coagulation genes, is misguided. Alteration of coagulation lab values in severe COVID-19 patients is reflecting the general inflammatory state of these patients, and would not be expected to manifest on the transcriptional level in infected cells/tissues. Coagulation labs are measuring the functional status of the coagulation cascade, which is far-removed from the direct transcription of the corresponding genes - proteolytic processing of clotting factors, etc. As with CRP (see above comment), most clotting factors are transcribed almost exclusively in the liver (check GTEx portal); I would not expect upregulation of coagulation factors in lung cell lines/organoids/cultures etc after infection with SARS-CoV-2. I would recommend the authors to pick a different gene ontology set for a case study, as the current one focusing on coagulation is confusing in a pathophysiologic sense.

      We thank the reviewer for this suggestion. To avoid any misleading claims, we have replaced the coagulation gene list with a filtered gene list from the “Coronavirus disease - COVID-19” KEGG pathway (hsa05171) to showcase how to identify experiments in which this gene signature is enriched or depleted. We also replaced Figures 3G-J with new results.

      1. The two large clusters of blood-derived samples vs other tissues is not surprising and the authors' interpretation is confusing. The authors write that "the COVID-19 signature was not able to overcome the tissue specificity and that immune cells might respond to SARS-CoV-2 differently." This should be immediately obvious given the pathophysiology of COVID-19 infection; the cell types that are directly infected by SARS-CoV-2 will of course have a distinct response compared to the circulating blood cells of COVID-19 patients, which are responding by mounting an immune response. There is no reason to expect a priori that the DEGs in the directly infected lung cells would be similar to that of immune cells that are mounting a response against the virus.

      We thank the reviewer for these comments. We agree that it should be obvious that directly infected lung cells would differ from immune cells. However, this has never been shown in a large dataset. Also, it is not obviously whether all other different tissues would respond to SARS-CoV-2 differently. Thus, we believe it is important to present this overview. We have amended the description to deliver clearer message as “This confirmed immune cells respond to SARS-CoV-2 differently from other tissues also suggested the response of most other tissues might sharing similar features.”.

      1. The authors devote considerable space in the manuscript to exploring "batch effects" and trying to minimize them (pg10-11 Fig 4A-D, Fig S4). However, given that the compiled datasets are from entirely different experimental and biological systems (e.g. in vitro infection vs patient infection, different cell lines, timepoints after virus exposure, diverse tissues, varying disease severity), it is inappropriate to simply refer to all of these differences as "batch effects" alone. Usually, the term "batch effect" would refer to the same biological experiment/system (i.e. A549 cells infected with CoV vs control), but performed on different days or by different lab personnel - in other words, batch effects are purely due to technical differences. This term clearly does not apply when comparing samples from entirely different cell lines, or tissues, etc, and the authors should not keep describing these differences as batch effects that should be "corrected" out.

      We thank the reviewer for the insight. We apologize for the confusion caused by using the phrase “batch effect correction” to describe our approach. We agree that the difference between studies should not be referred to as a “batch effect correction” and have now amended the descriptions to avoid confusion.

      Indeed, the authors themselves state that the main point of their "batch effect correction" efforts is only for PCA visualization. I therefore feel this section contributes very little to the overall manuscript, especially given the authors' own recommendation that all analyses should be performed on individual datasets (which I certainly agree with). I assume that the authors were required to provide some sort of dimensional reduction projection for the cellxgene browser, but this is more a quirk in their choice of platform for the web portal. Thus, this section of the manuscript should be deemphasized.

      We thank the reviewer for these comments and again apologize for the confusion caused by our use of the term “batch effect correction” to describe our approach. However, we believe these parts of the paper should be retained for the following reasons:

      • In practice, sample mislabeling can happen. PCA or simple clustering approaches are very useful for helping raise researchers’ attention, so they could further check the possibility of sample mislabeling.
      • Even within a study, one sample can be an outlier due to low or unequal sample quality. Removing outliers would help boost the significance of real findings. Without our approach, it would be harder for users to notice and remove outliers from their investigations.
      • Finally, these efforts are useful for generating hypotheses. For example, although we collected a lot of data, it is not feasible for us to read all the details in all the manuscripts published. We observed a similarity between SARS-CoV-2–infected lung samples and Rhinovirus–infected nasal samples by exploring our portal’s capabilities (Figure 3E-F). Then we read the manuscripts in which those data were published and found that our discovery was consistent with the original studies’ results. We believe these efforts are essential to help researchers generate or refine their hypotheses. As we update the database with more samples, this approach will become increasingly powerful.
        1. Given the limitations of any combined multi-dataset analyses, one very useful feature would be to conduct "meta-analyses" across multiple datasets. For instance, it would be informative to find which genes are commonly DEGs in user-selected comparisons, calculated separately for each dataset and then cross-referenced across the relevant/user-selected datasets.

      We thank the reviewer for this comment. Indeed, we agree that “meta-analyses” are useful and have now compiled Supplementary Table S2 and Figure 1F to demonstrate the commonly regulated genes. To enable user-selected comparisons across studies on our portal, we need to design a thoughtful user interface. Otherwise, the results from our portal could easily cause fatal misinterpretation. For example, GSE154613 includes samples like DMSO, Drug, SARS-CoV-2, and DMSO+SARS-CoV-2. If a user simply selected to compare SARS-CoV-2 versus Control, the results would be SARS-CoV-2 and DMSO+SARS-CoV-2 versus DMSO and Drug. Such functions need time to design and implement; therefore, we will consider this suggestion for further development of our portal.

      **Minor comments:**

      1. Fig S1G, color legend should be added (I understand that these colors are the same from S1H).

      We thank the reviewer for the comment. We have now added information about the colors in the figure legend.

      1. Mouseover text for trackPlot on the data portal is incorrect (it says the heatmap text instead).

      We thank the reviewer for this comment. We have now corrected this bug.

      1. Abstract should be revised to describe only the 1093 final remaining RNA-seq samples after filtering/QC steps.

      We thank the reviewer for this comment. We have now amended the Abstract to include this information.

      1. Text in many figures is too small to be legible. I would suggest pt 6 font minimum for all figure text, including the various statistics in the figure panels.

      We thank the reviewer for this comment. We have now amended the font sizes and will provide high-resolution figures in revision.

      1. Are the DE analyses in Fig 1F specifically limited to control vs SARS-CoV-2/COVID-19 comparisons? Many of the samples included in this study are from other respiratory infections (labeled "other" in Fig 1B).

      We thank the reviewer for the question. Figure 1F was not originally limited to control vs SARS-CoV-2/COVID-19 comparisons, because we thought control vs virus, drug vs mock, or difference between time points would also be interesting. If we narrow the analysis to contrasts only between control vs SARS-CoV-2/COVID-19, Figure 1F would be still look similar (as below) because the genes in that comparison comprise the largest share of genes included in the original graphic.

      In the end, we replaced Figure 1F to avoid confusion and added more details in the Methods.

      1. The word cloud format is not conducive for understanding or interpretation. It would be much more informative to simply have a barplot or similar to clearly indicate the relative "abnudance" of a given gene among all 315 DE analyses.

      We thank the reviewer for this comment but respectfully disagree with this point. Visualization of the relative “abundance” of genes with word clouds is a relatively novel concept in computational biology. However, we believe, that in this case, it has certain advantages over visualization using traditional bar plots for example. The word cloud format allows us to highlight genes relative to their importance, with the word “importance” being used here in the sense of combined metrics from DEGs, as shown in Figure 1F, or the frequency with which genes are mentioned/discussed in various literature sources, as shown in Figure 1E. For this purpose, the exact values will most likely not be important for most users/readers. Be presenting a word cloud visualization, readers can easily discern the top genes and use them in the exploration of their own data or the CovidExpress portal. However, if users want to analyze raw values, we provide in Supplementary Table S3 a full list of all genes and gene sets that can be download from our landing page (section “CovidExpress Expression Data Download”) in GMT format. Also, when we visualized the ranks of genes by using bar plots as the reviewer suggested, the results were much harder to read (as shown in the bar graph below) than simply looking at the raw data in supplementary tables.

      1. Claims of increased/decreased dataset separability should have statistical analysis on the silhouette score boxplots (Fig S4G-I).

      We thank the reviewer for the reminder. We have added statistical tests to referred silhouette score boxplots (Wilcoxon rank test)

      1. Regarding Fig 4E-F - what are the key genes that contribute to PC1, and how do they relate to the DEGs in Fig 4G?

      We thank the reviewer for this question and apologize for the confusion. In Figure 4E-F, the PCA were based on ssGSEA score, as each gene set would have a score for a sample, not individual genes. Thus, the top contributed to PC1 were gene sets upregulated or down-regulated in certain contrasts. We provided on the portal’s landing page detailed results for top gene sets (for the ssGSEA approach) and genes (for the TPM approach) that contributed to various PCs (“Clustering Results for Reviewing and Download” section). This allows users to download and further explore these data.

      1. Statistics describing the relation between OASL And TNF/PPARGC1A should be included to justify the author's statements. This could be correlation, mutual information, regression, etc.

      We thank the reviewer for this suggestion, and we have updated Figures 4J-K to show the correlation values and corresponding F-statistics. The Pearson correlation between OASL and TNF was significant (Pearson Correlation=0.75 and p-value = 6.85e-72), but the correlation between OASL and PPARGC1A had a negative slope and showed a moderately significant p-value (Pearson Correlation=-0.08 and p-value=0.12), confirming to a certain degree our statement. We have now updated the corresponding text in the manuscript.

      1. There are several studies now that have performed scRNA-seq on the lung resident and peripheral immune cells of COVID-19 patients. To more definitively tie in their analyses in Fig 4J-K/Fig S5D-E (to affirm "its important role in the innate immune response in lungs"), the authors should assess whether OASL is upregulated in the lung macrophages of COVID-19 patients vs controls.

      We thank the reviewer for this suggestion. Indeed, Liao, et al. recently reported “BALFs of patients with severe/critical COVID-19 infection contained higher proportions of macrophages and neutrophils and lower proportions of mDCs, pDCs, and T cells than those with moderate infection.” (Nature Medicine, 2020, https://doi.org/10.1038/s41591-020-0901-9). They further refined macrophage data into subclusters and reported top enriched GO terms as “response to virus” (group 1), “type I interferon signaling pathway” (group 2), “neutrophile degranulation” (group 3), and “cytoplasmic translational initiation” (group 4). When we investigated their data, we found that group1 and group2 both identified OASL as a marker gene, indicated OASL might response to virus and help type I interferon signaling. Furthermore, another data set (from Ren et al., Cell, 2021, https://dx.doi.org/10.1016%2Fj.cell.2021.01.053) showed several clusters in patients with severe COVID-19 (left panel below) that were enriched for OASL expression(right panel below).

      We have now added these observations to strengthen our hypothesis about the role of OASL.

      1. The visualization and analysis functions in the data portal appear to work reasonably well out of the box. However, the download buttons for plots did not work in my hands. I realized that a workaround is to right click -> "Save image as" (which then downloads a .svg file), but this is not ideal and should be fixed to improve usability. I had tested the data portal on both Firefox and Edge browsers, using a Windows 10 PC.

      We agree with the reviewer. Due to some technical issues with the figure javascript plugin, the download feature does not work unless the figure is saved as a file on the server side. To avoid any security issues, we tried to minimize new file generations, hence, for the moment we have disabled this feature. Users can still download high-resolution .svg figures by using the right-click -> “save image as.” This information is now included in the FAQ section on the portal’s landing page.

      Reviewer #2 (Significance (Required)): The data portal appears to have useful analysis and visualization features, and the data collection appears to be quite comprehensive. I would strongly encourage the authors to continue collecting datasets as they become available and further improving the usability of the portal. As noted in the above comments, I think there is potential for their cellxgene-based browser to be useful to non-computational biologists, but at present, the data portal is not as simple to use as it should be. With further efforts to developing step-by-step tutorials for common analysis/visualization tasks, more informative case studies, and the other revisions suggested above, this study could be a valuable resource for the community. Of note, this review is written from the perspective of a primary wet-lab biologist with extensive bioinformatics experience but limited web development expertise.

      We thank the reviewer for the positive comments. We understand the importance of data updating. Our plan is to complete quarterly updates once this manuscript has been accepted or when 10 new studies have been either collected by us or suggested by users. This information is also now included in the FAQs of the portal’s landing page. We have also uploaded several tutorials videos to the landing page and will gradually add more. We also added a gitter chat channel, so users can ask questions, report bugs, or suggest new studies to add to the database.

      **Referee Cross-commenting** I agree with the comments of the other reviewers. Reviewer #3 (Evidence, reproducibility and clarity (Required)): **Summary:** The ongoing COVID-19 pandemic is a big threat to human health. The researchers have conducted studies to explore the gene expression regulations of human cells responding to COVID-19 infection. A website that integrating those datasets and providing user-friendly tools for gene expression analysis is a valuable resource for the COVID-19 study community. The authors collected published RNASeq datasets and developed a database and an interactive portal for users to investigate the gene expression of SARS-CoV-2 related samples. This website would be of great value for the SARS-CoV-2 research community if the batch normalization problems are solved. **Major comments:** 1) The major concern of CovidExpress is the batch effects from different studies. As the authors have shown and mentioned in their discussion that "For the current release, we strongly suggest investigators to perform gene expression comparison within individual study." This limits the usage of CovidExpress as integrating analysis from multiple datasets of different studies is the key value and purpose of CovidExpress.

      We thank the reviewer for the comment. Reviewer #2 reminded us, and we agree, that differences between studies should not be considered “batch effects.” We apologize for the confusion. The GSEA function provided in the portal does not suffer from batch effect, because all the pre-ranked lists of genes are based on contrasts from the same studies. Although we cannot correct for the differences between studies, we did correct for effect caused by differences in software and parameters used. For example, in our approach, the DEGs from GSE155518 and GSE160435 (both studies of primary lung alveolar AT2 cells from Mulay et al., Cell Report, 2021) were significantly correlated (below panel A, p-value = 1.36e-24, F-test). However, if we simply download the TPM values from their GEO records, GSE155518 appears to show a genome-wide decrease in expression in SARS-CoV-2–infected samples (below panel B). These errors might lead to false hypotheses.

      2) The authors should include experimental protocols as one key parameter in the description and further integrating analysis of different datasets. As the authors showed that QuantSeq is a 3' sequencing protocol of RNA sequencing. However, it is not convincing to me that simply excluding QuantSeq samples is the ideal solution for downstream integrating analysis as QuantSeq has been shown that it has pretty good correlations with normal RNASeq methods in gene quantifications. It is interesting that there are 21.2% of samples were biased toward intronic reads. What protocol differences or experimental variations would explain the biases?

      We thank the reviewer for the comment and apologized for not being clearer. One of our main goals re-processing all samples is to correct for pipeline processing–related batch effects. We tried to reduce those effects introduced by using different software or parameters. QuantSeq or similar protocols are heavily bias to 3’ UTR; thus, the software and parameters used for RNA-seq data will not be suitable. In contrast, we agree that the downstream results from QuantSeq have good correlation to RNA-seq (we observed a correlation of ~0.75, when compared to the log2 fold-change from Quant-Seq to RNA-seq). However, we could not reconcile QuantSeq always correlated well with RNA-seq, in terms of individual quantification. For example, Jarvis et al. recently reported only ~0.35 correlation between QuantSeq and RNA-seq (https://doi.org/10.3389/fgene.2020.562445). Theoretically, the correlation would be weaker for genes with a small 3’ UTR. Thus, we will not include QuantSeq data in this portal. However, if we collect enough studies in the future, we will consider uploading a separate portal just for QuantSeq using a pipeline optimized for protocol bias to 3’ UTR.

      For the 21.2% samples that were biased towards intronic reads, we believe they reflect differences in the kits used. For example, of the 162 samples “BASE_INTRON (%)” >30% (Supplementary Table S1) that passed QC, 76 samples were total RNA obtained using the SMARTer kit and 36 were total RNA obtained using the Trio kit. Given that we have 105 samples of total RNA derived using the SMARTer kit and 38 samples of total RNA derived using the Trio kit, we conclude that the Trio kit was more biased toward introns, and the SMARTer kit was also strongly biased. This finding is consistent with those of others who have reported the bias of the SMARTer kit (Song et al., https://doi.org/10.1186/s12864-018-5066-2). Users can find these results in our Supplementary Table S1. We have also uploaded the protocol information to our portal.

      3) How do the authors plan to update and maintain CovidExpress?

      We thank the reviewer for this question. We understand the importance of data updating. Our plan is to update the database quarterly once this manuscript has been accepted or when 10 new studies have been collected by us or suggested by users. We have added this information to the FAQs on the portal’s landing page. We also understand the importance of maintaining the service for a feasible amount of time for research. Therefore, we will keep the server activated for at least 2 years after the WHO announces that COVID-19 is no longer a global pandemic. We will also ensure that, even after we take down the server , scientists with programming skills will be able to create local servers based on the data provided on CovidExpress.

      **Minor comments:** 1) Some texts in figures are not readable. For example, Fig2B, 2C, 2D, 2E.

      We thank the reviewer for this comment. We have now increased the font sizes and provided high-resolution figures in revision.

      2) The authors could use Videos to demonstrate how to use CovidExpress on the website as they have shown in Fig3.

      We thank the reviewer for this suggestion. We have uploaded several video tutorials to the landing page and will gradually add more. We also added a gitter chat channel so that users can ask questions, report bugs, or suggest new studies to include in the database.

      Reviewer #3 (Significance (Required)): The ongoing COVID-19 pandemic is a big threat to human health. Many molecular and cellular questions related to COVID-19 pathophysiology remain unclear and many researchers have conducted studies to explore the gene expression regulations of human cells responding to COVID-19 infection. However, there is no database/website that integrating all RNASeq data to provide user-friendly tools for gene expression analysis for COVID-19 researchers. The authors collected the published RNASeq datasets and developed a database and an interactive portal, named CovidExpress, to allow users to investigate the gene expressions response to COVID-19 infection. CovidExpress is a valuable resource for the COVID-19 study community once the batch normalization problems are solved. The users who came up with ideas about the regulation of COVID-19 response could use the system to test their hypothesis, without experience in bioinformatics and RNASeq data analysis. This will be more important when more RNASeq data from samples with different tissues, cell lines, and conditions are integrated into the database.

      We thank the reviewer for the positive comments. We apologize for the confusion and acknowledge that we should not describe our effort using the term “batch effect.” As described by Reviewer #2 (and we agree), batch effect should be used only to indicate a purely technical difference in the same biological system; for example, differences in experiments performed on different days or by different lab personnel. Thus, we cannot correct for “batch effect” by using CovidExpress. We hope that the reviewer realizes that what we did was correct for the effect caused by differences in software and parameters across the studies. For example, in our approach, the DEGs from GSE155518 and GSE160435 (both primary lung alveolar AT2 cells (both from Mulay et al., Cell Report, 2021) were significantly correlated (panel A below; p = 1.36e-24, F-test). However, when we downloaded the TPM values from their GEO records, GSE155518 appeared to have a genome-wide decrease in the expression of SARS-CoV-2–infected samples (panel B below).

      Thus, using the proceed data directly without careful reviewing the method might lead to false hypothesis. At last, researchers can make new discoveries, such as our OASL and FURIN findings, by using many other features that CovidExpress provides.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The ongoing COVID-19 pandemic is a big threat to human health. The researchers have conducted studies to explore the gene expression regulations of human cells responding to COVID-19 infection. A website that integrating those datasets and providing user-friendly tools for gene expression analysis is a valuable resource for the COVID-19 study community. The authors collected published RNASeq datasets and developed a database and an interactive portal for users to investigate the gene expression of SARS-CoV-2 related samples. This website would be of great value for the SARS-CoV-2 research community if the batch normalization problems are solved.

      Major comments:

      1) The major concern of CovidExpress is the batch effects from different studies. As the authors have shown and mentioned in their discussion that "For the current release, we strongly suggest investigators to perform gene expression comparison within individual study." This limits the usage of CovidExpress as integrating analysis from multiple datasets of different studies is the key value and purpose of CovidExpress.

      2) The authors should include experimental protocols as one key parameter in the description and further integrating analysis of different datasets. As the authors showed that QuantSeq is a 3' sequencing protocol of RNA sequencing. However, it is not convincing to me that simply excluding QuantSeq samples is the ideal solution for downstream integrating analysis as QuantSeq has been shown that it has pretty good correlations with normal RNASeq methods in gene quantifications. It is interesting that there are 21.2% of samples were biased toward intronic reads. What protocol differences or experimental variations would explain the biases?

      3) How do the authors plan to update and maintain CovidExpress?

      Minor comments:

      1) Some texts in figures are not readable. For example, Fig2B, 2C, 2D, 2E.

      2) The authors could use Videos to demonstrate how to use CovidExpress on the website as they have shown in Fig3.

      Significance

      The ongoing COVID-19 pandemic is a big threat to human health. Many molecular and cellular questions related to COVID-19 pathophysiology remain unclear and many researchers have conducted studies to explore the gene expression regulations of human cells responding to COVID-19 infection. However, there is no database/website that integrating all RNASeq data to provide user-friendly tools for gene expression analysis for COVID-19 researchers. The authors collected the published RNASeq datasets and developed a database and an interactive portal, named CovidExpress, to allow users to investigate the gene expressions response to COVID-19 infection. CovidExpress is a valuable resource for the COVID-19 study community once the batch normalization problems are solved. The users who came up with ideas about the regulation of COVID-19 response could use the system to test their hypothesis, without experience in bioinformatics and RNASeq data analysis. This will be more important when more RNASeq data from samples with different tissues, cell lines, and conditions are integrated into the database.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Djekidel and colleagues describe a web portal to explore several SARS-CoV-2 related datasets. The authors applied a uniform reprocessing pipeline to the diverse RNA-seq datasets and integrated them into a cellxgene-based interface. The major strengths of the manuscript are the scale of the compiled data, with over one thousand samples included, and the data portal itself, which has useful visualization and analysis functions, including GSEA and DEG analysis. My primary concerns with the study are centered on the analysis examples that are presented and their interpretation, as well as the user interface for the data portal.

      Major Comments:

      1. The literature analysis feels out of place and is not informative (Fig 1E), as the conclusions that can be drawn from literature mining are minimal. In evidence of this, the authors highlight that CRP is a top-studied "gene" and later voice their interest in how CRP is not a differentially expressed gene (pg6). This illustrates the problems with the literature-based analysis, since in the context of COVID-19, CRP is a common blood laboratory measurement that is used as a general marker of inflammation. Transcription of CRP is essentially exclusively in hepatocytes as an acute phase reactant (see GTEx portal for helpful reference), and would therefore not be expected to be found in the various datasets collected by the authors. The one exception might be liver RNA-seq samples from COVID-19 patients, but I do not think these are available in the current collection. I would therefore suggest to remove the literature analysis parts from the manuscript.
      2. The data portal, implemented through cellxgene, is accessible for non-programmers to use. However, it is very easy to end up with an "Unexpected HTTP response 400, BAD REQUEST" error, with essentially no description of the cause of the error or how to rectify it. When this occurs (and in my experience it occurs very frequently), this also forces the user to refresh the page entirely, losing any progress they may have made. I see that the authors describe this error in their FAQ page, but their answer is not very intuitive and I was unsure of what they meant: "This happens because the samples you selected doesn't contain all "Group by" you want compare for each "Split by" group. You could confirm using the "Diff. groups" buttons.".

      I would therefore ask that the authors provide more detailed tutorials (ideally step-by-step) on common analyses that users will want to perform, hopefully minimizing the amount of frustration that users will encounter.

      1. Selection of samples is not very quick or intuitive. If I wanted to select only the samples from one specific GEO accession, I had to resort to individually checking the boxes of the sample IDs that I wanted. If I instead selected the GEO accession under the samples source ID, then used the "Subset to currently selected samples" button, I invariable got the HTTP error 400 message. Of course, this may simply reflect my lack of familiarity with cellxgene; I would nevertheless encourage the authors to improve the FAQ to include a step-by-step example for how to do common analyses/procedures.
      2. The second case study, centered on coagulation genes, is misguided. Alteration of coagulation lab values in severe COVID-19 patients is reflecting the general inflammatory state of these patients, and would not be expected to manifest on the transcriptional level in infected cells/tissues. Coagulation labs are measuring the functional status of the coagulation cascade, which is far-removed from the direct transcription of the corresponding genes - proteolytic processing of clotting factors, etc. As with CRP (see above comment), most clotting factors are transcribed almost exclusively in the liver (check GTEx portal); I would not expect upregulation of coagulation factors in lung cell lines/organoids/cultures etc after infection with SARS-CoV-2. I would recommend the authors to pick a different gene ontology set for a case study, as the current one focusing on coagulation is confusing in a pathophysiologic sense.
      3. The two large clusters of blood-derived samples vs other tissues is not surprising and the authors' interpretation is confusing. The authors write that "the COVID-19 signature was not able to overcome the tissue specificity and that immune cells might respond to SARS-CoV-2 differently." This should be immediately obvious given the pathophysiology of COVID-19 infection; the cell types that are directly infected by SARS-CoV-2 will of course have a distinct response compared to the circulating blood cells of COVID-19 patients, which are responding by mounting an immune response. There is no reason to expect a priori that the DEGs in the directly infected lung cells would be similar to that of immune cells that are mounting a response against the virus.
      4. The authors devote considerable space in the manuscript to exploring "batch effects" and trying to minimize them (pg10-11 Fig 4A-D, Fig S4). However, given that the compiled datasets are from entirely different experimental and biological systems (e.g. in vitro infection vs patient infection, different cell lines, timepoints after virus exposure, diverse tissues, varying disease severity), it is inappropriate to simply refer to all of these differences as "batch effects" alone. Usually, the term "batch effect" would refer to the same biological experiment/system (i.e. A549 cells infected with CoV vs control), but performed on different days or by different lab personnel - in other words, batch effects are purely due to technical differences. This term clearly does not apply when comparing samples from entirely different cell lines, or tissues, etc, and the authors should not keep describing these differences as batch effects that should be "corrected" out.

      Indeed, the authors themselves state that the main point of their "batch effect correction" efforts is only for PCA visualization. I therefore feel this section contributes very little to the overall manuscript, especially given the authors' own recommendation that all analyses should be performed on individual datasets (which I certainly agree with). I assume that the authors were required to provide some sort of dimensional reduction projection for the cellxgene browser, but this is more a quirk in their choice of platform for the web portal. Thus, this section of the manuscript should be deemphasized.

      1. Given the limitations of any combined multi-dataset analyses, one very useful feature would be to conduct "meta-analyses" across multiple datasets. For instance, it would be informative to find which genes are commonly DEGs in user-selected comparisons, calculated separately for each dataset and then cross-referenced across the relevant/user-selected datasets.

      Minor comments:

      1. Fig S1G, color legend should be added (I understand that these colors are the same from S1H).
      2. Mouseover text for trackPlot on the data portal is incorrect (it says the heatmap text instead).
      3. Abstract should be revised to describe only the 1093 final remaining RNA-seq samples after filtering/QC steps.
      4. Text in many figures is too small to be legible. I would suggest pt 6 font minimum for all figure text, including the various statistics in the figure panels.
      5. Are the DE analyses in Fig 1F specifically limited to control vs SARS-CoV-2/COVID-19 comparisons? Many of the samples included in this study are from other respiratory infections (labeled "other" in Fig 1B).
      6. The word cloud format is not conducive for understanding or interpretation. It would be much more informative to simply have a barplot or similar to clearly indicate the relative "abnudance" of a given gene among all 315 DE analyses.
      7. Claims of increased/decreased dataset separability should have statistical analysis on the silhouette score boxplots (Fig S4G-I).
      8. Regarding Fig 4E-F - what are the key genes that contribute to PC1, and how do they relate to the DEGs in Fig 4G?
      9. Statistics describing the relation between OASL And TNF/PPARGC1A should be included to justify the author's statements. This could be correlation, mutual information, regression, etc.
      10. There are several studies now that have performed scRNA-seq on the lung resident and peripheral immune cells of COVID-19 patients. To more definitively tie in their analyses in Fig 4J-K/Fig S5D-E (to affirm "its important role in the innate immune response in lungs"), the authors should assess whether OASL is upregulated in the lung macrophages of COVID-19 patients vs controls.
      11. The visualization and analysis functions in the data portal appear to work reasonably well out of the box. However, the download buttons for plots did not work in my hands. I realized that a workaround is to right click -> "Save image as" (which then downloads a .svg file), but this is not ideal and should be fixed to improve usability. I had tested the data portal on both Firefox and Edge browsers, using a Windows 10 PC.

      Significance

      The data portal appears to have useful analysis and visualization features, and the data collection appears to be quite comprehensive. I would strongly encourage the authors to continue collecting datasets as they become available and further improving the usability of the portal. As noted in the above comments, I think there is potential for their cellxgene-based browser to be useful to non-computational biologists, but at present, the data portal is not as simple to use as it should be. With further efforts to developing step-by-step tutorials for common analysis/visualization tasks, more informative case studies, and the other revisions suggested above, this study could be a valuable resource for the community. Of note, this review is written from the perspective of a primary wet-lab biologist with extensive bioinformatics experience but limited web development expertise.

      Referee Cross-commenting

      I agree with the comments of the other reviewers.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript submitted by Djekidel et al entitled: "CovidExpress: an interactive portal for intuitive investigation on SARS-CoV-2 related transcriptomes" reports on a new web portal to search and analyze RNAseq data related to SARS-CoV-2 infections. The authors downloaded and reprocessed data of more than 40 different studies, which is available on the web portal along with all available meta data. The web portal allows to perform numerous differential expression and gene set enrichment analyses on the data and provides publication ready figures. Because of batch effects that could not be removed, the authors do not recommend to analyze data across studies at this point. The authors conclude that the web portal is unique and will allow scientists to rapidly analyze gene expression signatures related to SARS-CoV-2 infections with the potential to make new discoveries.

      Major comments:

      Based on the scientific literature, the web portal seems to be an unprecedented resource to search and analyze SARS-CoV-2-related RNAseq data and as such would certainly be a useful resource for the SARS-CoV-2 scientific community. The authors argue that new discoveries are possible by using their web portal in providing use cases. However, the section detailing the analyses the authors did to generate new hypotheses about genes potentially relevant in SARS-CoV-2 infections are very difficult to follow and without more guidance very difficult to reproduce with the web portal. It would require substantial expert knowledge in RNAseq data analysis without more information being provided. It also seems that key candidate genes identified by their analyses have all been studied or identified to be related to SARS-CoV-2 infections, so it is somewhat unclear whether new hypotheses can be generated by the reanalysis of RNAseq datasets, especially because combining the data from different studies is currently not recommended by the authors. The manuscript would benefit from providing fewer use cases but for each of them providing more information on how the portal and which studies were used to generate them and which findings were not described in the publication of the used studies. Some observations in the manuscript are not substantiated with significance calculations (see below). At times, the English writing (grammar) should be improved.

      Minor comments:

      Page 6 last sentence: The statement of this sentence is very much what one would expect. It remains unclear whether the authors mean this as a result to validate the processing of the RNAseq data or as a new discovery. Please, clarify.

      Figure 3A: The violin plots are so tiny that it is impossible to see any trends. It is also difficult to understand which categories one should compare with each other. If there is anything significant to observe, please, add a statistical test and better guide the reader.

      Figure 3C: A legend for the color scale is missing. The signal (I guess expression amounts) for SESN2 seems very weak and the same between ICU and non-ICU samples. What is the significance for assigning this gene to the group of genes being upregulated in ICU samples? Also contrary to what the authors state on page 8, SESN2 does not seem to be highly expressed in ICU samples, however, without knowing what the colors represent (fold changes or absolute expression values?) this is somewhat speculative.

      Page 9 first sentence: Please, specify what you mean by "starting list". Furthermore, in this paragraph, how do your results compare to the results from the study that you re-analyze here?

      Figure 3F: Please add labels to your axes and is there a particular reason why in a correlation plot like this one, the y and x axis are not shown with the same range and why does the y axis not start at 0?

      Page 9 second last sentence: It remains unclear which kind of analysis the authors intend to do here and what the starting question is. Please, try to rewrite with less technical terms (i.e. what do you mean by "precalculated contrasts"). In line with this, it remains unclear what Figure 3I is supposed to show. Please, provide some more information to readers who are not RNAseq analysis experts.

      Figure 3J is somewhat confusing. Why is the mean expression range indicated from 0 to 1 and why are all genes apparently having a mean expression of 1? Page 10 line 5-6. Are you referring to coagulation markers here or general expression patterns? In case of the latter, how does this statement fit to the paragraph about analyzing expression patterns of coagulation markers? Please, specify. And in line with this, are the highlighted genes in Figure 3K coagulation markers? If not, what is the relevance of these to make the point that one can use the portal to investigate the role of coagulation markers in SARS-CoV-2 infections?

      The appearance of describing batch effects and attempts to remove them from the studies was somewhat surprising on page 10 as I would expect this kind of results rather earlier in the results section before describing use cases of the data. You may consider changing the order of your results for a better flow. Page 11, second paragraph. Please, explain briefly what the silhouette score is supposed to reflect and thus how Figure S4G should be interpreted. The difference of both bars in Figure S4G is very marginal and thus, does not seem to support the statement of the authors that the ssGSEA scores-based projection is better unless you perform a significance test or I misunderstood. Please, clarify.

      Page 11, third paragraph: Figure 4B, to the best of my understanding, does not support the claim that samples clustered less according to study cohorts using the ssGSEA approach. Please, quantify the effect and test for significance or better explain.

      For the analyses described starting on page 12 it remains largely unclear whether they were conducted across studies or within studies and which studies were used. This section until the end of the results would especially benefit from providing more information on how the analyses were performed, either in the results or in the methods section.

      Figures 4J and 4K miss axis labels and since we look at correlations, the figures could be redrawn using the same ranges on x and y axis.

      Page 14 line 5: Is this the right figure reference here to Figure 4G? If yes, then it is unclear how Figure 4G supports the statement in this sentence. Please, clarify. Figure 2 is of too low resolution. Many details cannot be read. Please, provide a higher resolution figure.

      Significance

      Providing a single platform for the analysis of SARS-CoV-2-related RNAseq data is certainly of high value to the scientific community. However, as the portal and manuscript are currently presented, for scientists that are not RNAseq analysis specialists, more guidance would be required to understand and use correctly the functionalities of the portal. Unfortunately, because batch effects could not be removed from the studies, the authors, correctly, do not recommend to combine data from different studies for analyses, however, this likely will also limit the potential of the resource to make new discoveries beyond what the original studies have already published. As indicated above, the authors could support their claim by comparing their findings with findings published from the studies they reanalyzed. The portal is only of use to scientists studying SARS-CoV-2. I am not an expert in RNAseq data analysis and thus cannot comment on the technicalities, especially the processing of the RNAseq datasets.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-01024

      Corresponding author(s): Martin Spiess

      1. Description of the planned revisions — point-by-point response


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Apart from the default constitutive pathway for protein secretion some specialized cells (e.g., neuroendocrine cells, exocrine cells, peptidergic neurons and mast cells) exhibit additional regulated secretory pathway, where peptide hormones are stored as highly concentrated ordered manner inside electron opaque "dense core" of secretory granule for long duration until secretagogue mediated burst release. Although the general sorting receptor for packaging hormones in secretory granules is not yet identified, self-aggregation in the trans-Golgi network is a common shared property of peptide hormones and is a well-accepted potential sorting mechanism. Here the authors have hypothesized that cysteine containing small disulphide loop (CC loop), which is abundant in several hormone precursors, acts as aggregation mediator in TGN for sorting into secretory granule. They have tested the aggregation propensity of a misfolded reporter protein, NPΔ, in ER by attaching the CC loop segment of different hormones which promoted the pathological aggregation in endoplasmic reticulum (ER) of mutant provasopressin in the case of diabetes insipidus. Immunofluorescence and immunogold electron microscopy revealed accumulation of aggregates in the ER when CC loop of different hormonal origin fused NPΔ was transiently transfected in COS-1 fibroblasts and Neuro-2a neuroblastoma cells. The authors have also shown small disulphide loop mediated functional aggregation in TGN can sort a constitutively secreted protein, α1-protease inhibitor, into the secretory granule. The rerouting capacity of CC loop was tested in stably expressed AtT-20 cell line by confirming their localization with CgA-positive secretory granule as well as by studying BaCl2 mediated stimulated secretion and by testing secretory granule specific lubrol insolubility.

      **Major comments:**

      The study is highly impressive, and the results fully support the CC loop mediated hormone sorting hypothesis. However, it would be nice if the authors characterize the nature of the CC-loop mediated aggregates as hormones are reported to be stored inside secretory granules as functional amyloid (Maji et al., 2009). The mechanistic reason behind the small disulfide loop mediated aggregation was not explained in the paper. Authors may propose the probable molecular reasons behind CC loop mediated aggregation to completely justify their hypothesis.

      Although the hypothesis and the experimental results are highly impressive, the authors may consider adding the following experiments.

      The authors replaced CC-loop by the proline/glycine repeat sequence (Pro1) as a negative control which was previously reported to abolish aggregation as well. However, the authors may completely delete the small loop forming segment, CCv, and may check the status of His-tagged fused neurophysin II (NPΔ) segment as an additional negative control. We plan to use a NP∆ construct completely lacking any N-terminal extension as a further negative control, as proposed by the reviewer.

      To find the ultrastructure authors have done immunogold assay with anti-His antibody which indicated different CC loop mediated ER aggregation. Since the amyloid-like fibril nature of pro-vasopressin mutant mediated ER aggregates was previously reported (Beuret et al., 2017), authors must check the nature of the CC loop mediated ER aggregates with amyloid specific antibody.

      We will test staining ER aggregates of our CC loop–NP∆ constructs with anti-amyloid antibodies. A caveat is that CC loops cannot form a classical cross-b structure (strict b-sheets) because of the ring closure – which is why we suggest their aggregation to be "amyloid-like". These structures may not be recognized by anti-amyloid antibodies.

      Since hormones are known to form reversible functional amyloid during their storage inside secretory granule, authors may consider characterizing the nature of the aggregates formed by CC loop fused constitutive protein in AtT-20 cell line by immunostaining, immunoprecipitation and dot blot assay using amyloid specific antibody. Endogenous AtT20 granules are expected to be positive for amyloid stains or antibodies anyway (if the size and mass of the granules is sufficient for detection; Maji et al. used pituitary tissue and purified granules).

      **Minor comments:**

      In the quantification study (Figure 2C) CCc and CCr showed almost similar ER aggregates (around 40%). But authors have commented that all constructs except CCc produce statistically significant increases in cells compared to background. Authors must clarify the statement.

      CCc also increased, but in a statistically not significant manner (p = 0.08). We will change the sentence to: "It confirmed the ability of all constructs to produce an increase of cells with aggregates above background in COS-1 cells (Figure 2C), although not statistically significant for CCc (p = 0.08)."

      In lubrol insolubility assay, the otherwise constitutively secreted protein A1Pimyc (negative control) showed 23% insolubility. The authors explained the observation by commenting about trapping of the protein inside granule aggregate. But CCv and CCa fused proteins showed a very slight increase (around 30%). Only CCc construct showed more than 40% insolubility. If the trapping of constitutive protein may result in 23% insolubility, all the insolubility data except CCc is not satisfactory to claim as secretory granular content of aggregated protein. The authors must explain that.

      Lubrol insolubility is an empirical assay with high specificity for Golgi/post-Golgi forms, but with a relatively high background that we suggest to be due to trapping. Interpretation is based on statistical analysis of several independent experiments. It supports the conclusion of the other assays from an independent angle.

      We present the data of the paired t-test

      The authors have satisfactorily referenced prior studies in the field. However, authors may consider adding the following papers as they are directly connected with the hypothesis. The sorting of POMC hormone into secretory granules by disulphide loop was previously studied. (Cool et al.,1995). The N-terminal loop segment was also previously used to reroute a constitutive protein chloramphenicol acetyltransferase (Tam and Peng, 1993). S K. Maji and his coworker had previously shown that disulphide bond maintains native reversible functional amyloid structure relevant to hormone storage inside secretory granule whereas disulphide bond disruption led to rapid irreversible amyloid aggregation using cyclic somatostatin as model peptide. (Anoop et al., 2014). We will be happy to add these references (Anoop et al., 2014, is already discussed in the text).. Authors must check grammar and may reconstruct a few sentences where sentence construction seems complicated. We will go through the text to improve readability.

      Reviewer #1 (Significance (Required)):

      This manuscript has a significant contribution to enrich academia with fundamental research knowledge of hormone sorting mechanisms. Although constitutive and regulated secretory pathways are known for long times, the exact sorting mechanism is not yet elucidated. There is no common receptor identified yet for recruiting regulated secretary proteins inside the secretory granules.

      Aggregation in the TGN is a well-accepted mechanism for sorting. However, the triggering factor for aggregation is not yet known. This study has shed light on a novel hypothesis, which has considered intramolecular disulfide bond mediated small CC loop in hormone may act as aggregation mediator. Since many regulated secretory proteins contain the short disulphide loop, the hypothesis proposed in the manuscript is interesting.

      It has been confirmed that TGN is the last compartment which is common to both regulated and constitutive pathways (Kelly, 1985). There is no sorting mechanism required for the constitutive one as this is the default mechanism, whereas a regulated secretory pathway requires a specific sorting mechanism to be efficiently packaged in the secretory granules. There are two popular hypotheses about protein sorting in regulated secretory pathways. They are "sorting for entry" and "sorting for retention" (Blázquez and Kathleen, 2000). In "sorting for entry" hormones destined to go to the regulated secretory pathway start to form aggregates in the TGN specific environment excluding other proteins destined to go to the constitutive pathway. Arvan and Castle proposed the second mechanism as some hormones, like proinsulin, are initially packaged with lysosomal enzymes in immature secretory granules (ISG) (Arvan and Castle, 1998). But with time they start to aggregate and lysosomal enzymes are removed from ISG by small constitutive-like vesicles. Although, in both the mechanisms aggregation is an essential sorting criterion the molecular events that lead to aggregation is not yet elucidated. TGN specific environmental conditions including pH (around 6.5), divalent metal ions (Zn2+, Cu2+), Glycosaminoglycans (GAGs) have potential to trigger aggregation (Dannies, Priscilla S, 2012). Though each hormone has aggregation prone regions in the amino acid sequence, there is no common amino acid sequence responsible for aggregation. The authors in this manuscript, have pointed out an interesting observation that many hormones contain small disulfide loops which are exposed due to their presence in N or C terminal or close to the processing site. Based on their observation, they hypothesized CC loop may act as aggregation driver for hormone sorting. In-cell study with CC construct from different hormones successfully rerouted a constitutively secretory protein into the regulated pathway which supported their novel hypothesis.

      However, the hypothesis raises some questions to be answered regarding the molecular mechanism of CC loop mediated aggregation. Why does CC-loop promote aggregation? Does the amino acid sequence, size of the loop play a role in aggregation? The granular structure shown in the manuscript from different CC loops has different size and shape (Figure 2 and 3). What is the reason for the structural heterogeneity of the CC loop mediated dense core? Since authors have shown CC loop mediated aggregation both in functional as well as in diseased aggregation, a very important aspect to address would be the structure-function relationship of the aggregates. Since authors have rightly pointed out that not all hormones or prohormones contain CC loop, another curious question would be about the sorting mechanism of those without CC loop. The best part of the study is that it has tried to explain the well-established aggregation mediated sorting mechanism from a new perspective, which raises room for many questions to be addressed by further research. These are very valid questions, but beyond the scope of this study in which we address the contribution of CC loops in a cellular context. This is a novel extension to published in vitro studies, where a few CC loop proteins (vasopressin, oxytocin, somatostatin-14) have already been shown to enable amyloid(-like) aggregation in vitro.

      From this study, the audience will get to know about the role of small disulphide loop in functional and diseased associated protein/peptide aggregation. The audience will also get an idea about the sorting mechanism in the regulated secretory pathway from the study. According to my expertise and knowledge where I do protein aggregation related to human diseases and hormone storage, I see this manuscript is a fantastic addition to understand the secretory granules biogenesis of hormones with storage and subsequent release.

      Reference: Maji, Samir K., et al. "Functional amyloids as natural storage of peptide hormones in pituitary secretory granules." Science 325.5938 (2009): 328-332. Beuret, Nicole, et al. "Amyloid-like aggregation of provasopressin in diabetes insipidus and secretory granule sorting." BMC biology 15.1 (2017): 1-14. Cool, David R., et al. "Identification of the sorting signal motif within pro-opiomelanocortin for the regulated secretory pathway." Journal of Biological Chemistry 270.15 (1995): 8723-8729. Tam, W. W., K. I. Andreasson, and Y. Peng Loh. "The amino-terminal sequence of pro-opiomelanocortin directs intracellular targeting to the regulated secretory pathway." European journal of cell biology 62.2 (1993): 294-306.

      Anoop, Arunagiri, et al. "Elucidating the Role of Disulfide Bond on Amyloid Formation and Fibril Reversibility of Somatostatin-14: RELEVANCE TO ITS STORAGE AND SECRETION." Journal of Biological Chemistry 289.24 (2014): 16884-16903. Kelly, Regis B. "Pathways of protein secretion in eukaryotes." Science 230.4721 (1985): 25-32. Blázquez, Mercedes, and Kathleen I. Shennan. "Basic mechanisms of secretion: sorting into the regulated secretory pathway." Biochemistry and Cell Biology 78.3 (2000): 181-191. Arvan, Peter, and David Castle. "Sorting and storage during secretory granule biogenesis: looking backward and looking forward." Biochemical Journal 332.3 (1998): 593-610. Dannies, Priscilla S. "Prolactin and growth hormone aggregates in secretory granules: the need to understand the structure of the aggregate." Endocrine reviews 33.2 (2012): 254-270.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      This manuscript by Reck and colleagues aim at determining the importance of short disulfide loops for the correct sorting to, and release from, secretory granules. They utilize hybrid secretory proteins where sequences encoding disulfide loop from different hormones are cloned in frame with the same secretory peptide, and assess how the presence of the disulfide loop affect the ability of the protein to aggregate in the ER and to get sorted for secretion. By immunofluorescence analysis they show that the presence of a disulfide loop increases the ability of the peptide hormone to form aggregates in the ER, and these observations are confirmed by immunogold-EM. Importantly, aggregate formation is seen both in professional secretory (N-2a) and non-secretory (COS-1) cells. Using immunofluorescence and quantitative immuoblotting, they also show that the ability to aggregate the secretory proteins coincide with increased localization to secretory granules and in increased release from cells in response to stimuli.

      The results from this study are interesting and suggest that small disulfide loops may be an important part of the cargo sorting mechanism in secretory cells, and perhaps also a cause of sorting defects in certain diseases. The study is overall well conducted and worthy of publication after revision.

      **Major comments:**

      1) It is unclear to me what the relationship between the CC-loop and amyloid is. They are not involved in the formation of fibrils and amyloid, yet the authors conclude that they support the amyloid hypothesis of granule biogenesis. This must be clarified.

      Maji et al. (2009) concluded in their Science paper that secretory granules of the pituitary are made of functional amyloids formed by the protein hormones themselves. Evidence for this is that many purified protein hormones formed fibrillar aggregates in vitro with amyloid characteristics. Among the hormones analyzed were 4 CC loop-containing ones: vasopressin, oxytocin, somatostatin-14 (these are just the CC loop segments of the respective precursors), and full-length prolactin (199 aa, containing an N- and a C-terminal CC loop). Amyloid formation of somatostatin-14 was further analyzed in vitro with and without the disulfide bond by Anoop et al. (2014). On the tissue level, it was only shown that granules are stained by amyloid dyes (Maji et al., 2009). Our own lab found that folding-deficient mutant forms of provasopressin formed fibrillar aggregates in vitro (Birk et al., 2009) and in the ER of expressing cells (Birk et al., 2009; Beuret et al., 2011). These ER aggregates likely represent mislocalized amyloid formation that normally happens at the TGN for granule sorting.

      In the present study, we therefore tested the role of different CC loops in cells with respect to (1) inducing ER aggregation of a folding-incompetent reporter and (2) inducing granule sorting of a folded constitutive cargo protein. Unfortunately, the ER aggregates were all very compact and did not reveal fibrillarity. However, secretory granules, which contain functional amyloids, similarly do not have a fibrillar appearance.

      In this study, we do not directly provide evidence for the amyloid (or rather amyloid-like) character of aggregation. The concept of granules consisting of functional amyloids of peptide hormones was the starting point for our analysis. Our results are in line with the functional amyloid hypothesis and thus provide first functional support for it.

      2) What is the actual function of the CC-loops? The authors show that the loops promote aggregation of cargo proteins, yet the mechanism behind this is unclear. For example, would the proteins used in this study be able to aggregate in vitro (i.e. the CC-loop enable aggregation) or do they require some co-factor/chaperone? It would also be good if the authors could clarify or explain why some CC-loops cause aggregation and others not.

      Maji et al. (2009) showed for 3 different CC loops (vasopressin, oxytocin and somatostatin-14) that they aggregate in an amyloid-like form in vitro in purified form in the absence of chaperones or other protein cofactors. Anoop et al. (2014) analyzed in vitro amyloid formation of somatostatin-14 with and without disulfide bond in more detail. The proposed function is aggregation of the hormone into secretory granules as functional amyloids, which is supported by the finding that secretory granules are positive for amyloids.

      In the present study, we tested a variety of CC loops for aggregation in cells rather than in vitro. Many proteins and peptides have been shown to be able to form amyloids in vitro. The hallmark of pathological or functional amyloids is that they are still able to do it in living cells despite the presence of chaperones, whose function is to generally prevent aggregation.

      We found all CC loops to have the ability to mediate ER aggregation and granule sorting, although to different extents. The differences are likely due to their intrinsic potency and/or the way they are presented by the reporter proteins, since we used the same rather short linkers.

      We plan to go through the manuscript text to make our points clearer.

      3) The MS data in table 2 is very confusing, since half of the data points are missing. It is also not clear what the numbers in the table represent and if they are from a single experiment or multiple. As it is presented now, and as I interpret it, these results do not give support to the conclusion that CC loops form disulfide bonds. Since this is an important conclusion from the paper, these experiments need to be clarified, repeated or a different experimental approach used.

      Thanks to this comment, we realize that Table II may have presented the result in a confusing way, making the impression that a lot of data are missing, while in fact the data was measured to be 0. To improve it, we will write 0 instead of – to indicate that no signal could be detected for a particular peptide. In addition, we will move the missing results for CCpN-NP∆ into the figure legend to avoid confusion. In the legend, we will also note that the intensities detected by mass spectrometry differ strongly for different peptides. One experiment is shown, because the numbers for peak areas inherently differ between experiments. We will revise the text to make the experiment clearer.

      Proposed new Table II:

      Table II. Cysteines of CC loops are oxidized in secreted reporter fusion proteins.

      __nonreduced

      • IAA__

      __reduced

      • IAA__

      Diagnostic peptide*

      CCv disulf

      1637

      10

      CYFQNCPR↓

      CCv 2xmod

      0

      696

      CCa disulf

      4

      0

      ↓CNTATCATQTGEDPQGDAAQK↓

      CCa 2xmod

      0

      23

      CCc disulf

      6

      0

      ↓CGNLSTCMLGTTGEDPQGDAAQK↓

      CCc 2xmod

      0

      32

      CCr disulf

      570

      152

      ↓CSRLYTACVYHK↓

      CCr 2xmod

      0

      246

      CC loop fusion proteins with A1Pimyc were immunoprecipitated from the media of producing AtT20 cell lines, reduced with TCEP or not, before treatment with iodoacetic acid (+IAA). Samples analyzed by mass spectrometry for the expected peptide masses and the peak areas, normalized to the intensity of the peptide LQHLENELTHDIITK within A1Pi in arbitrary units are shown. It should be noted that intensities detected by mass spectrometry differ strongly by peptide. *CC loop sequences are shown in green with red cysteines, the N-terminal sequence of A1Pi in blue, linker sequence in black. CCv-, CCa-, and CCc-NP∆ containing samples were digested with trypsin, CCr- and CCpN-NP∆ containing samples with Lys-C. The peptides for CCpN-NP∆ (↓LPICPGGAARCQVTTGEDPQGDAAQK↓, disulfide bonded or carbamidomethylated) could not be detected.

      4) As the authors state, it is well-known that the concentration of proteins in the ER will influence the ability to aggregate. In figure 1 and 2, the authors use transient overexpression to assess the ability of different CC-loops to induce aggregation in the ER. How were these results normalized to expression levels of the proteins? In later experiments the authors instead use stable cell lines expressing similar amounts of the different proteins. However, in these cells there is no obvious aggregation in the ER (see figure 4). It therefore becomes unclear what the role of ER aggregation for sorting to granules is.

      The ER aggregation experiments were not normalized for expression levels. Plasmids were identical except for the short CC loop segments and produced similar transfection efficiencies. Stable cell lines with useful expression levels of CC-NP∆ could not be obtained, most likely because expression of mutant proteins inhibits growth.

      To analyze granule sorting, we expressed CC fusion proteins with rapidly folding A1Pi as a reporter that does not accumulate in the ER. Stable cell lines were important to select clones with moderate and very similar expression levels.

      5) What is the basal secretion of the different proteins, i.e. how much goes through the constitutive secretory pathway and how much goes through the regulated secretory pathway? The authors should show the resting secretion (before BaCl2 addition) for all conditions tested instead of just the change in relation to control (i.e. the way data is presented now it is not possible to tell whether BaCl2 stimulation actually cause an increased release of the peptides).

      The experiment is done by comparing resting secretion (– lanes) with BaCl2 stimulated secretion (+ lanes) in Fig. 5A and C. Stimulated secretion is calculated as a ratio of resting secretion / stimulated secretion (after normalization for cell number and supernatant loading).

      6) Lastly, the importance of CC-loops for the sorting of native peptides is unclear. The authors should test the importance of these loops for aggregation, sorting and secretion of a non-hybrid hormone with naturally occurring CC-loops (and a mutated version lacking the loop). This is important, since it is so far only shown that loops can affect the secretion of non-biologically relevant hybrid hormones.

      In our previous study Beuret et al. (2017), we analyzed the segments contributing to ER aggregation of folding-incompetent mutant provasopressins and to granule sorting for folding-competent mutants of provasopressins by self-aggregtion at the TGN. We found separate protein segments – vasopressin (=CCv) and the glycopeptide – to contribute to aggregation in both localizations. Our study is a follow up on the finding for vasopressin, expanding to other CC loops found in peptide hormones. Our results show that CC loops in general have the ability to aggregate and contribute to granule sorting.

      As exemplified by provasopressin, the CC loop may not be the only contributor. Preliminary experiments suggest the same for growth hormone. The detailed analysis of the aggregating sequences in one or more prohormone is clearly beyond the scope of our study.

      **Minor comments:**

      1) Stated that the 2x CC-loop constructs showed a positive effect in the cases of CCv and CCr, but this is not evaluated statistically.

      We will add the statistics to the respective figures.

      2) Explain the abbreviation POMC

      We will add the full name to the text.

      3) Figure 6D. Paired Student's t-test is not appropriate for determining significance when data is not paired (unpaired t-tests used throughout the rest of the paper).

      Only in the lubrol insolubility experiment did we find considerable shifts between experiments (particularly obvious for the yellow experiment). Instead of normalizing to the control construct, we used the paired t-test. However, using the unparied t-test does not produce fundamentally different significance. If required, we will change the figure as suggested.

      Figure 6D using unpaired t-test: [Figure]

      Reviewer #2 (Significance (Required)):

      The work in this paper builds on previous work from the same group and reinforces the notion that peptide aggregation is an important part of the sorting process that controls efficient delivery of certain proteins to nascent secretory granules, and suggest that short loops formed by disulfide bridges between closely apposed cysteine residues may be part of this sorting mechanism. The paper is of general cell biological interest, but perhaps of special interest to researches working on professional secretory cells and mechanisms of secretory protein sorting and secretion. My own research focuses on stimulus-secretion coupling pathways in secretory cells and we primarily use live cell imaging approaches to visualize different steps of secretory granule biogenesis and release.


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Since the small disulfide loop of the nonapeptide vasopressin has been previously demonstrated to play a role the self-aggregation and secretory granule targeting of vasopressin precursor (Beuret et al., 2017), and as several other peptide hormones contain small disulfide loops, Reck and colleagues investigate in this study the requirement of small disulfide loops coming from four additional peptide hormones for the self-aggregation and secretory granule targeting of their precursors. Then, they studied the aggregation role of small disulfide loops in the ER and the TGN of two cell lines, COS1 and Neuro-2a. Using confocal and TEM, an aggregation has indeed been observed, although to different extents depending on the cell line. When fused to a constitutively secreted reporter protein, these disulfide loops induced their sorting into secretory granules, increased the stimulated secretion and Lubrol insolubility in endocrine AtT20 cells. All these results led the authors to hypothesize that small disulfide loops may act as a general device for peptide hormone aggregation and sorting, and therefore for secretory granule biogenesis.

      **Major comments:**

      The authors demonstrated the ability of small disulfide loops of peptide hormones to induce peptide precursor aggregation in ER using confocal microscopy, in COS1 and Neuro-2a cell lines, with a higher extent in COS1 cells. The authors have to moderate this conclusion and to include in their interpretation that distinct results may be due to the distinct secretory phenotype of these two cell lines: COS1 are epithelial cells, i.e. with a unique constitutive secretory pathway, while Neuro-2a as well as AtT20 cells also possess a regulated secretory pathway. Thus, the differences could be explained by the distinct molecular mechanisms involved in the formation of constitutive vesicles or secretory granules, and therefore aggregation and/or sorting processes could be distinct in the two cell types. We can also suggest to remove COS1-related results, to avoid hasty conclusions. As suggested, we will amend the text to point out that the two cell lines differ with respect to regulated secretion and to explain why they were used. COS-1 and Neuro-2a cells were previously used by Birk et al. (2009) to study ER aggregation of disease mutants of provasopressin. COS-1 cells were used because they are large with an extensive ER suitable for immunofluorescence microscopy. Neuro-2a cells are of neuroendocrine origin and thus more comparable to the cell types where ER aggregation of disease mutants of provasopressin or growth hormone was observed. However, the presence or absence of a regulated pathway has no relevance for ER aggregation experiments, since the different pathways diverge only at the TGN.

      The data and the methods can be reproduced and the experiments are adequately replicated, using timely statistical analysis.

      **Minor comments:**

      • Figure 3: to complete TEM study, the concomitant use of an ER specific antibody would definitely demonstrate that small disulfide loop-containing aggregates are linked to ER compartment.

      In our previous study Birk et al. (2009), we performed double-immunogold staining for provasopressin mutants and calreticulin to confirm aggregation in the ER. This anti-calreticulin antibody is unfortunately not commercially available anymore and other antibodies we tested were not suitable for immuno-EM. Instead, we colocalized PDI with CC-NP∆ constructs for immunofluorescence microscopy. Colocalization is so extensive that we believe EM confirmation to be unnecessary.

      • Along abstract, introduction and discussion sections, the authors should avoid to conclude on the role of small disulfide loops on secretory granule biogenesis, but rather limit their conclusion on prohormone aggregation and targeting. Indeed, the present study did not highlight any direct molecular / physical link between disulfide loops and TGN membrane to drive secretory granule formation. Granule biogenesis involves a number of processes including interaction of cargo components with the membrane and of the actomyosin complex with the forming buds, but also selfaggregation of cargo as functional amyloids. However, we will reword our statements in the Abstract avoiding the term "**granule biogenesis".

      Reviewer #3 (Significance (Required)):

      • This study highlights small disulfide loops as novel signals for self-aggregating and secretory granule sorting of prohormone precursors in cells with a regulated secretory pathway. These results help to understand the molecular mechanism driving peptide hormone secretion, a physiological process which is crucial for interorgan communication and functional synchronization. Moreover, their previous study revealed that vasopressin small disulfide loop is involved in toxic unfolded mutant aggregation in the ER (Beuret et al., 2017), which highlights the clinical potential of the work.
      • Audience that might be interested in and influenced by the reported findings: cell biologists interested in cell trafficking, peptide hormone secretion
      • My field of expertise: secretory granule biogenesis, hormone sorting, secretory cells, neurosecretion.

      2. Description of the revisions that have already been incorporated in the transferred manuscript

      The manuscript has not yet been revised.

      3. Description of analyses that authors prefer not to carry out

      As indicated in the point-by-point response above, we consider additional analyses of in vitro aggregation with purified proteins to be beyond the scope of our study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Since the small disulfide loop of the nonapeptide vasopressin has been previously demonstrated to play a role the self-aggregation and secretory granule targeting of vasopressin precursor (Beuret et al., 2017), and as several other peptide hormones contain small disulfide loops, Reck and colleagues investigate in this study the requirement of small disulfide loops coming from four additional peptide hormones for the self-aggregation and secretory granule targeting of their precursors. Then, they studied the aggregation role of small disulfide loops in the ER and the TGN of two cell lines, COS1 and Neuro-2a. Using confocal and TEM, an aggregation has indeed been observed, although to different extents depending on the cell line. When fused to a constitutively secreted reporter protein, these disulfide loops induced their sorting into secretory granules, increased the stimulated secretion and Lubrol insolubility in endocrine AtT20 cells. All these results led the authors to hypothesize that small disulfide loops may act as a general device for peptide hormone aggregation and sorting, and therefore for secretory granule biogenesis.

      Major comments:

      The authors demonstrated the ability of small disulfide loops of peptide hormones to induce peptide precursor aggregation in ER using confocal microscopy, in COS1 and Neuro-2a cell lines, with a higher extent in COS1 cells. The authors have to moderate this conclusion and to include in their interpretation that distinct results may be due to the distinct secretory phenotype of these two cell lines: COS1 are epithelial cells, i.e. with a unique constitutive secretory pathway, while Neuro-2a as well as AtT20 cells also possess a regulated secretory pathway. Thus, the differences could be explained by the distinct molecular mechanisms involved in the formation of constitutive vesicles or secretory granules, and therefore aggregation and/or sorting processes could be distinct in the two cell types. We can also suggest to remove COS1-related results, to avoid hasty conclusions.

      The data and the methods can be reproduced and the experiments are adequately replicated, using timely statistical analysis.

      Minor comments:

      • Figure 3: to complete TEM study, the concomitant use of an ER specific antibody would definitely demonstrate that small disulfide loop-containing aggregates are linked to ER compartment.
      • Along abstract, introduction and discussion sections, the authors should avoid to conclude on the role of small disulfide loops on secretory granule biogenesis, but rather limit their conclusion on prohormone aggregation and targeting. Indeed, the present study did not highlight any direct molecular / physical link between disulfide loops and TGN membrane to drive secretory granule formation.

      Significance

      • This study highlights small disulfide loops as novel signals for self-aggregating and secretory granule sorting of prohormone precursors in cells with a regulated secretory pathway. These results help to understand the molecular mechanism driving peptide hormone secretion, a physiological process which is crucial for interorgan communication and functional synchronization. Moreover, their previous study revealed that vasopressin small disulfide loop is involved in toxic unfolded mutant aggregation in the ER (Beuret et al., 2017), which highlights the clinical potential of the work.
        • Audience that might be interested in and influenced by the reported findings: cell biologists interested in cell trafficking, peptide hormone secretion
        • My field of expertise: secretory granule biogenesis, hormone sorting, secretory cells, neurosecretion.
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript by Reck and colleagues aim at determining the importance of short disulfide loops for the correct sorting to, and release from, secretory granules. They utilize hybrid secretory proteins where sequences encoding disulfide loop from different hormones are cloned in frame with the same secretory peptide, and assess how the presence of the disulfide loop affect the ability of the protein to aggregate in the ER and to get sorted for secretion. By immunofluorescence analysis they show that the presence of a disulfide loop increases the ability of the peptide hormone to form aggregates in the ER, and these observations are confirmed by immunogold-EM. Importantly, aggregate formation is seen both in professional secretory (N-2a) and non-secretory (COS-1) cells. Using immunofluorescence and quantitative immuoblotting, they also show that the ability to aggregate the secretory proteins coincide with increased localization to secretory granules and in increased release from cells in response to stimuli.

      The results from this study are interesting and suggest that small disulfide loops may be an important part of the cargo sorting mechanism in secretory cells, and perhaps also a cause of sorting defects in certain diseases. The study is overall well conducted and worthy of publication after revision.

      Major comments:

      1) It is unclear to me what the relationship between the CC-loop and amyloid is. They are not involved in the formation of fibrils and amyloid, yet the authors conclude that they support the amyloid hypothesis of granule biogenesis. This must be clarified.

      2) What is the actual function of the CC-loops? The authors show that the loops promote aggregation of cargo proteins, yet the mechanism behind this is unclear. For example, would the proteins used in this study be able to aggregate in vitro (i.e. the CC-loop enable aggregation) or do they require some co-factor/chaperone? It would also be good if the authors could clarify or explain why some CC-loops cause aggregation and others not.

      3) The MS data in table 2 is very confusing, since half of the data points are missing. It is also not clear what the numbers in the table represent and if they are from a single experiment or multiple. As it is presented now, and as I interpret it, these results do not give support to the conclusion that CC loops form disulfide bonds. Since this is an important conclusion from the paper, these experiments need to be clarified, repeated or a different experimental approach used.

      4) As the authors state, it is well-known that the concentration of proteins in the ER will influence the ability to aggregate. In figure 1 and 2, the authors use transient overexpression to assess the ability of different CC-loops to induce aggregation in the ER. How were these results normalized to expression levels of the proteins? In later experiments the authors instead use stable cell lines expressing similar amounts of the different proteins. However, in these cells there is no obvious aggregation in the ER (see figure 4). It therefore becomes unclear what the role of ER aggregation for sorting to granules is.

      5) What is the basal secretion of the different proteins, i.e. how much goes through the constitutive secretory pathway and how much goes through the regulated secretory pathway? The authors should show the resting secretion (before BaCl2 addition) for all conditions tested instead of just the change in relation to control (i.e. the way data is presented now it is not possible to tell whether BaCl2 stimulation actually cause an increased release of the peptides).

      6) Lastly, the importance of CC-loops for the sorting of native peptides is unclear. The authors should test the importance of these loops for aggregation, sorting and secretion of a non-hybrid hormone with naturally occurring CC-loops (and a mutated version lacking the loop). This is important, since it is so far only shown that loops can affect the secretion of non-biologically relevant hybrid hormones.

      Minor comments:

      1) Stated that the 2x CC-loop constructs showed a positive effect in the cases of CCv and CCr, but this is not evaluated statistically.

      2) Explain the abbreviation POMC

      3) Figure 6D. Paired Student's t-test is not appropriate for determining significance when data is not paired (unpaired t-tests used throughout the rest of the paper).

      Significance

      The work in this paper builds on previous work from the same group and reinforces the notion that peptide aggregation is an important part of the sorting process that controls efficient delivery of certain proteins to nascent secretory granules, and suggest that short loops formed by disulfide bridges between closely apposed cysteine residues may be part of this sorting mechanism. The paper is of general cell biological interest, but perhaps of special interest to researches working on professional secretory cells and mechanisms of secretory protein sorting and secretion. My own research focuses on stimulus-secretion coupling pathways in secretory cells and we primarily use live cell imaging approaches to visualize different steps of secretory granule biogenesis and release.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Apart from the default constitutive pathway for protein secretion some specialized cells (e.g., neuroendocrine cells, exocrine cells, peptidergic neurons and mast cells) exhibit additional regulated secretory pathway, where peptide hormones are stored as highly concentrated ordered manner inside electron opaque "dense core" of secretory granule for long duration until secretagogue mediated burst release. Although the general sorting receptor for packaging hormones in secretory granules is not yet identified, self-aggregation in the trans-Golgi network is a common shared property of peptide hormones and is a well-accepted potential sorting mechanism. Here the authors have hypothesized that cysteine containing small disulphide loop (CC loop), which is abundant in several hormone precursors, acts as aggregation mediator in TGN for sorting into secretory granule. They have tested the aggregation propensity of a misfolded reporter protein, NPΔ, in ER by attaching the CC loop segment of different hormones which promoted the pathological aggregation in endoplasmic reticulum (ER) of mutant provasopressin in the case of diabetes insipidus. Immunofluorescence and immunogold electron microscopy revealed accumulation of aggregates in the ER when CC loop of different hormonal origin fused NPΔ was transiently transfected in COS-1 fibroblasts and Neuro-2a neuroblastoma cells. The authors have also shown small disulphide loop mediated functional aggregation in TGN can sort a constitutively secreted protein, α1-protease inhibitor, into the secretory granule. The rerouting capacity of CC loop was tested in stably expressed AtT-20 cell line by confirming their localization with CgA-positive secretory granule as well as by studying BaCl2 mediated stimulated secretion and by testing secretory granule specific lubrol insolubility.

      Major comments:

      The study is highly impressive, and the results fully support the CC loop mediated hormone sorting hypothesis. However, it would be nice if the authors characterize the nature of the CC-loop mediated aggregates as hormones are reported to be stored inside secretory granules as functional amyloid (Maji et al., 2009). The mechanistic reason behind the small disulfide loop mediated aggregation was not explained in the paper. Authors may propose the probable molecular reasons behind CC loop mediated aggregation to completely justify their hypothesis.

      Although the hypothesis and the experimental results are highly impressive, the authors may consider adding the following experiments.

      The authors replaced CC-loop by the proline/glycine repeat sequence (Pro1) as a negative control which was previously reported to abolish aggregation as well. However, the authors may completely delete the small loop forming segment, CCv, and may check the status of His-tagged fused neurophysin II (NPΔ) segment as an additional negative control.

      To find the ultrastructure authors have done immunogold assay with anti-His antibody which indicated different CC loop mediated ER aggregation. Since the amyloid-like fibril nature of pro-vasopressin mutant mediated ER aggregates was previously reported (Beuret et al., 2017), authors must check the nature of the CC loop mediated ER aggregates with amyloid specific antibody. Since hormones are known to form reversible functional amyloid during their storage inside secretory granule, authors may consider characterizing the nature of the aggregates formed by CC loop fused constitutive protein in AtT-20 cell line by immunostaining, immunoprecipitation and dot blot assay using amyloid specific antibody.

      Minor comments:

      In the quantification study (Figure 2C) CCc and CCr showed almost similar ER aggregates (around 40%). But authors have commented that all constructs except CCc produce statistically significant increases in cells compared to background. Authors must clarify the statement.

      In lubrol insolubility assay, the otherwise constitutively secreted protein A1Pimyc (negative control) showed 23% insolubility. The authors explained the observation by commenting about trapping of the protein inside granule aggregate. But CCv and CCa fused proteins showed a very slight increase (around 30%). Only CCc construct showed more than 40% insolubility. If the trapping of constitutive protein may result in 23% insolubility, all the insolubility data except CCc is not satisfactory to claim as secretory granular content of aggregated protein. The authors must explain that. The authors have satisfactorily referenced prior studies in the field. However, authors may consider adding the following papers as they are directly connected with the hypothesis. The sorting of POMC hormone into secretory granules by disulphide loop was previously studied. (Cool et al.,1995). The N-terminal loop segment was also previously used to reroute a constitutive protein chloramphenicol acetyltransferase (Tam and Peng, 1993). S K. Maji and his coworker had previously shown that disulphide bond maintains native reversible functional amyloid structure relevant to hormone storage inside secretory granule whereas disulphide bond disruption led to rapid irreversible amyloid aggregation using cyclic somatostatin as model peptide. (Anoop et al., 2014).

      Authors must check grammar and may reconstruct a few sentences where sentence construction seems complicated.

      Significance

      This manuscript has a significant contribution to enrich academia with fundamental research knowledge of hormone sorting mechanisms. Although constitutive and regulated secretory pathways are known for long times, the exact sorting mechanism is not yet elucidated. There is no common receptor identified yet for recruiting regulated secretary proteins inside the secretory granules.

      Aggregation in the TGN is a well-accepted mechanism for sorting. However, the triggering factor for aggregation is not yet known. This study has shed light on a novel hypothesis, which has considered intramolecular disulfide bond mediated small CC loop in hormone may act as aggregation mediator. Since many regulated secretory proteins contain the short disulphide loop, the hypothesis proposed in the manuscript is interesting.

      It has been confirmed that TGN is the last compartment which is common to both regulated and constitutive pathways (Kelly, 1985). There is no sorting mechanism required for the constitutive one as this is the default mechanism, whereas a regulated secretory pathway requires a specific sorting mechanism to be efficiently packaged in the secretory granules. There are two popular hypotheses about protein sorting in regulated secretory pathways. They are "sorting for entry" and "sorting for retention" (Blázquez and Kathleen, 2000). In "sorting for entry" hormones destined to go to the regulated secretory pathway start to form aggregates in the TGN specific environment excluding other proteins destined to go to the constitutive pathway. Arvan and Castle proposed the second mechanism as some hormones, like proinsulin, are initially packaged with lysosomal enzymes in immature secretory granules (ISG) (Arvan and Castle, 1998). But with time they start to aggregate and lysosomal enzymes are removed from ISG by small constitutive-like vesicles. Although, in both the mechanisms aggregation is an essential sorting criterion the molecular events that lead to aggregation is not yet elucidated. TGN specific environmental conditions including pH (around 6.5), divalent metal ions (Zn2+, Cu2+), Glycosaminoglycans (GAGs) have potential to trigger aggregation (Dannies, Priscilla S, 2012). Though each hormone has aggregation prone regions in the amino acid sequence, there is no common amino acid sequence responsible for aggregation. The authors in this manuscript, have pointed out an interesting observation that many hormones contain small disulfide loops which are exposed due to their presence in N or C terminal or close to the processing site. Based on their observation, they hypothesized CC loop may act as aggregation driver for hormone sorting. In-cell study with CC construct from different hormones successfully rerouted a constitutively secretory protein into the regulated pathway which supported their novel hypothesis.

      However, the hypothesis raises some questions to be answered regarding the molecular mechanism of CC loop mediated aggregation. Why does CC-loop promote aggregation? Does the amino acid sequence, size of the loop play a role in aggregation? The granular structure shown in the manuscript from different CC loops has different size and shape (Figure 2 and 3). What is the reason for the structural heterogeneity of the CC loop mediated dense core? Since authors have shown CC loop mediated aggregation both in functional as well as in diseased aggregation, a very important aspect to address would be the structure-function relationship of the aggregates. Since authors have rightly pointed out that not all hormones or prohormones contain CC loop, another curious question would be about the sorting mechanism of those without CC loop. The best part of the study is that it has tried to explain the well-established aggregation mediated sorting mechanism from a new perspective, which raises room for many questions to be addressed by further research.

      From this study, the audience will get to know about the role of small disulphide loop in functional and diseased associated protein/peptide aggregation. The audience will also get an idea about the sorting mechanism in the regulated secretory pathway from the study. According to my expertise and knowledge where I do protein aggregation related to human diseases and hormone storage, I see this manuscript is a fantastic addition to understand the secretory granules biogenesis of hormones with storage and subsequent release.

      Reference: Maji, Samir K., et al. "Functional amyloids as natural storage of peptide hormones in pituitary secretory granules." Science 325.5938 (2009): 328-332. Beuret, Nicole, et al. "Amyloid-like aggregation of provasopressin in diabetes insipidus and secretory granule sorting." BMC biology 15.1 (2017): 1-14. Cool, David R., et al. "Identification of the sorting signal motif within pro-opiomelanocortin for the regulated secretory pathway." Journal of Biological Chemistry 270.15 (1995): 8723-8729. Tam, W. W., K. I. Andreasson, and Y. Peng Loh. "The amino-terminal sequence of pro-opiomelanocortin directs intracellular targeting to the regulated secretory pathway." European journal of cell biology 62.2 (1993): 294-306.

      Anoop, Arunagiri, et al. "Elucidating the Role of Disulfide Bond on Amyloid Formation and Fibril Reversibility of Somatostatin-14: RELEVANCE TO ITS STORAGE AND SECRETION." Journal of Biological Chemistry 289.24 (2014): 16884-16903. Kelly, Regis B. "Pathways of protein secretion in eukaryotes." Science 230.4721 (1985): 25-32. Blázquez, Mercedes, and Kathleen I. Shennan. "Basic mechanisms of secretion: sorting into the regulated secretory pathway." Biochemistry and Cell Biology 78.3 (2000): 181-191. Arvan, Peter, and David Castle. "Sorting and storage during secretory granule biogenesis: looking backward and looking forward." Biochemical Journal 332.3 (1998): 593-610. Dannies, Priscilla S. "Prolactin and growth hormone aggregates in secretory granules: the need to understand the structure of the aggregate." Endocrine reviews 33.2 (2012): 254-270.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      FULL REVISION

      Manuscript number: RC-2021-00934

      Corresponding author(s): Seiya, Mizuno

      General Statements

      We would like to thank all the reviewers for their comments on improving the manuscript. We are encouraged by the overall positive responses from the reviewers. According to the reviewers’ comments, we have further refined our manuscript. We are confident that we have addressed all the reviewers’ comments and suggestions by incorporating them into the revised manuscript. We highlighted the changed text in the manuscript in red. The point-by-point responses to all comments follow.

      Point-by-point description of the revisions

      Reviewer 1:

      The study by Akihiro and colleagues describe the generation of multiplex genotyping method for detecting CRISPR gene editing alleles using nanopore sequencing and a machine learning program. The method is based on long-range PCR amplification of intended targeted loci from gene edited animals followed by nanopore sequencing. A PCR-index is introduced to the sample pooling system before sequencing, thus allow sequencing up to 100 sample in one flowcell. The study developed a machine learning program for allele binning, analysis, and presentation. To demonstrate the applicability of the method, the study has validated their methods for detection of point mutations, deletion, and flox insertion. The study has in principal provided sufficient investigation and data to demonstrate the validity of the method. All the figures are very nicely and clearly presented. However, there is a few concerns that it should be taken in to consideration.

      We appreciate the constructive and important comments from the reviewer.

      Reviewer 1_Comment #1:

      Many previous reported unintended structure variations caused by CRISPR off-targets are typically much larger deletion/insertion/insertion/translocation occurred outside the target sites. The current study is more for targeted allele genotyping. The use of structure variable (SV) in the whole study should be considered to revise thoroughly.

      SV is typically referred to genomic variation of approximately 1kb and above. What the study describe in this study is still within indel types instead. Thus, comparing the DAJIN with NanoSV and Sniffles on reads with 50, 100 and 200 bases deletions is not appropriate.

      The detection of SV alleles in the whole study is most likely a cause of minor indel alleles and sequencing errors. Figure 2b, BC32, WT mice also contains a proportion of SV allele, which is apparently caused by sequencing error. Such SV which is not related to CRISPR gene editing is also seen in other genotyping results e.g. Figure 3a. Figure 4b, Figure 5c, Figure 6b.

      Another co-factor that contributes to the SVs is the PCR-error from the method.

      Thank you very much for your comments. We agree that structural variation traditionally referred to genomic alterations that are larger than 1 kb in length. Although the application of sequencing technology has expanded the spectrum of structural variation to include smaller events >50 bp in length (PMID: 21358748, PMID: 26432246), there are no common understanding on the definition of the name of genomic rearrangements >50 bp in length through genome editing. We changed the name of the unexpected mutation reads more than approximately 50 bp in length “Large rearrangements (LAR)”. We changed description on the name of reads that DAJIN annotates in the Methods (Page 6, Line 205) and Results section (Page8, Line 249) as well as all other parts throughout the manuscript.

      Although we believe most of the LAR alleles are the real alleles generated through genomic rearrangements (Fig. 3b&3c, S12, and S16), we recognize that minor fractions of the LAR alleles, including those observed in WT mice, are composed of reads with high sequencing error rate. Visualized BAM files and consensus sequences can be indicators of the annotation results, providing information to the users of DAJIN that minor alleles that are similar in proportion to the one in the WT sample can be artificial alleles. We also cannot exclude the possibility that LAR alleles include those generated through PCR error. ‘Pseudo-LoxP’ alleles could be generated if the PCR products, which included one-side LoxP but not another-side LoxP, worked as a PCR primer to anneal WT allele in the next PCR step (Page 12, Line 425-427). Recently developed methods may address these limitations. We added description in the Discussion section (Page 17-18, Line 608-620).

      Reviewer 1_Comment #2:

      The reason that current method detect more than two alleles from one animal is probably due to the chimerism of the animal. However, when looking at the BAM file and figures presented in Figure 1b, 2c, 3b, 3d, 4c, as well as those in the Supplementary figures, there seems to be more than one allele (indels reads with different size) presented in one category.

      For example, Figure 2C, mice BC12, it is not fully aligned between the all alleles and the allele1 and allele 2 presented. For allele 1, which is called SV, there are reads with different size of indels. For allele 2, which is called intended PM, some reads are a hybrid of deletion and intended substitution.

      Thank you for checking the data in detail. As the reviewer pointed out, some of the reads in each allele showed indels with different sizes. We think these indel mutations are due to nanopore sequencing errors. Although the error rate of nanopore sequencing has improved, it has been reported that an error rate of 5% occurs in 1D sequencing of R9.4 flow cells that is the same flow cells used in our study (DOI: 10.1002/wfs2.1323). In this study, DAJIN mitigated the nanopore sequencing errors by calculating the MIDS score (Fig. S7), but the visualization using the BAM file showed the raw reads including the sequence errors. For this reason, the one allele seems to include different indel alleles.

      To evaluate the point, we performed Sanger sequencing and found that there were no hybrid sequences containing indel mutations, but only intended point mutation in BC12 allele 2 (Fig. 2d). The results of Sanger sequencing suggested that the indel mutations visualized by the BAM file were due to nanopore sequencing errors. To clarify the points, we updated the description in the Discussion section (Page 15-16, Line 528-548).

      Reviewer 1_Comment #3:

      What is the advantage of the current method as compared to the one reported by Bi et al., 2020, genome biology, previously?

      Thank you for pointing it out. We believe that one of the advantages of IDM-seq developed by Bi et al. is performing quantitative analysis by correcting PCR bias via Unique Molecular Identifiers (UMIs). However, when multiple samples are processed simultaneously, it is impractical in terms of cost and workability to prepare primers for the UMIs. While IDM-seq has the advantage to quantify the precise amount of each allele in a single sample, DAJIN is more suitable for primary and comprehensive analysis of multiple genome-edited samples. We have described these points in the Discussion section (Page 15, Line 509-513).

      Reviewer 1_Comment #4:

      The report machine learning method is developed for calling the different alleles. Has the authors compare DAJIN with e.g. NanoCaller, which is developed for SNPs and small indels calling based on DNN.

      We are thankful to the referee for bringing the comparison with NanoCaller to our attention. We conducted NanoCaller and found it performed better to detect the point mutation than Medaka and Clair. However, because NanoCaller could not detect the LAR (formerly labelled as “SV”) alleles, it incorrectly reported the genotype of BC25 as 'point mutation', not 'LAR with point mutation'. We added the results of NanoCaller in Table S9 and described these points in the Results section (Page 10, Line338-339).

      Reviewer 1_Comment #5:

      Apart from genotyping, many CRISPR studies performed in cells are focusing on profiling the indel profiles in a pool of edited cells. It would broaden the applicability of the method for detecting different indels types in such samples and conditions. Current methods, such as TIDE/ICE, NGS-based amplicon sequencing, IDAA can only detect smaller indels. DAJIN will add the advantage of detecting longer indels for such application.

      Thank you very much for your comments. We added description on application of DAJIN in the Discussion section (Page 17, Line 592-596).

      Reviewer #1 Significance :

      Although similar methods are reported for genotyping of the CRISPR editing outcome, the current study introduce the PCR barcoding and particularly the bioinformatic tool box for allele binning and calculation contribute with useful tool to the filed. The study has demonstrated with multiple applications demonstrating the broad applicability of it.

      Reviewer 2:

      CRISPR nucleases typically generate DNA double strand breaks (DSBs) at target site, which typically generate small insertion and deletion (indel) enabling precise gene knockout or knock-in. However, accompanied DNA DSBs often induce unwanted large deletions or chromosomal translocation. Thus, to assess such large variations as well as small indels is crucial in the genome editing field. In this manuscript, the authors developed a long-range assessment tool, named Determine Allele mutations and Judge Intended genotype by Nanopore sequencer (DAJIN), using a long-read sequencer, Nanopore sequencing. Overall, the topic will be interesting for broad readers and this tool looks technologically sound. I would suggest a few comments that may strengthen this manuscript, as follows.

      We are grateful for the referee’s valuable suggestions to improve our manuscript.

      Reviewer 2_Comment #1:

      Another key study is missed in this manuscript. Recently, a tool with similar concept to DAJIN was published in Nat Methods, which uses also long-read sequencers, Nanopore or PacBio [PMID: 33432244]. It is necessary to describe the benefits of DAJIN against the previous study.

      Thank you for pointing this out. Our method has an advantage over those utilizing unique molecular identifiers (UMIs) in its automatic identification and classification of genomic rearrangements including unexpected mutations in multiple samples obtained under different editing conditions (different target loci). As per our response to the Reviewer #1_Comment #3, one of the disadvantages of UMIs is the cost. More accessible methods of routine assessment of on-target genome editing outcomes are required, as well as unbiased assessment of editing products (PMID: 32643177). We showed in the manuscript that the machine-learning-based model could bypass molecular tagging to provide a feasible approach for routine assessment of genome editing outcomes. DAJIN will make a very significant contribution to speeding up and improving the accuracy of this experimental process.

      We agree that the approach reported by Karst et al. has certainly contributed to generation of highly accurate single-molecule consensus sequences. Analysis of small portion of samples using UMI-based methods may compensate for the limitations of DAJIN such as PCR bias and/or PCR-mediated recombination as you described in your comment #6. We added description in the Discussion section (Page 15, Line 509-513; Page 17, Line 615-618).

      Reviewer 2_Comment #2:

      In Figure 1a, the authors used Barcoding but details information is not present in the main text. The length and context information are necessary to be described in the main text.

      We thank the reviewer for these comments. According to the comments, we illustrated the process of PCR-based barcoding in Fig. 1a. Besides, we described the length of barcodes at "Library preparation and nanopore sequencing" in the Methods section (Page 4, Line 137 & 140).

      Reviewer 2_Comment #3:

      The term "SV (structural variation)" over "Single-nucleotide variant (SNV)" seems ambiguous. Does "SV" include large deletion and chromosomal translocation? In this manuscript, I guess that SNV indicates small indels, whereas SV indicates large indels. The detailed definition is needed for better understanding.

      Thank you very much for your comments. We intended to classify and label large genomic rearrangements including large deletion and chromosomal translocation as “SV (structural variation)”. We agree that structural variation traditionally referred to genomic alterations that are larger than 1 kb in length. Although the application of sequencing technology has expanded the spectrum of structural variation to include smaller events >50 bp in length (PMID: 21358748, PMID: 26432246), there are no common understanding on the definition of the name of genomic rearrangements >50 bp in length through genome editing. We changed the name of the unexpected mutation reads more than approximately 50 bp in length “Large rearrangements (LAR)”. We changed description on the name of reads that DAJIN annotates in the Methods (Page 6, Line 205) and Results section (Page8, Line 249) as well as all other parts throughout the manuscript.

      Reviewer 2_Comment #4:

      In Figure 2, IGV exhibits several SNVs (i.e., random errors) in each query sequence, which might be due to the low accuracy of Nanopore sequencing. I understand that DAJIN makes consensus sequence based on those long-read sequences. But I wonder how DAJIN pinpoint the point mutation (PM) so exactly?

      Thank you for pointing it out. As you mentioned, the low accuracy of Nanopore long-read sequencing made PM detection difficult. We tackled the issue and partly solved it by (i) calculation of MIDS score (Fig. S7), (ii) reducing data's dimension by principal component analysis (PCA), and (iii) setting proper parameters of HDMSCAN.

      DAJIN converts ACGT nucleotide information to MIDS (Match, Insertion, Deletion, and Substitution) (Fig. S6). Subsequently, DAJIN subtracts the relative frequency of MIDS between a control and a sample. We called the subtracted relative frequency 'MIDS score' (Fig. S7). The subtraction mitigates the sequencing errors because the error patterns are similar between a sample and a control. We next perform clustering using the MIDS score. DAJIN compresses the score by PCA and extracts five dimensions. The dimension reduction may be effective to mitigate sequencing errors because the sequencing errors have lower scores than true mutations. Subsequently, DAJIN performs HDBSCAN, a density-based clustering method. The HDBSCAN have a parameter of 'min_cluster_size' that indicates a minimum number of samples in a cluster. DAJIN finds the parameter returning the most frequent cluster numbers by searching the value in the range of 10% to 40% of reads. It means DAJIN ignores minor clusters that contain less than 10% of reads. We set the criteria because sequencing errors often made such minor clusters.

      In summary, we consider the MIDS score, PCA and the parameter setting of HDBSCAN support DAJIN's accurate PM detection. To clarify the point, we updated the description in the Methods section (Page 7, Line 217-225).

      Reviewer 2_Comment #5:

      In page 9, the authors also used next-generation sequencing (NGS). I guess this NGS indicates illumine-based short-read sequencing. Clearer definition is necessary here.

      We thank the referee for bringing this unclarity to our attention. According to the reviewer's comment, we updated the words 'NGS' to the 'illumina-based short-read next-generation sequencing' or 'short-read NGS' in the whole text.

      Reviewer 2_Comment #5-1:

      Whereas DAJIN could reported SVs, PM, and WT, the NGS could not capture SVs. Could you write the reason here? I guess that the short-read sequences including SVs might be discarded during the alignment process, which means that it is because of software limitation, rather than the NGS itself.

      Thank you for pointing this out. In this study, we performed the short-read NGS analysis by paired-end sequencing (2 x 151 bases) for PCR amplicons of about 200 bp length. We consider the main reason that NGS could not capture LAR (formerly labelled as “SV”) is due to the PCR process. The allele 2 in BC20, BC25, and BC26 of Tyr point mutation had a large deletion including primer annealing sites, which makes it impossible to obtain the PCR amplicon of this allele. Besides, allele 1 in BC25 had about 60-70 bp insertions. The insertion might make it difficult to amplify the whole length of this allele because of the limited number of cycles in short-read NGS.

      To examine whether the short-read sequencing reads were discarded during the alignment process, we calculated the mapping percentages of BC20, BC25, and BC26 and found that 97-99% of reads were successfully aligned to the mm10 reference genome. We think this result can support our hypothesis. We added the results in Table S10 and described the points in the Results section (Page 10, Line 329-332).

      Reviewer 2_Comment #6:

      Basically, DAJIN amplify the target region using PCR, thus PCR bias (e.g. unequal amplification according to different lengths) should be considered. The authors should address it. Moreover, it is better to describe the limitation of current DAJIN in the discussion section.

      Thank you very much for your comments. PCR amplification of genomic DNA is essential in our method described in the manuscript. As we have described in a paragraph in the Discussion section (Page 17, Line 597-601), we recognize there is an unavoidable limitation with PCR bias. We also cannot exclude the possibility that large rearrangements (‘LAR’, formerly labeled as ‘SV’) include alleles generated through PCR and/or sequencing error. ‘Pseudo-LoxP’ alleles could be generated if the PCR products, which included one-side LoxP but not another-side LoxP, worked as a PCR primer to anneal WT allele in the next PCR step (Page 17, Line 608-613). We recognize that minor fractions of the ‘LAR’ alleles, including those observed in WT mice, are composed of reads with high sequencing error rate. Recently developed methods including the one you kindly mentioned in the comment #1 may address these limitations. We added description in the Discussion section (Page 17-18, Line 615-618).

      Reviewer #2 Significance:

      Overall, the topic will be interesting for broad readers

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      General comments

      CRISPR nucleases typically generate DNA double strand breaks (DSBs) at target site, which typically generate small insertion and deletion (indel) enabling precise gene knockout or knock-in. However, accompanied DNA DSBs often induce unwanted large deletions or chromosomal translocation. Thus, to assess such large variations as well as small indels is crucial in the genome editing field. In this manuscript, the authors developed a long-range assessment tool, named Determine Allele mutations and Judge Intended genotype by Nanopore sequencer (DAJIN), using a long-read sequencer, Nanopore sequencing. Overall, the topic will be interesting for broad readers and this tool looks technologically sound. I would suggest a few comments that may strengthen this manuscript, as follows.

      Specific Comments:

      1. Another key study is missed in this manuscript. Recently, a tool with similar concept to DAJIN was published in Nat Methods, which uses also long-read sequencers, Nanopore or PacBio [PMID: 33432244]. It is necessary to describe the benefits of DAJIN against the previous study.
      2. In Figure 1a, the authors used Barcoding but details information is not present in the main text. The length and context information are necessary to be described in the main text.
      3. The term "SV (structural variation)" over "Single-nucleotide variant (SNV)" seems ambiguous. Does "SV" include large deletion and chromosomal translocation? In this manuscript, I guess that SNV indicates small indels, whereas SV indicates large indels. The detailed definition is needed for better understanding.
      4. In Figure 2, IGV exhibits several SNVs (i.e., random errors) in each query sequence, which might be due to the low accuracy of Nanopore sequencing. I understand that DAJIN makes consensus sequence based on those long-read sequences. But I wonder how DAJIN pinpoint the point mutation (PM) so exactly?
      5. In page 9, the authors also used next-generation sequencing (NGS). I guess this NGS indicates illumine-based short-read sequencing. Clearer definition is necessary here.
        • 5-1. Whereas DAJIN could reported SVs, PM, and WT, the NGS could not capture SVs. Could you write the reason here? I guess that the short-read sequences including SVs might be discarded during the alignment process, which means that it is because of software limitation, rather than the NGS itself.
      6. Basically, DAJIN amplify the target region using PCR, thus PCR bias (e.g. unequal amplification according to different lengths) should be considered. The authors should address it. Moreover, it is better to describe the limitation of current DAJIN in the discussion section.

      Significance

      Overall, the topic will be interesting for broad readers

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The study by Akihiro and colleagues describe the generation of multiplex genotyping method for detecting CRISPR gene editing alleles using nanopore sequencing and a machine learning program. The method is based on long-range PCR amplification of intended targeted loci from gene edited animals followed by nanopore sequencing. A PCR-index is introduced to the sample pooling system before sequencing, thus allow sequencing up to 100 sample in one flowcell. The study developed a machine learning program for allele binning, analysis, and presentation. To demonstrate the applicability of the method, the study has validated their methods for detection of point mutations, deletion, and flox insertion. The study has in principal provided sufficient investigation and data to demonstrate the validity of the method. All the figures are very nicely and clearly presented. However, there is a few concerns that it should be taken in to consideration.

      1. Many previous reported unintended structure variations caused by CRISPR off-targets are typically much larger deletion/insertion/insertion/translocation occurred outside the target sites. The current study is more for targeted allele genotyping. The use of structure variable (SV) in the whole study should be considered to revise thoroughly.

      SV is typically referred to genomic variation of approximately 1kb and above. What the study describe in this study is still within indel types instead. Thus, comparing the DAJIN with NanoSV and Sniffles on reads with 50, 100 and 200 bases deletions is not appropriate.

      The detection of SV alleles in the whole study is most likely a cause of minor indel alleles and sequencing errors. Figure 2b, BC32, WT mice also contains a proportion of SV allele, which is apparently caused by sequencing error. Such SV which is not related to CRISPR gene editing is also seen in other genotyping results e.g. Figure 3a. Figure 4b, Figure 5c, Figure 6b.

      Another co-factor that contributes to the SVs is the PCR-error from the method.

      1. The reason that current method detect more than two alleles from one animal is probably due to the chimerism of the animal. However, when looking at the BAM file and figures presented in Figure 1b, 2c, 3b, 3d, 4c, as well as those in the Supplementary figures, there seems to be more than one allele (indels reads with different size) presented in one category.

      For example, Figure 2C, mice BC12, it is not fully aligned between the all alleles and the allele1 and allele 2 presented. For allele 1, which is called SV, there are reads with different size of indels. For allele 2, which is called intended PM, some reads are a hybrid of deletion and intended substitution.

      1. What is the advantage of the current method as compared to the one reported by Bi et al., 2020, genome biology, previously?

      2. The report machine learning method is developed for calling the different alleles. Has the authors compare DAJIN with e.g. NanoCaller, which is developed for SNPs and small indels calling based on DNN.

      3. Apart from genotyping, many CRISPR studies performed in cells are focusing on profiling the indel profiles in a pool of edited cells. It would broaden the applicability of the method for detecting different indels types in such samples and conditions. Current methods, such as TIDE/ICE, NGS-based amplicon sequencing, IDAA can only detect smaller indels. DAJIN will add the advantage of detecting longer indels for such application.

      Significance

      Although similar methods are reported for genotyping of the CRISPR editing outcome, the current study introduce the PCR barcoding and particularly the bioinformatic tool box for allele binning and calculation contribute with useful tool to the filed. The study has demonstrated with multiple applications demonstrating the broad applicability of it.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: #RC-2021-00992

      Corresponding author(s): Parisa Kakanj and Maria Leptin

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this study, the authors use the fruit fly as a model to understand the role and regulation of autophagy in epidermal integrity during development and wound healing. They discover that hyper activation of autophagy via overexpression of Atg1 leads to disruption of epithelial organization, junctional protein localization, and syncytium formation. In addition, these epidermal defects were found to be dependent on TORC1 as knockdown or inhibition of TORC1 antagonists resulted in similar epidermal defects which could be rescued by knockdown of Atg1 or Atg5. Wound healing in fruit fly epidermis is known to induce cell fusion and here the authors show that syncytium formation is dependent on autophagy. GFP-Atg8a autophagosomes were found to accumulate in cells adjacent to the wound site, but Atg1-induced syncytium formation was dispensable for wound repair. However, the authors found that hyper activation of autophagy prior to injury slowed wound closure. This may be due to defects in actomyosin organization or another developmental defect the authors observed in the epidermis. Overall, the key conclusions of this study are convincing, but the experiments would be strengthened by validation of all the RNAi strains used as well as demonstration that epidermal barrier remains intact as described.

      **Major Comments**

      1. This study uses a number of UAS-RNAi strains as well as dominant negative and overexpression transgenes. There is no validation that these genetic perturbations work as expected.

        Almost all of the lines we use have been extensively used and validated by others as documented in the literature. We append a table (below, page 14) with these references. It would be close to impossible for us to show their tissue specific efficacy in the larval epidermis because it is extremely difficult to obtain clean dissections of epidermis without contamination from other tissues (muscles, nerves, etc.), and we believe we can rely on the known validation of most of the lines. It is true that some of the lines are less well characterised, and we comment on those below, and will eliminate our speculation on their effects in the manuscript.

      In fact, the authors state on pg 5 that RNAi to Atg6, Atg7, and Atg12 may be less effective, but do not verify the knockdown efficiency to the gene of interest (i.e. Atg5 RNAi knock downs Atg5 transcript or protein).

      Atg12 and Atg7 have been shown (PMID: 25882046) by quantitative RT-PCR to effectively reduce RNA levels in the midgut during larval to pupal transition. We will therefore have to eliminate our speculation that the weak effect in the epidermis may be due to ineffective knock-down. Rather, it seems that these components are accessory but not necessarily essential for the completion of autophagy, as also observed by others (PMID: 25882046, PMID: 1805642, PMID: 23599123, PMID: 15296714, PMID: 23873149, PMID: 23406899)

      This is particularly important as authors use a single UAS-rictor RNAi strain to conclude that autophagy is dependent on TORC1 and not TORC2. If rictor RNAi is also weak or ineffective than this conclusion would be erroneous.

      The function of rictor has been validated by classic genetics: Animals homozygous for deletions of rictor show no defects throughout their normal life cycle (Hietakangas and Cohen, 2007). We have also shown that epidermis of homozygous rictor∆1 larvae (marked with Src-GFP, DsNuc-Red2) shows no abnormalities in cell shapes or cell membranes. We include an image here.

      Figure A __| Effect of rictor deletion on the epidermis. a,b, Fluorescence micrographs of larval epidermis expressing the indicated markers in a larva homozygous for a rictor deletion (rictorEY08986 , also named rictor∆1). a, Lower magnification showing the entire width of larval segments A3 or A4. n=16-18 larvae each genotype. Scale bars: a 50 μm; b,__ 20 µm.

      A major conclusion of this study is that autophagy remodels the lateral cell membranes and not the basal or apical, so the membrane integrity remains intact. This is described and shown in Fig S3a, but it is hard to see that the apical membrane is intact. It would be helpful if authors could show a true membrane marker, such as UAS-CD8mGFP to see if there is a continuous membrane.

      We will include new experiments with this marker.

      Alternatively, is there a barrier assay that could help demonstrate that syncytium formation does not disrupt epithelial integrity?

      This follows from the fluorescence recovery we performed (Supplementary Video 13), where we observe rapid diffusion between areas in the epidermis, but never any leakage of fluorescence in the y-axis into the body cavity. We will emphasize this more clearly in the text.

      This could be performed in the fly gut, using the smurf assay (Rera M et al. 2011), since the authors also describe (pg 9) a similar role for autophagy in disruption of epithelial lateral membranes.

      We had done a smurf assay, and observed no leakage from the gut, but didn’t document this at the time because of difficulties during the period of Covid restrictions of accessing a dissecting scope/camera set up in a lab outside our own. We will try to repeat this now in the hope that with current reduced restrictions we can record the result.

      Is autophagy dependent syncytium formation cell autonomous?

      Our clonal analysis in wound healing addresses this point (Figure 2; text page 5 and 6). Clones of GFP-expressing cells neighbouring a wound share their cytoplasmic contents with their neighbours during wound closure. However, a clonal cell that is Atg5-deficient in a wild-type background does not share its content with the neighbouring cells. This shows that for a cell to participate in syncytium formation, that every cell itself has to be competent to perform autophagy. We will expand the explanation of this point in the text.

      The A58-Gal is not cell-type specific as authors describe (pg 9) similar effects in trachea, salivary glands, and intestine and it is unclear if effects are due to disruption of autophagy in epidermal cells or general disruption in fly's physiology. The authors should determine, using a more restrictive Gal driver, whether syncytium formation is due to activation of autophagy in the epidermal cells or another cell type (trachea, salivary glands, or intestine).

      We apologize if our phrasing of ‘ectodermal’ led to the impression that A58-Gal4 is cell-type specific. A58 also drives expression in the tracheal system, as all other available epidermal drivers do. A58 expression in the salivary gland is presumably due to the origin of the Gla4 construct, which like many other Gal4 drivers (e.g. NP1-Gal4) includes salivary gland specific enhancers (PMID: 8223268 and PMID: 12324947). A58 is not active in the gut, and for the experiments in the gut we used the NP1 driver. We will rephrase the text in the paper to avoid confusion. There is no driver that is absolutely restricted to the epidermis.

      Alternatively, if no other Gal4 is available for the larval epidermis then authors could at least show using enterocytes driver (NP1-Gal4) that overexpression of Atg1 is sufficient to induce syncytium formation and its effect on gut barrier integrity.

      We did do this experiment but didn’t include the images because of the large number of figures we already had. We now show them here. As mentioned above, barrier integrity is not compromised. We can also provide images of the phenotype in tracheal cells.

      Figure B __| Effect__ of uncontrolled autophagy on enterocytes and salivary glands. Larval gut or salivary glands expressing the indicated markers and overexpression (Tsc1,2 or Atg1S) or RNAi (raptori) constructs using the NP1-Gal4 driver. Images are from live imaging of gut or salivary gland of 6 to 11 larvae for each genotype. Scale bars, 20 µm.

      In Fig 8, authors nicely show that Atg1 RNAi can rescue Tor RNAi and raptor RNAi, but, what about the reverse. Is overexpression of Tor sufficient to inhibit the overexpression Atg1 and reduce autophagy-induced syncytium formation?

      Overexpression of Tor would affect both TORC1 and TORC2. We have done this experiment using UAS-Torwt construct but found that it leads to excessive autophagy rather than suppression, consistent with similar results reported by others (PMID: 12324961 and PMID: 15186745). This approach can therefore not be used to do the proposed experiment. Instead, one could use downregulation of the Tor inhibitor TSC1, which acts on TORC1, and we have shown to reduce autophagosome formation in wound healing (Fig. 1d). Another option is to overexpress the TORC1-specific activator Rheb (PMID: 12893813, PMID: 17208179 and PMID: 31422886). We will set up the experiments with these constructs in the hope that they will yield interpretable results.

      **Minor comments:**

      1. Check spelling of abbreviations, Sqh is often misspelled Shq in figures

        We will correct them. Thanks for alerting us.

      The order of images in Figure 3 should match the description in the text (pg. 6).

      We would prefer to retain the current order because it is then consistent with all the other figures. Re-writing the text to reflect this order would make it less clear.

      AtgW is described in text, but not shown in Fig 3a-c. Also, upstream activators of TORC1 are described first, but shown last in this Figure making it difficult to follow.

      We will now only mention Atg1W later in the text where we also show it in a figure.

      Fig7a should show junctional effect of Atg1W alone and in combination with Atg5i which is used in 7b.

      We had left this out to save space, but we will now include these data.

      It is unclear why authors switched to this weak overexpression for this photobleaching assay when Atg1S was predominantly used in the rest of the study.

      The reason we used Atg1W in this particular experiment is that we had it on a chromosome where it was recombined with GFP which made it genetically much easier to use for FLIP experiments. However, perhaps these constructs merit some discussion. Atg1W and Atg1S were originally called “weak” and “strong” based on studies in other tissues and other stages (PMID: 33253201). However, we found that in the epidermis their effects are practically indistinguishable, as judged by TEM (Fig.3d,e) (Fig 5e,f) (Suppl. Fig. 5a,b and Suppl. Fig. 6b,c), and all markers we used in confocal analyses (which we will include them). Thus, to avoid confusion, we will change the nomenclature we use on our paper to the neutral Atg1GS and Atg16B.

      Reviewer #1 (Significance (Required)):

      This study elucidates the role and regulation of TORC1 and autophagy in epithelial membrane remodeling. This is important work that is significant to both developmental and wound healing research. Many cell types become multinucleate during differentiation, aging, and wound healing and here the authors find a novel role for authophagy in remodeling lateral cellular junctions to facilitate syncytium formation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In their present manuscript Kakanj and colleagues show that during epithelial wound healing autophagy pathway controls plasma membrane integrity and homeostasis. Furthermore, elevated autophagic activity is sufficient to induce syncytium formation, which is essential for wound closure and healing. Authors used the epidermis of fruit fly larvae as model to study wound healing and video microscopy to examine this process. The methodology is well established, since authors already published a related study in 2016 using similar tools.

      The findings presented here are interesting and promising, the quality of most experiments are satisfactory, the confocal images/videos are excellent and I truly appreciate that authors used electron microscopy to support some of their claims. Findings are well presented and the text is well written and easy to read.

      Overall, my opinion is very positive about this manuscript.

      I believe most of the findings are very well supported, but I have some suggestions, which may can help strengthen the authors' points.

      1) Authors used GFP-Atg8a reporter to follow autophagy during wound healing. While I also believe that, the appearing GFP-Atg8a dots represent autophagic vesicles after wounding but GFP-Atg8a has some certain limitations. First: Atg8a (or LC3 in mammals) is removed from the outer surface of autophagosomes by Atg4 and the Atg8a trapped inside the autophagosomes will be degraded in the autolysosomal lumen. Thus Atg8a not always localizes to autolysosomes, actually Atg8a immunostaining mostly labels autophagosomes (and phagophores) but not autolysosomes in insect cells. Accordingly, GFP-Atg8a reporter is also subject of autolysosomal degradation and furthermore most of the GFP signal is quenched in the acidic lumen of autolysosomes, since at lower pH GFP loses fluorescence. Nevertheless, if lysosomal degradation proceeds normally, GFP-Atg8 will be degraded completely. Thus, some of the autolysosomes cannot be detected using this reporter, for this mCherry-Atg8a reporters can be used, since mCherry is more resistant than GFP and thus accumulate inside lysosomes, and retains its fluorescence in acidic environments.

      This is a good suggestion and we had done these experiments. However, the red fluorophores have a serious problem in that they all tend to form small aggregates or puncta – not in all tissues and at all stages, but this is a very wide-spread phenomenon, and is even observed in in vitro experiments (own observations). This makes quantification of vesicles or other small structures such as autophagosomes complete impossible. Nevertheless, here are a few figures from our analyses. While some of the spots clearly appear to be autophagosomes, as judged by their positions, they cannot be objectively distinguished from the other spots.

      Figure C __| Autophagy during epidermal wound healing. Time-lapse series of single-cell wound healing in larva expressing mCherry-Atg8a (black) to mark autophagosomes and autolysosomes (A58>mCherry-Atg8a). a, z-projections of a time-lapse series. b, Higher magnification of the areas marked by magenta boxes in (a). n=11 larvae. Each frame is a merge of 57 planes spaced 0.28 μm apart. Scale bars: a 20 μm; b,__ 10 µm.

      However, I still believe that for video microscopy GFP-Atg8a was a perfect choice, I just suggest to confirm the appearance of autophagosomes after wounding by other means: for instance, immunostaining of the epidermis after wounding (120 min) against Atg8a should confirm the presence of autophagosomes. There are a few specific available antibodies working in flies which are listed in the reviews of Nagy (PMID: 25481477) or more recently in Lorincz (PMID: 28704946)

      This is technically a huge challenge. We would have to induce a single cell wound, then filet and fix the epidermis, during which it rolls up and often destroys the area of interest. If it doesn’t, then the prep can be flattened out, but it still can be very difficult to find the wound in the prep.

      2) One of the major claims of the authors is that elevated autophagy leads to the breakdown or removal of lateral plasma membranes to promote syncytium formation. It is clearly seen on the confocal or EM images that lateral membranes disappear after wounding. However, it is also suggested that the lateral plasma membrane material is incorporated into autophagosomes or plasma membrane is a potential membrane source of autophagosome formation. I believe this is the least supported claim of the manuscript since no direct evidence for this is presented. This is based on BodyPy staining only, that BodyPy positive vesicles accumulate inside the cells. If this is indeed the case plasma membrane components should be detected in autophagic vesicles. Thus, I recommend co-staining membrane components with autophagic markers.

      This is indeed the clear next step, and we did a number of experiments along those lines, but they were once again compromised by the problem with the mCherry aggregates. This made the interpretation in the unwounded epidermis with artificially upregulated autophagy impossible. However, experiments with naturally upregulated autophagy in wound healing yielded results that are consistent with plasma membrane components being associated with autophagosomes (with the caveat that not every red dot we see represents an autophagosome). We have just repeated some of these using the septate junction marker FasIII and have obtained some beautiful movies that show FasIII labelled membrane (green) being surrounded by mCherry spots, and as the membrane begins to dissociate, the mCherry spots turn from red to yellow. We have included stills from results of these analyses here and will include them in a new figure in the revised manuscript.

      Figure D __| Colocalization of Atg8a and the septate junction component FasIII during epidermal wound healing. a, Time-lapse series of single-cell wound healing in a larva expressing mCherry-Atg8a (red) (A58>mCherry-Atg8a) and endogenously tagged FasIII (GFP gene trap; green), a transmembrane component of septate junctions. b, Higher magnification of the time-lapse marked by magenta boxes in (a). n=11 larvae. a,b, Each frame is a merge of 68 planes spaced 0.28 μm apart. Scale bars: a,b __20 μm.

      However if authors observe no colocalization of plasma membrane components with autophagy markers I still believe this study worth to be published. I would like to recommend the review of Ungermann and Reggiori (PMID: 29966469) in which the trafficking of Atg9 is discussed,

      Yes, indeed. And there is in fact now a further paper that goes in a similar direction (PMID: 34257406). We had left this out because we did not have direct data on Atg9, but will be happy to include it in the discussion in which we cite the paper that shows that Drosophila Atg9 is localized on the lateral plasma membrane in nurse cells, and loss of it leads to syncytium formation.

      since the source of autophagosomal Atg9 is in part the plasma membrane in mammalian cells. Therefore, these findings may strengthen the authors' claims.

      **Minor points:**

      Figure 2A: I believe authors wanted to use the word 'maintaining' not mating in their scheme.

      Indeed. Thanks for alerting us.

      Discussion: Authors suggest that: another function of autophagy in the cells surrounding the wound may be to clear up debris as in planarian and other cell types autophagy is activated in healthy cells, which simultaneously phagocytose cell debris. Honestly, I do not believe that this is the case here. Some of the Atg proteins are indeed required for phagocytosis during LC3-assiciated phagocytosis (LAP) (see: PMID: 30787029), but LAP is independent form Atg1

      Good point, we will include this in the discussion.

      and if LAP happened in the cells, surrounding the wound then GFP-Atg8a positive phagosomes would appear in those cells. However, it is clearly not the case here.

      Reviewer #2 (Significance (Required)):

      I highly recommend this manuscript to be uploaded to a relevant journal and I believe the findings presented here will be interesting for biologists specialized in regeneration and readers from the autophagy fields alike.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The larval epidermis of Drosophila is a prime model for studying wound healing by combining live imaging with cellular, genetic and molecular analysis of the processes involved. Autophagy is known to be activated and necessary for efficient wound healing in animal models through secretion of cytokines and clearance of bacteria. This manuscript implicates autophagy in cellular syncytium formation during wound healing. Live imaging demonstrates autophagy activation in cells surrounding the wound. Inhibition of autophagy by RNAi against atg1 or atg5, required for autophagy initiation and autophagosome formation had no effect on the rate of constriction and closing of the wound site. However, elegant live imaging demonstrates that autophagy is required cell autonomously for cell fusion, a normal process during wound healing in flies. Autophagy can also be instructive for cell fusion. Strong induction of autophagy by TORC1 inhibition, TSC1/2 overexpression or Atg1 overexpression induce cell fusion that is genetically dependent on atg5, a gene acting downstream of atg1 in autophagosome formation. As Chloroquine treatment, a chemical inhibiting autophagosome fusion to the lysosome and lysosomal breakdown showed no effect, the authors suggest that later steps of autophagy are not involved. Live imaging with a selection of cellular fluorescently tagged markers of apical, lateral and basolateral membrane domains, combined with electron microscopy show clearly that lateral membrane are disrupted and removed within the epithelium. During this process, membranous large vesicles "drift" away from the plasma membrane. If these vesicles relate to autophagy is not addressed. In addition to the effect on cell fusion, strong autophagy induction also leads to autophagy within the nucleus, chromatin condensation and distortion of the nuclear membrane. The manuscript is well written and easy to follow. Figure panels and data are clearly presented. All experiments are well described throughout and skillfully executed with appropriate controls and statistical analysis. It remains unknown what induces autophagy in response to wounding. It also remains unclear whether autophagy deconstructs or engulfs parts of the plasma membrane, or if parts of the autophagy machinery has additional roles in plasma membrane fusion.

      **Major comments:**

      • Are the key conclusions convincing? -Conclusions are generally balanced and convincing.

      -I have seldom seen a paper so well written, presented and balanced by first pass. Hence my experimental suggestions are few.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? -Claims are well founded.
      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary to evaluate the paper as it is, and do not ask authors to open new lines of experimentation.

        -The inhibition of autophagy is performed using knockdown of two genes acting in autophagy initiation (atg1, a part of the ULK1 kinase complex) and atg5, required for autophagosome formation. Later acting genes in the autophagy process such as autophagosome closure, fusion with the lysosome or degradation were not analyzed. In the abstract, the authors state "Proper functioning of TORC1 is needed to prevent autophagy from destroying the larval epidermis which depends on membrane isolation and phagophore expansion, but not fusion of autophagosomes to lysosomes". As far as I can see, the last statement on fusion derives from experiments with Chloroquine. Although frequently used for qualitative experiments, CQ is not suited for conclusive experiments. Without genetic experiments targeting genes for autophagosome-lysosome fusion such as snap29,stx17,vamp7 this statement is in my mind not well supported.

      We agree this would strengthen our findings, and we had indeed ordered these strains from the Bloomington stock collection. However, they were dead on arrival and both our labs in Heidelberg and Cologne currently have major problems with shipments from Bloomington and German customs. Other colleagues whom we asked did not have them available either. We will continue to search for appropriate constructs, but even if we find them and they arrive alive, and are processed by customs within a reasonable time, it will take many weeks to establish and then expand them and subsequently do the multi-generation crosses to obtain the stocks with all the relevant drivers and markers to set up the experiment. Three months is the absolute lower limit provided everything works according to plan, and first time round 6 months is a more realistic assumption. We hope that the referees and the editors agree that while this is a desirable experiment, it is not essential for the publication of the other results we present.

      • Are the suggested experiments realistic for the authors? It would help if you could add an estimated cost and time investment for substantial experiments. -Given the expertise of the authors, these experiments should be easy to perform within 3 months.

        • Are the data and the methods presented in such a way that they can be reproduced?

        • The manuscript is well written and an excellent example of how how methods and experiments should be presented. Methods, tools and experiments are all well described.

        • Are the experiments adequately replicated and statistical analysis adequate? -Replicates and statistics are adequate and custom for the type of analysis performed.

        **Minor comments:**

      • Specific experimental issues that are easily addressable. Figure 3 h. The live imaging documents the striking disappearance of lateral cell membranes using SRC-GFP. In 3h, large vesicle formation and movement towards the cell interior is shown. How frequent is this?

      This can only be seen clearly in experiments with time-controlled (Gal80ts) induction of authophagy where we can observe the process unfolding. We see these structures very frequently, but great variability in morphology and the structures are not always captured clearly in the plane of imaging. We here provide further examples.

      Figure E __| Autophagy in unwounded epidermis. a-c, Three additional examples showing apparent extrusions from lateral membranes after induction of autophagy (same experiment asn Figure 3h).__ Time-lapse series of epidermal cells expressing Src-GFP and Atg1S. Transgene expression is induced at the end of the second larval instar, live imaging started 6 h later (t=0) and continued for an additional 6 hours. a-c, Src-GFP containing material appears to be taken out of and eventually detached from lateral cell membranes (arrows).

      Is this believed to be the mechanism of lateral membrane removal?

      We would of course like to believe that, but we have no proof, and would therefore only be able to speculate.

      If so, is it dependent on the autophagy machinery. Are these vesicle positive for autophagy markers?

      Some autophagy markers have indeed been reported to be associated with the plasma membrane (e.g. Atg9, Atg16), but a conclusive study, while highly desirable, in our view goes beyond the scope of this study.

      Resolving this issue may lift the conclusions of the paper. Using 3xCherry-Atg8 together with SRC-GFP, this should be possible.

      We are intrigued by this suggestion and will be setting up the necessary crosses to do the experiments. However, it will take a long time to generate the necessary stocks (see genetics described below), and we will then again encounter the problem with the mCherry aggregates (see response to referees # 2). We are curious about the outcome, but we do not think it will be reasonable to promise as part of this revision that we will be able to provide conclusive results in the foreseeable future. Along with the many other things to do, this may just have to become part of a future paper, especially if there turn out to be other problems to be solved along the way. Like, for example, having to make an infrared (like mIFP or mKate, with which we have had much better experience in other contexts) Atg8 construct.

      Using CQ, the authors should be able to detect plasma membrane and junctional components in autophagosomes or autolysosomes (by confocal and electron microscopy) as degradation is inhibited. This should help to distinguish whether lateral membranes are engulfed and digested or if cells simply fuse, by using a part of the autophagy machiney.

      We have many interesting EM images on which we have had extensive discussions with the Paolo Ronchi and Yannick Schwab at the EMBL (whom we embarrassingly forgot to acknowledge in our manuscript, which will now be corrected), and one of the authors of this paper (BM) is an expert in EM imaging of the larval epidermis. It was agreed that some structures could indeed be interpreted as autophagosomes with content resembling junctional material. However, in the absence of absolute proof, we did not include them in the paper. We now show them here.

      Figure F __| Autophagosomes with junctional material in unwounded epidermis.__ Transmission electron micrographs of sections through the epidermis of a larva with elevated autophagy (A58>Atg1S) at two different magnifications. Arrows mark the autophagosomal membrane with content resembling junctional material.

      The authors, state that strong autophagy activation also leads to syncytium formation of tracheal cells, salivary glands and gut EC cells. Representative images in a supplementary figure would be useful for future reference.

      See response to other comments above (response to referees # 1). We have added some images in this document (Figure B) and will be happy to add additional ones in the revised manuscript.

      • Are prior studies referenced appropriately? -Yes. Key literature and findings are cited and discussed.

      • Are the text and figures clear and accurate? -Yes

        • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      -See suggested experiments above.

      Reviewer #3 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. -The findings clearly documents a role of autophagy in syncytium formation in the physiological process of wounding. This has parallels to muscle syncytium formation, but has to my knowledge not been demonstrated in any other cell type to be performed by autophagy. Moreover, the authors show that strong autophagy induction can lead to fusion of epithelial cells. This may have relevance for processes and diseases where polyploidy are observed.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      • State what audience might be interested in and influenced by the reported findings. -The data are very strong and the demonstration that autophagy controls syncytium formation outside of muscle development is surprising and significant. It is of interest to the field of cell biology and development in general and the autophagy field in particular. It will also be of interest for the medical field that deals with multinuclear phenotypes, such as cancer.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. -Development, cell signaling, autophagy, vesicle trafficking.

      Table 2 | Fly stocks used in experiments

      Transgenes

      Stock ID

      Source

      Publications using this construct

      Reference

      UAS-GFP-Kuk

      (UAS-GFP-KukEY07696(w+))

      Jörg Großhans

      PMID: 16421189

      https://flybase.org/reports/FBal0161312

      29

      UAS-Atg1i

      (UAS-Atg1RNAi)

      V # 16133

      (GD7149)

      PMID: 19363474

      PMID: 31995752

      PMID: 32032548

      PMID: 32915229

      https://flybase.org/reports/FBtp0034071.html

      UAS-Atg5i

      (UAS-Atg5RNAi)

      V # 104461

      (KK108904)

      PMID: 31995752

      PMID: 32032548

      https://flybase.org/reports/FBtp0046851.html

      UAS-Atg6i

      (UAS-Atg6RNAi)

      V # 110197

      (KK102460)

      PMID: 28581519

      PMID: 23599123

      PMID: 27542914

      PMID: 25644700

      Dissertation of Philipp Trachte, Abb. 23. https://refubium.fu-berlin.de/handle/fub188/27709

      Dissertation of Sirena Soriano Rodríguez. https://roderic.uv.es/bitstream/handle/10550/50749/Tesis%20SSoriano.pdf?sequence=1

      UAS-Atg7i

      (UAS-Atg7RNAi)

      V # 45558

      (GD11671)

      PMID: 25882046

      PMID: 31995752

      PMID: 32032548

      PMID: 23599123

      https://flybase.org/reports/FBtp0025106.html

      UAS-Atg12i

      (UAS-Atg12RNAi)

      V # 29791

      (GD15230)

      PMID: 25882046

      PMID: 17568747

      PMID: 31995752

      https://flybase.org/reports/FBtp0027770.html

      UAS-TSC1,2

      (UAS-TSC1, AUS-TSC2)

      Iswar K. Hariharan

      PMID: 15296714

      PMID: 11348592

      64

      UAS-TSC1i

      (UAS-TSC1RNAi)

      V # 22252

      (GD11836)

      PMID: 23144631

      PMID: 29144896

      PMID: 29456138

      https://flybase.org/reports/FBtp0025266.html

      UAS-Tori

      (UAS-TorRNAi)

      BL # 33951

      Nobert Perrimon

      PMID: 25882046

      PMID: 26395483

      https://flybase.org/reports/FBtp0065159.html

      65

      UAS-TORDN

      (UAS-TORTED)

      BL # 7013

      Thomas P. Neufeld

      PMID: 15296714

      PMID: 29144896

      https://flybase.org/reports/FBtp0016360.html

      66

      UAS-raptori

      (UAS-raptorRNAi)

      BL # 34814

      Nobert Perrimon

      PMID: 25882046

      PMID: 31048465

      https://flybase.org/reports/FBtp0068814.html

      65

      UAS-raptori-2

      (UAS-raptorRNAi)

      BL # 41912

      Nobert Perrimon

      PMID: 32097403

      https://flybase.org/reports/FBtp0081336.html

      65

      UAS-rictori

      (UAS-rictorRNAi)

      BL # 36699

      Nobert Perrimon

      PMID: 25882046

      https://flybase.org/reports/FBtp0070835.html

      65

      UAS-Atg1S

      (UAS-Atg16B)

      Thomas P. Neufeld

      PMID: 33253201

      https://flybase.org/reports/FBtp0041043.html

      67

      UAS-Atg1W, UAS-GFP

      (UAS-Atg1GS10797)

      Thomas P. Neufeld

      PMID: 33253201

      https://flybase.org/reports/FBal0216676.html

      67

      UAS-S6Ki

      (UAS-S6KRNAi)

      BL # 41895

      Nobert Perrimon

      PMID: 25284370

      https://flybase.org/reports/FBtp0080798.html

      65

      UAS-SqaKA

      (UAS-SqaT279A/CyO)

      Guang-Chao Chen

      PMID: 21169990

      https://flybase.org/reports/FBtp0071419

      30

      UAS-RhoAi

      (UAS-RhoARNAi)

      V # 12734

      (GD4726)

      PMID: 23853710

      PMID: 33789114

      https://flybase.org/reports/FBtp0031970.html

      UAS-Roki

      (UAS-RokRNAi)

      V # 104675

      (KK107802)

      PMID: 24995985

      PMID: 33789114

      https://flybase.org/reports/FBtp0046110.html

      UAS-RhebAV4

      BL # 9690

      Fuyuhiko Tamanoi

      PMID: 31909714

      PMID: 28829944

      https://flybase.org/reports/FBal0141561.html

      69

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The larval epidermis of Drosophila is a prime model for studying wound healing by combining live imaging with cellular, genetic and molecular analysis of the processes involved. Autophagy is known to be activated and necessary for efficient wound healing in animal models through secretion of cytokines and clearance of bacteria. This manuscript implicates autophagy in cellular syncytium formation during wound healing. Live imaging demonstrates autophagy activation in cells surrounding the wound. Inhibition of autophagy by RNAi against atg1 or atg5, required for autophagy initiation and autophagosome formation had no effect on the rate of constriction and closing of the wound site. However, elegant live imaging demonstrates that autophagy is required cell autonomously for cell fusion, a normal process during wound healing in flies. Autophagy can also be instructive for cell fusion. Strong induction of autophagy by TORC1 inhibition, TSC1/2 overexpression or Atg1 overexpression induce cell fusion that is genetically dependent on atg5, a gene acting downstream of atg1 in autophagosome formation. As Chloroquine treatment, a chemical inhibiting autophagosome fusion to the lysosome and lysosomal breakdown showed no effect, the authors suggest that later steps of autophagy are not involved. Live imaging with a selection of cellular fluorescently tagged markers of apical, lateral and basolateral membrane domains, combined with electron microscopy show clearly that lateral membrane are disrupted and removed within the epithelium. During this process, membranous large vesicles "drift" away from the plasma membrane. If these vesicles relate to autophagy is not addressed. In addition to the effect on cell fusion, strong autophagy induction also leads to autophagy within the nucleus, chromatin condensation and distortion of the nuclear membrane. The manuscript is well written and easy to follow. Figure panels and data are clearly presented. All experiments are well described throughout and skillfully executed with appropriate controls and statistical analysis. It remains unknown what induces autophagy in response to wounding. It also remains unclear whether autophagy deconstructs or engulfs parts of the plasma membrane, or if parts of the autophagy machinery has additional roles in plasma membrane fusion.

      Major comments:

      • Are the key conclusions convincing? -Conclusions are generally balanced and convincing. -I have seldom seen a paper so well written, presented and balanced by first pass. Hence my experimental suggestions are few.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? -Claims are well founded,

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary to evaluate the paper as it is, and do not ask authors to open new lines of experimentation.

      -The inhibition of autophagy is performed using knockdown of two genes acting in autophagy initiation (atg1, a part of the ULK1 kinase complex) and atg5, required for autophagosome formation. Later acting genes in the autophagy process such as autophagosome closure, fusion with the lysosome or degradation were not analyzed. In the abstract, the authors state "Proper functioning of TORC1 is needed to prevent autophagy from destroying the larval epidermis which depends on membrane isolation and phagophore expansion, but not fusion of autophagosomes to lysosomes". As far as I can see, the last statement on fusion derives from experiments with Chloroquine. Although frequently used for qualitative experiments, CQ is not suited for conclusive experiments. Without genetic experiments targeting genes for autophagosome-lysosome fusion such as snap29,stx17,vamp7 this statement is in my mind not well supported.

      • Are the suggested experiments realistic for the authors? It would help if you could add an estimated cost and time investment for substantial experiments. -Given the expertise of the authors, these experiments should be easy to perform within 3 months.

      • Are the data and the methods presented in such a way that they can be reproduced?

      • The manuscript is well written and an excellent example of how how methods and experiments should be presented. Methods, tools and experiments are all well described.

      • Are the experiments adequately replicated and statistical analysis adequate? -Replicates and statistics are adequate and custom for the type of analysis performed.

      Minor comments:

      • Specific experimental issues that are easily addressable. Figure 3 h. The live imaging documents the striking disappearance of lateral cell membranes using SRC-GFP. In 3h, large vesicle formation and movement towards the cell interior is shown. How frequent is this? Is this believed to be the mechanism of lateral membrane removal? If so, is it dependent on the autophagy machinery. Are these vesicle positive for autophagy markers? Resolving this issue may lift the conclusions of the paper. Using 3xCherry-Atg8 together with SRC-GFP, this should be possible.

      Using CQ, the authors should be able to detect plasma membrane and junctional components in autophagosomes or autolysosomes (by confocal and electron microscopy) as degradation is inhibited. This should help to distinguish whether lateral membranes are engulfed and digested or if cells simply fuse, by using a part of the autophagy machiney.

      The authors, state that strong autophagy activation also leads to syncytium formation of tracheal cells, salivary glands and gut EC cells. Representative images in a supplementary figure would be useful for future reference.

      • Are prior studies referenced appropriately?

      -Yes. Key literature and findings are cited and discussed.

      • Are the text and figures clear and accurate?

      -Yes

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      -See suggested experiments above.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      -The findings clearly documents a role of autophagy in syncytium formation in the physiological process of wounding. This has parallels to muscle syncytium formation, but has to my knowledge not been demonstrated in any other cell type to be performed by autophagy. Moreover, the authors show that strong autophagy induction can lead to fusion of epithelial cells. This may have relevance for processes and diseases where polyploidy are observed.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      • State what audience might be interested in and influenced by the reported findings. -The data are very strong and the demonstration that autophagy controls syncytium formation outside of muscle development is surprising and significant. It is of interest to the field of cell biology and development in general and the autophagy field in particular. It will also be of interest for the medical field that deals with multinuclear phenotypes, such as cancer.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      -Development, cell signaling, autophagy, vesicle trafficking.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their present manuscript Kakanj and colleagues show that during epithelial wound healing autophagy pathway controls plasma membrane integrity and homeostasis. Furthermore, elevated autophagic activity is sufficient to induce syncytium formation, which is essential for wound closure and healing. Authors used the epidermis of fruit fly larvae as model to study wound healing and video microscopy to examine this process. The methodology is well established, since authors already published a related study in 2016 using similar tools.

      The findings presented here are interesting and promising, the quality of most experiments are satisfactory, the confocal images/videos are excellent and I truly appreciate that authors used electron microscopy to support some of their claims. Findings are well presented and the text is well written and easy to read.

      Overall, my opinion is very positive about this manuscript.

      I believe most of the findings are very well supported, but I have some suggestions, which may can help strengthen the authors' points.

      1) Authors used GFP-Atg8a reporter to follow autophagy during wound healing. While I also believe that, the appearing GFP-Atg8a dots represent autophagic vesicles after wounding but GFP-Atg8a has some certain limitations. First: Atg8a (or LC3 in mammals) is removed from the outer surface of autophagosomes by Atg4 and the Atg8a trapped inside the autophagosomes will be degraded in the autolysosomal lumen. Thus Atg8a not always localizes to autolysosomes, actually Atg8a immunostaining mostly labels autophagosomes (and phagophores) but not autolysosomes in insect cells. Accordingly, GFP-Atg8a reporter is also subject of autolysosomal degradation and furthermore most of the GFP signal is quenched in the acidic lumen of autolysosomes, since at lower pH GFP loses fluorescence. Nevertheless, if lysosomal degradation proceeds normally, GFP-Atg8 will be degraded completely. Thus, some of the autolysosomes cannot be detected using this reporter, for this mCherry-Atg8a reporters can be used, since mCherry is more resistant than GFP and thus accumulate inside lysosomes, and retains its fluorescence in acidic environments. However, I still believe that for video microscopy GFP-Atg8a was a perfect choice, I just suggest to confirm the appearance of autophagosomes after wounding by other means: for instance, immunostaining of the epidermis after wounding (120 min) against Atg8a should confirm the presence of autophagosomes. There are a few specific available antibodies working in flies which are listed in the reviews of Nagy (PMID: 25481477) or more recently in Lorincz (PMID: 28704946)

      2) One of the major claims of the authors is that elevated autophagy leads to the breakdown or removal of lateral plasma membranes to promote syncytium formation. It is clearly seen on the confocal or EM images that lateral membranes disappear after wounding. However, it is also suggested that the lateral plasma membrane material is incorporated into autophagosomes or plasma membrane is a potential membrane source of autophagosome formation. I believe this is the least supported claim of the manuscript since no direct evidence for this is presented. This is based on BodyPy staining only, that BodyPy positive vesicles accumulate inside the cells. If this is indeed the case plasma membrane components should be detected in autophagic vesicles. Thus, I recommend co-staining membrane components with autophagic markers. However if authors observe no colocalization of plasma membrane components with autophagy markers I still believe this study worth to be published. I would like to recommend the review of Ungermann and Reggiori (PMID: 29966469) in which the trafficking of Atg9 is discussed, since the source of autophagosomal Atg9 is in part the plasma membrane in mammalian cells. Therefore, these findings may strengthen the authors' claims.

      Minor points:

      Figure 2A: I believe authors wanted to use the word 'maintaining' not mating in their scheme. Discussion: Authors suggest that: another function of autophagy in the cells surrounding the wound may be to clear up debris as in planarian and other cell types autophagy is activated in healthy cells, which simultaneously phagocytose cell debris. Honestly, I do not believe that this is the case here. Some of the Atg proteins are indeed required for phagocytosis during LC3-assiciated phagocytosis (LAP) (see: PMID: 30787029), but LAP is independent form Atg1 and if LAP happened in the cells, surrounding the wound then GFP-Atg8a positive phagosomes would appear in those cells. However, it is clearly not the case here.

      Significance

      I highly recommend this manuscript to be uploaded to a relevant journal and I believe the findings presented here will be interesting for biologists specialized in regeneration and readers from the autophagy fields alike.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this study, the authors use the fruit fly as a model to understand the role and regulation of autophagy in epidermal integrity during development and wound healing. They discover that hyper activation of autophagy via overexpression of Atg1 leads to disruption of epithelial organization, junctional protein localization, and syncytium formation. In addition, these epidermal defects were found to be dependent on TORC1 as knockdown or inhibition of TORC1 antagonists resulted in similar epidermal defects which could be rescued by knockdown of Atg1 or Atg5. Wound healing in fruit fly epidermis is known to induce cell fusion and here the authors show that syncytium formation is dependent on autophagy. GFP-Atg8a autophagosomes were found to accumulate in cells adjacent to the wound site, but Atg1-induced syncytium formation was dispensable for wound repair. However, the authors found that hyper activation of autophagy prior to injury slowed wound closure. This may be due to defects in actomyosin organization or another developmental defect the authors observed in the epidermis. Overall, the key conclusions of this study are convincing, but the experiments would be strengthened by validation of all the RNAi strains used as well as demonstration that epidermal barrier remains intact as described.

      Major Comments

      1. This study uses a number of UAS-RNAi strains as well as dominant negative and overexpression transgenes. There is no validation that these genetic perturbations work as expected. In fact, the authors state on pg 5 that RNAi to Atg6, Atg7, and Atg12 may be less effective, but do not verify the knockdown efficiency to the gene of interest (i.e. Atg5 RNAi knock downs Atg5 transcript or protein). This is particularly important as authors use a single UAS-rictor RNAi strain to conclude that autophagy is dependent on TORC1 and not TORC2. If rictor RNAi is also weak or ineffective than this conclusion would be erroneous.
      2. A major conclusion of this study is that autophagy remodels the lateral cell membranes and not the basal or apical, so the membrane integrity remains intact. This is described and shown in Fig S3a, but it is hard to see that the apical membrane is intact. It would be helpful if authors could show a true membrane marker, such as UAS-CD8mGFP to see if there is a continuous membrane. Alternatively, is there a barrier assay that could help demonstrate that syncytium formation does not disrupt epithelial integrity? This could be performed in the fly gut, using the smurf assay (Rera M et al. 2011), since the authors also describe (pg 9) a similar role for autophagy in disruption of epithelial lateral membranes.
      3. Is autophagy dependent syncytium formation cell autonomous? The A58-Gal is not cell-type specific as authors describe (pg 9) similar effects in trachea, salivary glands, and intestine and it is unclear if effects are due to disruption of autophagy in epidermal cells or general disruption in fly's physiology. The authors should determine, using a more restrictive Gal driver, whether syncytium formation is due to activation of autophagy in the epidermal cells or another cell type (trachea, salivary glands, or intestine). Alternatively, if no other Gal4 is available for the larval epidermis then authors could at least show using enterocytes driver (NP1-Gal4) that overexpression of Atg1 is sufficient to induce syncytium formation and its effect on gut barrier integrity.
      4. In Fig 8, authors nicely show that Atg1 RNAi can rescue Tor RNAi and raptor RNAi, but, what about the reverse. Is overexpression of Tor sufficient to inhibit the overexpression Atg1 and reduce autophagy-induced syncytium formation?

      Minor comments:

      1. Check spelling of abbreviations, Sqh is often misspelled Shq in figures
      2. The order of images in Figure 3 should match the description in the text (pg. 6).<br> AtgW is described in text, but not shown in Fig 3a-c. Also, upstream activators of TORC1 are described first, but shown last in this Figure making it difficult to follow.
      3. Fig7a should show junctional effect of Atg1W alone and in combination with Atg5i which is used in 7b. It is unclear why authors switched to this weak overexpression for this photobleaching assay when Atg1S was predominantly used in the rest of the study.

      Significance

      This study elucidates the role and regulation of TORC1 and autophagy in epithelial membrane remodeling. This is important work that is significant to both developmental and wound healing research. Many cell types become multinucleate during differentiation, aging, and wound healing and here the authors find a novel role for authophagy in remodeling lateral cellular junctions to facilitate syncytium formation.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The manuscript is interesting and well presented. The authors propose the use of an antifibrotic drug to attenuate resistance to RTK inhibitors.

      \*Specific comments***

        • It is not entirely clear how Nintedanib decreases tumour growth. It may be due to its effect on resistant melanoma cells as proposed, but it could also be due to the effect on CAFs. This should be at least discussed. *

      The reviewer asks about a potential effect of Nintedanib on CAFs in our mouse model. While we show that Nintedanib has a direct action on melanoma cells in vitro, the in vivo situation can indeed be more complex. We agree that we cannot rule out the possibility that its therapeutic efficacy could be attributed in part to inhibition of CAFs, knowing that BRAF inhibitors has been shown to activate CAFs in melanoma, generating a host-tumor niche that can mediate therapeutic escape. However, addressing the contribution of CAF in vivo is challenging and would represent an entire new study. As requested by the reviewer, we have discussed this important issue and added 3 new references (see discussion section lines 377-381).

      • A potential caveat is that drug used is non-specific as it also blocks PDGFR signalling. Hyperactivation of RTKs is a mechanism of BRAFi resistance and for example in Figure 1J, they see that BIF1120/Nintedanib has a significant effect on BRAFi-resistant cells, which may indicate that the growth inhibition seen in allografts could be a combination of an "anti-fibrotic" role and its own activity inhibiting the survival of resistant cells. This needs to be considered.*

      We thank the reviewer for this interesting issue. Nintedanib was chosen due to its inhibitory action on extracellular matrix deposition and as an example of a rapidly available drug to be exploited therapeutically to increase the effect of targeted therapy and delay the emergence of therapy-resistant cells. We recognize that a possible disadvantage of Nintedanib could be due to its multi-targeted nature (e.g. PDGFR (α and β), FGFR-1, -2, -3, -4 and VEGFR-1, -2, -3 as well as Src, Lck or Lyn) but it is one of the only approved molecules for the treatment of fibroproliferative diseases. Upregulation of PDGFRβ/AKT signaling was previously shown to contribute to acquired resistance in M238R (Shi et al. Cancer Res. 2011;71:5067-74 ; Nazarian et al. Nature. 2010;468:973-7). Our in vitro results indicate that Nintedanib inhibits survival of these resistant cells along with a decrease in their myofibroblast-like dedifferentiated phenotype (Fig. 1 I-J).

      To meet the reviewer’s comment, we have now addressed the contribution of PDGFRβ inhibition in Nintedanib’s effects on resistant cells. We have performed experiments on M238R using the selective PDGFR inhibitor CP673451 in comparison with Nintedanib (please see results section lines 120-127 and new Supplementary Fig. S1F-H). The data show that selective inhibition of the PDGFR pathway attenuates the myofibroblast-like signature typical of resistant cells to a similar degree as Nintedanib and affects melanoma cell viability (new Supplementary Fig. S1G-H). However, administration of CP673451 showed less efficiency than Nintedanib in inducing a phenotype switch toward a more differentiated phenotype (new Supplementary Fig. S1G). To further confirm the implication of RTK pathway in the phenotype observed, we analyzed the tyrosine phosphorylation status of EGFR, PDGFR and FGFR (another RTK inhibited by Nintedanib) and activation of AKT in M238R melanoma cells upon treatment with Nintedanib or CP673451 (new Supplementary Fig. S1F and additional results for the reviewers). Nintedanib had no effect on FGFR tyrosine phosphorylation and slightly decreased pEGFR levels. However, we found that the two inhibitors showed similar efficiency in decreasing phospho-PDGFRβ and phospho-AKT levels (Supplementary Fig. S1F). The results section has been modified according to these new results (lines 126-127).

      Altogether these data suggest that inhibition of PDGFR signaling likely plays a prominent role in the efficacy of Nintedanib in vitro on M238R survival. Thus, as proposed by the reviewer, we can predict that the growth inhibition induced by Nintedanib seen in vivo could be a combination of its "anti-fibrotic" action and PDGFR inhibitory activity inhibiting the survival of resistant cells. It is important to note that, compared to Nintedanib, inhibition of PDGFR/AKT signaling by the CP673451 compound is not sufficient to direct melanoma cells to a more differentiated state. This is now discussed in the manuscript (Discussion section lines 404-405).

      • Does the viability decrease in BRAFi-sensitive cells? For instance, in the parental cells?*

      This information was already addressed in the manuscript. As shown in Supplemental Fig. S1D, Nintedanib had no effect on BRAFi-sensitive M238P viability. We have also confirmed this result using a crystal violet viability assay on M238P and UACC62 cells treated with different doses of BIBF1120.

      • Figure 1 b-e, in vivo and in vivo experiments. *How many animals were used? Collagen decrease is not quantified (statistics missing).

      We apologize for this omission and have now added the number of animals in the legend of Fig.1 (n = 6). We have also performed statistics for collagen quantification and included this analysis in Fig.1F (see lines 720/723). We also provide to the referee the detailed statistical analysis of mature collagen fibers between the different treatment groups.

      • The title is not accurate. "prevent" resistance in melanoma is an overestimation because the cells do become resistant, albeit later.*

      We agree with the reviewer and we have modified the title accordingly. The new title is now: “Blockade of pro-fibrotic response mediated by the miR-143/-145 cluster prevents targeted therapy-induced phenotypic plasticity and delays resistance in melanoma”.

      Reviewer #1 (Significance):

      As the authors discussed, they and others have previously studied the contribution of ECM and stromal remodelling to resistance to targeted therapies in melanoma. Previous data from E. Sahai´s lab show that BRAFi activate CAFs and increase the production and remodelling of the extracellular matrix, but in this work, they look at a cell-autonomous mechanism mediated by miRs that promotes fibrosis and propose the use of an antifibrotic drug to attenuate resistance to RTK inhibitors.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): In this very interesting study, Diazzi and colleagues show that during adaptation to MAPK-targeted therapy (MAPKi), melanoma cells upregulate a miRNA profibrotic cluster (miR-143, -145), which drives a phenotypic switch towards a drug resistant undifferentiated mesenchymal-like state. From the miRNA targets, authors identify FSCN1 as a gene that needs to be downregulated during adaptation to MAPKi by the miRNAs, since FSCN1 ablation promotes the drug resistant phenotype. Importantly, authors show in a preclinical mouse melanoma model that the anti-fibrotic drug nintedanib (BIBF) improves response to MAPKi and delays onset of resistance.

      The study conclusions are convincing and the data are adequately replicated and presented, authors should be commended for having the manuscript in such good shape. However, there are a few issues that authors should clarify/expand.

      We sincerely thank the reviewer for his/her careful review and constructive comments.

      1. The study starts with the in vivo YUMM1.7 model and combination BRAFi+MEKi, and then authors use this combination in many in vitro experiments. However, when studying resistant lines, only BRAFi-resistant and -sensitive pairs were used. I would suggest including more validation of the upregulation of the miRNA and the fibrotic genes on BRAFi+MEKi-resistant lines, and this could be easily gathered from published transcriptomes of several BRAFi+MEKi-resistant melanoma lines from Roger Lo's lab (Song et al 2017 Cancer Discov, including M238, M229, M249 used by the authors). To complement this approach, miRNA expression could be evaluated in large collections of melanoma cell lines classified as more or less undifferentiated (correlating with more or less resistance) as in Tsoi 2018 Cancer Cell and Verfaille 2015 Nat Commun.

      We thank the reviewer for these interesting suggestions. We have performed several analyses, summarized below:

      • First, we have analyzed the expression of the miRNA-143/-145 cluster and pro-fibrotic signature by qPCR in A375 parental and BRAFi/MEKi double resistant melanoma cell lines described in Shen et al. Nat Commun. 2019;10:5713. We observed the upregulation of both mature miRNAs along with a pro-fibrotic signature in several A375 DR clones compared to parental cells. This new result is described in the results section (lines 147-150) and shown in new Supplementary Fig. S2B. In addition, we have included in the results section the important information that the undifferentiated/mesenchymal-like BRAFi-resistant M229R and M238R cells used in our work also displayed cross-resistance to MEKi (results section, line 112 and 1 new reference).

      • Second, as recommended, we have also fully (re)analyzed the mentioned studies and associated datasets. We provide a summary of the different studies including samples number, design of the study, platform used and accession.

      A general observation is that unfortunately, none of these published studies provided an available small RNA-seq dataset, which thus does not allow quantifying the expression levels of mature miRNAs. However, some interesting observations have been uncovered from these datasets, confirming at least in part some of our data:

      i) The dataset from Song et al. 2017 compared 18 isogenic parental versus resistant cell lines. Two subsets of resistant cells were identified, with MAPK addiction (Ra) or Resistance with MAPK redundancy (Rr). The expression of the pri-miR-143/145 precursor, named MIR143HG, was detected in these cells and was found significantly upregulated in Rr cell lines compared to parental cells. Of note, MIR143HG was also part of the Rr specific signature associated with a mesenchymal phenotype. This interesting observation is now discussed in the manuscript (Discussion section, lines 392-394).

      ii) The dataset from Tsoi et al. 2018 focused on transcriptome analysis of 53 human melanoma cell lines including paired acquired resistance sublines established from patient biopsies. Unfortunately, MIR143HG expression is not detected in this dataset, probably due to a limited sequencing depth. Interestingly, we found that FSCN1 expression was decreased in most mesenchymal-like resistant cell lines compared to their parental counterpart. These data cannot be added in the manuscript since we cannot correlate the expression of the miRNAs with their target.

      iii) The dataset from Verfaillie et al. 2015 revealed transcriptomic analyses on 11 short-term cultures derived from patient biopsies before therapy and gave access to RNA-seq data of tumors with a proliferative or an invasive phenotype. MIR143HG is not detected and FSCN1 expression does not appear to be associated with a specific phenotype. We have performed qPCR-based expression of miR-143-3p and miR-145-5p in some of these short-term cultures, confirming that miR-143/-145 expression is not associated with a specific phenotype in therapy naïve melanoma cells (results for referees, see below). Expression of miR-143-3p and miR-145-5p in each short-term culture was compared to the average expression of the analyzed miRNA in the proliferative short-term cultures. These results are consistent with the findings of our study describing that expression of the miR-143/145 cluster is triggered by the inhibition of the BRAF oncogenic pathway.

      Related to this, the clinical relevance would increase if findings were validated using patient samples, for example, from published transcriptomes (Hugo 2015 Cell, Song 2017 Cancer Discov, Wagle 2014 Cancer Discov...) or even from TCGA, which could be used to identify if patients with high miRNA have worse prognosis.

      We agree with the reviewer about the importance of providing clinical data supporting our observations. We have carefully analyzed all these profiling studies and provide below a summary.

      Overall, these studies have several limitations: i) as underlined above, expression of the miRNA cluster is specifically induced in response to therapy and is not present (or barely) in tumors at diagnosis; ii) no small RNA-seq datasets are available yet; iii) melanoma tumors are highly heterogeneous and invaded with stroma, especially CAFs and vessels that also express these miRNAs. We have looked at the expression of the MIR143HG precursor in these datasets and it was not present, probably due to low to medium sequencing depths in these clinical studies.

      We have also carefully explored TCGA datasets to look at possible association between prognosis and mature / precursor miRNA as well as miRNA target (FSCN1) expression in skin cutaneous melanoma (SKCM) using the tools developed by Anaya et al. 2016, PeerJ Computer Science 2:e67. Cox regression models and Kaplan-Meier analysis (using different percentiles) did not show any association of our candidates with survival on a cohort of 459 SKCM patients (median survival of 2.4 years).

      Finally, during the revision process, we could have access to 9 relapsed melanoma for research purposes from the Dermatology Department of Nice University Hospital (CHU) following treatment with targeted therapies, immunotherapies or a combination of them. We have analyzed in these biopsies the expression of fibrotic/mesenchymal genes, FSCN1 and the miR-143/145 cluster compared to the mean expression of the same genes/miRNAs in therapy naïve patient-derived xenografts (MEL003, MEL006, MEL015, MEL047). Our first results indicate that relapsed tumors acquire a strong fibrotic signature which is associated to increased expression of the miR-143/-145 cluster and decreased expression of FSCN1 (8 out of 9 patients).

      These results are encouraging and represent a good indicator for further clinical validation but are not solid enough to be incorporated in the manuscript. Overall, validation of our hypotheses in patient samples would require an entire new and highly complex clinical study comparing tumors at diagnosis with relapsed tumors after targeted therapies and ideally processed using single-cell RNA-seq and/or RNA FISH to take into account the stromal compartment.

      • While blocking the miRNA improves BRAFi response (Fig.3H), it is not clear that this combination would overcome resistance (using resistant lines), although authors show that BIBF does overcome resistance (Fig.1J). *This also applies to line 277 "… mirroring the effect of miR143/145 ASOs, forced expression of FSCN1 in M238R cells decreased viability in the presence of BRAFi (Fig.5H)." However, the miRNA ASOs were used in parental cells (Fig.3H).

      To meet the reviewer’s comment, we have conducted new experiments in resistant melanoma cells using different approaches to silence simultaneously the 2 mature miRNAs: i) an ASO-directed RNAse H degradation of the miR-143/145 precursor, as described by Plaisance et al., JACC Basic Transl Sci. 2016, 1:472-493 to knock-down the pri-miRNA in cardiomyocytes, and ii) a combination of the 2 anti-miRs ASOs. Unfortunately, the first approach failed to efficiently inhibit the expression of mature miR-143-3p and miR-145-5, suggesting that the miR-143/145 cluster has a different precursor gene in melanoma than the one described in cardiomyocytes.

      Concerning the second approach, as expected, the 2 anti-miRs ASOs as well as the combination of the 2 ASOs efficiently targeted the mature miRNAs (new Supplementary Fig.S6C). Inhibition of miR-145-5p alone and combined inhibition of the two miRNAs significantly affected the viability of BRAFi resistant melanoma cells (M238R) in the absence of BRAFi (new Supplementary Fig.S6D) in a similar way as Nintedanib/BIBF (Fig. 1J).

      • Analysis of cytoskeletal changes. Text (lines 284-287) is missing references, regarding "…morphological changes with cells assuming flattened spindle-like shape" and "..function of FSCN1 in F-actin microfilaments reorganization...".*

      We apologize for these omissions and have added the relevant references in the text (lines 305/306).

      Besides, authors say that transient overexpression of miRNAs reproduced these morphological changes as shown by F-actin staining. These would have benefited from including also side-by-side comparison of BRAFi treatment on these cell lines. To my knowledge, these melanoma lines (M238, M229, etc) have not been characterized in that regard (F-actin, focal adhesions). In Nazarian et al 2010, only brightfield pictures are shown in a supplementary figure.

      The same applies to YAP and especially MRTF activation upon miRNA overexpression, and whether this mirrors what BRAFi does to YAP and MRTF. In Misek et al 2020 and Kim et al 2015 YAP and MRTF were shown to be more enriched in the nucleus in resistant than in parental cells. Kim et al also show in time course experiments that there is significantly higher nuclear YAP after 7-14 days of BRAFi treatment. In the present manuscript, authors seemed to have assessed nuclear YAP/MRTF after 72h miRNA overexpression. Does it mirror MAPKi?

      As suggested by the reviewer, we have compared side-by-side the effect of oncogenic MAPK pathway inhibition to the effect of miR-143 or miR-145 overexpression on cytoskeleton and focal adhesion dynamics as well as YAP and MRTFA nuclear translocation in M238P, M229P and UACC62P melanoma cells. These analyses clearly show that transient overexpression of miR-143-3p or miR-145-5p mirrors the effects of BRAF or BRAF/MEK inhibition after 3 days on mechanopathways and acto-myosin remodeling. We thank the referee for this comment, which is helpful for the interpretation of the data. The new additional panels have been included in new Fig. 6B-D, new Fig. 7B-D, new Supplementary Fig. S10B-D and new Supplementary Fig. S11C-D.

      Regarding the decreased proliferation/survival after miRNA overexpression, is it truly slow cycling and not combined with some cell death? Table S1 has a "cell death of tumor cell lines" theme after miRNA overexpression.

      Following the reviewer suggestion, Annexin V/DAPI staining has been performed in M238P cells upon transient overexpression of miR-143 or miR-145. No significant cell death was observed (new Supplementary Fig. S4D). Detailed statistical analysis and quantification of the experiment is provided. Staurosporine (Stauro) treatment was used as a positive control of cell death induction.

      Related to this, in Supp. Fig.4C the effect on the cell cycle effect is very small, is this significant? It is unclear when the cell cycle was assessed after miRNA overexpression (72h?), it could be a matter of timing. According to Fig.3E, there is a reduction in growth from 60-72h onwards.

      We performed, as suggested by the reviewer, cell cycle analysis at longer timing after transfection (96 hours) (new Supplementary Fig. S4C). We observed a significant accumulation of melanoma cells in G0/G1 phase upon miR-143 or miR-145 overexpression and a significant decrease of the percentage of cells in S phase. Detailed statistical analysis of the described experiment is provided.

      Statistics. While multiple comparison tests were used, most graphs have asterisks on top of some bars, and it is unclear what is being compared with what. For example, Fig.2B have asterisks on top of BRAFi+MEKi group, does it mean it is significant vs vehicle group? In this and other similar cases (1J, 2C, S1B and others), a comparison against the combination group (BRAFiMEKi+BIBF) is also relevant. This should be revised throughout manuscript.

      As recommended by the reviewer, statistical analysis have been modified in the mentioned figures: Fig. 1J (lines 732/733), Fig. 2B (lines 745/746), Fig. 2C (lines 749/750) and Fig. S1B (see new figures and lines 251/252 of Supplementary materials).

      \*Minor:** -For all the studies using stable cell lines, authors should state how long after transduction and selection experiments were performed. *

      As recommended, we have now added this information (see lines 8-12 of Supplementary materials). - Authors only show single miRNA overexpression or inhibition. However, both miRNA are upregulated upon MAPKi. Did authors try the double overexpression or blockade?

      As suggested by the reviewer, we experimented the double blockade in M238P and 1205Lu cells treated with MAPK inhibitors. Results are presented in new Fig. 3B, 3D, 3H and Supplementary Fig. S6A-B. Overall, combined inhibition of the two miRNAs had an effect comparable or more significant than the single miRNA inhibition depending on the cellular parameter analyzed.

      Concerning the double overexpression, we already experimented lentivirus-mediated stable overexpression of the two miRNAs in two melanoma cell lines. Results are presented in Supplementary Fig. S5A-F and confirmed the functional effects observed by the single miRNA overexpression.

      - For the 1205Lu xenograft experiment, authors should also show the tumour growth curves, and explain how long treatment was and when miRNA expression was analysed (endpoint?). In addition, why in 5A there are only 3 dots (mice?) per group, while in 5B there are more (6-7 in control, 4-5 in BRAFi)?

      We apologize for this omission. We have added line 270 of the manuscript the reference to the previous study in which the experiment is described. miRNA expression was analyzed in tumors at the endpoint of the experiment i.e. 2 weeks after Vemurafenib treatment start. Moreover, we performed again the analysis of FSCN1 and miR-143/145 expression with the same number of mice (n = 6), please see new Fig. 5A.

      - In a few graphs, the axis legend should give more information. For example, Fig.2 says Fold change, and it should be Fold change expression, or similar; Fig.4G fold change FSCN mRNA expression; Fig. S2 log2 expression (resistant/par), S5A...

      We have corrected this and modified y-axis legends in the corresponding figures.

      - Fig.1E-G and S1B. **Is this at endpoint for each group?

      Yes, it is as stated in the materials and methods section.

      - Fig.3H and S6B. how long were these experiments?

      Experiments shown in Fig. 3H and Fig. S6B were carried out during 72 h. This information has been included in the legend of the corresponding figures.

      - Fig.7B and D. Why the MRTFA signal in miR-neg and siCTRL is so different? Same for UACC in S11A vs s11D.

      We apologize for this inaccuracy. We have revised the figures to show more representative pictures (new Figs. 7B, 7D and S11A, S11D and new Fig. 6C).

      • Fig.5C and 5E. FSCN1 knockdown in 5C is very efficient, while not so much in 5E. However, effects on MITF, AXL etc in 5C are quite impressive. are these knockdowns representative?

      We again apologize for this inaccuracy. We performed a new experiment and we are now showing a more representative FSCN1 knockdown in new Fig. 5E.

      - Fig.6-7 legend. When mentioning scale bar, it reads uM, should it be um?

      We have corrected this mistake.

      • Fig.7A. In the graph, the "YAP nuclear enrichment", do the numbers represent the nuclear/cytoplasm ratio?

      Yes, numbers represent the nuclear/cytoplasm ratio. This information was added in the legend of the corresponding figures.

      - When showing migration and a picture (Fig.3F, 5D, S4D, S5E...), the blue over dark background is difficult to see, using greyscale or a brighter pseudocolour would help

      We thank the reviewer for this useful suggestion. We have done this and used the gray scale to improve the quality of the pictures.

      Reviewer #2 (Significance):

      These findings have important preclinical implications, since the study proposes a biomarker of resistance (profibrotic signature) and importantly, a potential new therapy to delay MAPKi resistance in melanoma (BIBF). It could also apply to other BRAFmutant cancers and diseases cursing with fibrosis.

      Field of expertise: melanoma, drug resistance, cytoskeleton

      Reviewer #3:

      Major comments:

      The manuscript is well written, data are convincing, well presented and supportive of the conclusions.

      We thank the reviewer for his/her interest about our study and supportive comments.

      \*Minor points that may be improved:***

      - The expression of miR-143/145 increases in melanoma cell lines treated with BRAFi and/or MEKi for 72h (Fig. 2B, Supp. Fig. 2B-F), and also after the development of resistance to MAPK-targeted therapies (Fig. 2A, Supp. Fig. 2A). The transient overexpression of miRs in therapy-naive cells leads to cells de-differentiation toward a mesenchymal/MAPK resistant state. On the other hand, these cells become more sensitive to BRAFi treatment when combined with LNA-mediated inhibition of miRs activity. It would be important to determine if the same occurs also in resistant cells, or whether MAPKi-resistance is established, cells are no longer sensitive to miRs blockade.

      The answer to this point is common to the point 2 raised by the reviewer #2.

      According to reviewers suggestion, we have conducted new experiments in resistant melanoma cells using different approaches to silence simultaneously the 2 mature miRNAs: i) an ASO-directed RNAse H degradation of the miR-143/145 precursor, as described by Plaisance et al., JACC Basic Transl Sci. 2016, 1:472-493 to knock-down the pri-miRNA in cardiomyocytes, and ii) a combination of the 2 anti-miRs ASOs. Unfortunately, the first approach failed to efficiently inhibit the expression of mature miR-143-3p and miR-145-5, suggesting that the miR-143/145 cluster has a different precursor gene in melanoma than the one described in cardiomyocytes.

      Concerning the second approach, as expected, the 2 anti-miRs ASOs as well as the combination of the 2 ASOs efficiently targeted the mature miRNAs (Supplementary Fig.S6C). Inhibition of miR-145-5p alone and combined inhibition of the two miRNAs significantly affected the viability of BRAFi resistant melanoma cells (M238R) in the absence of BRAFi (new Supplementary Fig.S6D) in a similar way as BIBF (Fig. 1J).

      - In 2 out of 4 melanoma PDX samples naïve/resistant to combo BRAFi/MEKi therapy, the expression level of miR-143/145 cluster correlates with the de-differentiated transcriptomic profile of resistant tumor. How is Fascin1 expression in these samples?

      The reviewer legitimately asks about the expression level of the miR-143/-145 target FSCN1 in the PDX samples used in the study. Expression of FSCN1 in PDX resistant vs naïve samples has been assessed by RT-qPCR. Results are provided. We observed decreased expression of FSCN1 in only 1 out of the 2 samples showing increased miR-143/145 expression. This can be due to the heterogeneity of the subpopulations composing the tumor sample. It would have been interesting and probably more informative to test FSCN1 expression also at protein level since often miRNA molecular targets are inhibited at translation level but unfortunately we did not have the access to protein extracts corresponding to these samples.

      - The clinical relevance of the data could be strongly improved by assessing the expression of the miRs cluster and of its target Fascin1 in resistant subsets of patients, comparing their expression to patients before treatment, making use of available datasets.

      We agree with the reviewer about the importance of providing clinical data supporting our observations. We have carefully analyzed all available profiling studies and datasets and provide below a summary.

      Overall, these studies have several limitations: i) as demonstrated in our study, expression of the miRNA cluster is specifically induced in response to therapy and is not present (or barely) in tumors at diagnosis; ii) no small RNA-seq datasets are available yet; iii) melanoma tumors are highly heterogeneous and invaded with stroma, especially CAFs and vessels that also express these miRNAs. We have looked at the expression of the MIR143HG precursor in these datasets and it was not present, probably due to low to medium sequencing depths in these clinical studies.

      We have also carefully explored TCGA datasets to look at possible association between prognosis and mature / precursor miRNA as well as miRNA target (FSCN1) expression in skin cutaneous melanoma (SKCM) using the tools developed by Anaya et al. 2016 PeerJ Computer Science 2:e67. Cox regression models and Kaplan-Meier analysis (using different percentiles) did not show any association of our candidates with survival on a cohort of 459 SKCM patients (median survival of 2.4 years, see Kaplan plots below).

      Finally, during the revision process, we could have access to 9 relapsed melanoma for research purposes from the Dermatology Department of Nice University Hospital (CHU) following treatment with targeted therapies, immunotherapies or a combination of them. We analyzed in these samples the expression of fibrotic/mesenchymal genes, FSCN1 and the miR-143/145 cluster compared to the mean expression of the same genes/miRNAs in therapy naïve patient-derived xenografts (MEL003, MEL006, MEL015, MEL047). Our results indicate that relapsed tumors acquire a strong fibrotic signature which is associated to increased expression of the miR-143/145 cluster and decreased expression of FSCN1 (8 out of 9 patients).

      This represents a good indicator for further clinical validation but is not solid enough to be incorporated in the manuscript. Overall, validation of our hypotheses in patient samples would require an entire new and highly complex clinical study comparing tumors at diagnosis with relapsed tumors after targeted therapies and ideally processed using single-cell RNA-seq and/or RNA FISH to take into account the stromal compartment.

      Minor comments:

      - Fig. 4C, lower legend: M238P not M238S.

      We apologize for this mistake and corrected it.

      Reviewer #3 (Significance):

      **Nature and significance of the advances:**

      The findings not only suggest the combination therapy with the anti-fibrotic drug Nintedanib to be effective in enhancing MAPKi treatment in melanoma, reducing the development of resistance, but identify the molecular mechanism via the induction o the miR-143/145 cluster and the effects on the target Fascin1.

      **Compare to existing knowledge**

      These two miRNAs have been shown to have both oncogenic and oncosuppressor activities and have already been involved in EMT induction. The findings add yet one more piece to the puzzle.

      **Audience** This manuscript is not only of interest for oncology researchers but also of general interest or the understanding of fundamental biological processes and their effects on cancer therapy.

      **Your expertise**

      Molecular biologist and cancer research, transcriptional control of tumor transfromatin and progression including EMT, microRNAs -143/145

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In the present work Diazzi and co-authors describe the mechanism through which the anti-fibrotic drug Nintedanib potentiates MAPK-targeted therapy efficacy in melanoma cells. Nintedanib prevents the MAPK-induced pro-fibrotic response and is associated with loss of miR-143/-145 cluster expression. These miRs promote melanoma cells de-differentiation towards a pro-fibrotic mesenchymal-like state that correlates with resistance to MAPK inhibitors. Looking for miR-143/-145 targets responsible for this phenotype switch, the authors identified Fascin1 as a crucial regulator of cytoskeleton dynamics and mechanopathways.

      Major comments:

      The manuscript is well written, data are convincing, well presented and supportive of the conclusions.

      Minor points that may be improved:

      • The expression of miR-143/145 increases in melanoma cell lines treated with BRAFi and/or MEKi for 72h (Fig. 2B, Supp. Fig. 2B-F), and also after the development of resistance to MAPK-targeted therapies (Fig. 2A, Supp. Fig. 2A). The transient overexpression of miRs in therapy-naive cells leads to cells de-differentiation toward a mesenchymal/MAPK resistant state. On the other hand, these cells become more sensitive to BRAFi treatment when combined with LNA-mediated inhibition of miRs activity. It would be important to determine if the same occurs also in resistant cells, or whether MAPKi-resistance is established, cells are no longer sensitive to miRs blockade.
      • In 2 out of 4 melanoma PDX samples naïve/resistant to combo BRAFi/MEKi therapy, the expression level of miR-143/145 cluster correlates with the de-differentiated transcriptomic profile of resistant tumor. How is Fascin1 expression in these samples?
      • The clinical relevance of the data could be strongly improved by assessing the expression of the miRs cluster and of its target Fascin1 in resistant subsets of patients, comparing their expression to patients before treatment, making use of available datasets.

      Minor comments:

      • Fig. 4C, lower legend: M238P not M238S

      Significance

      Nature and significance of the advances:

      The findings not only suggest the combination therapy with the anti-fibrotic drug Nintedanib to be effective in enhancing MAPKi treatment in melanoma, reducing the development of resistance, but identify the molecular mechanism via the induction o the miR-143/145 cluster and the effects on the target Fascin1.

      Compare to existing knowledge

      These two miRNAs have been shown to have both oncogenic and oncosuppressor activities and have already been involved in EMT induction. The findings add yet one more piece to the puzzle.

      Audience

      This manuscript is not only of interest for oncology researchers but also of general interest or the understanding of fundamental biological processes and their effects on cancer therapy.

      Your expertise

      Molecular biologist and cancer research, transcriptional control of tumor transfromatin and progression including EMT, microRNAs -143/145

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this very interesting study, Diazzi and colleagues show that during adaptation to MAPK-targeted therapy (MAPKi), melanoma cells upregulate a miRNA profibrotic cluster (miR-143, -145), which drives a phenotypic switch towards a drug resistant undifferentiated mesenchymal-like state. From the miRNA targets, authors identify FSCN1 as a gene that needs to be downregulated during adaptation to MAPKi by the miRNAs, since FSCN1 ablation promotes the drug resistant phenotype. Importantly, authors show in a preclinical mouse melanoma model that the anti-fibrotic drug nintedanib (BIBF) improves response to MAPKi and delays onset of resistance.

      The study conclusions are convincing and the data are adequately replicated and presented, authors should be commended for having the manuscript in such good shape. However, there are a few issues that authors should clarify/expand.

      1. The study starts with the in vivo YUMM1.7 model and combination BRAFi+MEKi, and then authors use this combination in many in vitro experiments. However, when studying resistant lines, only BRAFi-resistant and -sensitive pairs were used. I would suggest including more validation of the upregulation of the miRNA and the fibrotic genes on BRAFi+MEKi-resistant lines, and this could be easily gathered from published transcriptomes of several BRAFi+MEKi-resistant melanoma lines from Roger Lo's lab (Song et al 2017 Cancer Discov, including M238, M229, M249 used by the authors). To complement this approach, miRNA expression could be evaluated in large collections of melanoma cell lines classified as more or less undifferentiated (correlating with more or less resistance) as in Tsoi 2018 Cancer Cell and Verfaille 2015 Nat Commun.

      Related to this, the clinical relevance would increase if findings were validated using patient samples, for example, from published transcriptomes (Hugo 2015 Cell, Song 2017 Cancer Discov, Wagle 2014 Cancer Discov...) or even from TCGA, which could be used to identify if patients with high miRNA have worse prognosis.

      1. While blocking the miRNA improves BRAFi response (Fig.3H), it is not clear that this combination would overcome resistance (using resistant lines), although authors show that BIBF does overcome resistance (Fig.1J). This also applies to line 277 ".. mirroring the effect of miR143/145 ASOs, forced expression of FSCN1 in M238R cells decreased viability in the presence of BRAFi (Fig.5H)." However, the miRNA ASOs were used in parental cells (Fig.3H).
      2. Analysis of cytoskeletal changes. Text (lines 284-287) is missing references, regarding "..morphological changes with cells assuming flattened spindle-like shape" and "..function of FSCN1 in F-actin microfilaments reorganization..". Besides, authors say that transient overexpression of miRNAs reproduced these morphological changes as shown by F-actin staining. These would have benefited from including also side-by-side comparison of BRAFi treatment on these cell lines. To my knowledge, these melanoma lines (M238, M229, etc) have not been characterized in that regard (F-actin, focal adhesions). In Nazarian et al 2010, only brightfield pictures are shown in a supplementary figure. The same applies to YAP and especially MRTF activation upon miRNA overexpression, and whether this mirrors what BRAFi does to YAP and MRTF. In Misek et al 2020 and Kim et al 2015 YAP and MRTF were shown to be more enriched in the nucleus in resistant than in parental cells. Kim et al also show in time course experiments that there is significantly higher nuclear YAP after 7-14 days of BRAFi treatment. In the present manuscript, authors seemed to have assessed nuclear YAP/MRTF after 72h miRNA overexpression. Does it mirror MAPKi?
      3. Regarding the decreased proliferation/survival after miRNA overexpression, is it truly slow cycling and not combined with some cell death? Table S1 has a "cell death of tumor cell lines" theme after miRNA overexpression.

      Related to this, in Supp. Fig.4C the effect on the cell cycle effect is very small, is this significant? It is unclear when the cell cycle was assessed after miRNA overexpression (72h?), it could be a matter of timing. According to Fig.3E, there is a reduction in growth from 60-72h onwards.

      1. Statistics. While multiple comparison tests were used, most graphs have asterisks on top of some bars, and it is unclear what is being compared with what. For example, Fig.2B have asterisks on top of BRAFi+MEKi group, does it mean it is significant vs vehicle group? In this and other similar cases (1J, 2C, S1B and others), a comparison against the combination group (BRAFiMEKi+BIBF) is also relevant. This should be revised throughout manuscript.

      Minor:

      -For all the studies using stable cell lines, authors should state how long after transduction and selection experiments were performed.

      -Authors only show single miRNA overexpression or inhibition. However, both miRNA are upregulated upon MAPKi. Did authors try the double overexpression or blockade?

      -For the 1205Lu xenograft experiment, authors should also show the tumour growth curves, and explain how long treatment was and when miRNA expression was analysed (endpoint?). In addition, why in 5A there are only 3 dots (mice?) per group, while in 5B there are more (6-7 in control, 4-5 in BRAFi)?

      -In a few graphs, the axis legend should give more information. For example, Fig.2 says Fold change, and it should be Fold change expression, or similar; Fig.4G fold change FSCN mRNA expression; Fig. S2 log2 expression (resistant/par), S5A...

      -Fig.1E-G and S1B. Is this at endpoint for each group?

      -Fig.3H and S6B. how long were these experiments? Fig.7B and D. Why the MRTFA signal in miR-neg and siCTRL is so different? Same for UACC in S11A vs s11D.

      -Fig.5C and 5E. FSCN1 knockdown in 5C is very efficient, while not so much in 5E. However, effects on MITF, AXL etc in 5C are quite impressive. are these knockdowns representative?

      -Fig.6-7 legend. When mentioning scale bar, it reads uM, should it be um?

      -Fig.7A. In the graph, the "YAP nuclear enrichment", do the numbers represent the nuclear/cytoplasm ratio?

      -When showing migration and a picture (Fig.3F, 5D, S4D, S5E...), the blue over dark background is difficult to see, using greyscale or a brighter pseudocolour would help.

      Significance

      These findings have important preclinical implications, since the study proposes a biomarker of resistance (profibrotic signature) and importantly, a potential new therapy to delay MAPKi resistance in melanoma (BIBF). It could also apply to other BRAFmutant cancers and diseases cursing with fibrosis.

      Field of expertise: melanoma, drug resistance, cytoskeleton

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript is interesting and well presented. The authors propose the use of an antifibrotic drug to attenuate resistance to RTK inhibitors.

      Specific comments

      1. It is not entirely clear how Nintedanib decreases tumour growth. It may be due to its effect on resistant melanoma cells as proposed, but it could also be due to the effect on CAFs. This should be at least discussed
      2. A potential caveat is that drug used is non-specific as it also blocks PDGFR signalling. Hyperactivation of RTKs is a mechanism of BRAFi resistance and for example in Figure 1J, they see that BIF1120/Nintedanib has a significant effect on BRAFi-resistant cells, which may indicate that the growth inhibition seen in allografts could be a combination of an "anti-fibrotic" role and its own activity inhibiting the survival of resistant cells. This needs to be considered.
      3. Does the viability decrease in BRAFi-sensitive cells? For instance, in the parental cells.
      4. Figure 1 b-e, in vivo and in vivo experiments. How many animals we used? Collagen decrease is not quantified (statistics missing).
      5. The title is not accurate. "prevent" resistance in melanoma is an overestimation because the cells do become resistant, albeit later.

      Significance

      As the authors discussed, they and others have previously studied the contribution of ECM and stromal remodelling to resistance to targeted therapies in melanoma. Previous data from E. Sahai´s lab show that BRAFi activate CAFs and increase the production and remodelling of the extracellular matrix, but in this work, they look at a cell-autonomous mechanism mediated by miRs that promotes fibrosis and propose the use of an antifibrotic drug to attenuate resistance to RTK inhibitors.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We thank the reviewers for their critical comments and suggestions. We are glad that the reviewers appreciated the quality of the data and the novel findings connecting the secretory trafficking machinery with extracellular matrix-related signaling.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Jung et al reports on an interesting finding that focal adhesion signaling regulates the expression of Sec23A and thereby regulates COPII-dependent trafficking. The data presented a mostly solid and the finding itself is highly novel, as it tackles an area of secretory trafficking that remains poorly understood, namely the connection between the ECM and secretion.

      I will list below all comments that I have mixing both technical and conceptual topics:

      \*Technical issues:***

      1-The authors should provide a better description of how the designed this siRNA library. What were the inclusion criteria for these 378 genes? I might have missed it, but I could not find this information easily.

      Reply: The library has been designed in-house based on gene annotations and literature to include cytoskeleton structural proteins, motor proteins, and other associated and regulatory proteins. We will add this information in the Materials and Methods section.

      2-Figure 2: I know this is challenging for EM images, but is there a way the authors could quantify these data? How many images were looked at? What was the average width of ER cisterne?

      Reply: We will provide image quantifications and statistics

      3-Figure 4: I think that the characterization of the FA phenotype is a bit underdeveloped. There is no quantification of these data. Is the size of FA changing? Is the number of FA per cell changing? Is the length of FAs changing? I think that more work is needed to increase the confidence in these data.

      I could also not easily see what type of cells these are. A better description of this experiment is also required. Also, how many cells were analyzed. I think it is important that this experiment is done with a sufficient number of cells to increase the confidence in the data.

      Reply: We agree with the reviewer that our observations regarding the focal adhesion (FA) phenotype will benefit from image quantification and we intend to include this in the revised manuscript. All FA experiments were performed on HeLa cells. We will update the materials and methods sections to better describe this experiment.

      \*Conceptual issues:***

      1-The finding that focal adhesion signaling negatively affects ER-export is surprising, because cancer cells that grow on stiff substrates have more focal adhesions and are more invasive and migratory. Both migration and invasion are expected to depend on ER-export. Although the authors did not formally test Sec23A expression under different stiffnesses, I would expect that stiff substrates would lower Sec23A expression and thereby negatively affect ER-export. It would certainly increase the breadth of this work to include data like this and to also discuss this highly surprising finding. However, it is of course the decision of the authors and the editors to decide whether such an experiment would benefit the entire story.

      Reply: In this work, we have shown that cells plated on ECM or matrigel have decreased SEC23A expression compared to control cells. We have also shown that inhibition of FA kinase leads to an increase in SEC23A expression (Figure 5). Whether this translates into a change in ER transport, is a fair point that we will address in the revision. Regarding stiffness, we have done a preliminary experiment that shows that cells plated on a soft synthetic substrate have less SEC23A than cells plated on plastic.This goes in line with our ECM experiments because Matrigel and fibroblast-derived ECM are softer than plastic.

      2-The authors postulate that this novel mechanism could be part of a feedback loop. If this were the case one would expect the acute effect of FA to increase ER-export (or secretion) and the negative feedback will then reduce secretion. However, the acute effect of FA is not addressed in this manuscript. In order to postulate a feedback loop, the authors would need to test the individual nodes of this loop.

      Reply: The question appears to be whether an acute effect on FA would affect the expression of SEC23A and therefore ER transport. If by the acute effect the reviewer means a pharmacological manipulation, we have shown that upon treatment with the FAK inhibitor the expression of SEC23A increases (Fig 5A). Whether this increase in SEC23A expression translates into a corresponding increase in ER transport remains to be seen. This will be tested in our revised manuscript as mentioned above in reply to point # 1.

      Our data encouraged us to propose a hypothetical feedback loop that would connect the deposition of ECM through the expression of SEC23A. We will have more data to support (or reject) this idea once we do the transport experiments as mentioned above. However, we think that a full characterization of this hypothetical loop by testing individual nodes is beyond the scope of this manuscript

      Reviewer #1 (Significance (Required)):

      I think that the basic finding of this manuscript is highly novel, by showing the impact of the ECM and focal adhesions on COPII-dependent trafficking. I think that this will not only appeal to people from the trafficking community, but also to people working on cell migration and on mechanobiology. The work in its current form does not require much extra efforts (max. 3 month). However, if the authors would decide to increase the breadth of data, they would require 3-6 months.

      Reply: We thank reviewer #1 for the comments. We also believe that this story will appeal to a broader audience and would help to bridge the gap between membrane trafficking and mechanobiology communities.

      \*Referees cross-commenting***

      I went through the comments of the two other reviewers and agree with their verdict. Some extra work on the characterization of the early secretory pathway would be good. Both reviewers provided a nice catalogue of possible experiments to choose from.

      Reply: We have characterized the early secretory pathway in terms of ER exit sites, Beta-COP, and Golgi morphology (FIG. 2B-H and S1A-B). Together, these data strongly characterize the nature of ER-block. Moreover, the finding that our interactors affect the expression of SEC23A allows us to explain mechanistically why an ER transport block occurs. This is further strengthened by the rescue experiments (FIG. 3F). We believe that further characterization of the secretory pathway will not contribute substantially to the main message of this manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Jung et al which based on a targeted siRNA screen, demonstrates regulation of SEC23A (component of the SEC23 complex of the COP coat) levels at transcriptional level downstream of focal adhesion signaling. By regulating siRNA mediated downregulation, the authors were able to identify proteins which either increased or decreased traffic of VSVG through the secretory pathway when combined with downregulation in the levels of with either SEC23A or SEC23B. Authors have focused on a group of SEC23B functional interactors, downregulation of which shows them increased size of focal adhesions which also downregulate SEC23A levels, thus providing an explanation for reduced secretory traffic. Authors further show that plating cells on fibronectin or Matrigel, which activate Focal adhesion kinase signaling also results in downregulation of SEC23A transcript levels. The screen is conducted in a well-controlled manner for most parts with a clear explanation of the analysis routines and the data presentation if of very good quality. Most important results have been validated by more than one experimental strategy which lends substantial confidence to the findings. The results also open further avenues for understanding the transcriptional regulation in different physiological and disease contexts.

      There are certain issues, which the authors should address with regards to controls and some conflicting observations with published results with respect the phenotypes associated with downregulating proteins on focal adhesions size. Additionally, authors don't tie the ends by monitoring secretory traffic in cells grown on different matrices but include it in the model. Addressing/explaining these issues could improve this manuscript and the model may have to be tweaked a bit.

      \*Major comments:***

      1)I wonder why the authors only used siRNA control in their screen when the effects are scored in context of double knockdown fashion in combination with mild knockdown of SEC23A and SEC23B to get functional interactors. Control siRNA in combination with SEC23A and SEC23B should have been two ideal negative controls in the screen. Nevertheless, in data presented Figure 1E and whole of Figure 2, using control siRNA in combination with SEC23B siRNA would have been ideal control to show that the combination does not induce any trafficking defects which could impact the findings of the study. Hence, a few of the data presented from some of these figures should have sicontrol+SEC23B siRNA combination as a control.

      Reply: There seems to be a misunderstanding. In the screen, the negative controls are only used as a reference as the scoring is based on a 5X5 matrix centered on the siRNA of interest. This is done to overcome possible plate effects and to normalize data across different biological replicas. As seen in figure 1B, the negative controls (Control siRNA or Control siRNA + SEC23A siRNA or Control siRNA + SEC23B siRNA are very close to 0 (but not exactly 0) as they were not used in the normalization process. It is important to mention that all single knockdowns also contain our control siRNA to keep the same final siRNA concentration in single and double knockdowns. In Fig 1E we will include the images from Control + SEC23A siRNAs and Control + SEC23B siRNA as a reference. For Figure 2 all except 2A and 2H have the single knockdowns as controls.

      2)What is the identity of post-ER structures which authors refer to in Figure 2A? Could the images represent VSVG concentrated at ER exit sites? Authors should stain with markers for ERES to see if the VSVG puncta colocalize with it.

      Reply: We have done the experiment, and indeed these structures colocalize with an ER exit site marker (SEC31A). We intend to include this data into the revised manuscript. Our observations are in agreement with what is known in the literature about VSVG transport.

      3)Based on RNA sequencing results, authors chose to follow up on SEC23A levels in background of siRNA knockdown of components (like MACF1, ROCK1, FERMT2 etc.) which regulate Focal adhesions in cells and show that there is a reduction in both transcript and protein levels of SEC23A. In images shown in Figure 2B and Figure 2C, levels or SEC31A and β-Cop1 are reduced. Authors should test using qPCR and western blots whether there is a downregulation SEC31A, β-Cop1 and SEC23B in siRNA knockdowns of MACF1, ROCK1, FERMT2 etc. It would provide new insights if there were a co-regulation of secretory machinery to modulate the secretory traffic in response to Focal Adhesion based signaling.

      Reply: Our transcriptomics data (FIG 3C and Table 5) shows that SEC31A and COPB1 mRNAs are not altered upon any of the knockdowns. For SEC23B, we observed only a slight decrease in ROCK1 knockdown. This data suggests that a co-regulation of the secretory machinery might not be present. Instead, the curation of secretory pathway genes in our transcriptome data shows that SEC23A is the only commonly differentially expressed gene.

      4)Most major concern in this manuscript surrounds around results presented in Figure 4C. Authors show that in response to all the knockdowns, they see more focal adhesions as monitored by Vinculin staining and this along with the experiments with cells plated on Matrigel and Fibronectin arrive at the conclusion that increased Focal adhesion signaling downregulates SEC23A levels which presumably modulates secretory traffic. I am not an expert on Focal adhesions but based on my understanding of the literature on that topic, downregulation of ROCK1, FEMRT2 disrupts focal adhesions. (See: Theodosiou et. al., Elife, 2016 or Lock et. al., Plos One, 2012 for example). How do authors explain their results in siRNA knockdown of ROCK1 and FEMRT2 which leads to an increased size of focal adhesions which seems contradictory to the published results? To clarify these results authors should test phosphorylation of FAK in their siRNA backgrounds which is another read out of focal adhesion signaling.

      The experiments from cells grown on Fibronectin and Matrigel favor the argument which authors put forth, but authors may have to tweak the model a bit based on FAK phosphorylation and FAK signaling in context of above-mentioned knockdowns.

      Reply: Based on the images for vinculin staining, in our current manuscript we propose that changes in FAs occur upon knocking down our interactors. In our revised manuscript we will provide a more robust quantitative assessment of those changes (change in number, size, or intensity) as mentioned in our reply to Reviewer #1.

      As for the discrepancies in the relation of FA phenotype upon depletion of ROCK1 and FERMT2, we want to point out that this effect depends on the cell type used. For instance, the papers listed by the reviewer here use fibroblasts and keratinocytes respectively while we have used Hela Kyoto cells which are epithelial in nature. Another example is that while in fibroblasts depletion of FERMT2 leads to a rounded morphology and almost an absence of FAs (Theodosiou et. al., Elife, 2016), in podocytes (Qu et al JCS, 2011), it leads to fewer FAs but an increase in their size. Nonetheless, this is a very keen observation from the reviewer and we will address this point in our revised manuscript discussion.

      5)What happens to VSVG traffic or RUSH-Cadherin traffic when cells are plated on Matrigel and Fibronectin? Reduction in secretory traffic of these is an important experiment which is missing to close the loop and validate the model presented. Authors must test these experiments either with cells grown on matrix alone or in combination with siRNA to SEC23B. Authors should also monitor ERES and transport carriers in this background.

      Reply: We agree with the reviewer and intend to perform these experiments.

      6)This is not such a major issue, but it would be good to see a comparison in SEC23A levels in siRNA knockdown condition in comparison to those when cells are grown on different substrates and in ROCK1, FEMRT2 knockdowns (blots of which authors already have in this manuscript).

      Reply: We will assess the level of SEC23A at the protein level for cells plated on matrigel or Fibroblast-derived ECM.

      \*Minor comments:***

      1)Scale bars are missing in EM images in Figure 2H.

      Reply: We will add the scales in our EM images

      2)Show molecular weight markers in Western blots in main figure 3E and supplementary figure S1E.

      Reply: We will add molecular weight markers in our Western-Blots

      Reviewer #2 (Significance (Required)):

      I have looked at the manuscript from through the lens of a cell biologist as that is predominantly my area of expertise. In that respect I find the screen conducted by authors particularly interesting as they aim to connect how extracellular cues regulate the secretory pathway. A screen seems justified as there is no comprehensive understanding linking the two above-mentioned processes. Authors have done a functional interaction screen and analyzed a lot of images to identify candidates which either increase or decrease secretory traffic in combination with SEC23A and SEC23B. Such a functional screen has helped authors identify candidates which were otherwise missed in single siRNA knockdowns in their previous work from 2012. This definitely opens up interesting avenues to test the candidates identified in the screen in different physiological contexts and in disease as also the transcriptional program connecting Focal adhesion signaling with the regulation of components governing secretion. Such functional interaction screens could also be employed to identify crosstalk of different cellular processes with the regulation secretory pathway at ER as well as at the Golgi apparatus.

      Reply: We thank reviewer #2 for the comments. As we mentioned in our reply to reviewer #1, we strongly believe that these results will encourage further research at the crossroads of membrane trafficking and mechanobiology.

      \*Referees cross-commenting***

      I agree with the comments from both the referees that the manuscript is very interesting, most experiments are well controlled, but the quantification of focal adhesion phenotype in knockdowns need to be done in an extensive manner and secretion phenotypes need to measured upon plating cells on different matrix to validate the model presented.

      Reply: These two experiments will be included in our revision

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      \*Summary***

      The authors use a synchronized cargo release assay following codepletion of either Sec23 paralog with cytoskeletal and associated proteins to identify potential functional interactions between COPII trafficking and the cytoskeleton. This screen yields a number of Sec23b functionally interacting molecules that stall cargo trafficking to various degrees within the secretory pathway upon codepletion, and in the case of MACF1 reduce ERES number despite not physically interacting. Depletion of the majority of the identified Sec23b functional interactors alone surprisingly caused the downregulation of Sec23a at the mRNA and protein levels, and cargo trafficking could be partially or fully rescued by Sec23a overexpression depending on the codepleted cytoskeletal factor. RNA-seq enrichment analysis and imaging of a focal adhesion marker suggest that genes involved in cell adhesion were differentially regulated following depletion of the cytoskeletal functional interactors. Finally, the authors show that Sec23a expression levels are reduced when cells are cultured on dishes with high amounts of ECM to induce focal adhesions, and that inhibition of focal adhesion kinase can rescue Sec23a expression levels.

      \*Major comments***

      #1 The authors successfully implicate a group of cytoskeletal proteins and their actions at focal adhesions in negatively regulating Sec23a expression levels and COPII trafficking. This description of a shared, novel mode of COPII transcriptional regulation by cytoskeletal factors is convincingly shown to be at least a contributor to the delayed trafficking in the presence of focal adhesions. In general, the data are reproducible and use appropriate statistical analysis. However, a more robust description of the architecture of early secretory pathway would be beneficial, especially in the case of MACF1 codepletion which cannot be fully rescued by Sec23a-YFP overexpression. In contrast, trafficking during codepletion of FERMT2 is fully rescued by Sec23a-YFP despite both MACF1 and FERMT2 showing similar loss of Sec23a mRNA levels upon codepletion. This data suggests that while the trafficking delay in FERMT2 codepletion might be exclusively due to reduced Sec23a expression levels, there are likely additional causes for the trafficking delay observed in MACF1 codepletion.

      Reply: We thank the reviewer for the appreciation of our results and the importance they might bear for the field. The reviewer has very neatly highlighted that each of our interactor hits might have roles in the secretory pathway beyond the ER or independent of the expression levels of SEC23A. This phenomenon could also explain the differential rescue of the arrival of VSVG at the plasma membrane upon SEC23A overexpression in FERMT2 and MACF1 knockdowns (FIG 3F). For instance, MACF1 has been involved in Golgi to Plasma Membrane transport as well (Kakinuma et al. Exp. Cell Res. 2004, Burgo et al. Dev. Cell 2012). So a possibility is that SEC23A overexpression rescues only ER to Golgi transport but the lack of rescue in the compartment between Golgi and plasma membrane independent of SEC23A expression levels would result in reduced rescue In the case of MACF1 compared to FERMT2. To support this, in our revised manuscript, we will provide example images from the experiment.

      Nonetheless, we agree that these are very important observations from Reviewer #3 and warrant a detailed discussion in the light of other interactors as well, which we intend to highlight in our revised manuscript.

      #2 While there is indeed a reduction in the number of ERESs following MACF1 codepletion, the authors report an even more dramatic reduction in 'transport intermediates / cell' as marked by COPI. However, as recent cyro-EM analysis of ERESs has definitively show, COPI exists stably at ERGIC membranes (1). Thus, an alternative possibility for the more dramatic reduction of COPI sites compared to Sec31a sites in Figures 2B-E is that ERGIC membranes are destabilized following MACF1 codepletion in a manner independent of Sec23a expression, and this destabilization compounds with reduced ERES number to ultimately delay trafficking. To more directly determine whether ERGIC membranes stability is regulated by MACF1, the authors should compare COPI and ERGIC-53 staining among MACF1 codepleted and FERMT2 codepleted cells with and without Sec23a-YFP overexpressed to levels that rescue cargo trafficking. If Sec23a-YFP restores the number of ERGIC puntae marked by these stains in FERMT2 but not MACF1 codepleted cells, it would suggest a role for MACF1 in forming or stabilizing ERGIC membranes which are known to associate with microtubules and WHAMM, an actin nucleator. Additionally, it would be useful to costain COPII with COPI or ERGIC-53 in control, MACF1 depleted, MACF1 codepleted, and MACF1 codepleted and Sec23a-YFP rescued cells to determine their colocalization. COPII and ERGIC membranes should be almost entirely coupled and juxtaposed in control cells and may be decoupled upon loss of MACF if plays a role in ERGIC membrane localization and stability. These proposed experiments are relevant because ERGIC membranes are sites of COPII cargo delivery and changes in ERGIC stability or localization would suggest an additional mechanism for cytoskeletal regulation of COPII trafficking. These immunofluorescence studies should be straightforward and completed in a few weeks.

      Reply: Although a possible additional role of MACF1 in the organisation of early secretory pathway, stability of ERES, etc., independent of the expression of SEC23A is interesting on its own, we believe that an extensive characterization of these possible roles/ pathways as proposed by the reviewer is beyond the scope this manuscript.

      #3 The choice to use VSVG and E-Cadherin for the synchronized release assays unfortunately convolutes interpreting the 'transport ratios' used by the authors to compare the effects of the various codepletions. Each protein progresses beyond the Golgi during secretion, and the authors choose to calculate the ratio of cargo intensity at the plasma membrane normalized to the total cellular cargo. This means that the synchronized release assays and calculated 'transport ratios' assay not only ER to Golgi trafficking, but also trafficking from the Golgi to the plasma membrane. In instances where Sec23a-YFP overexpression does not fully rescue the codepletion, it is possible that additional trafficking delays occur during Golgi to plasma membrane trafficking that cause the 'transport score' to decrease. Thus, the 'transport score' as the authors calculate it is needlessly nonspecific to COPII trafficking and should not be used to compare the codepletions for COPII functional interactors.

      Reply: We agree that the “transport score” used here and in our previous genome-wide screen (Simpson et. al Nat. Cell Biol. 2012) does not allow us to distinguish between the individual transport substeps in the transport of VSVG from the ER to the plasma membrane. However, as we see in Fig 1E, the proteins that we have decided to follow in more detail in this study do have a clear ER transport block phenotype (except for CRKL). So for 6 out of 7 of these proteins, the images clearly show that the decrease in the “transport score” is due to a decreased ER to Golgi transport.

      #4 To mitigate unwanted contributions of post-COPII trafficking events from altering 'transport scores,' the authors should use a cargo for synchronized release assays that does not progress past the Golgi such as α-Mannosidase II and quantify a ratio of the perinuclear cargo signal to whole cell signal. Ideally, the screen would be repeated with a more appropriate cargo generating new 'transport scores' for the full list of cytoskeletal proteins. However, this may not be feasible, and as such 'transport scores' based on a Golgi resident protein should at least be produced for the 7 Sec23b functional interactors featured in this manuscript. These Golgi 'transport scores' would add much needed quantification of ER to Golgi transport delays that currently can only be inferred from the representative images in Figure 1E, which unfortunately show significant heterogeneity among cells from the same image. The authors should also explicitly state that any 'transport score' from a synchronous release assay using a cargo destined for the plasma membrane will take into account trafficking rate changes due not only to COPII, but also COPI from the ERGIC to the Golgi, and transport carriers departing from the TGN. These synchronized release assays would likely take between a few weeks to a few months depending on their ability to automate image analysis.

      Reply: We consider that having a “Golgi transport score” won't add any new information as the proteins that we have chosen to follow are the ones that show a strong ER-block phenotype. However, we agree that such a “Golgi score” would indeed be useful if one would like to study other interactors, for instance, the ones that induce transport acceleration.

      Also, we don't expect all cells to behave similarly as the level of knockdown might be slightly different or because of the cell to cell variability. Even in control conditions (no knockdown), this heterogeneity is evident. As suggested by the reviewer, in our revised manuscript we will explicitly state that a change in the transport scores could mean a change in any sub-step of the transport from the ER to the PM in our assay.

      \*Minor comments***

      It would be useful for the authors to quantify the number of focal adhesions present from Vinculin stains from Figure 4C and 5C instead of just showing representative images. It would be interesting to determine if there is a meaningful relationship between focal adhesion number induced by the codepletions or tissue culture coating and Sec23a expression levels like in Figure 3D. Generally, the figures, text, and references were appropriate.

      Reply: As also pointed out by the other reviewers we will quantify the FA changes

      Reviewer #3 (Significance (Required)):

      In recent years, significant effort has been devoted to elucidating mechanisms by which COPII trafficking is modulated in response to cellular cues. These studies have revealed that changes in nutrient availability, growth factors, ER stress, autophagy, and T-cell activation all cause changes in COPII trafficking via unique gene expression, splicing, or post-translational control (2-7). This work elucidates a novel mechanism of transcriptional control driven by focal adhesions. Additionally, it provides a number of potentially useful Sec23a and Sec23b functional interactors among cytoskeletal factors for further study. These unexplored factors may have unique mechanism of COPII regulation that could contribute to our understanding ER export modulation. Altogether, this and similar works are building an increasingly complex set of regulatory pathways that when integrated ultimately dictate COPII trafficking kinetics.

      The reported findings are not only relevant to those who study COPII trafficking, but also other fields where secretion is studied in the context of the ECM. This work would suggest that secretion of factors involved in crosstalk between cells, including in tumors, is likely to be controlled by the interactions of cells with ECM.

      Reply: We thank reviewer #3 for the comments and insightful discussion about the limitations of our assay that we will highlight in the revised manuscript and in general for the insight into the early secretory pathway regulation. Furthermore their explicit summary of how our study could bridge COPII trafficking, ECM signaling and the relevance to various pathophysiologies is highly appreciated.

      Expertise keywords: cell biology, light microscopy, membrane trafficking

      References

      1.Weigel A V., Chang CL, Shtengel G, Xu CS, Hoffman DP, Freeman M, et al. ER-to-Golgi protein delivery through an interwoven, tubular network extending from ER. Cell. 2021 Apr;184(9):2412-2429.e16.

      2.Farhan, H., Wendeler, M. W., Mitrovic, S., Fava, E., Silberberg, Y., Sharan, R., Zerial, M., & Hauri, H. P. (2010). **MAPK signaling to the early secretory pathway revealed by kinase/phosphatase functional screening. Journal of Cell Biology, 189(6), 997-1011.

      3.Zacharogianni, M., Kondylis, V., Tang, Y., Farhan, H., Xanthakis, D., Fuchs, F., Boutros, M., & Rabouille, C. (2011). ERK7 is a negative regulator of protein secretion in response to amino-acid starvation by modulating Sec16 membrane association. **EMBO Journal, 30(18), 3684-3700.

      4.Lillmann, K.D., V. Reiterer, F. Baschieri, J. Hoffmann, V. Millarte, M.A. Hauser, A. Mazza, N. Atias, D.F. Legler, R. Sharan, et al 2015. **Regulation of Sec16 levels and dynamics links proliferation and secretion. J. Cell Sci. 128:670-682.

      5.Liu, L., Cai, J., Wang, H., Liang, X., Zhou, Q., Ding, C., Zhu, Y., Fu, T., Guo, Q., Xu, Z., Xiao, L., Liu, J., Yin, Y., Fang, L., Xue, B., Wang, Y., Meng, Z. X., He, A., Li, J. L., ... Gan, Z. (2019). Coupling of COPII vesicle trafficking to nutrient availability by the IRE1α-XBP1s axis. Proceedings of the National Academy of Sciences of the United States of America, 116(24), 11776-11785.

      6.Jeong, Y.-T., Simoneschi, D., Keegan, S., Melville, D., Adler, N. S., Saraf, A., Florens, L., Washburn, M. P., Cavasotto, C. N., Fenyö, D., Cuervo, A. M., Rossi, M., & Pagano, M. (2018). The ULK1-FBXW5-SEC23B nexus controls autophagy. ELife, 1-25.

      7.Wilhelmi, I., Kanski, R., Neumann, A., Herdt, O., Hoff, F., Jacob, R., Preußner, M., & Heyd, F. (2016). Sec16 alternative splicing dynamically controls COPII transport efficiency. Nature Communications, 7, 12347. https://doi.org/10.1038/ncomms12347

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      4. Description of analyses that authors prefer not to carry out

      Reviewer #3 suggested to robustly characterise the early secretory pathway, in response to the depletion of our interactors, for instance, the role of MACF1 in the organization and the stability of ERES. This view is also supported by reviewer #1. However, in our revised manuscript we would like to focus more on the novel aspect of our study (as highlighted by all the reviewers), namely how ECM signaling and changes in FAs affect SEC23A and possibly ER transport. For this, we would like to present a more quantitative outlook of the FA phenotype and concentrate on the transport experiments. The reason for not dwelling into a more extensive characterization of the early secretory pathway is that these experiments are very interesting on their own, and merit a separate study that would deconvolve in detail the individual trafficking steps, and their relation to SEC23A expression, ERES stability, and ECM signaling.

      Reviewer #2 suggested that to better characterize the FA phenotype and solve the apparent discrepancies between our data and the literature, we could test FAK phosphorylation. As we mentioned in our reply to this point, we think that most of the discrepancies arise from the different cell types used. Nevertheless, we agree that a quantitative approach is needed for a better characterisation of FA phenotype, therefore we intend to perform quantification of the vinculin stainings.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The authors use a synchronized cargo release assay following codepletion of either Sec23 paralog with cytoskeletal and associated proteins to identify potential functional interactions between COPII trafficking and the cytoskeleton. This screen yields a number of Sec23b functionally interacting molecules that stall cargo trafficking to various degrees within the secretory pathway upon codepletion, and in the case of MACF1 reduce ERES number despite not physically interacting. Depletion of the majority of the identified Sec23b functional interactors alone surprisingly caused the downregulation of Sec23a at the mRNA and protein levels, and cargo trafficking could be partially or fully rescued by Sec23a overexpression depending on the codepleted cytoskeletal factor. RNA-seq enrichment analysis and imaging of a focal adhesion marker suggest that genes involved in cell adhesion were differentially regulated following depletion of the cytoskeletal functional interactors. Finally, the authors show that Sec23a expression levels are reduced when cells are cultured on dishes with high amounts of ECM to induce focal adhesions, and that inhibition of focal adhesion kinase can rescue Sec23a expression levels.

      Major comments

      The authors successfully implicate a group of cytoskeletal proteins and their actions at focal adhesions in negatively regulating Sec23a expression levels and COPII trafficking. This description of a shared, novel mode of COPII transcriptional regulation by cytoskeletal factors is convincingly shown to be at least a contributor to the delayed trafficking in the presence of focal adhesions. In general, the data are reproducible and use appropriate statistical analysis. However, a more robust description of the architecture of early secretory pathway would be beneficial, especially in the case of MACF1 codepletion which cannot be fully rescued by Sec23a-YFP overexpression. In contrast, trafficking during codepletion of FERMT2 is fully rescued by Sec23a-YFP despite both MACF1 and FERMT2 showing similar loss of Sec23a mRNA levels upon codepletion. This data suggests that while the trafficking delay in FERMT2 codepletion might be exclusively due to reduced Sec23a expression levels, there are likely additional causes for the trafficking delay observed in MACF1 codepletion.

      While there is indeed a reduction in the number of ERESs following MACF1 codepletion, the authors report an even more dramatic reduction in 'transport intermediates / cell' as marked by COPI. However, as recent cyro-EM analysis of ERESs has definitively show, COPI exists stably at ERGIC membranes (1). Thus, an alternative possibility for the more dramatic reduction of COPI sites compared to Sec31a sites in Figures 2B-E is that ERGIC membranes are destabilized following MACF1 codepletion in a manner independent of Sec23a expression, and this destabilization compounds with reduced ERES number to ultimately delay trafficking. To more directly determine whether ERGIC membranes stability is regulated by MACF1, the authors should compare COPI and ERGIC-53 staining among MACF1 codepleted and FERMT2 codepleted cells with and without Sec23a-YFP overexpressed to levels that rescue cargo trafficking. If Sec23a-YFP restores the number of ERGIC puntae marked by these stains in FERMT2 but not MACF1 codepleted cells, it would suggest a role for MACF1 in forming or stabilizing ERGIC membranes which are known to associate with microtubules and WHAMM, an actin nucleator. Additionally, it would be useful to costain COPII with COPI or ERGIC-53 in control, MACF1 depleted, MACF1 codepleted, and MACF1 codepleted and Sec23a-YFP rescued cells to determine their colocalization. COPII and ERGIC membranes should be almost entirely coupled and juxtaposed in control cells and may be decoupled upon loss of MACF if plays a role in ERGIC membrane localization and stability. These proposed experiments are relevant because ERGIC membranes are sites of COPII cargo delivery and changes in ERGIC stability or localization would suggest an additional mechanism for cytoskeletal regulation of COPII trafficking. These immunofluorescence studies should be straightforward and completed in a few weeks.

      The choice to use VSVG and E-Cadherin for the synchronized release assays unfortunately convolutes interpreting the 'transport ratios' used by the authors to compare the effects of the various codepletions. Each protein progresses beyond the Golgi during secretion, and the authors choose to calculate the ratio of cargo intensity at the plasma membrane normalized to the total cellular cargo. This means that the synchronized release assays and calculated 'transport ratios' assay not only ER to Golgi trafficking, but also trafficking from the Golgi to the plasma membrane. In instances where Sec23a-YFP overexpression does not fully rescue the codepletion, it is possible that additional trafficking delays occur during Golgi to plasma membrane trafficking that cause the 'transport score' to decrease. Thus, the 'transport score' as the authors calculate it is needlessly nonspecific to COPII trafficking and should not be used to compare the codepletions for COPII functional interactors.

      To mitigate unwanted contributions of post-COPII trafficking events from altering 'transport scores,' the authors should use a cargo for synchronized release assays that does not progress past the Golgi such as α-Mannosidase II and quantify a ratio of the perinuclear cargo signal to whole cell signal. Ideally, the screen would be repeated with a more appropriate cargo generating new 'transport scores' for the full list of cytoskeletal proteins. However, this may not be feasible, and as such 'transport scores' based on a Golgi resident protein should at least be produced for the 7 Sec23b functional interactors featured in this manuscript. These Golgi 'transport scores' would add much needed quantification of ER to Golgi transport delays that currently can only be inferred from the representative images in Figure 1E, which unfortunately show significant heterogeneity among cells from the same image. The authors should also explicitly state that any 'transport score' from a synchronous release assay using a cargo destined for the plasma membrane will take into account trafficking rate changes due not only to COPII, but also COPI from the ERGIC to the Golgi, and transport carriers departing from the TGN. These synchronized release assays would likely take between a few weeks to a few months depending on their ability to automate image analysis.

      Minor comments

      It would be useful for the authors to quantify the number of focal adhesions present from Vinculin stains from Figure 4C and 5C instead of just showing representative images. It would be interesting to determine if there is a meaningful relationship between focal adhesion number induced by the codepletions or tissue culture coating and Sec23a expression levels like in Figure 3D. Generally, the figures, text, and references were appropriate.

      Significance

      In recent years, significant effort has been devoted to elucidating mechanisms by which COPII trafficking is modulated in response to cellular cues. These studies have revealed that changes in nutrient availability, growth factors, ER stress, autophagy, and T-cell activation all cause changes in COPII trafficking via unique gene expression, splicing, or post-translational control (2-7). This work elucidates a novel mechanism of transcriptional control driven by focal adhesions. Additionally, it provides a number of potentially useful Sec23a and Sec23b functional interactors among cytoskeletal factors for further study. These unexplored factors may have unique mechanism of COPII regulation that could contribute to our understanding ER export modulation. Altogether, this and similar works are building an increasingly complex set of regulatory pathways that when integrated ultimately dictate COPII trafficking kinetics.

      The reported findings are not only relevant to those who study COPII trafficking, but also other fields where secretion is studied in the context of the ECM. This work would suggest that secretion of factors involved in crosstalk between cells, including in tumors, is likely to be controlled by the interactions of cells with ECM.

      Expertise keywords: cell biology, light microscopy, membrane trafficking

      References

      1.Weigel A V., Chang CL, Shtengel G, Xu CS, Hoffman DP, Freeman M, et al. ER-to-Golgi protein delivery through an interwoven, tubular network extending from ER. Cell. 2021 Apr;184(9):2412-2429.e16.

      2.Farhan, H., Wendeler, M. W., Mitrovic, S., Fava, E., Silberberg, Y., Sharan, R., Zerial, M., & Hauri, H. P. (2010). MAPK signaling to the early secretory pathway revealed by kinase/phosphatase functional screening. Journal of Cell Biology, 189(6), 997-1011.

      3.Zacharogianni, M., Kondylis, V., Tang, Y., Farhan, H., Xanthakis, D., Fuchs, F., Boutros, M., & Rabouille, C. (2011). ERK7 is a negative regulator of protein secretion in response to amino-acid starvation by modulating Sec16 membrane association. EMBO Journal, 30(18), 3684-3700.

      4.Lillmann, K.D., V. Reiterer, F. Baschieri, J. Hoffmann, V. Millarte, M.A. Hauser, A. Mazza, N. Atias, D.F. Legler, R. Sharan, et al 2015. Regulation of Sec16 levels and dynamics links proliferation and secretion. J. Cell Sci. 128:670-682.

      5.Liu, L., Cai, J., Wang, H., Liang, X., Zhou, Q., Ding, C., Zhu, Y., Fu, T., Guo, Q., Xu, Z., Xiao, L., Liu, J., Yin, Y., Fang, L., Xue, B., Wang, Y., Meng, Z. X., He, A., Li, J. L., ... Gan, Z. (2019). Coupling of COPII vesicle trafficking to nutrient availability by the IRE1α-XBP1s axis. Proceedings of the National Academy of Sciences of the United States of America, 116(24), 11776-11785.

      6.Jeong, Y.-T., Simoneschi, D., Keegan, S., Melville, D., Adler, N. S., Saraf, A., Florens, L., Washburn, M. P., Cavasotto, C. N., Fenyö, D., Cuervo, A. M., Rossi, M., & Pagano, M. (2018). The ULK1-FBXW5-SEC23B nexus controls autophagy. ELife, 1-25.

      7.Wilhelmi, I., Kanski, R., Neumann, A., Herdt, O., Hoff, F., Jacob, R., Preußner, M., & Heyd, F. (2016). Sec16 alternative splicing dynamically controls COPII transport efficiency. Nature Communications, 7, 12347. https://doi.org/10.1038/ncomms12347

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript by Jung et al which based on a targeted siRNA screen, demonstrates regulation of SEC23A (component of the SEC23 complex of the COP coat) levels at transcriptional level downstream of focal adhesion signaling. By regulating siRNA mediated downregulation, the authors were able to identify proteins which either increased or decreased traffic of VSVG through the secretory pathway when combined with downregulation in the levels of with either SEC23A or SEC23B. Authors have focused on a group of SEC23B functional interactors, downregulation of which shows them increased size of focal adhesions which also downregulate SEC23A levels, thus providing an explanation for reduced secretory traffic. Authors further show that plating cells on fibronectin or Matrigel, which activate Focal adhesion kinase signaling also results in downregulation of SEC23A transcript levels. The screen is conducted in a well-controlled manner for most parts with a clear explanation of the analysis routines and the data presentation if of very good quality. Most important results have been validated by more than one experimental strategy which lends substantial confidence to the findings. The results also open further avenues for understanding the transcriptional regulation in different physiological and disease contexts.

      There are certain issues, which the authors should address with regards to controls and some conflicting observations with published results with respect the phenotypes associated with downregulating proteins on focal adhesions size. Additionally, authors don't tie the ends by monitoring secretory traffic in cells grown on different matrices but include it in the model. Addressing/explaining these issues could improve this manuscript and the model may have to be tweaked a bit.

      Major comments:

      1)I wonder why the authors only used siRNA control in their screen when the effects are scored in context of double knockdown fashion in combination with mild knockdown of SEC23A and SEC23B to get functional interactors. Control siRNA in combination with SEC23A and SEC23B should have been two ideal negative controls in the screen. Nevertheless, in data presented Figure 1E and whole of Figure 2, using control siRNA in combination with SEC23B siRNA would have been ideal control to show that the combination does not induce any trafficking defects which could impact the findings of the study. Hence, a few of the data presented from some of these figures should have sicontrol+SEC23B siRNA combination as a control.

      2)What is the identity of post-ER structures which authors refer to in Figure 2A? Could the images represent VSVG concentrated at ER exit sites? Authors should stain with markers for ERES to see if the VSVG puncta colocalize with it.

      3)Based on RNA sequencing results, authors chose to follow up on SEC23A levels in background of siRNA knockdown of components (like MACF1, ROCK1, FERMT2 etc.) which regulate Focal adhesions in cells and show that there is a reduction in both transcript and protein levels of SEC23A. In images shown in Figure 2B and Figure 2C, levels or SEC31A and β-Cop1 are reduced. Authors should test using qPCR and western blots whether there is a downregulation SEC31A, β-Cop1 and SEC23B in siRNA knockdowns of MACF1, ROCK1, FERMT2 etc. It would provide new insights if there were a co-regulation of secretory machinery to modulate the secretory traffic in response to Focal Adhesion based signaling.

      4)Most major concern in this manuscript surrounds around results presented in Figure 4C. Authors show that in response to all the knockdowns, they see more focal adhesions as monitored by Vinculin staining and this along with the experiments with cells plated on Matrigel and Fibronectin arrive at the conclusion that increased Focal adhesion signaling downregulates SEC23A levels which presumably modulates secretory traffic. I am not an expert on Focal adhesions but based on my understanding of the literature on that topic, downregulation of ROCK1, FEMRT2 disrupts focal adhesions. (See: Theodosiou et. al., Elife, 2016 or Lock et. al., Plos One, 2012 for example). How do authors explain their results in siRNA knockdown of ROCK1 and FEMRT2 which leads to an increased size of focal adhesions which seems contradictory to the published results? To clarify these results authors should test phosphorylation of FAK in their siRNA backgrounds which is another read out of focal adhesion signaling. The experiments from cells grown on Fibronectin and Matrigel favor the argument which authors put forth, but authors may have to tweak the model a bit based on FAK phosphorylation and FAK signaling in context of above-mentioned knockdowns.

      5)What happens to VSVG traffic or RUSH-Cadherin traffic when cells are plated on Matrigel and Fibronectin? Reduction in secretory traffic of these is an important experiment which is missing to close the loop and validate the model presented. Authors must test these experiments either with cells grown on matrix alone or in combination with siRNA to SEC23B. Authors should also monitor ERES and transport carriers in this background.

      6)This is not such a major issue, but it would be good to see a comparison in SEC23A levels in siRNA knockdown condition in comparison to those when cells are grown on different substrates and in ROCK1, FEMRT2 knockdowns (blots of which authors already have in this manuscript).

      Minor comments:

      1)Scale bars are missing in EM images in Figure 2H.

      2)Show molecular weight markers in Western blots in main figure 3E and supplementary figure S1E.

      Significance

      I have looked at the manuscript from through the lens of a cell biologist as that is predominantly my area of expertise. In that respect I find the screen conducted by authors particularly interesting as they aim to connect how extracellular cues regulate the secretory pathway. A screen seems justified as there is no comprehensive understanding linking the two above-mentioned processes. Authors have done a functional interaction screen and analyzed a lot of images to identify candidates which either increase or decrease secretory traffic in combination with SEC23A and SEC23B. Such a functional screen has helped authors identify candidates which were otherwise missed in single siRNA knockdowns in their previous work from 2012. This definitely opens up interesting avenues to test the candidates identified in the screen in different physiological contexts and in disease as also the transcriptional program connecting Focal adhesion signaling with the regulation of components governing secretion. Such functional interaction screens could also be employed to identify crosstalk of different cellular processes with the regulation secretory pathway at ER as well as at the Golgi apparatus.

      Referees cross-commenting

      I agree with the comments from both the referees that the manuscript is very interesting, most experiments are well controlled, but the quantification of focal adhesion phenotype in knockdowns need to be done in an extensive manner and secretion phenotypes need to measured upon plating cells on different matrix to validate the model presented.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Jung et al reports on an interesting finding that focal adhesion signaling regulates the expression of Sec23A and thereby regulates COPII-dependent trafficking. The data presented a mostly solid and the finding itself is highly novel, as it tackles an area of secretory trafficking that remains poorly understood, namely the connection between the ECM and secretion.

      I will list below all comments that I have mixing both technical and conceptual topics:

      Technical issues:

      1-The authors should provide a better description of how the designed this siRNA library. What were the inclusion criteria for these 378 genes? I might have missed it, but I could not find this information easily.

      2-Figure 2: I know this is challenging for EM images, but is there a way the authors could quantify these data? How many images were looked at? What was the average width of ER cisterne?

      3-Figure 4: I think that the characterization of the FA phenotype is a bit underdeveloped. There is no quantification of these data. Is the size of FA changing? Is the number of FA per cell changing? Is the length of FAs changing? I think that more work is needed to increase the confidence in these data. I could also not easily see what type of cells these are. A better description of this experiment is also required. Also, how many cells were analyzed. I think it is important that this experiment is done with a sufficient number of cells to increase the confidence in the data.

      Conceptual issues:

      1-The finding that focal adhesion signaling negatively affects ER-export is surprising, because cancer cells that grow on stiff substrates have more focal adhesions and are more invasive and migratory. Both migration and invasion are expected to depend on ER-export. Although the authors did not formally test Sec23A expression under different stiffnesses, I would expect that stiff substrates would lower Sec23A expression and thereby negatively affect ER-export. It would certainly increase the breadth of this work to include data like this and to also discuss this highly surprising finding. However, it is of course the decision of the authors and the editors to decide whether such an experiment would benefit the entire story.

      2-The authors postulate that this novel mechanism could be part of a feedback loop. If this were the case one would expect the acute effect of FA to increase ER-export (or secretion) and the negative feedback will then reduce secretion. However, the acute effect of FA is not addressed in this manuscript. In order to postulate a feedback loop, the authors would need to test the individual nodes of this loop.

      Significance

      I think that the basic finding of this manuscript is highly novel, by showing the impact of the ECM and focal adhesions on COPII-dependent trafficking. I think that this will not only appeal to people from the trafficking community, but also to people working on cell migration and on mechanobiology. The work in its current form does not require much extra efforts (max. 3 month). However, if the authors would decide to increase the breadth of data, they would require 3-6 months.

      Referees cross-commenting

      I went through the comments of the two other reviewers and agree with their verdict. Some extra work on the characterization of the early secretory pathway would be good. Both reviewers provided a nice catalogue of possible experiments to choose from.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons

      1. General Statements

      We want to thank all three reviewers for their positive feedback, constructive comments, and suggestions for clarity and improvement. We are delighted to find their consensus that the manuscript represents a contribution to the field.

      Accordingly, we made changes in the text (all highlighted in blue in the revised manuscript) and added a new figure as detailed in the point-by-point response.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors describe results of the comprehensive analysis of the prevalence and functionality of intrinsically disordered regions of the pathogen-encoded signaling receptor Tir, which serves as an illustrative example of the bacterial effector proteins secreted by Attaching and Effacing (A/E) pathogens. This is an interesting and important study that represents an impressive amount of data generated computationally and using a broad spectrum of biophysical techniques. The work serves as a model of the well-designed and perfectly conducted study, where intriguing conclusions are based on the results of the comprehensive experiments. The manuscript is well-written and concise, and I have a real pleasure reading it. The text and figures are clear and accurate.

      We thank the Reviewer for these positive comments on our work.

      Although, in general, prior studies are referenced appropriately, the authors should mention that the pre-formed structural elements they found in Tir are in line with the concept of "PreSMos" (pre-structured motifs) previously introduced and described in several important studies from the laboratory of Kyou-Hoon Han.

      We thank the Reviewer for this suggestion. We have added a sentence to acknowledge the presence of “PreSMos” in the target-free state of Tir as putative signatures for target-binding, referring to a review article summarizing several local structural elements in unbound IDPs:

      “This supports the presence of pre-structured motifs (PreSMos) as pre-existing signatures for target binding and function within target-free Tir (72)**.”

      Please, note that we decided to keep this discussion to a minimum, as we cannot rule out the contribution of the induced fit model to the binding mechanism (i.e., disorder-to-order transition upon binding).

      Reviewer #1 (Significance (Required)):

      Solid evidence is provided that structural disorder and short linear motifs represent common features of A/E pathogen effectors. In fact, using a set of bioinformatics tools, the authors first show that although prokaryotic proteins typically contain significantly less intrinsic disorder than eukaryotic proteins, A/E pathogen effectors are as disordered as eukaryotic proteins. Using the translocated intimin receptor (Tir) as a subject of focused study, the authors then utilized a number of biophysical techniques to draw an impressive picture of disorder-based functionality. This study clearly represents a major advancement in the field of functional intrinsic disorder in general and in disorder-based functionality of proteins expressed by pathogenic bacteria. This was adds significantly to the field and will have a noticeable impact.

      Again, reading this manuscript was a real joy. Finally, this work perfectly fits in the area of my expertise, since for the past 25 years or so I am working on the different aspects of intrinsically disordered proteins.

      Thank you for this encouraging assessment.

      **Referee Cross-commenting**

      I agree with the amended recommendation of reviewer #3 to add in the manuscript EPEC O127.

      According to the suggestion of Reviewer #3, we have now included EPEC O127:H6 in the manuscript.

      I completely agree with comments of reviewer #2 and partially agree with reviewer #3. In my view, comparison of various strains as references for EPEC represents an interesting but independent project. It can be recommended to the authors as one of the potential future developments of their work.

      Thanks for the suggestion. We are pursuing that line of research.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The general impression is that this is an excellent study that establishes

      The C-terminal intracellular region of Tir called C-Tir spanning residues 338 to 550 is largely disordered, however, observe helical structural elements involved with lipid interactions; multi-phosphorylation. The intracellular N-terminal part of Tir called N-Tir spanning residues 1 to 233 is also partially disordered but include a folded domain that is shown to assemble into a dimer

      The only major concern is that no SDS-PAGE gels or size exclusion chromatograms have been included to verify purity and monodispersed of the various constructs worked on. In particular, the SAXS and CD measurement is highly sensitive to purity, and the level of degradation as IDPs are notorious for being difficult to handle in solution. it would strengthen the arguments made based that

      We produced N-Tir and C-Tir as fusion proteins with a cleavable N-terminal thioredoxin tag (Trx-His6) and C-terminal Strep-tag. The latter allowed us to purify them via Strep-tag affinity chromatography as indicated by SDS-PAGE (please see Fig. S1).

      We agree with the Reviewer that even small amounts of impurities (i.e., higher oligomers/degradation) can interfere with the data analysis and make interpretation of the resulting data difficult and potentially misleading. So, to avoid such problems, all samples were purified in monodispersed forms by size-exclusion chromatography (SEC) before any biophysical study.

      Following the Reviewer's suggestion, we added a new supplementary figure (Fig. S5) showing the SEC-SAXS chromatogram profiles of C-Tir, N-Tir, and NS-Tir. Briefly, in the inline SEC-SAXS experiment, the sample eluates from an HPLC system directly and continuously into a BioSAXS flow cell for subsequent X-ray interrogation. Under our experimental conditions, C-Tir elutes as a single peak with Rg-values and mass compatible with a disordered monomeric protein, providing an excellent fit to the experimental SAXS curves. For N-Tir and NS-Tir, by SEC-SAXS, we separated the dimer from small amounts of high-order oligomers to yield the experimental SAXS curves of the pure dimers.

      “Fig. S5. SEC-SAXS chromatograms of (A) C-Tir, (B) N-Tir, and (C) NS-Tir. Each plane shows normalized total scattering intensity I(s), over the entire s range, from each frame acquired along elution volume and respective Rg-value (black circles). The flat variation of Rg reflects a pure monodisperse sample. The column type for size exclusion chromatography and sample concentrations are on the top left of each panel. For reference, the retention volume for monomeric BSA (66.4 kDa) is displayed by red triangles.”

      **Minor Comments**

      Read through the manuscript to remove passages with spoken language

      We thank the Reviewer for this suggestion. We went through the manuscript and improved the writing to reduce passages with spoken language.

      Line 263, "To do so", should be removed

      Line 290 "Our data thus" replaced with "this"

      We have amended the manuscript accordingly.

      Line 292 "lipid bilayers that might potentially fine-tune Tir's activity in the host cell." Weak sentence and the word fine-tune is slang. Rewrite the sentence. The interaction with lipids is fascinating!

      Thanks for the suggestion. The sentence has now been changed to “**This shows that C-Tir can undergo multivalent and tunable electrostatic interaction with lipid bilayers via pre-structured elements, suggesting that membrane-protein interplay at the intracellular side might control the activity and interactions of Tir in host cells.**”

      We also reinforce this fascinating message in the abstract by adding the sentence: “Membrane affinity is residue-specific and modulated by lipid composition, suggesting a previously unrecognized mechanism for interaction with the host.”

      Line 192 "In figure Fig. 3A," remove the Fig

      Fixed.

      Line 326, "In a similar fashion," is redundant. Rewrite the sentences below.

      We have modified the sentence as follows: “We evaluated whether the N-terminal cytosolic region of Tir (N-Tir; Fig S1) was also intrinsically disordered ...

      Line 342 add spaces between digit and SI unit "52kDa" there are more cases of this.

      Thank you for pointing this out. This has now been corrected to 52 kDa.

      Reviewer #2 (Significance (Required)):

      I expect this study to have broad relevance to microbiologists working with the intimin and translocated intimin receptor, in particular the lipid interaction is likely to be followed up by the community.

      We thank the reviewer for this comment. Indeed, we believe that further studies on Tir's lipid-binding ability as a novel molecular strategy in host-pathogen interactions, will potentially provide new insights on virulence, transmembrane signaling in general, and disorder-mediated functions.

      **Referee Cross-commenting**

      What reviewer 3 suggested in the comments sounds like added value and should be included.

      I agree with reviewer 1, that the strain comparison potentially is beyond the scope presented in this manuscript.

      We have now included EPEC O127:H6 in the manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      This interesting manuscript look at the structure of the Nter and Cter of the effector Tir from enteropathogenic E. coli. The authors confirmed previous study highlighting the "disordered" part of the Cter. However, the extended experimental work (NMR, Small-angle X-ray scattering and CD spectroscopy) from this study also reveals the connection between different area of Tir and its implication during Tir phosphorylation and its interactions with SH2 domain.

      We thank the Reviewer for this positive remark. Indeed, in our work, we highlight the structural features of the SH2-mediated interaction between Tir and host SHP-1 protein, and we also show that C-Tir is capable of lipid interaction via pre-structured motifs and that N-Tir is disordered but assembled into a dimer. Overall, we provide an updated and wide picture of Tir's intracellular side that goes beyond the scrutiny of previously described disorder features.

      **Major Comments:**

      The authors used E2348/69 (O127:H7) strain as a reference for EPEC. However, this strain are the least effectors of all the EPEC sequences and may over estimated the PDR in EPEC. It would be wiser to use a strain like B171 as a reference for EPEC to be able to conclude "Disordered Proteins (PDR) with long disordered regions occur in EPEC effectors similar to the human proteome". I believe that the PDR in EPEC is similar to EHEC and CR. I do not have any major concern for the rest of the work.

      We thank the Reviewer for this comment. So, to clarify, we amended “EPEC” with “EPEC O127:H6” in text and figures.

      We also added a paragraph at the beginning of the Discussion section to acknowledge that our prediction analysis concerns EPEC O127:H6 and two additional representative A/E bacteria strains:

      “Among the enteropathogenic Escherichia coli strains EPEC O127:H6 (E2348/69) is commonly used as a prototype strain to study EPEC biology, genetics, and virulence (69). Here, we have determined the structural disorder propensity of EPEC O127:H6 sequences and two additional representatives of A/E bacteria: EHEC O157:H7 and CR ICC168.

      Finally, the Reviewer suggests to include EPEC strain B171 (serotype O111:NM) in our analysis. We agree that considering additional strains would be of value, however we believe that this is beyond the scope of this manuscript, which mainly focuses on the characterization of the structural features of the E2348/69 Tir effector. We are currently working on a broader comparative analysis among different Escherichia coli pathogenic strains, including B171, and we hope to share our findings with the community in the near future.

      **Minor comments**

      Statistic problem: Mann Whitney U Test (Wilcoxon Rank Sum Test) is a comparison of two independent samples with the underlying assumption is normally distributed or that the samples were sufficiently large. It is not certain that any of this assumption is correct. In addition, the effector are part of the whole proteome. Can it be then considered that both groups are independent?

      We thank the Reviewer for this remark, which allows us to clarify the choice of this particular test. Indeed the Mann Whitney U-test is a non-parametric test to compare two samples with the alternative hypothesis being that one of the two samples is stochastically greater than the other. As it is a nonparametric test samples are not required to be normally distributed, as it is for the Student t-test.

      Regarding the independence of the samples, when comparing the effectors collections to their corresponding proteomes, we did exclude the effectors sequences from the latter. We have clarified this point in the Supplementary Material and Methods section.

      Line 120 and 442: O127 not H127

      Thank you for pointing this out. It has now been corrected to O127.

      Line 212: positions 409 or 405?

      Yes, it should be 405. Thank you.

      Reviewer #3 (Significance (Required)):

      **Nature and significance:**

      Tir plays a major role during EPEC infection. It is a signalling platform that has been reported to interact with multiple proteins. Whereas the extracellular part has been well characterised and crystallised, the intracellular part has been proven so far to be difficult to study. Over the last decade, no progress has been made to explain how Tir works. This manuscript provides interesting information that shade some light on how the protein could work.

      **Existing literature:**

      The last research manuscript trying to highlight the structural function of Tir dates from 2007 (PMC1896257). This study is far more extended and in depth than any other previous work done.

      **Audience:**

      the Audience may probably limited to researcher working on the field of cellular microbiology and the function associated with bacterial effector in the host. This study could be also a useful tool to identify new effectors base on their "disorder".

      We thank the Reviewer for recognizing the importance of this study. We agree that our work highlights the pivotal role of disordered regions in bacterial effectors, thus enabling a better understanding of the molecular mechanisms used by pathogens to subvert the host-cell processes. We indeed believe that our work can stimulate further research on the characterization of intrinsically disordered effectors, and also beyond the cellular microbiology field, in order to gain a broader knowledge on the molecular dialogue at the host-pathogen interface, which is essential to design better therapeutic strategies.

      **Expertise:**

      I have been working on A/E pathogens for the last 15 years with a particular interest in Tir signalling. My domain of expertise is more in relation to cell signalling than crystallography or structural study.

      **Referee Cross-commenting**

      I agree with both reviewers. My comment about EPEC is more about the conclusion for some of the figures. I don't think they should conclude for the whole EPEC. The Tir variation among EHEC O157:H7 is low, but it is far more diverse for EPEC. Simply adding in the manuscript EPEC O127 should be enough.

      We thank the Reviewer for this comment. As mentioned above, we now state in the manuscript, in both Results and Discussion sections, that we used E2348/69 as a representative strain for EPEC.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This interesting manuscript look at the structure of the Nter and Cter of the effector Tir from enteropathogenic E. coli. The authors confirmed previous study highlighting the "disordered" part of the Cter. However, the extended experimental work (NMR, Small-angle X-ray scattering and CD spectroscopy) from this study also reveals the connection between different area of Tir and its implication during Tir phosphorylation and its interactions with SH2 domain.

      Major Comments:

      The authors used E2348/69 (O127:H7) strain as a reference for EPEC. However, this strain are the least effectors of all the EPEC sequences and may over estimated the PDR in EPEC. It would be wiser to use a strain like B171 as a reference for EPEC to be able to conclude "Disordered Proteins (PDR) with long disordered regions occur in EPEC effectors similar to the human proteome". I believe that the PDR in EPEC is similar to EHEC and CR. I do not have any major concern for the rest of the work.

      Minor comments

      Statistic problem: Mann Whitney U Test (Wilcoxon Rank Sum Test) is a comparison of two independent samples with the underlying assumption is normally distributed or that the samples were sufficiently large. It is not certain that any of this assumption is correct. In addition, the effector are part of the whole proteome. Can it be then considered that both groups are independent?

      Line 120 and 442: O127 not H127

      Line 212: positions 409 or 405?

      Significance

      Nature and significance:

      Tir plays a major role during EPEC infection. It is a signalling platform that has been reported to interact with multiple proteins. Whereas the extracellular part has been well characterised and crystallised, the intracellular part has been proven so far to be difficult to study. Over the last decade, no progress has been made to explain how Tir works. This manuscript provides interesting information that shade some light on how the protein could work.

      Existing literature:

      The last research manuscript trying to highlight the structural function of Tir dates from 2007 (PMC1896257). This study is far more extended and in depth than any other previous work done.

      Audience:

      the Audience may probably limited to researcher working on the field of cellular microbiology and the function associated with bacterial effector in the host. This study could be also a useful tool to identify new effectors base on their "disorder".

      Expertise:

      I have been working on A/E pathogens for the last 15 years with a particular interest in Tir signalling. My domain of expertise is more in relation to cell signalling than crystallography or structural study.

      Referee Cross-commenting

      I agree with both reviewers. My comment about EPEC is more about the conclusion for some of the figures. I don't think they should conclude for the whole EPEC. The Tir variation among EHEC O157:H7 is low, but it is far more diverse for EPEC. Simply adding in the manuscript EPEC O127 should be enough.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The general impression is that this is an excellent study that establishes The C-terminal intracellular region of Tir called C-Tir spanning residues 338 to 550 is largely disordered, however, observe helical structural elements involved with lipid interactions; multi-phosphorylation. The intracellular N-terminal part of Tir called N-Tir spanning residues 1 to 233 is also partially disordered but include a folded domain that is shown to assemble into a dimer

      The only major concern is that no SDS-PAGE gels or size exclusion chromatograms have been included to verify purity and monodispersed of the various constructs worked on. In particular, the SAXS and CD measurement is highly sensitive to purity, and the level of degradation as IDPs are notorious for being difficult to handle in solution. it would strengthen the arguments made based that

      Minor Comments

      Read through the manuscript to remove passages with spoken language

      Line 263, "To do so", should be removed

      Line 290 "Our data thus" replaced with "this"

      Line 292 "lipid bilayers that might potentially fine-tune Tir's activity in the host cell." Weak sentence and the word fine-tune is slang. Rewrite the sentence. The interaction with lipids is fascinating!

      Line 192 "In figure Fig. 3A," remove the Fig

      Line 326, "In a similar fashion," is redundant. Rewrite the sentences below.

      Line 342 add spaces between digit and SI unit "52kDa" there are more cases of this.

      Significance

      I expect this study to have broad relevance to microbiologists working with the intimin and translocated intimin receptor, in particular the lipid interaction is likely to be followed up by the community.

      Referee Cross-commenting

      What reviewer 3 suggested in the comments sounds like added value and should be included.

      I agree with reviewer 1, that the strain comparison potentially is beyond the scope presented in this manuscript.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors describe results of the comprehensive analysis of the prevalence and functionality of intrinsically disordered regions of the pathogen-encoded signaling receptor Tir, which serves as an illustrative example of the bacterial effector proteins secreted by Attaching and Effacing (A/E) pathogens. This is an interesting and important study that represents impressive amount of data generated computationally and using a broad spectrum of biophysical techniques. The work serves as a model of the well-designed and perfectly conducted study, where intriguing conclusions are based on the results of the comprehensive experiments. The manuscript is well-written and concise, and I have a real pleasure reading it. The text and figures are clear and accurate.

      Although, in general, prior studies are referenced appropriately, the authors should mention that the pre-formed structural elements they found in Tir are in line with the concept of "PreSMos" (pre-structured motifs) previously introduced and described in several important studies from the laboratory of Kyou-Hoon Han.

      Significance

      Solid evidence is provided that structural disorder and short linear motifs represent common features of A/E pathogen effectors. In fact, using a set of bioinformatics tools, the authors first show that although prokaryotic proteins typically contain significantly less intrinsic disorder than eukaryotic proteins, A/E pathogen effectors are as disordered as eukaryotic proteins. Using the translocated intimin receptor (Tir) as a subject of focused study, the authors then utilized a number of biophysical techniques to draw an impressive picture of disorder-based functionality. This study clearly represents a major advancement in the field of functional intrinsic disorder in general and in disorder-based functionality of proteins expressed by pathogenic bacteria. This was adds significantly to the field and will have a noticeable impact.

      Again, reading this manuscript was a real joy. Finally, this work perfectly fits in the area of my expertise, since for the past 25 years or so I am working on the different aspects of intrinsically disordered proteins.

      Referee Cross-commenting

      I agree with the amended recommendation of reviewer #3 to add in the manuscript EPEC O127.

      I completely agree with comments of reviewer #2 and partially agree with reviewer #3. In my view, comparison of various strains as references for EPEC represents an interesting but independent project. It can be recommended to the authors as one of the potential future developments of their work.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Author Response to Reviewer Comments

      Review Commons

      Manuscript number: RC-2021-00979

      Corresponding author(s): Horvitz, H Robert

      Reviewer #1:

      Major comments: The manuscript is very well written and results have been very clearly presented. The key conclusions drawn by the authors are convincing. However, one of the claims by the authors is not supported by the data. In lines 206-215 the authors discuss experiments where they visualized the morphology of the AIAs in ctbp-1 mutants where ctbp-1 expression is restored temporally in the L4-young adult stage using a heat-shock promoter construct. The authors conclude that "ctbp-1 can act ... in older worms to maintain aspects of AIA morphology in a manner similar to AIA gene expression." However, the data presented in Fig. 3I-L show no statistically significant difference between ctbp-1 mutants and mutants with the HS-construct, either with and without heat shock. Thus, although there seems to be some effect of the heat shock, this is not significant and thus does not support the conclusion of the authors. In addition, an important control is missing. How does the heat shock affect the morphology of AIAs in wt or ctbp-1 animals, without the hs-construct?

      We agree with this comment and have updated the manuscript to clarify that suggestion of the activity of CTBP-1 in preventing further disruption of AIA morphology is speculative. We will conduct the suggested control experiment and include the results in a revised version of the manuscript.

      Apart from the above, all strong claims by the authors are valid. In addition, the authors suggest a mechanism, where CTBP-1 regulates the function of the EGL-13 transcription factor in AIA and that overexpression of CEH-28 in AIA contributes to the olfactory adaptation defect observed in the ctbp-1 mutant animals. These mechanistic speculations could be relatively easily strengthened by two additional experiments. One, does ctbp-1 loss of function affect egl-13 expression? The model presented in Fig 8 suggests that egl-13 expression levels are not affected, but from the data in the paper it is not even clear of egl-13 is expressed in AIA. Whether egl-13 is expressed in AIA, and if its expression levels are affected by mutation of ctbp-1 could be tested using egl-13::gfp expressing animals.

      This is an excellent suggestion and experiments we had been attempting already. We will include findings from these experiments once they are complete in a revised version of the manuscript.

      Two, does overexpression of ceh-28 cause an olfactory adaptation defect? This could be tested by cell specific overexpression of ceh-28 in AIA.

      This is also a great suggestion. We will conduct this experiment and include the findings in a revised manuscript.

      The data and the methods have been presented in such a way that they can be reproduced. I do have some doubts with regard to the statistical analysis. The authors report that statistical analysis involved unpaired t-tests. But as all results involve the analysis of data from 3-5 different strains, a multiple sample analysis should be used. To correct for the number of samples, one should first use an ANOVA to test for statistical differences, followed by a post hoc analysis to identify those that are significantly different.

      We agree with this criticism. We have replaced instances of multiple sample analyses with a one-way ANOVA test followed by Tukey’s multiple test correction. The current version of the manuscript reflects these changes in figures, figure legends and in the Materials and Methods.

      Reviewer #2: \*Major comments:**

      1. The paper is well written and figures are clearly organized. The authors made suitable conclusions based on the data provided. Materials and methods are appropriately described for reproductivity.*

      We agree and are currently attempting such experiments. Meaningful results from these experiments will be included in a revised manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): \*Major comments:**

      • The key conclusions of this manuscript are highly convincing and are supported by multiple mutant alleles and rescue experiments.*

      *

      • There are certain claims in the manuscript that need to be clarified (detailed below).*

      *

      • No additional experiments are essential to support the claims of the paper.*

      - Most of the data and the methods presented well - however a Table listing genes identified in the AIA-specific RNA Seq is required. The GEO accession number has been made available for the RNA Sequencing data however listed the genes identified would aid the reader. Were ctbp-1 and egl-13 shown to be expressed in the AIAs using this approach?

      We have included such a table, replacing Fig. S6 (which previously showed only ceh-28 expression) with a table listing expression of all confirmed hits from the scRNA-Seq experiment. ctbp-1 and egl-13 were also found to be expressed in the AIA neurons in this scRNA-Seq experiment.

      - No evidence is presented that EGL-13 is expressed in the AIAs?

      As noted above, the scRNA-Seq experiment showed egl-13 expression in the AIAs. We also will assay egl-13 expression in the AIAs using a GFP reporter and include the results in a revised manuscript.

      - Can the authors comment and include in the manuscript information regarding whether the promoters of AIA-expressed genes that are regulated by EGL-13 contain EGL-13 binding sites? Also, are the promoters of AIA-expressed genes not regulated by EGL-13 missing these sites?

      We have added such information to the manuscript. Briefly, our analysis identified no promising candidates for EGL-13 binding sites in the promoter regions of either ceh-28 or acbp-6, suggesting that regulation of these by EGL-13 is likely indirect. Further, no previous work has indicated that either of these genes is regulated directly by EGL-13, although in the case of acbp-6 little is known about this gene or the ways in which it is regulated. However, the claim that EGL-13 regulates expression of acbp-6 and ceh-28 indirectly is speculative and is not a conclusion of this current work.

      - Experiments and statistical analysis are adequate.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Saul et al. found that the CTB-1 transcriptional co-repressor acts cell-autonomously to maintain aspects of AIA neuronal fate, morphology and function. They found that CTBP-1 utilizes the Sox transcription factor EGL-13 to transcriptionally repress specific genes in the AIA neurons. This work proposes that CTBP-1 and other co-repressors play critical roles in selectively maintaining or repressing expression of specific genes.

      Major comments:

      • The key conclusions of this manuscript are highly convincing and are supported by multiple mutant alleles and rescue experiments.
      • There are certain claims in the manuscript that need to be clarified (detailed below).
      • No additional experiments are essential to support the claims of the paper.
      • Most of the data and the methods presented well - however a Table listing genes identified in the AIA-specific RNA Seq is required. The GEO accession number has been made available for the RNA Sequencing data however listed the genes identified would aid the reader. Were ctbp-1 and egl-13 shown to be expressed in the AIAss using this approach?
      • No evidence is presented that EGL-13 is expressed in the AIAs?
      • Can the authors comment and include in the manuscript information regarding whether the promoters of AIA-expressed genes that are regulated by EGL-13 contain EGL-13 binding sites? Also, are the promoters of AIA-expressed genes not regulated by EGL-13 missing these sites?
      • Experiments and statistical analysis are adequate.

      Minor comments:

      I list below a number of changes and typographical errors that will improve the manuscript.

      Page 11 Line 235 - the authors state that ctbp-1 L4s have an increased attraction to butanone. As the chemotaxis index is 0 for the ctbp1- mutant compared to -0.5 in WT I understand what the authors mean hear but the statement of "increased attraction" suggests that ctbp1- mutants are attracted to butanone when they are actually ambivalent to it.

      Page 12 Line 248 - change functioning to functional

      Page 18 Line 397 - it would be helpful to the reader if the authors referred back to the ctbp1- mutant data (Figure 5) for comparison in Fig 7D.

      Page 19 Line 404 - remove the word causally

      Page 19 Line 414 "However, while conditioned ctbp-1 ceh-28 double mutants appeared similar to both the wild type and ctbp-1 single mutants at the L1 stage (Fig. 7I-J), these double mutants displayed an intermediate phenotype between wild-type and ctbp-1 animals for adaptation at the L4 larval stage (Fig. 7K-L).

      This sentence is confusing as the ctbp-1 ceh-28 phenotype is not significant different to the ctbp-1 single mutant.

      Page 50 Line 1001 - change mlg-1 to mgl-1

      Figure 7A-C - please label with the genotype examined.

      Significance

      • This work identifies a function for the transcriptional corepressor CTBP-1 in controlling the expression of a subset of genes in the AIA neurons. It suggests that CTBP-1 may play a similar role in controlling subsets of gene expression in diverse neuronal classes. This would be interesting to examine in single cell sequencing experiments of all C. elegans neurons.
        • This work adds to the literature that describes CTBP-1 functions in the C. elegans nervous system. It also speculates that other transcriptional co-repressors play similar functions in other cells and tissues in other organisms.
        • An audience with interests in cell fate determination and the function of specific gene regulatory modules that control subsets of genes within a cell.
        • My field of expertise is C. elegans neurobiology (axon guidance and cell fate) and I am therefore well-qualified to review this manuscript.

      Referee Cross-commenting

      Comments from other reviewers are fair. I am happy with the overall conclusions.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper, the authors identified several mutations from a forward genetic screen in the transcriptional corepressor gene ctbp-1 that cause mixexpression of a M4 neuronal marker in the two AIA interneurons in C. elegans. ctbp-1 mutant AIA neurons also display a defect in morphology and sensory function. The penetrance and severity of these defects in gene expression, morphology, and function progressively increase with age. Their data suggests that ctbp-1 acts cell-autonomously and in older worms to maintain gene expression, morphology, and function in AIA neurons. Single-cell RNA sequencing was performed to identify changes in AIA transcriptional profiles between wild type and ctbp-1 mutants. Using the data from AIA transcriptional profiles, they showed that ctbp-1 mutant AIA neurons lose the expression of two genes characteristic of the adult AIA while misexpress at least two genes uncharacteristic of AIA. Taken together, their findings demonstrate that ctbp-1 acts to maintain the AIA identity at the level of gene expression, morphology, and function, while ctbp-1 does not act to establish the AIA cell identity. Furthermore, the authors identified a few mutations of a SOX family transcription factor gene egl-13 from a froward genetic screen that suppress the ctbp-1 mutant phenotype. The authors conclude their results that ctbp-1 maintains AIA function and some aspects of AIA gene expression by antagonizing egl-13 function and that ctbp-1 maintains AIA morphology through pathways independently of egl-13.

      Major comments:

      1. The paper is well written and figures are clearly organized. The authors made suitable conclusions based on the data provided. Materials and methods are appropriately described for reproductivity.
      2. It would strengthen the model (Figure 8) by testing physical interaction between CTBP-1 and EGL-13 in AIA using BiFC.

      Minor comments:

      1. The authors mentioned a previous finding that the mammalian ortholog of EGL-13, SOX6, interacts with the mammalian ortholog of CTBP-1, CtBP2. The authors should also discuss the function of interacting SOX6 and CTBP-1 in mammalian systems.
      2. It would be good to increase the font size of some figures and tables for easier reading.

      Significance

      This study identifies roles of conserved transcriptional corepressor CTBP-1 and a SOX family transcription factor gene egl-13 from unbiased forward genetic screens in the maintenance of AIA interneurons in C. elegans.

      Since CTBP-1 and EGL-13 have mammalian orthologs, although the roles of their mammalian orthologs were not discussed, this study may have broad implications for development in a range of organisms.

      The findings of this study will be of interest to a broad audience in the field of developmental biology, particularly in transcriptional regulation of cell identity maintenance.

      I have expertise in transcriptional regulation of sensory neuron diversification using C. elegans as a model. I am comfortable about evaluating this manuscript.

      Referee Cross-commenting

      I agree with the comments from reviewers 1 and 3.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Saul et al identify the transcriptional corepressor ctbp-1 as a regulator of ceh-28::gfp expression in the AIA neurons of the nematode C. elegans. They find 18 independent mutants in this gene, including several presumptive null alleles. Using cell specific rescue and temporal expression of the ctbp-1 gene, the authors show that ctbp-1 acts cell autonomously in AIA to regulate ceh-28 expression, and can do so in young adult animals. Next, using various reporters they show that the AIAs do not transdifferentiate to M4-like cells, but the AIAs do show morphological defects, which increase with age of the animal. Using behavioral experiments the authors next determine the functionality of the AIAs in the ctbp-1 mutant animals. They find that loss of ctbp-1 in AIA affects the function of the AIAs and that ctbp-1 does so on young adult animals. The authors conclude the characterization of the AIAs of ctbp-1 mutant animals by identifying several other genes whose expression is misregulated in ctbp-1 animals, using a single cell RNAseq experiment, confirmed using gfp-fusion constructs. These experiments identity one other gene, acbp-6, that is misexpressed in the AIAs of L4 ctbp-1 animals and 2 genes, sra-11 and glr-2 that are normally expressed in AIA, but not in ctbp-1 animals.

      To find out how ctbp-1 regulates gene expression in AIA, the authors perform a genetic suppressor screen and show that loss of function of egl-13 suppresses the ceh-28::gfp misexpression in AIA in ctbp-1 mutants. They show that egl-13 functions cell-autonomously in the AIAs. They find it does not suppress the morphological defects of the AIAs in ctbp-1 mutants, but it does suppress the effect of ctbp-1 loss of function on olfactory adaptation. In addition, mutation of egl-13 suppressed the misexpression of acbp-6, but not that of sra-11 and glr-2. Finally, the authors show that the olfactory adaptation defect observed in ctbp-1 mutant animals can be partially suppressed by inactivating ceh-28 suggesting that the behavioral defect is caused in part by overexpression of ceh-28.

      The manuscript is very well written and results have been very clearly presented. The key conclusions drawn by the authors are convincing. However, one of the claims by the authors is not supported by the data. In lines 206-215 the authors discuss experiments where they visualized the morphology of the AIAs in ctbp-1 mutants where ctbp-1 expression is restored temporally in the L4-young adult stage using a heat-shock promoter construct. The authors conclude that "ctbp-1 can act ... in older worms to maintain aspects of AIA morphology in a manner similar to AIA gene expression." However, the data presented in Fig. 3I-L show no statistically significant difference between ctbp-1 mutants and mutants with the HS-construct, either with and without heat shock. Thus, although there seems to be some effect of the heat shock, this is not significant and thus does not support the conclusion of the authors. In addition, an important control is missing. How does the heat shock affect the morphology of AIAs in wt or ctbp-1 animals, without the hs-construct?

      Apart from the above, all strong claims by the authors are valid. In addition, the authors suggest a mechanism, where CTBP-1 regulates the function of the EGL-13 transcription factor in AIA and that overexpression of CEH-28 in AIA contributes to the olfactory adaptation defect observed in the ctbp-1 mutant animals. These mechanistic speculations could be relatively easily strengthened by two additional experiments. One, does ctbp-1 loss of function affect egl-13 expression? The model presented in Fig 8 suggests that egl-13 expression levels are not affected, but from the data in the paper it is not even clear of egl-13 is expressed in AIA. Whether egl-13 is expressed in AIA, and if its expression levels are affected by mutation of ctbp-1 could be tested using egl-13::gfp expressing animals.

      Two, does overexpression of ceh-28 cause an olfactory adaptation defect? This could be tested by cell specific overexpression of ceh-28 in AIA.

      These are relatively simple experiments that would not take much time or investments, but would strengthen or clarify the model presented.

      The data and the methods have been presented in such a way that they can be reproduced. I do have some doubts with regard to the statistical analysis. The authors report that statistical analysis involved unpaired t-tests. But as all results involve the analysis of data from 3-5 different strains, a multiple sample analysis should be used. To correct for the number of samples, one should first use an ANOVA to test for statistical differences, followed by a post hoc analysis to identify those that are significantly different.

      Minor comments:

      Page 7, in the heat shock rescue experiment that authors conclude that ctbp-1 acts "in older worms" to prevent expression of ceh-28 in AIA. "Older" is quite unspecific. Please be specific, i.e. in L4-young adult animals. The same applies to various other phrases where "older" worms are mentioned. Line 229, the authors state that animals were "briefly starved". Please be precise and indicate how long the animals were starved.

      Significance

      Most studies that address cell fate, focus on the first phase where cell fate is determined. How cell fate is maintained is far less well understood. This manuscript convincingly identifies two transcription regulators that are important for cell fate maintenance, both a transcriptional repressor and an activator. The manuscript provides first clues as to how this process functions, and as such provides important conceptual insights. These not only apply to the worm, C. elegans, but as these are strongly conserved proteins, probably also provide a firm basis for our understanding of cell fate maintenance mechanisms in higher organisms including mammals. In addition, this study reports an excellent model that can be used to further unravel this mechanism. As such, I expect that this manuscript will be of interest to a broad range of scientists, interested in cell fate determination and maintenance and transcriptional control.

      My expertise lies in C. elegans behavior, where we focus on identification of the molecular and cellular mechanisms that allow C. elegans to respond to its environment even in changing circumstances. In addition, we study the mechanisms of cell fate determination and maintenance in C. elegans sensory neurons.

      Referee Cross-commenting

      I agree with the comments of reviewers 2 and 3.