5,256 Matching Annotations
  1. Oct 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      Authors developed a novel primer/probe set for detection of subgenomic (sgE) transcripts for SARS-CoV-2 with the aim to develop a system that may predict the presence of infectious virus in patient samples. After studying the specificity and sensitivity of their system, they compared it with already validated/published systems for diagnostic of SARS-CoV-2 infection. Interestingly, they also studied the effect of the conditions of isolation. They showed Vero E6 expressing TMPRSS2 (Vero E6-TMPRSS2) to be more sensitive to infection than Vero E6, allowing a higher number of isolation from patient samples. They also showed their system to be more sensitive than a previously published sgE system as well as than a negative-strand RNA assay but less sensitive than the WHO/Charité primer/probe set. Anyway, all samples containing infectious particles (successful virus isolation on Vero E6-TMPRSS2) were detected with their primer/probe system contrary to the other tested sgE assay. They showed the negative strand assay to be unlikely to detect virus genetic material in samples which nevertheless contain infectious particles.

      **Major comments:**

      Are the key conclusions convincing?

      I salute the intention of the authors to try to fix cut-off values for infectious patients but I would be more careful on the assertion of "using a total viral RNA Ct cut-off of >31 or specifically testing for sgRNA can serve as an effective rule-out test for viral infectivity". It is true that in this study, virus was not isolated from any of the samples below a Ct of 31 or negative in the developed sgE assay but all those assays are done on cell culture. We do not know how the transmission could occur for those samples from human to human. Being able to fix a cut-off in Ct value for a define PCR/RT-PCR system would be a great improvement for SARS-CoV-2 infected patient having to stay in quarantine. It is even more important for Ebola positive patients in Africa who has to stay in quarantine in precarious conditions under tents, warm temperatures and without privacy for long period because they still positive by RT-PCR. Unfortunately, fix those values would need a very high number of experiments, including animal experiment.

      We appreciate the reviewer’s acknowledgment of the significance of this issue. We agree that in vivo animal experiments to more precisely determine the lowest infectious or transmissible dose would be valuable. But such experiments are outside the scope of the current study. To acknowledge the reviewer’s important point regarding the unavoidable limitations of cell culture systems, we have modified the abstract (line 51) to say “an effective rule out test for the presence of culturable virus,” a conclusion that is fully supported by our data.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      No

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      No

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes.

      Are the data and the methods presented in such a way that they can be reproduced?

      Kinetic of SARS-CoV-2 (figure 2):

      The method is not detailed in the Methods part and is not clear in the figure legend. When supernatant are collected, is it all the supernatant that is remove? An aliquot? If aliquot, do you replace with new medium?

      We apologize for this omission and have included the requested details in the methods. We seed a separate well for each time point and collected the entire supernatant for a given time point, rather than replacing media. We added the following text to the methods section (lines 402-412): “Viral growth kinetics were measured in Vero E6 or Vero E6 TMPRSS2 cells at an MOI of 0.001. Separate wells were seeded for each time point, and growth curves were conducted in technical duplicates for each biological experiment. Supernatants and cell lysates were collected twice daily 1 & 2 dpi, and again on 3, 4, 7 and 8 dpi (Vero E6 TMPRSS2 cells were harvested for the final time at day 7 due to faster growth kinetics in this cell type). For each time point, the supernatant was removed and clarified to remove cellular debris, before being split into separate aliquots for RNA extraction (mixed 1:1 with AVE lysis buffer) and viral titration (by focus assay). Dead cells/debris that was pelleted after clarifying supernatants was combined with cells scraped from each well into PBS and spun again to obtain a pellet of all cell material from each timepoint. This pellet was then lysed in AVE viral lysis buffer for RNA extraction.”

      Stability of infectious SARS-CoV-2:

      I am very surprise by your results on stability of cultured virus, knowing we observed a decreased of SARS-CoV-2 titer in our lab after freezing/thawing steps. Do you freeze cell supernatant directly or do you prepare your samples another way? Please state it in the Methods part

      We measured the stability after freeze/thaw for our normal high concentration viral stocks. Our viral stocks are grown in DMEM with 10% FBS, 1% HEPES, 1% pen/strep, and clarified before use. It is possible that lab-lab variation in the media components or HEPES concentration used to prepare viral stocks explains the differences seen in our work vs the reviewer’s lab. We have added the following additional detail to the methods section (lines 415-418) of the manuscript to clarify how these experiments were performed: “High concentration viral stocks (prepared as above in DMEM, 10% FBS, 1% HEPES, 1% pen/strep) were used to measure viral stability over time and after multiple freeze-thaw cycles. Stocks were stored at the indicated temperatures in the dark and aliquots were removed at the indicated days or after each freeze-thaw cycle for measuring infectious virus by focus assay.”

      Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      **Minor comments:**

      Specific experimental issues that are easily addressable.

      Figure 2C and D: Instead of Ct values in cells, it would be more relevant to normalize these results with an endogenous gene and present results as fold change to mock-infected cells. Because you affirm that the level of RNA decline than stay stable over the time but you also note there is CPE. If you have less cells but same level of viral RNA, it means you have an increase in the RNA level in alive cells.

      We have measured the GAPDH level in these cells over time, and that data is included as gray lines in Fig 2 C&D (see new figure 2). As we are combining the cell pellet from clarified supernatants with the cells that remain adherent to the dish for each harvested timepoint we expect to be harvesting the majority of cells/cell debris for each time point. The levels of GAPDH remain broadly similar over the viral growth curve, with no drop in RNA levels.

      It would have been interesting to have the results of isolation at different time-point of treatment for patient samples (figure 3A and B) to see if the virus is stable in samples

      We have access to only limited volume (several hundred µl) of residual patient sample which would make it technically challenging to compare multiple days of storage conditions/ temperatures. Unfortunately, we do not have any remaining sample volume for the specimens used in this study, and so we are unable to perform additional isolations at other times/temperatures. While we agree this would be an interesting line of future inquiry, we feel it is outside the scope of the current study.

      Are prior studies referenced appropriately? Yes

      Are the text and figures clear and accurate?

      Yes.

      Line 140: "this delay in virus and RNA production". You do not talk about RNA yet...

      We have removed “and RNA” from this sentence and replaced with “infectious virus production”.

      Line 156 to 163: sgE RNA detected in cell free supernatant. Can't it come from lysed cells?

      We have replaced “cell-free” with “clarified”.

      Line 167: "...virus in cell culture time course experiment in TMPRRS2 expressing cells (fig.2)"

      We have modified this text to read according to the Reviewer’s suggestion.

      Ligne 258: Fig 6A and B

      We have added the missing reference to Fig 6B as requested.

      -Do you have suggestions that would help the authors improve the presentation of their data and conclusions? No

      Reviewer #1 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This new primer/probe system will participate to the accurate diagnostic of SARS-CoV-2. The comparison with the existing methods is relevant to highlight the strengths and weaknesses of each system. Comparison of isolation of SARS-CoV-2 on commonly used Vero E6 with Vero E6-TMPRSS2 will lead to a great improvement of the isolation method for SARS-CoV-2.

      We appreciate the Reviewer’s assessment of the significance of our study and the improvement in our isolation method compared to the existing standard of using Vero E6 cells.

      Place the work in the context of the existing literature (provide references, where appropriate).

      Properly done in the introduction of the paper.

      State what audience might be interested in and influenced by the reported findings.

      Diagnostic laboratories

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Virology, Molecular Biology, cell biology

      Not enough expertise to evaluate ROC data/analysis

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Bruce et al present a new RT-PCR assay with primer sets that specifically detect sgE RNA from SARS-CoV2 samples. The authors compare this assay to other diagnostic assays in an effort to identify assays capable of correlating RNA detection with culturable virus (i.e. infectious virus). While this new assay identified 100% of culturable isolates, only 56% of isolates testing positive actually had culturable virus. Compared with other assays, the WHO total E RNA assay had better parameters when used at a cutoff Ct value of 31 (PPV of 61%). Overall, this manuscript provides a novel primer probe set for RT-PCR diagnostic assay and conducted comparisons with other assays on the same clinical samples. There are some areas that the authors should address prior to publication.

      **Major comments:**

      The authors repeatedly tout VeroE6 TMRSS2 cells as supporting higher viral infection. Therefore, the authors should address why one clinical isolate (E16) was culturable in VeroE6 but not VeroE6 TMRSS2. Was this experiment repeated multiple times? What are the reasons for this discrepancy?

      We did not have sufficient residual sample volume to repeat isolation attempts of any clinical specimen, so we are limited to a single data point for each cell line. It is possible that this sample had levels of infectious virus at the limit of detection, and stochastic probability meant infectious virus was only present in the aliquot used to infect the Vero E6 (rather than Vero E6-TMPRSS2) cells. It is also possible that viral adaptation/evolution occurred in the VeroE6 well that allowed this virus to successfully grow, but we do not have sequencing data or remaining nucleic acids to test this theory.

      The authors' argument at lines 166-169 is not supported by the data in Fig. 2. The levels of viral RNA between VeroE6 and VeroE6 TMRSS2 appear to show similar trends in the supernatant across the time course but the infectious viral levels are dramatically different. This discordance between FFU levels and RNA levels cannot be explained by instability of viral particles alone. Have the authors looked into differences in viral particles produced from these two cell lines? The authors should collect virus particles from these two cell lines and conduct the stability experiment in Fig 2D to directly test the hypothesis that indeed the drop seen in FFU in VeroE6 TMRSS2 is due to instability.

      We apologize for the confusion. We did not intend to make claims about differences in particle stability as a result of the cell line used for viral production, but rather to highlight a general observation that RNA was more stable than infectious virus. This is more obvious in the TMRPSS2 cell line, as replication is faster and more synchronized than in Vero E6 cells (the TMRPSS2 cells are largely dead by day 4, whereas infection progresses more slowly in Vero E6 cells so that new virions continue to be produced during the measured time period). We have added clarifying text at line 167-169, “We observed that SARS-CoV-2 RNA species persist for much longer than infectious virus in cell culture time course experiments, a feature that was most obvious in Vero E6 TMRPSS-2 cells due to their viral kinetics but is likely not cell specific (Fig 2).”

      The evidence for the packaging of sgE RNA into virions is weak. GAPDH detection by PCR is not a proof that the concentration process did not pellet RNA nonspecifically. First, the authors should provide ample information about viral isolation process at line 379 including rotor, centrifuge and speed utilized. In addition, ribosomes typically stay intact following viral lysis (and can be found in supernatant after release from dead cells). Actively translating ribosomes can contain sgE RNA as well. The authors should consider detecting ribosomal RNAs in their samples to rule out the possibility of contaminating ribosomes. In addition, the authors should strongly consider repeating the experiment with high EDTA concentration to break up ribosomes and only pellet virions.

      We have added additional experimental details (rotor, centrifuge and speed) describing how the viral concentration step was performed (line 389-394), “Viral RNA (courtesy of David Bauer, The Francis Crick Institute, UK) from concentrated SARS-CoV-2 (England02 strain, B lineage ‘Wuhan-like’) was obtained by clarifying viral supernatants (2 x 4000 rpm for 30 mins at 4°C in a Beckman Allegra X-30R centrifuge with a SX4400 rotor), overlaying clarified media onto a 30% sucrose/PBS cushion (1/4th tube volume) and concentrating by ultracentrifugation in a Beckman ultra XPN-90 centrifuge with SW32TI rotor for 90 min at 25,500 rpm at 4°C. Pellets were then resuspended in buffer and extracted with TRIzol LS.” We thank the reviewer for their suggestion of including an additional control, and we have added an 18S primer-probe set (see new Figure 8). This data, while not as pronounced as the GAPDH control, suggests that the ultracentrifugation step has removed significant amounts of 18S RNA (though the clarified supernatants retain similar amounts of 18S RNA as the cells, suggesting that clarification alone is not sufficient to remove contaminating ribosomes). While we agree that repeating the ultracentrifuge concentration with high concentrations of EDTA is an interesting line of inquiry we feel it is outside the scope of this manuscript (and we face additional technical restrictions to pursue this as we currently lack access to an ultracentrifuge at BSL-3). We have updated the discussion to include the possibility of residual ribosome-protected fragments of sgE as a potential alternative interpretation (line 350-352).

      **Minor comments:**

      At line 197, the authors refer to "viruses" with lower levels of SARS-CoV2 RNA. This is incorrect and should be changed to "isolates" as the SARS-CoV2 virus particle does not package variable amount of genomic RNA.

      We have changed this to “clinical specimens” for clarity.

      The authors statement on lines 210-212 does not seem to be supported clearly by Fig. 5. The authors should consider including trendlines as well as other analyses that help show the correlation between viral RNA vs FFU. In addition, the authors should label the Y-axis clearly for Fig. 5.

      We have added clarifying labels to both the X and Y axes. Due to the limited sample volume we were unable to directly measure the infectious titers from the clinical samples used in this study, and thus the FFU/mL represents the titer post-isolation while the CT represents the amount of RNA pre-isolation. Nonetheless, we do see broad trends (ie, the colored dots are generally arranged in rainbow order from left to right, though we agree there is variation within this trend). We have also modified the text at lines 212-217 to reflect the reviewer’s concern- “Greater initial viral RNA levels was broadly associated with faster viral growth in both cell lines (seen in the progression of colors from left to right), however we saw significant variation within these trends. Our data suggests that when standard SARS-CoV-2 RNA RT-PCR values are the only available data for patient or population-level viral loads, they are useful in gauging the presence of infectious virus in patient NP samples (Fig 5).”

      The authors should expand on the methodology for creating ROC curves at line 467.

      We have included the following text in the methods section for ROC curve analysis:

      “ROC curves were generated using R and plotted with the ggplot2 package

      [43]. For each potential scoring marker (CT_e, CT_sge1, CT_sge2, neg_e,) samples were ordered by that marker, followed by culturable status. The false-positive rate was calculated as the cumulative count of culturable samples (after ordering by marker intensity) divided by the total count of culturable samples; the true positive rate was calculated as the cumulative count of non-culturable samples (after ordering) divided by the total count of non-culturable samples. The false positive rate was plotted on the X axis of the ROC curves and the true positive rate on the Y axis.”

      Reviewer #2 (Significance (Required)):

      This study is significant because it assesses the utility of several clinical assays for the measurement of viral RNA and correlating it with culturable virus. This is important in the field because it helps to identify methods whereby infectivity can be predicted from a simple diagnostic test. This is important to know as a virologist working in the SARS-CoV2 field. It is also important from a public health perspective to better define quarantine requirements for persons testing positive. While the study provided a new primer probe set, it appears that the already available WHO total E RNA assay is superior in predicting infectivity and this study provides further evidence to support this notion.

      We appreciate the Reviewer’s assessment that this study is significant and provides information of high interest to SARS-CoV-2 virologists that also has important public health implications.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Authors developed a novel primer/probe set for detection of subgenomic (sgE) transcripts for SARS-CoV-2 with the aim to develop a system that may predict the presence of infectious virus in patient samples. After studying the specificity and sensitivity of their system, they compared it with already validated/published systems for diagnostic of SARS-CoV-2 infection. Interestingly, they also studied the effect of the conditions of isolation. They showed Vero E6 expressing TMPRSS2 (Vero E6-TMPRSS2) to be more sensitive to infection than Vero E6, allowing a higher number of isolation from patient samples. They also showed their system to be more sensitive than a previously published sgE system as well as than a negative-strand RNA assay but less sensitive than the WHO/Charité primer/probe set. Anyway, all samples containing infectious particles (successful virus isolation on Vero E6-TMPRSS2) were detected with their primer/probe system contrary to the other tested sgE assay. They showed the negative strand assay to be unlikely to detect virus genetic material in samples which nevertheless contain infectious particles.

      **Major comments:**

      Are the key conclusions convincing?

      I salute the intention of the authors to try to fix cut-off values for infectious patients but I would be more careful on the assertion of "using a total viral RNA Ct cut-off of >31 or specifically testing for sgRNA can serve as an effective rule-out test for viral infectivity". It is true that in this study, virus was not isolated from any of the samples below a Ct of 31 or negative in the developed sgE assay but all those assays are done on cell culture. We do not know how the transmission could occur for those samples from human to human. Being able to fix a cut-off in Ct value for a define PCR/RT-PCR system would be a great improvement for SARS-CoV-2 infected patient having to stay in quarantine. It is even more important for Ebola positive patients in Africa who has to stay in quarantine in precarious conditions under tents, warm temperatures and without privacy for long period because they still positive by RT-PCR. Unfortunately, fix those values would need a very high number of experiments, including animal experiment.

      We appreciate the reviewer’s acknowledgment of the significance of this issue. We agree that in vivo animal experiments to more precisely determine the lowest infectious or transmissible dose would be valuable. But such experiments are outside the scope of the current study. To acknowledge the reviewer’s important point regarding the unavoidable limitations of cell culture systems, we have modified the abstract (line 51) to say “an effective rule out test for the presence of culturable virus,” a conclusion that is fully supported by our data.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? No

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. No

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Yes.

      Are the data and the methods presented in such a way that they can be reproduced?

      Kinetic of SARS-CoV-2 (figure 2): The method is not detailed in the Methods part and is not clear in the figure legend. When supernatant are collected, is it all the supernatant that is remove? An aliquot? If aliquot, do you replace with new medium?

      We apologize for this omission and have included the requested details in the methods. We seed a separate well for each time point and collected the entire supernatant for a given time point, rather than replacing media. We added the following text to the methods section (lines 402-412): “**Viral growth kinetics were measured in Vero E6 or Vero E6 TMPRSS2 cells at an MOI of 0.001. Separate wells were seeded for each time point, and growth curves were conducted in technical duplicates for each biological experiment. Supernatants and cell lysates were collected twice daily 1 & 2 dpi, and again on 3, 4, 7 and 8 dpi (Vero E6 TMPRSS2 cells were harvested for the final time at day 7 due to faster growth kinetics in this cell type). For each time point, the supernatant was removed and clarified to remove cellular debris, before being split into separate aliquots for RNA extraction (mixed 1:1 with AVE lysis buffer) and viral titration (by focus assay). Dead cells/debris that was pelleted after clarifying supernatants was combined with cells scraped from each well into PBS and spun again to obtain a pellet of all cell material from each timepoint. This pellet was then lysed in AVE viral lysis buffer for RNA extraction.”

      Stability of infectious SARS-CoV-2: I am very surprise by your results on stability of cultured virus, knowing we observed a decreased of SARS-CoV-2 titer in our lab after freezing/thawing steps. Do you freeze cell supernatant directly or do you prepare your samples another way? Please state it in the Methods part

      We measured the stability after freeze/thaw for our normal high concentration viral stocks. Our viral stocks are grown in DMEM with 10% FBS, 1% HEPES, 1% pen/strep, and clarified before use. It is possible that lab-lab variation in the media components or HEPES concentration used to prepare viral stocks explains the differences seen in our work vs the reviewer’s lab. We have added the following additional detail to the methods section (lines 415-418) of the manuscript to clarify how these experiments were performed: “High concentration viral stocks (prepared as above in DMEM, 10% FBS, 1% HEPES, 1% pen/strep) were used to measure viral stability over time and after multiple freeze-thaw cycles. Stocks were stored at the indicated temperatures in the dark and aliquots were removed at the indicated days or after each freeze-thaw cycle for measuring infectious virus by focus assay.”

      Are the experiments adequately replicated and statistical analysis adequate? Yes

      **Minor comments:**

      Specific experimental issues that are easily addressable.

      Figure 2C and D: Instead of Ct values in cells, it would be more relevant to normalize these results with an endogenous gene and present results as fold change to mock-infected cells. Because you affirm that the level of RNA decline than stay stable over the time but you also note there is CPE. If you have less cells but same level of viral RNA, it means you have an increase in the RNA level in alive cells.

      We have measured the GAPDH level in these cells over time, and that data is included as gray lines in Fig 2 C&D (see updated figure). As we are combining the cell pellet from clarified supernatants with the cells that remain adherent to the dish for each harvested timepoint we expect to be harvesting the majority of cells/cell debris for each time point. The levels of GAPDH remain broadly similar over the viral growth curve, with no drop in RNA levels.

      It would have been interesting to have the results of isolation at different time-point of treatment for patient samples (figure 3A and B) to see if the virus is stable in samples

      We have access to only limited volume (several hundred µl) of residual patient sample which would make it technically challenging to compare multiple days of storage conditions/ temperatures. Unfortunately, we do not have any remaining sample volume for the specimens used in this study, and so we are unable to perform additional isolations at other times/temperatures. While we agree this would be an interesting line of future inquiry, we feel it is outside the scope of the current study.

      Are prior studies referenced appropriately? Yes

      Are the text and figures clear and accurate? Yes.

      Line 140: "this delay in virus and RNA production". You do not talk about RNA yet...

      We have removed “and RNA” from this sentence and replaced with “infectious virus production”.

      Line 156 to 163: sgE RNA detected in cell free supernatant. Can't it come from lysed cells?

      We have replaced “cell-free” with “clarified”.

      Line 167: "...virus in cell culture time course experiment in TMPRRS2 expressing cells (fig.2)"

      We have modified this text to read according to the Reviewer’s suggestion.

      Ligne 258: Fig 6A and B

      We have added the missing reference to Fig 6B as requested.

      -Do you have suggestions that would help the authors improve the presentation of their data and conclusions? No

      Reviewer #1 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This new primer/probe system will participate to the accurate diagnostic of SARS-CoV-2. The comparison with the existing methods is relevant to highlight the strengths and weaknesses of each system. Comparison of isolation of SARS-CoV-2 on commonly used Vero E6 with Vero E6-TMPRSS2 will lead to a great improvement of the isolation method for SARS-CoV-2.

      We appreciate the Reviewer’s assessment of the significance of our study and the improvement in our isolation method compared to the existing standard of using Vero E6 cells.

      Place the work in the context of the existing literature (provide references, where appropriate). Properly done in the introduction of the paper.

      State what audience might be interested in and influenced by the reported findings. Diagnostic laboratories

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Virology, Molecular Biology, cell biology Not enough expertise to evaluate ROC data/analysis

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Bruce et al present a new RT-PCR assay with primer sets that specifically detect sgE RNA from SARS-CoV2 samples. The authors compare this assay to other diagnostic assays in an effort to identify assays capable of correlating RNA detection with culturable virus (i.e. infectious virus). While this new assay identified 100% of culturable isolates, only 56% of isolates testing positive actually had culturable virus. Compared with other assays, the WHO total E RNA assay had better parameters when used at a cutoff Ct value of 31 (PPV of 61%). Overall, this manuscript provides a novel primer probe set for RT-PCR diagnostic assay and conducted comparisons with other assays on the same clinical samples. There are some areas that the authors should address prior to publication.

      **Major comments:**

      The authors repeatedly tout VeroE6 TMRSS2 cells as supporting higher viral infection. Therefore, the authors should address why one clinical isolate (E16) was culturable in VeroE6 but not VeroE6 TMRSS2. Was this experiment repeated multiple times? What are the reasons for this discrepancy?

      We did not have sufficient residual sample volume to repeat isolation attempts of any clinical specimen, so we are limited to a single data point for each cell line. It is possible that this sample had levels of infectious virus at the limit of detection, and stochastic probability meant infectious virus was only present in the aliquot used to infect the Vero E6 (rather than Vero E6-TMPRSS2) cells. It is also possible that viral adaptation/evolution occurred in the VeroE6 well that allowed this virus to successfully grow, but we do not have sequencing data or remaining nucleic acids to test this theory.

      The authors' argument at lines 166-169 is not supported by the data in Fig. 2. The levels of viral RNA between VeroE6 and VeroE6 TMRSS2 appear to show similar trends in the supernatant across the time course but the infectious viral levels are dramatically different. This discordance between FFU levels and RNA levels cannot be explained by instability of viral particles alone. Have the authors looked into differences in viral particles produced from these two cell lines? The authors should collect virus particles from these two cell lines and conduct the stability experiment in Fig 2D to directly test the hypothesis that indeed the drop seen in FFU in VeroE6 TMRSS2 is due to instability.

      We apologize for the confusion. We did not intend to make claims about differences in particle stability as a result of the cell line used for viral production, but rather to highlight a general observation that RNA was more stable than infectious virus. This is more obvious in the TMRPSS2 cell line, as replication is faster and more synchronized than in Vero E6 cells (the TMRPSS2 cells are largely dead by day 4, whereas infection progresses more slowly in Vero E6 cells so that new virions continue to be produced during the measured time period). We have added clarifying text at line 167-169, “We observed that SARS-CoV-2 RNA species persist for much longer than infectious virus in cell culture time course experiments, a feature that was most obvious in Vero E6 TMRPSS-2 cells due to their viral kinetics but is likely not cell specific (Fig 2).”

      The evidence for the packaging of sgE RNA into virions is weak. GAPDH detection by PCR is not a proof that the concentration process did not pellet RNA nonspecifically. First, the authors should provide ample information about viral isolation process at line 379 including rotor, centrifuge and speed utilized. In addition, ribosomes typically stay intact following viral lysis (and can be found in supernatant after release from dead cells). Actively translating ribosomes can contain sgE RNA as well. The authors should consider detecting ribosomal RNAs in their samples to rule out the possibility of contaminating ribosomes. In addition, the authors should strongly consider repeating the experiment with high EDTA concentration to break up ribosomes and only pellet virions.

      We have added additional experimental details (rotor, centrifuge and speed) describing how the viral concentration step was performed (line 389-394), “Viral RNA (courtesy of David Bauer, The Francis Crick Institute, UK) from concentrated SARS-CoV-2 (England02 strain, B lineage ‘Wuhan-like’) was obtained by clarifying viral supernatants (2 x 4000 rpm for 30 mins at 4°C in a Beckman Allegra X-30R centrifuge with a SX4400 rotor), overlaying clarified media onto a 30% sucrose/PBS cushion (1/4th tube volume) and concentrating by ultracentrifugation in a Beckman ultra XPN-90 centrifuge with SW32TI rotor for 90 min at 25,500 rpm at 4°C. Pellets were then resuspended in buffer and extracted with TRIzol LS.” We thank the reviewer for their suggestion of including an additional control, and we have added an 18S primer-probe set (see new Figure 8). This data, while not as pronounced as the GAPDH control, suggests that the ultracentrifugation step has removed significant amounts of 18S RNA (though the clarified supernatants retain similar amounts of 18S RNA as the cells, suggesting that clarification alone is not sufficient to remove contaminating ribosomes). While we agree that repeating the ultracentrifuge concentration with high concentrations of EDTA is an interesting line of inquiry we feel it is outside the scope of this manuscript (and we face additional technical restrictions to pursue this as we currently lack access to an ultracentrifuge at BSL-3). We have updated the discussion to include the possibility of residual ribosome-protected fragments of sgE as a potential alternative interpretation (line 350-352).

      **Minor comments:**

      At line 197, the authors refer to "viruses" with lower levels of SARS-CoV2 RNA. This is incorrect and should be changed to "isolates" as the SARS-CoV2 virus particle does not package variable amount of genomic RNA.

      We have changed this to “clinical specimens” for clarity.

      The authors statement on lines 210-212 does not seem to be supported clearly by Fig. 5. The authors should consider including trendlines as well as other analyses that help show the correlation between viral RNA vs FFU. In addition, the authors should label the Y-axis clearly for Fig. 5.

      We have added clarifying labels to both the X and Y axes. Due to the limited sample volume we were unable to directly measure the infectious titers from the clinical samples used in this study, and thus the FFU/mL represents the titer post-isolation while the CT represents the amount of RNA pre-isolation. Nonetheless, we do see broad trends (ie, the colored dots are generally arranged in rainbow order from left to right, though we agree there is variation within this trend). We have also modified the text at lines 212-217 to reflect the reviewer’s concern- “Greater initial viral RNA levels was broadly associated with faster viral growth in both cell lines (seen in the progression of colors from left to right), however we saw significant variation within these trends. Our data suggests that when standard SARS-CoV-2 RNA RT-PCR values are the only available data for patient or population-level viral loads, they are useful in gauging the presence of infectious virus in patient NP samples (Fig 5).”

      The authors should expand on the methodology for creating ROC curves at line 467.

      We have included the following text in the methods section for ROC curve analysis:

      “ROC curves were generated using R [43]. For each potential scoring marker (CT_e, CT_sge1, CT_sge2, neg_e,) samples were ordered by that marker, followed by culturable status. The false-positive rate was calculated as the cumulative count of culturable samples (after ordering by marker intensity) divided by the total count of culturable samples; the true positive rate was calculated as the cumulative count of non-culturable samples (after ordering) divided by the total count of non-culturable samples. The false positive rate was plotted on the X axis of the ROC curves and the true positive rate on the Y axis.”

      Reviewer #2 (Significance (Required)):

      This study is significant because it assesses the utility of several clinical assays for the measurement of viral RNA and correlating it with culturable virus. This is important in the field because it helps to identify methods whereby infectivity can be predicted from a simple diagnostic test. This is important to know as a virologist working in the SARS-CoV2 field. It is also important from a public health perspective to better define quarantine requirements for persons testing positive. While the study provided a new primer probe set, it appears that the already available WHO total E RNA assay is superior in predicting infectivity and this study provides further evidence to support this notion.

      We appreciate the Reviewer’s assessment that this study is significant and provides information of high interest to SARS-CoV-2 virologists that also has important public health implications.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Bruce et al present a new RT-PCR assay with primer sets that specifically detect sgE RNA from SARS-CoV2 samples. The authors compare this assay to other diagnostic assays in an effort to identify assays capable of correlating RNA detection with culturable virus (i.e. infectious virus). While this new assay identified 100% of culturable isolates, only 56% of isolates testing positive actually had culturable virus. Compared with other assays, the WHO total E RNA assay had better parameters when used at a cutoff Ct value of 31 (PPV of 61%). Overall, this manuscript provides a novel primer probe set for RT-PCR diagnostic assay and conducted comparisons with other assays on the same clinical samples. There are some areas that the authors should address prior to publication.

      Major comments:

      -The authors repeatedly tout VeroE6 TMRSS2 cells as supporting higher viral infection. Therefore, the authors should address why one clinical isolate (E16) was culturable in VeroE6 but not VeroE6 TMRSS2. Was this experiment repeated multiple times? What are the reasons for this discrepancy?

      -The authors' argument at lines 166-169 is not supported by the data in Fig. 2. The levels of viral RNA between VeroE6 and VeroE6 TMRSS2 appear to show similar trends in the supernatant across the time course but the infectious viral levels are dramatically different. This discordance between FFU levels and RNA levels cannot be explained by instability of viral particles alone. Have the authors looked into differences in viral particles produced from these two cell lines? The authors should collect virus particles from these two cell lines and conduct the stability experiment in Fig 2D to directly test the hypothesis that indeed the drop seen in FFU in VeroE6 TMRSS2 is due to instability.

      -The evidence for the packaging of sgE RNA into virions is weak. GAPDH detection by PCR is not a proof that the concentration process did not pellet RNA nonspecifically. First, the authors should provide ample information about viral isolation process at line 379 including rotor, centrifuge and speed utilized. In addition, ribosomes typically stay intact following viral lysis (and can be found in supernatant after release from dead cells). Actively translating ribosomes can contain sgE RNA as well. The authors should consider detecting ribosomal RNAs in their samples to rule out the possibility of contaminating ribosomes. In addition, the authors should strongly consider repeating the experiment with high EDTA concentration to break up ribosomes and only pellet virions.

      Minor comments:

      -At line 197, the authors refer to "viruses" with lower levels of SARS-CoV2 RNA. This is incorrect and should be changed to "isolates" as the SARS-CoV2 virus particle does not package variable amount of genomic RNA.

      -The authors statement on lines 210-212 does not seem to be supported clearly by Fig. 5. The authors should consider including trendlines as well as other analyses that help show the correlation between viral RNA vs FFU. In addition, the authors should label the Y-axis clearly for Fig. 5.

      -The authors should expand on the methodology for creating ROC curves at line 467.

      Significance

      This study is significant because it assesses the utility of several clinical assays for the measurement of viral RNA and correlating it with culturable virus. This is important in the field because it helps to identify methods whereby infectivity can be predicted from a simple diagnostic test. This is important to know as a virologist working in the SARS-CoV2 field. It is also important from a public health perspective to better define quarantine requirements for persons testing positive. While the study provided a new primer probe set, it appears that the already available WHO total E RNA assay is superior in predicting infectivity and this study provides further evidence to support this notion.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Authors developed a novel primer/probe set for detection of subgenomic (sgE) transcripts for SARS-CoV-2 with the aim to develop a system that may predict the presence of infectious virus in patient samples. After studying the specificity and sensitivity of their system, they compared it with already validated/published systems for diagnostic of SARS-CoV-2 infection. Interestingly, they also studied the effect of the conditions of isolation. They showed Vero E6 expressing TMPRSS2 (Vero E6-TMPRSS2) to be more sensitive to infection than Vero E6, allowing a higher number of isolation from patient samples. They also showed their system to be more sensitive than a previously published sgE system as well as than a negative-strand RNA assay but less sensitive than the WHO/Charité primer/probe set. Anyway, all samples containing infectious particles (successful virus isolation on Vero E6-TMPRSS2) were detected with their primer/probe system contrary to the other tested sgE assay. They showed the negative strand assay to be unlikely to detect virus genetic material in samples which nevertheless contain infectious particles.

      Major comments:

      -Are the key conclusions convincing?

      I salute the intention of the authors to try to fix cut-off values for infectious patients but I would be more careful on the assertion of "using a total viral RNA Ct cut-off of >31 or specifically testing for sgRNA can serve as an effective rule-out test for viral infectivity". It is true that in this study, virus was not isolated from any of the samples below a Ct of 31 or negative in the developed sgE assay but all those assays are done on cell culture. We do not know how the transmission could occur for those samples from human to human. Being able to fix a cut-off in Ct value for a define PCR/RT-PCR system would be a great improvement for SARS-CoV-2 infected patient having to stay in quarantine. It is even more important for Ebola positive patients in Africa who has to stay in quarantine in precarious conditions under tents, warm temperatures and without privacy for long period because they still positive by RT-PCR. Unfortunately, fix those values would need a very high number of experiments, including animal experiment.

      -Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? No

      -Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. No

      -Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Yes.

      -Are the data and the methods presented in such a way that they can be reproduced?

      -Kinetic of SARS-CoV-2 (figure 2): The method is not detailed in the Methods part and is not clear in the figure legend. When supernatant are collected, is it all the supernatant that is remove? An aliquot? If aliquot, do you replace with new medium? -Stability of infectious SARS-CoV-2: I am very surprise by your results on stability of cultured virus, knowing we observed a decreased of SARS-CoV-2 titer in our lab after freezing/thawing steps. Do you freeze cell supernatant directly or do you prepare your samples another way? Please state it in the Methods part

      -Are the experiments adequately replicated and statistical analysis adequate? Yes

      Minor comments:

      • Specific experimental issues that are easily addressable.

      Figure 2C and D: Instead of Ct values in cells, it would be more relevant to normalize these results with an endogenous gene and present results as fold change to mock-infected cells. Because you affirm that the level of RNA decline than stay stable over the time but you also note there is CPE. If you have less cells but same level of viral RNA, it means you have an increase in the RNA level in alive cells. It would have been interesting to have the results of isolation at different time-point of treatment for patient samples (figure 3A and B) to see if the virus is stable in samples

      -Are prior studies referenced appropriately? Yes

      -Are the text and figures clear and accurate? Yes.

      Line 140: "this delay in virus and RNA production". You do not talk about RNA yet...

      Line 156 to 163: sgE RNA detected in cell free supernatant. Can't it come from lysed cells?

      Line 167: "...virus in cell culture time course experiment in TMPRRS2 expressing cells (fig.2)"

      Ligne 258: Fig 6A and B

      -Do you have suggestions that would help the authors improve the presentation of their data and conclusions? No

      Significance

      -Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This new primer/probe system will participate to the accurate diagnostic of SARS-CoV-2. The comparison with the existing methods is relevant to highlight the strengths and weaknesses of each system. Comparison of isolation of SARS-CoV-2 on commonly used Vero E6 with Vero E6-TMPRSS2 will lead to a great improvement of the isolation method for SARS-CoV-2.

      -Place the work in the context of the existing literature (provide references, where appropriate). Properly done in the introduction of the paper.

      -State what audience might be interested in and influenced by the reported findings. Diagnostic laboratories

      -Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Virology, Molecular Biology, cell biology Not enough expertise to evaluate ROC data/analysis

  2. Sep 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewer for their input. Our response to their comments is in the attached preliminary revision plan.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      • Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Please place your comments about significance in section 2.

      This manuscript provides a detailed and very clear description of multiSero, which is an open source multiplex-ELISA platform for analyzing antibody responses to SARS-CoV-2 infection. This tool is a very promising step towards fully open-source multiplex testing. Using terrific visualizations the different steps involved in measuring the antibody levels is carefully explained. It starts with a clear explanation of the principle of printed antigen arrays, the usage of developed and opensource software Pysero to analyse the colorimetric signal of each spot associated with a different antigen. The colorimetric signal was read using both a commercial reader and an inexpensive, open plate reader. The comparison between the two proved that the open plate reader is as good as the commercial reader is.

      Major comments:

      • Are the key conclusions convincing? • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. • Are the data and the methods presented in such a way that they can be reproduced? • Are the experiments adequately replicated and statistical analysis adequate?

      The authors provide a new method to measure antibody levels. A comparison with an exisisting ELISA for anti-spike IgG would be worthwhile.

      A gradient boosting tree was used to combine the signal from multiple antigens. However, this did not lead to any notable improvements in classification performance. This result could be due to a property of the data or the algorithm. Two things that would be very useful here would be to plot the data (e.g. anti-Spike vs anti-N) and use a much simpler algorithm such as a logistic regression.

      The performance of the tool is based on one positive and one negative pool. And as the the authors mention, antibody levels are highly dependent on severity and time since infection. The performance of the classifier therefore strongly depends on the characteristics of the positive pool. It would improve the manuscript by providing additional information, if possible. If not, I think this should be mentioned as a short-coming in the discussion. Possibly, having serum panel with more asymptomatic infections or longer time since infection, would result in a poorer performance from the classifier.

      Related to the point above is what is written in line 248-249. The direction of the performance of the tool with additional samples depends on the characteristics (time since infection, age, severity) of the currently used samples and the samples to be added. The assumption that the performance can only increase is in my opinion not correct.

      The authors compared three normalization methods to circumvent using a standard curve. The normalization of ODs by the mean of anti-IgG Fc ODs is most promising as shown in Fig. S5. A comparison between this normalization method and using a standard curve is not given. It would be worthwile to look at the distribution of a serum panel from different plates, in relative antibody units as well as normalized ODs. Is the captured antibody distribution by normalized ODs as good as relative antibody concentrations derived from the standard dilution.

      In the abstract, the reader is told that the multiSero tool could be used with up to 48 antigens. I assume that at this number of antigens, the use of duplicate/triplicate antigens is not possible anymore? Also, the layout and spacing of the antigen array with more antigens would introduce more experimental artificats like comets and debris ?

      In FigS3, and line 146/147 the authors state that they find the that the presence of comets odes not cause observable bias or variance. This strikes me as rather subjective, and my impression of FigS3 B3 is that there is some bias due to comets?

      Minor comments:

      • Specific experimental issues that are easily addressable. • Are prior studies referenced appropriately? • Are the text and figures clear and accurate? • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Important reason for developing the multiSero tool according to the authors is the deployment of high content, multiplex serology platforms across the world and this paper makes a huge step towards this goal. Main hurlde of implementing the multisero tool in low-resource settings is its dependency on printed antigen arrays, which are produced by machines costing around 100,000-300,000 $ as mentioned by the authors. The authors also realize this and acknowledge this bottleneck in the discussion. I think it would possibly be good to elaborate a little further on why this is a limitation. Less freedom with the user what they want to test because dependent on producer of printed 96 well-plates?

      In line 47, I suppose the word are is missing.

      The overall language use is very clear. An improvement in my opinion would be to replace words such as cognate (line 46) and « in lieu of » (line 227) by easier alternatives, such as associated and instead of.

      Comets and debris are first mentioned in line 129/130 but require more explanation. An explanation of what is meant with comets only became clear to me after reading the discussion. I would use the explanation mentioned in line 250/250 right after the first time mentioning comets. What debris means, remains unclear to me.

      In line 174, I suppose that the word points should be line.

      Pysero sometimes starts with a capital P, sometimes with a lower case p, see for example line 108 and 109.

      Authors find using a standard curve as labor-intensive (line 190), I find this too strongly put.

      In line 281-283 the authors mention they are unaware of examples of classifiers distinguishing positive from negative samples based on more than antigen. Examples could be the classification of cholera using 2-6 antigens by : Azman et al, 2019 in Sci. Transl. Med.

      In Fig S1 the Nauttilus plate reader is shown. The costs of this reader are estimated to be less than 1500$. These are the costs without the motorized

      Significance

      With this manuscript, the authors show that multiplex serology platforms can become more accessible to low and medium income countries due to their development of a new open source tool. This means that multiplex serology seems to be becoming more accessible in low-resource settings. Next step is to use this multiSero tool in a low-source setting.

      Specific audience potentially interested are computational biologists involved in the analysis, visualization, and interpretation of the results of techniques and microbiologists quantitating and measuring antibodies. A broader audience that could be interested are infectious disease epidemiologists, especially those that are involved in serosurveillance and are keen to pick up new methods to potentially improve epidemiologal descriptions of immunity to several infections in low-and medium-resource settings.

      My field of expertise is limited to field-epidemiology and sero-epidemiology. Techniques such as the detection of spots and registering grids with multiSero are outside of my expertise. The construction of the Nautilus reader is new to me, and therefore hard to assess how easy it would be set up such a system in low-resource settings. I also feel my expertise regarding the choice of classifiers is limited, as I have not used gradient boosting before. Further, I am not an expert in the field of new developments in multipex assays and therefore not up-to-date with the latest literature in this field.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      There is a need for multiplex serological tests,and ELISA is the most applicable platform. However, the current ELISA based multiplex serological tests are heavily dependent on expensive and sophisticated instruments and softwares, and this hinders the wide application. To address this challenge, by incorporating open-resourced instruments, developing new analysis software, the authors proposed an integrated platform for multiplex serological test. To test the platform, SARS-CoV-2 was included as the example. Overall, this study is more technical oriented. The major contents are the establishment and optimization of the platform. The aim is focused and clear, the design of the experiments are comprehensive. The conclusions could be supported by the data.

      Major comments:

      • Are the key conclusions convincing? Yes
      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? No
      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. No
      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. N/A
      • Are the data and the methods presented in such a way that they can be reproduced? Yes
      • Are the experiments adequately replicated and statistical analysis adequate? Yes

      Minor comments:

      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately? Yes
      • Are the text and figures clear and accurate? Yes
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? No

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This study is more technical centered. The major contribution is the development of an ELISA-based platform for multiplex serological test. The authors intended to make their platform applicable at resource limited regions. However, the problem here is the current platform is still too complicate for wide application in real world. For a platform which may could be widely applied, especially at poor regions, it needs to meet several key features: 1. Low cost; 2. Standardized; 3. Simple (reduce operation to as few as possible). The major focus of this study is the first feature, and the other two features were bared touched. But, even "low cost" is still valuable and worth publication. The reviewer suggest the author to modify the manuscript to better reflect the fact.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      The existing literatures were well referenced.

      • State what audience might be interested in and influenced by the reported findings.

      Researchers who are interested in assay development.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Protein microarray technology. Assay development. SARS-CoV-2 antibody response analysis.

      The reviewer is not familiar with the software part.

      Other specific points:

      1. The authors mentioned that the multiplex serological test could be applied to differentiate infection and vaccination, in the case of SARS-CoV-2, how could this be possible if there is no specific biomarker?
      2. Have the authors also tested IgM?
      3. To simplify the normalization, the authors have tried several strategies, however, none works well. The results need to be further explained. Is there any other strategy could be attempted?
      4. What's the definition of the "background"?
      5. What's the rationale to select the two concentrations? Will more concentrations be better?
      6. The authors stated "open source analysis tools can be adapted for multiplexed detection of pathogens by printing pathogen-specific antibodies, instead of antigens". This is true, however, highly specific antibodies are required.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      We thank the reviewers for their constructive and helpful comments on our manuscript and are delighted to find their consensus that the manuscript represents an important contribution to the field. We provide a detailed response to specific points below. In addition, we propose to include new data showing that our method can be applied to experimentally infected lung tissue. Namely, we show highly sensitive detection of SARS-CoV-2 RNA in infected hamster lung section.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      Reviewer #1 **Major comments:**

      The authors used approaches provided in FISH-quant (Mueller et al, Nat Methods 2013) and big-fish. However, these tools to analyze RNA aggregates were not designed and validated for such massive aggregations as observed by SARS-Cov-2. They were developed for cases such as transcription sites with much smaller aggregations, with a few tens to a hundred molecules. With a regular spot detection approach, usually a few thousand spots can be detected in a cell (e.g. King et al, J Virol 2018), but this depends also on the used microscope and the available cellular volume. Higher RNA concentrations cannot be resolved with a standard approach, because RNA spots start to overlap. Decomposing RNA aggregations can help but will not work reliably for the high RNA densities observed for SARS-Cov-2, especially at later infection time-points. The tools will then not provide accurate estimates anymore. To my knowledge, there is currently not accurate quantification method for such massive RNA levels in smFISH. What has been done in the past, is using cellular intensity as an approximation and perform calibrations with cells having lower and thus still resolvable RNA counts (Raj et al., PLO Biology; https://doi.org/10.1371/journal.pbio.0040309.sg003). The authors proposed three expression regimes (partially resistant, permissive, and super permissive). My concerns here apply mainly to the category super-permissive, where an accurate estimation can't be performed. Here a more cautious quantification should be applied. __To a lesser extent, this will also apply to some of quantifications of gRNAs per factory, with counts exceeding 100s of molecules. As mentioned above, this does not affect any of the conclusions, but would reflect more accurately what kind of reliable information can be drawn from such experiments.__

      We agree with the reviewer that approaches like FISH-quant and Big-FISH cannot reliably quantify RNA spots with high spatial density such as our examples of “super-permissive” cells. Single molecule quantitation of such cases is likely to underestimate RNA expression as noted by us and King et al 2018 (doi: 10.1128/JVI.02241-17). Therefore, we integrated the combined smFISH signal intensity within entire cellular volumes and compared to the median intensity of single molecules in cells with lower infection density. We will (i) revise the methods and results sections to explain more carefully and explicitly the quantification of RNA in super-permissive cells. (ii) Provide a calibration plot for the quantitation as previously reported (Raj et al 2006, doi: 10.1371/journal.pbio.0040309).

      We agree that high local RNA density has the potential to interfere with quantification of gRNAs within viral factories. We have used the “cluster.decomposition()” function of Big-FISH to quantify viral factories, which is conceptually similar to the “Integrated intensity” mode of FISH-quant. Applying this algorithm to non-super permissive cells allows us to use the mean intensity of a reference single-molecule spot to estimate the number of molecules in a cluster. We are confident such estimates are reliable in the majority of viral factories, which contain less than or equal to 200 single gRNA molecules. We will revise the methods section to clarify this method of analysis.

      Reviewer #1 __**Minor comments:**__

      1.Page 6; the authors state that "smFISH identifies ... cellular distribution .... within ER-like membranous structures". However, the authors didn't directly show such a localization, could they provide an experiment with an ER stain?

      This text was based on previous light microscopy and EM studies that reported SARS-CoV-2 RNA in ER-derived membranes (termed Double Membrane Vesicles - DMVs) or co-localisation of anti-dsRNA (J2) with ER-markers (Cortese et al 2020; Hackstadt et al 202; Mendonca et al 2021)*. We propose to clarify the text on page 6 including the citation of these publications and to tone down our claim that the virus is located in ER-like membranous structures.

      *Cortese et al 2020, doi: 10.1016/j.chom.2020.11.003

      Hackstadt et al 2021, doi: 10.3390/v13091798

      Mendonca et al 2021, doi: 10.1038/s41467-021-24887-y

      2.It might be worthwhile pointing out that the probe-sets can be used in different host organisms (Vero - African green monkey; human cell lines).

      We propose to revise the text to emphasise more clearly the applicability of SARS-CoV-2 probes for the study of many different host organisms.

      3.I really liked the experiment, where the authors showed absence of signal when infecting with another virus & elegant control with the J2 AB. Maybe the authors could explain more clearly that the used a different coronavirus & that based on their sequence alignment no/little signal would be expected.

      Thank you for this supportive comment. We plan to follow the reviewer’s suggestion and expand our explanation of the rationale of this experiment in the text.

      7.The experiment with the isolated virions shows nicely that the smFISH approach has single-virus sensitivity. Did the authors compare the intensity of these isolated virions with the signal in Fig 1B? This might be a question of personal taste, but to me, this section might actually fit better in the first paragraph of page 4/5, where the authors describe single virions in cells.

      Thank you for the interesting question. We have not performed a direct comparison of the spot intensities of intracellular genomic RNA molecules and those from the isolated virions, because isolated SARS-CoV-2 requires poly-L-lysine coating for the coverslip attachment while our infection strategy utilises cells growing on uncoated glass. Nonetheless, the isolated virion spot intensities follow a unimodal distribution, and their shape approximates to the point-spread function of the microscope. Since spots at 2 hpi are largely derived from non-replicative viral genomes and they are measured in the intracellular environment with the same background (autofluorescence), they are a better ‘single RNA molecule’ reference.

      We also thank the reviewer for suggesting rearranging the text section. To address this point we plan to move the relevant text to the second paragraph of the Results section.

      8.Page 6. The authors state "+ORF-N and +ORF-S single labelled spots, corresponding to sgRNAs, were more uniformly distributed throughout the cytoplasm than dual labelled gRNA". This is difficult to appreciate from the image. Is this something the authors could quantify, e.g. with the metrics proposed by Stueland et al, Scientific Reports 2019?

      To address this point, we plan to: (i) present an alternative image illustrating a clearer example of differential spatial localisation of gRNA and sgRNA, and (ii) perform quantification of spatial dispersion indices for gRNA and sgRNA using the suggested method for our revision.

      9.Page 6. The authors perform a FISH/IF experiment including a co-localization analysis, where a "limited overlap" with sgRNAs was observed. I was wondering if this overlap could actually be simply due to rather high density of the sgRNAs. Maybe a control analysis by slightly changing the RNA positions could provide insight here, and give a threshold for what's to be expected randomly at a given RNA density.

      The reviewer’s comment is correct, in that a high density of sgRNAs and nucleocapsid protein could lead to signal overlap due to chance. This is why we excluded “super-permissive” cells from this analysis. Our co-localisation data showed that gRNA spots had a bimodal nucleocapsid immunofluorescence intensity distribution (data not shown), suggesting nucleocapsid-associated and “free” gRNAs, providing a threshold for this analysis. Nevertheless, we agree with the reviewer that the analysis of randomly positioned transcripts of the same density would provide a valuable control. In our revised MS we will include: (i) a random distribution analysis comparing the overlap between sgRNA and nucleocapsid in the “Observed” and a “Randomised” simulation, and (ii) a plot showing a full distribution of co-localised nucleocapsid immunofluorescence intensity for both genomic and sub-genomic viral RNAs.

      10.I don't fully follow the argument about stability on page 8. The authors also see an increase in the RNA levels. Couldn't this increase compensate for loss of RNA due to degradation? Would it be possible to perform an experiment at a very high REMDESIVIR concentrations which would blocks transcription?

      Remdesivir is a nucleoside analogue that inhibits viral RNA polymerase activity. While this drug inhibits viral replication, the inhibition is incomplete and using higher concentrations results in cellular toxicity. At the present time there are no stronger polymerase inhibitors available, so these experiments are the best approximation possible to assess viral RNA stability. We propose to revise the text to discuss the limitations of Remdesivir for modelling RNA stability.

      12.How did the authors define/detect replication factories? I couldn't find information about this in the methods.

      This is a good point raised by both the reviewers. Please see [Reviewer 2 General comment #1] for our response.

      Reviewer #2 **General comments:**

      1.The authors' definition of viral factories, in part as foci with at least 4 gRNA molecules, comes across as arbitrary. Perhaps a clearer explanation of this cutoff would be helpful to the readers' understanding of this definition. Additionally, confirmation of the functionality of such factories by immunofluorescence with anti-RdRp, for example, in addition to identifying staining of gRNAs and (-) sense viral RNAs at each focus could provide valuable support to the authors' conclusions.

      We thank both reviewers for requesting further information on our explanation of viral factories. We defined viral factories as smFISH signals with spatially extended foci that exceed the size of the point spread function of the microscope and the intensity of a reference single molecule. We then filtered these candidate factories based on the radius of the signal foci with EM-measured radii of double-membrane vesicles and single-membrane vesicles formed by SARS-CoV-2 (150 nm pre-8hpi and 200 nm post-8hpi) (Cortese et al 2020; Mendoca et al 2021). Our terminology encompasses both replication and viral assembly sites. The threshold of 4 genomic RNA molecules was selected as a technical threshold to limit an over-estimation of viral factories at later timepoints. For our spinning-disk confocal imaging system, we found the threshold of 3-7 RNA molecules provided satisfactory results. We propose to revise both the Results and Methods sections to clarify our rationale for defining and quantifying viral factories.

      As the reviewer mentioned, we have shown a partial overlap of positive sense genomic RNAs with negative sense genomic RNAs (Figure 2D, S2C), suggesting these viral factories represent double membrane vesicles. The use of antibodies against the viral polymerase (nsp12) is also a possibility to detect replication centres. However, replication centres are not the only ‘viral factories’ as there are also double-membrane structures where viral particles assemble (Mendoca et al 2021) and they, in principle, lack negative sense RNA and replication machinery, so neither smFISH probes against the negative strand nor a nsp12 antibody will comprehensively detect viral factories. We appreciate the valuable suggestion, but the classification of viral factories into replication and assembly sites would be challenging due to reagent availability and is beyond the scope of this manuscript.

      2.The random distribution of super-permissive cells in each cell line was demonstrated early in the infection, primarily at 8 hpi. The authors do not show how this pattern changes over time (8, 10, 12, 16, 24 hpi, for example). Do clusters of super-permissive cells appear at later time points, or does the pattern of 'highly' infected cells remain random for each virus? Any strain-specific differences identified from such patterns may be important for understanding infection progression. Finally, the authors do acknowledge this point, but it cannot be overstated that these data were taken from cell culture systems that have limited similarities to the human respiratory epithelium. A better model for such studies might be primary cultured human bronchial epithelial cells, but of course, these cells are not as readily accessible as the cell lines used in this manuscript.

      We share the same view that the presence and the spatial distribution of “super-permissive” cells can provide unique insights into SARS-CoV-2 infection dynamics. Our findings suggest that even at 24 hours post infection (hpi), not all cells become “super-permissive” and the culture maintains a heterogenous population of “partially resistant”, “permissive” and “super-permissive” cells (Figure 3C, S3C-D). We agree with the reviewer that the spatial distribution of “super-permissive” cells at later timepoints is of interest. To address this point, we plan to: (i) analyse the spatial distribution of “super-permissive” cells at 24 hpi, and (ii) compare the distribution of “super-permissive” cells at 24 hpi between VIC and B.1.1.7 strains.

      We appreciate the comment on the limitations of the cell culture systems to the human respiratory tract. However, Calu-3 and A549-ACE2 lung epithelial cells have been used in many studies over the last year and we feel it is important to publish single cell quantitation with these models to enable comparison with the published literature. We believe our results provide valuable information on the intrinsic nature of host cell susceptibility to support viral replication. During the review of this manuscript, we applied our smFISH probes to detect SARS-CoV-2 RNA in infected Golden Syrian hamster lung sections, which show an uneven distribution of infected cells. While the identification and spatial characterisation of susceptible cell types in the lung are beyond the scope of this manuscript, we are excited to include this data in our revised paper to demonstrate the utility of this sensitive approach to track spatiotemporal viral infection dynamics.

      3.The difference in early replication kinetics between the VIC and B.1.1.7 strains is an exciting finding that may have implications for clinical outcomes and transmissibility of these viruses. However, the authors did not clearly demonstrate how these differences in RNA production correlate to infectious viral load released from these cells (in bulk) at each time point. An explanation of this omission would be helpful.

      We will provide data on the level of infectious virus secreted from VIC and B.1.1.7 infected cells at all time points in the revised paper.

      In my opinion, findings related to specific cell lines are of much less importance (and are much less biologically relevant) that identification of replicative differences among strains. Such differences could be used, in part, to aid prediction of the transmissibility of VOC, for example. I think this point gets a bit 'lost in the weeds' of the rest of the paper.

      To address this comment, we will revise text on the differential replication kinetics of the SARS-CoV-2 strains to make this more prominent in our paper.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      Reviewer #1 __**Minor comments:**__

      4.I might have missed this, but they authors could also mention the positive control data about but Calu3 infected with SARS-COv2. One thing I was wondering: why did the authors use two different cell lines for this experiment?

      To address this point, we have added a sentence about a positive control visualising SARS-CoV-2 in Calu-3 cells using our probe set (page 5 – line 17).

      The experiments with HCoV-229E were done in Huh-7.5 cells because SARS-CoV-2 and HCoV-229E have distinct cell preferences. Using the J2 antibody we show that the levels of the dsRNA derived from viral replication are similar in the two cell lines and with the two viruses. Therefore, the lack of smFISH signal in HCoV-229E infected cells supports the high specificity of the probe set.

      5.Fig 1E. Would be nice to have the intensity scale for all time-points to permit a comparison of image intensities along the different time-points.

      6.Fig 3B. Would be important to have intensity scale bars to judge the signal intensities across the different time-points.

      The fluorescence intensity scale in Figure 1E is applicable to all timepoints, except for the lower panel at 24 hpi, which was intended to show wider dynamic contrast range. To address this point, we have provided intensity scales for all time-points studied in this figure and also Figure 3B.

      11.Fig 3C. maybe indicate the two groups with dashed lines.

      We have added a dashed line at the 102 mark in Figure 3C to visually differentiate “partially resistant” and “permissive” cells.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In "Absolute quantitation of individual SARS-CoV-2 RNA molecules: a new paradigm for infection dynamics and variant differences", Lee and colleagues adapt fluorescence in situ hybridization (FISH) to track viral RNAs at the single-molecule level, illustrating heterogeneity during the infection process with potential for significant clinical implications. The authors have meticulously demonstrated use of this approach to investigate the kinetics of early infections, as well as infection heterogeneity between the original and variant strains. Most notably, the authors have identified differences in early infection kinetics between an early strain and more transmissible variant.

      General Comments:

      1.The authors' definition of viral factories, in part as foci with at least 4 gRNA molecules, comes across as arbitrary. Perhaps a clearer explanation of this cutoff would be helpful to the readers' understanding of this definition. Additionally, confirmation of the functionality of such factories by immunofluorescence with anti-RdRp, for example, in addition to identifying staining of gRNAs and (-) sense viral RNAs at each focus could provide valuable support to the authors' conclusions.

      2.The random distribution of super-permissive cells in each cell line was demonstrated early in the infection, primarily at 8 hpi. The authors do not show how this pattern changes over time (8, 10, 12, 16, 24 hpi, for example). Do clusters of super-permissive cells appear at later time points, or does the pattern of 'highly' infected cells remain random for each virus? Any strain-specific differences identified from such patterns may be important for understanding infection progression. Finally, the authors do acknowledge this point, but it cannot be overstated that these data were taken from cell culture systems that have limited similarities to the human respiratory epithelium. A better model for such studies might be primary cultured human bronchial epithelial cells, but of course, these cells are not as readily accessible as the cell lines used in this manuscript.

      3.The difference in early replication kinetics between the VIC and B.1.1.7 strains is an exciting finding that may have implications for clinical outcomes and transmissibility of these viruses. However, the authors did not clearly demonstrate how these differences in RNA production correlate to infectious viral load released from these cells (in bulk) at each time point. An explanation of this omission would be helpful.

      Significance

      Adaptation of RNA-based imaging to understand viral infection cycles is critical to the development of antivirals and other mitigation strategies, highlighting the significance of this work. This manuscript represents an almost herculean effort to identify viral replication dynamics using a series of thoughtful and well-controlled experiments. This paper is likely to be valuable to the field, and will serve as a launch pad for future studies in the role of viral RNA production in SARS-CoV-2 infection, clinical outcomes, and transmissibility.

      Expertise keywords: influenza virus, virus transmission, oligonucleotide-based imaging and therapeutics

      I do not have significant experience with quantitation of fluorescence imaging and signal co-localization in cell images.

      Referees cross-commenting

      Reviewer 1's comments regarding the application of smFISH and RNA quantitation are very helpful and address some key limitations of the research presented in this manuscript. I agree that the experiments are well thought out and include appropriate controls. I think the reviewer's comments and concerns are fair and that it would be appropriate to ask the authors to address their points.

      However, my primary concern remains with the biology and focus of the manuscript. In my opinion, findings related to specific cell lines are of much less importance (and are much less biologically relevant) that identification of replicative differences among strains. Such differences could be used, in part, to aid prediction of the transmissibility of VOC, for example. I think this point gets a bit 'lost in the weeds' of the rest of the paper.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors use single-molecule FISH (smFISH) to study the early-time points of SARS-Cov-2 infection/replication. By targeting genome and sub-genomic RNAs, they can decipher different stages during the infection cycle, and identify different cell populations with distinct behavior. By applying both smFISH and IF with the J2 antibody recognizing dsRNA, the authors nicely demonstrate how smFISH is more sensitive, especially during early infection when viral RNA levels are still relatively low. The investigation of the two SARS-Cov-2 strains is well thought through and provides evidence that these strains have similar viral uptake and infection rates, but differ in the replication kinetics, opening the door for future investigations. The paper is a pleasure to read and the authors provide a wealth of controls that not only convincingly illustrate the specificity of their approach but also how it provides unique information, complementing both IF and sequencing-based approaches. The provided methods are explained in detail and will allow users to quickly get started. Paper provides not only very interesting biological insights, but also nicely illustrates how smFISH can be used to study infection by providing unique information.

      Major comments:

      The key conclusions were convincingly presented and, as far as I can judge as a biophysicst with limited experience in SARS-Cov-2 biology, backed-up with the adequate controls and analysis. In general, the authors provide exemplary validations to illustrate the specific of their approach. RNA detection and single-molecule sensitivity is validated in several experiments, by the "standard" probe-splitting approach, where a dual-color labeling of the same RNA is performed, but also by RNAse and Remdesivir treatment. Further, the authors show the specificity of their smFISH probes by applying them to another coronavirus (HCov-229E), where no signal was detected. Further, the authors provide very detailed methods, which should make it easy for other researches to apply these methods in their own research, and also reproduce the results. The imaging data is nicely complimented with quantitative analysis where needed and the provided plots are both adequately chosen and visually pleasing.

      However, I have one major concern about the RNA abundance analysis. While this comment concerns some of the analysis, it does not question the obtained conclusions. The authors used approaches provided in FISH-quant (Mueller et al, Nat Methods 2013) and big-fish. However, these tools to analyze RNA aggregates were not designed and validated for such massive aggregations as observed by SARS-Cov-2. They were developed for cases such as transcription sites with much smaller aggregations, with a few tens to a hundred molecules. With a regular spot detection approach, usually a few thousand spots can be detected in a cell (e.g. King et al, J Virol 2018), but this depends also on the used microscope and the available cellular volume. Higher RNA concentrations cannot be resolved with a standard approach, because RNA spots start to overlap. Decomposing RNA aggregations can help but will not work reliably for the high RNA densities observed for SARS-Cov-2, especially at later infection time-points. The tools will then not provide accurate estimates anymore. To my knowledge, there is currently not accurate quantification method for such massive RNA levels in smFISH. What has been done in the past, is using cellular intensity as an approximation and perform calibrations with cells having lower and thus still resolvable RNA counts (Raj et al., PLO Biology; https://doi.org/10.1371/journal.pbio.0040309.sg003). The authors proposed three expression regimes (partially resistant, permissive, and super permissive). My concerns here apply mainly to the category super-permissive, where an accurate estimation can't be performed. Here a more cautious quantification should be applied. To a lesser extent, this will also apply to some of quantifications of gRNAs per factory, with counts exceeding 100s of molecules. As mentioned above, this does not affect any of the conclusions, but would reflect more accurately what kind of reliable information can be drawn from such experiments.

      Minor comments:

      I have a few minor comments/questions.

      1.Page 6; the authors state that "smFISH identifies ... cellular distribution .... within ER-like membranous structures". However, the authors didn't directly show such a localization, could they provide an experiment with an ER stain?

      2.It might be worthwhile pointing out that the probe-sets can be used in different host organisms (Vero - African green monkey; human cell lines).

      3.I really liked the experiment, where the authors showed absence of signal when infecting with another virus & elegant control with the J2 AB. Maybe the authors could explain more clearly that the used a different coronavirus & that based on their sequence alignment no/little signal would be expected.

      4.I might have missed this, but they authors could also mention the positive control data about but Calu3 infected with SARS-COv2. One thing I was wondering: why did the authors use two different cell lines for this experiment?

      5.Fig 1E. Would be nice to have the intensity scale for all time-points to permit a comparison of image intensities along the different time-points.

      6.Fig 3B. Would be important to have intensity scale bars to judge the signal intensities across the different time-points.

      7.The experiment with the isolated virions shows nicely that the smFISH approach has single-virus sensitivity. Did the authors compare the intensity of these isolated virions with the signal in Fig 1B? This might be a question of personal taste, but to me, this section might actually fit better in the first paragraph of page 4/5, where the authors describe single virions in cells.

      8.Page 6. The authors state "+ORF-N and +ORF-S single labelled spots, corresponding to sgRNAs, were more uniformly distributed throughout the cytoplasm than dual labelled gRNA". This is difficult to appreciate from the image. Is this something the authors could quantify, e.g. with the metrics proposed by Stueland et al, Scientific Reports 2019?

      9.Page 6. The authors perform a FISH/IF experiment including a co-localization analysis, where a "limited overlap" with sgRNAs was observed. I was wondering if this overlap could actually be simply due to rather high density of the sgRNAs. Maybe a control analysis by slightly changing the RNA positions could provide insight here, and give a threshold for what's to be expected randomly at a given RNA density.

      10.I don't fully follow the argument about stability on page 8. The authors also see an increase in the RNA levels. Couldn't this increase compensate for loss of RNA due to degradation? Would it be possible to perform an experiment at a very high REMDESIVIR concentrations which would blocks transcription?

      11.Fig 3C. maybe indicate the two groups with dashed lines.

      12.How did the authors define/detect replication factories? I couldn't find information about this in the methods.

      Significance

      The authors their established smFISH approach for the detection of SARS-Cov-2 RNA. As mentioned above, they provide extensive validations and detailed protocols (including the necessary probe sequences). This should allow also relative newcomers to the field to quickly perform these experiments. While the technical advance might not be major, the convincing presentation will certainly be appealing for an audience which has not be using imaging-based approaches to study (early) viral infection events and was relying more on other approaches, such as sequencing or bulk-PCR.

      There are a few papers using smFISH to study SARS-Cov-2, but to my knowledge this study provides the most detailed analysis of the early time-points of infection, where smFISH with its sensitivity really shines. This paper not only provide new insights about SARS-Cov-2 biology, but is very nicely illustrating what kind of unique information smFISH can provide and how this complements orthogonal approaches such as single-cell RNA-seq. Hence, this will certainly be interesting for virologists/biologists working on this pathogen by providing new insight about the replication kinetics, but can also help them to potentially integrate smFISH into their own research.

      I'm a biophysicist working on transcriptional regulation. I contributed to development of both experimental methods and analysis tools to study single-molecule FISH data. I have only limited expertise in virology, and thus not evaluate in detail the biological findings concerning SARS-Cov-2.

      Referees cross-commenting

      I completely agree with the assessment of reviewer #2 and have nothing to add.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We want to thank all three reviewers for their positive and constructive comments and suggestions for improvement. We have now thoroughly revised the manuscript including new analysis, extra figures, and new material in the wiki. The manuscript has significantly improved because of the reviewers input. Detailed responses to questions and comments are given below.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Lange et al. have developed an automatic feeding system for zebrafish facilities. The system is open-source and relatively easy to implement. The authors propose to systems, one that delivers the same amount of food for each aquarium (ZAF) and a second (ZAF+) that can adjust the amount of delivered food to each aquarium. The authors show no difference in fish weight, spawning and water quality, when fed using the automatic system or manually.

      In my opinion, the ZAF and ZAF+ are an excellent first approach to solve the complex problem of automatizing feeding in fish facilities. So far, only one company offers this option which is extremely expensive and demands a lot of maintenance.

      The manuscript is very well written and easy to follow. The supplementary material is very well detailed. It is clear that the authors intended to facilitate the implementation of the ZAF by potential users.

      We appreciate the supportive comments from Reviewer 1 and address all comments below:

      I just have a few comments regarding the system:

      1) The authors do not indicate how the system is cleaned. the system drains itself, but will any deposits of food remain in the tubes ? Why is the system not flushed with clear water after each feeding? do the tubes get clogged ?

      We agree that the cleaning process was not clearly explained in the manuscript. We added clear sentences in ‘Box 1’ to describe the first cleaning step (see text and figure). Indeed, after each feeding we flush water and then air into the tubes. Moreover, we explain in ‘Box 2’ that we have a second level of cleaning in the form of a special cleaning program that is run at least once a day with no food distribution (i.e same program as used for feeding but without actual food mixed, we flush lots of clean water and then air in the system). Finally, in the discussion we clarify the different cleaning steps by adding extra explanations in the first paragraph.

      All these procedures and programs are very effective in preventing system clogging and in reducing the accumulation of debris and algae. After more than 19 months of ZAF and ZAF+ feeding in our facility we never experienced any tube clogging.

      2) How long the system was tested for?

      ZAF has run in the facility for 9 months and ZAF+ for 10 months since September. We added a sentence about the testing time in the discussion. We never experienced any major problems, only a few minor malfunctions, reported in the new troubleshooting table added to the wiki (suggested by the reviewer 2).

      3) The ZAFs were used to feed 16 aquariums. For such a small rack, manually feeding takes less than 5 min. The authors should highlight that, at least for such small systems, the ZAFs will be especially very useful for feeding during weekends and holidays. Still, adding 16 commercially available small automatic feeders to each aquarium, could be simpler to implement.

      As noticed by the reviewer, ZAFs are very useful when staff are not present (week end, vacation, etc..). To emphasize on this particular point we added a sentence in the discussion's first paragraph. The small automatic feeders available commercially are usually very difficult to attach to zebrafish facilities . Indeed they can’t adapt to conventional lab aquatic facility racks because they are designed for pet aquariums. They also have less features compared to the ZAFs (difficult to adapt the food quantity, more food waste, cumbersome...). Additionally, by multiplying the number of devices (you need one small feeder per tank), one increases the risk of possible malfunction as well as the maintenance time required for food filling, cleaning etc...

      Thus, usage of small automatic feeders in laboratory aquatic housing racks is complex to adapt, a source of feeding error, is more cumbersome, and potentially more time consuming etc… They are simply not designed for professional aquaculture systems. Whereas ZAFs can be easily adapted to all the commercially available aquatic facilities. The fact that ZAFs simply ‘interfaces’ via tubes to fish facility racks makes them very versatile and unintrusive.

      4) How do authors envisage implementing the ZAFs in much larger facilities (from 100 to 1000 tanks) ? Implementing a specific ZAF for each rack containing ~20 tanks may not be realistic.

      Indeed building multiple ZAFs will be complex and resource consuming. Thus, we designed ZAFs to be adaptable and modular, so one ZAF ( or ZAF+) can easily be scaled to handle bigger facilities. The supplementary information and the wiki describe all the steps required to build a ZAF for 16 tanks and a ZAF+ for 30 tanks and many tips to scale up these devices without major modifications (up to 80 tanks for ZAF no restrictions for ZAF+). Of course, we do think that for truly large facilities, there is probably a sweet spot that balances the number of individual devices and the per-device capability. Having a single device feeding 1000 tanks is probably not wise, perhaps 5 devices for 200 tanks each (ZAF+) would be the best. Please note that the hardware cost and complexity scales roughly linearly with the number of tanks, no surprises here. Moreover, in the case of ZAF+ it is possible to use splitters to feed even more tanks from the same line (ZAF+).

      We added pages in the ZAF/ZAF+ wiki, to help the users extend the feeding capacities of their desired ZAFs (see in the wiki “tips to scale up ZAF “- “tips to scale up ZAF+”). We also mentioned in the discussion the possibility of distributing food to more tanks with one device by increasing the outputs and referenced the wiki accordingly.

      Having said this, we did not primarily design ZAFs for super large fish facilities, instead we designed the ZAF systems to facilitate adoption of fish models by many small and medium sized labs. We hope that our system will lower the bar for labs with moderate ressources to get started with aquatic models, or labs that just want to ‘try’ a new aquatic model organism ‘on-the-side’.

      5) how the length of the tubes influences the efficiency of feeding ? For feeding many tanks with the same ZAF it is necessary that the tubes will be of the same length. In that case, the system will become very cumbersome. Longer tubes will probably need stronger pumps. What's the maximal length of tubes tested ? That will limit the number of aquariums a ZAF can feed.

      how the length of the tubes influences the efficiency of feeding ? For ZAF the size of the tubes is very important because its design assumes homogeneous food distribution. In contrast, ZAF+ distributes the entire amount of water and food mix to each tank sequentially, so the tube length is not an issue. To make sure that tube length or tube layout is not affecting feeding efficiency we evaluated the weight of fish coming from tanks housed on two different rows (top and bottom). This was not clearly explained in the methods section -- we changed the text to reflect that. Additionally, at the end of each ZAF+ run, the washing sequence runs a relatively large quantity of water to ensure that all food gets flushed out to the right tanks. We did not evaluate the precise amount of food delivered. However after each feeding and cleaning all tubes are empty (see last sentences of the Box 2).

      For feeding many tanks with the same ZAF it is necessary that the tubes will be of the same length. In that case, the system will become very cumbersome. This is a fair concern. However, with a good design and with the help of cable tie it is very easy to organise the tubing, and avoid ‘tube-hell’. We added a sentence to clarify the organisation in the wiki (see ZAF>Hardware>Tubing in wiki) .

      Longer tubes will probably need stronger pumps. What's the maximal length of tubes tested ? That will limit the number of aquariums a ZAF can feed. We never precisely measured that because the generic pumps we use are very powerful and their running time can be adjusted in the software by changing the constants in the code source (see troubleshooting new supplementary table). Therefore the length of tubes should not be a limiting factor. Even stronger pumps (more amps) can be readily sourced on Amazon if really needed -- although we doubt that this is necessary. Regarding the number of tanks that ZAF can feed, we simply recommend adding more pumps to increase its capacity (see previous comments or “tips to scale up ZAF” in the wiki).

      Despite these comments, this is an excellent first approach, and the fact that the authors made it open-source and open access, make the ZAFs a very important contribution to the community. I have no doubt that some fish facilities will implement it and the community will help to improve it. Thank you. We do think that the main benefit of an open source project is the community around it. We are currently collecting a growing list of interested labs and we are interested in organising an online workshop to discuss ZAF and ZAF+, with some talks, QAs, and more to help people getting started.

      Reviewer #1 (Significance (Required)):

      This is the first open-source open-access automatic feeding system ever published.

      It is the first but very important step to the automation of research fish facilities.

      **Referee Cross-commenting**

      I agree with all the other reviewers.

      We also have to take into account that the system is a first prototype and although not ideal, it is open source. This will allow other labs to develop and improve their own models based on the ZAF.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      The manuscript proposes an open source automated feeder for zebrafish facilities, although it would be amenable to other species. Overall, the manuscript is clearly written and easy to understand, the wiki is well sourced and clear. The commitment to open source is commendable.

      I have some questions regarding the long-term sustainability of this setup, as well as some discrepancies in the methods. Finally, as this aims to be useful to people with no engineering/electronics competence, I feel that it is not yet at a level that is accessible enough.

      We are very pleased to see that the Reviewer appreciates our manuscript and our commitment to open access. We thanks the Reviewer for his comments, in particular the comments about accessibility, and address them bellow:

      **Major comments**

      It would be useful to have a centralized list of parts and components, which would make it easier for users to order all that is needed to assemble the ZAF or ZAF+, at the moment the information is distributed through the wiki as hyperlinks.

      Extremely important! This was clearly an oversight on our part. We agree that a table listing all the components would help for constructing ZAF and ZAF+. We have added two tables in the wiki, one for ZAF and another for ZAF+, with all the necessary parts and components required to build both devices, with articles number, supplier and cost in dollars. Thanks to the reviewer for this excellent suggestion.

      A troubleshooting guide for the common problems the team ran into (if any) would be useful for newcomers, even just as issues on the GitHub. The team may also consider some form of chat/forum/google group to allow discussions between users and experts.

      The reviewer raised an important point so we added to the ZAF wiki a troubleshooting guide to help users by listing the minor malfunctions that we observed. Additionally, users will be able to ask questions or report bugs on the ZAF GitHub using issues. Github issues will allow discussion and to track ideas and feedback within the ZAF user community. Finally, we just created a Gitter room: https://gitter.im/ZAF-Zebrafish-Automatic-Feeder to enable more interactive discussion.

      Did the author observe any algal or bacterial growth in the feeding tubes over the 60 days? Do they have an estimate on how long the tubes stay "clean" enough? The authors mention tube changing every 10 weeks, can they explain the rationale, and did they assess the bacterial/algal contamination over that time? Do the splitter panel and food mixing flask also need replacing regularly?

      After several weeks of usage we indeed observed algal and bacterial growth in the tubes. In order to report and justify the need to change the tubes, we made a new supplementary figure illustrating the tube cleanliness over time, mainly algal and bacterial (see Suppl. Fig 3). We realised that 12 weeks is actually the optimal tubing renewing period in our facility. Algal and bacterial growth depends on the facility environment characteristics such as light intensity, water and air temperature, as well as feeding frequency and therefore might be adapted to the users facility specs. The splitter tubing can be changed based on user observations; we now mention this in the ZAF tubing supplementary material and on the wiki.

      The authors mention that the tubing needs to be of similar length to ensure similar resistance and food distribution, did they compare the body weight of fish in racks at the top or at the bottom of their system? There are no overall differences, but maybe the bottom racks would received slightly more food? Furthermore, did they quantify the differences in food/water delivery as a function of length differences?

      The requirement for similar length is only necessary for ZAF because its accessible design assumes homogeneous distribution of the water-food mix through a passive splitter system which is susceptible to variable fluid resistance. In contrast, ZAF+ distributes the water-food mix one tank at a time -- ensuring that the correct amount of food is entirely flushed through any required tube length (the pumps are strong enough and we flush enough water). In the eventuality that the tube length is too long the user can adjust the pump running time by changing constants in the code (see troubleshooting table in the wiki and corresponding links).

      We thank the reviewer for suggesting to evaluate the fish weight on fish from two extremal heights. Although we did not explicitly report this in the first version of the manuscript, we had actually anticipated this potential issue and therefore we did collect data for ZAF and ZAF+ for tanks housed on the top and bottom rows. We added a clear description of the weighting process in the material and method, highlighting the housing condition of the tanks tested.

      Finally, after each feeding run the tubes have been fully flushed and are empty without food debris or pellets remaining, irrespective of their sizes. So we did not find it relevant to evaluate the precise amount of food effectively delivered as we control that already upstream.

      Methods fish weight: The methods mention different amounts of food than the wiki, the rationale in the wiki is also different from the 5% of body weight outlined in the methods (which then matches the food amount of the methods). Which is the correct amount?

      We thank the reviewer for noticing the inconsistency. The method numbers are the correct one so we changed the wiki, we made a mistake when editing the figures. We wrote some sections of the wiki early during the development of the hardware. We unfortunately forgot to correct the inconsistencies.

      The code is decently commented for scientific software with clear variable names, but I wonder how flexible it is if users cannot get access to the specific hardware (especially the pumps) used in ZAF/ZAF+? Can the authors briefly comment on this point?

      The pumps are just built from 12V motors, you can find a large variety of such pumps online (Amazon, etc…), we have ourselves tried several, but there is no need to have the exact same model. We added a note to the tubing section of the ZAF and ZAF+ about that.

      The only components that cannot be easily exchanged are the arduino and Raspberry PI, but that is not an issue as these are very easily sourced components.

      The wiki could use more pictures or, to borrow the Proust Madeleine allusion, schematics akin to LEGO with more intermediary steps clearly outlined. Some pictures are also a bit small/busy (such as 2D and 2E in the frame section, or the magnet pictures), they may benefit from cartoons/schematics to clarify what is done. Alternatively, videos/timelapses may help with better visualising the assembly.

      We appreciate the reviewer comments and added new pictures, schematic and extra legends in the wiki to help potential ZAFs builders. In the wiki for ZAF hardware we increased the size of all the pictures for all the different steps and added new legends to clarify the assembly. There are also now more pictures illustrating the construction steps (i.e in “frame”, “pumps and valve”) and we added a simple schematic for “servo and food container”. Picture sizes have been increased in “ZAF electronics” and added to the “Raspberry Pi and Servo Hat” section. We increased the picture sizes and added more legends to the ZAF+- Hardware “Pumps & Valve'. Moreover, we added more photos to the “tubing” section and the “ZAF+ Electronics” section.

      We agree that videos or gifs would have been great to visualize the assembly. Unfortunately, we did not record such videos during the construction. We created ZAF as an open source project and clearly hope to generate a community that will share assembly pro-tips and may be constructions videos on the github.

      Our institute is expanding on zebrafish research so we will build additional ZAFs and will use this opportunity to prepare nice videos to add to the wiki. We envision that the wiki will be improved over time with better material, some of it contributed, as well as perhaps newer and better versions of ZAF.

      The main question that would affect if this approach were taken up would be how reliable it is in the long run. Have the authors experienced any issue over the 2 months test? Is this system still being used currently? If so, could the authors update the water quality logs?

      The reviewer suggests that the key question is to see if using ZAFs all year long is possible. We can reply yes, it is actually possible! We have used ZAF for 9 months, and now ZAF+ for the past 10 months in our fish facility, with great success. We never experienced major malfunctions and the minor issues we encountered are reported in the troubleshooting table. Since ZAF and ZAF+ have been used daily for months with logs recorded every day we have updated the water logs quality to 3 months. We have been using the ZAFs in full autonomy for a total of 19 months, frankly invaluable.

      Getting a sense of how long it can run without problems, how much troubleshooting is involved per month would be very useful in answering those questions.

      Except manual cleaning and tube replacement, there is no other big maintenance on ZAF. Of course, the food reserve needs to be changed at least once per week. We listed the malfunctions in the troubleshooting guide in the wiki. In our facility ZAFs require an average of 1 hour of maintenance per month. And if any hardware part fails you can just immediately replace it because all the parts are cheap and easily replaceable. Actually, we recommend keeping spare parts of all the key components (pumps, valves, arduino, Raspberry Pi, tubes, ...).

      **Minor comments**

      • Main text page 3: Fig. Supp. 2 instead of Supp. Fig. 2. Furthermore, would the authors have similar data for the manual feeding? If so, it could be useful to add here for comparison (although that is not necessary if the data is unavailable).

      We changed the text but we don’t have data available for the water logs with manual feeding.

      Main text page 3: it would be useful to add how long it takes to change all the tubing after 10 weeks?

      This is really dependent on ZAF tubing and the fish facility, in our hand for about one hour. We mentioned it in the results section, ZAF paragraph.

      Methods fish weight: The phrasing as it stands make it unclear the same method was used for ZAF and ZAF+, the authors may consider to start with the description of the common weighting method, then the specifics of ZAF+.

      Thank you, we changed the text accordingly.

      Supp.Fig.1a: "Waste water drain pipe"

      Thank you, we changed the text accordingly.

      Acknowledgments: "...for their help..."

      Thank you, we changed the text accordingly.

      ZAF - Servo Hat connection: "to control the pumps"

      Thank you, we changed the text accordingly.

      ZAF - Installation: the dependencies should be listed as they are in ZAF+, or the two sections merged, unless the GUI is not functional (see below).

      Thank you, we now list the dependencies in the wiki.

      ZAF - How to use: there is no mention of the GUI, is it not yet implemented? If not, is the touch screen needed?

      The standard ZAF hardware is controlled by a very simple python-based program that works with a command line interface. Therefore to interact with the Raspberry Pi for installation and configuration we strongly recommend building ZAF with a screen, and the touch screen is an easy way to be able to quickly point and click in the absence of a mouse -- which can be cumbersome when no clean horizontal surfaces are available in a lab environment.

      ZAF+ - soldering: "A 12V power supply (at least 10A best 20A) provides power to the electronics, except the Raspberry Pi and the two Arduino Megas." It seems the sentence is incomplete, or at least I cannot make sense of it.

      Changed to “A 12V power supply (at least 10A, but ideally 20A) provides power to the electronics, except for the Raspberry Pi and the two Arduino Megas that are powered by the Raspberry Pi 5V GPIOs.”

      Reviewer #2 (Significance (Required)):

      This manuscript provides a significant technical advance to the zebrafish field. The proposed automated feeder would be a very useful option for smaller labs, to ensure the consistency of feeding, and to remove one of the routine aspects of fish husbandry.

      As the authors state, there is certainly interest in the zebrafish community [9,10] for automation of feeding. I am not aware of other DIY fully automated feeding system, commercial systems do exist, but are expensive.

      The manuscript, and proposed automated feeder, would certainly be of interest within the zebrafish community, as well as other researchers using aquatic models that can rely on dry food. How many in the community would embrace this method will depend on how confident they are in the long-term stability.

      I am neither electronics, nor husbandry expert. As such I am not qualified to comment on any long-term approach this may prove, if any, for fish health. My expertise lies in image and data analysis, as well as microscopy.

      **Referee Cross-commenting**

      I think the major points are shared by all reviewers, I think the other reviews are fair in their content and I have nothing specific to comment on.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      This technical report describes an open-source fully automated feeding system for husbandry of zebrafish (and potentially other aquatic organisms). It provides detailed instructions for assembling individual components into two different feeding systems of varying adaptability, as well as their operation. Links to relevant control software are also provided. The characterization of the systems' performance appears somewhat limited (e.g. only maintenance of adult fish over a period of 8 weeks and use of dry food is documented). These systems could be of use for husbandry in a large number of research labs, and, in

      addition, for automated reward delivery in large-scale associative conditioning assays.

      We thank the Reviewer for his encouraging comments and appreciate his helpful suggestions. We answer to the Reviewer comments bellow:

      **Major comments:**

      Providing food to large numbers of tanks in aquatic animal facilities in a regular fashion is a time- and resource-consuming process. Some automated feeding systems for large numbers of tanks are commercially available, but these feeder robots are expensive and are restricted to systems of specific vendors. Therefore, an adaptable automated system that can be assembled from off-the-shelf components is a very attractive option for many research labs to both save resources and standardize the feeding process.

      The instructions for assembly provided by the authors appear quite detailed and sufficient to allow non-experts the assembly and operation of the automated feeder systems. The design of the system appears appropriate for the task.

      While additional experiments are not required to support the claims of the article, I feel that it would be significantly improved by the provision of additional information. My suggestions in that regard include:

      Description of the washing procedure of the system (which solvents, how often, how long?). The authors mention that an exchange of the tubing is required every 10 weeks, but since the tubing transports liquid food mixture, it is easily conceivable that microbial growth will occur rapidly in the system without thorough hygiene / washing procedures. Also could the authors provide some information, which type of tubing material they are using (Silicone, Tygon etc.)?

      Description of the washing procedure of the system (which solvents, how often, how long?).

      We agree that the cleaning procedure must be clarified. So we added a more clear description of the process in the first paragraph of the discussion and clarified the explanation about cleaning in Box 1 and Box 2 (suggested also by the reviewer1). To summarise there are two levels of cleaning, the first one happens just after a food distribution program by flushing water and air in the system (Box1). Additionally at least once a day, we run an entire program without food, to rinse/clean the system (Box2). This last step is programmable using ZAFs software.

      The authors mention that an exchange of the tubing is required every 10 weeks, but since the tubing transports liquid food mixture, it is easily conceivable that microbial growth will occur rapidly in the system without thorough hygiene / washing procedures

      Following all reviewers' comments we added an extra supplementary figure justifying the need of changing the tubes every 12 weeks (updated based on our latest observations). We monitored the cleanliness (algal/microbial growth) of the tubes and realized that it becomes necessary to replace the tubes every 12 weeks (supp figure 3). Interestingly, we remarked that the microbial and algal growth depends on the facility specificities such as light intensity and temperature.

      Also could the authors provide some information, which type of tubing material they are using (Silicone, Tygon etc.)?

      For ZAF we used silicone based tubing then we changed to PVC based tubes for ZAF+ because they are cost effective and have similar specifications for our usage. We added a note about the tubing material in the wiki ZAF tubing and ZAF+ tubing.

      In a related point, I was left wondering how long the food is being mixed in the mixing flask before being applied to the animals? Too long mixing might lead to a loss of nutrients into the solution (through diffusion). Could the authors comment on that, please? Do the food pellets remain more or less integral so that the majority of delivered food is actually ingested by the fish?

      • In a related point, I was left wondering how long the food is being mixed in the mixing flask before being applied to the animals? Too long mixing might lead to a loss of nutrients into the solution (through diffusion). Could the authors comment on that, please? Very relevant point, indeed it is very important for the food to not be mixed too long in water to avoid pellet dissolution in water and loss of nutrients. The food manufacturer website mentioned: “duration of “wet” feeding should be kept short” (https://zebrafish.skrettingusa.com/pages/faq). Therefore we adapted our feeding program to keep the “wet” feeding extremely short. For ZAF and ZAF+, the software is designed to deliver the mix of food and water to tank(s) within 3 minutes at most. To clarify this, we added in the Box describing the feeding, a sentence : “Overall, they share many common features, like the quick distribution of food and water mix, to avoid pellet dissolution in water and loss of nutrients.”

      • Do the food pellets remain more or less integral so that the majority of delivered food is actually ingested by the fish? We manually evaluated the integrity of food pellets in the early phase of development, these parameters being difficult to quantify, we decided to record the fish weight as a readout of good food delivery and general effectiveness. However, we clearly understand the reviewer's remarks and therefore added to the manuscript a supplementary video that shows the distribution of the food pellets and their integrity once they reach the tanks.

      In yet another related point, I was left wondering, whether the authors observed any negative impact of feeder usage on water quality (besides pH and conductivity, which they report)? Especially, with regards to ammonia that might arise from the decomposition of uneaten food items?

      Ammonia toxicity is mentioned to induce clinical and microscopic changes that reduce growth and increase susceptibility to pathogens according to aquaculture textbooks as summarized here: https://zebrafish.org/wiki/health/disease_manual/water_quality_problems#ammonia_toxicity). However, we never experienced such abnormal phenotypes in our facility and our regular aquatic PCR health monitoring profiles have always been negative for pathogens. Additionally, high ammonia is influenced by husbandry conditions, such as important fish density or inappropriate water circulation, characteristics that are not present in our fish facility. Therefore we did not find relevant to test for ammonia levels.

      The authors only tested the feeder on adult fish, but discuss that it would easily be transferable to a system that is used for raising fish fry. In that context, could the authors comment, on whether the system of using water as the carrier for the dry food (after mixing) would work as well for the smaller pellets required in feeding fish fry (e.g. 75 or 100 um pellet size as compared to the 500 um pellet size they use)? With smaller pellets, break-down of the dry food during the mixing process seems to be an even larger problem, I could imagine.

      We appreciate the reviewer's comment about using different food pellets sizes, a very important point for ZAFs adoption beyond adult fish. During ZAFs testing we actually tested different food sizes (from 100uM pellets to 500uM) and did not observe differences in pellet distribution. Most of the industrial aquatic food pellets are oily and designed for automatic distribution (for large farming environments). Therefore they keep their integrity and are not easily broken. Besides, during food distribution, as mentioned previously, the duration of wet food (water and food mix) is relatively short, which helps maintain pellet integrity.

      **Minor comments:**

      (1) the average weight of animals is given as lying in the range of 5 to 6g. That seems very high. The "standard" weight range of adult zebrafish is more around 1g [see, for example: Clark, T. S., Pandolfo, L. M., Marshall, C. M., Mitra, A. K. & Schech, J. M. Body Condition Scoring for Adult Zebrafish (Danio rerio). j am assoc lab anim sci (2018)]. Could the authors comment on that discrepancy?

      Good observation by the reviewer. We did make a mistake during figure preparation and our legends were actually not reflecting the exact weight of the fish. The scale bars of the figures have been changed to reflect the real weight of the fish (below 1g). We thank the reviewer for noticing the mistakes.

      (2) The authors state that spawning success is not negatively affected by the automated feeding, and they quantify the number of successful crosses. Could the authors briefly confirm or state, that or whether the clutch size was also unaffected?

      We never precisely quantified the clutch size/quality but we are now using ZAFs for the feeding of our facility for 19months and never observed any problem with our clutch. Our lab is working on early development and crucially relies on clutch quality.

      (3) The manual feeding procedure / regime that is used to compare husbandry success against the automated feeding regime is not described in any detail. That seems important given the topic of the article.

      We agreed and added a brief description of the protocol in the Methods section (“Animal and husbandry”).

      (4) The authors cite two recent papers that describe semi-automatic feeding systems for zebrafish in the introduction. The authors might want to consider discussing some key differences between their system and these semi-automatic systems in the discussion.

      The two published semi-automatic feeding systems are completely different from the devices presented in our paper. They are also open access but they are devices that need to be manually operated by facility staff. In contrast, our solutions are fully automatic and do not require the human hand during operation. We mention these two solutions during our brief literature overview in the introduction. However, since these are in a different category, we did not judge it necessary to comment on them in the discussion.

      (5) What do the error bars in Fig. 1c signify (s.d., s.e.m.)? Please state in Figure legend.

      We thank the reviewer for their attention to details and explain in the figure that we mean standard error of the mean by s.e.m.

      (6) I do think that the system could be of particular interest to researchers that study learning and that use food rewards in automated associative conditioning experiments. While this might be obvious to researchers with such an interest, this aspect is not at all discussed in the paper. Mentioning it might further underscore the versatility of the feeder system.

      We agree with the reviewer that ZAF can be adapted to experimental conditions such as behavioral conditioning, nutritions and drug delivery. Any experiment requiring the automatic delivery of solid pellets or liquid can benefit from ZAF. We revised our text and mentioned it in the discussion.

      (7) A list of all required equipment with vendors and price estimates (e.g. in the Supplement) would make this paper an even more readily accessible resource.

      This is a very important point already suggested by another reviewer. We added two extra tables in the wiki with the necessary parts and components, listing models, references, and prices.

      Reviewer #3 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This article signifies a purely technical advance in that it provides a characterization of an open-source, scalable automated feeder for aquatic facilities. As such, it presents a significant advance in the field of aquatic animal husbandry. In addition, this system could also be useful for automated large- or medium-scale associative conditioning paradigms, in which food rewards are given as positive reinforcers.

      Place the work in the context of the existing literature (provide references, where appropriate).

      The authors refer to previously published semi-automatic feeder systems. Regardless of the advantages or disadvantages of all these systems, the field will benefit from a broad(er) choice of automatic feeding systems that are described in sufficient detail to be easily assembled in the laboratory.

      State what audience might be interested in and influenced by the reported findings.

      This study is of interest for any research laboratory working with zebrafish or other aquatic model organisms. Thus, the audience for this article is very broad. Specific interest might also arise in researchers that are performing learning studies in zebrafish (see above).

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Zebrafish, neural circuits, sensory systems.

      **Referee Cross-commenting**

      Many of the major points are shared by all three reviewers. Beyond these shared points, I agree with the other reviews; they raise important questions. All reviews are fair, in my opinion.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This technical report describes an open-source fully automated feeding system for husbandry of zebrafish (and potentially other aquatic organisms). It provides detailed instructions for assembling individual components into two different feeding systems of varying adaptability, as well as their operation. Links to relevant control software are also provided. The characterization of the systems' performance appears somewhat limited (e.g. only maintenance of adult fish over a period of 8 weeks and use of dry food is documented). These systems could be of use for husbandry in a large number of research labs, and, in addition, for automated reward delivery in large-scale associative conditioning assays.

      Major comments:

      Providing food to large numbers of tanks in aquatic animal facilities in a regular fashion is a time- and resource-consuming process. Some automated feeding systems for large numbers of tanks are commercially available, but these feeder robots are expensive and are restricted to systems of specific vendors. Therefore, an adaptable automated system that can be assembled from off-the-shelf components is a very attractive option for many research labs to both save resources and standardize the feeding process.

      The instructions for assembly provided by the authors appear quite detailed and sufficient to allow non-experts the assembly and operation of the automated feeder systems. The design of the system appears appropriate for the task.

      While additional experiments are not required to support the claims of the article, I feel that it would be significantly improved by the provision of additional information. My suggestions in that regard include:

      Description of the washing procedure of the system (which solvents, how often, how long?). The authors mention that an exchange of the tubing is required every 10 weeks, but since the tubing transports liquid food mixture, it is easily conceivable that microbial growth will occur rapidly in the system without thorough hygiene / washing procedures. Also could the authors provide some information, which type of tubing material they are using (Silicone, Tygon etc.)?

      In a related point, I was left wondering how long the food is being mixed in the mixing flask before being applied to the animals? Too long mixing might lead to a loss of nutrients into the solution (through diffusion). Could the authors comment on that, please? Do the food pellets remain more or less integral so that the majority of delivered food is actually ingested by the fish?

      In yet another related point, I was left wondering, whether the authors observed any negative impact of feeder usage on water quality (besides pH and conductivity, which they report)? Especially, with regards to ammonia that might arise from the decomposition of uneaten food items?

      The authors only tested the feeder on adult fish, but discuss that it would easily be transferrable to a system that is used for raising fish fry. In that context, could the authors comment, on whether the system of using water as the carrier for the dry food (after mixing) would work as well for the smaller pellets required in feeding fish fry (e.g. 75 or 100 um pellet size as compared to the 500 um pellet size they use)? With smaller pellets, break-down of the dry food during the mixing process seems to be an even larger problem, I could imagine.

      Minor comments:

      (1) the average weight of animals is given as lying in the range of 5 to 6g. That seems very high. The "standard" weight range of adult zebrafish is more around 1g [see, for example: Clark, T. S., Pandolfo, L. M., Marshall, C. M., Mitra, A. K. & Schech, J. M. Body Condition Scoring for Adult Zebrafish (Danio rerio). j am assoc lab anim sci (2018)]. Could the authors comment on that discrepancy?

      (2) The authors state that spawning success is not negatively affected by the automated feeding, and they quantify the number of successful crosses. Could the authors briefly confirm or state, that or whether the clutch size was also unaffected?

      (3) The manual feeding procedure / regime that is used to compare husbandry success against the automated feeding regime is not described in any detail. That seems important given the topic of the article.

      (4) The authors cite two recent papers that describe semi-automatic feeding systems for zebrafish in the introduction. The authors might want to consider discussing some key differences between their system and these semi-automatic systems in the discussion.

      (5) What do the error bars in Fig. 1c signify (s.d., s.e.m.)? Please state in Figure legend.

      (6) I do think that the system could be of particular interest to researchers that study learning and that use food rewards in automated associative conditioning experiments. While this might be obvious to researchers with such an interest, this aspect is not at all discussed in the paper. Mentioning it might further underscore the versatility of the feeder system.

      (7) A list of all required equipment with vendors and price estimates (e.g. in the Supplement) would make this paper an even more readily accessible resource.

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This article signifies a purely technical advance in that it provides a characterization of an open-source, scalable automated feeder for aquatic facilities. As such, it presents a significant advance in the field of aquatic animal husbandry. In addition, this system could also be useful for automated large- or medium-scale associative conditioning paradigms, in which food rewards are given as positive reinforcers.

      Place the work in the context of the existing literature (provide references, where appropriate). The authors refer to previously published semi-automatic feeder systems. Regardless of the advantages or disadvantages of all these systems, the field will benefit from a broad(er) choice of automatic feeding systems that are described in sufficient detail to be easily assembled in the laboratory.

      State what audience might be interested in and influenced by the reported findings. This study is of interest for any research laboratory working with zebrafish or other aquatic model organisms. Thus, the audience for this article is very broad. Specific interest might also arise in researchers that are performing learning studies in zebrafish (see above).

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Zebrafish, neural circuits, sensory systems.

      Referee Cross-commenting

      Many of the major points are shared by all three reviewers. Beyond these shared points, I agree with the other reviews; they raise important questions. All reviews are fair, in my opinion.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript proposes an open source automated feeder for zebrafish facilities, although it would be amenable to other species. Overall, the manuscript is clearly written and easy to understand, the wiki is well sourced and clear. The commitment to open source is commendable. I have some questions regarding the long-term sustainability of this setup, as well as some discrepancies in the methods. Finally, as this aims to be useful to people with no engineering/electronics competence, I feel that it is not yet at a level that is accessible enough.

      Major comments

      • It would be useful to have a centralized list of parts and components, which would make it easier for users order all that is needed to assemble the ZAF or ZAF+, at the moment the information is distributed through the wiki as hyperlinks.

      • A troubleshooting guide for the common problems the team ran into (if any) would be useful for newcomers, even just as issues on the GitHub. The team may also consider some form of chat/forum/google group to allow discussions between users and experts.

      • Did the author observe any algal or bacterial growth in the feeding tubes over the 60 days? Do they have an estimate on how long the tubes stay "clean" enough? The authors mention tube changing every 10 weeks, can they explain the rationale, and did they assess the bacterial/algal contamination over that time? Do the splitter panel and food mixing flask also need replacing regularly?

      • The authors mention that the tubing needs to be of similar length to ensure similar resistance and food distribution, did they compare the body weight of fish in racks at the top or at the bottom of their system? There are no overall differences, but maybe the bottom racks would received slightly more food? Furthermore, did they quantify the differences in food/water delivery as a function of length differences?

      • Methods fish weight: The methods mention different amounts of food than the wiki, the rationale in the wiki is also different from the 5% of body weight outlined in the methods (which then matches the food amount of the methods). Which is the correct amount?

      • The code is decently commented for scientific software with clear variable names, but I wonder how flexible it is if users cannot get access to the specific hardware (especially the pumps) used in ZAF/ZAF+? Can the authors briefly comment on this point?

      • The wiki could use more pictures or, to borrow the Proust Madeleine allusion, schematics akin to LEGO with more intermediary steps clearly outlined. Some pictures are also a bit small/busy (such as 2D and 2E in the frame section, or the magnet pictures), they may benefit from cartoons/schematics to clarify what is done. Alternatively, videos/timelapses may help with better visualising the assembly.

      • The main question that would affect if this approach were taken up would be how reliable it is in the long run. Have the authors experienced any issue over the 2 months test? Is this system still being used currently? If so, could the authors update the water quality logs? Getting a sense of how long it can run without problems, how much troubleshooting is involved per month would be very useful in answering those questions.

      Minor comments

      • Main text page 3: Fig. Supp. 2 instead of Supp. Fig. 2. Furthermore, would the authors have similar data for the manual feeding? If so, it could be useful to add here for comparison (although that is not necessary if the data is unavailable).

      • Main text page 3: I would be useful to add how long it takes to change all the tubing after 10 weeks?

      • Methods fish weight: The phrasing as it stands make it unclear the same method was used for ZAF and ZAF+, the authors may consider to start with the description of the common weighting method, then the specifics of ZAF+.

      • Supp.Fig.1a: "Waste water drain pipe"

      • Acknowledgments: "...for their help..."

      • ZAF - Servo Hat connection: "to control the pumps"

      • ZAF - Installation: the dependencies should be listed as they are in ZAF+, or the two sections merged, unless the GUI is not functional (see below).

      • ZAF - How to use: there is no mention of the GUI, is it not yet implemented? If not, is the touch screen needed?

      • ZAF+ - soldering: "A 12V power supply (at least 10A best 20A) provides power to the electronics, expect the Raspberry Pi and the two Arduino Megas." It seems the sentence is incomplete, or at least I cannot make sense of it.

      Significance

      This manuscript provides a significant technical advance to the zebrafish field. The proposed automated feeder would be a very useful option for smaller labs, to ensure the consistency of feeding, and to remove one of the routine aspect of fish husbandry.

      As the authors state, there is certainly interest in the zebrafish community [9,10] for automation of feeding. I am not aware of other DIY fully automated feeding system, commercial systems do exist, but are expensive.

      The manuscript, and proposed automated feeder, would certainly be of interest within the zebrafish community, as well as other researchers using aquatic models that can rely on dry food. How many in the community would embrace this method will depend on how confident they are in the long-term stability.

      I am neither electronics, nor husbandry expert. As such I am not qualified to comment on any long-term approach this may prove, if any, for fish health. My expertise lies in image and data analysis, as well as microscopy.

      Referee Cross-commenting

      I think the major points are shared by all reviewers, I think the other reviews are fair in their content and I have nothing specific to comment on.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Lange et al. have developed an automatic feeding system for zebrafish facilities. The system is open-source and relatively easy to implement. The authors propose to systems, one that delivers the same amount of food for each aquarium (ZAF) and a second (ZAF+) that can adjust the amount of delivered food to each aquarium. The authors show no difference in fish weight, spawning and water quality, when fed using the automatic system or manually.

      On my opinion, the ZAF and ZAF+ are an excellent first approach to solve the complex problem of automatizing feeding in fish facilities. So far, only one company offers this option which is extremely expensive and demands a lot of maintenance.

      The manuscript is very well written and easy to follow. The supplementary material is very well detailed. It is clear that the authors intended to facilitate the implementation of the ZAF by potential users.

      I just have a few comments regarding the system:

      1) The authors do not indicate how the system is cleaned. the system drains it self, but will any deposits of food remain in the tubes ? Why the system is not flushed with clear water after each feeding? do the tubes get clogged ?

      2) How long the system was tested for?

      3) The ZAFs were used to feed 16 aquariums. For such a small rack, manually feeding takes less than 5 min. The authors should highlight that, at least for such small systems, the ZAFs will be especially very useful for feeding during weekends and holidays. Still, adding 16 commercially available small automatic feeders to each aquarium, could be simpler to implement.

      4) How do authors envisage implementing the ZAFs in much larger facilities (from 100 to 1000 tanks). Implementing a specific ZAF for each rack containing ~20 tanks may not be realistic.

      5) how the length of the tubes influences the efficiency of feeding ? For feeding many tanks with the same ZAF it is necessary that the tubes will be of the same length. In that case, the system will become very cumbersome. Longer tubes will probably need stronger pumps. What's the maximal length of tubes tested ? That will limit the number of aquariums a ZAF can feed.

      Despite these comments, this is an excellent first approach, and the fact that the authors made it open-source and open access, make the ZAFs a very important contribution to the community. I have no doubt that some fish facilities will implement it and the community will help to improve it.

      Significance

      This is the first open-source open-access automatic feeding system every published. It is the first but very important step to the automation of research fish facilities.

      Referee Cross-commenting

      I agree with all the other reviewers.

      We also have to take into account that the system is a first prototype and although not ideal, it is open source. This will allow other labs to develop and improve their own models based on the ZAF.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewer comments on:

      “Recruitment of Scc2/4 to double strand breaks depends on γH2A and DNA end resection”, by Martin Scherzer et al

      We would like to thank the editors and reviewers for their time spent, as well as their appreciated and insightful comments on our manuscript. We have now initiated the revision as outlined point by point below. We provide a description of the plan for how to resolve the points of concern still remaining and also list the modifications and improvements already incorporated in the revised and transferred manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ __In the manuscript entitled "Recruitment of Scc2/4 to double strand breaks depends on yH2A and DNA end resection", Scherzer et al. study the role of Scc2 in DSB repair in yeast. Scc2 is part of the cohesin loader and it is required for cohesin loading in response to DSB. The authors study the chromatin association of Scc2 by ChIP-qPCR and use genetics to identify factors that affect its recruitment. They show that Scc2 is enriched up to 10 kb from the break site, similar to cohesin and identify MRE, TEL1 and yH2A as important factors for Scc2 chromatin binding. Remarkably, MEC1 that has been shown to regulate cohesin under these conditions is dispensable for Scc2 recruitment. While DNA resection is important for Scc2 recruitment, chromatin remodelers don't play a significant role in it despite numerous reports on their effect on cohesin loading during the cell cycle. The manuscript provides new and important information on cohesin regulation in response to DNA damage. **Major comments:** The experiments are done appropriately and contain the required control. The results are presented clearly and with adequate statistics and support the conclusions. The experiments provide valuable information. However, the low resolution of the experimental setup is limiting, and dynamic information of Scc2 binding is lacking. I would agree with the authors that this kind of information may be beyond their scope. However, the absence of this information reduces the overall impact of the manuscript.

      1. ChIP-seq, of at least some of the key experiments, could provide information on the specific Scc2 binding sites and elucidate whether cohesin is translocated from the loading sites or accumulate in its proximity.

      ChIP -seq would indeed increase the resolution of the Scc2 and Cohesin DSB accumulation, especially beyond 1 kb. However, to gain insight into the dynamics of the binding, numerous timepoints for both strains would have to be analyzed, which we feel would be beyond the possibilities for this study (see also comment under point 4 of this document). For Scc2 we believe that we have shown high enough resolution, determining binding from 0,1 to 30 kb away from the break. We have also provided a time course experiment from 90 minutes up to 6 hours and show that the Scc2 binding is continuously increasing. We have in the revised version of the manuscript added experiments looking at the Cohesin binding in close vicinity of the break – similar to what we previously did for Scc2. With this we confirm the binding pattern of Cohesin previously reported. We have also compared Cohesin binding at 90 and 180 min after break induction, for increased information on the dynamics of its binding at the DSB, and see no change in Cohesin positioning in relation to the DSB site. Rather the general level of binding increases equally over the region, with time (compare Fig 1B and 4A with Fig 1C and Fig S3). This to us indicates that there is no translocation of Cohesin from one loading site to final binding sites. However, to further clarify this issue we plan to include ChIP qPCR experiments on an ATPase deficient mutant of Cohesin, which has been found to be able to be loaded on DNA but not translocated (Hu et al 2010, “ATP Hydrolysis is required for relocating Cohesin from sites occupied by its Scc2/4 loading complex”). These experiments will potentially allow us to explore the possibility that Cohesin is loaded at one (or several) site(s) in the DSB region and then translocated away to the final binding locations with time. The generation of such a strain is ongoing and the results from these experiments will be included in a fully revised version of the manuscript.**

      1. It has been suggested that Scc2 and Pds5 are mutually exclusive in cohesin complexes. It would be interesting to check in the current experimental setup (ChIP-qPCR) if Pds5 is mimicing Scc2 pattern

      We have generated a strain where Pds5 is FLAG-tagged, and include experiments determining the loading/binding of Pds5 at the break region in the revised version of the manuscript. These show (Fig S1B) that the binding of Pds5 mimics that of Cohesin, indicating that it binds as part of the Cohesin complex. In addition, it is seemingly not affected by the presence of a DSB and therefore most likely not important for the Scc2 or Cohesin loading at the DSB.

      **Minor comments:**

      1. Adding a threshold line to the graphs at fold change= 1 (no enrichment in respect to wild type) will increase their readability.

      We appreciate this suggestion, this has now been added, and is indeed helpful.

      1. Fig. 1A- Add times to the schematic. Modify the text to GAL addition/break induction.

      Thank you for the good suggestion, the figure has now been modified.

      1. Page 9. The authors write: "Cohesin failed to be loaded at the DSB in a mec1**Δ background (Fig 3A)". However, the figure shows reduced cohesin binding in mec1delta in respect to the wild type.

      In this graph Cohesin binding in response to break induction is shown. The level of binding in the mec1 deletion mutant is comparable to that of Cohesin in the absence of break induction, See Fig S3 for a newly added experiment showing wt binding of Cohesin at the same timepoint. The text describing Fig 3A on page 9 has also been slightly modified.

      1. Page 10. ".......recruitment to the DSB compared to wild type (Fig 3D)."Should be Fig. 4D.

      Thank you for noticing this mistake, this has now been corrected.

      1. Figure legend 3. "........Protein samples were taken after 3 hours arrest (G2/M, lane 1),....." The benomyl arrest is referred to as G2 arrest in the text but G2/M arrest in the legend. Consistency is needed.

      We agree on the need for consistency and have thus changed to G2/M throughout the manuscript.

      I suggest presenting the suggested model in a figure

      We plan to add an illustrative model figure as Fig 6 in a fully revised version of the manuscript.

      Reviewer #1 (Significance (Required)): I am an expert in cohesin biology. The Scc2-Scc4 complex has been identified as an essential factor for cohesin loading during the cell cycle (Ciosk et al., 2000). This function has been shown to be essential for cohesin role in response to DNA DSB (Unal et al., 2004, Strom et al., 2004). The interplay between Scc2 and the cohesin has been studied mostly in the context of the cell cycle. It has been shown that Scc2 activates the ATPase activity of cohesin and promotes its translocation from the loading site. Scc2 and Pds5 are mutually exclusive and their switch suppresses cohesin ATPase activity (Hu et al., 2011, Petela et al., 2011). However, the Scc2-cohesin interplay has been poorly studied in the context of DNA repair. The current work adds valuable information on the factors that recruits Scc2 to the break site and identifies end resection as the key event in this process. This information is novel and important and its contribution to the fields of cohesin and DNA repair should not be overlooked. However, ChIP-seq information can increase the overall impact.

      We appreciate the nice verdict. We do agree to some extent on the ChIP seq comment, however based on the discussion under major points 1, we do not see that adding ChIP sequencing experiments to this study will be possible.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Cohesin is a key structural component of chromosomes. Amongst its functions, cohesin plays a critical role in ensuring the accurate repair of double stranded DNA breaks (DSBs). Intuitive as this may seem, a number of fundamental open questions remain. One of these questions is, how does the cohesin loading machinery recognise a DSB? This issue is addressed in the present study. The manuscript begins with a well-written introduction into the fields of DSB repair, as well as cohesin. The research aim is clearly laid out. Experiments follow that sequentially investigate known steps of the DSB repair pathway, asking how these steps intersect with the cohesin loading machinery. On the positive side, this is a technically very well conducted study (investigating the cohesin loader has proven tricky in many contexts). The study is systematic and explores the known steps during DSB repair for their impact on cohesin loader recruitment. The authors find a surprising separation of function. The DSB pathway up until H2AX phosphorylation and DNA end resection is required for both cohesin loader recruitment, as well as consequently for cohesin loading. The Mec1 checkpoint kinase, in contrast, is dispensable for cohesin loader recruitment but is required for cohesin loading. This suggests that Mec1 supports cohesin loading at a step beyond that of attracting the cohesin loader. The manuscript thus contains important information that will be of interest to a wide range of researchers in the DNA repair and cohesin fields. The limitation of the study lies in the fact that the molecular determinant for cohesin loader recruitment to DSBs remains unknown. H2AX phosphorylation and DNA end resection are shown to be prerequisites, but how do these events form a molecular mark that the cohesin loader recognises? And what is this mark? Equally, how does the Mec1 kinase permit cohesin loading additionally to the cohesin loader?

      We appreciate the positive comments as well as the criticism. We are unfortunately fully aware of the lack of precise knowledge regarding the actual mark made by phosphorylation of H2A, and resection, for recruitment of Scc2. The same is true for the limited understanding of what the exact contribution of Mec1 for Cohesin loading is. We would have liked to execute a screening based approach to find the single determinant – however this has to be performed outside the scope of this study.

      **Specific comments:** Figure 1. It would be interesting to overlay the Scc2 prolife around the DSB next with that of Scc1 (obtained previously under similar conditions?), to contrast the loading site with the final cohesin distribution.

      In the revised version of the manuscript, we have looked at the binding of Cohesin close to the break and outwards in the same way as for Scc2, with this experimental system. These binding profiles are not overlapping shown as Fig 1B and 1C. Their different distribution is very clear. This also confirms what been reported previously for Cohesin binding, where the region closest to the break is in principle rather devoid of Cohesin (Fig 1C). This binding pattern is also not changed with increased time for break induction (Fig S3), indicating that there is likely no major translocation of Cohesin from a loading site to the final binding sites around the DSB, at least not during the time frame analyzed, but rather an overall increase in Cohesin binding in the break region. While we cannot exclude translocation completely, we hope that experiments using a Cohesin transition state mutant, deficient in translocation, will address this better.

      Figure 2. Using the same y-axis scale from 1-4 amongst panels A-D could make evaluation of the data easier.

      We agree the comparison is made easier when the scale is the same - this has now been changed within figures.

      Figure 3. Panels A and B contain data that are important to interpret the DNA end resection results shown in Figure S2. Maybe that latter data, which conveys the main conclusion from the figure, could be incorporated within the main figure?

      This is a good point and we have changed accordingly, now resection experiments in the absence of Scc2 from Fig S2 are shown as Fig 3C.

      Figure 5. In this figure, the authors begin to investigate possible contributions of candidate cohesin loader receptors, in the form of chromatin remodelling complexes. The Swr1 and INO80 remodellers have an effect on DNA end resection that parallels the effect on Scc2 recruitment, suggesting that their main contribution might be that of facilitating DNA end resection.

      This relationship remains less well documented in the case of Sth1 depletion. Both when using the sth1-3 allele, or degron depletion, the authors observe a relative reduction of cohesin loader recruitment, compared to what they would otherwise expect. However, in both cases a side-by-side analysis of a similarly-treated wild type strain is missing. Whether or not RSC inactivation impacts cohesin loader recruitment therefore remains uncertain.

      In the revised version of the paper we have included experiments where wild-type cells were grown in the same culturing system as the Sth1 degron strain, included as Figure 5A. The best control would be to use the Sth1 degron strain and not degrade Sth1 as the wt control. However the poor growth of these cells in -Met media with raffinose as the sole carbon source is not compatible with the design of this experiment.

      For the experiment including the ts allele of Sth1 the wt control was not possible to keep arrested in G2 during the course of the experiment. We agree that a comparison with a wt control would be interesting, however due to not having a proper readout for the impairment of sth1 we decided to omit the data from the ts strain in the manuscript. Based on our results we would conclude that Sth1 inactivation affects Scc2 recruitment due to impaired end resection, deem it unlikely though that this is mediated by direct interaction, as has been shown in S-phase.

      It is also not documented what the corresponding effect of RSC inactivation on DNA end resection might be. Given that previous results suggested that RSC might contribute to cohesin loading at DSBs, the nature of how RSC does this could maybe be clarified before publication.

      In the revised version of the manuscript we are including RPA ChIP data for the Sth1 – degron strain. These show that resection is slightly, albeit significantly, reduced after degradation of Sth1. We believe this to be the explanation for the reduced Scc2 loading in its absence, in line with what is seen in the swr1 and nhp10 deletion mutants.

      Reviewer #2 (Significance (Required)): see above.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): This paper presents data analysing the recruitment of Scc2 to double strand breaks. It makes the interesting observation that its recruitment is Tel1 but not Mec1 dependent, and does not require remodelers (it seems). It does correlate with resection but the mechanism of loading is unclear. I have a few issues on controls and alignment of text with results in this manuscript. Also there is some omission of important recent work and some old studies. But if these points can be resolved it could be published. **Major points:**

      1. The cut efficiency under all conditions tested needs to be presented and the CHIP needs to be normalized in every assay to the cut efficiency. This is particularly relevant in the mutants of remodelers as they definitely influence the efficiency of Gal-HO induction. This must be included for every chip result.

      We agree that the Cut efficiency could influence the degree of recruitment due to the strength of the signal from the break for recruitment of the initial DSB response factors that we show are important for recruitment of Scc2. Already in the previous version of the manuscript we therefore show in Fig S3C that the cut efficiency of the chromatin remodelers was comparable to that in WT cells after 3 hours. We have now repeated this type of experiment three times for most strains used in the study and calculated an average cut efficiency for each strain, which is then used for normalization of the ChIPqPCR results. Alternatively, we have used an RT-PCR based method for quantification of the Cut efficiency on the actual ChIP samples when available. The average Cut efficiency is indicated for each strain in the figure legends in the new version of the manuscript. N**ormalization of the ChIP data to the Cut efficiency does in general not change the results or conclusions presented previously, throughout the manuscript.

      The arp8 delta mutant is clearly polyploid and probably has some suppressor mutation or another problem. They should discard the arp8 results and get a proper and controlled arp8 delta strain (from another lab in europe - there are several with good W303 strains).

      We have repeated the Arp8 transformation in different W303 strains which likewise resulted in polyploidy. Loss of INO80 components have been shown to confer polyploidy in a S288C background, with the loss of Arp8 being an exception. Considering the apparent differences regarding INO80 (the INO80 ATPase subunit is essential in W303 but not in S288C), we deemed it plausible that polyploidization could be a resulting phenotype of an Arp8 deletion in W303. Prompted by the comments put forward here we have now transformed a clean W303 background wild type strain and indeed see no sign of polyploidy. It could be that polyploidization is a consequence of the presence of the GAL:HO in combination with an extra recognition sequence for HO. We are now preparing crosses to answer this question. Depending on the outcome these experiments might be added to a final revision of the manuscript. In this version of the manuscript the arp8delta experiments have been removed.

      1. The text does not accurately reflect the results in several places. For instance .. on page 10 where the result of sgs1 exo1 mutant strain is described, it is said that "Recruitment of Scc2 to the DSB was drastically reduced.... and "consistent with long range resection the effect was less promiment closer to the break.". First, the word "drastic" is not appropriate for a drop of about 50% (on average) and in reality the drop is more significant near the cut (+1kb) than far from the break (+ 10 or 30 kb).... - the data are the opposite of what is stated. and it is not drastic. I do not contest that it correlates with resection, if the HO-cut efficiency is equal in all strains.

      We are sorry for this discrepancy between the results shown and the description of the same in a few cases. We have reworded the results section to reflect the data more accurately. We have also removed the sgs2exo1 deletion mutant data close to the break as we have not investigated all mutants in the region closest to the break and thereby lack a comprehensive comparison.

      The results with INO80 and SWR1 are not really compelling - what is the cut efficiency in these strains. Moreover, the "confusion" in the literature is only because people look at different loci and different conditions. INO80 does affect resection (see Van Attikum et al., 2007; and Cheblal A et al., Molecular Cell 2020) for resection assays in wt and mutant strains. And it is very strange that the Van attikum et al., Cell 2004 (the back to back paper with Morrison et al Cell 2004) is not cited. The data on resection is clear in this early work. But it appears that the arp8 mutant used has other mutations and polyploidization, and should clearly be discarded. Nhp10 impact is a bit controversial but not arp8 with a good strain. The references in general are missing Cheblal A et al., Molecular Cell 2020 for Cohesin recruitment, impact on resection and arp8 impact and ditto. Also missing is Deshpande I et al., molecular Cell 2017 for RPA-Ddc2-Mec1 interactions. These omissions are strange and in fact create confusion in the ms.

      We would like to thank the reviewer for bringing our attention on some very relevant articles published in the field that has now been references as we hope correctly. We have in the revised version of the manuscript also adjusted the ChIP qPCR results to the average efficiency of break induction.

      **Minor points:** The english usage needs to be corrected at a few places... and figures are not correctly cited always - see page 10 especially - there is no Figure 3D.

      It is unfortunately not so easy to correct the language without specific examples. We have however gone through the text carefully, and also asked a native English speaker to assess the language, and corrected accordingly. We are sorry for the Figure mistake, this has now been corrected together with a general update of figure numbers based on some modifications of the manuscript structure.

      Reviewer #3 (Significance (Required)): The advance is not groundbreaking but still interesting and worthy of publishing, if proper controls and better referencing can be done.

      We hope that we after having related all ChIP qPCR data to averaged Cut efficiencies for each strain, and edited the discussion to relate it more appropriately to both new and older correct references, have been able to handle the issues raised and motivate publication of the study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This paper presents data analysing the recruitment of Scc2 to double strand breaks. It makes the interesting observation that its recruitment is Tel1 but not Mec1 dependent, and does not require remodelers (it seems). It does correlate with resection but the mechanism of loading is unclear. I have a few issues on controls and alignment of text with results in this manuscript. Also there is some omission of important recent work and some old studies. But if these points can be resolved it could be published.

      Major points:

      1. The cut efficiency under all conditions tested needs to be presented and the CHIP needs to be normalized in every assay to the cut efficiency. This is particularly relevant in the mutants of remodelers as they definitely influence the efficiency of Gal-HO induction. This must be included for every chip result.
      2. The arp8 delta mutant is clearly polyploid and probably has some suppressor mutation or another problem. They should discard the arp8 results and get a proper and controlled arp8 delta strain (from another lab in europe - there are several with good W303 strains).
      3. The text does not accurately reflect the results in several places. For instance .. on page 10 where the result of sgs1 exo1 mutant strain is described, it is said that "Recruitment of Scc2 to the DSB was drastically reduced.... and "consistent with long range resection the effect was less promiment closer to the break.". First, the word "drastic" is not appropriate for a drop of about 50% (on average) and in reality the drop is more significant near the cut (+1kb) than far from the break (+ 10 or 30 kb).... - the data are the opposite of what is stated. and it is not drastic. I do not contest that it correlates with resection, if the HO-cut efficiency is equal in all strains.
      4. The results with INO80 and SWR1 are not really compelling - what is the cut efficiency in these strains. Moreover, the "confusion" in the literature is only because people look at different loci and different conditions. INO80 does affect resection (see Van Attikum et al., 2007; and Cheblal A et al., MOlecular Cell 2020) for resection assays in wt and mutant strains. And it is very strange that the VAn attikum et al., Cell 2004 (the back to back paper with Morrison et al Cell 2004) is not cited. The data on resection is clear in this early work. But it appears that the arp8 mutant used has other mutations and polyploidization, and should clearly be discarded. Nhp10 impact is a bit controversial but not arp8 with a good strain. The references in general are missing Cheblal A et al., Molecular Cell 2020 for Cohesin recruitment, impact on resection and arp8 impact and ditto. Also missing is Deshpande I et al., molecular Cell 2017 for RPA-Ddc2-Mec1 interactions. These omissions are strange and in fact create confusion in the ms.

      Minor points:

      The english usage needs to be corrected at a few places... and figures are not correctly cited always - see page 10 especially - there is no Figure 3D.

      Significance

      The advance is not groundbreaking but still interesting and worthy of publishing, if proper controls and better referencing can be done.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Cohesin is a key structural component of chromosomes. Amongst its functions, cohesin plays a critical role in ensuring the accurate repair of double stranded DNA breaks (DSBs). Intuitive as this may seem, a number of fundamental open questions remain. One of these questions is, how does the cohesin loading machinery recognise a DSB? This issue is addressed in the present study. The manuscript begins with a well-written introduction into the fields of DSB repair, as well as cohesin. The research aim is clearly laid out. Experiments follow that sequentially investigate known steps of the DSB repair pathway, asking how these steps intersect with the cohesin loading machinery.

      On the positive side, this is a technically very well conducted study (investigating the cohesin loader has proven tricky in many contexts). The study is systematic and explores the known steps during DSB repair for their impact on cohesin loader recruitment. The authors find a surprising separation of function. The DSB pathway up until H2AX phosphorylation and DNA end resection is required for both cohesin loader recruitment, as well as consequently for cohesin loading. The Mec1 checkpoint kinase, in contrast, is dispensable for cohesin loader recruitment but is required for cohesin loading. This suggests that Mec1 supports cohesin loading at a step beyond that of attracting the cohesin loader. The manuscript thus contains important information that will be of interest to a wide range of researchers in the DNA repair and cohesin fields.

      The limitation of the study lies in the fact that the molecular determinant for cohesin loader recruitment to DSBs remains unknown. H2AX phosphorylation and DNA end resection are shown to be prerequisites, but how do these events form a molecular mark that the cohesin loader recognises? And what is this mark? Equally, how does the Mec1 kinase permit cohesin loading additionally to the cohesin loader?

      Specific comments:

      Figure 1. It would be interesting to overlay the Scc2 prolife around the DSB next with that of Scc1 (obtained previously under similar conditions?), to contrast the loading site with the final cohesin distribution.

      Figure 2. Using the same y-axis scale from 1-4 amongst panels A-D could make evaluation of the data easier.

      Figure 3. Panels A and B contain data that are important to interpret the DNA end resection results shown in Figure S2. Maybe that latter data, which conveys the main conclusion from the figure, could be incorporated within the main figure?

      Figure 5. In this figure, the authors begin to investigate possible contributions of candidate cohesin loader receptors, in the form of chromatin remodelling complexes. The Swr1 and INO80 remodellers have an effect on DNA end resection that parallels the effect on Scc2 recruitment, suggesting that their main contribution might be that of facilitating DNA end resection.

      This relationship remains less well documented in the case of Sth1 depletion. Both when using the sth1-3 allele, or degron depletion, the authors observe a relative reduction of cohesin loader recruitment, compared to what they would otherwise expect. However, in both cases a side-by-side analysis of a similarly-treated wild type strain is missing. Whether or not RSC inactivation impacts cohesin loader recruitment therefore remains uncertain. It is also not documented what the corresponding effect of RSC inactivation on DNA end resection might be. Given that previous results suggested that RSC might contribute to cohesin loading at DSBs, the nature of how RSC does this could maybe be clarified before publication.

      Significance

      see above.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In the manuscript entitled "Recruitment of Scc2/4 to double strand breaks depends on yH2A and DNA end resection", Scherzer et al. study the role of Scc2 in DSB repair in yeast. Scc2 is part of the cohesin loader and it is required for cohesin loading in response to DSB. The authors study the chromatin association of Scc2 by ChIP-qPCR and use genetics to identify factors that affect its recruitment. They show that Scc2 is enriched up to 10 kb from the break site, similar to cohesin and identify MRE, TEL1 and yH2A as important factors for Scc2 chromatin binding. Remarkably, MEC1 that has been shown to regulate cohesin under these conditions is dispensable for Scc2 recruitment. While DNA resection is important for Scc2 recruitment, chromatin remodelers don't play a significant role in it despite numerous reports on their effect on cohesin loading during the cell cycle. The manuscript provides new and important information on cohesin regulation in response to DNA damage.

      Major comments:

      The experiments are done appropriately and contain the required control. The results are presented clearly and with adequate statistics and support the conclusions. The experiments provide valuable information. However, the low resolution of the experimental setup is limiting, and dynamic information of Scc2 binding is lacking. I would agree with the authors that this kind of information may be beyond their scope. However, the absence of this information reduces the overall impact of the manuscript.

      1. ChIP-seq, of at least some of the key experiments, could provide information on the specific Scc2 binding sites and elucidate whether cohesin is translocated from the loading sites or accumulate in its proximity.
      2. It has been suggested that Scc2 and Pds5 are mutually exclusive in cohesin complexes. It would be interesting to check in the current experimental setup (ChIP-qPCR) if Pds5 is mimicing Scc2 pattern

      Minor comments:

      1. Adding a threshold line to the graphs at fold change= 1 (no enrichment in respect to wild type) will increase their readability.
      2. Fig. 1A- Add times to the schematic. Modify the text to GAL addition/break induction.
      3. Page 9. The authors write: "Cohesin failed to be loaded at the DSB in a mec1Δ background (Fig 3A)". However, the figure shows reduced cohesin binding in mec1delata in respect to the wild type.
      4. Page 10. ".......recruitment to the DSB compared to wild type (Fig 3D).". Should be Fig. 4D.
      5. Figure legend 3. "........Protein samples were taken after 3 hours arrest (G2/M, lane 1),....." The benomyl arrest is referred to as G2 arrest in the text but G2/M arrest in the legend. Consistency is needed.
      6. I suggest presenting the suggested model in a figure

      Significance

      I am an expert in cohesin biology.

      The Scc2-Scc4 complex has been identified as an essential factor for cohesin loading during the cell cycle (Ciosk et al., 2000). This function has been shown to be essential for cohesin role in response to DNA DSB (Unal et al., 2004, Storm et al., 2004). The interplay between Scc2 and the cohesin has been studied mostly in the context of the cell cycle. It has been shown that Scc2 activates the ATPase activity of cohesin and promotes its translocation from the loading site. Scc2 and Pds5 are mutually exclusive and their switch suppresses cohesin ATPase activity (Hu et al., 2011, Petela et al., 2011). However, the Scc2-cohesin interplay has been poorly studied in the context of DNA repair. The current work adds valuable information on the factors that recruits Scc2 to the break site and identifies end resection as the key event in this process. This information is novel and important and its contribution to the fields of cohesin and DNA repair should not be overlooked. However, ChIP-seq information can increase the overall impact.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      Summary

      Copy number variations in the 1q21.1 loci, deletions and duplications, have been associated with neurodevelopmental disease. In particular, deletions of this locus result in a variety of neuronal phenotypes including microcephaly and schizophrenia in varying levels of severity. Duplications of the 1q21.1 locus are often associated with autism and/or macrocephaly.

      In this study Nomura et al. generated 1q21.1 deletion and duplication hESC lines to study the impact of these CNVs on neuronal development. They generated brain organoids and observed a bidirectional effect of this CNV on organoid size, with 1q21.1 deletion showing smaller brain organoids whereas, the 1q21.1 dup lines grew large than controls. This in line with observed micro and macrocephaly observed in patients. They further analyzed these organoids at the gene expression level using single cell RNAseq and performed some electrophysiological assessment on neurons from of dissociated organoids.

      This study is certainly of interest given the association of this loci with NDDs such as autism, epilepsy and schizophrenia. At this stage, the study is mainly a descriptive study, showing differences between the 1q21.1 del/dup versus controls but also between both the del/dup lines. There is no mechanistic insight provided. For example the 1q21.1 CNV encompasses several genes, of which some have already been linked to micro/macrocephaly (eg. NOTH2NL). More importantly, most of the conclusions drawn by the authors are based on a limited set of experiments/analysis which are not always carefully performed and/or presented. In general, the data presented are premature, therefore not supporting the claims/conclusion made by the author (eg title) This makes the overall impact of this study limited.

      As the reviewer pointed out, NOTCH2NL (both A and B) have been regarded as micro/macrocephaly-related genes (Fiddes et al., Cell, 2018; Suzuki et al., Cell, 2018). In this study, however, we focused on the distal region of 1q21.1 between BP3 and BP4, which contains neither NOTCH2NLA nor NOTCH2NLB, because the target site is thought to be the core region of clinical 1q21.1 microdeletion/microduplication syndrome (Mefford et al., NEJM., 2008; Brunetti-Pierri et al., Nat. Genet., 2008; Van Dijck et al., EJMG, 2015). Although both NOTCH2NLA and B are located outside of our target, these genes are important for human neocortical development and neurogenesis, so we cite these papers (Fiddes et al. and Suzuki et al.) and discuss them in the discussion of the revised manuscript.

      Main comments

      In general, the interpretation of the data is too premature:

      1. The title is not supported in any means by data

      As requested by the reviewer, we have corrected the title as “Modeling reciprocal CNVs of chromosomal 1q21.1 in cortical organoids reveals alterations in neurodevelopment”.

      1. Brain organoids size and development: In figure 2 the authors analyzed the development of the organoids. Based on the human phenotype the deletion would lead to smaller brain and the duplication to larger brain organoids. The presented data to support these claims are rather scarce. They indeed provide data on organoid size, however there is no information as to regard how this micro/macrocpehaly comes about. Only limited amount of cell types are being investigated with immunocytochemistry, which give little insight into the mechanism. Fig 3. The authors performed some very basic immunostaining and concluded that the neuronal maturity of 1q del seemed to be accelerated, whereas 1q dup decelerated from the NPC stage. However, there is no direct evidence provided for this. With simple additional immunostainings authors could already get a much better idea of what is going on. For example the authors could measure the amount of differentiating versus proliferating cells, cell cycle exit, etc (eg BrDU, KI67, pHH3 staining,...)

      We thank the reviewer for the suggestion. In response to this, we plan to analyze additional markers such as phosphor-histone H3 (pHH3) to evaluate the late-G2/M status by immunostaining. In addition, to explain the smaller organoid size observed in 1q del organoids, we will check apoptosis markers such as cleaved-caspase3 by immunostaining and western blotting.

      Further there are some technical aspect that would need to be resolved:

      There is a general lack of brain organoid characterization of the controls. It is unclear on how many independent clones these experiments were performed.

      We constructed one clone per genotype (1q21.1 deletion (1q del), 1q21.1 duplication (1q dup) and CTRL) from one human ES cell strain (khES-1) by next-generation chromosome engineering using the CRISPR/Cas9 system. According to the reviewer’s comment, we have added the information of each clone, including the actual number of each clone in the results section. Following the reviewer’s comment, we also recognized the importance of comparing targeted clones even in the same genotype to verify cellular phenotypes in a targeted clone. However, we consider that at least isogenic ES cell lines are less affected by genetic variances on other regions and epigenetic changes than patients-derived iPS cells.

      • Fig 2C: it is unclear why brain organoid sizes reduce over time. Is this an indication of increased apoptosis? Did the authors measure this?

      In order to respond to the reviewer’s comment, we plan to examine apoptotic markers such as cleaved caspase-3 by immunostaining or western blotting, as mentioned above.

      • What is the reason for using t-test with Bonferroni correction as opposed to one -way (or even two-way) Anova is unclear in Fig 2C

      Analysis of variance (ANOVA) has been regarded as optional when multiple comparisons without F-statistics are performed (Jason Hsu. 1996. Multiple Comparisons: Theory and Methods (Guilford School Practitioner)). We selected the Bonferroni test because we thought we could evaluate our data more strictly with the Bonferroni test than with the Tukey-Kramer test. In response to the reviewer’s request, we analyzed our data using one- way ANOVA with the Tukey-Kramer test. We confirmed that statistical significances were consistent (we can provide both data if requested). We have changed the description in the figure legend and methods section of the revised manuscript.

      • 2E is unclear how they came to the conclusion that dosage dependent size difference in NPC organoids was caused by the number of cells within an organoid, not by the size of each cell or different cell types. Since they only measured the amount of Sox 2 positive cells and used Sox2 to measure cell diameter, whereas Sox2 is mainly expressed in the nucleus.

      We thank the reviewer’s comment. We used images of SOX2 staining because contrasts of each cell in bright-field images were too obscure to be detected using the fluorescent microscopy, BZ-X analyzer, and because we found cell sizes seemed similar between bright-field images and SOX2 staining images. However, this method was not desirable. To respond to the reviewer’s comment, we have counted the number of cells in the images of each NPC organoid using the BZ-X analyzer and calculate the cell number per 1000 µm2. We found the cell density was not significantly different among the 3 genotypes. We understand that counting the cell number of a single organoid would be ideal, but it was impossible because each NPC organoid was too small. We have changed Figure 2E, descriptions in the methods and results section, and the corresponding figure legend in the revised manuscript.

      • How do the authors explain that the Dup cells do not express Tubb neither CTIP2, do they only express NPCs and no neurons?

      We consider this finding supports the immaturity in the cortical organoids with 1q21 duplication. However, we have checked only a few markers for intermediate progenitors and mature neurons so far. We plan to examine immature neuronal markers such as DCX and other mature neuronal markers such as NeuN by immunocytochemistry (ICC) to confirm this finding. Similarly, we will perform expression analysis by real-time qPCR to check mature and immature neuronal cell markers.

      In short, the characterization of the brain organoids at the level of general development, cell types, proliferation, differentiation is underdeveloped.

      We will examine the characterization of the brain organoids in more detail by different techniques as described above.

      1. Electrophysiological assessment of brain organoids derived neurons:

      In figure 4 the authors claim that both CNVs (Del/Dup) show hyperexcitability and altered expressions of glutamate system as common features between the Del/Dup lines. The data to support this are however scarce and far from being convincing:

      The poor quality of the data is represented by images in 4B-E:

      • First the authors choose to dissociate the organoids prior to measure the cells on MEA's. This takes away the advantage of 3D brain organoids, will add a lot of non-physiological stress, cause cell death and lead to unequal distribution of cells over the electrodes, see fig 4B.

      We are afraid that the reviewer might misunderstand our experiment. In this experiment, we used not 3-D brain organoids but 2-D neurons. Based on established neural differentiation protocol (Fujimori et al., Stem Cell Reports, 2017, Toyoshima et al., Transl. Psychiatry, 2016, Matsumoto et al., Stem Cell Reports, 2016), we seeded single-cells dissociated from neurospheres on MEA dishes at the same density (8 x 105 cells per dish) on day 33 and continued culturing for 28 days on the MEA dish before analysis. Thus, we didn’t dissociate cells just before analysis. We could avoid adding non-physiological stress because we kept on culturing on the MEA dish for 28 days.

      • MEA recording are meant to measure network activity and heavily (read: fully) dependent on the network being formed. Cherry picking electrodes for analysis is not justified, analysis should be performed per MEA chip not per electrode. Inclusion/exclusion parameters should be defined before analysis

      We have performed statistical analysis with all chips (electrodes) per genotype in response to the reviewer's request. Even though the distributions of firing rate were not consistent among electrodes, we found the significant differences between CTRL and each mutant (Ctrl vs 1q del: p< 0.001, Ctrl vs 1q dup: p< 0.001, 1q del vs 1q del: p=1.0). We have changed Figure 4E, the descriptions in the methods section, and the corresponding figure legend in the revised manuscript this time. We also reanalyzed burst rates so that all electrodes were included in the statistical analysis. We have changed supplementary Figure 3 and edited the descriptions in the methods and the corresponding figure legend in this revised manuscript.

      • MEA parameters such as Mean firing rate (spike/min) and burst rate are very sensitive to plating conditions, especially number of cells and clustering of cell around electrodes (see 4B). Given that the organoids already differ in size and according to the authors in cell number, but also in the amount of starting NPCs, one can expect very different cell densities/cell types per experiment/genotype. The authors should therefore show for every genotype the matching cell culture images. Also with regard to the claims made about GABAergic neurons the cell type composition at the time of the MEA recording should be characterized for every genotype.

      As mentioned above, in MEA analysis, we used 2-D neuronal culture and seeded cells on each chip at the same density. The distribution patterns of cells were similar among the 3 genotypes. We will show the images of cultured neurons from 3 genotypes in the revised figure. As for the cell type composition, we plan to examine the expressions of GABAergic markers using extracted RNAs from neuronal cells on around 28 days post- dissociation (dpd). As reviewer #2 suggested, we also considered that drug treatment with bicuculine in this MEA system was meaningful. We plan to perform this experiment if the experimental conditions can be optimized.

      • Fig 4B illustrates the points made above. The fact that no activity is observed in the control cells can be due to many different reasons: unequal plating, stress after dissociating cells, poor coverage of the electrodes, poor maturation, too early measuring time point, etc Because the authors have no control over the amount of cells covering the electrodes the data presented here carry very little carry little information. Fig 4B, best illustrates this with large cell clumps and areas without cell bodies. Measurements from these cell cultures are irrelevant and no conclusion can be drawn.

      We suggest that the authors first benchmark this technique with their own differentiation protocol, show robust and reliable recordings on control cells, and only compare to the CRISPR lines at a time point at which the control cells show a decent amount of activity 1Hz. When doing so, also reduced activity can be monitored (For examples see, Trujillo et al, Cell Stem Cell2019 or Frega et al 2019 Nat comm).

      As mentioned above, we seeded dissociated neurospheres in equal numbers on MEA dishes and kept culturing neurons gently for 28 days before analysis. Cell distribution was similar among the 3 genotypes and we could observe cell bodies in the area outside aggregates (we will provide additional bright-field images in the revised manuscript later). Low activities in CTRL neurons at 28 dpd could be observed even in the electrodes covered with dense cells, which were consistent among 3 independent experiments as described above. Nonetheless, we agreed with the reviewer that cellular conditions which could show stable activities even in CTRL neurons were more desirable. We have already tried longer cultures three times, but we could not perform sufficient analyses because neuronal cells became unhealthier after 35 dpd. We will try to improve the experimental conditions and perform analyses if the experimental conditions could be optimized.

      • MEAs measure the output of the network (action potentials). In a network, this can be influenced by virtually every neuronal property (morphology, synaptic input, types ofsynapses, intrinsic excitability, etc). Therefore, the authors cannot conclude only based on fig 4E that the Del/Dup cells are intrinsically hyperactive. To make this conclusion they should measure this directly by assessing that passive and active intrinsic properties of individual neurons.

      In control condition many electrodes do not give any signal. From these experiments it is impossible to know whether this is because of lack of cell on the particular electrode or real absence of activity. Certainly one could not conclude that the del en dup cell are intrinsically hyperexcitable.

      As described above, we could observe the similarity of cell distributions among 3 genotypes. However, as the reviewer mentioned, the assessment of the individual neuronal activity would be better. Thus, we will perform patch-clamp recordings in addition to MEA analysis.

      It seems that from the introduction the authors try to link 1q21 CNVs to epilepsy and ASd, thereby justifying the observed phenotypes.

      • How do the authors reconcile the fact that more mature GABA system is observed in the Del lines with the so called increased activity compared to controls but not to the Dup lines.

      We assumed that cell type compositions differed between 1q del and 1q dup, although network excitabilities were commonly observed in both mutants. We agree that this assumption lacks sufficient evidence even though we have shown the results in scRNAseq (Figure 6E). We consider that checking cell type compositions would be needed to ensure this. Although mature GABAergic neurons were increased in 1q del lines as mentioned by the reviewer, we think GABAergic signals and unknown factors such as epilepsy- associated genes (e.g., GRIN2A and SCN1A) may be involved in the abnormal neuronal firing. We will check the expression of these genes and examine the expressions of GABAergic markers in neuronal cells.

      Single cell RNAseq

      • I'm not a specialist on single cell RNAseq, however it seems that the analysis is underdeveloped and conclusion drawn for these experiments premature. It would be essential to validate some of the generated hypothesis, eg GABA maturity and not merely state as a conclusion (eg title).

      We thank the reviewer for the suggestion. We have revised the title as we mentioned above, and we will revise the main text based on our results appropriately.

      • How do the authors explain that a majority of the cells are Glial cells at day 27, and no presence of neurons.

      On day 27 in our 3-D organoid protocol, cells were still in the developmental stage. That’s why we consistently described it as “NPC organoid” but not “brain organoid” in this paper. Indeed, our rationale for the scRNA-seq study was to determine gene(s) or gene regulatory network(s) when the difference of circumference was significant among genotypes (Fig. 2C). Although the underlying mechanism was not fully understood from our results, we interpreted this result. Radial glial cells (RGs) have the ability to self- renewal with symmetric divisions and play a role in both neurogenesis and gliogenesis (Lui et al. Cell 2011, A Kriegstein et al., Annu Rev Neurosci 2009). A recent study showed that the reduction of NF1, a tumor suppressor protein in the RAS/MAPK pathway, induced excessive production of glial cells, i.e., mainly oligodendrocyte precursor cells (OPCs) accompanied with astrocyte precursor cells, from RGs; furthermore, the reduction of NF1 also enhanced the cell divisions of generated OPCs (Z Shen, BioRxiv 2020). We have checked that the expression of NF1 in the glial cluster was also downregulated in our scRNA-seq data. Thus, we reasoned that the predominance of 1q dup cells in the glial cluster reflected the excessive production of glial cells from RGs, which were related to the alteration of the RAS/MAPK pathway. We will add this interpretation in the revised manuscript next time.

      • How relevant is the changes in the extremely low amounts of GABAergic neurons in the Del cells, no excitatory neurons are present, only NSCs

      In a previous paper, CA Trujillo et al. showed the cell type composition in 3-D human cortical organoids at different time points. GABAergic cells were restricted to later stages and the ratio was still very limited at 6 months (Figure 1J in CA Trujillo et al., Cell Stem Cell 2019). From this fact, we regarded the emergence of GABAergic neurons as meaningful even if the ratio was very low. As for excitatory neurons, we will further check the expressions of excitatory neuronal markers. (According to the screening chart we used, we did not explore excitatory neuronal markers as far as cells did not express SLC17A7 significantly).

      Minor comments

      • It is unclear how many clones were assessed per genotype

      We constructed one clone per genotype. As we mentioned above, we have added the information in the results section of this preliminary revised manuscript.

      • The authors should properly annotate the genotypes 1q21.1 instead of 1q del (line 134)

      We have already annotated the abbreviations of 1q21.1 deletion and duplication in lines 87 and 93.

      • Introduction seems to be somehow off topic since 1q21.1 locus is associated with several neurodevelopmental disorders, including SCZ, but is certainly not specific to ASD and epilepsy. So the premiss on line 86: to study 1q21.1 locus to understand ASD/epilepsy is somewhat misleading. I propose that the introduction would be focussed on the 1q21.1 and not on general on ASD/epilepsy.

      As the reviewer pointed out, 1q21.1 CNVs are associated with other neurodevelopmental and neuropsychiatric disorders. Since our research aims to elucidate the underlying mechanism of ASD, we mainly focused on two representative comorbidities (abnormal brain size and epilepsy), which seemed relatively reproducible in vitro. However, we agree with the reviewer that the lack of information about clinical symptoms of 1q21.1 microdeletion and microduplication syndrome besides ASD was not appropriate. Thus, we will revise the introduction to mention the neurodevelopmental phenotypes of 1q21.1 CNVs in the revised manuscript next time.

      • It is unclear whether they generated heterozygous or homozygous deletions.

      We thank the reviewer for pointing it out. We have generated clones with heterozygous deletion and duplication. We have added the information in the results section of this revised manuscript.

      • The authors should cite Fiddes, I. T. et al. Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis. Cell 173, 1356-1369.e22 (2018).

      As the reviewer suggested, we will cite two papers regarding NOTCH2NL (NOTCH2NLA: Fiddes, I. T. et al., Cell 173, 2018; NOTCH2NLB: Ikuo K Suzuki et al., Cell 173, 2018) when we discuss the alteration of neuronal maturity and brain size. We will add the information in the revised manuscript next time.

      • Many unclear statements eg line 138: Next, we analyzed each single-cell in an organoid

      We thank the reviewer for noticing it. We have made an effort to remove inappropriate sentences in this revised manuscript.

      • Discussion on E/I is very speculative, not supported by any evidence

      In response to the reviewer’s suggestion, we will cut the descriptions which contain too speculative contents in the discussion section of the revised manuscript later.

      Significance

      The general topic of this study is high interest given the strong association of the 1q21.1 with disease. The authors developed interesting ESC line to study in parallel del and duplication. Unfortunately the level of of analysis performed on these organoids is not up the current stat of the art, are of low experimental quality, analyses are limited. Therefore no clear conclusion can be drawn except for the size of the organoids, very little mechanism is provided. This therefore remains a purely descriptive study for which the presented data are rather on low quality and limited impact in its current shape.

      We thank the reviewer for the interest and criticism of our paper. As discussed above, we plan to perform additional analyses and experiments to justify our hypothesis more clearly and try to meet the reviewer’s requests.

      Reviewer #2

      This study was initiated to look at specific cellular and molecular mechanism of the duplication and deletion CNV frequently observed at the 1q21.1 gene locus in an isogeneic human embryonic stem (hES) cell model. The authors note that these CNVs are associated with higher than normal penetrance of ASD and epilepsy and aim to elucidate gene expression differences with single cell RNAseq and functional changes in this model system. The authors further sought to proliferation and differentiation states, in addition to neuronal activity, using both 2D cultures and 3D organoid models. The 1q21.1 gene locus model system made here is unique and the results broadly recapitulate the patient phenotype particularly with observations of macrocephaly in the "1q dup" and microcephaly in the "1q del".

      Reviewers statement:

      We have joint expertise in GABAergic neuronal development, iPSC 2D and 3D culture and ASD human molecular genetics.

      Major comments:

      • Not sure why ASD (if used it should also be spelled out) is mentioned in the title if ASD is only seen in a proportion of human 1q21.1. duplication (~36% will have autism) and 1q21.1 deletion (<10% will have autism) carriers. I would prefer to use 'neurodevelopmental phenotype'. A good update review that is accurate with respect to this CNV role in autism is PMID: 29398931. The authors should also put into the context of their results what is known with other neuropsychiatric phenotypes also seen in these CNV events;

      We thank the reviewer for the suggestion and valuable information. We have corrected the title in the revised manuscript this time. We will also refer to the paper by Fernandez and Scherer (Dialogues Clin. Neurosci., 2017) to discuss the detail of roles and neuropsychiatric phenotypes of targeted CNVs.

      • In Fig 1D the ddPCR validation for the genetic alterations in 1q del shows a normal return to 2 copies of GPR89B. However, in the 1q dup the CNV level is still elevated for GPR89B. Please determine how much further the duplication goes as there are five more potentially affected genes in this region (eg PDZK1P1). Modify the text appropriately to note the potential influence of any of these other genes on the experimental outcomes.

      We thank the reviewer for pointing it out. Figure 1D showed the results of aCGH analysis to confirm the copy number alteration of the targeted region in each clone. This analysis expected that the target region contained GPR89B, as confirmed by PCR shown in Fig. 1B. However, as the reviewer’s comment, the cleavage sites shown in Figure 1D seem not consistent with the result of Fig. 1B. We think it reflects the limitation of the microarray-based CGH technique. Since the locus between GPR89B and LOC101927468 contains extensive repeat sequences, aCGH may not be an appropriate method. Thus, we will apply quantitative PCR (or ddPCR) to determine copy number alternation of each clone in addition to microarray-based CGH.

      • The authors' claim that dosage dependent size differences in NPC organoids is caused by a change in the number of cells within the organoid rather than size - from Fig. 2D, cells in 1qdel organoid appears more compact; a quantification of cell number should be done to support this claim. IHC of D27/28 organoids with GABAergic markers would support authors' claim of alterations of GABAergic components in 1qdel cells. These suggested experiments would take 2-3 days if the organoids are available.

      In response to the reviewer’s suggestion, we have counted the number of cells in the images of each NPC organoid using the fluorescent microscopy, BZ-X analyzer, and calculated the cell number per unit area (1000 µm2). We found the cell density was not significantly different among the 3 genotypes. We have changed Figure 2E, descriptions in the methods and results sections, and the corresponding figure legend in the revised manuscript this time. As for exploring GABAergic components in the NPC organoids, we plan to perform immunocytochemistry (ICC) and RT-qPCR analysis.

      • Fig 4 E shows MEA data from "top 10". What is the top ten? Do you mean data points? There are batch differences in 1q dup with one batch having a lower expression than the other. Increasing the n value to accommodate the high variance observed in this group will greatly increase the validity of the data generated. Also, change the figure legend to indicate the age of these cultures. Given that the controls are not spiking, this data should be extended to probe the developmental profile further to week 9 when normal cells should be spiking so that the baseline activity of this isogenic line can be determined.

      Top 10 meant the ten electrodes with the highest spike rates within one MEA dish. To respond to the reviewer’s suggestion, we have performed statistical analysis with all electrodes per genotype. Even though the distributions of firing rate were quite heterogeneous among different electrodes, we found significant differences between CTRL and each mutant per MEA dish. We have changed Figure 4E, descriptions in the methods section, and the corresponding figure legend in the revised manuscript this time.

      The reviewer is correct that the spike rates in 1q dup were quite different between different batches. We noticed from our experiments that spike rates were easily affected by the health conditions of cells. Some mutant batches showed mild spike activities like circles in 1q dup, and some had very vigorous activities. We have even checked the reproducibility of significant differences between CTRL and each mutant per MEA dish with 3 independent experiments. As for the extended cultures to detect more frequent signals in CTRL neurons, we have already tried longer cultures three times. However, we could not perform sufficient analyses because neurons became unhealthier after 35 dpd. We will further try to improve the experimental setup and perform analyses if the experimental conditions could be optimized.

      • Single cell RNAseq data suggests a cluster of GABAergic cell types that are appearing in the 1q del condition, but not in the 1q dup or control groups. The authors suggest that these GABAergic cells are excitatory because the chloride gradient has not yet been altered (no change to KCC2 expression). The authors should substantiate this idea in the MEA system with bicuculline treatment to block GABAergic transmission (drug washed in and out) to show that the spike activity observed in the 2D MEA experiments is due to GABAergic excitatory transmission. Ideally, this should be done for both the 1q dup, 1q del as well as controls.

      We thank the reviewer for the suggestion. We agreed with the reviewer that drug treatment with bicuculine in this MEA system was meaningful to identify cellular properties. We will try to set up the experimental conditions and perform this experiment if the condition can be optimized.

      • Fig 5A. The clustering method for single cell RNAseq seems shows a large proportion of "other" class cells begging the question as to what they are. Is there another cluster analysis, which might be used eg partially supervised/unsupervised clustering methods from the Allen Institute to help determine what these might be?

      We initially made the screening chart for cell-type specifications according to cellular markers from Allen brain map (http://celltypes.brain-map.org/rnaseq/human_ctx_smart- seq) and a published paper (CA Trujillo et al., Cell Stem Cell 2019). We defined this cluster as “other” because this cluster did not have any significant genes in the 1st screening, although we understood that the specifications of all clusters were desirable. To investigate the cellular property in this cluster, we tried to put significant genes into Metascape to check gene ontology. We found some terms about immune cells (mainly lymphocytes and macrophages), cancer cells, roles for inflammation, and apoptotic process, although miscellaneous terms were also included. We have provided the screening chart as supplementary Table 4 in this revised manuscript. Next time, we will add a more detailed description of the ‘other’ cluster in the revised manuscript.

      • Fig 5 B. The manuscript requires additional markers used in the cluster analysis. Particularly, expression of the GABAergic progenitor markers DLX5 and 6 as well as EMX1 for the progenitor cells. Details of all markers and cluster algorithms should be made available in supplementary tables and R scripts, so that others can repeat this analysis.

      In response to the reviewer’s suggestion, we will check these GABAergic progenitor markers and add them to the revised figure and manuscript later. As we mentioned above, we performed the cell type specification of each cluster manually using our screening chart and did not use R scripts. We have provided the information on the screening process in supplementary Table 4 of this revised manuscript.

      • Fig 6. Expanding the heat map of 1q del and 1q dup with CTRL expression would help with context for baseline levels in this isogenic cell line. Please also include additional GABAergic markers GABRA1, GABARB2and GABARG2, (subunits of the most common GABA-A receptor) SOM, VIP, NPY, (other GABAergic interneurons in addition to PVALB) DLX6, EXM1 and for excitatory markers GRIA2, GRIA3 and GRIA4 (all of which have developmentally regulated expression patterns) that will provide more context with the synaptic receptor literature. GRIN2D is expressed only in GABAergic cell types and so I would suggest including this NMDA receptor subunit as well.

      We thank the reviewer for the valuable suggestions. To further explore the cellular properties in 1q del and 1q dup, we will check these cell markers additionally and show the results in the revised figure and manuscript next time.

      Minor comments:

      1. Additional references (eg. Schafer et al. 2019) should be discussed in relation to the authors' suggestions of altered neuronal maturity.

      As the reviewer suggested, we will include the paper in our references and discuss the associations between neurodevelopmental disorders and altered neuronal maturity.

      1. The authors show no change in PAX6 expression between genotypes, but significant differences in TBR2 expression between genotypes (Fig. 2C) - this alteration in normal cortical development should be included in results and discussed.

      Radial glial cells (RGs) have abilities of both self-renewal and neurogenesis (Lui et al. Cell 2011, Fiddes, I. T. et al., Cell 2018). Fiddes et al. showed that if the balance leans toward neurogenesis, premature differentiation with higher TBR2 expressions was observed in week 4 human cortical organoids (Fiddes, I. T. et al., Cell 2018). However, the predisposition to neurogenesis is thought to cause the earlier shortage of RGs. Finally, these cells remain abundant in week 4 organoids. We considered this was why TBR2 expression was significantly different in 1q del, but PAX6 was not. We will add this interpretation in the revised manuscript next time.

      1. In the introduction (Line 67): The author's state that "alterations in brain size is common in patients with ASD" using one meta-study to support this claim. Further primary studies should be consulted and the authors should give the proportion of the population with ASD and altered brain size to support this statement. In addition, the age range should be supported with primary papers.

      As the reviewer suggested, we have cited some primary studies about the prevalence of altered brain size in ASD patients and its age range in this revised manuscript. Since it seems still controversial whether the enlargement of brain size persists or not until adolescence and adulthood (E H Aylward et al., Neurology 2002; J Piven et al., Am J Psychiatry 1995), we have also modified the description in this manuscript.

      1. Line 73. The authors suggest that the brain growth deviations are "Postnatal stage restrictive". Citations are needed to support this statement.

      As the reviewer suggested, we have cited some primary studies as described above and revised the manuscript.

      1. In the scRNAseq data results please report total cell numbers counted for each cluster and for genotype group.

      We apologize for the lack of information and thank the reviewer for noticing it. We have added the information in the results section of the revised manuscript this time.

      1. In the results section (line 269-270) the authors suggest that 1q del cells are in a more mature state because the GABAergic cells are present and glutamatergic genes are similarly altered in 1q dup and 1q del. However, the results from the gene cluster data suggests that there is a very high proportion of progenitor cells (Progenitor 1 and 2 clusters), which seems to argue against faster maturation. This suggests to me that cell fate is being modified here.

      We thank the reviewer for the valuable suggestion. Schafer et al. (the suggested paper in minor comment 1) reported that altered gene expressions in neuronal modules have already been observed in NSCs derived from ASD patient-derived iPSCs. As the reviewer suggested, we plan to consider our results in terms of the alteration of cell fate and neuronal maturity in the revised manuscript later.

      1. Label figures on each page for ms.

      As the reviewer suggested, we have labeled figures at the bottom right of each page.

      1. Fix typos and heat map legends (currently no colors for log2 fold change in Fig 5 or 6)

      We apologize to the reviewer for typos and grammatical errors. We made an effort to remove them. We also apologize for the lack of color information in the legends of Figure 5 and Figure 6 and thank the reviewer for noticing it. We have added the color information in the figure legends of the revised manuscript this time.

      Significance

      Overall the study is clearly described, and the outcomes have been substantiated to a certain degree, but requires a bit more work. This paper does represent a technical 'tour de force' and the authors should be applauded for sticking it out where other labs have so far failed. It might be useful to mention even in brief, of the number of 'failed' (failed or inaccurate) events. The availability of the lines should also be clearly stated.

      We thank the reviewer for the positive comments. In addition to the plans described above, we have added more detailed information, e.g., how many screenings were carried out to get positive clones, in the revised version of the methods and results section. We have also added the descriptions about the availability of the 1q21.1 CNV cell lines in the data availability section of this revised manuscript.

      Reviewer #3

      In this research study by Nomura et al., the authors develop novel hESC-based models of reciprocal CNVs in distal 1q21.1 using CRISPR/Cas9 genome editing technology. Specifically, the authors genome edit KhES-1 cells to produce two isogenic hESC line that contain either a deletion or duplication of this chromosomal region. Patients with 1q21.1 deletion and 1q21.1 duplication syndromes show abnormal head size in conjunction with multiple neurodevelopmental co-morbidities such as epilepsy, developmental delay, and neuropsychiatric abnormalities. This is an important study since it provides robust research tools to understand molecular and cellular mechanisms that may underly these syndromes. Through generation of cortical organoid models, the authors demonstrate 1q21.1 deletion and duplication organoids show deficits in growth and over-growth, respectively. Additionally, the authors provide data that 1q21.1 deletion and duplication organoids show altered signaling cascades which may underly growth deficits and also abnormal neurodevelopment which may underly hyperexcitable neurons as demonstrated by multi-electrode array analysis. While my enthusiasm for this study remain high, I do have a significant number of major and minor reservations specific to the experimental design and analysis that if addressed would provide for an excellent contribution to the field.

      Major concerns:

      1. Though the authors provide extensive data in this study, major revisions are necessary to interpret all of their data in the context of the phenotypes they are observing in organoids and MEA analyses. In addition, the current study lacks cohesiveness throughout the various experiments and does not provide text that clearly unifies the results of the study. For example, no interpretation of higher TBR2 levels in 1q21.1 deletion is provided. Does this mean these organoids show accelerated neuronal differentiation? Also please see my comment regarding TBR2 staining the next section.

      Other examples throughout the manuscript in which there is no clear interpretation of the data or inadequacies of unifying the results of the experiments.

      We thank the reviewer for pointing out that our manuscript had inadequacies of the integrity and cohesiveness throughout our data. With additional data as follows, we plan to improve these issues in the revised manuscript later. As for TBR2 expression, we considered that higher TBR2 expressions in week 4 human cortical organoids showed the predisposition to neurogenesis in 1q del as demonstrated in a previous paper (Fiddes, I. T. et al., Cell 2018). We will add the description in the revised manuscript later.

      • a. Additional interpretation why 1q21.1 duplication organoids show increased growth is lacking. The single cell RNA sequencing results show there are more glia, but no further interpretation is giving why these organoids show an overgrowth phenotype. Inversely, the 1q21.1 deletion organoids show more progenitor cells, but it is not apparent why this should result in decreased cell growth.

      As we have mentioned above, we considered that the predominance of 1q dup cells in the glial cluster reflected the excessive gliogenesis from radial glial cells and enhanced cell divisions in relation to the alteration of the RAS/MAPK pathway (Z Shen, BioRxiv 2020). We plan to analyze additional markers related to cell proliferation and cell division by immunostaining to validate the above hypotheses. To investigate how 1q del organoids showed smaller size, we plan to examine apoptotic markers such as cytochrome C and caspase 3 by culturing NPC organoids again.

      • b. The authors suggest that 1q21.1 duplication organoids are resistant to neuronal differentiation. What data supports this hypothesis other than the fact there are no mature neuronal cells are present in their single cell RNA sequencing data.

      We considered that the results in Figure 3B and Figure 3D also supported this hypothesis that 1q dup organoids expressed the lower intensity of neuronal markers. Since we have only checked a few markers by immunocytochemistry (ICC), we plan to examine additional markers, i.e., immature neuronal markers such as DCX and other mature neuronal markers such as NeuN, as well as proliferation markers such as phospho histone H3 to ensure this hypothesis.

      • c. The MEA analyses show hyperexcitability in both 1q21.1 deletion and duplication cultures. Since the authors suggest 1q21.1 duplication organoids are resistant to neuronal maturation, no interpretation is given why they show hyperexcitable phenotypes.

      In the MEA analyses, we used not 3-D cortical organoids but 2-D neurons because the required culture period to emit electrical activities was thought to be much shorter in 2-D neurons according to some previous studies with human pluripotent cells (A Taga et al., Stem Cells Transl Med 2019; CA Trujillo et al., Cell Stem Cell 2019). We considered that 2-D neurons on 28 dpd (day 63) had much higher maturity than NPC organoids and even 1q dup neurons had already become mature enough to emit spike activities. We will also check neuronal marker expressions using 2-D neurons around 28 dpd by RT-qPCR to ensure this.

      • d. The current study is lacking extensive immunohistochemical stains of representative markers that validate their findings from their single cell RNA sequencing experiments. For example, glial cell markers such as GFAP should be analyzed in 1q21.1 duplication organoids. Additionally, progenitor cell markers such as PAX6 and neuronal markers such as MAP2 and synaptic markers such as SYNAPSIN and others should be incorporated in the study.

      We thank the reviewer for the suggestions. We plan to perform additional IHC staining for NPC organoids with the suggested markers and OPC markers.

      1. Major details are lacking for the single cell RNA sequencing experiments.
      • a. How many cells were analyzed from each group? How many organoids and what age of organoids were analyzed from each group, were they pooled together? Why was a log2FC 1.2 used as a threshold? It is unclear how the authors identify Progenitor 1 and 2 cell clusters? Are they distinct clusters or is this a continuum of differentiation. The progenitor 1 and 2 clusters were chosen based on expression of the ID transcription factors, but no text was provided why these genes specify progenitor cells.

      We apologize for the lack of information and thank the reviewer for noticing it. We described the number of analyzed cells (32,171 cells: 1q del; 10,682, 1q dup; 11,987, CTRL; 9,502) in the results section (line 186) of the original manuscript. However, we could not count how many organoids were analyzed because they were too tiny (diameter; 400-700µm). Many organoids were needed to get the prescribed number of cells (25,000 cells per genotype). According to the analyzed data of size measurement for NPC organoids by fluorescent microscopy, at least 1,500 organoids were collected per genotype. We gathered all cultured organoids in the same batch, dissociated them, and then loaded the prescribed number of cells into the machine. We have added the description of the number of input cells in the methods section of this revised manuscript.

      We used the threshold of log2FC > |1.2| so that the total number of DEGs became around 100-1000 in both bulk and the NSC cluster to avoid a very high or low number of DEGs. Some previous transcriptome studies used the same or even smaller thresholds (Xiaoming Ma et al., Front in Genet 2020; J Zhong et al., Brain Res 2016; Y Wang et al., BMC genomics 2016). We have added these descriptions in the methods section of this revised manuscript.

      As for progenitor-1 and 2, we regarded them as a continuum based on the marker expressions. We chose ID transcription factors for progenitor cells, referring to a published paper (CA Trujillo et al., Cell Stem Cell 2019) as we have described in the methods section (line 633). Several articles have reported that ID transcription factors regulate proliferation and differentiation of neural precursor cells (K Yun et al., Development 2004; D Patel et al., Biochim Biophys Acta 2015).

      Minor concerns:

      1. I would suggest rephrasing the title of the study as it does not clearly convey the advancement to the field. I would suggest the following or something similar this is more concise: " Modeling Reciprocal CNVs of Chromosomal 1q21.1 in Cortical Organoids Reveals Alterations in Neurodevelopment."

      We thank the reviewer for the concrete suggestion. We have revised the title as the reviewer suggested in this preliminary revised manuscript.

      1. The length of the discussion is over extended and should be revised to become more concise.

      We thank the reviewer for pointing it out. We will shorten the beginning part and delete unnecessary sentences in the discussion section of the revised manuscript later.

      1. Additional experiments should be performed to characterize pluripotency of hESC clones generated after genome editing other than staining for alkaline phosphatase activity.

      At minimum, karyotyping in addition to measuring pluripotency markers such as NANOG and OCT3/4 should be performed.

      Karyotyping of wild-type ES cells has been checked by Institute for Frontier Medical Sciences, Kyoto University before being provided. After genome editing, we performed aCGH analysis for all 3 genotypes using the wildtype ES cells as reference genes and confirmed no chromosome aberrations were generated. We have added the information about karyotyping in the methods section of this preliminary revised manuscript.

      As for pluripotency markers, we performed RT-qPCR analyses with ES cells after genome editing and confirmed that OCT3/4 was highly expressed than internal control genes. (We can provide the raw data if requested).

      4) There are several dozen instances of spelling/grammatical and word choice errors throughout the manuscript. For example, line 24 reads "We generate isogenic..." should read "We generated isogenic... "

      • a. Line 25: "opposite organoid size" as written is confusing to interpret.
      • b. Line 46: "have been considered in the context of ASD" would read more clearly as "have been thought to underly ASD etiology."
      • c. Line 53: "in the study of neurological development" should read "nervous system development".
      • d. Line 118: ".. to detect the CRISPR target site for deletion" should read "to detect the CRISPR target site. For the deletion, we checked... "
      • e. <![endif]>Line 119: "...flanking the CRISPR target site; for duplication, we amplified.. " should read "flanking the CRISPR target site, and for the duplication, we amplified..... ".
      • f. Line 127: "we prepared control cells (CTRL) that transfected.... should read ""we prepared control cells (CTRL) that were transfected. ".
      • g. Line 185: "organoid size and mature level" should read "organoid size and developmental maturity."
      • h. In line 40, "We made cryosections of .... should read.... "We performed IHC for the three organoid genotypes on day 27... " i. <![endif]>In Supplementary Figure 8, line 554, "replictes" is misspelled.

      We apologize to the reviewer for many typos and grammatical errors and thank the reviewer for pointing them out in detail. We have corrected these errors as the reviewer suggested.

      5) Line 181: "with a little higher degree of.. " should be re-written more precisely and with more scientific accuracy.

      As the reviewer requested, we have corrected the sentence in this revised manuscript.

      6) Line 216, The use of the colloquial phrase: "On the other hand.. " should be replaced with more formal language. For example, "In contrast, the number of downregulated....

      We thank the reviewer for pointing it out. We have corrected this colloquial phrase at 4 locations.

      7) In line 201, Pprogenitor is misspelled.

      We apologize and thank the reviewer for noticing it. We have corrected it in this preliminary revised manuscript.

      8) In Figure 3, images showing TBR2 staining does not appear correct as this protein should be localized to the nucleus similar to SOX2 staining. I would suggest optimizing conditions such as utilizing antigen retrieval or other methods to reduce non-specific cytoplasmic staining.

      We thank the reviewer for the valuable suggestion. We plan to optimize the condition and try other neuronal lineages markers such as DCX and NeuN.

      9) I would suggest simplifying the text describing the primers utilized in this study and display them in a table format.

      As the reviewer requested, we will make a supplementary table of primer sequences in the revised manuscript later.

      10) Information regarding the number of technical replicates used in this study is lacking throughout the manuscript. For example, how many hESC clones were analyzed? How many organoids were analyzed for each specific assay such as single cell RNA sequencing and MEA analyses? How many independent experiments were used for these studies?

      We apologize for the lack of information. We have constructed one clone per genotype one human ES cell strain (khES-1) and performed all further analyses. The precise number of NPC organoids in scRNA-seq could not be counted, as we mentioned above. As for MEA analysis, 8 x 10^5 cells were seeded on each dish as described in the original manuscript. However, it was unclear how many neurons were observed on each electrode because multiple cells and neurites covered each electrode. Thus, spike activities were detected as the network of many neurons. We have added the information in the methods section of this preliminary revised manuscript.

      11) It is not clear why the authors choose two types of organoid methods in the study. The first protocol referred to as the "NPC organoid method" is synonymous to neurosphere culturing and should be referred to as neurospheres throughout the manuscript.

      One protocol (Fujimori et al., Stem Cell Rep., 2017) was not for 3-D organoids but 2-D neurons (Figure 4A). Thus, we considered neurosphere and NPC organoid were different.

      12) In Figure 4, panel C should be referred to as a local field potential trace and not a waveform.

      We thank the reviewer for pointing it out. We have corrected the description as the reviewer suggested.

      Reviewer #3

      This is an important study since it provides robust research tools to understand molecular and cellular mechanisms that may underlie 1q21.1 deletion and duplication syndromes.

      We thank the reviewer for the positive comments. We plan to perform additional analyses and experiments as described above and try to meet the reviewer’s requests.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this research study by Nomura et al., the authors develop novel hESC-based models of reciprocal CNVs in distal 1q21.1 using CRISPR/Cas9 genome editing technology. Specifically, the authors genome edit KhES-1 cells to produce two isogenic hESC line that contain either a deletion or duplication of this chromosomal region. Patients with 1q21.1 deletion and 1q21.1 duplication syndromes show abnormal head size in conjunction with multiple neurodevelopmental co-morbidities such as epilepsy, developmental delay, and neuropsychiatric abnormalities. This is an important study since it provides robust research tools to understand molecular and cellular mechanisms that may underly these syndromes. Through generation of cortical organoid models, the authors demonstrate 1q21.1 deletion and duplication organoids show deficits in growth and over-growth, respectively. Additionally, the authors provide data that 1q21.1 deletion and duplication organoids show altered signaling cascades which may underly growth deficits and also abnormal neurodevelopment which may underly hyperexcitable neurons as demonstrated by multi-electrode array analysis. While my enthusiasm for this study remain high, I do have a significant number of major and minor reservations specific to the experimental design and analysis that if addressed would provide for an excellent contribution to the field.

      Major concerns:

      1. Though the authors provide extensive data in this study, major revisions are necessary to interpret all of their data in the context of the phenotypes they are observing in organoids and MEA analyses. In addition, the current study lacks cohesiveness throughout the various experiments and does not provide text that clearly unifies the results of the study. For example, no interpretation of higher TBR2 levels in 1q21.1 deletion is provided. Does this mean these organoids show accelerated neuronal differentiation? Also please see my comment regarding TBR2 staining the next section. Other examples throughout the manuscript in which there is no clear interpretation of the data or inadequacies of unifying the results of the experiments.
        • a. Additional interpretation why 1q21.1 duplication organoids show increased growth is lacking. The single cell RNA sequencing results show there are more glia, but no further interpretation is giving why these organoids show an overgrowth phenotype. Inversely, the 1q21.1 deletion organoids show more progenitor cells, but it is not apparent why this should result in decreased cell growth.
        • b. The authors suggest that 1q21.1 duplication organoids are resistant to neuronal differentiation. What data supports this hypothesis other than the fact there are no mature neuronal cells are present in their single cell RNA sequencing data.
        • c. The MEA analyses show hyperexcitability in both 1q21.1 deletion and duplication cultures. Since the authors suggest 1q21.1 duplication organoids are resistant to neuronal maturation, no interpretation is given why they show hyperexcitable phenotypes.
        • d. The current study is lacking extensive immunohistochemical stains of representative markers that validate their findings from their single cell RNA sequencing experiments. For example, glial cell markers such as GFAP should be analyzed in 1q21.1 duplication organoids. Additionally, progenitor cell markers such as PAX6 and neuronal markers such as MAP2 and synaptic markers such as SYNAPSIN and others should be incorporated in the study.
      2. Major details are lacking for the single cell RNA sequencing experiments.
        • a. How many cells were analyzed from each group? How many organoids and what age of organoids were analyzed from each group, were they pooled together? Why was a log2FC >1.2 used as a threshold? It is unclear how the authors identify Progenitor 1 and 2 cell clusters? Are they distinct clusters or is this a continuum of differentiation. The progenitor 1 and 2 clusters were chosen based on expression of the ID transcription factors, but no text was provided why these genes specify progenitor cells.

      Minor concerns:

      1. I would suggest rephrasing the title of the study as it does not clearly convey the advancement to the field. I would suggest the following or something similar this is more concise: " Modeling Reciprocal CNVs of Chromosomal 1q21.1 in Cortical Organoids Reveals Alterations in Neurodevelopment."
      2. The length of the discussion is over extended and should be revised to become more concise.
      3. Additional experiments should be performed to characterize pluripotency of hESC clones generated after genome editing other than staining for alkaline phosphatase activity. At minimum, karyotyping in addition to measuring pluripotency markers such as NANOG and OCT3/4 should be performed.
      4. There are several dozen instances of spelling/grammatical and word choice errors throughout the manuscript. For example, line 24 reads "We generate isogenic...." should read "We generated isogenic...."
        • a. Line 25: "opposite organoid size" as written is confusing to interpret.
        • b. Line 46: "have been considered in the context of ASD" would read more clearly as "have been thought to underly ASD etiology."
        • c. Line 53: "in the study of neurological development" should read "nervous system development".
        • d. Line 118: "...to detect the CRISPR target site for deletion" should read "to detect the CRISPR target site. For the deletion, we checked....".
        • e. Line 119: "...flanking the CRISPR target site; for duplication, we amplified..." should read "flanking the CRISPR target site, and for the duplication, we amplified......".
        • f. Line 127: "we prepared control cells (CTRL) that transfected.... should read ""we prepared control cells (CTRL) that were transfected...."
        • g. Line 185: "organoid size and mature level" should read "organoid size and developmental maturity."
        • h. In line 40, "We made cryosections of .... should read.... "We performed IHC for the three organoid genotypes on day 27...."
        • i. In Supplementary Figure 8, line 554, "replictes" is misspelled.
      5. Line 181: "with a little higher degree of..." should be re-written more precisely and with more scientific accuracy.
      6. Line 216, The use of the colloquial phrase: "On the other hand..." should be replaced with more formal language. For example, "In contrast, the number of downregulated....
      7. In line 201, Pprogenitor is misspelled.
      8. In Figure 3, images showing TBR2 staining does not appear correct as this protein should be localized to the nucleus similar to SOX2 staining. I would suggest optimizing conditions such as utilizing antigen retrieval or other methods to reduce non-specific cytoplasmic staining.
      9. I would suggest simplifying the text describing the primers utilized in this study and display them in a table format.
      10. Information regarding the number of technical replicates used in this study is lacking throughout the manuscript. For example, how many hESC clones were analyzed? How many organoids were analyzed for each specific assay such as single cell RNA sequencing and MEA analyses? How many independent experiments were used for these studies?
      11. It is not clear why the authors choose two types of organoid methods in the study. The first protocol referred to as the "NPC organoid method" is synonymous to neurosphere culturing and should be referred to as neurospheres throughout the manuscript.
      12. In Figure 4, panel C should be referred to as a local field potential trace and not a waveform.

      Significance

      This is an important study since it provides robust research tools to understand molecular and cellular mechanisms that may underlie 1q21.1 deletion and duplication syndromes.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study was initiated to look at specific cellular and molecular mechanism of the duplication and deletion CNV frequently observed at the 1q21.1 gene locus in an isogeneic human embryonic stem (hES) cell model. The authors note that these CNVs are associated with higher than normal penetrance of ASD and epilepsy and aim to elucidate gene expression differences with single cell RNAseq and functional changes in this model system. The authors further sought to proliferation and differentiation states, in addition to neuronal activity, using both 2D cultures and 3D organoid models. The 1q21.1 gene locus model system made here is unique and the results broadly recapitulate the patient phenotype particularly with observations of macrocephaly in the "1q dup" and microcephaly in the "1q del".

      Reviewers statement: We have joint expertise in GABAergic neuronal development, iPSC 2D and 3D culture and ASD human molecular genetics.

      Major comments:

      • Not sure why ASD (if used it should also be spelled out) is mentioned in the title if ASD is only seen in a proportion of human 1q21.1. duplication (~36% will have autism) and 1q21.1 deletion (<10% will have autism) carriers. I would prefer to use 'neurodevelopmental phenotype'. A good update review that is accurate with respect to this CNV role in autism is PMID: 29398931. The authors should also put into the context of their results what is known with other neuropsychiatric phenotypes also seen in these CNV events;
      • In Fig 1D the ddPCR validation for the genetic alterations in 1q del shows a normal return to 2 copies of GPR89B. However, in the 1q dup the CNV level is still elevated for GPR89B. Please determine how much further the duplication goes as there are five more potentially affected genes in this region (eg PDZK1P1). Modify the text appropriately to note the potential influence of any of these other genes on the experimental outcomes.
      • The authors' claim that dosage dependent size differences in NPC organoids is caused by a change in the number of cells within the organoid rather than size - from Fig. 2D, cells in 1qdel organoid appears more compact; a quantification of cell number should be done to support this claim. IHC of D27/28 organoids with GABAergic markers would support authors' claim of alterations of GABAergic components in 1qdel cells. These suggested experiments would take 2-3 days if the organoids are available.
      • Fig 4 E shows MEA data from "top 10". What is the top ten? Do you mean data points? There are batch differences in 1q dup with one batch having a lower expression than the other. Increasing the n value to accommodate the high variance observed in this group will greatly increase the validity of the data generated. Also, change the figure legend to indicate the age of these cultures. Given that the controls are not spiking, this data should be extended to probe the developmental profile further to week 9 when normal cells should be spiking so that the baseline activity of this isogenic line can be determined.
      • Single cell RNAseq data suggests a cluster of GABAergic cell types that are appearing in the 1q del condition, but not in the 1q dup or control groups. The authors suggest that these GABAergic cells are excitatory because the chloride gradient has not yet been altered (no change to KCC2 expression). The authors should substantiate this idea in the MEA system with bicuculline treatment to block GABAergic transmission (drug washed in and out) to show that the spike activity observed in the 2D MEA experiments is due to GABAergic excitatory transmission. Ideally, this should be done for both the 1q dup, 1q del as well as controls.
      • Fig 5A. The clustering method for single cell RNAseq seems shows a large proportion of "other" class cells begging the question as to what they are. Is there another cluster analysis, which might be used eg partially supervised/unsupervised clustering methods from the Allen Institute to help determine what these might be?
      • Fig 5 B. The manuscript requires additional markers used in the cluster analysis. Particularly, expression of the GABAergic progenitor markers DLX5 and 6 as well as EMX1 for the progenitor cells. Details of all markers and cluster algorithms should be made available in supplementary tables and R scripts, so that others can repeat this analysis.
      • Fig 6. Expanding the heat map of 1q del and 1q dup with CTRL expression would help with context for baseline levels in this isogenic cell line. Please also include additional GABAergic markers GABRA1, GABARB2and GABARG2, (subunits of the most common GABA-A receptor) SOM, VIP, NPY, (other GABAergic interneurons in addition to PVALB) DLX6, EXM1 and for excitatory markers GRIA2, GRIA3 and GRIA4 (all of which have developmentally regulated expression patterns) that will provide more context with the synaptic receptor literature. GRIN2D is expressed only in GABAergic cell types and so I would suggest including this NMDA receptor subunit as well.

      Minor comments:

      1. Additional references (eg. Schaefer et al. 2019) should be discussed in relation to the authors' suggestions of altered neuronal maturity.
      2. The authors show no change in PAX6 expression between genotypes, but significant differences in TBR2 expression between genotypes (Fig. 2C) - this alteration in normal cortical development should be included in results and discussed.
      3. In the introduction (Line 67): The author's state that "alterations in brain size is common in patients with ASD" using one meta-study to support this claim. Further primary studies should be consulted and the authors should give the proportion of the population with ASD and altered brain size to support this statement. In addition, the age range should be supported with primary papers.
      4. Line 73. The authors suggest that the brain growth deviations are "Postnatal stage restrictive". Citations are needed to support this statement.
      5. In the scRNAseq data results please report total cell numbers counted for each cluster and for genotype group.
      6. In the results section (line 269-270) the authors suggest that 1q del cells are in a more mature state because the GABAergic cells are present and glutamatergic genes are similarly altered in 1q dup and 1q del. However, the results from the gene cluster data suggests that there is a very high proportion of progenitor cells (Progenitor 1 and 2 clusters), which seems to argue against faster maturation. This suggests to me that cell fate is being modified here.
      7. Label figures on each page for ms.
      8. Fix typos and heat map legends (currently no colors for log2 fold change in Fig 5 or 6)

      Significance

      Overall the study is clearly described, and the outcomes have been substantiated to a certain degree, but requires a bit more work. This paper does represent a technical 'tour de force' and the authors should be applauded for sticking it out where other labs have so far failed. It might be useful to mention even in brief, of the number of 'failed' (failed or inaccurate) events. The availability of the lines should also be clearly stated.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Copy number variations in the 1q21.1 loci, deletions and duplications, have been associated with neurodevelopmental disease. In particular, deletions of this locus result in a variety of neuronal phenotypes including microcephaly and schizophrenia in varying levels of severity. Duplications of the 1q21.1 locus are often associated with autism and/or macrocephaly.

      In this study Nomura et al. generated 1q21.1 deletion and duplication hESC lines to study the impact of these CNVs on neuronal development. They generated brain organoids and observed a bidirectional effect of this CNV on organoid size, with 1q21.1 deletion showing smaller brain organoids whereas, the 1q21.1 dup lines grew large than controls. This in line with observed micro and macrocephaly observed in patients. They further analyzed these organoids at the gene expression level using single cell RNAseq and performed some electrophysiological assessment on neurons from of dissociated organoids.

      This study is certainly of interest given the association of this loci with NDDs such as autism, epilepsy and schizophrenia. At this stage, the study is mainly a descriptive study, showing differences between the 1q21.1 del/dup versus controls but also between both the del/dup lines. There is no mechanistic insight provided. For example the 1q21.1 CNV encompasses several genes, of which some have already been linked to micro/macrocephaly (eg. NOTH2NL). More importantly, most of the conclusions drawn by the authors are based on a limited set of experiments/analysis which are not always carefully performed and/or presented. In general, the data presented are premature, therefore not supporting the claims/conclusion made by the author (eg title) This makes the overall impact of this study limited.

      Main comments

      In general, the interpretation of the data is too premature:

      1. The title is not supported in any means by data
      2. Brain organoids size and development: In figure 2 the authors analyzed the development of the organoids. Based on the human phenotype the deletion would lead to smaller brain and the duplication to larger brain organoids. The presented data to support these claims are rather scarce. They indeed provide data on organoid size, however there is no information as to regard how this micro/macrocpehaly comes about. Only limited amount of cell types are being investigated with immunocytochemistry, which give little insight into the mechanism. Fig 3. The authors performed some very basic immunostaining and concluded that the neuronal maturity of 1q del seemed to be accelerated, whereas 1q dup decelerated from the NPC stage. However, there is no direct evidence provided for this. With simple additional immunostainings authors could already get a much better idea of what is going on. For example the authors could measure the amount of differentiating versus proliferating cells, cell cycle exit, etc (eg BrDU, KI67, pHH3 staining,...) Further there are some technical aspect that would need to be resolved:
        • There is a general lack of brain organoid characterization of the controls. It is unclear on how many independent clones these experiments were performed.
        • Fig 2C: it is unclear why brain organoid sizes reduce over time. Is this an indication of increased apoptosis? Did the authors measure this?
        • What is the reason for using t-test with Bonferroni correction as opposed to one -way (or even two-way) Anova is unclear in Fig 2C
        • 2E is unclear how they came to the conclusion that dosage dependent size difference in NPC organoids was caused by the number of cells within an organoid, not by the size of each cell or different cell types. Since they only measured the amount of Sox 2 positive cells and used Sox2 to measure cell diameter, whereas Sox2 is mainly expressed in the nucleus.
        • How do the authors explain that the Dup cells do not express Tubb neither CTIP2, do they only express NPCs and no neurons?

      In short, the characterization of the brain organoids at the level of general development, cell types, proliferation, differentiation is underdeveloped.

      1. Electrophysiological assessment of brain organoids derived neurons: In figure 4 the authors claim that both CNVs (Del/Dup) show hyperexcitability and altered expressions of glutamate system as common features between the Del/Dup lines. The data to support this are however scarce and far from being convincing: The poor quality of the data is represented by images in 4B-E:
        • First the authors chooe to dissociate the organoids prior to measure the cells on MEA's. This takes away the advantage of 3D brain organoids, will add a lot of non-physiological stress, cause cell death and lead to unequal distribution of cells over the electrodes, see fig 4B.
        • MEA recording are meant to measure network activity and heavily (read: fully) dependent on the network being formed. Cherry picking electrodes for analysis is not justified, analysis should be performed per MEA chip not per electrode. Inclusion/exclusion parameters should be defined before analysis
        • MEA parameters such as Mean firing rate (spike/min) and burst rate are very sensitive to plating conditions, especially number of cells and clustering of cell around electrodes (see 4B). Given that the organoids already differ in size and according to the authors in cell number, but also in the amount of starting NPCs, one can expect very different cell densities/cell types per experiment/genotype. The authors should therefore show for every genotype the matching cell culture images. Also with regard to the claims made about GABAergic neurons the cell type composition at the time of the MEA recording should be characterized for every genotype.
        • Fig 4B illustrates the points made above. The fact that no activity is observed in the control cells can be due to many different reasons: unequal plating, stress after dissociating cells, poor coverage of the electrodes, poor maturation, too early measuring time point, etc.... Because the authors have no control over the amount of cells covering the electrodes the data presented here carry very little carry little information. Fig 4B, best illustrates this with large cell clumps and areas without cell bodies. Measurements from these cell cultures are irrelevant and no conclusion can be drawn. We suggest that the authors first benchmark this technique with their own differentiation protocol, show robust and reliable recordings on control cells, and only compare to the CRISPR lines at a time point at which the control cells show a decent amount of activity > 1Hz. When doing so, also reduced activity can be monitored (For examples see, Trujillo et al, Cell Stem Cell2019 or Frega et al 2019 Nat comm).
        • MEAs measure the output of the network (action potentials). In a network, this can be influenced by virtually every neuronal property (morphology, synaptic input, types of synapses, intrinsic excitability, etc). Therefore, the authors cannot conclude only based on fig 4E that the Del/Dup cells are intrinsically hyperactive. To make this conclusion they should measure this directly by assessing that passive and active intrinsic properties of individual neurons. In control condition many electrodes do not give any signal. From these experiments it is impossible to know whether this is because of lack of cell on the particular electrode or real absence of activity. Certainly one could not conclude that the del en dup cell are intrinsically hyperexcitable.

      It seems that from the introduction the authors try to link 1q21 CNVs to epilepsy and ASd, thereby justifying the observed phenotypes.

      • How do the authors reconcile the fact that more mature GABA system is observed in the Del lines with the so called increased activity compared to controls but not to the Dup lines.

      Single cell RNAseq

      • I'm not a specialist on single cell RNAseq, however it seems that the analysis is underdeveloped and conclusion drawn for these experiments premature. It would be essential to validate some of the generated hypothesis, eg GABA maturity and not merely state as a conclusion (eg title).
      • How do the authors explain that a majority of the cells are Glial cells at day 27, and no presence of neurons.
      • How relevant is the changes in the extremely low amounts of GABAergic neurons in the Del cells, no excitatory neurons are present, only NSCs

      Minor comments

      • It is unclear how many clones were assessed per genotype
      • The authors should properly annotate the genotypes 1q21.1 instead of 1q del (line 134)
      • Introduction seems to be somehow off topic since 1q21.1 locus is associated with several neurodevelopmental disorders, including SCZ, but is certainly not specific to ASD and epilepsy. So the premiss on line 86: to study 1q21.1 locus to understand ASD/epilepsy is somewhat misleading. I propose that the introduction would be focussed on the 1q21.1 and not on general on ASD/epilepsy.
      • It is unclear whether they generated heterozygous or homozygous deletions.
      • The authors should cite Fiddes, I. T. et al. Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis. Cell 173, 1356-1369.e22 (2018).
      • Many unclear statements eg line 138: Next, we analyzed each single-cell in an organoid
      • Discussion on E/I is very speculative, not supported by any evidence

      Significance

      The general topic of this study is high interest given the strong association of the 1q21.1 with disease. The authors developed interesting ESC line to study in parallel del and duplication. Unfortunately the level of of analysis performed on these organoids is not up the current stat of the art, are of low experimental quality, analyses are limited. Therefore no clear conclusion can be drawn except for the size of the organoids, very little mechanism is provided. This therefore remains a purely descriptive study for which the presented data are rather on low quality and limited impact in its current shape.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their careful reading, positive feedback and constructive criticisms of our manuscript. Their primary points of concern were that the discussion was too long and too speculative, and that the title did not sufficiently represent our work. We have now cut the discussion in half, and we have also changed the title to more precisely reflect our paper, and made some other minor changes in the text (all highlighted in blue).

      Below, we provide responses to each of the raised issues.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): The study was very well conducted by the group, selecting appropriated methods for achieving the aimed objectives. The sample were abundant and the statistical treatment were suitable for the size of samples, as well to compare different methods used in this study. The results in general were properly exploited by the authors, clearing many aspects of the role/function of the trophallaxis fluid. The results of this manuscript are apparently suggesting that young colonies prioritize the metabolization of carbohydrates, while mature colonies prioritize the accumulation and transmission of stored resources, amongst other processes. This study cleared many aspects about the role/function of the trophallaxis fluid for the colony.

      We are happy the reviewer agrees with our choices of methods, sample sizes, and statistics, and we are pleased that they have come to the same conclusions.

      Even considering the high level of present investigation, still there are some aspects that could be improved by the authors:

      • The text in general is relatively long with an over use of citations of literature;
      • The discussion is interesting, but some times too much speculative; if the authors could attenuate their speculative statements, the text would become more objective and fluid;

      Thank you for this feedback. These comments truly helped us strengthen the manuscript. We have now streamlined the text, cutting down the introduction, cutting in half the discussion and we have made more explicit what is statement and what is speculation (more on this in response to reviewer 2).

      • The results shown in figure 6A and 6D, relative to the processed of neutrophils degranulation and complement cascade, respectively. The authors did not discuss these results; is there a meaning at level of trophallaxis fluid role for the colony ? This was not discussed in the manuscript.

      We thank reviewer #1 for pointing out these results. We have now addressed these terms in lines 277-284 of the discussion:

      “Our gene-set enrichment analysis showed significant enrichment in immunity-related proteins characteristic of phagocytic hemocytes (58) in trophallactic fluid (‘innate immune system’, ‘complement cascade’, ‘neutrophil degranulation’). These results indicate that hemocytes may themselves be transmitted mouth-to-mouth, and generally shows the involvement of the social circulatory system in colony-level immune responses with implications for social immunity.”

      • Considering the very high scientific quality of the present study, the authors could deposit all the raw proteomic data in a international reliable repository of proteins/DNA DB, since it will be required by top journals.

      We wholeheartedly agree, and all data are now shared online through ProteomeXchange.

      Reviewer #1 (Significance (Required)): Significance:the present investigation represents an important contribution for the knowledge the the exchange of signals within the colony, to synchronize the physiology and development of the hive as whole (the concept of superorganism. The existing data about the composition and potential role of the components from tropahallaxis fluid is very small, compared to the present results. The present study is a master piece of knowledge about the importance of eusociality.

      Thank you for recognizing the importance of this study and affirming our work in such a wonderful way!

      **Audience:** all those scientists involved with social insects; biochemists/protomists dedicated to insect biology, biochemistry and physiology. **My expertise:** biochemistry of Arthropods secretion, in special of honeybees, ants and wasps. **Referee Cross-commenting** I think that both reviews aare complementary to each other; both reviews agree with the need to reorganize the text making it more compact and objective. Essentially, the auhtors must focus in the concept of trophallaxis. Thus, the biochemical processes outlined by proteomic analysis should be addressed to explain how the shared physiology of colony works out.

      Our discussion now focuses more on trophallaxis as a whole, and the biomarker-like quality of the changing proteome. We agree the biochemical processes and their role in the shared colony physiology are fascinating topics. We have not yet performed follow-up experiments with the many proteins present in this fluid and thus do not want to over-conclude. We have now stated more clearly in the discussion what the current data can reveal about these topics, what is assumed via orthology, and what needs to be addressed in future studies.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): This ms provides a comprehensive proteomic analysis of the trophallactic fluids extracted from carpenter ants. The analytical methods are state-of-the-art, and the results presented should fuel many studies. The vision of the research program, embodied in the title of the paper, is very exciting and is to be encouraged. However, the title of the paper in no way reflects the content of the paper, as none of the functional processes mentioned have been proven. This will require a lot of work and the development of perhaps new bioassays. I truly hope the PI's lab takes this on a deep and substantial way; the notion of trophallaxis and its socially exchanged fluid has long captivated the fancy of social insect biologists, but with a few specific exceptions, the promise has not yet been realized. The technical and descriptive results presented here lay a strong foundation. For purposes of present publication, I strongly recommend a different title and a revised discussion that reflects the disconnect I outline. Cause/consequence issues need to be addressed.

      We thank reviewer #2 for seeing our vision and that this is indeed foundational work that will “fuel many studies.” We also agree that the title and discussion contained too much speculation. The aim of this paper was to prove that there is systematic variation in trophallactic fluid in natural populations that correlates with biologically important social conditions, and further, that some proteins in this fluid can both act as biomarkers and be informative about underlying molecular processes. We have now communicated this more clearly in the introduction. In the revised version of the paper, we have reduced the speculation, and where appropriate, made it clear when there is speculation.

      For example, discussion lines 233-238:

      “Overall, our data reveal a rich network of trophallactic fluid proteins connected to the principal metabolic functions of ant colonies and their life cycle. Pinpointing contexts that induce changes in trophallactic fluid, along with the exact targets and functions of the proteins, are important subjects for future work. Our establishment of biomarkers transmitted over the social circulatory system that correlate with social life will allow researchers to formulate and test hypotheses on these proteins’ functional roles.”

      Three technical points: 1) Sample sizes are low for some analyses (2/group)--though they are cleverly pooled.

      We are not sure what the reviewer is referring to – none of our sample types had this low sample size (see SI Table 1 for sampling scheme). In contrast, for a proteomics study, our sample sizes are quite high. We are aware that for a study focusing on a natural population, the colony-level sample size of 16 (laboratory colonies) can be considered low, but this has been taken into account in our stringent statistical analyses.

      2) How to distinguish between what animals actually transmit and what is found in the gut? There could be differences.

      This has been addressed in our previous work, where it was shown that the crop content is equivalent to what is exchanged among individuals of this same species during the act of adult-adult stomodeal trophallaxis (Figure 1A, LeBoeuf et al. eLife 2016). We have now clarified this in the methods section of the current paper (line 361-364).

      “Trophallactic fluid was obtained from CO2- or cold-anesthetized workers whose abdomens were gently squeezed to force them to regurgitate the contents of their crops. This method of collection was shown previously to correspond to the fluid shared during the act of adult-adult stomodeal trophallaxis (17).”

      3) Is there evidence that the substances found are not just the product of digestion of ingested food? The differences between lab and field colony samples supports this.

      In the type of proteomic analysis we have performed (the most commonly used proteomics approach when a genome is available), we detect only proteins found in the reference genome of interest (in our case Camponotus floridanus), so excepting cannibalism, we should not see proteins that originate from food. Note that this is why we do not provide lab colonies with the typical lab-reared ant diet that includes honey, as bees are also Hymenoptera, and royal jelly and trophallactic fluid have many proteins in common. Cannibalism could result in trace observation of many proteins, but could not produce the consistent and high-abundance set of proteins that we have observed as they are not produced in those precise ratios in larvae or adults.

      The observed shift in trophallactic fluid from field to lab may reflect a change in diet or microbiome and these are questions that could be further investigated in future work (mentioned in lines 229-232). The clear difference we observe between trophallactic fluid of young and mature colonies, or the difference between the worker castes within a colony, is evidence that the variation observed in trophallactic fluid reflects more than diet.

      “Trophallactic fluid complexity declines over time when colonies are brought from the field to the laboratory. This may reflect dietary, microbiome or environmental complexity – typical of traits that have evolved to deal with environmental cues and stressors (e.g. immunity, (37)).”

      Reviewer #2 (Significance (Required)): The paper addresses a very important topic that should be of widespread interest to social biologists. Journal choice should reflect that this is a technically excellent paper that presents descriptive information but functional significance is highly speculative.

      We appreciate that the reviewer agrees that our results are of widespread interest to social biologists. Indeed, our results must be somewhat descriptive, as we are working on a mostly unexplored socially exchanged fluid in a natural population. However, our study design tests clear hypotheses with preplanned sampling and experimental transfer of ant colonies to a new laboratory environment. We present confirmatory results of the hypothesis that trophallactic fluid is complex mixture of biomarker-like molecules and that these biomarkers can be used predict sample origin through machine learning (see random forest predictions, emphasized in lines 151-152). The fact that our evidence for this is correlative does not render it speculative. Indeed, in both ecology and in much of medicine, using correlative evidence is the norm, as it is often impossible to manipulate ecosystems, natural populations and some organisms in a safe and controlled manner. This is what convinced us to invoke the term ‘biomarkers,’ as biomarkers are excellent examples of molecular correlates of larger conditions that have spurred advances in biology and medicine.

      Some of the next steps in our research will be, as reviewer #2 suggested, additional studies on the roles of individual compounds of trophallactic fluid, building on the results of this paper. Additionally, while this study may not have explored the roles of specific molecules, open ended exploration is extremely important and necessary for any scientific advancement in the long run (eLife 2020;9:e52157).

      All in all, we are grateful for this comment, as it showed us that we must communicate the aims of our work more clearly – which we have now done both in introduction (line 77-91) and throughout the discussion.

      **Referee Cross-commenting** Yes. Most of the discussion is pure speculation because we do t k ow what is exchanged and what the modes of action might be. But it's a great start!

      We have reduced the speculation on the roles of single molecules, and we hope our responses to the points above clarify some of the reviewer’s uncertainties about what is exchanged. However, we do still outline hypotheses for potential functions and origins in the discussion section, as this study is intended to be a foundation for new lines of research.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This ms provides a comprehensive proteomic analysis of the trophallactic fluids extracted from carpenter ants. The analytical methods are state-of-the-art, and the results presented should fuel many studies. The vision of the research program, embodied in the title of the paper, is very exciting and is to be encouraged. However, the title of the paper in no way reflects the content of the paper, as none of the functional processes mentioned have been proven. This will require a lot of work and the development of perhaps new bioassays. I truly hope the PI's lab takes this on a deep and substantial way; the notion of trophallaxis and its socially exchanged fluid has long captivated the fancy of social insect biologists, but with a few specific exceptions, the promise has not yet been realized. The technical and descriptive results presented here lay a strong foundation. For purposes of present publication, I strongly recommend a different title and a revised discussion that reflects the disconnect I outline. Cause/consequence issues need to be addressed.

      Three technical points:

      1) Sample sizes are low for some analyses (2/group)--though they are cleverly pooled.

      2) How to distinguish between what animals actually transmit and what is found in the gut? There could be differences.

      3) Is there evidence that the substances found are not just the product of digestion of ingested food? The differences between lab and field colony samples supports this.

      Significance

      The paper addresses a very important topic that should be of widespread interest to social biologists.

      Journal choice should reflect that this is a technically excellent paper that presents descriptive information but functional significance is highly speculative.

      Referee Cross-commenting

      Yes. Most of the discussion is pure speculation because we do t k ow what is exchanged and what the modes of action might be. But it's a great start!

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The study was very well conducted by the group, selecting appropriated methods for achieving the aimed objectives. The sample were abundant and the statistical treatment were suitable for the size of samples, as well to compare different methods used in this study.

      The results in general were properly exploited by the authors, clearing many aspects of the role/function of the trophallaxis fluid. The results of this manuscript are apparently suggesting that young colonies prioritize the metabolization of carbohydrates, while mature colonies prioritize the accumulation and transmission of stored resources, amongst other processes. This study cleared many aspects about the role/function of the trophallaxis fluid for the colony.

      Even considering the high level of present investigation, still there are some aspects that could be improved by the authors:

      • The text in general is relatively long with an over use of citations of literature;
      • The discussion is interesting, but some times too much speculative; if the authors could attenuate their speculative statements, the text would become more objective and fluid;
      • The results shown in figure 6A and 6D, relative to the processed of neutrophils degranulation and complement cascade, respectively. The authors did not discuss these results; is there a meaning at level of trophallaxis fluid role for the colony ? This was not discussed in the manuscript.
      • Considering the very high scientific quality of the present study, the authors could deposit all the raw proteomic data in a international reliable repository of proteins/DNA DB, since it will be required by top journals.

      Significance

      Significance:the present investigation represents an important contribution for the knowledge the the exchange of signals within the colony, to synchronize the physiology and development of the hive as whole (the concept of superorganism.

      The existing data about the composition and potential role of the components from tropahallaxis fluid is very small, compared to the present results. The present study is a master piece of knowledge about the importance of eusociality.

      Audience:

      all those scientists involved with social insects; biochemists/protomists dedicated to insect biology, biochemistry and physiology.

      My expertise:

      biochemistry of Athropods secretion, in special of honeybees, ants and wasps.

      Referee Cross-commenting

      I think that both reviews aare complementary to each other; both reviews agree with the need to reorganize the text making it more compact and objective. Essentially, the auhtors must focus in the concept of trophallaxis.Thus, the biochemical processes outlined by proteomic analysis should be addressed to explain how the shared physiology of colony works out.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #2 (Evidence, reproducibility and clarity):

      This paper attempts to address a current, clinically relevant question utilizing novel statistical modeling. The authors comprehensively assessed the presence of criteria and non-criteria aPL in a heterogeneous cohort of 75 COVID patients and 20 non-infected controls. They found 66% of COVID patients had positive aPL and demonstrated a correlation between aPL and anti-SARS-CoV-2. However, I have several major concerns:

      1. The cohort is extremely heterogeneous. COVID-19 samples that were used included hospitalized patients and those who had COVID more than 2 months ago and were convalesced (29% of samples). Severity of disease does influence autoreactivity and the presence of autoantibodies. The prevalence of autoantibodies among patients who are acutely ill will be much different than those who are convalesced. I think it would be prudent to assess the presence and correlation of aPL among those two groups separately.

      We thank you for pointing out the complexity of our study population, consisting of multiple cohorts from different centres. Exactly the above-mentioned heterogeneity of our cohorts and their variables is the reason why we employed linear mixed-effect models. Linear mixed-effect smodels, accounting for both fixed as well as random effects, are suitable to address potentially confounding factors. Along these lines, disease severity (different in the convalescent and the acutely ill individuals) as well as the relation of the time of sampling to time of disease occurrence (days post onset of disease manifestation) were included as fixed effects in our mixed model. Thus, our model accounts for potential differences between the acute phase of infection and convalescent phase and would capture them if relevant.

      In order to increase the rigour, we have performed an additional analysis where we excluded the convalescent individuals from the model (see Fig. 3C). The results obtained are in line with results already shown (Fig. 3B, 3D).

      In general, we have pursued a largely data-driven exploratory, and not a hypothesis-driven, approach. Clearly, we could have decided to set a stringent focus on a cohort without complexity. Yet, our approach encourages heterogeneity, which we address using an adequate model. Since, perhaps, the model choice, the model itself, and the data-driven approach were not explained extensively enough, we have added a more detailed account in the manuscript, lines 317-334 and lines 394-403.

      1. Sampling of the patients is concerning, 35% are plasma and 65% are serum. It is undesirable to put data from plasma and serum together to perform analysis.

      We thank the reviewer for raising this important concern. We have aimed to be as rigorous and transparent as possible in the description of the cohorts (see Tables 1 and 2) for serum/plasma). While we agree that, in general, it would be best if either only plasma (i.e., only heparin plasma or only EDTA plasma) or only serum was used, the authors wish to clarify that for both SARS-CoV-2 IgG profiling as well as for LIA, plasma or serum can be used interchangeably. We can formally show this. We have conducted a SARS-CoV-2 IgG profiling experiment on patient-matched samples (plasma and serum). Data is unambiguous about that there is no effect of plasma or serum on the assay outcome (Fig. S3A and S3B), with a Pearson correlation coefficient of 0.9942 (95% confidence interval: 0.9865-0.9975) and R2 of 0.9885. Bland-Altman analysis does not indicate any significant bias (Fig. S3C).

      For the detection of APS antibodies with ELISA, literature is suggestive of no relevant interference by the usage of plasma or serum on the measured value (Pham et al., 2019). To formally reassess this, we measured aPL autoantibodies with LIA in one matched plasma and serum sample of an individual with high-titre aPL antibodies and of one high-titre individual whose plasma was spiked into non-reactive plasma and serum (Fig. S2A and Fig. S2B). We found the same pattern of IgM and IgG aPL-positivity in both matched serum and plasma samples as well as in spiked serum and plasma samples, with a Pearson correlation coefficient of 0.9974 (95% confidence intervals: 09611-1.034) and R2 of 0.9813 (Fig. S2A). Bland-Altman analysis did not indicate a significant bias (Fig. S2B).

      We therefore conclude that in our study, using both plasma as well as serum has no effect on the validity of our results.

      1. LIA based assays were used to assess the presence of aPL and results were reported in OD rather than standardized units. While the same group demonstrated a positive correlation in the past between LIA OD and internationally accepted ELISA-based aPL assays, the validity and clinical utility of these LIA assays still require further evaluation. Furthermore, OD>50 was used as a positive cut-off. How this cut-off was determined and how it relates to internationally accepted positive aPL cut-offs (99th percentile or greater than 40) remains unclear.

      We thank the reviewer for mentioning concerns on LIA. The validity of this technology has been confirmed in multiple peer-reviewed publications (Roggenbuck et al. Arthr Res Ther 2016;18:11, Nalli et al. Autoimmunity Highlights 2018;9,6). In terms of cut-off detection, processed strips were analysed densitometrically employing a scanner with the evaluation software Dr. DotLine Analyzer (GA Generic Assays GmbH). The cut-off of 50 OD units was determined by calculating the 99th percentile of 150 apparently healthy individuals as recommended by the international classification criteria for aPL testing and Clinical and Laboratory Standards Institute (CLSI) guideline C28-A3 (Roggenbuck et al. Arthr Res Ther 2016;18:11, Nalli et al. Autoimmunity Highlights 2018;9,6). A corresponding sentence has been added to the METHODS AND MATERIALS section.

      For our study, we aimed to perform the maximum number of tests possible with limited sample volume and have therefore chosen LIA. We are aware of the discussion on internationally accepted cut-offs for clinical APS diagnostics. However, we would like to point out that our manuscript is not a case report on patients diagnosed with APS, nor do we aim to modify diagnostic standards set in the international consensus statement for the classification criteria for definite APS (established in 2006).

      Moreover, the OD ≥ 50 was used as a cut-off in one analysis (with Fisher’s exact test for statistics) in our manuscript and was re-assessed using Mann-Whitney/Wilcoxon rank sum test on a continuous scale (Fig. 1C and 1D). All subsequent analyses were not contingent on an OD cut-off. We believe that this is clearly stated in the manuscript.

      1. While the authors attempted to evaluate the presence of both IgG and IgM aPL in COVID patients, only 65% of samples were tested for both IgG and IgM aPL.

      We agree that testing the entire collective for IgG and IgM isotypes would have been best. In fact, we would have been interested in also including the IgA isotype. Inconveniently, sample volume is sometimes limiting.

      We have been clear about the omission of IgG aPL measurements in the samples from Zurich (see lines 214-215). We consider this a limitation, however, our data indicated that IgM aPLs are more immediately relevant in the context of SARS-CoV-2. While this has been surprising to us, we would like to highlight that this is a manifestation of the quality of a data-driven approach where data, much more than belief, build the foundation for conclusions. Along these lines, we could have easily omitted all data on IgG aPLs without compromising the message contained in our manuscript. However, we stand behind our decision to show all data even if, in the case of IgG aPL, (1) they are mostly negative and (2) they are incomplete.

      1. 26 patients had anti-SARS-CoV-2 data already available. Whether those were tested on the same samples and at the same time points as aPL ais not clear.

      We apologise for not having been clear about this in the text. The 26 samples from Zurich had been included in another study where their respective anti-SARS-CoV-2 Spike ECD, RBD, and NC p(EC50) values were used (Emmenegger et al., 2020). Thus, the p(EC50) values have been re-used in the current manuscript. The aPL autoantibodies were measured on exactly the same samples. We have tried to improve the explanation of this in the text, see lines 300-301.

      1. The novel statistical modelling design is interested. However, as there are concerns about the data put into the modelling, the validity of the conclusions is debatable.

      We thank the reviewer for being interested in the statistical model we used. Linear regression analysis belongs to the standard equipment when performing epidemiological analyses (see e.g., Szklo, Nieto, Epidemiology: Beyond the Basics). Here, we have employed a linear mixed-effects model to infer changes in the predictive power of fixed and random variables (e.g. SARS-CoV-2 IgG levels, disease severity, age, sex, days post onset of disease manifestation), to determine which of these variables reliably predict an outcome (e.g. PT aPL levels), and in what combination.

      We recognised that the manuscript would benefit from a more thorough explanation of the model and how it helps to evaluate the validity of the data. We have therefore added lines 317-334 in the manuscript.

      All authors are appreciative of the reviewer’s critique. In the light of the answers we provided, we are convinced about our conclusions, based on the data and our dataset. We hope that, with our responses, we have adequately addressed the concerns raised by the reviewer.

      Reviewer #2 (Significance):

      See above.

      Reviewer #3 (Evidence, reproducibility and clarity):

      It is being recognized that SARS-CoV-2 infection leads to acquired thrombophilia with increased arteriovenous thrombosis and endothelial injury and organ damage. This has multiple mechanisms including, the hypercoagulable state with platelet activation, endothelial dysfunction, increased circulating leukocytes, cytokines and fibrinogen, but also the acquired thrombophilia could be due to acquired APS in these patients. In this study, Emmenegger et al. evaluated aPL antibody responses in SARS-CoV2 infected individuals in connection with antibodies against the SARS-CoV2 components and found that antibody strength response against SARS-CoV-2 proteins is associated with PT IgM aPL antibody

      Reviewer #3 (Significance):

      This is overall an interesting and thought-provoking study, as it may explain the development of thrombophilia after SARS-CoV-2 vaccination. While the study provides a possible association of the development of antibodies against SARS-CoV-2 infection and aPL, it does not go to molecular details about the homology between anti- SARS-CoV-2 antibodies and aPL. Therefore, the study remains an association study.

      First of all, we would like to thank the reviewer for the careful evaluation of our work. We are in full consciousness of the descriptive nature of our work. Thanks to the suggestion of the reviewer (see below), we have aimed to go one step further into a more functional/ mechanistic description.

      It is not surprising that they found a difference in IgM rather than IgG as IgM development is an early response.

      The overall conclusion is supported by the rigorous statistical analyses, yet the study remains a correlative and association study.

      Significance: Thrombophilia associated SARS-CoV2 may be due to immunity against SARS-CoV2 rather than that pure cytokine response.

      Furthermore, they did not characterize the PT IgM aPL to find which part could be immunogenic or epitope similarity with anti- SARS-CoV-2 antibodies. Identification of these epitopes is crucial for further understanding of the antibody development and further intervention.

      Existing literature does not connect with antibody responses against Sars-CoV2.

      Could the authors provide some molecular epitope analysis of IgM aPl and ani Sars_ antibodies? Even computation analysis will improve the paper tremendously.

      We thank the reviewer for coming up with this idea. Clearly, the presence of cross-reactive IgM antibodies to human prothrombin, triggered against the SARS-CoV-2 Spike protein, would be a direct and simple explanation for our observation. We have put efforts into analysing epitopes of SARS-CoV-2 Spike protein and prothrombin (see lines 374-390 in the manuscript and Fig. 4). We conclude there is very limited similarity, and that the mechanism is most likely indirect.

      There is no ethical concern.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      It is being recognized that SARS-CoV-2 infection leads to acquired thrombophilia with increased arteriovenous thrombosis and endothelial injury and organ damage. This has multiple mechanisms including, the hypercoagulable state with platelet activation, endothelial dysfunction, increased circulating leukocytes, cytokines and fibrinogen, but also the acquired thrombophilia could be due to acquired APS in these patients. In this study, Emmenegger et al. evaluated aPL antibody responses in SARS-CoV2 infected individuals in connection with antibodies against the SARS-CoV2 components and found that antibody strength response against SARS-CoV-2 proteins is associated with PT IgM aPL antibody.

      Significance

      This is overall an interesting and thought-provoking study, as it may explain the development of thrombophilia after SARS-CoV-2 vaccination. While the study provides a possible association of the development of antibodies against SARS-CoV-2 infection and aPL, it does not go to molecular details about the homology between anti- SARS-CoV-2 antibodies and aPL. Therefore, the study remains an association study.

      It is not surprising that they found a difference in IgM rather than IgG as IgM development is an early response.

      The overall conclusion is supported by the rigorous statistical analyses, yet the study remains a correlative and association study.

      Significance: Thrombophilia associated SARS-CoV2 may be due to immunity against SARS-CoV2 rather than that pure cytokine response.

      Furthermore, they did not characterize the PT IgM aPL to find which part could be immunogenic or epitope similarity with anti- SARS-CoV-2 antibodies. Identification of these epitopes is crucial for further understanding of the antibody development and further intervention.

      Existing literature does not connect with antibody responses against Sars-CoV2.

      Could the authors provide some molecular epitope analysis of IgM aPl and ani Sars_ antibodies?. Even computation analysis will improve the paper tremendously.

      There is no ethical concern.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This paper attempts to address a current, clinically relevant question utilizing novel statistical modeling. The authors comprehensively assessed the presence of criteria and non-criteria aPL in a heterogeneous cohort of 75 COVID patients and 20 non-infected controls. They found 66% of COVID patients had positive aPL and demonstrated a correlation between aPL and anti-SARS-CoV-2. However, I have several major concerns:

      1. The cohort is extremely heterogeneous. COVID-19 samples that were used included hospitalized patients and those who had COVID more than 2 months ago and were convalesced (29% of samples). Severity of disease does influence autoreactivity and the presence of autoantibodies. The prevalence of autoantibodies among patients who are acutely ill will be much different than those who are convalesced. I think it would be prudent to assess the presence and correlation of aPL among those two groups separately.

      2. Sampling of the patients is concerning, 35% are plasma and 65% are serum. It is undesirable to put data from plasma and serum together to perform analysis.

      3. LIA based assays were used to assess the presence of aPL and results were reported in OD rather than standardized units. While the same group demonstrated a positive correlation in the past between LIA OD and internationally accepted ELISA-based aPL assays, the validity and clinical utility of these LIA assays still require further evaluation. Furthermore, OD>50 was used as a positive cut-off. How this cut-off was determined and how it relates to internationally accepted positive aPL cut-offs (99th percentile or greater than 40) remains unclear.

      4. While the authors attempted to evaluate the presence of both IgG and IgM aPL in COVID patients, only 65% of samples were tested for both IgG and IgM aPL.

      5. 26 patients had anti-SARS-CoV-2 data already available. Whether those were tested on the same samples and at the same time points as aPL ais not clear.

      6. The novel statistical modelling design is interested. However, as there are concerns about the data put into the modelling, the validity of the conclusions is debatable.

      Significance

      See above.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Full Revision

      Manuscript number: RC-2021-00785

      Corresponding author: Christian, G. Specht

      1. General Statements

      Dear Editor,

      We greatly appreciate the reviewers’ constructive comments on our manuscript ‘Identification of a stereotypic molecular arrangement of glycine receptors at native spinal cord synapses’. We were particularly pleased that all four reviewers agreed that our data yield new insights into the structure of inhibitory glycinergic synapses, and represent both a technical and conceptual advance the field of synaptic neuroscience.

      The reviewers have consistently raised one main criticism, namely the use of endogenously expressed GlyRs tagged with the fluorescent protein mEos4b, which could potentially have an impact on receptor expression, trafficking and function. We have addressed this point by performing whole-cell recordings of GlyR currents in cultured neurons that show that glycinergic transmission and therefore function is preserved. We have also addressed all other comments of the reviewers in the revised manuscript, including a thorough revision of the text and the addition of new data and figures as detailed in the point-by-point response.

      Point-by-point description of the revisions

      Reviewer 1:

      Summary:

      In this manuscript Maynard et al describe a newly generated knockin mouse to study the endogenous distribution of Gly receptors in the spinal cord. Using quantitative confocal imaging and SMLM the distribution and levels of GlyRs at spinal cord synapses is compared between dorsal and ventral horn. They found that levels of synaptic GlyR are higher in dorsal than ventral spinal cord synapses. Nevertheless, the ratio to gephyrin seems constant, except for synapses in superficial layers of the dorsal horn, where gephyrin levels exceeded the levels of GlyRs. There are also fewer, but larger synapses in the ventral horn than in the dorsal horn. These findings are further corroborated by an SR-CLEM approach. Furthermore, it is shown that in a mouse model for hyperekplexia GlyR levels are lower, but still enriched at synapses, and the dorsal-ventral gradient in GlyR expression was maintained. The difference in size of ventral and dorsal synapses observed in WT animals was also lost in the oscillator mouse, suggesting that particularly the ventral synapses are affected. Despite these differences, the density of GlyRs per synapse remained similar.

      Major comments:

      Line 113: "labeling the_ _b__-subunit has proven difficult". This statement is unclear and it would be informative for readers to grasp what exactly has been difficult, and why the approach described here overcomes that? Related to that, the authors state "KI animals reach adulthood and display no overt phenotype, suggesting that the presence of the N-terminal fluorophore does not affect receptor expression and function". That is indeed reassuring, but it does not exclude that receptor numbers, function and distribution are altered. As it seems there is no prior literature on tagging the beta subunit, additional evidence that the tag does not interfere with receptor trafficking or functioning would be desirable

      We have clarified why it has been difficult to label the GlyR beta subunit until now, lines 113-115 _“To date, labeling of GlyRβ in situ using immunocytochemistry has proven difficult due to a lack of reliable antibodies that recognize the native β-subunit (only antibodies for Western blotting recognizing the denatured protein are available), which has severely limited the study of the receptor.”_ Hence it was important to us to generate this knock-in mouse in order to study the endogenous GlyR at synapses, which is the least well studied receptor mediating fast synaptic transmission.

      The reviewer makes an important point regarding the labeling of the GlyRβ-subunit with a fluorescent protein that has also been raised by the other reviewers. We have now verified receptor function by patch clamp recordings of glycine currents in whole-cell configuration in spinal cord neuron cultures from the mEos4b KI mouse (new Supplementary Fig. S2C). At saturating glycine concentrations of 300 μM we found no difference in chloride influx between mEos4 KI and WT mice. Since glycine concentrations in the synaptic cleft are in the millimolar range during synaptic transmission, these data strongly suggest that glycinergic transmission is not affected by the presence of the mEos4b under physiological conditions, despite a minor shift in the EC50.

      There are several other strong arguments that suggest that mEos4b-GlyRb expression, subcellular localization and function are the same as those of the native subunit. Firstly, the mEos4b sequence was inserted after the signal peptide and before the beginning of the coding sequence of the mature β-subunit (Fig. S1). Since the mEos4b sequence does not interrupt the coding sequence it is less likely to affect the receptor conformation. Secondly, we did not notice any behavioural phenotypes in animals carrying the GlrbEos allele. At the time of weaning, the genotypes of the pups corresponded to the expected Mendelian frequency (new Fig. S2A). Moreover, we did not observe a reduction in live expectancy of GlrbEos/Eos animals (new Fig. S2B), demonstrating that the mEos4b-GlyRb does not cause pathology in older animals.

      Most importantly, our imaging data (Fig. 1-3) provide exhaustive evidence that mEos4b-GlyRb assembles with GlyR alpha subunits as heteropentameric receptor complexes that are trafficked to the plasma membrane and inserted into the synaptic membrane due to their interaction with the gephyrin scaffold at functional synapses. Using quantitative imaging, we have also shown that homozygous GlrbEos/Eos KI mice have exactly twice the number of receptors at synapses as heterozygous animals, strongly suggesting no interference in receptor trafficking to the plasma membrane and gephyrin binding. As the mEos4b mice were also bred with the oscillator mouse model of hyperekplexia, which is lethal when homozygous, we could further test the combined effect of GlrbEos and GlyRa1spt-ot. The presence of both alleles did not lead to any noticeable phenotypes in heterozygous oscillator mice. On the contrary, both synaptic targeting and the packing density of the receptors were not altered in this model, despite a region-specific reduction in synapse size due to the reduced availability of the intact GlyRa1 subunit.

      We believe that these data overwhelmingly support our conclusion that the presence of the mEos4b tag does not alter the structure and function of the receptor, making this mouse model uniquely suited to study the dynamics and regulation of glycinergic synapses in a quantitative manner and at the molecular level.

      In the Discussion the authors conclude that "Our quantitative SR-CLEM data lend support to the first model, whereby inhibitory PSDs in the spinal cord are composed of sub-domains that shape the distribution of the GlyRs". This conclusion seems however based on one example image in Fig 3G that is not very convincing. The EM image seems to show two clearly separated PSDs opposed by two distinct active zones. So, although this conclusion is of high interest, more support should be given to substantiate this conclusion. More general, these subsynaptic domains (SSDs) are hardly further explored, but seem relevant for transmission, particularly given that the synaptic pool of GlyRs at these synapses is not saturated by single release events. How general are these SSDs at these synapses?

      The representative image in Fig. 3G shows two SSDs within the same postsynaptic site with a continuous presynaptic active zone. It should be noted that the PALM/SRRF images were taken of the entire 2 µm thick slice, whereas the electron micrograph shows only a single 70 nm section. We verified throughout the full 3D stack of serial sections that the presynaptic site remains continuous, which it does. We would also like to point out the scale of the image showing that the two SSDs are only around 170 nm apart, i.e. spatially very close. Our conclusions are however not based on this single image but the whole dataset. The graph in Fig. 3I shows 3 synapses (out of N = 36), in which the GlyR density at separate SSDs could be quantified, demonstrating that the receptor density is not different between SSDs. The reviewer is correct that we do not further analyse the SSDs beyond their density and the analysis of the segmentation of the postsynaptic sites (Fig. 3E-G). Further work on the functional role of SSDs in synaptic transmission is outside the scope of this manuscript and would indeed merit future study.

      The approach for counting molecules based on the PALM acquisition has been developed in prior publications and seems robust. It would however be worth to present the reader with a bit more background and explain the assumptions of this approach in more detail. Particularly, since counting of mEos4b can be problematic, as there are multiple dark and fluorescent states of this fluorophore that could be influenced by the illumination scheme, see for instance De Zitter et al., Nat Methods 2019. Since the preceding SRRF acquisition already exposes the fluorophore to high and continuous 561-nm laser power this could skew the counting due to unaccounted conversion and perhaps bleaching of mEos4b. In line with this, although throughout the manuscript the term 'absolute copy numbers' is used the reported numbers are at best an estimate based on a number of assumptions. I think the wording 'absolute numbers' is therefore deceiving and should be nuanced.

      We have clarified how the molecule conversion is calculated (Fig. S7 legend), to provide a more complete description of the way in which the values were obtained. Further we have explained how we calculated the probability of detection. Since the probability of detection accounts for any unconverted or non-functional mEos4b molecules, our molecule counting approach is relatively resistant to potential pre-bleaching of fluorophores. It should be noted, that 561 nm illumination had no obvious effect on the non-converted (green) mEos4b fluorophores, as judged by the fact that the intensity of receptor puncta was unaffected by the SRRF recordings. We appreciate the reviewers point regarding the term ‘absolute copy number’ and we have adjusted our wording throughout the manuscript accordingly.

      Related, most of the quantifications are in estimating the number of receptors, and not so much the distribution with the PSD. The term "molecular arrangement" - also used in the title - might therefore be misleading, there is in fact little characterization of how GlyRs are placed within the PSD. More focused analysis quantifying the distribution of receptors within the PSD and/or SSDs would strengthen the manuscript.

      By estimating the number of receptors and the exact size of synapses, the main conclusion of our study is that receptor density at dorsal and ventral synapses is identical, independent of synapse size, subdomains, or in fact loss of GlyRs in a mouse model of hyperekplexia. This observation clearly relates to how receptors are packed within synapses, and thus describes their molecular arrangement.

      The reported N is confusing and makes it hard to judge the reproducibility of the data. Sometimes it refers to number of images, sometimes number of synapses, but it is unclear from how many experiments these are drawn. This should be reported more completely (number of animals should be reported at least) and consistently. In figure 1, the N numbers (N=3-5 images) are particularly low and question how consistent these findings are across multiple animals.

      We have clarified the N in the figure legends, to reflect the full size of the datasets that have been analysed.

      The levels of mRFP-Gephyrin seem to differ between the different mouse lines, is this a significant difference?

      No significant differences in mRFP-gephyrin levels were found in animals with different mEos4b-GlyRb genotype (Fig. 1B). However, expression of mRFP-gephyrin in heterozygous animals is 50% of that in homozygous mRFP-gephyrin KI animals (not shown).

      The ICQ analysis for co-localization is hardly explained. How do we interpret this parameter? What does an average value of ~0.3 mean? A comparison with sets of proteins that do not overlap as a negative control would strengthen the conclusion.

      We have clarified that an ICQ value of 0.3 is indicative of a very high spatial correlation between pixels, and provided a corresponding reference for ICQ analysis (lines 209-210). We would like to point out that the scale of the ICQ is between -0.5 to 0.5, meaning that a value of 0.3 comes close to complete correlation.

      Minor comments:

      Very little fluorescence was detected in the forebrain, despite the high reported expression of the Glrb transcript". Can the authors expand on this? What would explain this discrepancy?

      We have clarified the text to include “suggesting that protein levels are controlled by post-translational mechanisms in a region-specific manner, as previously proposed (Weltzien et al., 2012)” (Lines 152-153). The reason for this discrepancy is not known. However, the distribution of mEos4b expression throughout the brain is as expected, based on the literature.

      "What region is quantified in Fig 1B? is the same region in all conditions? This should be specified more clearly as the manuscripts presents a clear gradient in expression levels in the spinal cord and thus the location will influence the intensity measurements.

      We have explained in the text that this is the region at the centre of the ventral horn identified by the white square in Fig. 1A, and that the same region was analysed for all images across all animals. Page 5, lines 160-161 “The same region of the ventral horn, indicated by the white square in Fig. 1A was taken for quantification of mEos4b-GlyRβ and mRFP-gephyrin expression in all conditions.”

      The labeling approach does not differentiate between surface and internal receptors, this should be made more explicit in the text.

      Whilst this is correct, we have only analysed mEos4b-positive synapses that had corresponding gephyrin clusters, meaning synapses where receptors are located in the postsynaptic membrane. Indeed we found that all mEos4b clusters imaged colocalised with mRFP-gephyrin clusters. We have adjusted the text accordingly, page 6, line 205-206 “All mEos4b-GlyR clusters closely matched the mRFP-gephyrin clusters, confirming the localization of the receptors in the postsynaptic membrane.”

      Significance:

      The presented data are interesting and the experiments are technically advanced and carefully performed. Particularly the SR-CLEM approach is technically advanced. The datasets present a quantitatively detailed characterization of spinal cord synapses and will be of interest for researchers working in the field of spinal cord circuitry, as well as super-resolution imaging. The conceptual advance for the field is however somewhat limited. It seems that the presented data confirm the general notion that receptor numbers and synapse size are highly correlated. So, although this manuscript describes very interesting observations, in its present form the manuscript does not provide any new mechanistic insight or significant advance in our understanding of how these synapses operate.

      We thank the reviewer for his/her comments relating to the technicality of our manuscript. However we think that the statement “The conceptual advance for the field is however somewhat limited” is unfair, as this level of organisation of inhibitory synapses at the molecular scale has never been achieved before, as pointed out by the other reviewers, and especially not as regards different ages of animals and a disease model that directly affects receptor numbers in a region-specific manner. We therefore believe that our study will have a substantial impact within the fields of synaptic neuroscience as well as quantitative neurobiology.

      Referee cross-commenting:

      I agree with the other reviewers that this study is technically advanced, but I remain critical towards the extent of conceptual advancement this study brings and there are some important concerns with the presented data that need to be addressed. Nevertheless, indeed many of these concerns can be addressed without additional experiments. As pointed out also by other reviewers additional validation that the fusion proteins are not disrupting their function or organization would be important.

      Reviewer 2:

      Summary:

      Maynard et al. investigate (inhibitory) glycinergic synapses in mouse spinal cord, which regulate motor and sensory processes. The authors analyse the molecular architecture and ultra-structure of these synapses in native spinal cord tissue using quantitative super-resolution correlative light and electron microscopy. The major finding is that GlyRs exhibit equal receptor-scaffold occupancy and constant absolute packing densities across the spinal cord and throughout adulthood, although ventral and dorsal inhibitory synapses differ in size. Moreover, what the authors call a „stereotypic arrangement" is even maintained in a hypomorphic mutant (oscillator), which is deficient in the adult GlyR a1 subunit.

      Specific comments:

      To reach their conclusions the authors generate two knock-in mouse lines, one with mEOS-labelled GlyR ß-subunit and one with mRFP-labelled gephyrin, a subsynaptic scaffolding protein of inhibitory synapses, which are subsequently crossed. Both changes are not unproblematic, as mutations in the N-terminal end of the GlyR ß subunit polypeptide chain might interfere with the assembly of functional GlyR (consisting of a und ß subunits) and and mutations at the N-terminal end of gephyrin interfere with it's homo-oligomerization into higher molecular assemblies.

      We have demonstrated that the function of mEos4b-GlyRb does not differ significantly from WT GlyRs, by carrying out electrophysiological experiments (new Fig. S2C). For a detailed response, please see the response to the first comment of reviewer 1. The mRFP-gephyrin KI strain has been validated and published previously (see Machado et al., 2011, J Neurosci; Specht et al. 2013 , Neuron) and was not specifically generated for this study. The experiments with the oscillator mutant did not include the mRFP-gephyrin allele. In these experiments, the wildtype GlrbEos/Eos (Fig. 4, 5) behaves exactly as the GlrbEos/Eos in the double knock-in (Fig. 1, 2), further validating the mouse models used.

      However, in this experimental design both labelled proteins reach postsynaptic membrane specialisations. In case of the ß-subunit quantitative evaluation confirms that heterozygous animals contain only half of the labelled protein as homozygous, which is an indication but not a proof that the correct stoichometry of adult GlyR is maintained. Likewise, mRFP-labelled gephyrin assembles with WT-gephyrin in subsynaptic domains, but it is not clear, if the size and density of the synapses is changed by the knock-in procedure as compared to WT-synapses.

      An effect of the mRFP tag on gephyrin clustering can be ruled out, since we observed no difference in synapse size and receptor density in GlrbEos/Eos animals with (Fig. 1, 2) and without the GphnmRFP allele (Fig. 4, 5, oscillator wild-type controls). Similarly, the synaptic mEos4b-GlyRb levels in heterozygous animals were precisely half those of the homozygous animals, strongly suggesting that the expression and trafficking of the tagged receptor subunit is unchanged, as the reviewer acknowledges. In the absence of any obvious behavioural and/or functional phenotypes (Fig. S2) this KI model is in our view is an exceptional tool to study GlyRs expressed at endogenous levels in a cell-type specific manner.

      Accepting these constraints, which to the knowledge of this reviewer have never been addressed to satisfaction, the authors provide a technically excellent, comprehensive analysis of glycinergic synapses in the spinal cord of double knock-in mice. Therefore, it should be stated in the title, that the investigations were performed with double knock-in instead of „native" spinal cord. Text and figures are clear and accurate and represent the state of the art.

      We thank the reviewer for the positive comments regarding the techniques used in the study, and the clarity of the text and figures. We have adjusted the title as requested.

      Finally, the reviewer would like to raise a minor point: the term postsynaptic density is derived from electron microscopical studies of synapses, where asymmetrical synapses display a „postsynaptic density" but symmetrical synapses do not. The latter were identified as inhibitory synapses and therefore, by definition, inhibitory synapses do not have a postsynaptic density, but rather a postsynaptic membrane specialisation. The use of the term „postsynaptic density" should, therefore, be restricted to excitatory synapses.

      We are conscious of the importance of correct definitions and have revised the terminology, referring to “postsynaptic sites”, “postsynaptic domains”, and “postsynaptic specializations” as appropriate throughout the manuscript.

      Significance:

      The authors provide a state of the art advanced light and electron microscopical analysis of glycinergic synapses in the mouse spinal cord. They suggest a robust "stereotypical" mechanism in place, which guarantees a fixed stoichiometry of relevant components, which is even maintained in a hypomorphic mutant, which is believed to represent a mouse model of human hyperekplexia (startle disease).

      Referee cross-commenting:

      I would like to corroborate the arguments of the previous reviewer: it is not clear to which extent the fusion proteins influence the measurements, which are technically very advanced and well done, however. The authors do definitely not investigate "native spinal cord" as stated in the title.

      The argument concerning fusion proteins must be taken especially serious as the fusions were induced in regions known to be responsible for assembly of glycine receptors and oligomerization of gephyrin.

      We have verified the receptor function with electrophysiological recordings and clarified exactly where the fluorescent protein was inserted (see reviewer 1 response). Given the similarity in synapse size, fluorescence intensities and molecule densities observed in neurons expressing different combinations of tagged and native receptors and scaffold proteins, we strongly believe that all animal models used are well suited to the experimental aims of our study.

      Reviewer 3:

      Summary:

      Glycinergic synapses are the least well understood of synapses that mediate fast synaptic transmission. The manuscript by Maynard et al. adds new information about the structural aspects of these synapses, using PALM and EM imaging of spinal cord synapses from mice at 2 and 10 months. The authors created a knock-in mouse that expresses a tagged GlyRbeta subunit, allowing synaptic localization of glycine receptors; all synaptically localized glycine receptors are thought to require the beta subunit to be tethered by gephyrin. The authors compare synaptic profiles from: 2 month old vs. 10 month old mice; dorsal vs. ventral horn; and GlyR1-reduced vs. wild type mice. Strikingly, they find a tight relationship across all of these variables between glycine receptor puncta and gephyrin puncta, as well as an apparently constant "packing density" of glycine receptors. They conclude that synaptic extent is likely to be the most important determinant of synaptic strength, as the density of receptors within the postsynaptic density is constant. These results use cutting-edge imaging and are analyzed with care, and add new information to our understanding of these relatively less well characterized synapses._

      Major comments:

      The key conclusions are convincing and the claims appear solid. Additional experiments are not needed to support these claims. The data and the methods are largely presented in such a way that they can be reproduced, although there are minor suggestions for improvement below.

      We thank the reviewer for his/her positive comments.

      Minor comments:

      Do the authors have any comment on the requirement during, e.g. LTP, for insertion of a gephyrin-GlyR unit? The lead author has speculated that gephyrin creates "slots" for GlyRs; yet apparently each slot is already filled in the snapshots taken here. How might postsynaptic LTP occur (Kandler group, Kauer group papers)?

      Given the reciprocity of GlyR and gephyrin clustering at synapses, the occupancy of binding sites (and in turn the number of available ‘slots’) is dependent on the strength of receptor-scaffold interactions, as discussed previously (Specht 2020, Neuropharmacol). In this study we demonstrate that the density of GlyRs at synapses is constant, which implies that the receptor occupancy is also the same, with the possible exception of mixed inhibitory synapses in the superficial dorsal horn that contain a majority of GABAARs. The PALM/SRRF data are represented as rendered image reconstructions and not as pointillist representations, and the detection of unoccupied binding sites is below the spatial resolution of our approach. However, the high spatial correlation of the signal intensities (ICQ ≈ 0.3) suggests that receptor occupancy is equal between and within synapses. It has previously been established that there are more scaffold proteins than receptors at synapses (Specht et al. 2013, Neuron; Patrizio et al. 2017, Sci Rep). Based on these studies we report that approximately half the gephyrin binding sites are occupied by receptors (lines 262-655). We have also expanded the discussion, describing how shape and size of synapses may affect synaptic transmission, as well as the possible role of receptor-gephyrin interactions in synaptic plasticity at glycinergic synapses.

      It would be very interesting in the discussion to contrast the present observations with what is known about excitatory synapses (NMDA and AMPAR distributions) and GABAergic synapses. Are the authors at all surprised that receptor packing is constant across conditions? Can the authors speculate on how non-gephyrin binding receptors (homomeric alpha receptors, which are found in recordings) may function and be tethered to the membrane.

      We have included additional information about receptor numbers and distributions at excitatory (lines 428-438) and GABAergic (lines 389-393) synapses in the discussion. So far, homomeric GlyRs composed of alpha subunits have been found to be exclusively extrasynaptic. As stated on page 4, lines 111-112 the beta subunit is required for binding of the GlyR to gephyrin and subsequent anchoring at the synapse. Previous studies have shown exocytosis of receptors to occur at extrasynaptic sites followed by lateral diffusion to synapses. Homomeric GlyRs are therefore most likely targeted to the extrasynaptic plasma membrane where they remain due to the lack of the beta subunit.

      Figure S1. It would be most helpful to quantify this; at the least to include an atlas-like drawing to allow identification of the structures illustrated and containing Glrb; better yet would be quantification of staining in regions where this is strongest.

      We have added an atlas indicating the different brain regions expressing mEos4b-GlyRb protein as a new Supplementary Fig. S3. The regional expression pattern agrees with the available literature about protein expression of the GlyRb subunit in different brain regions and hence provides further evidence that mEos4b-GlyRb is expressed like the native receptor. Due to the relatively low resolution of the tiled image no accurate quantification was possible. We have however added higher magnification confocal images of representative brain regions expressing varying amounts of GlyRb.

      The fact that the lower panel in B is labeled as +/+ across all groups is initially confusing; perhaps relabel as mEos4 -/-, +/- and +/+?

      We assume that the reviewer is referring to Fig1B. The genotype of both the GlrbEos and the GphnmRFP allele is now indicated on the x-axes, and the legend has been modified to clarify that all these animals were homozygous for GphnmRFP/mRFP. We have strived to remain consistent throughout the manuscript when referring to genotypes and protein levels.

      Do gephyrin levels drop in WT mice as well as in the mEosr-GlyRb mouse between 2 and 10 months? Do the authors have any thoughts on this (Supp figure S2)?

      We found no differences in gephyrin levels between 2 and 10 months. Fig. S2 (now Fig. S4C) shows the number of synaptic gephyrin clusters, which was the same at different ages and genotypes.

      Significance:

      Glycinergic synapses are the least well understood of synapses that mediate fast synaptic transmission. The manuscript by Maynard et al. adds new information about the structural aspects of these synapses, using PALM and EM imaging of spinal cord synapses from mice at 2 and 10 months. The authors created a knock-in mouse that expresses a tagged GlyRbeta subunit, allowing synaptic localization of glycine receptors.

      This will be of interest to those studying inhibitory synapses, and more broadly to synaptic morphologists, physiologists and imagers for comparison with other synapse types.

      My own expertise is NOT in these techniques, but I am a synaptic physiologist with a standing interest in glycinergic synapses; thus I am not providing serious technical critiques.

      Referee cross-commenting:

      Hi all, I agree with the other two reviewers, and do not have anything else to add.

      Reviewer 4:

      Summary:

      The authors used a correlative approach and combined photo-activated localization microscopy with electron microscopy to characterise Glycinergic synapses in spinal cord tissue. Some of the major findings are:

      • The receptor-scaffold occupancy and packing densities of glycinergic synapses in different regions of the spinal cord are the same.
      • Gephyrin clusters in the spinal cord are composed of sub-domains that shape the GlyR clusters.
      • Ventral horn synapses are generally larger, more complex (containing a number of gaps) and contain more GlyRs. -In a mouse model of Hyperekplexia, the number of GlyRs is reduced resulting in smaller synapses in the ventral spinal cord.

      Major comments:

      Are the key conclusions convincing? Yes

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? No

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. No

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. N/A

      Are the data and the methods presented in such a way that they can be reproduced? Yes

      Are the experiments adequately replicated and statistical analysis adequate? Yes

      Minor comments:

      Specific experimental issues that are easily addressable. Please see below

      Are prior studies referenced appropriately? Yes

      Are the text and figures clear and accurate? Yes

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions? Please see below.

      As the authors pointed out, fusing mEos to the extrasynaptic terminal of GlyRb has been difficult and therefore this construct would benefit the larger scientific community. Fig 1C is a nice imaging control for expression efficiency, however, it is in stark contrast with the lack of functional control. Do authors have any electrophysiological evidence showing that the insertion of mEos4b doesn't modulate channel function? I would assume that the construct would be tested in cell lines before the KI mouse line was created. Was any functional analysis done? If yes, it would be very useful to show it. I do appreciate that the authors used a standard insertion between the 4th and 5th AA in the extracellular domain, which in most cases does not abolish channel function. Given the lack of an obvious phenotype in the KI mouse model, I believe that this is also the case here. However, I disagree with the statement in lines 120-121: "the presence of the N-terminal fluorophore does not affect receptor expression and function." I believe that if there are no electrophysiological measurements of GlyR function, this statement remains speculative. As the authors pointed out in their previous publication: "receptor function and gephyrin binding are not independent properties. Instead, we think that conformational changes triggered at extracellular or intracellular protein domains have downstream consequences on channel opening as well as receptor clustering." In line with this, my concern is that the modulation of channel function by mEos4b could result in an altered cluster size at synapses. There is a large body of literature showing that just one missense mutation in the extracellular domain of ion channel subunits can lead to synaptopathies because the channel function gets modulated, and there is an abundance of similar examples involving mutations of GlyR and GABAAR subunits. In my view, comparing the function of GlyRs incorporating wt-GlyRb and mEos4b-GlyRb subunits is important for the correct interpretation of the main findings of this work and would strengthen the publications.

      As the reviewer points out, the insertion of the mEos4b sequence was considered carefully in order to have the least impact on receptor function. GlyR channelopathies are often caused by point mutations within the coding sequence, which is not the case in the GlrbEos allele. Instead, the mEos4b sequence was inserted after the single peptide of GlyRb, duplicating several amino acid residues in order to maintain the correct cleavage site and N-terminus of the mature receptor, and to not interrupt the GlyRb coding sequence (Fig. S1B). In order to verify that the mEos4b-tag does not affect GlyR function, we have now carried out electrophysiological experiments (new Fig. 2C). For a detailed description please see the response to the first comment of reviewer 1.

      Line 189: Are the authors making conclusions based on intensity comparison of red mEos4b and mRFP? The title of this section implies that the red form of mEos was compared to mRFP(?) But mEos converts from green to red only partially. Was the probability for conversion taken into account at this point? Please clarify which version of mEos was compared to mRFP._

      In line 189 (now 218) we compared the intensities of mRFP-gephyrin with those of converted (red) mEos4b in SRRF / PALM super-resolution images of the synapses (Fig. 2D). Since the absolute intensities are altered by the process of image reconstruction, the probability that mEos4b is photoconverted does not have to be taken into account. The constant ratio of the SRRF and PALM image intensities confirms the data in Fig. 1D showing that GlyR and gephyrin amounts are highly correlated throughout the spinal cord (with the exception of the superficial layers of the dorsal horn). We have clarified in the text that this analysis was carried out on reconstructed SRRF images of mRFP-gephyrin and PALM images of mEos4, line 202.

      Line 192: Please clarify how the density threshold was calculated/determined? This is important for the replication of the experiments, and it also has implications for the calculated probability of detection of mEos4b. I am not aware that this probability was calculated before for mEos4b and therefore other researchers may decide to rely on the value calculated here.

      We have now clarified in more detail how the probability of detection was calculated (new Supplementary Fig. S7 legend).

      In Fig. 2 Gephyrin clusters look consistently smaller than GlyR clusters, which is inconsistent with the published work. I assume that the difference in size is a consequence of different image reconstruction methods(?) However, I would assume that SRRF would have lower resolution than your PALM measurements and that would result in wider Gephyrin clusters. Could you please explain this discrepancy? Also, could you provide an estimate for the image resolution in SRRF and PALM techniques? For SMLM, localization precision would suffice.

      We have provided an estimate of the resolution of the two techniques using Fourier ring correlation, which gave 46 nm for SRRF and 21 nm for PALM. Additionally we have precised the discrepancy between reconstruction methods, page 6, lines 194-200 “The spatial resolution was estimated using Fourier ring correlation (FRC), which measures the similarity of two images as a function of spatial frequency by comparing the odd and even frames of the raw image sequence. According to this analysis, the spatial resolution of SRRF was 46 nm and that of PALM 21 nm. It should be noted that the synaptic puncta in the SRRF images appear somewhat smaller and brighter due to differences in the reconstruction methods that result in differences in the dynamic intensity range.”

      Why is the data in Fig. 5D and E represented as Detections/Synapse instead of GlyRs/Synapse? Could you please re-plot this so that a comparison with Fig. 2H and I is straightforward?

      We have converted the detections to receptor copy numbers as requested (Fig. 5D,E).

      Figure S5C: for P=0.5, 2=0.25. Please correct. Also, I assume that the second graph is what would be observed experimentally for dimers and P=0.5. Please clarify in the figure caption.

      This was a mistake and has been corrected. We have also clarified which parts of the calculations are theoretical and which values were derived from our experimental data. We have provided a more detailed description in the figure legend of Supplementary Fig. S7.

      Line 606: Please provide a complete derivation of this formula.

      We have provided a full derivation of this formula (new Fig. S7C).

      Significance:

      The work described here seem to be a natural progression of a publication by Patrizio et al., 2017 that came out from the same laboratory. This study uses advanced methodologies in the imaging space to visualise and characterise Glycinergic synapses in spinal cord tissue. The experiments described here are technically demanding as evidenced by the relatively small number of publications describing super-resolution measurements in tissue samples. Even more rare are studies that attempt to do single protein counting in neuronal culture and tissue sections. Therefore, I believe that this work brings significant technical advancement in the field of super-resolution and corelative microscopy. The findings are also highly significant for all fields of neuroscience in which the structure of inhibitory Glycinergic synapse is relevant, ranging from the fundamental understanding of inhibitory synapse function to pathologies involving Glycinergic signalling._

      I have substantial experience in different microscopy methods, including quantitative super-resolution microscopy based on single molecule counting. My background also covers the structure and function of GABAA and Glycine receptors using electrophysiology. I am familiar with the methods used in electron microscopy and the process of creating KI mouse lines, however I don't have hands-on experience in these fields._

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary:

      The authors used a correlative approach and combined photo-activated localization microscopy with electron microscopy to characterise Glycinergic synapses in spinal cord tissue. Some of the major findings are:

      • The receptor-scaffold occupancy and packing densities of glycinergic synapses in different regions of the spinal cord are the same.
      • Gephyrin clusters in the spinal cord are composed of sub-domains that shape the GlyR clusters.
      • Ventral horn synapses are generally larger, more complex (containing a number of gaps) and contain more GlyRs.<br> -In a mouse model of Hyperekplexia, the number of GlyRs is reduced resulting in smaller synapses in the ventral spinal cord.

      Major comments:

      • Are the key conclusions convincing? Yes
      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? No
      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. No
      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. N/A
      • Are the data and the methods presented in such a way that they can be reproduced? Yes
      • Are the experiments adequately replicated and statistical analysis adequate? Yes

      Minor comments:

      • Specific experimental issues that are easily addressable. Please see below
      • Are prior studies referenced appropriately? Yes
      • Are the text and figures clear and accurate? Yes
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? Please see below.
      1. As the authors pointed out, fusing mEos to the extrasynaptic terminal of GlyRb has been difficult and therefore this construct would benefit the larger scientific community.<br> Fig 1C is a nice imaging control for expression efficiency, however, it is in stark contrast with the lack of functional control. Do authors have any electrophysiological evidence showing that the insertion of mEos4b doesn't modulate channel function? I would assume that the construct would be tested in cell lines before the KI mouse line was created. Was any functional analysis done? If yes, it would be very useful to show it. I do appreciate that the authors used a standard insertion between the 4th and 5th AA in the extracellular domain, which in most cases does not abolish channel function. Given the lack of an obvious phenotype in the KI mouse model, I believe that this is also the case here. However, I disagree with the statement in lines 120-121: "the presence of the N-terminal fluorophore does not affect receptor expression and function." I believe that if there are no electrophysiological measurements of GlyR function, this statement remains speculative. As the authors pointed out in their previous publication: "receptor function and gephyrin binding are not independent properties. Instead, we think that conformational changes triggered at extracellular or intracellular protein domains have downstream consequences on channel opening as well as receptor clustering." In line with this, my concern is that the modulation of channel function by mEos4b could result in an altered cluster size at synapses. There is a large body of literature showing that just one missense mutation in the extracellular domain of ion channel subunits can lead to synaptopathies because the channel function gets modulated, and there is an abundance of similar examples involving mutations of GlyR and GABAAR subunits. In my view, comparing the function of GlyRs incorporating wt-GlyRb and mEos4b-GlyRb subunits is important for the correct interpretation of the main findings of this work and would strengthen the publications.
      2. Line 189: Are the authors making conclusions based on intensity comparison of red mEos4b and mRFP?<br> The title of this section implies that the red form of mEos was compared to mRFP(?) But mEos converts from green to red only partially. Was the probability for conversion taken into account at this point? Please clarify which version of mEos was compared to mRFP.
      3. Line 192: Please clarify how the density threshold was calculated/determined? This is important for the replication of the experiments, and it also has implications for the calculated probability of detection of mEos4b. I am not aware that this probability was calculated before for mEos4b and therefore other researchers may decide to rely on the value calculated here.
      4. In Fig. 2 Gephyrin clusters look consistently smaller than GlyR clusters, which is inconsistent with the published work. I assume that the difference in size is a consequence of different image reconstruction methods(?) However, I would assume that SRRF would have lower resolution than your PALM measurements and that would result in wider Gephyrin clusters. Could you please explain this discrepancy? Also, could you provide an estimate for the image resolution in SRRF and PALM techniques? For SMLM, localization precision would suffice.
      5. Why is the data in Fig. 5D and E represented as Detections/Synapse instead of GlyRs/Synapse? Could you please re-plot this so that a comparison with Fig. 2H and I is straightforward?
      6. Figure S5C: for P=0.5, 2=0.25. Please correct. Also, I assume that the second graph is what would be observed experimentally for dimers and P=0.5. Please clarify in the figure caption.
      7. Line 606: Please provide a complete derivation of this formula.

      Significance

      The work described here seem to be a natural progression of a publication by Patrizio et al., 2017 that came out from the same laboratory. This study uses advanced methodologies in the imaging space to visualise and characterise Glycinergic synapses in spinal cord tissue. The experiments described here are technically demanding as evidenced by the relatively small number of publications describing super-resolution measurements in tissue samples. Even more rare are studies that attempt to do single protein counting in neuronal culture and tissue sections. Therefore, I believe that this work brings significant technical advancement in the field of super-resolution and corelative microscopy. The findings are also highly significant for all fields of neuroscience in which the structure of inhibitory Glycinergic synapse is relevant, ranging from the fundamental understanding of inhibitory synapse function to pathologies involving Glycinergic signalling.

      I have substantial experience in different microscopy methods, including quantitative super-resolution microscopy based on single molecule counting. My background also covers the structure and function of GABAA and Glycine receptors using electrophysiology. I am familiar with the methods used in electron microscopy and the process of creating KI mouse lines, however I don't have hands-on experience in these fields.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Glycinergic synapses are the least well understood of synapses that mediate fast synaptic transmission. The manuscript by Maynard et al. adds new information about the structural aspects of these synapses, using PALM and EM imaging of spinal cord synapses from mice at 2 and 10 months. The authors created a knock-in mouse that expresses a tagged GlyRbeta subunit, allowing synaptic localization of glycine receptors; all synaptically localized glycine receptors are thought to require the beta subunit to be tethered by gephyrin. The authors compare synaptic profiles from: 2 month old vs. 10 month old mice; dorsal vs. ventral horn; and GlyR1-reduced vs. wild type mice. Strikingly, they find a tight relationship across all of these variables between glycine receptor puncta and gephyrin puncta, as well as an apparently constant "packing density" of glycine receptors. They conclude that synaptic extent is likely to be the most important determinant of synaptic strength, as the density of receptors within the postsynaptic density is constant. These results use cutting-edge imaging and are analyzed with care, and add new information to our understanding of these relatively less well characterized synapses.

      Major comments:

      The key conclusions are convincing and the claims appear solid. Additional experiments are not needed to support these claims. The data and the methods are largely presented in such a way that they can be reproduced, although there are minor suggestions for improvement below.

      Minor comments:

      Do the authors have any comment on the requirement during, e.g. LTP, for insertion of a gephyrin-GlyR unit? The lead author has speculated that gephyrin creates "slots" for GlyRs; yet apparently each slot is already filled in the snapshots taken here. How might postsynaptic LTP occur (Kandler group, Kauer group papers)?

      It would be very interesting in the discussion to contrast the present observations with what is known about excitatory synapses (NMDA and AMPAR distributions) and GABAergic synapses. Are the authors at all surprised that receptor packing is constant across conditions? Can the authors speculate on how non-gephyrin binding receptors (homomeric alpha receptors, which are found in recordings) may function and be tethered to the membrane.

      Figure S1. It would be most helpful to quantify this; at the least to include an atlas-like drawing to allow identification of the structures illustrated and containing Glrb; better yet would be quantification of staining in regions where this is strongest.

      The fact that the lower panel in B is labeled as +/+ across all groups is initially confusing; perhaps relabel as mEos4 -/-, +/- and +/+?

      Do gephyrin levels drop in WT mice as well as in the mEosr-GlyRb mouse between 2 and 10 months? Do the authors have any thoughts on this (Supp figure S2)?

      Significance

      Glycinergic synapses are the least well understood of synapses that mediate fast synaptic transmission. The manuscript by Maynard et al. adds new information about the structural aspects of these synapses, using PALM and EM imaging of spinal cord synapses from mice at 2 and 10 months. The authors created a knock-in mouse that expresses a tagged GlyRbeta subunit, allowing synaptic localization of glycine receptors.

      This will be of interest to those studying inhibitory synapses, and more broadly to synaptic morphologists, physiologists and imagers for comparison with other synapse types.

      My own expertise is NOT in these techniques, but I am a synaptic physiologist with a standing interest in glycinergic synapses; thus I am not providing serious technical critiques.

      Referee Cross-commenting

      Hi all, I agree with the other two reviewers, and do not have anything else to add.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Identification of a stereotypic molecular arrangement of glycine receptors at native spinal cord synapses

      Maynard et al. investigate (inhibitory) glycinergic synapses in mouse spinal cord, which regulate motor and sensory processes. The authors analyse the molecular architecture and ultra-structure of these synapses in native spinal cord tissue using quantitative super-resolution correlative light and electron microscopy. The major finding ist that GlyRs exhibit equal receptor-scaffold occupancy and constant absolute packing densities across the spinal cord and throughout adulthood, although ventral and dorsal inhibitory synapses differ in size. Moreover, what the authors call a „stereotypic arrangement" is even maintained in a hypomorphic mutant (oscillator), which is deficient in the adult GlyR a1 subunit.

      To reach their conclusions the authors generate two knock-in mouse lines, one with mEOS-labelled GlyR ß-subunit and one with mRFP-labelled gephyrin, a subsynaptic scaffolding protein of inhibitory synapses, which are subsequently crossed. Both changes are not unproblematic, as mutations in the N-terminal end of the GlyR ß subunit polypeptide chain might interfere with the assembly of functional GlyR (consisting of a und ß subunits) and and mutations at the N-terminal end of gephyrin interfere with it's homo-oligomerization into higher molecular assemblies.

      However, in this experimental design both labelled proteins reach postsynaptic membrane specialisations. In case of the ß-subunit quantitative evaluation confirms that heterozygous animals contain only half of the labelled protein as homozygous, which is an indication but not a proof that the correct stoichometry of adult GlyR is maintained. Likewise, mRFP-labelled gephyrin assembles with WT-gephyrin in subsynaptic domains, but it is not clear, if the size and density of the synapses is changed by the knock-in procedure as compared to WT-synapses.

      Accepting these constraints, which to the knowledge of this reviewer have never been addressed to satisfaction, the authors provide a technically excellent, comprehensive analysis of glycinergic synapses in the spinal cord of double knock-in mice. Therefore, it should be stated in the title, that the investigations were performed with double knock-in instead of „native" spinal cord. Text and figures are clear and accurate and represent the state of the art.

      Finally, the reviewer would like to raise a minor point: the term postsynaptic density is derived from electron microscopical studies of synapses, where asymmetrical synapses display a „postsynaptic density" but symmetrical synapses do not. The latter were identified as inhibitory synapses and therefore, by definition, inhibitory synapses do not have a postsynaptic density, but rather a postsynaptic membrane specialisation. The use of the term „postsynaptic density" should, therefore, be restricted to excitatory synapses.

      Significance

      The authors provide a state of the art advanced light and electron microscopical analysis of glycinergic synapses in the mouse spinal cord. They suggest a robust "stereotypical" mechanism in place, which guarantees a fixed stoichiometry of relevant components, which is even maintained in a hypomorphic mutant, which is believed to represent a mouse model of human hyperekplexia (startle disease).

      Referee Cross-commenting

      I would like to corroborate the arguments of the previous reviewer: it is not clear to which extent the fusion proteins influence the measurements, which are technically very advanced and well done, however. The authors do definitely not investigate "native spinal cord" as stated in the title.

      The argument concerning fusion proteins must be taken especially serious as the fusions were induced in regions known to be responsible for assembly of glycine receptors and oligomerization of gephyrin.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript Maynard et al describe a newly generated knockin mouse to study the endogenous distribution of Gly receptors in the spinal cord. Using quantitative confocal imaging and SMLM the distribution and levels of GlyRs at spinal cord synapses is compared between dorsal and ventral horn. They found that levels of synaptic GlyR are higher in dorsal than ventral spinal cord synapses. Nevertheless, the ratio to gephyrin seems constant, except for synapses in superficial layers of the dorsal horn, where gephyrin levels exceeded the levels of GlyRs. There are also fewer, but larger synapses in the ventral horn than in the dorsal horn. These findings are further corroborated by an SR-CLEM approach. Furthermore, it is shown that in a mouse model for hyperekplexia GlyR levels are lower, but still enriched at synapses, and the dorsal-ventral gradient in GlyR expression was maintained. The difference in size of ventral and dorsal synapses observed in WT animals was also lost in the oscillator mouse, suggesting that particularly the ventral synapses are affected. Despite these differences, the density of GlyRs per synapse remained similar.

      Major comments:

      • Line 113: "labeling the -subunit has proven difficult". This statement is unclear and it would be informative for readers to grasp what exactly has been difficult, and why the approach described here overcomes that? Related to that, the authors state "KI animals reach adulthood and display no overt phenotype, suggesting that the presence of the N-terminal fluorophore does not affect receptor expression and function". That is indeed reassuring, but it does not exclude that receptor numbers, function and distribution are altered. As it seems there is no prior literature on tagging the beta subunit, additional evidence that the tag does not interfere with receptor trafficking or functioning would be desirable
      • In the Discussion the authors conclude that "Our quantitative SR-CLEM data lend support to the first model, whereby inhibitory PSDs in the spinal cord are composed of sub-domains that shape the distribution of the GlyRs". This conclusion seems however based on one example image in Fig 3G that is not very convincing. The EM image seems to show two clearly separated PSDs opposed by two distinct active zones. So, although this conclusion is of high interest, more support should be given to substantiate this conclusion. More general, these subsynaptic domains (SSDs) are hardly further explored, but seem relevant for transmission, particularly given that the synaptic pool of GlyRs at these synapses is not saturated by single release events. How general are these SSDs at these synapses?
      • The approach for counting molecules based on the PALM acquisition has been developed in prior publications and seems robust. It would however be worth to present the reader with a bit more background and explain the assumptions of this approach in more detail. Particularly, since counting of mEos4b can be problematic, as there are multiple dark and fluorescent states of this fluorophore that could be influenced by the illumination scheme, see for instance De Zitter et al., Nat Methods 2019. Since the preceding SRRF acquisition already exposes the fluorophore to high and continuous 561-nm laser power this could skew the counting due to unaccounted conversion and perhaps bleaching of mEos4b. In line with this, although throughout the manuscript the term 'absolute copy numbers' is used the reported numbers are at best an estimate based on a number of assumptions. I think the wording 'absolute numbers' is therefore deceiving and should be nuanced.
      • Related, most of the quantifications are in estimating the number of receptors, and not so much the distribution with the PSD. The term "molecular arrangement" - also used in the title - might therefore be misleading, there is in fact little characterization of how GlyRs are placed within the PSD. More focused analysis quantifying the distribution of receptors within the PSD and/or SSDs would strengthen the manuscript.
      • The reported N is confusing and makes it hard to judge the reproducibility of the data. Sometimes it refers to number of images, sometimes number of synapses, but it is unclear from how many experiments these are drawn. This should be reported more completely (number of animals should be reported at least) and consistently. In figure 1, the N numbers (N=3-5 images) are particularly low and question how consistent these findings are across multiple animals.
      • The levels of mRFP-Gephyrin seem to differ between the different mouse lines, is this a significant difference?
      • The ICQ analysis for co-localization is hardly explained. How do we interpret this parameter? What does an average value of ~0.3 mean? A comparison with sets of proteins that do not overlap as a negative control would strengthen the conclusion.

      Minor comments:

      • "Very little fluorescence was detected in the forebrain, despite the high reported expression of the Glrb transcript". Can the authors expand on this? What would explain this discrepancy?
      • What region is quantified in Fig 1B? is the same region in all conditions? This should be specified more clearly as the manuscripts presents a clear gradient in expression levels in the spinal cord and thus the location will influence the intensity measurements.
      • The labeling approach does not differentiate between surface and internal receptors, this should be made more explicit in the text.

      Significance

      The presented data are interesting and the experiments are technically advanced and carefully performed. Particularly the SR-CLEM approach is technically advanced. The datasets present a quantitatively detailed characterization of spinal cord synapses and will be of interest for researchers working in the field of spinal cord circuitry, as well as super-resolution imaging. The conceptual advance for the field is however somewhat limited. It seems that the presented data confirm the general notion that receptor numbers and synapse size are highly correlated. So, although this manuscript describes very interesting observations, in its present form the manuscript does not provide any new mechanistic insight or significant advance in our understanding of how these synapses operate.

      Referee Cross-commenting

      I agree with the other reviewers that this study is technically advanced, but I remain critical towards the extent of conceptual advancement this study brings and there are some important concerns with the presented data that need to be addressed. Nevertheless, indeed many of these concerns can be addressed without additional experiments. As pointed out also by other reviewers additional validation that the fusion proteins are not disrupting their function or organization would be important.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We thank the reviewers for their critical review of our manuscript. We are excited to see that the reviewers agree that we have presented high-quality data that advances the centrosome field and is worthy of publication following revision. The authors also agree with the reviewers that the data presentation requires improvement, that some experiments require additional replicates with robust statistical analyses and that a model or summary would help clarify the differences between previously published results and ours. We will address all these concerns in the revised version of our manuscript. The reviewer comments in their entirety can be found below in italic followed by our response in bold.

      Considering that the manuscript was very well received we believe it makes a strong candidate for publication in eLife. In terms of editors at eLife, we believe that Anna Akhmanova and Jeremy Reiter would be very well suited to handle this manuscript.

      We hope that you will concur with us that the revision plan detailed below adequately addresses the reviewers’ comments.

      2. Description of the planned revisions

      Reviewer 1, Major points

        • Previous data suggested that an important role of TRIM37 was to limit accumulation of CEP192 levels, yet here CEP192 levels appeared unchanged in TRIM37 knockout cells that stably express wild-type or RING domain mutant TRIM37. However, in agreement with previous work, transient expression of TRIM37 reduced CEP192 levels along with those of other PCM and centriole components in an E3-dependent manner. These data are rather confusing in light of the literature, and the current report does not really deal with these discrepancies but to me they suggest that high levels of TRIM37 can target multiple centrosome components for degradation, but this may be an experimental artefact.* We agree that acutely overexpressed TRIM37 results in decreased CEP192 levels and is consistent with published results. We also provide evidence that CEP192 levels are not correspondingly increased in the absence of TRIM37, nor are they decreased in a cell line that stably overexpresses FLAG-BirA TRIM37. This suggests that the decreased CEP192 (and PCNT and CEP120) after acute overexpression of TRIM37 might be short-lived or a consequence of overexpression. We will discuss this possibility more clearly in the revised mansucript. In addition, we will perform Western blots for TRIM37 in wild type cells, cells stably expressing FLAG-BirA TRIM37 and cells induced to express TRIM37-3xFLAG to more directly compare the amount of TRIM37 present in these cell lines.
      • The choice of cells for particular experiments is not always stated or explained. For instance, in Figure 3A: Trim37 KO pool used while in Figure 3B TRIM37 single KO. These are then combined with both transient and stable expression of TRIM37 mutants.*

      We apologize for this and will clarify the choice of cell lines in the results section. Importantly, because some of our results challenge previously published reports, we performed critical experiments using multiple cell lines. For example, we show that centrinone B-induced growth arrest is independent of TRIM37 E3 ligase activity using a single RPE-1 TRIM37-/- clone, an RPE-1 TRIM37-/- pool and an A375 TRIM37-/- pool. We feel this is a highlight of our work and this new data will be included in the revised version of the manuscript and will be emphasized.

      • Two different concentrations (200 nM and 500 nM) of centrinone were used to compare responses of too many or no centrosomes in RPE1 and A375 . While these concentrations result in centrosome amplification (200 nM) and loss (500 nM) in RPE1 cells, the phenotypes seem much less clear-cut in A375 cells. At 200nM 70% of cells have 0 or 1 centrioles (~35% each category) and only about 15% have centrosome amplification, whereas centrosome amplification occurs in 30% of RPE1 with 0-1 centrioles seen in fewer than 10% (Figure 4 - figure supplement 1H). Hence the different outcomes of centrinone treatment makes conclusions about cell-type specific responses difficult. This difference may be due to differences in drug uptake/efflux, PLK4 activity or in expression of other components of these pathways. In fact, 167nM centrinone B in A375 cells would have been a much closer match to the 200nM treatment of RPE-1. These points should be discussed as they impact the conclusions.*

      The reviewer rightly points out that the response to centrinone appears to differ between cell types, as shown previously (Meitinger et al., 2020 and Yeow et al., 2020), and that this difference may impact our conclusions. Although we don’t think that the major conclusions drawn will change, we will discuss these caveats within the results and discussion of the manuscript.

      • I find the different outcomes of stable versus acute expression of TRIM37 ligase mutant confusing. Here, stable expression of TRIM37 ligase mutant increases mitotic length compared to that of TRIM37 wild-type, which contradicts a recent report by (Meitinger et al. 2021). What could be the potential reason for these differences? *

      It is unclear why we obtain results that differ from Meitinger et al. We are using similar cell lines (RPE-1 hTert vs. RPE-1 hTert Cas9) with similar TRIM37 constructs (TRIM37-3xFLAG) that are induced in similar ways (both are doxycycline inducible but using different systems). For our experiment, we used a single TRIM37 KO clone. As an independent validation, we will repeat this experiment using our TRIM37 KO pools in both RPE-1 and A375 cells and discuss these results and implications.

      What could be the mechanism for TRIM37 action in regulating spindle assembly/mitotic duration and cell proliferation upon centrosome loss? How do those acentrosomal MTOCs form that decrease mitotic duration and promote proliferation?

      These are insightful questions that we feel lie at the heart of TRIM37 function. Current models posit that in the absence of TRIM37, PLK4 condensates form and are required to nucleate ectoptic accumulations of PCM components (ex. CEP192) that facilitate mitosis (Meitinger et al. 2020). A number of our findings are not consistent with this model. First, PLK4 is detected in the Cenpas/condensates only using a single antibody (Wong et al., 2015) (two other antibodies have been reported to be used (Sillibourne et al., 2010, Moyer et al., 2015) and we have used another (Millipore MABC544 clone 6H5) - none of these three detect PLK4 at the condensates). Additionally, the PLK4 signal observed is not sensitive to PLK4 siRNA (Balestraet al. 2021, Figure 4 – figure supplement 1I). In our manuscript we also provide evidence that overexpressed PLK4-3xFLAG cannot be detected (using PLK4 or FLAG antibodies) at these strucures. Moreover, our experiments using TRIM37 mutants show that Cenpas formation and ectopic PCM assembly are mechanistically distinct; Cenpas are not resolved after expression of TRIM37 C18R, yet ectopic PCM structures are suppressed (Figure 5E and G). Our data do, however, suggest that the ability to form ectopic PCM structures is inversely correlated to growth arrest activity (i.e. cells that form ectopic PCM fail to arrest). How these structures form and how they affect growth arrest are still critical, open questions. We will discuss these possibilities further in the revised manuscript.

      Do the authors find a difference in the % of cells expressing TRIM37 mutants upon stable or acute expression? This part needs a better summary, and again a table would help. I also wonder about protein expression levels; wild-type FB-TRIM37 seems to be expressed at much lower levels than the mutants in Figure 5B.

      The differences in overall abundance are not due to heterogenous expression within the population. The TRIM37 mutants are expressed in all cells after stable and acute expression. We will provide quantification of immunofluorescence images and statistics to show this. TRIM37 mediates its own degradation in an E3-dependent manner (Meitinger et al. 2021, Figure 3f). Our results are consistent with this as the TRIM37 C18R and TRIM37 __DRING mutants have a higher overall abundance compared to TRIM37 or TRIM37 D__505-709. These experiments are ongoing and we will discuss this further in the revised manuscript and provide a summary table.

      • Other means of centrosome depletion (Cenpj, SAS6 etc) would have been useful to include in the manuscript in support of E3 ligase dependent and independent roles of TRIM37. It is not essential to perform these experiment but if data are available, including these would improve the paper. *

      We will generate new data using a double TRIM37 KO, SASS6 KO line to address TRIM37 ligase-dependent and -independent functions.

      • The authors show that TRIM37 regulates PLK4 phosphorylation and that this modification could only be observed in HEK293T and not in RPE1. Why would there be a difference between HEK293 and RPE1?*

      We will address this by surveying a panel of cell lines to determine if there any cell type dependent differences in TRIM37 modification. Any potential differences will be addressed in the discussion.

      • Statistical analysis for graphs should be included. Figure 5 is ok but graphs in Figures 3, 4, 6, 7 would benefit.*

      This point is well taken. In the revised manuscript, we will ensure that all experiments are performed in biological triplicate and that proper statistical analyses are included to support our conclusions.

      • The authors characterise TRIM37 localisation. They detect it at centrosomes (as shown by Yeow et al 2021) and more specifically at the PCM, but apparently the signal is not present in all cells. They should also provide a quantification of the % of cells with centrosomal TRIM37 signal and compare this to cells expressing Flag-tagged Trim37. The specificity of the antibody signal using TRIM37-/- should be confirmed. *

      We will perform immunofluorescence experiments using wild type and TRIM37-/- cells to demonstrate the specificity of the antibody signal. We will also provide a more detailed analysis regarding TRIM37 localization noting 1) the number of cells with centrosomal TRIM37 2) cell cycle correlation with centrosomal TRIM37 and 3) a comparison with FLAG-BirA tagged TRIM37.

      Reviewer 1, Minor points

      1.Page 3: "A recent screen for mediators of supernumerary centrosome-induced arrest identified PIDDosome/p53 and placed the distal appendage protein ANKRD26 within this pathway [31]". It appears that the reference for Burigotto et al. is missing.

      This reference will be inserted.

      2.Page 6: The authors state that: TP53BP1, USP28 and CDKN1A are also suppressors in the Nutlin-3a screen and suggest that they act in a general p53 pathway. However Meitinger et al (2016) showed that depletion of TP53BP1 or USP28 did not affect the upregulation of p53 and p21 upon Mdm2 inhibition.

      Our data is consistent with previous reports that TP53BP1 and USP28 are required for cell arrest after Nutlin-3a treatment (Cuella-Martin R et al. 2016). We will discuss possible explanations for the results observed by Meitinger et al.

      3.Page 9: "First, we performed live cell imaging to measure mitotic length in cells grown in centrinone". For consistency the authors should say centrinone B here as wellI

      We will change the text to indicate using centrinone B.

      4.Page 9: "Cells lacking TRIM37 suppressed the growth arrest from 150 to 500 nM centrinone B in RPE-1 and 167 to 500 nM in A375 cells". The growth data for the A375 cells seem to be missing from the figures.

      We refer to Figure 4D and Figure 4 – figure supplement 1G that contain the RPE-1 and A375 growth data, respectively. We will modify the text to more clearly refer to the data.

      5.Page 10: "Our results confirmed that PLK4 and TRIM37 form a complex in RPE-1 cells (Figure 3G)" It appears the authors referred to the wrong figure, it should be Figure 4B.

      Our apologies. The correct figure reference will be used.

      6.Figure 1C: The nuclear p53 signal is not apparent with 500 nM centrinone B in the exemplary cells. Did the authors use thresholding to quantify p53/p21 positive cells?

      The p53 staining in centrinone-treated cells is somewhat variable. To quantify the data, we used automated image analysis and set a cut off based on p53 intensity in DMSO-treated cells to indicate p53-positive cells. To improve the figure we will repeat the experiment and use a lower magnification image to show a more representative field of cells stained for p53. The quantification pipeline will be better explained in the methods section.

      7.Figure 4D and Figure 4 - Figure supplement 1G: The graph is misleading and should not be presented as a continuous line.

      We are sorry that the reviewer finds the graph misleading. We will change the way this data is presented to make it easier to understand and to facilitate indicating statistical differences. Instead of a scatter plot of all the data, we will present the data as individual boxplots at each centrinone B concentration with statistical differences indicated. We hope this will address any confusion regarding these data.

      8.Figure 5A and C: A direct and statistical comparison mitotic timing upon expression different Trim37 mutants to wildtype and trim37-/- cells is missing

      In Figure 5A we compare RPE-1 WT to TRIM37-/- at each centrinone B concentration and within each line we compare each centrinone B concentration to DMSO. Perhaps we do not understand the reviewer’s concern here, but we do not think any comparisons are missing from this panel. In Figure 5C, we compare the mitotic lengths between cell lines expressing TRIM37 WT or TRIM37 C18R since we focus on the requirement for the E3 ligase activity of TRIM37. For this experiment we did not include a wild-type control, but we will perform statistical analyses between control cells expressing FLAG-BirA and those expressing FB-TRIM37 WT or FB-TRIM37 C18R. We hope this addresses this concern.

      9.Figure 6B: A loading control/Ponceau staining is missing as well as the quantification of protein levels

      This experiment will be repeated for proper quantification and we will include a loading control for our representative results.

      10.Figure 6D: It is unclear if the centrosomal signal intensity was quantified in interphase or mitotic cells

      The centrosomal signal was quantified in mitotic cells only. This results and figure legend will be updated to more clearly indicate this.

      11.Figure 7C: A loading control/Ponceau staining is missing

      The experiment will be repeated and a sample will be taken prior to immunoprecipitation to indicate the input amounts for each sample.

      12.Figure 2 - figure supplement 2F and G: It would help if the authors could highlight the cell line, e.g. RPE-1 (F) or A375 (G) in the venn diagrams.

      In Figure 2 – figure supplement 2G we highlight the genes found in RPE-1 and A375 screens only in the overlap of the Venn diagram using font colour. We will colour code the hits from each cell line in panels (F) and (G). We thank the reviewer for this suggestion.

      13.Figure 4 - figure supplement 1E: it appears that the BirA antibody gives only an unspecific signal. It would be useful to show if the different TRIM37 variants are able to localise to the centrosomes. Furthermore it appears that centrosomes are missing in the C18R and 505-709 variants. It would be useful if the authors quantify centrosome numbers upon expression of different Trim37 variants as shown in Figure 4 - figure supplement 1. To make the identification of the cell easier it would help to include a DNA signal or indicate the outline of the cell.

      The anti-BirA antibody does give a slightly diffuse signal, although we disagree that it is unspecific considering that the BirA signal is only observed in cells expressing FLAG-BirA alone or BirA fusion proteins.

      We agree with this reviewer that we did not make any statements about the centrosomal localization of the TRIM37 mutants. We will re-analyze our images to quantify relative centrosomal localization of these proteins. The images as displayed in this Figure panel appear to be somewhat confusing to the reviewer. In terms of scale, only a small portion of the cell surrounding the centrosome is shown, therefore a nuclear or cell outline cannot be displayed on these images. In each image a centrosome is present, even in the C18R and 505-709 samples. We will show images of entire cells with insets to highlight the region surrounding the centrosome.

      14.The generation of stable and dox-inducible cell lines is missing in the material and methods

      We apologize for this omission. This information will be added.

      Reviewer 2, Major points

        • The centrosomal localization of endogenous TRIM37 should be validated by comparing control and knockout/knockdown cells.* We will perform these experiments as outlined in response to Reviewer 1, Major point 8.
      • Some of the quantifications are derived from only two experiments and in many cases no statistical testing was done. The authors should test the observed effects and add extra replicates to make the data more robust, where required. *

      We will ensure experiments are performed in biological triplicate and that appropriate statistical analyses are performed (see comment to Reviewer 1, Major point 7)

      • Fig. 5 supplements: panels showing effects on marker proteins in cells by IF lack quantification of the claimed effects. Without providing some type of quantifications for key findings, it is unclear how strong or penetrant the effects are.*

      Quantification and statistical testing will be performed for these experiments.

      Reviewer 2, Minor points

      I would suggest a final, summarizing schematic that illustrates the main findings in a cartoon/flow chart manner.

      We will improve the discussion of our main findings as well as provide a model/table of comparisons to improve the clarity of our manuscript.

        • Please revise incorrect abstract sentence: "We identify TRIM37 as a key mediator of growth arrest when PLK4 activity is partially or fully inhibited but is not required for growth arrest triggered by supernumerary centrosomes." __In our screens, we find that TRIM37 is required for growth arrest after treating cells with 200 and 500 nM centrinone B. Treatment of cells with 200 nM centrinone B causes centriole overduplication and our initial hypothesis was that centriole overduplication alone is inducing growth arrest. To test this in a parallel manner, we also overexpressed PLK4 to induce centriole overduplication. Surprisingly, but consistent with recently published results (Evans et al*., 2020), TRIM37 was not required for growth arrest after PLK4 overexpression. Thus, TRIM37 is required for growth arrest after 200 nM centrinone treatment, but not PLK4 overexpression, yet both of these conditions induce centriole overduplication. This concept will be highlighted, discussed and clarified in the text. We will change the abstract sentence to ‘We identify TRIM37 as a key mediator of growth arrest when PLK4 activity is partially or fully inhibited, but it is not required for growth arrest after PLK4 overexpression’__.

      Please also see similar comment to Reviewer 3, Major point 1.

      • In various figures and supplements showing centrosome and condensates/Cenpas, these are very difficult to distinguish due to their small size. I suggest to magnify regions of interest and/or add arrowheads in different colors marking the specific structures.*

      This comment is similar to Reviewer 1, Minor point 13. We will use coloured arrowheads to indicate different structures. Where possible, we will use magnified regions to improve clarity.

      • Fig. 2A: What is the purpose of the schematics on the right of panel A? The labels in the graph are unreadable and the network diagram without any labels is also not very useful. This could be removed. *

      The schematics on the right indicate a ‘generic analysis’ using the NGS sequencing data. We agree it is not essential and it will be removed.

      • Fig. 2B: The network presentation is not very easy to read. What are the functional groups/pathways here? The clusters should be labeled accordingly. What is the meaning of the different sizes of the circles? Maybe key interactions (e.g. TRIM37) could be indicated in a different color shade to highlight these? *

      In our figure we tried to highlight 1) the connectivity among screening conditions and 2) complexes that were identified by the screens. In our figure, each node (other than the six hub nodes that denote a screen condition) represents a hit from the screens. Thus, the nodes are connected by edges only to the screening conditions, not to each other. In this scenario, highlighting TRIM37 ‘interactions’ would only highlight the screening conditions for which TRIM37 was a hit (200 nM RPE-1, 500 nM RPE-1, 200 nM A375, 500 nM A375). We could try to overlay functional enrichment data on the graph, but this data is presented separately in Figure 2 – figure supplement A-D. The large circles represent hits found in previous screens and is indicated in the legend. Given the challenges of this figure we will modify it to improve its clarity.

      Reviewer 3, Major points

        • The presentation throughout the manuscript sometimes made it difficult to follow exactly what the authors meant when they referred to the various doses of Centrinone used in their experiments-often using the terms "low" or "high" without specifying exactly what they mean. In Figure 1A, for example, they present a growth inhibition curve using a log10 scale of Centrinone concentration, and they conclude that growth was inhibited "at concentrations above 150nM, with full inhibition observed at concentrations greater than 200nM". I presume this is just sloppy language, as it appears that growth is significantly inhibited at 150nM and full growth inhibition is achieved at 200nM. However, in Figure 4D, the authors show another growth inhibition curve (this time presented on a linear scale) where significant growth inhibition is seen well below 100nM and full inhibition appears to be achieved at ~125nM. The discrepancy between these experiments is not noted, nor any reason for it explained. We agree with the reviewers and apologize for using ‘low’ and ‘high’ as they are ambiguous. We will ensure that we refer specifically to each concentration of centrinone B used (ex. 50 nM, 150 nM etc.). The comparison between Figure 1A and Figure 4D is not straightforward. The experiments presented were performed approximately 6 years apart and in slightly different ways. As reviewer 3 indicates, Figure 1A is presented in a log scale; this makes it difficult for the reader to determine the exact concentrations of centrinone B used. For this panel, we used, 0 (DMSO), 10, 30, 75, 165, 200 and 500 nM centrinone B. For Figure 4D, we used 0, 50, 125, 150, 167, 200 and 500 nM. The only point that might be anomalous is 75 nM in Figure 1A. We do see approximately 25% inhibition using 50 nM centrinone B in Figure 4D, but no inhibition using 75 nM in Figure 1A. We can offer two explanations for this discrepancy. First, we noticed small deviations in the potency of centrinone B batches. Second, for Figure 1A, cells were assayed using a passaging assay where they are continuously plated, counted and re-seeded. Cells in Figure 4D were assayed using a clonogenic assay where cells are plated at low density and allowed to grow over the course of approximately two weeks. It is possible that a combination of these factors led to the highlighted discrepancy. We feel that the discrepancy is a minor one and we propose the following as a solution. We will present the growth data in Figure 1A as a scatter / box plot using only 200 and 500 nM centrinone B since these are the drug concentrations we use for the screen conditions and the key conclusions are derived only from these concentrations (i.e. both concentrations result in p53-dependent growth arrest where centrioles are overduplicated after 200 nM centrinone B, while centrioles are lost after treatment with 500 nM). We hope that this explanation and changes satisfy the reviewers.

      While discrepancies such as this may seem trivial, they make it hard to interpret some of the authors conclusions. For example, in their initial screen, the "low" dose of Centrinone (200nM) leads to centriole amplification and genes that block centriole duplication or PIDDosome function (which normally signals the presence of extra centrioles) are required for the growth arrest triggered by this concentration of the drug (Figure 1B). To me, this suggests that centriole amplification is required for this growth arrest at 200nM. However, when the authors test a more graded series of concentrations they conclude "excess centrioles might not be the trigger for this arrest at low Centrinone B concentrations". I assume they are using "low" here to indicate concentrations at or below 150nM (even though they use low to mean 200nM in their initial screen)? In the Discussion, they state that TRIM37 is "required for the growth arrest in response to partially or fully inhibited PLK4, but this activity was independent of the presence of excess centrioles". Again, it is not clear to which experiments they are referring when they talk about "partially" or "fully" inhibited PLK4, but, if this is correct, then why are genes required for centriole duplication and PIDDosome function identified in their initial screen as being required for the growth arrest at 200nM but not 500nM? Do they consider 200nM to be fully inhibiting PLK4? *

      We observed that cells arrested after treatment with either 200 or 500 nM centrinone B. Additionally, we observed centriole over-duplication after 200 nM but centriole loss at 500 nM. Our initial hypothesis was therefore that either centriole overduplication or loss resulted in growth arrest. Our subsequent results with TRIM37 caused us to question this simple interpretation. To determine if centriole overduplication caused by 200 nM centrinone B triggers growth arrest in this case, we induced centriole overduplication by overexpressing PLK4 and, surprisingly, TRIM37 was not required for growth arrest in these conditions, similar to that observed by __Evans et al., 2020. Thus, we have two conditions where centriole overduplication is observed where the growth arrest in only one condition is dependent on TRIM37. This is an important difference that we will better highlight in our revised manuscript. We will also present a better model and/or table outlining our most salient results. Briefly, it is thought that partially inhibited PLK4 blocks its own auto-phosphorylation and therefore blocks its degradation. The overall abundance of PLK4 therefore increases under these conditions and overduplication occurs. In our hands, we consider PLK4 to be partially inhibited in RPE-1 or A375 cells at any concentrations of centrinone B at 200 nM or lower.__

      Please also see similar comment to Reviewer 2, Minor point 1.

      Presumably it will only require textual changes to address this point, but it is hard to assess the broader significance of the paper until these points are clarified: is the main point of this paper that the cells response to Centrinone treatment is complicated and the role of TRIM37 equally so; or, is there a narrative that leads to a clear hypothesis that can explain these surprising findings?

      We don’t currently have a model that explains all the results we observe with TRIM37. We have data that is consistent with some previously published results and data that challenges some of these recent reports. The current model suggests that TRIM37 E3-dependent remodeling of CEP192 underlies its growth arrest activity after centriole loss. Importantly, we find that TRIM37 supports growth arrest in an E3-ligase-independent manner. We will discuss this further in our revised manuscript, as well as providing additional hypotheses based on our other observations of TRIM37 function.

      • It seems a striking omission that the authors show that p53 and p21 are induced by 200nM and 500nM Centrinone (Figure 1D), but they don't assay these proteins at any concentration lower than this. Perhaps they are saving this data for a subsequent manuscript, but the authors certainly seem to draw conclusions from several experiments they perform at concentrations below 200nM, so they should at least explain why they don't assay p53 and p21 status in these experiments. *

      We apologize for not including this data in the original version of the manuscript. It will be included in the revised version.

      Reviewer 3, Minor points

        • In the abstract the authors claim that the way in which altered centrosome numbers cause a p53-dependent growth arrest is evolutionarily conserved. This is misleading, as it implies that the loss and gain of centrosomes trigger the same arrest (which is probably not correct), and most of the data to date suggests that flies and worms (two popular models for centrosome research) do not have such a growth-arrest pathway.* This is a good point. We will modify this statement to indicate that p53-dependent arrest is confined to mammalian cells: “Altered centrosome numbers cause a p53-dependent growth arrest in both mouse and human cells through mechanisms that are still poorly defined”.

      Reviewer 3, comment in ‘significance’

      I could not discern, however, whether one could draw any broader conclusions than this, in part due to the presentation problems described above. Moreover, in the abstract the authors propose that altering PLK4 activity alone is sufficient to signal growth arrest. This would be an important conclusion, and I presume this refers to the very low dosage Centrinone experiments that trigger growth arrest without altering centrosome numbers and which does not require TRIM37? If so, this arrest is poorly characterised here and will be the subject of a future investigation, so it seems to strange to have this as a major conclusion in the abstract.

      We agree. As reviewer 3 points out, based on our findings we hypothesize that altered PLK4 activity could itself signal growth arrest. As this is not supported experimentally, we will remove it from the abstract and discuss this tantalizing possibility within the discussion.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      Most of the experiments are currently ongoing and the preliminary results we have obtained discussed in the previous section. The revised manuscript will be modified to address each and every concern of the three reviewers as detailed above.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      We will carry out all the experiments requested by the reviewers as detailed above.

      References

      Balestra FR et al., TRIM37 prevents formation of centriolar protein assemblies by regulating Centrobin. Elife. 2021 Jan 25

      Cuella-Martin R et al., 53BP1 Integrates DNA Repair and p53-Dependent Cell Fate Decisions via Distinct Mechanisms. Mol Cell. 2016 Oct 6;64(1):51-64

      Evans LT et al., ANKRD26 recruits PIDD1 to centriolar distal appendages to activate the PIDDosome following centrosome amplification. EMBO J. 2021 Feb 15;40(4)

      Meitinger F et al., TRIM37 controls cancer-specific vulnerability to PLK4 inhibition. Nature. 2020 Sep;585(7825):440-446

      Moyer TC et al., Binding of STIL to Plk4 activates kinase activity to promote centriole assembly. J Cell Biol. 2015 Jun 22;209(6):863-78

      Sillibourne JE et al.,Autophosphorylation of polo-like kinase 4 and its role in centriole duplication. Mol Biol Cell. 2010 Feb 15;21(4):547-61

      Wong YL et al., Cell biology. Reversible centriole depletion with an inhibitor of Polo-like kinase 4. Science. 2015 Jun 5;348(6239):1155-60

      Yeow ZY et al., Targeting TRIM37-driven centrosome dysfunction in 17q23-amplified breast cancer. Nature. 2020 Sep;585(7825):447-452

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Tkach et al. analyse the molecular pathways that lead to the growth arrest of either RPE-1 or A375 cells in response to varying doses of the PLK4 inhibitor Centrinone B (hereafter Centrinone). They show that both 200nM and 500nM Centrinone cause a strong growth arrest, but the lower concentration actually leads to centrosome amplification, while the higher concentration leads to centrosome loss. They identify the Ubiquitin E3 ligase TRIM37 as a key mediator of the growth arrest at both drug concentrations, although they confirm previous findings that TRIM37 is not required for the growth arrest induced by the supernumary centrosomes that are formed when PLK4 is overexpressed. Perhaps most importantly, the authors test the ability of various mutated forms of TRIM37 to function in the growth arrest induced by Centrinone treatment, and they conclude that, surprisingly, the E3 ligase activity of TRIM37 is not required for this growth arrest.

      The experiments presented here are generally of a high quality, although I found some aspects of the presentation a little confusing (as detailed below).

      Major Comments:

      1. The presentation throughout the manuscript sometimes made it difficult to follow exactly what the authors meant when they referred to the various doses of Centrinone used in their experiments-often using the terms "low" or "high" without specifying exactly what they mean. In Figure 1A, for example, they present a growth inhibition curve using a log10 scale of Centrinone concentration, and they conclude that growth was inhibited "at concentrations above 150nM, with full inhibition observed at concentrations greater than 200nM". I presume this is just sloppy language, as it appears that growth is significantly inhibited at 150nM and full growth inhibition is achieved at 200nM. However, in Figure 4D, the authors show another growth inhibition curve (this time presented on a linear scale) where significant growth inhibition is seen well below 100nM and full inhibition appears to be achieved at ~125nM. The discrepancy between these experiments is not noted, nor any reason for it explained.

      While discrepancies such as this may seem trivial, they make it hard to interpret some of the authors conclusions. For example, in their initial screen, the "low" dose of Centrinone (200nM) leads to centriole amplification and genes that block centriole duplication or PIDDosome function (which normally signals the presence of extra centrioles) are required for the growth arrest triggered by this concentration of the drug (Figure 1B). To me, this suggests that centriole amplification is required for this growth arrest at 200nM. However, when the authors test a more graded series of concentrations they conclude "excess centrioles might not be the trigger for this arrest at low Centrinone B concentrations". I assume they are using "low" here to indicate concentrations at or below 150nM (even though they use low to mean 200nM in their initial screen)? In the Discussion, they state that TRIM37 is "required for the growth arrest in response to partially or fully inhibited PLK4, but this activity was independent of the presence of excess centrioles". Again, it is not clear to which experiments they are referring when they talk about "partially" or "fully" inhibited PLK4, but, if this is correct, then why are genes required for centriole duplication and PIDDosome function identified in their initial screen as being required for the growth arrest at 200nM but not 500nM? Do they consider 200nM to be fully inhibiting PLK4?

      Presumably it will only require textual changes to address this point, but it is hard to assess the broader significance of the paper until these points are clarified: is the main point of this paper that the cells response to Centrinone treatment is complicated and the role of TRIM37 equally so; or, is there a narrative that leads to a clear hypothesis that can explain these surprising findings?

      1. It seems a striking omission that the authors show that p53 and p21 are induced by 200nM and 500nM Centrinone (Figure 1D), but they don't assay these proteins at any concentration lower than this. Perhaps they are saving this data for a subsequent manuscript, but the authors certainly seem to draw conclusions from several experiments they perform at concentrations below 200nM, so they should at least explain why they don't assay p53 and p21 status in these experiments.

      Minor comments:

      In the abstract the authors claim that the way in which altered centrosome numbers cause a p53-dependent growth arrest is evolutionarily conserved. This is misleading, as it implies that the loss and gain of centrosomes trigger the same arrest (which is probably not correct), and most of the data to date suggests that flies and worms (two popular models for centrosome research) do not have such a growth-arrest pathway.

      Significance

      Significance and comparison to existing literature:

      The question of how centrosome loss or amplification leads to senescence or apoptosis in many cell types is currently a hot topic, and TRIM37 has previously been identified as a potentially important player-most recently in two high-profile papers from the Oegema/Loncarek (Meitinger et al, Nature 2021) and Holland/Chapman (Yeow at al., Nature 2021) labs. In these papers, TRIM37 is shown to be overexpressed in certain cancer cells, where it appears to degrade PCM components (most notably Cep192) to prevent the formation of ectopic spindle poles that help to ensure mitotic fidelity in these abnormal cells. Moreover, mutations in TRIM37 cause Mulibrey nanism, which has recently been shown to be associated with the formation of ectopic Centrobin-dependent PCM condensates (Balestra et al., eLife 2021; Meitinger et al., JCB, 2021).

      This manuscript makes an important contribution to this area, and it will be of considerable interest to researchers in several fields (most obviously the centrosome, but also ubiquitin ligase, cancer and Mulibrey fields). In its current form, this contribution is largely to illustrate that treating cells with Centrinone (which is widely used by many centrosome researchers) triggers a complex cellular response that varies with drug dosage, and that the role of TRIM37 in triggering this response also appears to be surprisingly complicated. These are significant points that are of sufficient importance to warrant publication.

      I could not discern, however, whether one could draw any broader conclusions than this, in part due to the presentation problems described above. Moreover, in the abstract the authors propose that altering PLK4 activity alone is sufficient to signal growth arrest. This would be an important conclusion, and I presume this refers to the very low dosage Centrinone experiments that trigger growth arrest without altering centrosome numbers and which does not require TRIM37? If so, this arrest is poorly characterised here and will be the subject of a future investigation, so it seems to strange to have this as a major conclusion in the abstract.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The study by Tkach et al. investigates the molecular basis of the previously described, p53-dependent growth arrest that is triggered by manipulation of PLK4 kinase activity, a master regulator of centrosome biogenesis. To address this, they use CRISPR/Cas9 screening in human cell lines, gene-specific knockout and rescue experiments, and biochemical interaction assays. As in previously conducted similar screens they identify the E3 ligase TRIM37 as a key mediator of growth arrest after PLK4 inhibition, but not growth arrest induced by increased centrosome number. Importantly, contrary to suggestions in previous studies, they find that TRIM37 function in growth arrest is independent of E3 ligase function, but may involve regulation of PLK4.

      Major comments:

      Overall, I found the key conclusions convincing, assuming the claimed effects are significant. In this regard, some data requires quantification and some of the quantifications may require additional replicates.

      1) The centrosomal localization of endogenous TRIM37 should be validated by comparing control and knockout/knockdown cells.

      2) Some of the quantifications are derived from only two experiments and in many cases no statistical testing was done. The authors should test the observed effects and add extra replicates to make the data more robust, where required.

      3) Fig. 5 supplements: panels showing effects on marker proteins in cells by IF lack quantification of the claimed effects. Without providing some type of quantifications for key findings, it is unclear how strong or penetrant the effects are.

      Minor comments:

      Overall, I felt that the presentation of the data can be improved. After reading the abstract, it was not clear at all to me, what message the authors want to convey, also in comparison to previous work. In particular the final part of the abstract should be improved. The results part is well written, but may still be improved, by providing more summarizing statements that extract the key conclusion from particular experiments and by explaining better why particular experiments were done. The specific rationale may be clear to expert readers but less so to non-experts. Only after reading the discussion, the findings and how they relate to previous work became clearer. I would suggest a final, summarizing schematic that illustrates the main findings in a cartoon/flow chart manner.

      1) Please revise incorrect abstract sentence: "We identify TRIM37 as a key mediator of growth arrest when PLK4 activity is partially or fully inhibited but is not required for growth arrest triggered by supernumerary centrosomes."

      2) In various figures and supplements showing centrosome and condensates/Cenpas, these are very difficult to distinguish due to their small size. I suggest to magnify regions of interest and/or add arrowheads in different colors marking the specific structures.

      3) Fig. 2A: What is the purpose of the schematics on the right of panel A? The labels in the graph are unreadable and the network diagram without any labels is also not very useful. This could be removed.

      4) Fig. 2B: The network presentation is not very easy to read. What are the functional groups/pathways here? The clusters should be labeled accordingly. What is the meaning of the different sizes of the circles? Maybe key interactions (e.g. TRIM37) could be indicated in a different color shade to highlight these?

      Significance

      While the authors start out by essentially reproducing results from previously conducted screens, which may seem to be of limited novelty, the current work reaches conclusions that differ in important aspects from those in previous studies. Moreover, the current work nicely compares in different cell backgrounds PLK4 partial inhibition (extra centrosomes), full inhibition (less/no centrosomes), and p53 pathway inhibition, to obtain an integrated view of the mechanisms involved in growth arrest and tease apart molecular requirements. The results challenge some of the conclusions from previous studies, including high-profile papers where this pathway has been identified as a potential target for cancer treatment. For these reasons I consider this very important work.

      My expertise is in centrosome biology and microtubule organization including mitotic spindle assembly.

      Referee Cross-commenting

      Hi everyone,

      Overall it seems that we all agree that this is an important study. However, as noted by several comments, the presentation definitely needs to be improved and the new findings need to be highlighted better and contrasted with previous studies. I had relatively few major concerns, but, after reading the other reviews, I found the additional comments also important and useful.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Centrosome loss and gain both elicit a p53-dependent cell cycle arrest but the molecular pathways involved are still not fully understood. To address this question the Pelletier lab performed several genome wide CRISPR screens using two different concentrations of centrinone that cause centrosome amplification (low) or loss (high) in RPE1 and A375 cells. In order to distinguish between pathways that act by regulating p53 levels in cells vs those that mediate p53 response to abnormal centrosome numbers, they also performed a screen in cells where p53 levels were artificially elevated by Nutlin treatment. The top hits from the low/high centrinone screen confirmed previous results from other groups, highlighting the importance of the 53BP1/USP28/p53 complex and PIDDisome/ANKR26 complex in the cell cycle response. TRIM37 was shared between both centrinone conditions, while being absent from the Nutlin screen, and thus the authors focused their analysis on the function of TRIM37. Overall the data quality and presentation are both good and the manuscript reads well. The Crispr-Cas9 screens have been performed to a high standard and it is reassuring that the same candidates emerge as from previous screens focusing on centrosome loss and gain.

      TRIM37 has been the subject of several high profile papers over the past year. This current manuscript has the potential to clarify some of the outstanding questions but in its present form the manuscript brings more confusion than clarity to his area of research. Although the authors conduct a careful analysis of TRIM37 function, unless someone is a die-hard specialist, it is difficult to follow what is already known, what the authors find and how or why their data fits/contradicts previous work. The key observations are that i) TRIM37 may not actually control CEP192 levels (unless overexpressed transiently), ii) its E3 ligase activity and its binding to PLK4 are independent of its ability to promote growth arrest upon centrinone treatment, iii) its influence on mitotic duration is independent of its E3 activity or its role in growth arrest upon centrinone treatment. The result that a TRIM37-dependent growth arrest may also exist without increased mitotic duration is another interesting finding, as is the link between TRIM37 and condensates of centrosomal proteins. Including a table that summarises which roles of TRIM37 require PLK4 binding, E3 ligase activity etc would be useful not only to non-specialists. Some of the data contradicts current models for TRIM37 function in growth suppression, so the authors should consider showing a revised model, too.

      Major Points:

      1. Previous data suggested that an important role of TRIM37 was to limit accumulation of CEP192 levels, yet here CEP192 levels appeared unchanged in TRIM37 knockout cells that stably express wild-type or RING domain mutant TRIM37. However, in agreement with previous work, transient expression of TRIM37 reduced CEP192 levels along with those of other PCM and centriole components in an E3-dependent manner. These data are rather confusing in light of the literature, and the current report does not really deal with these discrepancies but to me they suggest that high levels of TRIM37 can target multiple centrosome components for degradation, but this may be an experimental artefact.
      2. The choice of cells for particular experiments is not always stated or explained. For instance, in Figure 3A: Trim37 KO pool used while in Figure 3B TRIM37 single KO. These are then combined with both transient and stable expression of TRIM37 mutants.
      3. Two different concentrations (200 nM and 500 nM) of centrinone were used to compare responses of too many or no centrosomes in RPE1 and A375 . While these concentrations result in centrosome amplification (200 nM) and loss (500 nM) in RPE1 cells, the phenotypes seem much less clear-cut in A375 cells. At 200nM 70% of cells have 0 or 1 centrioles (~35% each category) and only about 15% have centrosome amplification, whereas centrosome amplification occurs in 30% of RPE1 with 0-1 centrioles seen in fewer than 10% (Figure 4 - figure supplement 1H). Hence the different outcomes of centrinone treatment makes conclusions about cell-type specific responses difficult. This difference may be due to differences in drug uptake/efflux, PLK4 activity or in expression of other components of these pathways. In fact, 167nM centrinone B in A375 cells would have been a much closer match to the 200nM treatment of RPE-1. These points should be discussed as they impact the conclusions.
      4. I find the different outcomes of stable versus acute expression of TRIM37 ligase mutant confusing. Here, stable expression of TRIM37 ligase mutant increases mitotic length compared to that of TRIM37 wild-type, which contradicts a recent report by (Meitinger et al. 2021). What could be the potential reason for these differences? What could be the mechanism for TRIM37 action in regulating spindle assembly/mitotic duration and cell proliferation upon centrosome loss? How do those acentrosomal MTOCs form that decrease mitotic duration and promote proliferation? Do the authors find a difference in the % of cells expressing TRIM37 mutants upon stable or acute expression? This part needs a better summary, and again a table would help. I also wonder about protein expression levels; wild-type FB-TRIM37 seems to be expressed at much lower levels than the mutants in Figure 5B.
      5. Other means of centrosome depletion (Cenpj, SAS6 etc) would have been useful to include in the manuscript in support of E3 ligase dependent and independent roles of TRIM37. It is not essential to perform these experiment but if data are available, including these would improve the paper.
      6. The authors show that TRIM37 regulates PLK4 phosphorylation and that this modification could only be observed in HEK293T and not in RPE1. Why would there be a difference between HEK293 and RPE1?
      7. Statistical analysis for graphs should be included. Figure 5 is ok but graphs in Figures 3, 4, 6, 7 would benefit.
      8. The authors characterise TRIM37 localisation. They detect it at centrosomes (as shown by Yeow et al 2021) and more specifically at the PCM, but apparently the signal is not present in all cells. They should also provide a quantification of the % of cells with centrosomal TRIM37 signal and compare this to cells expressing Flag-tagged Trim37. The specificity of the antibody signal using TRIM37-/- should be confirmed.

      Minor Points

      • Page 3: "A recent screen for mediators of supernumerary centrosome-induced arrest identified PIDDosome/p53 and placed the distal appendage protein ANKRD26 within this pathway [31]". It appears that the reference for Burigotto et al. is missing.

      • Page 6: The authors state that: TP53BP1, USP28 and CDKN1A are also suppressors in the Nutlin-3a screen and suggest that they act in a general p53 pathway. However Meitinger et al (2016) showed that depletion of TP53BP1 or USP28 did not affect the upregulation of p53 and p21 upon Mdm2 inhibition.

      • Page 9: "First, we performed live cell imaging to measure mitotic length in cells grown in centrinone". For consistency the authors should say centrinone B here as well

      • Page 9: "Cells lacking TRIM37 suppressed the growth arrest from 150 to 500 nM centrinone B in RPE-1 and 167 to 500 nM in A375 cells". The growth data for the A375 cells seem to be missing from the figures.

      • Page 10: "Our results confirmed that PLK4 and TRIM37 form a complex in RPE-1 cells (Figure 3G)" It appears the authors referred to the wrong figure, it should be Figure 4B.

      • Figure 1C: The nuclear p53 signal is not apparent with 500 nM centrinone B in the exemplary cells. Did the authors use thresholding to quantify p53/p21 positive cells?

      • Figure 4D and Figure 4 - Figure supplement 1G: The graph is misleading and should not be presented as a continuous line.

      • Figure 5A and C: A direct and statistical comparison mitotic timing upon expression different Trim37 mutants to wildtype and trim37-/- cells is missing

      • Figure 6B: A loading control/Ponceau staining is missing as well as the quantification of protein levels

      • Figure 6D: It is unclear if the centrosomal signal intensity was quantified in interphase or mitotic cells

      • Figure 7C: A loading control/Ponceau staining is missing

      • Figure 2 - figure supplement 2F and G: It would help if the authors could highlight the cell line, e.g. RPE-1 (F) or A375 (G) in the venn diagrams.

      • Figure 4 - figure supplement 1E: it appears that the BirA antibody gives only an unspecific signal. It would be useful to show if the different TRIM37 variants are able to localise to the centrosomes. Furthermore it appears that centrosomes are missing in the C18R and 505-709 variants. It would be useful if the authors quantify centrosome numbers upon expression of different Trim37 variants as shown in Figure 4 - figure supplement 1. To make the identification of the cell easier it would help to include a DNA signal or indicate the outline of the cell.

      • The generation of stable and dox-inducible cell lines is missing in the material and methods

      Significance

      Centrosome loss in mammalian cells triggers a somewhat mysterious p53-dependent irreversible cell cycle arrest that bears similarities with senescence. A key modulator of this arrest is the E3 ubiquitin ligase TRIM37; TRIM37-overexpressing cells show increased sensitivity to centrosome loss whereas TRIM37 deletion restores normal growth to cells lacking centrosomes. The precise function of TRIM37 in this process is still not clear.

      The authors here report a two-pronged approach to improve our understanding; first, they perform several genome-wide Crispr/Cas9 screens in two cell lines to identify new players that modulate growth arrest following inhibiton of centrosome duplication, and second, they analyse the function of TRIM37, their top candidate, in this process. Whereas the screens recapitulate previous reports by identifying a near identical set of genes, the functional work of TRIM37 provides interesting new data that go beyond (and at places contradict) published work. They describe a complex relationship between TRIM37 function, PLK4 inhibition and growth arrest, and suggest that TRIM37 acts via modulating PLK4 phosphorylation/stability and perhaps its role in autophagy also contributes to the overall phenotype. These possibilities will need to be tested in the future but the current manuscript contains enough interesting and potentially important data that it is worthy of publication following revision.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): In this paper, the authors use a previously published method SHAP for interpreting deep learning (DL) models (specifically LSTMs) that are trained for predicting physicochemical attributes of peptides (such as antigenicity and collisional cross section). The paper shows that it's capable of identifying some amino acid residues contributing to the prediction results of the DL models. Reviewer #1 (Significance (Required)):

      1. One main ideas of the paper is to use SHAP for determine the significant amino acids at each position (or pairs of AA at each position) contributing to the prediction. Some of the interpretation results are consistent with findings reported previously. This is very nice; however, most of these findings are statistical results such "XX is often present at the second position for the peptides with the positive outcome", which are relatively straightforward and may be derived by using some statistical methods without using DL models. We expect more complex patterns can be discovered in addition to these statistical observations.

      We thank the reviewers for these comments.

      First, to the point about discovering complex patterns, we note that one use of PoSHAP we discuss later in the paper is that PoSHAP enables interposition dependence analysis, which depends on interactions between residues and would not be reflected by summary statistics.

      Second, we agree it is important to show whether PoSHAP produces different residue importance maps than simple statistical summaries of amino acids in each group. The strongest binding peptides, or the highest mobility for the CCS model, were determined by taking only peptides that fall above a linear regression best fit of the ranked experimental values. Statistical summary heatmaps were created and then compared to those from PoSHAP revealing some similarities but also many differences. We added the following text and new figure to the results section to illustrate these points:

      “We wondered whether the patterns revealed by PoSHAP simply reflect the summary statistics for the high-binding or high-CCS subset of peptides. As expected, due to known differences in amino acid abundance across the proteome, the prevalence of amino acids was different across the training data and were also heterogeneous across positions (Figure 5A). To determine the subset of high CCS peptides, peptides were ordered in the training set by their CCS rank and then linear regression was performed to get the average trend line (Figure 5B). Any peptide above that trendline was defined as “high CCS”, and the frequency of amino acids at each position in this set was summarized using a heatmap (Figure 5C). Compared to the statistical amino acid frequencies, PoSHAP suggests a greater importance to arginine at both termini, the importance of tryptophan to increase CCS becomes apparent, and interior glutamic acid contributes less to high CCS than the frequencies would suggest (Figure 5D). The same analysis was repeated for MHC data (Supplementary Figures 9 and 10). This demonstrates that PoSHAP found non-linear relationships between the inputs and the outputs that are not present by simple correlation. “

      Figure 5: Amino acid summary statistics differ from PoSHAP values for the CCS data. (A) Amino acid counts as a function of position for training data. (B) Procedure for picking the ‘top peptides’ with the highest CCS. Linear regression was performed on the peptides ranked by their actual CCS value. Any peptide that fell above the trendline and overall mean were defined as ‘top peptides’. (C) Counts of amino acids for the top peptides were summarized in a heatmap. (D) Mean SHAP values across amino acids and positions from PoSHAP analysis.

      We also added the corresponding supplemental figures showing the same examples for the MAMU A001 model and human MHC models:

      Supplemental Figure 9: Amino acid summary statistics differ from PoSHAP values for the A001 MAMU MHC I data. (A) Amino acid counts as a function of position for training data. (B) Procedure for picking the ‘top peptides’ with the highest CCS. Linear regression was performed on the peptides ranked by their actual CCS value. Any peptide that fell above the trendline and overall mean were defined as ‘top peptides’. (C) Counts of amino acids for the top peptides were summarized in a heatmap. (D) Mean SHAP values across amino acids and positions from PoSHAP analysis. For the MAMU model, the amino acid frequencies of the input peptides show no obvious preference for amino acid position, but some amino acids are over-represented overall. The presence of the “end” token is more likely to be a high binder statistically (C), but the PoSHAP reveals that this end token is not the main determinant of binding (D).

      Supplemental Figure 10: Amino acid summary statistics differ from PoSHAP values for the human A1101 MHC I data. (A) Amino acid counts as a function of position for training data. The distribution of amino acids in this data. (B) Procedure for picking the ‘top peptides’ with the highest CCS. Linear regression was performed on the peptides ranked by their actual CCS value. Any peptide that fell above the trendline and overall mean were defined as ‘top peptides’. (C) Counts of amino acids for the top peptides were summarized in a heatmap. (D) Mean SHAP values across amino acids and positions from PoSHAP analysis. There are clear differences between the summary statistics of top peptides (C) and PoSHAP heatmap (D). For example, the end token is prominent in the summary statistics absent from the PoSHAP interpretation. Also, the preference for S/T/V at position two is tempered according to PoSHAP, but would be determined to be very important by the summary statistics.

      Although the interpreting results reported in the paper largely agree with previous reports, the paper did not explicitly model the frequency of different amino acid in the training data. For instance, if the amino acid 'A' happens to be over-represented in the positive samples of peptides in the training data, the DL model may consider it as to contribute to the positive prediction, which may not be not true. This issue might become more serious when pairs of amino acids are considered. The authors may want to analyze this potential issue in their results.

      We agree and understand the concern for the overrepresentation of amino acids that might skew the training of our models. To determine if this is an issue, as part of the response to the previous question, we looked at the amino acid counts for all peptides (Figure 5A, Supplemental Figures 9A and 10A). In general, the PoSHAP heatmaps (panel Ds in the same figures) look very different from the frequencies of amino acids (panel Cs in the figures), suggesting that amino acid frequencies have not caused any problem.

      Even on a balanced training dataset, the LSTM model to be interpreted may still contain arbitrary bias due to invertible overfitting, which the authors did not discuss. It will be more convincing by training multiple models using different hyper-parameters and optimization algorithms, and then see if similar interpretation results can be reached among most or all of these models.

      We assume the reviewer meant ‘inevitable overfitting’ instead of “invertible overfitting”? If so, the original manuscript did assess overfitting in Figure S4 based on the training and validation loss over training epochs.

      We think the reviewer makes a good point that different models might produce different interpretations, so we trained new models without optimization and with different hyperparameters and with a different optimizer (RMS prop). We see essentially the same PoSHAP interpretations. We added the following text to the results section along with these three new supplemental figures:

      “Given the dependence of the model interpretation results on the model used, the same model architecture trained with different parameters might result in different model interpretation. Given this, models for each of the three tasks mentioned here were retrained with different hyperparameters including the “RMS prop” optimizer. Each model produces similar or better prediction performance compared to the earlier version, and the model interpretation by PoSHAP was almost identical to the previous results in all three cases (Supplementary figures 12, 13, 14). This suggests that the model architecture drives the differences in interpretation, not the model training process.”

      Supplemental Figure 12. PoSHAP Analysis of Mamu A001 With Unoptimized Hyperparameters and RMSprop. A new model for the Mamu data was trained using the same architectures but with different hyperparameters and RMSprop as the optimization algorithm. Loss was plotted as mean squared error compared to the validation data. (A) Similar metrics for MSE, r, and p-values were obtained (B). Similar patterns are also observed for the PoSHAP heatmap of A001. (C) A dependence plot for A001 shows similar patterns to the Adam optimized model, including the positional dependence of proline at position two for high SHAP values of serine and threonine.

      Supplemental Figure 13. PoSHAP Analysis of A:11*01 With Unoptimized Hyperparameters and RMSprop. A new model for the A:11*01 data was trained using the same architectures but with different hyperparameters and RMSprop as the optimization algorithm. Loss was plotted as mean squared error compared to the validation data. (A) Similar metrics for MSE, r, and p-values were obtained (B). Similar patterns are also observed for the PoSHAP heatmap of A:11*01. (C) The SHAP ranges by position plot for A:11*01 shows similar patterns to the Adam optimized model, including the largest range of SHAP values at position two, nine, and ten.

      Supplemental Figure 14. PoSHAP Analysis of CCS With Unoptimized Hyperparameters and RMSprop. A new model for the CCS data was trained using the same architectures but with different hyperparameters and RMSprop as the optimization algorithm. Loss was plotted as mean squared error compared to the validation data. (A) Similar metrics for MSE, r, and p-values were obtained (B). Similar patterns are also observed for the PoSHAP heatmap of CCS. (C) Dependence analysis was performed on the dataset and the combined distance-interaction type bar plot shows similar relationships between the groupings, notably charge repulsion’s split.

      For the dependence analysis, it is not completely clear why the distance is used as the variable, while the relative position of the amino acid residue in the peptide is ignored. For example, if there is a strong interaction between the first and the last residues in the peptide, their distance changes depending on the peptide length. In figure 6, the authors showed strong interactions between amino acid that are 8-9 residues apart may suggest the peptide length actually plays a role here.

      We used distance because as the dependence analysis is a calculation of the difference in means between two distributions of SHAP values, dependent of the amino acid at another position. We believe that the distance between these interacting points is a natural choice and among the most informative metrics to explain these interactions. We agree with the reviewer that peptide length is important to the magnitude of the interactions between amino acids. We also recognize that there may be interactions between the peptide termini that could be obscured by the interactions of the longer peptides. To better explore this possibility, we performed the dependence analysis on each of the different peptide lengths separately (8, 9, or 10 here) to see if this is the case. Unfortunately, given the smaller size of these data subsets, we were unable to show significant differences in the interaction groupings. Though, interestingly enough, the significant interactions for the peptides of length eight only occurred between neighboring amino acids or the termini. This may suggest an interaction between termini that could be explored in the future.

      We added the following text and supplemental figure 11 to the results:

      “Finally, to try to ask if the absolute positions of amino acids in the peptide are relevant for the interaction, the data was split into 8, 9, or 10mers before analysis (Supplemental Figure 11). This revealed that there may be interactions between the termini, but this effect may be difficult to observe because there are significantly fewer 8mers and 9mers in the CCS dataset.”

      Supplemental Figure 11. SHAP Values of Collisional Cross Section by Peptide Length. The impact of peptide length on SHAP values was explored for the CCS data. The dataset was split into peptides of length 8, 9, and 10. All SHAP values were plotted as violin plots. The mean SHAP values were plotted in heatmaps by position and amino acid and standardized. Significant interactions by dependence analysis were plotted in bar charts by distance between interactions.

      To further support our decision to use distance as an interaction metric, we have also now included an additional box plot for Figure 7, demonstrating the interactions between each of the categories combined with distance. We have found that some of the bimodality of the interaction categories are explained by the distance at which they interact. Most strikingly is charge repulsion that decreases CCS when neighboring but increases CCS when the interaction is further.

      We added the following text and updated Figure 7 to the results section:

      “Additionally, there are interesting differences in the interactions of the amino acid among the significant set of interactions (Figure 76B). All significant interactions from the CCS data (Supplemental Table 3, adj. p-value Though it is evident that the mean of each interaction type corresponds to the expected impact those interactions would have on CCS, each of the interaction dependence plots are bimodal, with some interactions increasing CCS and some decreasing it. To dissect this observation further, we combined the two methods of splitting the data to see if the bimodality of interaction types would be resolved by distance (Figure 7C). Though definitive conclusions cannot be made for most categories, likely due to the ever decreasing sample size by splitting, of note is the difference between neighboring charge repulsion and non-neighboring charge repulsion. Neighboring charge repulsion seems to decrease CCS while distant charge repulsion increases CCS (see adjusted p-value from Tukey’s posthoc test in Figure 7D). When distant, charge repulsion makes intuitive sense as the amino acids are forced apart, linearizing the peptide and increasing the surface area. When neighboring, it is possible that the repulsion causes a kink in the linear peptide, decreasing the cross section. Overall, these analyses demonstrate that the models were able to learn fundamental chemical properties of the amino acids and through PoSHAP analysis we were able to uncover them.”*

      Figure 7. Dependence analysis of CCS model. (A) Significant (Bonferroni corr. P-value = charge repulsion, * = other, and δ = polar. For the distance analysis, interactions were grouped into three categories, neighboring (distance = 1), near (distance = 2, 3, 4, 5,6), and far (distance = 7, 8, 9). * indicates significance (ANOVA with Tukey’s post hoc test p-value

      Also, it would be better to show that how the result looks like when applying this method to peptides in the negative samples (e.g., the peptides that are not bound by MHC in the antigenicity prediction experiment). Will the interpreting results also be negative?

      We agree this is an interesting idea. We updated the supplemental figure showing PoSHAP of top peptide subsets to also show PoSHAP of bottom peptide subsets (supplemental figure 8). The results suggest that certain amino acid positions are detrimental to binding, for example D/E at various positions. We updated this section to add:

      “We also performed the same analysis with the eight peptides with the lowest binding predictions (Supplemental Figure 8). These PoSHAP heatmaps are primarily composed of negative SHAP values, suggesting that using this subset reveals amino acids at certain positions that are detrimental to MHC binding.”

      Supplemental Figure 8. Pooled PoSHAP for bottom and top predicted subsets of the data. The mean SHAP values for each amino acid at each position were calculated for the peptides with the bottom (A) or top (B) 0.013% predicted intensity (top 8 peptides) for the “A” Mamu alleles. Due to the small sample size, most of the amino acid positions have a value of zero. The positions with extreme values, however, illustrate important amino acids for prediction. Notably for A001 and A002, aspartic acid and glutamic acid contribute to low prediction along the peptide, suggesting charge may inhibit binding. For the top predictions, phenylalanine or leucine are important at the first position for both A001 and A008. A serine or threonine at position two is important for A001, A002, and A008. All alleles demonstrate the importance of a proline near the middle of the peptide.

      Finally, it will be interesting to see the interpreting results when the method is applied to the DL models on more challenging tasks such as the prediction of tandem mass spectra of peptides. The authors may want to discuss these applications.

      We agree it would be very interesting to apply this method to interpret predictions of tandem mass spectra. In this paper we already demonstrated PoSHAP on three different datasets with three different models, so we feel that adding a fourth model is out of the scope of this work. We do agree that we would like to explore this option in the future. We added this idea to the discussion section:

      “Altogether the advances described herein are likely to find widespread use for interpreting models trained from biological sequences, including models not covered here such as those to predict tandem mass spectra (reviewed in 33).”

      I am primarily interested in algorithmic and statistical problems in genomics and proteomics. We have develop deep learning models for predicting the full tandem mass spectrum of peptides, and am interested model interpretation methods to explain the fragmentation mechanism resulting in non-conventional fragment ions in tandem mass spectra of peptides. I review the paper in collaboration with my Ph.D students, who are developing deep learning models for computational mass spectrometry.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): **Comments to the Authors** In this study, the authors developed a framework named PoSHAP for the interpretation of neural networks trained on biological sequences. The current manuscript can be stronger if the following issues can be clearly addressed.

      1. As interpreting model with SHAP is a vital part of this manuscript, it would be better to provide descriptions of the underlying principles of SHAP to enable the readers to understand the paper easily.

      We recognize that understanding the principles of SHAP is vital. To better explain SHAP, we have added the following text to the introduction:

      “SHAP is a perturbation-based explanation method where the contribution of an input is calculated by hiding that input and determining the effect on the output. SHAP expands this using the game theoretic approach of Shapely values that ensures the contributions of the inputs plus a calculated baseline sum to the predicted output.”

      It is emphasized in the manuscript that PoSHAP is introduced to interpret neural networks trained on biological sequences. However, it is not clear why the authors choose the Model Agnostic Kernel SHAP, which is based on Linear LIME. Although it can be used for any model, the performance of which may not be optimal. In this regards, perhaps Deep SHAP or Gradient SHAP is more appropriate, both of which are designed for deep learning networks [1]. It would be better to provide some additional experiments on Deep SHAP and this work will be more convincing if the same or similar contribution of each position on each peptide as that of Kernel SHAP. [1] Lundberg, S., and S. I. Lee. "A Unified Approach to Interpreting Model Predictions." Nips 2017.

      Our goal in using KernelExplainer was to demonstrate that PoSHAP was not dependent on model specific interpretation methods. However, we have realized that this intention may not have been clearly stated or demonstrated. To expand on this, we have included a new Figure 8, which shows PoSHAP analysis comparisons to other classes of machine learning models, all using Kernel Explainer. This result was interesting because it revealed that even though the XGboost model technically performed better at prediction (Figure 8A, reduced MSE and higher spearman rho), and produced a similar PoSHAP motif heatmap, the interpositional dependences from the perspective of distance (Figure 8C) or chemical interactions (Figure 8D) were substantially muted. This is also apparent with the other standard machine learning model ExtraTrees. This result shows that the choice of model architecture is important, and this direct comparison would not be possible if we used the DeepExplainer.

      We added the following text and figure to the manuscript:

      “ PoSHAP uses the SHAP KernelExplainer method, which is based on Local interpretable model-agnostic explanations (LIME). Using the general KernelExpplainer method enables direct comparison of interpretations produced by different models trained from the same data. To ask whether PoSHAP interpretation changes based on the model used, the CCS data was used to train XGboost or ExtraTrees models. Surprisingly, the XGboost model performed better than the LSTM model with regard to MSE and spearman rho between true and predicted values in the test set (Figure 8A). ExtraTrees was slightly worse than the other two models. The model interpretation heatmaps from PoSHAP were similar between the LSTM and XGboost, but the interpretation from the ExtraTrees model was missing the high average SHAP due to n-terminal histidine or arginine (Figure 8B). Even though XGboost produced a similar PoSHAP heatmap, the interpositional dependence with regard to distance (Figure 8C) and chemical interactions (Figure 8D) was muted. This shows that the choice of model is important for revealing amino acid interactions.”

      Figure 8. CCS PoSHAP of Various Machine Learning Models. PoSHAP analysis was performed on two additional machine learning models, Extra Trees and Extreme Gradient Boosting (XGB). Predictions were plotted against experimental values and the Mean Squared Error and r values are reported for each model (A). PoSHAP heatmaps were created for each model (B), illustrating an increase in model complexity as more sophisticated models are used. Dependence analysis was performed on each model and the significant interactions are plotted by distance (C) and by combined distance and interaction type (D).

      As described in the manuscript, "Correlations between true and predicted values were assessed by MSE, Spearman's rank correlation coefficient, and the correlation p-value." As an important indicator for evaluation, the exact p-values should be provided in the seven subgraphs in Figure 2, not p=0.0.

      We agree with the reviewer that reporting accurate p-values can assist in evaluation. We have updated the figures to reflect the p-values as far as we were able to determine them. Unfortunately, we are limited by the nature of the double data type in python and so reported that the p-value was less than the minimum value allowed by a double in six of the seven graphs. Additionally, the scales have been marked symmetrically as you mentioned in comment 4.

      It should be noted that the coordinate scales of Figure 2B and Figure 2C need to be marked symmetrically. And from Figure 2B, we can see that, the IC50 with smaller (0.8) values cannot be well predicted. Can the authors provide a detailed explanation about these results?

      We understand the reviewer’s concern with poor prediction of extreme values. Figure B represents the IC50 prediction for the A1101 human allele which was the smallest of the datasets we used for training. It only consists of 4,522 entries, around 1/10 of the data used for the Mamu alleles and CCS. Because of this, it is likely that there were not enough examples of datapoints at the extremes to reliably train the model to account for them. However, given the limited size of the dataset, we were surprised with the satisfactory predictions. More importantly, the purpose of our paper is model interpretation not model prediction accuracy, and this shows that even when predictions are not perfect, the model interpretation by PoSHAP can still be effective. We thank the reviewer for noticing this and added the following statement to the results:

      “Remarkably, this was achieved for A\11:01 using a total dataset of only 4,522 examples, which shows that PoSHAP can be effective with even less than 10,000 training examples. “*

      References are needed in some descriptions in the manuscript. For example, "one might train a network to take an input of peptide sequence and predict chromatographic retention time", "RNNs have found extensive application to natural language processing, and by extension as a similar type of data, predictions from biological sequences such as peptides or nucleic acids".

      We apologize for missing these references. We have now cited these statements and have added many additional references as part of our revision.

      The description of the adopted three models in the section "Model architecture" is a bit confusing. As described in this section, "The LSTM layer outputs a 50x128 dimensional matrix to a dropout layer where a proportion of values are randomly set to 0", "a second LSTM layer outputs a tensor with length 128 and a second dropout layer then randomly sets a proportion of values to 0". But as shown in the Supplemental Figure 3, the output size of the first LSTM was 10x128. Also, as shown in Table 1, the dropout rates were not 0. Therefore, the section should be adjusted for clear clarification.

      We apologize for the confusing wording. We meant that dropout layers randomly set values=0, not that the dropout proportion was 0. We reworded this part to read:

      “The LSTM layer outputs a 10x128 dimensional matrix to a dropout layer where a proportion of values are randomly “dropped”, or set to 0. For the MHC models, a second LSTM layer outputs a tensor with length 128 to a second dropout layer. Then in all models, a dense layer reduces the data dimensionality to 64. For the MHC models, the data is then passed through a leaky rectified linear unit (LeakyReLU) activation before a final dropout layer, present in all models.”

      Reviewer #2 (Significance (Required)): Pls refer to my comments provided as above.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): **Summary:** The main goal of the work is to provide the interpretation of Deep Neural Networks (LSTM in the paper) trained on biological sequences. For this purpose authors used the framework introduced earlier - SHapley Additive exPlanations (SHAP), in particular - the slight adaptation of this method called positional SHAP (PoSHAP), because they are interested in the impact of each position of the input sequence to the model output. They demonstrate this on three regression tasks that predict peptide properties. **Major comments** The main contribution, highlighted in the paper: authors showed how PoSHAP discloses amino acid motifs that influence MHC I binding. Further they described how PoSHAP enables understanding of interpositional dependence of amino acids that result in high affinity predictions. Also they argued that this work also contributes to a method for accurate prediction of peptide-MHC I affinity using peptide array data enabled by novel application of a neural network that combines amino acid embedding and LSTM layers.

      There are some comments about the statements above: 1.Why was the LSTM model chosen? Recent publications showed the success of the Transformer model for biological sequences; however this direction was not covered in the related work overview. The architecture choice then should be better justified. Also the choice of LSTM for the biological sequences is not new and authors should better claim their statement about "novel application of a neural network that combines amino acid embedding and LSTM layers ". Where exactly is the novelty? Could the community use the pretrained embeddings for their purpose?

      The reviewer is correct that transformer models are highly effective for making predictions from biological sequences. In fact, many models do well, and there is no single correct choice of model for this task. Though there are many models to choose from, our models are sufficiently accurate. Importantly, the main contribution of our manuscript is not to train the most accurate models, but rather to demonstrate a strategy for positional model interpretation based on SHAP. Related to that point, please note our response to reviewer #2’s second comment that our approach uses the kernel explainer and can be applied to any model. However, we do agree that we neglected coverage of the transformer model in the introduction and have added a paragraph to the introduction covering some of the recent work in this area:

      “Many effective deep learning model architectures are available for making predictions from inputs of biological sequences, and there is currently no single correct choice. CNN models such as MHCflurry 2.0 (40) and LSTM models are effective at predicting MHC binding of peptides (41). Even simpler models, such as random forests, have been used to predict MHC binding (42,43). Prediction of other peptide properties like tandem mass spectra are often done with CNN or LSTM models (33). More recently, given the extraordinary performance of transformer models like BERT (44) and GPT-3 (45) for NLP, there is interest in transformer models for biological sequences (46).”

      We also want to be sure we do not overstate the novelty of our contributions. We have updated our discussion to better reflect the nature of our contributions. We reworded the statement quoted above to read:

      “Overall, the three modeling examples laid out herein serve as a tutorial for PoSHAP interpretation of almost any model trained from almost any biological sequence.”

      The attention mechanism itself provides the great opportunity to interpret the model predictions. In the introduction section authors made a statement that attention layers may limit the flexibility of model architecture when designing new models. Could they better explain this limit? Because recent state of the art models successfully work with long biological sequences and show better results then any other models (one example could be found here: https://openreview.net/pdf?id=YWtLZvLmud7). Authors should cover these limits more, that also related to the motivation of the LSTM choice.

      We added a paragraph to our introduction to expand on attention and its limitations:

      “Attention mechanisms have been successful in recapitulating experimentally defined binding motifs, but require that the model be constructed with attention layers. This may limit the flexibility of model architecture when designing new models. For example, attention mechanisms are specific to neural networks. Simpler models, such as random forests and XGboost, may also be more suitable for some applications, and these cannot utilize attention. Also, while attention mechanisms are currently very effective, there is always a possibility that new architectures will emerge that make interpretations using attention infeasible. Beyond this, attention is a metric of the model itself, while SHAP values are calculated on a per input basis. By looking at the model through the lens of the inputs, we can understand the model’s “reasoning” behind any peptide. Attention mechanisms also do not enable dissection of interpositional dependencies between amino acids. Thus, new methods for model agnostic interpretation are desirable.”

      Another statement was made about the PoSHAP - adaptation of the SHAP method. It is hard to follow through the explanation of this adaptation - it is not clear what exactly is this adaptation. For example, Kernel SHAP from the original paper computes feature importance, in this paper authors compute the impact of each position, that is basically also the feature importances. Thus authors should better explain the statement about PoSHAP novelty. Will it be possible to use PoSHAP for any other model trained for the same purpose? If yes, for better reproducibility, authors should provide the place where exactly in the repo is the code for this. Also mathematical notations are missing in the Positional SHAP (PoSHAP) section - it is better to explain the adaptation with them to increase the understanding of the section.

      We apologize for the ambiguous wording in the abstract stating that “PoSHAP adapts SHAP”. We have reworded this statement to “PoSHAP utilizes SHAP”. The novelty of this approach is taking the feature importance values calculated by SHAP and structuring them to include each position’s index to allow for the interpretation of biological sequences. As we demonstrate here, this allows for novel interpretations of previously published data and will enable model interpretation in future studies that learn from biological sequences. Although this is practically very simple, we are not yet aware of any examples in the literature that do this.

      The following two SHAP force plots demonstrate the difference between using SHAP as-is versus PoSHAP. There is a demonstrated need for such a framework, considering the dearth of biological sequence model interpretation using SHAP and the ambiguity within biological sequence SHAP interpretation. For example, Meier et al., Nature Communications, 2021 performed an analysis like our Figure S7C, which just shows the range of SHAP values per residue. Although we can learn something about which AAs are important based on the range of their SHAP values, SHAP as-is doesn’t reveal a motif. While our position indexing is a simple change, it enables all the rich, sequence dependent analysis we performed in this paper. We added the following text to the results section with this new supplementary figure:

      “PoSHAP utilizes the standard SHAP package but adapts the analysis by simply appending an index to each input and maintaining positional information after the kernelExplainer interpretation, which enables tracking of each input postion’s contribution to an output prediction (supplementary figure 5, showing force plot with and without index).”

      Supplemental Figure 5. SHAP Forceplots Demonstrating PoSHAP Indexing. Two forceplots were created with the SHAP forceplot method of the third peptide in the CCS testing set. (A) shows the plot with encoded inputs mapped to their amino acid. (B) shows the plot with the encoded inputs mapped to their amino acid and position. The addition of positional indexing removes the ambiguity of contributions, for example, glutamine having both a positive and a negative SHAP contribution to the prediction of the third peptide.

      We have updated the repository to include a tutorial that demonstrates PoSHAP on provided data and explains how to use PoSHAP with your own model and data.

      In the experimental section, authors first compare the results with previously known. For example, for the human MHC allele A*11:01 model PoSHAP analysis shows the similar results as was shown with another approach. Based on the provided explanation, it is not clear why PoSHAP is better than the previously published method. The advantage of the PoSHAP should be better explained.

      We agree with the reviewer that the benefits of our approach should be as clear as possible. The referenced section of the paper is to validate our approach compared to another model interpretation technique. We added a new third paragraph to the discussion section to clearly explain the benefits of PoSHAP:

      “There are several benefits of PoSHAP over competing methods. First, PoSHAP determines important residues despite biases in the frequencies of amino acids (Figure 5, Supplementary Figures 9 and 10). PoSHAP is also applicable to any model trained from sequential data (Figure 8), and enables dissection of interpositional dependencies (Figures 6 and 7). Finally, we include a clearly explained jupyter notebook on Github that will take any model and dataset and perform PoSHAP analysis.”

      In the experimental section, after the PoSHAP performance verification, hypothesis generation was introduced. However, it is not clear how many hypotheses were generated; how many of them were known before; what kind of other categories are inside these hypotheses (unknown, possible and potentially interesting, etc).

      We are unsure as to how to quantify the number of hypotheses generated by our approach. In a sense, the SHAP value of each amino acid at each position within a heatmap represents a hypothesis of the contribution of that amino acid to the metric being predicted. Each significant interaction listed in the first three supplemental tables represents a hypothesis of the interactions between two given amino acids at two positions. To make these into testable hypotheses requires some analysis, as we have discussed. i.e. the two binding motifs (L-T-P, F-S-P) of A001, or the distance-type interactions within the CCS.

      The README section in the GitHub repo is not easily understandable. An additional explanation for each step is required (e.g., links to the folders where the calculated SHAP values, the trained models, all splits and all-important benchmarks are).

      We have updated the README and repository to explain how to use PoSHAP, and explanations of each item in the repository.

      **Minor comments**

      1. The prior studies should be covered better (see Major comments).

      We apologize for not better covering prior studies. We have significantly expanded the introduction by adding two new paragraphs and at least 10 additional citations.

      The work consists of some typos, for example: "However, because many reports forgo model interpretation" - "t" is missed.

      We did intend to use the word “forgo” not “forget” in that sentence. We have checked again thoroughly for spelling and grammar mistakes.

      The hyperparameters table, hyperparameter search section should be moved to the supplemental material, that's technical details.

      We moved this table to the supplementary materials.

      Reviewer #3 (Significance (Required)): Interpretation of the model results is an important topic for biology. New findings here could lead to new interactions opening, new drugs development etc. That is relevant for the applied ML Researches and computational biologists. This paper aims to provide a way to do it. Because my field of interest and expertise lies in Machine Learning for healthcare, language modelling of biological sequences and Natural Language Processing, this work is of great interest to me. So I mostly evaluated ML methodology presented in the paper.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The main goal of the work is to provide the interpretation of Deep Neural Networks (LSTM in the paper) trained on biological sequences. For this purpose authors used the framework introduced earlier - SHapley Additive exPlanations (SHAP), in particular - the slight adaptation of this method called positional SHAP (PoSHAP), because they are interested in the impact of each position of the input sequence to the model output. They demonstrate this on three regression tasks that predict peptide properties.

      Major comments

      The main contribution, highlighted in the paper: authors showed how PoSHAP discloses amino acid motifs that influence MHC I binding. Further they described how PoSHAP enables understanding of interpositional dependence of amino acids that result in high affinity predictions. Also they argued that this work also contributes to a method for accurate prediction of peptide-MHC I affinity using peptide array data enabled by novel application of a neural network that combines amino acid embedding and LSTM layers.

      There are some comments about the statements above:

      1.Why was the LSTM model chosen? Recent publications showed the success of the Transformer model for biological sequences, however this direction was not covered in the related work overview. The architecture choice then should be better justified. Also the choice of LSTM for the biological sequences is not new and authors should better claim their statement about "novel application of a neural network that combines amino acid embedding and LSTM layers ". Where exactly is the novelty? Could the community use the pretrained embeddings for their purpose?

      1. The attention mechanism itself provides the great opportunity to interpret the model predictions. In the introduction section authors made a statement that attention layers may limit the flexibility of model architecture when designing new models. Could they better explain this limit? Because recent state of the art models successfully work with long biological sequences and show better results then any other models (one example could be found here: https://openreview.net/pdf?id=YWtLZvLmud7). Authors should cover these limits more, that also related to the motivation of the LSTM choice.
      2. Another statement was made about the PoSHAP - adaptation of the SHAP method. It is hard to follow through the explanation of this adaptation - it is not clear what exactly is this adaptation. For example, Kernel SHAP from the original paper computes feature importance, in this paper authors compute the impact of each position, that is basically also the feature importances. Thus authors should better explain the statement about PoSHAP novelty. Will it be possible to use PoSHAP for any other model trained for the same purpose? If yes, for better reproducibility, authors should provide the place where exactly in the repo is the code for this. Also mathematical notations are missing in the Positional SHAP (PoSHAP) section - it is better to explain the adaptation with them to increase the understanding of the section.
      3. In the experimental section, authors first compare the results with previously known. For example, for the human MHC allele A*11:01 model PoSHAP analysis shows the similar results as was shown with another approach. Based on the provided explanation, it is not clear why PoSHAP is better than the previously published method. The advantage of the PoSHAP should be better explained.
      4. In the experimental section, after the PoSHAP performance verification, hypothesis generation was introduced. However, it is not clear how many hypotheses were generated; how many of them were known before; what kind of other categories are inside these hypotheses (unknown, possible and potentially interesting, etc).
      5. The README section in the GitHub repo is not easily understandable. An additional explanation for each step is required (e.g. links to the folders where the calculated SHAP values, the trained models, all splits and all important benchmarks are).

      Minor comments

      1. The prior studies should be covered better (see Major comments).
      2. The work consists of some typos, for example: "However, because many reports forgo model interpretation" - "t" is missed.
      3. The hyperparameters table, hyperparameter search section should be moved to the supplemental material, that's technical details.

      Significance

      Interpretation of the model results is an important topic for biology. New findings here could lead to new interactions opening, new drugs development etc. That is relevant for the applied ML Researches and computational biologists. This paper aims to provide a way to do it. Because my field of interest and expertise lies in Machine Learning for healthcare, language modelling of biological sequences and Natural Language Processing, this work is of great interest to me. So I mostly evaluated ML methodology presented in the paper.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Comments to the Authors

      In this study, the authors developed a framework named PoSHAP for the interpretation of neural networks trained on biological sequences. The current manuscript can be stronger if the following issues can be clearly addressed.

      1. As interpreting model with SHAP is a vital part of this manuscript, it would be better to provide descriptions of the underlying principles of SHAP to enable the readers to understand the paper easily.
      2. It is emphasized in the manuscript that PoSHAP is introduced to interpret neural networks trained on biological sequences. However, it is not clear why the authors choose the Model Agnostic Kernel SHAP, which is based on Linear LIME. Although it can be used for any model, the performance of which may not be optimal. In this regards, perhaps Deep SHAP or Gradient SHAP is more appropriate, both of which are designed for deep learning networks [1]. It would be better to provide some additional experiments on Deep SHAP and this work will be more convincing if the same or similar contribution of each position on each peptide as that of Kernel SHAP. [1] Lundberg, S., and S. I. Lee. "A Unified Approach to Interpreting Model Predictions." Nips 2017.
      3. As described in the manuscript, "Correlations between true and predicted values were assessed by MSE, Spearman's rank correlation coefficient, and the correlation p-value." As an important indicator for evaluation, the exact p-values should be provided in the seven subgraphs in Figure 2, not p=0.0.
      4. It should be noted that the coordinate scales of Figure 2B and Figure 2C need to be marked symmetrically. And from Figure 2B, we can see that, the IC50 with smaller (<0) and larger (>0.8) values cannot be well predicted. Can the authors provide a detailed explanation about these results?
      5. References are needed in some descriptions in the manuscript. For example, "one might train a network to take an input of peptide sequence and predict chromatographic retention time", "RNNs have found extensive application to natural language processing, and by extension as a similar type of data, predictions from biological sequences such as peptides or nucleic acids".
      6. The description of the adopted three models in the section "Model architecture" is a bit confusing. As described in this section, "The LSTM layer outputs a 50x128 dimensional matrix to a dropout layer where a proportion of values are randomly set to 0", "a second LSTM layer outputs a tensor with length 128 and a second dropout layer then randomly sets a proportion of values to 0". But as shown in the Supplemental Figure 3, the output size of the first LSTM was 10x128. Also, as shown in Table 1, the dropout rates were not 0. Therefore, the section should be adjusted for clear clarification.

      Significance

      Pls refer to my comments provided as above.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper, the authors use a previously published method SHAP for interpreting deep learning (DL) models (specifically LSTMs) that are trained for predicting physicochemical attributes of peptides (such as antigenicity and collisional cross section). The paper shows that it's capable of identifying some amino acid residues contributing to the prediction results of the DL models.

      Significance

      1. One main ideas of the paper is to use SHAP for determine the significant amino acids at each position (or pairs of AA at each position) contributing to the prediction. Some of the interpretation results are consistent with findings reported previously. This is very nice; however, most of these findings are statistical results such "XX is often present at the second position for the peptides with the positive outcome", which are relatively straightforward and may be derived by using some statistical methods without using DL models. We expect more complex patterns can be discovered in addition to these statistical observations.
        1. Although the interpreting results reported in the paper largely agree with previous reports, the paper did not explicitly model the frequency of different amino acid in the training data. For instance, if the amino acid 'A' happens to be over-represented in the positive samples of peptides in the training data, the DL model may consider it as to contribute to the positive prediction, which may not be not true. This issue might become more serious when pairs of amino acids are considered. The authors may want to analyze this potential issue in their results.
        2. Even on a balanced training dataset, the LSTM model to be interpreted may still contain arbitrary bias due to invertible overfitting, which the authors did not discuss. It will be more convincing by training multiple models using different hyper-parameters and optimization algorithms, and then see if similar interpretation results can be reached among most or all of these models.
        3. For the dependence analysis, it is not completely clear why the distance is used as the variable, while the relative position of the amino acid residue in the peptide is ignored. For example, if there is a strong interaction between the first and the last residues in the peptide, their distance changes depending on the peptide length. In figure 6, the authors showed strong interactions between amino acid that are 8-9 residues apart may suggest the peptide length actually plays a role here.
        4. Also, it would be better to show that how the result looks like when applying this method to peptides in the negative samples (e.g., the peptides that are not bound by MHC in the antigenicity prediction experiment). Will the interpreting results also be negative?
        5. Finally, it will be interesting to see the interpreting results when the method is applied to the DL models on more challenging tasks such as the prediction of tandem mass spectra of peptides. The authors may want to discuss these applications.

      I am primarily interested in algorithmic and statistical problems in genomics and proteomics. We have develop deep learning models for predicting the full tandem mass spectrum of peptides, and am interested model interpretation methods to explain the fragmentation mechanism resulting in non-conventional fragment ions in tandem mass spectra of peptides. I review the paper in collaboration with my Ph.D students, who are developing deep learning models for computational mass spectrometry.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank reviewers for helping us clarify our manuscript. Some key information was only in the Supporting Information document, and was not obvious to find. We have now introduced some of this information into the main text, and otherwise clarified to which specific sub-paragraph of the Supporting Information document we refer every time we mention it. Another aspect which we have clarified is the relevance of controls previously published in our paper PLOS Comp Biol 16: 1-23. These controls address many of the remarks raised by the reviewers, regarding for instance rhythm detection methods, detection threshold, the effect of normalization of time-series data in rhythm detection, the consideration of biological replicates in time-series data, or the relationship between rhythms and highly expressed genes. We have now introduced some of these results within the main text to clarify these points, or have specified to which specific result of our previous paper we refer.

      REVIEWER #1

      Major comments:

      They assumed the optimal constant level would be the maximum over the rhythm period when rhythmic regulation is absent. They also assumed the trade-off between the benefits of not producing proteins when they are not needed (costs saved) and the costs involved in making it rhythmic (costs of complexity), which they argued lead to the expectation that costlier genes be more frequently rhythmic. However, there was no explicit definition for the trade-off, so it is unclear how it leads to the expectation. [...]

      Second, the "costs of complexity" were not defined

      We have now clarified these points:

      Thus, a first evolutionary advantage given by rhythmic biological processes would be an optimization of the overall cost (over a 24-hour period), compared to a constant expression at a high level of proteins, when this high level is necessary **for fitness at least at some point of time.

      • Thus, a first evolutionary advantage given by rhythmic biological processes would be an optimization of the overall cost (over a 24-hour period), compared to the costs generated over the same period by optimizing a constant level of proteins. The reasonable assumption that the optimal constant level would be the maximum over the rhythm period strengthens the case for selection on expression cost.

      • Our results suggest that rhythmicity of protein expression has been favored by selection for cost control of gene expression, while keeping optimal expression levels. In the case of rhythmic genes, what would that optimal constant level be? We can propose two hypotheses. The first is that it would be the mean expression over the period, since this maintains the same overall amount of protein. The second is that it would be the maximum over the rhythm period, since that is the level needed at least at some point. The second hypothesis explains better the existence of this maximum level during the cycle. Of note, it also strengthens the case for selection on expression cost. Thus, for the case of rhythmic genes, the optimal constant level should at least correspond to the mean expression level (Fig 1d). We provide results obtained using both the maximum and the mean of expression in Fig. 2a. We have modified Fig. 1d accordingly, and specified in Supp Fig. S2 that the delta value was calculated from mean expression levels.

      We assume that the maximal expression level gives an estimation of the level that would be constantly maintained in the absence of rhythmic regulation

      • We assume that, in the absence of rhythmic regulation, the constant optimal level is included between the mean and the maximum expression level observed in rhythmic expression. Here, we studied the evolutionary costs and benefits that shape the rhythmic nature of gene expression at the RNA and protein levels. For this, we analysed characteristics we presume to be part of the trade-off.

      • Here, we studied the evolutionary costs and benefits that shape the rhythmic nature of gene expression at the RNA and protein levels. For this, we analysed characteristics we presume to be part of the trade-off determining the rhythmic nature of gene expression between its advantages (cost economy over 24h, non-ribosomal occupancy) and disadvantages (costs of complexity related to precise temporal regulation). The evolutionary** origin of maintaining large cyclic biological systems, in term of adaptability, can be seen as a trade-off between disadvantages such as cost or noise induced by the added complexity, and advantages such as economy over a daily time-scale, temporal organization, or adaptability.

      • Most rhythmic genes are tissue-specific (Zhang et al. 2014, Boyle et al. 2017), which means that their rhythmic regulation is not a general property of the gene and is therefore expected to be advantageous only in those tissues in which they are found rhythmic. This argues that rhythmic regulation has costs, since it is not general. These costs are **probably related to the complexity of regulation** to maintain precise temporal organisation. Thus, cyclic biological systems are expected to have adaptive origins.

        It would be more convincing to define a fitness function or cost function to demonstrate their argument that costlier genes have fitness advantages if they are rhythmic.

      Considering rhythmicity as an economy strategy is quite intuitive and our results confirm what is currently accepted (Wang et al. 2015). We show and discuss to which extent this is true by comparing expression costs at different expression levels. Defining more precisely a fitness function in our case would require an experiment where we could compare fitness between two populations (e.g. prokaryote growth rates): WT versus a strain whose promoter of the costliest genes would be controlled by non-cyclic transcriptional factors. We do not feel that this is a reasonable extension of this work, but a whole new research program.

      First, when proteins are not needed, it can be either the case of not producing extra proteins (cost saved) or the case of degrading excessive proteins (cost incurred). […]

      The cost function presented in this paper may be oversimplified. It only takes into account the costs to produce protein. The authors argued that a more complex cost calculation would not change the observation, but without proving it. However, protein degradation, including ubiquitination and proteolysis, requires energy; for a rhythmic gene, it is also necessary to consider the cost of maintaining the rhythmicity, including the temporally precise regulation of protein expression when the proteins are needed and of protein destruction when they are not.

      We have now clarified this in Section 4.1 of the Supporting Information document:

      Protein decay can be due to spontaneous decay of unstable molecules (no cost), cellular dilution (no cost), or active protein degradation, which has a cost which has been shown to be negligible. Costs of protein decay are negligible enough to not be opposed by selection. Indeed, Lynch and Marinov (2015) and Wagner (2005) have shown that “degradation in a lysosome may cost essentially nothing, and amino-acid export back to the cytoplasm consumes 1 ATP for every 3 to 4 amino acids”. Compared with the unique cost of producing one single nucleotide which consume 49~P, protein decay costs becomes negligible comparatively to transcriptional costs, which are themselves negligible comparatively to translational costs. All the more, given that amino acids from degradation are reused and do not need to be produced by the cell, which therefore economizes around 30 ~P per amino-acid (~P: high-energy phosphate bonds).

      In Section 3 of the Supporting Information document, we also show why rhythmic and highly expressed proteins are costlier for the cell per time-unit than rhythmic and lower expressed proteins, even considering decay costs or proteins half-lives.

      Thus, the order of costs between genes is not expected to be affected by a more complex calculation accounting for protein decay and protein half-lives.

      We think these points should be in Supporting Information document since they are not novel. Lynch and Marinov as well as Wagner have studied and reported these points in detail in their work. We have replicated their results and have used them to understand rhythmicity, which is the focus of our manuscript.

      The authors claimed that cycling genes are enriched in highly expressed genes, by showing rhythmic proteins are costlier than non-rhythmic proteins (based on the expression cost function) in several species. However, only the first 15% of proteins based on p-values ranking from their rhythm detection algorithms were classified rhythmic. One potential artifact of this classification is that the identified rhythmic genes are biasedly highly expressed genes because the lower-amplitude genes are harder to detect and excluded by the algorithm. If changing the threshold for rhythmicity to include more rhythmic genes with intermediate p-values (p-value Since the results of this paper would be sensitive to the accuracy of identifying rhythmicity at both mRNA and protein levels, it is crucial to validate the rhythm detection algorithm by cross-checking algorithm-generated results with those known rhythmic genes. Can the authors estimate the false positive and false negative rates in each group of the rhythmic and non-rhythmic proteins or mRNAs identified by their algorithm?

      Our 2020 paper (https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007666) addresses these issues, but we did not make this sufficiently clear here. We have now added some details of our previous results in the main text to clarify, as this a logical limitation remark. We mostly use GeneCycle based on the results of the benchmarking in that paper; it notably produces a uniform distribution under the null hypothesis and a skew towards low p-values for all empirical data.

      Furthermore, cycling genes have been shown to **be over-represented among highly expressed genes (Laloum & Robinson-Rechavi 2020, Wang et al. 2015).

      • Furthermore, we have shown in our previous work that rhythmic genes are largely enriched in highly expressed genes, and that the differences in rhythm detection obtained between highly and lowly expressed genes either reflect true biology or a lower signal to noise ratio in lowly expressed genes (Laloum & Robinson-Rechavi 2020).

        Higher gene expression usually leads to lower genetic noise. The authors thus applied a definition of the stochastic gene expression (SGE) that controls the biases associated with the correlation between the expression mean and variance to evaluate expression noise. They found lower noise with rhythmic transcripts. However, they did not explain, mechanistically, why rhythmic RNA has lower noise and what is the biological meaning behind this finding. It is also unclear whether they considered the phase difference between signal and noise that usually exists in an oscillatory system.

      Please see answer to second reviewer.

      Minor comments:

      It would be helpful if the authors could interpret their observations including where the results may not be as significant. A few examples are listed below.

      1) In tissue-specific studies, they used the transcriptomics datasets from 11 mouse tissues to compare the difference in expression levels (based on z-score) of each gene between tissue groups of rhythmic and non-rhythmic expression and found higher gene expression in rhythmic tissues. However, proteins showed a bimodal distribution, and it would be helpful to add interpretation or discussion regarding this bimodal distribution.

      Note that for proteins, the delta was calculated based on only 3 or 4 tissues, which limits a lot our detection power. We now proposed the hypothesis:

      • We also provide results obtained from other datasets in supplementary Table S3, although they must be taken with caution since only 2 to 4 tissues were available, and sometimes data were coming from different experiments. Of note, for proteomic data, the distributions of are bimodal (Fig. S3), separating rhythmic proteins into two groups, with low or high protein levels in the tissues in which they are rhythmic. **A hypothesis is that for some tissue-specific proteins the rhythmic regulation is not tissue-specific, making them rhythmic also in tissues where they are lowly expressed. But the very small sample size does not allow us to test it, and we caution against any over-interpretation of this pattern before it can be confirmed.

        2) They also calculate partial correlation for rhythmicity with expression level over tissues for all tissue-specific genes (tau>0.5) and found Spearman's correlation coefficient is skewed towards negative (suggesting a correlation), but Pearson's correlation showed a positive peak. It indicates that a subset of genes is less rhythmic in the tissues where they are most expressed. Is this positive peak significant or expected? What are these genes? Any evolutionary benefits? Can the authors discuss the functional difference between these genes and other genes that follow the predictions?

      While Spearman’s correlation is clearly skewed towards negative correlations, i.e. lower p-values thus stronger signal of rhythmicity in the tissue where genes are more expressed, Pearson’s correlation also has a smaller peak of positive correlations (Fig. S4), suggesting a subset of genes which are less rhythmic in the tissues **where they are most expressed.

      • While Spearman’s correlation is clearly skewed towards negative correlations, i.e. lower p-values thus stronger signal of rhythmicity in the tissue where genes are more expressed, Pearson’s correlation also has a smaller peak of positive correlations (Fig. S4a), suggesting a subset of genes which are less rhythmic in the tissues where they are most expressed. We show that tissue-specific genes which are mostly rhythmic in tissues where they are highly expressed are under stronger selective constraint than those which are rhythmic in tissues where they are lowly expressed (Fig. S4b). Thus, rhythmic expression of this second set of genes might be under weaker constraints.**

      We added Fig. S4b in Supplementary figures.

      3) In SGE analysis, the scRNA data of Arabidopsis was from roots, while the data for detecting the rhythmicity was from leaves. Without knowing whether the gene expression patterns in these two different parts are comparable, it is hard to judge the results. The authors may want to provide some discussion.

      Indeed, this limits the interpretation for Arabidopsis, as noted in the results and in the discussion. We still prefer to report this pattern than to remove it. But, we have now moved the results obtained for Arabidopsis into Supplementary Table S5.

      • In Arabidopsis, the single-cell data used are from the root, while transcriptomic time-series data used to detect rhythmicity are from the leaves, which limits the interpretation. Despite this limitation, we found no evidence of lower noise for genes that are rhythmic at the protein level (Table 1b and 1e, and Supplementary Table S5), **and trends towards lower noise in almost all cases for genes with rhythmic mRNAs (Table 1a, 1c, and 1d).
      • Our results in mouse are consistent with all of these considerations (Table 1 and Supplementary Table S5), although it was not fully the case for Arabidopsis (Supplementary Table S5). However, this last point might be explained by the tissue-specificity of rhythmic gene expression. Indeed, for Arabidopsis, the time-series dataset come from leaves whereas single-cell RNA data come from roots.

        For Mouse tissues, while most show lower noise for rhythmic genes, they saw the opposite in Muscle. Is this significant? Any discussion?

      For mouse muscle, we had not mentioned it since it was the only tissue showing such a trend. We now added comment regarding this in the main text:

      • In mouse, tissue muscle gave opposite result, possibly because skeletal muscle is one of the most un-rhythmic tissues in the body.

        In various places of the text, the authors only pointed the readers to "Supporting information" without explicitly referring to a specific supplemental figure by its number. It would be helpful to cite a table or figure explicitly.

      We agree, and have corrected this. See first General Statements.

      Figure 2 does not have legends in the graphs.

      This is now corrected, thank you for your attention.

      REVIEWER #2

      Major comments:

      • Our major concern regards the identification of rhythmic genes.

      Despite we are not experts in the specific method used (details are not provided in the manuscript), a method looking for a statisical significant periodicity in a noisy signal will provide a high p-value for a signal sufficiently above the noise level. Gene expression data are noisy because of stochastic gene expression and technical noise (e.g., the sampling noise due to RNA capture in RNAseq data). This noise scales with the average level of expression. Lowly expressed genes generally display larger relative fluctuations (e.g., sampling noise is essentially Poisson-like). As a result, the method will identify with a higher probability genes that are highly expressed as rhythmic genes since the signal to noise ratio is generally higher.

      This could significantly bias the subsequent analysis, since most of the claims are related to a link between expression levels and rhythmicity.

      [There is not even an obvious separtation of timescales that can be invoked between a possible 24-hour periodic signal and the fluctuations. For example, the timescale of protein fluctuations can be largely set by dilution and thus have a timescale comparable to the cell cycle.]

      The authors should discuss this issue, which is overlooled in the current manuscript.

      How much this potential bias affects the selection of rhythmic genes can probably be assessed using synthetic data.

      • It would be useful to clarify in the main text what are the units of measurement of gene expression at the mRNA and at the protein level. If we understood correctly, the authors used FPKM and protein counts respectively. The dynamics in time could in principle be different if an absolute or a normalized level of expression is considered. For example, the cell cycle can be correlated with the circadian clock (as reported for example in cyanobacteria). Since the absolute amount of total proteins has to approximately double during a cell cycle (for cell size homeostasis), this can create a periodic signal in protein counts with a 24-hour period.

      The same reasoning does not hold true if the measurement is normalized, as in the FPKM case.

      The authors should discuss this issue or simply show that the results for proteins are robust if the protein count is normalized (for example with respect to the total protein amount).

      We haven’t focused the present manuscript on these issues since we recently published another paper which addresses these points: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007666

      We have now added some details of our previous results in the main text to make the work more relevant.

      • The expression cost defined in the manuscript seems dominated by the expression level.

      It would be useful to report the scatter plot and the correlation level of cost versus average expression. A high correlation between these two quantities can largely recapitulate the results in Figure 2 (even though the results presented are still interesting per se). In other words, the relation between cost and rhytmicity sounds like a simple rephrasing of the relation between average expression level and rhythmicity (previously reported as correctly referenced in the manuscript).

      We now provide these results in Fig. S2 (Supplementary figures) and show a negative and significant correlation between the order of the rhythmicity signal and the total expression cost (calculated from the mean expression level). Since our previous benchmark show that the order of genes from most to less rhythmic genes is not very reliable for known methods, including the one used here, we prefer to present this result in the Supplementary figures document.

      • The empirical observation of a relation between noise and rhythmicity in mRNA expression is interesting, but we cannot fully understand its link with the theoretical arguments proposed.

      The Authors suggest that perodicity in mRNA expression could decrease protein noise at the peak of mRNA expression (Fig.S1). But this is not what they can measure in the single-cell data analyzed, where cell-to-cell variability is reported at a single timepoint for a cell population. If the oscillations are not syncronized in the cell population, an oscillating transcript would simply display a high cell-to-cell variability dominated by the amplitude of oscillations. Even if the oscillations are syncronized, there is no information in the dataset about the mRNA dynamics. Thus, mRNA cell-to-cell variability could have been measured at any point of its (putative) cyclic dynamics.

      Thus, we propose to make more clear the connections between the theoretical arguments and the empirical observation about noise in gene expression.

      Thank you for pointing out this issue. We have clarified the following in the main text:

      These considerations lead to predictions which we test here: i) a decreased stochasticity strategy for genes with rhythmically accumulated mRNAs ...**.

      • These considerations lead to predictions which we test here: i) a strategy to periodically decrease stochasticity for genes with rhythmically accumulated mRNAs .... Assuming that genes with low noise have noise-sensitive functions (and thus noise is tightly controlled), these results support the hypothesis that noise is globally reduced thanks **to rhythmic regulation at the transcriptional level.

      • Our results show that noise is globally reduced for genes with rhythmic regulation at the transcriptional level. Since rhythmic genes are not all in the same phase (Fig. S9a in Supporting information), we expect this result obtained for a given time-point (noise estimation based on a single time-point scRNA dataset) to be general to all time-points (section 6.3 in Supporting information). Assuming that genes with low noise have noise-sensitive functions (and thus noise is tightly controlled), these results suggest that rhythmic genes have their noise periodically and drastically reduced through periodic high accumulation of their mRNAs.

      • Thus, since we find lower noise among rhythmic transcripts, rhythmic expression of RNAs might be a way to periodically reduce expression noise of highly expressed genes (Figure 2 and Fig. S1-S2), which are under stronger selection. Indeed, we found that genes with rhythmic transcripts are under stronger selection, even controlling for expression level effect. As proposed by Horvath et al. (2019) and supported by results in mouse by Barroso et al. (2018) genes under strong selection could also be less tolerant to high noise of expression. Thus, periodic accumulation of mRNAs might be a way to periodically reduce expression noise of noise-sensitive genes (Fig 1c), i.e. genes under stronger selection. **However, our results are limited by the fact that noise estimation is based on a single time-point measurement since no scRNA time-series data are currently available for these species. Since the peak time of rhythmic transcripts is distributed across all times (Supporting Information Fig. S9a), the mean noise estimated at a given time-point includes the noise of the genes that are peaking at that time (lowest noise) and all the others that have a higher noise than those at their own peak time-point (Supporting Information Fig. S9b). Our results suggest that rhythmic genes peaking at the time-point of the scRNA measurement have sufficiently low noise for the mean noise of rhythmic genes to be much lower than that of non-rhythmic genes.
        • As a simple additional test of robustness of the rhythmic gene selection, biological replicates can be used, although this would not resolve the possible bias discussed above. As explained by the Authors, some of the datasets analyzed have biological replicates. It would be interesting to know the robustness of the detection method across replicates. How much is the set of genes identified as rhythmic conserved if estimated on different replicates? Spearman correlation or simply the overlap between the sets (maybe assessed with a hypergeometric test) can be used.

      These points have been already addressed in our 2020 paper https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007666 (paragraph “The importance of having an informative dataset”) as well as in recent guidelines (Hughes et al. 2017). We specified in Methods that we considered replicates as new cycles as recommended.

      Minor comments:

      • The claim that "transcriptional noise is known to be the main driver of overall expression noise", which is present in the discussion is questionable.

      For example, the quantitative large-scale dataset referenced by the Authors for E.coli (Taniguchi et al) shows instead that the dominant source of noise is extrinsic for many of the genes tested.

      We have clarified in the main text that by “main driver of the overall noise” we refer to the relative contribution of transcriptional versus translational noise into the overall noise.

      We have also added the section 6.1 into Supporting Information document:

      • Relatively to translational noise, transcriptional noise is the main driver of the overall noise (Raj and van Oudenaarden 2008) and should give a good estimation of the output noise. Indeed, based on estimations of coefficient of variations (CV, cell-to-cell variations of protein level) for diverse transcription and translation rates in E. coli and S. cerevisiae, Hausser et al. (2019) have shown that for a fixed transcriptional rate, CV is almost constant for diverse translational rates. Thus, changes in protein level have little to no impact on gene expression noise. The availability of mRNA molecules seems to drive the final noise. I.e., comparatively to the noise caused by the translational activity, the availability of low number molecules such as transcriptional factors (subject to the stochasticity of diffusion and binding in the cell environment) is the main factor of the output cell-to-cell variation in protein abundances. And have modified the main text:

      Indeed, transcriptional noise, which we measure here, is known to be the main driver of overall expression **noise (Raj & van Oudenaarden 2008).

      • Relatively to translational noise, transcriptional noise is the main source of the overall noise (Raj & van Oudenaarden 2008) (section 6.1 in Supporting information) In addition, highly expressed proteins are all precisely expressed and they display little variation in noise (also shown by Hausser et al. (2019) who reused Taniguchi et al. (2010) data). The noise of these highly expressed proteins is also just above a limit which is the noise floor. This "noise floor" is dominated by extrinsic noise as suggested by Hausser et al. and Taniguchi et al.: “The extrinsic noise in the last three terms in Eq. 4 (of the noise floor) might originate from fluctuations in cellular components such as metabolites, ribosomes, and polymerases and dominates the noise of high copy proteins” (Taniguchi et al.). Thus, highly expressed proteins are precisely expressed and their residual noise is similar to the noise floor, which is due to the extrinsic noise (imperfect synchrony of cell states inherent or due to the environment).
      • We suggest to avoid explicit statements about a causal link between expression level and rhythmicity, as in the caption title of Figure 2. A detected correlation is not a proof of a causal relation.

      We have corrected the sentence as follows:

      Rhythmic proteins are costly proteins due to their high level of expression.

      • High level of expression is the main factor explaining the higher cost observed in rhythmic proteins.
        • Supplementary Figures attached at the end of the main text and Supplementary Figures in the Supporting Information file have the same numbering...so there are two different versions of Fig.S1 S2 etc.

      This complicates the work of the reader.

      We have modified the numbering of figures to make them easier to follow.

      -The legend of Fig 2 is missing (the legend is instead reported in Fig.S1).

      This is now corrected, thank you for your attention

      Other modifications:

      We also show how cost can explain the tissue-specificity of rhythmic gene expression. Indeed, the nycthemeral transcriptome has long been known to be tissue-specific (Zhang et al. 2014, Boyle et al. 2017, Korenˇciˇc et al. 2014), i.e. a given gene can be rhythmic in some tissues, and constantly or not expressed in others.

      • Furthermore, the nycthemeral transcriptome has long been known to be tissue-specific (Zhang et al. 2014, Boyle et al. 2017, Korenˇciˇc et al. 2014), i.e. a given gene can be rhythmic in some tissues, and constantly or not expressed in others. Here, we provide a first explanation for the tissue-specificity of rhythms in gene expression by showing that genes are more likely to be rhythmic in tissues where they are specifically highly expressed.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The manuscript proposes an interesting hypothesis to explain the widespread presence of rhythmicity in gene expression. The Authors suggest that rhythmicity can be the combined result of cost optimization and control of gene expression noise. To support this hypothesis, they analyzed several proteomic and RNA sequencing datasets across different species. Specifically, putative rhythmic genes were identified using a published tool from time-series datasets. Their first claim concerns the typical expression cost (Cp) for rhythmic vs. non-rhythmic genes. The evaluated Cp is empirically (slightly but significantly) higher for rhythmic genes, mainly because these genes on average show higher expression levels than non-rhythmic genes. The analysis of tissue-specific expression data further supports this relation between expression levels and rhythmicity. Genes are more likely to be rhythmic in tissues where they are specifically highly expressed. To investigate the additional hypothesis of a relation with noise control, the Authors compared expression fluctuations of rhythmic and non-rhythmic genes, measuring noise only at the mRNA level (and using a specific noise measure). According to this measure, genes displaying rhythmicity, in particular at the transcript level, are indeed in most cases less noisy than non-rhythmic genes.<br> Finally, the analysis of protein evolutionary conservation between rhythmic and non-rhythmic genes suggests that genes with rhythmic transcription are under strong purifying selection.

      The paper is concise and well written. The data used are described in sufficient detail to reproduce the results.

      Major comments:

      • Our major concern regards the identification of rhythmic genes.

      Despite we are not experts in the specific method used (details are not provided in the manuscript), a method looking for a statisical significant periodicity in a noisy signal will provide a high p-value for a signal sufficiently above the noise level. Gene expression data are noisy because of stochastic gene expression and technical noise (e.g., the sampling noise due to RNA capture in RNAseq data). This noise scales with the average level of expression. Lowly expressed genes generally display larger relative fluctuations (e.g., sampling noise is essentially Poisson-like). As a result, the method will identify with a higher probability genes that are highly expressed as rhythmic genes since the signal to noise ratio is generally higher. This could significantly bias the subsequent analysis, since most of the claims are related to a link between expression levels and rhythmicity. [There is not even an obvious separtation of timescales that can be invoked between a possible 24-hour periodic signal and the fluctuations. For example, the timescale of protein fluctuations can be largely set by dilution and thus have a timescale comparable to the cell cycle.]

      The authors should discuss this issue, which is overlooled in the current manuscript. How much this potential bias affects the selection of rhythmic genes can probably be assessed using synthetic data.

      • It would be useful to clarify in the main text what are the units of measurement of gene expression at the mRNA and at the protein level. If we understood correctly, the authors used FPKM and protein counts respectively. The dynamics in time could in principle be different if an absolute or a normalized level of expression is considered. For example, the cell cycle can be correlated with the circadian clock (as reported for example in cyanobacteria). Since the absolute amount of total proteins has to approximately double during a cell cycle (for cell size homeostasis), this can create a periodic signal in protein counts with a 24-hour period.

      The same reasoning does not hold true if the measurement is normalized, as in the FPKM case. The authors should discuss this issue or simply show that the results for proteins are robust if the protein count is normalized (for example with respect to the total protein amount).

      • The expression cost defined in the manuscript seems dominated by the expression level. It would be useful to report the scatter plot and the correlation level of cost versus average expression. A high correlation between these two quantities can largely recapitulate the results in Figure 2 (even though the results presented are still interesting per se). In other words, the relation between cost and rhytmicity sounds like a simple rephrasing of the relation between average expression level and rhythmicity (previously reported as correctly referenced in the manuscript).
      • The empirical observation of a relation between noise and rhythmicity in mRNA expression is interesting, but we cannot fully understand its link with the theoretical arguments proposed. The Authors suggest that perodicity in mRNA expression could decrease protein noise at the peak of mRNA expression (Fig.S1). But this is not what they can measure in the single-cell data analyzed, where cell-to-cell variability is reported at a single timepoint for a cell population. If the oscillations are not syncronized in the cell population, an oscillating transcript would simply display a high cell-to-cell variability dominated by the amplitude of oscillations. Even if the oscillations are syncronized, there is no information in the dataset about the mRNA dynamics. Thus, mRNA cell-to-cell variability could have been measured at any point of its (putative) cyclic dynamics. Thus, we propose to make more clear the connections between the theoretical arguments and the empirical observation about noise in gene expression.
      • As a simple additional test of robustness of the rhythmic gene selection, biological replicates can be used, although this would not resolve the possible bias discussed above. As explained by the Authors, some of the datasets analyzed have biological replicates. It would be interesting to know the robustness of the detection method across replicates. How much is the set of genes identified as rhythmic conserved if estimated on different replicates? Spearman correlation or simply the overlap between the sets (maybe assessed with a hypergeometric test) can be used.

      Minor comments:

      • The claim that "transcriptional noise is known to be the main driver of overall expression noise", which is present in the discussion is questionable. For example, the quantitative large-scale dataset referenced by the Authors for E.coli (Taniguchi et al) shows instead that the dominant source of noise is extrinsic for many of the genes tested.
      • We suggest to avoid explicit statements about a causal link between expression level and rhythmicity, as in the caption title of Figure 2. A detected correlation is not a proof of a causal relation.
      • Supplementary Figures attached at the end of the main text and Supplementary Figures in the Supporting Information file have the same numbering...so there are two different versions of Fig.S1 S2 etc. This complicates the work of the reader. -The legend of Fig 2 is missing (the legend is instead reported in Fig.S1).

      Significance

      The hypothesis of a link between rhythmic expression, expression cost and noise control is intriguing and can be of interest for a large audience of scientists from computational and evolutionary biologists to interdisciplinary researchers interested in models of gene expression.

      Our combined expertise (keywords):

      Physical biology, mathematical modelling, stochastic gene expression, transcriptomic data, quantititative cell physiology, genomics.

      Referee Cross-commenting

      The other report looks fair to me too. We seem to agree on the relevance of the questions asked, but also on some major concerns about the methods used to support the conclusions. Thanks!

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      This paper explored the evolutionary advantages of having nycthemeral rhythmicity in many genes, using genome-wide transcriptomics and proteomics datasets from bacteria, plants, animals, and specific mouse tissues. As the main findings of this paper, the authors first applied a cost function with the proteomics data in four species and showed that rhythmic proteins are costlier. They also evaluated the stochastic gene expression (SGE) using single-cell RNA (scRNA) data from several plant and animal species and found that genes with rhythmic mRNAs had lower noise than non-rhythmic mRNAs. They argued that rhythmic genes are evolutionarily selected because of the cost-saving advantage at the protein level and the noise control strategy at the mRNA level.

      In addition to their main findings, the authors also compared the protein evolutionary conservation between rhythmic and non-rhythmic genes using dN/dS data (the ratio of non-synonymous to synonymous substitutions). They found that genes with rhythmic transcripts were more conserved even after controlling for the effect of gene expression and suggested that rhythmic transcripts are important for genes under strong purifying selection.

      Major comments:

      The finding that rhythmic genes are costlier does not convincingly lead to the conclusion that protein rhythmicity has a cost-saving advantage. To make sense of this conclusion, the authors made several assumptions that lack convincing support. They assumed the optimal constant level would be the maximum over the rhythm period when rhythmic regulation is absent. They also assumed the trade-off between the benefits of not producing proteins when they are not needed (costs saved) and the costs involved in making it rhythmic (costs of complexity), which they argued lead to the expectation that costlier genes be more frequently rhythmic. However, there was no explicit definition for the trade-off, so it is unclear how it leads to the expectation. First, when proteins are not needed, it can be either the case of not producing extra proteins (cost saved) or the case of degrading excessive proteins (cost incurred). Second, the "costs of complexity" were not defined. It would be more convincing to define a fitness function or cost function to demonstrate their argument that costlier genes have fitness advantages if they are rhythmic. The cost function presented in this paper may be oversimplified. It only takes into account the costs to produce protein. The authors argued that a more complex cost calculation would not change the observation, but without proving it. However, protein degradation, including ubiquitination and proteolysis, requires energy; for a rhythmic gene, it is also necessary to consider the cost of maintaining the rhythmicity, including the temporally precise regulation of protein expression when the proteins are needed and of protein destruction when they are not.

      The authors claimed that cycling genes are enriched in highly expressed genes, by showing rhythmic proteins are costlier than non-rhythmic proteins (based on the expression cost function) in several species. However, only the first 15% of proteins based on p-values ranking from their rhythm detection algorithms were classified rhythmic. One potential artifact of this classification is that the identified rhythmic genes are biasedly highly expressed genes because the lower-amplitude genes are harder to detect and excluded by the algorithm. If changing the threshold for rhythmicity to include more rhythmic genes with intermediate p-values (p-value<=0.05), will this change the results? Since the results of this paper would be sensitive to the accuracy of identifying rhythmicity at both mRNA and protein levels, it is crucial to validate the rhythm detection algorithm by cross-checking algorithm-generated results with those known rhythmic genes. Can the authors estimate the false positive and false negative rates in each group of the rhythmic and non-rhythmic proteins or mRNAs identified by their algorithm? Higher gene expression usually leads to lower genetic noise. The authors thus applied a definition of the stochastic gene expression (SGE) that controls the biases associated with the correlation between the expression mean and variance to evaluate expression noise. They found lower noise with rhythmic transcripts. However, they did not explain, mechanistically, why rhythmic RNA has lower noise and what is the biological meaning behind this finding. It is also unclear whether they considered the phase difference between signal and noise that usually exists in an oscillatory system.

      Minor comments:

      It would be helpful if the authors could interpret their observations including where the results may not be as significant. A few examples are listed below.

      1) In tissue-specific studies, they used the transcriptomics datasets from 11 mouse tissues to compare the difference in expression levels (based on z-score) of each gene between tissue groups of rhythmic and non-rhythmic expression and found higher gene expression in rhythmic tissues. However, proteins showed a bimodal distribution, and it would be helpful to add interpretation or discussion regarding this bimodal distribution.

      2) They also calculate partial correlation for rhythmicity with expression level over tissues for all tissue-specific genes (tau>0.5) and found Spearman's correlation coefficient is skewed towards negative (suggesting a correlation), but Pearson's correlation showed a positive peak. It indicates that a subset of genes is less rhythmic in the tissues where they are most expressed. Is this positive peak significant or expected? What are these genes? Any evolutionary benefits? Can the authors discuss the functional difference between these genes and other genes that follow the predictions?

      3) In SGE analysis, the scRNA data of Arabidopsis was from roots, while the data for detecting the rhythmicity was from leaves. Without knowing whether the gene expression patterns in these two different parts are comparable, it is hard to judge the results. The authors may want to provide some discussion. For Mouse tissues, while most show lower noise for rhythmic genes, they saw the opposite in Muscle. Is this significant? Any discussion?

      In various places of the text, the authors only pointed the readers to "Supporting information" without explicitly referring to a specific supplemental figure by its number. It would be helpful to cite a table or figure explicitly. Figure 2 does not have legends in the graphs.

      Significance

      The paper attempts to understand the origins of why many genes display nycthemeral rhythmicities. The question, if addressed, would have a significant impact in the fields of computational systems biology and evolutionary biology. But the findings of this study do not provide a satisfying answer to the question, thus reducing the significance. The conclusions are too overarching without providing significant biological insights and interpretation. Our field of expertise is in systems biology, but we do not have sufficient expertise to evaluate computational tools used to classify genome-wide gene expression data.

      Referee Cross-commenting

      I have reviewed other reports, which look fair to me. I have no comments. Thanks!

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      First of all, I sincerely appreciate the critical reading of our manuscript by the reviewers.

      Point-by-point responses to the reviewer #1’s comments

      Most of the key conclusions are valid but the main one should be either reinforced or tuned down.

      Through our study, we want to indicate that MTCL2 preferentially associates with perinuclear MTs accumulated around the Golgi complex, and its target is not necessarily restricted to “Golgi-associated (nucleated) MTs.” In this sense, the sentences in the previous manuscript, such as “MTCL2 preferentially associates with Golgi-associated MTs” and “MTCL1 and 2 …. are specifically condensed on Golgi-associated MTs,” were overstatements and completely misleading.

      According to reviewer#1’s comment, we carefully revised these sentences throughout the manuscript and eliminated ambiguity on this point as far as possible.

      The corresponding revisions are as follows.

      In particular, the authors tend to give central role to MTCL2 in regulating the formation and organization of Golgi-associated MT network, and conversely in organizing Golgi elements, without considering the other factors identified (the authors cite relevant papers though but do not discuss this). They should analyze the function of MTCL2 in relation to the role of CLASP2, AKAP450, Golgi-g-Tubulin, or even EB proteins (like EB3).

      I agree with the above comment since it is important to analyze how MTCL2 preferentially associates with the perinuclear MTs accumulated around the Golgi complex.

      In the revised manuscript, we included new data analyzing knockdown effects of CLASP1/2 and AKAP450 on the subcellular localization of MTCL2 (Fig. 7A). These data indicate that CLASPs but not AKAP450 are required for the preferential localization of MTCL2 to the perinuclear MTs around the Golgi. We also demonstrate that the minimum Golgi-localizing region of MTCL2 (the N-terminal coiled-coil region) physically associates with CLASP2 (Fig. 7B), further supporting the idea that CLASPs mediate the Golgi association of MTCL2. Additional involvement of another Golgi element, giantin, is also suggested through Fig. 7C and Appendix Fig. S6. We believe that these revisions significantly improved the weakness previously pointed out by the reviewer.

      I also do not think that carrying out super resolution microscopy is enough to "reveal the possibility that MTCL2 mediates the association of the Golgi membrane with stabilized MTs". More generally, the authors cannot conclude that MTCL2 preferentially associated to Golgi-MT only from their immunofluorescence and KD experiments. The centrosome (the main MTOC) is indeed also localized in the perinuclear area. Easy to do additional experiments may help to confirm these conclusions (see below). Also, the authors could strengthen the way the study how MTCL1 and MTCL2 binds to microtubules and Golgi (see below). The localization or interaction of MTCL2 with Golgi-associated MT is not directly shown.

      Previously, we demonstrated that the N-terminal region of MTCL2 shows clear Golgi-localization activity, whereas the C-terminal region directly binds to MTs. These data support our conclusion that MTCL2 mediates the association of the Golgi membrane with general MTs (although not with Golgi-associated or stabilized MTs).

      In the revised manuscript, we reinforced these data by newly revealing that four-point mutations (4LA) in the first coiled-coil motif disrupt the Golgi localization of the N-terminal region of MTCL2 (Fig. 4D). Thereafter, we found that introduction of the same mutations in full-length MTCL2 abolished its preferential association to the perinuclear MTs accumulating around the Golgi without affecting its localization to MTs (Fig. 4E and F). In addition, we provide data on candidate molecules mediating the Golgi association of MTCL2, as stated above (Fig. 7). These results reinforce our immunofluorescence analysis results (Fig. 2) and indicate that the preferential association of MTCL2 to perinuclear MTs accumulating around the Golgi is facilitated by physical interactions between the N-terminal region of MTCL2 and the Golgi-resident proteins, such as CLASPs and giantin.

      The title should be changed also. I am not sure I understand what an asymmetric microtubule network means in this context. I guess that the authors mean non-centrosomal microtubule network.

      We acknowledge the confusion caused by our previous manuscript. By “an asymmetric MT network” we meant not equivalent to “non-centrosomal MT network.”

      In many cases, microtubules do not elongate radially (symmetrically) from the centrosome but intensely accumulate around the Golgi area and show asymmetric organization (see Meiring et al. Curr. Opin. Cell Biol. 62: 86-95, 2020). “An asymmetric MT network” in the title corresponds to this asymmetric array of general MTs accumulating around the GA.

      The present findings that MTCL2 depletion severely disrupted MT accumulation around the Golgi and induced random and rather symmetric arrays of MTs (Fig. 5A) are very impressive. We believe that the knockdown/rescue experiments in this study strongly support the title by demonstrating that MTCL2 facilitates MT accumulation around the Golgi through its dual binding activity to MTs and the Golgi membrane.

      We changed the title in the revised manuscript but still used the term “asymmetric microtubule organization” based on these rationalities.

      The authors also state that tubulin acetylation is induced by MTCL1 C-MTBD but it may simply be stabilized. They should also clarify if MTCL2 regulates Golgi-dependant nucleation microtubules.

      Yes, we think that MTCL1 C-MTBD enhances tubulin acetylation by simply stabilizing the polymerization state of MTs (see Kader et al. PLos One 12: e0182641, 2017). As for the second point, please see our response below to the comment (7).

      I was not convinced by the use of the quantification of "skewness", in particular in figure 5B. Whether a Wilcoxon test is adequate is unclear to me.

      I understand that utilization of skewness, a measure of the asymmetry of distribution, might not be popular in previous studies. In fact, the skewness of tubulin signal distribution in pixels does not indicate in which way MTs distribute asymmetrically by themselves. However, quantification of this statistical parameter does not require any arbitrary factors and thus eliminates the chance of using discretion as far as possible. Therefore, we are confident that this is the best way to estimate the asymmetric organization of microtubules, which are severely affected by various conditions, without any preconception.

      The two biological phenomena we attempted to elucidate here (microtubule arrays and Golgi ribbon expansion) are thought to be context-dependent in each cell (for example, cell cycle, cell densities, etc.). Therefore, we do not have any substantial reason to assume a normal distribution for variation of the two values (skewness of tubulin signal distribution and Golgi ribbon expansion angle) in our cell population. Therefore, we considered that the Wilcoxon test, being a non-parametric rank test, was the most appropriate and safest test to use.

      To demonstrate that MTCL2 associated to Golgi-MT, microtubule regrowth experiments following nocodazole treatment have to be conducted (time course). Another efficient way to analyze such events, as shown by the Kaverina and the Akhmanova labs for example, is to use fluorescent EB proteins (e.g. EB3) to image microtubule plus ends and back-track them to identify nucleation points. Carrying out such an experiment (nocodazole way-out and EB tracking) in the presence or absence of MTCL2 would allow to confirm, or not, the functional hypothesis of the authors.

      We did not want to demonstrate that MTCL2 preferentially associates with “Golgi-MTs.” From this point of view, we do not think the experiments suggested by reviewer#1 were necessarily required for our study.

      However, there is no doubt that one of the main components of the “perinuclear MTs accumulating around the Golgi” is “Golgi-associated (nucleated) MTs.” In this sense, we still agree with reviewer#1’s comment that it is better to examine whether MTCL2 is involved in MT nucleation from the Golgi membrane. The results of these experiments will be informative for readers particularly because we previously reported that MTCL1 stabilizes Golgi-associated (nucleated) MTs.

      In keeping with the above consideration, we have performed both experiments (nocodazole way-out and EB tracking) according to the previous studies (for example, Sanders et. al. M.B.C. vol. 28; 3181-3192, 2017). However, we ultimately decided against the inclusion of the data as we could not overcome large cell-to-cell deviations.

      Nevertheless, we believe that our current dataset adequately answers and supports the specific questions we explored. Briefly, if these experiments succeed to demonstrate the functional importance of MTCL2 for the development of Golgi-nucleated microtubules, they will not necessarily indicate the physical interaction of MTCL2 with Golgi-associated microtubules. In this respect, as described above, we have significantly supplemented data on the molecular mechanisms by which MTCL2 mediates MT–Golgi interactions. This improvement must sufficiently compensate lack of data from the experiments suggested by reviewer#1.

      Several circumferential data suggest that MTCL2 is not involved in the development of Golgi-associated (nucleated) MTs in contrast to MTCL1. We discussed this issue in the “Discussion” of the revised manuscript.

      Additionally, carrying electron microscopy analysis would be important to qualify better the effects observed on Golgi complexes upon depletion. The authors mention the effects on the "morphology of Golgi ribbon" but it is rather unclear.

      We did not perform electron microscopy analysis, because we are not implicating a change in the ultrastructure of the Golgi apparatus in MTCL2-knockdown cells. We specifically want to demonstrate that MTCL2 knockdown changes the assembly structures of the Golgi ribbons, and we believe that it is feasible to do so by light microscopy. We realize that using the term “Golgi morphology” may be misleading in this context. In the revised manuscript, we replaced this term with appropriate ones, such as “assembly structures of the Golgi stacks” or “compactness of the Golgi ribbon.”

      Last, because the authors compare the way MTCL1 and MTCL2 bind microtubules, and suggest intriguing differences, domain swapping experiments between these two isoforms would be important to carry out.

      We conducted the suggested experiments and obtained interesting results. However, we ultimately decided against their inclusion given that the functional difference between MTCL1 and 2 is not the main point of discussion in our study.

      Some studies are referred but the published data not actually used (with the exception of the final scheme). The authors should comment on the fact that other Golgi-associated MT binding proteins have been shown to be involved in the mechanisms highlighted here. Why they would not take over in the absence of MTCL2 should be properly discussed.

      In the revised manuscript, we included data regarding the involvement of CLASPs and AKAP450 in the Golgi association of MTCL2. Accordingly, we introduced their roles in the development of Golgi-associated MTs as far as possible in the “Introduction” (see lines 29-36 and 38-42), “Results” (see lines 306-309 and 344-347), and “Discussion” (see lines 398-402 and 442-444).

      Similarly, in the discussion, the authors indicate that SOGA has been found as an interacting partner of CLASP2. As CLASP2 is a microtubule binding protein also localized at the Golgi complex and binding to acetylated microtubules, the authors should at least comment on the putative role of the interaction between MTCL2 and CLASP2 in the phenotypes they described. The role of the interaction between CLASP2 and MTCL2 should be discussed and ideally tested.

      As described above, we provided the data indicating the role of the interaction between MTCL2 and CLASP2 in the revised manuscript.

      In the introduction, page 3 line 74-77, the authors wrote « The resultant N-terminal fragment is released into the cytoplasm to suppress autophagy by interacting with the Atg12/Atg5 complex, whereas the C-terminal fragment is secreted after further cleavage (see Fig. 1A, boxed illustration). » while on the Fig1 the boxed area indicates that SOGA bears Atg16 and Rab5 binding domains. Please double check the interacting partners of SOGA1.

      Thank you for pointing this out. The sentence in the “Introduction” was revised to “… interacting with the Atg12/Atg5/Atg16 complex” (Rev. Endocr. Metab. Disord. 15, 137–147, 2014).

      Figure 1 B and C are not cited in the main text.

      These figures were cited in the “Introduction” section (line 65 in the previous manuscript). In the revised manuscript, these figures were replaced with Fig. EV1 A and C and cited in the “Introduction” section (line 59) as well as in the legend to Fig. 1 (line 757).

      Figure 1E: a loading control is needed to evaluate the expression level of SOGA/MTCL2 in the mouse tissues.

      Sample loading in each lane shown in previous Fig. 1E (Fig. 1D in the revised manuscript) was normalized by total protein amount (25 mg), as indicated in the figure legends. However, we have decided to add the data for a-tubulin expression in each lane as a reference, although they are not equal for each lane.

      In the liver, the size of the bands is different than in other tissues (smaller size). The authors might comment if these smaller bands correspond to the cleaved version of SOGA that was previously described in mouse hepatocyt

      In Fig. 1D of the revised manuscript, we added arrowheads indicating the bands of smaller sizes observed in some tissues such as the liver. In addition, we commented on them in the corresponding part of the “Results” section by describing that “we cannot exclude a possibility that MTCL2 is subjected to the reported cleavage and works as SOGA in these tissues.”

      Figure 2A: single color picture for the anti-tubulin immunolabeling would help to see the distribution of microtubules in the perinuclear area. The perinuclear region is a crowded area with many intracellular compartments accumulating there as well as cytoskeleton elements.

      We completely revised Fig. 2 following the reviewers’ suggestion. To provide single-color pictures for the anti-MTCL2 and anti-tubulin immunolabeling, we added new pictures examining colocalization of MTCL2 with MTs at the peripheral regions where densities of both signals are rather low. In Fig. 2B, the colocalization was further examined via a line scan analysis across MTs. Finally, we have included new data demonstrating that exogenously expressed MTCL2 similarly colocalized with MTs even at the peripheral regions when its expression was suppressed to the endogenous level (Fig. 2C).

      Figure 2C: same comment as above, a single-color picture for the anti-MTCL2 and anti-GM130 immunolabeling are required.

      Owing to the space limitation, we could not include a single-color picture for the anti-GM130 immunolabeling in Fig. 2, although we enlarged their merged figure so that readers easily agree with our statement: “some overlapped with the Golgi marker signals” (lines 146-147).

      Alternatively, we included a new Appendix Fig. S8, in which immunofluorescence signals of MTCL2 and CLASP1/2 (A) or giantin (B) are compared at a super-resolution microscopic level. In these figures, we included single-color pictures together with merged data.

      page 7, line 132-134: the authors state: « Close inspection using super-resolution microscopy further revealed the possibility that MTCL2 mediates the association of the Golgi membrane with stabilized MTs (Fig. 2D, arrows). » To my opinion, the data are over-interpreted. The signals partially co-localize but this does not indicate a function of MTCL2 in mediating the interaction.

      We deleted the previous Fig. 2D and the corresponding sentence. By doing so, we ceased to suggest that MTCL2 functions to mediate MT–Golgi interactions only based on immunofluorescence data.

      Figure 3: Another way of merging the anti MTCL2 and GS28 pictures have to be provided. The pictures are difficult to interpret with the current display.

      We deleted the previous Fig. 4 and ceased to discuss colocalization of MTCL2 with Golgi proteins only based on immunolabeling data as mentioned above.

      Figure 4C: please indicate the meaning of « ppt »

      We included the explanation of “ppt” in the legends to the corresponding figure (Fig. 3C in the revised manuscript) as follows (lines 801-802):

      “ppt represents the MT precipitate obtained after centrifugation (200,000 × g) for 20 min at 25°C.”

      Figure 5B and C: for easier reading of the figure, it would be useful to annotate with MTCL2 construct is overexpressed following doxycycline treatment (MTCL2 WT (A) and MTCL2 delta C-MTBD (C)).

      We followed the suggestion. Please see new Fig. 5 and Fig. EV4 and 5.

      Figure 6 A and C: the labels are wrong. Bottom pictures correspond to anti-GM130 immunostaining not anti-tubulin. If I am not mistaken, it is MTCL2 delta C which is studied in panel C.

      Thank you for pointing this out. We corrected this error in Fig. EV5 (previous Fig. 6) in the revised manuscript.

      Page 11, line 212: Supplementary Figure 2 (knockdown in RPE1 cells) is intended to be cited not Supplementary Figure 3.

      Thank you for pointing this out. We corrected the error in the revised manuscript appropriately.

      Figure 8A: single color pictures are needed to appreciate the distribution of the signals

      One of the major comments of three reviewers have been provided on Fig. 8, which reports that MTCL1 and 2 differentially regulate microtubules. We agree that the previous data in Fig. 8 A–C are rather preliminary. Although we could improve these figures according to the reviewers’ comments, we decided to omit these data and cease the discussion that MTCL1 and 2 localize with microtubules in a mutually exclusive manner, as this was not the main focus of the study.

      Point-by-point responses to the reviewer #2’s comments

      In figure 1D, a loading control should be included for the Western Blot probing for V5-mMTCL2 in HEK293T cells.

      We did include loading controls for the indicated lanes. However, because the HEK293T cell extract in lanes 1–3 was diluted, the signals were too weak to be visualized in this figure (Fig. 2C in the revised manuscript).

      The authors use the anti-SOGA antibody to detect MTCL2. However, in Figure 1A they do not show the sequence similarity between this region in MTCL1 and MTCL2. The authors should include this, as well as show that the anti-SOGA antibody is specific for MTCL2 and does not detect MTCL1.

      In new Fig. EV1, we included amino acid sequence alignment data for the region corresponding to the used anti-SOGA1 antibody epitope. The data indicate significant divergence of the sequence from MTCL1 (6% homology, 23% similarity).

      We also included new western blot data (Fig. 1B in the revised manuscript) demonstrating that anti-SOGA1 antibody does not react with MTCL1 exogenously expressed in HEK293T cells.

      Line 132-134. The authors conclude that MTCL2 possible mediates association between Golgi membrane and stabilized MTs based on localization microscopy only. This is an overstatement and should be corrected. Not only is the microscopy technique used able to produce resolution of 140nm, which is not enough to show direct association; the staining techniques used (double antibody staining) ensures the fluorophores are approximately 20-30nm away from the intended target (MTs, MTCL2, or Golgi). Thus, the conclusion drawn is overstated and should be refined at this point in the manuscript.

      I agree with reviewer#2’s comment that the previous data in Fig. 2D are insufficient to draw the conclusion that MTCL2 mediates the association between the Golgi membrane and stabilized MTs. We deleted the figure and the corresponding sentence reviewer #2 indicated.

      We want to demonstrate that “MTCL2 mediates the association between the Golgi membrane and MTs (not restricted to the stabilized MTs).” In this sense, we have already obtained supportive data in the previous manuscript that the N-terminal region of MTCL2 has clear Golgi-localization activity, whereas the C-terminal region directly binds to MTs.

      In the revised manuscript, we reinforced these data by revealing that four-point mutations (4LA) in the first coiled-coil motif disrupt the Golgi localization of the N-terminal region of MTCL2 (Fig. 4D). Thereafter, we found that introduction of the same mutations in full-length MTCL2 abolished its preferential association to the perinuclear MTs accumulating around the GA without affecting its colocalization to MTs (Fig. 4E and F). We also provide data on candidate molecules mediating the Golgi association of MTCL2 (Fig. 7). These results reinforce our immunofluorescence analysis results (Fig. 2) and indicate that the preferential association of MTCL2 to perinuclear MTs accumulating around the Golgi is facilitated by physical interactions between the N-terminal region of MTCL2 and the Golgi-resident proteins, such as CLASPs and giantin.

      The authors should include some quantification of MTCL2 signals along stabilized microtubules near the Golgi and in peripheral regions of the cell in Figure 2. This will show that MTCL2 preferentially localizes to MTs in the Golgi region but not the periphery, as the authors claim (lines 124-130). This quantification could be in the form of linescans along or across MT signals.

      We included a line scan data across peripheral MTs to confirm MTCL2 colocalization with MTs (Fig. 2C). However, it is difficult to perform a line scan for the perinuclear regions where both signals of MTCL2 and MTs are too dense. Therefore, we demonstrate the preferential colocalization of MTCL2 to the perinuclear MTs by comparing peripheral signals of MTCL2 with that of MAP4 (Fig. 2D).

      The authors show that ectopic expression of the C-terminus of MTCL2 can rescue MTCL2 siRNA phenotypes. Since the N-terminus localizes strongly to the Golgi membrane, the authors should do corresponding experiments with this fragment, to determine if membrane binding of MTCL2 can have a similar rescue effect or if MT binding is essential. This is especially important for the Golgi-ribbon organization (Figure 6).

      We did not include data indicating rescue activity of the C-terminal fragment of MTCL2. In the previous Fig. 5 and 6, we demonstrated that MTCL2 lacking the C-terminal microtubule-binding region does not show rescue activities. Therefore, we did not follow reviewer#2’s suggestion directly.

      However, we included new data indicating that an MTCL2 mutant (4LA) that associates with MTs but not with the Golgi membrane also lacks rescue activities for asymmetric MT organization and Golgi ribbon compactness (new Fig. 5 and Fig. EV4). I hope these revisions are satisfactory.

      Line 261-2. The authors claim that MTCL1 and MTCL2 function in a mutually exclusive manner. As with point 3, this is an overstatement based solely on localization microscopy. The authors cannot draw this conclusion from the data associated with this statement (Figure 8A) and it should be refined to reflect that they only comment on the respective localization patterns of MTCL1 and MTCL2. Additionally, to show that MTCL1 and MTCL2 do not overlap on MTs, the authors should include linescans along MTs showing the anti-V5 and anti-MTCL1 intensities.

      One of the major comments of three reviewers have been provided on Fig. 8, which reports that MTCL1 and 2 differentially regulate microtubules. We agree that the previous data in Fig. 8 A–C are rather preliminary. Although we could improve these figures according to the reviewers’ comments, we decided to omit these data and cease the discussion that MTCL1 and 2 localize with microtubules in a mutually exclusive manner, as this was not the main focus of the study.

      In Figure 8C the authors show acetylated tubulin staining in cells depleted of MTCL2. Based on this localization pattern, it seems the MT network is not grossly altered, as was shown in Figure 5 where perinuclear accumulation of MTs was lost. The authors should comment on whether acetylated tubulin presence and localization is altered in MTCL2-depleted cells. This is also mentioned in the discussion where the authors conclude that the major function of MTCL2 is to crosslink and accumulate MTs in the Golgi region. However, based on acetylated tubulin staining patterns, stable MTs seem to still accumulate in the Golgi region. The authors need to show this accumulated population of stable MTs is no longer crosslinked in the absence of MTCL2 to support their claim.

      Acetylated microtubules represent a minor fraction of the perinuclearly accumulated microtubules. From the point of this view, it could be possible that the accumulation of perinuclear microtubules is severely affected, whereas that of acetylated microtubules is not. MTCL1 might crosslink these acetylated microtubules.

      In any case, we have decided to delete the previous Fig. 8 A–C, as stated above.

      To investigate potential functional overlap between MTCL1 and MTCL2, the authors should include a double depletion experiment where MT organization and Golgi organization are investigated. The currently shown experiments do not test a functional relationship between the two paralogs. Additionally, the authors should show Western Blot analysis of MTCL1 levels in MTCL2-depleted cells, and vice versa. While there does not seem to be an overlap in localization patterns of the two proteins, that does not mean there is no functional relationship.

      We did not follow reviewer#2’s comment because of the reason stated above.

      Lines 120-30 and 297-9. The authors state that based on the localization pattern of MTCL2 it mostly localizes along MTs in the perinuclear region (shown in Figure (2). Then, in the discussion they state MTCL2 preferentially localizes to Golgi membranes. Please clarify which of the two sites MTCL2 localizes to preferentially.

      We agree that we should be more careful while describing the subcellular localization of MTCL2. We revised the information in the manuscript to indicate that MTCL2 preferentially localizes to perinuclearly accumulated microtubules showing partial colocalization to the Golgi membrane.

      Loss of Golgi organization as described in Figures 6 does not appear in polarized cells in Figure 7. The authors should comment on the loss of the phenotype in polarized cells.

      Since RPE1 cells cultured at high density show abnormally elongated shapes, as described in the original text (line 238; in the revised text, line 326), Golgi ribbons in these cells do not appear to be as expanded. However, their loss of compactness in MTCL2-knockdown cells can be easily recognized in the previous Fig. 7C (corresponding to Fig. 6C in the revised manuscript).

      The authors should consider using colorblind friendly palettes in figures. For example, magenta/green instead of red/green and magenta/cyan/yellow instead of red/blue/green. Additionally, for tri-color images the combination red/green/white (Figure 4B, 7C) should be avoided, as overlapping red/green signals will show up as yellow which is difficult to distinguish from the white signals. Finally, human eyes detect shades of red much poorer than for example green. Therefore, the main point of a figure should not be in red. For example, MTCL2 is frequently shown as red signal in a merged image and should be replaced with a different color.

      We incorporated the reviewer’s suggestion.

      The authors claim the mouse MTCL2 protein lacks 203 N-terminal amino acids. Authors should clarify in the text that this is relative to mouse MTCL1. The authors should also include the human comparisons, as they work on human cell lines in the majority of the manuscript.

      I am afraid that this comment is based on a misunderstanding by reviewer #2, because we did not claim that mouse MTCL2 lacks 203 N-terminal amino acids. Instead, we described that SOGA, a mouse MTCL2 isoform, lacks 203 N-terminal amino acids compared to the full-length mouse MTCL2, the cDNA of which was used in this work.

      In Figure 1D the authors show Western Blots where various amounts of HEK293T extracts were probed for exogenously expressed MTCL2. As a control, authors should include a non-transfected control. From Figure 1E, it would be expected that HEK293 (kidney cells) would not express endogenous MTCL2, but the control should be included anyway.

      In the revised Fig. 2B, we included a lane in which a non-transfected HEK293T cell extract was loaded, according to reviewer #2’s comment (see lanes indicated as mock).

      In Figure 3, the color scheme in the final column of images should be changed. Red/white contrast is very poor and no conclusions can be drawn from these images. Additionally, the authors should include a box to show where the inset is located in the overview images.

      In the revised manuscript, we deleted the “final column of images using red/white contrast” from Fig. 2D (previous Fig. 3), to avoid drawing a conclusion on the interaction between MTCL2 and the Golgi membrane only from immunofluorescence data.

      In addition, we included boxes in the overview images to show where the inset is located, wherever it is required in the revised manuscript.

      Authors claim that MTCL2 is not detected near more dynamic MTs in the periphery of the cell and references Figures 2A and 3. They should include annotation in the figures to highlight this. This can be done with arrowheads or other markings, or with additional insets enlarging a peripheral region of the cell.

      To respond to the comment, we separately provided enlarged views of perinuclear and peripheral regions in the revised Fig. 2.

      The authors should clarify in the main text and figure legend which superresolution microscopy technique was used in Figure 2D.

      As mentioned above, we deleted the previous Fig. 2D.

      The authors use methanol fixation to examine localization of MTCL2, MTs, and Golgi. Methanol extracts lipids and thus affects intracellular membrane compartments, and can affect the localization pattern of GM130, a Golgi matrix protein. The authors should include samples fixed with a crosslinking fixative to ensure their conclusions drawn from methanol-fixed samples are not affected by the choice of fixative.

      According to the reviewer’s suggestion, we included additional data obtained using PFA fixations (Fig. EV2). PFA fixation also revealed a similar localization pattern of MTCL2 to that obtained by methanol fixation.

      In Supplementary Figure 1B a third, relatively high expressing cell can be seen in the top panel. The GM130 signal for this cell seems to be comparable to non-transfected cells in the same image. Can the authors address this? Alternatively, to show differences in expression levels between these three cells in that panel and others, authors could use a heatmap LUT of the V5 signal to differentiate expression levels more clearly in different cells.

      I am unsure whether the reviewer is referring to the cell located at the bottom-left corner of the panel in the previous Supplementary Fig. 1B (Appendix Fig. S1B in the revised manuscript). The cell shows a rather normal distribution pattern of exogenous MTCL2 similar to the endogenous one. We think this is the reason why it maintains a rather normal assembly structure of the Golgi ribbon. We included the word “frequently” in the sentence (line 153 in the revised text) to indicate that high levels of exogenous MTCL2 do not disrupt the normal Golgi ribbon structure. We do not think it is necessary to differentiate the expression levels of exogenous MTCL2 more clearly by using a heatmap, since this issue is not critical for the conclusions of this paper.

      Line 139. How was the ectopic expression 'suppressed to endogenous levels'? The panels in Suppl Fig. 1 of 'low expression' clearly show increased MTCL2 signal when compared to non-transfected cells in the same panel still. This would suggest ectopic expression is still above endogenous levels.

      We did not suppress the expression actively. We identified the cells expressing exogenous MTCL2 at low levels comparable to those of endogenous MTCL2. The information provided in line 139 of the previous text is not accurate. Thank you for pointing out this issue; we revised the sentence as follows: “However, when the expression levels were similar to the endogenous levels, … (lines 154-155 in the revised text)”

      Figure 5C. The label for MTCL2 construct should read mMTCL2 ΔC-MTBD to clarify the expression construct used.

      Since the labeling in previous Fig. 5 and 6 was confusing, we revised them all by adding the name of the expressed MTCL2 mutant under the label “+dox” (see Fig. 5, Fig. EV4, and Fig. EV5 in the revised manuscript).

      In Figures 6A and 6C the label shows a-tubulin, but the staining is of a Golgi marker.

      Thank you for pointing this out. We corrected this error in the corresponding figure (Fig. EV5) in the revised manuscript.

      In Figures 6B and 6D the different conditions should be separated more in the graph, the datapoints overlap.

      In the revised manuscript, we significantly improved the presentation of the statistical data shown in the previous Figs. 5 and 6 (Fig. 5 and Figs. EV4 and 5 in the revised manuscript). In these improvements, we determined to only include data of biological replicates in a single typical experiment in the main figures. Automatically, data points in the previous Fig. 6B and D were decreased in number and do not overlap anymore (see Figs. EV4 and EV5D). Instead, we have included new figures (Appendix Fig. S4) in which the results of technical replicates (three independent experiments) are presented.

      Lines 246-7. The authors claim the Golgi-associated and centrosomal MTs can be easily distinguished in MTCL2 knockdown cells. They should include annotation in the corresponding figures to highlight these different populations.

      We followed the reviewer’s suggestion by adding arrows in Fig. 6C of the revised manuscript.

      Figure 8A. A horizontal line is missing in the panel showing MTCL/a-tub merge.

      Thank you for pointing this out. As mentioned above, we deleted the previous Fig. 8A from the manuscript.

      Figures 8C and 8D. The acetylated tubulin staining in control cells (control RNAi and GFP) in these panels vary greatly. Can the authors comment on this?

      Expression of MTCL1 C-MTBD induces tubulin acetylation intensely. Therefore, to obtain appropriate pictures under non-saturated conditions, we had to decrease the gain of photomultiplier of the confocal microscopy system for the previous Fig. 8D. This is why acetylated tubulin signals in control cells appear to be too weak in the previous Fig. 8D than those in Fig. 8C.

      In any case, we deleted the previous Fig. 8C in the revised manuscript as stated above. The previous Fig. 8D is solely included in Fig. EV3.

      Additionally, there appears to be an increase in acetylated tubulin on the Western Blot (8E) shown in cells expressing GFP-MTCL2 CMTB that is not reflected in the image in Figure 8D. Since a significant population of GFP-MTCL2 CMBT localizes to the nucleus, it is possible that the functional population of GFP-MTCL2 CMBT that can stabilize MTs is much lower than GFP-MTCL1 CMBT despite showing equal levels in the Western Blot. The author should compare signal intensity in the cytosol of GFP-expressing cells and base their analysis of acetylated tubulin levels on cells where cytosolic levels are comparable.

      We agree with this reviewer’s comment and did not include WB data in Fig. EV3B corresponding to the previous Fig. 8D.

      As for quantification of the fluorescence data in Fig. 8D, we provided a typical result on the acetylate-tubulin signals normalized by GFP and a-tubulin signals in the boxed regions where cytosolic GFP signals are comparable.

      Point-by-point responses to the reviewer #__3’s comments__

      While the standard fluorescence images are of good quality, the quality of the super-resolution microscopic images is quite low and insufficient. Fig. 8A looks like an enlarged standard laser scanning microscope image, but does not achieve the resolution of a super-resolution image by far, which should be well below the µm range. However, such a resolution would be required to support the claim that MTCL1 and 2 locate on MTs in a mutually exclusive manner. (Negative) data from immunoprecipitation experiments also provide only weak evidence for the absence of a heterocomplex. I also fear that the fixation process creates artifacts. Experiments to image living cells would definitely bolster the data and also provide information about the dynamics of the interactions.

      One of the major comments of three reviewers have been provided on Fig. 8, which reports that MTCL1 and 2 differentially regulate microtubules. We agree that the previous data in Fig. 8 A–C are rather preliminary. In the revised manuscript, we deleted these data and ceased to discuss that MTCL1 and 2 localize with microtubules in a mutually exclusive manner, as this was not the main focus of the study.

      We also deleted the previous Fig. 2D (showing another super-resolution image) and the corresponding sentence. By doing so, we ceased to suggest that MTCL2 functions in mediating MT–Golgi interactions only based on immunofluorescence data.

      It would also be relevant to confirm that the results are not a cell line artifact in HeLa cells.

      In the previous manuscript, we included data indicating that the knockdown effects observed in HeLa-K cells (reduced accumulation of MTs around the Golgi as well as lateral expansion of the Golgi ribbon) are also induced in RPE1 cells by MTCL2 knockdown (Supplementary Fig. 2 in the previous manuscript). We included the same figure in the revised manuscript as Appendix Fig. S4.

      A standard method for detecting microtubule association in cultured cells would be to use an extraction protocol. This has to be done to show that MTCL2 actually behaves like a microtubule-associated protein (MAP).

      In the revised manuscript, we included new immunofluorescence data obtained using PFA fixation with or without pre-extraction, which revealed a similar localization pattern of MTCL2 to that obtained by methanol fixation (Fig. EV2). Pre-extraction was performed using BRB80 buffer supplemented with 0.5% TX-100 and 4 mM EGTA for 30 s, according to a protocol provided by Dr. Mitchison Laboratory.

      I don't see that the study proves that MTCL2 is essential for the organization of an asymmetric microtubule network as the title claims. The experiments shown in Fig. 5 demonstrate a change in the skewness of the pixel intensity distribution dependent on the presence of MTCL2, which may indicate a contribution of MTCL2 (provided that the fixation and staining do not produce an artifact). However, they do not prove that MTLC2 is essential.

      We cannot understand how an artifact due to the fixation and staining may be responsible for the results shown in the previous Fig. 5 (Fig. 5 and Figs. EV4 and 5 in the revised manuscript).

      In many cases, microtubules do not elongate radially (symmetrically) from the centrosome but intensely accumulate around the Golgi area and show asymmetric organization (see Meiring et al. Curr. Opin. Cell Biol. 62: 86-95, 2020). “An asymmetric MT network” in the title corresponds to this asymmetric array of general MTs accumulating around the Golgi complex.

      In this respect, our findings that MTCL2 depletion severely disrupted MT accumulation around the Golgi and induced random and rather symmetric arrays of MTs (Fig. 5A) are very impressive. We believe that the knockdown/rescue experiments in this study strongly support the title by demonstrating that MTCL2 facilitates MT accumulation around the Golgi through its dual binding activity to MTs and the Golgi membrane.

      We are unable to comprehend the reviewer’s standpoint in not allowing us to conclude the essential role of MTCL2 in the organization of an asymmetric microtubule. However, the title in the revised manuscript was changed as follows.

      “MTCL2 promotes asymmetric microtubule organization by crosslinking microtubules on the Golgi membrane”

      There is also a large oversampling of the data by plotting each individual cell from only two separate experiments. It would be better and more reliable to present the data as the mean of the experiments (then of course more than 2 would be required). The same applies to the experiments in which the "Golgi ribbon expanding angle" was determined (Fig. 6).

      In my opinion, statistical theories based on an ideal assumption cannot simply be applied to the quantitative analysis of biological phenomena. In our case, the MT distributions, as well as the Golgi ribbon expansion angles significantly deviate in a context-dependent manner in each cell (for example, cell cycle, cell densities, etc.). The deviation of these values between each cell (in biological replicates) is much larger than the experimental deviation, which is mainly dependent on the stochastic element (in technological replicates). I understand that this is the reason why many journals in cell biology do not necessarily require “three” independent experiments for statistical analysis.

      In the revised manuscript, however, we included data from three independent experiments for all rescue experiments (Fig. 5, Figs. EV4 and 5, and Appendix Fig. S4) to further demonstrate the reliability of our data.

      In the main figures (Fig. 5, Figs. EV4 and 5), we included statistical data of a single typical experiment to demonstrate reproducibility in biological replicates in each condition. To compensate for these figures, we listed statistical data for each biological replicate of all experiments in Appendix Fig. S4 A. In Appendix Fig. S4 B and C, we further provided statistical data of technical replicates (three independent experiments) by comparing the average of each biological replicate. We concluded that this is the best way to statistically demonstrate the reliability of the biological analysis.

      We believe that the data collectively presented by these figures strongly support the reliability of our conclusions.

      It would be good to support the claim that MTCL2 affects the Golgi ribbon structure through ultrastructural analysis (EM).

      We did not perform electron microscopy analysis, because we are not implicating a change in the ultrastructure of the Golgi apparatus in MTCL2-knockdown cells. We specifically want to demonstrate that MTCL2 knockdown changes the assembly structures of the Golgi ribbons, and we believe that it is feasible to do so by light microscopy. We realize that using the term “Golgi morphology” may be misleading in this context. In the revised manuscript, we replaced this term with appropriate ones, such as “assembly structures of the Golgi stacks” or “compactness of the Golgi ribbon.”

      The critical mechanistic question is which molecule on the Golgi side interacts with MTCL2, since the experiments with the deletion constructs would suggest that it is not the microstructure of the microtubules. As shown, the study is mainly descriptive in relation to this aspect.

      We significantly improved this weakness by including new data indicating the possible involvement of CLASPs and giantin in mediating the Golgi association of MTCL2 (see Fig. 7 and Appendix Figs. S5–7).

      We also revealed that four-point mutations (4LA) in the first coiled-coil motif disrupt the Golgi localization of the N-terminal region of MTCL2 (Fig. 4D). Thereafter, we found that introduction of the same mutations in full-length MTCL2 abolished its preferential association to the perinuclear MTs accumulating around the GA without affecting its colocalization to MTs (Fig. 4E and F).

      These results reinforce our immunofluorescence results (Fig. 2) and indicate that the preferential association of MTCL2 to perinuclear MTs accumulating around the Golgi is facilitated by physical interactions between the N-terminal region of MTCL2 and the Golgi-resident proteins, such as CLASPs and giantin.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Matsuoka et al. describe an MTCL1 paralogue (MTCL2) that is present in vertebrates and binds to the Golgi membrane and interacts with microtubules. In contrast to MTCL1, MTCL2 contains only one microtubule binding region and does not stabilize any microtubules. The authors provide evidence that MTCL2 may be involved in accumulating microtubules on the Golgi and promote directed migration. The study is based on experiments with cell lines, predominantly HeLa cells, and relies heavily on the immunofluorescence staining of methanol-fixed cells. While the concept of a functional Golgi-microtubule interaction is interesting and may be relevant for directed migration, I am not convinced of the experimental support and interpretation provided by the authors.

      1. The study relies entirely on the examination of cell lines, mainly HeLa cells, and the immunofluorescence of fixed cells. While the standard fluorescence images are of good quality, the quality of the super-resolution microscopic images is quite low and insufficient. Fig. 8A looks like an enlarged standard laser scanning microscope image, but does not achieve the resolution of a super-resolution image by far, which should be well below the µm range. However, such a resolution would be required to support the claim that MTCL1 and 2 locate on MTs in a mutually exclusive manner. (Negative) data from immunoprecipitation experiments also provide only weak evidence for the absence of a heterocomplex. I also fear that the fixation process creates artifacts. Experiments to image living cells would definitely bolster the data and also provide information about the dynamics of the interactions.
      2. It would also be relevant to confirm that the results are not a cell line artifact in HeLa cells.
      3. A standard method for detecting microtubule association in cultured cells would be to use an extraction protocol. This has to be done to show that MTCL2 actually behaves like a microtubule-associated protein (MAP).
      4. I don't see that the study proves that MTCL2 is essential for the organization of an asymmetric microtubule network as the title claims. The experiments shown in Fig. 5 demonstrate a change in the skewness of the pixel intensity distribution dependent on the presence of MTCL2, which may indicate a contribution of MTCL2 (provided that the fixation and staining do not produce an artifact). However, they do not prove that MTLC2 is essential. There is also a large oversampling of the data by plotting each individual cell from only two separate experiments. It would be better and more reliable to present the data as the mean of the experiments (then of course more than 2 would be required). The same applies to the experiments in which the "Golgi ribbon expanding angle" was determined (Fig. 6).
      5. It would be good to support the claim that MTCL2 affects the Golgi ribbon structure through ultrastructural analysis (EM).
      6. The critical mechanistic question is which molecule on the Golgi side interacts with MTCL2, since the experiments with the deletion constructs would suggest that it is not the microstructure of the microtubules. As shown, the study is mainly descriptive in relation to this aspect.

      Significance

      The study is based on experiments with cell lines, predominantly HeLa cells, and relies heavily on the immunofluorescence staining of methanol-fixed cells. While the concept of a functional Golgi-microtubule interaction is interesting and may be relevant for directed migration, I am not convinced of the experimental support and interpretation provided by the authors.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this work, Matsuoka et al. describe a novel microtubule (MT) crosslinking factor, MTCL2. They use Western Blot analysis to show the presence of MTCL2 in various tissues and use a previously developed antibody to show its localization in cultured cells. The authors show that MTCL2 localizes along MTs in the Golgi region and that upon depletion of MTCL2, these MTs do not accumulate in the Golgi and Golgi organization is affected, leading to defects in migration. Through deletion mutant analysis, they show that MTCL2 C-terminus binds to MTs and that the N-terminus binds to Golgi membranes, though this may be lost or reduced in the full length protein. Expression of the C-terminal fragment rescues the phenotypes observed in MTCL2-depleted cells. Finally, the authors show that MTCL1 and MTCL2 show non-overlapping localization patterns and conclude they may have different functions in crosslinking and stabilizing MTs and Golgi organization.

      Major comments:

      1. In figure 1D, a loading control should be included for the Western Blot probing for V5-mMTCL2 in HEK293T cells.
      2. The authors use the anti-SOGA antibody to detect MTCL2. However, in Figure 1A they do not show the sequence similarity between this region in MTCL1 and MTCL2. The authors should include this, as well as show that the anti-SOGA antibody is specific for MTCL2 and does not detect MTCL1.
      3. Line 132-134. The authors conclude that MTCL2 possible mediates association between Golgi membrane and stabilized MTs based on localization microscopy only. This is an overstatement and should be corrected. Not only is the microscopy technique used able to produce resolution of 140nm, which is not enough to show direct association; the staining techniques used (double antibody staining) ensures the fluorophores are approximately 20-30nm away from the intended target (MTs, MTCL2, or Golgi). Thus, the conclusion drawn is overstated and should be refined at this point in the manuscript.
      4. The authors should include some quantification of MTCL2 signals along stabilized microtubules near the Golgi and in peripheral regions of the cell in Figure 2. This will show that MTCL2 preferentially localizes to MTs in the Golgi region but not the periphery, as the authors claim (lines 124-130). This quantification could be in the form of linescans along or across MT signals.
      5. The authors show that ectopic expression of the C-terminus of MTCL2 can rescue MTCL2 siRNA phenotypes. Since the N-terminus localizes strongly to the Golgi membrane, the authors should do corresponding experiments with this fragment, to determine if membrane binding of MTCL2 can have a similar rescue effect or if MT binding is essential. This is especially important for the Golgi-ribbon organization (Figure 6).
      6. Line 261-2. The authors claim that MTCL1 and MTCL2 function in a mutually exclusive manner. As with point 3, this is an overstatement based solely on localization microscopy. The authors cannot draw this conclusion from the data associated with this statement (Figure 8A) and it should be refined to reflect that they only comment on the respective localization patterns of MTCL1 and MTCL2. Additionally, to show that MTCL1 and MTCL2 do not overlap on MTs, the authors should include linescans along MTs showing the anti-V5 and anti-MTCL1 intensities.
      7. In Figure 8C the authors show acetylated tubulin staining in cells depleted of MTCL2. Based on this localization pattern, it seems the MT network is not grossly altered, as was shown in Figure 5 where perinuclear accumulation of MTs was lost. The authors should comment on whether acetylated tubulin presence and localization is altered in MTCL2-depleted cells. This is also mentioned in the discussion where the authors conclude that the major function of MTCL2 is to crosslink and accumulate MTs in the Golgi region. However, based on acetylated tubulin staining patterns, stable MTs seem to still accumulate in the Golgi region. The authors need to show this accumulated population of stable MTs is no longer crosslinked in the absence of MTCL2 to support their claim.
      8. To investigate potential functional overlap between MTCL1 and MTCL2, the authors should include a double depletion experiment where MT organization and Golgi organization are investigated. The currently shown experiments do not test a functional relationship between the two paralogs. Additionally, the authors should show Western Blot analysis of MTCL1 levels in MTCL2-depleted cells, and vice versa. While there does not seem to be an overlap in localization patterns of the two proteins, that does not mean there is no functional relationship.
      9. Lines 120-30 and 297-9. The authors state that based on the localization pattern of MTCL2 it mostly localizes along MTs in the perinuclear region (shown in Figure 2). Then, in the discussion they state MTCL2 preferentially localizes to Golgi membranes. Please clarify which of the two sites MTCL2 localizes to preferentially.
      10. Loss of Golgi organization as described in Figures 6 does not appear in polarized cells in Figure 7. The authors should comment on the loss of the phenotype in polarized cells.

      Minor comments:

      1. The authors should consider using colorblind friendly palettes in figures. For example, magenta/green instead of red/green and magenta/cyan/yellow instead of red/blue/green. Additionally, for tri-color images the combination red/green/white (Figure 4B, 7C) should be avoided, as overlapping red/green signals will show up as yellow which is difficult to distinguish from the white signals. Finally, human eyes detect shades of red much poorer than for example green. Therefore, the main point of a figure should not be in red. For example, MTCL2 is frequently shown as red signal in a merged image and should be replaced with a different color.
      2. The authors claim the mouse MTCL2 protein lacks 203 N-terminal amino acids. Authors should clarify in the text that this is relative to mouse MTCL1. The authors should also include the human comparisons, as they work on human cell lines in the majority of the manuscript.
      3. In Figure 1D the authors show Western Blots where various amounts of HEK293T extracts were probed for exogenously expressed MTCL2. As a control, authors should include a non-transfected control. From Figure 1E, it would be expected that HEK293 (kidney cells) would not express endogenous MTCL2, but the control should be included anyway.
      4. In Figure 3, the color scheme in the final column of images should be changed. Red/white contrast is very poor and no conclusions can be drawn from these images. Additionally, the authors should include a box to show where the inset is located in the overview images.
      5. Authors claim that MTCL2 is not detected near more dynamic MTs in the periphery of the cell and references Figures 2A and 3. They should include annotation in the figures to highlight this. This can be done with arrowheads or other markings, or with additional insets enlarging a peripheral region of the cell.
      6. The authors should clarify in the main text and figure legend which superresolution microscopy technique was used in Figure 2D.
      7. The authors use methanol fixation to examine localization of MTCL2, MTs, and Golgi. Methanol extracts lipids and thus affects intracellular membrane compartments, and can affect the localization pattern of GM130, a Golgi matrix protein. The authors should include samples fixed with a crosslinking fixative to ensure their conclusions drawn from methanol-fixed samples are not affected by the choice of fixative.
      8. In Supplementary Figure 1B a third, relatively high expressing cell can be seen in the top panel. The GM130 signal for this cell seems to be comparable to non-transfected cells in the same image. Can the authors address this? Alternatively, to show differences in expression levels between these three cells in that panel and others, authors could use a heatmap LUT of the V5 signal to differentiate expression levels more clearly in different cells.
      9. Line 139. How was the ectopic expression 'suppressed to endogenous levels'? The panels in Suppl Fig 1 of 'low expression' clearly show increased MTCL2 signal when compared to non-transfected cells in the same panel still. This would suggest ectopic expression is still above endogenous levels.
      10. Figure 5C. The label for MTCL2 construct should read mMTCL2 ΔC-MTBD to clarify the expression construct used.
      11. In Figures 6A and 6C the label shows a-tubulin, but the staining is of a Golgi marker.
      12. In Figures 6B and 6D the different conditions should be separated more in the graph, the datapoints overlap.
      13. Lines 246-7. The authors claim the Golgi-associated and centrosomal MTs can be easily distinguished in MTCL2 knockdown cells. They should include annotation in the corresponding figures to highlight these different populations.
      14. Figure 8A. A horizontal line is missing in the panel showing MTCL/a-tub merge.
      15. Figures 8C and 8D. The acetylated tubulin staining in control cells (control RNAi and GFP) in these panels vary greatly. Can the authors comment on this? Additionally, there appears to be an increase in acetylated tubulin on the Western Blot (8E) shown in cells expressing GFP-MTCL2 CMTB that is not reflected in the image in Figure 8D. Since a significant population of GFP-MTCL2 CMBT localizes to the nucleus, it is possible that the functional population of GFP-MTCL2 CMBT that can stabilize MTs is much lower than GFP-MTCL1 CMBT despite showing equal levels in the Western Blot. The author should compare signal intensity in the cytosol of GFP-expressing cells and base their analysis of acetylated tubulin levels on cells where cytosolic levels are comparable.

      Significance

      This work describes a novel MT crosslinking protein, MTCL2. The authors show that MTCL2 may function predominantly on non-centrosomal MTs associated with the Golgi and suggest a function in linking the centrosome and Golgi in polarized, migrating cells. However, the manuscript is highly descriptive as the authors do not uncover a mechanism for how MTCL2 stabilizes and crosslinks MTs and do not address potential functional interactions between MTCL1 and MTCL2. Additionally, there are some contradictory findings that are not addressed in the current manuscript.

      This work adds a new factor to an expanding list of proteins that regulate non-centrosomal MTs (reviewed in Meiring et al., 2019, Current Opinion in Cell Biology, and Sanders and Kaverina, 2015, Frontiers in Neuroscience), and would be of interest to those interested in cell biology of MT organization and function.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      In this manuscript, the authors identify MTCL2, a paralog of the MTCL1 protein and study its interaction with the Golgi complex and with microtubules. A shorter version of this protein was identified before and named SOGA (suppressor of glucose from autophagy). A role of MTCL2 in regulating the polymerization of Golgi associated microtubules is reported as well as an implication in cell polarity and migration.

      Major comments:

      - Are the key conclusions convincing?

      Most of the key conclusions are valid but the main one should be either reinforced or tuned down. In particular, the authors tend to give central role to MTCL2 in regulating the formation and organization of Golgi-associated MT network, and conversely in organizing Golgi elements, without considering the other factors identified (the authors cite relevant papers though but do not discuss this). They should analyze the function of MTCL2 in relation to the role of CLASP2, AKAP450, Golgi-g-Tubulin, or even EB proteins (like EB3). I also do not think that carrying out super resolution microscopy is enough to "reveal the possibility that MTCL2 mediates the association of the Golgi membrane with stabilized MTs". More generally, the authors cannot conclude that MTCL2 preferentially associated to Golgi-MT only from their immunofluorescence and KD experiments. The centrosome (the main MTOC) is indeed also localized in the perinuclear area. Easy to do additional experiments may help to confirm these conclusions (see below). Also, the authors could strengthen the way the study how MTCL1 and MTCL2 binds to microtubules and Golgi (see below). The localization or interaction of MTCL2 with Golgi-associated MT is not directly shown. The title should be changed also. I am not sure I understand what an asymmetric microtubule network means in this context. I guess that the authors mean non-centrosomal microtubule network.

      - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The authors also state that tubulin acetylation is induced by MTCL1 C-MTBD but it may simply be stabilized. They should also clarify if MTCL2 regulates Golgi-dependant nucleation microtubules

      - Would additional experiments be essential to support the claims of the paper?

      Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. To demonstrate that MTCL2 associated to Golgi-MT, microtubule regrowth experiments following nocodazole treatment have to be conducted (time course). Another efficient way to analyze such events, as shown by the Kaverina and the Akhmanova labs for example, is to use fluorescent EB proteins (e.g. EB3) to image microtubule plus ends and back-track them to identify nucleation points. Carrying out such an experiment (nocodazole way-out and EB tracking) in the presence or absence of MTCL2 would allow to confirm, or not, the functional hypothesis of the authors. Additionally, carrying electron microscopy analysis would be important to qualify better the effects observed on Golgi complexes upon depletion. The authors mention the effects on the "morphology of Golgi ribbon" but it is rather unclear. Last, because the authors compare the way MTCL1 and MTCL2 bind microtubules, and suggest intriguing differences, domain swapping experiments between these two isoforms would be important to carry out.

      - Are the suggested experiments realistic in terms of time and resources?

      It would help if you could add an estimated cost and time investment for substantial experiments. The proposed experiments to study Golgi-based nucleation are easy and inexpensive, as are domain swapping experiments. Electron microscopy on the other hand is quite expert and requires either internal knowledge, access to a facility or setting-up a collaboration. A few months, 3-4, would be needed.

      - Are the data and the methods presented in such a way that they can be reproduced?

      yes

      - Are the experiments adequately replicated and statistical analysis adequate?

      I am not a statistician but I was not convinced by the use of the quantification of "skewness", in particular in figure 5B. Whether a Wilcoxon test is adequate is unclear to me.

      Minor comments:

      -Specific experimental issues that are easily addressable.

      Yes

      -Are prior studies referenced appropriately?

      Some studies are referred but the published data not actually used (with the exception of the final scheme). The authors should comment on the fact that other Golgi-associated MT binding proteins have been shown to be involved in the mechanisms highlighted here. Why they would not take over in the absence of MTCL2 should be properly discussed. Similarly, in the discussion, the authors indicate that SOGA has been found as an interacting partner of CLASP2. As CLASP2 is a microtubule binding protein also localized at the Golgi complex and binding to acetylated microtubules, the authors should at least comment on the putative role of the interaction between MTCL2 and CLASP2 in the phenotypes they described. The role of the interaction between CLASP2 and MTCL2 should be discussed and ideally tested.

      -Are the text and figures clear and accurate? In general, yes. There are however quite a few problems:

      • In the introduction, page 3 line 74-77, the authors wrote « The resultant N-terminal fragment is released into the cytoplasm to suppress autophagy by interacting with the Atg12/Atg5 complex, whereas the C-terminal fragment is secreted after further cleavage (see Fig. 1A, boxed illustration). » while on the Fig1 the boxed area indicates that SOGA bears Atg16 and Rab5 binding domains. Please double check the interacting partners of SOGA1.

      • Figure 1 B and C are not cited in the main text. • Figure 1E: a loading control is needed to evaluate the expression level of SOGA/MTCL2 in the mouse tissues. In the liver, the size of the bands is different than in other tissues (smaller size). The authors might comment if these smaller bands correspond to the cleaved version of SOGA that was previously described in mouse hepatocyte.

      • Figure 2A: single color picture for the anti-tubulin immunolabeling would help to see the distribution of microtubules in the perinuclear area. The perinuclear region is a crowded area with many intracellular compartments accumulating there as well as cytoskeleton elements. • Figure 2C: same comment as above, a single-color picture for the anti-MTCL2 and anti-GM130 immunolabeling are required.

      • page 7, line132-134: the authors state: « Close inspection using super-resolution microscopy further revealed the possibility that MTCL2 mediates the association of the Golgi membrane with stabilized MTs (Fig. 2D, arrows). » To my opinion, the data are over-interpreted. The signals partially co-localize but this does not indicate a function of MTCL2 in mediating the interaction.

      • Figure 3: Another way of merging the anti MTCL2 and GS28 pictures have to be provided. The pictures are difficult to interpret with the current display.

      • Figure 4C: please indicate the meaning of « ppt »

      • Figure 5B and C: for easier reading of the figure, it would be useful to annotate with MTCL2 construct is overexpressed following doxycycline treatment (MTCL2 WT (A) and MTCL2 delta C-MTBD (C)).

      • Figure 6 A and C: the labels are wrong. Bottom pictures correspond to anti-GM130 immunostaining not anti-tubulin. If I am not mistaken, it is MTCL2 delta C which is studied in panel C.

      • Page 11, line 212: Supplementary Figure 2 (knockdown in RPE1 cells) is intended to be cited not Supplementary Figure 3.

      • Figure 8A: single color pictures are needed to appreciate the distribution of the signals

      -Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Yes, see above

      Significance

      - Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      It adds a new player in the machinery involved in the interplay between the Golgi complex and microtubules for their mutual organization. To me a key observation is the unlinking between the Golgi complexes and the centrosome but this observation is not really used and studied (here again, may be a nocodazole wash-out experiment and real-time analysis may help)

      - Place the work in the context of the existing literature (provide references, where appropriate).

      A large number of studies, cited by the authors, have identified proteins involved in mutual organization of Golgi membranes and microtubules. Identification and study of MTCL1 and 2 are important in this context. It also questions the role and function of the initially identified SOGA.

      - State what audience might be interested in and influenced by the reported findings.

      This is a pure cell biology study that will primarily interest people studying the Golgi complex and micrutubules. People interested by the internal organization of the cell, the interaction between the centrosome and the Golgi and intracellular polarity would also be interested, as well as people studying migration.

      - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I studied Golgi dynamics and function as well as microtubule dynamics. I have no expertise in statistical analysis.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1 (Evidence, reproducibility and clarity):

      The main message of this paper, as far as I understood since I am not a molecular bioinformatician but I am certainly interested in mtDNA variations especially related to disease, is that there is a very obvious bias among synonymous changed in the ORF of human mtDNA, more frequent for aminoacids with 4 variants, more frequent in P position, and much more frequently characterized by transversion rather than transition substitutions. This survey is well written and, although edited in a rather technical language, the message is reachable and interesting. I also agree on the conclusions of the Author concerning the considerations that this set of new data should prompt one to draw also considering non-synonymous, potentially pathogenic mutations. The only contribution I feel I can provide to this manuscript is to invite the Authors to consider the possibility that the selection may be due to a preferred codon bias, linked to the higher or lower compliance of different codon to be translated by the translational in situ machinery of mitochondria. I am not sure that this applies also for mitochondrial mitochondria and related factors (you may want to ask Aleksey Amunts in Stockholm or Bob Lightowlers or Zoscha Lightowlers in Newcastle on this matter). I do know that this is certainly a problem for recombinant proteins containing, for instance, mammalian MTS fused with a bacterial restriction enzyme; in most of the cases the bacterial sequence has to be recoded using the preferred codon for mammalian system in order to increase translation by an eukaryotic (mammalian) translation machinery. I wonder whether you could discuss this possibility in your paper and maybe perform some further comparative measurement to test it.

      I appreciate the supportive comments of Reviewer 1 regarding the accessibility of our manuscript, and I address comments related to codon bias below.

      Reviewer 1 (Significance):

      The paper provides novel information on the structure and constrains of mtDNA variants in humans, opens an area of investigation which is new and potentially relevant, with some possible implications also on pathogenic mtDNA mutations in humans.

      I thank Reviewer 1 for their positive comments about the novelty of this work and the important implications of our study.

      Reviewer 1 (Referee Cross-commenting):

      I said in my first comment that I am not a bioinformatician, but Referee 2 made a great job in identifying some critical points and suggest the Authors how to cope with them. I maintain my opinion, that I think it's shared by referee 2, that the paper conveys an interesting and rather unexpected message, and that if the Authors are able to answer properly to the points raised by referee 2 the paper should be published.

      We are quite glad to hear that Reviewer 1 would like to see this manuscript published, provided that the items noted by the reviewers are properly addressed.

      Response to Reviewer 1:

      R1Q1 (Continuation from Referee Cross-commenting): I confirm that the only contribution I feel I can provide to this manuscript is to invite the Authors to consider the possibility that the selection may be due to a preferred codon bias, linked to the higher or lower compliance of different codons to be translated by the translational in situ machinery of mitochondria. I wonder whether the Authors could consider this possibility in the Discussion and possibly perform some further comparative measurement to test it.

      R1A1: My manuscript takes into consideration the possibility that codon-specific preferences would determine the frequency of mtDNA variants. Findings that argue against codon bias as a strong source of selection include:

      1) At two-fold degenerate P3s, nearly every site (> 97%) harbored at least one HelixMTdb sample associated with a non-reference base. It is worth noting that HelixMTdb is not enriched for known mitochondrial disease variants.

      2) SSNEs are very tightly associated with transversions from the human reference sequence, implicating mutational biases as a cause of any limited diversity in the HelixMTdb.

      3) Every possible base can be found at 99% of >500 analyzed I-P3 positions (those P3s at which the base at codon positions one and two is identical throughout the alignment), arguing against the idea that codon bias plays a significant role in controlling variant frequency across mammals. The only exception that I identified in my extensive analysis is the P3 found within the first methionine codon of COX3.

      4) Earlier, more limited studies of mitochondrial codon choice (citations of these earlier studies can be found in the manuscript) also argue against substantial selection based upon codon choice.

      5) Finally, I would note that the set of tRNAs encoded by vertebrate mtDNAs is quite limited, with only one tRNA linked to each codon family defined by codon positions P1 and P2. There is no evidence, to my knowledge, that nucleus-encoded tRNAs enter human mitochondria. Therefore, the scope of potential selection linked to, for example, translation speed and protein folding seems particularly limited at vertebrate mitochondria.

      While most evidence does not support strong selection on mtDNA codon choice in vertebrates, I do report divergence in TSS distributions obtained from the I-P3s of different amino acids within the same degeneracy class (eg. two-fold purine, two-fold pyrimidine, four-fold), hinting at some minimal role for codon preferences at P3. However, on the whole, mutational propensities are likely to be the predominant factor controlling synonymous variation.

      Reviewer 2 (Evidence, reproducibility and clarity):

      The manuscript explores a large database of human mtDNA sequences and performs some comparative analysis across mammals to characterise the profile of mtDNA mutations. It finds that some variants are surprisingly poorly represented in human mtDNA and suggests that mutational bias rather than selection is the dominant driver of this heterogeneity.

      This is an interesting message and an efficient and interpretable of a large-scale dataset to shed light on biological mechanisms, which is a highly desirable philosophy. The factors shaping human mtDNA heterogeneity are of immense interest for several fields from population genetics to medicine, making this a valuable perspective. My comments are mainly quite fine-grained and reflect instances where I think the argument could be tighter, rather than fundamental flaws in the approach. In the cases where these points are due to my own naivety, I apologise and suggest that more explanation of these points could help other readers like me!

      I am happy to read that Reviewer 2 (Dr. Iain Johnston) finds my approach to be fundamentally sound, and I certainly appreciate the insightful comments and suggestions that he has provided.

      Reviewer 2 (Significance):

      I wrote the above review without realising the reviewer interface would be categorised in this way. Here's a repeat of my "significance" comments

      The manuscript explores a large database of human mtDNA sequences and performs some comparative analysis across mammals to characterise the profile of mtDNA mutations. It finds that some variants are surprisingly poorly represented in human mtDNA and suggests that mutational bias rather than selection is the dominant driver of this heterogeneity.

      This is an interesting message and an efficient and interpretable of a large-scale dataset to shed light on biological mechanisms, which is a highly desirable philosophy. The factors shaping human mtDNA heterogeneity are of immense interest for several fields from population genetics to medicine, making this a valuable perspective.

      I am very pleased that the reviewer appreciates the importance and potential impact of my analysis. We agree that mtDNA heterogeneity is likely to be of high medical relevance.

      Response to Reviewer 2:

      R2Q1: The first paragraph is focused on humans without explicitly saying so; missing heritability is less of an issue in, for example, plants [Brachi et al., 2011. Genome biology, 12(10), pp.1-8]. This focus should be clearer (or the differences across kingdoms mentioned!). It's also worth noting that the argument about pathogenic variants being infrequent because of selection can only address missing heritability in pathogenic variants, and cannot (directly) inform the missing heritability in traits like height etc. Also, the whole motivation with respect to missing heritability currently comes across as a bit of a non sequitur. An introduction section could be used to help describe how the analysis of the provenance of mtDNA mutations contributes to the missing heritability question.

      R2A1: I agree that beginning the manuscript with a discussion of genome-side association studies may distract the reader from the main topic at hand: the utility of variant frequency when predicting pathogenicity in humans. I have changed the Introduction accordingly.

      R2Q2: I also suggest that such an introduction section introduces the (later cited) previous work from Reyes and others on mutational profiles in mtDNA to set the scene.

      R2A2: I now provide these citations in the second paragraph of the Introduction. However, I do not expand further upon mutational propensities in that section, with an eye toward minimizing manuscript length toward publication as a short report.

      R2Q3: An early result, that 35% of possible synonymous mutations do not appear in a dataset, lacks a null hypothesis. Depending on the size of the dataset this may be very surprising or very unsurprising : an order of magnitude estimate of what proportion would be expected under uniform mutation and zero selection would help comparison here. I guess this can be as simple as 16k/3*4 R2A3: The reviewer raises an excellent point regarding how 'surprising' it should be to the reader, previous to downstream analyses revealing transition/transversion biases, that so many synonymous substitutions are lacking within this dataset. While the authors of the HelixMT study removed mtDNA from highly related individuals from the analysis, the vast majority of the mtDNAs analyzed (91.2%) were from haplogroup N and of inferred European ancestry (doi.org/10.1101/798264). The authors of the HelixMTdb study do note that nearly all mtDNA lineages were present in the study, presumably encompassing roughly 100,000 years of human mtDNA evolution. That said, how this information alone may be used to quantitatively model expectations under zero selection is unclear.

      To address this question of whether sample diversity might be very limited in the HelixMTdb study, I have carried out additional analyses on this dataset. I now assess, for third codon positions allowing two-fold synonymous change (serine and leucine not included, due to their decoding by two different tRNAs), how often only one nucleotide was found at that position. For two-fold degenerate P3s, > 97% (n=1604) harbored both nucleotide possibilities within the database. This result strongly suggests that mtDNA diversity was well sampled in the HelixMTdb study, since a database consisting of highly related samples would presumably be characterized by a greater number of sites showing total identity. Moreover, when considering analyzed four-fold degenerate P3s (again, leucine and serine codons were omitted), only a very small number of sites showed no diversity (1%), with more than half of sites harboring at least three different bases. My interpretation is that the HelixMTdb authors have successfully sampled a very diverse set of human mitochondrial genomes. I have added these new analyses to the manuscript as Fig. 2a and 2b.

      I have also changed the word 'surprising' to 'noteworthy' within the relevant portion of my manuscript text.

      R2Q4: I think some comments and additional framing of the diversity in the central database would be valuable and important for interpretation. I believe it has, for example, rather more European rows than African ones, thus (to take a very basic view) sampling a less diverse population more than a more diverse one.

      R2A4: I now state explicitly that the vast majority of the mtDNAs analyzed (91.2%) were from haplogroup N and of inferred European ancestry. Also, please see point R2A3 for further discussion of the human mtDNA diversity reflected within HelixMTdb.

      R2Q5: Another rhetorically important number lacking a comparison with a null is that guanine was detected at >3000 P3 positions accepting synonymous purine substitutions. This is cited as evidence that nucleotide frequencies at P3s don't reflect selection inherent to translation. But this link isn't clear -- if such selection was present, how different from 3000 would Iexpect this number to be? Isn't there a continuum of possibilities? Is the key idea that 3000 is greater than some other number, and if so, what is that?

      R2A5: The purpose of this figure is simply to demonstrate that no nucleotide is ruled out when considering silent substitutions at the P3 of any amino acid. This is consistent with (although does not prove, and I believe that the I-P3 analysis provides stronger evidence on this point) a minimal role for mitochondrial codon preference in mtDNA evolution. To reflect that my point is more general, and not to be taken as a quantitative comparison, I changed my text to: 'However, even considering the relative depletion of guanine from all four-fold degenerate P3s and two-fold degenerate purine P3s, guanine was nonetheless detected at thousands of P3 positions (Fig. 3b)'.

      R2Q6: I also wasn't clear whether/how the finding that little selection inherent to translation was implicitly extended to suggest little general selection overall. The following section only considers selection acting at specific P3 sites, thus implicitly discarding other hypotheses about general selection based on nucleotide content but not inherent to translation. Perhaps I am misunderstanding this translation link, but selection based on general nucleotide profiles (for example, due to thermodynamic stability [Samuels, Mech. Ageing Dev. 2005; 126: 1123-1129] or availability of nucleotides [Aalto & Raivio, Mech. Ageing Dev. 2005; 126: 1123-1129; Ott et al., Apoptosis. 2007; 12: 913-922]) would seem to still be on the table?

      R2A6: I would argue against selection upon nucleotide choice linked to local changes to mtDNA thermodynamic stability. Most prominently, when considering two-fold degenerate sites, nucleotide differences from the reference sequence were identified within the HelixMTdb at almost every analyzed position (Fig. 2a), even though hydrogen bond strength between opposing bases would be affected in every case (AT>GC or vice versa). Of course, my argument here applies generally, and there may be a small subset of sites for which nucleotide substitutions can cause a pronounced functional defect because of a change to local mtDNA structure.

      I would also argue against mitochondrial nucleotide availability as a source of selective pressure within the human population. When considering the entire L-strand sequence (NC_012920.1), nucleotide counts are as follows:

      A 5124

      C 5181

      G 2169

      T 4094

      And when considering both strands, nucleotide counts and frequencies are as follows:

      A 9218 (27.8%)

      C 7350 (22.2%)

      G 7350 (22.2%)

      T 9218 (27.8%)

      One nucleotide substitution would lead to a change in nucleotide frequencies by less than 0.02%. While the formal possibility exists that mitochondrial nucleotide availability lies exquisitely close to an important threshold, there is no current evidence to support this proposition. And here again, the diversity of P3 nucleotide choice found among the HelixMTdb samples would argue against this possibility.

      That said, it is worth noting that nucleotide frequencies, and mtDNA mutation rates relative to nuclear mutation rates do appear to differ among clades (PMID: 8524045 and 28981721). Therefore, while selection related to nucleotide availability seems an unlikely explanation for the variant frequencies that I have recovered at degenerate sites among human samples, I certainly would not rule out taxon-specific dietary, environmental, or physiological factors that, over longer evolutionary timescales, might shape mtDNA nucleotide frequencies.

      I would like to raise the possibility of another source of selection upon nucleotide choice. Specifically, one might propose that synonymous mtDNA substitutions could affect the binding of proteins controlling the replication, compaction, or expression of mtDNA. Indeed, an intriguing study has reported that human cells manifest a mtDNA footprinting pattern (PMID: 30002158), suggestive of regulatory sites bound to protein or sites of transcriptional pausing. However, Blumberg et al. found no statistically significant difference in human synonymous change at footprinted sites, arguing against a strong selective pressure on nucleotide choice at footprinted P3s. Moreover, footprinting sites identified in the above-mentioned study are conserved in mouse and human, but I have shown that all four nucleotides are acceptable at all four-fold degenerate sites (n=252), all two-fold degenerate pyrimidine sites (n=157), and 99% of two-fold degenerate purine sites (n=152) within the mammalian I-P3 set, again arguing against general limitations on nucleotide choice caused by protein association. These analyses cannot, however, totally rule out the possibility that a subset of individual P3s are under some selection due to their role in binding or traversal of proteins.

      R2Q7: A reptile is chosen as an outgroup for a comparative analysis of mammals. As always when a choice is made, the question arises: what if that choice was different? Perhaps the corresponding figures can be presented for two other choices of outgroup to demonstrate that there's nothing particularly unrepresentative about this reptile?

      R2A7: While preparing this revised manuscript, I have performed an updated analysis using the most current mammalian mtDNA dataset available on RefSeq. For these new tests, I used Iguana iguana, rather than Anolis punctatus, as an outgroup. The new results are essentially indistinguishable from my previous findings. Importantly, when old TSS values and new TSS values for I-P3 sites were compared by linear regression, the R-squared value is 0.9955, with a p-value of

      R2Q8: Another analysis involves classifying variant frequency into discrete groups based on percentage appearance, then seeking links with the TSS statistic. First, it is not clear why discretisation is needed here. A statistical model embracing the continuous nature of variant frequency requires fewer arbitrary choices (e.g. of numbers and boundaries of classes).

      R2A8: A primary audience of this manuscript will certainly be the human genetics community, which commonly speaks in terms of variant classes (eg. 'common', 'rare', 'ultra-rare'). Therefore, I prefer to also use such classifications when analyzing the relationship between TSS and mtDNA variant frequency. I took advantage of the following references when generating frequency classifications:

      Bomba L, Walter K, Soranzo N. 2017. The impact of rare and low-frequency genetic variants in common disease. Genome Biol 18:77.

      McInnes G, Sharo AG, Koleske ML, Brown JEH, Norstad M, Adhikari AN, Wang S, Brenner SE, Halpern J, Koenig BA, Magnus DC, Gallagher RC, Giacomini KM, Altman RB. 2021. Opportunities and challenges for the computational interpretation of rare variation in clinically important genes. Am J Hum Genet 108:535–548.

      R2Q9: Second, an interpretation point here is in danger of equating absence of evidence with evidence of absence. Without an estimate of statistical power, an absence of a significant relationship cannot suggest that anything is likely or unlikely, only that there may not be sufficient power to detect an effect.

      R2A9: To address this point, I have changed my text as follows:

      Old: 'However, I detected no significant relationship between TSS and variant frequency for four-fold degenerate I-P3s (Fig. 2d), indicating that the highly elevated SSNE abundance at four-fold degenerate P3s is unlikely to be due to selection.'

      New: 'However, I detected no significant relationship between TSS and variant frequency for four-fold degenerate I-P3s (Fig. 2d), consistent with the idea that the highly elevated SSNE abundance at four-fold degenerate P3s is unlikely to be due to selection.'

      R2Q10: Figs 1a and 1e have a log vertical axis but I think the lowest points actually corresponds to zero? This is not compatible with a log axis and the zero position should be explicitly labelled with its own tick (perhaps in parentheses to highlight the discontinuity).

      R2A10: Quite correct, and I had neglected to clarify those details in the previous version of the manuscript. I now designate the samples with zero counts in the population using a smaller dot size, and I describe this approach in the figure legend.

      R2Q11: The methods are presented in an interesting way, with specific filenames for the code associated with each part of the pipeline explicitly provided. This is (very!) nice but it would also be good to describe in words what each piece of code does (e.g. "this was used as input for x.py, which counts the mutations and outputs a profile" or some such). This is indeed sometimes written but some parts lack an explanation.

      R2A11: I have now expanded my description of several scripts within the Methodology section.

      R2Q12: I could do with an additional sentence or two on the statistical analysis. As Kolmogorov-Smirnov tests examine differences between distributions, it's not immediately unambiguous how they are applied to total count statistics. Are count distributions with respect to variant frequency analysed for each amino acid separately? Or are the amino acids somehow ordered and the distributions across them compared? Or something else?

      R2A12: TSS distributions are held for each individual amino acid, which are then compared by Kolmogorov-Smirnov testing only within a given degeneracy category (four-fold degenerate, two-fold degenerate purine, two-fold degenerate pyrimidine). I have now elaborated upon this statistical test selection, and other details of the analysis, in the Methodology section.

      Reviewer 2 (Referee Cross-commenting):

      I agree that codon bias is an interesting potential axis of selection. Even if the analysis rejects the hypothesis of selective effects inherent to translation, it is conceivable that codon bias could be shaped by selection in other indirect ways (depending on how "inherent" is defined, these could include tRNA/nucleotide availability, GC content and thermodynamic stability, etc). I think this aligns with my suggestion that modes of selection that are not directly linked to translation could be explored in more depth before discounting selective effects overall. IJ

      I hope that I have now successfully addressed points related to codon bias, GC content, and thermodynamic stability in the manuscript, as well as here in this response to the reviewers.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript explores a large database of human mtDNA sequences and performs some comparative analysis across mammals to characterise the profile of mtDNA mutations. It finds that some variants are surprisingly poorly represented in human mtDNA and suggests that mutational bias rather than selection is the dominant driver of this heterogeneity.

      This is an interesting message and an efficient and interpretable of a large-scale dataset to shed light on biological mechanisms, which is a highly desirable philosophy. The factors shaping human mtDNA heterogeneity are of immense interest for several fields from population genetics to medicine, making this a valuable perspective. My comments are mainly quite fine-grained and reflect instances where I think the argument could be tighter, rather than fundamental flaws in the approach. In the cases where these points are due to my own naivety, I apologise and suggest that more explanation of these points could help other readers like me!

      The first paragraph is focused on humans without explicitly saying so; missing heritability is less of an issue in, for example, plants [Brachi et al., 2011. Genome biology, 12(10), pp.1-8]. This focus should be clearer (or the differences across kingdoms mentioned!). It's also worth noting that the argument about pathogenic variants being infrequent because of selection can only address missing heritability in pathogenic variants, and cannot (directly) inform the missing heritability in traits like height etc. Also, the whole motivation with respect to missing heritability currently comes across as a bit of a non sequitur. An introduction section could be used to help describe how the analysis of the provenance of mtDNA mutations contributes to the missing heritability question. I also suggest that such an introduction section introduces the (later cited) previous work from Reyes and others on mutational profiles in mtDNA to set the scene.

      An early result, that 35% of possible synonymous mutations do not appear in a dataset, lacks a null hypothesis. Depending on the size of the dataset this may be very surprising or very unsurprising : an order of magnitude estimate of what proportion would be expected under uniform mutation and zero selection would help comparison here. I guess this can be as simple as 16k/34 << 200k. Also the ancestry of the dataset is important here: if all samples are highly related then a more homogenous mutational profile is unsurprising. Perhaps one could assign a quantity like an effective population size to the database and compare this to 16k/34? I think some comments and additional framing of the diversity in the central database would be valuable and important for interpretation. I believe it has, for example, rather more European rows than African ones, thus (to take a very basic view) sampling a less diverse population more than a more diverse one.

      Another rhetorically important number lacking a comparison with a null is that guanine was detected at >3000 P3 positions accepting synonymous purine substitutions. This is cited as evidence that nucleotide frequencies at P3s don't reflect selection inherent to translation. But this link isn't clear -- if such selection was present, how different from 3000 would we expect this number to be? Isn't there a continuum of possibilities? Is the key idea that 3000 is greater than some other number, and if so, what is that?

      I also wasn't clear whether/how the finding that little selection inherent to translation was implicitly extended to suggest little general selection overall. The following section only considers selection acting at specific P3 sites, thus implicitly discarding other hypotheses about general selection based on nucleotide content but not inherent to translation. Perhaps I am misunderstanding this translation link, but selection based on general nucleotide profiles (for example, due to thermodynamic stability [Samuels, Mech. Ageing Dev. 2005; 126: 1123-1129] or availability of nucleotides [Aalto & Raivio, Mech. Ageing Dev. 2005; 126: 1123-1129; Ott et al., Apoptosis. 2007; 12: 913-922]) would seem to still be on the table?

      A reptile is chosen as an outgroup for a comparative analysis of mammals. As always when a choice is made, the question arises: what if that choice was different? Perhaps the corresponding figures can be presented for two other choices of outgroup to demonstrate that there's nothing particularly unrepresentative about this reptile?

      Another analysis involves classifying variant frequency into discrete groups based on percentage appearance, then seeking links with the TSS statistic. First, it is not clear why discretisation is needed here. A statistical model embracing the continuous nature of variant frequency requires fewer arbitrary choices (e.g. of numbers and boundaries of classes). Second, an interpretation point here is in danger of equating absence of evidence with evidence of absence. Without an estimate of statistical power, an absence of a significant relationship cannot suggest that anything is likely or unlikely, only that there may not be sufficient power to detect an effect.

      Figs 1a and 1e have a log vertical axis but I think the lowest points actually corresponds to zero? This is not compatible with a log axis and the zero position should be explicitly labelled with its own tick (perhaps in parentheses to highlight the discontinuity).

      The methods are presented in an interesting way, with specific filenames for the code associated with each part of the pipeline explicitly provided. This is (very!) nice but it would also be good to describe in words what each piece of code does (e.g. "this was used as input for x.py, which counts the mutations and outputs a profile" or some such). This is indeed sometimes written but some parts lack an explanation.

      I could do with an additional sentence or two on the statistical analysis. As Kolmogorov-Smirnov tests examine differences between distributions, it's not immediately unambiguous how they are applied to total count statistics. Are count distributions with respect to variant frequency analysed for each amino acid separately? Or are the amino acids somehow ordered and the distributions across them compared? Or something else?

      Iain Johnston

      Significance

      I wrote the above review without realising the reviewer interface would be categorised in this way. Here's a repeat of my "significance" comments

      The manuscript explores a large database of human mtDNA sequences and performs some comparative analysis across mammals to characterise the profile of mtDNA mutations. It finds that some variants are surprisingly poorly represented in human mtDNA and suggests that mutational bias rather than selection is the dominant driver of this heterogeneity.

      This is an interesting message and an efficient and interpretable of a large-scale dataset to shed light on biological mechanisms, which is a highly desirable philosophy. The factors shaping human mtDNA heterogeneity are of immense interest for several fields from population genetics to medicine, making this a valuable perspective.

      Referee Cross-commenting

      I agree that codon bias is an interesting potential axis of selection. Even if the analysis rejects the hypothesis of selective effects inherent to translation, it is conceivable that codon bias could be shaped by selection in other indirect ways (depending on how "inherent" is defined, these could include tRNA/nucleotide availability, GC content and thermodynamic stability, etc). I think this aligns with my suggestion that modes of selection that are not directly linked to translation could be explored in more depth before discounting selective effects overall. IJ

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The main message of this paper, as far as I understood since I am not a molecular bioinformatician but I am certainly interested in mtDNA variations especially related to disease, is that there is a very obvious bias among synonymous changed in the ORF of human mtDNA, more frequent for aminoacids with 4 variants, more frequent in P position, and much more frequently characterized by transversion rather than transition substitutions. This survey is well written and, although edited in a rather technical language, the message is reachable and interesting. I also agree on the conclusions of the Author concderning the considerations that this set of new data should prompt one to draw also considerin g non-synonymous, potentially pathogenic mutations. The only contribution I feel I can provide to this manuscript is to invite the Authors to coinsider the possibility that the selection may be due to a preferred codon bias, linked to the higher or lower campliance of different codon to be translated by the translational in situ machinery of mitochondria. I am not sure that this applies also for mitochondrial mitochondria and related factors (you may want to ask Aleksey Amunts in Stockholm or Bob Lightowlers or Zoscha Lightowlers in Newcastle on this matter). I do know that this is certainly a problem for recombinant proteins containing, for instance, mammalian MTS fused with a bacterial restriction enzyme; in most of the cases the bacterial sequence has to be recoded using the preferred codon for mammalian syste in orderr to increase translation by an eukaryotic (mammalian) translation machinery. I wonder whether you could discuss this possibility in your paper and maybe perform some further comparative measurement to test it.

      Significance

      The paper provides novel information on the structure and constrains of mtDNA variants in humans, opens an area of investigation which is new and potentially relevant, with some possible implications also on pathogenic mtDNA mutations in humans.

      Referee Cross-commenting

      I said in my first comment that I am not a bioinformatician, but Referee 2 made a great job in identifying some critical points and suggest the Authors how to cope with them. I maintain my opinion, that I think it's shared by referee 2, that the paper conveys an interesting and rather unexpected message, and that if the Authors are able to answer properly to the points raised by referee 2 the paper should be published. I confirm that the only contribution I feel I can provide to this manuscript is to invite the Authors to consider the possibility that the selection may be due to a preferred codon bias, linked to the higher or lower compliance of different codons to be translated by the translational in situ machinery of mitochondria. I wonder whether the Authors could consider this possibility in the Discussion and possibly perform some further comparative measurement to test it.

  3. Aug 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to Reviewers:

      ## Comments by Reviewer 1

      For the sake of clarity, it may help to provide some table illustrating the proportion of gastruloids behaving precisely as the best example shown

      We thank the reviewer for raising this point. For the expression patterns of each of the main endodermal markers analyzed, at the timepoints considered, we will provide additional context on the variability among Gastruloids. We envisage to provide such results in a format similar to what used e.g. in Supplementary Figure 4 of https://doi.org/10.1038/s41467-021-23653-4 [Xu, Peng-Fei, et al. "Construction of a mammalian embryo model from stem cells organized by a morphogen signalling centre." (2021)]; i.e. categorization of the phenotypes observed among gastruloids, and quantification of their proportions.

      It would also strengthen the message even more to provide some quantitation of co-expression for the main markers. As the behaviour seems very consistent, it is likely that such quantification would not be very arduous, and it would show the strength of the model.

      We thank again the reviewer for highlighting the value of co-localisation data, and in fact this is a suggestion also put forward by our other reviewer. We are currently developing a pipeline to segment the nuclei on the DAPI channel of each immunostained Gastruloid, and extract marker intensities within each cell. We envisage the results to be presented in a format similar to what shown e.g. in Fig. 2 of https://doi.org/10.1242/dev.159103 [Mulas, Carla, et al. "Oct4 regulates the embryonic axis and coordinates exit from pluripotency and germ layer specification in the mouse embryo." (2018)]. Such data would undoubtedly strengthen the reliability of the claims we here draw mostly from a qualitative inference of colocalisation.

      The introduction could be shortened, and the results a bit more to the point.

      As this is also a point raised by both reviewers, we will make sure to go back over the entirety of the manuscript and do the necessary adjustments, especially for what concerns the Results section. We would however like to point out that the unusual length of our Introduction section is to be understood as a deliberate departure from the limits and conventions prescribed by traditional publishing formats. In line with our intentional choice of a journal-independent publication platform (i.e. a preprint server), and of journal-independent review process, we also presented and contextualized our research in a journal-independent format. We see this as an opportunity to present research in a voice and in a form truer to that of the researchers that carried it out. We thank both reviewers for their feedback on the format and will certainly still make sure to minimize redundancy of information throughout our manuscript.

      ## Comments by Reviewer 2

      The following conclusions were not warranted by the findings:

      Endoderm emergence can occur in the absence of extraembryonic tissues and embryonic architecture: It is unclear if extraembryonic tissues (e.g., primitive endoderm and visceral endoderm-like cells) are absent in the early phase of gastruloid development.

      The reviewer raises an important point regarding the presence/absence of cells with extraembryonic endoderm identity within the early developing gastruloid. Our claim of absence of these cell types is grounded on previous gastruloid research (Turner, David A., et al. "Anteroposterior polarity and elongation in the absence of extra-embryonic tissues and of spatially localised signalling in gastruloids: mammalian embryonic organoids." (2017); van den Brink, Susanne C., et al. "Single-cell and spatial transcriptomics reveal somitogenesis in gastruloids." Nature 582.7812 (2020): 405-409.), which has indeed found no evidence of extraembryonic endoderm in Gastruloids. While other datasets do not include early Gastruloid stages where such cells may instead be detected, transcriptional analyses of later timepoint Gastruloids [Rossi, Giuliana, et al. "Capturing cardiogenesis in gastruloids." (2021)] also do not seem able to uniquely define extraembryonic endoderm signatures. Accordingly, we did not see this claim as having to be further substantiated.

      In light of most recent reports (see https://doi.org/10.1038/s41467-021-23653-4 [Xu, Peng-Fei, et al. "Construction of a mammalian embryo model from stem cells organized by a morphogen signalling centre." (2021)]), the necessary absence of extraembryonic endoderm types within self-organising embryonic models may appear to need to be at least revisited. Xu et al indeed report extraembryonic endoderm cells in an embryoid model in many ways analogous to the Gastruloid system. These claims are mainly based on the recovery of DBA (Dolichos Biflorus Agglutinin) signal, a marker traditionally associated with extraembryonic endoderm in vivo. Yet, and based on our own unpublished exploration of the topic, we presently cannot confirm DBA lectins to be an exclusive marker of extraembryonic endoderm in an in vitro setting (i.e. in the cell types obtained from and amongst differentiating stem cells), as not only we detect DBA positivity in wide cellular domains that are incompatible with any realistic estimate of the extraembryonic makeup of Gastruloids, but we also see this marker decorating the membrane of 2D colonies of pluripotent mouse embryonic stem cells maintained under 2iLIF conditions. When counterstaining for Ttr (Transthyretin; aka prealbumin) (a major discriminant of extraembryonic vs embryonic endoderm, as recovered from single-cell datasets; https://endoderm-explorer.com/), we further cannot detect positivity in either DBA+ or DBA- cells.

      To the best of the knowledge and evidence available to us, we thus still consider the absence of extraembryonic endoderm in Gastruloids to be a substantiated claim. We thank however the reviewer for raising this important point and agree that dedicated characterization of extraembryonic endoderm signatures/markers in the developing Gastruloid could certainly help to support the validity of the claims made in previous Gastruloid literature, from which we draw the premise that no extraembryonic endoderm is present. To facilitate this process, and in fact to better contextualize our results in the light of what now published in Xu et al, we will now include detected DBA and Ttr (absent) patterns in early Gastruloids, as well as the expression patterns of extraembryonic endoderm markers for the timepoints for which single cell transcriptomics data is available (i.e. 96h onwards).

      While the gastruloid does not replicate the morphological feature of the post-gastrula embryo, it nevertheless has a certain degree of tissue organization. Perhaps the emergence of DE-like cells in 2-D culture would be a more convincing model for "the absence of extraembryonic tissues and embryonic architecture".

      The observation is correct: gastruloids do not replicate the architecture of the peri gastrulation mouse embryo. Concomitantly, they display a striking degree of tissue (re-/)organization as they proceed throughout differentiation. We were deliberate in the presentation of our data, and as such we always refer to “absence of embryonic architecture” rather than “absence of architecture” (i.e. of any architecture), as the latter assertion would in fact be contrary to a main finding of our investigation.

      Inde ed, the observation that Gastruloids, and specifically endodermal cells within them, can give rise to such developed tissue architectures, by self-organisation and without the need of externally-supplied matrices, is a major focus of our manuscript. Since Gastruloids start as an unstructured, epithelioid, cluster of stem cells, the architectural rearrangements observed in the mature Gastruloid highlight an intrinsic propensity of endodermal cells to forming epithelia. Key aspect to this point, is that these architectural rearrangements are carried out by cells in a landscape that is not architecturally similar to that of the embryo (specifically, where endoderm progenitors do not and cannot arise from a columnar epithelium and do not and cannot have a visceral-endoderm-like destination in which to intercalate).

      Considering what discussed in this and in the previous point, we struggle to frame the Gastruloid as not a “convincing model for the absence of extraembryonic tissues and embryonic architecture", given that it satisfies both criteria. More complete and articulated discussions and appraisals of the value of Gastruloids and other 3D in vitro models towards uncovering fundamental features of embryonic development are available elsewhere, including in their comparison to 2D differentiation assays (van den Brink, Susanne C., and Alexander van Oudenaarden. "3D gastruloids: a novel frontier in stem cell-based in vitro modeling of mammalian gastrulation." (2021); Simunovic, Mijo, and Ali H. Brivanlou. "Embryoids, organoids and gastruloids: new approaches to understanding embryogenesis." (2017); Turner, David A., Peter Baillie‐Johnson, and Alfonso Martinez Arias. "Organoids and the genetically encoded self‐assembly of embryonic stem cells." (2016)). We refer readers to these reviews to help inform their own assessment on the matter.

      Of course, this is not to say that 2D culture models are not an equally valuable system to study peri-gastrulation development (if only as exemplified by https://doi.org/10.1186/s12915-014-0063-7 [Turner, David A., et al. "Brachyury cooperates with Wnt/β-catenin signalling to elicit primitive-streak-like behaviour in differentiating mouse embryonic stem cells." (2014)] and by the numerous studies on micropatterned stem cells). To the best of our knowledge, however, no 2D model of spontaneous, undirected endodermal differentiation has been investigated in detail and from a developmental perspective. We share the reviewer’s interest on the insight that this kind of approach could provide. Still, claims of absence of some degrees of architectural organisation or of extraembryonic tissues are not straightforward in self-organising 2D systems either.On the other hand, 2D approaches that consist in directed differentiation of stem cells to specific endodermal fates are clearly not the type of investigation we were interested within the scope of our experimental questions. We also believe that e.g. differentiation of 2D epithelia that then bud to form 3D spheres are generally contexts that are too decoupled from embryonic modes of development to provide the same degree of developmental insight than e.g. Gastruloids. Having both 2D and 3D platforms at our availability, we opted for latter.

      The following conclusions were not warranted by the findings: […]

      The FoxA2+/Sox17+ endoderm progenitors never transitioning through the mesenchymal intermediates and never leaving the epithelial compartment that they arise: In view of that the stereotypic morphogenetic activity was not documented during the development of the gastruloid, it is not possible to exclude the possibility of the progenitors undergo a partial EMT (loss of epithelial feature and cellular polarity and display of morphogenetic movement, as in vivo) in the transition from progenitor to the epithelial endoderm cells. The DE-like cells when first discerned in the gastruloid are apparently epithelialized. In the absence of lineage tracing results, It is not clear whether they are still residing in the "epithelial compartment that they arise".

      We agree with the reviewer’s comment: the ability to trace endodermal cells throughout their journey in the Gastruloid and throughout differentiation, specifically in conjunction with a live monitor of their epithelial status (e.g. overlayed with a Cdh1 reporter) would provide clear and definitive insight on the endodermal and epithelial transitions taking place in this system. Our conclusions are based on timed immunostaining showing that in early Gastruloids all cells are epithelioid. Finding FoxA2+ (and Sox17+) cells consistently within a Cdh1+ context, while also having necessarily emerged within an epithelioid context, suggests that these cells never leave an epithelial compartment. Within the text, we also put forwards alternative hypotheses equally consistent with our observations: namely that these cells would indeed leave the epithelial compartment (still, through incomplete EMT processes not relying on Snai1 programmes), but reintegrate it at short timescales. We do not find FoxA2+ cells within the mesodermal compartment, as one would expect from comparison with the embryo, and we do not see Snai1 expression within Cdh1+ cells. We would like a live tracing system whereas we could track endodermal identity (e.g. a FoxA2 reporter) while being also able to track its epithelial (E-cadherin, Cdh1) status. We do not foresee to be able to perform such experiment in the near future. Live tracking of Cdh1+ cells in Gastruloids has been described in [Hashmi, Ali, et al. "Cell-state transitions and collective cell movement generate an endoderm-like region in gastruloids." BioRxiv (2020).].

      We use the term “mesenchymal” to signify “not-epithelial”, and as such “Cdh1-negative”. When we hypothesise endoderm cells not to go through EMT, we imply a classical, complete, Snai1-mediated EMT. By “leaving the epithelial compartment” we mean “losing Cdh1 expression” (and as such not being associated anymore with the epithelial/epithelioid compartment). As such, we are not excluding the possibility put forwards by the reviewer: i.e. endodermal progenitors going through a partial EMT with loss of epithelial architecture, but not epithelial markers, and movement in this epithelioid state. In fact, this is the interpretation we are favoring in our report. We will clarify our use of each term within the text of the preprint and provide more clarity to these points. Accordingly, we will rephrase each of the terms above (“mesenchymal”, “absence of EMT”, and “epithelial compartment”) in terms of the Cdh1 and Snai1 status of the cells.

      The mature endoderm cells are patterned segmentally in the gastruloid. The findings that the molecular phenotype (marker expression) of the mature endoderm cells "aligns with (cellular) identities along the entire length of the embryonic gut tube" are not sufficient evidence of spatial A-P patterning of endoderm cells. The expression pattern of Foxa2/Cdh1 (Fig 5d) was not informative of tissue patterning.

      We share the reviewer’s point that alignment of the molecular phenotype (transcript expression) of Gastruloid cells with that of cells along varying position of the gut tube of the embryo does not in fact necessarily imply that these cells are spatially patterned within the Gastruloid. We were deliberate in our presentation of the data on this point, and in fact explicitly presented to readers the equally probable possibility that “the variety of cell identities uncovered in the single cell dataset are intermingled throughout the core of the Gastruloid” (rather than being spatially patterned; lines 787-789). This possibility is in fact provided as the rationale to highlight the need for alternative investigations able to provide spatial information (provided in the following section).

      Promisingly, the markers we did investigate (Pax9 for anterior endoderm; Cdx2 and TBra for posterior identities) were found to not be intermingled throughout the primordium but correctly expressed at anterior and posterior domains (as already known for TBra and Cdx2 from previous Gastruloid literature). We do agree that showing the distribution of AP markers within a same sample would provide a more immediate and compelling visual of the AP patterning of anterior and posterior markers. We also agree that the number of markers we investigated spatially is still restricted, and further characterisation, specifically of middle markers, would provide a more complete picture of the extent to which the Gastruloid primordium is effectively patterned. We would like however to point out that we do not make claims of spatial patterning beyond those supported by the markers we did confirm by immunostaning/HCR; i.e. we only claim spatial patterning of anterior and posterior domains (“Gastruloid endoderm contains patterned anterior and posterior endodermal types”; line 663).

      We agree with the reviewer that the expression pattern of Foxa2/Cdh1 is not informative of tissue patterning. When using these markers on Gastruloids, we indeed use them as “pan-endodermal” markers to identify the general cellular domain at the core of the Gastruloid.

      […] Whether endoderm cells are patterned or not is, however, irrelevant for the understanding of the mode of endoderm formation, unless the timing and the mechanism of allocation of endoderm cells of specific segmental property has been studied in the gastruloid.

      The relevance of the specific set of results indicated by the reviewer (i.e. the topic of patterning of endodermal cells in the Gastruloid) is maybe better appreciated in the context of understanding the modes of later development and maturation of endoderm in vitro, and its self-organising and self-patterning abilities after it has formed, rather than provide direct insights into the formation of the germ layer itself. In the preprint, we indeed present this investigation as a segue to the first set of experiments that instead focus on the formation of the endoderm itself. As the reviewer points out, we have not investigated the specific aspect of timing and mechanisms of allocation of endoderm cells to specific segmental identities as these cells are emerging within the Gastruloid. Our reporting that Gastruloids contain endodermal identities that do end up specified to different segmental identities in the first place sets the basis for the kind of investigation the reviewer suggests.

      it is also unclear if a structure reminiscent of the embryonic gut (closed or partly open) was formed (or self-organised) in the gastruloid.

      We thank the reviewer for raising this point. The structures we describe are initially multi-branched and whisk-shaped (120h), and in turn resolve a single rod-like tissue that follows the outer geometry of the Gastruloid (144h), interfacing with an outer envelope of mesenchymal (non-endodermal) cells. We provide depth-coded maximal intensity projection of these structures (under Cdh1 immunostaining) to try to best convey images of their three-dimensional shape (Figure 4A, Figure 5C). While we believe this epithelial primordium to be fully closed and non-hollow, in fact quite different than the folding/folded epithelium of the post-gastrulating mouse gut endoderm, a better idea of the 3D organisation of this primordium could certainly be provided by outputs from light sheet imaging and series of optical sections along the z-axis of our immunostained samples. We plan to include these alternative visualisations and in the meantime refer to already published light-sheet data as found in [Rossi, Giuliana, et al. "Capturing cardiogenesis in gastruloids." (2021)]. Here too, the endodermal primordium appears as a dense, closed mass of cells. To contextualise these structures with respect to most recent literature, we do not see the kind of vacuolated and segmented structures described instead in https://doi.org/10.1038/s41467-021-23653-4 [Xu, Peng-Fei, et al. "Construction of a mammalian embryo model from stem cells organized by a morphogen signalling centre." (2021)]

      The information regarding the spatial localization of specific germ layer markers in the gastruloids at different timepoints would be important to understand how the morphology progresses and how it is comparable to the developing embryo itself. How is the organisation of the mesoderm and endoderm layers in comparison to embryo in the early timepoints and later timepoints of gastruloids?

      We agree with the reviewer about the helpfulness of such characterisations. While a temporal characterisation of the evolution of the endoderm compartment in relationship to the other cell types is provided in Figure 4A, cells of the other two germ layers are not explicitly labelled. We will provide analogous immunostaining series across Gastruloid development (early to late timepoints), choosing markers able to highlight cells of each of the three germ layers. Given the difficulty of finding specific germ layer markers, and the often-unforgiving limitations on the markers that can be chosen for simultaneous staining due to host-species antibody cross-reactivity, we also find useful to piece together relative germ layer localisation from the much wider imaging data now widely available in previous Gastruloid literature. We thus point interested readers to complement the endoderm descriptions from our preprint with dedicated characterisations of the axial organisation and distribution of the other two germ layers in Gastruloids published in https://doi.org/10.1038/s41586-018-0578-0 [Beccari, Leonardo, et al. "Multi-axial self-organization properties of mouse embryonic stem cells into gastruloids." Nature 562.7726 (2018): 272-276.], which also specifically discusses these aspects in relation to the germ layer organisation of the embryo proper.

      Clarify if Foxa2 and Sox17 double positive cells exist in the Cdh1 patches (Fig 3a). In Fig 4, authors have demonstrated the development of epithelial primordium with overlaying mesodermal wings, however it is important to show if Foxa2, Sox17, or other definitive endoderm markers co-express in these cells.

      We thank the reviewer for highlighting the value of co-localisation data, and in fact this is a suggestion also put forward by our other reviewer. We are currently developing a pipeline to segment the nuclei on the DAPI channel of each immunostained Gastruloid, and extract marker intensities within each cell. We envisage the results to be presented in a format similar to what shown e.g. in Fig. 2 of https://doi.org/10.1242/dev.159103 [Mulas, Carla, et al. "Oct4 regulates the embryonic axis and coordinates exit from pluripotency and germ layer specification in the mouse embryo." (2018)]. Such data would undoubtedly strengthen the reliability of the claims we here draw mostly from a qualitative inference of colocalisation. When showing the distribution of markers within the 120h epithelial primordium, the assumption was that since the entire primordium is FoxA2+, any other marker would colocalise with a subset of them, when expressed within the primordium. We do note the importance of colocalisation data for a more exact description of the cell type identities contained within the primordium.

      It was suggested that E-Cadherin is maintained during endoderm differentiation. N-cadherin expression may be examined to determine if N-cad is expressed in the other region of gastruloids.

      We share the interest in describing N-cadherin expression patterns, especially in the context of EMT development and endoderm epitheliality, and thank the reviewer for highlighting the value of this marker. To the time of publication, we had been unfortunately unable to source a N-cadherin antibody giving good signal quality in our hands. We are planning further staining of early and late Gastruloids for this marker. Given what recently reported in https://doi.org/10.1038/s41556-021-00694-x [Scheibner, Katharina, et al. "Epithelial cell plasticity drives endoderm formation during gastrulation." (2021)] we expect endodermal cells to display double cadherin expression (Cdh1+/Cdh2+), and the mesodermal compartment to display N-cadherin instead only.

      In Fig 6, FACS quantification is not proportional to the expression of the TBra:GFP as shown in the microscopic images at 96 hr, 120hr. Fig 6D does not show the TBra:GFP positive cells on the y -axis in the top-left quadrant, even though it is quite visible in microscopy - at 96, 120 hr. Microscopic images suggest TBra signal is almost completely lost at 128hr whereas FACS does not represent that. Infact, at 120 hours, the plot shows opposite of what microscopy shows.

      The reviewer is correct in pointing out these discrepancies and we will make sure to flag them explicitly in the main body of our report. It appears that the trend in reporter expression highlighted by FACS data is delayed with respect to what visible by live imaging (where TBra loss of expression can already be seen starting at t=120h). The decrease in TBra expression does not instead seem to be recovered even by the last FACS timepoint. Concomitantly, the TBra signal detected during time-lapse seems to decrease to abnormally low level (indeed, be lost), especially if compared to equivalent timepoints processed for FACS. We will investigate whether TBra reporter signal is particularly vulnerable to sustained illumination (i.e. timelapse conditions), as it appears to be still be present at later timepoints in both FACS and end-imaged Gastruloids (see clear posterior expression in Fig 6B). Maintained presence of TBra expression is also what expected and detected by immunostaining in both our data and in general Gastruloid literature. In light of the possible effects of sustained imaging on our reporters, we only use timelapse data to describe global cell movement (and of the FoxA2+ cells, rather than changes in reporter expression), and instead rely on the FACS data for claims based on intensity values. While we resolve the effect of timelapse imaging on TBra reporter detection, we will explicitly highlight the discrepancy between the two investigative approaches within the Results section, and thank the reviewer for bringing this point to our attention.

      Gastruloids were sampled at 96-168 hours for single cell transcriptome analysis. However, the specimens documented in this study were those only up to 144 hours. How does the gastruloid morphology look at 168 hours? It is essential to show the morphology and characterise the further development from 144 to 168 hours, to compare the single cell RNA seq data with the morphology of the gastruloid.

      The reviewer is correct in pointing out the absence of imaging data showing the internal endodermal primordium at t=168h. Examples of 168h Gastruloids (but not of the internal primordium) are shown in Figure 8D, indeed when we compare transcriptomics data with spatial patterning and verify the spatial distribution of late gut markers. At this stage, the internal endodermal primordium is mostly similar to its configuration at t=144h. We will incorporate additional morphological data for Gastruloids at 168h to show the organisation of the endodermal primordium at this later stage and to facilitate morphologytranscriptome comparisons.

      In Fig 7, it is surprising to see that the proportion of cells in the two clusters 13, 4 that mark endoderm are a minor portion of the whole dataset collected, whereas the microscopic images suggest that the majority of the gastruloid structure from 120hr onwards is marked by Foxa2 and shows the epithelial primordium morphology as claimed.

      Microscopic images are optical sections through the midplane of each Gastruloid as to capture the full extent of the internal Gastruloid epithelial primordium. We believe that these pictures fail to accurately convey the degree to which this primordium is in fact completely surrounded by a thick and dense layer of mesenchymal cells, not only in the lateral dimension as can be appreciated e.g. in Figure 4B, but also above and below the plane of the microphotographs. A better idea of the volume occupied by such cells, and the proportion of endodermal vs non-endodermal cells, could be provided by lightsheet imaging. At later timepoints, and as hinted by e.g. panel 4B [144h], the volume occupied by non-endodermal cells can be considerable. Linking back to a previous suggestion from the reviewer, we believe documentation of Gastruloid morphology at 168h could help to further clarify the relationship between data coming from the single cell dataset and morphological data as seen from the Gastruloid themselves. This might also be a further opportunity to underscore that the single-cell dataset accessed for our endoderm-targeted analysis was produced in the context of a different study https://doi.org/10.1016/j.stem.2020.10.013 [Rossi, Giuliana, et al. "Capturing cardiogenesis in gastruloids." (2021)], in which Gastruloids made with this same cell line were additionally treated with cardiogenesis-inducing factors. The proportion of cells classified as cardiac mesoderm and associated mesodermal types is thus likely much over-represented compared to what present in non-induced Gastruloids (i.e. those considered in this report), and as in fact illustrated by the imaging data presented in the paper.

      The single-cell RNA-seg data should be analysed for the co-expression of multiple segment-specific cell markers to ensure that the mature endoderm cells align with high-confidence with the known cell types in different segments of the embryonic gut, and that the localization of representative cell types can be validated spatially along an endoderm structure in single gastruloids.

      We will analyze the single-cell RBAseq dataset to show the co-expression of segment specific markers. As the reviewer points out we have only shown the patterns of expression of single markers at a time. As mentioned in a previous point, we also agree that the number of markers we verified to be spatially pattern is still restricted, and further characterisation, specifically of middle markers, would provide a more complete picture of the extent to which the Gastruloid primordium is effectively patterned. We do also agree that showing the distribution of AP markers within a same sample would provide a more immediate and compelling visual of the AP patterning of anterior and posterior markers.

      It is not clearly indicated how many replicates were performed to assure consistency/reproductivity of the gastruloid results. Statistical results were not provided for most of the immunostaining experiments, either in the main text or in the figure legends.

      Both reviewers highlight the qualitative nature of much of the data that is presented. Accordingly, we will more clearly and more consistently indicate the number of samples analysed and the number of replicates considered. For what concerns the “statistical results” of immunostaining experiments, and as indicated in response to a previous comment, we envisage to provide such results in a format similar to what used e.g. in Supplementary Figure 4 of https://doi.org/10.1038/s41467-021-23653-4 [Xu, Peng-Fei, et al. "Construction of a mammalian embryo model from stem cells organized by a morphogen signalling centre." (2021)]; i.e. categorization of the phenotypes observed among gastruloids, and quantification of their proportions. To convey statistics on the variability among Gastruloids, our current pipeline can output scatterplots like the one presented in Figure 4C, where the datapoint spread informs about the variability of the data. We will provide statistics on these plots in a numerical format.

      Majority of the images presented in the manuscript are shown as Maximum Intensity Projections, and it is not clearly stated if the localisation of the cells expressing specific protein markers are present on the surface or in the internal layers of the gastruloids. Optical slices of the gastruloid images may be presented as supplementary information.

      Most of the images presented in the manuscript are presented as optical cross-section through the midplane of the Gastruloid. We can make this clearer in the text. Indeed, and since this report focuses on on the spatial distribution of markers along the AP axis of the Gastruloids (and not DV or LR axes), we found midplane optical sections to best capture the entirety of the axis and thus the full extent of marker pattern (where those patterns are not asymmetric along the DV axis). All cells positive for immunostained markers are thus to be interpreted to be within the midplane of the Gastruloid, and as such internal to the Gastruloid if falling internally to the midplane, and external to the Gastruloid when falling towards the edges. As suggested by the reviewer we will acquire optical sections along the z-stack of our immunostained samples and provide them as supplementary information. Maximum intensity projections were only shown when wanting to better show the 3D structure of the internal epithelial primordium, and were always depth-coded to aid visualisation (Figure 4A, rightmost panel, Figure 5C).

      - Are prior studies referenced appropriately?

      Yes, except for the study on endoderm formation by lineage tracing in vivo, high-resolution single-cell analytics and functional analysis of genetic mutant embryos.

      We thank the reviewer for pointing out inappropriate referencing of studies on these three topics, yet given the breadth of each of these topics and the absence of any more specific information we remain unsure on how to address this comment appropriately. We would thus like to warn readers that this comment might have remained unaddressed.

      Results may be presented with reference to the data figures in the appropriate sequential order, or the figures may be re-organised to match the presentation in the Results.

      We thank the reviewer for bringing this to our attention. The figures are automatically organised in the order they are presented in the Results, yet since most of the figures are page-long their placement may appear odd. We will look into the matter and readjust figure positioning where possible, and/or reduce mentions of data shown in previous figures.

      Reduce the verbosity throughout the manuscript, especially the Results and Discussion.

      As this is also a point raised by both reviewers, we will make sure to go back over the entirety of the manuscript and do the necessary adjustments, especially for what concerns the Results section. We would however like to point out that the unusual length of our Introduction section is to be understood as a deliberate departure from the limits and conventions prescribed by traditional publishing formats. In line with our intentional choice of a journal-independent publication platform (i.e. a preprint server), and of journal-independent review process, we also presented and contextualized our research in a journal-independent format. We see this as an opportunity to present research in a voice truer to that of the researchers that carried it out. We thank both reviewers for their feedback on the format and will certainly still make sure to minimize redundancy of information throughout our manuscript.

      This study on endoderm development, however, is confounded by the inherent limitation of the experimental model: lack of extraembryonic tissue components, the atypical morphological structure and the deviation from the in vivo schedule of development and morphogenesis. This may raise doubt of the relevance of the findings to the

      We find difficult to share the view expressed by the reviewer here. The “lack of extraembryonic tissue components, the atypical morphological structure, and the deviation from the in vivo schedule of development and morphogenesis”, which they identify here as the inherent limitations of the model, represent in fact its value for us and for most in the Gastruloid field. For more complete and articulated discussion on how it is precisely the differences with the embryo proper that provides insights when studying in vitro models of embryonic development, we refer interested readers to dedicated discussions on the topic [van den Brink, Susanne C., and Alexander van Oudenaarden. "3D gastruloids: a novel frontier in stem cell-based in vitro modeling of mammalian gastrulation." (2021); Simunovic, Mijo, and Ali H. Brivanlou. "Embryoids, organoids and gastruloids: new approaches to understanding embryogenesis." (2017); Turner, David A., Peter Baillie‐Johnson, and Alfonso Martinez Arias. "Organoids and the genetically encoded self‐assembly of embryonic stem cells." (2016)).]. In fact, we share the reviewer’s concern about the relevance of these findings to “understanding of the morphogenetic activity and molecular control of endoderm formation during gastrulation in the embryo”. The constant questioning of such relevance is in fact a central point in Gastruloid research, where insight comes from a dialectic comparison between what observed in vitro and what known to happen in vivo. The reviewer questions the relevance of the findings “to the understanding of the morphogenetic activity and molecular control of endoderm formation during gastrulation in the embryo”. The relevance of our findings might be better framed in that they provide a better “understanding of the morphogenetic activity and molecular control of endoderm formation” tout court. In this case, outside of the embryo (and inside a self-organising developmental system), and reflections on this inform better understanding of how endoderm might be developing in vivo.

      Our findings in the Gastruloid open lines of inquiry to be verified and tested in the embryo. Both similarity and differences being equally informative on the intrinsic and extrinsic elements of endoderm behaviour. In fact, where some of the aspects we describe have been investigated in the embryo proper (see [Scheibner, Katharina, et al. "Epithelial cell plasticity drives endoderm formation during gastrulation." (2021)]) many of the same themes have emerged.

      Knowledge gleaned from the present study on the gastruloid study added little to that of a recent study of the morphogenetic program of endoderm formation in the mouse embryo and the ESC differentiation model (https://doi.org/10.1038/s41556-021-00694-x ) .

      The reviewer is absolutely correct in mentioning [Scheibner, Katharina, et al. "Epithelial cell plasticity drives endoderm formation during gastrulation." (2021)] to the readers of this public review. Given that these results were published posteriorly to our preprint, we could only reference them in our later versions, and we did this extensively throughout the text. We consider the paper mentioned by the reviewer [Scheibner, Katharina, et al. "Epithelial cell plasticity drives endoderm formation during gastrulation." (2021)] to represent a major contribution to the topic of (mouse) endoderm development, specifically in its investigation of endoderm EMT mechanisms as they take place within the mouse embryo itself. In relationship with what we describe here in Gastruloids, we see what reported by Scheibner’s et al extremely validating and as a very strong example of the investigative validity of in vitro models of development. Here is the in vivo exploration of the same topic highlighting many of the endoderm features we had inferred from in vitro observations, or that our observations further supported (specifically, incomplete EMT, tight association with epithelioid character, low evidence for mesendodermal intermediates etc..).

      To say that our study added very little to what now available in [Scheibner, Katharina, et al. "Epithelial cell plasticity drives endoderm formation during gastrulation." (2021)] represents however an inaccurate view of our study as being a study of endoderm development in vivo (see also answer to previous point). We would also want to point out that a major portion of this preprint describes the self-organisation of endoderm in vitro, the emergence and development of almost all AP-endoderm identities by self-organisation, the effective spatial patterning of at least some of these (waiting further characterisation), and the description of an accessible, tractable, reproducible in vitro model system to study endoderm development and provide populations of interest for culture. Our study also provides further insight on the necessary inputs to endoderm development and patterning, and whether extracellular matrices and extraembryonic tissues are part of such necessary inputs. Sharing the view expressed by the other reviewer, we see great insight from all of these aspects, and these are certainly not the topic or focus of the study referenced by this reviewer.

      As we do throughout the preprint, we strongly encourage readers interested in the topic to refer to [Scheibner, Katharina, et al. "Epithelial cell plasticity drives endoderm formation during gastrulation." (2021)] for insights coming from the embryo proper. We also take the opportunity to stress the value of all types of science, be it incremental, consolidating, or complementary.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      • Are the key conclusions convincing?

      The following conclusions were not warranted by the findings: Endoderm emergence can occur in the absence of extraembryonic tissues and embryonic architecture: It is unclear if extraembryonic tissues (e.g., primitive endoderm and visceral endoderm-like cells) are absent in the early phase of gastruloid development. While the gastruloid does not replicate the morphological feature of the post-gastrula embryo, it nevertheless has a certain degree of tissue organization. Perhaps the emergence of DE-like cells in 2-D culture would be a more convincing model for "the absence of extraembryonic tissues and embryonic architecture".<br> The FoxA2+/Sox17+ endoderm progenitors never transitioning through the mesenchymal intermediates and never leaving the epithelial compartment that they arise: In view of that the stereotypic morphogenetic activity was not documented during the development of the gastruloid, it is not possible to exclude the possibility of the progenitors undergo a partial EMT (loss of epithelial feature and cellular polarity and display of morphogenetic movement, as in vivo) in the transition from progenitor to the epithelial endoderm cells. The DE-like cells when first discerned in the gastruloid are apparently epithelialized. In the absence of lineage tracing results, It is not clear whether they are still residing in the "epithelial compartment that they arise".

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The mature endoderm cells are patterned segmentally in the gastruloid. The findings that the molecular phenotype (marker expression) of the mature endoderm cells "aligns with (cellular) identities along the entire length of the embryonic gut tube" are not sufficient evidence of spatial A-P patterning of endoderm cells. Only the spatial regionalization of Pax6-expressing cells (Fig. 8) and Cdx2-expressing cells (Fig 4C) were shown on different gastruloid specimens. The expression pattern of Foxa2/Cdh1 (Fig 5d) was not informative of tissue patterning. It is also unclear if a structure reminiscent of the embryonic gut (closed or partly open) was formed (or self-organised) in the gastruloid. Whether endoderm cells are patterned or not is, however, irrelevant for the understanding of the mode of endoderm formation, unless the timing and the mechanism of allocation of endoderm cells of specific segmental property has been studied in the gastruloid.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Specific points:

      1. The information regarding the spatial localization of specific germ layer markers in the gastruloids at different timepoints would be important to understand how the morphology progresses and how it is comparable to the developing embryo itself. How is the organisation of the mesoderm and endoderm layers in comparison to embryo in the early timepoints and later timepoints of gastruloids?
      2. Clarify if Foxa2 and Sox17 double positive cells exist in the Cdh1 patches (Fig 3a). In Fig 4, authors have demonstrated the development of epithelial primordium with overlaying mesodermal wings, however it is important to show if Foxa2, Sox17, or other definitive endoderm markers co-express in these cells.
      3. It was suggested that E-Cadherin is maintained during endoderm differentiation. N-cadherin expression may be examined to determine if N-cad is expressed in the other region of gastruloids.
      4. In Fig 6, FACS quantification is not proportional to the expression of the TBra:GFP as shown in the microscopic images at 96 hr, 120hr. Fig 6D does not show the TBra:GFP positive cells on the y -axis in the top-left quadrant, even though it is quite visible in microscopy - at 96, 120 hr. Microscopic images suggest TBra signal is almost completely lost at 128hr whereas FACS does not represent that. Infact, at 120 hours, the plot shows opposite of what microscopy shows.
      5. Gastruloids were sampled at 96-168 hours for single cell transcriptome analysis. However, the specimens documented in this study were those only up to 144 hours. How does the gastruloid morphology look at 168 hours? It is essential to show the morphology and characterise the further development from 144 to 168 hours, to compare the single cell RNA seq data with the morphology of the gastruloid.
      6. In Fig 7, it is surprising to see that the proportion of cells in the two clusters 13, 4 that mark endoderm are a minor portion of the whole dataset collected, whereas the microscopic images suggest that the majority of the gastruloid structure from 120hr onwards is marked by Foxa2 and shows the epithelial primordium morphology as claimed.
      7. The single-cell RNA-seg data should be analysed for the co-expression of multiple segment-specific cell markers to ensure that the mature endoderm cells align with high-confidence with the known cell types in different segments of the embryonic gut, and that the localization of representative cell types can be validated spatially along an endoderm structure in single gastruloids.
      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The suggested experiments may be accomplished in a few months.

      • Are the data and the methods presented in such a way that they can be reproduced?

      Yes for the methods. It is not clearly indicated how many replicates were performed to assure consistency/reproductivity of the gastruloid results.

      • Are the experiments adequately replicated and statistical analysis adequate?

      Statistical results were not provided for most of the immunostaining experiments, either in the main text or in the figure legends.

      Minor comments:

      • Specific experimental issues that are easily addressable.

      Majority of the images presented in the manuscript are shown as Maximum Intensity Projections, and it is not clearly stated if the localisation of the cells expressing specific protein markers are present on the surface or in the internal layers of the gastruloids. Optical slices of the gastruloid images may be presented as supplementary information.

      • Are prior studies referenced appropriately?

      Yes, except for the study on endoderm formation by lineage tracing in vivo, high-resolution single-cell analytics and functional analysis of genetic mutant embryos.

      • Are the text and figures clear and accurate?

      Results may be presented with reference to the data figures in the appropriate sequential order, or the figures may be re-organised to match the presentation in the Results.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      While taking into consideration the limitation of this embryo model for studying morphogenesis, highlight the interesting/unique findings of the gastruloid study, albeit they may have been discovered in the study of embryos in vivo.

      Reduce the verbosity throughout the manuscript, especially the Results and Discussion.

      Significance

      • Describe the nature and significance of the advance (e.g., conceptual, technical, clinical) for the field.

      This study demonstrated that definitive endoderm (DE)-like cells can be generated in the stem cell-derived embryo-like structure, the gastruloid. This observation is significant for that gastruloids can serve as an amenable experimental model, in comparison to other 2D invitro differentiation models, for elucidating the requisite cellular process in endoderm differentiation and the acquisition of cell identity. It was inferred that, in the gastruloid, epithelial-mesenchyme transition may not be a requisite cellular process for the formation of the DE-like cells and that endoderm formation may not involve progression through an intermediate mesenchymal state. The progenitor cells may have acquired and retained the attribute of epithelization during differentiation and organization the DE-like cells into an endoderm layer.<br> This study on endoderm development, however, is confounded by the inherent limitation of the experimental model: lack of extraembryonic tissue components, the atypical morphological structure and the deviation from the in vivo schedule of development and morphogenesis. This may raise doubt of the relevance of the findings to the understanding of the morphogenetic activity and molecular control of endoderm formation during gastrulation in the embryo.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      Knowledge gleaned from the present study on the gastruloid study added little to that of a recent study of the morphogenetic program of endoderm formation in the mouse embryo and the ESC differentiation model (https://doi.org/10.1038/s41556-021-00694-x ) . This recent study has advanced the understanding of the functional attributes that segregate the endoderm progenitors and mesoderm progenitors in the primitive streak and the posterior epiblast, and has characterised the role of epithelial plasticity and the modulation of EMT activity under the control of Forkhead box transcription factor A2 and modulation of WNT signalling in the formation of the definitive endoderm.

      • State what audience might be interested in and influenced by the reported findings.

      Developmental biologists, stem cell scientists and embryo modellers

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Mouse embryogenesis, gastrulation, in vitro stem cell differentiation, advanced microscopy, single cell transcriptomics.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The study adresses a long-standing question in mouse gastrulation: the existence of a mesendodermal progenitor, similar to other species. Recent data in the mouse point towards a direct transition from epiblast to endoderm without going through a bipotent progenitor, and without loss of epithelial characteristics (notably Probst 2021). The authors take advantage of the gastruloid system to follow the emergence of endoderm cells. Through co-staining with markers of various germ layers at different timepoints, live imaging, and reanalysis of single cell RNASeq data, they propose a model in which endoderm cells differentiate from E-cadherin positive cells without going through a bipotent stage. Interestingly, they then organise a rod like structure along the anterior-posterior axis of the gastruloid, that display some polarity illustrated by a marker of anterior gut.

      The data present show very high reproducibility, and authors fully exploit the organoid system by analysing a large amount of samples and showing very similar results. The results are qualitatively very convincing. For the sake of clarity, it may help to provide some table illustrating the proportion of gastruloids behaving precisely as the best example shown. It would also strengthen the message even more to provide some quantitation of co-expression for the main markers. As the behaviour seems very consistent, it is likely that such quantification would not be very arduous, and it would show the strength of the model.

      The manuscript is pleasantly written and all statements are clearly explained. It is a bit long though, in particular the introduction, some of which might read more like a review of the field. It is less striking in the Results part, but there is some level of repetition that is a bit distracting.

      Significance

      The question is important and timely, and the data are clear and convincing. There have been a number of publications addressing it in the last years, either in the embryo or in gastruloids (notably Hashmi 2021). All data appear to point in the same direction, and this study is certainly an important contribution. An original aspect to my knowledge is the self organisation of the central rod. The system is simple, reproducible, and opens novel possibilities to dispose of a large number of cells to explore the emergence of endoderm subpopulations.

      The fact that several studies converge is rather an advantage as they all have specificities, and I believe they are adequately cited here.

      In summary this is an important and well conducted study that may just benefit from some additional quantification to prove robustness. In terms of writing, it is pleasant and quite literary, but perhaps a little bit too much so. The introduction could be shortened, and the results a bit more to the point.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      We would like to thank the reviewers for their thoughtful comments and efforts towards improving our manuscript. Based on the reports, we have revised the manuscript entitled “Bi-phasic effect of gelatin in myogenesis and skeletal muscle r__egeneration__ (RC-2021-00854 )”. We have addressed all the concerns, revised the text and figures. Modified parts are in red, and line numbers are tracked in this response letter. Our detailed point-by-point responses to the reviewers’ comments are listed below. We believe that these changes strengthen the manuscript and are grateful to the reviewers for all of their suggestions.

      Point-by-point description of the revisions

      R__eviewer #2 (Evidence, reproducibility and clarity (Required)):__

      \*Summary:***

      The manuscript by Xiao Ling Liu and colleagues titled "Bi-phasic effect of gelatin in myogenesis and skeletal muscle regeneration" deals with the effect of gelatin on differentiation of myoblast cell line, in vitro, and on skeletal muscle regeneration upon muscle injury, in vivo. In vivo, the gelatin is a product of collagen breakdown associated with skeletal muscle regeneration upon acute or chronic muscle damage.

      Specifically, the authors define a dose-dependent effect of gelatin, beneficial at low dose and detrimental at high dose. This effect is mediated by the level of ROS accumulation leading to the induction of different cytokines with opposite effects on skeletal muscle regeneration.

      **Major comments:**

      The experimental purpose is well tackled from both biochemical and functional point of view, and the proposed experiments are quite exhaustive.

      Response: We are grateful for the encouraging comments from this reviewer.

      However, I would suggest some additional experimental analyses to improve the robustness and quality of the study, as well as text and figure editing, as reported below.

      Regarding the additional experiments/analyses/images:

        • Figure 5: I would suggest to add an image of C2C12 cells in GM (growth medium), as representative images of proliferation analysis upon LCG/HCG/NAC treatment.* Response: We appreciate this suggestion by the reviewer. New images of C2C12 cells upon LCG/HCG/NAC treatment have now been added as in Supplementary Figure 4B. The new results are described in main text Lines 237-239.
      • Figure 5: I would suggest to repeat the main si-NOX2 experiments with an alternative siRNA to rule out off target effects.*

      Response: We thank the reviewer for this suggestion. We have now added a new siRNA targeting NOX2 with an independent sequence and shown the results in Figure 5F-J and Supplementary Figure 4H-J. The sequences of si-NOX2 and si-NC are shown in Materials and Methods Lines 681-687. The newly added siRNAs confirmed results with the previous siRNA, suggesting unlikely off-target effects. This result is described in the main text Lines 246-256.

      • In vivo experiments could be improved by adding DHE or DCFH staining on muscle TA cryosections to quantify the level of oxidative stress.*

      Response: We appreciate this suggestion by the reviewer. We have now stained DCFH-DA in TA cryosections to quantify oxidative status in situ and show the results in Supplementary Figure 8A-B. Indeed, low- and high-dose gelatin injections both triggered ROS production, and high-dose injection resulted in a high accumulation of ROS. The new result is described in the main text Lines 318-321.

      • The proposed model could be better tackled by additional in vivo treatment with Ab anti IL-6 or anti TNFalpha in combination with CTX and LCG or HCG, followed by H/E staining at 14 dpi.*

      Response: We appreciate this suggestion by the reviewer. We have now added in vivo treatment with IL-6 or TNFα neutralizing antibody (Ab) in combination with CTX and LCG or HCG. The procedure is illustrated in Figure 8J and described in main text Lines 522-525. H&E staining showed that anti-IL-6 Ab injection significantly reduced the beneficial effect of LCG, but had no effect on HCG-treated mice. By contrast, anti-TNFα Ab injection significantly suppressed infiltration of macrophages into the injury site upon HCG, reversed the deleterious effect of HCG on muscle repair, resulting in myofibers with higher CSA and more myofibers with central nuclei. The new results are described in Figure 8J-K and main text Lines 331-346.

      **Minor comments:**

      1. Each acronym should be indicated in full in the main text at the first mention (for instance BHP, NAC and others). Moreover, I would suggest to add an acronym list for reagents and factors Response: We thank the reviewer for this suggestion and have now added an acronym list in Material and Methods, Lines 466-504.

      Experimental methods should be better detailed; for instance I would suggest:

        • Add a detailed description of the quantification of differentiation indexes* Response: We thank the reviewer for this suggestion. We have now added a detailed description on the quantification method of myogenesis and differentiation to Material and Methods, Lines 577-582.
      • Explain how cell growth (OD 450nm) and optical density (570nm) assays have been performed*.

      Response: We thank the reviewer for pointing this out. The cell growth was examined by measuring dehydrogenase activities that generate a soluble formazan dye, whose OD 450nm value was measured as a proportional value to the number of viable cells in the sample (CCK-8 kits according to manufacturer’s instructions).

      The transwell cell migration ratio was determined by measuring the optical density of crystal violet staining (570 nm) of migrated cells on the bottom of the transwell filters (transwell filters have 8 μm pores to allow cells to pass through). The non-migrated cells on the top of the transwell filter were scraped away.

      Detailed descriptions have now been added to Material and Methods, Lines 593-609.

      • Explain how ROS species and antioxidant enzymes have been measured (Fig 4C and 4D)*

      Response: We thank the reviewer for pointing this out. The levels of ROS species (O2.- , OH· and H2O2) and antioxidant enzymes in Figure 4C and 4D were examined using commercial kits according to the manufacturer’s instructions (Nanjing Jiancheng Bioengineering Institute, Nanjing, China). Briefly, cells were lysed in RIPA lysis buffer and protein concentrations were determined using BCA method. The O2.- level and SOD activity were measured by adding electron transfer substances that reduce azo blue tetrazole to blue methionine. The activity of SOD was evaluated by the absorption of methionine. GSH-PX facilitates the reaction between H2O2 and GSH to produce H2O and oxidized glutathione (GSSG). The activity of GSH-PX can thus be obtained by measuring the consumption of GSH in this enzymatic reaction. OH· was measured using Fenton reaction. The level of H2O2 was determined according to the reaction with molybdic acid. We have now added the corresponding information to Figure. 4 legend and main text Lines 1066-1067. Detailed protocols can be found in Material and Methods, Lines 704-714.

      Figures and figure legends:

      1. Please add in the figures the figure number in order to facilitate the reading of the pdf file Response: We thank the reviewer for this suggestion and have added figure numbers to the PDF file.

      The sequence of the panels should be coherent with the alphabet and reading left to right and up to down

      Response: Yes, we have rearranged the figure layouts to be coherent.

      • In the Figure 1, I would suggest to add the whole TA sections for H/E staining in order to appreciate the overall beneficial or detrimental effect of LCG and HCG, respectively.*

      Response: We thank the reviewer for this suggestion and have added the whole-section view of TA with H&E staining as in Supplementary Figure 1C. The results are described in the main text Lines 132-136.

      • I would suggest to show Supplementary figures 1C and 1D in the Figure 1*.

      Response: We thank the reviewer for this suggestion and have now moved Supplementary Figures 1C and 1D to Figure 1F and 1G.

      In the Supplementary Fig 2A, I would suggest to the authors to show and comment only the data about proliferation: the earlier orientation, fusion and differentiation of C2C12 exposed to LCG are a consequence of the positive effect of LCG on proliferation

      Response: We thank the reviewer for this suggestion and have removed data and comments except for proliferation of C2C12 in Supplementary Figure 2A.

      • The quality of representative images of western blot is not always high: the bands are fuse and, consequently, the quantification is not reliable. For instance Fig S4 B, S4 G; in Fig 2 the representative image does not really represent the reported quantification.*

      Response: We thank the reviewer for this suggestion and have replaced western blot images in Supplementary Figure 4B, 4G and Figure 2.

      • Please specify in the figure legends the meaning of the acronym in the axis title (for instance RFU, MFI or DCF) and in the axis title the unit of measure (for instance Count (?)).*

      Response: We thank the reviewer for this suggestion and have spelt out “RFU” as “Relative fluorescence intensity”, changed “MFI” to “Relative fluorescence intensity of NOX2” in Figure 4H and Figure 8H. The unit of measurement is the fluorescence intensity. The related modifications have been described in the legend of Figure 4H and Figure 8H, Lines 1071, 1125-1126. We have now included “DCF: 2',7'-dichlorofluorescein” in the acronym list in Material and Methods, Line 473. DCF positive ratio is now used to reflect ROS level in the population. We have explained “Count” in the legend as “Cell counts”.

      The authors always wrote "filed" in place of "field"**.

      Response: We apologize for this typo and have corrected it with others throughout the text.

      • In the figures 7 and 8 the letters for densitometry panel are missed.*

      Response: We could not identify those missing labels and letters in the densitometry panel. We suspect this issue could be from the soft wares opening the documents.

      • In the figure legend 8 the panel letters do not match the panels in the figure*.

      Response: We thank the reviewer for pointing this mistake out and have corrected Figure legend 8.

      • Figure 3I: replace "nucleis" with "nuclei"*.

      Response: We thank the reviewer for pointing out this mistake and have modified Figure 3I.

      Text editing:

        • Line 64: Satellite cells (SCs) would be more appropriate than myoblasts* Response: We thank the reviewer’s suggestion and have replaced myoblasts with satellite cells in the main text Line 60.
      • Line 64: Please define more carefully the location of SCs*

      Response: We thank the reviewer for this suggestion. SCs are underneath the basal laminin and myofiber plasma membrane in the resting skeletal muscle. The results are described in the main text Lines 60-65.

      • Line 67: MyoD+/Myog+ would be more appropriate than Pax7+/Myog+*

      Response: We thank the reviewer’s suggestion and have changed pax7+/MyoG+ into MyoD+/MyoG+ in the main text Line 64.

      • Line 67: (Pax7+)/Myog+ "myocytes" in place of myotubes ... and fuse with each other to generate myotubes*

      Response: We thank the reviewer for pointing this out and have modified the sentence in Lines 64-65.

      • Line 69: Please add Myf6/MRF4 to MRFs list*

      Response: We thank the reviewer for pointing out and have added Myf6/MRF4 to MRFs list in the main text Line 66.

      • Line 112: replace "its" with "their"*

      Response: We thank the reviewer for pointing out this mistake and have replaced “its” with “their”.

      • Lines 330-331: "promoted IL-6" "enhanced TNF", please insert secretion/production*

      Response: We thank the reviewer’s suggestion and have inserted “production and secretion” behind IL-6 and TNFa in the main text Lines 310-312. .

      • Line 352-353: this sentence is not necessary*

      Response: We have deleted this sentence.

      Reviewer #2 (Significance (Required)):

      The study reports robust and interesting data applicable to both basic research and translational research, such as tissue engineering applications.

      Response: We thank the reviewer for sharing this positive and important opinion on our work. We are delineating the biological pathways of gelatin treatments, motivated by the application of this biocompatible and industrial material for treating disease and aging-related skeletal muscular dystrophies

      Keywords for field of expertise of this reviewer:

      Skeletal muscle regeneration

      Duchenne Muscular Dystrophy

      Inflammation

      Macrophages

      Oxidative stress

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      The Review Commons submission by Liu and colleagues entitled "Biphasic effect of gelatin in myogenesis and skeletal muscle regeneration" provide a systematic in vitro and in vivo evaluation of the effects of myogenic cell exposure to low or high dose gelatin. Through these analyses they uncover pro- and anti-regenerative effects of gelatin that are dose dependent and through a series of cell and molecular studies, they attribute these dual effects to a ROS-IL6/TNFa signaling axis. The study is well designed and executed. For the most part the figures and experimental details are clear and transparent, and most of the conclusions are supported by the data. Specific comments follow.

      Response: We are grateful for the reviewer’s encouraging comments.

      **Major Comments**

        • Study framing in the Introduction is mis-aligned with the research conducted. Currently the Introduction sets the stage for an in vivo exploration of the effects of endogenous gelatin produced in the course of muscle regeneration. However, there are no experiments in this paper investigating the presence or effects of endogenous gelatin.* Response: We thank the reviewer for pointing this out and apologize for the misleading parts in our abstract and introduction. The reported phenomenon of a temporary breakdown of collagen after skeletal muscle injury has inspired us to apply gelatin for achieving a pro-regenerative effect. Although it will be a very interesting biological pathway to study, no adequate research tools are available for measuring endogenous gelatin in vivo. We have re-written parts in abstract and introduction to avoid confusion, see Lines 34-35, 78-93.

      1.a. The impact of the study would indeed be increased by including a systematic characterization of endogenous gelatin levels during muscle regeneration in healthy mice as compared to in those where fibrosis is prevalent. A demonstration that ROS-IL6/TNFa levels align with patterns seen in the in vitro studies, and pharmacological manipulations to 'rescue' would all provide a demonstration of a hormesis gelatin response in vivo. Meaning, is this process something that naturally occurs in the physiological context, or is it one that is possible, but only be supraphyiological gelatin injections?

      Response: We resonate with the reviewer and would have loved to investigate whether what we have observed is a fundamental mechanism of the natural healing process. But we are currently limited by the lacking of adequate tools for endogenous gelatin quantification. We appreciate the reviewer’s suggestion to compare normal and aberrant repairing processes such as fibrosis and would like to explore the possibility in a separate future study.

      1.b. Alternatively to 1.a., the authors should reframe the Introduction to focus on understanding the effects of gelatin as a biomaterial that is being used in regenerative medicine applications. In this case, the authors should delete/edit/reframe lines 74-102 and instead use lines 103 on to motivate the study so as to be consistent.

      Response: We have rephrased sections of Abstract and Introduction to focus on gelatin as a biomaterial. Please find the changes in the main text Lines 34-35, 78-93.

      • Satellite cell conclusions in Figures 2C-D that are based upon representative images provided in 2A, are questionable. Pax7 staining in mouse tissue sections is notoriously difficult and the antibodies can have dramatic lot to lot variability. The immunostaining provided in the representative images is not convincing, and hence, draws into question the conclusions based upon them.*

      Response: We thank the reviewer for the suggestion and have optimized Pax7 staining based on the published protocol by Feng et al., 2018 in JoVE. The new images can be found in Figure 2A.

      2.a. If the authors wish to leave the satellite cell conclusions in their study, they will need to optimize their Pax7 staining and repeat this study. They should focus on Pax7+ objects that contain a nucleus and are located below the basal lamina. Also, the word 'activation' in line 180 should be edited to 'expansion' as the histological analysis and study design preclude an evaluation of satellite cell activation.

      Response: Our new results after optimizing staining protocol support that low-dose gelatin injection causes more Pax7+ cells (green) underneath the basal lamina (red) in injured TA muscle and high-dose gelatin injection suppressed the number of Pax7+ SCs at 7 D.P.I. The new results were shown in Figure 2A and described in the main text Lines 152-155.

      We have changed the word “activation” into “expansion” according to reviewer’s suggestion in the main text Line 161.

      2.b. Alternatively to 2.a., it would not diminish the impact of this study to remove the 'satellite cell' findings in their entirety from the manuscript.

      Response: Please see above. We hope the new images are convincing to this reviewer.

      • It is surprising that the molecular hallmarks of low vs high gelatin injection shown in Fig. 8 would still be present at a time point 2-weeks after the initial injection.*

      Response: Please see below 3.b.

      3.a. It would increase the impact of the study to better understand the basis of this surprising observation. This point links to point 1.a. as one would ideally need to quantify baseline gelatin levels pre-injury and post-injury. For example, is the injected gelatin still present 14-days after injection? Or is the MMP profile altered in a way that sustains these levels one direction or the other? Etc.

      Response: Please see below 3.b.

      3.b. Alternatively to 3.a., the authors should use the Discussion to note this point and speculate on the significance.

      Response: We thank the reviewer for making this important point. The sustained effect of gelatin materials has been reported in several previous studies, and we now have discussed possible mechanisms such as MMP expression profiles and lasting interplays between SCs, macrophages, ECM, and myoblasts. See the main text Lines 437-459.

      **Minor Comments**

        • The authors should conduct a careful review of the manuscript to address minor typos and grammatical errors.* Response: We thank the reviewer for pointing this out and have carefully reviewed the manuscript and corrected minor typos and grammatical errors.
      • It is unclear from the Figure Legends, Results, or Methods what the 'PBS' condition in Figure 1 refers to. Is this the uninjured control? If so, consider using 'Cntrl' as the label and then defining it in the figure legend for clarity.*

      Response: We thank the reviewer for pointing this out. Yes, PBS/phosphate-buffered saline is the vehicle for delivering CTX thus representing the uninjured model. In the revised manuscript, we have replaced “PBS: phosphate-buffered saline” with “Ctrl” in Figures, Figure Legends, Results and Methods.

      • It is unclear from the Figure Legends, Results, or Methods what is quantified to read-out the 'myogenesis index' and 'fusion index' that is reported in Fig. 5D & E. Please reconcile.*

      Response: This point has been addressed in the previous section. Myotube formation was quantified using myogenesis and fusion index. Myogenesis index (% nuclei within MyHC-stained myocytes/total nuclei) and fusion index (% nuclei in myotubes with >5 nuclei/total nuclei in MyHC-stained cells) are now explained both in legends and in Materials and Methods. Please see the main text Lines 577-582.

      Reviewer #3 (Significance (Required)):

      The manuscript constitutes a technical advance, and offers a molecular mechanism, in support of the notion that intramuscular injection of low dose gelatin serves to expedite the process of skeletal muscle regeneration. This study has translational implications for a regenerative medicine application of this knowledge. It is my opinion that this aspect of the study is well supported by the results and requires Response: We thank the reviewer for sharing this positive and important comment. Indeed, the motivation of this work was to delineate biological pathways of gelatin treatments in myogenesis and muscle regeneration for potential therapeutic applications.

      The manuscript would constitute both a technical and a conceptual advance by addressing Major Point 1a as the authors would show, for the first time, that low and high dose gelatin levels naturally exist in vivo to mediate the process of muscle endogenous repair. This would be highly significant, because as the authors rightly point out in Lines 74-76 of their manuscript "...by what mechanisms ECM regulates the functional, morphological, and molecular events of skeletal muscle regeneration remain poorly understood". It is my opinion that making this latter point would require 1-3 months of additional studies.

      Response: We thank the reviewer for pointing to this important scientific direction. But regrettably, missing adequate tools for quantifying endogenous gelatin in vivo is currently prohibitive.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The Review Commons submission by Liu and colleagues entitled "Biphasic effect of gelatin in myogenesis and skeletal muscle regeneration" provide a systematic in vitro and in vivo evaluation of the effects of myogenic cell exposure to low or high dose gelatin. Through these analyses they uncover pro- and anti-regenerative effects of gelatin that are dose dependent and through a series of cell and molecular studies, they attribute these dual effects to a ROS-IL6/TNFa signaling axis. The study is well designed and executed. For the most part the figures and experimental details are clear and transparent, and most of the conclusions are supported by the data. Specific comments follow.

      Major Comments

      1. Study framing in the Introduction is mis-aligned with the research conducted. Currently the Introduction sets the stage for an in vivo exploration of the effects of endogenous gelatin produced in the course of muscle regeneration. However, there are no experiments in this paper investigating the presence or effects of endogenous gelatin.

      1.a. The impact of the study would indeed be increased by including a systematic characterization of endogenous gelatin levels during muscle regeneration in healthy mice as compared to in those where fibrosis is prevalent. A demonstration that ROS-IL6/TNFa levels align with patterns seen in the in vitro studies, and pharmacological manipulations to 'rescue' would all provide a demonstration of a hormesis gelatin response in vivo. Meaning, is this process something that naturally occurs in the physiological context, or is it one that is possible, but only be supraphyiological gelatin injections?

      1.b. Alternatively to 1.a., the authors should reframe the Introduction to focus on understanding the effects of gelatin as a biomaterial that is being used in regenerative medicine applications. In this case, the authors should delete/edit/reframe lines 74-102 and instead use lines 103 on to motivate the study so as to be consistent.

      1. Satellite cell conclusions in Figures 2C-D that are based upon representative images provided in 2A, are questionable. Pax7 staining in mouse tissue sections is notoriously difficult and the antibodies can have dramatic lot to lot variability. The immunostaining provided in the representative images is not convincing, and hence, draws into question the conclusions based upon them.

      2.a. If the authors wish to leave the satellite cell conclusions in their study, they will need to optimize their Pax7 staining and repeat this study. They should focus on Pax7+ objects that contain a nucleus and are located below the basal lamina. Also, the word 'activation' in line 180 should be edited to 'expansion' as the histological analysis and study design preclude an evaluation of satellite cell activation.

      2.b. Alternatively to 2.a., it would not diminish the impact of this study to remove the 'satellite cell' findings in their entirety from the manuscript.

      1. It is surprising that the molecular hallmarks of low vs high gelatin injection shown in Fig. 8 would still be present at a time point 2-weeks after the initial injection.

      3.a. It would increase the impact of the study to better understand the basis of this surprising observation. This point links to point 1.a. as one would ideally need to quantify baseline gelatin levels pre-injury and post-injury. For example, is the injected gelatin still present 14-days after injection? Or is the MMP profile altered in a way that sustains these levels one direction or the other? Etc.

      3.b. Alternatively to 3.a., the authors should use the Discussion to note this point and speculate on the significance.

      Minor Comments

      1. The authors should conduct a careful review of the manuscript to address minor typos and grammatical errors.
      2. It is unclear from the Figure Legends, Results, or Methods what the 'PBS' condition in Figure 1 refers to. Is this the uninjured control? If so, consider using 'Cntrl' as the label and then defining it in the figure legend for clarity.
      3. It is unclear from the Figure Legends, Results, or Methods what is quantified to read-out the 'myogenesis index' and 'fusion index' that is reported in Fig. 5D & E. Please reconcile.

      Significance

      The manuscript constitutes a technical advance, and offers a molecular mechanism, in support of the notion that intramuscular injection of low dose gelatin serves to expedite the process of skeletal muscle regeneration. This study has translational implications for a regenerative medicine application of this knowledge. It is my opinion that this aspect of the study is well supported by the results and requires <1month of additional edits to finalize the manuscript.

      The manuscript would constitute both a technical and a conceptual advance by addressing Major Point 1a as the authors would show, for the first time, that low and high dose gelatin levels naturally exist in vivo to mediate the process of muscle endogenous repair. This would be highly significant, because as the authors rightly point out in Lines 74-76 of their manuscript "...by what mechanisms ECM regulates the functional, morphological, and molecular events of skeletal muscle regeneration remain poorly understood". It is my opinion that making this latter point would require 1-3 months of additional studies.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Xiao Ling Liu and colleagues titled "Bi-phasic effect of gelatin in myogenesis and skeletal muscle regeneration" deals with the effect of gelatin on differentiation of myoblast cell line, in vitro, and on skeletal muscle regeneration upon muscle injury, in vivo. In vivo, the gelatin is a product of collagen breakdown associated with skeletal muscle regeneration upon acute or chronic muscle damage.

      Specifically, the authors define a dose-dependent effect of gelatin, beneficial at low dose and detrimental at high dose. This effect is mediated by the level of ROS accumulation leading to the induction of different cytokines with opposite effects on skeletal muscle regeneration.

      Major comments:

      The experimental purpose is well tackled from both biochemical and functional point of view, and the proposed experiments are quite exhaustive. However, I would suggest some additional experimental analyses to improve the robustness and quality of the study, as well as text and figure editing, as reported below.

      Regarding the additional experiments/analyses/images:

      • Figure 5: I would suggest to add an image of C2C12 cells in GM (growth medium), as representative images of proliferation analysis upon LCG/HCG/NAC treatment.
      • Figure 5: I would suggest to repeat the main si-NOX2 experiments with an alternative siRNA to rule out off target effects.
      • In vivo experiments could be improved by adding DHE or DCFH staining on muscle TA cryosections to quantify the level of oxidative stress.
      • The proposed model could be better tackled by additional in vivo treatment with Ab anti IL-6 or anti TNFalpha in combination with CTX and LCG or HCG, followed by H/E staining at 14 dpi.

      Minor comments:

      Each acronym should be indicated in full in the main text at the first mention (for instance BHP, NAC and others). Moreover, I would suggest to add an acronym list for reagents and factors

      Experimental methods should be better detailed; for instance I would suggest:

      • Add a detailed description of the quantification of differentiation indexes
      • Explain how cell growth (OD 450nm) and optical density (570nm) assays have been performed
      • Explain how ROS species and antioxidant enzymes have been measured (Fig 4C and 4D)

      Figures and figure legends:

      • Please add in the figures the figure number in order to facilitate the reading of the pdf file
      • The sequence of the panels should be coherent with the alphabet and reading left to right and up to down
      • In the Figure 1, I would suggest to add the whole TA sections for H/E staining in order to appreciate the overall beneficial or detrimental effect of LCG and HCG, respectively.
      • I would suggest to show Supplementary figures 1C and 1D in the Figure 1
      • In the Supplementary Fig 2A, I would suggest to the authors to show and comment only the data about proliferation: the earlier orientation, fusion and differentiation of C2C12 exposed to LCG are a consequence of the positive effect of LCG on proliferation
      • The quality of representative images of western blot is not always high: the bands are fuse and, consequently, the quantification is not reliable. For instance Fig S4 B, S4 G; in Fig 2 the representative image does not really represent the reported quantification.
      • Please specify in the figure legends the meaning of the acronym in the axis title (for instance RFU, MFI or DCF) and in the axis title the unit of measure (for instance Count (?)).
      • The authors always wrote "filed" in place of "field"
      • In the figures 7 and 8 the letters for densitometry panel are missed.
      • In the figure legend 8 the panel letters do not match the panels in the figure
      • Figure 3I: replace "nucleis" with "nuclei"

      Text editing:

      • Line 64: Satellite cells (SCs) would be more appropriate than myoblasts
      • Line 64: Please define more carefully the location of SCs
      • Line 67: MyoD+/Myog+ would be more appropriate than Pax7+/Myog+
      • Line 67: (Pax7+)/Myog+ "myocytes" in place of myotubes ... and fuse with each other to generate myotubes
      • Line 69: Please add Myf6/MRF4 to MRFs list
      • Line 112: replace "its" with "their"
      • Lines 330-331: "promoted IL-6" "enhanced TNF", please insert secretion/ production
      • Line 352-353: this sentence is not necessary

      Significance

      The study reports robust and interesting data applicable to both basic research and translational research, such as tissue engineering applications.

      Keywords for field of expertise of this reviewer:

      Skeletal muscle regeneration

      Duchenne Muscular Dystrophy

      Inflammation

      Macrophages

      Oxidative stress

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Firstly, we would like to thank the reviewers for their helpful and insightful comments.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In the manuscript of Ramadan et al. , authors use the ex vivo organoid approach to compare gene expression in organoids derived from adult type stem cells when these organoids are grown using different matrices. The presence of Collagen type I induces the emergence of cells with a transcriptome similar to fetal progenitors. In contrast, laminin the main component of matrigel, induces an organoid-protruded phenotype with transcriptome of stem cell type. Then, they correlate these data with expression of collagens and laminins from data publicly available. They show by qRT -PCR that laminins are more expressed in mesenchymal versus epithelial fractions postnatally. They hypothesize on this basis that the remodeling at postnatal stage is likely only dependent on the mesenchymal compartment and it involves interaction of laminins with integrity a6.

      It seems that some of the presented data have already been described and could not be considered as « novel ».

      For some of the statements, like this one « the basement-membrane produced by the epithelium is not sufficient to increase stem cell numbers and induce a morphological crypt formation », the conclusion is not sustained by provided experiments. To draw definitive conclusion on this particular point, authors could reproduce the experiment presented in Fig. 4d but using Cre recombinases specific for mesenchymal and epithelial compartments rather than the ubiquitous Cre line. It would be interesting to investigate if organoids grown from lamc1-/- mice can generate protruded organoids or not.

      In addition, how interpret the fact that fetal organoids up is associated with « laminin interactions » in fig. 1c?

      The statement that the epithelium-produced basement membrane is not sufficient to increase stem cell numbers is based on our in vitro observations. Analysis of the RNAseq data shows that the expression of several laminins is increased on collagen (see heatmap of laminin interactions below, which will be added to the manuscript). This is also the reason why ‘laminin interactions’ is highly significant in the gene set enrichment analysis (Fig. 1C). Despite this upregulation, we never observed morphological changes (or expression changes) as when laminin is added to the collagen-hydrogel. In addition, we showed that the vast majority of ECM components is produced by the mesenchyme in vivo, in line with previous literature as cited in the manuscript. The mentioned Cre lines to address the question in vivo are unfortunately not available to our collaborators with the Lamc1 k.o. mice and it would therefore take too long to perform these experiments.

      However, to address this point in vitro we will grow organoids from Lamc1 fl/fl mice and induce loss of laminin in the pure epithelial cell culture. Organoids will then be analysed for morphological changes, as well as proliferation and gene expression changes.

      One major point to address regards statistics. In material and methods, the paragraph describing statistical analyses is missing. Moreover, in the figures presenting qPRC data ( figs 1g 3b 3D 3g 4c and f), no statistic analysis is provided; and the number of samples for some conditions is extremely limited (n=2). In general, the term « independent experiment « should be clarified : does it correspond to one organoid line for which the experiment was repeated or one single experiment using different organoid lines?

      In fig 4c , all collagen conditions are set to 1.

      The avoidance of statistical inference for most of the experiments was a deliberate choice. In line with several comments (e.g. 1. Vaux, D. L. (2012) Know when your numbers are significant. Nature. 492, 180–181), we chose to show all individual data points (with exception of Fig. 3D, n=5, to ease interpretation) without statistics. In addition, for most expression data, we have data from RNAseq, single-cell RNAseq and qPCRs repeated at different hydrogel concentrations to obtain reliable results. Further, the in vivo mesenchymal qPCR expression data was validated with RNA in situ hybridization showing the mainly mesenchymal expression.

      The term independent experiment was used mainly for repeated experiments with the same organoid lines (exception RNAseq data, different organoids derived from individual mice). While conducting these experiments, we realised that the variability of these experiments comes from time in culture, density of cells and even Matrigel variation. The experiment in Fig. 4c (n=4, each time with the all controls) was performed with longer intervals in between, and showed variation in the absolute levels of expression. However, relative to each control we believe the effect is clear.

      As we will perform additional experiments for the revision of this paper, we will then perform statistical tests in the key experiments (e.g. Itga6 experiment) to alleviate any concerns regarding significance.

      Regarding the experiment presented in fig 4c, authors should include additional control conditions : anti-a6 integrity antibody in matrigel and use of an isotype antibody.

      We will conduct additional experiments regarding the Itga6. In addition to including the mentioned controls for the neutralizing antibody, we will genetically inactivate Itga6 via an inducible Crispr/Cas9. This should enable us to delete Itga6 when the cells are grown on collagen, and hence reduce the possibility of compensation in matrigel derived organoids.

      Another point regards RNAscope data presented in Fig 4b, it is surprising to observe such difference in terms of expression between E19 and P0. Does this mean that birth dramatically unregulates Itga6 expression in few hours? Authors should comment this point if verified.

      We do believe that birth is a timepoint where a dramatic change in the ECM and their receptors can be observed. The epithelial RNAseq data would already indicate that at 18.5 there is an increase in expression compared to E16. This upregulation of the receptor is in line with the dramatic remodelling of the ECM at birth, as is shown by the expression of the basement membrane components in Fig. 3d.

      Authors should avoid the word « signaling » for laminin-integrin interactions as they do not study this aspect at all in their experiments.

      The word signaling was used for the protein:receptor interaction and to distinguish it from changes to the physical characteristics of the hydrogel. But we agree with the reviewer, that we did not study laminin signaling per se and therefore will change the wording accordingly.

      Regarding Col1a1, authors cannot claim that it's expression only slightly changed (fig 3d) as it is clearly upregulated between E17 and P0.

      The reviewer is right, and we apologise for the misleading sentence. The contrast was meant to the basement membrane components that are very lowly expressed at E17 and then suddenly show the burst of expression at birth, whereas collagen seems to be continuously expressed with a peak at P7. We will rephrase the sentence.

      Reviewer #1 (Significance (Required)):

      Overall, the methodology used for the asked questions is accurate.

      One potential problem for publication comes from the fact that some of the findings are already reported and hat the present data do not provide further advances.

      for example, collagen and fetal-like expression profile, Ly6a sorting and replating in culture-Yui et al, 2018, Jabaji et al, 2013.

      We obviously do not agree with the reviewer on this point. We build upon the work of Jabaji et al. 2013, and Wang 2017 to characterise the specific effect of collagen on the intestinal epithelium compared to a pure Matrigel culture. The emergence of Ly6a cells was nicely shown by Yui et al., however it was unclear if collagen changes the fate of all intestinal cells or only a few. We strongly feel that our data extends these findings as it associates the changes we observe in vitro to the development of the crypt morphology and intestinal stem cells.

      The phenotype of Lamc1-/- mice and the observed reduced stem cell marker expression are also reported by Fields et al, 2019.

      Indeed, as we cited this paper. However we predicted based on our in vitro model, that deletion of laminin would result in this specific fetal-like gene expression and hence were happy to include these findings in our manuscript.

      Infine, authors do not interpret their ex vivo data in the context of fetal progenitors which grow as spheres in matrigel (containing laminin)?

      Our ex-vivo (in vitro) data would suggest that adult epithelial cells express some genes that are characteristic for fetal organoids, however we do not think these cells completely revert back to a fetal stage. Regarding the comment of spheres, it is noteworthy that fetal cells from E14-16 stay as spheres in Matrigel, whereas fetal cultures from E19 initially grow as spheres and then develop into organoids within 30 days in vitro (M. Navis et al., “Mouse fetal intestinal organoids: new model to study epithelial maturation from suckling to weaning,” EMBO Rep., vol. 20, no. 2, pp. 1–12, 2019.).

      In figure 5, should we interpret that there is no laminin at all in the fetal mesenchyme?

      We now see how the image is a bit misleading for that stage. The levels of laminin are lower in the fetal stage as can be seen by the IF image in Fig.3f and the image will be updated. We also apologise for the lack of labeling in Figure 3f, which should be E19, P7 and adult.

      Also, authors do not cite a paper reporting on the role of the epithelium ( stem cells) in regulating its own extracellular matrix composition, this process modulating the stem cell number and fate (Fernandez-Vallone et al. 2020). As this is contradictory with the claim that only mesenchyme impacts on crypt morphogenesis, authors could discuss on this point.

      In the paper by Fernandez-Vallone, deletion of Lgr5 in E16.5 embryos resulted in a decrease expression of several ECM genes. Further, the authors could show that the fetal epithelium does express for example Col1a1 at this point, which decreases with maturation. However even for the example of Col1a1 it is evident in their paper that the mesenchyme expresses Col1a1 at much higher levels. Our proposed experiments with Lamc1 k.o. In organoids will show if the produced laminins of the epithelium are essential.

      This manuscript could be interesting for an audience in the stem cell and developmental fields ( my field of expertise).

      **Referee Cross-commenting**

      Considering the pertinent and sometimes overlapping comments of the two other reviewers, the estimated time is revised to 3-6 months.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      In this manuscript, Ramadan and colleagues demonstrate that depending on the extracellular matrix (ECM) composition in which mouse intestinal organoids and/or 2D intestinal epithelial cells are grown in, cellular composition of the epithelium changes. Organoids plated on 2D collagen layers show a unique cell cluster characteristic of fetal-like genes, while organoids plated with increased amount of Matrigel in 2D or in 3D exhibit a shift towards higher stem cell abundance and the absence of the fetal-like gene cluster. Specifically, the ECM component Laminin supports acquisition of stem cells identities in small intestinal epithelial cells, correlating with a transient increase in expression levels of collagen and laminin genes in vivo spanning time points of crypt formation. The authors reported the functional contribution of laminin signaling (Lamc1 KO) via integrin alpha 6 (antibody-blocking) to intestinal stem cell acquisition in vitro and in vivo.

      There are a handful of comments/concerns that would need to be addressed before publication.

      Major points:

      1. The author claimed: "the effect of ECM components on gene expression is not due to difference in morphology (2D collogen vs. 3D Matrigel)". The conclusion of the 2D vs. 3D experiment should be toned down to that organoid morphology (2D vs. 3D) does not directly impact on the expression of fetal-like genes. Otherwise more analysis of RNAseq data with different group of genes (e.g., in different mechanosensing pathways) should be provided with Fig. S1D. Also, would it be technically feasible to perform experiments of SI in collagen (3D) in all group of experiments? Directly comparing 3D Matrigel with 3D collagen avoids the concern of the 2D vs. 3D effect.

      We apologise for the too strong claim of structure of growth vs. signalling of the ECM and its effect on the transcriptome. Indeed, the main message is that a changed morphology from 3D to 2D is not responsible for the expression of fetal-like genes. The paragraph will be rephrased.

      Also, would it be technically feasible to perform experiments of SI in collagen (3D) in all group of experiments? Directly comparing 3D Matrigel with 3D collagen avoids the concern of the 2D vs. 3D effect.

      To address this point, we want to refer to the excellent idea of growing established organoids in collagen (3D) vs. Matrigel (3D) as suggested by this reviewer (and reviewer #3) in Minor points #5. This circumvents the need for Wnt3a addition, which affects stem cell and Paneth cell gene expression.

      1. For Fig. 1f, the authors should include overlapping stainings of Lyz (or Olfm4, CD44 etc.) and Adolase B signal, or they could perform Aldolase B staining in Lgr-5-DTR-GFP and/or Lyz-RFP organoid line. From the current data provided one cannot draw clear conclusions on the crypt morphology as claimed by the authors. Additionally, when talking about crypt morphology and apical accumulation of Actin specifically in the Lyz+ cells, the authors should show a higher zoom in of the picture and either add an orthogonal slice to see apical and basal side and the specific accumulation in one of the cell types, or also co-label with apical polarity markers.

      We will perform additional co-stainings to further highlight the differences in the spatial distribution of differentiated cells and undifferentiated-crypt-like cells. Further we will provide higher magnification images highlighting the apical accumulation of Actin in the crypt-like structures, which can also be seen in mature organoids.

      1. Authors referred the organoid transient change to fetal-like state. To exam the similarity of ECM-induced reprogramming with the regenerative-type of reprogramming, it would be essential to compare the expression of the selected fetal-like genes (Anxa3, Ly6a/Sca1, Msln, Col4a2 et al.), as well as bulk and single-cell (if applicable) RNA-seq data.

      Here, we would like to refer to the excellent study by Yui et al. ([S. Yui et al., “YAP/TAZ-Dependent Reprogramming of Colonic Epithelium Links ECM Remodeling to Tissue Regeneration,” Cell Stem Cell, vol. 22, no. 1, pp. 35-49.e7, 2018.). In this study the authors detected the same gene signature in the repairing epithelium. We can provide a GSEA for the Ly6a+ signature that was derived from this paper, if necessary.

      1. For in vivo data, authors were looking at the normal development of intestine. Following the point of organoid culture recapitulates regeneration, it would be relevant to check the in vivo ECM change by staining in the process of intestinal regeneration or discuss would the fetal-like genes be involved in regeneration.

      We will address this point in the discussion as it also involves the study by Yui et al.

      1. For Fig2.d and e, it would be important to measure compactness vs. the emergence/probability of Ly6+ cells to see if there is correlation.

      If we understand the reviewer correctly, this would address the important relationship between cell shape and cell fate/type. However, this is a topic that needs more attention than a simple correlation and would exceed the scope of this manuscript as we are not able to modulate cell shape to make any further points about its effect on the fetal gene expression program.

      1. In Fig.2d, Ly6a expression is very obscure, and it would be important to show control staining for cell boundaries (eg. Phalloidin, PM) to visualize which nuclei show Ki67 staining and are high or low in Ly6a (plus quantification).

      We will improve the image in Fig.2d and include the mentioned Actin staining. In addition we will perform an analysis via Flow Cytometry to quantify the level of Ly6a staining and EdU positivity.

      1. In Fig. 2f-g, FACS-ed Ly6a+ and Ly6- cells embedded in Matrigel can grow into organoids with crypts. Here the imaging of Paneth cell staining is not clear, and a quantification on number of Paneth cells per crypt would be very helpful to confirm the phenotype. Also, authors should either provide data on the initial size of seeded cell clusters and report organoid growth and cell type composition in more detail when plating from Ly6a+ and Ly6- cells or report the variation in the respective populations.

      This comment suggests that we may not have described the experimental settings properly. The sorted cells were embedded as single cells, not as clusters, in drops of matrigel (10k cells/25ul Matrigel). The emergence of Paneth cells together with a normal organoid architecture grown from Ly6a+ cells shows their stem cell capacity, as has been shown by Yui et al. before from the regenerating colon. In addition, organoids from both cell populations (Ly6a+ and Ly6a-) could be passaged, indicating presence of intestinal stem cells.

      1. The authors could also test whether Ly6+ cells have any advantages over Ly6- cells when grown on collagen I instead of Matrigel.

      We will sort Ly6a+ and Ly6- negative cells and plate them on collagen I. It will be interesting to see if the Ly6a+ cells can give rise to the other cell types when plated on collagen or if they stay Ly6+ cells. This will also answer whether Ly6a+ need the presence of Ly6a- cells in the cultures. In addition, the experiment proposed in #6 will also highlight any proliferative advantage of Ly6a cells compared to Ly6-negative cells on collagen.

      1. In Fig.3f, a control of membrane protein staining should be added for the experiment. The increased Laminin signal can be caused by the global increase of protein when there are more cells, or tissues are more compact. When authors make conclusion of "Dramatic remodelling of ECM during crypt formation ", the experiment should also count cell numbers vs. Laminin (intensity). The phenotype can come from increased area of interface between epithelium and mesenchyme instead of active remodelling.

      We agree with the reviewer that by itself the IF images are not enough to make such a claim. However, we would point to the qPCR data and RNA in situ, that can be more easily normalised and shows the dramatic increase in expression of all laminins at birth. To show that laminin protein is increasing is more difficult than we initially anticipated. However, in the study by De Arcangelis (A. De Arcangelis et al., “Hemidesmosome integrity protects the colon against colitis and colorectal cancer,” Gut, vol. 66, no. 10, pp. 1748–1760, 2017.) the authors use an EDTA assay to show that the epithelium detaches easily when Itga6 is deleted. Within the figure, it seems also that the epithelium detaches easily at P2, compared to P14. As EDTA is disrupting laminin polymerisation, this would further indicate increased laminin protein deposition after birth.

      1. The authors claim that intestinal stem cells in vivo are controlled by Laminin signalling that goes via Integrin alpha 6. However, there is no evidence provided that supports the contribution of ITGA6 in the in vivo setting. So, the authors should either tone down on that point or show a convincing in vivo experiment (e.g., inhibit ITGA6 in vivo by inhibitor injections or by extracting the ECM of a wild-type mouse and seeding intestinal epithelial cells without vs. with ITGA6 blocking antibody which should recapitulate the phenotype in Fig. 4 c.

      We apologise for this confusion. We are well aware about the limitations of our Itga6 blocking experiment in vitro and its relevance in vivo. We tried to get material of the inducible VilCreER Itga6 mouse as referenced in the discussion of the manuscript, without any luck so far. Therefore we will highlight further that any claims about the laminin:Itga6 interaction can only be made in vitro.

      1. Fig. 4: For the data of ITGA6 expression and all sorts of analysis on protein expression with staining, normalization with cell numbers should be performed.

      The RNAseq data that shows the upregulation of Itga6 in the epithelium at E18 is normalized within. Our RNAscope only further validates these expression changes and highlights the specific enriched expression at the bottom of the nascent crypts. We can add quantification of the RNAscope if required.

      1. Two questions on mechanisms:

      2. What is the mechanism from ITGA signaling to Ly6a+ cell fate?

      3. And would/how Laminin induce ITGA expression? Depends on how much the authors would like to go deep with the project, could be addressed further with functional studies, or touch on the topics with discussion.

      These are important questions, however we do agree that this will go to deep for the scope of this manuscript. We will address these open questions in the discussion and leave the experimental part for a follow-up study.

      **Minor points:**

      1. Text in Fig.S1d regarding 'in' or 'on' collagen, could be clearer by changing the terms to 2D and 3D correspondingly.

      We agree and the text will be changed accordingly.

      1. Fig. S1a, it is great the authors showed that similar stiffness in Matrigel and collagen I. It would be important to check the concentration of collagen I vs. stiffness (also for increasing concentrations of Laminin in Fig. 3b), since this is also the type of ECM change that might lead to the change of cell status in cancer progression or collective cell migration.

      We will perform further stiffness measurements of the hydrogels and update the Fig. S1a.

      1. When plating intestinal epithelial cells on collagen I, is the Ly6+ phenotype altered upon Wnt addition? This is not so clear from the RNAseq data Fig S1d., so authors should provide antibody stainings (stem cells/Paneth cells). This could give insight whether Ly6+ cells are still able to convert into stem cells/ Paneth cells by changing morphogen concentration vs. ECM composition.

      We will reanalyse the RNaseq dataset further, specifically analysing the ratio of stem cell and Paneth cell gene expression. However, as mentioned before, Wnt3a specifically does reduce the expression of Paneth cell markers.

      Similar to this point, also enteroendocrine cell fate is absent in collagen I condition (Fig2.ab), the authors could address this point by medium induced EE cell fate.

      Due to the reduced number of secretory cells the clustering in Fig2 a/b does not separate all the different cell lineages. However, EE cells are present in the collagen cultures as characterised by expression of Chga, just reduced in their number (see Supl Fig. 2B).

      1. It would be more informative to indicate the thickness of ECM layer in culture of 2D collagen I, as well as the image of the whole well, demonstrating the morphological variation in the middle and peripheral of the ECM layer.

      The thickness of the collagen layer is about 1mm in a 6well plate and we do not observe any morphological differences in the cells between the periphery and center of the well.

      1. After the formation of PC/SC clusters, would ECM contribute to maintenance? Putting mature organoids from Matrigel to Collagen I 3D would help to clarify.

      This is an interesting experiment that we will conduct, we thank the reviewer for this suggestion. Indeed, established organoids should be able to grow in collagen I without Wnt3a addition. The paper by Sachs et al. (N. Sachs, Y. Tsukamoto, P. Kujala, P. J. Peters, and H. Clevers, “Intestinal epithelial organoids fuse to form self-organizing tubes in floating collagen gels,” Development, vol. 144, no. 6, pp. 1107–1112, 2017. et al) used extensive washing with PBS to remove Matrigel from the organoids. We will go one step further and trying to completely remove laminin specifically by EDTA incubation, as has been shown recently (J. Y. Co et al., “Controlling Epithelial Polarity: A Human Enteroid Model for Host-Pathogen Interactions,” Cell Rep., vol. 26, no. 9, pp. 2509-2520.e4, 2019.). This should then also answer whether disruption of laminin signalling is sufficient to induce fetal-gene expression without the addition of collagen I in a 3D setting.

      1. Check secretome and individual culture of Mesenchyme, see if the increase of Laminin is epithelium independent.

      We agree that the mesenchyme is key for laminin production, therefore these are important questions. Our prediction would be that epithelium from birth (P0) versus adult might result in different responses on the mesenchyme. However, we feel these experiments are better suited for a follow-up study.

      1. In general, the authors should look at cell polarity markers to check the ECM contribution to cell polarity in different cell types.

      We thank the reviewer for the suggestions and as mentioned above, we will perform additional stainings.

      Reviewer #2 (Significance (Required)):

      **Significance:**

      The work highlights the role of ECM on stem cell niche and is of great interest to the organoid and stem cell community.

      Our field of expertise is image- and seq-technology-based quantitative biology, regeneration and mechanics in organoid.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Ramadan et al present a highly informative paper detailing how the Extracellular matrix influences the development of the intestine. Specifically, the authors provide a thorough analysis of how manipulating the components of the ECM can affect organoid growth, morphology, and gene expression of the organoids. Most importantly, the authors isolate laminin as a critical component of the ECM which impacts the development of fetal-like epithelium. While the in vitro work is generally compelling and of interest to the field, the in vivo data is someone lacking in depth and novelty. Particularly, these conclusions from the abstract could be much better supported: "This laminin:ITGA6 signalling is essential for the stem cell induction and crypt formation in vitro. Importantly, deletion of laminin in the adult mouse results in a fetal-like epithelium with a marked reduction of adult intestinal stem cells." The in vivo work was largely published previously and has caveats noted below, while the in vitro association of ITGA6 signaling with crypt formation is over-interpreted based upon an antibody blocking experiment and a lack of statistical rigor. Despite these concerns, this reviewer finds the work of considerable interest in an important area of the field (epithelial/stromal interactions of the intestine).

      **Major Concerns:**

      The use of the Ubc-Cre Lamc1-flox mouse model is an interesting way to test the impact of loss Lamc1 on intestinal development. However, with the Ubc-Cre, does the mouse model have other deleterious effects on the mouse beyond the intestine?

      Can the authors use a more localized Cre to observe specifically the impacts of Lamc1 loss in the intestine? What is the fate of these mice? Can the authors show swiss roll, low mag sections to let the reader know the extent of this phenotype? OLFM4 and Ki67 IHC should be conducted over a timecourse to show how the changes occur over time after loss of Lamc1. How long does Lamc1's protein product perdure after tamoxifen treatment? More details of this exciting, in vivo validation of the authors' in vitro studies are key to elevating the impact of this work. However, it appears that much of this mouse model was previously published, but the previous findings are not well summarized in the current manuscript.

      We will describe the model in more detail and refer readers to the excellent study of our collaborators which answers most of the raised questions. It is interesting to note that although a ubiquitous Cre was used to delete Lamc1 in adult mice, a phenotype was only observed in the intestine indicating a specific role for continuous laminin production here.

      Can the authors show that ITGA6 loss has functional consequences in vivo with an epithelial knockout or via an organoid knockdown? A more rigorous genetic test of this proposed function would be important for substantiating the claims made in the abstract.

      As referenced in the discussion of the manuscript, there is a VilCreER Itga6 mouse described in the literature (A. De Arcangelis et al., “Hemidesmosome integrity protects the colon against colitis and colorectal cancer,” Gut, vol. 66, no. 10, pp. 1748–1760, 2017.), which mainly focus on the colon. However, the authors use an EDTA assay in the small intestine to show that the epithelium detaches easily when Itga6 is deleted (Fig. 1J). Within the figure, it seems also that the epithelium detaches easily at P2, compared to P14. As EDTA is disrupting laminin polymerisation, this would further indicate increased laminin protein deposition after birth which is dependent on Itga6.

      We tried to get material of the inducible VilCreER Itga6 mouse however without any luck so far.

      We will conduct additional experiments regarding the Itga6 in vitro. In addition to including additional controls for the neutralizing antibody, we will genetically inactivate Itga6 via an inducible Crispr/Cas9. This should enable us to delete Itga6 when the cells are grown on collagen, and hence reduce the possibility of compensation in matrigel derived organoids.

      The authors state that the changes in gene expression are not due to differences in morphology, but rather are specific to the components of the environment. While the authors show that organoids treated with Wnt3a "in Matrigel" and "CollagenI" appear to have similar morphologies and yet still result in a different gene expression profiles, it would be of great interest to see whether that difference persists without Wnt3a when organoids are "in Matrigel" and "in CollagenI". While the reviewer understands the difficulties of culturing organoids in 3D Collagen without Wnt3a, organoids can be indeed be cultured in 3D using "floating collagenI rings" (Sachs et al 2017).

      This is an interesting experiment that we will conduct, we thank the reviewer for this suggestion. Indeed, established organoids should be able to grow in collagen I without Wnt3a addition. The paper by Sachs et al. (N. Sachs, Y. Tsukamoto, P. Kujala, P. J. Peters, and H. Clevers, “Intestinal epithelial organoids fuse to form self-organizing tubes in floating collagen gels,” Development, vol. 144, no. 6, pp. 1107–1112, 2017. et al) used extensive washing with PBS to remove Matrigel from the organoids. We will go one step further and trying to completely remove laminin specifically by EDTA incubation, as has been shown recently (J. Y. Co et al., “Controlling Epithelial Polarity: A Human Enteroid Model for Host-Pathogen Interactions,” Cell Rep., vol. 26, no. 9, pp. 2509-2520.e4, 2019.). This should then also answer whether disruption of laminin signalling is sufficient to induce fetal-gene expression without the addition of collagen I in a 3D setting.

      Similarly, while the authors indicate that increasing Matrigel concentrations altered the gene expression patterns in a dose-dependent manner, it is unknown whether this can fully be attributed to the Matrigel composition, or whether the layer of Matrigel is providing the capability to transition from 2D to 3D culture.

      We are not entirely sure, we understood this point. The Matrigel and the collagen I are mixed before they solidify, therefore enabling a homogenous hydrogel. The different hydrogels are not layered (if that is what the reviewer is referring to).

      The authors cultured organoids in different concentrations of Laminin/CollagenIV when mixed with CollagenI. Can organoids be sustained only on a matrix of CollagenIV and/or Laminin? Would this show more direct differences between CollagenI vs. Laminin cultured organoids?

      Organoids cannot be grown in pure Collagen IV, but pure laminin should be feasible. We did initial experiments with 3-5mg/ml laminin in PBS and that was sufficient to allow organoid growth. We will perform additional experiments with pure laminin and show the impact on organoid growth.

      With the reduction in stem-cell and Paneth cells in the Lamc1-KO mice, it would also be of interest to determine what cell types are now prominent within the heavily elongated intestinal "crypt" structures seen in the Lamc1-KO mice and whether populations are more TA-cells or enterocytes to consider differentiation status of the cells. Additionally, it would also of interest to see if the Itga6 expression is significantly altered in the absence of Lamc1.

      We will test expression changes for Itga6 in the Lamc1-KO mice, in the epithelium via qPCR. Additionally we can stain tissue from these mice for Sox9, Ki67 and differentiated markers eg. CD44, AldolaseB, Villin etc .to determine whether the elongated, hyperproliferative crypts contain progenitor cells or secretory enterocytes.

      **Minor concerns:**

      Matrigel is a complex matrix derived from mouse tumors. In many instances in the manuscript, the authors portray it as a more simpler mix of laminin/Collagen4 (fig 3a). It should be made clearer to the reader that Matrigel is not a mix of recombinant proteins, but a more clear depiction of how Matrigel is derived will be critical for this study, given the focus on specific ECM components and how they affect intestinal epithelial growth.

      We agree and will change the oversimplified view of Matrigel.

      Some labels of specific conditions would be appreciated in the figures as opposed to only the figure legends (ie. Fig. 1b and 1d should be labeled with comparisons; Fig. 3f labels of fluorescence, Fig. 4b label of itga6 staining).

      We apologise for this and the Figure labels will be updated.

      With the light staining of Lamc1 in-situ, it is hard to appreciate the expression of laminin within the stroma of the intestine when compared to Col4. This reviewer is also curious of the biological relevance of the concentrations of Laminin/CollagenIV when culturing organoids in Fig. 4a.

      Indeed, the Lamc1 due to its lower expression than Col4a1 is more difficult to see. Maybe the reviewer overlooked Suppl.Fig.4A, where the blue channel of these in situ images show more contrast. If required, we can try to optimise the hybridisation times to increase the signal a bit further.

      When culturing the organoids with the mixture of collagen and laminin or collagen IV, the concentrations of the two single components were selected similar to their concentrations in Matrigel. Regarding the absolute concentrations of laminin/collagen IV in vitro versus their “concentration” in vivo is much harder to answer. In addition to the unknown concentrations in vivo, there are many more Laminin types present with specific localisation and even specific receptor interactions. For our in vitro studies we relied on the Laminin present in EHS tumours, which is Laminin alpha 1 beta 1 gamma 1. We are currently investigating decellularization protocols to purify the ECM from mouse intestinal tissue, but again this would be more suited for a follow up study.

      It would be appreciated if gene expression analyses presented in figures would include p-values to provide context for differences in gene expression.

      The avoidance of statistical inference for most of the experiments was a deliberate choice. In line with several comments (e.g. 1. Vaux, D. L. (2012) Know when your numbers are significant. Nature. 492, 180–181), we chose to show all individual data points (with exception of Fig. 3D, n=5, to ease interpretation) without statistical testing. For most expression data, we have data from RNAseq, single-cell RNAseq and qPCRs repeated at different hydrogel concentrations to obtain reliable results. Further, the in vivo mesenchymal qPCR expression data was validated with RNA in situ hybridization showing the mainly mesenchymal expression.

      As we will perform additional experiments for the revision of this paper, we can perform statistical tests in the key experiments (e.g. Itga6 experiment) to alleviate any concerns regarding significance.

      In figure 3f, the authors report "immunofluorescence of laminin". How is this measured? Can more details be given about the antibody in the text and figure legend? Laminins are a family of genes, and it's not clear what's being demonstrated in this figure panel. Developmental stages of the samples are also not clear.

      We apologise for the lack of labeling in Fig.3f. The details of the antibody were hidden in the Materials and Methods of the manuscript (Slides were incubated with Laminin Polyclonal Antibody (1/200, Thermo Fisher #PA5-22901) overnight at 4C ). This pan-laminin antibody reacts with most Laminin isoforms alpha1, alpha2, beta1, gamma1. We will declare it as a pan-laminin antibody in the Figure legend to help future readers.

      Reviewer #3 (Significance (Required)):

      This work is in an exciting "hot" area of research to understand the role of non-epithelial cells in intestinal epithelial development and function. The audience would be those in the GI field and those studying tissue-tissue interactions.

      There's some concern that the in vivo portion of the manuscript (4th figure) uses a model that was previously characterized and published by this group, and that isn't clearly disclosed. The manuscript would benefit from more disclosure and detail about the in vivo phenotype. Such changes would substantially increase the impact and novelty of the study.

      We would like to point out that we cited the paper of the original study that uses the model throughout the manuscript. We will disclose in more detail that this group did the study and that the reduction in stem cell genes was already mentioned in the original publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Ramadan et al present a highly informative paper detailing how the Extracellular matrix influences the development of the intestine. Specifically, the authors provide a thorough analysis of how manipulating the components of the ECM can affect organoid growth, morphology, and gene expression of the organoids. Most importantly, the authors isolate laminin as a critical component of the ECM which impacts the development of fetal-like epithelium. While the in vitro work is generally compelling and of interest to the field, the in vivo data is someone lacking in depth and novelty. Particularly, these conclusions from the abstract could be much better supported: "This laminin:ITGA6 signalling is essential for the stem cell induction and crypt formation in vitro. Importantly, deletion of laminin in the adult mouse results in a fetal-like epithelium with a marked reduction of adult intestinal stem cells." The in vivo work was largely published previously and has caveats noted below, while the in vitro association of ITGA6 signaling with crypt formation is over-interpreted based upon an antibody blocking experiment and a lack of statistical rigor. Despite these concerns, this reviewer finds the work of considerable interest in an important area of the field (epithelial/stromal interactions of the intestine).

      Major Concerns:

      The use of the Ubc-Cre Lamc1-flox mouse model is an interesting way to test the impact of loss Lamc1 on intestinal development. However, with the Ubc-Cre, does the mouse model have other deleterious effects on the mouse beyond the intestine? Can the authors use a more localized Cre to observe specifically the impacts of Lamc1 loss in the intestine? What is the fate of these mice? Can the authors show swiss roll, low mag sections to let the reader know the extent of this phenotype? OLFM4 and Ki67 IHC should be conducted over a timecourse to show how the changes occur over time after loss of Lamc1. How long does Lamc1's protein product perdure after tamoxifen treatment? More details of this exciting, in vivo validation of the authors' in vitro studies are key to elevating the impact of this work. However, it appears that much of this mouse model was previously published, but the previous findings are not well summarized in the current manuscript.

      Can the authors show that ITGA6 loss has functional consequences in vivo with an epithelial knockout or via an organoid knockdown? A more rigorous genetic test of this proposed function would be important for substantiating the claims made in the abstract. The authors state that the changes in gene expression are not due to differences in morphology, but rather are specific to the components of the environment. While the authors show that organoids treated with Wnt3a "in Matrigel" and "CollagenI" appear to have similar morphologies and yet still result in a different gene expression profiles, it would be of great interest to see whether that difference persists without Wnt3a when organoids are "in Matrigel" and "in CollagenI". While the reviewer understands the difficulties of culturing organoids in 3D Collagen without Wnt3a, organoids can be indeed be cultured in 3D using "floating collagenI rings" (Sachs et al 2017). Similarly, while the authors indicate that increasing Matrigel concentrations altered the gene expression patterns in a dose-dependent manner, it is unknown whether this can fully be attributed to the Matrigel composition, or whether the layer of Matrigel is providing the capability to transition from 2D to 3D culture.

      The authors cultured organoids in different concentrations of Laminin/CollagenIV when mixed with CollagenI. Can organoids be sustained only on a matrix of CollagenIV and/or Laminin? Would this show more direct differences between CollagenI vs. Laminin cultured organoids? With the reduction in stem-cell and Paneth cells in the Lamc1-KO mice, it would also be of interest to determine what cell types are now prominent within the heavily elongated intestinal "crypt" structures seen in the Lamc1-KO mice and whether populations are more TA-cells or enterocytes to consider differentiation status of the cells. Additionally, it would also of interest to see if Itga6 expression is significantly altered in the absence of Lamc1.

      Minor concerns:

      Matrigel is a complex matrix derived from mouse tumors. In many instances in the manuscript, the authors portray it as a more simpler mix of laminin/Collagen4 (fig 3a). It should be made clearer to the reader that Matrigel is not a mix of recombinant proteins, but a more clear depiction of how Matrigel is derived will be critical for this study, given the focus on specific ECM components and how they affect intestinal epithelial growth.

      Some labels of specific conditions would be appreciated in the figures as opposed to only the figure legends (ie. Fig. 1b and 1d should be labeled with comparisons; Fig. 3f labels of fluorescence, Fig. 4b label of itga6 staining).

      With the light staining of Lamc1 in-situ, it is hard to appreciate the expression of laminin within the stroma of the intestine when compared to Col4. This reviewer is also curious of the biological relevance of the concentrations of Laminin/CollagenIV when culturing organoids in Fig. 4a. It would be appreciated if gene expression analyses presented in figures would include p-values to provide context for differences in gene expression.

      In figure 3f, the authors report "immunofluorescence of laminin". How is this measured? Can more details be given about the antibody in the text and figure legend? Laminins are a family of genes, and it's not clear what's being demonstrated in this figure panel. Developmental stages of the samples are also not clear.

      Significance

      This work is in an exciting "hot" area of research to understand the role of non-epithelial cells in intestinal epithelial development and function. The audience would be those in the GI field and those studying tissue-tissue interactions.

      There's some concern that the in vivo portion of the manuscript (4th figure) uses a model that was previously characterized and published by this group, and that isn't clearly disclosed. The manuscript would benefit from more disclosure and detail about the in vivo phenotype. Such changes would substantially increase the impact and novelty of the study.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Ramadan and colleagues demonstrate that depending on the extracellular matrix (ECM) composition in which mouse intestinal organoids and/or 2D intestinal epithelial cells are grown in, cellular composition of the epithelium changes. Organoids plated on 2D collagen layers show a unique cell cluster characteristic of fetal-like genes, while organoids plated with increased amount of Matrigel in 2D or in 3D exhibit a shift towards higher stem cell abundance and the absence of the fetal-like gene cluster. Specifically, the ECM component Laminin supports acquisition of stem cells identities in small intestinal epithelial cells, correlating with a transient increase in expression levels of collagen and laminin genes in vivo spanning time points of crypt formation. The authors reported the functional contribution of laminin signaling (Lamc1 KO) via integrin alpha 6 (antibody-blocking) to intestinal stem cell acquisition in vitro and in vivo.

      There are a handful of comments/concerns that would need to be addressed before publication.

      Major points:

      1. The author claimed: "the effect of ECM components on gene expression is not due to difference in morphology (2D collogen vs. 3D Matrigel)". The conclusion of the 2D vs. 3D experiment should be toned down to that organoid morphology (2D vs. 3D) does not directly impact on the expression of fetal-like genes. Otherwise more analysis of RNAseq data with different group of genes (e.g., in different mechanosensing pathways) should be provided with Fig. S1D. Also, would it be technically feasible to perform experiments of SI in collagen (3D) in all group of experiments? Directly comparing 3D Matrigel with 3D collagen avoids the concern of the 2D vs. 3D effect.
      2. For Fig. 1f, the authors should include overlapping stainings of Lyz (or Olfm4, CD44 etc.) and Adolase B signal, or they could perform Aldolase B staining in Lgr-5-DTR-GFP and/or Lyz-RFP organoid line. From the current data provided one cannot draw clear conclusions on the crypt morphology as claimed by the authors. Additionally, when talking about crypt morphology and apical accumulation of Actin specifically in the Lyz+ cells, the authors should show a higher zoom in of the picture and either add an orthogonal slice to see apical and basal side and the specific accumulation in one of the cell types, or also co-label with apical polarity markers.
      3. Authors referred the organoid transient change to fetal-like state. To exam the similarity of ECM-induced reprogramming with the regenerative-type of reprogramming, it would be essential to compare the expression of the selected fetal-like genes (Anxa3, Ly6a/Sca1, Msln, Col4a2 et al.), as well as bulk and single-cell (if applicable) RNA-seq data.
      4. For in vivo data, authors were looking at the normal development of intestine. Following the point of organoid culture recapitulates regeneration, it would be relevant to check the in vivo ECM change by staining in the process of intestinal regeneration or discuss would the fetal-like genes be involved in regeneration.
      5. For Fig2.d and e, it would be important to measure compactness vs. the emergence/probability of Ly6+ cells to see if there is correlation.
      6. In Fig.2d, Ly6a expression is very obscure, and it would be important to show control staining for cell boundaries (eg. Phalloidin, PM) to visualize which nuclei show Ki67 staining and are high or low in Ly6a (plus quantification).
      7. In Fig. 2f-g, FACS-ed Ly6a+ and Ly6- cells embedded in Matrigel can grow into organoids with crypts. Here the imaging of Paneth cell staining is not clear, and a quantification on number of Paneth cells per crypt would be very helpful to confirm the phenotype. Also, authors should either provide data on the initial size of seeded cell clusters and report organoid growth and cell type composition in more detail when plating from Ly6a+ and Ly6- cells or report the variation in the respective populations.
      8. The authors could also test whether Ly6+ cells have any advantages over Ly6- cells when grown on collagen I instead of Matrigel.
      9. In Fig.3f, a control of membrane protein staining should be added for the experiment. The increased Laminin signal can be caused by the global increase of protein when there are more cells, or tissues are more compact. When authors make conclusion of "Dramatic remodelling of ECM during crypt formation ", the experiment should also count cell numbers vs. Laminin (intensity). The phenotype can come from increased area of interface between epithelium and mesenchyme instead of active remodelling.
      10. The authors claim that intestinal stem cells in vivo are controlled by Laminin signalling that goes via Integrin alpha 6. However, there is no evidence provided that supports the contribution of ITGA6 in the in vivo setting. So, the authors should either tone down on that point or show a convincing in vivo experiment (e.g., inhibit ITGA6 in vivo by inhibitor injections or by extracting the ECM of a wild-type mouse and seeding intestinal epithelial cells without vs. with ITGA6 blocking antibody which should recapitulate the phenotype in Fig. 4 c.
      11. Fig. 4: For the data of ITGA6 expression and all sorts of analysis on protein expression with staining, normalization with cell numbers should be performed.
      12. Two questions on mechanisms: a) What is the mechanism from ITGA signaling to Ly6a+ cell fate? b) And would/how Laminin induce ITGA expression? Depends on how much the authors would like to go deep with the project, could be addressed further with functional studies, or touch on the topics with discussion.

      Minor points:

      1. Text in Fig.S1d regarding 'in' or 'on' collagen, could be clearer by changing the terms to 2D and 3D correspondingly.
      2. Fig. S1a, it is great the authors showed that similar stiffness in Matrigel and collagen I. It would be important to check the concentration of collagen I vs. stiffness (also for increasing concentrations of Laminin in Fig. 3b), since this is also the type of ECM change that might lead to the change of cell status in cancer progression or collective cell migration.
      3. When plating intestinal epithelial cells on collagen I, is the Ly6+ phenotype altered upon Wnt addition? This is not so clear from the RNAseq data Fig S1d., so authors should provide antibody stainings (stem cells/Paneth cells). This could give insight whether Ly6+ cells are still able to convert into stem cells/ Paneth cells by changing morphogen concentration vs. ECM composition. Similar to this point, also enteroendocrine cell fate is absent in collagen I condition (Fig2.ab), the authors could address this point by medium induced EE cell fate.
      4. It would be more informative to indicate the thickness of ECM layer in culture of 2D collagen I, as well as the image of the whole well, demonstrating the morphological variation in the middle and peripheral of the ECM layer.
      5. After the formation of PC/SC clusters, would ECM contribute to maintenance? Putting mature organoids from Matrigel to Collagen I 3D would help to clarify.
      6. Check secretome and individual culture of Mesenchyme, see if the increase of Laminin is epithelium independent.
      7. In general, the authors should look at cell polarity markers to check the ECM contribution to cell polarity in different cell types.

      Significance

      Significance:

      The work highlights the role of ECM on stem cell niche and is of great interest to the organoid and stem cell community.

      Our field of expertise is image- and seq-technology-based quantitative biology, regeneration and mechanics in organoid.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In the manuscript of Ramadan et al. , authors use the ex vivo organoid approach to compare gene expression in organoids derived from adult type stem cells when these organoids are grown using different matrices. The presence of Collagen type I induces the emergence of cells with a transcriptome similar to fetal progenitors. In contrast, laminin the main component of matrigel, induces an organoid-protruded phenotype with transcriptome of stem cell type. Then, they correlate these data with expression of collagens and laminins from data publicly available. They show by qRT -PCR that laminins are more expressed in mesenchymal versus epithelial fractions postnatally. They hypothesize on this basis that the remodeling at postnatal stage is likely only dependent on the mesenchymal compartment and it involves interaction of laminins with integrity a6. It seems that some of the presented data have already been described and could not be considered as « novel ». For some of the statements, like this one « the basement-membrane produced by the epithelium is not sufficient to increase stem cell numbers and induce a morphological crypt formation », the conclusion is not sustained by provided experiments. To draw definitive conclusion on this particular point, authors could reproduce the experiment presented in Fig. 4d but using Cre recombinases specific for mesenchymal and epithelial compartments rather than the ubiquitous Cre line. It would be interesting to investigate if organoids grown from lamc1-/- mice can generate protruded organoids or not. In addition, how interpret the fact that fetal organoids up is associated with « laminin interactions » in fig. 1c?

      One major point to address regards statistics. In material and methods, the paragraph describing statistical analyses is missing. Moreover, in the figures presenting qPRC data ( figs 1g 3b 3D 3g 4c and f), no statistic analysis is provided; and the number of samples for some conditions is extremely limited (n=2). In general, the term « independent experiment «  should be clarified : does it correspond to one organoid line for which the experiment was repeated or one single experiment using different organoid lines? In fig 4c , all collagen conditions are set to 1.

      Regarding the experiment presented in fig 4c, authors should include additional control conditions : anti-a6 integrity antibody in matrigel and use of an isotype antibody.

      Another point regards RNAscope data presented in Fig 4b, it is surprising to observe such difference in terms of expression between E19 and P0. Does this mean that birth dramatically unregulates Itga6 expression in few hours? Authors should comment this point if verified. Authors should avoid the word « signaling » for laminin-integrin interactions as they do not study this aspect at all in their experiments.

      Regarding Col1a1, authors cannot claim that it's expression only slightly changed (fig 3d) as it is clearly upregulated between E17 and P0.

      Significance

      Overall, the methodology used for the asked questions is accurate.

      One potential problem for publication comes from the fact that some of the findings are already reported and hat the present data do not provide further advances.

      for example, collagen and fetal-like expression profile, Ly6a sorting and replating in culture-Yui et al, 2018, Jabaji et al, 2013.

      The phenotype of Lamc1-/- mice and the observed reduced stem cell marker expression are also reported by Fields et al, 2019.

      Infine, authors do not interpret their ex vivo data in the context of fetal progenitors which grow as spheres in matrigel (containing laminin)? In figure 5, should we interpret that there is no laminin at all in the fetal mesenchyme?

      Also, authors do not cite a paper reporting on the role of the epithelium ( stem cells) in regulating its own extracellular matrix composition, this process modulating the stem cell number and fate (Fernandez-Vallone et al. 2020). As this is contradictory with the claim that only mesenchyme impacts on crypt morphogenesis, authors could discuss on this point.

      This manuscript could be interesting for an audience in the stem cell and developmental fields ( my field of expertise).

      Referee Cross-commenting

      Considering the pertinent and sometimes overlapping comments of the two other reviewers, the estimated time is revised to 3-6 months.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      This manuscript by Gulyrutlu and co-workers addresses the role of CUG expanded repeat RNA associated with DM1 in regulating the formation of higher order RNP assemblies such as stress granules and P-bodies in the cell. The authors used lens epithelial cells (hLECs) derived from a DM1 patient

      We used cell lines from several patients and age-matched controls to avoid effects of individual cell-line variation. We will make sure that this is clear in the text.

      or a HeLa cell inducible model of DM1 to investigate whether expression of the CUG repeat-associated protein MBNL1 and CUGBP1 affected the formation and dispersal of stress granules and P-bodies. The authors show that MBNL1 and CUGBP1 are components of SGs and PBs in hLECs and HeLa cells. In cells expressing the CUG repeat, there are minor alterations in the dispersal of stress granules as well as in the formation of P-bodies.

      The alterations in the formation and dispersal of stress granules are not minor. For example, in the HeLa cell model, stress granules take more than twice as long to form in cell expressing the CUGexp repeats associated with DM1 and disperse in half the time. These data are already in the results section, but we will highlight them in a revision and have included an additional representation of the data to the figure, using graphs of ‘proportion of cells with stress granules’ against time. The changes we see are as large, or larger, then results published elsewhere (see appendix below)

      MBNL1 could affect the formation and dispersal of SGs independent of the CUG repeat.

      In fact, we present data in HeLa cells with MBNL1 almost completely removed by shRNA revealing that this has a much smaller effect on stress granules than does the expression of CUGexp RNA. This is an important point, as it is widely assumed that most of the cellular defects in DM1 are caused by the ‘sequestration’ of MBNL1 in the CUGexp foci. Since only . This is not what our data show. In the hexanediol experiments, both cell lines over-express MBNL1 in similar amounts. The difference between them is that one cell line expresses a DMPK1 mini-gene with a CUG expansion and the other expresses a mini-gene without the expansion. Again, our results show that the alteration to P-body responses to 1,6-hexanediol can be attributed to the presence of the CUGexp RNA, rather than altered levels of MBNL1. We will revise the results and discussion to further emphasise this point.

      Finally, in HeLa cells, overexpression of MBNL1 can reduce the dispersal of P-bodies upon 1,6-hexanediol treatment.

      This is not what our data show. In the hexanediol experiments, both cell lines over-express MBNL1 in similar amounts. The difference between them is that one cell line expresses a DMPK1 mini-gene with a CUG expansion and the other expresses a mini-gene without the expansion. Again, our results show that the alteration to P-body responses to 1,6-hexanediol can be attributed to the presence of the CUGexp RNA, rather than altered levels of MBNL1. We will revise the results and discussion to further emphasise this point.

      Major comments:

      One limitation of the work is that the perturbations seen with stress granules or P-bodies are all relatively small, and no evidence for a functional consequence on gene expression is demonstrated. Specifically, the authors observe only minor alterations in the formation or disaggregation of PBs and SGs in these DM1 models. Further, some of the effects observed are independent of the CUG repeat expression, suggesting that MBNL1 and CUGBP1 might have independent roles in modulating some properties of SG and PB formation or dispersal.

      As above, the changes we see in SG formation and dispersal are not small. There are already numerous studies of the effects of DM1 on gene expression and mRNA splicing. This is not what we set out to study: we are interested in perturbations to the organisation of cellular structures associated with the expression of the CUGexp repeat RNA characteristic of DM1. We do show some data relating specifically to the proteins MBNL1 and CUGBP1 in the paper. shRNA resulting in almost complete loss of these proteins has much smaller effects that the expression of CUGexp RNA, suggesting that the major part of the effects caused by expression of the CUGexp RNA is not mediated through changes in MBNL1 or CUGBP1 levels. MBNL1 and CUGBP1 levels may well contribute to alterations in SG dynamics, but our data suggest that they are minor contributors. This is an advance in our current knowledge

      1. The authors could investigate whether the CUG repeat RNA itself is localized to SGs or PBs in their models, and whether the presence of the repeat RNA is absolutely necessary for regulating the dynamics of SG or PB formation.

      We have now done this. The CUG repeat RNA is not localised in stress granules or PBs to a detectable extent. This suggests that the effect we see on these structures by expression of the expanded RNA occurs despite the absence of the RNA from the structures. This is similar to the effects of the ALS-associated paraspeckle protein FUS, which can affect the integrity of nuclear LLPS structures (gems) despite not co-localising with them https://doi.org/10.1016/j.celrep.2012.08.025 We have added these data to the manuscript, as part of draft figure 8, and will add text emphasising this as an additional example of disease-causing macromolecules affecting the structure of LLPS domains in which they are not found.

      1. The authors use 1,6-hexanediol to suggests that PBs and SGs in HeLa cells show behavior analogous to LLPS. However, the use of 1,6,-hexanediol to establish an assembly as a LLPS is a relatively limited analysis (despite its widespread use in the field), since this compound can affect the formation of multiple cellular substructures that are not always LLPS (for example, see Wheeler et al, 2016, eLife).

      We are aware of and have cited this publication. Our comments about LLPS structures are measured, as there is still controversy about how to definitively identify them in cells. SGs and PBs have, however, previously been widely published to be formed by LLPS. The rapid exchange of SG and PB components during FRAP and the ability of SGs to both fuse and bud (seen in our supplementary movies) are also supportive of these structures behaving as LLPS structures in our models. Wheeler et showed that, in yeast, the nuclear pore complex and some cytoskeletal structures were affected by 1,6-hexanediol but membrane-bound structures such as the ER and mitochondria were not. The disruption of the nuclear pore complex is not unexpected, since phase separation is involved in cargo shuttling through the NPC (reviewed in https://doi.org/10.1016/j.devcel.2020.06.033). We will revise our discussion to make it more clear that we are not relying only on the use of 1,6-hexanediol to define SGs and PBs as LLPS structures but also on other aspects of their dynamic behaviour and on extensive prior literature.

      Significance

      This study would be of interest to the field if the impact of the DM! repeat RNAs on PB and SG were more substantial...

      As above, the effects we see on SG formation and loss are substantial. Tissue types affected in DM1 are prone to stress, particularly the lens of the eye, so alterations to cellular response to stress associated with the presence of CUG repeats are of key importance to understanding the cellular pathology of DM1.

      ...and if some functional consequences were demonstrated.

      In terms of function, we show altered responses to stress caused by the expression of CUGexp RNA and probably mediated through alterations in the propensity of LLPS cytoplasmic structures (SGs and PBs) to form and be resolved. Additionally, we can now show that SGs in HeLa cells expressing CUGexp RNA contain less total polyA RNA than is seen in controls, and that ‘docking’ events between SGs and PBs are compromised in cells with CUGexp RNA. These docking events are proposed to mediate transfer of RNA from SGs to PBs (reviewed in https://doi.org/10.1007/978-1-4614-5107-5_12). These new data demonstrate functional impairment of SGs and PBs associated with DM1. We have included this as an additional draft figure 8.

      The lack of a strong effect on SG or PB formation in the DM1 models, along with the CUG repeat-independent effect of MBNL1 on the formation and dispersal of these complexes, argues that MBNL1/CUGBP1 may not significantly affect the formation or dispersal of SGs and PBs.

      We are actually not arguing that MBNL1 and CUGBP1 are the main effectors in the changes we see to SGs and PBs, but that the CUGexp RNA is the key player, so are a little confused by this comment.


      Reviewer #2:

      In the current study, the authors compared the dynamics of P-bodies (PBs) and stress granules (SGs) between control and several DM1 cell lines. They found that MBNL1 and CUGBP1, two CUG repeat RNA-binding proteins that are primarily nuclear, could also co-localize with PBs in the cytoplasm and re-localize to SGs under stress. Small differences were observed in SG assembly and disassembly dynamics between control and DM1 HLECs, between HeLa cells expressing either CTG12 or CTG960, and between HeLa cells with and without shRNAs targeting CUGBP1 or MBNL1.

      As detailed above, the alterations in SG assembly and disassembly in cells expressing CUGexp RNA are not small, in contrast to those in cells will lowered expression of MBNL1 and CUGBP1, which are much smaller suggesting that the changes caused by CUGexp RNA largely do not result from loss of MBNL1 (or CUGBP1). We have inserted additional graphs of ‘proportion of cells with stress granules’ against time' and will modify the text to emphasise both of these points.

      Overall, the experiments were clearly described and the results properly presented. However, critical controls, as detailed below, are missing in multiple analyses. The mechanisms underlying these apparent differences are also unknown.

      We do not consider that any ‘critical controls’ are missing, but can supply all of the additional analysis of our data that the reviewer requests below. We can also now provide additional mechanistic insight and will add an additional figure showing lowered amount of polyA RNA in stress granules in cells expressing CUGexp RNA and compromised docking events between stress granules and P-bodies, suggesting impaired communication between them.

      Major concerns:

      1. Throughout the study, the authors compared MBNL1 and CUGBP1 association with PBs and SGs without considering the potential differences in their cytoplasmic abundance between control and DM1 cell lines, which seems to be case for MBNL1 abundance in CTG960-expressing HeLa cells (Fig. 3). Provided that PBs and SGs exchange components with the cytosol at an equilibrium, if the cytoplasmic abundance of, for example, MBNL1 is decreased in DM1, one would expect the equilibrium being shifted resulting in less MBNL1 associated with PB/SG. Therefore, before measuring the association or the assembly/disassembly kinetics of PB and SG, the authors should first test whether MBNL1 and CUGBP1 abundance may be different between control and DM cell lines.

      There is, in fact, no difference in the relative cytoplasmic abundance of GFP-MBNL1 between CTG12 and CTG960- expressing HeLa cells. Each has approximately a 50/50 split between nucleus and cytoplasm, with <3% of nuclear GFP-MBNL1 found in nuclear CUGexp foci when they are present. We have added a graph demonstrating this to the supplementary data. The abundance of total endogenous MBNL1 is also not altered in DM1 patient-derived lens cell lines compared to controls, as shown by semi-quantitative western blot analysis, which we have also added to the supplementary data. However, if the expression of CUGexp RNA did cause a major loss of cytoplasmic MBNL1, this change would be reflective of the situation seen in DM1 and would not invalidate our results or conclusions.

      The same caveat applies to MBNL1/CUGBP1 knockdown experiments, where knocking down one may change the abundance of the other.

      To carry out FRAP experiments or live cell analysis of SG formation and loss, it is necessary to over-express a tagged version of the protein being studied. For the knockdown experiments shown in figure 6, therefore, when we knocked down MBNL1, CUGBP1 was present in excess as a GFP-tagged protein and when we knocked down CUGBP1, MBNL1 was present in excess as a GFP-tagged protein. Thus, any effects of the knockdowns on expression of the endogenous proteins being analysed would be highly unlikely to influence the results.

      1. Similarly, the authors did not consider the possibility that changes in SG/PB dynamics may be due to changes in the abundance/availability of essential SG/PB components such as GE1 and G3BP1.

      From our immunofluorescence experiments, there was certainly no obvious reduction in GE1 or TIA1 abundance (we did not assess G3BP1). We have quantitative proteomic analysis (unpublished) from a similar pair of cell lines, expressing CUGexp RNA alongside GFP rather than GFP-MBNL1. This shows no change in GE1 or G3BP1, so we would not expect to see any here either. We can easily carry out a quantitative western blot analysis to confirm and will add this to the supplementary data

      1. Most of the observed differences between control and DM cell lines were modest, leaving one wonder whether it could be simply due to cell line-to-cell line variability. Whenever possible, the authors should present results for each individual lines. For example, in Fig.2, 3 DM1 lines and 2 control lines were used. Was the difference in SG disassembly (Fig. 2B) observed in each of the 3 lines?

      Some of the alterations were modest and there is cell line-to-cell line variability in the lens cell lines. This is why we pooled the data: on average, DM1 cells disperse their SGs more quickly than control cell lines do on average. This is not an unusual way to present data from patient cell lines of diverse genetic background. We have added data for stress granule loss in the individual cell lines to the supplementary data. These data show a consistent trend towards quicker dispersal of stress granules in patient cell lines. The variability between the patient lens cell lines was also the primary reason for us to develop the inducible system in HeLa cells, on a fixed genetic background, as explained in the manuscript.

      Minor points:

      1. Western blot in Fig. 3 shows two protein products from both endogenous and overexpressed MBNL1. Please explain.

      Many of the commercially available anti-MBNL1 antibodies show this double-band in some cell lines as evidenced in numerous publications and on manufacturers’ websites (for example https://abclonal.com/catalog-antibodies/MBNL1RabbitmAb/A5149, https://www.ptglab.com/products/MBNL1-Antibody-66837-1-Ig.htm). We haven’t analysed the two bands in detail, but assume this to be a result of a post translational modification of some sort. Since GFP-MBNL1 and endogenous MBNL1 show the same thing, we do not consider it to be a major concern. We do mention the double-band as ‘characteristic’ in the figure legend for figure 3 so are not seeking to conceal anything here.

      1. No data were shown to substantiate the statement that "MBNL1 localises to CUGexp foci and CUGBP1 does not" (page 6).

      This has been published many times and is shown in figure 1A. However, we will add in a citation for this and have added an additional supplementary figure showing the lack of co-localisation in the foci from figure 1A more clearly together with separate data confirming that MBNL1 and CUGexp RNA do not co-localise with CUGBP1 in the nuclei of line HeLa_CTG960_GFPMBNL1.

      1. The y-axis of Fig. 4D should not go beyond 1.

      We will trim the axis. There are no data points above 1.0, just the indicator of statistical significance

      Significance:

      The nature of the current study is highly descriptive with little mechanistic insights.

      Our work is not descriptive, as we observed a change in stress granules in patient cells, which we could then replicate (and enhance) in a novel inducible model of DM1 designed to abrogate the unavoidable variation in patient-derived cell lines. We also now have additional mechanistic insights (see above) and have added an additional figure (draft figure 8) detailing these.

      For the subtle differences observed between control and DM1 cell lines, it remains unclear whether it may be due to cell line-to-cell line variation (see above).

      We cannot completely rule out an influence of cell line-to-cell line variation in the patient-derived lens cell lines (see above), though we think this unlikely as we saw the same effect repeated and amplified in the inducible HeLa-derived cell model, which was designed to minimise this concern. Furthermore, for stress granule loss, we see a larger effect in the HeLa cell model after 72hrs of induction than after 24hrs (figure 5C). This argues strongly that the effects seen are due to the expression of CUGexp RNA and we will emphasise this point more strongly.

      Some difference appear to be specific to one model but not the others (e.g., SG formation is slower in HeLa-CTG960 cells but not in DM1 HLECs).

      Even for the observations that seem consistent between models, the current results yielded little novel biological insights into whether and how these subtle differences in PB/SG dynamics may relate to DM1 pathogenesis. Collectively, these weaknesses render the current study incremental at best.

      The key biological insight the results provide is that the presence of the CUGexp repeat RNA results in defects in LLPS structures that are largely separable from any sequestration of MBNL1 in nuclear foci. With many researchers attributing the cellular defects in DM1 simply to the loss of MBNL1 by sequestration into nuclear foci, both this separation of altered stress response from MBNL1 levels and the involvement of altered LLPS formation (evidenced by the changes in PB behaviour on 1,6-hexanediol treatment) are novel biological insights into the cellular pathology of DM1. Additionally, our results shift the emphasis from nuclear effects to those seen in the cytoplasm.

      In terms of specific DM1 pathogenesis, the eye lens is subject to constant repeated stress and is subject to continued growth throughout the life span, relying on lens epithelial cells as a stem cell pool. Epithelial cells are also vital to the homeostatic regulation of ions, growth factor and nutrient flow from the aqueous humor to the underlying fibre cells. Any alterations in the response of lens epithelial cells, in particular, to stress is highly relevant to the pathology of cataract seen in DM1. We will revise our discussion to emphasise these key points more strongly.


      Reviewer #3

      The manuscript entiled "Phase-separated stress granules and processing bodies are compromised in Myotonic Dystrophy Type 1" by Gulyurtlu et al., characterizes the composition and ydnamics of stress granules and P-bodies in two Myotonic Dystrophy Type 1 (DM1) cell models, human lens epithelial cells from DM1 patients and age-matched controls and HeLa_CTG12_GFPMBNL1 and HeLa_CTG960GFPMBNL1 cell lines. The manuscript is somewhat descriptive with lack of functional data and some discrepancies. For example, in the discussion section, the authors conclude that "MBNL1 appears to be absent from P-bodies in cells with CUGexp foci in their nuclei. This observation suggests that the role of MBNL1 in P-bodies may be disrupted by the presence of CUGexp RNA." Figure 4A shows that "P-bodies in the DM1 model line, HeLa_CTG960GFPMBNL1 do not contain detectable amounts of GFPMBNL1". However, Figure 4E shows similar levels of total cellular MBNL1 per PB between the control CTG12 and mutated CTG960 lines.

      There is no discrepancy here. The reviewer has misinterpreted our data. PBs in the HeLa CTG960 cell line do not contain detectable amounts of GFP-MBNL1 under normal growth conditions, as shown in figure 4A. The data shown in figure 4E concern arsenite-treated cells, where some PBs in the CTG960 line do contain detectable levels of GFP-MBNL1, but significantly less than in the control CTG12 cells. We will reword these sections to make sure this is clear.

      Most importantly, in Figure S3 the authors show that CUGexp foci are present in 1-2 % of the cells. The claim appears to be too strong for the data presented in the manuscript.

      This is not what is shown in figure S3. The reviewer has misinterpreted our data. Figure S3 shows that in cells from line CTG960, only 1-2% of the total nuclear GFP-MBNL1 signal is found in the CUGexp foci, despite the intensity of the signal within them. Virtually all of the cells from the CTG960 cell line contain CUGexp foci (>95%). We will add a statement to this effect into the results section. We would not have continued working with a cell line in which only 1-2% of cells showed the DM1 phenotype of nuclear CUGexp RNA foci.

      Although the findings are interesting and of potential impact for a better understanding of the implications of RNA-protein condensate dynamics in the pathogenesis of DM1, the work presented here is still descriptive and preliminary in my opinion. In summary, the conclusions are not so convincing and additional experiments are essential to support the authors claims.

      This reviewer seems to have misinterpreted several of our data sets, including the specific points above, leading to the assertion that our conclusions are not convincing.

      Several months of works will be required to consolidate data and reorganize and ameliorate the manuscript, including the way data are presented and quantified.

      We already have data with which to address the majority of the queries posed, so should be able to make the adjustments relatively quickly.

      Specific comments:

      "On removal of stress, clearance of stress granules is mediated largely by a form of autophagy." This statement is not correct since the majority of stress granules disassemble and are not targeted to autophagy; in healthy cells only 5 % (or less) of the total SGs tend to persist in presence of autophagy or lysosome inhibitors, while the vast majority disassembles. Please rephrase carefully.

      The degree and manner of dispersal of stress granules in healthy cells on removal of stress is not well understood, but is known to differ depending on the type and duration of the stress (DOI: 10.1126/science.abj2400). We do not yet know how this may be altered in DM1, however, compromised autophagy is implicated in cataract formation, which is of relevance to our study. We will re-phrase this section of the discussion carefully to reflect the complex situation.

      Figure 1: RNA-protein complexes have heterogeneous composition. In HLECs, do all PBs colocalize with MBNL1 and CUGBP1 or only a fraction of them?

      We do not routinely see PBs without MBNL1 or CUGBP1 in the HLECs, in contrast to the situation in the HeLa CTG960 line. We have data available in order to quantify this and will add the numbers to the text of the results.

      Figure 2: Stress granules and P-Bodies are known to touch each-other, a process referred to as a "kissing event". The authors have studied the mobility of GFP-MBNL1 inside these two types of assemblies. It would be important also to quantify the "kissing" events. Is this altered in DM1 cells?

      We couldn’t find reference in the literature to ‘kissing events’ between SGs and PBs, but found several references to ‘docking’ events. We have noticed such interactions between PBs and SGs in our models. We are currently quantifying this and our first experiments (one in the HeLa cell model and one comparing one of the patient-derived lens cell lines to a control) suggest that there is a change in the frequency and/or size of such interactions in the HeLa CTG960 cell line compared to the CTG12 control and in the DM1 lens cell line derived to control. If this holds true in our repeat experiments (currently in progress), this would also provide the mechanistic insight requested by reviewer 2. We have included this, together with our data showing a decrease in total polyA RNA in stress granules in HeLa cells expressing CUGexp RNA, as an additional draft figure 8.

      Figure 3: In HeLa cells overexpressing CTG960_GFPMBNL1, beside the accumulation of one bright CUGexp puncta, several intranuclear GFPMBNL1 protein foci are visible. This subcellular distribution is different from the one observed in the control HeLaCTG12 GFPMBNL1. Can the author describe what these intranuclear GFPMBNL1 protein foci are?

      In most cells expressing CUGexp RNA, several nuclear foci form (usually one large and several smaller) and all of them contain MBNL1 (or GFP-MBNL1 in the HeLa_CTG960_GFPMBNL1 cell line). Figure S3 shows object identification using MBNL1 in this cell line, with two clear foci detected as the reviewer points out. We have added an additional panel to supplementary figure 1 to confirm that the additional foci are also CUGexp RNA foci and will clarify in the text of the results that there is not a single focus of CUGexp RNA in each nucleus.

      Is GFPMBNL1 accumulating at the level of splicing speckles? Or paraspeckles? Or other types on intranuclear condensates such as e.g. PML nuclear bodies? The different intranuclear distribution of GFPMBNL1 should be better characterized.

      The sub-nuclear distribution of MBNL1 is, indeed, very complex. MBNL1 also sometimes co-localises to splicing speckles/interchromatin granule clusters as we have previously reported in lens epithelial cell lines (DOI: 10.1042/BJ20130870 ) . The details of differences in the nuclear distribution of MBNL1, beyond its accumulation in CUGexp RNA foci, in DM1 cells compared to controls is the subject of another manuscript we have in preparation but are beyond the scope of the current study.

      Moreover, the % of cells expressing CTG960_GFPMBNL1 and forming intranuclear CUGexp foci is only mentioned in the discussion (Figure S3); for clarity it should be reported in the main text when describing Figure 3.

      The number of cells forming nuclear CUGexp foci on expression of CTG960_GFP-MBNL1 is >95% and we will add this to the text of the results section.

      "Figure S2: Quantitation of GFPMBNL1 in P-bodies in HeLa cell model of DM1." The authors report in the legend "Some, but not all, of these P-bodies contain detectable amounts of GFPMBNL1". However, the figure only shows a representative image of cells without quantification. Quantitation should be provided.

      We have data available to provide this simple quantitation. Approximately 38% of PBs in arsenite-treated cells from line HeLa_CTG960_GFPMBNL1 contain detectable levels of GFPMBNL1 using a manually-assigned cut-off intensity. We will add this to the relevant figure legend (now figure S5). However, this method of analysis requires an intensity to be manually set above which GFP-MBNL1 signal is considered ‘detectable’. This is hugely subjective, and in our opinion, the automatically generated quantitative comparison of “% total cellular MBNL1 per P-body” as shown in figure 4E is a more experimentally robust way to demonstrate a small loss of MBNL1 from P-bodies in cells from line HeLa_CTG960_GFPMBNL1 treated with arsenite compared to the relevant control.

      The authors report "a subtle change in stress granule architecture associated with the presence of CUGexp RNA". This statement is not supported by experimental data and should be omitted.

      We will qualify this statement to make it clear that we are referring to a subtle alteration in the co-localisation between CUGBP1 and MBNL1 specifically in the SGs, as our experimental data shown in figure 4D clearly support that, showing a statistically significant increase in the Pearson’s co-efficient of cololcalisation between MBNL1 and CUGBP1 in cell containing CUGexp RNA compared to the relevant control (0.90+/-0.05 for CTG960; 0.87+/-0.07 for CTG12).

      Figure 4. MBNL1 and CUGBP1 co-localise in P-bodies. What is the % of colocalization?

      We’re not sure exactly what is being requested here or what biological question the reviewer is asking us to address. MBNL1 and CUGBP1 co-localise in virtually all PBs (except in the HeLa CTG960 line where MBNL1 is undetectable in PBs under normal growth conditions). Figure 4E shows that, in cells with PBs upregulated by sodium arsenite, the mean amount of total cellular MBNL1 per PB is 0.1%, so it will be similar in cells grown under normal conditions as the PB sizes are similar and they appear to be of similar brightness by immunofluorescence. Again, this would be straightforward to quantify with our existing data if this is, indeed what the reviewer is requesting, but we question the biological significance. We would be reluctant to derive a Pearson’s co-efficient for the degree of co-localisation between CUGBP1 and MBNL1 in P-bodies as the structures are too small in size for this to be meaningful within the limits of imaging capabilities. We could, however, provide this if this is a specific request.

      Figure 5: "Treatment with sodium arsenite was then carried out under time-lapse microscopy, with Z-stacks of images taken every 4 minutes until stress granule formation was clearly seen (Fig.5A). This revealed a pronounced delay in formation of stress granules in cells containing CUGexp foci (HeLa CTG960 GFPMBNL1, 36 min +/- 12) compared to those without (HeLa CTG12 GFPMBNL1, 15 min +/- 2) (Fig.5B)." Data representation in Figure 5 is unclear and the pronounced delay in stress granule formation is not appreciated. Since the authors performed a live imaging taking pictures every 4 minutes, it would me more informative to plot the data and show the assembly and disassembly kinetics over time for both control and CTG960_ GFPMBNL1 cell lines (similar to what shown in e.g. Gwon et al., Science 2021, Ubiquitination of G3BP1 mediates stress granule disassembly in a context-specific manner, Figure 2G).

      The bar graph in figure 5B shows that cells from the CTG960 line take more than twice as long to form SGs compared to controls and are lost in half the time, with the precise numbers given in the text. A simple bar graph seemed the clearest way to present this. However, we have plotted our existing data in a similar manner similar to that in the cited reference and added this to figure 5. These graphs clearly show that the differences we see are at least as great as in other published literature, including the reference given by the reviewer (see below).

      Concerning Figure 1, the authors report no difference in the kinetic of stress granule formation in HLECs. However, they only report data after 45 and 60 min of arsenite treatment; at these time-points the assembly step is maximal. Thus, for consistency, the authors should include earlier time-points to the analysis of stress granule assembly also in HLECs, similar to what done in HeLa cells in Figure 5.

      The assembly step is not ‘maximal’ in these cell lines after 45 minutes. Figure 2A clearly shows that only ~30% of cells have SGs after 45 minutes of treatment, compared with 100% of cells after 90 minutes shown in figure 2B. We have additional data at 10, 20 and 30 minutes all showing no significant differences. We had omitted them to keep the graph simple, but have now included them as a graph of ‘% of cells with stress granules against time’ in figure 2.

      "Having established that MBNL1 and CUGBP1 co-localise closely in stress granules": the authors investigated the colocalization of each of these two proteins with stress granule markers but they did not verify whether MBNL1 and CUGBP1 co-localise.

      In figure 1B we show that endogenous CUGBP1 and endogenous MBNL1 both co-localise with the stress granule marker TIA1 in stress granules in lens epithelial cells. It would, therefore, be highly unlikely that CUGBP1 and MBNL1 would not co-localise with each other in stress granules. We have also previously verified that GFPMBNL1 behaves identically to its endogenous counterpart (Coleman et al, 2014). Furthermore, in figure 4C and 4D, we show close co-localisation between endogenous CUGBP1 and GFPMBNL1 in stress granules in our HeLa cell model, using high-resolution AiryScan microscopy for which we provide detailed quantitation.

      This aspect should be addressed experimentally since the authors also conclude that "a complex relationship between MBNL1 and CUGBP1 in stress granules" exists. Thus, the authors need to assess the colocalization of GFPMBNL1 with endogenous CUGBP1 in stress granules and the one of GFPCUGBP1 with endogenous MBNL1.

      The complex relationship we propose is based on the effects of CUGBP1 or MBNL1 knockdown on the dynamic behaviours of each other by FRAP assay and not solely on their co-localisation, although we have already analysed their co-localisation in detail as above.

      Figure 6: Please add antibody labeling to microscopy panels A and B.

      Certainly, this was an accidental omission and has been added

      Moreover, specify is the numbers refer to minutes in panel F. The data representation is also unclear - see comment above, Figure 5.

      As stated in the figure legend and on the graph axes, these numbers have been normalised to the mean time taken for SG formation/loss in the control CTG12 cell line (set at 100%). The precise numbers in minutes for mean and SD are given in the text. We have added additional graphs of ‘% of cells with stress granules against time’ to this figure, with the values in minutes given to clarify the exact time-scale.

      Figure 7: was 1,6-hexanediol added in presence of arsenite? Or was arsenite removed?

      Arsenite was not removed (neither was Doxycycline) as we wanted to examine the effect of 1,6-hexanediol on SGs and PBs without the added complication of the effects of stress removal. We will clarify this point in the methods/results.

      Aberrant persistent stress granules have been implicated in age-related (Mateju et al., 2017) and neurodegenerative diseases (Protter and Parker, 2016), such as ALS and FTD (Jain et al., 2016; Markmiller et al., 2018; Zhang et al., 2018). These are proposed to result from increased liquid-to-solid phase transitions within the stress granules (Mateju et al., 2017)." The authors should better define what are aberrant stress granules (e.g. see Ganassi et al., 2016; Turakhiya et al., 2018, PMID: 29804830).*

      We will expand on this subject in the discussion

      "Persistent stress granules have long been associated with degenerative conditions, notably ALS (Li et al., 2013)". I suggest updating the reference adding a more recent one.

      We selected this 2013 review to emphasise that there is a long history of association of persistent stress granules with degenerative conditions. We will add in an additional, more recent review.

      Significance

      The work is descriptive; thus, in this form I do not consider that it is strongly advancing the field.

      Having noted alterations to stress granule disassembly in lens epithelial cells from DM1 patients, we went on to develop a novel inducible model in which we replicated and enhanced these effects by expressing the large CUGexp RNA that causes DM1 as part of a DMPK mini-gene mimicking the genetic mutation seen in DM1 patients. This is not purely descriptive. Furthermore, we are now in a position to add an additional figure showing two pieces of evidence for functional defects in stress granules associated with CUGexp RNA expression 1) reduced accumulation of total PolyA RNA in stress granules indicating compromised function and 2) compromised ‘docking’ events between stress granules and P-bodies, a process proposed to be integral to the function of both structures.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript entiled " Phase-separated stress granules and processing bodies are compromised in Myotonic Dystrophy Type 1" by Gulyurtlu et al., characterizes the composition and ydnamics of stress granules and P-bodies in two Myotonic Dystrophy Type 1 (DM1) cell models, human lens epithelial cells from DM1 patients and age-matched controls and HeLa_CTG12_GFPMBNL1 and HeLa_CTG960_GFPMBNL1 cell lines. The manuscript is somewhat descriptive with lack of functional data and some discrepancies. For example, in the discussion section, the authors conclude that "MBNL1 appears to be absent from P-bodies in cells with CUGexp foci in their nuclei. This observation suggests that the role of MBNL1 in P-bodies may be disrupted by the presence of CUGexp RNA." Figure 4A shows that "P-bodies in the DM1 model line, HeLa_CTG960_GFPMBNL1 do not contain detectable amounts of GFPMBNL1". However, Figure 4E shows similar levels of total cellular MBNL1 per PB between the control CTG12 and mutated CTG960 lines. Most importantly, in Figure S3 the authors show that CUGexp foci are present in 1-2 % of the cells. The claim appears to be too strong for the data presented in the manuscript.

      Although the findings are interesting and of potential impact for a better understanding of the implications of RNA-protein condensate dynamics in the pathogenesis of DM1, the work presented here is still descriptive and preliminary in my opinion. In summary, the conclusions are not so convincing and additional experiments are essential to support the authors claims. Several months of works will be required to consolidate data and reorganize and ameliorate the manuscript, including the way data are presented and quantified.

      Specific comments:

      "On removal of stress, clearance of stress granules is mediated largely by a form of autophagy." This statement is not correct since the majority of stress granules disassemble and are not targeted to autophagy; in healthy cells only 5 % (or less) of the total SGs tend to persist in presence of autophagy or lysosome inhibitors, while the vast majority disassembles. Please rephrase carefully.

      Figure 1: RNA-protein complexes have heterogeneous composition. In HLECs, do all PBs colocalize with MBNL1 and CUGBP1 or only a fraction of them?

      Figure 2: Stress granules and P-Bodies are known to touch each-other, a process referred to as a "kissing event". The authors have studied the mobility of GFP-MBNL1 inside these two types of assemblies. It would be important also to quantify the "kissing" events. Is this altered in DM1 cells?

      Figure 3: In HeLa cells overexpressing CTG960_GFPMBNL1, beside the accumulation of one bright CUGexp puncta, several intranuclear GFPMBNL1 protein foci are visible. This subcellular distribution is different from the one observed in the control HeLaCTG12 GFPMBNL1. Can the author describe what these intranuclear GFPMBNL1 protein foci are? Is GFPMBNL1 accumulating at the level of splicing speckles? Or paraspeckles? Or other types on intranuclear condensates such as e.g. PML nuclear bodies? The different intranuclear distribution of GFPMBNL1 should be better characterized. Moreover, the % of cells expressing CTG960_GFPMBNL1 and forming intranuclear CUGexp foci is only mentioned in the discussion (Figure S3); for clarity it should be reported in the main text when describing Figure 3.

      "Figure S2: Quantitation of GFPMBNL1 in P-bodies in HeLa cell model of DM1." The authors report in the legend "Some, but not all, of these P-bodies contain detectable amounts of GFPMBNL1". However, the figure only shows a representative image of cells without quantification. Quantitation should be provided.

      The authors report "a subtle change in stress granule architecture associated with the presence of CUGexp RNA". This statement is not supported by experimental data and should be omitted.

      Figure 4. MBNL1 and CUGBP1 co-localise in P-bodies. What is the % of colocalization?

      Figure 5: "Treatment with sodium arsenite was then carried out under time-lapse microscopy, with Z-stacks of images taken every 4 minutes until stress granule formation was clearly seen (Fig.5A). This revealed a pronounced delay in formation of stress granules in cells containing CUGexp foci (HeLaCTG960 GFPMBNL1, 36 min +/- 12) compared to those without (HeLaCTG12 GFPMBNL1, 15 min +/- 2) (Fig.5B)." Data representation in Figure 5 is unclear and the pronounced delay in stress granule formation is not appreciated. Since the authors performed a live imaging taking pictures every 4 minutes, it would me more informative to plot the data and show the assembly and disassembly kinetics over time for both control and CTG960_ GFPMBNL1 cell lines (similar to what shown in e.g. Gwon et al., Science 2021, Ubiquitination of G3BP1 mediates stress granule disassembly in a context-specific manner, Figure 2G). Concerning Figure 1, the authors report no difference in the kinetic of stress granule formation in HLECs. However, they only report data after 45 and 60 min of arsenite treatment; at these time-points the assembly step is maximal. Thus, for consistency, the authors should include earlier time-points to the analysis of stress granule assembly also in HLECs, similar to what done in HeLa cells in Figure 5.

      "Having established that MBNL1 and CUGBP1 co-localise closely in stress granules": the authors investigated the colocalization of each of these two proteins with stress granule markers but they did not verify whether MBNL1 and CUGBP1 co-localise. This aspect should be addressed experimentally since the authors also conclude that "a complex relationship between MBNL1 and CUGBP1 in stress granules" exists. Thus, the authors need to assess the colocalization of GFPMBNL1 with endogenous CUGBP1 in stress granules and the one of GFPCUGBP1 with endogenous MBNL1.

      Figure 6: Please add antibody labeling to microscopy panels A and B. Moreover, specify is the numbers refer to minutes in panel F. The data representation is also unclear - see comment above, Figure 5.

      Figure 7: was 1,6-hexanediol added in presence of arsenite? Or was arsenite removed?

      Minor comments:

      Aberrant persistent stress granules have been implicated in age-related (Mateju et al., 2017) and neurodegenerative diseases (Protter and Parker, 2016), such as ALS and FTD (Jain et al., 2016; Markmiller et al., 2018; Zhang et al., 2018). These are proposed to result from increased liquid-to-solid phase transitions within the stress granules (Mateju et al., 2017)." The authors should better define what are aberrant stress granules (e.g. see Ganassi et al., 2016; Turakhiya et al., 2018, PMID: 29804830).

      "Persistent stress granules have long been associated with degenerative conditions, notably ALS (Li et al., 2013)". I suggest updating the reference adding a more recent one.

      Significance

      The work is descriptive; thus, in this form I do not consider that it is strongly advancing the field.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the current study, the authors compared the dynamics of P-bodies (PBs) and stress granules (SGs) between control and several DM1 cell lines. They found that MBNL1 and CUGBP1, two CUG repeat RNA-binding proteins that are primarily nuclear, could also co-localize with PBs in the cytoplasm and re-localize to SGs under stress. Small differences were observed in SG assembly and disassembly dynamics between control and DM1 HLECs, between HeLa cells expressing either CTG12 or CTG960, and between HeLa cells with and without shRNAs targeting CUGBP1 or MBNL1. Overall, the experiments were clearly described and the results properly presented. However, critical controls, as detailed below, are missing in multiple analyses. The mechanisms underlying these apparent differences are also unknown.

      Major concerns:

      1. Throughout the study, the authors compared MBNL1 and CUGBP1 association with PBs and SGs without considering the potential differences in their cytoplasmic abundance between control and DM1 cell lines, which seems to be case for MBNL1 abundance in CTG960-expressing HeLa cells (Fig. 3). Provided that PBs and SGs exchange components with the cytosol at an equilibrium, if the cytoplasmic abundance of, for example, MBNL1 is decreased in DM1, one would expect the equilibrium being shifted resulting in less MBNL1 associated with PB/SG. Therefore, before measuring the association or the assembly/disassembly kinetics of PB and SG, the authors should first test whether MBNL1 and CUGBP1 abundance may be different between control and DM cell lines. The same caveat applies to MBNL1/CUGBP1 knockdown experiments, where knocking down one may change the abundance of the other.
      2. Similarly, the authors did not consider the possibility that changes in SG/PB dynamics may be due to changes in the abundance/availability of essential SG/PB components such as GE1 and G3BP1.
      3. Most of the observed differences between control and DM cell lines were modest, leaving one wonder whether it could be simply due to cell line-to-cell line variability. Whenever possible, the authors should present results for each individual lines. For example, in Fig.2, 3 DM1 lines and 2 control lines were used. Was the difference in SG disassembly (Fig. 2B) observed in each of the 3 lines?

      Minor points:

      1. Western blot in Fig. 3 shows two protein products from both endogenous and overexpressed MBNL1. Please explain.
      2. No data were shown to substantiate the statement that "MBNL1 localises to CUGexp foci and CUGBP1 does not" (page 6).
      3. The y-axis of Fig. 4D should not go beyond 1.

      Significance

      The nature of the current study is highly descriptive with little mechanistic insights. For the subtle differences observed between control and DM1 cell lines, it remains unclear whether it may be due to cell line-to-cell line variation (see above). Some difference appear to be specific to one model but not the others (e.g., SG formation is slower in HeLa-CTG960 cells but not in DM1 HLECs). Even for the observations that seem consistent between models, the current results yielded little novel biological insights into whether and how these subtle differences in PB/SG dynamics may relate to DM1 pathogenesis. Collectively, these weaknesses render the current study incremental at best.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript by Gulyrutlu and co-workers addresses the role of CUG expanded repeat RNA associated with DM1 in regulating the formation of higher order RNP assemblies such as stress granules and P-bodies in the cell. The authors used lens epithelial cells (hLECs) derived from a DM1 patient or a HeLa cell inducible model of DM1 to investigate whether expression of the CUG repeat-associated protein MBNL1 and CUGBP1 affected the formation and dispersal of stress granules and P-bodies. The authors show that MBNL1 and CUGBP1 are components of SGs and PBs in hLECs and HeLa cells. In cells expressing the CUG repeat, there are minor alterations in the dispersal of stress granules as well as in the formation of P-bodies. MBNL1 could affect the formation and dispersal of SGs independent of the CUG repeat. Finally, in HeLa cells, overexpression of MBNL1 can reduce the dispersal of P-bodies upon 1,6-hexanediol treatment.

      Major comments:

      One limitation of the work is that the perturbations seen with stress granules or P-bodies are all relatively small, and no evidence for a functional consequence on gene expression is demonstrated. Specifically, the authors observe only minor alterations in the formation or disaggregation of PBs and SGs in these DM1 models. Further, some of the effects observed are independent of the CUG repeat expression, suggesting that MBNL1 and CUGBP1 might have independent roles in modulating some properties of SG and PB formation or dispersal.

      1. The authors could investigate whether the CUG repeat RNA itself is localized to SGs or PBs in their models, and whether the presence of the repeat RNA is absolutely necessary for regulating the dynamics of SG or PB formation.
      2. The authors use 1,6-hexanediol to suggests that PBs and SGs in HeLa cells show behavior analogous to LLPS. However, the use of 1,6,-hexanediol to establish an assembly as a LLPS is a relatively limited analysis (despite its widespread use in the field), since this compound can affect the formation of multiple cellular substructures that are not always LLPS (for example, see Wheeler et al, 2016, eLife).

      Significance

      This study would be of interest to the field if the impact of the DM! repeat RNAs on PB and SG were more substantial, and if some functional consequences were demonstrated. The lack of a strong effect on SG or PB formation in the DM1 models, along with the CUG repeat-independent effect of MBNL1 on the formation and dispersal of these complexes, argues that MBNL1/CUGBP1 may not significantly affect the formation or dispersal of SGs and PBs.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      This paper puts together a nice set of data showing that a specific gene called Resf1 when deleted effects the ability of ESCs to self-renew and proceed to germline fates. I believe the data are sound and that they provide the evidence needed for the authors to make their conclusions.While I think the data are presented well and the manuscript is well-written, the "modest" functional results suggest this work would be more suited for a specialized journal.

      We thank Reviewer 1 for their supportive comments.

      • *

      Reviewer #2:

      1. In the presence of LIF, there is no difference between Resf1 knockout mESCs and WT mESCs except the expression of Esrrb, Nanog and Pou5f1. What about other genes? RNA-seq is needed to distinguish the two cell lines.

      Fukuda et al. have shown that deletion of Resf1 leads to misregulation of ~1000 genes (adj. p-value 2) in presence of LIF. This highlights large differences between transcriptomes of Resf1 KO and WT cells that occur despite only a marginal difference in self-renewal efficiency between Resf1 KO cells and WT in the presence of LIF. It is therefore questionable whether the time and resources required to perform the requested RNA-seq would produce data that could unambiguously identify the potential causative effector difference downstream of Resf1.

      As an alternative approach, we have reanalysed the Fukuda et al RNA-seq data. We find that Esrrb is significantly downregulated (in agreement with our Q-RT-PCRs), as are Klf4 and LifR (FDR 1.5). However, our meta-analysis of the Fukuda et al data did not show Pou5f1 and Nanog to be differentially expressed (FDR 1.5). This is in line with the lower level of downregulation of Pou5f1 and Nanog, compared to Esrrb in our Q-RT-PCR data. Notably, our gene expression analyses were performed in 5 biological replicates, whereas Fukuda et al. performed RNA-seq in two biological replicates. We can include the meta-analysis of the Fukuda et al data in our submission. As the change in ESC self-renewal that we see at low LIF concentrations could result from a decrease in Lifr expression, we will verify the change in expression of Lifr by Q-RT-PCR. Importantly, we will do this in a way that discriminates between expression of the transmembrane Lifr and soluble LifR, since the latter acts antagonistically (PMID: 9396734, Chambers, BJ, 1997).


      1. The authors showed Resf1 is not required for Nanog function, so how does Resf1 regulate the expression of pluripotency genes? Through epigenetic modifications or signaling pathways? The authors should design experiments to explain the detailed mechanisms.

      The strength of the immunoblot signal for RESF1 is low, even when Resf1 is expressed episomally. Therefore, although we could try to co-immunoprecipitate with the Resf1-v5 cell line and endogenous Nanog, the expression level of RESF1 may mean this effort is unsuccessful. Given the fact that the result will not affect the conclusions of our study, we do not think this effort is justifiable.

      1. The authors showed that Resf1 interacts with Nanog, but they used forced expressed proteins. Does the endogenous Resf1 interacts with endogenous Nanog? Do they bind to some same DNA sequences?

      This are important questions to answer. However, many more experiments would be required to reach firm conclusions. The reviewer is right to say that the mechanisms by which Resf1 affects pluripotency are unknown and remain to be answered in future. We therefore propose to improve the text discussing similarities in pluripotency phenotype between deletions of Trim28, SETDB1, YTHDC1 and RESF1. As deletion of RESF1 partner SETDB1 or other proteins involved in repression of retrotransposons lead to downregulation of pluripotency genes and in some cases collapse of ESCs (e.g. PMID: 19884255, Bilodeau et al. 2009; PMID: 19884257, Yuan et al. 2009), we hypothesise that the RESF1 phenotype may be explained by affecting SETDB1 chromatin binding and therefore repression of SETDB1 targets. The mild phenotype of RESF1 KO indicates that RESF1 would not be an essential component of this repressor complex but rather “a modulatory protein”.

      It is also worth noting that the meta-analysis of the RNA-seq data from Fukuda et al. suggests that Resf1-null ESCs may express reduced levels of LifR mRNA, and this is something we plan to investigate.


      1. In figure 5C, some Resf1 positive cells showed Nanog negative. Are these Nanog negative cells pluripotent?

      Nanog-null ESCs are pluripotent (PMID: 18097409, Chambers et al., 2007). In addition, NANOG-negative cells in FCS/LIF cultures can retain pluripotency. Our purpose in this figure was therefore not to say whether NANOG-negative:RESF1-positive cells are pluripotent but to draw attention to the broader expression of RESF1 in FCS/LIF compared to NANOG. Such broader expression has also been noted for other heterogeneously expressed factors (PMID: 31582397, Pantier et al. 2019).

      1. In figure 6A, the naïve mESCs are induced to EpiLCs. Is the transition efficiency of Resf1 knockout cells the same with WT mESCs? The finally obtained PGCLCs should be identified.

      We show that the key TFs of EpiLC state are expressed similarly in WT and Resf1 KO cells (Supplementary figure 4) and we have data showing that WT and Resf1KO EpiLCs have a similar morphology. Together this suggests an efficient transition to an EpiLC state. Our analysis has identified expression of Blimp1/Ap2g/Prdm14 in Resf1-null cultures. Compared to wild-type cells these levels are reduced up to 3-fold. As this is from an unsorted population and the number of SSEA1/CD61-positive cells is decreased around 2x, this suggests that the PGCLC population formed by Resf1-null cells is reduced in proportion but is otherwise normal.

      We will add photographs of EpiLC colonies formed by Resf1 KO and WT cells.


      1. in figure 5c, the scale bar is missing.

      We will add missing scale bars in the figure 5C.

      Reviewer #3:

      1. What was less clear was an explanation of why colonies 4 and 24 were chosen. Were there other colonies with the desired expression? Was this amount of expression repeated in replicative experiments with approximately 2 colonies only available to be selected?

      Approximately 30 colonies were selected for analysis. Of these, only 2 had deletion of both Resf1 alleles. We will make this point clearer in the text.


      1. Figure 1C, 5C and S2B with microscopic images should include a scale bar.

      Missing scalebars in the Figure 1C will be added. Unfortunately the microscopy setup used to collect the images in Figures 5C and S2B did not allow scalebars to be added at the time of imaging and these cannot be added retrospectively. However, we do not think that inclusion of scalebars, even were it possible would affect the conclusions of our manuscript.

      1. Figure 1E needs a better explanation of the significance, "less clear cut" is not adequate. Reporting statistics, or lack of significance, on the graph would help.

      We will update the manuscript and the Figure 1E to include results of a statistical analysis (Wilcoxon-rank sum test) comparing formation of AP+ colonies between Resf1 KO and WT cells at different LIF concentrations. These results show that both Resf1 KO cell lines have lower median number of AP+ colonies than WT cells at LIF concentrations 0 and 1 (p.adj. *

      1. It's translatability to medicine, although perhaps that is not the intention, is somewhat lacking. Is there a naturally occurring situation where LIF is absent that would require this pathway to be used? These were mouse ESC's, perhaps this study could incorporate information about relevant translation to a human condition to aid in the significance. This manuscript suggests a mechanistic evaluation by which self-renewal can occur other than the canonical pathway, which is interesting and can inform the field.

      Our results suggest that RESF1 directly or indirectly supports self-renewal of ESCs. Interestingly, Human cell atlas identified RESF1 expression as a negative predictor of survival of renal cancer and was found to be expressed in testis cancer cells and other cancer tissues. Therefore, RESF1 could promote self-renewal of cancer cells similarly to ESCs. However, this is speculative and needs further studies. As this is both outside of the scope of this manuscript and our expertise, we do not think it prudent for us to pursue this line of inquiry. However, we agree that further studies could evaluate RESF1 function in human tissues, especially pluripotent cells and germ cells. As we show that RESF1 deletion leads to reduced induction of PGCLCs and previous studies showed infertility of Resf1 KO mice, investigating link between human fertility and RESF1 could have implications in reproductive medicine.

      We will improve our discussion to highlight the possible significance of RESF1 function in human fertility.





    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors aimed to study RESF1 in ESC's to understand it's role in germ cell specification and PGCLC differentiation under in vitro experimental challenges.

      The experiments performed and reported were thorough and convincing. Data and methods were clearly explained.

      What was less clear was an explanation of why colonies 4 and 24 were chosen. Were there other colonies with the desired expression? Was this amount of expression repeated in replicative experiments with approximately 2 colonies only available to be selected?

      Figure 1C, 5C ad S2B with microscopic images should include a scale bar.

      Figure 1E needs a better explanation of the significance, "less clear cut" is not adequate. Reporting statistics, or lack of significance, on the graph would help.

      Significance

      Understanding the specific interactions and suggested role of RESF1 in self-renewal is informative on a molecular biology and developmental biology level.

      It's translatability to medicine, although perhaps that is not the intention, is somewhat lacking. Is there a naturally occurring situation where LIF is absent that would require this pathway to be used? These were mouse ESC's, perhaps this study could incorporate information about relevant translation to a human condition to aid in the significance. This manuscript suggests a mechanistic evaluation by which self-renewal can occur other than the canonical pathway, which is interesting and can inform the field.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors uncovered the new roles of Resf1 in mESC self-renewal and germline entry. They showed that Resf1 deletion reduced mESC self-renewal, and it's not required for Nanog function. In addition, the efficiency of PGCLC specification of Resf1 knockout mESC is less than WT mESC. However, these conclusions are too preliminary and the underlying mechanism is missing.

      Major comments:

      1. In the presence of LIF, there is no difference between Resf1 knockout mESCs and WT mESCs except the expression of Esrrb, Nanog and Pou5f1. What about other genes? RNA-seq is needed to distinguish the two cell lines.
      2. The authors showed Resf1 is not required for Nanog function, so how does Resf1 regulate the expression of pluripotency genes? Through epigenetic modifications or signaling pathways? The authors should design experiments to explain the detailed mechanisms.
      3. The authors showed that Resf1 interacts with Nanog, but they used forced expressed proteins. Does the endogenous Resf1 interacts with endogenous Nanog? Do they bind to some same DNA sequences?
      4. In figure 5C, some Resf1 positive cells showed Nanog negative. Are these Nanog negative cells pluripotent?
      5. In figure 6A, the naïve mESCs are induced to EpiLCs. Is the transition efficiency of Resf1 knockout cells the same with WT mESCs? The finally obtained PGCLCs should be identified.

      Minor comments:

      1. in figure 5c, the scale bar is missing.

      Significance

      The authors uncovered the new roles of Resf1 in mESC self-renewal and germline entry.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This paper puts together a nice set of data showing that a specific gene called Resf1 when deleted effects the ability of ESCs to self-renew and proceed to germline fates. I believe the data are sound and that they provide the evidence needed for the authors to make their conclusions.

      Significance

      While I think the data are presented well and the manuscript is well-written, the "modest" functional results suggest this work would be more suited for a specialized journal.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      First of all, we would like to thank the each of the expert reviewers for their effort in evaluating our study. We are confident that we can positively address each of the issues and queries raised by the reviewers.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **Summary:** This study investigates the functions of the tricalbin proteins in S. cerevisiae, which are homologs extended synaptotagmins in mammals. It suggests the tricalbins modulate plasma membrane phospholipid composition and are particularly important for surviving shift to elevated temperature. The tricalbins are proposed to directly or indirectly promote phosphatidylserine transport from the ER to the plasma membrane during heat stress. They also promote the localization of the kinase Pkh1 to the plasma membrane during heat stress.

      **Major comments:**

      General Response: We thank the reviewer for the comments and opinions. While they are starkly different from the other two reviewers’, they have caused us to consider how we can add additional support for our conclusions and to consider alternative possibilities.

      1. To determine whether the tricalbins and other tethering proteins play a role in phospholipid homeostasis, lipid distribution is measured using biosensors (FLARES) and phospholipid levels are determined by mass spec. The experiments are well done but say little about what role the tricalbins or other tethering proteins play in homeostasis. There are no measurements of lipid transport or rates of phospholipid production, degradation or modification. It is reasonable to propose the tricalbins transport lipids, since other proteins with SMP domains do, but this study does not present any evidence that they do.

      Response: We thank the reviewer for stating that the quantitative microscopy and lipidomics experiments in our study are well done. However, we strongly disagree that the results do not provide new information on the roles of the tricalbins or other tethering proteins in membrane lipid homeostasis. In fact, the lipidomics results definitively show that Ist2 and Scs2/22 control phosphatidylserine (PS) levels. In contrast, loss of the tricalbins does not significantly affect PS levels. We will provide a new figure to make this even more clear to expert and non-expert readers.

      While the tricalbins do not regulate PS levels, the data clearly show that the tricalbins control PS acyl chain saturation and cellular distribution. A PS reporter is reduced at the plasma membrane (PM) upon loss of the tricalbins and we have now confirmed increased localization at the endoplasmic reticulum (ER). These new results will be included in a revised manuscript. As the lipid desaturase Ole1 is localized in the ER, the lipidomics are consistent with increased ER residency of PS species. Thus, the data indicate that while the tricalbins do not regulate PS levels, they control either delivery of PS from the ER to the PM, and/or they may control the organization and stability of PS at the PM which would be a novel finding on its own.

      The reviewer must be aware that there are currently no in vivo cell assays that directly (only) measure lipid transport. These experiments are subject to several factors including lipid metabolism, anterograde transport rates, bilayer organization, lipid accessibility, and retrograde transport rates. While our findings clearly show that the Tcb proteins do not control PS levels, we agree that there are alternative possibilities to explain the changes in PS distribution. In our revised manuscript, we will perform additional experiments to distinguish between these possibilities. New experiments will include mutant forms of Tcb3 bearing substitutions in the SMP domain. We will also examine whether the Tcb proteins control PS organization, availability/accessibility, and stability at the PM (also see Reviewer #3, comment 2). This latter possibility may reveal a novel concept regarding Tcb/E-Syt protein function that goes beyond their proposed conventional role as lipid transfer proteins. Based on the outcome of these experiments, we shall adjust the final cartoon model and conclusions in the discussion accordingly.

      The study convincingly demonstrates that there are fewer Pkh1 puncta formed after temperature shift in cells lacking tricalbins than in wild-type cells (Fig. 6 C,D). However, there is no demonstration that this change in localization alters Pkh1 function or signaling.

      Response: Regulation of Pkh1 by lipids is outside the scope of our current study that is focused on providing new understanding of Tcb protein function. However, the decrease in heat-induced Pkh1 puncta may provide insight into the PM integrity defects in cells lacking Tcb1/2/3 (as Pkh1 is required for PM integrity). To test whether Pkh1 function is compromised in the tcb1/2/3 mutant cells, we can test whether constitutively active Ypk1 (which acts downstream of Pkh1) rescues the PM integrity defects in tcb1/2/3 mutant cells.

      There is no demonstration that association of tricalbins and Skh1 (Fig. 4) has any functional significance or affects phosphoinositide metabolism.

      Response: We thank the reviewer for raising this issue. If the association of Tcb3 and Sfk1 has functional significance, then one would expect that loss of the proteins should phenocopy one another. Deletion of the Sfk1 cytoplasmic domain necessary for co-localization with Tcb3 should also phenocopy loss of Tcb3. This is exactly what we find. Localization of the PS probe is decreased at 42oC upon loss of Sfk1 or truncation of the Sfk1 cytoplasmic tail, similar to cells lacking Tcb3. Furthermore, we find that Tcb3 regulates sterol homeostasis at the PM (using the D4H probe), as has been recently reported for Sfk1 (Kishimoto et al, 2021). Thus, Tcb3 and Sfk1 not only co-localize, but they also share common functions in PM lipid organization. These new results will be presented in our revised manuscript.

      The reviewer also inquired about potential roles for Tcb3 and Sfk1 in phosphoinositide lipid homeostasis, as Sfk1 has reportedly been implicated in heat-induced PI(4,5)P2 synthesis. However, while we find clear roles for Sfk1 and the tricalbins in PS and sterol homeostasis, we did not find a requirement for Sfk1 or the tricalbins in PI(4,5)P2 homeostasis upon heat stress conditions. These findings will be included in our revised manuscript. Importantly, our results indicate that the tricalbins and Sfk1 primarily control PS and sterol homeostasis at the PM, and may regulate phosphoinositides indirectly, and thus provide new insight into the key role of these proteins.

      The study proposes the tricalbins directly or indirectly promote phosphatidylserine transport after temperature shift, but transport has not measured and other possibilities are not ruled out.

      Response: While the Tcb3 SMP domain has been shown to transfer phospholipids in vitro (Qian et al., 2021), we agree that a role in PS transfer in vivo should be examined in more detail and that other possible roles in PS homeostasis should also be considered (also see responses to Reviewer #3, comment 2).

      Upon heat shock, we not only observe a decrease in relative levels of the PS reporter at the PM in the tcb1/2/3 mutant cells (as shown in our original manuscript), but also a corresponding increase in relative levels of the PS probe at the ER and vacuole membrane (also see response to Reviewer #3, comment 5). This could reflect impaired delivery of PS from the ER to the PM and reduced stability of PS at the PM (i.e. increased internalization of PS into the cell).

      In our original manuscript, we showed that deletion of the SMP domain (the lipid transfer domain), phenocopies deletion of the full-length Tcb3 protein in terms of PS distribution and PM integrity following heat shock. To more rigorously test whether lipid transport activity of the SMP domain is responsible for these phenotypes, we will generate amino acid substitutions within the SMP domain of Tcb3 that maintains its overall structure but impairs its ability to transport lipids (by targeting conserved key residues identified in Saheki et al., 2016). We will then assess whether SMP-mediated lipid transfer is necessary for PS homeostasis and PM integrity under heat stress.

      We also agree that other possibilities should be examined. First, to rule out a defect in PS production upon heat stress, we are performing new mass spectrometry lipidomics experiments to measure levels of individual phospholipid species in the tcb1/2/3 mutant and wild type cells after heat stress.

      Second, we have considered whether the Tcb proteins control phospholipid bilayer distribution (e.g. flip and flop). However, cells lacking the Tcbs are not hypersensitive to duramycin (Omnus et al. 2016) and thus phosphatidylethanolamine exposure on the extracellular leaflet is not increased. Moreover, cells lacking the Tcbs (and Scs2/22 and Ist2) are not impaired in the uptake of exogenous NBD-labelled phospholipids (and thus flip across the PM bilayer is not impaired). Possibly, there may be increased lipid ‘flop’ in the mutant cells at high temperature. We can test whether there is increased phospholipid exposure in the extracellular leaflet at high temperature, but our results thus far indicate accumulation of PS on internal membrane compartments (the ER and vacuole membrane).

      Another potentially exciting possibility is that the tricalbin proteins bind and stabilize PS within the cytosolic leaflet of the PM and prevent its internalization by endocytosis or non-vesicular transfer. This mechanism would be completely independent of lipid transport to the PM and would constitute new mechanistic insight into Tcb function. We will test whether PS (and sterol) becomes more accessible (less stable or reduced sequestered pools at the PM) and internalized into the cell, upon removal of the tricalbin proteins. For example, we will monitor PS distribution in cells where endocytosis is blocked with latrunculin A.

      As mentioned, there currently no cellular lipid transport assay that directly (only) measure anterograde transport. However, if the Tcb3 SMP domain mutants are impaired in PS homeostasis and PM integrity, then we can consider monitoring PS transfer in vivo. By performing the experiments outlined here, we will have thoroughly characterised the roles of the tricalbin proteins in PS homeostasis at the PM. Moreover, the new findings may even reveal novel roles that are independent of transport.

      Reviewer #1 (Significance (Required)): While this study is likely to be of interest to those studying the tricalbins or phospholipid homeostasis, it is incremental and provides little conceptual advance on what is already known about the tricalbins and extended synaptotagmins. They have already been implicated in lipid homeostasis in the plasma membrane and this study provides no new mechanistic insight into how this occurs. Similarly, it has already been shown that the tricalbins play a role in maintaining cell integrity during heat stress and there is little new insight into what role the tricalbins play. Perhaps the most notable part of the study is the idea that tricalbins are necessary for phosphatidylserine transport during stress, but considerable additional work is necessary to make a strong case for this claim.

      Response: We strongly disagree with the reviewer’s opinions. Indeed, Reviewer #2 found our study “novel and detailed” and Reviewer #3 found the results in our study to be “highly valuable” and “interesting”.

      In contrast to the reviewer’s claims, there are certainly novel findings in our study. Foremost, this is the first study that demonstrates a role of the tricalbins in PS homeostasis. Previous studies have implicated E-Syt family members in diacylglycerol and phosphoinositide regulation. Our results indicate that the tricalbins and Sfk1 primarily control PS and sterol homeostasis at the PM, not phosphoinositides, and thus provide new insight into the key role of these proteins. Second, while a previous study by Collado et al reported a role of the tricalbins in PM integrity upon heat stress, this work did not provide mechanistic insight into this process. We performed the PM integrity assays for the Collado et al study (as co-authors). Our current study now shows that Tcb function is needed for PS homeostasis and Pkh1 recruitment at the PM upon heat stress; both factors are needed for PM integrity under these conditions. As such, our current study does provide new insight into roles of the Tcbs in PM integrity. Finally, we are exploring roles of the Tcb proteins in PS homeostasis that go beyond their proposed functions as lipid transfer proteins. We are convinced that our study will provide novel and deep mechanistic understanding of the Tcb/E-Syt protein family.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): The study examines the roles of tricalbins in signalling in yeast. By using several mutations they are able to evaluate whether the interactions are caused by tethering the PM and ER or by other mechanisms. Particualrly powerful is their use of cryo-EM tomography and shot-gun mass spectrometery. They look at all the major lipids of the ER and PM and consider their interconversions. They identify large changes in lipid species with one double bond and with two, particularly for PS.

      Reviewer #2 (Significance (Required)): There is little information about membrane physical properties and how it changes as a result of changes in lipid molecular species. Nevertheless, the information provided is novel and detailed. The topic of ER-PM contact sites is new and evolving and this paper advances our understanding of the yeast system considerably. It also looks at protein-protein interactions by fluorescence methods and studies the consequences of heat shock.

      Response: We are pleased that the reviewer concluded that our study on the Tcb proteins “advances our understanding of the yeast system considerably” and found our use of lipidomics to be “powerful”.

      This reviewer only had only one critique; there “is little information about membrane physical properties and how it changes as a result of changes in lipid molecular species”. In our revised manuscript, we will provide new data showing changes in levels of sterol (ergosterol) accessibility/availability at the PM in cells lacking the tricalbin proteins. Sterol lipids exist in distinct pools in the PM bilayer (extracellular vs. cytoplasmic leaflet, accessible vs. sequestered) that control the biophysical and mechanical properties of the PM (packing order, permeability, etc.). Moreover, PS and sterol lipids are proposed to undergo mutual associations whereby PS controls sterol accessibility (the ‘umbrella’ model) and sterol in turn stabilizes PS in the cytoplasmic leaflet of the PM. Our findings demonstrate that the primary function of the Tcb proteins is PS and sterol organization in the PM, providing new mechanistic insight into regulatory mechanisms for membrane homeostasis. We will attempt to further characterize changes in PM mechano-chemical and biophysical properties to further understand how changes in membrane lipid composition affect membrane integrity.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): The Tcb/ESyt proteins play important role in contact site formation and non-vesicular lipid transport. However, their exact functions remain highly controversial. This study by Stefan and colleagues revealed a new role for the yeast Tcb proteins, especially Tcb3, in regulating plasma membrane phosphatidylserine, as well as PIP2. These results are highly valuable to people working on membrane contact sites and lipid trafficking. Overall, the results are fairly convincing and interesting. There are however some concerns and suggestions:

      Response: We are pleased that the reviewer stated that our study has “revealed a new role for the yeast Tcb proteins” and that the findings are “fairly convincing and interesting”. We also thank the reviewer for providing constructive criticisms and helpful suggestions.

      1. Fig. 4, for Tcb3 and Sfk1 interaction, what about Tcb1/2, which would be good controls for the specificity of the interaction.

      Response: We agree that it would be useful to determine if the Tcb3-Sfk1 interaction is specific. We will perform additional BiFC experiments to address whether Tcb1 and Tcb2 associate with Sfk1. Our previous work suggested that Tcb3 forms heterodimers with Tcb1 and Tcb2 necessary for PM integrity (Collado et al., 2019). However, the functional association between Sfk1 and Tcb3 may be specific to Tcb3, and we will test this possibility. It would be interesting to identify a function specific to an individual Tcb protein.

      A central question is whether Tcb3 transfers PS by itself through the SMP domain or requires other lipid transfer proteins. One possibility is that the transfer is mediated by Osh6 and Osh7. Did the expression level of Osh6/7 change between the delta tether and the ist2scs2/22 null strain? Under normal and stressful conditions?

      Response: We agree that it is important to address whether Tcb3 transfers PS via its SMP domain (an issue also raised by Reviewer #1) or whether this activity is carried out by another lipid transfer protein, such as Osh6 and Osh7. We have considered several alternative possibilities, as well as designed and performed new experiments, as described below (two key experiments are in italics).

      To rigorously test whether SMP domain-mediated lipid transport activity is required for PS homeostasis at the PM under heat stress, we will generate amino acid substitutions within the Tcb3 SMP domain that maintain its overall structure but impair its ability to transport lipids (targeting conserved key residues described in Saheki et al., 2016 & Bian et al., 2018). We will then assess whether SMP-mediated lipid transfer is necessary for PS homeostasis and PM integrity under heat stress.

      As suggested by the reviewer (and discussed in our original manuscript), the tricalbins might serve as scaffolds for PS transfer proteins, including the Osh6 and Osh7 proteins, under stress conditions. Osh6 and Osh7 are recruited to ER-PM contacts through interactions with the Ist2 tether protein where they mediate PS delivery to the PM under non-stress conditions (D’Ambrosio et al., 2020). However, strong lines of evidence suggest that Ist2, Osh6, and Osh7 are not required for PS homeostasis at the PM under stress conditions. First, loss of Ist2 has no impact on the PS probe under heat stress conditions (this result will be included in the revised manuscript). Therefore, the Ist2-Osh6/7 interaction is not required for PS homeostasis upon heat stress. Ist2 is not required for PM integrity upon heat stress either (Omnus et al., 2016).

      These results do not rule out the possibility that the tricalbins serve as scaffolds for other PS transfer proteins, such as Osh6 and Osh7, under stress conditions. In this scenario, a switch between Osh6/7 tethering proteins may occur: from Ist2 under normal growth conditions to the Tcb proteins during stress conditions. However, our findings also suggest that Osh6 and Osh7 function is impaired upon stress conditions, including heat and nutrient starvation. Notably, Osh6 and Osh7 become mis-localized from the PM under upon heat stress (we can provide this data in the revised manuscript). The mechanisms for Osh6/7 attenuation upon stress conditions is outside the scope of our current study, but our preliminary results suggest that changes in cytoplasmic pH and ion homeostasis are involved; this will be a focus of a future study. This is also in line with results from a previous study (Omnus et al., 2020) that showed Osh3 forms intracellular aggregates in response to heat stress. The activity of the Osh proteins and other lipid transfer proteins, in general, may be impaired upon stress conditions (see below).

      To directly determine whether Osh6 and Osh7 are required for PS homeostasis at the PM under heat stress conditions, we will monitor the distribution of the PS probe in cells lacking the Osh6 and Osh7 (osh6 osh7 double mutant cells) upon heat stress. This is a key experiment that will directly address the reviewer’s concerns.

      As suggested by the reviewer, we can also examine the expression and localization of GFP-tagged Osh6 and/or Osh7 in tcb1/2/3, scs2/22 ist2, and ‘delta tether’ mutant cells. However, we have not observed any changes in other Osh proteins, including Osh2 and Osh3, in the ‘delta tether’ mutant strain.

      Finally, we have considered yet another possibility. PS transfer to the PM may be generally attenuated under heat stress conditions and a key role of the Tcb proteins may be to bind and stabilize PS at the PM. In other words, the Tcb proteins may act as a ‘buttress’ to stabilize and maintain pre-existing pools of PS at the PM under stress conditions. Consistent with this idea, the Tcb3 C2 domains are required for PS homeostasis and PM integrity upon heat stress. If the Tcb3 SMP domain mutants are not impaired in PS homeostasis or PM integrity, then this alternative mechanism may be very relevant to PM quality control and organization in response to membrane stress. To address this possibility, we will address whether PS is internalized or removed (extracted) from the PM upon heat stress in cells lacking the Tcb proteins. This may uncover a novel role of the Tcb proteins that is independent of SMP domain-mediated lipid transfer.

      Figure 5A. It is not obvious that the intensity of C2 decreases in tcb null cells at 42 degree. Perhaps there is more internal staining.

      Response: We thank the reviewer for pointing out this issue. We think Figures 5A and 5B together convincingly show increased cytoplasmic localization of the PS reporter in the tcb1/2/3 mutant cells upon heat stress. More importantly, we thank the reviewer for pointing out the increased localization at internal membrane compartments. We realized it would be important to identify the PS-containing membrane compartments in the tcb1/2/3 mutant cells upon heat stress. We have now confirmed that the PS probe localizes at the ER and vacuole membrane (to be included in the revised manuscript). The example in Figure 5A also shows accumulation of the PS probe at both the nuclear ER and vacuole membrane. Thus, whilst wild type cells show little PS reporter localization at the ER or vacuole membrane, loss of the tricalbin proteins leads to an increase in ER and vacuole membrane PS probe localization after heat shock. Accumulation at the ER may reflect impaired PS delivery from the ER to the PM, and possibly rerouting to the vacuole membrane. Alternatively, as discussed above, vacuole membrane localization may be due increased PS removal from the PM and delivery to the vacuole membrane in tcb1/2/3 cells upon heat stress.

      It is important to determine the primary function of Tcb3 since both defects in both PIP2 and PS were observed. If the change in PIP2 is due to a lack of PS, can overexpressing Osh6/7 rescue the PIP2 defect in the tetherless mutant?

      Response: We agree that is important to determine whether the primary function of the Tcb proteins is regulation of PS or PI(4,5)P2 homeostasis. Our new findings definitively indicate that their primary function is PS regulation, not PI(4,5)P2 regulation. A clear effect in the distribution of the PS reporter was observed following heat shock in the tcb1/2/3 mutant cells. In contrast, there is no difference in the localization of the PI(4,5)P2 reporter in tcb1/2/3 cells compared to wild type after heat shock (also see response to reviewer #1, comment 3). In addition, cells lacking the Tcb proteins were not impaired in heat-induced PI(4,5)P2 synthesis, as assessed by metabolic labelling and HPLC analysis. These findings will be included in our revised manuscript, as they indicate that the Tcb proteins primarily control PS at the PM, not phosphoinositides, and thus provide new insight into the main role of these proteins.

      Both PS and sterol are required for the proper recruitment of type I PIP5K to the PM (Nishimura et al., 2019). Therefore, defects in PS distribution could be responsible for the PI(4,5)P2 effects observed in Figure 2. However, overexpression of Osh7 in the ‘delta tether’ mutant did not significantly rescue localization of the PI(4,5)P2 reporter (included in the revision plans). Sterol organization is also perturbed in the ‘tether’ mutant cells (Quon et al., 2018), and this may explain why Osh7 expression did not rescue. Accordingly, Osh2/3/4 (sterol transfer proteins) rescue the PI(4,5)P2 effects.

      The detection of PS by LactC2 has been well-established. However, an alternative approach would be to use the 2XPH in permeabilized cells. See PMID: 33929485 for some detailed discussions on the techniques. It is not a requirement for the authors to adopt the 2XPH.

      Response: We thank the reviewer for suggesting another technique to confirm the results of the LactC2 domain as a PS probe. In this study, we have primarily used a genetically encoded LactC2 probe to observe PS distribution within live, intact cells. Whilst this approach was sufficient to identify accumulation of PS on cytosolic membrane leaflets of the ER and vacuole (see above), the addition of an exogenous probe to permeabilized cells may allow the detection of PS on luminal and extracellular membrane leaflets. Therefore, we plan to repeat our heat shock experiments using permeabilized cells and a purified tagged form of the LactC2 protein. This may allow for improved imaging of intracellular PS localisation and bilayer distribution. However, these experiments are technically challenging, and fixation and permeabilization conditions have not yet been optimized for yeast cell experiments. It is not yet known whether we will be able to optimize these protocols in a reasonable amount of time while completing revisions to the manuscript.

      **Minor:**

      1. the discussion seems to be a bit long

      Response: We will shorten the discussion and modify our final conclusions based on the results from the new experiments.

      Reviewer #3 (Significance (Required)): These proteins are highly important in cell biology/contact sites. The redundancy made it difficult to pinpoint their function. Previous studies have had a number of models. The current study proposed a new function of these proteins, i.e. PS transfer, and this is very interesting and valuable. There will be a good audience for this work. I specialize in lipid storage and trafficking, lipid droplets, cholesterol and phosphatidylserine.

      Response: We are pleased that this expert reviewer found our study to be “very interesting and valuable”.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The Tcb/ESyt proteins play important role in contact site formation and non-vesicular lipid transport. However, their exact functions remain highly controversial. This study by Stefan and colleagues revealed a new role for the yeast Tcb proteins, especially Tcb3, in regulating plasma membrane phosphatidylserine, as well as PIP2. These results are highly valuable to people working on membrane contact sites and lipid trafficking. Overall, the results are fairly convincing and interesting. There are however some concerns and suggestions:

      Major:

      1.Fig. 4, for Tcb3 and Sfk1 interaction, what about Tcb1/2, which would be good controls for the specificity of the interaction.

      1. A central question is whether Tcb3 transfers PS by itself through the SMP domain or requires other lipid transfer proteins. One possibility is that the transfer is mediated by Osh6 and Osh7. Did the expression level of Osh6/7 change between the delta tether and the ist2scs2/22 null strain? Under normal and stressful conditions?
      2. Figure 5A. It is not obvious that the intensity of C2 decreases in tcb null cells at 42 degree. Perhaps there is more internal staining.
      3. It is important to determine the primary function of Tcb3 since both defects in both PIP2 and PS were observed. If the change in PIP2 is due to a lack of PS, can overexpressing Osh6/7 rescue the PIP2 defect in the tetherless mutant?
      4. The detection of PS by LactC2 has been well-established. However, an alternative approach would be to use the 2XPH in permeabilized cells. See PMID: 33929485 for some detailed discussions on the techniques. It is not a requirement for the authors to adopt the 2XPH.

      Minor:

      1. the discussion seems to be a bit long

      Significance

      These proteins are highly important in cell biology/contact sites. The redundancy made it difficult to pinpoint their function. Previous studies have had a number of models. The current study proposed a new function of these proteins, i.e. PS transfer, and this is very interesting and valuable. There will be a good audience for this work. I specialize in lipid storage and trafficking, lipid droplets, cholesterol and phosphatidylserine.

      Referee Cross-commenting

      I was asked to include my expertise in the Significance part. Other reviewers did not include it. Maybe my last sentence should be removed?

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The study examines the roles of tricalbins in signalling in yeast. By using several mutations they are able to evaluate whether the interactions are caused by tethering the PM and ER or by other mechanisms. Particualrly powerful is their use of cryo-EM tomography and shot-gun mass spectrometery. They look at all the major lipids of the ER and PM and consider their interconversions. They identify large changes in lipid species with one double bond and with two, particularly for PS.

      Significance

      There is little information about membrane physical properties and how it changes as a result of changes in lipid molecular species. Nevertheless, the information provided is novel and detailed. The topic of ER-PM contact sites is new and evolving and this paper advances our understanding of the yeast system considerably. It also looks at protein-protein interactions by fluorescence methods and studies the consequences of heat shock.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      This study investigates the functions of the tricalbin proteins in S. cerevisiae, which are homologs extended synaptotagmins in mammals. It suggests the tricalbins modulate plasma membrane phospholipid composition and are particularly important for surviving shift to elevated temperature. The tricalbins are proposed to directly or indirectly promote phosphatidylserine transport from the ER to the plasma membrane during heat stress. They also promote the localization of the kinase Pkh1 to the plasma membrane during heat stress.

      Major comments:

      1. To determine whether the tricalbins and other tethering proteins play a role in phospholipid homeostasis, lipid distribution is measured using biosensors (FLARES) and phospholipid levels are determined by mass spec. The experiments are well done but say little about what role the tricalbins or other tethering proteins play in homeostasis. There are no measurements of lipid transport or rates of phospholipid production, degradation or modification. It is reasonable to propose the tricalbins transport lipids, since other proteins with SMP domains do, but this study does not present any evidence that they do.
      2. The study convincingly demonstrates that there are fewer Pkh1 puncta formed after temperature shift in cells lacking tricalbins than in wild-type cells (Fig. 6 C,D). However, there is no demonstration that this change in localization alters Pkh1 function or signaling.
      3. There is no demonstration that association of tricalbins and Skh1 (Fig. 4) has any functional significance or affects phosphoinositide metabolism.
      4. The study proposes the tricalbins directly or indirectly promote phosphatidylserine transport after temperature shift, but transport has not measured and other possibilities are not ruled out.

      Significance

      While this study is likely to be of interest to those studying the tricalbins or phospholipid homeostasis, it is incremental and provides little conceptual advance on what is already known about the tricalbins and extended synaptotagmins. They have already been implicated in lipid homeostasis in the plasma membrane and this study provides no new mechanistic insight into how this occurs. Similarly, it has already been shown that the tricalbins play a role in maintaining cell integrity during heat stress and there is little new insight into what role the tricalbins play. Perhaps the most notable part of the study is the idea that tricalbins are necessary for phosphatidylserine transport during stress, but considerable additional work is necessary to make a strong case for this claim.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements We are grateful to all reviewers for the critical comments and valuable suggestions that have helped improve our paper. We provide point-to-point answers to the comments and added detailed explanations in the preliminary revised manuscript. We put the comments made by the reviewer in italics with our responses below.

      Description of the planned revisions

      Reviewer #1

      \*Major comments:***

      - The authors thought almost all iRFP forms a holo-complex with BV when HO1 is expressed in fission yeast. This should be proved by calculating the percentage of BV-iRFP in yeast. It is meaningful to compare the percentage of BV-iRFP and PCB-iRFP in vitro and in yeast.

      We would like to thank the reviewer for this valuable comment. We plan to quantify the percentage of iRFP holo-complex in vitro and in yeast by using fluorescence correlation spectroscopy (FCS). In FCS, the fluctuation of fluorescence emitted from a very tiny space (i.e., confocal volume, ~ 1 fL) in solution is measured and statistically analyzed by autocorrelation, providing the average number of fluorescent molecules in the confocal volume (Krichevsky and Bonnet, 2002, Rep. Prog. Phys.). The fusion proteins of iRFP with mNeonGreen (mNG) are purified or expressed in yeast, and thus the stoichiometry of iRFP to mNG must be 1:1. The numbers of iRFP and mNG are measured by FCS in vitro or in yeast. If iRFP forms 100% holo complex with BV or PCB, the number of fluorescent iRFP is equal to that of mNG. We have used FCS to measure the concentration of fluorescent protein in mammalian cells (Sadaie et al., 2014, MCB; Komatsubara et al., 2020, JBC), and therefore have no technical difficulties. One possible limitation is the maturation efficiency of mNG in vitro and in yeast, although it would not disturb at least the qualitative comparison of holo-complex formation between iRFP-BV and iRFP-PCB. In this preliminary version of the revised manuscript, we added the data of FCS measurement in yeast, but not yet in vitro, in Supplementary Figure S5A. The numbers of fluorescent iRFP molecules relative to the numbers of mNG were comparable between iRFP-BV (HO1 expressing cell) and iRFP-PCB (SynPCB2.1 expressing cell). Of note, the number of fluorescent iRFP molecules in cells treated with external BV was significantly lower than that in cells expressing HO1, suggesting the low permeability of BV.

      In addition to the FCS measurement, we prepare the following backup experiments; recombinant iRFP proteins are mixed with sufficient amounts of BV or PCB, and used them as a reference. Then, the iRFP proteins are expressed and purified from yeast, and subjected to Zinc blot with the reference iRFP proteins to compare the percentage of iRFP holo-complex in vitro and in yeast. This method is feasible, but we are not able to quantitatively determine the percentage of iRFP holo-complex with BV or PCB.

      -The brightness of fluorescent proteins in organisms often depends on the molecular brightness (fluorescence quantum yield and extinction coefficient) and the amounts of fluorescent proteins. The authors indicated that iRFP-PCB is brighter than iRFP-BV at the molecular level. To calculate the amounts of iRFP-PCB and iRFP-BV when different proteins are expressed in yeast, it is better to explain the results that phycocyanobilin was better than biliverdin for the imaging in fission yeast.

      We agree with the reviewer’s comment. We will quantify the amounts of iRFP in the fission yeast cells expressing iRFP, iRFP/HO1, and iRFP/SynPCB2.1 by western blot analysis and fluorescent imaging. With regard to western blot analysis, the iRFP proteins are normalized by Tubulin in western blot analysis, which has been used as the reference protein.

      During this preliminary revision period, we assessed the expression levels by fluorescence microscopy. The fusion proteins of iRFP fused with mNG were expressed in yeast and the expression levels were quantified by the fluorescence signals of mNG. In this preliminary version of the revised manuscript, we only added fluorescence imaging data in Supplementary Figure S4. The fluorescence intensities of mNG among cells expressing iRFP, iRFP/HO1, and iRFP/SynPCB2.1 were comparable as well as cells treated with BV or PCB. We will further confirm this result by western blotting analysis.

      To test the application of PCB as chromophore in mammalian cells, a HO1 gene knock out mammalian cells should be used.

      We would like to thank the reviewer for this excellent suggestion. It is indeed interesting to examine the effect of an HO1 gene knock-out on the iRFP fluorescence and the application of PCB as a chromophore in mammalian cells. We plan to knock out the HO1 gene in HeLa and HeLa/BVRA KO cells by conventional CRISPR/Cas9 genome editing techniques. After the establishment of HO1-KO HeLa cell lines, iRFP fluorescence is assessed in the same way as we did in Figure S9. In addition, we will carefully investigate the effect of BV in serum on iRFP fluorescence.

      The authors may use the all-in-one plasmids carrying SynPCB2.1 and iRFP fusion protein genes to image the target proteins in mammalian cells.

      According to the reviewer’s suggestion, we will construct an all-in-one plasmid carrying SynPCB2.1 and an iRFP fusion protein gene for iRFP imaging in mammalian cells.

      To this end, the IRES and iRFP genes are inserted downstream of the SynPCB2.1 gene cassette, and iRFP fluorescence is confirmed in mammalian cells.

      Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1

      \*Minor comments:***

      - In line 35. iRFP is derived from bacteriophytochromes.

      We have replaced “phytochrome” with “bacteriophytochrome”.

      - In line 62. The reference of Rodriguez et al. deals with allophycocyanin instead of bacteriophytochromes.

      We have included the following note in the revised manuscript.

      (Page 3, line 56)

      “Near-infrared fluorescent proteins have been developed through the engineering of phytochromes, which are photosensory proteins of plants, bacteria, and fungi (Chernov et al., 2017), or allophycocyanin, which is a light-harvesting phycobiliprotein of cyanobacteria (Rodriguez et al., 2016). “

      - In lines 65-66. Bacteriophytochromes bind BV, phycocyanin, allophycocyanin and cyanobacterial phytochromes bind PCB, and plantal phytochromes bind PΦB.

      Thank you for the notification. We have added the explanation in the revised manuscript as follows.

      (Page 3, line 65)

      “Unlike the canonical fluorescent proteins derived from jellyfish or coral, phytochromes and allophycocyanin require a linear tetrapyrrole as a chromophore such as biliverdin IXα (BV), phycocyanobilin (PCB), or phytochromobilin (PΦB); bacteriophytochromes bind to BV, allophycocyanin and cyanobacterial phytochromes bind to PCB, and plantal phytochromes bind to PΦB.”

      - In lines 67-71. The authors described the biosynthesis of BV, PCB and PΦB. But it missed many references.

      We agree with this reviewer’s comment, and have included the references in the revised manuscript as follows.

      (Page 3, line 70)

      “These linear tetrapyrroles are produced from heme (Terry and Lagarias 1991; Beale 1993). Heme-oxygenase (HO) catalyzes oxidative cleavage of heme to generate BV with the help of ferredoxin (Fd), an electron donor, and ferredoxin-NADP+ reductase (Fnr) (Cornejo, Willows, and Beale 1998). In cyanobacteria, PCB is produced from BV through PcyA, Fd, and Fnr, while in plants PΦB is synthesized from BV using HY2, Fd, and Fnr (Muramoto et al. 1999; Frankenberg et al. 2001; Kohchi et al. 2001).“

      - In line 270. One full stop should be deleted in "cells.. Fission".

      We have corrected this mistake.

      - In line 272. How did the authors add the high concentration of PCB (625 microM) into the culture? PCB is insoluble.

      As the reviewer pointed out, 625 microM PCB seem to be insoluble in PBS solution and fission yeast culture medium, because insoluble PCB debris was observed in the medium. We have included the following note in Materials and Methods of the revised manuscript.

      (Page 20, line 505)

      “Of note, 625 μM PCB or 625 μM BV is insoluble in PBS solution and fission yeast culture medium, and 25 μM PCB or 25 μM BV is insoluble in mammalian cell culture medium, because insoluble PCB or BV debris is observed.”

      - In line 401. smURFP is derived from allophycocyanin instead of cyanobacteriochromes.

      We have corrected this mistake.

      - In line 474. As BV and PCB are insoluble, how do the authors add the pigments into DMEM?

      We directly added BV or PCB dissolved in DMSO into the culture medium of HeLa cells, i.e., DMEM, 10% FBS, and maintained cells for 3 hours. We have included the following note in the figure legend of the revised Supplementary Information.

      Fig. S9 legend

      “Of note, 25 μM BV or PCB remained unsolved in the cultured medium for mammalian cells, and therefore chromophores in the medium could be saturated under these conditions.”

      - In line 509. In which solvent are BV and PCB dissolved?

      BV or PCB dissolved in DMSO was added into His-iRFP PBS solution. We have added the explanations as follows.

      (Page 22, line 555)

      “Purified His-iRFP in PBS solution was mixed with an excess amount (1:5 molar ratio) of BV or PCB dissolved in DMSO, followed by size exclusion chromatography with NAP-5 Columns (Cytiva, 17085301) to remove free BV or PCB.”

      - In lines 546, 547. What is ROI?

      We apologize for the reviewer’s confusion. ROI is the abbreviation of “region of interest.” We have included the full expression in the revised manuscript.

      - The authors should indicate whether codon optimization was necessary when HO1, PcyA, Fd and Fnr were expressed in yeast or in mammalian cells.

      We would like to thank the reviewer for pointing out this issue. The codon optimization for expressing SynPCB genes was needed in mammalian cells (Uda et al., PNAS, 2017). In fission yeast, we used the same SynPCB gene optimized for human codon usage, but not for fission yeast codon usage. As far as we used, there have been no problems with PCB synthesis in fission yeast. We have included this important note in the revised manuscript as follows.

      (Page 20, line 483)

      “The nucleotide sequence of these genes and SynPCB were optimized for human codon usage (see Benchling link; Table S1). ”

      - Figure S1 should show the software and algorithm for the construction of phylogenetic.

      Thank you for pointing out the ambiguity in our statement. We have not constructed the species tree on our own; instead, we have manually drawn the tree by Adobe Illustrator, based on the latest genome-scale phylogeny of fungi (Yuanning Li et al. 2021), where results of multiple algorithms were compared to evaluate the reliability of phylogenies. To make it explicit, we revised the manuscript as follows.

      (Page 24, line 627)

      “We have manually drawn the evolutionary relationship among representative species (Nguyen et al. 2017) based on a recent genome-scale phylogeny (Yuanning Li et al. 2021), which is consistent with the current consensus view of the fungal tree of life (James et al. 2020).”

      Here, to validate that the tree in Figure S1 is consistent with the current consensus phylogeny, we have added a reference to a recent review on the fungal tree of life (Timothy Y. James et al., Ann Rev Microbiol, 2020). Additionally, we modified the tree in Figure S1 to represent the remaining ambiguity among subdivisions in Basidiomycota (Yuanning Li et al. 2021, Figure 4G).

      - In Figure 1C and Figure 2C. The authors should explain the reasons why the fluorescence of iRFP was lower when yeast was treated with excessive BV and PCB.

      The decrease in iRFP intensity under 625 μM PCB or 625 μM BV could be due to cell death and/or toxicity by the excess DMSO. We have included the explanation in the revised manuscript.

      Figure 1 legend

      “The decrease in iRFP intensity under 625 μM BV could be due to cell death and/or toxicity by the excess DMSO”

      Figure 2 legend

      “The decrease in iRFP intensity under 625 μM PCB or BV could be due to cell death and/or toxicity by the excess DMSO”

      - In Supplementary Information, line 51. What is "teh"?

      We have corrected this mistake.

      Reviewer #2

      \*Minor comments:***

      - In the paragraph headed "iRFP brightens iRFP more efficiently..." there is mention of "NLS-iRFP-NLS". Please introduce the abbreviation (nuclear localisation sequence?) and, if possible, why the sequence is present N- and C-terminally.

      We would like to thank the reviewer for this comment. The abbreviation of NLS (nuclear localization signal) has been already included in the previous version of manuscript (page 5, line 113). Two NLSs are fused with iRFP because the addition of a single NLS does not sufficiently localize the protein at the nucleus. We have included this note in the revised manuscript (page 5, line 114).

      - The fluorescence intensity increase upon PCB compared to BV treatment of S. pombe cultures expressing iRFP (Fig. 2C,D) appears about 7-fold, which is far more than the factor of 2 measured on the level of fluorescence quantum yield, and the protein levels were determined to be comparable. Are there any factors conceivable that further enhances the signal of PCB-bound iRFP in S. pombe?

      As we have discussed in the previous version of the manuscript, we presume the three factors enhancing fluorescence of iRFP-PCB in fission yeast; (1) the increase in quantum yield (~ 1.61-fold, Fig. 3F), (2) blue-shifted excitation and emission spectrum of iRFP-PCB, which are beneficial for our microscopic setup, and (3) the efficiency of chromophore formation (~ 1.75-fold) (Rumyantsev et al. 2015). During this revision, we calculated to what extent the second factor potentially enhances iRFP-PCB fluorescence compared to iRFP-BV. Based on the emission spectrum (Fig. S3C, left), iRFP-PCB is approximately 1.3-fold more effectively excited by 640 nm of excitation laser than iRFP-BV. Similarly, the detection of iRFP-PCB fluorescence is about 2.0-fold more efficient than that of iRFP-BV with our emission filter (665-705 nm emission filter). Based on these data, the rough estimation yields 1.61*1.3*2*1.75 = 7.3-fold increase, which is comparable with the experimental results showing 5~10-fold increase in iRFP-PCB (Figs. 2C, 2D, 3G). We have included this additional discussion in the revised manuscript as follows:

      (page 17, line 429)

      “Based on the emission and excitation spectrum (Fig. S3C), iRFP-PCB is approximately 1.3-fold more effectively excited by 640 nm of the excitation laser, and detected about 2.0-fold more efficiently with our emission filter (665-705 nm emission filter) in comparison to iRFP-BV.”

      (page 17, line 433)

      “Based on these data, the rough estimation yields 1.61*1.3*2*1.75 = 7.3-fold increase, which is comparable with the experimental results showing the 5~10-fold increase in iRFP-PCB fluorescence compared to iRFP-BV (Figs. 2C, 2D, 3G).”

      - In the spectral data of Fig. 3D,E, there appears to be a spectrometer artifact at around 450 nm in all spectra, which is not commented on. It is certainly not part of the cofactor / holoprotein spectra.

      The peaks around 450 nm in spectra are the artifact of our spectrometer for unknown reasons. We have added an explanation of this artifact in the revised manuscript.

      (page 10, line 287,290)

      “Of note, there is a spectrometer artifact at around 450 nm in all spectra.”

      - Lines 292/296: "Fig. 3H" should read "Fig. 2G".

      We have corrected these mistakes.

      - The reader may wonder whether it is a common or rather rare phenomenon that the exchange of BV by PCB or some other bilin as (usually covalently bound) cofactor in iRFPs (or, more general, also in the parental phytochromes from different branches of the tree of Life) can be performed. Are there comparable studies available in the literature in addition to the work of Rumyantsev et al. (2015)?

      According to the reviewer’s suggestion, we examined the fluorescence of miRFP670 and miRFP703 in fission yeast cells under the conditions of DMSO, BV, or PCB treatment, or SynPCB2.1 expression. We found that the fluorescence intensities of miRFP670 and miRFP703 with PCB treatment or SynPCB2.1 expression showed much higher values than those with BV treatment as observed in iRFP fluorescence in this study (Figure S7). These data indicate that the replacement of BV with PCB is beneficial for enhancing the fluorescence intensity of other iRFPs. We have included the data in the revised manuscript as follows:

      (page 12, line 340)

      “To explore the generality of the application of PCB and SynPCB2.1 system to other near-infrared fluorescent proteins, we measured fluorescence intensities of miRFP670 and miRFP703, which are derived from a different branch of bacteriophytochrome RpBphP1 (Shcherbakova et al. 2016), in fission yeast treated with BV or PCB or expressing SynPCB2.1 (Fig. S7). The fluorescence intensities of both miRFP670 and miRFP703 were enhanced by the addition of PCB and the expression of SynPCB2.1 compared to the addition of BV in a similar manner iRFP (Fig. S7). From these data, we concluded that PCB biosynthesis by SynPCB2.1 is suitable for imaging with near-infrared fluorescent proteins in fission yeast.”

      Reviewer #3

      (1) In the Fig. 1D, authors tried to measure the BV incorporation into fission yeast cells. How about the time after 180 min, is it still a increase for the fluorescence?

      According to the reviewer’s suggestion, we did the same experiments in Figure 1D for up to 24 hours. As the reviewer expected, iRFP fluorescence still increased gradually for up to 24 hours. We replaced Figure 1D with the new data. We do not have any good idea why iRFP fluorescence gradually increases in the presence of BV.

      (2) The authors used the Fig. 3H in the manuscript, but I did not find the H in the Fig. 3.

      We apologize for the reviewer’s confusion. We have corrected these mistakes.

      (3) The information for the BVRA KO HeLa cells need to be provided.

      We have included the information of the BVRA KO HeLa cells in the materials and methods.

      (page 21, line 519)

      “BVRA KO HeLa cells have been established previously (Uda et al. 2017)”

      Description of analyses that authors prefer not to carry out

      Reviewer #1

      - What is the affinity (KD) of BV and PCB to iRFP? Whether is this related to the fluorescence intensity of iRFP in yeast?

      The attachment of BV or PCB to iRFP is an irreversible reaction, i.e., BV or PCB is covalently attached to iRFP. For this reason, it is technically impossible to measure equilibrium dissociation constant (Kd) values to evaluate the affinity of BV or PCB to iRFP.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, entitled "Near-infrared imaging in fission yeast by genetically encoded biosynthesis of phycocyanobilin", Sakai and co-workers report that phycocyanobilin (PCB) could function as a brighter chromophore for iRFP than BV, and biosynthesis of PCB allows live-cell imaging with iRFP in the fission yeast. They first found that fission yeast cells did not produce BV and therefore did not show any iRFP fluorescence due to the lack of BV and HO gene. Upon the addition of external BV, the iRFP fluorescence signal could be recovered. In addition, expression of HO1 in fission yeast cells also resulting in the iRFP fluorescence. Next, they found that PCB brightens iRFP more efficiently than BV in fission yeast and in vitro, while the fluorescence excitation and emission spectra were 10 nm blue-shifted in iRFP-PCB compared to iRFP-BV. Finally, they introduced a system named SynPCB2.1 for efficient PCB biosynthesis in fission yeast and concluded that PCB biosynthesis by SynPCB2.1 is ideal for iRFP imaging in fission yeast based on the experiments data. They also developed all-in-one plasmids carrying SynPCB2.1 and iRFP fusion protein genes to image the target proteins in fission yeast. In the final part of this manuscript, PCB was tested as an iRFP chromophore in HeLa cells, however it does not offer significant advantage over BV. Overall, this work provides a system for efficient iRFP imaging in fission yeast. Yet, several issues have been identified and need to be addressed in order to strengthen or even validate some of the conclusions made by the authors.

      (1) In the Fig. 1D, authors tried to measure the BV incorporation into fission yeast cells. How about the time after 180 min, is it still a increase for the fluorescence?

      (2) The authors used the Fig. 3H in the manuscript, but I did not find the H in the Fig. 3.

      (3) The information for the BVRA KO HeLa cells need to be provided. To test the application of PCB as chromophore in mammalian cells, a HO1 gene knock out mammalian cells should be used. The authors may use the all-in-one plasmids carrying SynPCB2.1 and iRFP fusion protein genes to image the target proteins in mammalian cells.

      Significance

      This work provides new information regarding the chromophore of iRFP. It shows that PCB brightens iRFP more efficiently than BV in fission yeast and in vitro. The all-in-one plasmids system developed in this work supplies a tool for iRFP imaging in the organisms without BV production. Although the iRFP-PCB produces a brighter fluorescence compared with iRFP-BV, the fluorescence excitation and emission spectra were ~15 nm blue-shifted compared to that of iRFP-BV, which might restrict its applications in tissues imaging. As the abundant of BV exist in the mammalian cells, iRFP-PCB does not offer significant advantage over iRFP-BV.

      Referee Cross-commenting

      All the reviewers gave reasonable suggestions.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      iRFPs have been engineered from the chromophore-binding domains of bacterial phytochromes to expand the wavelength range of genetically-encoded markers for fluorescence imaging and sensing applications into the far-red or near-infrared. In contrast to fluorescent proteins from jellyfish or corals, in which the chromophore is formed autocatalytically, iRFPs require a bilin chromophore for fluorescence such as biliverdin IX alpha (BV), phycocyanobilin (PCB) or phytochromobilin (P[phi]B). While available in some, these molecules may be not available in sufficient amounts in the particular cell type under study, and, therefore, frequently requires the introduction of at least one further gene such as heme oxygenase (HO, to generate BV from heme) or in addition PcyA (which produces PCB from BV). In this study, the authors show that one of the previously described iRFPs (denoted as iRFP713 in the original studies), which is non-fluorescent upon expression in the fission yeast Schizosaccharomyces pombe due to the lack of an endogenous HO gene, is able to complement with PCB in S. pombe when added externally (to the culture medium) or if produced intracellularly with the help of a plasmid developed for PCB synthesis (termed SynPCB2.1). It is shown that iRFP with PCB bound as cofactor exhibits brighter fluorescence than the BV-bound original iRFP713, which is due to a nearly two-fold increased fluorescence quantum yield as shown by in vitro reconstituted and in vivo generated iRFP with bound PCB. S. pombe cells harbouring the SynPCB2.1 system for PCB synthesis even release ("leak") PCB into the surrounding medium to be taken up by cells not producing PCB. Furthermore, the authors report the generation of a plasmid for C-terminal tagging of a protein of interest by iRFP, novel genome integration vectors and all-in-one plasmids harbouring the genes for PCB synthesis and iRFP-fused marker proteins. The genome integration vectors ensure that stable one-copy integration into the genome occurs at different uncritical loci on each of the chromosomes (different from the common Z-locus), which are in gene-free regions and distant enough for crossing strains, and at sites that do not affect the auxotrophy characteristics of cells. The plasmid system, termed pSKI, harbours elements for propagation in E. coli, constitutive or inducible promoters, a multiple cloning site, a selection marker cassette and homology arms separated by a unique restriction enzyme cutting site for plasmid linearization. The viability of the C-terminal fluorescence tagging system relying on PCB-bound iRFP in S. pombe is shown with 10 different proteins of variant and highly specific intracellular location, which all show the expected cellular distribution pattern by fluorescence imaging. Multi-spectral fluorescence labelling schemes (manifold of five) were successfully tested in S. pombe with four specifically localized proteins harbouring different conventional FP tags, and, in addition, one labeled with iRFP(PCB). Finally, it was tested whether PCB can be exploited to yield a brighter iRFP fluorophore also in a mammalian model cell line, HeLa cells. Here, treatment of cells with externally added BV or PCB produced a similar increase in iRFP fluorescence and also knockout cells devoid of the BV breakdown enzyme BVRA did not show different fluorescence levels upon BV or PCB treatment. Therefore, it is concluded that PCB is applicable to iRFP imaging in mammalian cells though offering no advantage, which to some extent limits the advantages to cells which naturally do not produce BV from heme.

      Major comments:

      The claim that the developed plasmid system for multi-spectral fluorescence imaging applications in fission yeast S. pombe, which relies on PCB synthesis and incorporation of PCB into iRFP, is fully justified by the data shown in the manuscript. Interpretation and the derived conclusions are put forward in a decent and balanced way. No additional experiments are required to justify the claims.

      Minor comments:

      • In the paragraph headed "iRFP brightens iRFP more efficiently..." there is mention of "NLS-iRFP-NLS". Please introduce the abbreviation (nuclear localisation sequence?) and, if possible, why the sequence is present N- and C-terminally.
      • The fluorescence intensity increase upon PCB compared to BV treatment of S. pombe cultures expressing iRFP (Fig. 2C,D) appears about 7-fold, which is far more than the factor of 2 measured on the level of fluorescence quantum yield, and the protein levels were determined to be comparable. Are there any factors conceivable that further enhances the signal of PCB-bound iRFP in S. pombe?
      • In the spectral data of Fig. 3D,E, there appears to be a spectrometer artifact at around 450 nm in all spectra, which is not commented on. It is certainly not part of the cofactor / holoprotein spectra.
      • Lines 292/296: "Fig. 3H" should read "Fig. 2G".
      • The reader may wonder whether it is a common or rather rare phenomenon that the exchange of BV by PCB or some other bilin as (usually covalently bound) cofactor in iRFPs (or, more general, also in the parental phytochromes from different branches of the tree of Life) can be performed. Are there comparable studies available in the literature in addition to the work of Rumyantsev et al. (2015)?

      Significance

      The observation of an increased fluorescence quantum yield upon PCB insertion into iRFP is a technical advance with value as such, which will stimulate more detailed studies by spectroscopists (fluorescence, IR, Raman in particular) to clarify the principles underlying the increased quantum yield. While the findings and the developments of this study present a clear advance for imaging applications in fission yeast (available for cell biologists), at least the data on HeLa cells show that the method may not present an advance in terms of delivering a better (i.e. brighter) near-infrared chromophore for mammalian cells. It seems that the ability of a cell line to produce BV (from heme with the help of a HO) limits the potential of PCB addition to iRFP-expressing cells, since BV is readily taken up by the protein shortly after protein biosynthesis. Thus, more elaborate interventions in heme metabolism may be required to fully exploit the potential of the findings in mammalian cell systems and to fulfil the claim that also the optogenetics toolbox may benefit from the findings in the future.

      My own expertise relates to spectroscopic analyses (fluorescence, IR, Raman) of iRFP and mutant variants thereof in search of the determinants for increased fluorescence quantum yield, fluorescence lifetime analyses and photoreceptor research.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      • This paper provided a method for near-infrared imaging using iRFP and proved that phycocyanobilin was better than biliverdin for the imaging in fission yeast.

      Major comments:

      • What is the affinity (KD) of BV and PCB to iRFP? Whether is this related to the fluorescence intensity of iRFP in yeast?
      • The authors thought almost all iRFP forms a holo-complex with BV when HO1 is expressed in fission yeast. This should be proved by calculating the percentage of BV-iRFP in yeast. It is meaningful to compare the percentage of BV-iRFP and PCB-iRFP in vitro and in yeast. -The brightness of fluorescent proteins in organisms often depends on the molecular brightness (fluorescence quantum yield and extinction coefficient) and the amounts of fluorescent proteins. The authors indicated that iRFP-PCB is brighter than iRFP-BV at the molecular level. To calculate the amounts of iRFP-PCB and iRFP-BV when different proteins are expressed in yeast, it is better to explain the results that phycocyanobilin was better than biliverdin for the imaging in fission yeast.

      Minor comments:

      • In line 35. iRFP is derived from bacteriophytochromes.
      • In line 62. The reference of Rodriguez et al. deals with allophycocyanin instead of bacteriophytochromes.
      • In lines 65-66. Bacteriophytochromes bind BV, phycocyanin, allophycocyanin and cyanobacterial phytochromes bind PCB, and plantal phytochromes bind PΦB.
      • In lines 67-71. The authors described the biosynthesis of BV, PCB and PΦB. But it missed many references.
      • In line 270. One full stop should be deleted in "cells.. Fission".
      • In line 272. How did the authors add the high concentration of PCB (625 microM) into the culture? PCB is insoluble.
      • In line 401. smURFP is derived from allophycocyanin instead of cyanobacteriochromes.
      • In line 474. As BV and PCB are insoluble, how do the authors add the pigments into DMEM?
      • In line 509. In which solvent are BV and PCB dissolved?
      • In lines 546, 547. What is ROI?
      • The authors should indicate whether codon optimization was necessary when HO1, PcyA, Fd and Fnr were expressed in yeast or in mammalian cells.
      • Figure S1 should show the software and algorithm for the construction of phylogenetic.
      • In Figure 1C and Figure 2C. The authors should explain the reasons why the fluorescence of iRFP was lower when yeast was treated with excessive BV and PCB.
      • In Supplementary Information, line 51. What is "teh"?

      Significance

      • near-infrared fluorescent protein broadens the spectrum of fluorescent protein and facilitates deep tissue imaging. Based on iRFP, this work successfully constructed the near-infrared fluorescent protein with PCB as chromophore in fission yeast by using the method of synthesizing PCB, which has been realized in mammalian cells by this group. The method of synthesizing PCB in yeast not only facilitates fluorescence imaging, but also meaningful for optogenetics experiments in yeast. This work is useful for the study of fluorescence imaging, optogenetics and biosynthesis when using yeast as a model organism.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-00831

      Corresponding author(s): Lu, Gan

      Reviewer comments are in regular font. Our rebuttal is in bolded font. Experiments that we plan for the full revision are preceded with “FULL:”. In the revision files, the changes are highlighted in yellow.

      General Statements

      We thank the reviewers for their detailed feedback. There are two major concerns. First, the manuscript lacks functional analysis of the meiotic triple helices (MTHs). Second, the manuscript makes claims about the properties of synaptonemal complexes (SCs) and MTHs that are inadequately supported. In order to address the first concern, we would need extensive experiments to first identify and then perturb the genes associated with the MTH. Such experiments are beyond the scope of this manuscript and are the focus of future studies. We will address the second concern with mostly text revisions. We will also improve some of the imaging analysis with new experiments that can be done in a few months’ time.

      2. Description of the planned revisions

      We will acquire new cryo-ET data of pachytene-arrested cell cryolamellae using our new K3-GIF camera. These new data have higher signal-to-noise ratios and allow us to generate a higher-resolution subtomogram average of the MTHs. The achievable resolution will depend on the conformational homogeneity of the MTH segments and on the number of cryotomograms we can capture. If we are able to achieve a subnanometer-resolution reconstruction, we will narrow down the possible identities of the subunits. Even if we cannot achieve subnanometer resolutions, the new data will allow us to test if ladder-like densities were missed in our lower-resolution older data, thereby improving our understanding of the SC’s structure. We will also perform subtomogram analysis of purified ribosomes as a control to strengthen our handedness determination.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      The Reviewers’ original comments are reproduced in regular font. Line numbers refer to the preliminary revision.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Ma and coworkers report studies of budding yeast undergoing meiosis by cryo-ET. They fail to detect structures interpretable as synaptonemal complex, and instead detect feather-like bundles of what appears to be a triple helix. These structures do not appear to be related to the synaptonemal complex, as spo11 mutants that do not initiate recombination, red1 mutants that lack axial elements, and zip1 mutants that lack the central element of the SC still make these bundles. These structures are absent from cells treated with latrunculin A, which depolymerizes actin filaments, but expected structures are not visible in light microscopy of cells treated with two different F-actin-staining reagents. However, it should be noted that another study (Takagi et al, 2021, bioRxiv) did detect actin associated with these structures by immunogold labeling. The structures are also reversibly dissolved in 7% hexanediol.

      This part of the paper's findings is well supported by data and is certainly of interest, although interest is somewhat limited by the unknown nature of these structures-what they contain, let alone their function, remains to be determined-in fact, it is not even determined whether or not they are made of protein. However, as an initial report of a previously unknown phenomenon, the paper is of some value.

      __Thank you for raising the issue of whether MTHs are composed of protein or not. Aside from the proteins, the only other materials capable of forming large bundles of linear polymers are polysaccharides and DNA. Yeast polysaccharides are found in the cell wall, so they are unlikely to be a candidate for the MTHs. In the nucleus, DNA is abundant. While we favor that MTHs are composed of protein, we cannot rule out that the MTHs are non-chromatin DNA-protein complexes. Depending on the resolution of future subtomogram averages, we may get a better idea of the MTH’s composition.__

      There is, unfortunately, a second aspect of the paper that cannot be supported. Although it is clear that synaptonemal complex is present in the cells examined (by standard cytological methods) the authors cannot detect structures consistent with SC in their cryo-ET images. Unfortunately, authors then extrapolate from their inability to detect SC to conclusions about SC, such as that it is not crystalline, and even go so far as to suggest that their failure to detect SC invalidates two models for crossover interference, and that the ladder-like structure reported for SC in many organisms using many difference approaches may be a fixation artifact. Authors show remember that the absence of evidence is not evidence for absence; the speculation described above should be removed from both the abstract and discussion.

      We have removed the speculation about crossover interference and limited the scope of our discussion on SCs to budding yeast only.

      The differences between traditional EM and cryo-EM are not trivial. In the introduction, we added more details to explain the differences in both the sample preparation and contrast generation:

      Lines 79-87: “Meiotic nuclei have been studied for decades by traditional EM (Fawcett, 1956; Moses, 1956), but not by cryo-ET. Cryo-ET can reveal 3-D nanoscale structural details of cellular structures in a life-like state because the samples are kept unfixed, unstained, and frozen-hydrated during all stages of sample preparation and imaging (Ng and Gan, 2020). The densities seen in cryo-ET data come from electron scattering of the biological macromolecules. In comparison, the densities seen in traditional EM from electron scattering of heavy metals such as uranium, tungsten, and osmium, which have adhered to a subset of biological macromolecules that were not extracted in earlier steps.”

      We have also removed the term ‘artifact’ from lines 201-202:

      Original: “The ladder model is based on images of traditional EM samples, which are vulnerable to fixation and staining artifacts.”

      Revised: “The ladder model is based on images of traditional EM samples.”

      Reviewer #1 (Significance (Required)):

      This paper reports a previously unreported structure in the nuclei of yeast cells undergoing meiosis. The composition and function of this structure remain to be determined. This considerably limits the significance of the paper.

      **Referee Cross-commenting**

      I agree with the concerns of the other reviewers. I also agree with reviewer 3 that to raise the significance of the paper would require much work. But I think that the raw observation is of value, albeit in a journal of record. So I would stick with my recommendation of text changes, keeping in mind that there may not be a suitable journal in LSA's portfolio.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This work describes helical filamentous structures observed in budding yeast nuclei that were cryosectioned and imaged using cryo-electron tomography (cryoET).

      The goal of this work seems to have been to conduct an ultrastructural analysis of the synaptonemal complex (SC), a meiosis-specific protein structure that holds chromosomes together during meiosis and is thought to regulate meiotic recombination. In conventional TEM images of fixed, embedded, and stained sections obtained from pachytene nuclei, SCs usually appear as long, thin, transversely striated structures. At pachytene, SCs extend along the full length of a thin (100-nm) gap between paired chromosomes (typically 1-6 µm long in yeast cells). Surprisingly, the authors did not observe SCs, perhaps because these structures do not produce much contrast in cryo-EM images. Instead, they observe abundant triple helical structures in the nucleoplasm, which they designate as "meiotic triple helices" (MTHs). The authors report that these triple helices assemble at the same time as but independently of the SC. They publised a preprint in which they indicated that these structures were somehow related to SCs, but this the revised version reports that they appear independently of SCs. They further report that treatment of cells with the F-actin depolymerizing drug Latrunculin-A (LatA) resulted in a lack of detectable triple helical structures in the nucleus, suggesting that these structures may be a form of actin or, alternatively, that they may require F-actin for their assembly.

      While the work is technically mostly sound, its significance is unclear because the reported structures have no known function. Many, if not most proteins will form helical structures if their concentration rises above a threshold defined by their binding affinity for themselves (see https://doi.org/10.1186/1741-7007-11-119 and references therein), so this may simply be an example of an abundant nuclear protein that polymerizes to form helical filaments under the conditions that trigger yeast sporulation.

      Thank you for raising the interesting possibility that MTHs form helices because their subunits have exceeded a critical concentration threshold. In the revised text, we discuss the possibility that the MTH is simply a protein that is highly expressed in meiotic cells and polymerize either due to exceeding a critical concentration, or having undergone a biochemical change like a post-translational modification:

      Lines 408-415: “Note it is possible that the MTHs may not be directly involved in meiosis, but are instead a protein that is at a sufficient concentration or has the right biochemical modifications to form helices in pachytene because it is known that many proteins can form a helix under the right conditions {Crane, 1950; Pauling, 1953; Theriot, 2013}. These MTHs also have lateral interactions that allow them to pack with crystalline density. Their sensitivity to 1,6-hexanediol suggests that the polymerization both within and between MTHs are based on hydrophobic interactions. Further work will be needed to determine the identity of the MTH’s subunits and their potential function.”

      The authors perform cryosectioning and cryoEM on yeast cells undergoing meiosis to show that the assembly and disassembly of MTHs follow a similar time course as that of the SC. The observation of these triple-helical filaments in meiotic nuclei has also been reported in another study (https://www.biorxiv.org/content/10.1101/778100v2.full.pdf+html), which proposed, based on immuno-EM labeling, that they may be actin cables. This study reports that the structures are not detected using phalloidin or Lifeact-mCherry. However, treatment with LatA did eliminate detection of MTH structures, suggesting that they may be comprised of actin.

      In my view, there are a number of issues that should be addressed before publication. Many of these relate to the presentation of the findings. Detailed comments below:

      1. The presentation of the work is very confusing. The authors clearly expected to observe SC structures, but did not. They conclude that MTHs are not SCs, since they do not depend on molecular components required for SC assembly. They should describe their findings in a more straightforward way rather than veering from introducing the SC to describing the MTHs.

      We have restructured the manuscript to tone down the discussion about the SC. However, we have to start with the SC because it is the most iconic feature of pachytene cells and a major organizer of chromatin in meiosis. Furthermore, its presence, as indicated by Zip1-GFP signals, was key to establishing that our cells were indeed arrested in pachytene. It would have been confusing to then overlook the absence of structural features as conspicuous and prevalent as what was expected of SCs. The other sections did have room for improvement. In the sections below, we describe the changes point by point.

      Similarly, the discussion section on "recombination and chromosome segregation" seems inappropriate and irrelevant, since no data are presented in this study regarding the functions of the MTHs, and there is no reason to think that they contribute to crossover interference, chromosome segregation, or other aspects of meiosis. Additionally, most of the ideas presented in this section seem very muddled. I recommend deleting this section.

      We have deleted the section entitled "Recombination and chromosome segregation".

      Throughout the text, we have also changed the term “meiosis-specific” to “meiosis-related” when describing MTHs. Doing so allows for the possibility that MTHs might just form as a consequence of being expressed to a high enough concentration as discussed above.

      Along the same lines as comment #1: The title should be changed - absence of evidence for "ladders" is not evidence of absence. Prior work using TEM and superresolution fluorescence microscopy has clearly shown that ladder-like SC structures exist in pachytene nuclei of budding yeast and many other organisms, although they apparently cannot be visualized using the methods described here.

      We have changed the title to be less forceful, yet report what we see and don’t see by cryo-ET:

      “Cryo-ET detects bundled triple helices but not ladders in meiotic budding yeast”

      The authors should clarify whether cryo-sectioning was performed through the full volume of pachytene nuclei.

      This comment refers to our attempt at serial cryosectioning, as shown in Fig. S8. We have revised the text in lines 332-334 to reflect the estimated volume covered:

      “We successfully reconstructed six sequential sections from one ndt80Δ cell (Figure S8), which represents approximately one third of a nucleus (assuming a spherical shape).”

      We also changed Fig. S8’s title so that it doesn’t sound like we reconstructed an entire nucleus:

      “MTH bundles are extensive throughout the cell nucleus.” → “MTH bundles are extensive.”

      It is not clear from the manuscript which camera/microscope configuration was used to acquire the cryoET data that were used for sub-tomogram averaging. The authors state in the methods that Falcon II and K3-GIF was used for projection images, but it's not clear if this applies to all images. These technical details should be clarified.

      We have added a new column to Table S4 that reports the camera used for all the projection imaging and tomography experiments. In the original MS, all of the subtomogram averaging was done using Falcon II data.

      FULL: In the full revision, we plan to incorporate new subtomogram averages of MTHs in situ, using K3-GIF data of cryolamellae.

      The analysis of the handedness of the helices seems to be questionable as the resolution of the reconstructions for 80S are also quite low. I am uncertain whether this can be used to state with confidence that the MTH are right-handed.

      FULL: In the full revision, we will use purified 70S ribosomes, imaged on the same K3-GIF camera and using the same software workflow as for the new subtomogram averaging of MTHs in situ. We expect higher resolution for both ribosomes and MTHs, which will make the handedness assignment unambiguous.

      The authors claim that treatment with Latrunculin-A (LatA) leads to disappearance of MTHs. However, they support this with projection images of cells treated with LatA. The projection images are of poor quality and the vitrification in these cells (as well as the DMSO treated cells) do not look appropriate. They should present data for LatA-treated cells and DMSO-treated controls obtained using the same approach and ideally imaged in parallel with untreated cells. They should also quantify the number of sections and cells imaged for all conditions.

      Once we realized that the MTH bundles were visible in projections, we chose to report detections of MTH bundles by projection imaging instead of the costly tilt series. The apparent poor quality and questionable vitrification comes from the fact that the projection images show the cryosection’s crevasses and knife marks, which reside on opposite cryosection surfaces. These sectioning artifacts are computationally excluded from tomographic slices. The following line was added to the figure caption to explain this:

      “These image features are not devitrification artifacts; they are absent from the tomographic slices in other figures because they can be computationally excluded.”

      The quantification of the MTHs in Lat-A vs control cells are in Table 2. We have now added these numbers to both the text and the figure caption.

      The similarities between the MTHs and SCs - that both are present in meiotic nuclei and sensitive to hexanediol - seem unlikely to be functioanlly relevant. Again, I think the presentation suffers from being focused on the SC which was not seen, rather than on the MTHs.

      We have toned down the discussion on SCs throughout the manuscript. We have retained the motivation for using 1,6-hexanediol to probe the MTHs physico-chemical properties and the fact the previous work on SCs provided motivation for this perturbation experiment. However, we removed the comparison of their relative sensitivities to 1,6-hexanediol (see reply to point #10 below).

      The absence of 100-nm-wide zones containing nucleosomes is again not evidence for lack of SCs. SCs are ribbon-like structures - they are about 100 nm in one dimension but the thickness has not been characterized reliably; even if SCs do exclude nucleosomes (which is not certain) the excluded volume might be much smaller than the authors imagine.

      We did not argue for “lack of SCs”; these structures clearly exist in our cells given the fluorescent linear structures seen in Zip1-GFP expressing cells. We only say that the textbook portrayal of SCs needs revision, though we should have restricted our statement to yeast. In the literature and textbooks: whenever the SC’s central element is drawn, it is depicted without internal nucleosomes and being densely packed with SC proteins.

      Does Lifeact-mCherry enter the nucleus? This information is important in interpreting the failure to detect MTHs using this probe.

      While Lifeact-mCherry is small enough to passively diffuse through the nuclear pore, our data cannot rule out that this molecule is excluded from the nucleus. We added the following sentence as a caveat:

      Lines 244-245 “Note that we cannot rule out that Lifeact-mCherry is excluded from the nucleus.”

      The sensitivity of the MTH structures to 1,6-hexanediol treatment is potentially interesting, but it does not reveal anything about their structure or function, only that their assembly likely depends in part on hydrophobic interactions. Caution should be used in interpreting these findings.

      We have toned down the discussion about the meaning of the MTH bundles’ 1,6-hexanediol sensitivity by removing this line from the original Results:

      “MTH bundles are therefore sensitive to a slightly higher concentration of 1,6-hexanediol than SCs are and reform when 1,6-hexanediol is removed.”

      We have also added the following line to more clearly say what sensitivity to 1,6-hexanediol means:

      Lines 412-414: “Their sensitivity to 1,6-hexanediol suggests that the polymerization both within and between MTHs are based on hydrophobic interactions”

      The figure legends and/or Methods sections should clarify what is represented in each figure, and how the data were acquired. In particular, cryotomographic slices of varying dimensions (6nm, 10 nm or 12 nm or 70 nm) are mentioned in the captions of several figures (2-6, and S1, S2, S3). However, is often unclear whether these represent physical or computational sections.

      “Tomographic slice” refers to a rendering of a computationally extracted slice from a reconstructed tomogram. To make it clearer, we have added the term “computational” to describe the tomographic slices in each figure caption.

      Page 23 has a supplemental figure but no captions. Is this the same as Fig. S8?

      Yes, this is a copy of Fig. S8 that appeared due to MS Word’s jumping-figure bugs. We will manually edit the PDF in the future revision.

      I do not find the model figure (Figure 7) to be helpful. Additionally, the failure to detect SCs and the presence of MTHs do not warrant a "revised model of the meiotic yeast nucleus."

      We now call panel A and B the “Traditional EM” and the “Cryo-EM” models, respectively. The figure therefore reports the large nuclear bodies seen by the two methods and no longer implies correctness.

      We also changed the related sentence in the Introduction:

      Original: “Our work strongly suggests that current models of pachytene nuclear cytology need revision.”

      Revised: “Our analysis shows that MTHs coexist with SCs, which have an unknown cryo-ET structure.”

      The absence of MTHs in haploid cells induced to undergo meiosis should perhaps be studied in more detail. Even SCs are present in haploid meiotic cells, so the absence of these structures may be informative as to their function. Haploid cells should also be stained for SCs and imaged by immunofluorescence to verify that they are in meiosis.

      The haploid strain that was treated with sporulation media cannot enter meiosis. Haploid cells that are capable of entering meiosis need to be disomic for chromosome III, with each copy having a different mating type at the Mat locus. We believe that the construction and studies of such strains would be more meaningful after we identify the MTH’s subunits and determine its function in diploid cells.

      The yeast strain is SK1, not SK-1.

      Thank you. This mistake is corrected.

      What are "self-pressurized-frozen samples" (p.2)?

      Self-pressurized-frozen samples are generated by an alternative to the conventional machine-based high-pressure-freezing method. We have added more details in the new lines 135-141:

      “Self-pressurized freezing is a simpler and lower-cost alternative to conventional high-pressure freezing, which requires a dedicated machine that consumes large amounts of cryogen. In the self-pressurized freezing method, the sample is sealed in a metal tube and rapidly cooled in liquid ethane. The material in direct contact with the metal cools first and expands by forming crystalline ice, which exerts pressure on the material in the center of the tube (Leunissen and Yi, 2009; Yakovlev and Downing, 2011; Han et al., 2012).”

      Reviewer #2 (Significance (Required)):

      The observation of MTHs is novel but (as stated above) of unclear significance, given that their molecular identity and function are unknown.

      This work may be of interest to the meiosis field, with the caveats described above that the functional relevance is currently unclear.

      This review was co-written by referees with expertise in meiosis, chromosome organization, SC structure and function, and cryoEM.

      **Referee Cross-commenting**

      The reviews are strikingly concordant so I don't think much needs to be added.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The authors report the observation of filamentous structures (further termed meiotic triple helices MTH) in the nucleoplasm during meiosis in yeast cells. Those structures are visualized using Cryo-ET. While the authors initially seem to assume an association of those structures with synaptonemal complexes, they discover that those structures rely on filamentous actin and are not affiliated with synaptonemal complexes after all.

      While I think that the observation of those MTH by Cryo-ET is interesting, the overall structure of the paper and presentation of the data are not very well done. As the authors find throughout their experiments that the MTHs are not associated with synaptonemal complexes the strong focus in the first figures on synaptonemal complexes as well as the title of the paper are very mis-leading. The authors try to initially make the point that other labs have observed ladder like structures by transmission electron microscopy and want to make the claim that those observed structures might be an artifact of sample preparation, hence the title: Meiotic budding yeast assemble bundled triple helices but not ladders. However, at the end those structures seem unrelated to synaptonemal complexes.

      In addition, several labs have reported the presence of nuclear actin in meiosis and mitosis and have even succeeded to show those structures by transmission electron microscopy, questioning the "artifact" argument.

      Presumably, the Reviewer means intranuclear actin, as opposed to perinuclear (cytoplasmic) actin. If so, then we have only seen one paper, the one from the Takagi et at 2021 (bioRxiv) that has reported seeing structures associated with nuclear actin. Note however, that the revised Takagi bioRxiv paper is very careful in saying that the filament bundles contain actin, which is not the same as saying that the filaments are polymers of actin.

      Our “artifact” argument – now removed – referred to SCs, not to nuclear actin.

      In the revised manuscript, we use the term “intranuclear” to make it clear that we are referring to structures inside the nucleus. We also use the term F-actin, where appropriate, to refer to the best-studied actin polymer, which resembles a double helix. Doing so eliminates confusion about other forms of actin: G-actin, which cryo-ET cannot yet directly visualize due to its small size; and non-canonical actin polymers, for which there are no previous experimental X-ray or cryo-EM structures for comparisons.

      The story line of the paper is weak and I think the authors would have been better of reporting their cryo-ET structures and making a better link to actin or determining what else they think might be a component of those structures. Immuno-EM (as actually shown in Reference 41) of actin would have been much more convincing. The authors could also use the power of Cryo-ET and the achievable higher resolution to describe those filaments in much more detail. In my opinion this would have been a much better and more exciting paper.

      We disagree that “making a better link to actin” is the right approach because doing so presupposes that the structures are composed of actin, for which the present evidence is inadequate. We do agree that determining what the MTHs are (or are not) would be valuable.

      FULL: Now that we have better access to a K3-GIF camera and a cryo-FIB-SEM, we will attempt higher-resolution subtomogram averaging analysis. If we are able to achieve subnanometer resolution, we will attempt to narrow down the fold of the MTH subunit. Note that this goal will require that the MTHs are conformationally homogenous and that we can image sufficient copies of the MTHs.

      In summary: While I think the Cryo-ET images of those structures could be very exciting the paper unfortunately does not do a very good way in presenting this data and is at times misleading trying to proof or rather dis-proof a connection to synaptonemal complexes. Based on this I think that the paper can not be published in the current form and needs major revisions that would require a significant amount of time.

      **Minor comments:**

      Figure 1: The choice of timepoints is confusing and makes it hard to compare. While wild type is shown at 0,1,3,4,5,8h, the mutant is shown at 0,2,3,4,5,6,7h. It would be appropriate to select the same timpepoints for both conditions.

      FULL: We will recollect fluorescence images of the mutant cells at the same time points as the wild type.

      Figure 3 and 4 need a quantification of the number of observed MTHs, in particular as only selected regions of the nuclei are shown.

      These images were taken from single cryosections instead of serial cryosections, which would have been too difficult to do for multiple conditions and multiple cells. Therefore, quantification would be obfuscated by the fact each cryosection samples a small fraction of the nuclear volume. We believe that reporting the number of cell cryosections that are MTH-positive (Table 2) is at present the best way to characterize their abundance and ability to polymerize. Once we are able to identify the MTH gene products, we will be able to perform GFP tagging experiments and thereby get a much better estimate of the polymer mass as a function of biochemical perturbations.

      Fig 7. The data certainly does not support a "REVISED" model of the yeast nuclear organization.

      We have changed the Figure title to “A cryo-ET model of the meiotic yeast nucleus.” We now also refer to panels A and B as “Traditional EM model” and “Cryo-ET model”, respectively.

      Reviewer #3 (Significance (Required)):

      Several publications have already shown the presence of actin in meiotic and mitotic nuclei and have even succeeded in observing those structures by transmission electron microscopy. Based on this it is not clear why the authors have not tried to put their work in context to all these observations and used their technology to obtain novel information on the structure, which might be helpful to identify which proteins compose the MTHs. Based on how the data is presented I do not think that this paper contributes anything new to the field.

      Presumably, the reviewer means that F-actin has been imaged in yeast cells, because for actin to exist in nuclei in mitosis/meiosis, the organism would have to undergo closed mitosis/meiosis. Furthermore, for actin to be observable by transmission electron microscopy, it would have to be in the filamentous (F-actin) form. We could not find any publications that report transmission electron microscopy of F-actin in yeast cells. We therefore cannot relate our results to F-actin in the meiotic nucleus.

      My field of expertise is meiosis and mitosis as well as imaging (light and electron microscopy).

      **Referee Cross-commenting**

      All reviewers seem to agree that the general observation of these structures is interesting but that there is a reduced significance as the function and identity of these structures remains unknown.

      4. Description of analyses that authors prefer not to carry out

      The main unanswered question is: what is the function of the MTH bundles? To address this question, we would first need to identify the gene products that are needed for MTH assembly. Next, we would need to do genetic perturbation experiments to actually determine the MTHs’ function. These experiments would constitute a complete study, which is better suited for a separate, future manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors report the observation of filamentous structures (further termed meiotic triple helices MTH) in the nucleoplasm during meiosis in yeast cells. Those structures are visualized using Cryo-ET. While the authors initially seem to assume an association of those structures with synaptonemal complexes, they discover that those structures rely on filamentous actin and are not affiliated with synaptonemal complexes after all. While I think that the observation of those MTH by Cryo-ET is interesting, the overall structure of the paper and presentation of the data are not very well done. As the authors find throughout their experiments that the MTHs are not associated with synaptonemal complexes the strong focus in the first figures on synaptonemal complexes as well as the title of the paper are very mis-leading. The authors try to initially make the point that other labs have observed ladder like structures by transmission electron microscopy and want to make the claim that those observed structures might be an artifact of sample preparation, hence the title: Meiotic budding yeast assemble bundled triple helices but not ladders. However, at the end those structures seem unrelated to synaptonemal complexes. In addition, several labs have reported the presence of nuclear actin in meiosis and mitosis and have even succeeded to show those structures by transmission electron microscopy, questioning the "artifact" argument. The story line of the paper is weak and I think the authors would have been better of reporting their cryo-ET structures and making a better link to actin or determining what else they think might be a component of those structures. Immuno-EM (as actually shown in Reference 41) of actin would have been much more convincing. The authors could also use the power of Cryo-ET and the achievable higher resolution to describe those filaments in much more detail. In my opinion this would have been a much better and more exciting paper. In summary: While I think the Cryo-ET images of those structures could be very exciting the paper unfortunately does not do a very good way in presenting this data and is at times misleading trying to proof or rather dis-proof a connection to synaptonemal complexes. Based on this I think that the paper can not be published in the current form and needs major revisions that would require a significant amount of time.

      Minor comments:

      Figure 1: The choice of timepoints is confusing and makes it hard to compare. While wild type is shown at 0,1,3,4,5,8h, the mutant is shown at 0,2,3,4,5,6,7h. It would be appropriate to select the same timpepoints for both conditions.

      Figure 3 and 4 need a quantification of the number of observed MTHs, in particular as only selected regions of the nuclei are shown.

      Fig 7. The data certainly does not support a "REVISED" model of the yeast nuclear organization.

      Significance

      Several publications have already shown the presence of actin in meiotic and mitotic nuclei and have even succeeded in observing those structures by transmission electron microscopy. Based on this it is not clear why the authors have not tried to put their work in context to all these observations and used their technology to obtain novel information on the structure, which might be helpful to identify which proteins compose the MTHs. Based on how the data is presented I do not think that this paper contributes anything new to the field. My field of expertise is meiosis and mitosis as well as imaging (light and electron microscopy).

      Referee Cross-commenting

      All reviewers seem to agree that the general observation of these structures is interesting but that there is a reduced significance as the function and identity of these structures remains unknown.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This work describes helical filamentous structures observed in budding yeast nuclei that were cryosectioned and imaged using cryo-electron tomography (cryoET).

      The goal of this work seems to have been to conduct an ultrastructural analysis of the synaptonemal complex (SC), a meiosis-specific protein structure that holds chromosomes together during meiosis and is thought to regulate meiotic recombination. In conventional TEM images of fixed, embedded, and stained sections obtained from pachytene nuclei, SCs usually appear as long, thin, transversely striated structures. At pachytene, SCs extend along the full length of a thin (100-nm) gap between paired chromosomes (typically 1-6 µm long in yeast cells). Surprisingly, the authors did not observe SCs, perhaps because these structures do not produce much contrast in cryo-EM images. Instead, they observe abundant triple helical structures in the nucleoplasm, which they designate as "meiotic triple helices" (MTHs). The authors report that these triple helices assemble at the same time as but independently of the SC. They publised a preprint in which they indicated that these structures were somehow related to SCs, but this the revised version reports that they appear independently of SCs. They further report that treatment of cells with the F-actin depolymerizing drug Latrunculin-A (LatA) resulted in a lack of detectable triple helical structures in the nucleus, suggesting that these structures may be a form of actin or, alternatively, that they may require F-actin for their assembly.

      While the work is technically mostly sound, its significance is unclear because the reported structures have no known function. Many, if not most proteins will form helical structures if their concentration rises above a threshold defined by their binding affinity for themselves (see https://doi.org/10.1186/1741-7007-11-119 and references therein), so this may simply be an example of an abundant nuclear protein that polymerizes to form helical filaments under the conditions that trigger yeast sporulation.

      The authors perform cryosectioning and cryoEM on yeast cells undergoing meiosis to show that the assembly and disassembly of MTHs follow a similar time course as that of the SC. The observation of these triple-helical filaments in meiotic nuclei has also been reported in another study (https://www.biorxiv.org/content/10.1101/778100v2.full.pdf+html), which proposed, based on immuno-EM labeling, that they may be actin cables. This study reports that the structures are not detected using phalloidin or Lifeact-mCherry. However, treatment with LatA did eliminate detection of MTH structures, suggesting that they may be comprised of actin.

      In my view, there are a number of issues that should be addressed before publication. Many of these relate to the presentation of the findings. Detailed comments below:

      1. The presentation of the work is very confusing. The authors clearly expected to observe SC structures, but did not. They conclude that MTHs are not SCs, since they do not depend on molecular components required for SC assembly. They should describe their findings in a more straightforward way rather than veering from introducing the SC to describing the MTHs. Similarly, the discussion section on "recombination and chromosome segregation" seems inappropriate and irrelevant, since no data are presented in this study regarding the functions of the MTHs, and there is no reason to think that they contribute to crossover interference, chromosome segregation, or other aspects of meiosis. Additionally, most of the ideas presented in this section seem very muddled. I recommend deleting this section.
      2. Along the same lines as comment #1: The title should be changed - absence of evidence for "ladders" is not evidence of absence. Prior work using TEM and superresolution fluorescence microscopy has clearly shown that ladder-like SC structures exist in pachytene nuclei of budding yeast and many other organisms, although they apparently cannot be visualized using the methods described here.
      3. The authors should clarify whether cryo-sectioning was performed through the full volume of pachytene nuclei.
      4. It is not clear from the manuscript which camera/microscope configuration was used to acquire the cryoET data that were used for sub-tomogram averaging. The authors state in the methods that Falcon II and K3-GIF was used for projection images, but it's not clear if this applies to all images. These technical details should be clarified.
      5. The analysis of the handedness of the helices seems to be questionable as the resolution of the reconstructions for 80S are also quite low. I am uncertain whether this can be used to state with confidence that the MTH are right-handed.
      6. The authors claim that treatment with Latrunculin-A (LatA) leads to disappearance of MTHs. However, they support this with projection images of cells treated with LatA. The projection images are of poor quality and the vitrification in these cells (as well as the DMSO treated cells) do not look appropriate. They should present data for LatA-treated cells and DMSO-treated controls obtained using the same approach and ideally imaged in parallel with untreated cells. They should also quantify the number of sections and cells imaged for all conditions.
      7. The similarities between the MTHs and SCs - that both are present in meiotic nuclei and sensitive to hexanediol - seem unlikely to be functioanlly relevant. Again, I think the presentation suffers from being focused on the SC which was not seen, rather than on the MTHs.
      8. The absence of 100-nm-wide zones containing nucleosomes is again not evidence for lack of SCs. SCs are ribbon-like structures - they are about 100 nm in one dimension but the thickness has not been characterized reliably; even if SCs do exclude nucleosomes (which is not certain) the excluded volume might be much smaller than the authors imagine.
      9. Does Lifeact-mCherry enter the nucleus? This information is important in interpreting the failure to detect MTHs using this probe.
      10. The sensitivity of the MTH structures to 1,6-hexanediol treatment is potentially interesting, but it does not reveal anything about their structure or function, only that their assembly likely depends in part on hydrophobic interactions. Caution should be used in interpreting these findings.
      11. The figure legends and/or Methods sections should clarify what is represented in each figure, and how the data were acquired. In particular, cryotomographic slices of varying dimensions (6nm, 10 nm or 12 nm or 70 nm) are mentioned in the captions of several figures (2-6, and S1, S2, S3). However, is often unclear whether these represent physical or computational sections.
      12. Page 23 has a supplemental figure but no captions. Is this the same as Fig. S8?
      13. I do not find the model figure (Figure 7) to be helpful. Additionally, the failure to detect SCs and the presence of MTHs do not warrant a "revised model of the meiotic yeast nucleus."
      14. The absence of MTHs in haploid cells induced to undergo meiosis should perhaps be studied in more detail. Even SCs are present in haploid meiotic cells, so the absence of these structures may be informative as to their function. Haploid cells should also be stained for SCs and imaged by immunofluorescence to verify that they are in meiosis.
      15. The yeast strain is SK1, not SK-1.
      16. What are "self-pressurized-frozen samples" (p.2)?

      Significance

      The observation of MTHs is novel but (as stated above) of unclear significance, given that their molecular identity and function are unknown.

      This work may be of interest to the meiosis field, with the caveats described above that the functional relevance is currently unclear.

      This review was co-written by referees with expertise in meiosis, chromosome organization, SC structure and function, and cryoEM.

      Referee Cross-commenting

      The reviews are strikingly concordant so I don't think much needs to be added.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Ma and coworkers report studies of budding yeast undergoing meiosis by cryo-ET. They fail to detect structures interpretable as synaptonemal complex, and instead detect feather-like bundles of what appears to be a triple helix. These structures do not appear to be related to the synaptonemal complex, as spo11 mutants that do not initiate recombination, red1 mutants that lack axial elements, and zip1 mutants that lack the central element of the SC still make these bundles. These structures are absent from cells treated with latrunculin A, which depolymerizes actin filaments, but expected structures are not visible in light microscopy of cells treated with two different F-actin-staining reagents. However, it should be noted that another study (Takagi et al, 2021, bioRxiv) did detect actin associated with these structures by immunogold labeling. The structures are also reversibly dissolved in 7% hexanediol.

      This part of the paper's findings is well supported by data and is certainly of interest, although interest is somewhat limited by the unknown nature of these structures-what they contain, let alone their function, remains to be determined-in fact, it is not even determined whether or not they are made of protein. However, as an initial report of a previously unknown phenomenon, the paper is of some value.

      There is, unfortunately, a second aspect of the paper that cannot be supported. Although it is clear that synaptonemal complex is present in the cells examined (by standard cytological methods) the authors cannot detect structures consistent with SC in their cryo-ET images. Unfortunately, authors then extrapolate from their inability to detect SC to conclusions about SC, such as that it is not crystalline, and even go so far as to suggest that their failure to detect SC invalidates two models for crossover interference, and that the ladder-like structure reported for SC in many organisms using many difference approaches may be a fixation artifact. Authors show remember that the absence of evidence is not evidence for absence; the speculation described above should be removed from both the abstract and discussion.

      Significance

      This paper reports a previously unreported structure in the nuclei of yeast cells undergoing meiosis. The composition and function of this structure remain to be determined. This considerably limits the significance of the paper.

      Referee Cross-commenting

      I agree with the concerns of the other reviewers. I also agree with reviewer 3 that to raise the significance of the paper would require much work. But I think that the raw observation is of value, albeit in a journal of record. So I would stick with my recommendation of text changes, keeping in mind that there may not be a suitable journal in LSA's portfolio.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Saha and colleagues investigated the functions of the long non-coding RNA (lncRNA) DRAIC in malignant glioma. They find that DRAIC expression decreases cell migration/invasion and tumorsphere/colony formation in vitro, and tumor growth in vivo using established cell lines. Mechanistically, DRAIC is known to inhibit NF-kB signaling and the authors demonstrate that DRAIC activates AMPK leading to repression of mTOR, which decreases protein synthesis and increases autophagy. This is a solid study highlighting a potentially interesting pathway of tumor growth and invasion in brain tumors.

      Answer: We appreciate Reviewer 1 for the positive feedback of our study


      Major Comments:


      1) It is unclear whether the presented values (mean +/- SD) in the histograms refer to repeat measurements (in which case n = 1) or independent experiments (n>1). The number of replicate experiments is not stated in the methods or figure legends. This must be included.

      Answer: We want to thank Reviewer 1 for pointing out this omission. We have now included this information in the Materials and methods and in figure legends section.


      2) I don't think the immunoblot for p62 in Fig. 5C shows a convincing increase following DRAIC knockout, so the statement on p.8 should be revised.

      Answer: We have revised the statement to say: Consistent with DRAIC decrease being associated with a decrease in autophagic flux, and despite a decrease in p62 mRNA, the level of P62 protein is increased in three of the DRAIC KO prostate cancer cells (Fig. 5C, KO1, KO2, KO4 compared to WT) and unchanged in the other two.

      3) On p.8/Fig.5 the authors make a case that increased DRAIC levels increase lysosomal degradation of autophagosome core proteins LC3 II / p62 (resulting in decreased protein levels of both), while simultaneously increasing gene expression of LC3B and p62 (causing increased mRNA levels). The data for DRAIC overexpression fit this logic fairly well (even though I think more work is needed to fully support this claim), but I am finding it difficult to reconcile the DRAIC knockout data with this scenario - here, loss of DRAIC results in increased protein levels to decreased autophagy, but also decreased gene expression. To fully support this argument, rescue experiments would be needed using FoxO3a knockout/overexpression.

      Answer: Note that the mRNA level is not always correlated with protein expression. This is particularly true for LC3 and p62, whose protein levels are significantly affected by the extent of fusion of autophagosomes and lysosomes and subsequent degradation in autophagolysosomes. Thus, although the mRNA of these genes is decreased in DRAIC KO cells (Fig. 5E), the proteins are increased (Fig. 5C) because of decrease of autophagic flux (and decrease of degradation in the autophagolysosomes).

      The overexpression of FoxO3a in the DRAIC KO cells will not restore mRNA levels of LC3 or p62, because we show in Fig. 4H that FoxO3 phosphorylation by AMPK is suppressed by DRAIC KO. This phosphorylation is important for the induction of LC3 or p62 mRNA by FoxO3.

      FoxO3 knockout or knockdown in DRAIC OE cells should decrease LC3B or p62 mRNA in Fig. 5D, but it is already known from the Literature that FoxO3a is necessary for inducing LC3B or p62 mRNA. Cell Metab. 2007 Dec;6(6):458-71. doi: 10.1016/j.cmet.2007.11.001.PMID: 18054315.

      4) Similarly, the data supporting increased autophagy following DRAIC overexpression (Fig. 5F/G) are a bit weak and lack controls (is the LC3B-GFP overlapping with endogenous LC3B and autophagosomes? Was the transfection efficiency comparable? Is there fusion with lysosomes?). In the absence of stronger data, the authors should temper their claims that DRAIC increases autophagy.

      Answer: LC3B of fusion protein LC3B-GFP is known to overlap with the p62 puncta (similar to endogenous LC3B). This result is in Fig. 4A of the citation that we have now added (Proc Natl Acad Sci U S A. 2016 Nov 22;113(47): E7490-E7499. doi: 10.1073/pnas.1615455113. Epub 2016 Oct 17)

      To support our hypothesis that DRAIC OE induces more autophagy compared to empty vector, we used Bafilomycin A1 in Figure 5B to inhibit the autophagosome and lysosome fusion. We see the accumulation of more LC3B upon treatment with Bafilomycin A1 in the DRAIC OE cells (compared to EV containing U251 cells), consistent with the idea that autophagosome-lysosome fusion is increased by DRAIC OE.



      5) No information is provided on animal numbers used in this study. How many mice were used per cohort? Were male and female mice used? Authors should follow ARRIVE guidelines in reporting animal experiments. The method for calculating tumor volume needs to be specified.

      Answer: We have included the details about the animal study in methods section of our modified manuscript .

      6) Student's T-test is inappropriate for comparisons of more than two groups (i.e. all experiments using DRAIC knockout cells) - for these experiments a Kruskal Wallis test or ANOVA should be used. Did the authors test for normal distribution of their data? This may affect statistical testing and should be taken into consideration.

      Answer: We have now modified our statistical calculation and included in the statistical analysis section in our modified manuscript.


      Minor Comments:


      7) Authors mention that DRAIC expression is undetectable in immortalized astrocytes and GBM cancer stem cells (Fig. S1). What is the source of these cells and how were they cultured?

      Answer: The immortalized astrocytes and GBM stem cells and their culture conditions is now described.

      8) The immunoblot in Fig. 3D could be replaced with a slightly lower exposure to make the difference between WT and DRAIC KO more obvious.

      Answer: We have now replaced the immunoblot with lower exposure.

      9) Some immunoblots in Fig. 3 (panel E, p-S6K and S6K; panel H, actin) are not of the best quality and an effort should be made to replace them.

      Answer: We have now replaced the immunoblot p-S6K as reviewer mentioned.



      10) Why are different loading controls used in Fig. 3 (a-Tubulin v actin)?

      Answer: We use multiple loading control to make sure that we are not underestimating or overestimating changes in the experimental protein because of unexpected changes in the loading controls.

      11) Compared to other blot images in the same figure (e.g. Fig. 3E), the bands for p-mTOR and mTOR in Fig. 3F look compressed and should be shown appropriately sized.

      Answer: We have modified the Figure as reviewer suggested.


      12) The layout of Fig. 4 is somewhat confusing. I would suggest organizing this according to DRAIC overexpression in A172 and U373 cells versus DRAIC knockout in LNCaP cells. Each immunoblot should be clearly labelled with the corresponding cell line, and it should be clearly explained why p-FoxO3a was tested in U251 cells, rather than A172/U373 as in the rest of the figure.

      Answer: We thank the reviewer for the constructive criticism. We have labeled all the cell lines in the Figure as reviewer suggested. We have now systematically alternated the prostate cancer cells (for KO) and the GBM cells (for OE), as we looked at each relevant marker. We have now included the western blot for p-FoxO3a from another glioblastoma cell line U373. Please find the modified Figure 4K for p-FoxO3a.

      13) Labelling of immunoblot in Fig. 5B is confusing and should be improved.

      Answer: We have modified the Fig. 5B to make the label clearer.

      14) Changes in GLUT1 expression (Fig. 7A) should be validated on the protein level.

      Answer: We have included the immunoblot for GLUT1 from DRAIC KO cells in Figure 7B. GLUT1 protein is increased upon DRAIC KO.


      Reviewer #1 (Significance (Required)):

      The authors describe a novel link between the lncRNA DRAIC and AMPK activation through inhibition of NF-kB-mediated regulation of GLUT1. This study extends their previous work on DRAIC inhibition of NF-kB in prostate cancer (Saha et al. Cancer Res 2020). There is one study describing DRAIC effects on growth and invasion in glioma cell lines (Li et al. Eur Rev Med Pharmacol Sci 2020), but the work presented by Saha and colleagues contains stronger experimental data and a more detailed and previously undescribed mechanism.

      The current study presents a mechanistic advance that increases the understanding of tumor growth and protein synthesis in cancer cells. The data presented in the study are not supported by in vivo experiments (other than suppression of tumor growth by DRAIC overexpression), validation in human tissue and/or primary patient-derived human glioblastoma cells, or even substantial rescue experiments. This limits the influence of the work on the field. I'm also not sure how transferable findings from DRAIC knockout in prostate cancer cell lines are to glioma, although the results are mostly complementary to the data from glioma cell lines. This is particularly relevant to the proposed mechanism of GLUT1 regulation by NF-kB, as the bulk of experimental data in Figures 6 and 7 was generated in prostate cancer cell lines and is only poorly validated in glioma cells. The study results will be most relevant for researchers investigating cell signaling pathways and autophagy in cancer.

      Answer: We like to thank reviewer for the positive comments on our study. The DRAIC KO experiments of Fig. 6 and 7 cannot be done in glioma cells, because as we show if Supp. Fig. S1, there are no glioma cells or GSC that express DRAIC to levels comparable to LnCaP. We have shown that GLUT1 mRNA decreases in the glioma cells when DRAIC is overexpressed (Supp. Fig. S4. We also show in Fig. 7G that AMP levels increase when DRAIC is overexpressed in glioma cells.__

      __

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors describe DRAIC as a lncRNA downregulated in prostate cancer. They postulate that DRAIC expression surpasses invasion, migration and growth. Mechanistically, the authors show that DRAIC activates AMPK by suppressing NFkB target gene TOR and indirectly impacting translation and autophagy. Collectively the observation is interesting and robust. However, I have several technical requests, particularly regarding the mechanistic part of the paper.

      Answer: We appreciate the positive feedback. We have addressed the reviewer’s concerns in our modified manuscript.


      Major Comments:


      1) The authors should rescue Ko phenotypes by over expressing DRAIC to consider potential off target effects.

      Answer: DRAIC OE alone is sufficient to have exactly the opposite effect as DRAIC KO in protein translation (Fig. 3C-F), so DRAIC OE will rescue the effect of DRAIC KO. We make a similar argument for all the phenotypes, including mTOR, S6K and ULK1(S757) phosphorylation (Fig. 3G-J), AMPK and FoxO3a phosphorylation (Fig. 4B-C; J-L), autophagic flux (Fig. 5B, C) and effects on LC3B and p62 mRNAs (Fig. 5D, E). The same is true for our published phenotypes of DRAIC KO on invasion, migration and NF-kB activity (Saha, Cancer Research, 2020)


      2) The blots showing TOR and ULK1 phosphorylation need to be repeated. This is an important part of the paper and I feel that these blots are hard to interpret. p-S6K typically run a bit higher in gels. there may be a technical problem.

      Answer: We are not sure which specific blots the reviewer is referring to, and it is possible that the blots the other reviewers pointed to are the ones under question. We have changed those blots so that the results are clear.


      3) GLUT1-related results are interesting, but the authors should provide genetic evidence that the effects are mediated by GLUT1. How do we know that glucose uptake is indeed upregulated upon knockout?

      Answer: In Fig. 7 C-F we show that the effects of DRAIC KO on invasion, protein translation, AMP levels and AMPK activity are reversed by the GLUT1 inhibitor Bay-876. This is a cleaner result than using siRNA to knockdown GLUT1. siRNAs can have off-target activity and sometimes cannot decrease a protein sufficiently below the threshold necessary to see reversal of action.


      Minor Comments:

      4) The figures need to be updated. FOnts are all different, lots of unaligned graphs, quality of the blots are poor.

      Answer: We have updated the Figures and changed fonts as reviewer mentioned.

      Reviewer #2 (Significance (Required)):

      The observation is interesting, but the mechanism is incompletely understood. This is a nice addition to the literature, even without the mechanism.

      Answer: We want to thank the reviewer for the constructive criticism.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Shaha and colleagues present a study demonstrating the tumor suppressive role of DRAIC, a long non-coding RNA transcript, through transmission of the signal from IKK/NF-kB to the AMPK/mTOR pathway via regulation of GLUT1 expression. The inhibition of mTOR by this pathway results in the reduction of protein translation, cellular invasion and activation of autophagy. Several diseases and models as well as multiple genetic and pharmacological manipulations were used to investigate the mechanisms at play. The manuscript is well written and the experiments are well designed. The conclusions are supported by the results. The following major and minor comments should be addressed:

      Answer: We appreciate the reviewer for the positive comments on our study.


      Major Comments:


      1) In addition to reporting the effect of DRAIC overexpression on tumor volume, the authors should present survival studies with one or more models.__

      Answer: __We thought of doing the survival study in our glioblastoma model but unfortunately, the tumor growth is very rapid (exceeding the size permitted by our IACUC in 2-3 weeks). The animal ethics welfare committee did not allow us to keep the mice for a longer time to perform the survival study.



      2) Since the authors study metabolic energy sensor pathways, related to glycolysis, it would be important to perform some of the key experiments in physiological level of glucose: e.g., pmTOR, pAMPK, LC3-II expression level in DRAIC overexpressing and deficient cells.


      Answer: The concentration of glucose in plasma is 1G/L, while that of the RPMI medium is 2G/L. We do not think we are too far from the physiological levels of glucose.


      3) In addition to RT-PCR data, GLUT1 protein levels should be investigated in the different DRAIC expressing cells.

      Answer: We have incorporated the GLUT1 protein expression data from DRAIC KO cells in Figure 7B and DRAIC overexpressing cells in supplementary Figure 4G-H. The blots from the same gels were split into different panels, the loading control GAPDH remain same in Figure 4K and Supplementary Figure S4H.


      4) The effect of DRAIC on GLUT1 expression is also measured in condition of glucose saturation, which does not reflect disease state. The decrease of GLUT1 in response to DRAIC overexpression and the increased GLUT1 level in DRAIC deficient cells should be investigated in physiological levels of glucose.

      Answer Same as above. We are near physiological levels of glucose.


      __Minor Comments:

      __

      5) All the data are generated with established cell lines (e.g., U87) but more clinically relevant models, such as patient-derived primary cells like the ones used in Fig. S1, could be used to replicate some of the key findings.

      Answer: As we showed in Fig. S1 that DRAIC is not expressed in glioma stem cells, and so knockout experiments are not possible. We believe that the knockout experiments are the most relevant to this paper because they do not run the risk of artefacts from overexpression of an RNA far beyond physiological levels.


      6) Also please provide further details about the patient-derived cells from Fig. S1.

      Answer: We have mentioned the details of the cell lines in our modified manuscript.


      7) The statistical analysis section states that the number of measurements is indicated however I don't see the sample size of the experiments.

      Answer: We have now incorporated the number the experiments in our modified text.


      Reviewer #3 (Significance (Required)): The study reports a new model of regulation of tumor via long non-coding RNA. This article adds to the growing literature The topic and content of the article is relevant and significant to the field of tumor research but the significance and impact could be enhanced with the use of more physiologically relevant models and conditions as pointed in the major comments.

      Answer: We want to thank the reviewer for the positive feedback on our study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Shaha and colleagues present a study demonstrating the tumor suppressive role of DRAIC, a long non-coding RNA transcript, through transmission of the signal from IKK/NF-kB to the AMPK/mTOR pathway via regulation of GLUT1 expression. The inhibition of mTOR by this pathway results in the reduction of protein translation, cellular invasion and activation of autophagy. Several diseases and models as well as multiple genetic and pharmacological manipulations were used to investigate the mechanisms at play. The manuscript is well written and the experiments are well designed. The conclusions are supported by the results. The following major and minor comments should be addressed:

      Major comments:

      1. In addition to reporting the effect of DRAIC overexpression on tumor volume, the authors should present survival studies with one or more models.
      2. Since the authors study metabolic energy sensor pathways, related to glycolysis, it would be important to perform some of the key experiments in physiological level of glucose: e.g., pmTOR, pAMPK, LC3-II expression level in DRAIC overexpressing and deficient cells.
      3. In addition to RT-PCR data, GLUT1 protein levels should be investigated in the different DRAIC expressing cells.
      4. The effect of DRAIC on GLUT1 expression is also measured in condition of glucose saturation, which does not reflect disease state. The decrease of GLUT1 in response to DRAIC overexpression and the increased GLUT1 level in DRAIC deficient cells should be investigated in physiological levels of glucose.

      Minor comments:

      1. All the data are generated with established cell lines (e.g., U87) but more clinically relevant models, such as patient-derived primary cells like the ones used in Fig. S1, could be used to replicate some of the key findings.
      2. Also please provide further details about the patient-derived cells from Fig. S1.
      3. The statistical analysis section states that the number of measurements is indicated however I don't see the sample size of the experiments.

      Significance

      The study reports a new model of regulation of tumor via long non-coding RNA. This article adds to the growing literature The topic and content of the article is relevant and significant to the field of tumor research but the significance and impact could be enhanced with the use of more physiologically relevant models and conditions as pointed in the major comments.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, the authors describe DRAIC as a lncRNA downregulated in prostate cancer. They postulate that DRAIC expression surpasses invasion, migration and growth. Mechanistically, the authors show that DRAIC activates AMPK by suprressing NFkB target gene TOR and indirectly impacting translation and autophagy. Collectively the observation is interesting and robust. However, I have several technical requests, particularly regarding the mechanistic part of the paper.

      • The authors should rescue Ko phenotypes by over expressing DRAIC to consider potential off target effects.
      • The blots showing TOR and ULK1 phosphorylation need to be repeated. This is an important part of the paper and I feel that these blots are hard to interpret. p-S6K typically run a bit higher in gels. there may be a technical problem.
      • GLUT1-related results are interesting but the authors should provide genetic evidence that the effects are mediated by GLUT1. How do we know that glucose uptake is indeed upregulated upon knockout?

      Minor:

      The figures need to be updated. FOnts are all different, lots of unaligned graphs, quality of the blots are poor.

      Significance

      The observation is interesting, but the mechanism is incompletely understood. This is a nice addition to the literature, even without the mechanism.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Saha and colleagues investigated the functions of the long non-coding RNA (lncRNA) DRAIC in malignant glioma. They find that DRAIC expression decreases cell migration/invasion and tumorsphere/colony formation in vitro, and tumor growth in vivo using established cell lines. Mechanistically, DRAIC is known to inhibit NF-kB signaling and the authors demonstrate that DRAIC activates AMPK leading to repression of mTOR, which decreases protein synthesis and increases autophagy. This is a solid study highlighting a potentially interesting pathway of tumor growth and invasion in brain tumors.

      Major comments:

      • It is unclear whether the presented values (mean +/- SD) in the histograms refer to repeat measurements (in which case n = 1) or independent experiments (n>1). The number of replicate experiments is not stated in the methods or figure legends. This must be included.
      • I don't think the immunoblot for p62 in Fig. 5C shows a convincing increase following DRAIC knockout, so the statement on p.8 should be revised.
      • On p.8/Fig.5 the authors make a case that increased DRAIC levels increase lysosomal degradation of autophagosome core proteins LC3 II / p62 (resulting in decreased protein levels of both), while simultaneously increasing gene expression of LC3B and p62 (causing increased mRNA levels). The data for DRAIC overexpression fit this logic fairly well (even though I think more work is needed to fully support this claim), but I am finding it difficult to reconcile the DRAIC knockout data with this scenario - here, loss of DRAIC results in increased protein levels to decreased autophagy, but also decreased gene expression. To fully support this argument, rescue experiments would be needed using FoxO3a knockout/overexpression.
      • Similarly, the data supporting increased autophagy following DRAIC overexpression (Fig. 5F/G) are a bit weak and lack controls (is the LC3B-GFP overlapping with endogenous LC3B and autophagosomes? Was the transfection efficiency comparable? Is there fusion with lysosomes?). In the absence of stronger data, the authors should temper their claims that DRAIC increases autophagy.
      • No information is provided on animal numbers used in this study. How many mice were used per cohort? Were male and female mice used? Authors should follow ARRIVE guidelines in reporting animal experiments. The method for calculating tumor volume needs to be specified.
      • Student's T-test is inappropriate for comparisons of more than two groups (i.e. all experiments using DRAIC knockout cells) - for these experiments a Kruskal Wallis test or ANOVA should be used. Did the authors test for normal distribution of their data? This may affect statistical testing and should be taken into consideration.

      Minor comments:

      • Authors mention that DRAIC expression is undetectable in immortalized astrocytes and GBM cancer stem cells (Fig. S1). What is the source of these cells and how were they cultured?
      • The immunoblot in Fig. 3D could be replaced with a slightly lower exposure to make the difference between WT and DRAIC KO more obvious.
      • Some immunoblots in Fig. 3 (panel E, p-S6K and S6K; panel H, actin) are not of the best quality and an effort should be made to replace them.
      • Why are different loading controls used in Fig. 3 (a-Tubulin v actin)?
      • Compared to other blot images in the same figure (e.g. Fig. 3E), the bands for p-mTOR and mTOR in Fig. 3F look compressed and should be shown appropriately sized.
      • The layout of Fig. 4 is somewhat confusing. I would suggest organizing this according to DRAIC overexpression in A172 and U373 cells versus DRAIC knockout in LNCaP cells. Each immunoblot should be clearly labelled with the corresponding cell line, and it should be clearly explained why p-FoxO3a was tested in U251 cells, rather than A172/U373 as in the rest of the figure.
      • Labelling of immunoblot in Fig. 5B is confusing and should be improved.
      • Changes in GLUT1 expression (Fig. 7A) should be validated on the protein level.

      Significance

      The authors describe a novel link between the lncRNA DRAIC and AMPK activation through inhibition of NF-kB-mediated regulation of GLUT1. This study extends their previous work on DRAIC inhibition of NF-kB in prostate cancer (Saha et al. Cancer Res 2020). There is one study describing DRAIC effects on growth and invasion in glioma cell lines (Li et al. Eur Rev Med Pharmacol Sci 2020), but the work presented by Saha and colleagues contains stronger experimental data and a more detailed and previously undescribed mechanism.

      The current study presents a mechanistic advance that increases the understanding of tumor growth and protein synthesis in cancer cells. The data presented in the study are not supported by in vivo experiments (other than suppression of tumor growth by DRAIC overexpression), validation in human tissue and/or primary patient-derived human glioblastoma cells, or even substantial rescue experiments. This limits the influence of the work on the field. I'm also not sure how transferable findings from DRAIC knockout in prostate cancer cell lines are to glioma, although the results are mostly complementary to the data from glioma cell lines. This is particularly relevant to the proposed mechanism of GLUT1 regulation by NF-kB, as the bulk of experimental data in Figures 6 and 7 was generated in prostate cancer cell lines and is only poorly validated in glioma cells. The study results will be most relevant for researchers investigating cell signaling pathways and autophagy in cancer.

      Reviewer keywords:

      neurooncology, cancer stem cells, signaling pathways in cancer

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The authors do not wish to provide a response at this time.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In the study, the authors found that cancer cells including breast, bladder, lung, neuroglioma, and prostate display an increase of Lipid Droplet (LD) after 6 Gy x-ray (Fig.1). And, the cells containing high LDs showed more radioresistance than the cells with low LDs after irradiated with a dose Gy x-ray (Fig.2). Ferritin Heavy Chain (FTH1), the main intracellular iron storage protein, is found to be upregulated after 6 Gy exposure or in the LDs high cells. FTH1 knockdown decreases LD accumulation and increases radiation sensitivity (Fig.3). Overexpression of FTH1 or DFO (an iron chelator agent) treatment in shFTH1 cells rescue the LD accumulation and cancer radioresistance (Fig.4)

      Major comments:

      1. The conclusion has some conflict with some publication (PMC5928893) which shows fatty acid oxidation not LDs lead to cancer radioresistance. So the authors should rule out this possibility through knockdown of CPT1.
      2. Lipid droplets are dynamic in cells. The sorted cells (10% highest or lowest LD-expressing cells) in Fig.3 may not stand for subpopulation, so the authors should add exogenous lipids or cholesterol to test cancer cell radioresistance.
      3. It is impossible to overexpress FTH1 in shFTH1 cells (the stable shRNA will target all mRNA of FTH1) (Fig.4 and methods section: cell culture and FTH1 Reconstitution).
      4. The relationship between the free cytoplasmic iron and LD accumulation is not so convincing. Add exogenous iron to test LD accumulation.

      Minor comments:

      1. Remove fig.1C which is not related to the conclusion;
      2. Fig.2 label error: LD520 should be LD540;
      3. Fig.3, A: change loading control HSC70 which not so stable in the cells; D: add quantification of LD number.
      4. Fig.4, A: change loading control HSC70, and repeat western of MCF7 shFTH1/pcDNA3
      5. Line 111, "...that general ROS levels resulted not altered..." should be "...that general ROS levels were not altered..."
      6. Fig.S4 legend: "Figure S2" should be "Figure S4".

      Significance

      The FTH1 affects Lipid Droplets is novel (some results in this study have published: radiation led to LD accumulation (PMC5928893) and an increase of FTH1 (PMC4688087 and PMID:32937103).

      The finding is helpful to improve radiation therapy which may combine with drugs targeting FTH1 or iron metabolism.

      The researchers who worked in cancer treatment are interested in this finding.

      My expertise is cancer lipid metabolism and cancer therapy.

      Referee Cross-commenting

      I completely agree the comments from the other Reviewer. The authors need enhance the correlation between the lipid droplets genes (e.g., DGAT1/2, SOAT1, ACAT1) and the iron metabolism (e.g., FTH1), and improve data quality as suggested by reviewers.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this work Tirinato and co-authors used different experimental approaches to trace correlations between cancer cell stemness, lipid droplets, iron homeostasis and radiation resistance. Their findings were acquired by using different cancer cell lines from different origin, and using diverse techniques, including cytometry, microscopy, clonogenic assays and shRNAi. In general, the manuscript is well written and organized. It is also easy to be followed and the authors managed to convince the readers about the importance of this important aspect of cancer metabolism. The fact that cell lipid droplets content might condition cancer survival to radiotherapy poses novelty and therefore it deserves attention in the delimitation of new anticancer therapies or protocols. There are however some issues that should be amended to improve the quality of the manuscript.

      Major comments:

      The main concern is that all the work is based on cancer cell lines. The use of some cell lines derived from clinical samples or analysis of the clinical data already deposited in bank webs could be useful to support their conclusions. This last could be easy. Exploration of this depositories could help the author to reinforce the correlation between the expression of the lipid droplets genes, as well as that related with the iron metabolism, with the radiotherapy efficacy in patients.

      When analyzed in detail I have some comments regarding specific sections:

      Lipid droplets detection images (Figs 1, 3 y 4). Why are the nuclei size look so different in both conditions? FTH1 expression. Fig 3A vs 3C and 4A. As depicted is difficult to have a clear picture of the variations in expression in FHT1 in the different cell lines. Why the authors evaluate the success of shRNAi by PCR if the wb works so well? Is there any correlation between the RR and the expression of FTH1 intra cell lines? What happen when FTH1 is downregulated in the rest of the cell lines? More than restore, the authors overexpressed FTH1, what is the result when FTH1 is overexpressed in the different cell lines?

      Minor comments:

      The correlation between nile red staining and ROS is not clear (Fig S2). The authors may try to graphic the ROS mfi of different subpopulations (lineal in X) vs the mfi of red nile (y, in log scale). Abreviation in introduction ER, is endoplasmic reticulum? A picture with a putative model could be helpful to summarize the findings.

      Significance

      As stated above the idea that lipid droplets and iron metabolism might be determinants in the cancer survival to radiotherapy poses novelty and therefore it deserves attention in the delimitation of new anticancer therapies or protocols.

      Although I have experience in cancer lipid metabolism I am not an expert in the field of lipid droplets.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary:

      In this study, the investigators describe two new Trypanosoma brucei brucei strains wheich have been subjected to minimal passage in rodents or in culture. In addition to characterizing some of the phenotypic and genetic characteristics of these isolates and lines derived from them the authors also characterize the changes in gene copy numbers occurring during in vitro passage. While this study illuminated the impact of in vitro culturing for adaptation that might have been overlooked for years in unicellular eukaryotes, the experimental design and method description make their conclusions less convincing.

      Major comments:

      My main discomfort with this work is that it provides just enough information to be interesting and perhaps important, but not enough to be definitive. Part of this is not the fault of the authors - especially in dealing with various "lab" strains and their genomes - over which they have no control in terms of where and how they were isolated, maintained and investigated. But the authors seem to add to this problem by not determining the clonality of their new strains, failing to provide a full genome sequence (although they seem to have the data they need - including long-read sequencing - to do so), and with often vague descriptions of the analyses they performed as part of this report. My biggest concern has to do with the population structure of the beginning parasites set and to what extent the cultivation - even though limited - is selecting for variants within an existing non-clonal population rather than showing evidence of evolution of a (mostly"??) clonal population. The authors note that they have not cloned so as to "minimize manipulation" - which is sensible. But the provided references (particularly Ref 21) use microsatellite markers to establish wide clonality in populations in individuals. Although the authors have not indicated it here, one would presume that that "evolution" being observed in their lines over time (in animal or in culture) have not altered the microsatellite signature of the lines (thus are still clonal based upon standards of Ref 21), but has resulted in genetically and biologically altered lines - i.e. clonal variants. It seems without knowing the complexity of the beginning populations (granted - not easy), it is hard to say whether the changes over time are due to selection of a subset of the initial population, genetic changes in an initially clonal population, or a combination of the two. Indeed, a selection process would explain the unexpected differences found in early sampled (low parasitemic but high stumpy) and late sampled (high parasitemic but low stumpy) lines as well as the erratic growth in culture. The authors seem to go back and forth on this issue but the key problem is that there is not enough information here to understand this clearly, and for that reason, the work, although considerable, adds mostly anecdotal observations but relatively little definitively. The authors attribute gene/chromosome ploidy changes under in vitro culturing as adaptation of parasites due to chromosome replication and/or segregation. But this conclusion can not be reached unless the populations were clonal at the beginning. Related to this point, it is surprising that there is not a comparison between the A and B lines of each isolate to each other (e.g. 65A/65B). If the isolates are clonal to begin with then one would expect little difference between the 2 harvested 1 day apart from different rats.

      Overall the manuscript is difficult to follow and interpret, very light on details and Figure 2 is very confusing and is not particularly helped by the author's summary of them. A little more detail on how the experiment was done might help. It should be made clear and explicitly stated that the cultures were split (by how much/to what level - seems like it is 0.5 x 10e6?) once cultures reached what level (there doesn't appear to be any pattern there - so are they split every 24hrs unless they have declined?). Readers should not have to go to a previous publication to find these types of details needed to interpret the results. The text refers to "diluting them" but presumably a portion of the culture was reseeded/cultured - or was there a constant dilution of the culture over time? The authors refer to a "slowed growth" after 1-2 days, but 3 of the 4 cultures also declined again between 6 and 11 days. This is not consistent with the hypothesis that the rebound after 2-3 days was the result of an adaptation to a depleted component (if so, why would an additional "adaptation" be necessary?). 2E seems to have been handled differently from A-D. The growth curves in S2 fig do not exhibit the same drops as evident in the Fig 2 cultures.

      A fundamental finding of this research is detecting the gene copy number variation from the sequencing results of the strains after in vitro passages. However, the description of how to processing the sequencing reads was used to determine the gene copy number is vague. The description lacks basic information like the analysis workflow, software packages, parameters, and other important details. These details are the keys to support their conclusion.

      According to the authors, only one representative was considered for multicopy genes. But what is the criteria to call 'multicopy genes' here? How homologous these multicopy genes are set to be? The authors also mentioned that reads may also map to homologous genes, how is this issue dealt with during the analysis?

      Overall, the study seems rushed and too preliminary; much of the writing is loose and lacking in details. The authors mention that they have long read nanopore sequence but don't appear to use it to appropriately address some of the questions (e.g. Line 195 "In the meantime..."). As a result, although the results are intriguing (but not totally surprising - this work uncovers a bit of the dirty little secret most scientist realize underlies the pathogen strains that we often present as stable entities while knowing that they have and are changing over time), the study presents more questions than it answers - as the authors seem to acknowledge in their final paragraph.

      Minor comments:

      Line 79-83 - statements require some references The strain "A/B" nomenclature comes into use in the manuscript (line 129 and Fig 1 legend) without defining what the difference between A and B is (apparently this is the samples from 2 different cultures (Line 187) or from different rats - but it takes a bit of work to figure that out. - 1C could be interpreted to mean that the 2 rats got infected by either the A or B lines - or that the resulting lines were named based on the rat they came from, or they are just different cultures of the same stabilate).

      Line 141 "firmly confirmed" sounds a bit off - just "confirmed" or "firmly supported", perhaps better. Line 140/141 - S fig 2 a bit hard to understand - as is the remaining paragraphs on culture outcomes. As this is expressed as a ratio, could it not be argued that polyA+ enriches for (or ribo-minus "depletes" - using the author's treminology) short RNAs (or that ribo-minus enriches for longer)? It does seem likely that the relatively rare extra-long RNAs are underrepresented in the poly A+ method but not clear that this shows anything other than that the different techniques have different strengths/weaknesses.

      Line 156 ."... had run out of a nutrient..."

      Line 170: "several additional attempts ..." these were from the mouse blood "stabilates"?

      Line 188 and 189, figure S1 is figure S2, not S1.

      Line 192, comma is in the wrong place.

      Line 225: do you mean 'figure 4E' instead of 'figure 4B'. It would be informative if the authors could provide a statistic comparison of frequency of copy number alteration between tandem repeated genes and those that are not in tandem to support their statement in line 293-294.

      For the figures with chromosomes displayed, are these data points from the copy number of each unique gene? Or are they the average gene copy number per region? The chromosome number should be in order, Chr10 and Chr11 should be after Chr9, not the Chr1.

      Significance

      There is a lot of good information here, but it could be more clearly presented and better detailed. Placement in the literature is fine - although comparison to long-read sequenced T. brucei or T. cruzi genomes might deserve mention.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      Most laboratory research using T. b. brucei has made use of two strains, the monomorphic Lister 427 strain and the pleomorphic EATRO1125 strain. These strains have been passaged in vitro for decades in separate laboratories, thus selecting for populations that differ from both the original isolate and between laboratories.

      In this study, Mulindwa et al. describe the isolation of two new T. b. brucei strains from cattle in Uganda, MAK65 and MAK98, which show differences in virulence and propensity for stumpy form differentiation in rodents. To assess the effect of culture adaptation on trypanosomes, the researchers compared the gene copy number of the MAK65 and MAK98 isolates before and after culture adaptation. The study found that isolates cultured for as little as a week already demonstrated a broader gene copy number distribution than those that were not culture adapted. Broader gene copy number distributions, compared to the non-culture-adapted isolates, were also observed for a number of routinely-passaged Lister 427 and EATRO1125 cultures. The researchers observed reproducible increases in copy number for certain genes, such as those encoding histones, HSP70 and PFR proteins. The study postulated that changes in gene copy number observed across the genome increased bias towards rapid proliferation and stress tolerance upon culture adaptation.

      Major comments

      I have some concerns about the method that was used for the copy-number calculations. Firstly, I can imagine that a non-uniform distribution of the reads across the genome for any of the datasets could influence the results. Was this checked? Secondly, in the methods section lines 394/395 it is said that the modal RPKM for each dataset was 'adjusted slightly' to get a symmetrical distribution. Upon checking the modal RPKMs and the adjusted values used for the calculations, the adjusted values appear to have been adjusted to different extents between the different datasets to fit the authors assumptions. Do the authors believe that this could perhaps account for some of the subtle gene copy number changes observed (as is discussed in lines 287-290)? Next, why were the reads aligned 20 times? I think in general the method needs to be explained far more clearly so that the audience can understand what you did.

      Minor comments

      Additional experiments:

      Would it be possible to generate a phylogenetic tree comparing these new isolates with the Lister 427, TREU927 and EATRO1125 isolates in circulation to get an idea of how these strains may have evolved? I this could add to the strength of the manuscript.

      Title: Since no other unicellular eukaryotes are mentioned throughout the text, I think a more appropriate title for this study might be 'Adaptation of Trypanosoma brucei brucei bloodstream forms to in vitro culture results in gene copy-number changes.'

      Line 59: There have been more recent studies than the referenced paper (Cross et al., 2014) that quote far higher numbers of alternative VSG genes or pseudogenes in the Lister 427 strain. In Müller et al (2018, doi: 10.1038/s41586-018-0619-8) over 2500 VSGs are quoted and in Cosentino et al (2021, doi: 10.1101/2021.04.13.439624) 2872 VSGs are identified.

      Line 109/110: The dates of isolation written here, Feb 1st and July 30th 2016, are different to what is written under the Date of Isolation column in Supplementary text 1, table 1- 25/5/2016 and 30/7/2017. Do these two dates represent different things?

      Line 233: Should Figure 5 A, B, C and E (rather than A, B, D and E) be referenced here to show all the initial, not cultured results?

      Line 270/271: PIP39 does not promote differentiation to stumpy forms, but instead contributes towards the efficient differentiation of stumpies to procyclic forms (Szoor et al, 2010, doi: 10.1101/gad.570310).

      Discussion:

      It should be discussed in the text that changes in gene copy number have also been observed upon Leishmania culture adaptation (see, for e.g., Gerald Späth's work). This will strengthen the authors' conclusions. Furthermore, similar observations of aneuploidy and triploidy in some T. brucei Lister 427 strains have recently been reported in Cosentino et al (2021, doi: 10.1101/2021.04.13.439624). This could also be referenced somewhere in the text.

      Line 364: I think the Nijuru et al (2005, doi: 10.1007/s00436-004-1267-5) paper would be the correct reference here since it describes the use of the ITS-1 PCR. Reference 13 is actually referring to another paper, I cannot find Nijuru et al in the references list.

      Line 379: PAD staining should be referenced to be more informative and allow reproducibility.

      Figure 1, Line 580: How many cells were counted for determining the % PAD positivity?

      Figure 1, Line 581: Scale bar should be included in D.

      Supplement Text 1:

      I found Supplement Text 1 a little confusing for two reasons. Firstly, I was never sure if the tables or references being referenced were from the main text or from the supplement text 1, perhaps this could be made a little clearer to aid the reader. Secondly, the names of the isolates switched between, for e.g., MAK 65 and Tb065. For simplicity it could help to try and stick to one naming system.

      It might be worth adding a sentence about why Tb236B was not followed up. Was this because it could not easily be distinguished from MAK65 by microsatellite analysis?

      S3 Figure: Legend describes red bars but there are none in the figure.

      Table 2: In the legend for Sheet 2, the cut-off for the increase is missing.

      Significance

      Though the T. b. brucei Lister 427 and EATRO1125 strains are used most commonly for laboratory-based research, they have been extensively passaged in vitro without characterisation of the changes that have occurred between them and the original isolate, or indeed, between laboratories.

      Mulindwa et al. demonstrated that changes to gene copy number occur rapidly upon adaptation to culture of field isolates, and that different cultures of the same isolate can furthermore have different ploidies. This is an important advance and raises awareness a) that trypanosomes undergo changes upon laboratory adaptation, b) of the nature of some of these changes and c) that the changes can occur rapidly. Changes to gene copy number have also been shown to effect Leishmania donovani upon culture adaptation (Prieto Barja et al, 2017, doi: 10.1038/s41559-017-0361-x).

      This study will be of interest to the trypanosome community in general, but particularly those who work on biology that we already know is impacted by prolonged in vitro passage-differentiation, virulence and antigenic variation.

      Reviewer expertise: BSF differentiation, antigenic variation

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Trypanosoma brucei brucei is a protozoan parasite studied both because it is the cause of Human African Trypanosomiasis, and as a unicellular eukaryotic model organism. However, only few isolates are lab-adapted and most literature focuses on derivatives of Lister 427 and EATRO1125, which have both been cultured under in vitro conditions for several decades. In this manuscript, Mulindwa and colleagues describe and characterise two novel T. b. brucei strains (MAK65 and MAK98) isolated from the field with minimal passage through rodents. These new strains are clearly more closely related to trypanosomes in the field, compared to the aforementioned in vitro models. The authors investigate infection dynamics in murine models, in particular to assess differentiation capabilities of the parasites. Furthermore, optimisation of in vitro adaptation is also attempted successfully. Finally, the authors show, through genome sequencing, that adaptation of both strains to laboratory-based culture affects gene ploidy, and several examples of interest are discussed. A key conclusion is that cultured lines from the same original isolate can rapidly (in a matter of weeks) develop karyotype differences, which could carry implications for the study, as well as genetic manipulation of these organisms. The availability of novel laboratory strains for the study of T. b. brucei is a significant outcome for the trypanosome research community.

      Major comments:

      The key conclusions, in particular that adaptation to in vitro culture results in reproducible genomic alterations are convincing. In my opinion, this is a descriptive study, in that the underlying causes (and consequences) of the findings are not investigated further.

      The authors state several hypotheses based on their results in the 'Outlook' section. The majority of these hypotheses are valid points to make, but I feel the following should be toned down: Lines 321-323 - These hypotheses are speculative, based on available data, and should be revised. There are likely many factors that will impact parasite growth in initial stages of culture adaptation, but most commonly there is a selection bottleneck effect that will impact initial growth.

      Line 344 - With the data presented, it is difficult to say whether tissue distribution is different between the two trypanosome strains. I would suggest removing this claim, unless work was carried out to investigate differences in tissue distribution.

      Changes in gene copy number is discussed, but one interesting aspect is whether this translates to changes at the protein level (e.g. are there increases in HSP70 mRNA or protein levels after in vitro adaptation?). Whilst not essential, carrying out qPCRs or Western blots to confirm this would enhance the impact of this study. If these experiments are not carried out, this point should be added to the discussion.

      It would be very interesting to further analyse the genomics data to assess whether there are further changes to metabolic gene copy number, perhaps as a result of in vitro adaptation. These changes could be linked to the high levels of nutrients available in HMI-9 medium. Currently, only a pteridine transporter is mentioned, although the tables list several genes such as glycerol kinase and an arginine transporter. Are there changes in glucose transporter copy number after culturing these strains? Copy number of this array can often vary in field isolates.

      The arginine transporter, AAT5 was previously shown to be an essential transporter (PMID: 28045943). Loss of copy numbers could reflect reduced need in a nutrient-rich environment such as in vitro culture medium. I feel this may be worth mentioning, but not investigating further.

      More detail is required in the Materials and Methods section, in particular the "sequence analysis" section: Which Illumina kits were used? What tool/software was used for genome alignment? Which reference genome was used (if TREU 927, what version)? What parameters were used for tryprnaseq and DeSeqU1 (if default settings were used, please mention this)? What tool/software was used for gene copy analysis (i.e. the alignment that was restricted to 20 alignments)? How was modal RPKM value adjusted to obtain a symmetrical distribution? These details are essential for this work to be reproducible. In addition, they would help establish a pipeline that can be used to directly compare other novel field isolates.

      In the RNA preparation section (line 372), please indicate replicate number for each sample group used for transcriptomics analysis, as well as parasitaemia or cell number used for extractions. In addition, if RNA extraction was carried out according to the Trifast manual, please state this.

      The protocol for staining with PAD1 requires more detail (line 380). If this protocol is available in a previous study, it is sufficient to reference this.

      The study mentions attempts to adapt cells using methylcellulose (line 179). It would be helpful for the reader to know what percentage (w/v) was attempted. In addition, did the authors consider using serum supplements from different host sources (e.g. goat or adult bovine?). If only FBS supplementation was attempted, it would be worthwhile mentioning this.

      This reviewer is not an expert on statistical analysis of genomics data, but it may be of interest to carry out statistical analysis of the differences in copy number between non-culture-selected and culture-selected parasites (figure 6) to determine whether these differences are significant.

      In my opinion, experiments are adequately replicated.

      Minor comments:

      Line 136 - What is the rationale of calculating Log2 fold changes of 65A/98A to compare to log2 fold change of ST/SL? Given the authors are in possession of normalised read count data for all 4 sample groups, it would also be interesting to show correlation between the individual datasets (e.g. 65A vs 98A, 65A vs ST, 65A vs SL, etc). Perhaps this would be more informative, and give an idea whether, for example, 65A is more similar to previously published datasets derived from stumpy or slender parasites.

      Prior studies are referenced appropriately, with the following exception: Line 70 - note T. b. gambiense as an example of African trypanosome that does not undergo sexual recombination (reference PMID: 26809473)

      To make the text and legends clearer, it would be beneficial to add commas to numbers >1,000. In addition, text would be more legible if spaces are added between numbers and units (e.g. 480 bp, instead of 480bp), with the exception of temperature and percentages. There are also inconsistencies in spacing around the multiplication symbol when stating cell densities (line 170: 1.5 x106/ml; line 176: 5x 105/ml) - please add spacing on both sides.

      In figures 1A and 1B, y-axis numbering is not consistent (i.e. 1A is linear, 1B is logarithmic) I would recommend maintaining consistency between these two experiments.

      In figure 1C, the 'A' and 'B' variants of the MAK strains are not defined in the legend, nor main text. It would be helpful for the reader to know what is meant by'98A' vs '98B'.

      In figure 1E, statistical test used to generate R and R2 are not indicated.

      In figure 2, please make y-axis scale consistent (e.g. 2A consistent with 2B, 2C consistent with 2D) as this makes it much easier for the reader to compare and contrast the data2.

      In all figures showing gene copy number overviews (4, 5 and S3), the chromosomes not in order (i.e. chromosomes 10 and 11 should come after 9). I feel these figures would be improved by keeping chromosomes in increasing number, left to right.

      In figures 5A and 5B, does the 'c' refer to culture or clone? This should be clarified. In addition, it is difficult to tell which colour represents which culture/clone in panel A in particular.

      Line 308 - Could the authors explain why the translation factor eIF3c result (significant decrease in copy number) is an "odd" one?

      In the legends for Tables 1 and 2 it is worth mentioning that the "Tb927" prefix has been removed for legibility.

      Several genes/proteins are discussed in the study (e.g. HSP70, histones). However, the implications of these changes, and their potential significance, are not discussed. A paragraph on what these changes could mean for the biology of the parasite would be of interest.

      As mentioned above, lack of PAD1 staining in MAK98 does not necessarily mean differences in tissue distribution compared to MAK65 and this sentence should be revised.

      Can the authors comment on expected differences in drug sensitivity in these strains compared to Lister 427 and EATRO1125? For example, changes in pteridine transporter copy number, if reflected at the protein level, could impact anti-folate uptake.

      A previous study investigating transcriptomics of culture-derived vs. in vivo derived T. brucei concluded that in vitro-cultured cells can be used as alternatives to animal-derived parasites (PMID: 29606092). How do the results presented here impact the findings of this previous study?

      Significance

      The availability of new strains to study African trypanosomes is significant. Furthermore, low passage number means that these strains more accurately reflect naturally occurring strains in the field, compared to the commonly used EATRO1125 and Lister 427, both of which have been adapted for laboratory use for decades.

      The finding that adaptation to laboratory culture has a significant effect on gene copy number is also of importance, and suggests variation in karyotype between strains in different labs is common, even if they are derived from the same original isolate. The logical next step is to find out how these karyotype differences impact cell biology.

      Few trypanosome strains are routinely used in a laboratory setting, in particular the monomorphic Lister 427 and the pleomorphic EATRO1125 (AnTat1.1). The adaptation of two novel strains with minimal passage and differing virulence is a significant outcome. In the past, adaptation involved growing parasites on feeder layer cells but addition of cysteine removes this requirement (PMID: 4045385). Comparison of culture- and animal-derived T. brucei has been carried out before (for example, PMID 29606092), and in addition, genomic analysis has previously suggested aneuploidy does not occur (PMID: 30256189). Therefore, the finding that adaptation to in vitro culture can result in changes in gene copy number is novel, and significant. It is unknown whether these changes are reflected at the mRNA or protein level and this is an important next step.

      The main audience for these findings is the trypanosome research community, especially groups working on the genomics of African trypanosomes. In addition, the findings are of significance (as mentioned in the manuscript) to researchers designing genetic manipulation experiments with culture-derived trypanosomes.

      The background of this reviewer is in biochemical parasitology, with work focusing on livestock trypanosomes (Trypanosoma congolense, Trypanosoma vivax). Key words: Metabolism, Metabolomics, Transcriptomics, Drug mode of action, Drug resistance.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors present an analysis of gene copy number in Trypanosoma brucei lines based on resequencing. The analysis includes 2 strains isolated very recently, which are sequenced straight from passage in animals and following culture in flasks. They then compare gene copy number in these strains to T. brucei lines commonly used in the lab and reference DNA probably isolated after multiple passages through animals, concluding that there are some genes that consistently change copy number on adaptation to culture. There is some overlap with the authors' previous work [PMID: 26715446] comparing isolates of T. b. rhodesiense (data also included here), but here they have the addition of lines pre and post adaptation to culture, and focus more on gene copy number over transcriptomic changes.

      The question is a really interesting one: what are labs selecting for when they adapt these parasites to culture? The authors generate some useful data on this subject and suggest some of the additional analysis that could be done with these data in future in the Outlook section. However, I have 2 major comments about the analysis presented that I believe affect whether it can be used to support the conclusions made in this manuscript.

      Major comments:

      1) In the analysis, the authors' discuss variation in measured copy number as if representing genuine changes in gene number. However, variation can also result from differences in sequencing depth or be introduced during DNA isolation, processing, sequencing and/or mapping. In my opinion, this variation is not adequately dealt with in the analysis presented and as a result at least some of the subsequent interpretations are likely to be incorrect. While array copy numbers in particular would be expected to become more heterogeneous as a population ages, this is unlikely to affect the majority of diploid genes and should not greatly influence the distribution around copy number of 2. Moreover, where average diploid genes do change, they are expected to do so across whole chromosomes or large sections of chromosomes, which does not appear to be the pattern seen in S3 Fig. Moreover, heterogeneity in the population of chromosomes present in the culture resulting from changes in replication/segregation [as suggested in the text] would affect the spread of copy numbers between chromosomes, but not the distribution on each chromosome (as the genes are still linked even when the average chromosome copy number is non-integer). For example, for the distribution broadening seen in MAK98_cA (which appears to be broadening of the distribution at all positions on all chromosomes) to be the result of a biological process would require widespread and extreme fragmentation of the chromosomes. I would suggest this is very unlikely against the alternative of variation introduced during DNA processing, sequencing and/or mapping - and indeed, this is the likely the cause of for the majority of observed difference in the overall distribution around 2 copies.

      This source of variation in copy number estimates needs to be accounted for in testing for differences as the certainty for each estimate will vary with sample preparation and also gene length. In addition, the distributions for Lister427 1313 and MAK98_cA probably suggest that these should be removed from the analysis.

      2) The lines being analysed are not phylogenetically independent, meaning that similarity can reflect shared ancestry rather than selection. The 3 lines derived from Lister 427 are obviously expected to be very closely related, and the authors have previously shown that the 4 T. b. rhodesiense isolates are highly similar (suggesting recent common ancestor) [PMID: 26715446]. In addition, although EATRO1125 was derived independently, this too might be more closely related to 427 than to the new lines. As such, copy number in each cannot be taken as an independent measurement. I'm afraid I think this makes the statistical approach taken to identify 'reproducible' gene copy number changes invalid. For example, the monooxygenase Tb927.9.1400 in Fig. 6E is present as 2 copies in T. b. rhodesiense isolates and 3 copies in EATRO1125/Lister427 lines, but this probably only represents a single difference between 2 clades, not 8 independent measurements. On adaptation to culture this gene is unchanged in MAK98 (excluding cA) and goes from 2→3 in MAK65. This last observation is interesting (especially if the 4 MAK65 lines are completely independent, which wasn't clear from the text), but whether such change is statistically significant across the whole genome needs testing.

      Notwithstanding the above, unique to this study are the copy number estimates for the same populations pre and post adaptation. I think if the authors could apply a method that tests for directional changes in these while accounting for effects of gene size and the heterogeneity due to sequencing discussed above, then an analysis of which of these were observed in both MAK65 and MAK98 - and which also shared by Lister427/EATRO1125 - would be very interesting and potentially very informative, but I don't think the current analysis is sufficient to support the claims made. The authors may want to consult a statistician, but I suspect biological replicates of the sequencing samples will be required.

      Minor comments:

      The concept of "gene ploidy" is rather unhelpful and I would suggest that the authors make a more strict distinction between changes in ploidy (caused by duplication/loss of chromosomes) and changes in gene copy number (predominantly due to array expansion/contraction). For example, a gene with copy number of 4 can exist as 2 copies on each of 2 homologous diploid chromosomes - and an expansion of one array that increases the copy number to 5 is not a change in ploidy.

      The purpose of Fig 1A/B is to compare the infections between 2 T.b. isolates (MAK65/98), but the parasitaemias are displayed on different scales and with different transforms making such comparison extremely difficult. Same for Fig 2A-D and F-G - text discusses growth rate, but can't be judged on linear scale.

      Very difficult to see PAD1 staining when overlaid with transmitted light (Fig1D) - could this be separated?

      Discussion is made comparing the proportion of cells with PAD1 staining, but it is unclear how many cells and replicates this is based on - percentages should be given with estimate of SEM, confidence intervals or similar.

      I found naming of the cultures derived from MAK65 and MAK98 confusing and counter-intuitive. I believe the following to be correct:

      MAK65 cA/cB are independent cultures with a total of 30 days in culture and the doubling time ~7h (after ~day 20) MAK65 cC/cD have only 17 days in culture but achieved a similar doubling time to cA/B after ~day 8. MAK98 cA has 33 days in culture, but grew slowly throughout (doubling time ~13h). MAK98 cB has 24 days in culture including a freezing step, but doubling time dropped to 10h after ~day 9. This really needs to be clearer in the manuscript and it would be extremely useful to have some indication of the number of generations each line has been through in culture. To me, it was confusing that MAK65 cA/cB have been longer in culture than cC/cD.

      "Our observations on copy number are the tip of the iceberg: a survey of just a single ~10 kb region revealed selection for smaller insertions, deletions and point mutations (S4 Fig).". S4 Fig shows a region around a repeated HSP70 gene, with a large number of SNP/INDELs versus reference. As would be expected, some of these are haplotype-specific, but changes in the ratio of the sequencing reads between the haplotypes should not be presented as evidence for selection without a statistical test for enrichment.

      The Supplemental data are an important resource here. They are too large to check extensively, but in trying to reproduce the copy number estimates for the unique genes, I found that the read counts for MAK98 and EATRO1125 in Sheet 5 to be identical except for some NA values in EATRO1125. This is presumably an error. I was also surprised that no genes have 0 count when DNA was mapped to 927 reference - except for DNA from 927 itself which is the one sample I'd expect not to have this behaviour. Is there an explanation for this?

      Significance

      Combined with evidence, reproducibility and clarity.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      We thank the reviewers for their valuable and pertinent suggestions that helped us immensely in revising our manuscript. Listed below are the responses to reviewers’ comments where reviewer’s comments are highlighted in bold while the responses are in normal font.

      Reviewer #1 (Evidence, reproducibility and clarity (Required))

      As HIV-1 bnAbs show extensive somatic mutations and have several unique properties such as long CDR and ability to recognize glycopeptides, there is a possibility that some of them can cross-react with SARS-CoV-2 and a fraction of these could even neutralize the virus. To explore if this is true, Mishra et al from the Luthra Lab study investigated the cross-reactivity of existing HIV-1 neutralizing antibodies with purified and stabilized SARS-CoV-2 trimeric ectodomain S2P and receptor binding domain. Indeed, they find that antibodies that recognize the CD4 binding site and MPER region of HIV-envelope show fairly strong reactivity to SARS-CoV-2 RBD and S2P like the CoV-1 cross reactive antibody CR3022. They further go on to test neutralization using pseudoviruses decorated with SARS-CoV-2 spike and find that one of the several binding antibodies, N6 can indeed neutralize, though not potently as expected. They also assessed the cross-reactivity and neutralization of plasma from children with chronic HIV-1 infection to SARS-CoV-2 pseudoviruses. Some sample of plasma indeed showed neutralization of SARS-CoV-2 pseudoviruses. Based on these observations, the authors propose cross-reactive epitopes such as the N6 epitope can be used as a blueprint to engineer variant super-antibodies that can be effective against several viral pathogens including SARS-VoV-2. Overall, the study is well conducted and detailed with clarity.

      Q1. The authors should include an authentic SARS-CoV-2 neutralizing antibodies in the study as a positive control. There are several RBD as well as NTD binding nAbs that have been identified. This will allow one to assess the extent of cross-reactivity and neutralization exhibited by HIV-1 bnAbs and gauge the extent of practical utility. CR3022 while important, is not enough.

      R1. We have now included an Anti-RBD nAb, CC12.1, as positive control for both binding assay and neutralization assays (line numbers 73 – 77).

      Q2. Some plasma of HIV-1 infected children showed neutralizing activity against SARS-CoV-2. However, the neutralization seen may not be solely due to HIV-1 bnAbs in plasma. Antibodies to other human coronaviruses likely contributes to this neutralization. The authors should also test reactivity/neutralization to 229E, NL63, OC43, HKU1, MERS spike/pseudoviruses to support their claim. In its absence, the authors should tone down their interpretation and provide alternate explanation.

      R2. We have now performed neutralization assays with the plasma from all the HIV-1 infected children against all seven coronaviruses (SARS-CoV-1, SARS-CoV-2, HKU1, NL63, MERS, OC43, and 229E) (line numbers 150 – 160 and 551 – 564).

      Q3. The authors start out with the following premise "Given the unique nature of HIV-1 bnAbs and their ability to recognize and/or accommodate viral glycans, we reasoned that the glycan shield of SARS-CoV-2 spike protein can be targeted by HIV-1 specific bnAbs". While HIV-1 envelope is heavily glycosylated, it is interesting to note that the cross-reactive HIV-1 bnAbs target RBD of SARS-CoV-2 spike protein which is not heavily glycosylated (2 sites). In absence of glycosylations-probing experiments, it is difficult to justify this premise. Rather the extensive SHM and self-reactivity (MPER-directed antibodies) could be the likely reason for cross-reactivity.

      R3. We have discussed the polyreactivity and/or autoreactivity of HIV-1 bnAbs as one of the plausible reasons for the observed cross-reactivity with SARS-CoV-2 (line numbers 166 – 180).

      Minor

      Q4. The color profile of in figure 3 needs to be revised or symbols added to allow the curves to be distinguished from each other.

      R4. We have changed the color profile of figure 3 and have added symbols as well (line number 527__). Figure 3 from the original manuscript is now figure 4 in the revised manuscript.

      Reviewer #2 (Significance (Required))

      This is a significant study that highlights the concept of developing a broadly cross-reactive antibodies that may be effective against multiple pathogens, a concept not usually ascribed to antibodies that are known for their specificity unlike drugs that can be repurposed. In-depth study of epitopes that can elicit multi-pathogen cross-reactive/neutralizing antibodies can on the other hand allow generation of antigenic templates that can be effective as vaccines. This study is suitable for broader audiences to disseminate above-mentioned concepts.

      **Referee Cross-commenting**

      Q1. There is a lot of evidence now that lenti-pseudoviruses are a good surrogate for identifying SARS-CoV-2 neutralizing antibodies. That said the authors may want to ensure the data is reproducible even with authentic SARS-CoV-2 viruses as reviewer 2 suggests to avoid surprises.

      R1. We have now tested the ability of N6 to neutralize authentic SARS-CoV-2 virus using a cytopathic effect (CPE) based neutralization assay. N6 failed to show any reduction in CPE upto a concentration of 20 µg/ml (line numbers 135 – 141).

      We have now discussed N6’s inability to neutralize authentic virus in discussion section (line numbers 181 – 187).

      Q2. I also agree with the Reviewer 2 that structural analyses of the cross-neutralizing antibody with SARS-CoV-2 spike will strengthen the manuscript.

      R2. While we do agree with both the reviewer’s suggestion that a structural analysis of N6 binding with SARS-CoV-2 will provide details into the exact nature of epitopes-paratopes that are responsible for the cross-reactivity observed, we believe it to be out of scope for the current work. The current work was designed to show the ability of polyreactive and somatically hypermutated antibodies generated in chronic HIV-1 infection to bind to new and emerging pathogens such as SARS-CoV-2. While only N6 showed neutralization of SARS-CoV-2, it is also noteworthy that several HIV-1 bnAbs showed significant binding to both purified RBD as well as surface expressed spike from SARS-CoV-2. As all these distinct bnAbs have different paratopes, a detailed structural analysis to understand the exact nature by which several distinct bnAbs cross-reacted with SARS-CoV-2 is not feasible. We are optimistic that our results will provide that impetus for future detailed structural studies (line numbers 194 – 198__).

      Q3. Also, the authors may want to avoid 293T-ACE2 cells as their targets for carrying out neutralization assays. There is data now from Davide Corti's group that indicate ACE2 overexpression can significantly underestimate the neutralizing capacity of antibodies especially those that bind to RBDs (bioRxiv 2021.04.03.438258; doi: https://doi.org/10.1101/2021.04.03.438258). This may be particularly true for N6. A cell line like Vero E6 that endogenously express ACE2 would be the way to go. If the authors decide to use Lenti pseudoviruses as the first step in Vero cells, please make sure to use G89V mutated HIV gag-pol to avoid TRIM5alpha restriction.

      R3. Based on reviewer’s suggestion, we have now performed neutralization assays for all bnAbs using both HEK293T/ACE2 cells and Vero-E6 cells. We have used an HIV-1 proviral backbone that does contain G89V mutation in the gag-pol region (line numbers 96 – 121).

      Reviewer #3 (Evidence, reproducibility and clarity (Required))

      The authors report that HIV-1 sp-bnMAb cross-neutralized SARS-CoV-2 pseudotyped virus in vitro. The finding that N6 can block HIV-based-SARS-CoV-2 pVs from entering 293T-ACE2 cells is intriguing and requires further investigation before the manuscript can be considered for publication.

      Major comments

      Q1. The authors must demonstrate that N6 is capable of neutralizing authentic SARS-CoV-2.

      R1. We have now tested the ability of N6 to neutralize authentic SARS-CoV-2 virus using a cytopathic effect (CPE) based neutralization assay. N6 failed to show any reduction in CPE upto a concentration of 20 µg/ml (line numbers 135 – 141).

      We have now discussed N6’s inability to neutralize authentic virus in discussion section (line numbers 181 – 187).

      Q2. The authors should also demonstrate it works in another system i.e., recombinant VSV/SARS-CoV-2 (used by Naveenchandra Suryadevara et al., 2021 "Neutralizing and protective human monoclonal antibodies 1 recognizing the N2 terminal domain of the SARS-CoV-2 spike protein" https://doi.org/10.1016/j.cell.2021.03.029) which is a well validated system for SARS-CoV-2 neutralization activity.

      R2. We have additionally used a MLV based pseudovirus system to further increase the robustness of neutralization assays. In addition, we have now used three distinct cell lines (HEK293T/ACE2, Vero-E6 and Huh7 cells) to exclude any bias in neutralization assays due to endogenous overexpression of ACE2 on HEK293T cells (line numbers 96 – 121).

      Q3. Please, reference the HIV-1 based SARS pVs neutralization assay used in this work if it has been used earlier, either by your team or others. Details regarding the performance of this particular assay for SARS-CoV-2 Nab detection will aid the reader to understand its robustness and weaknesses.

      R3. We have provided several references for both the HIV-1 lentiviral and MLV retroviral pseudovirus neutralization assays. Furthermore, we have added detailed results for optimization of neutralization assays that we performed (line numbers 96 – 121 and figure 3). In addition, we have discussed the limitation and weaknesses of pseudovirus neutralization assays in comparison to authentic virus neutralization assays (line numbers 183 – 187).

      Q4. In the method section, a list of all controls used to guarantee neutralization assay performance must be included (i.e., positive and negative mAb controls, virus controls, etc.); additionally, it might be necessary to show raw data of all control results.

      R4. We have provided the detail of all controls that were used to guarantee the robustness of neutralization assay in the methodology section of “neutralization assay.’ Furthermore, we have now added an entire section in the results titled, ‘Optimized conditions for neutralization of pseudotyped coronaviruses,’ where we now provide all the technical details that were optimized to ensure the robustness of the neutralization assay (line numbers 96 – 121).

      Q5. Control description are missing in the method section for all or most of the assays (some descriptions are found in the result section, but definitely they should be found in M&M, for each technique).

      R5. We have now provided the description for all the controls in methodology section too.

      Q6. The authors draw some similarities between HIV-1 Env and SARS-CoV-2 Spike proteins from structure and glycosylation point of view, but the fact that N6 binds to ENV-CD4bs is beyond these pictures and for that reason a detailed structural analysis MUST be performed to show the mechanism of cross-neutralization and corroborate it exists.

      R6. While we do agree with both the reviewer’s suggestion that a structural analysis of N6 binding with SARS-CoV-2 will provide details into the exact nature of epitopes-paratopes that are responsible for the cross-reactivity observed, we believe it to be out of scope for the current work. The current work was designed to show the ability of polyreactive and somatically hypermutated antibodies generated in chronic HIV-1 infection to bind to new and emerging pathogens such as SARS-CoV-2. While only N6 showed neutralization of SARS-CoV-2, it is also noteworthy that several HIV-1 bnAbs showed significant binding to both purified RBD as well as surface expressed spike from SARS-CoV-2. As all these distinct bnAbs have different paratopes, a detailed structural analysis to understand the exact nature by which several distinct bnAbs cross-reacted with SARS-CoV-2 is not feasible. We wish that our results will provide that impetus for future detailed structural studies (line numbers 194 – 198).

      Q7. As the authors may realize, based on previous comments, any false positive results that could have biased the main conclusion of this article, must be excluded in order to claim such intriguing finding.

      Wherever applicable, we have tried to use proper positive and negative control and have tried to tone down the inferences.

      Q8. Regarding BINDING AND NEUTRALIZATION OF SARS-COV-2 WITH PLASMA FROM HIV-1 INFECTED PAEDIATRIC PATIENTS, how can you discard previous exposure to actual SARS-CoV-2 in such population, and/or previous infection with other common hu-CoVs, which has been shown to elicit cross reactive SARS-CoV-2 Abs.

      R8. All the pediatric plasma samples used in the binding and neutralization assays were pre-pandemic (2013-2014). While we do agree that exposure to common endemic coronaviruses might have elicited some degree of cross-reactive SARS-CoV-2 antibodies, we have now performed neutralization assay for all seven coronaviruses (including common endemic coronaviruses HKU1, OC43 and 229E) using the plasma that showed SARS-CoV-2 neutralization (line numbers 150 – 160 and figure 6__).

      Minor comments can be addressed once these main concerns are solved.

      Reviewer #3 (Significance (Required)):

      Q9. The fact that N6 (and plasma from HIV-1 infected patients) can neutralize SARS-CoV-2 is novel and intriguing; however, these observations must be supported with new experiments. Showing that N6 neutralizes authentic SARS-CoV-2 would be a key point to address. Moreover, the mechanism for such interaction should be explored using a detailed structural analysis.

      R9. Please see response to query 6.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors report that HIV-1 sp-bnMAb cross-neutralized SARS-CoV-2 pseudotyped virus in vitro. The finding that N6 can block HIV-based-SARS-CoV-2 pVs from entering 293T-ACE2 cells is intriguing and requires further investigation before the manuscript can be considered for publication.

      Major comments:

      The authors must demonstrate that N6 is capable of neutralizing authentic SARS-CoV-2.

      The authors should also demonstrate it works in another system i.e., recombinant VSV/SARS-CoV-2 (used by Naveenchandra Suryadevara et al., 2021 "Neutralizing and protective human monoclonal antibodies 1 recognizing the N2 terminal domain of the SARS-CoV-2 spike protein" https://doi.org/10.1016/j.cell.2021.03.029) which is a well validated system for SARS-CoV-2 neutralization activity.

      Please, reference the HIV-1 based SARS pVs neutralization assay used in this work if it has been used earlier, either by your team or others. Details regarding the performance of this particular assay for SARS-CoV-2 Nab detection will aid the reader to understand its robustness and weaknesses.

      In the method section, a list of all controls used to guarantee neut assay performance must be included (i.e., positive and negative mAb controls, virus controls, etc.); additionally, it might be necessary to show raw data of all control results.

      Control description are missing in the method section for all or most of the assays (some descriptions are found in the result section, but definitely they should be found in M&M, for each technique).

      The authors draw some similarities between HIV-1 Env and SARS-CoV-2 Spike proteins from structure and glycosylation point of view, but the fact that N6 binds to ENV-cd4bs is beyond these picture and for that reason a detailed structural analysis MUST be performed to show the mechanism of cross-neutralization and corroborate it exists.

      As the authors may realize, based on previous comments, any false positive result that could biased the main conclusion of this article, must be excluded in order to claim such intriguing finding.

      Regarding BINDING AND NEUTRALIZATION OF SARS-COV-2 WITH PLASMA FROM HIV-1 INFECTED PAEDIATRIC PATIENTS, how can you discard previous exposure to actual SARS-CoV-2 in such population, and/or previous infection with other common hu-CoVs, which has been shown to elicit cross reactive SARS-CoV-2 Abs.

      Minor comments can be addressed once these main concerns are solved.

      Significance

      The fact that N6 (and plasma from HIV-1 infected patients) can neutralize SARS-CoV-2 is novel and intriguing; however, these observations must be supported with new experiments. Showing that N6 neutralizes authentic SARS-CoV-2 would be a key point to adress. Moreover, the mechanism for such interaction should be explored using a detailed structural analysis.

      Referee Cross-commenting

      I am happy to realize that there is an agreement in the major points observed by both Reviewers. These comments Must be answered properly by the authors now.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      As HIV-1 bNabs show extensive somatic mutations and have several unique properties such as long CDR and ability to recognize glycopeptides, there is a possibility that some of them can cross-react with SARS-CoV-2 and a fraction of these could even neutralize the virus. To explore if this is true, Mishra et al from the Luthra Lab study investigated the cross-reactivity of existing HIV-1 neutralizing antibodies with purified and stabilized SARS-CoV-2 trimeric ectodomain S2P and receptor binding domain. Indeed they find that antibodies that recognize the CD4 binding site and MPER region of HIV-envelope show fairly strong reactivity to SARS-CoV-2 RBD and S2P like the CoV-1 cross reactive antibody CR3022. They further go on to test neutralization using pseudoviruses decorated with SARS-CoV-2 spike and find that one of the several binding antibodies, N6 can indeed neutralize, though not potently as expected. They also assessed the cross-reactivity and neutralization of plasma from children with chronic HIV-1 infection to SARS-CoV-2 pseudoviruses. Some sample of plasma indeed showed neutralization of SARS-CoV-2 pseudoviruses. Based on these observations, the authors propose cross-reactive epitopes such as the N6 epitope can be used as a blueprint to engineer variant super-antibodies that can be effective against several viral pathogens including SARS-VoV-2. Overall the study is well conducted and detailed with clarity.

      1. The authors should include an authentic SARS-CoV-2 neutralizing antibodies in the study as a positive control. There are several RBD as well as NTD binding nABs that have been identified. This will allow one to assess the extent of cross-reactivity and neutralization exhibited by HIV-1 bNAbs and guage the extent of practical utility. CR3O22 while important, is not enough.
      2. Some plasma of HIV-1 infected children showed neutralizing activity against SARS-CoV-2. However the neutralization seen may both be solely due to HIV-1 bnAbs in plasma. Antibodies antibodies to other human coronaviruses likely contributes to this neutralization. The authors should also test reactivity/neutralization to 229E, NL63, OC43, HKU1, MERS spike/peusdoviruses to support their claim. In its absence, the authors should tone down their interpretation and provide alternate explanation.
      3. The authors start out with the following premise "Given the unique nature of HIV-1 bnAbs and their ability to recognize and/or accommodate viral glycans, we reasoned that the glycan shield of SARS-CoV-2 spike protein can be targeted by HIV-1 specific bnAbs". While HIV-1 envelope is heavily glycosylated it is interesting to note that the cross-reactive HIV-1 bnAbs target RBD of SARS-CoV-2 spike protein which is not heavily glycosylated (2 sites). In absence of glysosylation-probing experiments, it is difficult to justify this premise. Rather the extensive SHM and self-reactivity (MPER-directed antibodies) could be the likely reason for cross-reactivity.

      Minor:

      1. The color profile of in figure 3 needs to be revised or symbols added to allow the curves to be distinguished from each other.

      Significance

      This is a significant study that highlights the concept of developing a broadly cross-reactive antibodies that may be effective against multiple pathogens, a concept not usually ascribed to antibodies that are known for their specificity unlike drugs that can be repurposed. In-depth study of epitopes that can elicit multi-pathogen cross-reactive/neutralizing antibodies can on the other hand allow generation of antigenic templates that can be effective as vaccines. This study is suitable for broader audiences to disseminate above-mentioned concepts.

      Referee Cross-commenting

      1.There is a lot of evidence now that lentipseudoviruses are a good surrogate for identifying SARS-CoV-2 neutralizing antibodies. That said the authors may want to ensure the data is reproducible even with authentic SARS-CoV-2 viruses as reviewer 2 suggests to avoid surprises.

      1. I also agree with the Reviewer 2 that structural analyses of the cross-neutralizing antibody with SARS-CoV-2 spike will strengthen the manuscript.
      2. Also the authors may want to avoid 293T-ACE2 cells as their targets for carrying out neutralization assays. There is data now from Davide Corti's group that indicate ACE2 overexpression can significantly underestimate the neutralizing capacity of antibodies especially those that bind to RBDs (bioRxiv 2021.04.03.438258; doi: https://doi.org/10.1101/2021.04.03.438258). This may be particularly true for N6. A cell line like Vero E6 that endogenously express ACE2 would be the way to go. If the authors decide to use Lenti pseudoviruses as the first step in Vero cells, please make sure to use G89V mutated HIV gagpol to avoid TRIM5alpha restriction.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2021-00733

      Corresponding author(s): Haiyun Gan

      Sara Monaco, PhD Managing Editor Review Commons

      Dear Dr. Monaco,

      We would like to thank all the reviewers for their insightful and constructive comments, which have helped us to improve our work. Please find below our responses to each of the concerns raised by the reviewers.

      Sincerely

      Haiyun


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      • In this article, Li and colleagues demonstrate the utility of a novel proximity labeling-based strategy, which they term AMAPEX (antibody-mediated protein A APEX) for the proteomic characterization of the protein environment of specific histone modifications. They apply this new methodology in mouse embryonic fibroblasts for subsequent purification of biotinylated proteins and mass-spectrometric identification. **Major comments:**

      • The authors present a high quality, descriptive manuscript that introduces a novel proximity labeling approach from a mostly technical point of view. The presentation of the data and methods are mostly clear, such that they should be easy to reproduce.

        We thank the reviewer for the positive view of our work.

      • The biggest shortcoming of the study in its current form seems to be the lack of a proper assessment of the method's sensitivity AND -most importantly also- specificity. The authors do not validate potentially new interactors of the modified histones experimentally, which would highlight their technology as a discovery tool. Without the assessment of newly identified proteins, these could simply represent false positives, which would point towards an additional requirement for optimization of the experimental setups. In that light the newly identified proteins are merely potential histone interactors. Validation would require establishment or purchase of additional antibodies, or alternatively cloning and transfection of respective candidates, which will probably take an extra 2-3 months of work. While the study may be publishable in its current form, this validation seems a valuable investment to strengthen the preliminary conclusions.

        We agree with the reviewer that lack of validation of our approach is the major shortcoming of our study. As described in our manuscript, we could identify most of the histone interacting proteins found by other methods in the published data(Vermeulen et al, 2010; Villasenor et al, 2020) (fig. 2A). We also provide evidence that the H3K27me3 proteins (NSD1, Brd9) identified by H3K27me3 AMAPEX are co-localized with H3K27me3 at the same genomic regions (fig. 2C). In addition, as we observed the enrichment of splicing proteins of the H4K5ac interacting proteins, we purified nucleosomes and performed SF3B1 IP and confirmed the interaction of SF3B1 with H4K5ac. We also performed H3K27me3 mono-nucleosome IP and confirmed the interaction between H3K27me3 and NSD2. Together, these results suggested that AMAPEX is a reliable method to map histone proximal proteins. We have added the SF3B1/H4K5ac, H3K27me3/NSD2 binding data to the revised manuscript (Fig. 2E and Fig. 7C).

      • With two biological replicates the study fulfills the minimum requirements for reproducibility, however, it would benefit from an additional replicate.

        We added extra replicates in the assays for H3K27me3 and H3K4me3 (Fig. 2A and Fig. 6A ).

      **Minor point:**

      • The sentences in lines 22f (and 44f) could be misunderstood. Please rephrase statements to be unambiguous that it specifies the proteomic surrounding of post-translationally modified proteins. The proximity labeling technologies may themselves be limited in identifying post-translationally modified proteins, as they label reactive side-chains that are often targeted by PTMs. Post-translationally modified lysine residues cannot be targeted by biotin ligase-based methodologies, as tyrosine phosphorylations cannot be assessed by peroxidase-based labeling.

        We agree that these sentences are confusing. We rephrased the sentence to “cannot be applied to identify proteins surrounding of post-translationally modified proteins” and “mapping the biomolecules that are proximal to post-translationally modified (PTM) proteins like histone modifications is complex” in our revised manuscript.

      Reviewer #1 (Significance (Required)):

      • From a technical point of view, most of the key conclusions of this paper are convincing. Assessing the proteomic environment of post-translationally modified proteins is of extremely high interest and the methodology seems broadly applicable to address such questions. It should be noted that proximity labeling has not originally been utilized to map protein-protein interactions, as the enzymatic activities available probe their surroundings, this could be an over-interpretation. Nonetheless, the "proteomic surrounding" of target proteins, which would also include hard to identify transient interactions is of high general interest for molecular biologists.

        Thanks so much, we have rephrased our sentences in the revised manuscript.

      • Although it can be generally agreed that having to express an exogenous fusion protein is a limitation of current proximity labeling setups, the methodology presented here in turn has the limitation that it cannot be performed in living cells, which is a significant disadvantage.

        We agree that it is a disadvantage that our assay cannot be done in living cells, as most of the antibody-based assays have these limitations. However, we have successfully applied the AMAPEX to the cell samples under native conditions and we have included the results in the revised manuscript (Fig. 8)

      Moreover, our method working under fixation conditions can be potentially applied to FFPE samples. We will add discussion to our revised manuscript.

      • My lab is establishing and constantly improving various proximity labeling methodologies in combination with mass spectrometry for sub-organellar proteomics. While the technology is sound and well executed, I am not an expert in the biology of histone modifications and the proteins involved and cannot assess the novelty nor the actual value of the generated datasets. It appears to me though that additional validation by independent experiments would strengthen the manuscript. The authors describe the usefulness of the technology mainly in confirming known interactors of modified histones, however, it would be nice (for non-specialists) to explicitly state how high the coverage of the known interactors is and discuss why some of them might have been missed.

        Thanks for the suggestion. We have added the statement of the coverage of the known interactors in our revised manuscript and we also include discussions.

      • The manuscript in its current form completely lacks a discussion of the presented data. Even if the focus will remain on the technical aspects, the findings should be properly discussed by comparison to other proteomics approaches studying histones. Ideally the data should also be discussed in the light of current proximity labeling technologies and potential future directions.

        We have added the discussion part to our revised manuscript.

      • While I cannot assess the value for histone research, the manuscript will be very interesting for experts focusing on proximity labeling technologies and subcellular proteomics. Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • This work describes interesting approach for mapping interactome of specifically modified histone protein using antibody-based APEX2. Although it contains interesting results and useful techniques for biological community, I found that it requires revision and addition for the publication.

      • I could understand the reason using antibody-based proximity labeling approach for mapping the biomolecules that interact with post-translationally modified histone because it should be complicated to map it with exogenous protein expression approach. However, the introduction of antibody-conjugated APEX2 should require fixation and permeabilization steps that can usually compromise ultrastructure. The authors should comment whether these procedures can affect protein composition and structures of nucleosome in the Discussion part of this manuscript.

        We agree with the reviewer. To address the concerns raised here, we have performed AMPEX under native conditions (H3K37me3). Our results showed that AMPEX under native conditions can still identify most of the surrounding proteins identified with crosslink approach. We have added the data to the revised manuscript.

      There are also advantages to perform the AMAEX with fixed samples, which may help capture the temporary events. We have added this to the discussion of the revised manuscript.

      Formaldehyde crosslinking may cause undesired effect. For instance, the nuclear proteins/loci are not equally efficiently cross-linked; cross-linking may trigger the DNA damage response; crosslinking can also mask epitopes of some antibodies, affecting antibody/antigen binding. However, most of the approaches studying chromatin/nucleosome in cells requires crosslinking and permeabilization. We have included discussion about whether fixation and permeabilization affects protein composition and structures of nucleosome.

      • For generation of biotin-phenoxyl radical, HRP-conjugated antibody can be utilized as shown in BAR method (Daniel Z Bar et al. Nat Methods, 2018, 15, 127-133). Since secondary-HRP antibody is commercially available, one cannot make an effort to express and purify pA-APEX2 for this approach. The authors should clearly explain why they selected APEX2 and what is an expected advance(s) using APEX2 in their approach

        We have compared pA-APEX2 with secondary-HRP and we found that pA-APEX2 can give better specificity as there more proteins identified by H3K27me3 mediated APEX2 located in the nuclear than that of HRP. We found that while only 33.31% of proteins identified by HRP based method locates in the nuclear, 69.4% of that identified by pA-APEX2 locates in nuclear, which implicates the increased specificity of the pA-APEX2 method compared to HRP conjugated 2nd antibody. We have added to data to revised manuscript (Fig. 3C, D).

      • For the detection of the APEX-mediated biotinylated proteins, direct mass identification of tyrosine-modified peptides with chemical probes can tell the most correct information of the proximal proteins (see Lee SY et al. J. Am. Chem. Soc. 2017, 139, 3651-3662; Namrata D Udeshi et al. Nature Methods 2017, 14, 1167-1170). Thus, if the authors can obtain the biotin-modified peptide information from each antibody-conjugated APEX2, the quality of their interactome results should be much improved. If authors may be under the situation that cannot conduct further mass experiments, it might be required to check whether their important finding molecules (e.g. Arid2, Brd7, Nsd2) are really "biotinylated" by conducting Streptavidin-HRP western blot experiment after enrichment of those proteins by using primary antibody. If biotinylation is specifically conducted by pA-APEX2 with H3K27me3 antibody, the authors can observe SA-HRP blot signal on the enriched protein band on the membrane. Negative controls should be the samples omit pA-APEX2, H3K27me3 antibody, biotin-phenol, H2O2, respectively or using different PTM targeted primary antibody. This result can confirm that their findings are enriched from proximity-dependent biotinylation of APEX2, not from spurious binding events to the other biotinylated proteins or self-labeled bait proteins.

        We thank the reviewer’s suggestion. We agree that finding the biotinylated peptides of the discovered proteins in our experiments could make the results more convincing. A variable modification of biotin-phenol on tyrosine was added to the search setting of MaxQuant (version 1.6.10.43), which indeed led to the identification of very few (only ten) biotinylated peptides without the ones from Arid2, Brd7, or Nsd2. However, these results were not surprising. Since, we pulled down the low abundant target proteins based on the strong binding of biotin-streptavidin (Kd ≈ 10−14–10−16 M ) (Laitinen et al, 2007)., which is also unusually stable against heat, denaturants, extremes of pH and proteolytic enzymes(Wilchek et al, 2006). These were considered as the obvious advantages of the technology. Thereafter, to obtain the better peptide coverage, on-bead tryptic digestion was performed under a relatively mild condition (50mM HEPES pH8.0, 1μM CaCl2, and 2% ACN). Therefore, it is highly likely that, the biotinylated peptides were still trapped on the streptavidin beads but not detectable in the samples.

      We indeed performed the Streptavidin-HRP western blot experiment and compared to the samples omit H3K27me3 antibody, H2O2, or samples of IgG, there is increased SA-HRP blot signal in the samples conducted by pA-APEX2 with H3K27me3 antibody (Supplementary Fig. 1F).

      Reviewer #2 (Significance (Required)):

      • For antibody-binding APEX2 strategy, this work is not the first one and the authors should mention the precedent work in the manuscript: Jisu Lee et al. Chem. Commun., 2015, 51, 10945-10948. And the author also commented the antibody-based proximity labeling mapping works including Daniel Z Bar et al. Nat Methods, 2018, 15, 127-133 in the manuscript.

        We are sorry for the oversight. We have rephrased our sentences in the revised manuscript. In addition, we have also compared our method with the one published in Nat Methods.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      • The manuscript by Li and coworkers describe an approach for detecting the PTM specific interacting proteomes through Proximity labeling strategy. In this method, cellular proteomes were first cross-linked by formaldehyde and then incubated with PTM specific antibodies. Afterwards, a protein A-APEX2 fusion protein was added to target the antibody and label the proteins in proximity with biotin. The biotinylated proteins can be enriched and subsequently identified by LC-MS/MS. The authors claim that these proteins are PTM-specific interacting proteins because they are close to the PTM-recognition antibody. They mainly applied the method to the histone marks and identified the interacting proteins for several histone modifications. They found that the identified proteins overlap partially with the published CHIP-seq data and drew complicated interaction networks based the proteomic data using the STRING database. **Major comments:**

      • The APEX technology has been widely used for capturing subcellular proteome or proteome proximal to the target protein of interest. Here the authors aimed to use antibodies specific for detecting certain histone PTMs to guide the APEX enzyme for proximal labeling. While the idea is interesting, the actual applicability is rather limited as APEX does not enable identification of direct interactions. With the current methodology design, the reviewer did not see its advantage over a formaldehyde crosslinking followed by a conventional immunoprecipitation using the PTM-specific antibody. The authors are suggested to compare the head-to-head performance of these two approaches.

        We agree with the reviewer that APEX2 is for proximal labeling and we have rephrased our statement from “binding” or “interaction’ to “proximal”.

      We have compared with the data of published work. (Vermeulen et al., 2010; Villasenor et al., 2020) (Fig. 2A) and we were able to identify most of the proteins identified in the published work. To improve our method, we performed AMPEX under native conditions and still got very robust labeling of nearby proteins, which is an improvement compared to the formaldehyde crosslinking followed by a conventional immunoprecipitation using the PTM-specific antibody method.

      • From the methodological point of view, it is unclear why formaldehyde crosslinking is included here. Neither did the authors justify its necessity for the whole workflow. With treatment with 0.1% formaldehyde for proteome crosslinking, how about physiological status of the cells? Authors should assess survival of cells under formaldehyde treatment. After a long labeling time, will the labeling results still reflect the physiological status of the protein interaction networks in living cells?

        We agree with the reviewer regarding the role of crosslinking in our method. To this point, we have performed AMPEX under native conditions and still got very robust labeling of nearby proteins. On the other hand, crosslinking allows our method to be potentially used for FFPE samples.

      • Overall, the manuscript lacks sufficient description on how the pipeline was optimized. For example, does it matter where the APEX2 is fused to protein A? The authors need to do a better job to present their optimization process.

        We found the activity is robust with current pA-APEX2. There is published data that there is only minimal effect on enzymatic activity when APEX2 is fused to Tn5 at different positions, which suggested APEX2’s enzymatic activity is unlikely to be affected by the location of fusion proteins.

      • Crosslinking will result in high background in proximity labeling, in figure 1 D, the signal from IgG is obviously high. Strangely, the background signals completely disappeared in figure 2A. What are the differences between these two experiments? Similarly, in figure S4, the IgG lanes in A, B and C looked quite different from each other. The results suggested that the workflow might not be as robust as the authors have claimed.

        The high background of IgG in Fig. 4A in very possibly caused by the relatively less effective labeling by H3K4me3. As we responded before, we have included news results obtained under native conditions, which reduced background.

      • The authors drew complicated interaction networks for each histone PTMs, however, a comparison between the networks for a modified histone versus the unmodified one is missing. Such data would be more valuable and informative as they can provide new clues on how the PTM mediates different interaction networks for cellular signaling.

        We have performed AMPEX with different Histone modification antibodies, while there are overlaps between these modifications, we could find proximal proteins specific identified by different histone modifications.

      • The manuscript lacks the experimental validation of the identified interaction protein partners. The authors used the published CHIP-seq data, however, the cell lines of CHIP-seq are different from the one they used for APEX labeling. In this regard, an side-by-side comparison of this method with the CHIP-seq results should be performed, in terms of both purification efficiency and specificity.

        To validate our findings, we have purified nucleosomes and performed SF3B1 IP and confirmed the interaction of SF3B1 with H3K5ac. We have also performed mono-nucleosome H3K27me3 IP and confirmed the binding between H3K27me3 and NSD2. We have included these validated results in our revised manuscript.

      • For the proteomic data, reproducibility calculation should use intensity ratio instead of intensity. The high dynamic of signal intensity will mislead the audience. P values between replicates should be calculated, a cutoff of

        We have calculated P values between replicates by t-test and added them in supplementary table s1.

      **Minor comments:**

      • Reference format should be checked. eg. redundant citation, incorrect format etc.

        We have checked our references for the errors and corrected them as many as we can find.

      • Loading controls should be attached along with western blotting results (figure 1d, figure 2A).

        We have included the Ponceau S staining as loading controls of these western blotting results in our revised manuscript.

      • This manuscript lacks clear conclusion or discussions. Subtitles are needed to delineate results. The overall logical organization of this manuscript should be strengthened. Currently, it is quite hard to follow and appreciate which part of the data is more technically novel and biologically significant.

        We have added subtitles and discussion in the revised manuscript.

      • The title is too broad as the authors only showed the application on histone PTMs.

        We have changed our title to “Defining proximity proteomics of Histone modifications by antibody-mediated protein A-APEX2 labeling”.

      • The introduction lacks proper description of other strategies for mapping PTM specific interactome, especially those by photo-affinity peptides or photo-affinity unnatural amino acids.

        We have added the other strategies for mapping PTM specific interactome in the introduction in the revised manuscript.

      Reviewer #3 (Significance (Required)):

      • It is more technical improvement by fusing protein A with APEX2 so that the proximity labeling can be guided around a specific PTM using the proper antibody. Researchers in the field of histone modifications will be interested in the current technique. The reviewer is with expertise in proteomics with focus on PTM analysis. References

      Laitinen OH, Nordlund HR, Hytoenen VP, Kulomaa MSJTiB (2007) Brave new (strept)avidins in biotechnology. 25: 269-277

      Vermeulen M, Eberl HC, Matarese F, Marks H, Denissov S, Butter F, Lee KK, Olsen JV, Hyman AA, Stunnenberg HGJC (2010) Quantitative interaction proteomics and genome-wide profiling of epigenetic histone marks and their readers. 142: 967-980

      Villasenor R, Pfaendler R, Ambrosi C, Butz S, Giuliani S, Bryan E, Sheahan TW, Gable AL, Schmolka N, Manzo M et al (2020) ChromID identifies the protein interactome at chromatin marks. Nat Biotechnol 38: 728-736

      Wilchek M, Bayer EA, Livnah OJIL (2006) Essentials of biorecognition: the (strept)avidin-biotin system as a model for protein-protein and protein-ligand interaction. 103: 27-32

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Li and coworkers describe an approach for detecting the PTM specific interacting proteomes through Proximity labeling strategy. In this method, cellular proteomes were first cross-linked by formaldehyde and then incubated with PTM specific antibodies. Afterwards, a protein A-APEX2 fusion protein was added to target the antibody and label the proteins in proximity with biotin. The biotinylated proteins can be enriched and subsequently identified by LC-MS/MS. The authors claim that these proteins are PTM-specific interacting proteins because they are close to the PTM-recognition antibody. They mainly applied the method to the histone marks and identified the interacting proteins for several histone modifications. They found that the identified proteins overlap partially with the published CHIP-seq data and drew complicated interaction networks based the proteomic data using the STRING database.

      Major comments:

      1. The APEX technology has been widely used for capturing subcellular proteome or proteome proximal to the target protein of interest. Here the authors aimed to use antibodies specific for detecting certain histone PTMs to guide the APEX enzyme for proximal labeling. While the idea is interesting, the actual applicability is rather limited as APEX does not enable identification of direct interactions. With the current methodology design, the reviewer did not see its advantage over a formaldehyde crosslinking followed by a conventional immunoprecipitation using the PTM-specific antibody. The authors are suggested to compare the head-to-head performance of these two approaches.
      2. From the methodological point of view, it is unclear why formaldehyde crosslinking is included here. Neither did the authors justify its necessity for the whole workflow. With treatment with 0.1% formaldehyde for proteome crosslinking, how about physiological status of the cells? Authors should assess survival of cells under formaldehyde treatment. After a long labeling time, will the labeling results still reflect the physiological status of the protein interaction networks in living cells?
      3. Overall, the manuscript lacks sufficient description on how the pipeline was optimized. For example, does it matter where the APEX2 is fused to protein A? The authors need to do a better job to present their optimization process.
      4. Crosslinking will result in high background in proximity labeling, in figure 1 D, the signal from IgG is obviously high. Strangely, the background signals completely disappeared in figure 2A. What are the differences between these two experiments? Similarly, in figure S4, the IgG lanes in A, B and C looked quite different from each other. The results suggested that the workflow might not be as robust as the authors have claimed.
      5. The authors drew complicated interaction networks for each histone PTMs, however, a comparison between the networks for a modified histone versus the unmodified one is missing. Such data would be more valuable and informative as they can provide new clues on how the PTM mediates different interaction networks for cellular signaling.
      6. The manuscript lacks the experimental validation of the identified interaction protein partners. The authors used the published CHIP-seq data, however, the cell lines of CHIP-seq are different from the one they used for APEX labeling. In this regard, an side-by-side comparison of this method with the CHIP-seq results should be performed, in terms of both purification efficiency and specificity.
      7. For the proteomic data, reproducibility calculation should use intensity ratio instead of intensity. The high dynamic of signal intensity will mislead the audience. P values between replicates should be calculated, a cutoff of < 0.05 or < 0.01 is mostly used in proteome data.

      Minor comments:

      1. Reference format should be checked. eg. redundant citation, incorrect format etc.
      2. Loading controls should be attached along with western blotting results (figure 1d, figure 2A).
      3. This manuscript lacks clear conclusion or discussions. Subtitles are needed to delineate results. The overall logical organization of this manuscript should be strengthened. Currently, it is quite hard to follow and appreciate which part of the data is more technically novel and biologically significant.
      4. The title is too broad as the authors only showed the application on histone PTMs.
      5. The introduction lacks proper description of other strategies for mapping PTM specific interactome, especially those by photo-affinity peptides or photo-affinity unnatural amino acids.

      Significance

      It is more technical improvement by fusing protein A with APEX2 so that the proximity labeling can be guided around a specific PTM using the proper antibody. Researchers in the field of histone modifications will be interested in the current technique. The reviewer is with expertise in proteomics with focus on PTM analysis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This work describes interesting approach for mapping interactome of specifically modified histone protein using antibody-based APEX2. Although it contains interesting results and useful techniques for biological community, I found that it requires revision and addition for the publication.

      1. I could understand the reason using antibody-based proximity labeling approach for mapping the biomolecules that interact with post-translationally modified histone becuase it should be complicated to map it with exogenous protein expression approach. However, the introduction of antibody-conjugated APEX2 should require fixation and permeabilization steps that can usually compromise ultrastructure. The authors should comment whether these procedures can affect protein composition and structures of nucleosome in the Discussion part of this manuscript.

      2. For generation of biotin-phenoxyl radical, HRP-conjugated antibody can be utilized as shown in BAR method (Daniel Z Bar et al. Nat Methods, 2018, 15, 127-133). Since secondary-HRP antibody is commercially available, one cannot make an effort to express and purify pA-APEX2 for this approach. The autrhos should clearly explain why they selected APEX2 and what is an expected advance(s) using APEX2 in their approach.

      3. For the detection of the APEX-mediated biotinylated proteins, direct mass identification of tyrosine-modified peptides with chemical probes can tell the most correct information of the proximal proteins (see Lee SY et al. J. Am. Chem. Soc. 2017, 139, 3651-3662; Namrata D Udeshi et al. Nature Methods 2017, 14, 1167-1170). Thus, if the authors can obtain the biotin-modified peptide information from each antibody-conjugated APEX2, the quality of their interactome results should be much improved. If authors may be under the situation that cannot conduct further mass experiments, it might be required to check whether their important finding molecules (e.g. Arid2, Brd7, Nsd2) are really "biotinylated" by conducting Streptavidin-HRP western blot experiment after enrichment of those proteins by using primary antibody. If biotinylation is specifically conducted by pA-APEX2 with H3K27me3 antibody, the authors can observe SA-HRP blot signal on the enriched protein band on the membrane. Negative controls should be the samples omit pA-APEX2, H3K27me3 antibody, biotin-phenol, H2O2, respectively or using different PTM targeted primary antibody. This result can confirm that their findings are enriched from proximity-dependent biotinylation of APEX2, not from spurious binding events to the other biotinylated proteins or self-labeled bait proteins.

      Significance

      For antibody-binding APEX2 strategy, this work is not the first one and the authors should mention the precedent work in the manuscript: Jisu Lee et al. Chem. Commun., 2015, 51, 10945-10948. And the author also commented the antibody-based proximity labeling mapping works including Daniel Z Bar et al. Nat Methods, 2018, 15, 127-133 in the manuscript.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this article, Li and colleagues demonstrate the utility of a novel proximity labeling-based strategy, which they term AMAPEX (antibody-mediated protein A APEX) for the proteomic characterization of the protein environment of specific histone modifications. They apply this new methodology in mouse embryonic fibroblasts for subsequent purification of biotinylated proteins and mass-spectrometric identification.

      Major comments:

      • The authors present a high quality, descriptive manuscript that introduces a novel proximity labeling approach from a mostly technical point of view. The presentation of the data and methods are mostly clear, such that they should be easy to reproduce.

      • The biggest shortcoming of the study in its current form seems to be the lack of a proper assessment of the method's sensitivity AND -most importantly also- specificity. The authors do not validate potentially new interactors of the modified histones experimentally, which would highlight their technology as a discovery tool. Without the assessment of newly identified proteins, these could simply represent false positives, which would point towards an additional requirement for optimization of the experimental setups. In that light the newly identified proteins are merely potential histone interactors. Validation would require establishment or purchase of additional antibodies, or alternatively cloning and transfection of respective candidates, which will probably take an extra 2-3 months of work. While the study may be publishable in its current form, this validation seems a valuable investment to strengthen the preliminary conclusions.

      • With two biological replicates the study fulfills the minimum requirements for reproducibility, however, it would benefit from an additional replicate.

      Minor point:

      • The sentences in lines 22f (and 44f) could be misunderstood. Please rephrase statements to be unambiguous that it specifies the proteomic surrounding of post-translationally modified proteins. The proximity labeling technologies may themselves be limited in identifying post-translationally modified proteins, as they label reactive side-chains that are often targeted by PTMs. Post-translationally modified lysine residues cannot be targeted by biotin ligase-based methodologies, as tyrosine phosphorylations cannot be assessed by peroxidase-based labeling.

      Significance

      • From a technical point of view, most of the key conclusions of this paper are convincing. Assessing the proteomic environment of post-translationally modified proteins is of extremely high interest and the methodology seems broadly applicable to address such questions. It should be noted that proximity labeling has not originally been utilized to map protein-protein interactions, as the enzymatic activities available probe their surroundings, this could be an over-interpretation. Nonetheless, the "proteomic surrounding" of target proteins, which would also include hard to identify transient interactions is of high general interest for molecular biologists.

      • Although it can be generally agreed that having to express an exogenous fusion protein is a limitation of current proximity labeling setups, the methodology presented here in turn has the limitation that it cannot be performed in living cells, which is a significant disadvantage.

      • My lab is establishing and constantly improving various proximity labeling methodologies in combination with mass spectrometry for sub-organellar proteomics. While the technology is sound and well executed, I am not an expert in the biology of histone modifications and the proteins involved and cannot assess the novelty nor the actual value of the generated datasets. It appears to me though that additional validation by independent experiments would strengthen the manuscript. The authors describe the usefulness of the technology mainly in confirming known interactors of modified histones, however, it would be nice (for non-specialists) to explicitly state how high the coverage of the known interactors is and discuss why some of them might have been missed.

      • The manuscript in its current form completely lacks a discussion of the presented data. Even if the focus will remain on the technical aspects, the findings should be properly discussed by comparison to other proteomics approaches studying histones. Ideally the data should also be discussed in the light of current proximity labeling technologies and potential future directions.

      • While I cannot assess the value for histone research, the manuscript will be very interesting for experts focusing on proximity labeling technologies and subcellular proteomics.

  4. Jul 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear reviewers,

      We thank all reviewers for and their appreciation of our work and even more so for their constructive comments and suggestions, which will significantly improve the quality of the manuscript. We were able to complete the revision and address all reviewer comments. Aside a more stringent discussion of the literature, and rewording of certain paragraphs for clarity, we also generated additional experimental data.

      More importantly, to address the concern that we did not provide a positive marker for the intranuclear compartment, we present new images. We attempted to label gamma-Tubulin by generating new antibodies, GFP-tagged strains, and trying multiple commercial antibodies since the beginning of the project. Only recently we found an antibody providing a more specific signal at the expected location, although with some likely cross-reactivity with alpha- and beta-tubulin, and now show these data in the supplements. Additionally, we generated expansion microscopy samples stained with a fluorophore-coupled NHS-Ester, a bulk protein label. These data show that the centrosome contains an exceptionally protein dense hourglass-shaped region, which spans from the extranuclear to the intranuclear compartment, as revealed by centrin and tubulin co-staining. This fortifies our claims about the distinct nature of the intranuclear centrosome compartment containing the microtubule nucleation sites.

      Further, we add images of 5-SiR-Hoechst, SPY555-Tubulin, Centrin1-GFP triple labelling live cells to demonstrate the specificity of the microtubule dye and to underline that we are indeed acquiring the dynamics from the first nuclear division on.

      In terms of formatting we added line numbers and uploaded high quality figures separately. Due to the added data and panels we needed to split Fig. 1 into two separate figures, rewrote the figure legends and moved them to the end of the document.

      Please find below a point-by-point response to the comments.

      Best regards,

      Julien Guizetti

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Simon and collaborators addresses the dynamic changes of spindle and hemi-spindle microtubules occurring along schizogony in Plasmodium falciparum. The work explores the temporal correlation of the changes observed in intranuclear spindles with changes at the level of the centriolar plaque; the nuclear microtubule organizing center of these parasites, using centrin as a bona fide marker of the structure. The study shows that spindle microtubules organize from an intranuclear region, devoid of chromatin, distinct from the centrin region which had not been observed or described before. It further shows that centrin does not localize at the nuclear envelope, but it is actually extranuclear.

      This work significantly expands on previous knowledge regarding the functional and spatial organization of the nucleus in P. falciparum, and the structure once defined as "an electron dense mass on the nuclear envelope." It uses state of the art microscopy approaches such as STED, UExM and CLEM, in combination with immunolabeling, dyes and parasites over expressing fluorescent protein fusions, to address these questions.

      **Major comments:**

      • Are the key conclusions convincing?

      I find the manuscript successfully addresses the posed questions. The data presented supports the conclusions.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      No

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      N/A

      • Are the data and the methods presented in such a way that they can be reproduced?

      Yes

      • Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      **Minor comments:**

      • Specific experimental issues that are easily addressable.

      On the data shown in Figure 1, it is unclear to me what elements are taken into account to define "anaphase." Anaphase could be defined by using chromatin markers - such as CenH3- which have been identified in Plasmodium and the authors make use of in Figure 1F.

      We acknowledge that the term anaphase is ill-defined here. Further it suggests a mitotic morphology analogous to the one observed in “classical” models (prophase, metaphase, anaphase,…), which is not fully appropriate. In line with the comments by Reviewer 3 we, therefore, decided to use the term “extended spindle” instead (Fig. 1 & 2). This better reflects the morphological criterion on which we based the stage definition.

      • Are prior studies referenced appropriately?

      The authors state that "with the exception of centrins and gamma tubulins" few canonical centrosome components are conserved in Plasmodium. These parasites are in fact able to assemble a more or less canonical centriole for microgamete basal body formation. Widely conserved centriolar components such as Sas6 are coded by the malaria genome, and have been characterized previously. This work is neither referenced nor discussed in the manuscript.

      The reviewer is right to point out this omission. We were too much focussed on the blood stage centriolar plaques while writing this section, where centrioles are not observed. Of course centriole-like structures are relevant in other life cycle stages, such as microgametes, and should be discussed (line 104). Some previous attempts to endogenously tag Sas6 to verify its localization in blood stages were unfortunately not successful.

      • Are the text and figures clear and accurate?

      I find the timings shown in Figure 1A, with respect to the schematic quantification shown in Figure 1B, confusing. Shown as it is, one naturally correlates the images on Fig1A above with the cell cycle progression timing shown on Fig1B, below. However, by time 260min, for example, two somewhat adjacent centrin signals can be observed. Though this is defined as anaphase- by an unspecified criterium- this could very well be representative of metaphase. Nonetheless, the timing shown on Figure 1B for "anaphase" onset is 170min, which is inconsistent with the images above. I suggest that either, the quantification is shown in a different format (ex. bar plots) which could then better reflect the cell to cell variations observed (by use of error bars, for example) or that the figure explanation in the results section clarifies this issue.

      We understand how this representation is misleading and have adjusted the figure and text accordingly. We modified the time stamps in Fig. 1A (now Fig. 1C) to the scale used in Fig. 1B (now Fig. 1D) i.e. collapse of the hemispindle is t=0 and explain this in the text (line 158). Since we feel that Fig. 1B (now Fig. 1D) is a good and compact visual representation of progression through the first division we kept the bar plots in the supplements (Fig. S1), but added a title clarifying that average duration between multiple movies are shown.

      As presented, the data in Figure 1C is rather uninformative. A pattern could be more immediately extracted if dots corresponding to subsequent appearance of centrin dots in the same nucleus were connected to each other.

      Concerning the appearance of the centrin signals we adopted the good suggestion by the reviewer and connected “paired” centrin signals by lines (Fig. 1E).

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      There are a number of edits required on the text. Row numbers would have been helpful in pointing these out. I point some edits below, but thorough revision of the manuscript for grammatical and synthetic errors would be beneficial.

      • Cytokinetic segmeter - please replace with "segmented"

      • Please refer to Figure 1D when appropriate - there is quite an extensive paragraph describing the results shown on this figure, but it is only referenced at the start.

      • "..., as did the and the number of branches per nucleus,..." please rewrite as appropriate.

      We apologize for not providing line numbers, but have corrected the addressed points and applied a grammatical check throughout the manuscript. We have added additional references to Figure 1D (now Fig. 2A) in the text.

      Reviewer #1 (Significance (Required)):

      This manuscript could be interested to a wide audience interested in cell cycle, cell division, cell organization and organelle positioning, infectious diseases and microscopy. However, the introduction assumes that readers are somewhat experts in the malaria field. I suggest the authors include a brief introduction of the malaria life cycle, and a schematic representation of the division mode. This will help non-experts follow the narrative more easily.

      We are happy to read that the reviewer sees value of this study for a broader audience. Following the suggestion, we added a small schematic (Fig. 1A, lines 54, 62) highlighting the relevant steps of schizogony and expanded the introduction of the life cycle (line 46).

      This work rectifies long-standing inconsistencies observed by different experimental approaches in the nuclear organization of malaria parasites during schizogony. However, what the functional consequences of the alternative modes of spindle organization in malaria could be, are not clearly stated or discussed. In this respect, as it stands, the manuscript is rather descriptive and lacks mechanistic insight. Nonetheless, the data presented are of superb quality, and the manuscript represents a tremendous leap in structural insight and imaging resolution for the field of malaria. I find the data is suitable for publication albeit minor adjustments are made (specially to Figure 1 and/or the description of the results shown in Figure 1, for consistency).

      We agree that the value of this manuscript lies in the clarification of conflicting data, unprecedented structural insight, and providing a useful working model for the malaria parasite centrosome. Although this study is ultimately descriptive it forms the indispensable basis to generate more meaningful functional insight about centrosome biology and nuclear division. Some of the functional consequences worth considering are: i) The (at least) bipartite composition indicating that centrosome functionality is spatially spread throughout the nucleoplasm/cytoplasm boundary. ii) The delayed appearance of the centrin signal after tubulin signal allows the prediction that centrosome assembly is a staged process occurring over an elongated period of time. iii) The generally amorphous structure of the compartment predicts the involvement of yet to be uncovered matrix-like proteins harbouring microtubule nucleation sites. iv) Lastly, our model has important implications for the mechanism of centrosome duplication. In a centrosome containing centrioles (like in vertebrates), the duplication event can easily be explained by physical separation of the daughter and mother centrioles. Spindle pole body duplication in yeasts is achieved by de novo formation of a new one, which remains connected by a half bridge until it is split. The centriolar plaque organization revealed here suggests that we need an entirely new model of centrosome duplication (or splitting) to describe and understand this process in malaria parasites. We now address those points more explicitly in the discussion section (e.g. lines 375, 443, 467).

      **Referee Cross-commenting**

      I agree with all the other reviewer's comments. I'm glad the reviewers seem to be experts in the field of malaria cell division and have pointed out previous studies which were not appropriately referenced. I second those comments.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The manuscript by Simon et al have used advance cell biology technology like STED, expansion and live cell imaging to decipher the configuration of microtubules, centrin and nuclear pore during unconventional cell division process in malaria parasite. They have shown the dynamics of centrin and its localisation with respect to centriolar plaque that is characteristic of these parasite cell during schizogony> They also implicate from their studies that there is extended intranuclear compartment which is devoid of chromatin

      **Major Comments**

      • Are the key conclusions convincing? Yes to some extent

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      *Some part are preliminary and speculative as there is no solid data supporting it. Please see below

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      *Yes to substantiate their claim

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      *They can do these quite quickly less than a month

      • Are the data and the methods presented in such a way that they can be reproduced?

      *Yes

      • Are the experiments adequately replicated and statistical analysis adequate?

      *Yes

      The authors present beautiful imaging and some in depth structure using tomography and CLEM to show the location of centrin which is generally considered the marker for centrosome or Microtubule organising centre in malaria parasite. These approaches are still not been applied in Plasmodium and hence very informative. Though they present some advance microscopy but a lot of these concept for hemispindle were shown earlier in many light and super resolution microscopy studies. Authors claim that they are first to show that there is space between centrin and nucleus but it has been show previously in centrin studies in Plasmodium berghei using super resolution microscopy (Roques et al 2019 Fig1 and supplementary videos1&2) as well as expansion microscopy recently by group of Brochet etal 2021.

      We thank the reviewer for the appreciation of our work. We are, indeed, not the first to describe the gap between centrin and tubulin or the nucleus. We just aimed to reiterate this finding, also visible in our data, in order to transition to the analysis of nuclear pore positioning to clarify whether centrin is actually extranuclear. Nevertheless, we should have cited the Roques and Bertiaux et al. studies again in this context, which we have now rectified (line 252).

      In addition the microtubule dynamics was also recently shown with Kinesin5 live cell imaging for schizogony in Plasmodium berghei (PMID: 33154955) which author have omitted in their manuscript.

      We thank the reviewer for pointing out the Kinesin-5 study by Zeeshan et al., which we failed to cite and discuss. We now state the findings of this publication and put it into the context of our work (see also answer to next point). Microtubule associated proteins, such as the microtubule plus end tracking EB1 and the aforementioned Kinesin-5, are indeed useful markers to investigate microtubule dynamics leading to the interesting results shown by Zeeshan et al. Nevertheless, we want to point out that labelling microtubule associated proteins (MAPs) remains an approximation of the underlying microtubule organization. As the authors in Zeeshan et al. indicate by themselves, Kinesin-5 does not decorate axonemal microtubules or the membrane-associated microtubule structure formed during cytokinesis in very late schizont stages. Further, colocalization between alpha-tubulin and kinesin-5 in schizont-stage parasites is not complete indicating a preferential decoration of certain sections of the microtubule structures (possibly the microtubule ends), which could only be resolved by super-resolution microscopy. Using a live cell dye, such as SPY555-tubulin, which directly binds to microtubules will provide a uniform labelling of any microtubule species and hopefully prove useful to the field in the future. Lastly, we present time-lapse microscopy analysis of blood stage cells, contrary to single time point images of live cells, providing a quantified chronology of microtubule reorganization at single cell level (with time stamps). Therefore, we feel that our claim, although it should be relativized, is formally speaking accurate.

      It is also important that authors give valid discussion about previous studies on hemispindle, microtubule dynamics with respect to schizogony (PMID: 18693242; PMID: 11606229; PMID: 33154955) rather than giving the impression that they have given this concept first time on hemispindle dynamics and centrin location during schizogony.

      We agree that those studies should be discussed in more detail. We are grateful to the reviewer for pointing out the Fowler et al. 2001 (PMID: 11606229) study. They use an antibody against gamma-tubulin to demonstrate its presence at the apical pole of subpellicular microtubules (f-MAST) in the merozoite and cytokinetic stages (line 102). However, we were unable to reveal a specific gamma-tubulin staining using the antibody used by them in the preceding schizont stage. After trying many different commercial gamma-tubulin antibodies and attempting to generate our own we now finally observe a gamma tubulin localization at the poles of intranuclear spindles in schizont stage, although the only successful antibody still displays some background staining, possibly including cross-reactivity with alpha or beta-tubulin (Fig. S4, line 237).

      The highly insightful study by Mahajan et al. 2008 (PMID: 18693242) indeed suggests that centrin localizes away from the DNA and demonstrate the distinct localization from tubulin. They, however, likely due to the resolution limit of their microscopy techniques, speculate that the centrin signal is embedded in the membrane, while we could show by super-resolution and nuclear pore staining that centrin is distinct from the membrane (now Fig. 2A; line 257). The work done by Zeeshan et al. 2020 (PMID: 33154955) nicely shows dynamics of kinesin-5 in nuclear division. In schizont stages Kinesin-5 signal elongates and splits alongside the mitotic spindle with which it overlaps for the most part. Colocalization with centrin is less strong although the authors note some overlap. Our data suggest that centrin and tubulin are clearly distinct. In male gametes the authors show nicely time-resolved data of kinesin spreading along the elongating spindle, although hemispindles are not observed at this stage. We introduce and discuss these findings (lines 123, 432).

      The concept of bipartite centrosome is already been discussed in Toxoplasma and the claim by authors in Plasmodium presented here is not substantiated experimentally. They showed that centrin is part of outer region while they do not show with any marker for the inner region. It will be very helpful if the authors use gamma tubulin or MORN1 to show the location with respect to centrin and microtubule. In the absence of this localisation the claims are preliminary and speculative. If the centrosomal protein complex is not involved in microtubule nucleation, then how the nucleation is happening. What are the molecules present in this amorphous matrix? It will be great to check the location of gamma-tubulin or some inner centrosome molecules described in Toxoplasma that is deemed to be MTOC.

      We share the opinion that our Plasmodium data should be compared to Toxoplasma, while still being assessed independently. Despite Toxoplasma belonging to the apicomplexan the conclusion that their centrosomes should be organized in a similar fashion is by no means self-evident considering for example their significant evolutionary distance. Actually, several noteworthy morphological differences have already been well documented. i) Toxoplasma MTOC does contain centrioles in the outer core which is coherent with the centrin and gamma-tubulin localization in this region. ii) Toxoplasma MTOC contains an additional nuclear membrane protrusion enclosing the inner core. iii) mitotic microtubules in Toxoplasma are thought to penetrate the nuclear membrane to connect to centromeres. iv) the inner and the outer core are both extranuclear and therefore not to be equated with the intranuclear compartments. We now expand a bit on the discussion of the aforementioned differences (line 382). Nevertheless, we thank the reviewer for making us realize that the term “bipartite” is a poor choice to describe the centriolar plaque organization in this context. Therefore, we replaced it in the abstract (line 29) and the main text (line 375).

      We acknowledge the fact that it would be desirable to show a marker localizing to the intranuclear compartment, and not only through visualizing the microtubule nucleation complex (Fig. 4A-B) and the positioning of the microtubule ends in this region (Fig. 3A). Concerning MORN1 we found no indication in the published localization data that it is, like in Toxoplasma, associated with the nucleus in Plasmodium species, where it is only found associated with the budding complex (and we are currently unable to procure an antibody) (line 422). We have attempted gamma-tubulin visualization on many occasions throughout the project (transgenic parasite lines, commercial antibodies, self-made antibodies) and only recently found an antibody revealing some specific signal. Indeed, we found localization at the poles of the spindles i.e. the intranuclear compartment (line 237). Unfortunately, this “best-possible staining” still showed some unspecific spindle staining likely resulting from cross-reactivity with alpha- or beta-tubulin causing us to put these data into the supplements (Fig. S4).

      We had more luck with attempting a “new” type of staining, recently used in Plasmodium (Bertiaux & Balestra et al. 2021) using a fluorophore-coupled NHS-Ester in expanded samples. This chemical unspecifically stains proteins and revealed that the centrosomal region contains an exceptionally protein dense “hourglass-shaped” structure (Fig. 3F-H). Since the outer part of this structure colocalizes with centrin and the inner part overlaps with microtubules we assume that the centrosomal complex stretches throughout the nucleo-cytoplasmic boundary and fills part of the intranuclear compartment (line 320). Especially the highly protein dense region at the neck of the “hourglass” seems very coherent with the nuclear membrane embedded electron dense region which can be seen in electron microscopy (e.g. Fig. 3E & 4B). We feel that this staining strongly supports the presence of this novel intranuclear compartment.

      The expansion microscopy is very nice and some of it presented in supplementary can be moved to main section.

      Thanks for sharing our enthusiasm about this imaging technique. We have now selected a representative image of a hemispindle and mitotic spindle stage nucleus imaged by U-ExM and added it to the main section (Fig. 2B, line 231).

      The localisation CenH3 is bit puzzling as it has been shown that centromere/ kinetochore cluster and are present during early and mid schizogony. The various foci with respect to nuclei are not what has been seen previously. Please discuss the difference in these two findings.

      The localization pattern can easily be explained by the increased resolution of STED nanoscopy used in this study. Previous studies (e.g. Hoeijmakers et al. 2012 and Zeeshan et al. 2020) used classical confocal microscopy. Under those imaging conditions the individual foci seen here can´t be resolved and would, in accordance with the other studies, appear as one cluster. We slightly modified the text for more clarity (line 247).

      Reviewer #2 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      * This is more technical advancement on the subject of centrin by using STED, tomography and CLEM.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      * This work has relevance relation to cell division during schizogony in asexual stages in par with Toxoplasma or in Apicomplexa in general

      • State what audience might be interested in and influenced by the reported findings.

      Working with Apicomplexa, Protist, cell division and mitosis.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Working on Cell division in Plasmodium.

      **Referee Cross-commenting**

      I agree with the reviewers and some of the experiment suggested and the minor details have to be addressed. There are some loose ends and these suggestions will enhance clarity of the data. It is a very nice study and some of the comments suggested by reviewers will improve the manuscript. __

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The centrosome is the primary microtubule-organizing center (MTOC) in eukaryotic cells that nucleate spindle microtubules necessary for chromosome segregation. In most eukaryotic cells, the canonical centrosome is composed of centrioles surrounded by an electron-dense proteinaceous matrix named the pericentriolar matrix (PCM) competent for microtubule nucleation mitotic spindle assembly. Following the breakdown of the nuclear envelope breakdowns, the mitotic spindle microtubules gain access to the kinetochores of the condensed mitotic chromosomes. Once the mitotic spindle is fully developed, centrosomes are at opposite poles of the cells, and chromosomes are pulled toward opposite poles. Cell division completes with cytokinesis resulting in the active formation of two nuclei within two daughter cells. Interestingly, during its asexual replication cycle, the malaria parasite Plasmodium falciparum undergoes multiple asynchronous rounds of mitosis with segregation of uncondensed chromosomes followed by nuclear division within an intact nuclear envelope. The multi-nucleated cell is then subjected to a single round of cytokinesis that produces dozen of daughter cells. We know about the Plasmodium centrosome is that it is made of an acentriolar structure embedded in the nuclear envelope and serves as MTOC during cell division. However, the biogenesis and regulation of the Plasmodium centrosome are poorly understood. Given the peculiarity of the cell division in Plasmodium parasites, understanding the molecular mechanisms that drive and regulate MTOC duplication and maturation could unveil novel targets for the treatment of malaria. In this study, Simon et al. successfully applied challenging and cutting-edge microscopy techniques to monitor the dynamic formation of the spindle microtubules and MTOC during Plasmodium intraerythrocytic mitosis. In addition, they remarkably combined stimulated emission depletion (STED) with ultrastructure expansion microscopy to define an uncharacterized intranuclear compartment devoid of chromatin as the nucleation site of nuclear microtubules. And lastly, the authors adapted an in-resin correlative light and electron microscopy (CLEM) approach to define the centriolar plaque position in a novel intranuclear compartment with centrosomal function.

      **Major comments:**

      1. In the methods section, it is stated that across this study, three different anti-tubulin antibodies (alpha-tubulin B-5-1-2, alpha-tubulin TAT1, beta-tubulin KMX1) were used, and two anti-centrin antibodies (TgCentrin1 and PfCentrin3) were used, one of which seems to have been generated in this study (anti-PfCentrin3). It is unclear in the figures or results section when each of these antibodies was used, and the authors should give a rationale for using multiple antibodies in combination.

      To label microtubules we used the mouse anti-alpha-tubulin B-5-1-2 (Sigma, T5168) antibody throughout the study. Except for U-ExM were we added two additional primary antibodies against tubulin. Due to the expansion of the samples the antibody binding epitopes are stretched out in space. This causes a significant reduction of local epitope concentration (expansion factor 4.5 in all directions results in ~ 80-fold increase in the volume), which can reduce the signal intensity. Adding multiple antibodies binding different epitopes of tubulin can compensate for this dilution effect to some degree, as has been shown before by Gao et al. 2018. At the same time the expansion contributes to the accessibility of the usually densely packed tubulin epitopes within the microtubule polymer, which certainly adds to the success of U-ExM. What the respective contributions of those effects are is not clear, but we found superior signal-to-noise ratios when combining three tubulin antibodies instead of using one. The TgCentrin1 antibody was only used in Fig. 2C (now Fig. 3B) and validated the localization pattern of our new PfCentrin3 antibody we used in the other pictures. We now provide clearer description of antibody usage in the methods section and a new supplemental table.

      The anti-PfCentrin3 antibody seems to have been generated for this study. If this is the case, the authors should provide evidence that this antibody binds to the recombinant PfCentrin3 it was raised against and binds PfCentrin3 in parasite lysates.

      The anti-PfCentrin3 antibody was, indeed, produced for this study and we should have provided our western blot data right away. We now show the requested blot, which shows bands at the appropriate size in parasite lysate as well as for the recombinant protein, in the supplements (Fig. S2, line 178).

      In the first paragraph of the Results section, the authors' remark of centrin foci that they are "...only detectable later (Mov. S2) or sometimes not at all." In Figure 1 A-C, it is implied that the first observed division is the first nuclear division of that parasite. Given that some nuclei do not have a visible centrin focus, it cannot be concluded with certainty that these parasites only contain a single nucleus and that this is their first division. The authors would need to include a quantifiable DNA stain to show this unequivocally to show a single nucleus. It has undergone DNA replication, similar to Klaus et al., 2021 BioRxiv paper. In the absence of a DNA stain, the authors should reword to clarify that this is the first observed division and speculate that it is the first division of that nucleus, but the authors should draw no firm conclusions about the first division.

      Indeed the variability in protein levels that can result from exogenous expression can lead to some cells not showing clear Centrin1-GFP foci. Although this is a rare event we wanted to acknowledge this observation. The live cell microtubule staining using Spy555-Tubulin we use is, however, highly specific and sensitive and would stain any nucleus undergoing division including the first one. If there would be more than one nucleus in the observed cell it would unequivocally show two clearly separated tubulin signals (hemispindle or mitotic spindle). To illustrate this we added Fig. 1B (line 148) showing two live parasites stained with SPY555-Tubulin plus a Hoechst-based dye showing one or two nuclei alongside the corresponding tubulin signal. We modified the text to clarify how we stage the parasite for time-lapse acquisition (line 154). We already extensively experimented with state of the art fluorogenic live cell DNA dyes (e.g. from Spirochrome and the Johnsson group) to visualize the nuclei directly in time lapse microscopy, but even at minimal concentrations they all significantly inhibit mitotic progression. We also add this information in the main text (line 150).

      In the first paragraph of the Results section, the authors write: " We quantified the duration of hemispindle, accumulation and anaphase stages ...." Anaphase spindle fibers means that the sister chromatids are separated. In the absence of a centromeric marker like NDC80, it doesn't seem easy to claim the anaphase stage. The authors should write " extended spindle." The authors might also consider using the term collapsed spindle instead of accumulation to reflect the dynamic of the intranuclear microtubules during the blood-stage replication. The same modification should be made for Figure 1B, so we read " hemispindle, collapsed spindle and extended spindle."

      We thank the reviewer for this suggestion, which is very much in line with a comment by Reviewer 1 on the definition of anaphase. We acknowledge that the term is ill-defined here. Further, it suggest a mitotic morphology analogous to the one observed in “classical” models (prophase, metaphase, anaphase,…), which is not fully appropriate. Consequently, we decided to adapt the suggested terminology in Fig. 1 (and also new Fig. 2) and in the text (line 160).

      Based on the evidence in this study, it cannot be stated unequivocally that the centrosome is entirely extranuclear, at least not as it is implied in Figure 3C. In Supplementary Figure 4, the microtubules appear to be extruding from a circular structure that may either be intranuclear or span the nuclear envelope. In Supplementary Figure 6, the structure pointed to as the centrosome appears to be embedded within the nuclear membrane with a top structure on the cytosolic side of the nuclear envelope. Thus, the best support for an extranuclear centrosome comes from the CLEM images. Still, it is noteworthy that the double membrane of the nuclear envelope is not visible on this slice in the region where the centrin fluorescence is found. Considering some of the fluorescence pixels for centrin are outside the parasite plasma membrane, and some of the Hoechst pixels are outside the nuclear envelope, this data does not show unequivocally that centrosomes are entirely extranuclear. However, this argument would be strengthened if the authors performed a proteinase K protection assay (or something similar) to determine if Centrin1 and Centrin3 are exposed to the cytosol. However, in the absence of that or further evidence, the authors should dampen their claims about the centrosome being exclusively extranuclear, as represented in the schematic in figure 3C.

      We thank the reviewer for this comment, which highlights an issue in our communication of our working model of the centriolar plaque. At no point we intended to claim that the centrosome is exclusively extranuclear. Rather, centrins, which are currently the only reliable marker proteins, localize to a subcompartment of the extranuclear region of the centriolar plaque. Additionally, the centrosome clearly contains an intranuclear region. The composition of this intranuclear compartment is elusive, except that it harbors microtubule nucleation sites. Indeed, our model in Fig. 3C (now Fig. 4C) is misleading and not well annotated. The newly added NHS-Ester staining fortifies this claim (Fig. 3F-H. Consequently, we corrected our working model by adding an explicit figure labelling (now Fig. 4C).

      We apologize for the misleading labelling in Fig. S6 (now Fig. S7). The green arrow was intended to point out the electron dense region associated with the nuclear membrane, which has been seen in previous studies, and was not intended to represent the entire extended centriolar plaque. If anything, this smaller region might provide the link between the intra and extranuclear compartments that the reviewer also identified in Suppl. Fig. 4 (now Fig. 2D). We modified the annotation of the Fig. S7 and Fig. 4A-B accordingly, labelling it the “electron dense region”. More importantly, we hope that our newly added data using NHS-Ester staining of protein dense regions (Fig. 3F-H) highlights the spread of the centrosome across the nucleo-/cytoplasmic boundary more clearly.

      Considering whether centrin is actually extranuclear, we feel that the data shown in Fig. 2A (now 3A) is convincing. We have, however, added two panels of the relevant regions showing centrin localization respective to the nuclear pore and adjusted the contrast as we acknowledge the limited “visibility” within the unadjusted panels. The fact, that the centrin signal slightly overlaps with the nuclear envelope in CLEM images can be explained by the relatively poor resolution of the widefield microscope we had to use to image the sections. From the other super-resolution images in the manuscript, we know that the perimeter of the better resolved centrin signal is significantly smaller. Otherwise one had to assume from the CLEM data that centrin is also in the cytosol of the red blood cell and that DNA is localized outside the nucleus. On a similar note the fluorescence image is, contrary to the tomography image, a single slice since the thickness of the sample section (about 200nm) is significantly below the z-resolution (about 500nm) of a fluorescence microscope.

      Throughout the study, the level of biological replication is unclear. The authors rigorously include all the data points for each of their graphs and the total number of images/videos quantified. And what needs to be added, in either the figure legends or a methods section, is the number of biological replicates for each of these measures came from.

      We have added the number of replicas in the figure legends.

      **Minor comments:**

      STED is present as an acronym in the abstract and should be spelled out in full and clarified that it is a super-resolution microscopy technique.

      We opted to remove STED from the abstract (leaving it at super-resolution, which includes expansion microscopy) to avoid disrupting the “flow” of the abstract and now spell out the acronym at the first mention in the introduction (line 127).

      The second paragraph of the Results section states that ring and early trophozoite stage parasites do not express tubulin or centrin. Still, only an early trophozoite is shown in Supplementary Figure 2. Therefore, the authors should either include a similar image of a ring-stage parasite or remove ring-stage parasites from that statement.

      We have removed the ring stages from the statement.

      The second paragraph of the Results section contains the sentence, "At which point tubulin is reorganized into the bipolar microtubule array, which then forms the mitotic spindle cannot be resolved here." The authors are implying that the point at which tubulin is reorganized into the microtubule array, which goes on to form the mitotic spindle, cannot be resolved here. This is not particularly clear, though, and this sentence could be reworded for clarity.

      We reformulated the sentence to clarify the point we failed to make with the previous wording (line 188).

      The second paragraph of the Results section contains some statements about the results without referencing the figures that these statements come from. The authors should clarify this to make clear which figures each statement refers to.

      We added more references to the appropriate figure throughout the paragraph (lines 188, 219, 223).

      In the third paragraph of the introduction section, the authors write, " Centriolar plaques seem partially embedded in the nuclear membrane, but their positioning relative to the nuclear pore-like "fenestra" remains unclear." Unfortunately, the lack of reference did not allow me to understand if the authors state literature or comment on past published results.

      We added the reference which was incorrectly positioned before the sentence instead of at the end (line 82).

      the authors could add some references:

      • Second section of the introduction: " the 8-28 nuclei are packaged into individual daughter cells, called merozoites ( Rudlaff et al. 2019 PMID: 31097714)

      • Third section of the introduction: " The centrosome of P.falciparum is called centriolar plaque" ( Arnot et al. 2011, Sinden 1991a); " the nuclear pore-like "fenestra" remains unclear (Wall et al. 2018; Zeeshan et al. 2020).

      • Fourth section of the introduction: " tubulin antibody staining are extensive structures measuring around 2-4um ( Ref?)

      • When the authors introduce subpellicular microtubules of segmented schizonts, a reference to a study that shows these structures should be included.

      • A previous study that shows the distinct structure of microtubule minus ends should be cited when this structure is described.

      • Third section of the results, the authors should cite Bertiaux et al. 2021 with the Gambarotto et al. 2019 paper regarding U-ExM.

      We apologize for missing some important references or putting them in the wrong position. We now added all the references or cite them again at the appropriate locations throughout the text.

      Figure 1E shows hemispindle and mitotic spindle lengths of U-ExM expanded parasites, but the position within the figure and figure legend implies that these lengths were determined unexpanded parasites. Therefore, it should be stated in the figure legend that these measurements come from U-ExM expanded parasites. Moreover, I encourage the authors to include U-ExM images in the main figures. The images are beautiful, represent a significant technical achievement, and directly relate to Figure 1E. To the best of my knowledge, this is only the second study to perform expansion microscopy on Plasmodium and the first to use PFA-fixed parasites and a nuclear stain. It would be valuable for the Plasmodium and ExM communities to see this technical advancement represented in the main text.

      We thank the reviewer for the appreciation of our ExM data and added it to Fig. 2B before the quantification of the microtubule length and number and added the information to the legend.

      In the second paragraph of the Results section, the authors write, " but clearly display the microtubule cytoskeleton associated with the inner membrane complex." It would bring clarity to define in few words what the IMC is.

      We included a short definition of the IMC (line 223).

      The methods section details that the length of microtubules was determined by dividing the observed values by an expansion factor of 4.5. If the authors recorded the expansion factors of their gels, this data should be included, and how it was recorded should be stated in the methods. If not, the authors should include the rationale of using an expansion factor of 4.5 as this is slightly different from the previously published expansion factor of P. falciparum of 4.3.

      We recorded the expansion factor by measuring the gel size pre and post expansion with a ruler and found a factor of 4.5 on average. We added this information in the methods (line 688).

      There are several parasite lines used in this study, and some figures are not clear what parasite line was used. Could the authors please include the parasite lines in the figure legends of Figure 1 D-F, Figure 3, Supplementary Figures 1-2, and Supplementary Figures 4-7?

      We added the parasite line information in the legends as requested.

      Nuclear pore complexes, of which Nup313 is a component, can have cytoplasmic, integral, and nuclear-facing components. If it has been shown previously that PfNup313 is the homolog of Nup214 in vertebrates present on the cytosolic side of NPC, this should be stated. If not, then it should be clarified that it is unknown whether Nup313 faces the cytoplasm, nucleus, or is embedded in the NE, as this has implications for the colocalization of Nup313 and Centrin.

      Nuclear pore proteins are very poorly conserved in P. falciparum and Nup313 has only been recently identified as such (Kehrer et al. 2018) mainly by the presence of FG-repeats (as for all the other newly defined proteins). The only related ortholog that can be found through BLAST search against humans, yeasts, and Arabidopsis is Nup100 from S. cerevisiae. ScNup100 is a central pore localizing protein but the sequence similarity to Nup313 is low. We are not aware of any findings showing relatedness to vertebrate Nup214, while sequence analysis rather indicates the absence of orthology. To clearly demonstrate the individual positioning of the few known Nups within the parasite´s nuclear pore complex would require a dedicated long-term project. However, due to the presence of FG-repeats one can assume that it is part of the central FG-Nups layer rather than of the intranuclear basket or the cytoplasmic filaments (line 255). Therefore it would localize more closely to the nuclear envelope than the latter. Either way, a clear gap between centrin and Nup313 signal can be identified and colocalization has not been observed. These data indicate that the exact position of Nup313 on the cytoplasmic, integral or nuclear-facing site is not decisive for the conclusions made in this study and our observations preclude scenarios where centrin is not extranuclear.

      It seems from the image in Figure 2C that DRAQ5 and Hoechst have at least visually indistinguishable localizations. Have the authors taken any STED deconvolved images of nuclei stained with both Hoechst and DRAQ5? Considering the striking increase in detail of the Hoechst signal in STED deconvolved images, it may be informative both to this study and to people who work on chromatin organization what the chromatin staining looks like in the absence of bias towards chromatin state.

      It would, indeed, be interesting to analyse chromatin organization by those means, but DRAQ5 is not a STED compatible dye, highly prone to bleaching, and therefore not suitable for such analysis. Being an infrared dye DRAQ5 is compared to the UV excited Hoechst also yielding a reduced spatial resolution, which is limited by the emitted wave length.

      For the tomography and TEM images, the centrosome is indicated with an arrow, but it isn't entirely clear what that arrow is pointing to for some images. It would be clearer if the centrosome were outlined in green, like the NE, rather than just an arrow. This is particularly important for Supplementary Figure 4, where to my eye, it appears that the microtubules inside the chromatin-free region are coming directly out of a circular structure, which could be interpreted as the centriolar plaque.

      The reviewer is right to point out the use of arrows for centrosome annotation. It was intended for orientation of readers to indicate the “likely position of the centriolar plaque” since a clear boundary around the centrosome can´t be defined. It would have been more precise to indicate that the arrow is pointing at the electron dense region associated with the nuclear membrane, which is of course only one of the sub-regions of the centrosome. This is particularly important since we want to emphasize the extended dimensions of the centrosome. Consequently, we modified the annotation to “electron dense region” in all concerned figures and corresponding legends.

      The ordering of Figure 2A-C seems to imply that the DNA-free region was measured in the STED deconvolved images, but the methods imply that it was in the confocal images. The authors should clarify this in the figure legend or by rearranging B and C's order.

      Hoechst signal was indeed acquired and measured in confocal mode and to avoid confusion we have changed the order of the figures (now Fig. 3B-C) as suggested.

      The authors should provide some more detail on how the DNA-free zone was measured. For example, was it measured on single slices or maximum intensity projections? Was it measured from the middle, far, or near side of the centrin focus? Etc.

      The measurement was carried out in the slice where the DNA-free zone was in focus. Depth was measured from below the centrin signal until the “bottom” of the DNA-free zone. We hoped that the little schematic above the figure would clarify this question, but acknowledge the need to more clearly explain the measurement method, which we now do in the corresponding figure legend (Fig. 3C).

      The methods state that the mCherry signal in figure 2C was detected using a mCherry nanobody. This should be clarified in the figure legends as it currently seems as if we see endogenous mCherry fluorescence.

      The visible signal is certainly a combination of the mCherry plus the “boosting” effect from the Atto594-coupled nanobody that we added. Clearly, this should be mentioned in the figure legend, which we now do.

      The data in Supplementary Figure 4 seems vital to the interpretation of the study. Therefore, for clarity, I encourage the authors to include Supplementary Figure 4 in Figure 2.

      We share the reviewers view on these data and moved them to the main figures (now Fig. 3D).

      In the last sentence of the discussion, it is unclear what the authors mean by how the nuclear compartment "splits," could they please clarify?

      We were referring to the event of centrosome duplication, which has to occur during nuclear division. In a structure without centrioles or a spindle pole body structure forming a half bridge we therefore need a new model to explain how the two poles of the spindle are formed. Potential modes are splitting or de novo assembly. This aspect, as also pointed out by other reviewer, warrants a bit more explanation, which can now be found in the discussion (line 468).

      If the pArl-PfCentrin3-GFP plasmid or pDC2-cam-coCas9-U6.2-hDHFR have been published previously, the respective studies should be cited. If not, the study where the vector backbones were first established should be cited.

      We have now cited the original studies publishing the vector backbones for the first time in the methods (lines 490, 501).

      From the current text, it is not clear that the Nup313 tagged parasites also had a GlmS ribozyme. It is shown in Supplementary Figure 3, but the authors should clarify either in the text of the results, or figure legends, that this parasite line was Nup313_3xHA_GlmS

      The Nup313-tagged line indeed has a glms ribozyme after the HA-tag, which we now mention in the figure legends.

      In the plasmid constructs section of the methods, the authors list several primers by number but not by sequence. Instead, the authors should include the sequence and orientation of each of the primers mentioned in a table as supplementary data.

      This is a good suggestion. We have generated a table at the end of the supplementary data file and on this occasion we also added tables of all the antibodies and dyes used in this study.

      The authors should cite the study where the TgCentrin1 antibody was generated and provide the Rat anti-HA 3F10 antibody catalog number, as catalog numbers are provided for other commercial primary antibodies.

      We now provide the missing catalog numbers in the supplemental data table.

      There is an issue with the formatting of the journal-title in the Kukulski et al. reference.

      Thank you for noticing this error, which we now corrected.

      Reviewer #3 (Significance (Required)):

      The genome of P. falciparum is fully sequenced; however, over 50% of encoded proteins are of unknown function, with many of these proteins unique to Plasmodium parasites. By identifying and characterizing essential biological processes, especially those divergent from human host cell processes, we will formulate ways to interfere with them by developing novel antimalarial drugs. The process of Plasmodium cell division differs from the classical cell cycle of its human host. In the study led by Caroline Simon, authors successfully utilized recent developments of super-resolution microscopies on expanded parasites to identify novel features of cell division machinery of the malaria blood-stage parasite.

      Simon et al.'s work highlight the growing interest in the diversity of cell division mode of Apicomplexan parasites, which will likely contribute to a deeper understanding of the origin and functional role of the centrosome in eukaryotic life. In 2020, the Open Biology journal published a unique article collection named Focus on Centrosome Biology showcasing research that advanced our knowledge on centrosome function, evolution and abnormalities. In addition, the reported findings will interest research groups studying cell cycle regulation and evolution beyond the field of parasitology.

      Our lab studies the peculiar cell cycle of Plasmodium falciparum to gain a functional understanding of mechanistic principles of nuclear envelope assembly and integrity during the cell division of the human malaria parasite.

      **Referee Cross-commenting**

      It is a wonderful study, and once all reviewer's comments are addressed, the manuscript should be in excellent shape for publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The centrosome is the primary microtubule-organizing center (MTOC) in eukaryotic cells that nucleate spindle microtubules necessary for chromosome segregation. In most eukaryotic cells, the canonical centrosome is composed of centrioles surrounded by an electron-dense proteinaceous matrix named the pericentriolar matrix (PCM) competent for microtubule nucleation mitotic spindle assembly. Following the breakdown of the nuclear envelope breakdowns, the mitotic spindle microtubules gain access to the kinetochores of the condensed mitotic chromosomes. Once the mitotic spindle is fully developed, centrosomes are at opposite poles of the cells, and chromosomes are pulled toward opposite poles. Cell division completes with cytokinesis resulting in the active formation of two nuclei within two daughter cells. Interestingly, during its asexual replication cycle, the malaria parasite Plasmodium falciparum undergoes multiple asynchronous rounds of mitosis with segregation of uncondensed chromosomes followed by nuclear division within an intact nuclear envelope. The multi-nucleated cell is then subjected to a single round of cytokinesis that produces dozen of daughter cells. We know about the Plasmodium centrosome is that it is made of an acentriolar structure embedded in the nuclear envelope and serves as MTOC during cell division. However, the biogenesis and regulation of the Plasmodium centrosome are poorly understood. Given the peculiarity of the cell division in Plasmodium parasites, understanding the molecular mechanisms that drive and regulate MTOC duplication and maturation could unveil novel targets for the treatment of malaria. In this study, Simon et al. successfully applied challenging and cutting-edge microscopy techniques to monitor the dynamic formation of the spindle microtubules and MTOC during Plasmodium intraerythrocytic mitosis. In addition, they remarkably combined stimulated emission depletion (STED) with ultrastructure expansion microscopy to define an uncharacterized intranuclear compartment devoid of chromatin as the nucleation site of nuclear microtubules. And lastly, the authors adapted an in-resin correlative light and electron microscopy (CLEM) approach to define the centriolar plaque position in a novel intranuclear compartment with centrosomal function.

      Major comments:

      1. In the methods section, it is stated that across this study, three different anti-tubulin antibodies (alpha-tubulin B-5-1-2, alpha-tubulin TAT1, beta-tubulin KMX1) were used, and two anti-centrin antibodies (TgCentrin1 and PfCentrin3) were used, one of which seems to have been generated in this study (anti-PfCentrin3). It is unclear in the figures or results section when each of these antibodies was used, and the authors should give a rationale for using multiple antibodies in combination.
      2. The anti-PfCentrin3 antibody seems to have been generated for this study. If this is the case, the authors should provide evidence that this antibody binds to the recombinant PfCentrin3 it was raised against and binds PfCentrin3 in parasite lysates.
      3. In the first paragraph of the Results section, the authors' remark of centrin foci that they are "...only detectable later (Mov. S2) or sometimes not at all." In Figure 1 A-C, it is implied that the first observed division is the first nuclear division of that parasite. Given that some nuclei do not have a visible centrin focus, it cannot be concluded with certainty that these parasites only contain a single nucleus and that this is their first division. The authors would need to include a quantifiable DNA stain to show this unequivocally to show a single nucleus. It has undergone DNA replication, similar to Klaus et al., 2021 BioRxiv paper. In the absence of a DNA stain, the authors should reword to clarify that this is the first observed division and speculate that it is the first division of that nucleus, but the authors should draw no firm conclusions about the first division.
      4. In the first paragraph of the Results section, the authors write: " We quantified the duration of hemispindle, accumulation and anaphase stages ...." Anaphase spindle fibers means that the sister chromatids are separated. In the absence of a centromeric marker like NDC80, it doesn't seem easy to claim the anaphase stage. The authors should write " extended spindle." The authors might also consider using the term collapsed spindle instead of accumulation to reflect the dynamic of the intranuclear microtubules during the blood-stage replication. The same modification should be made for Figure 1B, so we read " hemispindle, collapsed spindle and extended spindle."
      5. Based on the evidence in this study, it cannot be stated unequivocally that the centrosome is entirely extranuclear, at least not as it is implied in Figure 3C. In Supplementary Figure 4, the microtubules appear to be extruding from a circular structure that may either be intranuclear or span the nuclear envelope. In Supplementary Figure 6, the structure pointed to as the centrosome appears to be embedded within the nuclear membrane with a top structure on the cytosolic side of the nuclear envelope. Thus, the best support for an extranuclear centrosome comes from the CLEM images. Still, it is noteworthy that the double membrane of the nuclear envelope is not visible on this slice in the region where the centrin fluorescence is found. Considering some of the fluorescence pixels for centrin are outside the parasite plasma membrane, and some of the Hoechst pixels are outside the nuclear envelope, this data does not show unequivocally that centrosomes are entirely extranuclear. However, this argument would be strengthened if the authors performed a proteinase K protection assay (or something similar) to determine if Centrin1 and Centrin3 are exposed to the cytosol. However, in the absence of that or further evidence, the authors should dampen their claims about the centrosome being exclusively extranuclear, as represented in the schematic in figure 3C.
      6. Throughout the study, the level of biological replication is unclear. The authors rigorously include all the data points for each of their graphs and the total number of images/videos quantified. And what needs to be added, in either the figure legends or a methods section, is the number of biological replicates for each of these measures came from.

      Minor comments:

      1. STED is present as an acronym in the abstract and should be spelled out in full and clarified that it is a super-resolution microscopy technique.
      2. The second paragraph of the Results section states that ring and early trophozoite stage parasites do not express tubulin or centrin. Still, only an early trophozoite is shown in Supplementary Figure 2. Therefore, the authors should either include a similar image of a ring-stage parasite or remove ring-stage parasites from that statement.
      3. The second paragraph of the Results section contains the sentence, "At which point tubulin is reorganized into the bipolar microtubule array, which then forms the mitotic spindle cannot be resolved here." The authors are implying that the point at which tubulin is reorganized into the microtubule array, which goes on to form the mitotic spindle, cannot be resolved here. This is not particularly clear, though, and this sentence could be reworded for clarity.
      4. The second paragraph of the Results section contains some statements about the results without referencing the figures that these statements come from. The authors should clarify this to make clear which figures each statement refers to.
      5. In the third paragraph of the introduction section, the authors write, " Centriolar plaques seem partially embedded in the nuclear membrane, but their positioning relative to the nuclear pore-like "fenestra" remains unclear." Unfortunately, the lack of reference did not allow me to understand if the authors state literature or comment on past published results.
      6. the authors could add some references: • Second section of the introduction: " the 8-28 nuclei are packaged into individual daughter cells, called merozoites ( Rudlaff et al. 2019 PMID: 31097714) • Third section of the introduction: " The centrosome of P.falciparum is called centriolar plaque" ( Arnot et al. 2011, Sinden 1991a); " the nuclear pore-like "fenestra" remains unclear (Wall et al. 2018; Zeeshan et al. 2020). • Fourth section of the introduction: " tubulin antibody staining are extensive structures measuring around 2-4um ( Ref?) • When the authors introduce subpellicular microtubules of segmented schizonts, a reference to a study that shows these structures should be included. • A previous study that shows the distinct structure of microtubule minus ends should be cited when this structure is described. • Third section of the results, the authors should cite Bertiaux et al. 2021 with the Gambarotto et al. 2019 paper regarding U-ExM.
      7. Figure 1E shows hemispindle and mitotic spindle lengths of U-ExM expanded parasites, but the position within the figure and figure legend implies that these lengths were determined unexpanded parasites. Therefore, it should be stated in the figure legend that these measurements come from U-ExM expanded parasites. Moreover, I encourage the authors to include U-ExM images in the main figures. The images are beautiful, represent a significant technical achievement, and directly relate to Figure 1E. To the best of my knowledge, this is only the second study to perform expansion microscopy on Plasmodium and the first to use PFA-fixed parasites and a nuclear stain. It would be valuable for the Plasmodium and ExM communities to see this technical advancement represented in the main text.
      8. In the second paragraph of the Results section, the authors write, " but clearly display the microtubule cytoskeleton associated with the inner membrane complex." It would bring clarity to define in few words what the IMC is.
      9. The methods section details that the length of microtubules was determined by dividing the observed values by an expansion factor of 4.5. If the authors recorded the expansion factors of their gels, this data should be included, and how it was recorded should be stated in the methods. If not, the authors should include the rationale of using an expansion factor of 4.5 as this is slightly different from the previously published expansion factor of P. falciparum of 4.3.
      10. There are several parasite lines used in this study, and some figures are not clear what parasite line was used. Could the authors please include the parasite lines in the figure legends of Figure 1 D-F, Figure 3, Supplementary Figures 1-2, and Supplementary Figures 4-7?
      11. Nuclear pore complexes, of which Nup313 is a component, can have cytoplasmic, integral, and nuclear-facing components. If it has been shown previously that PfNup313 is the homolog of Nup214 in vertebrates present on the cytosolic side of NPC, this should be stated. If not, then it should be clarified that it is unknown whether Nup313 faces the cytoplasm, nucleus, or is embedded in the NE, as this has implications for the colocalization of Nup313 and Centrin.
      12. It seems from the image in Figure 2C that DRAQ5 and Hoechst have at least visually indistinguishable localizations. Have the authors taken any STED deconvolved images of nuclei stained with both Hoechst and DRAQ5? Considering the striking increase in detail of the Hoechst signal in STED deconvolved images, it may be informative both to this study and to people who work on chromatin organization what the chromatin staining looks like in the absence of bias towards chromatin state.
      13. For the tomography and TEM images, the centrosome is indicated with an arrow, but it isn't entirely clear what that arrow is pointing to for some images. It would be clearer if the centrosome were outlined in green, like the NE, rather than just an arrow. This is particularly important for Supplementary Figure 4, where to my eye, it appears that the microtubules inside the chromatin-free region are coming directly out of a circular structure, which could be interpreted as the centriolar plaque.
      14. The ordering of Figure 2A-C seems to imply that the DNA-free region was measured in the STED deconvolved images, but the methods imply that it was in the confocal images. The authors should clarify this in the figure legend or by rearranging B and C's order.
      15. The authors should provide some more detail on how the DNA-free zone was measured. For example, was it measured on single slices or maximum intensity projections? Was it measured from the middle, far, or near side of the centrin focus? Etc.
      16. The methods state that the mCherry signal in figure 2C was detected using a mCherry nanobody. This should be clarified in the figure legends as it currently seems as if we see endogenous mCherry fluorescence.
      17. The data in Supplementary Figure 4 seems vital to the interpretation of the study. Therefore, for clarity, I encourage the authors to include Supplementary Figure 4 in Figure 2.
      18. In the last sentence of the discussion, it is unclear what the authors mean by how the nuclear compartment "splits," could they please clarify?
      19. If the pArl-PfCentrin3-GFP plasmid or pDC2-cam-coCas9-U6.2-hDHFR have been published previously, the respective studies should be cited. If not, the study where the vector backbones were first established should be cited.
      20. From the current text, it is not clear that the Nup313 tagged parasites also had a GlmS ribozyme. It is shown in Supplementary Figure 3, but the authors should clarify either in the text of the results, or figure legends, that this parasite line was Nup313_3xHA_GlmS
      21. In the plasmid constructs section of the methods, the authors list several primers by number but not by sequence. Instead, the authors should include the sequence and orientation of each of the primers mentioned in a table as supplementary data.
      22. The authors should cite the study where the TgCentrin1 antibody was generated and provide the Rat anti-HA 3F10 antibody catalog number, as catalog numbers are provided for other commercial primary antibodies. There is an issue with the formatting of the journal-title in the Kukulski et al. reference.

      Significance

      The genome of P. falciparum is fully sequenced; however, over 50% of encoded proteins are of unknown function, with many of these proteins unique to Plasmodium parasites. By identifying and characterizing essential biological processes, especially those divergent from human host cell processes, we will formulate ways to interfere with them by developing novel antimalarial drugs. The process of Plasmodium cell division differs from the classical cell cycle of its human host. In the study led by Caroline Simon, authors successfully utilized recent developments of super-resolution microscopies on expanded parasites to identify novel features of cell division machinery of the malaria blood-stage parasite.

      Simon et al.'s work highlight the growing interest in the diversity of cell division mode of Apicomplexan parasites, which will likely contribute to a deeper understanding of the origin and functional role of the centrosome in eukaryotic life. In 2020, the Open Biology journal published a unique article collection named Focus on Centrosome Biology showcasing research that advanced our knowledge on centrosome function, evolution and abnormalities. In addition, the reported findings will interest research groups studying cell cycle regulation and evolution beyond the field of parasitology.

      Our lab studies the peculiar cell cycle of Plasmodium falciparum to gain a functional understanding of mechanistic principles of nuclear envelope assembly and integrity during the cell division of the human malaria parasite.

      Referee Cross-commenting

      It is a wonderful study, and once all reviewer's comments are addressed, the manuscript should be in excellent shape for publication.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Simon et al have used advance cell biology technology like STED, expansion and live cell imaging to decipher the configuration of microtubules, centrin and nuclear pore during unconventional cell division process in malaria parasite. They have shown the dynamics of centrin and its localisation with respect to centriolar plaque that is characteristic of these parasite cell during schizogony> They also implicate from their studies that there is extended intranuclear compartment which is devoid of chromatin

      Major Comments

      • Are the key conclusions convincing? Yes to some extent
      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? *Some part are preliminary and speculative as there is no solid data supporting it. Please see below
      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. *Yes to substantiate their claim
      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. *They can do these quite quickly less than a month.
      • Are the data and the methods presented in such a way that they can be reproduced? *Yes
      • Are the experiments adequately replicated and statistical analysis adequate?

      *Yes

      The authors present beautiful imaging and some in depth structure using tomography and CLEM to show the location of centrin which is generally considered the marker for centrosome or Microtubule organising centre in malaria parasite. These approaches are still not been applied in Plasmodium and hence very informative. Though they present some advance microscopy but a lot of these concept for hemispindle were shown earlier in many light and super resolution microscopy studies. Authors claim that they are first to show that there is space between centrin and nucleus but it has been show previously in centrin studies in Plasmodium berghei using super resolution microscopy (Roques et al 2019 Fig1 and supplementary videos1&2) as well as expansion microscopy recently by group of Brochet etal 2021. In addition the microtubule dynamics was also recently shown with Kinesin5 live cell imaging for schizogony in Plasmodium berghei (PMID: 33154955) which author have omitted in their manuscript. It is also important that authors give valid discussion about previous studies on hemispindle, microtubule dynamics with respect to schizogony (PMID: 18693242; PMID: 11606229; PMID: 33154955) rather than giving the impression that they have given this concept first time on hemispindle dynamics and centrin location during schizogony. The concept of bipartite centrosome is already been discussed in Toxoplasma and the claim by authors in Plasmodium presented here is not substantiated experimentally. They showed that centrin is part of outer region while they do not show with any marker for the inner region. It will be very helpful if the authors use gamma tubulin or MORN1 to show the location with respect to centrin and microtubule. In the absence of this localisation the claims are preliminary and speculative. If the centrosomal protein complex is not involved in microtubule nucleation, then how the nucleation is happening. What are the molecules present in this amorphous matrix? It will be great to check the location of gamma-tubulin or some inner centrosome molecules described in Toxoplasma that is deemed to be MTOC.

      Minor comments:

      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately? Partially
      • Are the text and figures clear and accurate? Yes
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? To gamma tubulin and some reference, move expansion microscopy

      Minor comments:

      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately? Partially
      • Are the text and figures clear and accurate? Yes
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? To perform experiments with gamma tubulin and add some references, move expansion microscopy.

      Minor comments:

      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately? Partially
      • Are the text and figures clear and accurate? Yes
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? To gamma tubulin and some reference, move expansion microscopy

      Minor comments:

      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately? Partially
      • Are the text and figures clear and accurate? Yes
      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? To gamma tubulin and some reference, move expansion microscopy

      The expansion microscopy is very nice and some of it presented in supplementary can be moved to main section. The localisation CenH3 is bit puzzling as it has been shown that centromere/ kinetochore cluster and are present during early and mid schizogony. The various foci with respect to nuclei are not what has been seen previously. Please discuss the difference in these two findings.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.
        • This is more technical advancement on the subject of centrin by using STED, tomography and CLEM.
        • Place the work in the context of the existing literature (provide references, where appropriate).
        • This work has relevance relation to cell division during schizogony in asexual stages in par with Toxoplasma or in Apicomplexa in general
        • State what audience might be interested in and influenced by the reported findings. Working with Apicomplexa, Protist, cell division and mitosis.
        • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Working on Cell division in Plasmodium.

      Referee Cross-commenting

      I agree with the reviewers and some of the experiment suggested and the minor details have to be addressed. There are some loose ends and these suggestions will enhance clarity of the data. It is a very nice study and some of the comments suggested by reviewers will improve the manuscript.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Simon and collaborators addresses the dynamic changes of spindle and hemi-spindle microtubules occurring along schizogony in Plasmodium falciparum. The work explores the temporal correlation of the changes observed in intranuclear spindles with changes at the level of the centriolar plaque; the nuclear microtubule organizing center of these parasites, using centrin as a bona fide marker of the structure. The study shows that spindle microtubules organize from an intranuclear region, devoid of chromatin, distinct from the centrin region which had not been observed or described before. It further shows that centrin does not localize at the nuclear envelope, but it is actually extranuclear.

      This work significantly expands on previous knowledge regarding the functional and spatial organization of the nucleus in P. falciparum, and the structure once defined as "an electron dense mass on the nuclear envelope." It uses state of the art microscopy approaches such as STED, UExM and CLEM, in combination with immunolabeling, dyes and parasites over expressing fluorescent protein fusions, to address these questions.

      Major comments:

      • Are the key conclusions convincing? I find the manuscript successfully addresses the posed questions. The data presented supports the conclusions.
      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?
      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. No
      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. N/A
      • Are the data and the methods presented in such a way that they can be reproduced? Yes
      • Are the experiments adequately replicated and statistical analysis adequate? Yes

      Minor comments:

      • Specific experimental issues that are easily addressable.

      On the data shown in Figure 1, it is unclear to me what elements are taken into account to define "anaphase." Anaphase could be defined by using chromatin markers - such as CenH3- which have been identified in Plasmodium and the authors make use of in Figure 1F.

      • Are prior studies referenced appropriately?

      The authors state that "with the exception of centrins and gamma tubulins" few canonical centrosome components are conserved in Plasmodium. These parasites are in fact able to assemble a more or less canonical centriole for microgamete basal body formation. Widely conserved centriolar components such as Sas6 are coded by the malaria genome, and have been characterized previously. This work is neither referenced nor discussed in the manuscript.

      • Are the text and figures clear and accurate?

      I find the timings shown in Figure 1A, with respect to the schematic quantification shown in Figure 1B, confusing. Shown as it is, one naturally correlates the images on Fig1A above with the cell cycle progression timing shown on Fig1B, below. However, by time 260min, for example, two somewhat adjacent centrin signals can be observed. Though this is defined as anaphase- by an unspecified criterium- this could very well be representative of metaphase. Nonetheless, the timing shown on Figure 1B for "anaphase" onset is 170min, which is inconsistent with the images above. I suggest that either, the quantification is shown in a different format (ex. bar plots) which could then better reflect the cell to cell variations observed (by use of error bars, for example)or that the figure explanation in the results section clarifies this issue. As presented, the data in Figure 1C is rather uninformative. A pattern could be more immediately extracted if dots corresponding to subsequent appearance of centrin dots in the same nucleus were connected to each other.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      There are a number of edits required on the text. Row numbers would have been helpful in pointing these out. I point some edits below, but thorough revision of the manuscript for grammatical and synthetic errors would be beneficial. • Cytokinetic segmeter - please replace with "segmented" • Please refer to Figure 1D when appropriate - there is quite an extensive paragraph describing the results shown on this figure, but it is only referenced at the start. • "..., as did the and the number of branches per nucleus,..." please rewrite as appropriate.

      Significance

      This manuscript could be interested to a wide audience interested in cell cycle, cell division, cell organization and organelle positioning, infectious diseases and microscopy. However, the introduction assumes that readers are somewhat experts in the malaria field. I suggest the authors include a brief introduction of the malaria life cycle, and a schematic representation of the division mode. This will help non-experts follow the narrative more easily.

      This work rectifies long-standing inconsistencies observed by different experimental approaches in the nuclear organization of malaria parasites during schizogony. However, what the functional consequences of the alternative modes of spindle organization in malaria could be, are not clearly stated or discussed. In this respect, as it stands, the manuscript is rather descriptive and lacks mechanistic insight. Nonetheless, the data presented are of superb quality, and the manuscript represents a tremendous leap in structural insight and imaging resolution for the field of malaria. I find the data is suitable for publication albeit minor adjustments are made (specially to Figure 1 and/or the description of the results shown in Figure 1, for consistency).

      Referee Cross-commenting

      I agree with all the other reviewer's comments. I'm glad the reviewers seem to be experts in the field of malaria cell division and have pointed out previous studies which were not appropriately referenced. I second those comments.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      This paper proposes a noise-aware approach SCRaPL for modelling the associations of single cell multi-omic data. For gene expression, it uses Poisson-lognormal model. For DNAm data, it uses Binomial noise model which explicitly takes into account the average within the region. The Bayesian hierarchical framework employed by SCRaPL could achieve higher sensitivity and better robustness in identifying correlations, and also offer a template for the application of more complex analysis techniques to multi-omics data. The symbols of this paper are a little bit confusing, and I suggest authors to carefully check them.

      We thank the reviewer for his/ her appreciation, and apologise for the confusion arising from the dense notation, which we will thoroughly revise.

      1. The symbols used in this paper are messy. For example, "1" and "2" are subscripts in Eq.(2) but become superscripts in Figure 5. Besides, there are many symbols not explained such as mj, Hj, Ψ0, etc. Also, I don't know if x{j,i}^{(1)} , x{j,i}^{(2)} in Figure 5 are same with x{ij1} and x{ij2} in Eq.(3). There are many places mismatch, authors should check carefully.
      1. Why the equations in Fig.5 are totally different with Section 4.2? For example, pj Beta(αj ,βj ) in Fig.5 but ρj Beta1,1 in Eq.(8).

      We apologise for the notational confusion, this will be fully revised.

      The paper involves a lot of hyper-parameters which doesn't demonstrate their selection. For example, c1, c2, d1, d2.

      This is a good point. We will include a sensitivity analysis on the hyperparameters, justifying the choices on both simulated and real data.

      In section4.8, I am confused about $ρ_j$ the experiment 2, 5, 8, 11. Why $ρ_j$ both represents ZI rate and correlation?

      We apologise for the notational oversight, which will be rectified.

      In Section 4.5, it is difficult to understand the sentence "for me threshold u". Besides, what is $r$ represent in Section 4.5?

      We apologise for the confusing sentence. $r$ is the Pearson correlation coefficient, as explained at the start of 4.5

      Why there is "(6a)Agreement between SCRaPL and Pearson" in Fig. 4?

      This simply means that the panel shows a methylation/ expression scatterplot for a gene where estimation by SCRaPL and Pearson return both a significant association. We will expand the caption to explain further.

      For Fig.1, I cannot see the text in the rectangle.

      Apologies, we will improve the readability of the figures

      I would like to see the efficiency analysis for SCRaPL.

      As part of part of re-implementation in a more accessible programming language, we have preformed preliminary efficiency analysis for MCMC , demonstrating linear scalability. Results will appear in the revised manuscript.

      Reviewer 2

      The authors present a Bayesian model to determine noise-corrected correlation coefficients for gene expression (RNA) and DNA-methylation data at single-cell resolution. The authors present a series of simulation data and an example of matched multi-omics data, and compare their results with Pearson correlation. Noise modelling allows the model to determine gene-methylation correlation patterns more accurately. While the authors demonstrate a neat application on accurate quantification of correlation coefficients, I see a limited use of the model for the broader single-cell community. The authors may therefore improve their manuscript on several aspects.

      We thank the reviewer for the encouraging words, and thank him/ her for the critical observations, which we have taken at heart, considerably broadening the scope of our paper to make it more attractive to a larger community.

      - Abstract: please specify the omics layers that you are analyzing (RNA + DNA methylation) in the abstract

      We acknowledge that, while SCRaPL is potentially general, in the first submission we focused only on RNA and DNA methylation. We have now decided to expand our analyses to include 10X data of simultaneous chromatin accessibility (ATAC-seq) and RNA.

      - What is the benefit of using a Bayesian model formulation in this setting?

      The benefit is twofold: a principled treatment of noise, and a quantification of the resulting uncertainty which allows for a meaningful way to compute Bayesian significance levels. We will expand the discussion of the relative merits of a Bayesian vs frequentist approach.

      - Does it also apply to unmatched data?

      In principle, given measurements with the same number of cells in all modalities, it is possible to apply SCRaPL. However, unless there is a natural pairing between different cells, the scaling of this approach will be quadratic in the number of cells, hence potentially expensive (although largely parallelizable). We will discuss this now, particularly in the light of applying SCRaPL in conjunction with other suites such as Seurat.

      • Would SCRaPL allow for differential correlation testing?

      At the moment, SCRaPL does not allow for differential correlation testing. Of course, one may run SCRaPL separately on two groups of cells and compare the resulting estimates, which would be informative. Nevertheless, extending SCRaPL to perform differential correlation testing (e.g. using Bayesian model selection) would be a non-trivial effort. We will add a comment on this issue to the discussion section.

      • Figure 1: The graphical description of the model is rudimentary. I believe that the model description could profit from a graphical model representation of SCRaPL (as presented in figure 5).

      We will redraw Fig. 1 and incorporate the graphical model from Fig 5.

      - Simulated data: all experiments seem to have rather low cell numbers (max. 200) and genes (max. 300). Given that 10X Genomics is the most widely-used sequencing platform with approx. 10,000 cells and 3,000 (highly variable) genes per experiment, and given that the authors show a use-case with 9480 genes in 487 cells, it seems appropriate to extend the simulations and runtime estimates of the presented model to several thousands of cells and genes, respectively.

      Thank you for this comment. The original simulation settings were designed with scMT data in mind, where indeed only a few hundred cells can be assayed at most. Partly because of this feedback, and also because of the request of implementing SCRaPL in a different language, we are working on a more scalable Tensorflow implementation which will be able to handle thousands of cells and genes in a matter of tens of minutes . The new simulated data will therefore extend into this regime with larger data sets.

      - Figure 4: Please revise the figure legend as I did not understand the plotted results based on the description.

      We will do so.

      - Results section 2.5: Please formulate your whole argument about epigenetic regulators. I do not think that "For further information please refer to supplementary figure XYZ." Is an appropriate closing statement for a paragraph, nor does it motivate the reader to look at the supplementary figures (I did look at them and I do not see how they support the point made in the paragraph). Please elaborate and consider a "take home message" for the paragraph such that the reader is able to understand the benefit of SCRaPL without revisiting the original data publication.

      Thank you for this pointer, we will take it on board in the full revision.

      - Conclusion: The authors mention that SCRaPL would further offer a "template for the application of more complex analysis techniques (such as clustering, dimensionality reduction and network inference)". If that was the case, the authors should consider a comparison to other tools, which offer exactly that (e.g. Seurat's CCA or non-negative matrix factorization in LIGER). Further, the authors should set their work into context with tools like bindSC.

      Thank you for the suggestion. As far as we can tell, all of these methods are thought for unmatched data, rather than multi-omics assays performed in the same cells. Having said that, it is in principle possible to “preprocess” data with SCRaPL and then feed to Seurat or other tools the latent means computed by SCRaPL. We will include an example of how this may be done in the revision.

      - Implementation: Matlab is used in about 6% of the single-cell RNAseq tools (according to scrna-tools.org). To reach a larger scientific community, do the authors plan to provide an R or Python implementation of their model?

      We are now implementing SCRaPL in Python using Tensorflow probability, hoping to achieve substantial speedups (see response to previous point).

      Additional minor points about formatting by Reviewer 2 will all be addressed.

      Reviewer 3

      Maniatis et al propose a sound strategy to analyse single-cell multi-comic data sets. A key advance is to use bespoke error models for each of the omics data. These are integrated into a multivariate gaussian model. This method is a novel and, in my opinion, a valuable addition to the analyses of the growing multi-omics single-cell data sets.

      We thank this reviewer for his/ her appreciation of our work.

      - Authors make a convincing argument of the importance of principle methods and in particular to use noise models that tailored to the data at hand. To further support this, can authors elaborate on how results would be different from using commonly applied methods ? Eg those embedded in the Seurat, OSCA, and scanpy 'suites'? Authors compare to Pearson correlation-based methods but is not clear if that is the true state-of-the-art on those methods

      As far as we know, volcano plots of p-value versus Pearson correlation are the most commonly employed approaches to assess correlations amongst different molecular modalities in single-cell multi-omics (see e.g. Argelaguet et al, Nature 2020). Seurat and other methods normally do not deal with single-cell multi-omics (i.e., multiple omics measured in the same cell), rather with multiple single-cell omics (different molecular modalities assayed in different cells). Nevertheless, it is possible to pre-apply SCRaPL to non-matched data and then use another suite; as an illustration, we will perform an analysis on scMT data using SCRaPL followed by Seurat.

      - In the case study on mouse embryonic stem cells, authors excluded the chromatic accessibilty. Why not using it to more clearly show the value of the method?

      We did use SCRaPL also on chromatin accessibility, however the signal was weaker and we did not include it in the manuscript, we will now present these results as supplementary material.

      - It would also be great if authors would use a different single-cell multi-comic data sets, using other dat modalities, e.g. CITE-Seq data. If this not possible, at least they should elaborate on which omics SCRAPL can handle, what would be the noise models for different data types, etc.

      We have started analysing a joint scATAC-scRNA- seq data set generated using the new 10X commercial platform, and will add the results of this analysis to the revised manuscript. We will also expand the description of the suitability for different data types.

      *- As the authors acknowledge, computational burden is high, which presumably limits scalability. Are authors able to further explore this (scalability on Insilico data)? Or how complex is adopting the variational inference method suggested? I appreciate that the variational inference implementation might be out of the scope of this paper, though.

      • It is a pity that the method is in Matlab. Nearly no-one in single-cell omics use Matlab. Our own lab is largely invested in this topic and we do not even have Matlab licenses. I strongly encourage authors to implement their method in e.g. R or python, ideally compatible with the broadly used 'suites' (Seurat, OSCA, and scanpy,...).*

      We are addressed these two comments jointly by re-implementing SCRaPL in Tensorflow probability (Python based), which allow us to leverage powerful libraries for variational inference. We hope that this will lead to a substantial increase of scalability, providing the possibility of running on thousands of cells / genes in under one hour (results will appear in ).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Maniatis et al propose a sound strategy to analyse single-cell multi-comic data sets. A key advance is to use bespoke error models for each of the omics data. These are integrated into a multivariate gaussian model. This method is a novel and, in my opinion, a valuable addition to the analyses of the growing multi-omics single-cell data sets.

      I have some comments below that I hope are helpful for the authors:

      • Authors make a convincing argument of the importance of principle methods and in particular to use noise models that tailored to the data at hand. To further support this, can authors elaborate on how results would be different from using commonly applied methods ? Eg those embedded in the Seurat, OSCA, and scanpy 'suites'? Authors compare to Pearson correlation-based methods but is not clear if that is the true state-of-the-art on those methods
      • In the case study on mouse embryonic stem cells, authors excluded the chromatic accessibilty. Why not using it to more clearly show the value of the method?
      • It would also be great if authors would use a different single-cell multi-comic data sets, using other dat modalities, e.g. CITE-Seq data. If this not possible, at least they should elaborate on which omics SCRAPL can handle, what would be the noise models for different data types, etc.

      Minor:

      • As the authors acknowledge, computational burden is high, which presumably limits scalability. Are authors able to further explore this (scalability on Insilico data)? Or how complex is adopting the variational inference method suggested? I appreciate that the variational inference implementation might be out of the scope of this paper, though.
      • It is a pity that the method is in Matlab. Nearly no-one in single-cell omics use Matlab. Our own lab is largely invested in this topic and we do not even have Matlab licenses. I strongly encourage authors to implement their method in e.g. R or python, ideally compatible with the broadly used 'suites' (Seurat, OSCA, and scanpy,...).

      This also precludes checking software and reproducibility of results.

      Significance

      I think this is an important methodological development for the analysis of single-cell multi-comic data.

      To my knowledge, it goes beyond existing methods and does so in a principled manner.

      The audience is mostly bioinformaticians dealing with the analysis of this type of data, ie single-cell multi-omics.

      My expertise is in the computational analysis of omics data, though less on the statistical fundaments of it. Hence, my group members and I are probable users of this method (if implemented in free software, as mentioned above).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors present a Bayesian model to determine noise-corrected correlation coefficients for gene expression (RNA) and DNA-methylation data at single-cell resolution. The authors present a series of simulation data and an example of matched multi-omics data, and compare their results with Pearson correlation. Noise modelling allows the model to determine gene-methylation correlation patterns more accurately. While the authors demonstrate a neat application on accurate quantification of correlation coefficients, I see a limited use of the model for the broader single-cell community. The authors may therefore improve their manuscript on several aspects.

      Major comments:

      • Abstract: please specify the omics layers that you are analyzing (RNA + DNA methylation) in the abstract
      • What is the benefit of using a Bayesian model formulation in this setting?
      • Does it also apply to unmatched data?
      • Would SCRaPL allow for differential correlation testing?
      • Figure 1: The graphical description of the model is rudimentary. I believe that the model description could profit from a graphical model representation of SCRaPL (as presented in figure 5).
      • Simulated data: all experiments seem to have rather low cell numbers (max. 200) and genes (max. 300). Given that 10X Genomics is the most widely-used sequencing platform with approx. 10,000 cells and 3,000 (highly variable) genes per experiment, and given that the authors show a use-case with 9480 genes in 487 cells, it seems appropriate to extend the simulations and runtime estimates of the presented model to several thousands of cells and genes, respectively.
      • Figure 4: Please revise the figure legend as I did not understand the plotted results based on the description.
      • Results section 2.5: Please formulate your whole argument about epigenetic regulators. I do not think that "For further information please refer to supplementary figure XYZ." Is an appropriate closing statement for a paragraph, nor does it motivate the reader to look at the supplementary figures (I did look at them and I do not see how they support the point made in the paragraph). Please elaborate and consider a "take home message" for the paragraph such that the reader is able to understand the benefit of SCRaPL without revisiting the original data publication.
      • Conclusion: The authors mention that SCRaPL would further offer a "template for the application of more complex analysis techniques (such as clustering, dimensionality reduction and network inference)". If that was the case, the authors should consider a comparison to other tools, which offer exactly that (e.g. Seurat's CCA or non-negative matrix factorization in LIGER). Further, the authors should set their work into context with tools like bindSC.

      Minor comments:

      • Implementation: Matlab is used in about 6% of the single-cell RNAseq tools (according to scrna-tools.org). To reach a larger scientific community, do the authors plan to provide an R or Python implementation of their model?
      • Fig. 2: Legends for mean, median and y=0 are hardly legible.
      • Figure order: 6a is referenced before 4b and 4c (what about 4a?) - seems like a referencing issue as 6a is also listed in the figure legend of Figure 4.
      • Figure 6: AIC histogram is difficult to make out behind the blue bars of the DIC histogram. Please adapt.

      Reference:

      Unbiased integration of single cell multi-omics data Jinzhuang Dou, Shaoheng Liang, Vakul Mohanty, Xuesen Cheng, Sangbae Kim, Jongsu Choi, Yumei Li, Katayoun Rezvani, Rui Chen, Ken Chen, bioRxiv, 2020 https://www.biorxiv.org/content/10.1101/2020.12.11.422014v1

      Significance

      Significance:

      The use of a single-cell specific noise-model to infer accurate correlation coefficients for multi-omic analysis is a novel approach to assess information from DNA-methylation and RNA-sequencing data at single-cell resolution. As far as I am aware, methods like canonical correlation analysis (CCA), as used in Seurat, rely on the accuracy of Pearson correlation, yet, the authors of this manuscript made a convincing point on the devastating impact of noise from transcription and methylation levels on Pearson correlation.

      Audience:

      In order to address downstream analysis questions such as gene regulatory network inference, it is essential to have an accurate metric to assess the regulatory impact of methylation on gene expression at hand. However, an efficient implementation in a more common language (e.g. R, Python or C++) would be advisable to create a broader applicability of the model.

      The reviewer's field of expertise: single-cell RNAsequencing, data analysis, data integration, Bayesian modelling

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This paper proposes a noise-aware approach SCRaPL for modelling the associations of single cell multi-omic data. For gene expression, it uses Poisson-lognormal model. For DNAm data, it uses Binomial noise model which explicitly takes into account the average within the region. The Bayesian hierarchical framework employed by SCRaPL could achieve higher sensitivity and better robustness in identifying correlations, and also offer a template for the application of more complex analysis techniques to multi-omics data. The symbols of this paper are a little bit confusing, and I suggest authors to carefully check them. My comments are as following:

      1. The symbols used in this paper are messy. For example, "1" and "2" are subscripts in Eq.(2) but become superscripts in Figure 5. Besides, there are many symbols not explained such as mj, Hj, Ψ0, etc. Also, I don't know if x{j,i}^{(1)} , x{j,i}^{(2)} in Figure 5 are same with x{ij1} and x{ij2} in Eq.(3). There are many places mismatch, authors should check carefully.
      2. Why the equations in Fig.5 are totally different with Section 4.2? For example, pj ∼Beta(αj ,βj ) in Fig.5 but ρj ∼ Beta−1,1 in Eq.(8).
      3. The paper involves a lot of hyper-parameters which doesn't demonstrate their selection. For example, c1, c2, d1, d2.
      4. In section4.8, I am confused about $ρ_j$ the experiment 2, 5, 8, 11. Why $ρ_j$ both represents ZI rate and correlation?
      5. In Section 4.5, it is difficult to understand the sentence "for me threshold u". Besides, what is $r$ represent in Section 4.5?
      6. Why there is "(6a)Agreement between SCRaPL and Pearson" in Fig. 4?
      7. For Fig.1, I cannot see the text in the rectangle.
      8. I would like to see the efficiency analysis for SCRaPL.

      Significance

      Audience who interested in multi-omic data, single-cell rna, machine learning will be interested in this paper.

      My field of expertise: machine learning, single-cell RNA

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      First of all, we would like to thank the editor and all reviewers for the effort to evaluate our paper in this difficult era of COVID-19.

      Reviewer #1

      (Significance): Overall, this manuscript is very clear and easy to follow. The manuscript could be improved by making the following changes:

      We thank the reviewer for the favorable comment and will revise the manuscript according to the suggestions.

      Reviewer #2

      (Evidence, reproducibility and clarity): The use of genetics is particularly impressive but the lack of major discoveries dampens the enthusiasm. Additional efforts to mechanistically define wave initiation and wave propagation would significantly improve the impact of the manuscript. Moreover, some of the conclusions are not fully supported by the data and require further experimentation and/or analysis.

      We admit that marked redundancy of function among the EGFR ligands and their essential roles in cell growth prevent us from obtaining very clear results. Considering the importance of EGFR ligands in biology, we believe, our observation will give invaluable suggestions to whom wishes to clarify the roles played by EGFR-family protein in other biological contexts.

      (Significance): While it is known that ADAM17 is critical to process EGFR ligands, the specific or redundant roles of different ligands remains an open question. The authors find that all ADAM17 ligands contribute to ERK signaling waves but may have specific contributions to other phenotypes. This work would be of interest to the signaling dynamics, epithelial and developmental biology communities.

      We thank the reviewer for the favorable comment.

      Reviewer #3

      (Evidence, reproducibility and clarity): Overall, this study is carried out with a high degree of rigor and technical excellence, with clear reporting of experimental details and replication. The writing and figures are very clear, and there are no obvious technical problems. However, there are some areas in which the strength and clarity of the conclusions could be strengthened by relatively simple experiments.

      We thank the reviewer for the favorable comment. We have already performed some of the experiments suggested by the reviewer. As the reviewer might have anticipated, co-culture with the wild type MDCK cells helps mutant cells to survive. We believe we could propose a clearer model in the revised paper.

      (Significance): This study definitively establishes the role of 4 EGFR ligands in the generation of ERK activity waves in MDCK cells. While other studies, including some from the senior author's lab, have strongly indicated that EGFR autocrine signaling is important for these waves, this study goes further in comparing the roles of these ligands using knockouts to unambiguously establish the autocrine factors involved. Others who use this common experimental system (MDCK) to study epithelial dynamics will find this study of great interest. A wider audience of those who work on EGFR-mediated signaling will also find the data quite fascinating as an example of the complex relationship between ERK activation and its downstream effects. The technical excellence of the paper will make it a must-read for those in these fields. However, there are some factors that limit the scope of the significance. MDCK cells are an important experimental model system but differ in substantial ways from other epithelial cells, particularly in the expression of EGFR ligands. Because different ligands such as amphiregulin dominate in other systems (as noted by the authors, and PMID 27405981), the ability to extrapolate from these findings to other cell types is somewhat limited. Also, the paper avoids addressing the major question of how ERK waves relate to collective migration rate. From the data presented it is clear that this relationship is complex; for example, bath application of the ligands restores a high migration rate but not ERK waves. Given this lack of a clear relationship it is an understandable decision to leave this question for future work; however this does limit the conclusions that can be drawn from the study.

      We completely agree with the reviewer’s view. It is uncertain to what extent the observation with MDCK cells can be generalized to other cell types. We also admit that the conclusion is not very simple because EGFR signaling is required for various cellular functions including cell survival and migration. Even though the gene editing becomes so easy, it is still labor consuming work to knock out many genes in a single cell line with extensive characterization. We believe the data shown in our work will provide a basis for the understanding of EGFR ligands.

      Reviewer #1

      For Fig 1F, 3 individual experiments should be conducted to confirm results.

      We will follow the reviewer’s suggestion and repeat the experiment.

      For Fig 1G, could the authors please show the original western blot data in full rather than just the densitometry graphs?

      We did not show just for the sake of brevity. We are happy to will include the images as a supplementary data.

      The authors should explain the origin/phenotype of MDCK cells for those who are not familiar with the cell line.

      We will modify the text according to the reviewer’s suggestion.

      The authors should give a future outlook/direction for future experimentation to further confirm redundancy in EGF ligands in the propagation of ERK activation waves.

      We will discuss on the redundancy in other cell types based on available NGS data.

      Some mention of the use of biosensors in the abstract and introduction is recommended as this is a major part of the experimental work.

      We will refer to the biosensors in the abstract and introduction.

      Reviewer #2

      There are conflicts with some of the conclusions made about ligands. dEGFR cells have basal ERK activity as high as WT which argues against EGF being responsible for basal ERK activity. Further, basal ERK activity was not rescued by restoration of EGF in the 4KO-EGF cells. The authors should address this discrepancy.

      We agree that some new questions have arisen from our observations. The discrepancy of the phenotypes between dEGFR cells and dEGF cells is an example. We are currently establishing dEGF cell lines, in which different genomic sequences of the EGF gene were targeted. We have already started to develop these cell lines and will obtain them within a month. The result will provide some clues to answer the questions. However, even if we could not solve the question, we believe, it is worth reporting observations that could not be easily understood, because such questions are often leading to another discovery.

      Besides the ones genetically disrupted in this work, other EGFR ligands seem to play functional roles given that dEGFR cells less migration and fewer ERK waves than 4KO cells. The authors could test if other ligands are upregulated in 4KO cells to compensate. On a similar note determining whether ADAM17 deficient cells are more similar to 4KO cells or dEGFR cells could provide some insight.

      According to the reviewer’s suggestion, we will conduct qPCR of growth factors in mutant cell lines to see the expression levels of seven EGFR ligands might have changed significantly. At the same time, as the reviewer suggested, we will establish ADAM17 knockout cell lines and compare the phenotype with those of cell lines deficient from EGFR ligand genes.

      • The authors propose that Nrg1 is responsible for ERK waves in QKO, 4KO, dEGFR, and 4KO-EGF cells but are limited in testing this due to Nrg1 being essential in 4KO cells. First, Nrg1 should have been deleted in TKO cells to confirm that it is only essential in the absence of the four EGFR ligands. Additionally, Nrg1 could be knocked out in 4KO-EGF cells to demonstrate the claim that EGF-induced ADAM17 cleavage of Nrg1 is responsible for ERK waves.*

      We do not think the deletion of Nrg1 in the TKO cells will abolish the ERK activation waves because EREG in TKO cells could transmit the waves. To overcome the problem of cell growth, we will try to obtain 5KO cells by Cre-induced deletion of NRG1 in 5KO-loxP-NRG1 cells, wherein EGF is supplied exogenously. We already had preliminary data suggesting that co-culture with wild type MDCK cells helps 5KO cells grow.

      The authors state that ERK activation waves are important for collective migration and seek to understand the roles of each EGFR ligand, but despite measuring migration and properties of ERK activity, there is very little analysis or commentary on the relationship between the two. The ability of HB-EGF to restore migration without ERK waves suggests that waves are not required per se. It is interesting to note that with restoration of ligands, migration is higher than WT but ERK activity is lower.

      We refrained from spending much space about the essential role of ERK activation waves in collective cell migration, because several papers have already described this issue.

      Probably, we should have spent more space to emphasize that the collective cell migration is comprised of at least two different phenomena. The migration of leader cells and the follower cells. The ERK activation waves are essential for the follower cells but not the leader cells. In 4KO cells, both the leader cell and follower cell migrations are impaired. We showed that GFs expression restore the leader cell migration, but not the follower cells. We will revise the text to include this issue.

      It is suggested that the total amount of EGFR ligands may be the primary determinant of migration, but deletion of TGFα alone causes a significant decrease in migration comparable to the DKO cells. TGFα has the lowest expression of the four ligands studied but is the only ligand to have a significant impact on migration in the single knockout context, which disagrees with that conclusion.

      Each EGFR ligand has different affinity to EGFR, which makes it difficult to link the mRNA levels directly to the effect of each EGFR ligand. We will modify the discussion to include this argument.

      Other:

      Fig. S3B needs clarification that the WT (black) and 4KO (green) did not receive a stimulus.

      We will follow the reviewer’s advice.

      Reviewer #3:

      The experiments in Fig. 5 are undertaken with the purpose of assessing whether NRG acts as an additional ligand that mediates the residual ERK waves in 4KO/QKO cells. However, this question is never addressed in the NRG/4KO cells. While it might be challenging due to the proliferative defect, it seems important to attempt this experiment in some way; measuring the ERK waves for these cells would establish whether all of the critical autocrine factors have been identified. Can the proliferation be rescued by application of high amounts of growth factors?

      This question is similar to a question raised by reviewer #2.

      To overcome the problem of cell growth, we will try to obtain 5KO cells by Cre-induced deletion of NRG1 in 5KO-loxP-NRG1 cells, wherein EGF is supplied exogenously.

      The bath exposure to EGFR ligands shown in Fig. S3A is an important experiment, but it is surprising that ERK signaling is not maintained under these conditions. Is this due to depletion of the added ligands, perhaps locally? Or is the intermittent nature of paracrine signaling needed to maintain ERK activity? These possibilities could be distinguished by checking whether the added EGF or the other ligands are depleted after several hours, or by restimulating with a new bolus of ligand after several hours.

      We thank the reviewer for this invaluable suggestion. We will conduct the experiments suggested by the reviewer.

      The connection between ERK activity and migration is somewhat confusing. It would be helpful to show the dose sensitivity of migration to a MEK or ERK inhibitor. Are other pathways downstream of EGFR such as PI3K involved in the autocrine-mediated migration? This could also be established with the appropriate inhibitors.

      We should have spent more space to emphasize that the collective cell migration is comprised of at least two different phenomena. The migration of leader cells and the follower cells. The ERK activation waves are essential for the follower cells but not the leader cells. In 4KO cells, both the leader cell and follower cell migrations are impaired. We showed that GFs expression restore the leader cell migration, but not the follower cells. We will emphasize this issue in the revised manuscript.

      Reviewer #1:

      Line 47 in Abstract should read "Aiming for" not "Aiming at".

      We have corrected the mistake as suggested.

      Some in the field call fluorescence lifetime microscopy "FLIM", you could adopt the same wording in your manuscript to attract more readers.

      We have included FLIM according to the reviewer’s suggestion.

      Reviewer #1 :

      Figure 1D, the images should be presented using the same scale for both the EKAREV and EKARrEV constructs so that they can be directly compared.

      Because the basal FRET/CFP ratio is significantly different between EKAREV-NLS and EKARrEV-NLS, the changes during mitosis become unclear if we applied the same scale. This figure is prepared to show the reactivity to Cdk1 during mitosis; therefore, we believe the current scale is better for presentation.

      The names QKO and 4KO are a bit confusing. Could the authors please change the naming of the knockout cells so that readers understand that QKO and 4KO are two separate cell types? Perhaps instead of 4KO use FKO for "full knockout" or something similar. The 5KO line might also need to be named something else if you change to FKO.

      We have discussed this issue with the co-authors, but could not reach a better idea. Instead of changing the names, we will include a detailed explanation for each cell line.

      Reviewer #2:

      The interpretation of the RA-SOS coculture experiments is confusing. Based on the author's reasoning, I would expect ADAM17 shedding in the RA-SOS cells to trigger signaling at the interface of both WT and 4KO cells but the 4KO should be unable to propagate the wave farther away from the interface. This does not seem to be the case. Do RA-SOS ADAM17KO cells still trigger waves of ERK signaling in the WT cells? Do ADAM17KO cells behave as the 4KO cells in this coculture system?

      Probably, the reviewer misunderstood the method. The GF-less 4KO cells were co-cultured with wild type cells harboring the RA-SOS system. We will describe more in detail to avoid misunderstanding.

      Finally, the growth curve in Fig. 5B indicates that 5KO-loxP-NRG1-CreERT2 cells are viable for about two days after Cre induction. The authors could perform a confinement release assay of these cells 1-1.5 days after Cre induction to look for further reduction of ERK waves and migration to demonstrate the role of Nrg1.

      This experiment may not be necessary. It is clear that NRG1 is required for the survival of 4KO cells. The reason why cells are still alive 1 to 2 days after 4-OHT application is simply because NRG1 protein is remaining. The interpretation of the results would be difficult during the phase of NRG1 reduction.

      In Fig. 1G, the normalization of all WT pERK samples to 1 artificially lowers the variance to zero when performing the T-test.

      For the comparison of immunoblotting data derived from independent experiments, the signals must be normalized to the control. We believe the use of pERK/ERK of the wild type cells as the control is reasonable for this experiment.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript seeks to clarify the mechanisms that underlie traveling "waves" of ERK activity that occur in monolayers of migrating epithelial cells. A combination of live cell imaging with ERK activity biosensors and CRISPR-mediated knockouts for autocrine regulators are used to dissect the factors that make these waves possible. The authors utilize the MDCK cell line, which shows very prominent wave behavior, and they perform an impressive number of knockouts to eliminate the most abundant autocrine EGFR ligands. They also introduce a novel ERK FRET reporter, which is less sensitive to off-target phosphorylation by Cdk1. Analysis of ERK biosensor data from the knockouts shows that knockout of all four main EGFR ligands is needed to substantially reduce the amplitude of ERK waves, although it does not completely eliminate it. Re-expression of any of the four ligands, with the exception of HBEGF, restores strong ERK waves. Application of the same ligands in solution restores migration but not the ERK waves.

      Overall, this study is carried out with a high degree of rigor and technical excellence, with clear reporting of experimental details and replication. The writing and figures are very clear, and there are no obvious technical problems. However, there are some areas in which the strength and clarity of the conclusions could be strengthened by relatively simple experiments.

      Major:

      1. The experiments in Fig. 5 are undertaken with the purpose of assessing whether NRG acts as an additional ligand that mediates the residual ERK waves in 4KO/QKO cells. However, this question is never addressed in the NRG/4KO cells. While it might be challenging due to the proliferative defect, it seems important to attempt this experiment in some way; measuring the ERK waves for these cells would establish whether all of the critical autocrine factors have been identified. Can the proliferation be rescued by application of high amounts of growth factors?
      2. The bath exposure to EGFR ligands shown in Fig. S3A is an important experiment, but it is surprising that ERK signaling is not maintained under these conditions. Is this due to depletion of the added ligands, perhaps locally? Or is the intermittent nature of paracrine signaling needed to maintain ERK activity? These possibilities could be distinguished by checking whether the added EGF or the other ligands are depleted after several hours, or by restimulating with a new bolus of ligand after several hours.

      Minor (I think this is an important point overall, but it is outside of the scope of the paper as defined by the authors, which is focused on the ERK waves rather than how the waves relate to migration):

      1. The connection between ERK activity and migration is somewhat confusing. It would be helpful to show the dose sensitivity of migration to a MEK or ERK inhibitor. Are other pathways downstream of EGFR such as PI3K involved in the autocrine-mediated migration? This could also be established with the appropriate inhibitors.

      Significance

      This study definitively establishes the role of 4 EGFR ligands in the generation of ERK activity waves in MDCK cells. While other studies, including some from the senior author's lab, have strongly indicated that EGFR autocrine signaling is important for these waves, this study goes further in comparing the roles of these ligands using knockouts to unambiguously establish the autocrine factors involved. Others who use this common experimental system (MDCK) to study epithelial dynamics will find this study of great interest. A wider audience of those who work on EGFR-mediated signaling will also find the data quite fascinating as an example of the complex relationship between ERK activation and its downstream effects. The technical excellence of the paper will make it a must-read for those in these fields. However, there are some factors that limit the scope of the significance. MDCK cells are an important experimental model system but differ in substantial ways from other epithelial cells, particularly in the expression of EGFR ligands. Because different ligands such as amphiregulin dominate in other systems (as noted by the authors, and PMID 27405981), the ability to extrapolate from these findings to other cell types is somewhat limited. Also, the paper avoids addressing the major question of how ERK waves relate to collective migration rate. From the data presented it is clear that this relationship is complex; for example, bath application of the ligands restores a high migration rate but not ERK waves. Given this lack of a clear relationship it is an understandable decision to leave this question for future work; however this does limit the conclusions that can be drawn from the study.

      Areas of expertise: growth factor signal transduction, biosensors, quantitative modeling

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Lin et al. address the mechanisms underlying ERK signaling waves in epithelial cells. While it is known that ADAM17 is critical to process EGFR ligands, the specific or redundant roles of different ligands remains an open question. First the authors generate a modified ERK FRET sensor with reduced cross-reactivity to CDK1 in MDCK cells and systematically knockout EGF, HBEGF, TGF⍺ and EREG (the highest expressed ligands in MDCK cells). The authors use live cell imaging of ERK activity upon release from confinement and find that all ligands contribute to ERK signaling waves. While differences in basal signaling and other dynamic features are found in individual knockouts, only the quadruple KO cells show a significant decrease in ERK waves. To determine if the 4KO cells are defective in wave propagation (as opposed to wave initiation), the authors coculture 4KO cells with an inducible cell line and conclude that 4KO cells are unable to propagate waves. Individual EGFR ligands are then restored in 4KO cells, and EGF, TGFα, and EREG, but not HBEGF, can rescue ERK activity waves. Finally, the authors attempt to eliminate all ERK activation waves by deletion of Nrg1 but find that it is essential in 4KO cells. The paper is well-written and technically sound. The use of genetics is particularly impressive but the lack of major discoveries dampens the enthusiasm. Additional efforts to mechanistically define wave initiation and wave propagation would significantly improve the impact of the manuscript. Moreover, some of the conclusions are not fully supported by the data and require further experimentation and/or analysis.

      1. There are conflicts with some of the conclusions made about ligands. dEGFR cells have basal ERK activity as high as WT which argues against EGF being responsible for basal ERK activity. Further, basal ERK activity was not rescued by restoration of EGF in the 4KO-EGF cells. The authors should address this discrepancy.
      2. Besides the ones genetically disrupted in this work, other EGFR ligands seem to play functional roles given that dEGFR cells less migration and fewer ERK waves than 4KO cells. The authors could test if other ligands are upregulated in 4KO cells to compensate. On a similar note determining whether ADAM17 deficient cells are more similar to 4KO cells or dEGFR cells could provide some insight.
      3. The interpretation of the RA-SOS coculture experiments is confusing. Based on the author's reasoning, I would expect ADAM17 shedding in the RA-SOS cells to trigger signaling at the interface of both WT and 4KO cells but the 4KO should be unable to propagate the wave farther away from the interface. This does not seem to be the case. Do RA-SOS ADAM17KO cells still trigger waves of ERK signaling in the WT cells? Do ADAM17KO cells behave as the 4KO cells in this coculture system?
      4. The authors propose that Nrg1 is responsible for ERK waves in QKO, 4KO, dEGFR, and 4KO-EGF cells but are limited in testing this due to Nrg1 being essential in 4KO cells. First, Nrg1 should have been deleted in TKO cells to confirm that it is only essential in the absence of the four EGFR ligands. Additionally, Nrg1 could be knocked out in 4KO-EGF cells to demonstrate the claim that EGF-induced ADAM17 cleavage of Nrg1 is responsible for ERK waves. Finally, the growth curve in Fig. 5B indicates that 5KO-loxP-NRG1-CreERT2 cells are viable for about two days after Cre induction. The authors could perform a confinement release assay of these cells 1-1.5 days after Cre induction to look for further reduction of ERK waves and migration to demonstrate the role of Nrg1.
      5. The authors state that ERK activation waves are important for collective migration and seek to understand the roles of each EGFR ligand, but despite measuring migration and properties of ERK activity, there is very little analysis or commentary on the relationship between the two. The ability of HB-EGF to restore migration without ERK waves suggests that waves are not required per se. It is interesting to note that with restoration of ligands, migration is higher than WT but ERK activity is lower.
      6. It is suggested that the total amount of EGFR ligands may be the primary determinant of migration, but deletion of TGFα alone causes a significant decrease in migration comparable to the DKO cells. TGFα has the lowest expression of the four ligands studied but is the only ligand to have a significant impact on migration in the single knockout context, which disagrees with that conclusion. Other:
      7. In Fig. 1G, the normalization of all WT pERK samples to 1 artificially lowers the variance to zero when performing the T-test.
      8. Fig. S3B needs clarification that the WT (black) and 4KO (green) did not receive a stimulus.

      Significance

      While it is known that ADAM17 is critical to process EGFR ligands, the specific or redundant roles of different ligands remains an open question. The authors find that all ADAM17 ligands contribute to ERK signaling waves but may have specific contributions to other phenotypes. This work would be of interest to the signaling dynamics, epithelial and developmental biology communities.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      see below for comments.

      Significance

      Overall, this manuscript is very clear and easy to follow. The manuscript could be improved by making the following changes:

      • Line 47 in Abstract should read "Aiming for" not "Aiming at".
      • Some mention of the use of biosensors in the abstract and introduction is recommended as this is a major part of the experimental work.
      • The names QKO and 4KO are a bit confusing. Could the authors please change the naming of the knockout cells so that readers understand that QKO and 4KO are two separate cell types? Perhaps instead of 4KO use FKO for "full knockout" or something similar. The 5KO line might also need to be named something else if you change to FKO.
      • Figure 1D, the images should be presented using the same scale for both the EKAREV and EKARrEV constructs so that they can be directly compared.
      • Some in the field call fluorescence lifetime microscopy "FLIM", you could adopt the same wording in your manuscript to attract more readers.
      • For Fig 1F, 3 individual experiments should be conducted to confirm results.
      • For Fig 1G, could the authors please show the original western blot data in full rather than just the densitometry graphs?
      • The authors should explain the origin/phenotype of MDCK cells for those who are not familiar with the cell line.
      • The authors should give a future outlook/direction for future experimentation to further confirm redundancy in EGF ligands in the propagation of ERK activation waves.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Please place your comments about significance in section 2.

      In this paper the authors used a targeted approach to identify rare mutations in a cohort of glioma patients. Using this approach they identified a recurrent mutation in the TOP2A gene encoding for Topoisomerase 2A, and suggest that this mutation creates a more effective protein, binding DNA strongly and maybe more enzymatically active. RNAseq analysis of TOP2A WT and TOP2A mut tumor samples suggest different transcription patterns and points to possible splicing defects. The most recurrent variant (E9448Q) is described in depth and some experimental information shows this variant might be a gain-of-function mutation.

      **Major comments:**

      • Are the key conclusions convincing? The validation of both the methodology and the presence of never described TOP2A variations in HGG is done quite successfully. Interesting evidence about relevance of the most frequent mutation is provided. However, besides having computational and biochemistry assays performed, lack of details about in vitro experiment statistics (no p-values are provided in figures 4 and 5, neither sample size, repetitions) weakens the conclusions claimed by the authors about the properties of the mutated topoisomerase. Ad. In the revised version we provided more details about in vitro experiments, including statistics when is applicable, sample size and a number of repetitions. In the fig. 4 we show the results of two repetitions (so we can’t calculate statistics) but I would like to stress that we tested independently two fragments of the protein and the results were similar, so our conclusion was justified. However, we do agree with the reviewer that a statistical analysis of those biochemical tests is required. We already started to produce a new batch of recombinant proteins and we will add repetitions to reinforce our claims. We will provide statistical analysis details once all experiments are performed. __

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Claims about E948Q variant function should be revised. Data is not presented in a convincing way, plus there is ambiguous language used from the results ("We conclude that the E9448Q TOP2A protein is functional, and MIGHT have a higher activity than the WT protein") to the rest of the paper where they strongly support the claims about the TOP2A activity. Ad. We will provide more data on biochemical features of the TOP2A variant to confirm the impact of the E948Q substitution on enzyme activities, which would allow more strong conclusions. This will present our results in more convincing way. A language of the manuscript has been critically revised and modified (see a version with tracked changes).

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. In line with the presented data in the paper, additional experiments that show catalytic changes of the E9448Q variation must be added. It is shown that there are differences in the DNA binding capacity by EMSA compared to the WT form, however, the DNA supercoil relaxation activities is not that different, at least the way the results are presented. The authors suggest that TOP2A mutation is a driver mutation but no validation in vitro of this claim is shown. Can this mutation alone or in combination with e.g. tumor suppressors transform normal cells to cancer cells? Do cell lines expressing this mutation (compared to parental TOP2A wt expressing cells) display increased transcription? Increased invasion? Ad. In the revised version we moderated our conclusions and we do not state that the mutated TOP2A is an oncogenic driver. We suggest this mutation (and possibly other TOP2A mutations, as we analyzed the impact of other variants on the TOP2A protein function) contribute to gliomagenesis. This conclusion is based not only on the changes in biochemical properties, but also on the observation of the impact of the mutation of transcription and patient survival. We expanded the analysis of TOP2A mutations and expression levels on TCGA datasets and those new results support our conclusions about a pathogenic nature of TOP2A overexpression and mutations (the supplementary fig.4). We believe in such situation, there is no rationale to make a classical oncogenic driver experiment.

      Due to a rarity of the TOP2A mutations it is impossible to find a patient derived cell line with such defect. We attempted to overexpress TOP2A in glioma cells but apparently there is some autoregulation preventing overexpression of this protein is cells with endogenous TOP2A expression. Therefore, we can’t verify if cell lines expressing this variant (compared to parental TOP2A wt expressing cells) have increased transcription. Moreover, such experiments are costly and require more time investment for substantial experiments

      I would like to stress that modeling some events in cell cultures is difficult and we found in GBMs the link between the mutated TOP2A and increased transcription along with decrease of splicing factors expression.

      We have attempted to make CRISPR/Cas9 mediated knock-in in glioma cells but without success. This is a difficult and time consuming procedure. Although in principle, we agree on the rationale for such experiment, we think that the current data are consistent and convincing. If reviewers find it necessary we may attempt to create glioma cell lines with TOP2A knock-out and overexpression of the mutated TOP2A gene and study it functionally, but it would require more time.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. If the authors can complement the already presented in vitro experiments with additional ones supporting their hypothesis, this should be feasible. The authors can use patient derived glioma cells or glioma cell lines manipulated to express either the parental TOP2A wt enzyme or the identified mutated form. __Ad. Due to a rarity of the TOP2A mutations it is impossible to find a patient derived cell line with such defect. Our findings partly relied on frozen historical samples, so it is not possible to develop patient-derived cell lines. As mentioned above, we can create a TOP2A knock-out cell line and overexpress a wild type or mutated version but there is no certainty that TOP2A deficient cells would survive (this is an essential enzyme) and such manipulation would be feasible.__

      • Are the data and the methods presented in such a way that they can be reproduced? Yes, the authors provide a quite detailed explanation of the methods implemented to reach each one of the results they are presenting.

      • Are the experiments adequately replicated and statistical analysis adequate? No, there is no information about the statistical analysis or number of replicates in any of the in vitro experiments performed. This information should be added to the manuscript.

      Ad. In the revised version the requested information was added where was possible and additional repetitions for biochemical experiments are currently in progress.

      **Minor comments:**

      • Specific experimental issues that are easily addressable.
      • Are prior studies referenced appropriately? Yes, authors clearly address the state of the art regarding previous NGS methodologies and let us know the advantages and novelty of their approach.
      • Are the text and figures clear and accurate? There are some discrepancies between the strength of the language used in different sections of the paper to refer the conclusions they can infer from the results they are showing. While they are all valid, authors should revise it. Ad. The text of the manuscript has been unified and revised.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? First of all, describe the statistical analysis used in every figure, include number of biological and technical replicates. I would also suggest to change the title or the scope of the discussion, there is too much focus on the TOP2A in the introduction, neglecting all the technical NGS work that actually lead to several new variants being described. This may be confusing when it collides with a conclusion that is heavily focused on the first half describing potential implications of at least another 3 proteins where genetic alterations were described. Given the fact there is not much experimental work that shows TOP2A mutations relevance in HGG or strong enough evidence of the variant's function I would suggest to change a bit the scope of the title. Ad. The description of the results and discussion have been revised to include additional data/discussion on technicalities and other finding not related to TOP2A. We performed additional computational analyses of TOP2A expression/mutations in the TCGA datasets. We believe that the planned experiments on genetically modified cell lines would provide additional support for our claims. We think that in the revised version a balance between landscape/NGS content and TOP2A content is well balanced.

      Reviewer #1 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The authors describe a methodology that proved to be sensitive and specific enough in order to allow them to detect rare genetic alterations in patient glioma samples. This information could be valuable to describe new driver mutations or infer in genetic pathway alterations that could be potential therapeutic targets. As the authors state at the beginning of the paper, given the poor therapeutical approaches existing for HGG currently, information of this kind could still be highly useful and provide a better outcome to a specific cohort of patients.

      On a personal note, I think there is too much speculation about how TOP2A mutations could be interesting from a biological point of view (authors referred to evidence about implications of this mutation in other forms of cancer) but since no experimental validation is provided in glioma cells, it is difficult to conclude that this enzyme gain-of-function mutation could have a relevant role in HGG and thus make these variants a potential therapeutic target. There are no experiments conducted in glioma cells that express TOP2A variants, it would be interesting to see if it has an effect in the migratory/invasive phenotype like described in other cancer types or like it is suggested by analysis of the genetic pathways activated in the HGG patients samples harboring TOP2A mutation. In addition, there is no evidence of the TOP2A mutations possible role as a driver mutation, which is an interesting aspect that could be further explored from both a computational and an experimental approach.

      Ad. As mentioned above, there is no glioma cells that express TOP2A variants and we are not convinced that such experiment will be feasible taking into account an essential role of TOP2A. We will attempt to perform experiments with CRISP/Cas9 knock-in cell lines and functional validation, but until now we did not accomplish knock-in in glioma cells. We will try to knock-out the endogenous TOP2A using CRISPR and express a TOP2A WT or E948Q variant from plasmids encoding these proteins, but we can’t predict if TOP2A KO cell would survive. If we manage to produce such cells, then we will investigate proliferation, migration and invasion of cells expressing TOP2A WT or mutated variant.

      We do agree with the reviewer that our previous conclusions were too strong, and in the revised version we moderated our claims. We do not say that the mutated TOP2A is an oncogenic driver. We suggest this mutation (and possibly other TOP2A mutations, as we analyzed the impact of other variants on the TOP2A protein structure) contribute to gliomagenesis.

      __Data on the Fig. 1A suggests that TOP2A has a mutational hotspot in the position E948Q in our dataset. In the revised version of the manuscript we have extended RNA-seq analysis of our datasets and TCGA PanCancer datasets to search for TOP2A mutations/ overexpression. We found that another computational prediction using CADD algorithm strongly confirms that TOP2A E948Q is in the top 1% of most deleterious variants in the human genome (CADD score >20). This results was added to Supplementary Table 2.__

      • Place the work in the context of the existing literature (provide references, where appropriate). The quality of the paper is high and in line with other studies in the literature that perform genome and transcriptome analysis of tumor samples. It is only the experimental validation that is lacking data supporting the "in silico" findings. Ad. We would like to point that we provided the results of experimental, biochemical validation (2 assays) showing that the variant TOP2A proteins have different properties. The associations of transcriptional dysregulation in variant TOP2A bearing gliomas was not a in silico prediction but the result of the analysis of real tumor samples.

      As stated above, we are ready to perform further biological validation if the editors find it necessary.

      • State what audience might be interested in and influenced by the reported findings. Computational biologists are the right audience to target this paper. If additional experimental work further validating their initial bioinformatic findings is added to the manuscript then probably a wider population could be targeted.

      Ad. As stated above, we are working now on providing more replicates of biochemical assays and we are ready to perform further biological validation if the editors find it necessary. I would like to stress that genome editing by knock-in is not always possible/feasible, and these type of experiments is time and money consuming.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Brain tumors, immunotherapy, cancer stem cells, tumor microenvironment, tumor heterogeneity. I do not have sufficient expertise to evaluate the bioinformatic analysis and software/programs used to analyze the NGS data.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      By exon targeted resequencing of 664 genes frequently mutated in cancer, authors identify novel mutations associated to Glioma in a cohort of 182 Polish and Canadian samples. Most of these novel mutations have been identified as potential rare germline mutations, somatic mosaicism or loss-of-heterozygosity variants. Among them, authors focus on mutations associated to the TOP2A gene, which encodes one of the two Type II topoisomerases paralogs present in humans. By a limited number of in vitro experiments, authors conclude that TOP2A recurrent variant E948Q, displays increased binding to DNA and topoisomerase activity. Therefore, authors suggest that the TOP2A E948Q variant is a gain-of-function mutation.

      **Major comments:**

      • Authors show an interesting plethora of new exon mutations associated with High Grade Glioma. Nevertheless, the characterization of TOP2A E948Q variant, which is the main focus of the study, although very interesting and potentially clinically relevant, remains incomplete. Association of the TOP2A E948Q glioma variant with a gain-of-function mutation would require to improve the statistical power of the presented experiments (increase number of replicates). With the existing experimental evidence, the increased DNA binding and activity of the TOP2A E948Q variant should be considered as preliminary, especially in the case of 431-1193 aa fragment. I would consider mandatory to increase experimental replicates and to analyse statistical significance in the case of DNA binding experiments and DNA relaxation assays with the TOP2A 431-1193 aa fragment. A more detailed biochemical characterization should be performed. A titration of different amounts of protein should be included in these experiments, and at least two batches of purified proteins should be analysed. Decatenation assays should also be performed to characterize the activity of the mutant protein in more detail. Recapitulation of DNA binding and activity results with other TOP2A variants obtained in this study will significantly reinforce authors claims too. This improved biochemical characterization should not take longer than two months.

      Ad. We would like to stress that while two replicates are presented, we were testing two forms of TOP2A proteins and the results were similar, confirming our conclusions. But we agree that additional replicates would strengthen our claims. Therefore, we are in the process of producing another batch of recombinant proteins to increase a number of replicates and calculate statistics for the biochemical assays (binding and relaxation assay). We will perform titration of different amounts of the protein using two batches of purified proteins.

      The occurrence of other TOP2A variants is low (identified in only a single patient sample), therefore we will perform experimental validation only for E948Q. However, we performed additional computational analysis for other TOP2A variants showing the influence of the substitution on DNA binding by docking the DNA fragment into TOP2A binding pocket (Supplementary table 4).

      • To increase the significance of the results, I would encourage authors to include experiments showing the functional impact of this TOP2A mutation in cells. The connection with transcriptomic alterations is merely correlative, and would be greatly strengthened by functional experiments in cellular models. To draw definitive conclusions regarding the changes in transcription, I would encourage authors to complement the results with experiments that point to the physiological impact of TOP2A variants within the cell. Overexpression of WT and E948Q variants in a cell model and transcriptomic analysis would be desirable, but validation in these experimental models of some of the target genes identified as deregulated in patients could suffice. These experiments could be accomplished in no more than 3-4 months.

      Ad. We agree that the connection of the TOP2A mutation with transcriptomic alterations is correlative, and would be greatly strengthened by functional experiments in cellular models. If we develop a TOP2A E948Q knock-in cell line or TOP2A KO cell line with E948Q over-expression, we are planning to evaluate transcriptomic changes on selected genes by qPCR or whole transcriptome by RNAseq. We estimate that developing a stable CRISPR/Cas9 cell line may take up to 6 months.

      We provided additional results showing that the connection of the TOP2A mutation with transcriptomic alterations may be due to different expression of splicing factors (Supplementary Fig. 6).

      • Some of the methods are not presented with sufficient detail. Regarding the DNA and RNA sequencing experiments, I consider necessary to specify the DNA fragmentation method, reference for the indexed adapters and ligation and amplification procedures (ligase reference, number of PCR cycles, etc). It would be helpful to clarify or reference which are the "special oligonucleotide probes" that are mentioned. Finally, a reference for the "special beads" and final amplification number of cycles is needed. The sequence of primers used for TOP2A cloning and mutagenesis should be included. The reference for the "site mutagenesis kit" used is missing. When studying the survival rate of glioma patients depending of TOP2A expression levels, it should be clarified what is considered HIGH or LOW expression (i.e: which percentiles are used).

      Ad. We expanded the description of methodological aspects of DNA and RNA sequencing experiments. This description was revised and more details are provided in the revised version. Regarding cloning and mutagenesis, we added a table with primer sequences (Supplementary Table 5). We did not use any kit for cloning and mutagenesis. Standard methods and primers with modified nucleotides were used.

      __We have included information about the partitioned groups in the survival analyses in the figure 2 caption. “D - Kaplan-Meier overall survival curve for patients with high (> TOP2A mRNA median expression x 1.25) or low (- There is a major concern about how the experiments are replicated and about the statistical analysis, which is inexistent in some cases. Indeed, Figures 4 and 5 do not present any statistical analysis, it is therefore hard to draw any conclusion. In Figure 4b, the results for the 890-996 aa fragment looks qualitatively clear, but this is not the case for the 431-1193 aa fragment. More replicates and statistical analysis are mandatory, together with a protein titration. The replicates should be performed with at least two independent batches of protein purifications. The individual values of each experiment should be included in the graph to provide a better understanding of experimental variability. All this also applies to Figure 5.

      Ad. We will increase a number of replicates for the binding and relaxation assay. We will perform a titration of different amounts of protein in these experiments using two batches of purified proteins.

      **Minor comments:**

      • The effect on transcription of co-occurrence of TOP2A mutations with other mutations could also be analysed with the already available data. Also, a more detailed analysis of genome-wide transcription could also be used to at least partially address the proposed hypotheses of increased transcriptional rate or splicing aberrations.

      Ad. We don’t have enough samples with the TOP2A mutation to analyze the effect on transcription of co-occurrence of TOP2A with other mutations.

      We addressed the hypothesis of increased transcriptional rate or splicing aberrations by performed additional analyses of RNA-seq data to confirm splicing aberrations. Indeed we found splicing machinery genes down-regulated in the E948Q TOP2A glioma samples (Supplementary Fig.6).

      • There is no reference for the following argument "As the identified germline variants were exceptionally rare in the general population ... it is likely that these variants are pathogenic". I also find low number of references to support the suggested high frequency of altered genes in gliomas compare to other cancer types. I miss specific works relating TOP2 activity with transcriptional regulation.

      Ad. The appropriate references are provided to back-up these statements.

      • At several points in the text there are quantitative and comparative statements that should be backed up by the actual numbers (e.g. "The results of the targeted sequencing indicate a high frequency of altered genes", "The most altered gene was TP53, followed by IDH1...", "Other genes that were found to be frequently altered included KDM6B...", "These partial results combined with a low frequency of this variant in the Polish population suggest a somatic mutation"). The same thing applies to the co-occurrence of mutations, in which the percentage of co-occurrence and significance is not indicated. This lack of detail in the description is also observed in the description of the transcriptomic alterations in which no detail is provided regarding how many of the 105 analyzed samples correspond to low or high gliomas.

      Ad. We apologize that the frequencies of mutated genes were not specified. This information is included in the main text of the revised version. We now provide a gnomAD frequency for all variants of interest, confirming the low frequency in the population (AF__ __

      Regarding the total number of samples in the transcriptomic analysis, we provided an updated supplementary table covering also samples that were used for transcriptomic analyses (Supplementary Table 1).


      • For TOP2A mutation analysis, sometimes is not clear when the analysis is done with the 9 mutated samples and when with the 4 recurrent TOP2A E948Q variants. For example, in figure 2b and 2c analysis are done with 9 samples while the figure 2e is based on the 4 E948Q variants. At least this is what I have deduced from the main text, it should be clarified in the figure legend).

      Ad. This information has been included in the captions of Figure 2B, 2C and 2E and now we specify how many samples were used in each analysis.

      • Fig1. In figure 1b it would be interesting to color-code patients by glioma grade. This would also apply to Figure S1a, S1c, 2a, S3 and S4. In figure 1D it would be very informative to distinguish mutations that passed the quality control or not with different colors.

      Ad. Following reviewer’s suggestions, we have added this information, and oncoplot figures derived from the germline analysis have a distinct color for each glioma grade. In the figure 1D, all of the presented mutations have passed a quality control in terms of quality of sequencing. One additional criterion that was used for all genomic results (except some of the TOP2A variants) was a criterion of 20% variant penetration (20% of reads in the position had to come from the alternative allele). We corrected the description in the Supplementary Table to “passed 20% penetration criterion”. The rationale behind this criterion for TOP2A variants was a fact that for one of the E948Q samples it was ~13% and we didn’t want to lose this sample from the analysis due to rarity of the mutation.


      • Fig2. In figure 2b and 2c the statistical significance of differences between TOP2A and the rest of genotypes should be included. Looking at Figures 2d and 2e it looks surprising how similar is the overall survival of HIGH TOP2A mRNA expression (500 days, fig 2d) with the overall survival of the TOP2A WT samples (400 days, fig 2e). Here a I would include a graph that summarizes the TOP2A mRNA expression levels of each group in fig 2d and 2e.

      Ad. We agree that median overall survival is similar comparing patients with high TOP2A mRNA expression to TOP2A WT patients in our cohort. It is worth noting, however, that both datasets were produced using different library protocols, and the methodology is different, so it can’t be expected the levels to be equal. We think that adding two more graphs, as suggested, would add another layer of information to this section of the analysis. We have included two boxplots depicting TOP2A mRNA RPKMs, and it is clear now that the medians of High TOP2 mRNA and TOP2A mutant (E948Q) are more closely related, despite the fact that we only have a few patients with the mutation.

      • Fig3. It would be interesting to include the same simulation for the rest of TOP2A mutations as supplementary figure.

      Ad. We agree that the other TOP2A SNPs could potentially affect DNA binding. We focused on the recurrent mutation and did not analyze those occurring in a single patient. In the revised version we included predictions whether these variants could affect TOP2A DNA binding. For WT TOP2A and variants, we calculated the Gibbs free energy (ΔG). This information can be found in Supplementary Table 4. We have extended description in the Results section: “The TOP2A E948Q substitution may affect protein-DNA interactions”

      • Fig4 and Fig5. Include statistical analysis and dots representing individual replicates.

      Ad. For Fig 4 we have two replicates for two protein fragments, so we can’t present statistics now. As mentioned above we are preparing a new batch of proteins and will make more repetitions of EMSA and relaxation assays. For Fig 5. we have 3 replicates but despite a trend there is no statistical significance. We intent to make more replicates and a separate protein preparation. After including additional repetitions we will present the results as dots representing individual replicates.

      • Fig6. In Figure 6d I would increase the size differences in the dots representing the gene counts, as it is not easily perceived with current parameters.

      Ad. The dot size in Fig 6d did not reflect the true meaning. To make it easier to understand, we changed a plot type to a barplot, which now represents the number of differentially expressed genes involved in each pathway.

      • FigS2. In figure S2B, it would be informative to establish which dots are significatively above or below the diagonal.

      Ad. The purpose of this figure was to show which oncogenic signaling pathways from TCGA cohorts were affected in our cohort. The pathway's size is a variable that is used to normalize the calculation (shown in abscissa axis in S2B). RTK-RAS and NOTCH pathways contain hundreds of genes, whereas other pathways, such as the NRF2 oncogenic pathway, contains only a few. On the other hand, we counted how many genes in each pathway in our cohort were mutated (shown in ordinate axis, S2B). We used logarithms in both axes for visualization purposes, but this has no effect on the enrichment of these pathways, which is shown in the color-coded legend.

      • FigS3. How were the samples shown selected from the total?

      Ad. In this plot we show only somatic variants that were found in at least two different patients. We apologize that this information was missing, and we have added it to the figure's caption.

      • FigS4. I would include a line with the TOP2A mutation to have an idea of how these mutations are distributed between groups.

      Ad. Based on the feedback of the reviewer, this figure has been modified and improved. A new row has been added to the figure, displaying TOP2A mutations alongside other highly frequent mutations in other genes.

      Reviewer #2 (Significance (Required)):

      In this work authors have identified new mutations associated to gliomas by targeted exome sequencing using an important cohort of 182 samples. Among these new mutations epigenetic enzymes and modifiers are found. These results potentially increase the repertoire of putative molecular targets for future cancer therapies. Authors focus in mutations associated to TOP2A gene, that provides stronger DNA binding and DNA relaxation capacity in vitro. Although further characterization is needed, tumours harbouring this kind of mutations could show higher level of sensitivity to TOP2 drugs, providing potentially interesting clinical implications. Although the link between TOP2A expression and cancer prognosis is well established, the relevance of specific mutations in still largely unexplored.

      On one hand this work brings novelties in the field of Glioma providing a series of putative new players in the development of this type of cancer. Audience interested in basic or clinical aspects of these tumours would be a good target for this work. On the other hand, this putative gain-of-function mutation of TOP2A represent an interesting aspect for the DNA topology and topoisomerases field. Although, as stated above a more detailed biochemical and functional characterization would be required to draw the attention of this audience-

      Scientifically, I have experience in the DNA topology and topoisomerases field, 3D genome organization and gene regulation. I have no experience in Gliomas or any other clinical aspect of cancer, so it is difficult for me to properly establish the potential impact of the newly discovered mutations. Technically I have no capacity to critically evaluate the aspects related to the targeted exome sequencing and the suitability of the analysis performed for mutation identification.

      **Referee Cross-commenting**

      I fully agree with the comments of the other reviewer, which are perfectly aligned with my own regarding the preliminary nature of the conclusions about the biochemical and functional characterization of the TOP2A mutations.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      By exon targeted resequencing of 664 genes frequently mutated in cancer, authors identify novel mutations associated to Glioma in a cohort of 182 Polish and Canadian samples. Most of these novel mutations have been identified as potential rare germline mutations, somatic mosaicism or loss-of-heterozygosity variants. Among them, authors focus on mutations associated to the TOP2A gene, which encodes one of the two Type II topoisomerases paralogs present in humans. By a limited number of in vitro experiments, authors conclude that TOP2A recurrent variant E948Q, displays increased binding to DNA and topoisomerase activity. Therefore, authors suggest that the TOP2A E948Q variant is a gain-of-function mutation.

      Major comments:

      • Authors show an interesting plethora of new exon mutations associated with High Grade Glioma. Nevertheless, the characterization of TOP2A E948Q variant, which is the main focus of the study, although very interesting and potentially clinically relevant, remains incomplete. Association of the TOP2A E948Q glioma variant with a gain-of-function mutation would require to improve the statistical power of the presented experiments (increase number of replicates). With the existing experimental evidence, the increased DNA binding and activity of the TOP2A E948Q variant should be considered as preliminary, especially in the case of 431-1193 aa fragment. I would consider mandatory to increase experimental replicates and to analyse statistical significance in the case of DNA binding experiments and DNA relaxation assays with the TOP2A 431-1193 aa fragment. A more detailed biochemical characterization should be performed. A titration of different amounts of protein should be included in these experiments, and at least two batches of purified proteins should be analysed. Decatenation assays should also be performed to characterize the activity of the mutant protein in more detail. Recapitulation of DNA binding and activity results with other TOP2A variants obtained in this study will significantly reinforce authors claims too. This improved biochemical characterization should not take longer than two months.
      • To increase the significance of the results, I would encourage authors to include experiments showing the functional impact of this TOP2A mutation in cells. The connection with transcriptomic alterations is merely correlative, and would be greatly strengthened by functional experiments in cellular models. To draw definitive conclusions regarding the changes in transcription, I would encourage authors to complement the results with experiments that point to the physiological impact of TOP2A variants within the cell. Overexpression of WT and E948Q variants in a cell model and transcriptomic analysis would be desirable, but validation in these experimental models of some of the target genes identified as deregulated in patients could suffice. These experiments could be accomplished in no more than 3-4 months.
      • Some of the methods are not presented with sufficient detail. Regarding the DNA and RNA sequencing experiments, I consider necessary to specify the DNA fragmentation method, reference for the indexed adapters and ligation and amplification procedures (ligase reference, number of PCR cycles, etc). It would be helpful to clarify or reference which are the "special oligonucleotide probes" that are mentioned. Finally, a reference for the "special beads" and final amplification number of cycles is needed. The sequence of primers used for TOP2A cloning and mutagenesis should be included. The reference for the "site mutagenesis kit" used is missing. When studying the survival rate of glioma patients depending of TOP2A expression levels, it should be clarified what is considered HIGH or LOW expression (i.e: which percentiles are used).
      • There is a major concern about how the experiments are replicated and about the statistical analysis, which is inexistent in some cases. Indeed, Figures 4 and 5 do not present any statistical analysis, it is therefore hard to draw any conclusion. In Figure 4b, the results for the 890-996 aa fragment looks qualitatively clear, but this is not the case for the 431-1193 aa fragment. More replicates and statistical analysis are mandatory, together with a protein titration. The replicates should be performed with at least two independent batches of protein purifications. The individual values of each experiment should be included in the graph to provide a better understanding of experimental variability. All this also applies to Figure 5.

      Minor comments:

      • The effect on transcription of co-occurrence of TOP2A mutations with other mutations could also be analysed with the already available data. Also, a more detailed analysis of genome-wide transcription could also be used to at least partially address the proposed hypotheses of increased transcriptional rate or splicing aberrations.
      • There is no reference for the following argument "As the identified germline variants were exceptionally rare in the general population ... it is likely that these variants are pathogenic". I also find low number of references to support the suggested high frequency of altered genes in gliomas compare to other cancer types. I miss specific works relating TOP2 activity with transcriptional regulation.
      • At several points in the text there are quantitative and comparative statements that should be backed up by the actual numbers (e.g. "The results of the targeted sequencing indicate a high frequency of altered genes", "The most altered gene was TP53, followed by IDH1...", "Other genes that were found to be frequently altered included KDM6B...", "These partial results combined with a low frequency of this variant in the Polish population suggest a somatic mutation"). The same thing applies to the co-occurrence of mutations, in which the percentage of co-occurrence and significance is not indicated. This lack of detail in the description is also observed in the description of the transcriptomic alterations in which no detail is provided regarding how many of the 105 analyzed samples correspond to low or high gliomas.
      • For TOP2A mutation analysis, sometimes is not clear when the analysis is done with the 9 mutated samples and when with the 4 recurrent TOP2A E948Q variants. For example, in figure 2b and 2c analysis are done with 9 samples while the figure 2e is based on the 4 E948Q variants. At least this is what I have deduced from the main text, it should be clarified in the figure legend).
      • Fig1. In figure 1b it would be interesting to color-code patients by glioma grade. This would also apply to Figure S1a, S1c, 2a, S3 and S4. In figure 1D it would be very informative to distinguish mutations that passed the quality control or not with different colors.
      • Fig2. In figure 2b and 2c the statistical significance of differences between TOP2A and the rest of genotypes should be included. Looking at Figures 2d and 2e it looks surprising how similar is the overall survival of HIGH TOP2A mRNA expression (500 days, fig 2d) with the overall survival of the TOP2A WT samples (400 days, fig 2e). Here a I would include a graph that summarizes the TOP2A mRNA expression levels of each group in fig 2d and 2e.
      • Fig3. It would be interesting to include the same simulation for the rest of TOP2A mutations as supplementary figure.
      • Fig4 and Fig5. Include statistical analysis and dots representing individual replicates.
      • Fig6. In Figure 6d I would increase the size differences in the dots representing the gene counts, as it is not easily perceived with current parameters.
      • FigS2. In figure S2B, it would be informative to establish which dots are significatively above or below the diagonal.
      • FigS3. How were the samples shown selected from the total?
      • FigS4. I would include a line with the TOP2A mutation to have an idea of how these mutations are distributed between groups.

      Significance

      In this work authors have identified new mutations associated to gliomas by targeted exome sequencing using an important cohort of 182 samples. Among these new mutations epigenetic enzymes and modifiers are found. These results potentially increase the repertoire of putative molecular targets for future cancer therapies. Authors focus in mutations associated to TOP2A gene, that provides stronger DNA binding and DNA relaxation capacity in vitro. Although further characterization is needed, tumours harbouring this kind of mutations could show higher level of sensitivity to TOP2 drugs, providing potentially interesting clinical implications. Although the link between TOP2A expression and cancer prognosis is well established, the relevance of specific mutations in still largely unexplored.

      On one hand this work brings novelties in the field of Glioma providing a series of putative new players in the development of this type of cancer. Audience interested in basic or clinical aspects of these tumours would be a good target for this work. On the other hand, this putative gain-of-function mutation of TOP2A represent an interesting aspect for the DNA topology and topoisomerases field. Although, as stated above a more detailed biochemical and functional characterization would be required to draw the attention of this audience-

      Scientifically, I have experience in the DNA topology and topoisomerases field, 3D genome organization and gene regulation. I have no experience in Gliomas or any other clinical aspect of cancer, so it is difficult for me to properly establish the potential impact of the newly discovered mutations. Technically I have no capacity to critically evaluate the aspects related to the targeted exome sequencing and the suitability of the analysis performed for mutation identification.

      Referee Cross-commenting

      I fully agree with the comments of the other reviewer, which are perfectly aligned with my own regarding the preliminary nature of the conclusions about the biochemical and functional characterization of the TOP2A mutations.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Please place your comments about significance in section 2. In this paper the authors used a targeted approach to identify rare mutations in a cohort of glioma patients. Using this approach they identified a recurrent mutation in the TOP2A gene encoding for Topoisomerase 2A, and suggest that this mutation creates a more effective protein, binding DNA strongly and maybe more enzymatically active. RNAseq analysis of TOP2Awt and TOP2Amut tumor samples suggest different transcription patterns and points to possible splicing defects. The most recurrent variant (E9448Q) is described in depth and some experimental information shows this variant might be a gain-of-function mutation.

      Major comments:

      • Are the key conclusions convincing? The validation of both the methodology and the presence of never described TOP2A variations in HGG is done quite successfully. Interesting evidence about relevance of the most frequent mutation is provided. However, besides having computational and biochemistry assays performed, lack of details about in vitro experiment statistics (no p-values are provided in figures 4 and 5, neither sample size, repetitions) weakens the conclusions claimed by the authors about the properties of the mutated topoisomerase.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Claims about E948Q variant function should be revised. Data is not presented in a convincing way, plus there is ambiguous language used from the results ("We conclude that the E9448Q TOP2A protein is functional, and MIGHT have a higher activity than the WT protein") to the rest of the paper where they strongly support the claims about the TOP2A activity.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. In line with the presented data in the paper, additional experiments that show catalytic changes of the E9448Q variation must be added. It is shown that there are differences in the DNA binding capacity by EMSA compared to the WT form, however, the DNA supercoil relaxation activities is not that different, at least the way the results are presented. The authors suggest that TOP2A mutation is a driver mutation but no validation in vitro of this claim is shown. Can this mutation alone or in combination with e.g. tumor suppressors transform normal cells to cancer cells? Do cell lines expressing this mutation (compared to parental TOP2A wt expressing cells) display increased transcription? Increased invasion?

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. If the authors can complement the already presented in vitro experiments with additional ones supporting their hypothesis, this should be feasible. The authors can use patient derived glioma cells or glioma cell lines manipulated to express either the parental TOP2A wt enzyme or the identified mutated form.

      • Are the data and the methods presented in such a way that they can be reproduced? Yes, the authors provide a quite detailed explanation of the methods implemented to reach each one of the results they are presenting.

      • Are the experiments adequately replicated and statistical analysis adequate? No, there is no information about the statistical analysis or number of replicates in any of the in vitro experiments performed. This information should be added to the manuscript.

      Minor comments:

      • Specific experimental issues that are easily addressable.

      • Are prior studies referenced appropriately? Yes, authors clearly address the state of the art regarding previous NGS methodologies and let us know the advantages and novelty of their approach.

      • Are the text and figures clear and accurate? There are some discrepancies between the strength of the language used in different sections of the paper to refer the conclusions they can infer from the results they are showing. While they are all valid, authors should revise it.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? First of all, describe the statistical analysis used in every figure, include number of biological and technical replicates. I would also suggest to change the title or the scope of the discussion, there is too much focus on the TOP2A in the introduction, neglecting all the technical NGS work that actually lead to several new variants being described. This may be confusing when it collides with a conclusion that is heavily focused on the first half describing potential implications of at least another 3 proteins where genetic alterations were described. Given the fact there is not much experimental work that shows TOP2A mutations relevance in HGG or strong enough evidence of the variant's function I would suggest to change a bit the scope of the title.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The authors describe a methodology that proved to be sensitive and specific enough in order to allow them to detect rare genetic alterations in patient glioma samples. This information could be valuable to describe new driver mutations or infer in genetic pathway alterations that could be potential therapeutic targets. As the authors state at the beginning of the paper, given the poor therapeutical approaches existing for HGG currently, information of this kind could still be highly useful and provide a better outcome to a specific cohort of patients.

      On a personal note, I think there is too much speculation about how TOP2A mutations could be interesting from a biological point of view (authors referred to evidence about implications of this mutation in other forms of cancer) but since no experimental validation is provided in glioma cells, it is difficult to conclude that this enzyme gain-of-function mutation could have a relevant role in HGG and thus make these variants a potential therapeutic target. There are no experiments conducted in glioma cells that express TOP2A variants, it would be interesting to see if it has an effect in the migratory/invasive phenotype like described in other cancer types or like it is suggested by analysis of the genetic pathways activated in the HGG patients samples harboring TOP2A mutation. In addition, there is no evidence of the TOP2A mutations possible role as a driver mutation, which is an interesting aspect that could be further explored from both a computational and an experimental approach.

      • Place the work in the context of the existing literature (provide references, where appropriate). The quality of the paper is high and in line with other studies in the literature that perform genome and transcriptome analysis of tumor samples. It is only the experimental validation that is lacking data supporting the "in silico" findings.

      • State what audience might be interested in and influenced by the reported findings. Computational biologists are the right audience to target this paper. If additional experimental work further validating their initial bioinformatic findings is added to the manuscript then probably a wider population could be targeted.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Brain tumors, immunotherapy, cancer stem cells, tumor microenvironment, tumor heterogeneity. I do not have sufficient expertise to evaluate the bioinformatic analysis and software/programs used to analyze the NGS data.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful for the constructive and highly supportive reviews provided by our Reviewers. We especially appreciate the efforts they have made to provide suggestions on how to make our revised manuscript even more robust. We have incorporated many of these suggestions into the revised manuscript that will post to Biorxiv and will be submitted to an affiliate journal. We have provided point-by-point responses to each Reviewer below each item (starting with Response: …), along with any changes made in response to that comment/suggestion (starting with In our revised manuscript, …).

      Finally, we agree with all Reviewers that this work should be of broad interest to the molecular biology, cell biology, and parasitology communities. Our discovery that Plasmodium and two related genera have taken the unorthodox approach of duplicating their NOT1 protein, and that Plasmodium has dedicated it for its unique transmission strategy, is a fascinating adaptation of the use of this core eukaryotic complex. We believe that those that focus on diverse aspects of RNA biology, including RNA preservation/decay, the maternal to zygotic transition, translational repression, and beyond will find this work to be of interest and relevant to their own research questions.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript „The Plasmodium NOT1-G paralogue acts as an essential nexus for sexual stage maturation and parasite transmission" investigates the two forms of NOT1 in rodent malaria parasites. The authors found out that the original NOT1 is crucial for gametocyte induction as well as transmission to the mosquito, they therefore renamed it NOT1-G. The paralogous proteins, on the other hand, appears to be crucial for intraerythrocytic growth, since it cannot be knocked out. The authors then investigated NOT1-G in more detail, using standard phenotyping assays. They found a slightly increased gametocytemia and a minor effect on transmission to the mosquito.

      Response: In our submitted manuscript, we do focus on PyNOT1-G because of the exciting role it has for both sexes of gametocytes, which results in a complete defect in transmission to mosquitoes. Our investigations of what domains of PyNOT1-G focused on the most likely suspect: the putative tristetraprolin-binding domain (TTPbd). It was through deletion of this domain that we observed only a minor defect in the prevalence of infection of mosquitoes, indicating that the portion of PyNOT1-G that is required for transmission lies elsewhere (in part or in total). It is also important to correct Reviewer 1’s statement regarding the other (perhaps canonical) PyNOT1. To our surprise, PyNOT1 could be deleted, but resulted in a parasite that has an extreme fitness cost and a very slow growth phenotype. This is in stark contrast to other eukaryotes, where NOT1 is essential.

      Reviewer #1 (Significance (Required)):

      If the authors are able to provide convincing data that NOT1-G is indeed important for gametocyte induction and transmission to the mosquito, then the report would be of high significance for the malaria and molecular cell biology fields.

      Response**: We have in fact shown this and more in the originally submitted manuscript, and thus we are grateful that Reviewer 1 considers this work to be of high significance in a broad readership (molecular and cell biology, parasitology). In our revised manuscript, we have added text throughout to make these results even more apparent and clear for the reader.

      My expertise: molecular cell biology of gametocytes, translational regulation, parasite transmission

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      The manuscript by Hart et al. builds upon a fascinating finding presented in a previous manuscript by the same authors, in which they show that CCR4 seems to be able to associate with two members of the NOT1 family. In this work, the authors first re-annotate the two NOT1 paralogs in Plasmodium yoelii and then perform an in depth characterization of the role of NOT1-G during gametocytogenesis and early mosquito development. Using gene knockout and different genetic crosses, the authors show that NOT1-G is essential for male gametocyte development and leads to an arrest of development in zygotes arising from female gametocytes. Using RNA-seq the authors show that NOT1-G leads to lower transcript abundances, leading to the hypothesis that NOT1-G might be involved in preserving mRNAs in a larger RNA-binding complex. Lastly, the authors characterize a NOT1-G defining TPP domain and find that it is not essential for either male/female phenotype observed for the whole gene KO.

      Response**: We appreciate the concise and accurate summary of these findings.

      **Major comments:**

      • Are the key conclusions convincing?

        The phenotypic characterization of NOT1-G during gametocytogenesis / early mosquito development is nicely presented and the experiments are well performed. Because a duplication of NOT1 with possibly opposing roles of the paralogs is a very unique feature with broad implication on RNA metabolism, it would have been great to see two select experiments on the molecular level adding evidence that 1) NOT1/NOT1-G are mutually exclusive in a complex with CCR4/CAF1 and 2) NOT1-G acts post-transcriptionally in an antagonistic way to NOT1 (i.e. as a mRNA 'stabilizer' as proposed by the authors).

      Response**: We agree that inclusion of those two aspects would make for a more complete story about these two NOT1 paralogues.

      First, we also think that it is highly likely that NOT1 and NOT1-G are mutually exclusive, as in other eukaryotes NOT1 acts as a scaffold protein upon which effector proteins bind and bridging interactions are made. In our original manuscript, we did not include a mention of our previous attempts to address this question through colocalization and proteomic approaches, as they were largely unsuccessful. Specifically, we generated rabbit polyclonal antisera to PyNOT1-G’s tristetraprolin-binding domain but it did not pass our rigorous quality control (e.g. too much staining persisted in pynot1-g- parasites). Using both asexual and sexual blood stage parasites, we also attempted immunoprecipitation (with and without chemical crosslinking) and proximal labeling approaches via BioID and TurboID but all approaches did not produce rigorous results and thus we did not report them in our original manuscript. However, this question of whether the two NOT1 paralogues were mutually exclusive in complexes was also taken up by the Bozdech Laboratory in their 2020 preprint (Liu et al.) where they were able to capture the P. falciparum NOT1-G and NOT1 proteins (called Not1.1 and Not1.2 in that work). While their proteomic evidence showed that they could capture these bait proteins and that the NOT1 paralogues were not in the same complex, these results should be taken with a grain of salt: all mass spectrometry-based proteomic approaches are limited in that an absence of evidence does not mean that the protein is not present/interacting. Moreover, these efforts only identified a few other proteins that were already known to interact with the CAF1/CCR4/NOT complex, but even so, they did not use statistically rigorous methods in an attempt to quantify these results. In our revised our manuscript, we have included additional text to describe our unsuccessful efforts to do these capture proteomics experiments, and we have expanded our discussion of the Liu et al findings that provide some evidence in support of a mutually exclusive complex.

      Second, we also hypothesize that PyNOT1-G acts post-transcriptionally to affect mRNA abundance and translation. However, it is important to emphasize that NOT1 proteins typically act as scaffolds, with the recruited effector proteins acting to hasten the degradation and/or to preserve associated transcripts. We believe that studying these effector proteins is the next important effort to undertake. In fact, we hypothesized that these antagonistic effector proteins would be analogous to TTP and ELAV/HuR-family proteins as are found in other eukaryotes, and that the critical interaction with PyNOT1-G would be via its putative TTP-binding domain. It was for that reason that we interrogated the TTP-binding domain itself, and were surprised that its deletion did not phenocopy the complete gene deletion. Ongoing work will be focused on identifying these antagonistic effector proteins that likely are expressed in a stage-enriched manner, and to define how they interact with PyNOT1-G in order to direct specific mRNAs to their fates. Additionally, it would be very important and exciting to directly test if PyNOT1 and PyNOT1-G are functionally opposed. However, this would be exceptionally challenging to study from a technical standpoint. While we were able to delete the pynot1 gene after many repeated attempts, these parasites are very sickly and grow very slowly. Because of this, we believe that assessing direct versus indirect effects of PyNOT1 in these cells would not be feasible or robust. Given this, comparing functions between PyNOT1 and PyNOT1-G could not be done in a conclusive manner.** In our revised manuscript, we have expanded our descriptions of the mechanisms by which we believe PyNOT1-G and its complex affects mRNA fates. In particular, we have expanded our Discussion section to incorporate the results that indicate that the TTP-binding domain is not required for the essential functions of PyNOT1-G.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

        The authors describe the role of NOT1-G as 'preserving' mRNA. The lower abundance of many transcripts in the NOT1-G knockout suggest this, but experimental proof is not provided (see suggestions below). Maybe rephrase to 'putatively preserved/stabilized' or 'has a potentially stabilizing function'. The same is true for the mutually exclusive association of the two paralogs with CCR4/CAF1. The authors refer to a protein co-IP of CCR4 showing that CCR4 can interact with both NOT1 and NOT1-G, but a reciprocal experiment is lacking.

      Response**: In our first publication on the deadenylase members of this complex, we also saw a similar effect on specific mRNAs when pyccr4-1 was deleted: the abundance of specific mRNAs went up in pyccr4-1- parasites. In that work and here in this manuscript, we have carefully decided to apply the word “preserved” to the fate of these mRNAs as it describes in a general way what is happening. In order to robustly state that mRNAs are stabilized by PyNOT1-G (directly or indirectly) would require additional experiments designed to test this (more description on this is provided on a response below). Second, as described above, we agree that doing a reciprocal IP for mass spectrometry-based proteomics would be ideal, we attempted four different approaches to do this to no avail. However, the composite proteomics data that is already available in the literature and via the Liu et al. preprint from the Bozdech Lab all indicate that these interactions occur, and perhaps that NOT1 and NOT1-G are mutually exclusive as expected. In our revised manuscript, we have provided further explanation in the Discussion for our use of the descriptor “preserve” instead of “stabilize”, and as noted above, and we have expanded our Discussion to more comprehensively define the interaction network depicted in Figure 7.

      In both cases, the conclusions of the authors are very likely (e.g. downregulation of many genes as seen by RNA-seq), but the final experimental evidence is not provided and a network such as in Figure 7 is not fully supported. If the authors would like to maintain these statements, then they should be rephrased and made clear or the additional experimental evidence suggested below is necessary.

      Response**: We hold that the published proteomic datasets do support such a network, with further support offered from the preliminary proteomic evidence from the Liu et al preprint. Therefore, we have not modified our manuscript beyond the additional text now provided in the Discussion as noted above.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

        The essential claim that NOT1-G is important for gametocytogenesis and early mosquito development is well presented and fully supported by the experiments. As for the role of NOT1-G in 'preserving' mRNA, an mRNA half-life experiment would be necessary (or the text should be adjusted as mentioned above). In a short-term in vitro culture, pynot1-g- and WT parasites could be treated with ActD and abundances of select transcripts are measured by RT-qPCR.

      Response**: We appreciate that Reviewer 2 considers the rigor of our experiments to be high. Regarding the use of the term “preserve” vs “stabilize”, we agree that to shift from our more general descriptor (preserve) to one that has specific connotations (stabilize) would require additional experimentation. To correctly and most robustly make the claim of stabilization would require work on par with that done by Painter et al. (PMID: 29985403) that uses a thiol-containing nucleotide (4-TU) along with a yeast-derived fusion enzyme (yFCU) to convert it for use by Plasmodium. Previously we have shown that an associated deadenylase (PyCCR4-1) also acted to preserve mRNAs, and moreover that deletion of its gene resulted in no discernable effect upon the poly(A) tail or 3’ UTR of an mRNA that is bound by this complex (p28).

      While understanding mRNA stability is an exciting area of study, this 4-TU labeling experiment alone warranted a standalone, high impact publication for Painter et al. As this has not been adapted for any rodent-infectious Plasmodium species to date, and as adaptation of this labeling approach took several years for Dr. Painter while in the Llinas Laboratory (personal communication), we believe this work is beyond the scope of this study. Moreover, the additional information that it would provide to understand NOT1-g functions (preserve vs stabilize) would be incremental beyond the major storyline presented in this manuscript. In our revised manuscript, we have added text to ensure that our choice of “preserve” is well defined and explained.

      To support the idea that NOT-1 and NOT1-G associate in a mutually exclusive way or to just show that they act in distinct complexes despite their similar expression patterns, an IFA with a double stained NOT1/NOT-1G cell line could be performed. Alternatively, the authors could perform a protein co-IP using the already existing NOT1/NOT1-G-GFP cell line and show that the proteins don't interact with each other or even have certain distinct interaction partners.

      Response**: We agree, and these studies were attempted but were unsuccessful (described in our responses above). In our revised manuscript, we have included this information as noted above.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

        All necessary cell lines for a NOT1/NOT1-G co-IP and the ActD experiment are already present. The authors already present a ring to schizont in vitro culture (for ActD) and also have substantial experience in protein co-IP and proteomics.

        I am not sure about the cost for a proteomics experiment at the author's institute and I don't want to make a guess on time investment given the still on-going COVID situation.

      Response**: We agree that these experiments would be interesting, and would be costly to do at a transcriptome-wide scale and would require substantial time to conduct. We believe that the 4-TU approach noted above is the most rigorous, but is well beyond the scope of this study as it has not yet been adapted to rodent-infectious malaria parasites. As noted above, we have attempted four different proteomics approaches to provide reciprocal evidence for the complex composition which were unsuccessful. In our revised manuscript, we have added text to ensure that our choice of “preserve” is well defined and explained, and have noted the unsuccessful reciprocal proteomics approaches.

      • Are the data and the methods presented in such a way that they can be reproduced?

        The MM section is well structured and presented and the supplemental material includes all data.

      Response**: Thank you. We want to ensure that our work is clearly described and can be reproduced with the information reported.

      • Are the experiments adequately replicated and statistical analysis adequate?

        There is hardly any test of significance presented in the main text of the manuscript (e.g. Figure 3B and 4A). Please show the individual data points for these graphs and make sure the n= and the statistical test is described in the figure legend. If you use the term significant in the text, then just add the p-value behind it. This is also true for the RNA-seq data: Genes are sorted by fold-changes, leaving it unclear if these changes are significant. These data are however presented in Table S1 and could be incorporated in the main text.

      Response**: We agree. In our revised manuscript, we have incorporated additional details about the statistical tests used, p-values for noteworthy comparisons, and have included more panels for our comparative RNA-seq datasets (heatmap, PCA, MA plots). We have also made adjustments to our plots to make individual data points more readily observed, especially when error bars may block them (e.g. Figure 3B). And as in the original submission, all of the pertinent values, including fold changes, statistics and more are provided in our comprehensive supplementary files. We have structured the Supplementary Tables to flow from one tab to the next with the filtering/threshold applied noted both in the tab name and in the README tab that is found first among the tabs.

      **Minor comments:**

      • Specific experimental issues that are easily addressable.

        One idea that is also not discussed but could be added is for example that NOT1-G itself doesn't even have a stabilizing effect itself, but act as a decoy for other components of the CCR4/Caf1 complex, keeping them from associating with NOT1. In the NOT1-G knockout, the decrease in RNA abundance might then be just a result of an 'overactivity' of CCR4/Caf1/NOT1.

      Response**: This hypothesis proposed by Reviewer 2, that PyNOT1-G is acting as a decoy or a binding partner sponge, is certainly feasible. For this scenario to be effective, PyNOT1-G would need to be in excess of PyNOT1 and/or would need to be able to bind to the critical effector protein(s) better than does PyNOT1. However, our microscopy data, along with the transcriptomic data presented here and previously published proteomic data would indicate that these two gene products are in approximately balanced proportions and are similarly localized. This does not exclude the possibility that PyNOT1-G could act as a sponge for relevant binding partners. In our revised manuscript, we have raised this possibility as an alternate explanation for the phenotype in the Discussion section.

      • Are prior studies referenced appropriately?

        Throughout the manuscript, the authors should make clear what results come from which organism. Just as an example, the genome wide KO screens were performed in P. berghei and P. falciparum, CCR4/CAF1 experiments were performed in P. yoelii, whereas the original DDX6 work was done in P. berghei.

      Response**: We agree. In our revised manuscript, we have added additional text to further clarify what data comes from which Plasmodium species.

      • Are the text and figures clear and accurate?

        The Introduction is a bit long and partially turns into a minireview of eukaryotic RNA degradation. In the main text on page 13, the authors introduce a model for proteins involved in translational repression. This in not fully accurate, since for many of the proteins in this network, an effect on translation has actually not been shown. This includes NOT1-G characterized in the present work that most likely has an effect on mRNA stability, but for which a role in regulating translation is not presented.

      Response**: We believe the length and content of this Introduction is appropriate to provide the context that some readers outside of the parasitology field will need to appreciate these findings. Regarding designations for these proteins as being related to translational repression, we think that the ample proteomic evidence tying them to translationally repressive complexes warrants this. In our revised manuscript, we have made it more clear that these proteins themselves have not been directly implicated in translational repression.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

        Overall the RNA-seq is underrepresented and Figure 5 could easily be expanded by adding several panels that would help the future reader getting a better idea of the data:

      1. Summary graphs such as PCA/MDS plots of the different replicates and MA-plots (all of which can be easily generated in DESeq2)
      2. Heatmaps comparing the expression patterns of pynot1-g-, pbdozi-, pbcith-, pyalba4- highlighting some key gametocyte genes mentioned in the text
      3. Alternatively to 2., a simple Venn Diagram would already be very informative

        An informative representation might also be to sort the differentially expressed genes as predominant male and/or female. The P. berghei data by Yeoh et al (PMID: 28923023) could be a starting point.

      Response**: We agree. In our revised manuscript, we have expanded Figure 5 to include additional plots that speak the rigor of these datasets. Specifically, we have added a comprehensive heatmap and PCA plots, as well as MA plots as recommended. We have chosen not to include a Venn diagram for the overlap of affected mRNAs across these transgenic parasite lines, as we hold that this information is best provided in the text (high level observations) and the Supplement (details).

      Reviewer #2 (Significance (Required)):

      **Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.**

      Technically this manuscript builds on standard methods of the field that are well executed. There is no direct clinical advancement, although one might argue that a unique adaptation of the parasite could always be a novel therapeutic target. Conceptually this is great advancement for the parasitology field as it is, providing additional evidence for the importance of post-transcriptional regulation for parasite transmission. With the two experiments suggested above and the additional evidence gained from it, this manuscript could also gain great interest to readers outside the field by clearly showing how alternative ways to regulate RNA stability evolved.

      Response**: We are grateful for your careful review of our work and for the recommendations that you provided. We have incorporated many of them into the revised manuscript to make it even more rigorous and comprehensive. We also appreciate hearing that this work would be of great interest to a broader community. We feel that this is already the case, as the duplication of NOT1 and the dedication of one paralogue to an essential function is exciting and novel among eukaryotes.

      **Place the work in the context of the existing literature (provide references, where appropriate)**

      The work builds on the early reports of the particular RNA metabolism in gametocytes performed in the groups of Andy Waters. Since then, the authors themselves have published a great set of manuscripts extending our knowledge of the proteins involved in gametocytogenesis and nicely place the current work into this framework.

      Response**: We appreciate this positive feedback. This is a fascinating topic to study.

      **State what audience might be interested in and influenced by the reported findings.**

      The manuscript as it stands is particularly interesting for the parasitology and potentially the evolutionary biology field. For a broader readership for example in the RNA field, the possibly antagonistic roles and mutually exclusive association with CAF1/CCR4 are likely most interesting.

      Response**: We agree that this should be interesting to readers beyond our own field, as the duplication and specialization of NOT1, and the finding that the “canonical” PyNOT1 can be deleted, are both of general interest to how eukaryotes have adapted and deployed a highly conserved and essential RNA metabolic complex.

      **Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.**

      **Expertise:**

      RNA biology, Plasmodium falciparum, Bioinformatics

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors investigate the requirement of two possible Not1 paralogs for the development of asexual blood stages and for the sexual transmission stages of Plasmodium yoelii. While Not1 is critical for asexual blood stages, its putative paralog, Not1G is important for the development of sexual transmission stages. In the absence of Not1G, male gametes are not formed while female gametes are formed and can be fertilised by wt male gametes. However, the resulting zygote cannot develop further into ookinete. The in vitro genetic cross assay to show this is elegant! A transcriptomic analysis further indicates that the transcriptomes of Not1G deficient parasites are significantly different from their WT counterpart.

      Response**: We are thrilled that you found our evidence and approaches to be rigorous and compelling. Thank you.

      **Major comments:**

      The discussion section is very nice and the authors describe well what is speculative and should be further confirmed by additional experiments. However, I did find this was not the case in the results section where the authors are proposing conclusions that are not supported by the results. I think the reading of this manuscript would be much more enjoyable if the authors only describe the results shown and move all the discussions to the dedicated section. Below are some examples. The data presented in this manuscript is not showing a nexus, this is a suggestion based on the results of other articles, the word should thus be removed from the title (and kept for a future review!). The last two sentences of the localisation section should be moved to the discussion because they do refer to results not shown in this manuscript. The last sentence of the second paragraph of the zygote development section should also be moved to the discussion. For the transcriptomic analysis there is also no formal comparison with transcriptomes of other previously analysed mutants: the results of the comparisons should either be shown or not discussed in the result section. Finally, the discussions mentioning interactors of the complex should be removed from the result section and moved to the discussion unless the results are formally analysed.

      Response**: We again thank you for the complement. In our original manuscript, we opted to provide some limited interpretations and context within the Results section in order to help guide readers along our train-of-thought and line-of-experimentation. While a more traditional split of keeping essentially all discussion and interpretation for the Discussion is a tried-and-true approach, we prefer this more narrative method and have opted to keep these short sections in the Results section.

      I would strongly suggest the author the better present and describe their transcriptomic results. There is only one volcano plot indicating the overall defect in mixed gametocytes in the main figure. Apart from this, the results are only described in the main text or in supplementary tables. It is therefore difficult to understand the subtilities of the analysis. For example, the authors frequently mention dysregulated genes, but without specifying whether it is up or down-regulated in the mutant. To address this issue, I would suggest the authors to better describe their results in the figures. They could show the GO term enrichment analysis they mention and show how they assign GO term or transcripts to male and female parasites. It would also be nice to discuss some of the results a bit more in details. For example, it is not surprising to see a reduction in transcripts that are under the control of AP2-O in retort-arrested ookinetes as the parasite do not reach this stage. It is thus highly speculative to specifically link this observation with ALBA4 without further detailed analysis. On the other hand, it is more surprising to see a decrease in ap2g transcripts, while the authors observe an increased gametocytaemia. Could the authors comment this observation? It may also be nice to better present the comparison between gametocytes and schizonts to possibly speculate on the early requirement of Not1G in committed schizonts.

      Response**: We (and Reviewer 2) agree. In our revised manuscript, we have expanded Figure 5 to include additional plots that speak the rigor of these datasets. Specifically, we have added a heatmap, and PCA and MA plots as recommended. We have chosen not to include a Venn diagrams for the overlap of affected mRNAs across these transgenic parasite lines for the reasons stated above in our response to Reviewer 2. Similarly, we have opted to keep the specifics of the GO Term analyses in the Supplement as we believe these should always be taken with a grain of salt (especially high level GO Terms, as many choose to report). Finally, we have expanded our discussion on our observation that pyapiap2-g transcript levels are lower in the pynot1-g- line, despite seeing a slight increase in gametocytemia.

      The conclusion regarding the similar localisation of Not1 and Not1G with other members of the CAF1/CCR4/NOT complex is not really convincing for two reasons. First, there is not colocalization shown and, second, the distribution is not very peculiar so it is difficult to draw any conclusion with this level of resolution. The presence of alpha-tubulin in the nucleus of male gametocytes is also very surprising as it is rather nucleus-excluded in both P. falciparum and P. berghei, could the authors comment this peculiar localisation?

      Response**: We agree and disagree here. First, we agree that no colocalization data is presented here to place NOT1-G within the limit of resolution of fluorescence microscopy. What we can (and do) state is that these proteins are all localized to cytosolic puncta, which matches what is observed for essentially all other studied eukaryotes. In further support of this, our published, quantitative proteomic data indicates that the bioinformatically predictable members of the CAF1/CCR4/NOT complex do associate as anticipated. In the same vein, the micrographs presented were not captured by confocal microscopy, and thus the apparent localization of alpha tubulin “in” the nucleus is most likely attributed to being above and/or below the nucleus. Taken together, we do feel that the combined evidence is convincing. As we have already made all of these points in the original manuscript, we have not adjusted the revised manuscript further.

      One of my major frustration when reading this manuscript was that the authors are not trying to discriminate between an early role of Not1G during gametocytogenesis or later in gametogenesis. The fact that the transcriptomes of gametocytes and schizonts seem to show similarities suggests that the phenotype observed during both male gametogenesis or ookinete development are probably linked to early knock-on defects during gametocytogenesis. Could the authors test whether male gametocytes replicate DNA or female activate translation? These are of course non-essential experiments as the authors are careful with their conclusions and mention possible defects during both gametocytogenesis or gametogenesis. Addressing this question may however add significant insights into the requirement for Not1G.

      Response**: We are sorry for the frustration. We wrote the manuscript so as to state what we feel we could robustly say, and where we are drawn to speculate, we made that speculation clear. As Reviewer 3 notes, we have not attempted to discriminate between functions that PyNOT1-G may be playing in different stages or substages of development because we do not believe the experiments allow that discrimination. While we could investigate finer and finer aspects of possible defects in both male and female gametocyte development, the most impactful take home messages remain the same. We continue to address questions related to translational repression and its release, and anticipate that PyNOT1-G will play a substantial and essential role in this. As Reviewer 3 noted, we have already discussed these possibilities in the original manuscript, and thus have not added anything further about this in our revised manuscript.

      **Minor comments:**

      Please use page and line numbering for your next submissions! Please describe what "bioinformatics" was used. I would show the nice localisation in oocyst and sporozoite in the main section. The conclusions drawn from the genetic cross seem to come from a single biological replicate, if this is the case please indicate it clearly.

      Response**: We apologize for these oversights. In our revised manuscript, we have provided page and line numbering, have expanded on what bioinformatic processes were done in the manuscript, and have made it more clear that the genetic crosses come from multiple biological replicates (biological triplicate for the transmission-based genetic cross, biological duplicate for the in vitro culture genetic cross). However, we have opted to retain the oocyst and sporozoite IFA data in the Supplement, as the rest of the story is focused on blood stage and early mosquito stage.

      Reviewer #3 (Significance (Required)):

      This manuscript highlights the requirement of a Not1 paralog in the transmission stages of a Plasmodium parasite. More specifically it describes a new player in the control of RNA biology during this process where our knowledge is scarce. It will be a valuable manuscript for molecular parasitologists interested in transmission or RNA biology.

      Response**: We agree and are grateful that our colleagues find this study to be a valuable addition in our efforts to understand how malaria parasites have adapted classic eukaryotic mechanisms to suit their purposes.

      Our expertise is largely in molecular and cellular parasitology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, the authors investigate the requirement of two possible Not1 paralogs for the development of asexual blood stages and for the sexual transmission stages of Plasmodium yoelii. While Not1 is critical for asexual blood stages, its putative paralog, Not1G is important for the development of sexual transmission stages. In the absence of Not1G, male gametes are not formed while female gametes are formed and can be fertilised by wt male gametes. However, the resulting zygote cannot develop further into ookinete. The in vitro genetic cross assay to show this is elegant! A transcriptomic analysis further indicates that the transcriptomes of Not1G deficient parasites are significantly different from their WT counterpart.

      Major comments:

      The discussion section is very nice and the authors describe well what is speculative and should be further confirmed by additional experiments. However, I did find this was not the case in the results section where the authors are proposing conclusions that are not supported by the results. I think the reading of this manuscript would be much more enjoyable if the authors only describe the results shown and move all the discussions to the dedicated section. Below are some examples. The data presented in this manuscript is not showing a nexus, this is a suggestion based on the results of other articles, the word should thus be removed from the title (and kept for a future review!). The last two sentences of the localisation section should be moved to the discussion because they do refer to results not shown in this manuscript. The last sentence of the second paragraph of the zygote development section should also be moved to the discussion. For the transcriptomic analysis there is also no formal comparison with transcriptomes of other previously analysed mutants: the results of the comparisons should either be shown or not discussed in the result section. Finally, the discussions mentioning interactors of the complex should be removed from the result section and moved to the discussion unless the results are formally analysed.

      I would strongly suggest the author the better present and describe their transcriptomic results. There is only one volcano plot indicating the overall defect in mixed gametocytes in the main figure. Apart from this, the results are only described in the main text or in supplementary tables. It is therefore difficult to understand the subtilities of the analysis. For example, the authors frequently mention dysregulated genes, but without specifying whether it is up or down-regulated in the mutant. To address this issue, I would suggest the authors to better describe their results in the figures. They could show the GO term enrichment analysis they mention and show how they assign GO term or transcripts to male and female parasites. It would also be nice to discuss some of the results a bit more in details. For example, it is not surprising to see a reduction in transcripts that are under the control of AP2-O in retort-arrested ookinetes as the parasite do not reach this stage. It is thus highly speculative to specifically link this observation with ALBA4 without further detailed analysis. On the other hand, it is more surprising to see a decrease in ap2g transcripts, while the authors observe an increased gametocytaemia. Could the authors comment this observation? It may also be nice to better present the comparison between gametocytes and schizonts to possibly speculate on the early requirement of Not1G in committed schizonts.

      The conclusion regarding the similar localisation of Not1 and Not1G with other members of the CAF1/CCR4/NOT complex is not really convincing for two reasons. First, there is not colocalization shown and, second, the distribution is not very peculiar so it is difficult to draw any conclusion with this level of resolution. The presence of alpha-tubulin in the nucleus of male gametocytes is also very surprising as it is rather nucleus-excluded in both P. falciparum and P. berghei, could the authors comment this peculiar localisation?

      One of my major frustration when reading this manuscript was that the authors are not trying to discriminate between an early role of Not1G during gametocytogenesis or later in gametogenesis. The fact that the transcriptomes of gametocytes and schizonts seem to show similarities suggests that the phenotype observed during both male gametogenesis or ookinete development are probably linked to early knock-on defects during gametocytogenesis. Could the authors test whether male gametocytes replicate DNA or female activate translation? These are of course non-essential experiments as the authors are careful with their conclusions and mention possible defects during both gametocytogenesis or gametogenesis. Addressing this question may however add significant insights into the requirement for Not1G.

      Minor comments:

      Please use page and line numbering for your next submissions! Please describe what "bioinformatics" was used. I would show the nice localisation in oocyst and sporozoite in the main section. The conclusions drawn from the genetic cross seem to come from a single biological replicate, if this is the case please indicate it clearly.

      Significance

      This manuscript highlights the requirement of a Not1 paralog in the transmission stages of a Plasmodium parasite. More specifically it describes a new player in the control of RNA biology during this process where our knowledge is scarce. It will be a valuable manuscript for molecular parasitologists interested in transmission or RNA biology.

      Our expertise is largely in molecular and cellular parasitology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript by Hart et al. builds upon a fascinating finding presented in a previous manuscript by the same authors, in which they show that CCR4 seems to be able to associate with two members of the NOT1 family. In this work, the authors first re-annotate the two NOT1 paralogs in Plasmodium yoelii and then perform an in depth characterization of the role of NOT1-G during gametocytogenesis and early mosquito development. Using gene knockout and different genetic crosses, the authors show that NOT1-G is essential for male gametocyte development and leads to an arrest of development in zygotes arising from female gametocytes. Using RNA-seq the authors show that NOT1-G leads to lower transcript abundances, leading to the hypothesis that NOT1-G might be involved in preserving mRNAs in a larger RNA-binding complex. Lastly, the authors characterize a NOT1-G defining TPP domain and find that it is not essential for either male/female phenotype observed for the whole gene KO.

      Major comments:

      • Are the key conclusions convincing?

      The phenotypic characterization of NOT1-G during gametocytogenesis / early mosquito development is nicely presented and the experiments are well performed. Because a duplication of NOT1 with possibly opposing roles of the paralogs is a very unique feature with broad implication on RNA metabolism, it would have been great to see two select experiments on the molecular level adding evidence that 1) NOT1/NOT1-G are mutually exclusive in a complex with CCR4/CAF1 and 2) NOT1-G acts post-transcriptionally in an antagonistic way to NOT1 (i.e. as a mRNA 'stabilizer' as proposed by the authors).

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The authors describe the role of NOT1-G as 'preserving' mRNA. The lower abundance of many transcripts in the NOT1-G knockout suggest this, but experimental proof is not provided (see suggestions below). Maybe rephrase to 'putatively preserved/stabilized' or 'has a potentially stabilizing function'. The same is true for the mutually exclusive association of the two paralogs with CCR4/CAF1. The authors refer to a protein co-IP of CCR4 showing that CCR4 can interact with both NOT1 and NOT1-G, but a reciprocal experiment is lacking.

      In both cases, the conclusions of the authors are very likely (e.g. downregulation of many genes as seen by RNA-seq), but the final experimental evidence is not provided and a network such as in Figure 7 is not fully supported. If the authors would like to maintain these statements, then they should be rephrased and made clear or the additional experimental evidence suggested below is necessary.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      The essential claim that NOT1-G is important for gametocytogenesis and early mosquito development is well presented and fully supported by the experiments. As for the role of NOT1-G in 'preserving' mRNA, an mRNA half-life experiment would be necessary (or the text should be adjusted as mentioned above). In a short-term in vitro culture, pynot1-g- and WT parasites could be treated with ActD and abundances of select transcripts are measured by RT-qPCR.

      To support the idea that NOT-1 and NOT1-G associate in a mutually exclusive way or to just show that they act in distinct complexes despite their similar expression patterns, an IFA with a double stained NOT1/NOT-1G cell line could be performed. Alternatively, the authors could perform a protein co-IP using the already existing NOT1/NOT1-G-GFP cell line and show that the proteins don't interact with each other or even have certain distinct interaction partners.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      All necessary cell lines for a NOT1/NOT1-G co-IP and the ActD experiment are already present. The authors already present a ring to schizont in vitro culture (for ActD) and also have substantial experience in protein co-IP and proteomics.

      I am not sure about the cost for a proteomics experiment at the author's institute and I don't want to make a guess on time investment given the still on-going COVID situation.

      • Are the data and the methods presented in such a way that they can be reproduced?

      The MM section is well structured and presented and the supplemental material includes all data.

      • Are the experiments adequately replicated and statistical analysis adequate?

      There is hardly any test of significance presented in the main text of the manuscript (e.g. Figure 3B and 4A). Please show the individual data points for these graphs and make sure the n= and the statistical test is described in the figure legend. If you use the term significant in the text, then just add the p-value behind it. This is also true for the RNA-seq data: Genes are sorted by fold-changes, leaving it unclear if these changes are significant. These data are however presented in Table S1 and could be incorporated in the main text.

      Minor comments:

      • Specific experimental issues that are easily addressable.

      One idea that is also not discussed but could be added is for example that NOT1-G itself doesn't even have a stabilizing effect itself, but act as a decoy for other components of the CCR4/Caf1 complex, keeping them from associating with NOT1. In the NOT1-G knockout, the decrease in RNA abundance might then be just a result of an 'overactivity' of CCR4/Caf1/NOT1.

      • Are prior studies referenced appropriately?

      Throughout the manuscript, the authors should make clear what results come from which organism. Just as an example, the genome wide KO screens were performed in P. berghei and P. falciparum, CCR4/CAF1 experiments were performed in P. yoelii, whereas the original DDX6 work was done in P. berghei.

      • Are the text and figures clear and accurate?

      The Introduction is a bit long and partially turns into a minireview of eukaryotic RNA degradation. In the main text on page 13, the authors introduce a model for proteins involved in translational repression. This in not fully accurate, since for many of the proteins in this network, an effect on translation has actually not been shown. This includes NOT1-G characterized in the present work that most likely has an effect on mRNA stability, but for which a role in regulating translation is not presented.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Overall the RNA-seq is underrepresented and Figure 5 could easily be expanded by adding several panels that would help the future reader getting a better idea of the data:

      1. Summary graphs such as PCA/MDS plots of the different replicates and MA-plots (all of which can be easily generated in DESeq2)
      2. Heatmaps comparing the expression patterns of pynot1-g-, pbdozi-, pbcith-, pyalba4- highlighting some key gametocyte genes mentioned in the text
      3. Alternatively to 2., a simple Venn Diagram would already be very informative

      An informative representation might also be to sort the differentially expressed genes as predominant male and/or female. The P. berghei data by Yeoh et al (PMID: 28923023) could be a starting point.

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Technically this manuscript builds on standard methods of the field that are well executed. There is no direct clinical advancement, although one might argue that a unique adaptation of the parasite could always be a novel therapeutic target. Conceptually this is great advancement for the parasitology field as it is, providing additional evidence for the importance of post-transcriptional regulation for parasite transmission. With the two experiments suggested above and the additional evidence gained from it, this manuscript could also gain great interest to readers outside the field by clearly showing how alternative ways to regulate RNA stability evolved.

      Place the work in the context of the existing literature (provide references, where appropriate)

      The work builds on the early reports of the particular RNA metabolism in gametocytes performed in the groups of Andy Waters. Since then, the authors themselves have published a great set of manuscripts extending our knowledge of the proteins involved in gametocytogenesis and nicely place the current work into this framework.

      State what audience might be interested in and influenced by the reported findings.

      The manuscript as it stands is particularly interesting for the parasitology and potentially the evolutionary biology field. For a broader readership for example in the RNA field, the possibly antagonistic roles and mutually exclusive association with CAF1/CCR4 are likely most interesting.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Expertise:

      RNA biology, Plasmodium falciparum, Bioinformatics

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript „The Plasmodium NOT1-G paralogue acts as an essential nexus for sexual stage maturation and parasite transmission" investigates the two forms of NOT1 in rodent malaria parasites. The authors found out that the original NOT1 is crucial for gametocyte induction as well as transmission to the mosquito, they therefore renamed it NOT1-G. The paralogous proteins, on the other hand, appears to be crucial for intraerythrocytic growth, since it cannot be knocked out. The authors then investigated NOT1-G in more detail, using standard phenotyping assays. They found a slightly increased gametocytemia and a minor effect on transmission to the mosquito.

      Significance

      If the authors are able to provide convincing data that NOT1-G is indeed important for gametocyte induction and transmission to the mosquito, then the report would be of high significance for the malaria and molecular cell biology fields.

      My expertise: molecular cell biology of gametocytes, translational regulation, parasite transmission

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all four reviewers for their positive and constructive comments! We have carefully considered these comments and provided a point-by-point response below.

      Reviewer #1 (Evidence, reproducibility and clarity):

      This paper explores an interesting problem of SHP1/SHP2 preferences of inhibitory immunoreceptors. The author are quick to point out that many of their individual data points confirm published results at some level, but the power of the paper is in the parallel analysis of both PD1, which is strongly biased towards SHP2 and BTLA, which is biased towards SHP1. This gives them the opportunity to test the predictions of descriptive experiment by making simple mutated receptors with swapped ITIM or ITSM domains.

      The work is very well done and generally the authors are quite careful and precise about the language used to describe results, in general.

      The results are quite striking in that the find plenty of evidence for transient interaction of SHP1 with PD1 based on the biophysical measurements, but don't detect the interactions in pull down or in "in cell" microcluster recruitment experiments. In describing the pull-downs they discuss the issue of dissociation during washing potentially missing interactions that are taking place. I would prefer that the pull down is fine evidence for binding, but lack of pull down is not evidence for lack of binding. They should double check that this language is consistent. Also, unless something has changed in the microcluster binding experiments, this in situ recruitment of SHP2 to PD1 is only observed or a 2-3 minutes and then can't be detected, the situation for SHP2 becoming the same as it is for SHP1. If the kinetics are different in the cleaner systems that have now developed they should show this in a primary figure as this would be then different when what is reported previously.

      We agree with the reviewer that pull down is evidence for binding. Indeed, in most, if not all of our assays, our results with pull down were consistent with those in the microcluster imaging. As suggested by the reviewer, we will check through the manuscript and ensure the language is accurate and consistent. In our recent study (Xu et al., JCB, 2020, PMID: 32437509), we conducted a side-by-side comparison of SHP2 and SHP1 recruitment kinetics to PD-1 in a similar system as the current study. Both microcluster imaging and co-IP assays showed that PD-1:SHP2 association lasted at least 10 minutes, whereas PD-1:SHP1 recruitment was nearly undetectable. The duration of PD- 1:SHP2 association was in good agreement with Takashi Saito’s finding in CD4+ mouse T cells (Yokosuka et al., JEM, 2012, PMID: 22641383). Regardless the somewhat different kinetics in different studies, SHP2 recruitment was transient, as pointed out by the reviewer. We believe that some other effectors contribute to PD-1 inhibitory signaling. In supportive of this notion, we recently found that PD-1 remains partially inhibitory in CD8+ T cells deficient in both SHP1 and SHP2 (Xu et al., JCB, 2020).

      The gap in this study is lack of any functional analysis. The Jurkat model could be quite useful as they have a relatively clean system for asking if the transient binding of SHP1 to PD1 has any functional impact, which they have not yet followed through on. Does PD-1 recruited SHP2 have any impact on function after the 5 minutes? Furthermore, the authors need to keep in mind that mice deficient in SHP2 respond to anti-PD1 checkpoint therapies (Rota, G., Niogret, C., Dang, A. T., Barros, C. R., Fonta, N. P., Alfei, F., Morgado, L., Zehn, D., Birchmeier, W., Vivier, E., & Guarda, G. (2018). Shp-2 Is Dispensable for Establishing T Cell Exhaustion and for PD-1 Signaling In Vivo. Cell Rep, 23(1), 39-49. https://doi.org/10.1016/j.celrep.2018.03.026). This is an important issue to discuss in light the the very interesting binding analysis the authors have performed. But I think the functional analysis can be part of a future paper.

      In our recent publication (Xu et al. JCB, 2020, PMID: 32437509), we found that deletion of SHP1 from Jurkat cells had little, if any effect on PD-1 mediated suppression of IL-2 production. As the reviewer alluded to, we did observe SHP2 dissociation from PD-1 after 10 minutes, so the question of whether and how PD-1:SHP2 complex influence T cell function in a longer term is a great one. We currently are pursuing a hypothesis that there is a SHP2-independent mechanism of PD-1 inhibitory function, and indeed, in our recent study (Xu et al. JCB, 2020, PMID: 32437509), we found that PD-1 retains its partial inhibitory function in SHP1/SHP2 double knockout murine primary T cells. These results are consistent with the in vivo data by Rota et al. cited by the reviewer. We will also briefly discuss this point in a revised manuscript.

      I would suggest that the title be modified slightly from "SHP1/SHP2 discrimination" to "differential SHP1/SHP2 interaction" and leave discussion of discrimination until they have the functional data integrated over times that are relevant to T cell transcriptional regulation (1-2 hrs). The functional analysis can be in another paper, but it would be interesting to have a paragraph in the discussion raising the outstanding issues beyond stable binding detected by the pull-down and microcluster recruitment experiments- what are the implications for function. Could the transient interactions in the noise of the steady state and equilibrium measurements be functional?

      We thank the reviewer for the suggestion, even though reviewer #3 felt that our current title is appropriate. We will be happy to change the title at the editors’ discretion.

      I would summarise that the work is outstanding as biochemistry and biophysics and it should be published nearly as is. I'm suggesting minor revisions in that the changes are just to text, but I think this is important and somewhat nuanced aspect of the paper that will make it even more helpful to readers.

      We appreciate the positive and insightful comments!

      Reviewer #1 (Significance):

      The authors generate a detailed descriptive data set about the component interaction of SHP1 and SHP2 SH2 domains with PD1 and BTLA intracellular domains. They then test hypotheses generated from the descriptive data set to better define the nature of the interactions and why PD1 recruits primarily SHP2, while BTLA mainly recruits SHP1. PD1 is a major driver or the cancer immunotherapy revolution and SHP2 is the major candidate for a signalling effector of PD1. This paper can become the reference paper for the specificity and engineering of this interaction, which will make it highly significant in a very active and still expanding field.

      Referee Cross-commenting

      I still feel that "discrimination" has a functional/activity connotation that is not addressed at all in this paper, but can be addressed. I'm happy to have the suggestion stand and let the authors decide. They need to live with it once its published. Another suggestion- the citations on regulation are mostly old. A good recent paper is Pádua, R. A. P., Sun, Y., Marko, I., Pitsawong, W., Stiller, J. B., Otten, R., & Kern, D. (2018). Mechanism of activating mutations and allosteric drug inhibition of the phosphatase SHP2. Nature Communications, 9(1),

      1. https://doi.org/10.1038/s41467-018-06814-w .

      We believe that some of the functional questions raised by this reviewer, including the SHP1 and SHP2 contribution in PD-1 signaling, was addressed in our recent publication (Xu et al., JCB, 2020). Using SHP1 KO and SHP2 KO T cells, we showed that PD-1 inhibitory function is contributed by SHP2, but very little if any by SHP1. Thus in the current study, we focus on the mechanism behind the striking SHP2 preference by PD-1. We thank this reviewer for suggesting this excellent reference. We will cite this reference in the revised manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity):

      In this study, Xu and co-workers investigate the biophysical nature of the interaction between the structurally-related non-transmembrane PTPs Shp1 and Shp2 with the ITIM/ITSM-containing inhibitory receptors PD-1 and BTLA using cell-based, biochemical, biophysical and domain swapping assays. The primary aim being to better understand how these receptors discriminate between binding Shp1 and/or Shp2, and the orientation of Shp1 and Shp2 engagement. These are major unresolved questions in the field that the authors go some way to addressing in a methodical, rigorous, clear and concise manner. Findings are convincing, correlate well with previous findings and internally, and are complemented with excellent schematics, making it easy to comprehend.

      Major comments

      The authors focus primarily on binding affinities to explain differential binding of Shp1 and Shp2 by PD-1 and BTLA ITIMs and ITSMs, but this is only part of the story. Avidity, compartmentalization, stoichiometry of kinases, and relative abundance of Shp1 and Shp2 are also important aspects of the discriminatory mechanism that are not addressed. Competition assays would go some way to addressing the latter point and should be at least be considered and discussed.

      We agree that various parameters mentioned by this reviewers, such as compartmentalization and relative expression levels would be a concern for purely cell-free assays such as SPR, however, we feel that our cell-based assays already integrate these parameters. This is also precisely the reason why we chose to examine the recruitments of Shp1/2 in a cellular context instead of a purely cell-free system.

      Regarding the competition, we have confirmed our key results in both WT and SHP2 KO background, with or without the potential competition from endogenous SHP2, suggesting that competition might not be a dominant mechanism for the recruitment specificity we observed.

      Similarly, authors do not address how distortion of the pY binding pocket of Shp1 and Shp2 nSH2 domains in the auto-inhibited conformation is released, allowing the domain to engage with phopho-ITIM/ITSM. Again, this should be at least discussed. Current binding studies do not address this issue.

      We feel that the overall recruitment to the PD-1 microclusters as we observed in cells already integrate this auto-inhibition mechanism of Shp1 and Shp2, because we used full length proteins. We do agree with the reviewer that future studies are warranted to address the contributions of each mechanism, including auto-inhibition, concentration, competition, etc., to the overall recruitment. This might require careful and extensive biophysical analyses coupled with mathematical modeling.

      Minor comments:

      Phosphorylation should be indicated in schematic representations in Figures 3, 6 b, c.

      We thank the reviewer for this advice, we will indicate phosphorylation in the revised figure 3.

      Cellular and physiological significance should be further discussed, as well as broader implications of findings to other ITIM/ITSM-containing receptors in other lineages.

      We will further discuss this as suggested.

      Reviewer #2 (Significance)

      Findings from this study advance our knowledge of how inhibitory checkpoint regulatory receptors discriminate between Shp1 and Shp2, which has important implications for understanding how the unique biochemical, cellular and physiological functions of these receptors and phosphatases are dictated. Indeed, findings lay the foundation for a universal mechanism, that may apply to all ITIM/ITSM receptors in other cell lineages, and perhaps novel ways of targeting these interactions therapeutically.

      Compare to existing published knowledge

      Although largely correlative with previous studies, findings from this study start to fill major gaps in our knowledge of these biochemical processes, in a highly rigorous, concise and clear manner. Findings from previous studies were more 'piecemeal', whereas this study consolidates and advances important nuances of these interactions. Moreover, it lays the foundation for further structural, physiological and therapeutic studies.

      Audience

      The immune receptor signaling community and beyond, including any lineage in which ITIM/ITSM-containing receptors play a major role in regulating cellular responses.

      Your expertise

      ITIM/ITSM-containing receptors, kinase-phosphatase molecular switches, cellular reactivity to extracellular matrix proteins

      Referee Cross-commenting

      Generally agree with reviewer's comments. Constructive overall and fair. Although I was thinking additional competition experiments, I do not think necessary. Over the top for this study. Hence, 1 month should suffice to revise accordingly.

      We thank this reviewer for the excellent comments and understanding!

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary:

      Inhibitory immune receptors containing ITIMs function through recruiting the phosphatases SHP-1 and SHP-2. SHP-1 and SHP-2 are remarkably similar yet have different roles in vivo. How can ITIM-containing immune receptors specifically recruit SHP-1 or SHP-2? In this paper, Xu et al ask how SHP-1 vs SHP-2 specificity is achieved. They use very thorough biochemical assays to measure the affinity of SHP-1 and SHP-2 for various ITIM/ITSMs and finally pin point some key amino acids that switch an ITIM/ITSM from SHP-2 to SHP-1 specificity. The in vitro biochemical assays are augmented by in cell assays that support their conclusions. Overall, this paper is an incredibly elegant and straight forward paper addressing how SHP-1/SHP-2 specificity is achieved.

      Major Comments: none

      Minor Comments:

      • Could the western blots in Figure 1 be quantified as the western blots in other figures?

      We will quantify the western blots in Figure 1 as suggested in the revised manuscript.

      • The data that the y+1 reside is essential for SHP-1/2 specificity is very convincing. We are curious if the other residues of the ITIM/ITSM also contribute to this specificity, albeit less potently. The PD-1 G224A mutant is still less potent than the PD-1 BTLA ITIM swap, suggesting that while the y+1 position is most important, the other residues contribute some specificity. The authors also included data on a PD-1 variant with the BTLA ITIM A224G mutation (8f), which is slightly better at recruiting SHP-1 than the PD-1 ITIM. It may be worth mentioning this data in the text of the paper as well as displaying it in the figure.

      The reviewer raised an excellent point, yes, our data does suggest that other pY-flanking residues within the ITIM also contribute to SHP1 binding. However, the pY+1 residue replacement produced the strongest effect as the reviewer noted. In the revised manuscript, we will acknowledge the potential contributions of other residues.

      • A brief introduction to ITIM vs ITSM in the introduction of the paper may be helpful background for readers. For example, ITIM receptors are reasonably well known but how ITSM functionally differs is probably less well known.

      We will rewrite the introduction about ITIM and ITSM for better clarity.

      • Although not the major focus of the paper, broadening out this SHP-1/2 specificity to other immune receptors in the discussion is fascinating. (a) The authors find that a Valine, Leucine, or Isoleucine in place of the Alanine in y+1 is very close to equivalent, yet the A is highly conserved. The authors speculate that there may be an advantage to sub-maximal SHP-1 affinity because it is more easy to regulate. I think this is reasonable speculation but a little unsatisfying given the very small observed difference in SHP-1 binding. If the authors have additional thoughts, I would be interested to hear them. (b) The authors note that PD-1 is the only ITIM with a glycine in the Y+1 position. Are there other receptors that function primarily through SHP-2, and how might they achieve this specificity?

      Response to a: Even though valine, leucine or isoleucine did not produce a striking enhancement in Shp1 recruitment over alanine, the differences were statistically significant. In fact, when we performed these point mutations at a BTLA ITIM background, valine, leucine or isoleucine markedly enhanced the SHP1 recruitment (see unpublished data below). We speculate that other pY-flanking residues in BTLA, as this reviewer alluded to above, creates an environment that amplifies the differences. The strong sensitivity on pY+1 residue, as observed in BTLA, might be true for other SHP1-recruiting receptors too. If they were to have leucine or isoleucine at the pY+1 position of ITIM, they may recruit too much SHP1 that presumably decreases the fitness/growth of the cells. We propose to show this unpublished data as a supplemental figure in the revised manuscript. We will also discuss the potential contributions of other pY-flanking residues as this reviewer suggested.

      {{images cannot be rendered at this time in reply letters}}

      Response to b: Among the several receptors that we tested, PD-1 is the only receptor that exhibited no recruitment of SHP1. The lack of SHP1 recruitment is also true for murine PD-1, which has a glutamate residue (charged) at Y+1 position. In addition, earlier work reported that PECAM1 also selectively recruits SHP2, but not SHP1. We have noted that PECAM1 contain a threonine (polar) at the pY+1 position of their ITIMs. Thus, their inability to recruit SHP1 is consistent with our model that a nonpolar residue at Y+1 position is required for strong SHP1 recruitment. We will discuss these points in the revised manuscript.

      • Figure 9 b Val not Vla, Figure 3a - a legend for the color code may be nice (ie, 20-1000 nM) Thanks for catching this, we will fix the error in Figure 9b and provide the color code in Figure 3a in the revised manuscript.

      Reviewer #3 (Significance):

      Significance:

      SHP-1 and SHP-2 play a critical role in regulating immune system function. In addition, the receptors recruiting these phosphatases (like PD-1) are important immunotherapy targets. Previously, the question of SHP-1/SHP-2 specificity has been primarily described for ITIM bearing receptors individually. Other studies have predicted consensus sequences for the tSH2 domains of SHP-1 or SHP-2, but not addressed the defining molecular characteristics of these consensus sites or how these could be combined on ITIM receptors to generate selectivity between these related phosphatases. This paper represents a significant step forward because it provides a unifying mechanism explaining how ITIM-bearing immune receptors specifically recruit SHP-1 or SHP-2. I expect this paper will be broadly interesting to biochemists, immunologists and cancer biologists.

      Referee Cross-commenting

      I generally think the other reviewers comments are reasonable and insightful. Together, they suggest no new experiments are necessary. As for the proposed title change, I prefer the authors title and find it to be justified given their data.

      Reviewer #4 (Evidence, reproducibility and clarity):

      In this manuscript, Xu and college performed an elaborate study to investigate the molecular basis of Shp1 and Shp2 discrimination by immune checkpoints PD-1 and BTLA. The paper is original, clear, and well written. I only have a few minor comments:

      1. Please label the molecular weights to all the western blots/IPs results.

      We will label the molecular weights to all the blots in the revised manuscript.

      1. Please add scale bars to all the microscopy pictures.

      We will add scale bars to all the microcopy images in the revised manuscript.

      1. For the SPR data, please add the fitting curves.

      We thank the reviewer for the suggestion. However, we did not use the fitting curve to calculate the Kd, we plotted the maximum response as a function of concentration to determine the Kd. This is another well accepted method for Kd calculation. In fact, some of the SPR curves fit poorly with the existing algorithm. Thus, showing the fitting curve might distract the readers.

      Reviewer #4 (Significance):

      The strength of this paper relies on the details they dissected by using a series of mutagenesis screening experiments, which should be interesting to cell biologists and cancer immunologists.

      Referee Cross-commenting

      I think the other reviewer's comments are insightful and constructive, the suggested experiments are necessary and will improve the paper.

      We thank this reviewer for the positive comments!

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      In this manuscript, Xu and college performed an elaborate study to investigate the molecular basis of Shp1 and Shp2 discrimination by immune checkpoints PD-1 and BTLA. The paper is original, clear, and well written. I only have a few minor comments:

      1. Please label the molecular weights to all the western blots/IPs results.
      2. Please add scale bars to all the microscopy pictures.
      3. For the SPR data, please add the fitting curves.

      Significance

      The strength of this paper relies on the details they dissected by using a series of mutagenesis screening experiments, which should be interesting to cell biologists and cancer immunologists.

      Referee Cross-commenting

      I think the other reviewer's comments are insightful and constructive, the suggested experiments are necessary and will improve the paper.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Inhibitory immune receptors containing ITIMs function through recruiting the phosphatases SHP-1 and SHP-2. SHP-1 and SHP-2 are remarkably similar yet have different roles in vivo. How can ITIM-containing immune receptors specifically recruit SHP-1 or SHP-2? In this paper, Xu et al ask how SHP-1 vs SHP-2 specificity is achieved. They use very thorough biochemical assays to measure the affinity of SHP-1 and SHP-2 for various ITIM/ITSMs and finally pin point some key amino acids that switch an ITIM/ITSM from SHP-2 to SHP-1 specificity. The in vitro biochemical assays are augmented by in cell assays that support their conclusions. Overall, this paper is an incredibly elegant and straight forward paper addressing how SHP-1/SHP-2 specificity is achieved.

      Major Comments:

      none

      Minor Comments:

      • Could the western blots in Figure 1 be quantified as the western blots in other figures?
      • The data that the y+1 reside is essential for SHP-1/2 specificity is very convincing. We are curious if the other residues of the ITIM/ITSM also contribute to this specificity, albeit less potently. The PD-1 G224A mutant is still less potent than the PD-1 BTLA ITIM swap, suggesting that while the y+1 position is most important, the other residues contribute some specificity. The authors also included data on a PD-1 variant with the BTLA ITIM A224G mutation (8f), which is slightly better at recruiting SHP-1 than the PD-1 ITIM. It may be worth mentioning this data in the text of the paper as well as displaying it in the figure.
      • A brief introduction to ITIM vs ITSM in the introduction of the paper may be helpful background for readers. For example, ITIM receptors are reasonably well known but how ITSM functionally differs is probably less well known.
      • Although not the major focus of the paper, broadening out this SHP-1/2 specificity to other immune receptors in the discussion is fascinating. (a) The authors find that a Valine, Leucine, or Isoleucine in place of the Alanine in y+1 is very close to equivalent, yet the A is highly conserved. The authors speculate that there may be an advantage to sub-maximal SHP-1 affinity because it is more easy to regulate. I think this is reasonable speculation but a little unsatisfying given the very small observed difference in SHP-1 binding. If the authors have additional thoughts, I would be interested to hear them. (b) The authors note that PD-1 is the only ITIM with a glycine in the Y+1 position. Are there other receptors that function primarily through SHP-2, and how might they achieve this specificity?
      • Figure 9 b Val not Vla, Figure 3a - a legend for the color code may be nice (ie, 20-1000 nM)

      Significance

      SHP-1 and SHP-2 play a critical role in regulating immune system function. In addition, the receptors recruiting these phosphatases (like PD-1) are important immunotherapy targets. Previously, the question of SHP-1/SHP-2 specificity has been primarily described for ITIM bearing receptors individually. Other studies have predicted consensus sequences for the tSH2 domains of SHP-1 or SHP-2, but not addressed the defining molecular characteristics of these consensus sites or how these could be combined on ITIM receptors to generate selectivity between these related phosphatases. This paper represents a significant step forward because it provides a unifying mechanism explaining how ITIM-bearing immune receptors specifically recruit SHP-1 or SHP-2. I expect this paper will be broadly interesting to biochemists, immunologists and cancer biologists.

      Referee Cross-commenting

      I generally think the other reviewers comments are reasonable and insightful. Together, they suggest no new experiments are necessary. As for the proposed title change, I prefer the authors title and find it to be justified given their data.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this study, Xu and co-workers investigate the biophysical nature of the interaction between the structurally-related non-transmembrane PTPs Shp1 and Shp2 with the ITIM/ITSM-containing inhibitory receptors PD-1 and BTLA using cell-based, biochemical, biophysical and domain swapping assays. The primary aim being to better understand how these receptors discriminate between binding Shp1 and/or Shp2, and the orientation of Shp1 and Shp2 engagement. These are major unresolved questions in the field that the authors go some way to addressing in a methodical, rigorous, clear and concise manner. Findings are convincing, correlate well with previous findings and internally, and are complemented with excellent schematics, making it easy to comprehend.

      Major comments

      The authors focus primarily on binding affinities to explain differential binding of Shp1 and Shp2 by PD-1 and BTLA ITIMs and ITSMs, but this is only part of the story. Avidity, compartmentalization, stoichiometry of kinases, and relative abundance of Shp1 and Shp2 are also important aspects of the discriminatory mechanism that are not addressed. Competition assays would go some way to addressing the latter point and should be at least be considered and discussed.

      Similarly, authors do not address how distortion of the pY binding pocket of Shp1 and Shp2 nSH2 domains in the auto-inhibited conformation is released, allowing the domain to engage with phopho-ITIM/ITSM. Again, this should be at least discussed. Current binding studies do not address this issue.

      Minor comments:

      Phosphorylation should be indicated in schematic representations in Figures 3, 6 b, c.

      Cellular and physiological significance should be further discussed, as well as broader implications of findings to other ITIM/ITSM-containing receptors in other lineages.

      Significance

      Findings from this study advance our knowledge of how inhibitory checkpoint regulatory receptors discriminate between Shp1 and Shp2, which has important implications for understanding how the unique biochemical, cellular and physiological functions of these receptors and phosphatases are dictated. Indeed, findings lay the foundation for a universal mechanism, that may apply to all ITIM/ITSM receptors in other cell lineages, and perhaps novel ways of targeting these interactions therapeutically.

      Compare to existing published knowledge

      Although largely correlative with previous studies, findings from this study start to fill major gaps in our knowledge of these biochemical processes, in a highly rigorous, concise and clear manner. Findings from previous studies were more 'piecemeal', whereas this study consolidates and advances important nuances of these interactions. Moreover, it lays the foundation for further structural, physiological and therapeutic studies.

      Audience

      The immune receptor signaling community and beyond, including any lineage in which ITIM/ITSM-containing receptors play a major role in regulating cellular responses.

      Your expertise

      ITIM/ITSM-containing receptors, kinase-phosphatase molecular switches, cellular reactivity to extracellular matrix proteins

      Referee Cross-commenting

      Generally agree with reviewer's comments. Constructive overall and fair. Although I was thinking additional competition experiments, I do not think necessary. Over the top for this study. Hence, 1 month should suffice to revise accordingly.

    5. Referee #1

      Evidence, reproducibility and clarity

      This paper explores an interesting problem of SHP1/SHP2 preferences of inhibitory immunoreceptors. The author are quick to point out that many of their individual data points confirm published results at some level, but the power of the paper is in the parallel analysis of both PD1, which is strongly biased towards SHP2 and BTLA, which is biased towards SHP1. This gives them the opportunity to test the predictions of descriptive experiment by making simple mutated receptors with swapped ITIM or ITSM domains.

      The work is very well done and generally the authors are quite careful and precise about the language used to describe results, in general.

      The results are quite striking in that the find plenty of evidence for transient interaction of SHP1 with PD1 based on the biophysical measurements, but don't detect the interactions in pull down or in "in cell" microcluster recruitment experiments. In describing the pull-downs they discuss the issue of dissociation during washing potentially missing interactions that are taking place. I would prefer that the pull down is fine evidence for binding, but lack of pull down is not evidence for lack of binding. They should double check that this language is consistent. Also, unless something has changed in the microcluster binding experiments, this in situ recruitment of SHP2 to PD1 is only observed or a 2-3 minutes and then can't be detected, the situation for SHP2 becoming the same as it is for SHP1. If the kinetics are different in the cleaner systems that have now developed they should show this in a primary figure as this would be then different when what is reported previously.

      The gap in this study is lack of any functional analysis. The Jurkat model could be quite useful as they have a relatively clean system for asking if the transient binding of SHP1 to PD1 has any functional impact, which they have not yet followed through on. Does PD-1 recruited SHP2 have any impact on function after the 5 minutes? Furthermore, the authors need to keep in mind that mice deficient in SHP2 respond to anti-PD1 checkpoint therapies (Rota, G., Niogret, C., Dang, A. T., Barros, C. R., Fonta, N. P., Alfei, F., Morgado, L., Zehn, D., Birchmeier, W., Vivier, E., & Guarda, G. (2018). Shp-2 Is Dispensable for Establishing T Cell Exhaustion and for PD-1 Signaling In Vivo. Cell Rep, 23(1), 39-49. https://doi.org/10.1016/j.celrep.2018.03.026). This is an important issue to discuss in light the the very interesting binding analysis the authors have performed. But I think the functional analysis can be part of a future paper.

      I would suggest that the title be modified slightly from "SHP1/SHP2 discrimination" to "differential SHP1/SHP2 interaction" and leave discussion of discrimination until they have the functional data integrated over times that are relevant to T cell transcriptional regulation (1-2 hrs). The functional analysis can be in another paper, but it would be interesting to have a paragraph in the discussion raising the outstanding issues beyond stable binding detected by the pull-down and microcluster recruitment experiments- what are the implications for function. Could the transient interactions in the noise of the steady state and equilibrium measurements be functional?

      I would summarise that the work is outstanding as biochemistry and biophysics and it should be published nearly as is. I'm suggesting minor revisions in that the changes are just to text, but I think this is important and somewhat nuanced aspect of the paper that will make it even more helpful to readers.

      Significance

      The authors generate a detailed descriptive data set about the component interaction of SHP1 and SHP2 SH2 domains with PD1 and BTLA intracellular domains. They then test hypotheses generated from the descriptive data set to better define the nature of the interactions and why PD1 recruits primarily SHP2, while BTLA mainly recruits SHP1. PD1 is a major driver or the cancer immunotherapy revolution and SHP2 is the major candidate for a signalling effector of PD1. This paper can become the reference paper for the specificity and engineering of this interaction, which will make it highly significant in a very active and still expanding field.

      Referee Cross-commenting

      I still feel that "discrimination" has a functional/activity connotation that is not addressed at all in this paper, but can be addressed. I'm happy to have the suggestion stand and let the authors decide. They need to live with it once its published. Another suggestion- the citations on regulation are mostly old. A good recent paper is Pádua, R. A. P., Sun, Y., Marko, I., Pitsawong, W., Stiller, J. B., Otten, R., & Kern, D. (2018). Mechanism of activating mutations and allosteric drug inhibition of the phosphatase SHP2. Nature Communications, 9(1), 4507. https://doi.org/10.1038/s41467-018-06814-w .

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      This study reveals the role of WW-PLEKHAs (PLEKHA5, 6 and 7) in the basolateral targeting of copper (Cu) transporter ATP7A. The Authors suggest that the WW-PLEKHAs/PDZD11/ATP7A interaction directs Cu-induced trafficking of ATP7A to the basolateral surface of epithelial cells. Suppression of WW-PLEKHAs impairs basolateral delivery of ATP7A and causes increased intracellular Cu levels. On the contrary, WW-PLEKHAs do not seem to participate in the retrieval of ATP7A back to the Golgi once the Cu levels return to basal values. To support these notions the manuscript provides a substantial set of the data, which were achieved with a wide repertoire of methods. In my view, this manuscript could be of interest to a broad readership, ranging from cells biologists to medical doctors. However, further revision should address the concerns outlined below.

      Major points:

      1. The Authors claim that at basal Cu conditions ATP7A resides in the TGN regardless of PDZD11 or WW-PLEKHAs depletion (Figs. 3, 4 and Fig. S6, S7). However, colocalization with TGN marker and its quantification are not shown. Thus, the colocalization of ATP7A with TGN marker (Golgin 97 should work in all cell types) has to be shown and its quantification (Pearson coefficient) has to be provided for control and all KO cells.

      Response: We thank the Reviewer for this comment. We plan to carry out the IF colocalization of ATP7A with Golgin97 and quantifications for WT and KO clonal lines at basal Cu conditions.

      1. Along the same line, ATP7A colocalization with TGN marker and its quantification also has to be conducted for the Cu washout experiments.

      Response: We plan to carry out the IF colocalization of ATP7A with Golgin97 and quantifications for WT and KO clonal lines for the Cu washout conditions.

      1. The authors say that upon addition of Cu ATP7A labeling was detected along lateral contacts, and near the apical and basal plasma membranes (Fig. 3B, WT). Here again "near apical" localization of ATP7A has to be clarified. This could either represent the ATP7A pool that still remains in the Golgi (which is usually close to apical surface in polarized epithelial cells) or the ATP7A pool delivered to the apical membrane of the cells. However, apical targeting of ATP7A would be odd considering previously published data that shows basolateral localization in polarized epithelial cells. Thus, the authors have to show whether "apical" ATP7A overlaps with TGN marker or with an apical marker (Gp135).

      Response. This Reviewer is correct. Effectively we believe that the localization of ATP7A that we observe in cysts is not apical, but sub-apical, as shown for the localization of PLEKHA5, where colocalization with the apical marker gp135 clearly shows a different localization (Fig. 2I).

      Therefore, we will carry out co-localization of ATP7A with gp135 in MDCK cells (the monoclonal antibody does not work on mCCD cells) and the labeling of the micrographs in Fig. 3B and 4B will be revised (sub-apical instead of apical). The labeling of PLEKHA5 sub-apical pool will also be revised (sub-apical and not apical) in Fig. 2.

      1. PDZD11 or PLEKHA6/7 KOs lead to an ATP7A pattern, which looks like pretty large scattered vesicles that do not overlap with basolateral marker. What are these round ATP7A structures, endosomes? Colocalization assessments with EEA1 (early endosomes), VPS35 (sorting endosome) and LAMP1 (late endosomes) would be needed to clarify this. Alternatively, these vesicles could represent a fragmented Golgi with ATP7A inside. To establish this, labelling with TGN marker at these conditions is required.

      Response: We thank the Reviewer for this comment. To clarify the nature (endosomes, Golgi, etc) of the membrane vesicles where ATP7A is localized in KO lines we will carry out double IF colocalization of ATP7A with either Golgin97 or early/sorting (which are mostly overlapped) endosome or the late endosome markers.

      1. Biotinylation experiments. The Authors say that KO of either PDZD11, or PLEKHA7, or both PLEKHA6 and PLEKHA7, but not PLEKHA6 alone, decreased ATP7A levels at the basolateral surface of mCCD cells (Fig. 3G), while a small decrease in the basolateral levels of ATP7A is observed in PLEKHA5-KO, but not PLEKHA6-KO MDCK cells (Fig. 4G). Honestly, it is tough to see this. In Fig. 4G all ATP7A bands in the biotinylated fraction look similar. In Fig. 3G, the P11 and P6/7 KO bands of biotinylated ATP7A might be a bit less intense than in WT, while the P6 KO signal looks even more intense that WT. More convincing blots with quantification have to be provided for both figures.

      Response: We will carry out additional immunoblots and quantifications of the biotinylation experiments results.

      1. Along the same line. Why was apical biotinylation of ATP7A not included? It absolutely should be done to understand whether any KO induces apical mistargeting of ATP7A.

      Response: The levels of ATP7A at the apical surface upon basal or elevated copper are negligible and not physiologically relevant, as established by previous biotinylation studies (for example Greenough et al AJP 2004, and Nyasae et al AJP-Gastrointest Liver Physiol 2007). We will carry out IF analysis of WT and KO cells with ATP7A and apical markers (ex. gp135) to clarify if the subapical labeling for ATP7A is or not on the apical membrane. Importantly, LESS, and not more subapical labeling is detected in KO lines (Fig. 3B, Fig. 4B), as we pointed out in the results section. Therefore, the KO lines do not show increased apical (mistargeting of) ATP7A.

      1. Copper metabolism. The authors say that KO of either PDZD11 or PLEKHA6/7 results in higher Cu levels. What does this mean in terms of physiology and pathology? In the context of Menkes disease one has to show that this intracellular Cu increase is due to a reduction in Cu release from the cells. So, Cu release from the cells into medium has to be measured by ICP-MS or Cu64. On the other hand, it would be important to understand whether Cu accumulation in KO cells is toxic. To this end viability of KO cells should be tested in Cu dose-response experiments.

      Response: The focus of this paper is the molecular mechanisms of ATP7A targeting to the BL plasma membrane, rather than a quantitative analysis of copper transport by and analysis of physiology/pathology of copper homeostasis ATP7A in our WT and KO cell lines. Our measurement of the intracellular copper using the CF4 probe was designed as a physiological readout to confirm that altered localization at the BL plasma membrane correlates with reduced copper extrusion, as it can be hypothesized. This said, to address this point, we plan to carry out an ICP-MS analysis of intracellular copper in selected WT and KO lines, after loading cells with different amounts of copper, and at different times after return to basal copper levels. CF4 and ICP-MS generally track, but they do measure distinct copper pools: CF4 measures exchangeable Cu pools while ICP-MS measures total Cu pools. We will also carry out a crystal violet analysis (see Gudekar et al, Scient. Reports, 2020) of the viability of WT and KO cells in the absence and presence of low or elevated copper levels, as suggested by the Reviewer.

      1. How critical is WW-PLEKHAs or PDZD11 deficiency in terms of Cu metabolism? Are there genetic disorders or mouse phenotypes associated with their loss of function? If yes, do these phenotypes include any impairment of Cu metabolism?

      Response: To our knowledge, no genetic study has addressed the role of WW-PLEKHAs and PDZD11 in Cu metabolism in vivo. PLEKHA7-KO mice are viable and were not reported to display any phenotype consistent with grossly altered Cu metabolism (Popov et al 2015). Mice KO for either PLEKHA5 or PLEKHA6 or PDZD11 have not been described. However, if WW- PLEKHAs have redundant functions in the trafficking of ATP7A, one would expect that mutation/KO of only one of them may not yield a significant phenotype. Furthermore, we cannot exclude that additional PDZ-containing proteins may participate in the trafficking of ATP7A, compensating a pathological or experimental loss of PDZD11. So, answering this question will require to generate single, double and triple KO mice for WW-PLEKHAs, and carry out a detailed analysis of in vivo Cu metabolism. This is beyond the scope of this paper. The text of the Discussion will be revised to address this comment.

      1. Discussion. Could PH domains of WW-PLEKHAS be involved in their basolateral localization, thereby generating a targeting patch for ATP7A? Some publications suggest that the basolateral membrane might be enriched in specific PIPs, which in turn generates a favorable environment for some PH domains. Is this the case for PH domains of WW-PLEKHAS?

      Response: This is an interesting hypothesis that should be investigated in future studies (lipidomic analysis of KO lines, overexpression studies, etc), but is outside of the scope of the present manuscript.

      Minor points:

      1. Fig. 6C. CFP-HA is a negative control but still gives a band (although of lower intensity). So how can one be sure that other interactions are specific? This is particularly worrying because the quantification shows a very minor (less than 1.5) increase in the intensity of bands corresponding to specific interactors.

      Response: CFP-HA is used as a “negative control” 3_rd _protein, added to bait (GST-PDZD11) and prey (GFP-ATP7A-Cter) (Fig. 6C). The IB shows that in the presence of CFP-HA the bait binds the prey, which is in agreement with the previously reported interaction between PDZD11 and the C-terminal region of ATP7A (Stephenson et al JBC, 2005). The point of the Figure is to show that the interaction between bait and prey is enhanced in the presence of HA-tagged WW- PLEKHAs (again, CFP-HA is the negative control). We agree that the increase is not huge, but it is nevertheless statistically significant, based on several experiments (Fig. 6E).

      1. Page 11. The result section title "WW-PLEKHAs promote PDZD11 binding to ATP7A through PDZD11 (Figure 6)" does not sound right and has to be corrected.

      Response: The text was revised ("WW-PLEKHAs promote PDZD11 binding to ATP7A”).

      Reviewer #1 (Significance):

      Delivery of copper transporter ATP7A to the basolateral surface of epithelial cells is of great importance for maintenance of copper metabolism and, hence for human health in general. Impairment of this process in enterocytes causes fatal Menkes disease. However, the mechanisms driving basolateral targeting of ATP7A remained poorly characterized. This study provides a significant advance in our understanding of these mechanisms and opens new avenues for investigation of how WW-PLEKHAs/PDZD11-mediated targeting of ATP7A might be affected in the context of inherited disorders of copper metabolism.

      Reviewer #2 (Evidence, reproducibility and clarity):

      This manuscript uncovers new PDZD11 interactors that participate in trafficking of the copper transporter ATP7A from the Golgi/TGN to the cell periphery in response to high copper concentrations. These interactors named PLEKHA5, PLEKHA6, and PLEKHA7 interact with the N-terminal Pro-rich domain of PDZD11 through their WW domains. As PDZD11 interacts with the C-terminal region of ATP7A, the authors investigated the hypothesis that WW-PLEKHAs are required for copper-induced relocalization of ATP7A from the TGN to the plasma membrane where it functions in copper efflux. In vitro pull down experiments verified the formation of ATP7A-, PLEKHAs-, and PDZD11-containing complexes. Using using CRISPR/Cas9 technology, the authors have generated PDZD11-, PLEKHA5-, PLEKHA6-, and PLEKHA6/7-KOs cell lines.

      Cells lacking one (or more) of these proteins were examined by microscopy with respect to their ability of targeting ATP7A to the cell periphery in response to copper. Abnormal trafficking of ATP7A in these mutant cell lines (PDZD11-, PLEKHA5-, PLEKHA6-, PLEKHA7-, and PLEKHA6/7-KOs) presumably prevented copper efflux since elevated intracellular copper was detected using the fluorescent copper probe CF4.

      Although it is difficult to read across the article's figures and supporting figure files (going back- and-forth repeatedly), the manuscript is generally clear and well written, and the results seem well documented accompanied by a tremendous amount of work.

      Comments.

      1. Two-hybrid screen occurs in the nucleus. How the authors could explain the fact that the use of PDZD11 as a bait exhibited an interaction with PLEKHA5 and PLEKHA6 (as well as PLEKHA7) in this system? Microscopic analysis of PLEKHA5 showed a cytoplasmic submembrane localization with E-cadherin, whereas PLEKHA6 exhibited a localization along the plasma membrane at apical junctions. In the case of PLEKHA7, it is an adherens junction protein. Furthermore, these three proteins are quite big (1116, 1297, and 1121 AAs, respectively) with their WW regions at their N termini, which involved the expression of very long cDNAs fused to the TA domain. As truly membrane-associated proteins, isn't surprising that a two hybrid approach worked?

      Response: We carried out several Y2H screens with a number of different baits, and we have always validated the physiological significance of the high score interactions (Pulimeno et al JBC 2011, Guerrera et al, 2016 JBC, and other unpublished data). So, it is an approach that reliably works very well. The Hybrigenics human placenta library that we used contains fragments of proteins, not the full-length proteins. Fig. 1A shows the preys identified with the Y2H using PDZD11 as a bait. The preys that were found comprise only the N-terminal regions of WW- PLEKHAs, not the FL proteins.

      1. Fig. 1C, what are the 4 bands seen for the second blot (anti-HA) in lines 1, 2, 9 and 10? The blot was cut in a way that not enough of the membrane can be evaluated. Why using Ponceau for GST-baits and not using anti-GST antibodies? It would be much better having an uniform method (Western blot assays) to show the data.

      This latter comment is true for Fig 1D, E, and Fig 6.

      Response: The 4 bands seen for the second blot (anti-HA) in lanes 1, 2, 9 and 10 are non- specific cross-reaction of the antibodies with the baits, that are present in high concentration (and present also where there is no CFP-HA, in lanes 1 and 9). The preys can be identified on the basis of their molecular size. For example, no CFP-HA prey is detected, since its size is intermediate between the baits, thus the negative control is validated. We use Ponceau for 2 reasons: 1) Ponceau can detect very well baits on nitrocellulose membranes; 2) to use GST antibodies we would need to cut the membranes. But cutting membranes is not possible when the size of some preys (in this case, the negative control) is in the same range of sizes as the baits. Thus, if we used anti-GST antibodies we would have to strip and re-probe the membranes, which is not optimal in our experience to elicit good signals.

      1. Fig 1B, why Caco-2 cells? All the other experiments were conducted with other cell lines such as mCCD and MDCK. Using different cell lines could give different results.

      Response: We used Caco2 cells because the Y2H was carried out with a human bait on a human placental library, and Caco2 are human cells. We also tried to use MDCK cells, but the efficiency of the IP was lower.

      1. Fig 1D, it is unclear whether the GST-PDZD11 fusion protein (bait) was present or not when used in pull down assays with GFP alone. This is a clear disadvantage of Ponceau, immunoblot would be much better to use.

      Response: The labeling by Ponceau is not optimal in one image (Fig. 1D), probably due to a problem of transfer. But a clearer image for the same pulldown with the same bait is shown in the bottom panel of Fig. 1E (where we show 3 PDZD11 baits, FL, N-term and delta-24), and it clearly shows good normalization of baits. We stain baits with Ponceau for normalization.

      1. In Fig 3A, under basal copper conditions, microscopic image of the PLEKHA6/7-KO seems indicate a distinct pattern of localization for ATP7A in comparison to that of WT. However, this difference does not seem to be highlighted in Fig 3E.

      Response: We will re-examine all the micrographs used for the quantification and integrate the data with the results of the colocalization between ATP7A and TGN marker. This should allow us to establish whether there is a dissociation of ATP7A labeling from TGN marker labeling in KO cells, or else a fragmentation of the TGN in the double-KO mCCD cells.

      In Figs 3F and 4F, what was the method for quantification?

      Response: The methods for quantifications are described in the “Image quantification” section of the Methods. We will add new data about the quantification of co-localization of ATP7A with TGN and endosomal markers.

      Along these lines, what is the copper concentration under basal conditions? How much copper was used for elevated copper conditions and what was the time of treatment?

      Response: Basal conditions refers to normal cell culture medium (“Cell culture” section of the Methods), and elevated copper is 315 µM of CuCl2 dissolved in culture medium. Cells were treated for 4hr (MDCK) or 5hr (mCCD) when cells were cultured on Transwells, overnight in the case of cysts.

      1. Is there any evidence for Atp7A-PDZD11-PLEKHAs association in vivo? Do the authors have assessed these protein-protein interactions using methods such as bimolecular fluorescence in cells?

      Response: We have attempted co-IP experiments with endogenous proteins, but they were inconclusive, probably due to the different extraction conditions required to solubilize membrane (ATP7A) and cytoplasmic (WW-PLEKHAs, PDZD11) proteins, and a disassembly of the complex under the conditions required to solubilize ATP7A. We have not tried bimolecular fluorescence, but for the revision we plan to carry out Proximity Ligation Assay (PLA) experiments, which in our hands are very effective in assessing physiological proximity of proteins in cells. Our pulldown experiments however provide evidence that the three proteins form a complex, and that WW- PLEKHAs enhance the interaction between PDZD11 and ATP7A (Fig. 6C-E). This is a mechanism that we have shown occurs also for the complex between PLEKHA7, PDZD11 and Tspan33 (Shah et al, 2018 Cell Rep, Rouaud et al, 2020 JBC).

      1. In Fig 5, do the authors have verified the mRNA (or/and protein) steady-state levels of metallothioneins? Probing whether metallothioneins are induced would strongly reinforced their conclusion as to whether an increase intracellular copper levels occurred in PDZD11-, PLEKHA5-, PLEKHA6-, PLEKHA7-, and PLEKHA6/7-Kos cell lines.

      Response: We thank the Reviewer for this comment. In the revision we will carry out RT-PCR analysis of the levels of expression of mRNAs for Metallothioneins I and II.

      1. In Fig 6E, what was the method for quantitative immunoblot assays? Have you used an Odyssey infrared imaging system (Li-Cor). What was the loading (internal) control under the same analytical method?

      Response: The Li-Cor imaging system was used to capture the signals, and intensities were measured in Image Studio Lite program (Li-cor). Signals from the prey (C-terminus of ATP7A) were normalized to signals from the bait (GST-PDZD11) which were used as loading control.

      1. In the case of the manuscript section entitled " PLEKHA5, PLEKHA6 and PLEKHA7 show distinct localizations in cells and tissues and define cytoplasmic...", (pages 5 to 7) the reader would benefit having a Table that would summarize all the data. It would be more understandable.

      Response: We thank the Reviewer for this suggestion. We will include a Table in the revision.

      1. Do PDZD11, PLEKHA5, PLEKHA6, and PLEKHA7 proteins exist as multiple isoforms? If that is the case, for each of them, are they exhibiting the same tissue-specific expression profiles as shown in Fig S3? For each protein, if different isoforms exist, perhaps some of them participate in a different way for the targeting of ATP7A?

      Response: No PDZD11 isoforms are known, but 15, 5, and 9 different protein-coding transcripts are reported (ensemble.org) for PLEKHA5, PLEKHA6 and PLEKHA7, respectively, the largest ones being the WW-containing transcripts. We focused exclusively on the WW-containing isoforms of PLEKHAs because PDZD11 binds to the WW domains, and the Y2H identified only the WW-containing isoforms of PLEKHA5, PLEKHA6 and PLEKHA7. The observation that the phenotype of PDZD11-KO cells is similar to that of either PLEKHA6-KO, PLEKHA7-KO or double-KO mCCD cells suggests that PLEKHA5, PLEKHA6 and PLEKHA7 WW-containing isoforms act in a complex with PDZD11. This is consistent with the previous observations that highlight a role of the C-terminal region of ATP7A in regulating its traffic, and the binding of the same region to PDZD11. However, we cannot exclude that PLEKHA5/6/7 isoforms that lack the WW domains could participate in the regulation of the targeting of ATP7A, through other, PDZD11-independent mechanisms. The text of the Discussion will be revised to clarify this point.

      1. Is it known whether PDZD11, PLEKHA5, PLEKHA6, and PLEKHA7 proteins participate in the copper-regulated trafficking of the ATP7B (Wilson) protein? In Fig S3, it is shown that they are expressed in liver, with PLEKHA7 exhibiting a slower migration (protein modification?). Alternatively, are they strictly involved in the regulation of ATP7A (Menkes)? Could the authors discuss about it?

      Response: ATP7B lacks the PDZ-binding motif that is responsible for PDZD11 binding, and the C-terminus of ATP7B does not interact with PDZD11 (AIPP1) by beta-galactosidase assays in yeast, unlike ATP7A (Stephenson et al, JBC 2005). For this reason, ATP7B is not expected to be regulated by PDZD11 and WW-PLEKHAs. However, analysis of the localization of ATP7B in our cell lines could be done in future studies. The text of the Discussion will be revised to make this point.

      1. The proposed model in Fig 7 is unclear illustrating a nucleus that consumes a lot of space while it is not involved in the proposed mechanism. Cellular proteins that are involved in the proposed mechanism should be bigger and their interactions that lead to formation of protein complexes must be better illustrated as a function of copper availability.

      Response: The model of Fig. 7 will be re-drawn to take into account these suggestions.

      1. Typo. Line 320: remove "or" and replace it by "and" : ...both PLEKHA6 and PLEKHA7 (Fig. 5A-D).

      2. Typo. Line 329: remove (Figure 6) in the title.

      Response: The typos were corrected.

      Reviewer #2 (Significance ):

      This study represents a significant advanced in the copper field.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary

      The authors identified some major interesting findings including the key role of WW-PLEKHAs (PLEKHA5, PLEKHA6, PLEKHA7) in the recruitment of PDZD11 targeting ATP7A to the cell periphery in response to elevated copper. Generating the antibodies against PLEKHAs and PDZD11 and various knock out cell lines and validating their expression in these cell lines and tissues is innovative. Further, the authors showed that copper dependent WW-PLEKHAs and PDZD11 regulate the localization and function of ATP7A to modulate cellular copper homeostasis.

      Major comments:

      We are in agreement with the manuscript conclusions. Based on the presented studies, the authors propose the in-vivo role of WW-PLEKHAs and PDZD11 in ATP7A trafficking, and how microtubule dynamics and trafficking machinery regulate ATP7A localization. Additionally, investigating the effects of the cell membrane trimolecular complexes ATP7A-PDZD11-WW- PLEKHA on elevated copper would be impactful.

      Additional notes:

      1. Figure 4, the authors provide excellent data and images showing localization of PLEKHA, PDZD11 and ATP7A within different cell lines. Nevertheless, showing PLEKHA, PDZD11 and ATP7A localization on membrane of the cell surfaces at elevated copper condition with cell fractionation technique and their interaction through co-immunoprecipitation (co-IP) could validate author's hypothesis. At least the authors should comment on this.

      Response: As stated in the response to comment n.6 from Reviewer #2, we attempted co-IP experiments with endogenous proteins, which were inconclusive. Our pulldown experiments provide evidence that the three proteins form a complex, and that WW-PLEKHAs enhance the interaction between PDZD11 and ATP7A (Fig. 6C-E). This is a mechanism that we have shown occurs also for the complex between PLEKHA7, PDZD11 and Tspan33 (Shah et al, 2018 Cell Rep, Rouaud et al, 2020 JBC). We plan for the revision to carry out Proximity Ligation Assay (PLA) experiments, which in our hands are very effective in assessing proximity of proteins in cells, when co-IPs are technically difficult or impossible.

      1. Figure 5, Alternatively, intracellular copper levels by ICPMS in the cell lines would strengthen the results. As author's treated the cell lines with very high copper concentration, copper concentration dependent studies would be appreciated to verify how PLEKHA's and PDZD11 response depends on copper concentration

      Also, the authors should clearly mention the number of replicates for each experiment and indicate in the figure legends.

      Response. We will carry out ICP-MS to evaluate intracellular copper levels as a function of genotype. Depending on the results, we will carry out studies about the dose-dependence of the effects of copper. The number of replicates of the experiment will be mentioned in the Figure legend in the revised text.

      Minor comments:

      1. Figure1B, co-IP efficiency is lower in Caco-2 cells, therefore endogenous levels of PLEKHA5, PLEKHA6 and PDZD11 in Caco-2 should be checked and shown. Mention the number of replicates for the experiments.

      Response. Endogenous levels of proteins are shown in the Input lanes. The low levels of PLEKHA5 in Caco2 cells are consistent with the IB analysis of tissue lysates, showing relatively low levels in intestine (Fig. S3D). The number of replicates of the experiment will be mentioned in the Figure legend in the revised text.

      1. Figure 2A, as per result of 2A, E-cadherin labeling is missing. Figure 2M and 2N, author analyzed the co-localization of PLEKHA-5 in presence of nocodazole but not PDZD11. It would be interesting to see the PDZD11 as well after nocodazole treatment.

      Response. E-cadherin-labelled panel will be added to Fig. 2A in the revision. We will also show the effect of nocodazole on the localization of PDZD11.

      1. The result section title for figure 6 (line 329) is misleading. Also, trimolecular complex PLEKHA's, PDZD11 and ATP7A membrane localization at elevated copper concentration could be shown by immunofluorescence, if possible.

      Response. The title of the section was revised, to reflect more accurately the results of Figure 6. It is now “WW-PLEKHAs promote the binding of the C-terminal region of ATP7A to PDZD11”.

      Triple IF colocalization of endogenous PLEKHAs, PDZD11 and ATP7A is not possible for 2 reasons: 1) PDZD11 antibodies can only reveal endogenous junctional (clustered) labeling (Guerrera et al JBC2016); the lateral and cytoplasmic labeling is too weak, and can only be appreciated upon overexpression of PDZD11, as shown in Fig. 2B-E (co-expression with selected WW-PLEKHAs highlights how each PLEKHA directs PDZD11 to a different pool). 2) Both antibodies against PDZD11 and ATP7A were raised in rabbits, which makes it technically impossible to do triple labeling. We will address the question of the existence of the ATP7A- containing trimolecular complex by PLA analysis (ATP7A+PDZD11 and ATP7A+WW-PLEKHAs)._

      1. General comment: it would be interesting to see the hypothesis and finding in mice model with copper accumulation (for example Atp7b KO mice) as PLEKHA's and PDZD11 are sensitive to copper concentration. Or at least the authors can comment on this future possibility.

      Response. We agree with the Reviewer that mouse models could be useful to test the relevance of WW-PLEKHAs and PDZD11 as targets or effectors of copper-sensing mechanisms in vivo.

      The text of the Discussion will be modified to envisage these possible future studies

      Reviewer #3 (Significance):

      In conditions including Menkes disease, occipital horn syndrome (OHS), and ATP7A-related distal motor neuropathy (DMN), characterized by altered intestinal copper metabolism, the new knowledge ATP7A associates with WW-PLEKHAs (PLEKHA5, PLEKHA6, PLEKHA7) and PDZD11 is an important finding for the study of copper homeostasis.

      As ATP7A is structurally similar to ATP7B (60% homology), the current study opens the area of the research where WW-PLEKHAs (PLEKHA5, PLEKHA6, PLEKHA7) and PDZD11 could also play role in ATP7B trafficking to address not only Menkes disease but also Wilson disease and other diseases related to altered copper levels.

      This is a well written and presented manuscript with excellent mechanistic work utilizing molecular imaging techniques and several confirmatory experiments. I recommend the manuscript to be accepted for publication with minor modifications.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The authors identified some major interesting findings including the key role of WW-PLEKHAs (PLEKHA5, PLEKHA6, PLEKHA7) in the recruitment of PDZD11 targeting ATP7A to the cell periphery in response to elevated copper. Generating the antibodies against PLEKHAs and PDZD11 and various knock out cell lines and validating their expression in these cell lines and tissues is innovative. Further, the authors showed that copper dependent WW-PLEKHAs and PDZD11 regulate the localization and function of ATP7A to modulate cellular copper homeostasis.

      Major comments:

      We are in agreement with the manuscript conclusions. Based on the presented studies, the authors propose the in-vivo role of WW-PLEKHAs and PDZD11 in ATP7A trafficking, and how microtubule dynamics and trafficking machinery regulate ATP7A localization. Additionally, investigating the effects of the cell membrane trimolecular complexes ATP7A-PDZD11-WW-PLEKHA on elevated copper would be impactful.

      Additional notes:

      1. Figure 4, the authors provide excellent data and images showing localization of PLEKHA, PDZD11 and ATP7A within different cell lines. Nevertheless, showing PLEKHA, PDZD11 and ATP7A localization on membrane of the cell surfaces at elevated copper condition with cell fractionation technique and their interaction through co-immunoprecipitation (co-IP) could validate author's hypothesis. At least the authors should comment on this.
      2. Figure 5, Alternatively, intracellular copper levels by ICPMS in the cell lines would strengthen the results. As author's treated the cell lines with very high copper concentration, copper concentration dependent studies would be appreciated to verify how PLEKHA's and PDZD11 response depends on copper concentration Also, the authors should clearly mention the number of replicates for each experiment and indicate in the figure legends.

      Minor comments:

      1. Figure1B, co-IP efficiency is lower in Caco-2 cells, therefore endogenous levels of PLEKHA5, PLEKHA6 and PDZD11 in Caco-2 should be checked and shown. Mention the number of replicates for the experiments.
      2. Figure 2A, as per result of 2A, E-cadherin labeling is missing. Figure 2M and 2N, author analyzed the co-localization of PLEKHA-5 in presence of nocodazole but not PDZD11. It would be interesting to see the PDZD11 as well after nocodazole treatment.
      3. The result section title for figure 6 (line 329) is misleading. Also, trimolecular complex PLEKHA's, PDZD11 and ATP7A membrane localization at elevated copper concentration could be shown by immunofluorescence, if possible.
      4. General comment: it would be interesting to see the hypothesis and finding in mice model with copper accumulation (for example Atp7b KO mice) as PLEKHA's and PDZD11 are sensitive to copper concentration. Or at least the authors can comment on this future possibility.

      Significance

      In conditions including Menkes disease, occipital horn syndrome (OHS), and ATP7A-related distal motor neuropathy (DMN), characterized by altered intestinal copper metabolism, the new knowledge ATP7A associates with WW-PLEKHAs (PLEKHA5, PLEKHA6, PLEKHA7) and PDZD11 is an important finding for the study of copper homeostasis.

      As ATP7A is structurally similar to ATP7B (60% homology), the current study opens the area of the research where WW-PLEKHAs (PLEKHA5, PLEKHA6, PLEKHA7) and PDZD11 could also play role in ATP7B trafficking to address not only Menkes disease but also Wilson disease and other diseases related to altered copper levels.

      This is a well written and presented manuscript with excellent mechanistic work utilizing molecular imaging techniques and several confirmatory experiments. I recommend the manuscript to be accepted for publication with minor modifications.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript uncovers new PDZD11 interactors that participate in trafficking of the copper transporter ATP7A from the Golgi/TGN to the cell periphery in response to high copper concentrations. These interactors named PLEKHA5, PLEKHA6, and PLEKHA7 interact with the N-terminal Pro-rich domain of PDZD11 through their WW domains. As PDZD11 interacts with the C-terminal region of ATP7A, the authors investigated the hypothesis that WW-PLEKHAs are required for copper-induced relocalization of ATP7A from the TGN to the plasma membrane where it functions in copper efflux. In vitro pull down experiments verified the formation of ATP7A-, PLEKHAs-, and PDZD11-containing complexes. Using using CRISPR/Cas9 technology, the authors have generated PDZD11-, PLEKHA5-, PLEKHA6-, and PLEKHA6/7-KOs cell lines. Cells lacking one (or more) of these proteins were examined by microscopy with respect to their ability of targeting ATP7A to the cell periphery in response to copper. Abnormal trafficking of ATP7A in these mutant cell lines (PDZD11-, PLEKHA5-, PLEKHA6-, PLEKHA7-, and PLEKHA6/7-KOs) presumably prevented copper efflux since elevated intracellular copper was detected using the fluorescent copper probe CF4.

      Although it is difficult to read across the article's figures and supporting figure files (going back-and-forth repeatedly), the manuscript is generally clear and well written, and the results seem well documented accompanied by a tremendous amount of work.

      Comments.

      1. Two-hybrid screen occurs in the nucleus. How the authors could explain the fact that the use of PDZD11 as a bait exhibited an interaction with PLEKHA5 and PLEKHA6 (as well as PLEKHA7) in this system? Microscopic analysis of PLEKHA5 showed a cytoplasmic submembrane localization with E-cadherin, whereas PLEKHA6 exhibited a localization along the plasma membrane at apical junctions. In the case of PLEKHA7, it is an adherens junction protein. Furthermore, these three proteins are quite big (1116, 1297, and 1121 AAs, respectively) with their WW regions at their N termini, which involved the expression of very long cDNAs fused to the TA domain. As truly membrane-associated proteins, isn't surprising that a two hybrid approach worked?
      2. Fig. 1C, what are the 4 bands seen for the second blot (anti-HA) in lines 1, 2, 9 and 10? The blot was cut in a way that not enough of the membrane can be evaluated. Why using Ponceau for GST-baits and not using anti-GST antibodies? It would be much better having an uniform method (Western blot assays) to show the data. This latter comment is true for Fig 1D, E, and Fig 6.
      3. Fig 1B, why Caco-2 cells? All the other experiments were conducted with other cell lines such as mCCD and MDCK. Using different cell lines could give different results.
      4. Fig 1D, it is unclear whether the GST-PDZD11 fusion protein (bait) was present or not when used in pull down assays with GFP alone. This is a clear disadvantage of Ponceau, immunoblot would be much better to use.
      5. In Fig 3A, under basal copper conditions, microscopic image of the PLEKHA6/7-KO seems indicate a distinct pattern of localization for ATP7A in comparison to that of WT. However, this difference does not seem to be highlighted in Fig 3E.

      In Figs 3F and 4F, what was the method for quantification?

      Along these lines, what is the copper concentration under basal conditions? How much copper was used for elevated copper conditions and what was the time of treatment?

      1. Is there any evidence for Atp7A-PDZD11-PLEKHAs association in vivo? Do the authors have assessed these protein-protein interactions using methods such as bimolecular fluorescence in cells?
      2. In Fig 5, do the authors have verified the mRNA (or/and protein) steady-state levels of metallothioneins? Probing whether metallothioneins are induced would strongly reinforced their conclusion as to whether an increase intracellular copper levels occurred in PDZD11-, PLEKHA5-, PLEKHA6-, PLEKHA7-, and PLEKHA6/7-KOs cell lines.
      3. In Fig 6E, what was the method for quantitative immunoblot assays? Have you used an Odyssey infrared imaging system (Li-Cor). What was the loading (internal) control under the same analytical method?
      4. In the case of the manuscript section entitled " PLEKHA5, PLEKHA6 and PLEKHA7 show distinct localizations in cells and tissues and define cytoplasmic...", (pages 5 to 7) the reader would benefit having a Table that would summarize all the data. It would be more understandable.
      5. Do PDZD11, PLEKHA5, PLEKHA6, and PLEKHA7 proteins exist as multiple isoforms? If that is the case, for each of them, are they exhibiting the same tissue-specific expression profiles as shown in Fig S3? For each protein, if different isoforms exist, perhaps some of them participate in a different way for the targeting of ATP7A?
      6. Is it known whether PDZD11, PLEKHA5, PLEKHA6, and PLEKHA7 proteins participate in the copper-regulated trafficking of the ATP7B (Wilson) protein? In Fig S3, it is shown that they are expressed in liver, with PLEKHA7 exhibiting a slower migration (protein modification?). Alternatively, are they strictly involved in the regulation of ATP7A (Menkes)? Could the authors discuss about it?
      7. The proposed model in Fig 7 is unclear illustrating a nucleus that consumes a lot of space while it is not involved in the proposed mechanism. Cellular proteins that are involved in the proposed mechanism should be bigger and their interactions that lead to formation of protein complexes must be better illustrated as a function of copper availability.
      8. Typo. Line 320: remove "or" and replace it by "and" : ...both PLEKHA6 and PLEKHA7 (Fig. 5A-D).
      9. Typo. Line 329: remove (Figure 6) in the title.

      Significance

      This study represents a significant advanced in the copper field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study reveals the role of WW-PLEKHAs (PLEKHA5, 6 and 7) in the basolateral targeting of copper (Cu) transporter ATP7A. The Authors suggest that the WW-PLEKHAs/PDZD11/ATP7A interaction directs Cu-induced trafficking of ATP7A to the basolateral surface of epithelial cells. Suppression of WW-PLEKHAs impairs basolateral delivery of ATP7A and causes increased intracellular Cu levels. On the contrary, WW-PLEKHAs do not seem to participate in the retrieval of ATP7A back to the Golgi once the Cu levels return to basal values. To support these notions the manuscript provides a substantial set of the data, which were achieved with a wide repertoire of methods. In my view, this manuscript could be of interest to a broad readership, ranging from cells biologists to medical doctors. However, further revision should address the concerns outlined below.

      Major points:

      1. The Authors claim that at basal Cu conditions ATP7A resides in the TGN regardless of PDZD11 or WW-PLEKHAs depletion (Figs. 3, 4 and Fig. S6, S7). However, colocalization with TGN marker and its quantification are not shown. Thus, the colocalization of ATP7A with TGN marker (Golgin 97 should work in all cell types) has to be shown and its quantification (Pearson coefficient) has to be provided for control and all KO cells.
      2. Along the same line, ATP7A colocalization with TGN marker and its quantification also has to be conducted for the Cu washout experiments.
      3. The authors say that upon addition of Cu ATP7A labeling was detected along lateral contacts, and near the apical and basal plasma membranes (Fig. 3B, WT). Here again "near apical" localization of ATP7A has to be clarified. This could either represent the ATP7A pool that still remains in the Golgi (which is usually close to apical surface in polarized epithelial cells) or the ATP7A pool delivered to the apical membrane of the cells. However, apical targeting of ATP7A would be odd considering previously published data that shows basolateral localization in polarized epithelial cells. Thus, the authors have to show whether "apical" ATP7A overlaps with TGN marker or with an apical marker (Gp135).
      4. PDZD11 or PLEKHA6/7 KOs lead to an ATP7A pattern, which looks like pretty large scattered vesicles that do not overlap with basolateral marker. What are these round ATP7A structures, endosomes? Colocalization assessments with EEA1 (early endosomes), VPS35 (sorting endosome) and LAMP1 (late endosomes) would be needed to clarify this. Alternatively, these vesicles could represent a fragmented Golgi with ATP7A inside. To establish this, labelling with TGN marker at these conditions is required.
      5. Biotinylation experiments. The Authors say that KO of either PDZD11, or PLEKHA7, or both PLEKHA6 and PLEKHA7, but not PLEKHA6 alone, decreased ATP7A levels at the basolateral surface of mCCD cells (Fig. 3G), while a small decrease in the basolateral levels of ATP7A is observed in PLEKHA5-KO, but not PLEKHA6-KO MDCK cells (Fig. 4G). Honestly, it is tough to see this. In Fig. 4G all ATP7A bands in the biotinylated fraction look similar. In Fig. 3G, the P11 and P6/7 KO bands of biotinylated ATP7A might be a bit less intense than in WT, while the P6 KO signal looks even more intense that WT. More convincing blots with quantification have to be provided for both figures.
      6. Along the same line. Why was apical biotinylation of ATP7A not included? It absolutely should be done to understand whether any KO induces apical mistargeting of ATP7A.
      7. Copper metabolism. The authors say that KO of either PDZD11 or PLEKHA6/7 results in higher Cu levels. What does this mean in terms of physiology and pathology? In the context of Menkes disease one has to show that this intracellular Cu increase is due to a reduction in Cu release from the cells. So, Cu release from the cells into medium has to be measured by ICP-MS or Cu64. On the other hand, it would be important to understand whether Cu accumulation in KO cells is toxic. To this end viability of KO cells should be tested in Cu dose-response experiments.
      8. How critical is WW-PLEKHAs or PDZD11 deficiency in terms of Cu metabolism? Are there genetic disorders or mouse phenotypes associated with their loss of function? If yes, do these phenotypes include any impairment of Cu metabolism?
      9. Discussion. Could PH domains of WW-PLEKHAS be involved in their basolateral localization, thereby generating a targeting patch for ATP7A? Some publications suggest that the basolateral membrane might be enriched in specific PIPs, which in turn generates a favorable environment for some PH domains. Is this the case for PH domains of WW-PLEKHAS?

      Minor points:

      1. Fig. 6C. CFP-HA is a negative control but still gives a band (although of lower intensity). So how can one be sure that other interactions are specific? This is particularly worrying because the quantification shows a very minor (less than 1.5) increase in the intensity of bands corresponding to specific interactors.
      2. Page 11. The result section title "WW-PLEKHAs promote PDZD11 binding to ATP7A through PDZD11 (Figure 6)" does not sound right and has to be corrected.

      Significance

      Delivery of copper transporter ATP7A to the basolateral surface of epithelial cells is of great importance for maintenance of copper metabolism and, hence for human health in general. Impairment of this process in enterocytes causes fatal Menkes disease. However, the mechanisms driving basolateral targeting of ATP7A remained poorly characterized. This study provides a significant advance in our understanding of these mechanisms and opens new avenues for investigation of how WW-PLEKHAs/PDZD11-mediated targeting of ATP7A might be affected in the context of inherited disorders of copper metabolism.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Editor comments:

      Thank you for sending your manuscript entitled "In situ imaging of bacterial membrane projections and associated protein complexes using electron cryo-tomography" to Review Commons. We have now completed the peer review of the manuscript. Please find the full set of reports below.

      We thank the editors of Review Commons and all the reviewers for their insightful comments which helped us to improve our manuscript. We have now modified our manuscript based on the Reviewers’ comments and would like to ask you to consider our revised manuscript for publication.

      Reviewer #1:

      This manuscript by the Jensen lab surveys a plethora of bacterial outer-membrane projections captured over the years by in situ cryo-tomography under near-native conditions. The authors classify the different visualized structures, highlighting both similarities and differences among them. They further describe molecular complexes that are associated with these projections. The manuscript highlights the abundance of such understudied structures in nature, indicating the need to deepen our exploration into their biological functions and mechanisms of action.

      We thank the reviewer for her/his insightful comments that allowed us to improve our manuscript.

      The authors should state in the Abstract and Introduction that only diderm bacteria and outer- membrane extensions are included in the study.

      Done. We have modified the title, the abstract and the introduction to explicitly highlight this point.

      In the Introduction or Discussion the authors should mention the limits of the in situ cryo-tomography, such as the difficulty to observe regions in between neigbouring bacterial cells, and into the thick bacterial cell body.

      Done. We have added the following to our revised manuscript:

      “Currently, only electron cryo-tomography (cryo-ET) allows visualization of structures in a near-native state inside intact (frozen-hydrated) cells with macromolecular (~5 nm) resolution. However, this capability is limited to thin samples (few hundred nanometers thick, like individual bacterial cells of many species) while thicker samples like the central part of eukaryotic cells, thick bacterial cells, or clusters of bacterial cells are not amenable for direct cryo-ET imaging. Such thick samples can be rendered suitable for cryo-ET experiments by thinning them first using different methods including focused ion beam milling and cryosectioning [30]. Cryo-ET has already been invaluable in revealing the structures of several membrane extensions, including Shewanella oneidensis nanowires [6], Helicobacter pylori tubes [15], Delftia acidovorans nanopods [25], Vibrio vulnificus OMV chains [16], and more recently cell-cell bridges in the archaeon Haloferax volcanii [31].” (Lines 108-118)

      Please provide a legend to Table S1 explaining the numbers (organelles?), how many cells were viewed? I think that at least part of it should be included in the main text. Also, there are examples of vesicles emanating from H. pylori. This information is missing from Table S1.

      Done. We added a column to the table indicating the number of cells available for each species. We also added the information about the vesicles in H. pylori to the table. This table is now incorporated into the main text of the manuscript as Table 1.

      Please provide an ordered list including all the strains (and IDs of the specific isolates) used in this study and their genotypes.

      Done. We added Table S1 to the revised manuscript that contains this information. This table also includes relevant references to all the published papers where these strains were previously used.

      The authors describe in detail the H. pylori tubes that seem to be flagellum-core independent. However, the authors found previously (ref 15) that during infection, these structures are dependent on CagA T4SS, and they visualized T4SS sub-complexes in proximity to the point of tube emanation. This should be described and discussed in the text. Also, please indicate if the "host-independent" tubes are similarly dependent on T4SS.

      Done. We added the following to the revised manuscript:

      “The scaffolded uniform tubes of H. pylori that we observed were formed in samples not incubated with eukaryotic cells, indicating that they can also form in their absence. However, the tubes we found had closed ends and no clear lateral ports, while some of the previously-reported tubes (formed in the presence of eukaryotic host cells) had open ends and prominent ports [15]. It is possible that such features are formed only when H. pylori are in the vicinity of host cells. Moreover, while it was previously hypothesized that the formation of membrane tubes in H. pylori (when they are in the vicinity of eukaryotic cells) is dependent on the cag T4SS [15], we could not identify any clear correlation between the emanation of membrane tubes and cag T4SS particles in our samples where H. pylori was not incubated with host cells. We also show that the tubes of H. pylori are CORE-independent, indicating that they are different from the CORE-dependent nanotubes described in other species.” (Lines 303-313)

      Is there any difference in the frequency or length of the tubes in the mutants presented in Figure 4? The flgS mutant in the image exhibits a very short filament; is that typical?

      We did not see any significant statistical difference in the number or lengths of the tubes in these different mutants. We added Table S2 to the revised manuscript which details the number of cells we visualized for each mutant and the number of the tubes seen there. In all these mutants the lengths of the tubes ranged between few tens to hundreds of nanometers. In addition, we added Fig. S2 to show more examples of these tubes in each of these mutants.

      Minor points:

      -Please check full bacterial names that are sometimes missing (e.g., lines 110-112).

      Done.

      -There is no reference to panel 2G. Please check the references to all panels.

      Done. Please see lines 154 and 183 in the main text.

      -Lines 181-184: There is no figure related to the formation of teardrop-like extensions from C. pinensis. Please review the text accordingly.

      Done. Corrected.

      -Line 235, not clear to what "as these" refers to.

      Done. We modified the text as the following:

      “As these MEs/MVs from S. oneidensis were purified” (Lines 246-247)

      -Line 241, not clear what "a secretin-like complex" is, and no reference is provided.

      Done. We modified the text as the following:

      “In the third category, we observed a secretin-like complex in many tubes and vesicles of F. johnsoniae. Secretins are proteins that form a pore in the outer membrane and are associated with many secretion systems like type IV pili and type II secretion systems (T2SS) [39–41]” (Lines 252-254)

      Reviewer #1 (Significance)

      As described in this manuscript, even in model bacteria these structures are generated (e.g., Caulobacter forms the hardly studied nanopod extensions). The manuscript also provides visual categories of these structures, defining "extension types" that are likely to be used by the scientific community for years to come, similar to the initial pili classification during the 1960s-70s. It is a "descriptive study," in the positive sense of the term, as it significantly contributes to the field of bacteriology.

      We thank the reviewer for her/his kind words and enthusiasm about our work. It is an honor to have our work compared to the seminal pili classification work done in the 1960s-70s by pioneers in the field of bacteriology.

      Reviewer #2:

      The manuscript "In situ imaging of bacterial membrane projections and associated protein complexes using electron cryo-tomography" by Kaplan et al., identifies and catalogues membrane extensions (MEs) and membrane vesicles (MVs) from 13 different species using cryo-electron tomography. Furthermore, they identify and discuss several protein complexes observed in these membrane projections.

      The manuscript is beautifully written, interesting, and genuinely got this reviewer excited about the biology. I applaud the authors on their manuscript and have only minor comments and a few thoughts that the authors may wish to think on and discuss.

      We thank the reviewer for her/his kind words and insightful comments that allowed us to improve our manuscript.

      Some schematics throughout the introduction would be useful to readers new to the field/ outside the field who are not used to these different membrane structure features.

      We thank the Reviewer for this suggestion. First, we made an extra figure with schematics showing the cell body and membrane tubes but that was rather redundant with Figure 8. For this reason, we added explicit labels to figure 1 highlighting the cell body and the tubes in these examples to help the reader following that figure and the subsequent ones. However, if the Reviewer has an explicit suggestion/view about the schematics then we would be very happy to do that.

      The size of scale bars should be indicated on the figure panels themselves rather than in the figure legend to assist the reader.

      Done.

      In reference to lines 193-196 - what was the extracellular environment like in these micrographs? Were other cells present? Could it be the extracellular environment/surrounding cells that stimulate pearling? Have the authors considered this? Please discuss if relevant/insightful.

      This is a good point. The cells were usually plunge-frozen in their standard growth media (except in H. pylori where the cells were resuspended in PBS and subsequently plunge-frozen). Yes, there are other cells present in the sample, however, usually, only one cell is present in the field of view of the tomogram as areas with multiple cells have thick ice and therefore not amenable for cryo-ET imaging. We added the following to the revised manuscript:

      “As usually only one (or part of a) cell is present in the cryo-tomogram, we can’t exclude that differences in the extracellular environments, like the presence of a cluster of cells in the vicinity of the individual cells with pearling tubes, might play a role in this observation” (Lines 198-201).

      "Randomly-located complexes" in this reviewers opinion should actually be described "seemingly randomly-located complexes" given there may be an organization present that is beyond the resolution limit of this study.

      The is a good point. Indeed, we can’t exclude that these complexes have a preferred localization in specific lipid patches that we can’t detect in our cryo-tomograms. We added the following statement to the revised manuscript:

      “These complexes, which were also found in the OM of intact cells, did not exhibit a preferred localization or regular arrangement within the tube at least within the fields of view provided by our cryo- tomograms (Fig. 5a & b).” (lines 227-230).

      In reference to lines 287-292 - is it possible this has to do with lipid composition? Have the authors considered this? Please discuss if relevant/insightful.

      Done. We added the following to the revised manuscript:

      “In addition, differences in the lipid compositions among the various species investigated here might also play a role in the formation of these different forms of projections” (Lines 299-301).

      Reviewer #2 (Significance ):

      These results advance the field by shedding new light on bacterial membrane extension morphologies. The authors use a cryo-ET to catalogues membrane extensions and membrane vesicles which has not been done before.

      This paper is likely to be of interest to structural biologists, biophysicist, membrane protein biologists, virologists and microbiologists.

      This reviewer is a single-particle cryo-EM structural biologist with interest in membrane proteins._

      We thank the reviewer for her/his enthusiasm about our work described here.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript "In situ imaging of bacterial membrane projections and associated protein complexes using electron cryo-tomography" by Kaplan et al., identifies and catalogues membrane extensions (MEs) and membrane vesicles (MVs) from 13 different species using cryo-electron tomography. Furthermore, they identify and discuss several protein complexes observed in these membrane projections.

      The manuscript is beautifully written, interesting, and genuinely got this reviewer excited about the biology. I applaud the authors on their manuscript and have only minor comments and a few thoughts that the authors may wish to think on and discuss.

      • Some schematics throughout the introduction would be useful to readers new to the field/ outside the field who are not used to these different membrane structure features.
      • The size of scale bars should be indicated on the figure panels themselves rather than in the figure legend to assist the reader.
      • In reference to lines 193-196 - what was the extracellular environment like in these micrographs? Were other cells present? Could it be the extracellular environment/surrounding cells that stimulate pearling? Have the authors considered this? Please discuss if relevant/insightful.
      • "Randomly-located complexes" in this reviewers opinion should actually be described "seemingly randomly-located complexes" given there may be an organization present that is beyond the resolution limit of this study.
      • In reference to lines 287-292 - is it possible this has to do with lipid composition? Have the authors considered this? Please discuss if relevant/insightful.

      Significance

      These results advance the field by shedding new light on bacterial membrane extension morphologies. The authors use a cryo-ET to catalogues membrane extensions and membrane vesicles which has not been done before.

      This paper is likely to be of interest to structural biologists, biophysicist, membrane protein biologists, virologists and microbiologists.

      This reviewer is a single-particle cryo-EM structural biologist with interest in membrane proteins.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript by the Jensen lab surveys a plethora of bacterial outer-membrane projections captured over the years by in situ cryo-tomography under near-native conditions. The authors classify the different visualized structures, highlighting both similarities and differences among them. They further describe molecular complexes that are associated with these projections. The manuscript highlights the abundance of such understudied structures in nature, indicating the need to deepen our exploration into their biological functions and mechanisms of action.

      Comments:

      1. The authors should state in the Abstract and Introduction that only diderm bacteria and outer-membrane extensions are included in the study.
      2. In the Introduction or Discussion the authors should mention the limits of the in situ cryo-tomography, such as the difficulty to observe regions in between neigbouring bacterial cells, and into the thick bacterial cell body.
      3. Please provide a legend to Table S1 explaining the numbers (organelles?), how many cells were viewed? I think that at least part of it should be included in the main text. Also, there are examples of vesicles emanating from H. pylori. This information is missing from Table S1.
      4. Please provide an ordered list including all the strains (and IDs of the specific isolates) used in this study and their genotypes.
      5. The authors describe in detail the H. pylori tubes that seem to be flagellum-core independent. However, the authors found previously (ref 15) that during infection, these structures are dependent on CagA T4SS, and they visualized T4SS sub-complexes in proximity to the point of tube emanation. This should be described and discussed in the text. Also, please indicate if the "host-independent" tubes are similarly dependent on T4SS.
      6. Is there any difference in the frequency or length of the tubes in the mutants presented in Figure 4? The flgS mutant in the image exhibits a very short filament; is that typical?

      Minor points:

      • Please check full bacterial names that are sometimes missing (e.g., lines 110-112).
      • There is no reference to panel 2G. Please check the references to all panels.
      • Lines 181-184: There is no figure related to the formation of teardrop-like extensions from C. pinensis. Please review the text accordingly.
      • Line 235, not clear to what "as these" refers to.
      • Line 241, not clear what "a secretin-like complex" is, and no reference is provided.

      Significance

      As described in this manuscript, even in model bacteria these structures are generated (e.g., Caulobacter forms the hardly studied nanopod extensions). The manuscript also provides visual categories of these structures, defining "extension types" that are likely to be used by the scientific community for years to come, similar to the initial pili classification during the 1960s-70s. It is a "descriptive study," in the positive sense of the term, as it significantly contributes to the field of bacteriology.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Mishima et al., designed a reporter system (dubbed PACE, for Parallel Analysis of Codon Effects) to assess the effect of codon usage in regulating mRNA stability in a controlled sequence context. This reporter corresponds to a stretch of 20 repetitions of a given codon (to be tested for its effect on mRNA stability), each repetition being separated by one codon corresponding to each of the 20 canonical amino acids. This stretch is inserted at the 3' end of the coding sequence of a superfolder GFP flanked with fixed 5' and 3' untranslated regions. In vitro transcribed capped and polyadenylated RNAs are then produced from these reporters (each with a specific stretch of repetitions of a given codon), pooled together and injected into zebrafish zygotes to monitor their relative abundance at different time points upon injection.

      Using the PACE reporter, the authors were able to obtain a quantitative estimation of the impact of 58 out of the 61 sense codons on modulating mRNA stability. Their results are in agreement with a previous report that estimated the effect of codon usage on mRNA stability using endogenous mRNAs and an ORFeome library (Bazzini et al., 2016). However, contrary to relying on endogenous mRNAs and ORFeome reporters, the advantage of the PACE strategy is that the effect of the codon to be studied can be probed in a defined context, thus avoiding the presence of other motifs or transcript features that could also regulate mRNA stability. Similarly to results from Bazzini et al., 2016, the authors show that blocking translation completely abrogates the effect of codon usage, indicating that translation is required to drive codon-dependent mRNA degradation from their reporters. Also, the extent of codon-dependent mRNA decay is correlated with tRNA abundance and occurs through a process involving mRNA deadenylation as previously described in the zebrafish (Mishima et al., 2016 and Bazzini et al., 2016).

      Having validated their PACE protocol, the authors performed ribosome profiling to test whether ribosome occupancy on tested codons is correlated with their capacity to drive mRNA degradation. Their results indicate that, at least for polar amino acids, there is indeed an inverse correlation between ribosome occupancy at tested codons and mRNA stability thus suggesting that slow decoding of codons due to low levels of available cognate tRNA can induce mRNA degradation. The authors further validate this finding by reducing the levels of aminoacylated tRNAAsn (corresponding a polar amino acid) and showing that stability of the reporter RNA carrying a stretch of AAC codons (decoded by tRNAAsnGUU) is reduced. To test whether codon-dependent mRNA degradation in the context of slow ribosome decoding lead to ribosome stalling and collisions, the authors generated a mutant zebrafish strain with impaired expression of ZNF598 (an essential factor of the No-Go decay (NGD) pathway in yeast). They also integrated a known ribosome stalling sequence from hCMV (and a mutant version that does not trigger ribosome stalling) in their sfGFP reporter construct as a positive control for NGD in their assays. Their results indicate that although ZNF598 depletion impairs degradation of the hCMV reporter (as expected), it does not affect codon-dependent mRNA degradation, which appears to occur for most codons through a NGD-independent manner. Finally, through the use of a tandem ORF reporter assay separated by codon tags to be tested, the authors show that destabilizing codons do not stall ribosomes but only lead to their transient slowdown which induces mRNA deadenylation and degradation in a ZNF598-independent manner.

      Overall, the manuscript is very well written and pleasant to read. The introduction is well documented and relevant to the study as it allows readers to place the study in the current context of the field while highlighting open questions that have not been addressed yet. The results are clearly presented, the technical approaches are elegant and the conclusions convincing.

      Below you will find some major and minor points that, in my opinion, should be addressed by the authors.

      **Major point:**

      • One interesting aspect of the PACE reporter assay is the possibility to monitor ribosome occupancy in parallel for all codon-tags tested, which the authors did in Figure 3. However, instead of using RNA-seq data to normalize ribosome footprints and obtain ribosome occupancy, the authors used an alternative normalization approach consisting, for each codon-tag, to calculate the number of ribosome footprints with test codons in the A site divided by the number of ribosome footprints with spacer codons in the A site. This approach is elegant and appears to work with codons corresponding to polar amino acids. However, it might have its limitations for other codons.

      Indeed, ribosome dwell times (in yeast and mammals) have been shown to respond both to tRNA availability but also to other features such as the nature of the pair of adjacent codons, and the nature of the amino acid within the exit channel (Gobet C et al., 2020 PNAS; Gamble CE et al., 2016 Cell; Pavlov MY et al., 2009 PNAS). However, based on the work of "Buschauer R et al., 2020 Science", only ribosomes lacking an accommodated tRNA at the A site are able to recruit Ccr4-Not to mediate mRNA deadenylation and degradation. Other events that increase ribosome dwell time (and thus occupancy), such as slow peptidyl-transfer, do not lead to Ccr4-Not recruitment and are resolved by eIF5A. It is therefore possible that depending on the nature of the codon that is being tested, ribosome occupancy at test and spacer codons can be biased by the nature of codon-pairs and "dilute" the effects of tRNA availability.

      If the authors performed RNA-seq together with the ribosome profiling experiment, it might be interesting to use the RNA-seq data to calculate ribosome occupancy on "tested" and "spacer" codons to check whether using this normalization, they do find a negative correlation between ribosome occupancy and PACE stability. A different approach would be to perform ribosome run-off experiments using harringtonine and estimate the elongation speed across the codon tag. However, I am aware that this experiment could be tedious an expensive.

      • Figure 6: Insertion of the Lys x8 AAA stretch in the tandem ORF reporter leads to a decrease in HA-DsRedEx expression compared to that of Myc-EGFP. However, results from "Juszkiewicz and Hedge, 2017" using a similar reporter in mammalian cells indicate that stretches of Lys AAA below 20 repetitions only elicit poor RQC (less than 10% of true ribosome stalling for 12 repetitions of the AAA codon). Instead, most of the loss in RFP signal results from a change in the reading frame of ribosomes due to the "slippery" translation of the poly(A) stretch. I therefore think that it could be important to perform the experiment in ZNF598 KO embryos to validate that the observed reduction in HA-dsRedEx does indeed result from stalling and RQC and not from a change in the reading frame of ribosomes.

      On a similar note, how do the authors explain the decrease in signal of the Flag-EGFP and HA-DsRedEx observed when using the Flag-EGFP with non-optimal codons? I understand that RQC occurring through NGD leads to ribosome disassembly at the stalling site and possibly mRNA cleavage (thus explaining the decrease in HA-DsRedEx signal compared to Myc-EGFP). However, I would assume that codon-mediated mRNA decay (even for ORF longer than 200 of non-optimal codons) should trigger mRNA deadenylation, followed by decapping and co-translational 5'to3' mRNA degradation, following the last translating ribosome. I would therefore expect not to see any change in the HA-DsRedEx/Myc-EGFP ratio even for the non-optimal Flag-EGFP reporter. Could the 200 non-optimal codons trigger some background RQC through NGD? Or could there be some ribosome drop-off? It might be interesting to test the optimal and non-optimal Flag-EGFP reporters in the ZNF598 KO background to check whether the observed decrease in the relative amount of HA-DsRedEx results from stalling-dependent RQC.

      **Minor comments:**

      • The color-coded CSC results from "Bazzini et al., 2016" presented at the bottom of panel B in figure 2 are misleading because many codons (such as PheUUU, AsnAAU, TyrUAC...) are lacking information. I have the impression that the authors used the combined data from the rCSCI (obtained from the reporter RNAs) and CSC (obtained from endogenous transcripts) corresponding to Figure 1F from Bazzini et al., 2016. This data set excluded all codons that were not concordant between the endogenous and reporter CSCs (which are those that are lacking a color code in Figure 2B from this study). However, in the scatter-plot of PACE Vs CSC (from Supplemental Figure 1D of this study), the authors used the complete set of CSC values from Bazzini et al .,2016. Could the authors please use the complete set of CSC values from Bazzini et al., 2016 to color code codons in their Figure 2B?

      • Figure 4B. The charged tRNA measurements seem to have been done in a single biological replicate (there aren't any error bars in the chart). I understand that the procedure is tedious and requires a large amount of total RNA to begin with, but it would be preferable to perform it in three biological replicates.

      • Supplementary Figure 2B. I do not understand what the figure represents. The legend is quite cryptic and states that the panel corresponds to the information content of each reading frame. More information should be given so that readers can understand how to interpret de figure and extract periodicity information.

      Reviewer #1 (Significance (Required)):

      Since the seminal work from Jeff Coller's laboratory in 2015 (Presnyak et al., 2015 Cell) showing a global and major role for codon optimality in determining mRNA half-lives in yeast, the role of codon usage in modulating translation and stability of mRNAs has been widely studied in different organisms (including zebrafish and mammals). As stated by the authors in the introduction, most studies have relied on correlation analyses between codon usage and mRNA half-lives from endogenous transcripts or from ORF libraries with fixed 5'UTR and 3'UTRs. This approaches could suffer from the presence of transcript features that can participate in other mRNA degradation pathways, which could limit their use when performing further mechanistic studies.

      The work by Mishima and collaborators presents an original reporter assay that allows to evaluate the role of codon usage on regulating mRNA stability in a defined context, thus avoiding the impact of confounding factors that could bias the measurement of mRNA stability. Results obtained using this reporter are in good agreement with previous reports from Zebrafish (Bazzini et al 2016., and Mishima et al., 2016). From this validated reporter approach, the authors further show that codon-dependent mRNA degradation is directly related to tRNA availability and (at least partially) to ribosome occupancy (two factors already suggested as being important for codon-mediated decay in zebrafish, although they were based on correlation analyses). Furthermore, the authors show that codon-mediated mRNA decay occurs during productive mRNA translation and that it is functionally distinct from RQC induced by ribosome stalling. As a consequence, codon-mediated mRNA degradation is independent from the RQC factor ZNF598 (which they also validate for the first time as an important RQC factor in zebrafish). This information is new within metazoans since only in yeast it has been clearly shown that codon-mediated mRNA decay is distinct from RQC induced by ribosome stalling and collisions.

      Taken together, the reported findings will be of interest to the community working on mRNA metabolism and translation. It could also interest, more broadly, scientists working on translational selection and genome evolution.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Mishima et al aim to determine if the RNA-mediated decay determined by codon optimality is part of the ribosome quality control pathway, triggered by slowed codon decoding and ribosome stalling or it is an independent pathway.

      To this end, the authors capitalize on their previous work to design a very elegant high-throughput reporter system that can analyze individually codon usage, ribosome occupancy and tRNA abundance. This reporter system, called PACE, is rigorously validated throughout the manuscript, because blocking translation with a morpholino blocking the AUG codon demonstrated that the effects no RNA stability are translation dependent.

      When most of the available codons are tested using the PACE system, the authors recapitulate codon optimality profiles similar to the ones previously uncovered using transcriptome-wide approaches.

      Thanks to the design of the reporter, which alternates repeats of a test codon with random codons, the authors can calculate how quickly a ribosome decodes the test codon on average. With this approach, the authors uncover a negative correlation between RNA stability and ribosome density on codons for polar amino acids and suggest that codon optimality is related to a slower decoding of the codons.

      With the PACE reporter validated, the authors can interrogate the system to gain mechanistic insights of codon optimality. First, they test if RNA decay and deadenylation mediated by codon optimality is determined, in part, by the levels of aminoacylated tRNAs available. The authors use a very elegant approach, as they overexpress a bacterial enzyme (AnsB) in zebrafish that degrades asparagine, effectively reducing the levels of tRNA-Asn. The authors demonstrate that AnsB turns a previously optimal Asn codon, AAC, into a non-optimal one. This effect is translated into RNA destabilization and deadenylation, but this effect in not extended to other codons encoding amino acids not affected by Asn. These results provide a direct experimental validation of the previously published observation of tRNA levels and codon optimality.

      Finally, the authors interrogate the relationship between the codon optimality pathways and the ribosome quality control pathways, that takes care of stalled ribosomes. The authors generate a zebrafish mutant of Znf598, a vertebrate homolog of the yeast protein in charge of resolving stalled ribosomes. Using a maternal-and-zygotic mutant, the authors demonstrate that in these mutant's codon optimality proceeds as usual but ribosome stalling is not resolved, providing evidence for first time that Znf598 is involved in ribosome quality control in vertebrates.

      Altogether, this manuscript presents work that builds on the previous findings of the authors and other labs but it is a qualitative leap forward rather than a marginal increment, because the body of work in the current manuscript i) establishes a reporter to dissect the mechanisms of codon optimality, ii) demonstrates that ribosome slow-down but not stalling is part of the trigger of RNA decay mediated by codon optimality, iii) demonstrates that this pathway is independent of ribosome quality control pathway and finally iv) demonstrates that vertebrate Znf598 is involved in the RNA decay mediated by ribosome stalling.

      Due to these novel findings, and the rigor of the experimental design, this manuscript should be accepted for publication. The authors should first address the following comments:

      **Major comment:**

      1. The authors very elegantly demonstrate the impact of AnsB on the stability of the RNA reporter, and it is precisely the simplicity of the reporter that allows the authors to draw clear conclusions. Nevertheless, it would be interesting to determine if the reporter results in embryos injected with AnsB also translate to endogenous genes rich in AAC codons. The authors could perform a polyA-selected RNA-Seq in embryos treated with AnsB to determine if the transcripts rich in AAC codons are destabilized compared to wild-type, thus validating the reporter results in endogenous genes. **Minor comments:**

      In figure 5J the authors plot the normalized codon tag levels of the PACE reporter run in the MZznf598 mutant. The authors color code the labels in the x-axis following the PACE results in wild-type (figure 2B). The authors should also plot the wild-type values to have a direct visual comparison of the results trend in both genotypes. The authors focus in the title on the role of Znf598 or the lack thereof in RNA decay induced by codon optimality. However, for the non-aficionados in codon-optimality, ZnF598 is an unknown protein and adds little information to the title. The authors should provide a more informative title, directly pinpointing that codon-optimality is independent of the ribosome quality control pathway.

      Reviewer #2 (Significance (Required)):

      This manuscript presents work that builds on the previous findings made by the authors and other laboratories but it is a qualitative leap forward rather than a marginal increment, because the body of work in the current manuscript i) establishes a reporter to dissect the mechanisms of codon optimality, ii) demonstrates that ribosome slow-down but not stalling is part of the trigger of RNA decay mediated by codon optimality, iii) demonstrates that this pathway is independent of ribosome quality control pathway and finally iv) demonstrates that vertebrate Znf598 is involved in the RNA decay mediated by ribosome stalling.

      In addition to the conceptual findings, the authors establish a new high-throughput reporter system to evaluate the influence of codon optimality in RNA decay.

      The work its done in zebrafish embryos, an in vivo model system where codon optimality has been extensively tested by the authors and others, following the stability of reporter and endogenous genes.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Mishima et al. address a very timely topic of how the codon composition of the ORF and the associated translation elongation speed affect mRNA stability. Several studies have already shown a strong correlation between codon optimality and mRNA stability - meaning the more "optimal" the codons, the faster supposedly the elongation speed and the more stable the mRNA. Most of these studies were done by analyzing global expression data, with limited follow up, therefore being also impacted by other co-translational mRNA decay pathways and in addition these studies could also not test directly the effect of each single codon on mRNA stability. The authors took a systematic reporter-based assay approach, called PACE, which allowed them to test systematically the effect of codon composition on mRNA decay. By integrating also ribosome profiling data, the authors could nicely show that the speed of translation (measured by ribosome density) correlates with their determined mRNA stability effect of each codon and also the corresponding tRNA levels. However, interestingly this seems to be the case only for codons encoding polar amino acids, but not the ones that encode charged or non-polar amino acids. It will be very interesting to find out why that is? Finally, the authors address if some of the effects they see might be due to ribosome collisions and associated no go decay (NGD). For this they generated a Znf598 mutant by CRISPR-Cas9. Znf598 is the proposed homolog of Hel2, the protein in yeast that is essential for NGD. The authors go on to show that NGD is defective in this mutant, but that codon mediated decay, which is elongation dependent, is not to a large part not dependent on Znf598.

      **All minor comments:**

      1. It is intriguing why only polar AAs show a tRNA amount specific effect in the ribosome footprint data. Some hypothesis/discussion about this could be expanded further in the discussion or results.
      2. On the same token some additional analysis might be helpful. For example, in Figure 2E, the authors group codons in weak, neutral and strong based on PACE measurements and then look at the tRNA expression range for each of the three groups. Could the authors do this also separately for the codons of polar, non-polar and charged amino acids? What do you see - still the same pattern as for all the codons or do again only polar amino acids show the trend?
      3. Can the authors elaborate on the development of their PACE system? Why is it designed the way it is? What parameters did they test? For example, why the 20 amino acids tail, did you you test shorter sequences of the amino acid, spacer repeats, etc?
      4. The next few questions are a bit more of a technical nature regarding the reporter construct used for PACE.
      5. Did all AA pairs (Codon of interest + spacer codon) behave the same in the footprint assay? Does the data have enough information and resolution for this?
      6. Was the order of the spacer codons always the same in all the constructs? Could the specific order, if it is consistent, have any unseen consequences (ie. interaction with the exit tunnel)? Did the authors test other orders?
      7. Are the spacer codons optimized?
      8. Are the codons affected in the NGD mutant the ones that are most different in the Bazzini data?
      9. The authors inject directly mRNA into the embryos, therefore avoiding that the reporter mRNA is ever in the nucleus. However, there could be nuclear events (e.g. loading of particular proteins) that might affect the fate of an mRNA in the cytosol, among these the translation efficiency and also stability. Maybe some comment in the discussion as to the effect of missing nuclear factors would be welcome. This is not a criticism; it would just be nice to hear the authors' thoughts on that.
      10. Page 6; final paragraph: "Finally, we compared the speed of the ribosome translating mRNA destabilizing codons to that of an aberrantly stalled ribosome." Not sure the authors did that actually. They tested the effect of ribosome slowing down on protein production and mRNA levels and compared that to stalling ribosomes, but did not compare the "speed" directly and I am not even sure what they mean by that in this context. Probably good to rephrase.

      Page 7, upper half: ".....by taking the positional effect of codon-mediated decay into account (Mishima and Tomari, 2016)."

      This is my limited knowledge of the literature, but I think you should mention what this positional effect is and not just cite a paper.

      Very minor, but on page 8 when PACE is introduced, the authors show the different destabilizing effects of the three Ile codons. While that is ok, in the section before, when the authors tested their construct by qRT-PCR, they focused on the two Leu codons. I would also mention them here and do a direct comparison of the qRT-PCR results with the pooled PACE result for these two codons. Based on the figure the two codons seem to behave qualitatively like expected, but I am not sure how good the quantitative behavior matches. The AnsB experiment - the authors only mention data about one of the two Asn codons (AAC), but what about the second Asn codon (AAU) - do you also see an effect on that codon upon overexpression of AnsB as well? AAU is already a quite destabilizing codon and you might not see a further increase in destabilization, but it would be great to know if there was or not. Page 13, second paragraph: More out of interest, but it is quite intriguing that GCG turned into a destabilizing codon (opposite of what one would expect if NGD would play a bit a role). Any speculation why? Page 14, end of page and related to Figure 6C: AAU seems much more destabilizing than AAC. Therefore, I would have expected that the inserted sequence with the AAU codons would lead actually to downregulation of the mRNA and therefore the EGFP and DsRFP total protein signal relative to the construct with the AAC inserted in between, even if the ratio of EGFP/DsRed seems unchanged. However, based on the western blot in 6C the total protein levels seem very similar. Isn't that surprising? Although, AAU obviously allows translation to proceed it should still induce a stronger mRNA decay than AAC and therefore result in less total mRNA (and protein level as a consequence). Did the authors quantify the exact levels of the reporter proteins and mRNA and compared them between the two constructs? Page 15, last sentence: Somehow for me the word "transient" is a bit hard to grasp in this context. What do you mean by that - do you really mean "impermanent" or "lasting only for a short amount of time"? Don't you simply mean "weaker", "less strong"? Page 17, second sentence: I think the authors want to reference here Figure 2E and not Figure 2D.

      Reviewer #3 (Significance (Required)):

      All in all, I have to say that it was a real pleasure to read this manuscript. The authors were extremely thorough with their experiments and did nearly never overstate any of their conclusions. It is a very insightful story, which in my opinion will contribute greatly to the field of gene expression and posttranscriptional gene expression regulation in particular. The PACE assay, although a bit artificial, gave very clean results, which agree with the previous literature and could be very useful for future studies. Generating the Znf598 mutant and showing that the codon-dependent decay is independent from NGD is a great addition to this paper. Although it is a bit of a pity that we do not see more of a characterization of the Znf598 mutant in this paper, I do agree with the authors that this might take away a bit of the focus of this manuscript and that the mutant deserves actually its own story. I only have very minor comments/questions for the authors that they should be able to address easily. Finally, I can only repeat myself by saying: congrats on this great paper and I fully support publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Mishima et al. address a very timely topic of how the codon composition of the ORF and the associated translation elongation speed affect mRNA stability. Several studies have already shown a strong correlation between codon optimality and mRNA stability - meaning the more "optimal" the codons, the faster supposedly the elongation speed and the more stable the mRNA. Most of these studies were done by analyzing global expression data, with limited follow up, therefore being also impacted by other co-translational mRNA decay pathways and in addition these studies could also not test directly the effect of each single codon on mRNA stability. The authors took a systematic reporter-based assay approach, called PACE, which allowed them to test systematically the effect of codon composition on mRNA decay. By integrating also ribosome profiling data, the authors could nicely show that the speed of translation (measured by ribosome density) correlates with their determined mRNA stability effect of each codon and also the corresponding tRNA levels. However, interestingly this seems to be the case only for codons encoding polar amino acids, but not the ones that encode charged or non-polar amino acids. It will be very interesting to find out why that is? Finally, the authors address if some of the effects they see might be due to ribosome collisions and associated no go decay (NGD). For this they generated a Znf598 mutant by CRISPR-Cas9. Znf598 is the proposed homolog of Hel2, the protein in yeast that is essential for NGD. The authors go on to show that NGD is defective in this mutant, but that codon mediated decay, which is elongation dependent, is not to a large part not dependent on Znf598.

      All minor comments:

      1. It is intriguing why only polar AAs show a tRNA amount specific effect in the ribosome footprint data. Some hypothesis/discussion about this could be expanded further in the discussion or results.
      2. On the same token some additional analysis might be helpful. For example, in Figure 2E, the authors group codons in weak, neutral and strong based on PACE measurements and then look at the tRNA expression range for each of the three groups. Could the authors do this also separately for the codons of polar, non-polar and charged amino acids? What do you see - still the same pattern as for all the codons or do again only polar amino acids show the trend?
      3. Can the authors elaborate on the development of their PACE system? Why is it designed the way it is? What parameters did they test? For example, why the 20 amino acids tail, did you you test shorter sequences of the amino acid, spacer repeats, etc?
      4. The next few questions are a bit more of a technical nature regarding the reporter construct used for PACE. a. Did all AA pairs (Codon of interest + spacer codon) behave the same in the footprint assay? Does the data have enough information and resolution for this? b. Was the order of the spacer codons always the same in all the constructs? Could the specific order, if it is consistent, have any unseen consequences (ie. interaction with the exit tunnel)? Did the authors test other orders? c. Are the spacer codons optimized?
      5. Are the codons affected in the NGD mutant the ones that are most different in the Bazzini data?
      6. The authors inject directly mRNA into the embryos, therefore avoiding that the reporter mRNA is ever in the nucleus. However, there could be nuclear events (e.g. loading of particular proteins) that might affect the fate of an mRNA in the cytosol, among these the translation efficiency and also stability. Maybe some comment in the discussion as to the effect of missing nuclear factors would be welcome. This is not a criticism; it would just be nice to hear the authors' thoughts on that.
      7. Page 6; final paragraph: "Finally, we compared the speed of the ribosome translating mRNA destabilizing codons to that of an aberrantly stalled ribosome." Not sure the authors did that actually. They tested the effect of ribosome slowing down on protein production and mRNA levels and compared that to stalling ribosomes, but did not compare the "speed" directly and I am not even sure what they mean by that in this context. Probably good to rephrase.
      8. Page 7, upper half: ".....by taking the positional effect of codon-mediated decay into account (Mishima and Tomari, 2016)." This is my limited knowledge of the literature, but I think you should mention what this positional effect is and not just cite a paper.
      9. Very minor, but on page 8 when PACE is introduced, the authors show the different destabilizing effects of the three Ile codons. While that is ok, in the section before, when the authors tested their construct by qRT-PCR, they focused on the two Leu codons. I would also mention them here and do a direct comparison of the qRT-PCR results with the pooled PACE result for these two codons. Based on the figure the two codons seem to behave qualitatively like expected, but I am not sure how good the quantitative behavior matches.
      10. The AnsB experiment - the authors only mention data about one of the two Asn codons (AAC), but what about the second Asn codon (AAU) - do you also see an effect on that codon upon overexpression of AnsB as well? AAU is already a quite destabilizing codon and you might not see a further increase in destabilization, but it would be great to know if there was or not.
      11. Page 13, second paragraph: More out of interest, but it is quite intriguing that GCG turned into a destabilizing codon (opposite of what one would expect if NGD would play a bit a role). Any speculation why?
      12. Page 14, end of page and related to Figure 6C: AAU seems much more destabilizing than AAC. Therefore, I would have expected that the inserted sequence with the AAU codons would lead actually to downregulation of the mRNA and therefore the EGFP and DsRFP total protein signal relative to the construct with the AAC inserted in between, even if the ratio of EGFP/DsRed seems unchanged. However, based on the western blot in 6C the total protein levels seem very similar. Isn't that surprising? Although, AAU obviously allows translation to proceed it should still induce a stronger mRNA decay than AAC and therefore result in less total mRNA (and protein level as a consequence). Did the authors quantify the exact levels of the reporter proteins and mRNA and compared them between the two constructs?
      13. Page 15, last sentence: Somehow for me the word "transient" is a bit hard to grasp in this context. What do you mean by that - do you really mean "impermanent" or "lasting only for a short amount of time"? Don't you simply mean "weaker", "less strong"?
      14. Page 17, second sentence: I think the authors want to reference here Figure 2E and not Figure 2D.

      Significance

      All in all, I have to say that it was a real pleasure to read this manuscript. The authors were extremely thorough with their experiments and did nearly never overstate any of their conclusions. It is a very insightful story, which in my opinion will contribute greatly to the field of gene expression and posttranscriptional gene expression regulation in particular. The PACE assay, although a bit artificial, gave very clean results, which agree with the previous literature and could be very useful for future studies. Generating the Znf598 mutant and showing that the codon-dependent decay is independent from NGD is a great addition to this paper. Although it is a bit of a pity that we do not see more of a characterization of the Znf598 mutant in this paper, I do agree with the authors that this might take away a bit of the focus of this manuscript and that the mutant deserves actually its own story. I only have very minor comments/questions for the authors that they should be able to address easily. Finally, I can only repeat myself by saying: congrats on this great paper and I fully support publication.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Mishima et al aim to determine if the RNA-mediated decay determined by codon optimality is part of the ribosome quality control pathway, triggered by slowed codon decoding and ribosome stalling or it is an independent pathway.

      To this end, the authors capitalize on their previous work to design a very elegant high-throughput reporter system that can analyze individually codon usage, ribosome occupancy and tRNA abundance. This reporter system, called PACE, is rigorously validated throughout the manuscript, because blocking translation with a morpholino blocking the AUG codon demonstrated that the effects no RNA stability are translation dependent.

      When most of the available codons are tested using the PACE system, the authors recapitulate codon optimality profiles similar to the ones previously uncovered using transcriptome-wide approaches.

      Thanks to the design of the reporter, which alternates repeats of a test codon with random codons, the authors can calculate how quickly a ribosome decodes the test codon on average. With this approach, the authors uncover a negative correlation between RNA stability and ribosome density on codons for polar amino acids and suggest that codon optimality is related to a slower decoding of the codons.

      With the PACE reporter validated, the authors can interrogate the system to gain mechanistic insights of codon optimality. First, they test if RNA decay and deadenylation mediated by codon optimality is determined, in part, by the levels of aminoacylated tRNAs available. The authors use a very elegant approach, as they overexpress a bacterial enzyme (AnsB) in zebrafish that degrades asparagine, effectively reducing the levels of tRNA-Asn. The authors demonstrate that AnsB turns a previously optimal Asn codon, AAC, into a non-optimal one. This effect is translated into RNA destabilization and deadenylation, but this effect in not extended to other codons encoding amino acids not affected by Asn. These results provide a direct experimental validation of the previously published observation of tRNA levels and codon optimality.

      Finally, the authors interrogate the relationship between the codon optimality pathways and the ribosome quality control pathways, that takes care of stalled ribosomes. The authors generate a zebrafish mutant of Znf598, a vertebrate homolog of the yeast protein in charge of resolving stalled ribosomes. Using a maternal-and-zygotic mutant, the authors demonstrate that in these mutant's codon optimality proceeds as usual but ribosome stalling is not resolved, providing evidence for first time that Znf598 is involved in ribosome quality control in vertebrates.

      Altogether, this manuscript presents work that builds on the previous findings of the authors and other labs but it is a qualitative leap forward rather than a marginal increment, because the body of work in the current manuscript i) establishes a reporter to dissect the mechanisms of codon optimality, ii) demonstrates that ribosome slow-down but not stalling is part of the trigger of RNA decay mediated by codon optimality, iii) demonstrates that this pathway is independent of ribosome quality control pathway and finally iv) demonstrates that vertebrate Znf598 is involved in the RNA decay mediated by ribosome stalling.

      Due to these novel findings, and the rigor of the experimental design, this manuscript should be accepted for publication. The authors should first address the following comments:

      Major comment:

      1. The authors very elegantly demonstrate the impact of AnsB on the stability of the RNA reporter, and it is precisely the simplicity of the reporter that allows the authors to draw clear conclusions. Nevertheless, it would be interesting to determine if the reporter results in embryos injected with AnsB also translate to endogenous genes rich in AAC codons. The authors could perform a polyA-selected RNA-Seq in embryos treated with AnsB to determine if the transcripts rich in AAC codons are destabilized compared to wild-type, thus validating the reporter results in endogenous genes.

      Minor comments:

      1. In figure 5J the authors plot the normalized codon tag levels of the PACE reporter run in the MZznf598 mutant. The authors color code the labels in the x-axis following the PACE results in wild-type (figure 2B). The authors should also plot the wild-type values to have a direct visual comparison of the results trend in both genotypes.
      2. The authors focus in the title on the role of Znf598 or the lack thereof in RNA decay induced by codon optimality. However, for the non-aficionados in codon-optimality, ZnF598 is an unknown protein and adds little information to the title. The authors should provide a more informative title, directly pinpointing that codon-optimality is independent of the ribosome quality control pathway.

      Significance

      This manuscript presents work that builds on the previous findings made by the authors and other laboratories but it is a qualitative leap forward rather than a marginal increment, because the body of work in the current manuscript i) establishes a reporter to dissect the mechanisms of codon optimality, ii) demonstrates that ribosome slow-down but not stalling is part of the trigger of RNA decay mediated by codon optimality, iii) demonstrates that this pathway is independent of ribosome quality control pathway and finally iv) demonstrates that vertebrate Znf598 is involved in the RNA decay mediated by ribosome stalling.

      In addition to the conceptual findings, the authors establish a new high-throughput reporter system to evaluate the influence of codon optimality in RNA decay.

      The work its done in zebrafish embryos, an in vivo model system where codon optimality has been extensively tested by the authors and others, following the stability of reporter and endogenous genes.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Mishima et al., designed a reporter system (dubbed PACE, for Parallel Analysis of Codon Effects) to assess the effect of codon usage in regulating mRNA stability in a controlled sequence context. This reporter corresponds to a stretch of 20 repetitions of a given codon (to be tested for its effect on mRNA stability), each repetition being separated by one codon corresponding to each of the 20 canonical amino acids. This stretch is inserted at the 3' end of the coding sequence of a superfolder GFP flanked with fixed 5' and 3' untranslated regions. In vitro transcribed capped and polyadenylated RNAs are then produced from these reporters (each with a specific stretch of repetitions of a given codon), pooled together and injected into zebrafish zygotes to monitor their relative abundance at different time points upon injection.

      Using the PACE reporter, the authors were able to obtain a quantitative estimation of the impact of 58 out of the 61 sense codons on modulating mRNA stability. Their results are in agreement with a previous report that estimated the effect of codon usage on mRNA stability using endogenous mRNAs and an ORFeome library (Bazzini et al., 2016). However, contrary to relying on endogenous mRNAs and ORFeome reporters, the advantage of the PACE strategy is that the effect of the codon to be studied can be probed in a defined context, thus avoiding the presence of other motifs or transcript features that could also regulate mRNA stability. Similarly to results from Bazzini et al., 2016, the authors show that blocking translation completely abrogates the effect of codon usage, indicating that translation is required to drive codon-dependent mRNA degradation from their reporters. Also, the extent of codon-dependent mRNA decay is correlated with tRNA abundance and occurs through a process involving mRNA deadenylation as previously described in the zebrafish (Mishima et al., 2016 and Bazzini et al., 2016). Having validated their PACE protocol, the authors performed ribosome profiling to test whether ribosome occupancy on tested codons is correlated with their capacity to drive mRNA degradation. Their results indicate that, at least for polar amino acids, there is indeed an inverse correlation between ribosome occupancy at tested codons and mRNA stability thus suggesting that slow decoding of codons due to low levels of available cognate tRNA can induce mRNA degradation. The authors further validate this finding by reducing the levels of aminoacylated tRNAAsn (corresponding a polar amino acid) and showing that stability of the reporter RNA carrying a stretch of AAC codons (decoded by tRNAAsnGUU) is reduced. To test whether codon-dependent mRNA degradation in the context of slow ribosome decoding lead to ribosome stalling and collisions, the authors generated a mutant zebrafish strain with impaired expression of ZNF598 (an essential factor of the No-Go decay (NGD) pathway in yeast). They also integrated a known ribosome stalling sequence from hCMV (and a mutant version that does not trigger ribosome stalling) in their sfGFP reporter construct as a positive control for NGD in their assays. Their results indicate that although ZNF598 depletion impairs degradation of the hCMV reporter (as expected), it does not affect codon-dependent mRNA degradation, which appears to occur for most codons through a NGD-independent manner. Finally, through the use of a tandem ORF reporter assay separated by codon tags to be tested, the authors show that destabilizing codons do not stall ribosomes but only lead to their transient slowdown which induces mRNA deadenylation and degradation in a ZNF598-independent manner.

      Overall, the manuscript is very well written and pleasant to read. The introduction is well documented and relevant to the study as it allows readers to place the study in the current context of the field while highlighting open questions that have not been addressed yet. The results are clearly presented, the technical approaches are elegant and the conclusions convincing.

      Below you will find some major and minor points that, in my opinion, should be addressed by the authors.

      Major point:

      • One interesting aspect of the PACE reporter assay is the possibility to monitor ribosome occupancy in parallel for all codon-tags tested, which the authors did in Figure 3. However, instead of using RNA-seq data to normalize ribosome footprints and obtain ribosome occupancy, the authors used an alternative normalization approach consisting, for each codon-tag, to calculate the number of ribosome footprints with test codons in the A site divided by the number of ribosome footprints with spacer codons in the A site. This approach is elegant and appears to work with codons corresponding to polar amino acids. However, it might have its limitations for other codons.

      Indeed, ribosome dwell times (in yeast and mammals) have been shown to respond both to tRNA availability but also to other features such as the nature of the pair of adjacent codons, and the nature of the amino acid within the exit channel (Gobet C et al., 2020 PNAS; Gamble CE et al., 2016 Cell; Pavlov MY et al., 2009 PNAS). However, based on the work of "Buschauer R et al., 2020 Science", only ribosomes lacking an accommodated tRNA at the A site are able to recruit Ccr4-Not to mediate mRNA deadenylation and degradation. Other events that increase ribosome dwell time (and thus occupancy), such as slow peptidyl-transfer, do not lead to Ccr4-Not recruitment and are resolved by eIF5A. It is therefore possible that depending on the nature of the codon that is being tested, ribosome occupancy at test and spacer codons can be biased by the nature of codon-pairs and "dilute" the effects of tRNA availability.

      If the authors performed RNA-seq together with the ribosome profiling experiment, it might be interesting to use the RNA-seq data to calculate ribosome occupancy on "tested" and "spacer" codons to check whether using this normalization, they do find a negative correlation between ribosome occupancy and PACE stability. A different approach would be to perform ribosome run-off experiments using harringtonine and estimate the elongation speed across the codon tag. However, I am aware that this experiment could be tedious an expensive.

      • Figure 6: Insertion of the Lys x8 AAA stretch in the tandem ORF reporter leads to a decrease in HA-DsRedEx expression compared to that of Myc-EGFP. However, results from "Juszkiewicz and Hedge, 2017" using a similar reporter in mammalian cells indicate that stretches of Lys AAA below 20 repetitions only elicit poor RQC (less than 10% of true ribosome stalling for 12 repetitions of the AAA codon). Instead, most of the loss in RFP signal results from a change in the reading frame of ribosomes due to the "slippery" translation of the poly(A) stretch. I therefore think that it could be important to perform the experiment in ZNF598 KO embryos to validate that the observed reduction in HA-dsRedEx does indeed result from stalling and RQC and not from a change in the reading frame of ribosomes. On a similar note, how do the authors explain the decrease in signal of the Flag-EGFP and HA-DsRedEx observed when using the Flag-EGFP with non-optimal codons? I understand that RQC occurring through NGD leads to ribosome disassembly at the stalling site and possibly mRNA cleavage (thus explaining the decrease in HA-DsRedEx signal compared to Myc-EGFP). However, I would assume that codon-mediated mRNA decay (even for ORF longer than 200 of non-optimal codons) should trigger mRNA deadenylation, followed by decapping and co-translational 5'to3' mRNA degradation, following the last translating ribosome. I would therefore expect not to see any change in the HA-DsRedEx/Myc-EGFP ratio even for the non-optimal Flag-EGFP reporter. Could the 200 non-optimal codons trigger some background RQC through NGD? Or could there be some ribosome drop-off? It might be interesting to test the optimal and non-optimal Flag-EGFP reporters in the ZNF598 KO background to check whether the observed decrease in the relative amount of HA-DsRedEx results from stalling-dependent RQC.

      Minor comments:

      • The color-coded CSC results from "Bazzini et al., 2016" presented at the bottom of panel B in figure 2 are misleading because many codons (such as PheUUU, AsnAAU, TyrUAC...) are lacking information. I have the impression that the authors used the combined data from the rCSCI (obtained from the reporter RNAs) and CSC (obtained from endogenous transcripts) corresponding to Figure 1F from Bazzini et al., 2016. This data set excluded all codons that were not concordant between the endogenous and reporter CSCs (which are those that are lacking a color code in Figure 2B from this study). However, in the scatter-plot of PACE Vs CSC (from Supplemental Figure 1D of this study), the authors used the complete set of CSC values from Bazzini et al .,2016. Could the authors please use the complete set of CSC values from Bazzini et al., 2016 to color code codons in their Figure 2B?
      • Figure 4B. The charged tRNA measurements seem to have been done in a single biological replicate (there aren't any error bars in the chart). I understand that the procedure is tedious and requires a large amount of total RNA to begin with, but it would be preferable to perform it in three biological replicates.
      • Supplementary Figure 2B. I do not understand what the figure represents. The legend is quite cryptic and states that the panel corresponds to the information content of each reading frame. More information should be given so that readers can understand how to interpret de figure and extract periodicity information.

      Significance

      Since the seminal work from Jeff Coller's laboratory in 2015 (Presnyak et al., 2015 Cell) showing a global and major role for codon optimality in determining mRNA half-lives in yeast, the role of codon usage in modulating translation and stability of mRNAs has been widely studied in different organisms (including zebrafish and mammals). As stated by the authors in the introduction, most studies have relied on correlation analyses between codon usage and mRNA half-lives from endogenous transcripts or from ORF libraries with fixed 5'UTR and 3'UTRs. This approaches could suffer from the presence of transcript features that can participate in other mRNA degradation pathways, which could limit their use when performing further mechanistic studies.

      The work by Mishima and collaborators presents an original reporter assay that allows to evaluate the role of codon usage on regulating mRNA stability in a defined context, thus avoiding the impact of confounding factors that could bias the measurement of mRNA stability. Results obtained using this reporter are in good agreement with previous reports from Zebrafish (Bazzini et al 2016., and Mishima et al., 2016). From this validated reporter approach, the authors further show that codon-dependent mRNA degradation is directly related to tRNA availability and (at least partially) to ribosome occupancy (two factors already suggested as being important for codon-mediated decay in zebrafish, although they were based on correlation analyses). Furthermore, the authors show that codon-mediated mRNA decay occurs during productive mRNA translation and that it is functionally distinct from RQC induced by ribosome stalling. As a consequence, codon-mediated mRNA degradation is independent from the RQC factor ZNF598 (which they also validate for the first time as an important RQC factor in zebrafish). This information is new within metazoans since only in yeast it has been clearly shown that codon-mediated mRNA decay is distinct from RQC induced by ribosome stalling and collisions.

      Taken together, the reported findings will be of interest to the community working on mRNA metabolism and translation. It could also interest, more broadly, scientists working on translational selection and genome evolution.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their thoughtful comments. We were delighted the reviewers found our results “compelling”, “striking”, “well presented”, “implications exciting”, “excellent results! really nice!”, “this microscopy is beautiful!” and “translational-dependence (of mRNA localization) in a transcript-specific way without perturbing translation globally”, which is a “complete surprise, and opens exciting doors to investigate how translation leads to mRNA organization and its connection to **tissue development” and “may represent a new pathway of mRNA transport”.

      We also appreciated the comments regarding the “wide appeal”, “broad readership of readers”, and “broad interest” the reviewers gave to our manuscript regarding its impact, and also the comments of “well-written (and) well-cited”.

      We can address all the concerns raised by the reviewers. In addition to textual changes, we will add the following to the Results section:

      1. Additional quantitation of smFISH beyond Figure 2;
      2. Addition of a negative (uniformly distributed) mRNA control and its quantitation;
      3. Western blots for our ΔATG lines to determine what and how much protein is made.
      4. Unbiased nuclear masking. Our specific responses are shown below, in blue.

      Reviewer #1

      **Major comments**

      Fig. 1: Main and supplementary figures present smFISH signals for eight localized mRNAs, while in the results section authors describe that they analyzed twenty-five transcripts. Authors should explain the choice of transcripts presented in the paper.

      We will include a panel in Fig. S1E to show every mRNA that we tested, and we will edit Table 1 to describe the observed subcellular localization.

      We will edit the text, adding a few sentences to clarify, along the lines of: “O**ur survey revealed mRNAs with varying degrees of localization within epithelia that we divided into three classes: CeAJ/membrane localized, perinuclearly localized, and unlocalized (Fig. 1 and S1 and Table 1).” and “The rest of our tested mRNAs did not possess any evident subcellular localization at any of the analyzed embryonic stages/tissues and were not further investigated (Fig. S1E and Table 1).

      Moreover, smFISH signal of different localized mRNAs in epidermal cells was visualized at different stages (bean, comma or late comma), and authors did not comment what was the reason of such conditions. This may make transcripts localization results difficult to interpret, as further analysis showed that mRNA localization varied in a stage-specific manner.

      We have clarified this point now in Figure legend 1: “Specific embryonic stages were selected for each transcript based on the highest degree of mRNA localization they exhibited.

      Did author used smFISH probes designed against endogenous mRNAs for all tested transcripts?

      We did not. We clarify this point now in Materials and methods: “All probes were designed against the endogenous mRNA sequences except dlg-1 (some constructs), pkc-3, hmp-2, spc-1, let-805, and vab-10a, whose mRNA were detected with gfp probes in their corresponding transgenic lines (Table S2). An exception to this is Fig. S1A where we used probes against the endogenous dlg-1 mRNA.”.

      Marking dlg-1 mRNA as dlg-1-gfp suggests that smFISH probe was specific for gfp transcript. Is it true? If yes, authors should compare localization of wild-type endogenous dlg-1 mRNA with that of the transcript encoding a fusion protein, to confirm that fusion does not affect mRNA localization.

      Yes, in Fig. 1C we show smFISH for GFP (i.e., the tagged dlg-1 only). In Fig. S1A, we show smFISH against endogenous dlg-1. Tagged and endogenous dlg-1 mRNAs are both localized. We clarified this point in the main text: “Five of these transcripts were enriched at specific loci at or near the cell membrane: laterally and at the CeAJ for dlg-1 (Fig. 1C for endogenous/GFP CRISPR-tagged dlg-1::gfp mRNA and S1A for endogenous/non-tagged dlg-1 mRNA), (…)”. And in the Supplemental figure legend (Fig. S1A): “Endogenous/non-tagged dlg-1 mRNA shows CeAJ/membrane localization like its endogenous/GFP CRISPR-tagged counterpart.

      Fig. 2B: Authors conclude that at later stages of pharyngeal morphogenesis mRNA enrichment at the CeAJ decreased gradually in comparison to comma stage. Data do not show statistically significant decrease in ratio of localized mRNAs - for dlg-1: bean: 0.39{plus minus}0.09, comma: 0.29{plus minus}0.08, 1.5-fold: 0.30{plus minus}0.09; for ajm-1: bean: 0.36{plus minus}0.08, comma: 0.30{plus minus}0.05, 1.5-fold: 0.28{plus minus}0.09.

      t-test (one-tailed) analysis revealed a significant difference between bean and comma stages for both dlg-1 and ajm-1 mRNAs. Statistical analysis and data will be provided.

      Fig. 4: What was the difference between the first and the second __ΔATG transgenic line? Authors should analyze the size of the truncated DLG-1 protein that is expressed from the second Δ__ATG transgenic line that localizes to CeAJ. Knowing alternative ATGs and protein size may suggest domain composition of the truncated protein. This will allow to confront truncated protein localization with the results from.

      We will perform a Western blot to determine the size and levels of proteins produced.

      Fig. 5. Moreover, to prove that the localization of dlg-1 mRNA at the CeAJ is translation-dependent, additional experiment should be performed where transcripts localization will be analyzed in embryos treated with translation inhibitors such as cycloheximide (translation elongation inhibitor) and puromycin (that induces premature termination).

      We believe this comment might refer to Fig. 4. If this is the case: drugs like cycloheximide and puromycin affect the translation of the whole transcriptome, whereas with our ΔATG experiment, we aimed to target the translation of one specific transcript and avoid secondary effects. Nevertheless, we understand Reviewer #1’s concern and will include a second experiment. In our hands, cycloheximide and puromycin have never worked in older embryos (it’s hard to get past the eggshell and into the embryo). Instead, we will use stress conditions, which induce a “ribosome drop-off” (Spriggs et al., 2010). Heat stress has been shown to decrease polysome occupancy (Arnold et al., 2014). We, therefore, have used heat-shock at 33°C for 30’, and the results are now shown in Fig. S4. These show the loss of RNA localization upon heat shock.

      **Minor comments**

      In the introduction section authors should emphasize the main goal and scientific significance of the paper.

      We added this sentence to state the significance before summarizing the results: “To investigate the impact of mRNA localization during embryonic development, we conducted a single molecule fluorescence in situ hybridization (smFISH)-based survey (…)” and “Our data demonstrate that the dlg-1 UTRs are dispensable, whereas translation is required for localization, therefore providing an example of a translation-dependent mechanism for mRNA delivery in C. elegans.” To state the significance.

      Fig 1A: It's hard to distinguish different colors on the schematics. Schematics presents intermediate filaments that are not included in the Table 1.

      We modified Table 1 based on this and other reviewers’ comments.

      Fig. 1C: dlg-1 transcript is marked as dlg-1-gfp on the left panel and dlg-1 on the right panel.

      Corrected.

      Fig. 2B: Axis labels and titles are not visible, larger font size should be used.

      We will modify the graph (following Reviewer #2’s suggestion) and axes label and title sizes will be taken into account.

      Fig. 5C: Enlarge the font size.

      Will do.

      Fig. S2: Embryonic stages should be marked on the figure for easier interpretation.

      Added.

      Reviewer #2

      Major comments

      Figure 2 requires a negative (or uniformly distributed) mRNA control for comparison. Figure 2C should be quantified. The plot quality should be improved, and appropriate statistical tests should be employed to strengthen the claimed findings.

      We will add a negative control (jac-1 mRNA), and quantify Fig. 2C as well. Plots will be changed accordingly to the suggestion.

      Most claims of perinuclear mRNA localization are difficult to see and not well supported visually or statistically. The usage of DAPI markers, membrane markers, 3D rendering, or a quantified metric would bolster this claim. Also, sax-7 is claimed to be perinuclear and elsewhere claimed to be uniform then used as a uniform control. Please explain or resolve these discrepancies more clearly.__

      Regarding perinuclear mRNAs:

      We are not trying to make a big statement out of these data as perinuclear (ER) localization of mRNAs coding for transmembrane/secreted proteins is well known. The aim of our study was to describe transcript localized at or in the proximity of the junction. However, we thought it was worth mentioning these examples of perinuclearly localized mRNAs (hmr-1, sax-7, and eat-20) for two reasons: scientific correctness – show accessory results that might be interesting for other scientists – and use as positive controls for our smFISH survey – these mRNAs were expected to localize perinuclearly for the reasons mentioned above. We will rewrite the text to make these points clearer.

      Regarding sax-7 mRNA:

      sax-7 mRNA localizes perinuclearly in sporadic instances (Fig S1C), but it is predominantly scattered throughout the cytoplasm (i.e., unlocalized). It presumably localizes perinuclearly in a translation-dependent manner as sax-7 codes for a transmembrane protein that would be targeted to the ER. We have described this ER-type of localization in the introduction and reiterated it partially in the first paragraph of the results. sax-7 UTRs are therefore presumably not responsible for subcellular localization, which would instead depend on a signal sequence. We will better clarify this point in the main text.

      The major concern about the paper is the data display and interpretation of Figure 5C. I'm not comfortable with the approach the authors took of blurring out the nucleus. A more faithful practice would be to use an automated mask over DAPI staining or to quantify the entirety of the cell. If the entirety of the cell were quantified, one could still focus analysis on specific regions of relevance. The interpretations distinguishing membrane versus cytoplasmic localization (or mislocalization) are hard to differentiate in these images especially since they are lacking a membrane marker. The ability to make these distinctions forms the basis of Tocchini et al's two pathways of dlg-1 mRNA localization. These interpretations also heavily rely on how the image was processed through the different Z-stacks, and it's not clear to me how that was done. For example, the diffusion of mRNA in figure 5F and 5I are indistinguishable to my eye but are claimed to be different.

      In the images, the nuclei have been blurred to allow the reader to focus on the cytoplasmic signal and not on the nuclear (transcriptional) signal as it is not meaningful for this study. In the quantitation, the nuclear signal has been unbiasedly and specifically removed from the analysis by cropping out the DNA signal from the other channels. The frontal plane views of the seam cells in Fig. 5 show maximum intensity projections (MIPs) of 3 Z-stacks (0.54 µm total) that each contain nuclei and, therefore, the transcriptional signal (schematics in Fig. 5B). We will clarify these points in the text.

      Regarding cytoplasmic versus membrane-associated mRNAs, although we did not have a membrane marker, we relied on the brightness of the DLG-1::GFP signal to identify the cell borders (i.e., membranes) after over-exposure. This approach allowed us to discern apicobasal and apical sides for the intensity profile analyses. We will clarify this point as well in the text and, in parallel, we will try a different approach using transverse sections on top views to clarify our data.

      To my eye, it seems that Figure 5 could be more faithfully interpreted to state that DGL-1 protein localization depends on the L27-SH3 domains. The Huk/Guk domains are dispensable for DLG-1 protein localization; however, through other studies, we know they are important for viability. In contrast, dlg-1 mRNA localization requires all domains of the protein (L27-Guk). It is exceptionally interesting to find a mutant condition in which the mRNA and protein localizations are uncoupled. It would be very interesting to explore in the discussion or by other means what the purpose of localized translation may be. Because, in this instance, proper mRNA localization and protein function are closely associated, it may suggest that DLG-1 needs to be translated locally to function properly.

      We will rewrite the Results and Discussion to clarify our model. We agree that L27 and SH3 domains are critical, but we also detected effects of the HooK/GuK domains. We have refined our model to describe functions of the N and C termini for membrane or junctional localization.

      The manuscript requires an improve materials & methods description of the quantification __procedures and statistics employed.__

      We will add these points.

      Minor & Major comments together - text

      Summary statement: Is "adherent junction" supposed to be "adherens junction?"

      Corrected.

      Abstract: Sentence 1, I think they should add a caveat word to this sentence. Something like "...phenomenon that can facilitate sub-cellular protein targeting." In most instances this isn't very well characterized or known.

      Corrected.

      In the first paragraph, it might be good to mention that Moor et al also showed that mRNA localize to different regions to alter their level of translation (to concentrate them in high ribosome dense regions of the cell).

      Added as follows: “For example, a global analysis of localized mRNAs in murine intestinal epithelia found that 30% of highly expressed transcripts were polarized and that their localization coincided with highly abundant regions in ribosomes **(Moor, 2017).”

      There are some new studies of translation-dependent mRNA localization - that might be good to highlight - Li et al., Cell Reports (PMID: 33951426) 2021; Sepulveda et al., 2018 (PCM), Hirashima et al., 2018; Safieddine, et al 2021. Also, Hughes and Simmonds, 2019 reviews membrane associated mRNA localization in Drosophila. And a new review by Das et al (Nat Rev MCB) 2021 is also nice.

      We will add them to the text.

      Parker et al. did not show that the 3'UTR was dispensable for mRNA localization. They showed the 3'UTR was sufficient for mRNA localization.

      Quoting from the paper Parker et al.: “3′UTRs of erm-1 and imb-2 were not sufficient to drive mRNA subcellular localization. Endogenous erm-1 and imb-2 mRNAs localize to the cell or nuclear peripheries, respectively, but mNeonGreen mRNA appended with erm-1 or imb-2 3′UTRs failed to recapitulate those patterns (Fig. 4A-D).” We will make this point clearer in the rewritten text.

      In the second paragraph, the sentence about bean stages is missing one closing parenthesis.

      Corrected.

      Last paragraph: FISH is fluorescence, not fluorescent.

      Corrected.

      Both "subcellular" and "sub-cellular" are used.

      Corrected.

      Minor comments – Figures

      Figure 1

      o Figure 1A is confusing. It's not totally clear what the rectangles and circles signify. There are many acronyms within the figure. Which of the cell types depicted in the figure are shown here? For example, for the dorsal cells, which is the apical v. basal side?

      We tried to simplify the cartoon for a general C. elegans epithelial cell. We followed schematics already shown in previous publications to maintain consistency. Acronyms and color-codes are listed in the corresponding figure legend and have been better clarified.

      o Some of the colors are difficult to distinguish, particularly when printed out or for red/green colorblind readers. Is erm-1 meant to be a cytoskeletal associated or a basolateral polarity factor?

      We understand the issue, but unfortunately, with 8 classes of factors, shades of gray might not solve the problem. We tried to circumvent the red-green issue changing red to dark grey. Furthermore, we added details about shapes to the figure legends. We will work to make the colors work better.

      ERM-1 is a cytoskeletal-associated factor.

      o The nomenclature for dlg-1 is inconsistent within "C".

      Corrected.

      o Please specify what the "cr" is in "cr.dlg-1:-gfp" in the legend.

      Added.

      Figure 2

      o Can Figure 2C be quantified in a similar manner to 2A/2B?

      Currently our script cannot do that, but we will try to optimize it to be able to quantify this type of images.

      o 2B - please jitter the dots to better visualize them when they land on top of one another

      Yes, we will.

      o Please include a negative control example, a transcript that is not peripherally localized for comparison.

      Yes, we will.

      o There is no place in the text of the document where Fig 2C is referenced

      Corrected (it was wrongly referred to as “2B”).

      o I can't see any discernable ajm-1 localization in Fig 2A.

      We added some arrowheads to point at specific examples and increased the intensities of the corresponding smFISH signal for better visualization.

      o I can't see any dlg-1 pharyngeal localization in Fig2C.

      We added some arrowheads to point at specific examples and increased the intensities of the corresponding smFISH signal for better visualization.

      o More details on how the quantification was performed would be welcome. Particularly, in 2B, what is the distance from the membrane in which transcripts were called as membrane-associated? What statistics were used to test differences between groups?

      We will add a full description of the script used as well as the statistic details.

      Figure 3

      o Totally optional but might be nice: can you make a better attempt to approximate the scale of the cartoon depiction?

      The UTRs, especially the 5’ one, are much smaller than the dlg-1 gene sequence. A proper scaling of the cartoon to the actual sequences, would draw the attention away from the main subjects of this figure, the UTRs. Nevertheless, we made sure it is clear in the corresponding figure legend that the cartoon is not in scale: “The schematics are not in scale with the actual size of the corresponding sequences. UTR lengths: dlg-1 5’UTR: 61 nucleotides; sax-7 5’UTR: 63 nucleotides; dlg-1 3’UTR: 815 nucleotides; unc-54 3’UTR: 280 nucleotides.”

      o The GFP as an asterisk illustration may be confusing for some readers. Could you add another rectangular box to depict the gfp coding sequence?

      Corrected.

      o This microscopy is beautiful!

      Thanks Reviewer #2!

      o Were introns removed? Is the endogenous copy still present?

      All the transgenes were analyzed in a wild-type background, therefore, yes, the endogenous copy was still present. All the transgenes possessed introns. We will change the corresponding text as follows: “To test whether the localization of one of the identified localized mRNAs, dlg-1, relied on zip codes, we generated extrachromosomal transgenic lines carrying a dlg-1 gene whose sequence was fused to an in-frame GFP and to exogenous UTRs.”. In the figure “dlg-1 ORF” has been replaced with “dlg-1 gene”.

      o The wording in the legend "CRISPR or transgenic" may be confusing as Cas9 genome editing is still a form of transgenesis.

      We added “extrachromosomal” to clarify the nature of the mRNA.

      o The authors state that the 5'-3'UTR construct produces perinuclear dlg-1 transcripts but in the absence of DAPI imaging, it's not clear that this is the case.

      We could not find such a statement, but we tried to clarify the localization of these mRNAs in the text: “The mRNA localization patterns of the two UTR reporters were compared to the localization of dlg-1 transcripts from the CRISPR line (“wild-type”, Fig. 3A; Heppert et al., 2018), described in Fig. 2. Both reporter strains showed enrichment at the CeAJ and localization dynamics of their transcripts that were comparable to the wild-type cr.dlg-1 (Fig. 3B). These results indicate that the UTR sequences of dlg-1** mRNA are not required for its localization.”

      o Which probe set was used? The gfp probe?

      Yes, please see the main text: “Given that the transgenic constructs were expressed in a wild-type background, smFISH experiments were conducted with probes against GFP RNA sequences to focus on the transgenic dlg-1::GFP mRNAs (cr.dlg-1 and tg.dlg-1).”

      o Here, sax-7 is used as a uniform control, but sax-7 is claimed in Fig S1B-D as being perinuclear. This is a bit confusing.

      sax-7 mRNA localizes perinuclearly in sporadic instances (Fig S1C), but it is predominantly scattered throughout the cytoplasm (i.e., unlocalized). It presumably localizes perinuclearly in a translation-dependent manner as sax-7 codes for a transmembrane protein that would be targeted to the ER. We have described this ER-type of localization in the introduction and reiterated it partially in the first paragraph of the results. sax-7 UTRs are therefore presumably not responsible for any subcellular localization, which would instead rely on a signal sequence. We will better clarify this point in the main text.

      Figure 4

      o Excellent results! Really nice!

      Thanks Reviewer #2!

      o Fig 4A. The GFP depicted as a circle is strange.

      We changed it into a rectangle.

      o Fig 4A. Can you include the gene/protein name for easy skimming?

      Added.

      o Fig 4B. the color here is too faint and it is unclear what is being depicted. Overall, this part of the figure could be improved.

      We are optimizing the coloring and simplifying the schematics.

      o Were the introns removed?

      No, the introns were maintained in this and in all our transgenic lines. We described our transgenic lines in the materials and methods section (now with more detail). What we depict in the scheme (Fig. 4A) is the mature RNA (now specified in the figure), therefore no introns depicted. We will also specify this in the main text.

      Figure 5

      o Fig 5A. can you add the gene/protein name

      Added.

      o Fig 5B. Can you make the example apicobasal (non-apical) mRNA more distinctive? If it had its own peak in the lower trace, the reader would more clearly understand that this mRNA will be excluded from apical measurements whereas it will be included in apicobasal measurements.

      We actually wanted to show this specific example: a cytoplasmic mRNA and a junctional mRNA may seem close from the apicobasal analysis (partially overlapping peaks that Reviewer #2 mentioned). With the apical analysis, instead, we can show that these mRNAs are actually not close, and they belong to two different compartments (cytoplasm and junction). We would therefore like to keep the current scheme, while better clarifying this point in the corresponding figure legend.

      o D' - I' The grey font is too light.

      Noted. We will change it.

      o D' - I' The inconsistent y-axis scaling makes it difficult to compare across these samples. Can you set them to the same maximum number?

      The values are indeed quite different. We tried to use the same scale, but this would make some of the data unappreciable. The idea was to evaluate, within each graph, how mRNA and protein are localized relative to the junctional marker. We will make this clearer in the text.

      o D' - I' The x-axis labels are formatted incorrectly

      Corrected.

      o The practice of masking out the nucleus appears to remove potentially important mRNAs that are not nuclear localized. This could really impact the findings and interpretation. Instead, consider an automated DAPI mask.

      The masking on the images is not the same used for the analysis: in the images, a shaded circle has been drawn on the DNA channel and moved onto its corresponding location in the other channels or merges. For the analysis, the DNA signal has been specifically removed in the channel with the smFISH signal. Given that the analysis has been performed on maximum intensity projections of 3 Z-stacks, we believe we did not remove any non-nuclear mRNA. We will clarify this point in Materials and methods.

      o I can't see what the authors are calling membrane diffuse versus cytoplasmic. This is making it hard for me to see their "two step" pathway to localization.

      We will add in Fig. 5B-C an example of a membrane localized mRNA. Furthermore, we will add transverse sections of membrane and cytoplasm to make the date clearer to the reader.

      o Can more details of the quantification be included? How were Z-sections selected, chosen for inclusion? Which Z-sections and how many were selected?

      We will add the details to Materials and methods.

      o Also, why do these measurements focus on what I think are the seam cells when Lockwood et al., 2008 show the entire epithelium that is much easier to see?

      We are focusing on the seam cells at the bean stage as these are the cells and the embryonic stage where we see the highest localization of dlg-1 mRNA in the wild-type.

      o Please name these constructs to correlate the text more explicitly to the figures.

      Added.

      o How many embryos were analyzed for each trace? How many embryos showed consistent patterns?

      We will add the details of the analysis to Materials and methods.

      o Why were these cells used for study here? Lockwood et al., 2008 use a larger field of epithelial cells for visualization.

      As stated before: we are focusing on the seam cells at the bean stage as these are the cells and the embryonic stage where we see the highest localization of dlg-1 mRNA in the wild-type.

      Figure 6

      There are major discrepancies between what this figure is depicting graphically and what is described in the text. Again, I'm not comfortable making the "two step" claims this figure purports given the data shared in Figure 5.

      We are planning to re-write the last part of the results to better clarify our two-step model. A two-step model had been previously suggested in McMahon et al., 2001, where they could show that DLG-1 and AJM-1 (referred to in that publication as JAM-1) are initially localized laterally and only later in development are then enriched apically. Our data agree with McMahon very well, so we used the earlier study as a start. We will cite and explain this paper in greater depth during the rewriting.

      **Minor comments - Tables & Supplemental Figures**

      Table 1

      I think this table could be improved to more clearly illustrate which mRNAs were tested and what their mRNA localization patterns were (for example, gene name identifiers included, etc). Could the information that is depicted by gray shading instead be added as its own column? For example, have a column for "Observed mRNA localization"

      We modified Table 1 based on these and the other reviewers’ comments.

      Can you add distinct column names for the two columns that are labeled as "protein localization - group"

      We modified Table 1 based on these and the other reviewers’ comments.

      Can you also add which of these components are part of ASI v. ASII (as described in the introduction?)

      A new table has been added with the factors belonging to the two adhesion systems (same color code as in Table 1).

      Supplemental Figure 1

      It is hard to see that some of these spots are perinuclear. More information (membrane marker, 3D rendering, improved metrics) is required to support this claim.

      We are not trying to make a big statement out of these data as perinuclear localization for mRNAs coding for transmembrane/secreted proteins is well known. The aim of our study was to describe transcript localized at or in the proximity of the junction. We thought it was worth mentioning these examples of perinuclearly localized mRNAs (hmr-1, sax-7, and eat-20) for two reasons: scientific correctness – show accessory results that might be interesting for other scientists – and use as positive controls for our smFISH survey – these mRNAs were expected to have a somewhat perinuclear localization for the reasons mentioned above.

      What do these images look like over the entire embryo, not just in the zoomed in section?

      We added a column with the zoom-out embryos.

      sax-7 localization in S4 looks similar but a different localization claim is made.

      sax-7 mRNA can localize perinuclearly in sporadic instances (Fig S1C), but is predominantly scattered throughout the cytoplasm (i.e., unlocalized). It presumably localizes perinuclearly in a translation-dependent manner as sax-7 codes for a transmembrane protein that would be targeted to the ER. We have described this ER-type of localization in the introduction and reiterated it partially in the first paragraph of the results. sax-7 UTRs are therefore presumably not responsible for any subcellular localization, which would instead rely on a signal sequence. We will better clarify this point in the main text.

      Supplemental Figure 2

      Before adherens junctions even exist dlg-1 go to the membrane - this is really neat!

      Thanks Reviewer #2!

      Supplemental Figure 3

      Technical question: If either 5 or 3 stack images are used, how does this work? Do they have different z-spacings? Or do they do 5-stack images represent a wider Z-space?

      This is the sentence under question: “Maximum intensity projections of 5 (1.08 µm) (A) and 3 (0.54 µm) (B) Z-stacks”. The space between each Z-stack image is constant in all our imaging and its value is 270 nm. When we consider 5 planes, the distance from the 1st to the 5th is 4 x 270 nm = 1.08 µm, whereas for 3 planes will be 2 x 270 nm = 0.54 µm.

      Supplemental Figure 4

      Line #2 retains translation and keeps mRNA localization.

      Totally optional, but consider showing both lines in the main figure to illustrate the two possibilities.

      Noted.

      Materials and methods - how did they created the ATG mutations? Is it an array? - why does one translate, and one doesn't?

      We will clarify this point in Materials and methods: “dlg-1 deletion constructs ΔATG (SM2664 and SM2663) and ΔL27-PDZs (SM2641) were generated by overlap extension PCR using pML902 as a template.”.

      We will perform a Western blot to clarify Reviewer #2’s last point. Currently we do not know what peptide is translated, but the comparison with our full-length control will probably shed some light on the issue.

      Reviewer #3

      Major comments

      The smFISH results are striking and implications exciting. The conclusions made from the smFISH results reported in all Figures will be strengthened considerably by quantifying the mRNA localized to the defined specific subcellular regions. At the very least, localization to the cytoplasm versus the plasma membrane should be determined as performed in Figure 2B, but quantifying finer localization will enhance the conclusions made about regional localization (e.g. CeAJ versus plasma membrane mRNA localization in Figure 5). Inclusion of a non-localizing control in Figures 1-4 will enable statistical comparisons between mRNA localizing and non-localizing groups.

      We will add more quantitation, statistics, and negative controls.

      The script used for smFISH quantitation should be included in the methods or published in an accessible forum (Github, etc). Criteria for mRNA "dot" calling should be defined in the methods. All raw smFISH counts should also be reported.

      We will add the full description of the script in Materials and methods, and we will provide the raw data in an additional supplementary table.

      Figure 2: What is the localizing ratio of a non-localizing control mRNA (e.g. jac-1)? Including an unlocalized control with quantitation would strengthen the localization arguments presented.

      Yes, we will add quantitation for an unlocalized mRNA.

      Figure 5: Quantifying colocalization of mRNA and protein (+/- AJM-1) will strengthen the arguments made about mRNA/protein localization.

      Yes, we will quantify Fig. S5 to have a full picture of the cells (the images in Fig. 5 represent only a portion of the cell).

      Discussion of the CeAJ mRNA localization mechanism is warranted. Do the authors speculate that the newly translated protein drives localization during translation, similar in concept to SRP-mediated localization to the ER, or ribosome association is a trigger to permit a secondary factor to drive mRNA localization, or another model?

      Unfortunately, this is hard to say at the moment as we do not have any data regarding where translation actually occurs. We will add a conjecture to the Discussion.

      Minor comments

      Please complete the following sentence: "We identified transcripts enriched at the CeAJ in a stage- and cell type-specific."

      Corrected.

      It would be helpful to provide reference(s) for the protein localization summary in Table 1.

      Added.

      Figure 2B: Did dlg-1 and ajm-1 localize at similar ratios? Appropriate statistics comparing the different ratios may be informative.

      We will modify the graph (following Reviewer #2’s suggestion) and add the requested details.

      Figure 2: In the paragraph that begins, "Morphogenesis of the digestive track," the text should refer to Figure 2C? If not, the text requires further clarification.

      Corrected.

      Figure 2: Reporting the smFISH localizing ratios of 8E and 16E will be informative.

      We will add the information.

      Please include citations when summarizing the nonsense-mediated decay NMD mechanism and AJM-1 identifying the CeAJ.

      Added.

      The sentence, "Embryos from our second __Δ__ATG transgenic line displayed a little GFP protein and some dlg-1::gfp mRNA," should refer to Figure S4.

      Added.

      An immunoblot of this reporter versus wild type may be informative regarding the approximate position of putative alternative start codon.

      We will perform a Western blot to verify the size of the protein product produced.

      Figure 5: N's and repetitions performed should be included for localization experiments.

      Yes, we will add them here and in all the other quantifications we will add to the manuscript.

      Please clarify that the "the mechanism of UTR-independent targeting is unknown in any species" refers to dlg-1 mRNA localization.

      Added.

      "Our findings suggest..." discussion paragraph should reference Figure 6.

      Added.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Subcellular localization of mRNAs plays a critical role in gene regulation and ultimately cellular function. While mRNA untranslated regions often serve as key regulatory codes for expression, mRNA translation can also have a significant effect, a notable example being secretory peptides delivering translating transcripts to the endoplasmic reticulum. A complete understanding of the signals that organize mRNAs in the cell remains an open question. Here, Tocchini, et al. use the C. elegans embryo and single molecule FISH (smFISH) to determine the subcellular localization of key mRNAs involved in epithelial morphogenesis. This survey identifies several mRNAs that appear to localize to specific regions of the cell, such as the plasma membrane or apical junction, and in a developmental stage-specific manner. Dissection of the mRNA of dlg-1/discs large, an apical junction component, provides evidence that mRNA localization requires active translation, but surprisingly the untranslated regions are dispensable. Further mRNA truncation mapping supports the model that the N-terminal coding region helps target mRNAs to the apical junctions, but the C-terminal coding regions are sufficient to localize dlg-1 mRNA to the plasma membrane. The manuscript describes a two-step model for dlg-1 localization and recruitment to the apical junction that depends on translation.

      MAJOR:

      1. The smFISH results are striking and implications exciting. The conclusions made from the smFISH results reported in all Figures will be strengthened considerably by quantifying the mRNA localized to the defined specific subcellular regions. At the very least, localization to the cytoplasm versus the plasma membrane should be determined as performed in Figure 2B, but quantifying finer localization will enhance the conclusions made about regional localization (e.g. CeAJ versus plasma membrane mRNA localization in Figure 5). Inclusion of a non-localizing control in Figures 1-4 will enable statistical comparisons between mRNA localizing and non-localizing groups.
      2. The script used for smFISH quantitation should be included in the methods or published in an accessible forum (Github, etc). Criteria for mRNA "dot" calling should be defined in the methods. All raw smFISH counts should also be reported.
      3. Figure 2: What is the localizing ratio of a non-localizing control mRNA (e.g. jac-1)? Including an unlocalized control with quantitation would strengthen the localization arguments presented.
      4. Figure 5: Quantifying colocalization of mRNA and protein (+/- AFM-1) will strengthen the arguments made about mRNA/protein localization.
      5. Discussion of the CeAJ mRNA localization mechanism is warranted. Do the authors speculate that the newly translated protein drives localization during translation, similar in concept to SRP-mediated localization to the ER, or ribosome association is a trigger to permit a secondary factor to drive mRNA localization, or another model?

      MINOR:

      1. Please complete the following sentence: "We identified transcripts enriched at the CeAJ in a stage- and cell type-specific."
      2. It would be helpful to provide reference(s) for the protein localization summary in Table 1.
      3. Figure 2B: Did dlg-1 and ajm-1 localize at similar ratios? Appropriate statistics comparing the different ratios may be informative.
      4. Figure 2: In the paragraph that begins, "Morphogenesis of the digestive track," the text should refer to Figure 2C? If not, the text requires further clarification.
      5. Figure 2: Reporting the smFISH localizing ratios of 8E and 16E will be informative.
      6. Please include citations when summarizing the nonsense-mediated decay NMD mechanism and AJM-1 identifying the CeAJ.
      7. The sentence, "Embryos from our second ΔATG transgenic line displayed a little GFP protein and some dlg-1::gfp mRNA," should refer to Figure S4. An immunoblot of this reporter versus wild type may be informative regarding the approximate position of putative alternative start codon.
      8. Figure 5: N's and repetitions performed should be included for localization experiments.
      9. Please clarify that the "the mechanism of UTR-independent targeting is unknown in any species" refers to dlg-1 mRNA localization.
      10. "Our findings suggest..." discussion paragraph should reference Figure 6.

      Significance

      This well-written, well-cited manuscript describes the striking subcellular localization pattern of a critical, conserved gene involved in both animal development and human disease. The observation that the start codon, and thus translation, is necessary for transcript localization is a complete surprise, and opens exciting doors to investigate how translation leads to mRNA organization and its connection to tissue development. As such, this manuscript will be of broad interest to RNA, cell and developmental biologists, particularly those who investigate post-transcriptional gene regulation and protein complex assembly. However, while the images are indeed supportive of the manuscript's claims, the conclusions will be markedly strengthened by quantifying the subcellular localization of mRNAs in the smFISH experiments, paired with negative controls (e.g. non-localizing, cytoplasmic mRNA). Addition of more quantitative smFISH analyses will enhance the experimental reproducibility, rigor, and statistical significance. The text, figures, and methods should also be revised to include more details about the smFISH analyses, in particular the inclusion of n's, descriptions of how spots were identified, descriptions of scripts used, and the raw mRNA counts. Regardless, the reporter genes tested were well conceived and dlg-1 shows promise to be a fantastic model to further investigate the mechanisms underlying translation-dependent mRNA localization.

      My expertise covers post-transcriptional gene regulation, the C. elegans model organism, and fluorescent imaging with smFISH.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Tocchini et al. screened apical junction and cell membrane proteins for mRNA localization. They identified multiple proteins that are translated from localized mRNAs. Of these, dlg-1 (Discs large) mRNA localizes to cell cortices of dorsal epithelial cells, endoderm cells, and epidermal (seam) cells and is dependent on active translation for transport. The manuscript dissects the contributions of different DLG-1 protein domains to mRNA localization.

      A major strength of the paper is the way it assesses translational-dependence in a transcript-specific way without perturbing translation globally. The authors cleverly combine mutations in ATG start sites with a knock down of the non-sense mediated decay pathway. This allows Tocchini et al to examine whether dlg-1 mRNA depends on active translation for localization, which it does. The authors observe an interesting finding, that the domains required for protein localization can be separated from those required for mRNA localization. Namely, mRNA localization (but not protein localization) requires C-terminal domains of the protein.

      My major points of concern focus on the presentation and interpretation of Figure 5. In this figure, the blocking approach used seems confounding, the observations described by the authors are not visible, the quantification is confusing, and the interpretations seem like an over-reach. The

      Major comments:

      • Figure 2 requires a negative (or uniformly distributed) mRNA control for comparison. Figure 2C should be quantified. The plot quality should be improved, and appropriate statistical tests should be employed to strengthen the claimed findings.

      • Most claims of perinuclear mRNA localization are difficult to see and not well supported visually or statistically. The usage of DAPI markers, membrane markers, 3D rendering, or a quantified metric would bolster this claim. Also, sax-7 is claimed to be perinuclear and elsewhere claimed to be uniform then used as a uniform control. Please explain or resolve these discrepancies more clearly.

      • The major concern about the paper is the data display and interpretation of Figure 5C. I'm not comfortable with the approach the authors took of blurring out the nucleus. A more faithful practice would be to use an automated mask over DAPI staining or to quantify the entirety of the cell. If the entirety of the cell were quantified, one could still focus analysis on specific regions of relevance. The interpretations distinguishing membrane versus cytoplasmic localization (or mislocalization) are hard to differentiate in these images especially since they are lacking a membrane marker. The ability to make these distinctions forms the basis of Tocchini et al's two pathways of dlg-1 mRNA localization. These interpretations also heavily rely on how the image was processed through the different Z-stacks, and it's not clear to me how that was done. For example, the diffusion of mRNA in figure 5F and 5I are indistinguishable to my eye but are claimed to be different.

      • To my eye, it seems that Figure 5 could be more faithfully interpreted to state that DGL-1 protein localization depends on the L27-SH3 domains. The Huk/Guk domains are dispensable for DLG-1 protein localization; however, through other studies, we know they are important for viability. In contrast, dlg-1 mRNA localization requires all domains of the protein (L27-Guk). It is exceptionally interesting to find a mutant condition in which the mRNA and protein localizations are uncoupled. It would be very interesting to explore in the discussion or by other means what the purpose of localized translation may be. Because, in this instance, proper mRNA localization and protein function are closely associated, it may suggest that DLG-1 needs to be translated locally to function properly.

      • The manuscript requires an improve materials & methods description of the quantification procedures and statistics employed.

      Minor & Major comments together:

      Text

      • Summary statement: Is "adherent junction" supposed to be "adherens junction?"

      • Abstract: Sentence 1, I think they should add a caveat word to this sentence. Something like "...phenomenon that can facilitate sub-cellular protein targeting." In most instances this isn't very well characterized or known.

      • In the first paragraph, it might be good to mention that Moor et al also showed that mRNA localize to different regions to alter their level of translation (to concentrate them in high ribosome dense regions of the cell).

      • There are some new studies of translation-dependent mRNA localization - that might be good to highlight - Li et al., Cell Reports (PMID: 33951426) 2021; Sepulveda et al., 2018 (PCM), Hirashima et al., 2018; Safieddine, et al 2021. Also, Hughes and Simmonds, 2019 reviews membrane associated mRNA localization in Drosophila. And a new review by Das et al (Nat Rev MCB) 2021 is also nice.

      • Parker et al. did not show that the 3'UTR was dispensable for mRNA localization. They showed the 3'UTR was sufficient for mRNA localization.

      • In the second paragraph, the sentence about bean stages is missing one closing parenthesis.

      • Last paragraph: FISH is fluorescence, not fluorescent.

      • Both "subcellular" and "sub-cellular" are used. Minor comments - Figures

      • Figure 1

      o Figure 1A is confusing. It's not totally clear what the rectangles and circles signify. There are many acronyms within the figure. Which of the cell types depicted in the figure are shown here? For example, for the dorsal cells, which is the apical v. basal side? o Some of the colors are difficult to distinguish, particularly when printed out or for red/green colorblind readers. Is erm-1 meant to be a cytoskeletal associated or a basolateral polarity factor? o The nomenclature for dlg-1 is inconsistent within "C". o Please specify what the "cr" is in "cr.dlg-1:-gfp" in the legend.

      • Figure 2

      o Can Figure 2C be quantified in a similar manner to 2A/2B? o 2B - please jitter the dots to better visualize them when they land on top of one another o Please include a negative control example, a transcript that is not peripherally localized for comparison. o There is no place in the text of the document where Fig 2C is referenced o I can't see any discernable ajm-1 localization in Fig 2A. o I can't see any dlg-1 pharangeal localization in Fig2C. o More details on how the quantification was performed would be welcome. Particularly, in 2B, what is the distance from the membrane in which transcripts were called as membrane-associated? What statistics were used to test differences between groups?

      • Figure 3

      o Totally optional but might be nice: can you make a better attempt to approximate the scale of the cartoon depiction? o The GFP as an asterisk illustration may be confusing for some readers. Could you add another rectangular box to depict the gfp coding sequence? o This microscopy is beautiful! o Were introns removed? Is the endogenous copy still present? o The wording in the legend "CRISPR or transgenic" may be confusing as Cas9 genome editing is still a form of transgenesis. o The authors state that the 5'-3'UTR construct produces perinuclear dlg-1 transcripts but in the absence of DAPI imaging, it's not clear that this is the case. o Which probeset was used? The gfp probe? o Here, sax-7 is used as a uniform control, but sax-7 is claimed in Fig S1B-D as being perinuclear. This is a bit confusing.

      • Figure 4

      o Excellent results! Really nice! o Fig 4A. The GFP depicted as a circle is strange. o Fig 4A. Can you include the gene/protein name for easy skimming? o Fig 4B. the color here is too faint and it is unclear what is being depicted. Overall, this part of the figure could be improved. o Were the introns removed?

      • Figure 5

      o Fig 5A. can you add the gene/protein name o Fig 5B. Can you you make the example apicobasal (non-apical) mRNA more distinctive? If it had its own peak in the lower trace, the reader would more clearly understand that this mRNA will be excluded from apical measurements whereas it will be included in apicobasal measurements. o D' - I' The grey font is too light. o D' - I' The inconsistent y-axis scaling makes it difficult to compare across these samples. Can you set them to the same maximum number? o D' - I' The x-axis labels are formatted incorrectly o The practice of masking out the nucleus appears to remove potentially important mRNAs that are not nuclear localized. This could really impact the findings and interpretation. Instead, consider an automated DAPI mask. o I can't see what the authors are calling membrane diffuse versus cytoplasmic. This is making it hard for me to see their "two step" pathway to localization. o "F" looks the same as "I" to me, but the authors claim they represent different patterns and use these differences as the basis for their claim that X. o Can more details of the quantification be included? How were Z-sections selected, chosen for inclusion? Which Z-sections and how many were selected? o Also, why do these measurements focus on what I think are the seam cells when Lockwood et al., 2008 show the entire epithelium that is much easier to see? o Please name these constructs to correlate the text more explicitly to the figures. o How many embryos were analyzed for each trace? How many embryos showed consistent patterns? o Why were these cells used for study here? Lockwood et al., 2008 use a larger field of epithelial cells for visualization.

      • Figure 6

      o There are major discrepancies between what this figure is depicting graphically and what is described in the text. Again, I'm not comfortable making the "two step" claims this figure purports given the data shared in Figure 5.

      Minor comments - Tables & Supplemental Figures

      Table 1

      • I think this table could be improved to more clearly illustrate which mRNAs were tested and what their mRNA localization patterns were (for example, gene name identifiers included, etc). Could the information that is depicted by gray shading instead be added as its own column? For example, have a column for "Observed mRNA localization"

      • Can you add distinct column names for the two columns that are labeled as "protein localization - group"

      • Can you also add which of these components are part of ASI v. ASII (as described in the introduction? Supplemental Figure 1

      • It is hard to see that some of these spots are perinuclear. More information (membrane marker, 3D rendering, improved metrics) is required to support this claim.

      • What do these images look like over the entire embryo, not just in the zoomed in section?

      • sax-7 localization in S4 looks similar but a different localization claim is made.

      Supplemental Figure 2

      • Before adherens junctions even exist dlg-1 go to the membrane - this is really neat! Supplemental Figure 3

      • Technical question: If either 5 or 3 stack images are used, how does this work? Do they have different z-spacings? Or do they do 5-stack images represent a wider Z-space?

      Supplemental Figure 4

      • Line #2 retains translation and keeps mRNA localization.

      • Totally optional, but consider showing both lines in the main figure to illustrate the two possibilities.

      • Materials and methods - how did they created the ATG mutations? Is it an array? - why does one translate, and one doesn't?

      Significance

      The authors discover that dlg-1, ajm-1, and hmr-1 mRNAs (among others) are locally translated, and this represents an important conceptual advance in the field as these are well studied proteins and important markers. This is the first study to illustrate translation-dependent mRNA localization in C. elegans, to my knowledge. The mechanisms transporting these mRNAs and their associated translational complexes to the membrane may represent a new pathway of mRNA transport and is therefore significant. The authors identify domains within DLG-1 responsible which is a nice advance. If they are unable to order the events of association as they claim in Figure 5 (and that I dispute), this doesn't detract from the impact of the paper.

      Other high-profile studies have recently been published that echo how mRNA localization to membranes can be observed for transcripts that encode membrane-associated proteins (Choaib et al., Dev Cell, 2020; Li et al., Cell Reports, 2021 (PMID: 33951426); and Reviewed in Hughes & Simmonds, Front Gen, 2019). These recent findings underscore the impact of Tocchini et al.'s paper. Similar studies have identified mRNAs localizing through translation dependent mechanisms to a variety of different regions of the cell (Sepulveda et al., eLife, 2018; Hirashima et al., Sci Reports, 2018; Safieddine, et al., Nat Comm, 2021; and reviewed in Ryder et al., JCB 2020). Given the timely nature of these findings and the recent interest in these concepts, a broad readership of readers should be interested in this paper.

      My field of expertise is in mRNA localization imaging and quantification. I feel sufficiently qualified to evaluate the manuscript on all its merits.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In the current study Tocchini et al analyze mRNA localization during development of Caenorhabditis elegans embryonic epithelia. Using smFISH-based method they have identified mRNAs associated with the cell membrane or cortex, and with apical junctions. They showed that most of mRNAs involved in AS-II cell adhesion system localize to the membrane. To examine how epithelial morphogenesis affects mRNA localization, authors studied two transcripts encoding DLG-1 and AJM-1 that form a complex. Data showed that studied mRNAs enrichment at the CeAJ varies at distinct stages and cell types of embryogenesis. Then the study was focused on one of the identified transcripts - dlg-1/discs large. Using transgenic lines authors demonstrated that dlg-1 localization to the CeAJ is UTRs-independent, but requires active translation. Moreover, authors mapped protein domains involved in that process.

      Major comments:

      Fig. 1: Main and supplementary figures present smFISH signals for eight localized mRNAs, while in the results section authors describe that they analyzed twenty-five transcripts. Authors should explain the choice of transcripts presented in the paper. Moreover, smFISH signal of different localized mRNAs in epidermal cells was visualized at different stages (bean, comma or late comma), and authors did not comment what was the reason of such conditions. This may make transcripts localization results difficult to interpret, as further analysis showed that mRNA localization varied in a stage-specific manner. Did author used smFISH probes designed against endogenous mRNAs for all tested transcripts? Marking dlg-1 mRNA as dlg-1-gfp suggests that smFISH probe was specific for gfp transcript. Is it true? If yes, authors should compare localization of wild-type endogenous dlg-1 mRNA with that of the transcript encoding a fusion protein, to confirm that fusion does not affect mRNA localization.

      Fig. 2B: Authors conclude that at later stages of pharyngeal morphogenesis mRNA enrichment at the CeAJ decreased gradually in comparison to comma stage. Data do not show statistically significant decrease in ratio of localized mRNAs - for dlg-1: bean: 0.39{plus minus}0.09, comma: 0.29{plus minus}0.08, 1.5-fold: 0.30{plus minus}0.09; for ajm-1: bean: 0.36{plus minus}0.08, comma: 0.30{plus minus}0.05, 1.5-fold: 0.28{plus minus}0.09.

      Fig. 4: What was the difference between the first and the second ΔATG transgenic line? Authors should analyze the size of the truncated DLG-1 protein that is expressed from the second ΔATG transgenic line that localizes to CeAJ. Knowing alternative ATGs and protein size may suggest domain composition of the truncated protein. This will allow to confront truncated protein localization with the results from Fig. 5. Moreover, to prove that the localization of dlg-1 mRNA at the CeAJ is translation-dependent, additional experiment should be performed where transcripts localization will be analyzed in embryos treated with translation inhibitors such as cycloheximide (translation elongation inhibitor) and puromycin (that induces premature termination).

      Minor comments:

      In the introduction section authors should emphasize the main goal and scientific significance of the paper. Fig 1A: It's hard to distinguish different colors on the schematics. Schematics presents intermediate filaments that are not included in the Table 1.

      Fig. 1C: dlg-1 transcript is marked as dlg-1-gfp on the left panel and dlg-1 on the right panel.

      Fig. 2B: Axis labels and titles are not visible, larger font size should be used.

      Fig. 5C: Enlarge the font size.

      Fig. S2: Embryonic stages should be marked on the figure for easier interpretation.

      Significance

      This study provides a few contributions into understanding mRNA localization in Caenorhabditis elegans during embryo development. Firstly, it identifies adhesion system II mRNAs associated with epithelial cells. Secondly, it demonstrates a case study of translation-dependent dlg-1/DLG-1 mRNA localization mechanism that does not involve zip codes. Finally, it provides a model showing the roles of different DLG-1 domains in dlg-1 localization. The results are compelling and experiments are well presented, although in my opinion authors should provide a stronger evidence to support the idea that active translation is essential for dlg-1 localization.

      Overall, I believe the work will have a wide appeal covering areas such as mRNA localization, developmental biology and embryogenesis.

      My field of expertise is in the RNA-protein interactions and mRNA turnover using biochemical methods as well as in vivo studies in C. elegans and mammalian cell lines. I do not have an expertise in smFISH-based methods.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.


      Reply to the Reviewers

      We thank the Referees for their evaluation and their useful comments.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The MS from Bonaventure and colleagues used a CRISPR to identify novel IFN-induced antiviral effectors targeting HIV-1. One hit, the DEAD Box helicase DDX42, while not itself part of the IFN response, exerts a substantial inhibitory effect on HIV-1 replication when over expressed, and gives a several fold boost to viral replication when knocked down in cells. The effect of DDX42 KO or O/E is manifest at reverse transcription and PLA analysis suggests and interaction with incoming virions. Moreover, DDX42 appears to exert an inhibitory effect generally against retroviruses and retroelements, with evidence that it associates with viral/transposon RNA. The authors further show that DDX42 has antiviral against a range (but not all) RNA viruses, with very striking phenotypes seen especially with Zika, CHIKV and SARS CoV2, with DDX42 associating with dsRNA in infected cells. These data suggest DDX42 is a constitutively expressed a broad-spectrum inhibitor of a range of mammalian RNA viruses. The manuscript is very well written, the data is of good quality and clearly DDX42 is having a general effect on viral replication. The results are novel, important and potentially of wide interest. Where the MS is somewhat lacking is understanding whether DDX42 has direct antiviral activity or is globally affecting cellular RNA metabolism. Some important areas for the authors to consider are:

      • DDX42 has a potential role in splicing and/or RNA metabolism so I think it would be important to see whether there is any clear global change in gene expression in knockout or knockdown cells cells vs control that might be suggestive of a generalized effect.

      Responses

      We thank the reviewer for this important question. Indeed, DDX42 didn’t impact the replication of 2 negative strand RNA viruses and this suggested that DDX42 didn’t have a global impact on the target cells, but we could not formally exclude a generalized effect. Therefore, we have performed RNA-seq analysis in order to evaluate the impact of DDX42 depletion (using 3 different siRNAs targeting DDX42 in comparison to a CTRL siRNA in U87-MG cells, and 2 different siRNA in comparison to a CTRL siRNA in A549-ACE2 cells, in samples obtained in 3 independent silencing experiments). The RNA-seq data (See Supplemental File 1 and Figure S5) showed that only 63 genes are commonly differentially expressed by the 3 siRNAs targeting DDX42 in U87-MG cells and only 23 of these genes were also found differentially expressed in A549-ACE2 cells depleted for DDX42. Importantly, the identity of these genes could not explain the observed antiviral phenotypes. These data are in favor of the absence of generalized effect on the target cells, which could have explained the antiviral phenotypes of the sensitive viruses.

      • The HIV experiments in primary cells are only one round at present. Does the DDX42 knockdown enhance viral replication in multiround? Does it lead to more viral PAMPs for PRRs to induce IFN?

      Responses

      We agree with the reviewer that it would have been very informative to measure the impact of DDX42 knockdown in multiround infections in primary T cells. However, we tried several times to do this experiment (with primary T cells from several donors) and we were not successful: indeed, DDX42 KO appeared to slow down cell division, which could be taken into account for a short, one-cycle experiment (i.e. 24 h) 3 days post-Cas9/sgRNA electroporation by adjusting the number of cells at the time of infection. However, DDX42 KO appeared quite toxic in longer experiments, with cells stopping to grow.

      The question regarding the generation of more viral PAMPs for PRRs to induce IFN is also very interesting. We know from published work (including ours) that primary T cells don’t normally produce IFN following HIV-1 infection (see for instance Bauby and Ward et al, mBio 2021). However, one can indeed hypothesize that as more viral DNAs are produced in the absence of DDX42, perhaps the primary T cells could detect them and produce IFN. To address this question in primary T cells, we would have needed to be able to perform multiround infections, which was not possible, as mentioned above. Moreover, we could not test this hypothesis in the cell lines that we used, such as U87-MG/CD4/CXCR4 cells, as they are unable to produce IFN following HIV-1 infection.

      • More could be made mechanistically of the lack of sensitivity of Flu and VSV to DDX42. In particular showing whether or not DDX42 interacts with the RNA of the insensitive virus, or whether DDX42/virus or dsRNA interactions by PLA occur with Flu would highlight the relevance of these observations to the antiviral mechanism.

      Responses

      This is an excellent remark. We have now performed RNA immunoprecipitation experiments using 2 viruses targeted by DDX42 (CHIKV and SARS-CoV-2) and 1 virus that is insensitive to DDX42 (IAV) (See New Figure 4J-L): whereas CHIKV and SARS-CoV-2 RNAs could be specifically pulled-down with DDX42 immunoprecipitation, this was not the case for IAV RNA. This strongly argues for a direct mechanism of action of DDX42 helicase on viral RNAs.

      Reviewer #1 (Significance (Required)):


      __ The role of helicases in host defence are of wide interest and importance. This has the potential to be a very important study that deserves a wide audience. However in my opinion it needs some further mechanistic insight along the lines I have suggested.

      Responses

      As mentioned above, we have now added important data: First, DDX42 is able to interact with RNAs from targeted viruses (and not from an insensitive virus); Second, we have checked that DDX42 didn’t have a substantial impact on the cell transcriptome. Taken together, these data are clearly in favour of a direct mode of action of DDX42.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this brief report, the authors use a CRISPR screening approach to identify cellular proteins that limit HIV infection. The screen itself is elegantly designed and most of the top hits are components of the interferon signaling pathway that would be expected to emerge from such a screen, thus providing confidence in the results. The authors followed up on DDX42 as a new hit identified in their screen and confirmed that targeting DDX42 with distinct guide RNAs resulted in increased HIV infection in at least 3 cell lines. Conversely, DDX42 overexpression inhibited infection. They also confirmed a role for DDX42 in inhibiting HIV infection in primary macrophages and CD4 T cells using siRNA and CRISPR KO strategies, respectively. They also demonstrate that DDX42 inhibits several other divergent lentiviruses as well as Chikungunya virus and SARS-CoV-2, but not influenza virus. These data convincingly show that DDX42 plays a role in inhibiting many lentivirus and positive sense RNA virus infections. Using PCR assays for reverse transcription products they conclude that DDX42 inhibits an early process in the HIV life cycle occurring after virus entry, though the statistical significance of these differences is not clear. They further use proximity ligation assays to suggest that DDX42 is in proximity to HIV-1 and SARS-CoV-2 replication complexes. Mechanistically, these data are largely unsatisfying as they do not provide specific insight into how DDX42 so broadly inhibits virus replication. Overall, the manuscript presents a significant advance, it also has some weaknesses as listed below.

      1. Statistical analysis is not included in any of the figures.

      Response

      Statistical analyses have now been included.

      Many of the figure legends do not state how many independent biological replicates the figures are based on.

      Response

      The number of biological replicates for each panel is stated at the very end of each figure legend.

      Detailed mechanistic understanding of DDX42 effects on virus replication is not provided by the manuscript.


      Response

      As mentioned in response to Reviewer 1, we have now added data showing that DDX42 could interact with RNAs from targeted viruses but not from an insensitive virus, arguing for a direct antiviral mode of action of this Dead-Box helicase.

      Reviewer #2 (Significance (Required)):

      DDX42 is a new antiviral protein identified and confirmed in this manuscript. It was also identified as one of many hits in a genome wide CRISPR screen for cellular proteins that regulate SARS-CoV-2 infections, but was not followed up. Thus, the identification and confirmation of DDX42 antiviral activity is highly significant for both the HIV and SARS-CoV-2 fields. This high significance may compensate to some extent for the lack of mechanistic insight contained in this initial report.

      **Referees Cross-commenting**

      I find the comments of the other reviewers to be fair and reasonable, and I concur that the work is overall important and novel. It seems that reviewers generally agreed that some additional mechanistic insights would be desirable for publication in a high impact journal. Reviewer 1 makes some good suggestions in this regard. As for mouse experiments, I would reserve these for a follow up manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):


      __In this manuscript, Bonaventure et al report the results of a screen to identify cellular inhibitors of HIV-1 infection in IF treated cells. They identify DDX42 as such a factor though, unexpectedly, DDX42 did not turn out to be an ISG. Strikingly, DDX42 turns out to inhibit a wide range of retroviruses as well as retrotransposons and + sense, but not - sense, RNA viruses among which SARS-CoV2 turns out to be especially sensitive to DDX42, with siRNAs specific for SARS-CoV2 DDX42 increasing viral RNA expression by a startling 3 orders of magnitude, compared to only an 2-5 fold positive effect with HIV-1.

      Response

      We agree with the reviewer that DDX42’s impact on HIV-1 may appear as somewhat modest, however, it is highly reproducible across cell lines and primary cells and, more importantly, it is observed upon depletion of the endogenous protein (either by KO or silencing) in target cells that are highly permissive to viral replication, such as activated primary CD4+ T cells. We therefore believe that these findings, combined with the findings that other positive-strand RNA viruses are targeted, are of high interest.

      Reviewer #3 (Significance (Required)):


      __I found this paper generally convincing and technically sound though the emphasis was odd and clearly driven more by the history of how this work was done than by the actual results obtained. Specifically, the emphasis is on HIV-1 yet the most interesting data are the dramatic effects seen with Chikungunya and SARS2. If I was writing this paper, I would delete figure 4 and focus this paper entirely on retroviruses and retrotransposons. In that form, I think it would be competitive at PLoS Pathogens or perhaps EMBO Journal. The RNA virus work shown in figure 4 could then be figure 1 of a new, high impact, paper looking at the mechanism of action of DDX42 as an inhibitor of + sense, but not - sense, viral gene expression. Though Wei et al do mention DDX42 in their SARS-CoV2 screening paper this is certainly not a major theme of that paper so I don't think that would be a problem.

      Responses

      We thank the reviewer for this comment. We had hesitated to present the manuscript as suggested by the reviewer (i.e. focusing only on HIV-1, retroviruses and retroelements) and prepare a second manuscript with the remaining data. We’ve finally decided against it, as we believe that showing a broad antiviral effect of DDX42 on +strand RNA viruses increases the impact of our findings.

      On another note, a conditional DDX42 KO mouse has been generated by the Wellcome trust Sanger institute and it would greatly improve this manuscript if they could show an in vivo a result similar to figure 3F using MLV.

      Responses

      We thank the reviewer for this information. We completely agree that in vivo work would be a massive plus and we will be planning to explore this in the future, but not at this stage as it would require specific funding and resources.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, Bonaventure et al report the results of a screen to identify cellular inhibitors of HIV-1 infection in IF treated cells. They identify DDX42 as such a factor though, unexpectedly, DDX42 did not turn out to be an ISG. Strikingly, DDX42 turns out to inhibit a wide range of retroviruses as well as retrotransposons and + sense, but not - sense, RNA viruses among which SARS-CoV2 turns out to be especially sensitive to DDX42, with siRNAs specific for SARS-CoV2 increasing viral RNA expression by a startling 3 orders of magnitude, compared to only an 2-5 fold positive effect with HIV-1.

      Significance

      I found this paper generally convincing and technically sound though the emphasis was odd and clearly driven more by the history of how this work was done than by the actual results obtained. Specifically, the emphasis is on HIV-1 yet the most interesting data are the dramatic effects seen with Chikungunya and SARS2. If I was writing this paper, I would delete figure 4 and focus this paper entirely on retroviruses and retrotransposons. In that form, I think it would be competitive at PLoS Pathogens or perhaps EMBO Journal. The RNA virus work shown in figure 4 could then be figure 1 of a new, high impact, paper looking at the mechanism of action of DDX42 as an inhibitor of + sense, but not - sense, viral gene expression. Though Wei et al do mention DDX42 in their SARS-CoV2 screening paper this is certainly not a major theme of that paper so I don't think that would be a problem. On another note, a conditional DDX42 KO mouse has been generated by the Wellcome trust Sanger institute and it would greatly improve this manuscript if they could show an in vivo a result similar to figure 3F using MLV.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this brief report, the authors use a CRISPR screening approach to identify cellular proteins that limit HIV infection. The screen itself is elegantly designed and most of the top hits are components of the interferon signaling pathway that would be expected to emerge from such a screen, thus providing confidence in the results. The authors followed up on DDX42 as a new hit identified in their screen and confirmed that targeting DDX42 with distinct guide RNAs resulted in increased HIV infection in at least 3 cell lines. Conversely, DDX42 overexpression inhibited infection. They also confirmed a role for DDX42 in inhibiting HIV infection in primary macrophages and CD4 T cells using siRNA and CRISPR KO strategies, respectively. They also demonstrate that DDX42 inhibits several other divergent lentiviruses as well as Chikungunya virus and SARS-CoV-2, but not influenza virus. These data convincingly show that DDX42 plays a role in inhibiting many lentivirus and positive sense RNA virus infections. Using PCR assays for reverse transcription products they conclude that DDX42 inhibits an early process in the HIV life cycle occurring after virus entry, though the statistical significance of these differences is not clear. They further use proximity ligation assays to suggest that DDX42 is in proximity to HIV-1 and SARS-CoV-2 replication complexes. Mechanistically, these data are largely unsatisfying as they do not provide specific insight into how DDX42 so broadly inhibits virus replication. Overall, the manuscript presents a significant advance, it also has some weaknesses as listed below.

      1. Statistical analysis is not included in any of the figures.
      2. Many of the figure legends do not state how many independent biological replicates the figures are based on.
      3. Detailed mechanistic understanding of DDX42 effects on virus replication is not provided by the manuscript.

      Significance

      DDX42 is a new antiviral protein identified and confirmed in this manuscript. It was also identified as one of many hits in a genome wide CRISPR screen for cellular proteins that regulate SARS-CoV-2 infections, but was not followed up. Thus, the identification and confirmation of DDX42 antiviral activity is highly significant for both the HIV and SARS-CoV-2 fields. This high significance may compensate to some extent for the lack of mechanistic insight contained in this initial report.

      Referees Cross-commenting

      I find the comments of the other reviewers to be fair and reasonable, and I concur that the work is overall important and novel. It seems that reviewers generally agreed that some additional mechanistic insights would be desirable for publication in a high impact journal. Reviewer 1 makes some good suggestions in this regard. As for mouse experiments, I would reserve these for a follow up manuscript.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The MS from Bonaventure and colleagues used a CRISPR to identify novel IFN-induced antiviral effectors targeting HIV-1.

      One hit, the DEAD Box helicase DDX42, while not itself part of the IFN response, exerts a substantial inhibitory effect on HIV-1 replication when over expressed, and gives a several fold boost to viral replication when knocked down in cells. The effect of DDX42 KO or O/E is manifest at reverse transcription and PLA analysis suggests and interaction with incoming virions. Moreover, DDX42 appears to exert an inhibitory effect generally against retroviruses and retroelements, with evidence that it associates with viral/transposon RNA. The authors further show that DDX42 has antiviral against a range (but not all) RNA viruses, with very striking phenotypes seen especially with Zika, CHIKV and SARS CoV2, with DDX42 associating with dsRNA in infected cells. These data suggest DDX42 is a constitutively expressed a broad-spectrum inhibitor of a range of mammalian RNA viruses.

      The manuscript is very well written, the data is of good quality and clearly DDX42 is having a general effect on viral replication. The results are novel, important and potentially of wide interest. Where the MS is somewhat lacking is understanding whether DDX42 has direct antiviral activity or is globally affecting cellular RNA metabolism. Some important areas for the authors to consider are:

      • DDX42 has a potential role in splicing and/or RNA metabolism so I think it would be important to see whether there is any clear global change in gene expression in knockout or knockdown cells cells vs control that might be suggestive of a generalized effect.

      • The HIV experiments in primary cells are only one round at present. Does the DDX42 knockdown enhance viral replication in multiround? Does it lead to more viral PAMPs for PRRs to induce IFN?

      • More could be made mechanistically of the lack of sensitivity of Flu and VSV to DDX42. In particular showing whether or not DDX42 interacts with the RNA of the insensitive virus, or whether DDX42/virus or dsRNA interactions by PLA occur with Flu would highlight the relevance of these observations to the antiviral mechanism.

      Significance

      The role of helicases in host defence are of wide interest and importance. This has the potential to be a very important study that deserves a wide audience. However in my opinion it needs some further mechanistic insight along the lines I have suggested.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements

      We appreciate the thoughtful and constructive comments provided by the reviewers and the opportunity to submit our revision plan for consideration. We have copied the reviewers’ comments below and have detailed our proposed revisions and/or clarifications after each comment (or set of comments). We also provide a partially revised manuscript with editorial changes highlighted in red.

      Reviewer 1:

      In this work, the authors Titialii-Torres and Morris assess how hyperglycemia affects the development of the neural retina using a genetic and a nutritional approach in the model organism zebrafish. This is important as diabetes can contribute to retinal degeneration in during the progression of diabetic retinopathy which often leads to blindness in adults. The authors examine how different cell types in the neural retina are affected in a genetic hyperglycemic model, the pdx1 mutant embryos, and in a nutritional model, in which hyperglycemia is induced by glucose and dexamethasone exposure. Titialii-Torres and Morris show that in both models, photoreceptor rods and cones, as well as horizontal cells, are reduced in number. Additionally, they report a delay in retinal cell differentiation accompanied by increased ROS production in the hyperglycemic retina. Altered expression of metabolism related genes and effects on visual function were also found in their hyperglycemic models. Overall, the assessment of the different retinal cell types impacted by hyperglycemia and examination of potential molecular mechanisms contributes important and novel data to the field. However, the data as presented falls short in supporting the conclusions of the authors.

      **Major comments**

      Overall, the conclusions would be more strongly supported by improving the clarity of the images, and by additional analyses.

      Figure1:

      Referring to figure 1 E' the text states that an arrowhead points to the shorter and thinner outer segment of a rod. In the figure there is an arrow pointing to a cell without a visible outer segment, making it hard to make the same conclusion. Additionally the GFP signal is very weak in D and E in the dorsal retina. Therefore it is not possible to see if there is also a decreased amount of rods in the dorsal retina as claimed. In the text it is mentioned that cones in the ventral region are affected. Is there also a difference in the dorsal region?

      Response: In our revised manuscript, we will include higher magnification panels for better visualization of the morphological differences between photoreceptor outer segments; we will also revise the graphs to show separate quantification of photoreceptors in the dorsal vs ventral retina.

      Figure 3: Rods and cones might be better displayed in close-ups from sections rather than from projections of the whole eye.

      Response: We will make this change

      The authors write about a reduction of cones upon glucose treatment. In the graph this is not highlighted as significant.

      Response: the change is significant; the graph will be edited to indicate this

      Figure 4: as the overall number of cones was already assessed before, focusing on a smaller region might help the reader to see the Zpr3 staining showing that the outer segments of the cones are stunted (as stated in the main text). In the figure panels presented, outer segments cannot be clearly seen.

      Response: we agree, and will make this change

      Figure 6: scale bar is missing. Please clarify what the red and the green is. Why is there a red signal from Mitosox outside of the embryo (panel C)? The fluorescence of the superoxide probe should be displayed in a more convincing way. For example, in sections to enable assignment of signal to tissues and cells, as shown in Supplemental Figure 5.

      Response: for the revised manuscript we will replace this figure with one containing analysis of tissue sections, with appropriate figure annotation and scale bar

      Figure 8: Is the coincubation with methylene blue leading to a significant increase in photoreceptors? If yes, this should be indicated in the graph.

      Response: for methylene blue treatment alone, the increase was not statistically significant; we have added text in the Results to clarify this. For the revised manuscript, we are also performing additional experiments with a methylene blue + SOD treatment group and with other ROS inhibitors, so this figure will be updated with those data.

      Supplemental Figure 2: The authors assert that TUNEL+ cell labeling coincides with Müller glial cells. This would be better supported with a magnified view of the INL, optimally by applying TUNEL staining to hyperglycemic, GFAP:GFP transgenic samples.

      Response: we will repeat this experiment using the GFAP:GFP line as suggested

      It would be of interest to determine if an incubation with methylene blue also affects photoreceptors in pdx1 mutants. Is it possible to confirm that Methylene blue treatment reduces ROS in the retina ? Can changes in ROS response gene expression be demonstrated by qPCR ? The assumptions about ROS should be either strengthened by additional experiments or less emphasized in the discussion.

      Response: for the revised version we will include the ROS inhibitor experiments on pdx1 mutants as suggested, as well as imaging with the Mitosox probe to confirm the efficacy of the ROS inhibitors; we are also testing additional ROS inhibitors as described above.

      For completeness, glucose metabolism in the genetic model should be also addressed and compared to the nutritional model.

      Response: While we agree that it would be helpful to have these data, it would take a very long time to collect the necessary number of pdx1 mutant individuals needed for this experiment due to the small numbers of homozygous mutants recovered in each clutch. As an alternative approach, for the revised manuscript we will use qRT-PCR to test a subset of the genes on the pdx1 mutants that showed significant changes in the nutritional model.

      The authors talk about a "long term" return to normoglycemia and long term effects of hyperglycemia. Analysis at 7 dpf after a 2 day return to normoglycemic conditions can hardly be called long term. To make these statements, an assessment after a longer time period (one week or more if possible?) would be more convincing.

      Response: for our revised manuscript, we are adding an additional time point for analysis at one week post hyperglycemia

      The claims of 'reactive gliosis' in glucose-treated larvae is overstated. Biologically meaningful differences in cell shape between control and treated samples are not evident from the images (Fig. 5A-F). This should at least be quantitated by shape analysis. The Glucose+Dex samples do not show increased number of Müller glial cells, and glucose treatment alone leads to highly variable glucose levels. This complicates and weakens a correlation with hyperglycemia.

      Response: we will add the suggested shape quantification of these images; we are also performing Western blots with an anti-GFAP antibody to further strengthen our conclusions – this is a well-accepted method for demonstrating gliosis.

      **Minor comments**

      Some figures would benefit if they would follow the sequence of the text. Eg: figure 1 and 3, the text addresses first the rods and then the cones. In several places the panels referred to in the text do not match the figures or figure panels are not mentioned at all. For example: Pg 3 "Quantification revealed a significant decrease in both rod and cone photoreceptors in pdx1 mutants at 5 dpf (Fig. 1C)." - the quantification is in panels C and F. The main text does not mention or explain Figure 2A.

      Pg 5 "The results confirmed that rods and cones from hyperglycemic larvae have shorter outer segments compared to wild type larvae at 5 dpf (Fig. 4A-C)." - panel C is a graph of Saccades. Fig. S3 - only panel Y is referred to in the text.

      Response: the text has been edited to correct these issues

      Supplemental figure 2: the authors claim a significant increase of apoptotic cells in the genetic model. In the corresponding graph significance is not indicated.

      Response: the increase in apoptotic cells was significant for the nutritional but not the genetic model; the text has been corrected to reflect this.

      Figure 5: scale bars are missing, the figure text and the numbering of the figure do not fit.

      The suggested corrections will be made to this figure and the corresponding text

      Supplemental figures 4 and 5: The Prox1 staining is hard to see and it is unclear what was counted as cells.

      Response: annotations will be added to Sup Figs 4 and 5 to clarify which cells are being quantified

      In Supp Fig. 4E the PKC staining looks increased compared to the controls.

      Response: the variability in staining intensity is within the normal range of what we have observed across all treatments and genotypes

      The graphs could have similar y axes, especially because in Supp Fig. 5 the amount of cells/µm is also different. Why not always use per 50µm? Shouldn't the amount of cells in wild types and untreated embryos be the same per 50µm? Also the labelling of the y axes could be made coherent in the two figures.

      Response: The denominator will be standardized for all graphs. The scale of the y-axes varies by cell type because some retinal cell classes are significantly more abundant than others.

      Supplemental figure 6: K is not mentioned in the legend.

      Response: this has been corrected

      2-NDBG treatment is not explained in material and methods

      Response: this information has been added to the Methods

      • *

      Reviewer #1 (Significance (Required)):

      **Significance**

      Titialii-Torres et al. characterize the impaired development of neural retinal cells under hyperglycemic conditions in zebrafish larvae and also show evidence of impaired visual function. This work will be of interest for researchers in the field of diabetes, especially those focused on diabetic retinopathy, and for developmental biologists interested in pathologies that impact human development. While the manuscript provides insights into the development of the retina under hyperglycemic conditions, a revision addressing weaknesses of figure presentation and some additional confirmatory experiments would be of great benefit.

      Response: we appreciate the reviewer’s assessment that our work will be of interest to various research communities, and agree that the suggested revisions to the figures and confirmatory experiments will greatly strengthen the impact.

      Reviewer 2:

      **Summary:**

      This paper uses immersion of embryonic zebrafish in high glucose solution to model the effects of hyperglycemia on retinal development. The paper finds that high glucose causes a reduction in the number of photoreceptors and horizontal cells, abnormalities in the morphology of photoreceptors and Müller glia, increased retinal cell apoptosis, a change in the timing of neuronal cell birth, and a defect in the optokinetic response. The mechanistic link between high glucose and changes in retinal development is not well described but may involve an increase in reactive oxygen species.

      **Major comments:**

      1. Is the photoreceptor phenotype a degenerative rather a developmental phenotype? In embryos treated with high glucose, photoreceptors in the periphery of the retina near the ciliary margin, which are younger in age, seem to be structurally more normal than those at the center, away from the ciliary margin, which are older in age. Could this reflect the fact that photoreceptor development proceeds normally followed by degenerative changes?

      Response: this is certainly a possibility, given the increase in TUNEL positive cells we detected in hyperglycemic retinas. However, we did not detect many apoptotic cells in the ONL at 3 and 4 dpf, suggesting that there is not widespread degeneration among differentiated photoreceptors at that stage. This result, in combination with the altered differentiation timing data shown in Figure 7, is what led us to favor a developmental phenotype. In the revised manuscript, we will add text to the Discussion that more thoroughly explores these alternative interpretations.

      1. For many or most phenotypes the main examined treatment is glucose + dexamethasone. The authors state this combination achieves more uniform glucose concentrations in the embryos as compared to glucose alone. However, dexamethasone may have effects independently of glucose and the dexamethasone only control is not used in some or most experiments. For example in Fig. 3, could dexamethasone alone causes changes in photoreceptor morphology? In the combo treatment, is it possible that some effects are simply due to a synergism of glucose+dex and not because dex causes a more uniformly high intraembryonic glucose?

      Response: we have evaluated photoreceptor number and morphology in the dex alone treatment group and found no significant differences. We will add these results to the main text and the supplemental figures.

      1. It is interesting that hyperglycemic retinas show more neurons born between 2-5 days post fertilization in the RGC layer than in the outer nuclear layer (Fig 7). One interpretation is delayed birth of RGCs after hyperglycemia as the authors suggest. Another interpretation is that non-RGC cell types are in now in the RGC layer; or that some proliferating progenitors persist at 5dpf. Co-localization of EdU with differentiation markers, and EdU analysis after a short pulse of 2 hours would help to nail down if there is developmental delay or something else going on here.

      Response: we appreciate the suggestion, and will perform this experiment for the revision

      1. Do Müller cells go into cycle after high glucose treatment?

      Response: this is a great question – we will do a co-localization experiment and add these results to the revised manuscript.

      1. The increase in ROS in Fig. 6 does not seem very convincing. Is the difference between untreated and glucose or glucose/dex treatments statistically significant? I would avoid making too much of this unless some type of phenotype rescue with N-acetylcysteine or vitamin C, or Trolox, can be shown. Methylene blue is a bit non-specific as an antioxidant.

        Response: for methylene blue treatment alone, the increase was not statistically significant; we have added text in the Results to clarify this. For the revised manuscript, we are also performing additional experiments with a methylene blue + SOD treatment group and with other ROS inhibitors, so this figure will be updated with those data

      Reviewer #2 (Significance (Required)):

      The translational significance of the findings is that they might provide a model to study how embryonic hyperglycemia due to maternal diabetes changes embryonic development. Pitfalls include the fact that its relevance to humans is unclear. Is maternal diabetes known to cause visual abnormalities due to abnormal retinal development in newborns? The basic biology significance may be to provide a model to investigate how glucose metabolism is connected to developmental decisions. However it is unclear whether glucose metabolism within retinal cells mediates the observed effects; and the high glucose used here is likely unphysiological as at these developmental stages zebrafish embryos feed from the yolk sac.

      Response: yes, maternal diabetes is associated with retinal abnormalities in humans, although there are not many published studies on this topic. In the Discussion, we talked about how our results align with prior clinical studies which documented reduced inner and outer macular thickness in children of diabetic pregnancies. At the suggestion of Reviewer 3, we have added this information to the Introduction as well to highlight the relevance of our study to humans. With respect to the comment about physiological relevance, we feel that the inclusion of the genetic model, which does not rely on high levels of exogenous glucose and yet exhibits a similar photoreceptor phenotype, speaks to this issue.

      Reviewer 3: **Summary:**

      The authors use a combination of genetic and pharmacological immersion approaches to investigate the effects of hyperglycemia on development of the retina in zebrafish larvae. They demonstrate a rather mild phenotype (though still convincing) such that photoreceptor maturation is delayed/impaired and the Muller glia are also affected. Visual function is modestly impacted, as measured with an assay that can be influenced by motor as well as sensory defects. The authors conclude that altered timing of the differentiation of retinal cells, together with accumulation of reactive oxygen species (ROS) underly the photoreceptor defects and reduced visual function in the hyperglycemic larvae.

      **Major comments**

      The retinal phenotype related to hyperglycemia is quite subtle, but sufficiently consistent. This phenotype would be more convincing, and lead to more definitive conclusions, if the authors could include some ultrastructural (TEM) information, or even high-resolution/magnification color images of thinner sections processed using conventional histological methods, such as H&E, or toluidine blue/pyronin B. It is difficult to appreciate the features of the apical projections of the photoreceptors in the fluorescently-labeled images.

      Response: we are adding higher magnification images to the photoreceptor figures (also suggested by Reviewer 1) and will incorporate an H&E stain as well.

      Comparison of zpr1 labeling with the TaC:eGFP transgenic is unfortunate. Ideally the authors would use the pdx1 mutant on this transgenic background. Alternatively, the authors could perform TaC in situ hybridizations.

      Response: we have crossed the pdx1 line onto the TaC:eGFP transgenic background and will have this experiment completed for the revision

      The visual function defect is also quite mild. The authors should mention that the OKR assay also relies upon motor function, and so the defect may be related to sensory deficit, motor deficit, or both. Larval ERGs would address this issue.

      Response: we will add this alternative explanation for the OKR results to the text.

      The "reactive gliosis" phenotype is also mild/subtle, and not entirely convincing. More information should be provided regarding what the authors considered an "abnormal shape" of an MG cell body. Ideally, there is an at least somewhat objective means to score normal vs. abnormal and then quantify.

      Response: for the revision, we are adding shape quantification and Western blots (please see our response to the similar comment made by Reviewer 1)

      In Figure 6 legend, the authors state that superoxide production is increased, but the graph does not appear convincing in this regard, and no statistical evaluation is provided.

      Response: for methylene blue treatment alone, the increase was not statistically significant; we have added text in the Results to clarify this. For the revised manuscript, we are also performing additional experiments with a methylene blue + SOD treatment group and with other ROS inhibitors, so this figure will be updated with those data

      The authors do not indicate whether they checked datasets for having a normal distribution prior to the selection of a t-test (or ANOVA) for analysis vs. nonparametric tests.

      Response: a more thorough description of our statistical analyses will be added to the methods

      The model and accompanying text in the Discussion seem overly wordy and speculative. This discussion also does not acknowledge that the effects upon the retina may be indirect, mediated by other tissues that are impacted by hyperglycemia. For example, ocular vascular defects have been described to result from hyperglycemia, over a similar time frame of analysis, and the effects on the retina may be downstream of these defects.

      Response: we will revise the Discussion to remove extraneous information and to incorporate alternative mechanisms that could explain the retinal phenotypes induced by hyperglycemia

      **Minor comments:**

      Introduction - the statement appearing in the Discussion (offspring of diabetic pregnancies had significantly thinner inner and outer macula as well as lower macular volume [43].) should appear in the Introduction to better capture the interest of the reader.

      Response: this change has been made

      Page 1. (..in nearly 10% of US pregnancies) - citation needed.

      Response: this has been added

      Page 2. Pdx1 mutation should be briefly described when first mentioned.

      Response: this has been added

      Legend for Figure 2 could benefit from a definition of 2-NDBG.

      Response: the figure legend has been revised

      Figure 2B does not show Whole Body [Glucose] because the heads were removed for histological analysis.

      Response: this correction will be made to the figure

      It is this reviewer's experience and opinion that zpr-3 labels rods and the RH2 members of the double cones, due to the sequence similarity of RH1 (rhodopsin) and RH2. The cited paper (Yin et al., 2012) hints at this as well, but is slightly unclear due to the terminology used in the paper in describing cone subtypes.

      We have edited the Results to clarify that Zpr3 labels the Rh2-expressing member of the double cones

      Page 8. The marker used to detect bipolar neurons should be mentioned within the Results section.

      Response: this information has been added to the Results

      Pages 12-13. The localization of TUNEL+ profiles may be related to microglia extending processes into the ONL, engulfing photoreceptors, and then internally transporting the bits to their cellular "eating stations" within other retinal layers.

      Response: we will add text to the Results including this as a possibility. We are also (at the suggestion of Reviewer 1) adding an experiment to determine whether some of the TUNEL+ cells co-localize with Muller glia markers.

      Reviewer #3 (Significance (Required)):

      **Significance/Comparison to published knowledge:**

      The zebrafish model(s) are sufficiently novel, versatile, and interesting to constitute an advance in this field. A literature search by this reviewer revealed that the focus upon retinal cells in hyperglycemic, larval zebrafish appears novel. However, the phenotype is remarkably mild, and there is concern that follow-up studies in pursuit of more mechanistic insights will be challenging to perform by the authors and by others in the field. This paper does lay some key groundwork, but what comes next sounds like a lot of fishing expeditions.

      Response: we appreciate the reviewer’s assessment that our work represents a novel advance in this field, and lays “key groundwork” for future studies. Although our nutritional and genetic models do not present with photoreceptor loss so severe that it causes complete blindness at the timepoints we tested, the photoreceptor reductions we observe are consistent, readily scoreable, and are associated with demonstrable defects in visual behavior. Given that we also provide evidence of both altered cell differentiation kinetics and increased oxidative stress in embryonic hyperglycemic retinas, we feel that these are excellent starting places for future work to uncover more mechanistic insights. Finally, our results have implications for human visual system development under hyperglycemic conditions. Timely vision development in infancy is required for attainment of a host of developmental milestones. Even mild delays in this process could have long term consequences for intellectual and social development, and due to the difficulty of measuring visual acuity in infants, subtle but significant impairments may go undetected at this critical stage. Therefore, having a reliable animal model for embryonic hyperglycemia will facilitate efforts to better understand this condition with the goal of developing appropriate intervention and treatment strategies.